<div dir="ltr">Hello Everyone,<div><br></div><div style>I have been working on implementing a non-blocking version of our halo exchange routines. In order to get some input on what people think about them, I put together a design document.</div>


<div style><br></div><div style>Although the design document has a proposed benefit that applies directly to cores that support multiple blocks, there is a potential benefit for cores that don&#39;t support multiple blocks too.</div>


<div style><br></div><div style>Non-multiple block cores could break up their halo exchanges to group the initialization of all halo exchanges for all fields, followed by the local copies for all fields, and finally the finalization of all halo exchanges for all fields. This could allow more computation between the initialization of the MPI_Isend/MPI_Irecvs pairs and the MPI_waits used before unpacking the buffers.</div>


<div style><br></div><div style>Anyway, please let me know if you have any comments or questions. I&#39;ve also committed this to the repository.</div><div style><br></div><div style>Thanks,</div><div style>Doug</div></div>