<p><b>dwj07@fsu.edu</b> 2012-05-09 15:51:31 -0600 (Wed, 09 May 2012)</p><p><br>

        -- BRANCH COMMIT --<br>

<br>

        Adding changes to the design document for multiple blocks.<br>

</p><hr noshade><pre><font color="gray">Modified: branches/omp_blocks/docs/mpas_multiple_blocks.pdf

===================================================================

(Binary files differ)

Modified: branches/omp_blocks/docs/mpas_multiple_blocks.tex

===================================================================

--- branches/omp_blocks/docs/mpas_multiple_blocks.tex        2012-05-09 20:28:47 UTC (rev 1882)

+++ branches/omp_blocks/docs/mpas_multiple_blocks.tex        2012-05-09 21:51:31 UTC (rev 1883)

@@ -3,6 +3,7 @@

 \usepackage{graphicx}

 \usepackage{listings}

 \usepackage{color}

+\usepackage{placeins}

 \setlength{\topmargin}{0in}

 \setlength{\headheight}{0in}

@@ -15,7 +16,7 @@

 \begin{document}

-\title{Revisions to MPAS block decomposition routines}

+\title{Implementing Multiple Blocks within MPAS Framework}

 \author{}

 \maketitle

@@ -100,14 +101,15 @@

 There are significant changes to MPAS' framework that have to be made in order

 to support multiple blocks. A list of requirements that determine these changes

-are listed below.

+are listed below, with the reasons for these requirements written below them.

 \begin{itemize}

-        \item Block creation must be robust, and handle an arbitrary number of

-                blocks per processor.

+        \item Block creation must be robust, and handle an arbitrary number

+                of blocks per processor.

         \item Blocks should be created using the derived data types created in an

-                earlier project, utilizing the field data types.

+                earlier project, promoting the use of field data types rather than

+                simple arrays.

         \item Block creation routines should be created with an arbitrary number of

                 halos assumed, although the default is currently two.

@@ -117,9 +119,43 @@

         \item Exchange list creation should be performed at the block/field level.

-        \item A new module should be setup to handle the management of blocks.

+        \item Block creation code should be isolated from the rest of MPAS code.

 \end{itemize}

+Blocks per processor should be allowed to be any non-negative number, including

+zero. This could be useful if a user wanted the ability to specify certain

+processors to do certain tasks, without doing any actual computation work on

+blocks. Although the user would have to give an explicit block to proc

+decomposition in order to have this ability used.

+

+In the creation of blocks, field data types should be used in place of simple

+arrays to promote the use of internal derived data types that are used

+elsewhere within MPAS. This will allow similar techniques to be identified by

+developers of cores, and allow a similar work flow with variables and fields

+within all of MPAS.

+

+Although fields currently are restricted to having two halo layers, at some

+point in the future we might like to be able to extend halo layers or even have

+different halo layers on each field. In order to make this task easier to

+accomplish in the future block creation routines need to be able to create an

+arbitrary number of halo layers. 

+

+When two blocks are neighboring on a single processor shared memory copies

+could be used for halo exchanges and other sorts of block-to-block

+communications rather than using MPI send/recv routines.

+

+Exchange lists are limited in their functionality by the fact that the only

+information they have refer to the other processor/block involved in the

+communication. For example, if processor 0 owns block 0 and this has to send

+information from 15 cells to processor 2 block 5 then the send list for block 0

+only gives the information on where it has to send the information, while the

+receive list on block 5 gives information on where to receive the information

+from. Because of this, exchange lists have to be created on a per block basis,

+and linked to a specific block. This way each block knows which cells it's

+supposed to send/recv/copy to/from another block. Each field and block have

+their own exchange lists, so the creation of exchange lists should place them

+within these already existing structures.

+

 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

 %

 % Design

@@ -128,12 +164,75 @@

 \chapter{Design}

 Only a small amount of design has been completed thus far. So, all information

-in this section should be regarded as a work in progress for now.

+in this section should be regarded as a work in progress for now. As a visual

+for the design process, the proposed module layout can be seen in figure

+\ref{fig:module_layout}.

-The current prototyping efforts have determined the following routines which

-require changes to their infrastructure.

+\begin{figure}[H!]

+        \centering

+        \includegraphics[scale=0.35]{DesignLayout.eps}

+        \caption{Layout of modules for input/output with multiple blocks}

+        \label{fig:module_layout}

+\end{figure}

+To begin, a change to the exchange lists need to be made to support cleaner

+versions of local copies. The current structure of an exchange list can be seen

+below

 \begin{lstlisting}[language=fortran,escapechar=@,frame=single]

+type exchange_list

+  integer :: procID

+  integer :: blockID

+  integer :: nlist

+  integer, dimension(:), pointer :: list

+  type (exchange_list), pointer :: next

+  real (kind=RKIND), dimension(:), pointer :: rbuffer

+  integer, dimension(:), pointer           :: ibuffer

+  integer :: reqID

+end type exchange_list

+\end{lstlisting}

+

+The exchange lists need to be split to accommodate shared memory copies. The

+distributed memory exchange lists won't change, aside from their name. These

+can be seen below

+

+\begin{lstlisting}[language=fortran,escapechar=@,frame=single]

+@\colorbox{yellow}{type distributed\_exchange\_list}@

+  integer :: procID

+  integer :: blockID

+  integer :: nlist

+  integer, dimension(:), pointer :: list

+  type (exchange_list), pointer :: next

+  real (kind=RKIND), dimension(:), pointer :: rbuffer

+  integer, dimension(:), pointer           :: ibuffer

+  integer :: reqID

+@\colorbox{yellow}{end type distributed\_exchange\_list}@

+\end{lstlisting}

+

+In addition to these new distributed memory exchange lists, shared memory

+exchange lists will be created as well. The shared memory exchange list can be

+seen below

+\begin{lstlisting}[language=fortran,escapechar=@,frame=single]

+@\colorbox{yellow}{type shared\_exchange\_list}@

+  integer :: blockID

+  integer :: nlist

+  @\colorbox{yellow}{integer, dimension(:), pointer :: srcList}@

+  @\colorbox{yellow}{integer, dimension(:), pointer :: destList}@

+  type (exchange_list), pointer :: next

+@\colorbox{yellow}{end type shared\_exchange\_list}@

+\end{lstlisting}

+

+The new exchange list will be tied to a field which should be recieving data,

+therefore blockID now refers to the block that the data should be taken from.

+Since these are all shared memory copies the procID is not needed anymore, and

+because they are local copies no buffers are required to send/recieve the data.

+However, the list field is not split into srcList and destList to allow the

+complement of two exchange lists within the shared memory context.

+

+Changes within the framework of MPAS can be seen below. To begin, a name change

+to the dmpar module is proposed. The new name will be mpas\_comm.F rather than

+mpas\_dmpar.F.

+

+\begin{lstlisting}[language=fortran,escapechar=@,frame=single]

 mpas_dmpar_alltoall_field

 mpas_dmpar_exch_halo_field

 mpas_dmpar_get_owner_list

@@ -144,6 +243,79 @@

 be added where they are not currently in place. Also, shared memory copies

 within local blocks need to be added.

+The old allToAll interfaces will have to change in order to accommodate the new

+field data types. The old interface looks like

+\begin{lstlisting}[language=fortran,escapechar=@,frame=single]

+subroutine mpas_dmpar_alltoall_field1d_real(dminfo, arrayIn, 

+                          arrayOut, nOwnedList, nNeededList, 

+                          sendList, recvList)

+\end{lstlisting}

+

+In this case, arrayIn and arrayOut are simple arrays representing 1d real

+values (there are other interfaces for integers, chars, and multi-dimensional

+arrays but they are all similar), nOwnedList and nNeededList are integers

+representing the sizes of arrayIn and arrayOut respectively, and sendList and

+recvList are exchange lists describing how the data from arrayIn needs to be

+communicated into arrayOut.

+

+The proposed new interface looks like

+

+\begin{lstlisting}[language=fortran,escapechar=@,frame=single]

+subroutine mpas_dmpar_alltoall_field1d_integer(dminfo, 

+                                    fieldIn, fieldOut)

+\end{lstlisting}

+

+Where fieldIn and fieldOut are pointers to the fields that need to be

+communicated. In this case the exchange lists are stored within the field, or

+possibly the field \% block data structure. Both of the two fields represent a

+linked list of fields, where each of the fields in this linked list is the

+field for a given block. Each of the fields in the linked list also has it's

+own unique exchange lists.

+

+In order to handle allToAll communications, the following pseudo code is used

+\begin{lstlisting}[language=fortran,escapechar=@,frame=single]

+loop over fieldOut list

+  loop over recvList for specific field

+    initiate mpi_irecv for specific recvList

+  end loop

+end loop

+

+loop over fieldIn list

+  loop over sendList for specific field

+    if sendList % procID == dminfo % my_proc_id

+      loop over fieldOut list

+        loop over copyList for specific field

+          if copyList % blockID == fieldIn_ptr % block % blockID

+            copy data from fieldIn_ptr to fieldOut_ptr

+          end if

+        end loop

+      end loop

+    else

+      pack data from fieldIn_ptr

+      initiate mpi_isend for specific sendList

+    end if

+  end loop

+end loop

+

+loop over fieldOut list

+  loop over recvList for specific field

+    wait for mpi_irecv to finish

+        unpack data into fieldOut_ptr

+  end loop

+end loop

+

+loop over fieldIn list

+  loop over sendList for specific field

+    wait for mpi_isend to finish

+  end loop

+end loop

+\end{lstlisting}

+

+The only changes within the mpas\_dmpar\_exch\_halo\_field are internal, and

+only refer to the addition of checking copyList for shared memory copies. These

+changes should be similar to the internal changes to the allToAll routine

+changes presented within the pseudocode above.

+

 The whole structure of mpas\_dmpar\_get\_owner\_list has to change in order to

 support multiple blocks. This routine currently builds the exchange lists for a

 single block and has global communications. In order to handle the creation of

@@ -155,6 +327,40 @@

 the data types used in the creation of these routines in line with how the rest

 of MPAS deals with fields.

+The previous interface for mpas\_dmpar\_get\_owner\_list can be seen below

+

+\begin{lstlisting}[language=fortran,escapechar=@,frame=single]

+subroutine mpas_dmpar_get_owner_list(dminfo,

+                    nOwnedList, nNeededList,

+                      ownedList, neededList,

+               sendList, recvList, inOffset)

+\end{lstlisting}

+

+where in this case, ownedList and neededList are simple arrays representing

+indices of owned and needed elements, nOwnedList and nNeededList are the number

+of elements in each respective list, sendList and recvList are output fields to

+represent the send and receive lists for the communications between these

+fields, and inOffset is an offset for receive lists.

+

+The proposed new interface for this can be seen below

+\begin{lstlisting}[language=fortran,escapechar=@,frame=single]

+subroutine mpas_get_exchange_lists(dminfo, ownedListField, 

+                         ownedDecomposed, neededListField, 

+                            neededDecomposed, offSetField)

+\end{lstlisting}

+where ownedListField and neededListField are pointers to linked lists of 1d

+integer fields, ownedDecomposed and neededDecomposed are logical flags

+determining if the *ListField is decomposed using mpas\_block\_decomp or if

+there is one block per processor from the field, and offSetField represents a

+pointer to a linked list of 0d integer fields determining each blocks receive

+list offset.

+

+The two major differences here are that the input data are given as fields

+rather than arrays, and the send/recv/copy lists are not output separate from

+the fields instead they are stored within the field structure. However these

+can be modified to put the send/recv/copy lists within the block rather than

+the field.

+

 mpas\_input\_state\_for\_domain also has to be modified in order to setup the

 fields and blocks as required for mpas\_dmpar\_get\_owner\_list, and to create

 multiple blocks. Currently it is writing under the assumption that only one

@@ -173,10 +379,22 @@

               nEdgesOnCell_nHalos, cellsOnCell_nHalos)

 \end{lstlisting}

-At the time of writing this document, this routine can be seen within the

-src/framework/mpas\_block\_decomp.F module, but this may change when a new

-module is created.

+In addition to these changes, the dimension variable nCellsSolve will now refer

+to an array. This array will contain the index to the end of a given halo. For

+example, if one wanted to do some computation over all 0 halo cells (ie. owned

+cells) the max index would be nCellsSolve(1), while computations to the end of

+the 1 halo would be nCellsSolve(2). In order to accommodate this, the

+indexToCellID array will also be packed appropriately, meaning the first

+nCellsSolve(1) indices will all be 0 halo cells while the next

+nCellsSolve(2)-nCellsSolve(1) cells will be the 1 halo cells and continuing

+until the max halo number is reached.

+Because this routine relates specifically to cells, within the routine exchange

+lists are created for cells and stored within each block's cellsToSend,

+cellsToRecv, and cellsToCopy variables. This is done to keep in line with the

+previously identified requirement, and because the exchange lists that are

+actually used belong within this structure.

+

 One general change that has to be made in order to support these field data

 types being used in the input stage of MPAS is the addition of a deallocate

 field routine. This routine would be used to deallocate all fields within a

@@ -192,10 +410,26 @@

 modified to be local indices. 

 As mentioned in the requirements section, a large portion of these changes

-might be pushed into a new module. This new module would be written to handle

-the management of blocks. The proposed name would be mpas\_block\_manager.

+should be so they are isolated from the remainder of MPAS. In order to meet

+this requirement, a new module will be created named mpas\_block\_creator. 

+The creation of this new module should allow a reorganization of the input and

+output routines. It should also allow the code writing for the creation of

+blocks to be more transparent to other MPAS developers.

+

+One issue that comes up with the creation of this new module, is that MPI calls

+are now required within a module that's external to mpas\_dmpar. Previously all

+MPI calls were isolated within the dmpar module, however because of some

+circular dependency issues that come up now this is no longer the case. Now,

+MPI calls can be restricted to the dmpar module and the block creation module. 

+

 \chapter{Testing}

+**NOTE**

+All of the testing described in this section relates only to the ocean core.

+Other core developers may test this with similar procedures but different

+simulations. \\

+

+

 The end goal from this project is to provide a framework that allows

 bit-for-bit reproduction of data using an arbitrary combination of blocks and

 processor numbers.

@@ -204,13 +438,21 @@

 exploring bit-for-bit reproduction of output data using the three following

 simulations:

 \begin{itemize}

-        \item Current trunk simulation run with 8 processors and 8 blocks

-        \item Finished branch simulation run with 8 processors and 8 blocks

-        \item Finished branch simulation run with 1 processor and 8 blocks

-        \item Finished branch simulation run with 2 processors and 8 blocks

+        \item Current trunk simulation run with 8 processors and 8 blocks (1 block per proc).

+        \item Finished branch simulation run with 8 processors and 8 blocks (1 block per proc).

+        \item Finished branch simulation run with 1 processor and 8 blocks (8 blocks per proc).

+        \item Finished branch simulation run with 2 processors and 8 blocks (4 blocks per proc).

 \end{itemize}

-If all of these simulations produce bit-for-bit output then the project would

-be deemed as completed.

+If all of these simulations produce bit-for-bit output then testing can move on to a set of larger scale simulations.

+\begin{itemize}

+        \item Current trunk 15km simulation with 1200 processors and 1200 blocks (1 block per proc).

+        \item Finished branch simulation with 1200 processors and 1200 blocks (1 block per proc).

+        \item Finished branch simulation with 600 processors and 1200 blocks (2 blocks per proc).

+        \item Finished branch simulation with 24 processors and 1200 blocks (50 blocks per proc).

+\end{itemize}

+

+After these final four simulations show bit-for-bit output then the project can be deemed as completed.

+

 \end{document}

</font>

</pre>