<p><b>dwj07@fsu.edu</b> 2012-01-27 11:04:19 -0700 (Fri, 27 Jan 2012)</p><p><br>
        -- BRANCH COMMIT --<br>
<br>
        Updating design doc.<br>
</p><hr noshade><pre><font color="gray">Modified: branches/omp_blocks/docs/mpas_block_decomp_redesign.pdf
===================================================================
(Binary files differ)
Modified: branches/omp_blocks/docs/mpas_block_decomp_redesign.tex
===================================================================
--- branches/omp_blocks/docs/mpas_block_decomp_redesign.tex        2012-01-27 00:25:04 UTC (rev 1429)
+++ branches/omp_blocks/docs/mpas_block_decomp_redesign.tex        2012-01-27 18:04:19 UTC (rev 1430)
@@ -70,9 +70,22 @@
This document concerns the last item, namely, the extensions to the block decomposition
module that will be necessary for supporting multiple blocks per task in other infrastructure
-modules.
+modules. \\
+For a broader scope of this project, the intent with these five previously
+detailed tasks is to provide the capabilities within MPAS to support PIO and
+simulations where the number of blocks in a decomposition are not equal to the
+number of MPI tasks. For example, a simulation could run on 16 processors with
+a total of 64 blocks, as opposed to the current framework where only 16 blocks
+can run at 16 processors. \\
+After these tasks are implemented shared memory parallism can be implemented at
+the core level to (hopefully) improve performance, but also allow greater
+flexibility in terms of the parallel infrastructre of MPAS. \\
+
+As a rough timeline, these 5 tasks are planned to be completed by the end of February, 2012.
+
+
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
% Requirements
@@ -86,12 +99,20 @@
\begin{itemize}
-        \item Domain decomposition information needs to be read from a file based
-                on number of total blocks as opposed to number of MPI tasks.
+        \item The user must be able to specify the number of blocks in a
+                simulation.
-        \item Block decomposition routines need to provide a list of cells in a on
-                a processor, as well as a list of blocks on a processor, and the cells
-                that make up each block.
+        \item Block decomposition modules must provide information describing the
+                cell and block relationship for a given MPI task.
+
+        \item Block decomposition modules need to be flexible enough to support
+                multiple methods of acquiring a decomposition.
+
+        \item Block decomposition modules need to support a different number of
+                blocks than MPI tasks, even when they are not evenly divisible.
+
+        \item Block decomposition modules should provide an interface to map a
+                global block number to local block number, and owning processor number.
\end{itemize}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
@@ -126,13 +147,16 @@
The api for mpas\_block\_decomp\_cells\_for\_proc will change from
\begin{lstlisting}[language=fortran,escapechar=@,frame=single]
-@\small{subroutine mpas\_block\_decomp\_cells\_for\_proc(dminfo, partial\_global\_graph\_info, local\_cell\_list)}@
+subroutine mpas_block_decomp_cells_for_proc(dminfo, &
+ partial_global_graph_info, local_cell_list)}@
\end{lstlisting}
to
\begin{lstlisting}[language=fortran,escapechar=@,frame=single]
-@\small{subroutine mpas\_block\_decomp\_cells\_for\_proc(dminfo, partial\_global\_graph\_info, local\_cell\_list, \colorbox{yellow}{local\_block\_list})}@
+subroutine mpas_block_decomp_cells_for_proc(dminfo, &
+ partial_global_graph_info, @\colorbox{yellow}{cellsOnCell}@, &
+ local_cell_list, @\colorbox{yellow}{local\_block\_list}@)
\end{lstlisting}
where local\_cell\_list is a list of cells owned by a processor, and
@@ -149,22 +173,84 @@
block number is as follows.
\begin{lstlisting}[language=fortran,escapechar=@,frame=single]
-blocks_per_proc = config_number_of_blocks / dminfo % nprocs
+subroutine mpas_get_blocks_per_proc(dminfo, blocks_per_proc)
+ type(domain_info), intent(in) :: dminfo
+ integer, dimension(:), pointer :: blocks_per_proc
+ integer :: blocks_per_proc_min, even_blocks, remaining_blocks
-local_block_id = mod(global_block_id, blocks_per_proc)
-owning_proc = (global_block_id) / blocks_per_proc
+ allocate(blocks_per_proc(dminfo % nprocs))
+
+ blocks_per_proc_min = config_number_of_blocks / dminfo % nprocs
+ remaining_blocks = config_number_of_blocks - &
+ (blocks_per_proc_min * dminfo % nprocs)
+ even_blocks = config_number_of_blocks - remaining_blocks
+
+ do i = 1, dminfo % nProcs
+ blocks_per_proc(i) = blocks_per_proc_min
+ if(i-1 .le. remaining_blocks) then
+ blocks_per_proc(i) = blocks_per_proc(i) + 1
+ end if
+ end do
+end subroutine mpas_get_blocks_per_proc
\end{lstlisting}
-An array called global\_block\_list needs to be added, which will be similar to
-global\_cell\_list except that instead of storing cell ids it will store global
-block ids that should be paired with the cell id. For example,
-global\_block\_list(10) gives the global block number that owns the cell from
-global\_cell\_list(10). \\
+\begin{lstlisting}[language=fortran,escapechar=@,frame=single]
+subroutine mpas_get_local_block_id(dminfo, &
+ global_block_number, local_block_number)
+ type(domain_info), intent(in) :: dminfo
+ integer, intent(in) :: global_block_number
+ integer, intent(out) :: local_block_number
+ integer :: blocks_per_proc_min, even_blocks, remaining_blocks
-An additional mpas\_dmpar\_scatter\_ints needs to be added to communicate the
-global\_block\_list into nprocs copies of local\_block\_list containing a list
-of the global block ids that a processor owns. \\
+ blocks_per_proc_min = config_number_of_blocks / dminfo % nprocs
+ remaining_blocks = config_number_of_blocks - &
+ (blocks_per_proc_min * dminfo % nprocs)
+ even_blocks = config_number_of_blocks - remaining_blocks
+ if(global_block_number > even_blocks) then
+        local_block_number = blocks_per_proc_min - 1
+ else
+ local_block_number = mod(global_block_id, blocks_per_proc_min)
+ end if
+end subroutine mpas_get_local_block_id
+\end{lstlisting}
+
+\pagebreak
+
+\begin{lstlisting}[language=fortran,escapechar=@,frame=single]
+subroutine mpas_get_owning_proc(dminfo, &
+ global_block_number, owning_proc)
+ type(domain_info), intent(in) :: dminfo
+ integer, intent(in) :: global_block_number
+ integer, intent(out) :: owning_proc
+ integer :: blocks_per_proc_min, even_blocks, remaining_blocks
+
+ blocks_per_proc_min = config_number_of_blocks / dminfo % nprocs
+ remaining_blocks = config_number_of_blocks - &
+ (blocks_per_proc_min * dminfo % nprocs)
+ even_blocks = config_number_of_blocks - remaining_blocks
+
+ if(global_block_number > even_blocks) then
+ owning_proc = global_block_number - even_blocks
+ else
+ owning_proc = global_block_number / blocks_per_proc_min
+ end if
+end subroutine mpas_get_owning_proc
+\end{lstlisting}
+
+In this case, the variable blocks\_per\_proc\_min is a module variables. Module
+variables will be added, as vectors of integers with length nProcs that will
+describe the mapping from global block id to local block id, and owning
+processor number. \\
+
+In addition to this ad-hoc method of determining which blocks belong to which
+processors, a file based method will be added. This method will be toggelable
+by a namelist option named config\_block\_decomp\_file, which will be logical.
+If this option is true, a file (proc.graph.info.part.N) will be provided, where
+N is the number of processors. This file will have number of blocks lines, and
+each line will say what processor should own the block. This file can be
+created using metis externally. \\
+
%\begin{lstlisting}[language=fortran,escapechar=@,frame=single]
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
@@ -172,8 +258,11 @@
% Implementation
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
-\chapter{Implementation}
+\chapter{Testing}
+Only limited testing can be performed on this task. Since this task alone
+doesn't allow the use of multiple blocks the only testing that can really be
+performed is to provide a mis-matched number of blocks and MPI tasks and verify
+the block decomposition routines provide the correct block numbers for a
+processor and put the cells in their correct block numbers.
-Should we outline a plan for implementing these changes?
-
\end{document}
</font>
</pre>