[Dart-dev] DART/branches Revision: 11952
dart at ucar.edu
dart at ucar.edu
Thu Sep 21 16:59:35 MDT 2017
nancy at ucar.edu
2017-09-21 16:59:34 -0600 (Thu, 21 Sep 2017)
918
add a namelist option to make a copy of the buffer before
a broadcast (was already an option to do this for send/recv).
this is expected to be needed only for mpi implementations
which can't deal with array sections properly.
rearrange the code in send, recv, and broadcast to combine
sections in a more sane way depending on whether you are
making a copy or not. support the copy option even if
you have to break large transfers up into multiple operations.
you should *not* need to make these copies; they cost time
and space. however when trying to debug strange mpi problems
trying them out is a good initial test.
there is still an issue on what the max buffer limit should
be for a single operation; we currently use 2,000,000,000
(not 2 * 1024^3) for both send, recv, and broadcast.
johnny says it might be necessary to drop this slighly for
broadcast with some of the mpi implementations on cheyenne.
Modified: DART/branches/rma_trunk/assimilation_code/modules/utilities/mpi_utilities_mod.f90
===================================================================
--- DART/branches/rma_trunk/assimilation_code/modules/utilities/mpi_utilities_mod.f90 2017-09-21 22:41:58 UTC (rev 11951)
+++ DART/branches/rma_trunk/assimilation_code/modules/utilities/mpi_utilities_mod.f90 2017-09-21 22:59:34 UTC (rev 11952)
@@ -72,16 +72,16 @@
! this directory. It is a sed script that comments in and out the interface
! block below. Please leave the BLOCK comment lines unchanged.
- !!SYSTEM_BLOCK_EDIT START COMMENTED_IN
- ! interface block for getting return code back from system() routine
- interface
- function system(string)
- character(len=*) :: string
- integer :: system
- end function system
- end interface
- ! end block
- !!SYSTEM_BLOCK_EDIT END COMMENTED_IN
+! !!SYSTEM_BLOCK_EDIT START COMMENTED_OUT
+! ! interface block for getting return code back from system() routine
+! interface
+! function system(string)
+! character(len=*) :: string
+! integer :: system
+! end function system
+! end interface
+! ! end block
+! !!SYSTEM_BLOCK_EDIT END COMMENTED_OUT
! allow global sum to be computed for integers, r4, and r8s
@@ -159,9 +159,10 @@
! from any task will print regardless of this setting.
logical :: all_tasks_print = .false. ! by default only msgs from 0 print
-! make local copy for send/recv. was needed on an old, buggy version
+! make local array copy for send/recv/bcast. was needed on an old, buggy version
! of the mpi libs but seems unneeded now.
-logical :: make_copy_before_sendrecv = .false. ! should not be needed; .true. is very slow
+logical :: make_copy_before_sendrecv = .false. ! should not be needed; .true. is very slow
+logical :: make_copy_before_broadcast = .false. ! should not be needed; .true. is very slow
! NAMELIST: change the following from .false. to .true. to enable
! the reading of this namelist. This is the only place you need
@@ -171,7 +172,7 @@
namelist /mpi_utilities_nml/ reverse_task_layout, all_tasks_print, &
verbose, async2_verbose, async4_verbose, &
shell_name, separate_node_sync, create_local_comm, &
- make_copy_before_sendrecv
+ make_copy_before_sendrecv, make_copy_before_broadcast
contains
@@ -545,57 +546,57 @@
! use my task id as the tag; unused at this point.
tag = myrank
-if (make_copy_before_sendrecv) then
- if (itemcount > SNDRCV_MAXSIZE) then
- write(errstring, '(a,i12,a,i12)') "MPI msg contains ", itemcount, &
- " items; above single msg limit of ", SNDRCV_MAXSIZE
- write(errstring1, '(a)') "cannot make local copy; change 'make_copy_before_sendrecv' to .false. so msg can be sent in multiple chunks"
- call error_handler(E_ERR,'send_to', errstring, source, revision, revdate, &
- text2=errstring1)
- endif
+if (make_copy_before_sendrecv) allocate(tmpdata(min(itemcount, SNDRCV_MAXSIZE)))
- allocate(tmpdata(itemcount))
- ! this copy should be unneeded, but on the intel fortran 9.0 compiler and mpich
- ! on one platform, calling this routine with an array section resulted in some
- ! apparently random memory corruption. making a copy of the data into a local,
- ! contiguous buffer before send and receive fixed the corruption. this should
- ! be examined at some later time for any performance impact.
- tmpdata = srcarray
+if (itemcount <= SNDRCV_MAXSIZE) then
- if (verbose) write(*,*) "PE", myrank, ": send_to alloctmp ", itemcount, " dest ", dest_id
- call MPI_Ssend(tmpdata, itemcount, datasize, dest_id, tag, &
- my_local_comm, errcode)
+ if (verbose) write(*,*) "PE", myrank, ": send_to ", itemcount, " dest ", dest_id
+
+ if (.not. make_copy_before_sendrecv) then
+ call MPI_Ssend(srcarray, itemcount, datasize, dest_id, tag, &
+ my_local_comm, errcode)
+ else
+ ! this copy should be unneeded, but on the intel fortran 9.0 compiler and mpich
+ ! on one platform, calling this subroutine with an array section resulted in some
+ ! apparently random memory corruption. making a copy of the data into a local,
+ ! contiguous buffer before send and receive fixed the corruption. this shouldn't
+ ! have been needed, and is a performance/memory sink.
+
+ tmpdata = srcarray
+ call MPI_Ssend(tmpdata, itemcount, datasize, dest_id, tag, &
+ my_local_comm, errcode)
+ endif
else
! there are a few places in the code where we send/receive a full state vector.
! as these get really large, they may exceed the limits of the MPI library.
! break large messages up into smaller chunks if needed.
- if (itemcount > SNDRCV_MAXSIZE) then
More information about the Dart-dev
mailing list