[Wrf-users] scalability of WRF w/OpenMPI
Dominikus Heinzeller
climbfuji at ymail.com
Mon Jul 4 03:47:12 MDT 2016
Hi Marc,
attached are scaling plots (parallel efficiency and realtime versus number of tasks and number of grid points owned per task) for a domain with 500 x 330 grid points, obtained on the JURECA HPC at Research Centre Juelich (dm only, opt = -O3). For 512 tasks, the parallel efficiency is down to 60% only. “Acceptable” scaling for my purposes is up to about 256 tasks for a domain 1.8 times larger than yours. These results are obtained for one domain only, nesting might change this picture.
How about the I/O in your runs? Do you get better scaling when I/O is switched off? If so, you could use I/O quilting.
Cheers
Dom
> On 2/07/2016, at 5:42 PM, Marcella, Marc <MMarcella at AIR-WORLDWIDE.COM> wrote:
>
> Hi Dom,
>
> Thanks for the reply. Yes, unfortunately we've already gone through all of the kernel tcp tuning for 10gb ethernet, and for mpi over 10gb ethernet. We are coming no where near topping out the interconnects between the nodes. Open to other suggestions. Do you see near linear scaling for say 256,512, 1024, etc cores? Or should we at least see something somewhat close?
>
> Marc
>
>
> From: Dominikus Heinzeller [mailto:climbfuji at ymail.com]
> Sent: Friday, July 01, 2016 11:14 AM
> To: Marcella, Marc
> Cc: wrf-users at ucar.edu
> Subject: Re: [Wrf-users] scalability of WRF w/OpenMPI
>
> Hi Marc,
>
> One thing you could check is the interconnect over which the MPI traffic is routed. In particular if you use different interconnects for I/O and MPI (e.g. Ethernet for I/O and Infiniband for MPI - if this is not configured correctly, MPI might actually use the Ethernet connection, too). Do you see similar problems with parallel benchmarks on your network?
>
> Cheers
>
> Dom
>
> On 1/07/2016, at 3:05 PM, Marcella, Marc <MMarcella at AIR-WORLDWIDE.COM <mailto:MMarcella at air-worldwide.com>> wrote:
>
> Hi all,
>
> Im trying to run WRF3.6.1 in parallel using OpenMPI on a Linux HPC with an openmpi-1.6.5 build with the PGI compiler 13.6-0 64-bit, WRF3.6. I’ve tried the configure 3 option, and what we’re finding is that WRF doesn’t seem to be scaling properly. For a domain of 450x200, anything past 256 cores and we actually see negative returns. Linear scaling stops at essentially 16 cores. When we introduce the DM+SM option, we see better returns but still we don’t see any type of scalability (4cores = 30hr simulated, 16cores = 74hr simulated, 256 cores = 432hr simulated)
>
> From what I researched online, it appears WRF should scale a lot better or close to linear for a significant larger amount of cores, particularly in DM mode. Is there any known issues with WRF3.6 or certain versions of MPI we should use, or something else we should be aware of to properly scale parallel WRF?
>
> Any experience shared would be greatly appreciated…
>
> Thanks,
> Marc
>
>
> _______________________________________________
> Wrf-users mailing list
> Wrf-users at ucar.edu <mailto:Wrf-users at ucar.edu>
> http://mailman.ucar.edu/mailman/listinfo/wrf-users <http://mailman.ucar.edu/mailman/listinfo/wrf-users>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ucar.edu/pipermail/wrf-users/attachments/20160704/dd434162/attachment-0003.html
-------------- next part --------------
A non-text attachment was scrubbed...
Name: scaling_wrf1.pdf
Type: application/pdf
Size: 175473 bytes
Desc: not available
Url : http://mailman.ucar.edu/pipermail/wrf-users/attachments/20160704/dd434162/attachment-0002.pdf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ucar.edu/pipermail/wrf-users/attachments/20160704/dd434162/attachment-0004.html
-------------- next part --------------
A non-text attachment was scrubbed...
Name: scaling_wrf2.pdf
Type: application/pdf
Size: 119251 bytes
Desc: not available
Url : http://mailman.ucar.edu/pipermail/wrf-users/attachments/20160704/dd434162/attachment-0003.pdf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ucar.edu/pipermail/wrf-users/attachments/20160704/dd434162/attachment-0005.html
More information about the Wrf-users
mailing list