[Wrf-users] scalability of WRF w/OpenMPI

Marcella, Marc MMarcella at AIR-WORLDWIDE.COM
Sat Jul 2 09:42:36 MDT 2016


Hi Dom,

Thanks for the reply.  Yes, unfortunately we've already gone through all of the kernel tcp tuning for 10gb ethernet, and for mpi over 10gb ethernet. We are coming no where near topping out the interconnects between the nodes.  Open to other suggestions.  Do you see near linear scaling for say 256,512, 1024, etc cores?  Or should we at least see something somewhat close?

Marc


From: Dominikus Heinzeller [mailto:climbfuji at ymail.com]
Sent: Friday, July 01, 2016 11:14 AM
To: Marcella, Marc
Cc: wrf-users at ucar.edu
Subject: Re: [Wrf-users] scalability of WRF w/OpenMPI

Hi Marc,

One thing you could check is the interconnect over which the MPI traffic is routed. In particular if you use different interconnects for I/O and MPI (e.g. Ethernet for I/O and Infiniband for MPI - if this is not configured correctly, MPI might actually use the Ethernet connection, too). Do you see similar problems with parallel benchmarks on your network?

Cheers

Dom

On 1/07/2016, at 3:05 PM, Marcella, Marc <MMarcella at AIR-WORLDWIDE.COM<mailto:MMarcella at air-worldwide.com>> wrote:

Hi all,

Im trying to run WRF3.6.1 in parallel using OpenMPI on a Linux HPC with an openmpi-1.6.5 build with the PGI compiler 13.6-0 64-bit, WRF3.6.  I’ve tried the configure 3 option, and what we’re finding is that WRF doesn’t seem to be scaling properly.  For a domain of 450x200, anything past 256 cores and we actually see negative returns.  Linear scaling stops at essentially 16 cores.  When we introduce the DM+SM option, we see better returns but still we don’t see any type of scalability (4cores = 30hr simulated, 16cores = 74hr simulated, 256 cores = 432hr simulated)

From what I researched online, it appears WRF should scale a lot better or close to linear for a significant larger amount of cores, particularly in DM mode.  Is there any known issues with WRF3.6 or certain versions of MPI we should use, or something else we should be aware of to properly scale parallel WRF?

Any experience shared would be greatly appreciated…

Thanks,
Marc


_______________________________________________
Wrf-users mailing list
Wrf-users at ucar.edu<mailto:Wrf-users at ucar.edu>
http://mailman.ucar.edu/mailman/listinfo/wrf-users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ucar.edu/pipermail/wrf-users/attachments/20160702/a915b99c/attachment-0001.html 


More information about the Wrf-users mailing list