[Wrf-users] Re: Wrf-users Digest, Vol 20, Issue 1
Wrfhelp
wrfhelp at ucar.edu
Wed Apr 5 14:49:57 MDT 2006
Hi,
Some comments were provided for this enquiry in WRF Users Forum page.
For more info visit:
http://tornado.meso.com/wrf_forum/index.php?showtopic=430
Hope this help,
--Wrfhelp
On Mon, Apr 03, 2006 at 12:00:04PM -0600, wrf-users-request at ucar.edu wrote:
> Send Wrf-users mailing list submissions to
> wrf-users at ucar.edu
>
> To subscribe or unsubscribe via the World Wide Web, visit
> http://mailman.ucar.edu/mailman/listinfo/wrf-users
> or, via email, send a message with subject or body 'help' to
> wrf-users-request at ucar.edu
>
> You can reach the person managing the list at
> wrf-users-owner at ucar.edu
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Wrf-users digest..."
>
>
> Today's Topics:
>
> 1. mpirun giving unexpected results (Brian.Hoeth at noaa.gov)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Mon, 03 Apr 2006 12:01:14 -0500
> From: Brian.Hoeth at noaa.gov
> Subject: [Wrf-users] mpirun giving unexpected results
> To: wrf-users at ucar.edu
> Message-ID: <59e10599da.599da59e10 at noaa.gov>
> Content-Type: text/plain; charset=us-ascii
>
> Hello,
>
> The post below was sent to the online WRF Users Forum by one of our
> software support group members (Brice), so I will just cut and paste
> the post here to see if we get any replies here also.
>
> Thanks,
> Brian Hoeth
> Spaceflight Meteorology Group
> Johnson Space Center
> Houston, TX
> 281-483-3246
>
>
>
> The Spaceflight Meteorology Group here at Johnson Space Center has
> recently acquired a small Linux-based cluster to run the WRF-NMM in
> support of Space Shuttle operations. I am the software support lead
> and have been running some 'bench' testing on the system. The results
> of the tests have raised some questions that I would appreciate help
> in answering.
>
> I may not have the exact details of the configuration of the model run
> here, but the SMG folks will probably supply that if more information
> is needed. The testing involved running the WRF-NMM at a 4km
> resolution over an area around New Mexico, using the real data test
> case, downloaded from the WRF-NMM user's site.
>
> The cluster is composed of a head node with dual hyper-threading Intel
> Xeons at 3.2GHz and 16 subnodes with dual Intel Xeons at 3.2GHz. All
> of the subnodes mount the headnodes home drive. Communications between
> the nodes is via Gigabit Ethernet.
>
> The WRF-NMM package was installed using the PGI CDK 6.0 as was MPICH
> and netCDF. One thing that I ran into in the installation was
> differences between what I started out installing using the 32-bit PGI
> and then attempting to install the WRF, which chose to have itself
> installed using the 64-bit. That was corrected and all of the software
> packages associated with the model (MPICH, netCDF, real-nmm.exe and
> wrf.exe) are compiled with 64-bit support. The head node is running
> RHEL AS 3.4 and the compute nodes are running RHEL WS 3.4.
>
> Ok. That's the basic background to jump past all of those questions.
> Additional information is that I have not tried any of the debugging
> tools yet; I am using /usr/bin/time -v to gather timing data; and I am
> not using any scheduling applications, such as OPENPBS, just mpirun
> and various combinations of machine and process files. I have the time
> results and the actual command lines captured and can supply that if
> someone needs that. Last bit of 'background' is that I am not a long
> term cluster development programmer (20+years programming in FORTRAN
> and other things, but not clusters), nor a heavy Linux administrator (
> though that is changing rapidly and several years experience in HPUX
> administration). So now you know some measure of how many questions I
> will ask before I understand the answers I get ;-) The SMG has had a
> Beowulf cluster for a couple of years, but my group was giving it
> minimal admin support. So I, like any good programmer, am looking
> for 'prior art' and experience.
>
> Here are some of the summarized results and then I will get the
> questions:
>
> WRF-NMM run with 1 process on head node and 31 processes on subnodes
> 'mpirun -np 32 ./wrf.exe'
> 13:21.32 wall time (all times from the headnode perspective)
>
> WRF-NMM run with 3 processes on head node and 32 processes on subnodes
> 'mpirun -p4pg PI-35proc ./wrf.exe'
> 13:53.70 wall time
>
> WRF-NMM run with 1 process on head node and 15 processes on subnodes
> 'mpirun -np 16 ./wrf.exe'
> 14:09.29 wall time
>
> WRF-NMM run with 1 process on head node and 7 processes on subnodes
> 'mpirun -np 8 ./wrf.exe'
> 20:08.88 wall time
>
> WRF-NMM run with NO processes on head node and 16 processes on subnodes
> 'mpirun -np 16 -nolocal -machinefile wrf-16p.machines ./wrf.exe'
> 1:36:56 - an hour and a half of wall time
>
> and finally, dual runs of the model with 1 process each on the head
> node and 15 processes pushed out to separate banks of the compute nodes
>
> 'mpirun -np 16 -machinefile wrf-16p-plushead.machines ./wrf.exe'
> 17:27.70 wall time
> 'mpirun -np 16 -machinefile wrf-16p-test2.machines ./wrf.exe'
> 17:08.21 wall time
>
> The results that call questions are the minimal difference between 16
> and 32 processes, and, in fact, 8 processes and the huge difference in
> putting no processes on the head node. Taking the last case first, my
> thought, based on some web research is that possibly the difference
> between NFS and local writes could be influencing the time, but
> question maybe a shared memory issue?
>
> Going back to the base issue of the number of processes influence.
> Does anyone have other experiences with the scaling of the WRF to
> larger or smaller clusters (I did note one in an earlier post, but I
> am unsure what to make of the results at this point)? And I did look
> at the graph that was referred to, but we are a much smaller shop than
> most of the tests there. Can anybody suggest some tuning that might be
> useful or a tool that would assist in gaining a better understanding
> of what is going on and what to expect if(when) the users expand their
> activities?
>
> Pardon the length of this post, but I figured it was better to get out
> as many details up front as possible.
>
> Thanks,
>
> Brice
>
>
>
>
> ------------------------------
>
> _______________________________________________
> Wrf-users mailing list
> Wrf-users at ucar.edu
> http://mailman.ucar.edu/mailman/listinfo/wrf-users
>
>
> End of Wrf-users Digest, Vol 20, Issue 1
> ****************************************
More information about the Wrf-users
mailing list