[Wrf-users] mpirun giving unexpected results

Tue Apr 11 12:41:07 MDT 2006

All,

Thanks to all that replied to my post!  All of your 
comments/suggestions were very helpful.  It turns out that it was 
basically an NFS issue.  For those that did not see Brice's post on the 
online web WRF Users Forum, I have pasted it below. 

I want to thank all of you that replied. Working through your comments 
and some other web info and looking at the data, we determined that the 
problem indeed was NFS tuning. The default NFS installation on the 
nodes was not what the documentation indicates and was actually running 
with a 4K read/write size and synchronous writes. And only 4 nfsd's 
were started. By increasing the nfsd's to 32, increasing the blocking 
to 16K and setting the exports and mounts to asnyc, the times went from 
1.5+ hours with no local processes on the head node to 13 minutes and a 
few seconds.

Thanks again!

Brice 

Thanks,
Brian Hoeth
Spaceflight Meteorology Group
Johnson Space Center 
Houston, TX
281-483-3246

----- Original Message -----
From: Brian.Hoeth at noaa.gov
Date: Monday, April 3, 2006 12:01 pm
Subject: [Wrf-users] mpirun giving unexpected results

> Hello,
> 
> The post below was sent to the online WRF Users Forum by one of 
> our 
> software support group members (Brice), so I will just cut and 
> paste 
> the post here to see if we get any replies here also.
> 
> Thanks,
> Brian Hoeth
> Spaceflight Meteorology Group
> Johnson Space Center 
> Houston, TX
> 281-483-3246
> 
> 
> 
> The Spaceflight Meteorology Group here at Johnson Space Center has 
> recently acquired a small Linux-based cluster to run the WRF-NMM 
> in 
> support of Space Shuttle operations. I am the software support 
> lead 
> and have been running some 'bench' testing on the system. The 
> results 
> of the tests have raised some questions that I would appreciate 
> help 
> in answering.
> 
> I may not have the exact details of the configuration of the model 
> run 
> here, but the SMG folks will probably supply that if more 
> information 
> is needed. The testing involved running the WRF-NMM at a 4km 
> resolution over an area around New Mexico, using the real data 
> test 
> case, downloaded from the WRF-NMM user's site.
> 
> The cluster is composed of a head node with dual hyper-threading 
> Intel 
> Xeons at 3.2GHz and 16 subnodes with dual Intel Xeons at 3.2GHz. 
> All 
> of the subnodes mount the headnodes home drive. Communications 
> between 
> the nodes is via Gigabit Ethernet.
> 
> The WRF-NMM package was installed using the PGI CDK 6.0 as was 
> MPICH 
> and netCDF. One thing that I ran into in the installation was 
> differences between what I started out installing using the 32-bit 
> PGI 
> and then attempting to install the WRF, which chose to have itself 
> installed using the 64-bit. That was corrected and all of the 
> software 
> packages associated with the model (MPICH, netCDF, real-nmm.exe 
> and 
> wrf.exe) are compiled with 64-bit support. The head node is 
> running 
> RHEL AS 3.4 and the compute nodes are running RHEL WS 3.4.
> 
> Ok. That's the basic background to jump past all of those 
> questions. 
> Additional information is that I have not tried any of the 
> debugging 
> tools yet; I am using /usr/bin/time -v to gather timing data; and 
> I am 
> not using any scheduling applications, such as OPENPBS, just 
> mpirun 
> and various combinations of machine and process files. I have the 
> time 
> results and the actual command lines captured and can supply that 
> if 
> someone needs that. Last bit of 'background' is that I am not a 
> long 
> term cluster development programmer (20+years programming in 
> FORTRAN 
> and other things, but not clusters), nor a heavy Linux 
> administrator ( 
> though that is changing rapidly and several years experience in 
> HPUX 
> administration). So now you know some measure of how many 
> questions I 
> will ask before I understand the answers I get ;-) The SMG has had 
> a 
> Beowulf cluster for a couple of years, but my group was giving it 
> minimal admin support. So I, like any good programmer, am looking 
> for 'prior art' and experience.
> 
> Here are some of the summarized results and then I will get the 
> questions:
> 
> WRF-NMM run with 1 process on head node and 31 processes on subnodes
> 'mpirun -np 32 ./wrf.exe'
> 13:21.32 wall time (all times from the headnode perspective)
> 
> WRF-NMM run with 3 processes on head node and 32 processes on subnodes
> 'mpirun -p4pg PI-35proc ./wrf.exe'
> 13:53.70 wall time
> 
> WRF-NMM run with 1 process on head node and 15 processes on subnodes
> 'mpirun -np 16 ./wrf.exe'
> 14:09.29 wall time
> 
> WRF-NMM run with 1 process on head node and 7 processes on subnodes
> 'mpirun -np 8 ./wrf.exe'
> 20:08.88 wall time
> 
> WRF-NMM run with NO processes on head node and 16 processes on 
> subnodes'mpirun -np 16 -nolocal -machinefile wrf-16p.machines 
> ./wrf.exe'1:36:56 - an hour and a half of wall time
> 
> and finally, dual runs of the model with 1 process each on the 
> head 
> node and 15 processes pushed out to separate banks of the compute 
> nodes
> 'mpirun -np 16 -machinefile wrf-16p-plushead.machines ./wrf.exe'
> 17:27.70 wall time
> 'mpirun -np 16 -machinefile wrf-16p-test2.machines ./wrf.exe'
> 17:08.21 wall time
> 
> The results that call questions are the minimal difference between 
> 16 
> and 32 processes, and, in fact, 8 processes and the huge 
> difference in 
> putting no processes on the head node. Taking the last case first, 
> my 
> thought, based on some web research is that possibly the 
> difference 
> between NFS and local writes could be influencing the time, but 
> question maybe a shared memory issue?
> 
> Going back to the base issue of the number of processes influence. 
> Does anyone have other experiences with the scaling of the WRF to 
> larger or smaller clusters (I did note one in an earlier post, but 
> I 
> am unsure what to make of the results at this point)? And I did 
> look 
> at the graph that was referred to, but we are a much smaller shop 
> than 
> most of the tests there. Can anybody suggest some tuning that 
> might be 
> useful or a tool that would assist in gaining a better 
> understanding 
> of what is going on and what to expect if(when) the users expand 
> their 
> activities?
> 
> Pardon the length of this post, but I figured it was better to get 
> out 
> as many details up front as possible.
> 
> Thanks,
> 
> Brice 
> 
> 
> _______________________________________________
> Wrf-users mailing list
> Wrf-users at ucar.edu
> http://mailman.ucar.edu/mailman/listinfo/wrf-users
>