[Wrf-users] Poor scalability of WRFV221 on 8-core machine

Erick van Rijk erick at vanrijk.nl
Fri Oct 3 11:19:45 MDT 2008


Roman,
Yes, the chipsets define the memory bandwidth, (hence I wrote 5400).
Are you saying that MPI performs better than OpenMP on a single  
machine? Could you explain further why the memory traffic would be  
lower than openmp? The same amount of communication needs to be done  
for the sharing of the data.
Nothing I have tested points to that and so does Mr. Fovell's tests http://macwrf.blogspot.com/2008/03/wrfv221-on-dual-quad-mac-pro.html
I use a similar machine as he did for his tests.

Erick

On Oct 2, 2008, at 10:38 PM, Dubtsov, Roman S wrote:

> Erick,
>
> (a) Gerardo was right; you should use MPI so that more computations  
> are
> done in parallel. Memory bandwidth is defined by chipset, not CPUS.
> OpenMP version is likely to cause large amount of cache coherency
> traffic lowering effective data bandwidth for WRF. With MPI cache
> coherency traffic is much lower. MPI library needs to support 1)
> communications via shared memory and 2) "pinning" MPI processes to CPU
> cores so that they do not migrate between the cores/sockets and do not
> thrash cache.
>
> (b) In your case it also may make sense experimenting with numtiles
> namelist option. Setting it to higher value may improve cache
> utilization and lower memory pressure. For CONUS12km-sized domains  
> and 8
> MPI processes I suggest trying numtiles=64 first. However, results  
> with
> different numtiles settings are not bit-for-bit identical. Also, you  
> can
> try experimenting with numtiles even if you use only OpenMP.
>
> Regards,
> Roman
> :wbr
>
>> -----Original Message-----
>> From: wrf-users-bounces at ucar.edu [mailto:wrf-users- 
>> bounces at ucar.edu] On
>> Behalf Of Erick van Rijk
>> Sent: Friday, October 03, 2008 08:20
>> To: Gerardo Cisneros
>> Cc: wrf-users at ucar.edu
>> Subject: Re: [Wrf-users] Poor scalability of WRFV221 on 8-core  
>> machine
>>
>> My reasoning for using OMP is that my test machine is a single unit
>> MPI will only harm in that scenario.
>> The overhead of launching separate processes, duplicating the dataset
>> for every instance, is considerable and with omp the communication
>> latency is lower than MPI.
>>
>> I agree that the available bandwidth per core declines if you add  
>> more
>> cores to share the same bus, but I expected that 2 Intel Xeon 5400
>> processors could handle that.
>>
>> Erick
>> On Oct 2, 2008, at 5:59 PM, Gerardo Cisneros wrote:
>>
>>> On Thu, 2 Oct 2008, Erick van Rijk wrote:
>>>
>>>> Hello everybody,
>>>> I have been looking into the scalability of WRFV221 on my 8-core
>>>> machine and I have noticed that the scalability is very poor [...]
>>>>
>>>> Do any of the user/developers want to comment on this? Any reason
> why
>>>> this is happening or point me to somewhere that can cause this
>>>> behaviour?
>>>> I have build wrf221 with ifort and openmp enabled (not using MPI).
>>>
>>> (a)  WRF scaling with OpenMP only isn't anywhere
>>> near what can be obtained by using MPI.
>>>
>>> (b)  Memory bandwidth per core dwindles as you
>>> use more cores in your shared-memory machine.
>>>
>>> Saludos,
>>>
>>> Gerardo
>>> --
>>> Dr. Gerardo Cisneros	|SGI (Silicon Graphics, S.A. de C.V.)
>>> Scientist             	|Av. Vasco de Quiroga 3000, Col. Santa
> Fe
>>> gerardo at sgi.com		|01210 Mexico, D.F., MEXICO
>>> (+52-55)5563-7958 	|http://www.sgi.com/
>>>
>>
>> _______________________________________________
>> Wrf-users mailing list
>> Wrf-users at ucar.edu
>> http://mailman.ucar.edu/mailman/listinfo/wrf-users
>
> --------------------------------------------------------------------
> Closed Joint Stock Company Intel A/O
> Registered legal address: Krylatsky Hills Business Park,
> 17 Krylatskaya Str., Bldg 4, Moscow 121614,
> Russian Federation
>
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.
>



More information about the Wrf-users mailing list