[Wrf-users] Wrf-users Digest, Vol 76, Issue 6

Tue Dec 7 14:03:47 MST 2010

Just  a comment on Hein's comment: there may be another reason he sees  
the
bottleneck at 64+ CPUs. For a given domain size, the performance will  
eventually
flat out simply because there isn't enough computations to scale up.  
There is a balance
between computation and communication (via halos for decomposed  
domains). If one
uses nests, then the performance might be limited by the smallest  
domain.

wrfhelp

On Dec 7, 2010, at 12:00 PM, wrf-users-request at ucar.edu wrote:

> Send Wrf-users mailing list submissions to
> 	wrf-users at ucar.edu
>
> To subscribe or unsubscribe via the World Wide Web, visit
> 	http://mailman.ucar.edu/mailman/listinfo/wrf-users
> or, via email, send a message with subject or body 'help' to
> 	wrf-users-request at ucar.edu
>
> You can reach the person managing the list at
> 	wrf-users-owner at ucar.edu
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Wrf-users digest..."
>
>
> Today's Topics:
>
>   1. Re: Speed improvement through SSD hard drives? (Hein Zelle)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Tue, 7 Dec 2010 09:40:22 +0100
> From: Hein Zelle <hein.zelle at bmtargoss.com>
> Subject: Re: [Wrf-users] Speed improvement through SSD hard drives?
> To: wrf-users at ucar.edu
> Message-ID: <20101207084022.GA2342 at giotto.bmtargoss.org>
> Content-Type: text/plain; charset=us-ascii
>
> Jonas Kaufmann wrote:
>
>> I am thinking about getting a new server for my WRF model
>> computations, and I am wondering about the hardware specs I should  
>> use
>> for that. Obviously the most important thing is CPU power, but I am
>> wondering what to do about harddrives in general. I know that SSD
>> drives can give a significant performance boost for I/O tasks, so I  
>> am
>> thinking about using those drives.
>>
>> Has anyone already tried this and if so, what were your results
>> compared to normal harddrives? If you did not try this, do you think
>> the WRF performance will be affected by this?
>
> I have not tried SSD drives, but I can tell you our experiences with
> WRF bottlenecks on a 64+ CPU cluster.  We used to run on a 64 core
> cluster, 8 nodes each with 2x 4-core intel xeon cpu's.  The front  
> ends each
> had a raid array (HP) with 8 SAS drives.  Performance of those arrays
> is relatively pathetic: 80Mb/s sustained read/write speeds (megabytes
> per second).
>
> On this system the bottleneck was NOT I/O, strangely enough: it was
> memory bandwidth.  Above 32 cores WRF scaled badly.  I/O speeds for a
> single model simulation were quite acceptable.  What we did notice was
> that it's easy to lock up the server with the disk pack: under write
> loads, the server would easily become irresponsive.  This was a
> combination of raid controller, linux kernel version (2.6.32+ is much
> improved), raid setup (raid 5 is BAD here), file system (ext3 with
> journalling).
>
> We eventually switched to RAID 1 with ext2, which did not improve the
> throughput but the front end did not lock up anymore.
>
>
> Our new setup uses Nehalem CPU's in a blade configuration, 64 cores
> (again, 2 quad core cpu's per motherboard).  Using this cluster the
> model scales much better, the memory bandwidth problems have largely
> gone away.  Now, running multiple models at once, it was all too easy
> to overload the disk pack server.  A single model simulation would
> perform fine, but 3 or more would completely lock up the NFS server
> with huge wait loads.
>
> We have moved to a new raid server with 12 SATA drives, hardware RAID
> 1/0, a more "beefy" raid controller card, 2 network cards in parallel
> (200 Mb/s throughput limit).  Linux 2.6.32 kernel, ubuntu 10.04.  We  
> can
> still lock up the server by running 10 models (not all WRF) in
> parallel, but it's much harder to reach the limit.  This server has a
> read/write performance of about 400-500 Mb/s sustained.
>
>
> So, summarizing, if you're going to upgrade your server and can  
> afford it:
>
> - use Nehalem cpu's or better  (I believe that's 5500 series or up,
>  but please verify that)
> - memory bandwidth is a critical factor for WRF, older intel CPU's
>  perform much worse.
> - disk I/O only becomes a bottle neck for very large models or
>  multiple at once (that's our experience, at least)
> - use a linux kernel of at least 2.6.32 or better.
> - test your disk performance in several configurations!
>  You can get huge gains with the right raid/filesystem configuration.
> - ext4 seems to work well, so far.
>
> For a small server, I think a raid array (e.g. 1/0) of a couple of
> SATA disks is fine.  For a large cluster you migth want to consider
> heavier options.  Keep in mind that your I/O bandwidth will not help
> once you exceed your network bandwidth (assuming you have a networked
> cluster).  It may well be worth getting one or two SSD disks instead
> of multiple SATA drives, if you can achieve the same performance.
>
> Hope that helps,
> Kind regards
>
>     Hein Zelle
>
>
> _______________________________________________
> Wrf-users mailing list
> Wrf-users at ucar.edu
> http://mailman.ucar.edu/mailman/listinfo/wrf-users
>
>
> End of Wrf-users Digest, Vol 76, Issue 6
> ****************************************