[Wrf-users] Wrf-users Digest, Vol 76, Issue 6
wrfhelp at ucar.edu
Tue Dec 7 14:03:47 MST 2010
Just a comment on Hein's comment: there may be another reason he sees
bottleneck at 64+ CPUs. For a given domain size, the performance will
flat out simply because there isn't enough computations to scale up.
There is a balance
between computation and communication (via halos for decomposed
domains). If one
uses nests, then the performance might be limited by the smallest
On Dec 7, 2010, at 12:00 PM, wrf-users-request at ucar.edu wrote:
> Send Wrf-users mailing list submissions to
> wrf-users at ucar.edu
> To subscribe or unsubscribe via the World Wide Web, visit
> or, via email, send a message with subject or body 'help' to
> wrf-users-request at ucar.edu
> You can reach the person managing the list at
> wrf-users-owner at ucar.edu
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Wrf-users digest..."
> Today's Topics:
> 1. Re: Speed improvement through SSD hard drives? (Hein Zelle)
> Message: 1
> Date: Tue, 7 Dec 2010 09:40:22 +0100
> From: Hein Zelle <hein.zelle at bmtargoss.com>
> Subject: Re: [Wrf-users] Speed improvement through SSD hard drives?
> To: wrf-users at ucar.edu
> Message-ID: <20101207084022.GA2342 at giotto.bmtargoss.org>
> Content-Type: text/plain; charset=us-ascii
> Jonas Kaufmann wrote:
>> I am thinking about getting a new server for my WRF model
>> computations, and I am wondering about the hardware specs I should
>> for that. Obviously the most important thing is CPU power, but I am
>> wondering what to do about harddrives in general. I know that SSD
>> drives can give a significant performance boost for I/O tasks, so I
>> thinking about using those drives.
>> Has anyone already tried this and if so, what were your results
>> compared to normal harddrives? If you did not try this, do you think
>> the WRF performance will be affected by this?
> I have not tried SSD drives, but I can tell you our experiences with
> WRF bottlenecks on a 64+ CPU cluster. We used to run on a 64 core
> cluster, 8 nodes each with 2x 4-core intel xeon cpu's. The front
> ends each
> had a raid array (HP) with 8 SAS drives. Performance of those arrays
> is relatively pathetic: 80Mb/s sustained read/write speeds (megabytes
> per second).
> On this system the bottleneck was NOT I/O, strangely enough: it was
> memory bandwidth. Above 32 cores WRF scaled badly. I/O speeds for a
> single model simulation were quite acceptable. What we did notice was
> that it's easy to lock up the server with the disk pack: under write
> loads, the server would easily become irresponsive. This was a
> combination of raid controller, linux kernel version (2.6.32+ is much
> improved), raid setup (raid 5 is BAD here), file system (ext3 with
> We eventually switched to RAID 1 with ext2, which did not improve the
> throughput but the front end did not lock up anymore.
> Our new setup uses Nehalem CPU's in a blade configuration, 64 cores
> (again, 2 quad core cpu's per motherboard). Using this cluster the
> model scales much better, the memory bandwidth problems have largely
> gone away. Now, running multiple models at once, it was all too easy
> to overload the disk pack server. A single model simulation would
> perform fine, but 3 or more would completely lock up the NFS server
> with huge wait loads.
> We have moved to a new raid server with 12 SATA drives, hardware RAID
> 1/0, a more "beefy" raid controller card, 2 network cards in parallel
> (200 Mb/s throughput limit). Linux 2.6.32 kernel, ubuntu 10.04. We
> still lock up the server by running 10 models (not all WRF) in
> parallel, but it's much harder to reach the limit. This server has a
> read/write performance of about 400-500 Mb/s sustained.
> So, summarizing, if you're going to upgrade your server and can
> afford it:
> - use Nehalem cpu's or better (I believe that's 5500 series or up,
> but please verify that)
> - memory bandwidth is a critical factor for WRF, older intel CPU's
> perform much worse.
> - disk I/O only becomes a bottle neck for very large models or
> multiple at once (that's our experience, at least)
> - use a linux kernel of at least 2.6.32 or better.
> - test your disk performance in several configurations!
> You can get huge gains with the right raid/filesystem configuration.
> - ext4 seems to work well, so far.
> For a small server, I think a raid array (e.g. 1/0) of a couple of
> SATA disks is fine. For a large cluster you migth want to consider
> heavier options. Keep in mind that your I/O bandwidth will not help
> once you exceed your network bandwidth (assuming you have a networked
> cluster). It may well be worth getting one or two SSD disks instead
> of multiple SATA drives, if you can achieve the same performance.
> Hope that helps,
> Kind regards
> Hein Zelle
> Wrf-users mailing list
> Wrf-users at ucar.edu
> End of Wrf-users Digest, Vol 76, Issue 6
More information about the Wrf-users