[Wrf-users] real.exe failing on huge domains

Mon Aug 31 16:59:02 MDT 2009

Might you be exceeding the north pole?

Best,
Abdullah.

On 01/09/2009, Don Morton <morton at arsc.edu> wrote:
> First - the basic question - has anybody been successful in WPS'ing
> and real.exe'ing a large domain, on the order of 6075x6075x27 grid
> points (approximately 1 billion)?
>
> I've almost convinced (I say "almost" because I recognize that I, like
> others, am capable of making stupid mistakes) myself that there is an
> issue with real.exe which, for large grids, results in an error
> message of the form:
>
> =====================
>  p_top_requested =     5000.000
>  allowable grid%p_top in data   =     55000.00
>  -------------- FATAL CALLED ---------------
>  FATAL CALLED FROM FILE:  module_initialize_real.b  LINE:     526
>  p_top_requested < grid%p_top possible from data
> =====================
>
> and I'm beginning to think that this is somehow related to memory
> allocation issues.  I'm currently working on a 1km resolution case,
> centered on Fairbanks, Alaska.  If I use a 3038x3038 horizontal grid,
> it all works fine, but with a 6075x6075 grid, I get the above error.
> In both cases, I've written an NCL script to print the min/max/avg
> values of the PRES field in met_em*, and at the top level they both
> come out to 1000 Pa and at the next level down they both come out to
> 2000 Pa.  So, I'm guessing my topmost pressure fields are fine.  So,
> I'm guessing that the met_em file being fed to real.exe is good.
>
> Further information:
>
> - I've tried these cases under a number of varying conditions -
> different resolutions, different machines (a Sun Opteron cluster and a
> Cray XT5).  In all cases, however, I've been using the PGI compilers
> (but I may try Pathscale on one of the machines to see if that makes a
> difference).  I feel pretty good about having ruled out resolution,
> physics, etc. as a problem, and feel like I've narrowed this down to
> be a problem that's a function of domain size.
>
> - With some guidance from John Michalakes and folks at Cray, I feel
> pretty certain that I'm not running out of memory on the compute
> nodes, though I'll be probing this a little more.  In one case (that
> failed with the above problem) I put MPI Task 0 on a 32 GByte node all
> by itself, then partitioned the other 255 tasks, 8 to an 8-core node
> (two quad-core processors) each with 32 GBytes memory (4 GBytes per
> task).
>
> - Have tried this with WRFV3.0.1.1 and WRFV3.1
>
>
> I'll continue to probe, and may need to start digging into the
> real.exe source, but just wanted to know if anybody else has
> experienced success or failure with this size of a problem.  I'm aware
> that a Gordon Bell entry last year was performed with about 2 billion
> grid points, but I think I remember someone telling me that the run
> wasn't prepared with WPS.
>
> Thanks,
>
> Don Morton
> --
> Arctic Region Supercomputing Center
> http://www.arsc.edu/~morton/
> _______________________________________________
> Wrf-users mailing list
> Wrf-users at ucar.edu
> http://mailman.ucar.edu/mailman/listinfo/wrf-users
>

-- 
Abdullah KAHRAMAN
Istanbul Technical University
Department of Meteorology
http://www.students.itu.edu.tr/~kahramanab/
University of Helsinki
Department of Physics