[Wrf-users] da_wrfvar.exe failing on "large" domains
Steven G Decker
decker at envsci.rutgers.edu
Wed Sep 23 12:49:25 MDT 2009
Don,
Assuming the test case works fine, my wild guess is that somewhere in
the code the wrong integer kind is being used (either a bug in the
source code or a bug in the compiler or improper compiler flags), and
the large domain size is leading to integer overflow.
What you are seeing is similar to the results of the following Fortran
program:
program overflow
implicit none
integer, parameter :: Long = selected_int_kind(8)
integer, parameter :: LLong = selected_int_kind(16)
integer(Long) :: i
integer(LLong) :: j
i = 400000000
print *, i
j = transfer(20*i,j)
print *, j
end program overflow
Change the kind of i to LLong and the "bug" goes away.
Try turning on all of the compiler flags involving debugging (bounds
checks, type checking, interface checking, etc.) and cross your fingers.
If a Fortran 77-style implicit interface is involved, be prepared to
pull out your hair.
The negative values for the "m" indices are fine as they allow for a
halo of points around each core's portion of the domain.
Hope this helps,
Steve
> Date: Tue, 22 Sep 2009 14:19:15 -0800
> From: Don Morton <morton at arsc.edu>
> Subject: [Wrf-users] da_wrfvar.exe failing on "large" domains
> To: wrf-users at ucar.edu
> Message-ID:
> <237e74280909221519x1187c3f8oa198c2cb506fbfb2 at mail.gmail.com>
> Content-Type: text/plain; charset=UTF-8
>
> I'm trying to run da_wrfvar.exe on a 1050x1050x75 grid point domain,
> at 3km resolution. This strikes me as a "large" domain, but not
> really unreasonably large. Inevitably, even with 256 cores, I get the
> following ABEND message:
>
> taskid: 0 hostname: nid00318
> Ntasks in X 16 , ntasks in Y 16
> *************************************
> Parent domain
> ids,ide,jds,jde 1 1050 1 1050
> ims,ime,jms,jme -4 73 -4 73
> ips,ipe,jps,jpe 1 66 1 66
> *************************************
> DYNAMICS OPTION: Eulerian Mass Coordinate
> alloc_space_field: domain 1 , 454678064 bytes allocated
> WRF NUMBER OF TILES = 1
> 0: ALLOCATE: 18446744072020333888 bytes requested; not enough memory
>
>
> Although I can believe the figure for the alloc_space_field, I'm just
> a little suspicious of the number of bytes requested for Task 0 - if
> I've read this correctly, it comes out to 18.4 Exabytes! :)
>
> Although I'm not sure, I believe the ims, ime,jms,jme values are the
> start/stop dimensions of a subdomain in a given task, and if that's
> the case, I'm suspicious about the negative start value.
>
> I'll look into the code, but I'd like to first pose the question of
> whether anybody has used da_wrfvar.exe for domains this big and/or if
> anybody knows of inherent limitations that might prevent me from doing
> so.
>
> Thanks,
>
> Don Morton
--
Steve Decker, Assistant Professor
Department of Environmental Sciences Phone: (732) 932-9800 x 6203
Rutgers University Fax: (732) 932-8644
14 College Farm Rd Email: decker at envsci.rutgers.edu
New Brunswick, NJ 08901-8551
More information about the Wrf-users
mailing list