[Wrf-users] WPSV3 metgrid.exe seg fault GFS/GFDL
Eric_Meyers
emeyers3 at atmos.uiuc.edu
Tue Jun 24 09:41:46 MDT 2008
Dear WPS Users:
SUMMARY:
metgrid.exe segmentation fault when combining GFS/GFDL input AND processing domains with total grid point number exceeding some mysterious upper bound.
PROCESS:
After trying to process GFDL input (Vtable.GFDL) for 1 domain (9-km grid spacing) using WPSV3 and witnessing metgrid.exe fail due to lacking necessary soil parameters, I now run WPSV3 with multiple data sources - GFS and GFDL - the latter taking precedence for all duplicate fields, the former providing the necessary soil parameters.
First, I run WPSV3 with the GFS input, excluding running of metgrid.exe (i.e., just run geogrid.exe & ungrib.exe; Vtable.GFS; GFS input):
&ungrib
out_format = 'WPS',
prefix = 'GFS',
/
in directory ~/WPSV3_GFS1dm/, for example. The following are produced, for example:
GFS:2005-07-08_00
GFS:2005-07-08_03
GFS:2005-07-08_06
Second, I run WPSV3 again (this time, geogrid.exe, ungrib.exe, AND metgrid.exe; Vtable.GFDL; GFDL input), in a separate directory, with fg_name specified to have metgrid.exe process the GFS input first (path directed to previous WPSV3 run GFS:* output from ungrib.exe) and the GFDL input second:
&ungrib
out_format = 'WPS',
prefix = 'GFDL',
/
&metgrid
fg_name = '~/WPSV3_GFS1dm/GFS', 'GFDL'
io_form_metgrid = 2,
/
RESULTS:
For SMALL DOMAIN SIZES, such as 100x79, metgrid.exe SUCCEEDS in producing, for example,
met_em.d01.2005-07-08_00:00:00.nc.
met_em.d01.2005-07-08_03:00:00.nc.
met_em.d01.2005-07-08_06:00:00.nc.
HOWEVER, when I simply try LARGER DOMAIN SIZES, such as 340x340, metgrid.exe FAILS after processing all of the GFS fields for the initial time, while "Processing SKINTEMP at level 200100.000000" from the GFDL input:
Processing domain 1 of 1
Processing 2005-07-08_00
~/WPSV3_GFS1dm/GFS
GFDL
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image PC Routine Line Source
metgrid.exe 400000000005DDD0 interp_module_mp_ 377 interp_module.f90
metgrid.exe 4000000000054B30 interp_module_mp_ 193 interp_module.f90
metgrid.exe 4000000000085AC0 interp_module_mp_ 751 interp_module.f90
metgrid.exe 40000000000556B0 interp_module_mp_ 203 interp_module.f90
metgrid.exe 4000000000077600 interp_module_mp_ 585 interp_module.f90
metgrid.exe 40000000000539F0 interp_module_mp_ 178 interp_module.f90
metgrid.exe 400000000008D070 interp_module_mp_ 821 interp_module.f90
metgrid.exe 4000000000052E70 interp_module_mp_ 168 interp_module.f90
metgrid.exe 400000000009C7F0 interp_module_mp_ 942 interp_module.f90
metgrid.exe 4000000000054570 interp_module_mp_ 188 interp_module.f90
metgrid.exe 400000000025AF70 process_domain_mo 1619 process_domain_module.f90
metgrid.exe 400000000024BA20 process_domain_mo 1518 process_domain_module.f90
metgrid.exe 4000000000212FC0 process_domain_mo 883 process_domain_module.f90
metgrid.exe 40000000001D69D0 process_domain_mo 137 process_domain_module.f90
metgrid.exe 400000000002BC00 MAIN__ 66 metgrid.f90
metgrid.exe 4000000000003E90 Unknown Unknown Unknown
libc.so.6.1 2000000000138060 Unknown Unknown Unknown
metgrid.exe 40000000000038C0 Unknown Unknown Unknown
The first error listed, corresponding to line 377 of interp_module.f90, is inside routine search_extrap:
if (array(qdata%x,qdata%y,izz) /= msgval .and. mask_array(qdata%x,qdata%y) /= maskval) then
The second error listed, corresponding to line 193, is the call to search_extrap:
interp_sequence = search_extrap(xx, yy, izz, array, start_x, end_x, &...
CHECK:
I compiled with -check all, which should have caught an out-of-bounds error for array() or mask_array().
I'm certain the seg fault has no dependence on the input processing time, for I've encountered the same seg fault using various input times. In addition, the specified domains (e.g., 340x340) never exceed the boundaries of either the GFS or GFDL input.
The fact that processing of smaller domains works demonstrates that my process of running WPSV3 with the multiple GFS/GFDL input sources is correct, but there is obviously some limitation on total grid point number with the compilation, code.
EXPLORATION:
I tried running WPSV3 with more processors, using ‘unlimit’ before running metgrid.exe, and changing optimization, but I witnessed the seg fault regardless.
In addition to 100x79, metgrid.exe SUCCEEDED for domain sizes 202x160 and 250x220, but it FAILED (with the above seg fault) for LARGER domains. I performed two tests: 1) increasing just the x dimension (250) to that which I desired (340) 2) increasing just the y dimension (220) to that which I desired (340). I had hoped that these tests would show if there was a region of the GFDL input to either the East or North not willing to work with the 9-km domain increased beyond 250x220, or if simply exceeding an upper limit to the total number of grid points composing the domain caused the seg fault. Both 1) and 2) failed, thus suggesting the latter.
HOWEVER, when I processed the GFS input ONLY (i.e., no GFDL; single WPSV3 run - geogrid.exe, ungrib.exe, and metgrid.exe; Vtable.GFS; GFS input; fg_name = ‘GFS’) for the 340x340 domain, metgrid.exe SUCCEEDED. So, the total grid point constraint only applies to my use of GFDL input, in combination with GFS input. In other words, the seg fault is confined to using large grid point numbers ~ > 250x220 = 55000 AND combined GFS/GFDL data sources.
Although seemingly redundant, I tried processing GFS data twice (i.e., simply replaced link to GFDL input with link to GFS input), as if the data sets were different, to see if perhaps my methodology of processing multiple data sources (even though in this case the same content) for larger domains was producing the seg fault. I followed the same process described in "PROCESS" section above, but I simply changed the link to GFDL input to GFS input, so that only GFS input would be processed, but with Vtable.GFDL. In other words, I ran WPSV3 again (geogrid.exe, ungrib.exe, AND metgrid.exe; Vtable.GFDL; GFS input THIS TIME), in a separate directory, with fg_name specified to have metgrid.exe process the GFS input first (path directed to WPSV3 run producing GFS:* output from ungrib.exe, in directory ~/WPSV3_GFS1dm/, for example, from before) and the GFS input again. metgrid.exe SUCCEEDED for the 340x340 domain. The success of this test proves that the GFDL input in particular is causing the seg fault, not the methodology that I am using to process multiple data sources, but why only for large total grid point numbers (i.e., ~ > 250x220)?
Although not intended because I want GFDL input to take priority over GFS, I changed (as another sensitivity test, this time with combined GFS/GFDL input again, following "PROCESS" section above, except fg_name order:)
fg_name = '~/WPSV3_GFS1dm/GFS', 'GFDL'
to
fg_name = 'GFDL', '~/WPSV3_GFS1dm/GFS'
The GFDL fields processed without error for the second case listed, in addition to the GFS fields (as usual), for the 340x340 domain. I find it quite peculiar that just switching the order of processing (from GFS, GFDL to GFDL, GFS, the latter in each taking priority) resulted in SUCCESSFUL processing of GFDL input, but of course not in the manner intended (i.e., "erasing" bogus vortex with subsequently-processed, relatively coarse GFS input).
CONCLUSION:
metgrid.exe failure is linked to use of combined GFS/GFDL input, and for some mysterious reason only occurs for increased domain total grid point numbers > ~ 250x220 and when GFDL takes priority. The seg fault does not occur when processing GFS input only.
The 340x340 GFDL 9-km domain, the resolution of which is not necessary for my simulation (I'm using a 27-km grid temporarily, which I could process using the methodology described because it, despite covering the same area intended for the 9-km domain, is composed of fewer than 250x220 grid points (~1/9 x 340x340)), would benefit my analysis. Ideally, I would like to nest a 3-km domain with WPSV3 for enhanced terrain resolution, but this would require even greater nx, ny certain to produce the seg fault.
*****If anyone has experience processing GFDL input with WPS and/or can help me resolve this problem, please reply. More broadly, if there is another way other than via WPSV3 (nesting grids of finer horizontal resolution) to obtain enhanced terrain resolution, please help.*****
THANK YOU!
--
--------------------------------------------------------------
Eric C. Meyers
Graduate Research Assistant
University of Illinois at Urbana-Champaign
Department of Atmospheric Sciences
emeyers3 at atmos.uiuc.edu
--------------------------------------------------------------
More information about the Wrf-users
mailing list