[Wrf-users] Odd WRF crashes
Bart Brashers
bbrashers at environcorp.com
Wed Aug 15 17:15:21 MDT 2012
WRFv3.4, PGI compilers, WPSv3.4 using ERA-Interim + RTG SST initialization, OpenMPI 1.4.3, on 2x3=6 cores of an AMD 6100 series processor. Some of my settings:
max_dom = 3,
e_we = 165, 100, 133,
e_sn = 129, 100, 100,
e_vert = 34, 34, 34,
dx = 36000, 12000, 4000,
mp_physics = 10, 10, 10,
mp_zero_out = 1,
mp_zero_out_thresh = 1.e-8,
ra_lw_physics = 4, 4, 4,
ra_sw_physics = 4, 4, 4,
radt = 30, 10, 5,
sf_sfclay_physics = 2, 2, 2,
sf_surface_physics = 2, 2, 2,
sf_urban_physics = 0, 0, 0,
bl_pbl_physics = 2, 2, 2,
bldt = 0, 0, 0,
cu_physics = 5, 5, 0,
cudt = 0, 0, 0,
ishallow = 0,
prec_acc_dt = 0., 0., 0.,
Many of my 5.5-day inits run OK, but a few here and there are crashing. Here's an example of the frustrating lack of details:
# tail -20 rsl.error.0002
OBS NUDGING FOR IN,J,KTAU,XTIME,IVAR,IPL: 3 10 936 171.14 2 2 rindx=45.0
OBS NUDGING: Reading new obs for time window TBACK = 1.902 TFORWD = 3.902 for grid = 3
****** CALL IN4DOB AT KTAU = 954 AND XTIME = 174.13: NSTA = 134 ******
++++++CALL ERROB AT KTAU = 954 AND INEST = 3: NSTA = 134 ++++++
OBS NUDGING FOR IN,J,KTAU,XTIME,IVAR,IPL: 3 10 954 174.13 3 3 rindx=45.0
OBS NUDGING FOR IN,J,KTAU,XTIME,IVAR,IPL: 3 10 954 174.13 4 4 rindx=45.0
OBS NUDGING FOR IN,J,KTAU,XTIME,IVAR,IPL: 3 10 954 174.13 1 1 rindx=45.0
OBS NUDGING FOR IN,J,KTAU,XTIME,IVAR,IPL: 3 10 954 174.13 2 2 rindx=45.0
OBS NUDGING: Reading new obs for time window TBACK = 1.956 TFORWD = 3.956 for grid = 3
****** CALL IN4DOB AT KTAU = 972 AND XTIME = 177.35: NSTA = 134 ******
++++++CALL ERROB AT KTAU = 972 AND INEST = 3: NSTA = 134 ++++++
OBS NUDGING FOR IN,J,KTAU,XTIME,IVAR,IPL: 3 10 972 177.35 3 3 rindx=45.0
OBS NUDGING FOR IN,J,KTAU,XTIME,IVAR,IPL: 3 10 972 177.35 4 4 rindx=45.0
OBS NUDGING FOR IN,J,KTAU,XTIME,IVAR,IPL: 3 10 972 177.35 1 1 rindx=45.0
OBS NUDGING FOR IN,J,KTAU,XTIME,IVAR,IPL: 3 10 972 177.35 2 2 rindx=45.0
[compute-0-3:19366] *** Process received signal ***
[compute-0-3:19366] Signal: Segmentation fault (11)
[compute-0-3:19366] Signal code: (128)
[compute-0-3:19366] Failing at address: (nil)
[compute-0-3:19366] *** End of error message ***
It's repeatable. Happens whether I use adaptive time stepping or not. Happens whether I use OBS nudging or not.
Any suggestions for how to track down this error?
Bart Brashers
Ignore Mordac:
________________________________
This message contains information that may be confidential, privileged or otherwise protected by law from disclosure. It is intended for the exclusive use of the Addressee(s). Unless you are the addressee or authorized agent of the addressee, you may not review, copy, distribute or disclose to anyone the message or any information contained within. If you have received this message in error, please contact the sender by electronic reply to email at environcorp.com and immediately delete all copies of the message.
More information about the Wrf-users
mailing list