[Wrf-users] Segmentation faults, DX/DY, and timekeeping errors.

Elliot tornadodrummer at yahoo.com
Wed Apr 7 13:27:53 MDT 2010

Good afternoon,

I am running into a couple of problems while trying to run real-data, one-way nested w/ndown runs in both WRFV3.0 and WRFV3.1.1.  I am running a number of different cases, and while a good number of them will run and complete without a problem, there are a few that abort due to problems.  When those specific cases are repeated, they cut out at the same location due to the same problem.  An example of a problem I am having in WRFV3.0 is with ndown.exe, which is given as follows:

  Nesting domain
  ids,ide,jds,jde            1         151           1         151
  ims,ime,jms,jme           -4          81          -4          45
  ips,ipe,jps,jpe            1          75           1          38
  ids,ide,jds,jde            3          58           3          58
  ims,ime,jms,jme           -2          35          -2          23
  ips,ipe,jps,jpe            1          29           1          17
 -------------- FATAL CALLED ---------------
 FATAL CALLED FROM FILE:  set_timekeeping.b  LINE:     274
 WRFU_TimeSet(startTime) FAILED   Routine returned error code =           -1
[u213.uncc.edu:05171] MPI_ABORT invoked on rank 0 in communicator MPI_COMM_WORLD with errorcode 1

Another problem I am having in WRFV3.0 is when trying to run the first child nest, after ndown.exe and real.exe have successfully completed.  Immediately upon the start of the first child WRF.exe, I get the following error: 

  Parent domain
  ids,ide,jds,jde            1         151           1         151
  ims,ime,jms,jme           -4          81          -4          44
  ips,ipe,jps,jpe            1          75           1          38
 DYNAMICS OPTION: Eulerian Mass Coordinate
   med_initialdata_input: calling input_model_input
 -------------- FATAL CALLED ---------------
 FATAL CALLED FROM FILE:  input_wrf.b  LINE:     173
 DX and DY do not match from the namelist and the input file
[b227.uncc.edu:25854] MPI_ABORT invoked on rank 0 in communicator MPI_COMM_WORLD with errorcode 1

Finally, I am running into problems with WRF.exe abruptly aborting in the middle of a run, ending with a segmentation fault.  Again, these segmentation faults occur in the exact same place when I attempt to repeat the same case.  The segmentation fault is given as follows:

b223:12055] *** Process received signal ***
[b223:12055] Signal: Segmentation fault (11)
[b223:12055] Signal code: Address not mapped (1)
[b223:12055] Failing at address: 0xfffffffe06fcfb00
forrtl: error (78): process killed (SIGTERM)
Image              PC                Routine            Line        Source             
libgcc_s.so.1      00000033180087B0  Unknown               Unknown  Unknown
libc.so.6          00000033148E5338  Unknown               Unknown  Unknown
libopen-pal.so.0   00002B28CAB0598E  Unknown               Unknown  Unknown
libopen-pal.so.0   00002B28CAB044DE  Unknown               Unknown  Unknown
libpthread.so.0    000000331540E4C0  Unknown               Unknown  Unknown
wrf.exe            0000000001454868  Unknown               Unknown  Unknown
wrf.exe            000000000144F6E8  Unknown               Unknown  Unknown
wrf.exe            000000000144E6F4  Unknown               Unknown  Unknown
wrf.exe            000000000144C528  Unknown               Unknown  Unknown
wrf.exe            00000000011538DD  Unknown               Unknown  Unknown
wrf.exe            0000000001238307  Unknown               Unknown  Unknown
wrf.exe            0000000000C53DA2  Unknown               Unknown  Unknown
wrf.exe            00000000007BEE81  Unknown               Unknown  Unknown
wrf.exe            000000000049EAD6  Unknown               Unknown  Unknown
wrf.exe            000000000046718B  Unknown               Unknown  Unknown
wrf.exe            0000000000467019  Unknown               Unknown  Unknown
wrf.exe            0000000000466FC2  Unknown               Unknown  Unknown
libc.so.6          000000331481D974  Unknown               Unknown  Unknown
wrf.exe            0000000000466EE9  Unknown               Unknown  Unknown

Has anyone had any of these same issues, and if so what have you done to fix them?



