[Wrf-users] Segmentation faults, DX/DY, and timekeeping errors.

Jorge Alejandro Arevalo Borquez jaareval at gmail.com
Wed Apr 7 17:21:00 MDT 2010


For Dx/DY problem, you must use ncdump (or another tool) to see the value of
dx in wrfinput with ncdump -h wrfinput_d01 and copy exactly the same value
in namelist (sometimes due to rounding, there is differences)

For the other problems, how much is the size for stack? It must be large, if
you are running with MPI and bash try ulimit -s unlimited, for OMP try with
ulimit -s 8000000
Atentamente
Jorge Arévalo Bórquez


On Wed, Apr 7, 2010 at 3:27 PM, Elliot <tornadodrummer at yahoo.com> wrote:

> Good afternoon,
>
> I am running into a couple of problems while trying to run real-data,
> one-way nested w/ndown runs in both WRFV3.0 and WRFV3.1.1.  I am running a
> number of different cases, and while a good number of them will run and
> complete without a problem, there are a few that abort due to problems.
> When those specific cases are repeated, they cut out at the same location
> due to the same problem.  An example of a problem I am having in WRFV3.0 is
> with ndown.exe, which is given as follows:
>
> NDOWN_EM V3.0 PREPROCESSOR
>   *************************************
>   Nesting domain
>   ids,ide,jds,jde            1         151           1         151
>   ims,ime,jms,jme           -4          81          -4          45
>   ips,ipe,jps,jpe            1          75           1          38
>   INTERMEDIATE domain
>   ids,ide,jds,jde            3          58           3          58
>   ims,ime,jms,jme           -2          35          -2          23
>   ips,ipe,jps,jpe            1          29           1          17
>   *************************************
>  -------------- FATAL CALLED ---------------
>  FATAL CALLED FROM FILE:  set_timekeeping.b  LINE:     274
>  WRFU_TimeSet(startTime) FAILED   Routine returned error code =
> -1
>  -------------------------------------------
> [u213.uncc.edu:05171] MPI_ABORT invoked on rank 0 in communicator
> MPI_COMM_WORLD with errorcode 1
>
>
> Another problem I am having in WRFV3.0 is when trying to run the first
> child nest, after ndown.exe and real.exe have successfully completed.
> Immediately upon the start of the first child WRF.exe, I get the following
> error:
>
>
> WRF V3.0 MODEL
>   *************************************
>   Parent domain
>   ids,ide,jds,jde            1         151           1         151
>   ims,ime,jms,jme           -4          81          -4          44
>   ips,ipe,jps,jpe            1          75           1          38
>   *************************************
>  DYNAMICS OPTION: Eulerian Mass Coordinate
>    med_initialdata_input: calling input_model_input
>  -------------- FATAL CALLED ---------------
>  FATAL CALLED FROM FILE:  input_wrf.b  LINE:     173
>  DX and DY do not match from the namelist and the input file
>  -------------------------------------------
> [b227.uncc.edu:25854] MPI_ABORT invoked on rank 0 in communicator
> MPI_COMM_WORLD with errorcode 1
>
>
> Finally, I am running into problems with WRF.exe abruptly aborting in the
> middle of a run, ending with a segmentation fault.  Again, these
> segmentation faults occur in the exact same place when I attempt to repeat
> the same case.  The segmentation fault is given as follows:
>
> b223:12055] *** Process received signal ***
> [b223:12055] Signal: Segmentation fault (11)
> [b223:12055] Signal code: Address not mapped (1)
> [b223:12055] Failing at address: 0xfffffffe06fcfb00
> forrtl: error (78): process killed (SIGTERM)
> Image              PC                Routine            Line
> Source
> libgcc_s.so.1      00000033180087B0  Unknown               Unknown  Unknown
> libc.so.6          00000033148E5338  Unknown               Unknown  Unknown
> libopen-pal.so.0   00002B28CAB0598E  Unknown               Unknown  Unknown
> libopen-pal.so.0   00002B28CAB044DE  Unknown               Unknown  Unknown
> libpthread.so.0    000000331540E4C0  Unknown               Unknown  Unknown
> wrf.exe            0000000001454868  Unknown               Unknown  Unknown
> wrf.exe            000000000144F6E8  Unknown               Unknown  Unknown
> wrf.exe            000000000144E6F4  Unknown               Unknown  Unknown
> wrf.exe            000000000144C528  Unknown               Unknown  Unknown
> wrf.exe            00000000011538DD  Unknown               Unknown  Unknown
> wrf.exe            0000000001238307  Unknown               Unknown  Unknown
> wrf.exe            0000000000C53DA2  Unknown               Unknown  Unknown
> wrf.exe            00000000007BEE81  Unknown               Unknown  Unknown
> wrf.exe            000000000049EAD6  Unknown               Unknown  Unknown
> wrf.exe            000000000046718B  Unknown               Unknown  Unknown
> wrf.exe            0000000000467019  Unknown               Unknown  Unknown
> wrf.exe            0000000000466FC2  Unknown               Unknown  Unknown
> libc.so.6          000000331481D974  Unknown               Unknown  Unknown
> wrf.exe            0000000000466EE9  Unknown               Unknown  Unknown
>
>
> Has anyone had any of these same issues, and if so what have you done to
> fix them?
>
> Thanks,
>
> Elliot.
>
>
> _______________________________________________
> Wrf-users mailing list
> Wrf-users at ucar.edu
> http://mailman.ucar.edu/mailman/listinfo/wrf-users
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ucar.edu/pipermail/wrf-users/attachments/20100407/dbab830d/attachment-0001.html 


More information about the Wrf-users mailing list