[Wrf-users] Runtime issues

Brian Smith brs at usf.edu
Mon Jan 14 09:44:15 MST 2008


Hello All,

I'm bouncing this to the list as well as the forums since this is 
somewhat an urgent matter for the person that I am helping with this 
problem.

We recently upgraded our compute nodes from CentOS 4.5 to CentOS 5.1. 
We've also updated our compiler from PGI 7.0-u2 to 7.0-u7. We've built 
and run the code successfully with OpenMPI 1.2.3 and 1.2.4.  We also 
compiled and tried to execute wrf.exe in parallel but it is giving the 
following error.

"Timing for Writing wrfout_d01_2006-04-03_00:00:00 for domain 1: 1.06300 
elapsed seconds.
2 input_wrf: wrf_get_next_time current_date: 2006-04-03_00:00:00 Status 
= -11
-------------- FATAL CALLED ---------------
FATAL CALLED FROM FILE: input_wrf.b LINE: 491
... Could not find matching time in input file 
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@
-------------------------------------------"

There did not appear to be any errors in the compile.log file. It 
appears to me that whatever data is being read above is not from the 
correct memory location. Serial execution works correctly.  We've tried 
reverting to the older libraries with some LD_LIBRARY_PATH voodoo, to 
compile and link the executable with known good MPI libraries.  We've 
also recompiled with -O1 -g for debugging and to stop any optimizations 
that might be causing this problem.  Loading up the debugger with core 
files tends to take some time ;)  So far, the only thing interesting was 
a SIGILL because of an SSE3 call made on a revision 1 Opteron system.  
Recompiling with -tp=amd64 should fix this but no dice.

Let me know if there is any more information that I can provide and 
thanks in advance.

Brian Smith




More information about the Wrf-users mailing list