[Wrf-users] Runtime Problem running WRF2.1.2 on multi processors
Diego M. Vadell
dvadell at linuxclusters.com.ar
Thu Sep 14 19:30:45 MDT 2006
Hi Jayanthi,
did you follow PGI's instructions in
http://www.pgroup.com/resources/wrf/wrfv2_pgi52.htm ? I used v6 of
their compilers and worked ok.
Hope it helps.
-- Diego.
On Thu, 14 Sep 2006 16:08:05 -0500
"Srikishen, Jayanthi" <Jayanthi.Srikishen at msfc.nasa.gov> wrote:
> Hi Wrf-users !
>
>
> I'm trying to run the WRF model on a Linux cluster (intel processors)
> Successfully created real.exe,wrf.exe
>
> ./real.exe worked fine
>
> mpirun -np 4 ./wrf.exe runs for 2 time steps and aborts. Fort.98 has
> OUT OF BOUNDS and nan's.
>
> I've tried with mpirun -np 1 ./wrf.exe and it works FINE.
>
> Could you tell me what is causing the problem ?
>
> COMPILER SWITCHES and other system related info is given below. Also,
> included here is the
> rsl.out.0000 output and a few lines of fort.98 output.
>
> ************************************************************************
> ************************************************************************
> **********
> setenv MP_STACK_SIZE 64000000
> setenv WRF_EM_CORE 1
> limit stacksize unlimited
>
> mpif90 -V = pgf90 5.2-2
> netcdf = netcdf-3.6.1_GCC_PGI5.2 (portland group compiler-fortran)
> mpich = mpich2-1.0.3_GCC_PGI5.2
> mpich = mpich-1.2.7p1_GCC_PGI5.2 (tried with this version also)
>
> WRF = WRFV2.1.2
> wrfsi = wrfsi_v2.1.2
> uname -rs = Linux 2.6.9-34.0.1.ELsmp
>
> #### Architecture specific settings ####
>
> # Settings for PC Linux i486 i586 i686, PGI compiler DM-Parallel (RSL,
> MPICH, Allows nesting
> )
> #
> # Notes: for experimental implementation of moving nests, add
> -DMOVE_NESTS to ARCHFLAGS
> # for experimental implementation of vortex tracking nests, add
> -DMOVE_NESTS -DVORTEX_
> CENTER to ARCHFLAGS
> #
>
> DMPARALLEL = 1
> MAX_PROC = 1024
> FC = /rstor17/sriki/mpich-1.2.7p1/bin/mpif90
> -f90=pgf90 -Bstatic (with and without static option)
> LD = /rstor17/sriki/mpich-1.2.7p1/bin/mpif90
> -f90=pgf90 -Bstatic (with and without static option)
> CC = /rstor17/sriki/mpich-1.2.7p1/bin/mpicc -cc=gcc
> -static -DMPI2_SUPPORT
> -DFSEEKO64_OK
> SCC = gcc
> SFC = pgf90
> RWORDSIZE = $(NATIVE_RWORDSIZE)
> PROMOTION = -r$(RWORDSIZE) -i4
> CFLAGS = -DDM_PARALLEL -DWRF_RSL_IO \
> -DMAXDOM_MAKE=$(MAX_DOMAINS)
> -DMAXPROC_MAKE=$(MAX_PROC) -I../external
> /RSL/RSL \
> -I/rstor17/sriki/mpich-1.2.7p1/include
> FCOPTIM = -O2 # -fast # ALSO TRIED WITH -O0
> FCDEBUG = #-g
> #FCBASEOPTS = -w -byteswapio -Ktrap=fp -Mfree -tp p6
> $(FCDEBUG)
> FCBASEOPTS = -w -byteswapio -Mfree -tp p6 $(FCDEBUG) # -Mlfs
> FCFLAGS = $(FCOPTIM) $(FCBASEOPTS)
> ARCHFLAGS = -DDEREF_KLUDGE -DIO_DEREF_KLUDGE -DGRIB1 -DINTIO
> -DWRF_RSL_IO -DRSL -
> DDM_PARALLEL \
> -DIWORDSIZE=4 -DDWORDSIZE=8 -DRWORDSIZE=$(RWORDSIZE) -DLWORDSIZE=4 -D
> NETCDF \
> -DTRIEDNTRUE \
> -DLIMIT_ARGS
> INCLUDE_MODULES = -module ../main -I../external/io_netcdf
> -I../external/io_int -I../ext
> ernal/esmf_time_f90 \
> -I../external -I../frame -I../share -I../phys
> -I../chem -I../inc \
> /rstor17/sriki/mpich-1.2.7p1/include
> PERL = perl
> REGISTRY = Registry
> LIB = -L../external/io_netcdf -lwrfio_nf
> -L/usr/local/netcdf/lib -lnetcdf -
> L../external/RSL/RSL -lrsl \
> -L../external/io_grib1 -lio_grib1 \
> -L../external/io_int -lwrfio_int \
> -L/rstor17/sriki//mpich-1.2.7p1/lib -lmpichf90
> \
> ../frame/module_internal_header_util.o
> ../frame/pack_utils.o -L../ext
> ernal/esmf_time_f90 -lesmf_time
> LDFLAGS = -byteswapio $(FCFLAGS)
> ENVCOMPDEFS =
> WRF_CHEM = 0
> CPP = /lib/cpp -C -P -traditional
> POUND_DEF = -DNO_RRTM_PHYSICS -traditional $(COREDEFS)
> -DNONSTANDARD_SYSTEM -DF9
> 0_STANDALONE -DCONFIG_BUF_LEN=$(CONFIG_BUF_LEN)
> -DMAX_DOMAINS_F=$(MAX_DOMAINS)
> CPPFLAGS = -I$(LIBINCLUDE) -C -P $(ARCHFLAGS)
> -I../external/RSL/RSL -C -P `cat .
> ./inc/dm_comm_cpp_flags` $(ENVCOMPDEFS) $(POUND_DEF)
> AR = ar ru
> M4 = m4
> RANLIB = ranlib
> NETCDFPATH = /usr/local/netcdf
> CC_TOOLS = cc
> ************************************************************************
> *********************
> ********************
> mpirun -np 4 ./wrf.exe
>
> The program aborts with the following message:
> rm_l_2_19671: (28.535918) net_send: could not write to fd=5, errno = 32
>
> tail rsl.out.0000
>
> STEPRA,STEPCU,STEPBL 7 3 1
> Timing for Writing wrfout_d01_2005-11-20_06:00:00 for domain 1:
> 4.25400 elapsed sec
> onds.
> Timing for processing lateral boundary for domain 1: 0.84300
> elapsed seconds.
> WRF NUMBER OF TILES = 1
> Timing for main: time 2005-11-20_06:01:30 on domain 1: 16.37400
> elapsed seconds.
> Timing for main: time 2005-11-20_06:03:00 on domain 1: 5.29800
> elapsed seconds.
>
>
> more fort.98
>
>
> **** OUT OF BOUNDS *********
> **** OUT OF BOUNDS *********
> **** OUT OF BOUNDS *********
> **** OUT OF BOUNDS *********
> LFS,LDB,LDT = 26 24 24 TIMEC, TADVEC, NSTEP= 3600. 4308. 2NCOUNT,
> FABE, AINC= 1 1.000 na
> n
>
> P(LC), DTP, WKL, WKLCL = 590.3015 -nan
> -1.8851364E-02
> 2.0000000E-02
> TLCL, DTLCL, DTRH, TENV = 266.8293 0.0000000 0.0000000
> -nan
> KLCL=25 ZLCL= 5248.4M DTLCL= 0.00 LTOP=36 P0(LTOP)=122.8MB FRZ LV=
> 0 TMIX=-0.7 PMIX= 57
> 7.0 QMIX= 4.5 CAPE= nan
> P0(LET) = 122.8 P0(LTOP) = 122.8 VMFLCL = -nan PLCL = -nan
> WLCL = 1.000 CLDHGT =
> 10237.9
> PEF(WS)=0.90(CB)=0.31LC,LET= 23 36WKL=-0.019VWS= 0.66
> PRECIP EFFICIENCY = nan
> LFS,LDB,LDT = 26 24 24 TIMEC, TADVEC, NSTEP= 3600. 4308. 2NCOUNT,
> FABE, AINC= 1 1.000 na
> n
> P DP DT K/D DR K/D OMG DOMGDP UMF UER UDR
> DMF DER DD
> R EMS W0 DETLQ DETIC
> just before DO 300...
> 122.76 45.42 -17.29 -nan nan nan 0.00 0.000 -nan
> 0.00 0.000 0.
> 000 6.000 0.852 nan nan
> 168.13 45.42 26.98 -nan nan nan -nan -nan -nan
> 0.00 0.000 0.
> 000 6.000 1.578 nan nan
>
> ************************************************************************
> *********************
> ****
>
> Thanks
> Jayanthi
>
>
>
>
More information about the Wrf-users
mailing list