[cam-users] bug report
Xuejin Zhang
xuejin_zhang at ncsu.edu
Fri Jul 30 12:35:20 MDT 2004
Dear CAM supporting scientist(s),
I tried to run CAM3.0 on IBM p690 (
http://www.ncsu.edu/itd/hpc/Hardware/Hardware.php ) but it stopped
running. The problems are:
1. After hundreds of steps (it seems to stop arbitrarily around 800-1000
steps, different submissions stop at different points even though the
namelist and executable cam are same), the program stopped
2. restart run can't work.
The attached files are my config_cach.xml, namelist, standard output.
BTW, the one-day run and 3-day continuous runs are fine.
Thank you for your assistance.
-Xuejin
--
********************************************
* Xuejin Zhang *
* North Carolina State University *
* Dept. of Marine, Earth & Atmos. Sciences *
* Campus Box 8208 *
* Raleigh, NC 27606-8208 *
* Tel:919-513-2325 *
********************************************
-------------- next part --------------
&camexp
absems_data = '/ncsu/xuejin/CAM3.0/inputdata/atm/cam2/rad/abs_ems_factors_fastvx.c030508.nc'
aeroptics = '/ncsu/xuejin/CAM3.0/inputdata/atm/cam2/rad/AerosolOptics_c040105.nc'
bndtvaer = '/ncsu/xuejin/CAM3.0/inputdata/atm/cam2/rad/AerosolMass_V_64x128_clim_c031022.nc'
bndtvdms = '/ncsu/xuejin/CAM3.0/inputdata/atm/cam2/scyc/DMS_emissions_64x128_c030722.nc'
bndtvghg = '/ncsu/xuejin/CAM3.0/inputdata/atm/cam2/ggas/ghg_1870_2100_c040122.nc'
bndtvo = '/ncsu/xuejin/CAM3.0/inputdata/atm/cam2/ozone/pcmdio3.r8.64x1_L60_clim_c970515.nc'
bndtvoxid = '/ncsu/xuejin/CAM3.0/inputdata/atm/cam2/scyc/oxid_3d_64x128_L26_c030722.nc'
bndtvs = '/ncsu/xuejin/CAM3.0/inputdata/atm/cam2/sst/sst_HadOIBl_bc_64x128_clim_c020411.nc'
bndtvscon = '/ncsu/xuejin/CAM3.0/inputdata/atm/cam2/rad/scon_1870_2100_c040122.nc'
bndtvsox = '/ncsu/xuejin/CAM3.0/inputdata/atm/cam2/scyc/SOx_emissions_64x128_L2_1870-1871_c040520.nc'
bndtvvolc = '/ncsu/xuejin/CAM3.0/inputdata/atm/cam2/rad/VolcanicMass_1870-1999_64x1_L18_c040115.nc'
caseid = 'camrun'
iyear_ad = 1996
ncdata = '/ncsu/xuejin/CAM3.0/inputdata/atm/cam2/inic/gaus/cami_0000-09-01_64x128_L26_c030918.nc'
nelapse = -490
nsrest = 0
mss_irt = 0
/
&clmexp
finidat = '/ncsu/xuejin/CAM3.0/inputdata/lnd/clm2/inidata_2.1/cam/clmi_0000-09-01_64x128_T42_USGS_c030609.nc'
fpftcon = '/ncsu/xuejin/CAM3.0/inputdata/lnd/clm2/pftdata/pft-physiology'
fsurdat = '/ncsu/xuejin/CAM3.0/inputdata/lnd/clm2/srfdata/cam/clms_64x128_USGS_c030605.nc'
/
-------------- next part --------------
t_setoption: option disabled: Usr Sys
t_setoption: option disabled: Usr Sys
t_setoption: option disabled: Usr Sys
t_setoption: option disabled: Usr Sys
4 pes participating in computation
-----------------------------------
NODE# NAME
0 mcrae.ncsc.org
1 mcrae.ncsc.org
2 mcrae.ncsc.org
3 mcrae.ncsc.org
------------------------------------------------------------
NCAR Community Atmospheric Model (CAM)
$Name: cam3_0_brnchT_release01 $
$Date: 2004/05/20 18:36:01 $
------------------------------------------------------------
(Online documentation is available on the CAM
home page: http://www.ccsm.ucar.edu/models/atm-cam/
License information is available as a link from above or from:
home page: http://www.ccsm.ucar.edu/models/atm-cam/license.html)
------------------------------------------------------------
DATE 07/29/04 TIME 11:49:22
------------------------------------------------------------
DYCORE is EUL
scon set to fixed value of 1367000.00000000000
Filename specifier for tape 1 = %c.cam2.h%t.%y-%m.nc
Filename specifier for tape 2 = %c.cam2.h%t.%y-%m-%d-%s.nc
Filename specifier for tape 3 = %c.cam2.h%t.%y-%m-%d-%s.nc
Filename specifier for tape 4 = %c.cam2.h%t.%y-%m-%d-%s.nc
Filename specifier for tape 5 = %c.cam2.h%t.%y-%m-%d-%s.nc
Filename specifier for tape 6 = %c.cam2.h%t.%y-%m-%d-%s.nc
AEROSOL_SETOPTS: prognostic sulfur aerosols are off
AEROSOL_SETOPTS: feedback of prognostic sulfur aerosols is disabled
AEROSOL_SETOPTS: prognostic carbon aerosols are off
AEROSOL_SETOPTS: feedback of prognostic carbon aerosols is disabled
AEROSOL_SETOPTS: prognostic sea salt aerosols are off
AEROSOL_SETOPTS: feedback of prognostic sea salt aerosols is disabled
READ_NAMELIST:rest_pfile= /home/edu/xuejin/cam2.camrun.rpointer
------------------------------------------
*** INPUT VARIABLES (CAMEXP) ***
------------------------------------------
Initial run
********** CASE = camrun **********
SPMDBUF: Allocating SPMD buffers of size 1790976
SPMDBUF: Allocating SPMD buffers of size 1790976
SPMDBUF: Allocating SPMD buffers of size 1790976
Initial dataset is: /ncsu/xuejin/CAM3.0/inputdata/atm/cam2/inic/gaus/cami_0000-09-01_64x128_L26_c030918.nc
History-file archive directory = /XUEJIN/csm/camrun/atm/hist/
Restart-file archive directory = /XUEJIN/csm/camrun/atm/rest/
Initial-file archive directory = /XUEJIN/csm/camrun/atm/init/
Time-variant boundary dataset (sst) is: /ncsu/xuejin/CAM3.0/inputdata/atm/cam2/sst/sst_HadOIBl_bc_64x128_clim_c020411.nc
Time-variant boundary dataset (ozone) is: /ncsu/xuejin/CAM3.0/inputdata/atm/cam2/ozone/pcmdio3.r8.64x1_L60_clim_c970515.nc
Time-invariant (absorption/emissivity) factor dataset is: /ncsu/xuejin/CAM3.0/inputdata/atm/cam2/rad/abs_ems_factors_fastvx.c030508.nc
Time-variant boundary dataset (aerosols) is: /ncsu/xuejin/CAM3.0/inputdata/atm/cam2/rad/AerosolMass_V_64x128_clim_c031022.nc
Time-variant boundary dataset (carbonscale) is: bndtvcarbonscale
Time-variant boundary dataset (solar constant) is: /ncsu/xuejin/CAM3.0/inputdata/atm/cam2/rad/scon_1870_2100_c040122.nc
Time-variant boundary dataset (greenhouse gas surface values) is: /ncsu/xuejin/CAM3.0/inputdata/atm/cam2/ggas/ghg_1870_2100_c040122.nc
Time-variant boundary dataset (volcanics) is: /ncsu/xuejin/CAM3.0/inputdata/atm/cam2/rad/VolcanicMass_1870-1999_64x1_L18_c040115.nc
Aerosol Optics dataset is: /ncsu/xuejin/CAM3.0/inputdata/atm/cam2/rad/AerosolOptics_c040105.nc
Time-variant boundary dataset (carbon emissions) is: co_emis
Time-variant boundary dataset (DMS emissions) is: /ncsu/xuejin/CAM3.0/inputdata/atm/cam2/scyc/DMS_emissions_64x128_c030722.nc
Time-variant boundary dataset (soil erodibility) is: soil_erod
Time-variant boundary dataset (oxidants) is: /ncsu/xuejin/CAM3.0/inputdata/atm/cam2/scyc/oxid_3d_64x128_L26_c030722.nc
Time-variant boundary dataset (SOx emissions) is: /ncsu/xuejin/CAM3.0/inputdata/atm/cam2/scyc/SOx_emissions_64x128_L2_1870-1871_c040520.nc
READ_NAMELIST3:rest_pfile=/home/edu/xuejin/cam2.camrun.rpointer
Restart pointer file is: /home/edu/xuejin/cam2.camrun.rpointer
Restart flag (NSREST) 0=no,1=yes,3=branch 0
Output files will NOT be disposed to Mass Store
Initial conditions history files will be written yearly.
Time filter coefficient (EPS) 0.060
DEL2 Horizontal diffusion coefficient (DIF2) 0.250E+06
DEL4 Horizontal diffusion coefficient (DIF4) 0.100E+17
Number of levels Courant limiter applied 5
Lowest level for dry adiabatic adjust (NLVDRY) 3
Frequency of Shortwave Radiation calc. (IRADSW) 3
Frequency of Longwave Radiation calc. (IRADLW) 3
Frequency of Absorptivity/Emissivity calc. (IRADAE) 36
Frequency of SST Initialization calc. (ITSST) 1
SST dataset will be reused for each model year
Snow will accumulate to a maximum over sea-ice
ICE dataset will be reused for each model year
OZONE dataset will be reused for each model year
Output files will be disposed ASYNCHRONOUSLY
divergence damper NOT invoked
Visible optical depth (tauback) = 0.000000000000000000E+00
(shr_orb_print) Orbital parameters calculated for year: AD 1996
------------------------------------------
ISCCP calcs and history IO will NOT be done
Problem factors: 2** 6 * 3** 0 * 5** 0
procid 0 assigned 16 latitude values from 1 through 16
procid 1 assigned 16 latitude values from 17 through 32
procid 2 assigned 16 latitude values from 33 through 48
procid 3 assigned 16 latitude values from 49 through 64
-----------------------------------------
Number of lats passed north & south = 3
Node Partition Extended Partition
-----------------------------------------
0 1- 16 -2- 19
1 17- 32 14- 35
2 33- 48 30- 51
3 49- 64 46- 67
*** Original Courant limit exceeded at k,lat= 2 2 (estimate = 1.073) ***
NSTEP = 847 8.873557234594424E-05 7.214638342030866E-06 253.131 9.84634E+04 2.461509048740180E+01 1.14 0.19
nstep, te 848 3343945943.82261229 2.26178480923175806 -0.225758616495982359E-03 98463.3810348214902
COURLIM: *** Courant limit exceeded at k,lat= 1 1 (estimate = 1.136), solution has been truncated to wavenumber 36 ***
COURLIM: *** Courant limit exceeded at k,lat= 2 2 (estimate = 1.067), solution has been truncated to wavenumber 39 ***
*** Original Courant limit exceeded at k,lat= 1 1 (estimate = 1.136) ***
*** Original Courant limit exceeded at k,lat= 2 2 (estimate = 1.067) ***
NSTEP = 848 8.873643110863055E-05 7.224214470055816E-06 253.131 9.84634E+04 2.461442050169447E+01 1.14 0.20
TID HOST_NAME COMMAND_LINE STATUS TERMINATION_TIME
==== ========== ================ ======================= ===================
0001 mcrae.ncsc poejob ./cam Signaled (SIGUSR2) 07/29/2004 12:19:47
0002 mcrae.ncsc poejob ./cam Killed by PAM (SIGTERM) 07/29/2004 12:19:47
0003 mcrae.ncsc poejob ./cam Killed by PAM (SIGTERM) 07/29/2004 12:19:47
0004 mcrae.ncsc poejob ./cam Killed by PAM (SIGTERM) 07/29/2004 12:19:47
I am going to do poe kill then exit
More information about the cam-users
mailing list