[Dart-dev] [6392] DART/trunk/models/CESM/shell_scripts: Committing the functionality wherein the CESM multi-instance setup

nancy at ucar.edu nancy at ucar.edu
Mon Aug 12 10:10:09 MDT 2013


Revision: 6392
Author:   thoar
Date:     2013-08-12 10:10:09 -0600 (Mon, 12 Aug 2013)
Log Message:
-----------
Committing the functionality wherein the CESM multi-instance setup
and DART setup are separate.

CESM1_1_1_initial.csh        sets up a hybrid case from disjoint initial/restart conditions.
CESM1_1_1_continuation.csh   sets up a hybrid case from a previous reference case.
CESM_DART_config             gets copied to the CASEROOT but not run. Configures for DART.

Modified Paths:
--------------
    DART/trunk/models/CESM/shell_scripts/CESM1_1_1_continuation.csh
    DART/trunk/models/CESM/shell_scripts/CESM1_1_1_initial.csh
    DART/trunk/models/CESM/shell_scripts/CESM_DART_config

-------------- next part --------------
Modified: DART/trunk/models/CESM/shell_scripts/CESM1_1_1_continuation.csh
===================================================================
--- DART/trunk/models/CESM/shell_scripts/CESM1_1_1_continuation.csh	2013-08-09 21:27:22 UTC (rev 6391)
+++ DART/trunk/models/CESM/shell_scripts/CESM1_1_1_continuation.csh	2013-08-12 16:10:09 UTC (rev 6392)
@@ -10,29 +10,60 @@
 # Purpose
 # ---------------------
 #
-# This version of the setup script is designed for a B compset where
-# CAM, POP, and CLM are all active and going to be assimilated separately.
-# It also differs from other versions of this script in that it is going
-# to stop CESM every 6 hours.  It will assimilate into CAM (but not POP
-# or CLM) every time, and all components at 0Z.
+# This script is designed to set up, stage, and build a multi-instance run of CESM
+# using a B compset where CAM, POP, and CLM are all active. The initial states for
+# the models come from as single reference case so a CESM hybrid setup is used.
 #
-# It depends on a couple patched utility files in ~nancy/cesm1_1_1
-# Copy them over to your own $HOME/cesm1_1_1 dir before running this script.
+# POP: uses the result of the 'b40.20th.005_ens${instance}' experiments. The POP
+#      restart files were saved as binary files, which is somewhat problematic for
+#      data assimilation. Consequently, the entire model state must be advanced for
+#      several days before a viable netCDF restart file can be produces. We advance
+#      for 72 hours.
 #
-# This script is designed to configure and build a multi-instance CESM model.
-# It will use DART to assimilate observations at regular intervals.
-# This script will build the DART executables first if they are not found.
-# 
-# Until the binary POP restart files have been converted from big-endian
-# to little-endian, you MUST COMPILE DART with intel and -convert big-endian
-# contact dart at ucar.edu if you want to use another compiler.
+# RTM: uses the result of the 'b40.20th.005_ens${instance}' experiments. These
+#      restart files are actually the CLM restart files from this experiment because
+#      the RTM was part of CLM when the experiment was run. The standalone version of
+#      RTM can read old CLM restart files.
 #
+# CICE: uses the result of the 'b40.20th.005_ens${instance}' experiments.
+#
+# CLM: uses the result of the 'b40.20th.005_ens${instance}' experiments - sort of.
+#      CLM has changed since then and the CLM restart files needed to be converted
+#      to the new format. I ran 'interpinic' to (essentially) reformat the files
+#      and changed the CASENAME to 'cesm_test' and used the multi-instance naming
+#      convention for the new files.
+#
+# CAM: We want to use a newer version of CAM than was used for the b40.20th.005
+#      experiments. Consequently, we took a SINGLE instance of CAM and replicated
+#      it for all the desired instances. After just a few days, the differences in
+#      the ocean and land states will induce enough variability in the CAM states.
+#
+# Much of the complexity comes from ensuring compatibility between the namelists
+# for each instance and staging of the files. The original experiments were run
+# before the multi-instance capacity was developed and the naming convention decided.
+# Consequently, there is a lot of manipulation of the 'instance' portion of the
+# filenames.
+#
+# This script results in a viable setup for a CESM multi-instance experiment. You
+# are STRONGLY encouraged to run the multi-instance CESM a few times and experiment
+# with different settings BEFORE you try to assimilate observations. The amount of
+# data volume is quite large and you should become comfortable using CESM's restart
+# capability to re-stage files in your RUN directory
+#
+# To perform assimilation, it is required only to insert a few dozen lines into the
+# CASEROOT/*.run script. This, and the required setup, is performed in
+# CESM_DART_config - which can be run at a later date. e.g. you can use this
+# script to advance this ensemble from 2004-01-01 to 2004-02-01 and then run
+# CESM_DART_config to augment the existing run script, modify STOP_N to 24 hours,
+# and start assimilating observationg at midnight from 2004-02-01 on ...
+#
 # This script relies heavily on the information in:
 # http://www.cesm.ucar.edu/models/cesm1.1/cesm/doc/usersguide/book1.html
 #
 # ---------------------
 # How to set up the script
 # ---------------------
+#
 # -- Either edit and run this script in the $DART/models/CESM/shell_scripts
 #    directory where it first appears,
 #    or copy it to somewhere that it will be preserved and run it there.
@@ -75,22 +106,18 @@
 #    script names; so consider it's length and information content.
 # num_instances:  Number of ensemble members
 
-setenv case                 cesm_hybrid_6h
+setenv case                 cesm_continue
 setenv compset              B_2000_CAM5
 setenv resolution           0.9x1.25_gx1v6
 setenv cesmtag              cesm1_1_1
 setenv num_instances        30
-setenv refcase              cesm_assim
 
 # ==============================================================================
 # define machines and directories
 #
 # mach            Computer name
-# cesm_datadir    Root path of the public CESM data files
 # cesmroot        Location of the cesm code base
 #                 For cesm1_1 on yellowstone
-# DARTroot        Location of DART code tree.
-#                    Executables, scripts and input in $DARTroot/models/dev/...
 # caseroot        Your (future) cesm case directory, where this CESM+DART will be built.
 #                    Preferably not a frequently scrubbed location.
 #                    This script will delete any existing caseroot, so this script,
@@ -99,31 +126,33 @@
 # exeroot         (Future) directory for executables - scrubbable, large amount of space needed.
 # archdir         (Future) Short-term archive directory
 #                    until the long-term archiver moves it to permanent storage.
+# dartroot        Location of _your_ DART installation
+#                    This is passed on to the CESM_DART_config script.
 # ==============================================================================
 
 setenv mach         yellowstone
-setenv cesm_datadir /glade/p/cesm/cseg/inputdata
 setenv cesmroot     /glade/p/cesm/cseg/collections/$cesmtag
 setenv caseroot     /glade/p/work/${USER}/cases/${case}
 setenv exeroot      /glade/scratch/${USER}/${case}/bld
 setenv rundir       /glade/scratch/${USER}/${case}/run
 setenv archdir      /glade/scratch/${USER}/archive/${case}
+setenv dartroot     /glade/u/home/${USER}/svn/DART/trunk
 
-setenv DARTroot     /glade/u/home/${USER}/DART
-
-# start a hybrid run off of tim's existing run
-set stagedir = /glade/scratch/thoar/archive/cesm_hybrid/rest/2004-01-10-00000
-
 # ==============================================================================
 # configure settings ... run_startdate format is yyyy-mm-dd
 # ==============================================================================
 
+setenv run_refcase cesm_hybrid
 setenv refyear     2004
 setenv refmon      01
 setenv refday      10
 setenv run_reftod  00000
 setenv run_refdate $refyear-$refmon-$refday
 
+# THIS IS THE LOCATION of the 'reference case'.
+
+set stagedir = /glade/scratch/thoar/archive/${run_refcase}/rest/${run_refdate}-${run_reftod}
+
 # ==============================================================================
 # runtime settings --  How many assimilation steps will be done after this one
 #                      plus archiving options
@@ -132,11 +161,15 @@
 # stop_option   Units for determining the forecast length between assimilations
 # stop_n        Number of time units in the first forecast
 # assim_n       Number of time units between assimilations
+#
+# If the long-term archiver is off, you get a chance to examine the files before
+# you
+#
 # ==============================================================================
 
-setenv resubmit            10
+setenv resubmit            12
 setenv stop_option         nhours
-setenv stop_n              12
+setenv stop_n              6
 setenv assim_n             6
 setenv short_term_archiver on
 setenv long_term_archiver  off
@@ -144,16 +177,15 @@
 # ==============================================================================
 # job settings
 #
-# timewall   can be changed during a series by changing the ${case}.run
 # queue      can be changed during a series by changing the ${case}.run
+# timewall   can be changed during a series by changing the ${case}.run
 #
-# TJH: How many T62_gx1v6 CESM instances can fit on 1 node?
+# TJH: 30 instances with 900 pes took about 30 minutes on yellowstone.
 # ==============================================================================
 
-setenv ACCOUNT      P8685nnnn
-setenv timewall     0:20
+setenv ACCOUNT      P86850054
 setenv queue        economy
-setenv ptile        15
+setenv timewall     0:20
 
 # ==============================================================================
 # set these standard commands based on the machine you are running on.
@@ -184,23 +216,11 @@
 endsw
 
 # ==============================================================================
-# some simple error checking before diving into the work
+# Make sure the CESM directories exist.
+# VAR is the shell variable name, DIR is the value
 # ==============================================================================
 
-# fatal idea to make caseroot the same dir as where this setup script is
-# since the build process removes all files in the caseroot dir before
-# populating it.  try to prevent shooting yourself in the foot.
-if ( $caseroot == `dirname $0` ) then
-   echo "ERROR: the setup script should not be located in the caseroot"
-   echo "directory, because all files in the caseroot dir will be removed"
-   echo "before creating the new case.  move the script to a safer place."
-   exit -1
-endif
-
-# make sure these directories exist
-set musthavedirs = "cesm_datadir cesmroot DARTroot"
-foreach VAR ( $musthavedirs )
-   # VAR is the shell variable name, DIR is the value
+foreach VAR ( stagedir cesmroot dartroot )
    set DIR = `eval echo \${$VAR}`
    if ( ! -d $DIR ) then
       echo "ERROR: directory '$DIR' not found"
@@ -209,56 +229,77 @@
    endif
 end
 
-# make sure there is a filter in these dirs.   try to build
-# them if we can't find filter already built for each model.
-set musthavefiles = "cam POP clm CESM"
-foreach MODEL ( $musthavefiles )
-   set targetdir = $DARTroot/models/$MODEL/work
-   if ( ! -x $targetdir/filter ) then
-      echo "WARNING: executable file 'filter' not found"
-      echo " Looking for: $targetdir/filter "
-      echo " Trying to rebuild all model files now."
-      (cd $targetdir; ./quickbuild.csh -mpi)
-      if ( ! -x $targetdir/filter ) then
-         echo "ERROR: executable file 'filter' not found"
-         echo " Unsuccessfully tried to rebuild: $targetdir/filter "
-         echo " Required DART assimilation executables are not found "
-         exit -1
-      endif
-   endif
-end
-
 # ==============================================================================
-# Create the case.
+# Create the case - this creates the CASEROOT directory.
 #
 # For list of the pre-defined cases: ./create_newcase -list
 # To create a variant case, see the CESM documentation and carefully
 # incorporate any needed changes into this script.
 # ==============================================================================
 
-   echo "removing old files from ${caseroot}"
-   echo "removing old files from ${exeroot}"
-   echo "removing old files from ${rundir}"
-   ${REMOVE} ${caseroot}
-   ${REMOVE} ${exeroot}
-   ${REMOVE} ${rundir}
-   ${cesmroot}/scripts/create_newcase -case ${caseroot} -mach ${mach} \
-                   -res ${resolution} -compset ${compset}
+# fatal idea to make caseroot the same dir as where this setup script is
+# since the build process removes all files in the caseroot dir before
+# populating it.  try to prevent shooting yourself in the foot.
 
-   if ( $status != 0 ) then
-      echo "ERROR: Case could not be created."
-      exit -1
-   endif
+if ( $caseroot == `dirname $0` ) then
+   echo "ERROR: the setup script should not be located in the caseroot"
+   echo "directory, because all files in the caseroot dir will be removed"
+   echo "before creating the new case.  move the script to a safer place."
+   exit -1
+endif
 
+echo "removing old files from ${caseroot}"
+echo "removing old files from ${exeroot}"
+echo "removing old files from ${rundir}"
+${REMOVE} ${caseroot}
+${REMOVE} ${exeroot}
+${REMOVE} ${rundir}
+
+${cesmroot}/scripts/create_newcase -case ${caseroot} -mach ${mach} \
+                -res ${resolution} -compset ${compset}
+
+if ( $status != 0 ) then
+   echo "ERROR: Case could not be created."
+   exit -1
+endif
+
 # ==============================================================================
-# Configure the case - this creates the CASEROOT directory.
+# Record the DARTROOT directory and copy the DART setup script to CASEROOT.
+# CESM_DART_config can be run at some later date if desired, but it presumes
+# to be run from a CASEROOT directory. If CESM_DART_config does not exist locally,
+# then it better exist in the expected part of the DARTROOT tree.
 # ==============================================================================
 
+if ( ! -e CESM_DART_config ) then
+   ${COPY} ${dartroot}/models/CESM/shell_scripts/CESM_DART_config .
+endif
+
+if (   -e CESM_DART_config ) then
+   sed -e "s#BOGUS_DART_ROOT_STRING#$dartroot#" < CESM_DART_config >! temp.$$
+   ${MOVE} temp.$$ ${caseroot}/CESM_DART_config
+   chmod 755       ${caseroot}/CESM_DART_config
+else
+   echo "WARNING: the script to configure for data assimilation is not available."
+   echo "         CESM_DART_config should be present locally or in"
+   echo "         ${dartroot}/models/CESM/shell_scripts/"
+   echo "         You can stage this script later, but you must manually edit it"
+   echo "         to reflect the location of the DART code tree."
+endif
+
+# ==============================================================================
+# Configure the case.
+# ==============================================================================
+
 cd ${caseroot}
 
+source ./Tools/ccsm_getenv || exit -2
+
+@ ptile = $MAX_TASKS_PER_NODE / 2
+@ nthreads = 1
+
 # Save a copy for debug purposes
 foreach FILE ( *xml )
-   if ( ~ -e        ${FILE}.original ) then
+   if ( ! -e        ${FILE}.original ) then
       ${COPY} $FILE ${FILE}.original
    endif
 end
@@ -266,48 +307,46 @@
 if ( $num_instances < 10) then
 
    # This is only for the purpose of debugging the code.
-   # A more efficient layout must be found
-   @ atm_pes = $ptile * $num_instances * 4
-   @ ocn_pes = $ptile * $num_instances * 4
-   @ lnd_pes = $ptile * $num_instances * 4
-   @ ice_pes = $ptile * $num_instances * 1
-   @ glc_pes = $ptile * $num_instances
-   @ rof_pes = $ptile * $num_instances
-   @ cpl_pes = $ptile * 4
+   @ atm_tasks = $ptile * $num_instances * 4
+   @ ocn_tasks = $ptile * $num_instances * 4
+   @ lnd_tasks = $ptile * $num_instances * 4
+   @ ice_tasks = $ptile * $num_instances * 1
+   @ glc_tasks = $ptile * $num_instances
+   @ rof_tasks = $ptile * $num_instances
+   @ cpl_tasks = $ptile * 4
 
 else
 
-   # This is only for the purpose of debugging the code.
-   # A more efficient layout must be found
-   #
-   @ atm_pes = $ptile * $num_instances * 2
-   @ ocn_pes = $ptile * $num_instances * 2
-   @ lnd_pes = $ptile * $num_instances * 2
-   @ ice_pes = $ptile * $num_instances
-   @ glc_pes = $ptile * $num_instances
-   @ rof_pes = $ptile * $num_instances
-   @ cpl_pes = $ptile * $num_instances
+   # This works, but a more efficient layout should be used
+   @ atm_tasks = $ptile * $num_instances * 2
+   @ ocn_tasks = $ptile * $num_instances * 2
+   @ lnd_tasks = $ptile * $num_instances * 2
+   @ ice_tasks = $ptile * $num_instances
+   @ glc_tasks = $ptile * $num_instances
+   @ rof_tasks = $ptile * $num_instances
+   @ cpl_tasks = $ptile * $num_instances
 
 endif
 
-#echo "task partitioning ... atm+ocn // lnd+ice+glc+rof"
+# echo "task partitioning ... perhaps ... atm // ocn // lnd+ice+glc+rof"
+# presently, all components run 'serially' - one after another.
 echo ""
-echo "ATM  gets $atm_pes"
-echo "CPL  gets $cpl_pes"
-echo "ICE  gets $ice_pes"
-echo "LND  gets $lnd_pes"
-echo "GLC  gets $glc_pes"
-echo "DROF gets $rof_pes"
-echo "OCN  gets $ocn_pes"
+echo "ATM  gets $atm_tasks"
+echo "CPL  gets $cpl_tasks"
+echo "ICE  gets $ice_tasks"
+echo "LND  gets $lnd_tasks"
+echo "GLC  gets $glc_tasks"
+echo "DROF gets $rof_tasks"
+echo "OCN  gets $ocn_tasks"
 echo ""
 
-./xmlchange NTHRDS_CPL=1,NTASKS_CPL=$cpl_pes
-./xmlchange NTHRDS_GLC=1,NTASKS_GLC=$glc_pes,NINST_GLC=1
-./xmlchange NTHRDS_ATM=1,NTASKS_ATM=$atm_pes,NINST_ATM=$num_instances
-./xmlchange NTHRDS_LND=1,NTASKS_LND=$lnd_pes,NINST_LND=$num_instances
-./xmlchange NTHRDS_ICE=1,NTASKS_ICE=$ice_pes,NINST_ICE=$num_instances
-./xmlchange NTHRDS_ROF=1,NTASKS_ROF=$rof_pes,NINST_ROF=$num_instances
-./xmlchange NTHRDS_OCN=1,NTASKS_OCN=$ocn_pes,NINST_OCN=$num_instances
+./xmlchange NTHRDS_CPL=$nthreads,NTASKS_CPL=$cpl_tasks
+./xmlchange NTHRDS_GLC=$nthreads,NTASKS_GLC=$glc_tasks,NINST_GLC=1
+./xmlchange NTHRDS_ATM=$nthreads,NTASKS_ATM=$atm_tasks,NINST_ATM=$num_instances
+./xmlchange NTHRDS_LND=$nthreads,NTASKS_LND=$lnd_tasks,NINST_LND=$num_instances
+./xmlchange NTHRDS_ICE=$nthreads,NTASKS_ICE=$ice_tasks,NINST_ICE=$num_instances
+./xmlchange NTHRDS_ROF=$nthreads,NTASKS_ROF=$rof_tasks,NINST_ROF=$num_instances
+./xmlchange NTHRDS_OCN=$nthreads,NTASKS_OCN=$ocn_tasks,NINST_OCN=$num_instances
 ./xmlchange ROOTPE_ATM=0
 ./xmlchange ROOTPE_OCN=0
 ./xmlchange ROOTPE_CPL=0
@@ -328,14 +367,19 @@
 # climate, however, should be continuous provided that no model source code or
 # namelists are changed in the hybrid run. In a hybrid initialization, the ocean
 # model does not start until the second ocean coupling (normally the second day),
-# and the coupler does a "cold start" without a restart file.
+# and the coupler does a "cold start" without a restart file."
 
+# TJH:
+# DART's CAM implementation causes a bit more complexity. DART only uses CAM _initial_
+# files, not RESTART files, so there are sourcemods to force a hybrid start for CAM to
+# read initial files - even when CONTINUE_RUN = TRUE.
+
 ./xmlchange RUN_TYPE=hybrid
 ./xmlchange RUN_STARTDATE=$run_refdate
 ./xmlchange START_TOD=$run_reftod
+./xmlchange RUN_REFCASE=$run_refcase
 ./xmlchange RUN_REFDATE=$run_refdate
 ./xmlchange RUN_REFTOD=$run_reftod
-./xmlchange RUN_REFCASE=$refcase
 ./xmlchange GET_REFCASE=FALSE
 ./xmlchange EXEROOT=${exeroot}
 
@@ -344,12 +388,11 @@
 ./xmlchange STOP_OPTION=$stop_option
 ./xmlchange STOP_N=$stop_n
 ./xmlchange CONTINUE_RUN=FALSE
-./xmlchange RESUBMIT=0
+./xmlchange RESUBMIT=$resubmit
 ./xmlchange PIO_TYPENAME=pnetcdf
 
-# this is to set the ocean coupling time to 6 hours.
-# it should set all other related namelists based on
-# this setting.
+# This sets the ocean coupling time to 6 hours.
+# All related namelist settings are based on this value.
 ./xmlchange OCN_NCPL=4
 
 ./xmlchange CLM_CONFIG_OPTS='-bgc cn'
@@ -374,7 +417,6 @@
 ./xmlchange DEBUG=FALSE
 ./xmlchange INFO_DBUG=0
 
-
 # ==============================================================================
 # Set up the case.
 # This creates the EXEROOT and RUNDIR directories.
@@ -387,18 +429,54 @@
    exit -2
 endif
 
-# this should be removed when the files are fixed:
-echo "PATCHING ADDITIONAL FILES HERE - SHOULD BE REMOVED WHEN FIXED"
-echo caseroot is ${caseroot}
-if (    -d     ~/${cesmtag} ) then
-   ${COPY} ~/${cesmtag}/*buildnml.csh ${caseroot}/Buildconf/
-   ls -l ~/${cesmtag}/*buildnml.csh ${caseroot}/Buildconf/*buildnml.csh
-   ${COPY} ~/${cesmtag}/preview_namelists ${caseroot}
-   ls -l ~/${cesmtag}/preview_namelists ${caseroot}/preview_namelists
+# ==============================================================================
+# Edit the run script to reflect queue and wallclock
+# ==============================================================================
+
+echo ''
+echo 'Updating the run script to set wallclock and queue.'
+echo ''
+
+if ( ! -e  ${case}.run.original ) then
+   ${COPY} ${case}.run ${case}.run.original
 endif
 
+source Tools/ccsm_getenv
+set BATCH = `echo $BATCHSUBMIT | sed 's/ .*$//'`
+switch ( $BATCH )
+   case bsub*:
+      # NCAR "bluefire", "yellowstone"
+      set TIMEWALL=`grep BSUB ${case}.run | grep -e '-W' `
+      set    QUEUE=`grep BSUB ${case}.run | grep -e '-q' `
+      sed -e "s/$TIMEWALL[3]/$timewall/" \
+          -e "s/ptile=[0-9][0-9]*/ptile=$ptile/" \
+          -e "s/$QUEUE[3]/$queue/" < ${case}.run >> temp.$$
+          ${MOVE} temp.$$ ${case}.run
+          chmod 755       ${case}.run
+   breaksw
+
+   default:
+
+   breaksw
+endsw
+
 # ==============================================================================
-# Modify namelist templates for each instance.
+# Modify namelist templates for each instance. This is a bit of a nuisance in
+# that we are pulling in restart and initial files from 'all over the place'
+# and each model component has a different strategy.
+#
+# In a hybrid run with CONTINUE_RUN = FALSE (i.e. just starting up):
+#
+# CAM has been forced to read initial files - specified by namelist var:ncdata
+# POP reads from pointer files
+# CICE reads from namelist variable 'ice_ic'
+# CLM builds its own 'finidat' value from the REFCASE variables but in CESM1_1_1
+#     it does not use the instance string. There is a patch for clm.buildnml.csh
+# RTM reads from namelist variable 'finidat_rtm', but rtm.buildnml.csh also is buggy.
+#
+# All of these must later on be staged with these same filenames.
+# OR - all these namelists can be changed to match whatever has been staged.
+# MAKE SURE THE STAGING SECTION OF THIS SCRIPT MATCHES THESE VALUES.
 # ==============================================================================
 
 @ inst = 1
@@ -412,50 +490,39 @@
    # ===========================================================================
    # For a HOP TEST ... empty_htapes = .false.
    # For a HOP TEST ... use a default fincl1
+   # FIXME ... add documentation for configuring CAM history files
 
-   echo " inithist      = '6-HOURLY'"                   >> $fname
-   echo " ncdata        = 'cam_initial${instance}.nc'"  >> $fname
-   echo " empty_htapes  = .true. "                      >> $fname
-   echo " fincl1        = 'PHIS:I' "                    >> $fname
-   echo " nhtfrq        = -$assim_n "                   >> $fname
-   echo " mfilt         = 1 "                           >> $fname
+   echo " inithist      = '6-HOURLY'"                   >> ${fname}
+   echo " ncdata        = 'cam_initial${instance}.nc'"  >> ${fname}
+   echo " empty_htapes  = .true. "                      >> ${fname}
+   echo " fincl1        = 'PHIS:I' "                    >> ${fname}
+   echo " nhtfrq        = -$assim_n "                   >> ${fname}
+   echo " mfilt         = 1 "                           >> ${fname}
 
    # ===========================================================================
-   set fname = "user_nl_pop2${instance}"
-   # ===========================================================================
-
-   # POP Namelists
-   # init_ts_suboption = 'data_assim'   for non bit-for-bit restarting (assimilation mode)
-   # init_ts_suboption = 'rest'         for exact restart
-   # init_ts_suboption = 'spunup'       for ?
-   # init_ts_suboption = 'null'         for ?
-   # For a HOP TEST (untested)... tavg_file_freq_opt = 'nmonth' 'nday' 'once'"
-   # For a HOP TEST ... cool to have restart files every day, not just for end.
-
-   echo "init_ts_suboption = 'data_assim'" >> $fname
-
-   # ===========================================================================
-   set fname = "user_nl_cice${instance}"
-   # ===========================================================================
-   # CICE Namelists
-
-   echo "ice_ic = '${refcase}.cice${instance}.r.${run_refdate}-${run_reftod}.nc'" >> $fname
-
-   # ===========================================================================
    set fname = "user_nl_clm${instance}"
    # ===========================================================================
 
    # Customize the land namelists
-   # The history tapes are a work in progress. If you write out the instantaneous
-   # flux variables every 30 minutes to the .h1. file, the forward observation
-   # operators for these fluxes should just read them from the .h1. file rather
-   # than trying to create them from the (incomplete DART) CLM state.
+   # The filename is built using the REFCASE/REFDATE/REFTOD information.
+   #
+   # This is the time to consider how DART and CESM will interact.  If you intend
+   # on assimilating flux tower observations (nominally at 30min intervals),
+   # then it is required to create a .h1. file with the instantaneous flux
+   # variables every 30 minutes. Despite being in a namelist, these values
+   # HAVE NO EFFECT once CONTINUE_RUN = TRUE so now is the time to set these.
+   #
+   # DART's forward observation operators for these fluxes just reads them
+   # from the .h1. file rather than trying to create them from the subset of
+   # CLM variables that are available in the DART state vector.
+   #
    # For a HOP TEST ... hist_empty_htapes = .false.
    # For a HOP TEST ... use a default hist_fincl1
    #
+   # FIXME ... add documentation for configuring CLM history files
+
    @ thirtymin = $assim_n * 2
 
-   #echo "\!finidat = '${refcase}.clm2${instance}.r.${run_refdate}-${run_reftod}.nc'" >> $fname
    echo "hist_empty_htapes = .true."                 >> $fname
    echo "hist_fincl1 = 'NEP'"                        >> $fname
    echo "hist_fincl2 = 'NEP','FSH','EFLX_LH_TOT_R'"  >> $fname
@@ -464,24 +531,48 @@
    echo "hist_avgflag_pertape = 'A','A'"             >> $fname
 
    # ===========================================================================
+   set fname = "user_nl_pop2${instance}"
+   # ===========================================================================
+
+   # POP Namelists
+   # GIVEN: init_ts_option = 'ccsm_hybrid'  when  RUN_TYPE=hybrid, then
+   #
+   # init_ts_suboption = 'data_assim'   --> non bit-for-bit restarting (assimilation mode)
+   # init_ts_suboption = 'rest'         --> default behavior
+   #
+   # No matter the setting of init_ts_suboption, POP always uses the
+   # 'data_assim' behavior when RUN_TYPE=hybrid and CONTINUE_RUN=FALSE.
+   # As soon as CONTINUE_RUN=TRUE, the value becomes important.
+   #
+   # For a HOP TEST ... Would like to have restart files every day, not just for end.
+   # For a HOP TEST (untested)... tavg_file_freq_opt = 'nmonth' 'nday' 'once'"
+
+   echo "init_ts_suboption = 'rest'" >> $fname
+
+   # ===========================================================================
+   set fname = "user_nl_cice${instance}"
+   # ===========================================================================
+   # CICE Namelists
+
+   echo "ice_ic = '${run_refcase}.cice${instance}.r.${run_refdate}-${run_reftod}.nc'" >> $fname
+
+   # ===========================================================================
    set fname = "user_nl_rtm${instance}"
    # ===========================================================================
    # RIVER RUNOFF CAN START FROM AN OLD CLM RESTART FILE
-   # but in this hybrid restart case we have real rtm files.
+   # you can specify the RTM filename here and override the settings from
+   # RUN_REFCASE/RUN_REFDATE/RUN_REFTOD (something you cannot do with CLM).
 
-   echo "finidat_rtm = '${refcase}.rtm${instance}.r.${run_refdate}-${run_reftod}.nc'" >> $fname
+   echo "finidat_rtm = '${run_refcase}.rtm${instance}.r.${run_refdate}-${run_reftod}.nc'" >> $fname
 
    @ inst ++
 end
 
-./preview_namelists
-
 # ==============================================================================
-# Update source files if need be
-#    Ideally, using DART will not require any source mods.
-#    Until then, this script accesses source mods from a hard-wired location below.
-#    Those may eventually be packaged into the DART repository.
-#    If you have additional source mods, they will need to be merged into any DART
+# Update source files.
+#    Ideally, using DART would not require any modifications to the model source.
+#    Until then, this script accesses sourcemods from a hardwired location.
+#    If you have additional sourcemods, they will need to be merged into any DART
 #    mods and put in the SourceMods subdirectory found in the 'case' directory.
 # ==============================================================================
 
@@ -498,523 +589,85 @@
    exit -4
 endif
 
-# ==============================================================================
-# build
-# ==============================================================================
+# The CESM multi-instance capability is relatively new and still has a few
+# implementation bugs. These are known problems and will be fixed soon.
+# this should be removed when the files are fixed:
 
-echo ''
-echo 'Building the case'
-echo ''
+echo "REPLACING BROKEN CESM FILES HERE - SHOULD BE REMOVED WHEN FIXED"
+echo caseroot is ${caseroot}
+if ( -d ~/${cesmtag} ) then
 
-./${case}.build
-
-if ( $status != 0 ) then
-   echo "ERROR: Case could not be built."
-   exit -5
-endif
-
-# ==============================================================================
-# Stage the restarts now that the run directory exists
-# I ran this compset once without setting user_nl_atm_00nn to see where the
-# initial files come from.
-# ==============================================================================
-
-cd ${rundir}
-
-echo ''
-echo "Copying the restart files from the staging directories"
-echo 'into the CESM run directory and creating the pointer files.'
-echo ''
-
-# TJH FIXME ... simply put full path in the pointer file? traceability?
-# TJH FIXME ... what do we do with the *.hdr file
-# After a run completes, the 'normal' pointer files created are:
-# rpointer.drv
-# rpointer.atm_0001           cesm_2000.cam_0001.r.2004-01-04-00000.nc
-# rpointer.lnd_0001           cesm_2000.clm2_0001.r.2004-01-04-00000.nc
-# rpointer.ice_0001           cesm_2000.cice_0001.r.2004-01-04-00000.nc
-# rpointer.rof_0001           cesm_2000.rtm_0001.r.2004-01-04-00000.nc
-# rpointer.ocn_0001.ovf       cesm_2000.pop_0001.ro.2004-01-04-00000
-# rpointer.ocn_0001.restart   cesm_2000.pop_0001.r.2004-01-04-00000.nc       RESTART_FMT=nc
-
-echo "Staging restarts for $num_instances instances"
-
-# files that have an instance
-@ inst = 1
-while ($inst <= $num_instances)
-   set inst_string = `printf _%04d $inst`
-
-   ${COPY} ${stagedir}/*${inst_string}*  .
-   ${LINK} ${stagedir}/${refcase}.cam${inst_string}.i.${run_refdate}-${run_reftod}.nc cam_initial${inst_string}.nc
-
-   @ inst ++
-end
-
-# files that do not
-${COPY} ${stagedir}/*prior_inflate_restart* .
-${COPY} ${stagedir}/*cpl.r*nc               .
-${COPY} ${stagedir}/rpointer.drv            .
-
-echo "Restarts copied to run dir"
-
-# ==============================================================================
-# Edit the run script to reflect project, queue, and wallclock
-# ==============================================================================
-
-cd ${caseroot}
-
-echo ''
-echo 'Updating the run script to set wallclock and queue.'
-echo ''
-
-if ( ~ -e  ${case}.run.original ) then
-   ${COPY} ${case}.run ${case}.run.original
-endif
-
-source Tools/ccsm_getenv
-set BATCH = `echo $BATCHSUBMIT | sed 's/ .*$//'`
-switch ( $BATCH )
-   case bsub*:
-      # NCAR "bluefire", "yellowstone"
-      set TIMEWALL=`grep BSUB ${case}.run | grep -e '-W' `
-      set    QUEUE=`grep BSUB ${case}.run | grep -e '-q' `
-      sed -e "s/ptile=[0-9][0-9]*/ptile=$ptile/" \
-          -e "s/$TIMEWALL[3]/$timewall/" \
-          -e "s/$QUEUE[3]/$queue/" < ${case}.run >! temp.$$
-      ${MOVE} temp.$$  ${case}.run
-   breaksw
-
-   default:
-
-   breaksw
-endsw
-
-# ==============================================================================
-# The *.run script must be modified to call the DART assimilate script.
-# The modifications are contained in a "here" document that MUST NOT
-# expand the wildcards etc., before it is run. This is achieved by
-# double-quoting the characters used to delineate the start/stop of
-# the "here" document. No kidding. It has to be "EndOfText",
-# not 'EndOfText' or EndOfText.
-# ==============================================================================
-
-echo ''
-echo 'Adding the call to assimilate.csh to the *.run script.'
-echo ''
-
-cat << "EndOfText" >! add_to_run.txt
-
-# -------------------------------------------------------------------------
-# START OF DART: if CESM finishes correctly (pirated from ccsm_postrun.csh);
-# perform an assimilation with DART.
-
-set CplLogFile = `ls -1t cpl.log* | head -n 1`
-if ($CplLogFile == "") then
-   echo 'ERROR: Model did not complete - no cpl.log file present - exiting.'
-   echo 'ERROR: Assimilation will not be attempted.'
-   setenv LSB_PJL_TASK_GEOMETRY "{(0)}"
-   setenv EXITCODE -1
-   mpirun.lsf ${CASEROOT}/shell_exit.sh
-   exit -1
-endif
-
-grep 'SUCCESSFUL TERMINATION' $CplLogFile
-if ( $status == 0 ) then
-   ${CASEROOT}/assimilate.csh
-
-   if ( $status == 0 ) then
-      echo "`date` -- DART HAS FINISHED"
-   else
-      echo "`date` -- DART FILTER ERROR - ABANDON HOPE"
-      setenv LSB_PJL_TASK_GEOMETRY "{(0)}"
-      setenv EXITCODE -3
-      mpirun.lsf ${CASEROOT}/shell_exit.sh
-      exit -3
+   # preserve the original version of the files
+   if ( ! -e  ${caseroot}/Buildconf/clm.buildnml.csh.original ) then
+      ${MOVE} ${caseroot}/Buildconf/clm.buildnml.csh \
+              ${caseroot}/Buildconf/clm.buildnml.csh.original
    endif
-else
-   echo 'ERROR: Model did not complete successfully - exiting.'
-   echo 'ERROR: Assimilation will not be attempted.'
-   setenv LSB_PJL_TASK_GEOMETRY "{(0)}"
-   setenv EXITCODE -2
-   mpirun.lsf ${CASEROOT}/shell_exit.sh
-   exit -2
-endif
+   if ( ! -e  ${caseroot}/Buildconf/rtm.buildnml.csh.original ) then
+      ${MOVE} ${caseroot}/Buildconf/rtm.buildnml.csh \
+              ${caseroot}/Buildconf/rtm.buildnml.csh.original
+   endif
+   if ( ! -e  ${caseroot}/preview_namelists.original ) then
+      ${MOVE} ${caseroot}/preview_namelists \
+              ${caseroot}/preview_namelists.original
+   endif
 
-# END OF DART BLOCK
-# -------------------------------------------------------------------------
-"EndOfText"
+   # patch/replace the broken files
+   ${COPY} ~/${cesmtag}/clm.buildnml.csh  ${caseroot}/Buildconf/.
+   ${COPY} ~/${cesmtag}/rtm.buildnml.csh  ${caseroot}/Buildconf/.
+   ${COPY} ~/${cesmtag}/preview_namelists ${caseroot}/.
 
-# Now that the "here" document is created,
-# determine WHERE to insert it -- ONLY IF it is not already there.
-
-grep "ABANDON HOPE" ${case}.run
-set STATUSCHECK = $status
-
-if ( ${STATUSCHECK} == 0 ) then
-   echo "DART block already present in ${case}.run"
-else if ( ${STATUSCHECK} == 1 ) then
-
-   set MYSTRING = `grep --line-number "CSM EXECUTION HAS FINISHED" ${case}.run`
-   set MYSTRING = `echo $MYSTRING | sed -e "s#:# #g"`
-
-   @ origlen = `cat ${case}.run | wc -l`
-   @ keep = $MYSTRING[1]
-   @ lastlines = $origlen - $keep
-
-   head -n $keep      ${case}.run    >! temp.$$
-   cat                add_to_run.txt >> temp.$$
-   tail -n $lastlines ${case}.run    >> temp.$$
-
-   ${MOVE} temp.$$ ${case}.run
-   ${REMOVE} add_to_run.txt
-
-else
-   echo "ERROR in grep of ${case}.run: aborting"
-   echo "status was ${STATUSCHECK}"
-   exit -6
 endif
 
-chmod 0744 ${case}.run
+./preview_namelists
 
 # ==============================================================================
-# Stage the required parts of DART in the CASEROOT directory.
+# Stage the restarts now that the run directory exists
+# THIS IS THE STAGING SECTION - MAKE SURE THIS MATCHES THE NAMELISTS.
+# POP/CAM/CICE read from pointer files. The others use namelist values initially.
 # ==============================================================================
 
-# The standard CESM short-term archiving script may need to be altered
-# to archive addtional or subsets of things, or to reduce the amount of
-# data that is sent to the long-term archive.  Put a version of st_archive.sh
-# in  ${DARTroot}/models/CESM/shell_scripts when/if necessary
-
-if ( ~ -e  Tools/st_archive.sh.original ) then
-   ${COPY} Tools/st_archive.sh Tools/st_archive.sh.original
-else
-   echo "a Tools/st_archive.sh backup copy already exists"
-endif
-
-${COPY} ${DARTroot}/models/CESM/shell_scripts/st_archive.sh           Tools/
-${COPY} ${DARTroot}/models/CESM/shell_scripts/assimilate.csh          .
-${COPY} ${DARTroot}/shell_scripts/shell_exit.sh                       .
-${COPY} ${DARTroot}/models/CESM/shell_scripts/cam_assimilate.csh      .
-${COPY} ${DARTroot}/models/CESM/shell_scripts/pop_assimilate.csh      .
-${COPY} ${DARTroot}/models/CESM/shell_scripts/clm_assimilate.csh      .
-
-
-# ==============================================================================
-# Construct a set of scripts for restarting broken runs without having to
-# run this entire script again and rebuild CESM.  This section creates 5 scripts
-# which do the following tasks:
-#  1) resetting the xml files to run (or rerun) the first step of the experiment
-#  2) the xml changes you need to make between steps 1 and 2
-#  3) restage initial case files to start over
-#  4) restore the files from the last successful cesm advance to restart
-#      in the middle of a run
-#  5) restage the dart executables from the dartroot directory to the
-#      CESM bld directory
-# ==============================================================================
-
-
-#-----
-
-cat << EndOfText >! xml_changes_for_step1.sh
+cat << EndOfText >! stage_initial_cesm_files
 #!/bin/sh
 
-# this script changes the env_run options that are needed for
-# the first job step.
-# this script was autogenerated by $0
-# using the variables set in that script
-
-./xmlchange STOP_OPTION=$stop_option
-./xmlchange STOP_N=$stop_n
-./xmlchange CONTINUE_RUN=FALSE
-./xmlchange RESUBMIT=0
-exit 0
-
-EndOfText
-chmod 0775 xml_changes_for_step1.sh
-
-#-----
-
-cat << EndOfText >! xml_changes_for_stepN.sh
-#!/bin/sh
-
-# this script changes the env_run options that are needed for
-# any jobs step after the first one.
-# this script was autogenerated by $0
-# using the variables set in that script
-
-./xmlchange STOP_OPTION=$stop_option
-./xmlchange STOP_N=$assim_n
-./xmlchange CONTINUE_RUN=TRUE
-./xmlchange RESUBMIT=$resubmit
-exit 0
-
-EndOfText
-chmod 0775 xml_changes_for_stepN.sh
-
-#-----
-
-cat << EndOfText >! reset_step1.sh
-#!/bin/sh
-
-# this script removes the current contents of the run directory and
-# replaces the initial staged files needed to start the experiment over.
-# this script was autogenerated by $0
-# using the variables set in that script
-
-# before removing everything, be sure we make it to the run dir
 cd ${rundir}
-if [[ "\`pwd\`" != ${rundir} ]]; then
-   echo did not change to run directory successfully.  exiting.
-   exit -1
-fi
-${REMOVE} ${rundir}/*
 
-${COPY} ${stagedir}/*prior_inflate_restart* .
-${COPY} ${stagedir}/*cpl.r*nc .
-${COPY} ${stagedir}/rpointer.drv .
+echo ''
+echo 'Copying the restart files from the staging directory.'
+echo ''
 
-let inst=1
-while [[ \$inst -le $num_instances ]]
-do
-   # the instance numbers
-   echo staging files for instance \$inst
+${COPY} ${stagedir}/* .
 
-   # instance string includes the leading underscore
-   inst_string=\`printf _%04d \$inst\`
-
-   ${COPY} ${stagedir}/*\${inst_string}*  .
-   ${LINK} ${refcase}.cam\${inst_string}.i.${run_refdate}-${run_reftod}.nc cam_initial\${inst_string}.nc
-
-   let inst=inst+1
-done
-
-cd ${caseroot}
-
-# reset the env_run options to start a new run (or start over)
-./xml_changes_for_step1.sh
-
-exit 0
-
-EndOfText
-chmod 0775 reset_step1.sh
-
-#-----
-
-cat << EndOfText >! reset_last_successful_step.sh
-#!/bin/sh
-
-# this script removes the current contents of the run directory and
-# restores the files from the last successfully archived directory.
-# this script was autogenerated by $0
-# using the variables set in that script
-
-lastarchivedir=\`ls -1dt ${archdir}/.sta2/* | head -n 1\`
-if [[ ! -d \$lastarchivedir ]]; then
-  lastarchivedir=\`ls -1dt ${archdir}/rest/* | head -n 1\`
-  if [[ ! -d \$lastarchivedir ]]; then
-    echo cannot find last archive directory in ${archdir}/.sta2 
-    echo or in ${archdir}/rest.  exiting.
-    exit -1
-  fi
-fi
-
-# before removing everything, be sure we make it to the run dir
-cd ${rundir}
-if [[ "\`pwd\`" != ${rundir} ]]; then

@@ Diff output truncated at 40000 characters. @@


More information about the Dart-dev mailing list