[Dart-dev] [9967] DART/trunk/models/cam/shell_scripts:

Wed Mar 16 14:28:32 MDT 2016

Revision: 9967
Author:   raeder
Date:     2016-03-16 14:28:24 -0600 (Wed, 16 Mar 2016)
Log Message:
-----------

CESM1_5_setup_advanced:

This commitment is a preliminary version based on a CESM development tag, 
/glade/p/work/juliob/cesm_tags/cesm1_5_beta03_clm_r162_pop_20151215_mosart1_0_14, 
which has not been extensively tested.  It was the best version available
as of the CESM 2016 winter meetings, and was chosen to do testing of the interactions
of Bacmeister's anisotropic gravity wave drag, turbulent mountain stress, and CLUBB.
It has also been tested with the tag used for ATM_spinup$N; cesm1_5_alpha02d.
It has not been tested with WACCM or CAM-SE.

There are extensive changes due to CESM1_5's move to perl and xml files for all setup 
and run scripts, and keeping all of that in the new cime directory.  
The new CESM scripts incorporate an optional call to data assimilation after each forecast.
CESM1_5 accommodates running multiple cycles in each job, with the st_archive run as a 
separate single processor job after each CESM_DART job (not after each cycle).  
Job resubmission is still controlled by RESUBMIT.

This version has support for using a non-standard grid, which we may want to keep in
...setup_advanced, but won't want in ...setup_hybrid.
The example in these scripts is the use of 1/4 degree SST files from AVHRR.

This version has support for an active land ice model. 

Details:
> ccsmgetenv has been replaced by numerous xmlquery commands.
> The CESM run, submit, archive, ... scripts no longer have $CASE in their names.
  They're called case.run, ...
> cesm_setup has been replaced by case.setup.
> CESM batch job characteristics (queue, account, ...) are now controlled by env_batch.xml,
  so those values are set with xmlchange calls, rather than editing case.run, case.st_archive, ...
> (no_)assimilate.csh is chosen by providing its full pathname to env_run.xml,
  rather than by editing case.run, which was the method in pre-CESM1_5.
> CESM's st_archive now handles DART output, which is controlled by the new env_archive.xml.
  So an st_archive modified for DART should not be copied to $CASEROOT.
> Moved BASEOBSDIR definition from assimilate.csh to CESM#_#_setup_advanced 'baseobsdir'.
  It's now inherited by CESM#_#_DART_config and inserted into assimilate.csh,
  so that assimilate.csh will only need to be modified if the user wants to change obs sets.
> Set a new parameter to tell assimilate.csh to keep every Mth day's restart set.
> River runoff models RTM and MOSART can be handled by this script.
> Land ice, 'glc', 20km resolution CISM1, can be an active component.
  CISM1 is single threaded, and all instances are piled on the first num_ens tasks,
  so finer resolution than 20km can't be used.  
  CISM2 can be multi-threaded, but is not ready yet (2016-3-14).
> The script figures out which river runoff and land ice models are in the compset
  by quering the COMPSET environment variable.
> CESM1_5_setup_advanced can be run as a batch script; required alternatives to $0,
  which doesn't refer to the actual script name in the LSF environment.

-------------------
CESM1_5_DART_config:
> It seems that we will need different versions of CESM_DART_config for evolving CESM versions,
  so I've put a version number in the name, and refer to that version in CESM1_5_setup_advanced.
> CESM1_5 has a mechanism (env_run.xml:DATA_ASSIMILATION*) for running a data assimilation script 
  after the forecast.  This removes the need for all of the pieces of ...DART_config which modified 
  CESM scripts to do DA.
> Modifies assimilate#_#.csh in several more ways and puts the result in $CASEROOT/assimilate.csh:
  $baseobsdir,$save_every_Mth_day_restart. 

--------------
assimilate1_5.csh:
> Will be modified by CESM1_5_DART_config and put into $CASEROOT/assimilate.csh.
> New method (xmlquery) for getting CESM environment variables, 
  since they're not inherited from the new CESM1_5 perl scripts.
> Adapting the queries of input.nml to the new input.nml,
  which has multiple versions of some namelists ( or sections),
  all of which are commented out except one.  The greps in assimilate.csh
  were finding all of them, instead of the active one.  
  Now $CASEROOT/input.nml goes through sed, which removes all lines
  with # or ! as the first non-blank character, so a stripped version of
  input.nml ends up in $RUNDIR/assimilate_cam and the queries find the
  active variable.
> Removes unneeded restart sets at the start of each assimilation.
  This is controlled by CESM1_5_setup_advanced:save_every_Mth_day_restarts.
> Hides 1 (potentially) more restart sets in the parent of $RUNDIR,
  where it's hidden from the short term archiver.
> Looks for observations in YYYYMM_6H_CESM, where the (links) files with the CESM date format are found.
> Copies obs_seq.final files to a local place where they won't be purged by {st,lt}_archive.
> Copies obs_seq.final files to a local place where they won't be purged by {st,lt}_archive.
> Copies obs_seq.final files to a local place where they won't be purged by {st,lt}_archive.
> Copies obs_seq.final files to a local place where they won't be purged by {st,lt}_archive.
> Copies obs_seq.final files to a local place where they won't be purged by {st,lt}_archive.
> Copies obs_seq.final files to a local place where they won't be purged by {st,lt}_archive.
> Copies obs_seq.final files to a local place where they won't be purged by {st,lt}_archive.
> Copies obs_seq.final files to a local place where they won't be purged by {st,lt}_archive.
> Copies obs_seq.final files to a local place where they won't be purged by {st,lt}_archive.
> Copies obs_seq.final files to a local place where they won't be purged by {st,lt}_archive.
> Copies obs_seq.final files to a local place where they won't be purged by {st,lt}_archive.
> Copies obs_seq.final files to a local place where they won't be purged by {st,lt}_archive.

Added Paths:
-----------
    DART/trunk/models/cam/shell_scripts/CESM1_5_DART_config
    DART/trunk/models/cam/shell_scripts/CESM1_5_setup_advanced
    DART/trunk/models/cam/shell_scripts/assimilate1_5.csh

-------------- next part --------------
Added: DART/trunk/models/cam/shell_scripts/CESM1_5_DART_config
===================================================================

--- DART/trunk/models/cam/shell_scripts/CESM1_5_DART_config	                        (rev 0)
+++ DART/trunk/models/cam/shell_scripts/CESM1_5_DART_config	2016-03-16 20:28:24 UTC (rev 9967)
@@ -0,0 +1,359 @@
+#!/bin/csh
+#
+# DART software - Copyright 2004 - 2013 UCAR. This open source software is
+# provided by UCAR, "as is", without charge, subject to all terms of use at
+# http://www.image.ucar.edu/DAReS/DART/DART_download
+#
+# DART $Id: CESM_DART_config 7742 2015-03-20 00:17:16Z thoar $
+
+# ---------------------
+# Purpose
+# ---------------------
+#
+# This script integrates DART with a pre-existing CESM multi-instance case.
+# It must be run from a valid CASEROOT directory and some environment variables
+# must be set (as in CESM#_#_setup_YYY). If the case was created
+# using one of the DART scripts, this script should be staged in the
+# CASEROOT directory automatically, and DARTROOT is set at that time.
+#
+# CAM is the only active model component.
+# CESM starts and stops to allow for CAM to assimilate every 6 hours.
+#
+# This script will build the DART executables if they are not found.
+#
+# There are many CESM binary files in big-endian format, and DART reads
+# some of them, so you MUST compile DART accordingly e.g.,
+# ifort -convert big_endian
+# Contact dart at ucar.edu if you want to use another compiler.
+#
+# ---------------------
+# How to set up the script
+# ---------------------
+#
+# -- Ensure DARTROOT references a valid DART directory.
+# -- Examine the whole script to identify things to change for your experiments.
+# -- Provide any initial files needed by your run:
+#       inflation
+#       sampling error correction
+# -- Run this script.
+# -- Edit the DART input.nml that appears in the ${CASEROOT} directory.
+# -- Submit the job using ${CASEROOT}/${CASE}.submit
+#
+# ==============================================================================
+# Get the environment of the case - defines number of instances/ensemble size ...
+# Each model component has their own number of instances.
+# ==============================================================================
+
+if ( ! -e ./xmlquery ) then
+   echo "ERROR: $0 must be run from a CASEROOT directory".
+   exit -1
+endif
+
+setenv CASE          `./xmlquery CASE           -value`
+setenv CASEROOT      `./xmlquery CASEROOT       -value`
+setenv CCSM_COMPSET  `./xmlquery CCSM_COMPSET   -value`
+setenv EXEROOT       `./xmlquery EXEROOT        -value`
+setenv NINST_ATM     `./xmlquery NINST_ATM      -value`
+setenv RUNDIR        `./xmlquery RUNDIR         -value`
+
+set num_instances = $NINST_ATM
+
+# DARTROOT is set by the DART CESM_configure scripts. Under certain
+# situations, you may need to set this manually. It should reference the
+# base portion of the DART code tree.
+
+setenv DARTROOT  BOGUS_DART_ROOT_STRING
+
+# ==============================================================================
+# Some
+# ==============================================================================
+
+set nonomatch       # suppress "rm" warnings if wildcard does not match anything
+
+# The FORCE options are not optional.
+# The VERBOSE options are useful for debugging though
+# some systems don't like the -v option to any of the following
+switch ("`hostname`")
+   case be*:
+      # NCAR "bluefire"
+      set   MOVE = '/usr/local/bin/mv -fv'
+      set   COPY = '/usr/local/bin/cp -fv --preserve=timestamps'
+      set   LINK = '/usr/local/bin/ln -fvs'
+      set REMOVE = '/usr/local/bin/rm -fr'
+
+   breaksw
+   default:
+      # NERSC "hopper", NWSC "yellowstone"
+      set   MOVE = '/bin/mv -fv'
+      set   COPY = '/bin/cp -fv --preserve=timestamps'
+      set   LINK = '/bin/ln -fvs'
+      set REMOVE = '/bin/rm -fr'
+
+   breaksw
+endsw
+
+echo ""
+
+# ==============================================================================
+# make sure the required directories exist
+# VAR is the shell variable name, DIR is the value
+# ==============================================================================
+
+foreach VAR ( CASEROOT DARTROOT )
+   set DIR = `eval echo \${$VAR}`
+   if ( ! -d $DIR ) then
+      echo "ERROR: directory '$DIR' not found"
+      echo "       In the setup script check the setting of: $VAR"
+      exit -1
+   endif
+end
+
+# ==============================================================================
+# Make sure the DART executables exist or build them if we can't find them.
+# The DART input.nml in the model directory IS IMPORTANT during this part
+# because it defines what observation types are supported.
+# ==============================================================================
+
+foreach MODEL ( cam )
+   set targetdir = $DARTROOT/models/$MODEL/work
+   if ( ! -x $targetdir/filter ) then
+      echo ""
+      echo "WARNING: executable file 'filter' not found."
+      echo "         Looking for: $targetdir/filter "
+      echo "         Trying to rebuild all executables for $MODEL now ..."
+      (cd $targetdir; ./quickbuild.csh -mpi)
+      if ( ! -x $targetdir/filter ) then
+         echo "ERROR: executable file 'filter' not found."
+         echo "       Unsuccessfully tried to rebuild: $targetdir/filter "
+         echo "       Required DART assimilation executables are not found."
+         echo "       Stopping prematurely."
+         exit -1
+      endif
+   endif
+end
+
+# ==============================================================================
+# Stage the required parts of DART in the CASEROOT directory.
+# ==============================================================================
+
+# The new case.st_archive job script calls st_archive.  It runs after the case.run job.
+# It submits the next case.run job, if RESUBMIT > 0.
+# Fix some pieces.
+# /X/ means search for lines with X in them.
+# 'c' means replace the line with the following.
+# This might want to have a conditional around it, to only execute if it's a bsub machine.
+sed -e "/BSUB[ ]*-o/c\#BSUB  -o cesm_st_arch.stdout.%J" \
+    -e "/BSUB[ ]*-e/c\#BSUB  -e cesm_st_arch.stderr.%J" \
+    -e "/BSUB[ ]*-J/c\#BSUB  -J ${CASE}.st_arch" case.st_archive >! temp.$$  || exit 20
+${MOVE} temp.$$ case.st_archive
+chmod 755       case.st_archive
+
+# Same for lt_archive
+# CESM1_5; queue and wall_clock can/should be modified via xmlchange in CESM1_5_setup_advanced 
+# (see env_batch.xml)
+sed -e "/BSUB[ ]*-o/c\#BSUB  -o cesm_lt_arch.stdout.%J  \n" \
+    -e "/BSUB[ ]*-e/c\#BSUB  -e cesm_lt_arch.stderr.%J  \n" \
+    -e "/BSUB[ ]*-J/c\#BSUB  -J ${CASE}.lt_arch         \n" case.lt_archive >! temp.$$  || exit 21
+${MOVE} temp.$$ case.lt_archive
+chmod 755       case.lt_archive
+
+# SetupFileName comes from CESM#_#_setup*, the calling script.
+sed -e "s#BOGUSCASEROOT#$CASEROOT#"  ${DARTROOT}/models/cam/shell_scripts/no_assimilate.csh \
+    > no_assimilate.csh  || exit 22
+sed -e "s#BOGUSCASEROOT#$CASEROOT#" \
+    -e "s#BOGUSBASEOBSDIR#$baseobsdir#"  \
+    -e "s#BOGUS_save_every_Mth#$save_every_Mth_day_restarts#" \
+    ${DARTROOT}/models/cam/shell_scripts/assimilate1_5.csh    > assimilate.csh  || exit 23
+#     -e "s#BOGUSSETUP#$SetupFileName#"  \
+chmod 755 assimilate.csh
+chmod 755 no_assimilate.csh
+# chmod 755 perfect_model.csh
+
+# ==============================================================================
+# Stage the DART executables in the CESM execution root directory: EXEROOT
+# If you recompile the DART code (maybe to support more observation types)
+# we're making a script to make it easy to install new DART executables.
+# ==============================================================================
+
+cat << EndOfText >! stage_dart_files
+#!/bin/sh
+
+# Run this script in the ${CASEROOT} directory.
+# This script copies over the dart executables and POSSIBLY a namelist
+# to the proper directory.  If you have to update any dart executables,
+# do it in the ${DARTROOT} directory and then rerun stage_dart_files.
+# If an input.nml does not exist in the ${CASEROOT} directory,
+# a default one will be copied into place.
+#
+# This script was autogenerated by $0 using the variables set in that script.
+
+if [[ -e input.nml ]]; then
+   echo "stage_dart_files: Using existing ${CASEROOT}/input.nml"
+   if [[ -e input.nml.original ]]; then
+      echo "input.nml.original already exists - not making another"
+   else
+      ${COPY} input.nml input.nml.original
+   fi
+
+elif [[ -e ${DARTROOT}/models/cam/work/input.nml ]]; then
+   ${COPY} ${DARTROOT}/models/cam/work/input.nml  input.nml
+   if [[ -x update_dart_namelists ]]; then
+          ./update_dart_namelists
+   fi
+else
+   echo "ERROR: stage_dart_files could not find an input.nml.  Aborting"
+   exit -99
+fi
+
+${COPY} ${DARTROOT}/models/cam/work/cam_to_dart       ${EXEROOT}
+${COPY} ${DARTROOT}/models/cam/work/dart_to_cam       ${EXEROOT}
+${COPY} ${DARTROOT}/models/cam/work/filter            ${EXEROOT}
+${COPY} ${DARTROOT}/models/cam/work/perfect_model_obs ${EXEROOT}
+
+exit 0
+
+EndOfText
+chmod 0755 stage_dart_files
+
+./stage_dart_files  || exit -8
+
+# ==============================================================================
+# Ensure the DART namelists are consistent with the ensemble size,
+# suggest settings for num members in the output diagnostics files, etc.
+# The user is free to update these after setup and before running.
+# ==============================================================================
+
+cat << EndOfText >! update_dart_namelists
+#!/bin/sh
+
+# this script makes certain namelist settings consistent with the number
+# of ensemble members built by the setup script.
+# this script was autogenerated by $0
+# using the variables set in that script
+
+# Ensure that the input.nml ensemble size matches the number of instances.
+# WARNING: the output files contain ALL ensemble members ==> BIG
+
+ex input.nml <<ex_end
+g;ens_size ;s;= .*;= ${NINST_ATM};
+g;num_output_state_members ;s;= .*;= ${NINST_ATM};
+g;num_output_obs_members ;s;= .*;= ${NINST_ATM};
+wq
+ex_end
+
+# If we are using WACCM (i.e. WCCM or WACCM) we have preferred values
+echo "${CCSM_COMPSET}" | grep CCM
+if [[ \$? == 0 ]]; then 
+   sed -e "/ vert_normalization_scale_height /c\ vert_normalization_scale_height = 2.5" \
+       -e "/ highest_obs_pressure_Pa /c\ highest_obs_pressure_Pa = 0.0001" \
+       -e "/ highest_state_pressure_Pa /c\ highest_state_pressure_Pa = 0.01" \
+       -e "/ vert_coord /c\ vert_coord = 'log_invP'" \
+       input.nml > input.nml.waccm || exit 1
+   mv input.nml.waccm input.nml || exit 1
+else
+   echo "Apparently not configured for WACCM"
+   echo "CCSM_COMPSET is $CCSM_COMPSET"
+fi
+
+exit 0
+
+EndOfText
+chmod 0755 update_dart_namelists
+
+./update_dart_namelists || exit -9
+
+#=========================================================================
+# Stage the files needed for SAMPLING ERROR CORRECTION - even if not
+# initially requested. The file is static, small, and may be needed later.
+#
+# If it is requested and is not present ... it is an error.
+#
+# The sampling error correction is a lookup table.  Each ensemble size
+# has its own (static) file.  It is only needed if any
+# input.nml:&assim_tools_nml:sampling_error_correction = .true.,
+#=========================================================================
+
+if ( $num_instances > 1 ) then
+   set SAMP_ERR_FILE = ${DARTROOT}/system_simulation/final_full_precomputed_tables/final_full.${num_instances}
+   if (  -e   ${SAMP_ERR_FILE} ) then
+      ${COPY} ${SAMP_ERR_FILE} .
+   else
+      echo ""
+      echo "WARNING: no final_full.xx file found for an ensemble size of ${num_instances}."
+      echo "         This file is NOT needed unless you want to turn on the"
+      echo "         sampling_error_correction feature in any of the models."
+      echo "         To use it, in addition to setting the namelist to .true., cd to:"
+      echo "         ${DARTROOT}/system_simulation"
+      echo "         and create a final_full.${num_instances} file"
+      echo "         one can be generated for any ensemble size; see docs."
+      echo "         Copy it into ${CASEROOT} before running."
+      echo ""
+   endif
+   
+   foreach N ( input.nml )
+      set  MYSTRING = `grep sampling_error_correction $N`
+      set  MYSTRING = `echo $MYSTRING | sed -e "s#[=,'\.]# #g"`
+      set  MYSTRING = `echo $MYSTRING | sed -e 's#"# #g'`
+      set SECSTRING = `echo $MYSTRING[2] | tr '[:upper:]' '[:lower:]'`
+   
+      if ( ${SECSTRING} == true ) then
+         if ( ! -e  ${SAMP_ERR_FILE} ) then
+            echo "ERROR: no sampling error correction file for this ensemble size."
+            echo "ERROR: looking for ${SAMP_ERR_FILE} in"
+            echo "ERROR: ${DARTROOT}/system_simulation/final_full_precomputed_tables"
+            echo "ERROR: one can be generated for any ensemble size; see docs."
+            exit -3
+         endif
+      endif
+   end
+else
+   # sampling error correction not used for perfect_model_obs
+endif
+
+# ==============================================================================
+# What to do next
+# ==============================================================================
+
+
+cat << EndOfText >! DART_instructions.txt
+
+-------------------------------------------------------------------------
+
+Check the DART configuration:
+
+1) The default behavior is to _not_ invoke DART and simply run CESM.
+   We recommend that you make sure this works before proceeding.
+
+2) When you want to run DART, edit the env_run.xml: DATA_ASSIMILATION_* 
+   to enable running a DART script (assimilate.csh or perfect_model.csh)
+   after the model forecast.
+
+3) Modify what you need to in the DART namelist file, i.e. ${CASEROOT}/input.nml
+
+4) If you have recompiled any part of the DART system, 'stage_dart_files'
+   will copy them into the correct places.
+
+5) If you stage your own inflation files, make sure you read the INFLATION section
+   in ${CASEROOT}/CESM_DART_config
+
+6) Make sure the observation directory name in assimilate.csh or perfect_model.csh 
+   matches the one on your system.
+
+7) Submit the CESM job in the normal way.
+
+8) You can use ${CASEROOT}/stage_cesm_files
+    to stage files to restart a run.
+
+-------------------------------------------------------------------------
+
+EndOfText
+
+cat DART_instructions.txt
+
+exit 0
+
+# <next few lines under version control, do not edit>
+# $URL: https://subversion.ucar.edu/DAReS/DART/trunk/models/cam/shell_scripts/CESM_DART_config $
+# $Revision: 7742 $
+# $Date: 2015-03-19 18:17:16 -0600 (Thu, 19 Mar 2015) $
+

Added: DART/trunk/models/cam/shell_scripts/CESM1_5_setup_advanced
===================================================================
--- DART/trunk/models/cam/shell_scripts/CESM1_5_setup_advanced	                        (rev 0)
+++ DART/trunk/models/cam/shell_scripts/CESM1_5_setup_advanced	2016-03-16 20:28:24 UTC (rev 9967)
@@ -0,0 +1,1567 @@
+#!/bin/csh -f
+#BSUB  -n 1 
+#BSUB  -R "span[ptile=1]"
+#BSUB  -q caldera 
+#BSUB  -P P86850054
+#BSUB  -W 1:00
+#BSUB  -u raeder at ucar.edu
+#BSUB  -N  
+#BSUB  -a poe 
+# The job name MUST be the name of this script(file), or this file will not be
+# archived in $CASEROOT.
+#BSUB  -J CESM1_5_setup_advanced
+#BSUB  -o CESM1_5_setup_ATM.bld1
+#BSUB  -e CESM1_5_setup_ATM.bld1
+#
+# DART software - Copyright 2004 - 2013 UCAR. This open source software is
+# provided by UCAR, "as is", without charge, subject to all terms of use at
+# http://www.image.ucar.edu/DAReS/DART/DART_download
+#
+# DART $Id: CESM1_5_a2d_setup_advanced 7946 2015-04-30 21:11:34Z nancy $
+
+
+#*******************************************************************************
+#
+# ---------------------
+# Purpose
+# ---------------------
+#
+# This script is designed to set up, stage, and build a multi-instance run
+# of CESM using an F compset where CAM, CLM and CICE are active. The initial state
+# can come from a single multi-instance reference case so a CESM hybrid setup
+# is used. Instructions on what to change to use the SE core or WACCM are
+# outlined in the models/cam/model_mod.html documentation.
+#
+# This script (CESM1_5_setup_advanced) has many changes from CESM1_2_1_setup_advanced 
+# due to CESM's implementation of cime and move to perl scripts and xml files 
+# for all of the setup and running.  CESM1_5 enables running multiple cycles in a single
+# job without running st_archive.  This can make use of a large run directory, 
+# but requires knowledge of how many cycles may be completed before the archiving must happen.
+#
+# This script differs from the "simple" CESM1_5_setup_hybrid in several ways.
+# This script also has examples of how to specify alternate support files that
+# cover more recent dates than the current defaults.  For more detail, consult
+# the models/cam/shell_scripts/README_advanced and the models/cam/model_mod.html.
+#
+# Variable resolution in CAM-SE is not yet available in the public releases of CESM.
+#
+# DOCN: We are using a single data ocean.
+#       This script also supports the use of non-standard (ocean) resolution 
+#       (e.g. 0.25x0.25 AVHRR)
+#
+# Because the atmosphere assimilations typically occur every 6 hours,
+# the methodology here reflects that. All of CESM stops every 6 hours
+# so that a CAM output file would be available for assimilation.
+#
+# CESM/DART requires some modifications to the CESM source code EVEN IF YOU
+# ARE NOT USING DART. 
+#
+# This script results in a viable setup for a CESM multi-instance experiment.
+# You are STRONGLY encouraged to run the multi-instance CESM a few times and
+# experiment with different settings BEFORE you try to assimilate observations.
+# The data volume is quite large and you should become comfortable using
+# CESM's restart capability to re-stage files in your RUN directory.
+#
+# ${CASEROOT}/CESM_DART_config is automatically run by this script and will
+# augment the CESM case with the required setup and configuration to use DART
+# to perform an assimilation. 
+#
+# Previous versions of this script relied heavily on the information in:
+# http://www.cesm.ucar.edu/models/cesm1.2/cesm/doc/usersguide1_2/book1.html
+#
+# ---------------------
+# How to use this script.
+# ---------------------
+#
+# -- You will have to read and understand the script in its entirety.
+#    You will have to modify things outside this script.
+#    This script sets up a plain CESM multi-instance run without DART,
+#    intentionally.  Once it is running, calls to DART can be added.
+#
+# -- Examine the whole script to identify things to change for your experiments.
+#
+# -- Edit this script in the $DART/models/cam/shell_scripts directory
+#    or copy it to somewhere where it will be preserved.
+#
+# -- Locate the initial multi-instance files that CESM will need.
+#
+# -- Run this script. When it is executed, it will create:
+#    1) a CESM 'CASE' directory, where the model will be built,
+#    2) a run directory, where each forecast (and assimilation) will take place,
+#    3) a bld directory for the executables.
+#    4) The short term archiver will use a fourth directory for
+#    storage of model output until it can be moved to long term storage (HPSS)
+#
+#    This script also executes ${CASEROOT}/CESM_DART_config to 
+#    make the SourceMods for CAM
+#    effective. CESM_DART_config will also augment the case with all
+#    the pieces necessary to run DART when the time comes.
+#
+# -- (if running DART) Edit the DART input.nml that appears in the ${CASEROOT}
+#    directory.
+#
+# -- Submit the job using ${CASEROOT}/case.submit
+#
+# ---------------------
+# Important features
+# ---------------------
+#
+# If you want to change something in your case other than the runtime
+# settings, it is safest to delete everything and start the run from scratch.
+# For the brave, read
+#
+# http://www.cesm.ucar.edu/models/cesm1.2/cesm/doc/usersguide1_2/x1080.html
+#
+# and you may be able to salvage something with
+# ./case.setup -clean
+# ./case.setup
+# ./case.clean_build
+# ./case.build
+#
+#*******************************************************************************
+
+# ==============================================================================
+# case options:
+#
+# case          The value of "case" will be used many ways; directory and file
+#               names both locally and on HPSS, and script names; so consider
+#               its length and information content.
+# compset       Defines the vertical resolution and physics packages to be used.
+#               Must be a standard CESM compset; see the CESM documentation.
+# resolution    Defines the horizontal resolution and dynamics; see CESM docs.
+#                  T85           ... eulerian at ~ 1 degree
+#                  ne30np4_gx1v6 ... SE core at ~ 1 degree
+#                  f09_f09       ... FV core at ~ 1 degree
+#               BUG 1384 may apply, check if ocean and atm/land must be at same resolution.
+#               Notes about the creation of the 0.25x0.25 ocean + 1deg FV  resolution are in
+#               /glade/p/work/raeder/Models/CAM_init/SST/README"
+# cesmtag       The version of the CESM source code to use when building the code.
+#               A directory with this name must exist in your home directory,
+#               and have SourceMods in it. See the SourceMods section.
+#               http://www.image.ucar.edu/pub/DART/CESM/README
+# num_instances The number of ensemble members.
+#
+# Guidelines on what to change for an SE or WACCM run are described in the
+# models/cam/model_mod.html documentation.
+# ==============================================================================
+# AMIP_CAM5_CLM40%SP_CICE%PRES_DOCN%DOM_RTM_SGLC_SWAV (F_AMIP_CAM5) (FAMIPC5)
+
+# ? Do I want to use CESM standard name format for actual runs?
+# E.g. f.e14.FC54L45BGC.f09_025.CAMassim.000a
+
+setenv case            CESM1_5_setup_ATM
+# 'compset' can be a shorthand name.  Farther down COMPSET will have the long name.
+# if -user_compset is set, then '-user_pes_setby cam' must be specified
+setenv compset         FC5L45BGC
+setenv compset_args    "-compset $compset"
+# setenv compset         "AMIP_CAM5_CLM50%BGC_CICE%PRES_DOCN%DOM_MOSART_CISM1%NOEVOLVE_SWAV"
+# setenv compset_args    "-user_compset $compset -user_pes_setby cam"
+
+# NOTE; this uses a non-standard resolution, which required adding a line to create_newcase.
+#       That 'user_grid_file' line should be removed for standard resolutions.
+#       If the glc/CISM resolution is changed, also change GLC_GRID below.
+# setenv resolution       0.9x1.25_0.25x0.25_gland20
+# setenv user_grid        "-user_grid_file /glade/p/work/raeder/Exp/GWD_TMS_CLUBB/config_grids+fv1deg_oi0.25_gland20.xml"
+setenv resolution           0.9x1.25_0.25x0.25
+setenv user_grid "-user_grid_file /glade/p/work/raeder/Exp/ATM_forcXX/config_grids+fv1deg_oi0.25.xml"
+# set user_grid = ' '
+
+# WARNING: CSEG says that alpha tags should not be used for any science.
+# setenv cesmtag              cesm1_5_beta03_clm_r162_pop_20151215_mosart1_0_14
+setenv cesmtag              cesm1_5_alpha02d
+
+setenv num_instances        4
+
+# ==============================================================================
+# machines and directories:
+#
+# mach            Computer name
+# cesmdata        Location of some supporting CESM data files.
+# cesmroot        Location of the CESM code base.  This version of the script
+#                 only supports version cesm1_2_1.
+# caseroot        Will create the CESM case directory here, where the CESM+DART
+#                 configuration files will be stored.  This should probably not
+#                 be in scratch (on yellowstone, your 'work' partition is suggested).
+#                 This script will delete any existing caseroot, so this script,
+#                 and other useful things should be kept elsewhere.
+# rundir          Will create the CESM run directory here.  Will need large
+#                 amounts of disk space, generally on a scratch partition.
+# exeroot         Will create the CESM executable directory here, where the
+#                 CESM executables will be built.  Medium amount of space
+#                 needed, generally on a scratch partition.
+# archdir         Will create the CESM short-term archive directories here.
+#                 Large, generally on a scratch partition.  Files will remain
+#                 here until the long-term archiver moves it to permanent storage.
+# dartroot        Location of the root of _your_ DART installation
+# baseobsdir      Part of the directory name containing the obs_seq.out files to be used in the 
+#                 assimilation.  The year, month, and filename will be provided in assimilate.csh.
+#                 Will be inherited by CESM#_#_DART_config and inserted into assimilate.csh
+# ==============================================================================
+
+setenv mach         yellowstone
+setenv cesmdata     /glade/p/cesm/cseg/inputdata
+# setenv cesmroot     /glade/p/cesm/cseg/collections/${cesmtag}
+setenv cesmroot     /glade/p/work/${USER}/Models/${cesmtag}
+# setenv cesmroot     /glade/p/work/juliob/cesm_tags/cesm1_5_beta03_clm_r162_pop_20151215_mosart1_0_14
+setenv caseroot     /glade/p/work/${USER}/Exp/${case}
+setenv rundir       /glade/scratch/${USER}/${case}/run
+setenv exeroot      /glade/scratch/${USER}/${case}/bld
+setenv archdir      /glade/scratch/${USER}/${case}/archive
+setenv dartroot     /glade/u/home/${USER}/DART/Trunk
+
+# For 2010-08 testing;  Note that assimilate1_5.csh looks for $base_obs_dir/YYYYMM_6H_CESM.
+setenv baseobsdir   /glade/p/image/Observations/NCEP+ACARS
+# For 2010-01
+# setenv baseobsdir   /glade/p/image/Observations/NCEP+ACARS+GPS
+# setenv baseobsdir   /glade/p/image/Observations/GPS/local
+# setenv baseobsdir   /glade/p/image/Observations/ACARS
+
+# ==============================================================================
+# configure settings:
+#
+# refcase    The name of the existing reference case that this run will
+#            start from.
+#
+# refyear    The specific date/time-of-day in the reference case that this
+# refmon     run will start from.  (Also see 'runtime settings' below for
+# refday     start_year, start_mon, start_day and start_tod.)
+# reftod
+# NOTE:      all the ref* variables must be treated like strings and have
+#            the appropriate number of preceeding zeros
+#
+# stagedir   The directory location of the reference case files.
+# ==============================================================================
+
+# GWD:
+# setenv refcase     cesm_hybrid
+# setenv refyear     2004
+# setenv refmon      01
+# setenv refday      10
+
+setenv refcase     FV1deg_sst.25_spinup
+setenv refyear     2010
+setenv refmon      08
+# setenv refcase     f.e15b03ch.FAMIP.f09_f09.CLB00
+# setenv refyear     2005
+# setenv refmon      01
+
+setenv refday      01
+setenv reftod      00000
+
+# useful combinations of time that we use below
+setenv refdate      $refyear-$refmon-$refday
+setenv reftimestamp $refyear-$refmon-$refday-$reftod
+
+# setenv stagedir /glade/p/image/CESM_initial_ensemble/rest/${reftimestamp}
+# alternative reference case for different times may be available here:
+# setenv stagedir /glade/scratch/${USER}/${refcase}/archive/rest/${reftimestamp}
+setenv stagedir /glade/p/work/raeder/Models/CAM_init/${refcase}/${reftimestamp}
+# or on the HPSS:
+# /CCSM/dart/FV0.9x1.25x30_cesm1_1_1/{Mon}1         for 1-degree FV ensembles
+
+# ==============================================================================
+# runtime settings: This script will find usable files for years 19mumble-2010.
+#    Years after that (or before) may require searching $cesmdata for more 
+#    up-to-date files and adding them to the user_nl_cam_#### in the code below.
+#
+# start_year    generally this is the same as the reference case date, but it can
+# start_month   be different if you want to start this run as if it was a different time.
+# start_day
+# start_tod
+#
+# sst_use_defaults Controls what data ocean files are used.
+#                  'true' makes CESM use default files,
+#                  'false' requires you to supply a set of files
+# sst_dataset     Data ocean file
+# sst_grid        Supporting (consistent) grid file
+# sst_year_start  Years included in the sst files.
+# sst_year_end
+#                 The default SST (as of 2015-3) goes through 2012.
+#                 Don't use the last few months, since they are incomplete.
+#
+# short_term_archiver  Copies the files from each job step to a 'rest' directory.
+# long_term_archiver   Puts the files from all completed steps on tape storage.
+#
+# stop_option   Units for determining the forecast length between assimilations
+# stop_n        Number of time units in each forecast
+#
+# If the long-term archiver is off, you get a chance to examine the files before
+# they get moved to long-term storage. You can always submit $CASE.l_archive
+# whenever you want to free up space in the short-term archive directory.
+# ==============================================================================
+
+# setenv start_year    2005
+# setenv start_month   12
+# setenv start_day     15
+setenv start_year    2010
+setenv stop_year     2010
+setenv start_month   08
+setenv start_day     01
+setenv start_tod     00000
+
+setenv sst_use_defaults 'false'
+
+if ($sst_use_defaults == 'false') then
+   # Daily, 1/4-degree SSTs from Reynolds,...,Tomas
+   setenv sst_dataset \
+      "/glade/p/work/raeder/Models/CAM_init/SST/avhrr-only-v2.20100101_cat_20101231_filled_c130829.nc"
+   #   /glade/p/work/raeder/Models/CAM_init/SST/avhrr-only-v2.20110101_cat_20111231_filled_c130829.nc
+   setenv sst_grid    /glade/p/work/raeder/Models/CAM_init/SST/domain.ocn.025.120821.nc
+   setenv sst_year_start $start_year
+   setenv sst_year_end   $stop_year
+#    setenv sst_dataset ${cesmdata}/atm/cam/sst/sst_HadOIBl_bc_0.9x1.25_1850_2013_c140701.nc
+#    setenv sst_grid ${cesmdata}/share/domains/domain.ocn.fv0.9x1.25_gx1v6.130409.nc
+#    setenv sst_year_start 1850
+#    setenv sst_year_end   2011
+endif
+
+
+# Archiving during runs is handled by CESM's st_archive, outside of the data assimilation jobs.
+# It's best to leave long_term_archiver off, so that all the desired cleanup can be done
+# before using Tb of tape space.
+setenv short_term_archiver off
+setenv long_term_archiver  off
+
+setenv stop_option         nhours
+setenv stop_n              6
+
+# ==============================================================================
+# job settings:
+#
+# queue      can be changed during a series by changing the case.run
+# timewall   can be changed during a series by changing the case.run
+#
+# TJH: Advancing 30 instances for 6 hours and assimilating took
+#      less than 10 minutes on yellowstone using 1800 pes (120 nodes)
+# ==============================================================================
+
+setenv ACCOUNT      P86850054
+setenv queue        premium
+setenv timewall     1:00
+
+# ==============================================================================
+# standard commands:
+#
+# If you are running on a machine where the standard commands are not in the
+# expected location, add a case for them below.
+# ==============================================================================
+
+set nonomatch       # suppress "rm" warnings if wildcard does not match anything
+
+# The FORCE options are not optional.
+# The VERBOSE options are useful for debugging though
+# some systems don't like the -v option to any of the following
+switch ("`hostname`")
+   case be*:
+      # NCAR "bluefire"
+      set   MOVE = '/usr/local/bin/mv -fv'
+      set   COPY = '/usr/local/bin/cp -fv --preserve=timestamps'
+      set   LINK = '/usr/local/bin/ln -fvs'
+      set REMOVE = '/usr/local/bin/rm -fr'
+
+   breaksw
+   default:
+      # NERSC "hopper", NWSC "yellowstone"
+      set   MOVE = '/bin/mv -fv'
+      set   COPY = '/bin/cp -fv --preserve=timestamps'
+      set   LINK = '/bin/ln -fvs'
+      set REMOVE = '/bin/rm -fr'
+
+   breaksw
+endsw
+
+# ==============================================================================
+# ==============================================================================
+# by setting the values above you should be able to execute this script and
+# have it run.  however, for running a real experiment there are still many
+# settings below this point - e.g. component namelists, history file options,
+# the processor layout, xml file options, etc - that you will almost certainly
+# want to change before doing a real science run.
+# ==============================================================================
+# ==============================================================================
+
+if ($?LS_SUBCWD) then
+   cd $LS_SUBCWD
+endif
+
+# ==============================================================================
+# Make sure the CESM directories exist.
+# VAR is the shell variable name, DIR is the value
+# ==============================================================================
+
+foreach VAR ( cesmroot dartroot stagedir )
+   set DIR = `eval echo \${$VAR}`
+   if ( ! -d $DIR ) then
+      echo "ERROR: directory '$DIR' not found"
+      echo " In the setup script check the setting of: $VAR "
+      exit 10
+   endif
+end
+
+# ==============================================================================
+# Create the case - this creates the CASEROOT directory.
+#
+# For list of the pre-defined component sets: ./create_newcase -list
+# To create a variant compset, see the CESM documentation and carefully
+# incorporate any needed changes into this script.
+# ==============================================================================
+
+if ($?LSB_JOBNAME) then
+   # This only works if the job name in the BSUB directives is the name of this script.
+   setenv SetupFileName $LSB_JOBNAME
+else
+   setenv SetupFileName $0:t
+endif
+   
+# fatal idea to make caseroot the same dir as where this setup script is
+# since the build process removes all files in the caseroot dir before
+# populating it.  try to prevent shooting yourself in the foot.
+
+if ( $caseroot == `pwd` ) then
+   echo "ERROR: the setup script should not be located in the caseroot"
+   echo "directory, because all files in the caseroot dir will be removed"
+   echo "before creating the new case.  move the script to a safer place."
+   exit 11
+endif
+
+echo "removing old files from ${caseroot}"
+echo "removing old files from ${exeroot}"
+echo "removing old files from ${rundir}"
+${REMOVE} ${caseroot}
+${REMOVE} ${exeroot}
+${REMOVE} ${rundir}
+
+${cesmroot}/cime/scripts/create_newcase -case ${caseroot} -mach ${mach} \
+    -res ${resolution} ${compset_args} $user_grid
+    
+
+if ( $status != 0 ) then
+   echo "ERROR: Case could not be created."
+   exit 12
+endif
+
+# ==============================================================================
+# Record the DARTROOT directory and copy the DART setup script to CASEROOT.
+# CESM_DART_config can be run at some later date if desired, but it presumes
+# to be run from a CASEROOT directory. If CESM_DART_config does not exist locally,
+# then it better exist in the expected part of the DARTROOT tree.
+# ==============================================================================
+
+# Preserve a copy of this script as it was run.
+# Use setenv so that CESM_DART_config can access it, 
+# in particular for help with removing unneeded restart sets.
+${COPY} $SetupFileName ${caseroot}/${SetupFileName}.original
+
+if (   -e CESM1_5_DART_config ) then
+   sed -e "s#BOGUS_DART_ROOT_STRING#${dartroot}#" < CESM1_5_DART_config \
+       >! ${caseroot}/CESM_DART_config  || exit 20
+   chmod 755       ${caseroot}/CESM_DART_config
+else
+   echo "ERROR: the script to configure for data assimilation is not available."
+   echo "       CESM1_5_DART_config MUST be present in ${dartroot}/models/cam/shell_scripts/"
+   exit 21
+endif
+
+
+# FIXME; DReaD suggested merging our env_archive needs into their file,
+# rather than replacing theirs.  Do this with a 'ex' call instead of the copy above.
+# Replace ".h\w*" with ".*\.h.*" 
+#  \w = [a-zA-z0-9_], which doesn't include '.', which is part of the names. 
+#  '.*' matches any character except newline 0 or more times.  
+#  Change * to + if you want '1 or more times'. 
+#  Also, '.' at the start says "add more file name pieces before this suffix
+#     and '.' will be part of the suffix.  
+#  '.\.h' doesn't work because it's looking for [anychar].h,
+#     but there's usually no anychar before the '.' except 
+#     what's in the pieces being added, which don't count.
+# Also replace ".r\..*" with ".*\.r\..*"
+# Look for the address of "cpl", then continue through address of ".h\w*" .
+if (! -f ${caseroot}/env_archive.xml.original) then
+   ${COPY} ${caseroot}/env_archive.xml  ${caseroot}/env_archive.xml.original
+ex ${caseroot}/env_archive.xml <<ex_end  
+/"dart"
+/<files>
+append
+      <file_extension suffix="da\.log\..+">
+        <subdir>logs</subdir>
+        <keep_last_in_rundir>false</keep_last_in_rundir>
+      </file_extension> 
+.
+/"cpl"
+/\.r
+s/\.r\\\./.*.r./
+/\.h\\w*/
+s/\.h\\w*/.*.h.*/
+wq
+ex_end
+endif
+
+# Need \\ because '\' actually appears in the file and we need to escape it and '.'
+grep  'da\\\.log' ${caseroot}/env_archive.xml  || exit 30
+
+
+# ==============================================================================
+# Configure the case.
+# ==============================================================================
+
+cd ${caseroot}
+
+# Get a bunch of environment variables.
+# ./Tools/ccsm_getenv no longer exists.
+# If any of these are changed by xmlchange calls in this program,
+# then they must be explicty changed with setenv calls too.  
+setenv TEST_MPI           `./xmlquery MPI_RUN_COMMAND    -value`
+setenv CLM_CONFIG_OPTS    `./xmlquery CLM_CONFIG_OPTS    -value`
+# old? setenv CCSM_COMPSET       `./xmlquery CCSM_COMPSET       -value`
+setenv COMPSET            `./xmlquery COMPSET            -value`
+setenv CAM_DYCORE         `./xmlquery CAM_DYCORE         -value`
+setenv BATCHSUBMIT        `./xmlquery BATCHSUBMIT        -value`
+setenv MAX_TASKS_PER_NODE `./xmlquery MAX_TASKS_PER_NODE -value`
+setenv COMP_OCN           `./xmlquery COMP_OCN           -value`
+setenv CIMEROOT           `./xmlquery CIMEROOT           -value`
+setenv CASEROOT           `./xmlquery CASEROOT           -value`
+
+# Make sure the case is configured with a data ocean.
+
+if ( ${COMP_OCN} != docn ) then
+   echo " "
+   echo "ERROR: This setup script is not appropriate for active ocean compsets."
+   echo "ERROR: Please use the models/CESM/shell_scripts examples for that case."
+   echo " "
+   exit 40
+endif
+
+# Extract pieces of the COMPSET for choosing correct setup parameters.
+# "AMIP_CAM5_CLM50%BGC_CICE%PRES_DOCN%DOM_MOSART_CISM1%NOEVOLVE_SWAV"
+set list = `echo $COMPSET   | sed -e "s/_/ /g"`
+
+# Land ice, aka glacier, aka glc.
+set glc  = `echo "$list[7]" | sed -e "s/%/ /g"`
+set glacier = "$glc[1]"
+if ($glacier !~ 'CISM*' && $glacier != 'DGLC'  && $glacier != 'SGLC') then
+   echo "glacier is $glacier, which is not supported"
+   exit 45
+endif
+
+# River runoff status (2016-2-23 CESM1_5_beta03)
+# There are 2 choices: the older River Transport Model and the new Model for Scale Adaptive River Transport.
+# They are separate CESM components, and are/need to be specified in the compset.
+# It may be that RTM can be turned off via namelists, but I don't know about MOSART.
+# Specify the river runoff model: 'RTM', 'MOSART', or anything else.
+set roff = `echo "$list[6]" | sed -e "s/%/ /g"`
+set river_runoff = "$roff[1]"
+if ($river_runoff != 'RTM'  && $river_runoff != 'MOSART' && \
+    $river_runoff != 'DROF' && $river_runoff != 'SROF') then
+   echo "river_runoff is $river_runoff, which is not supported"
+   exit 45
+endif
+
+# MAX_TASKS_PER_NODE comes from $case/Tools/mkbatch.$machine
+@ ptile = $MAX_TASKS_PER_NODE / 2
+@ nthreads = 1
+
+# Save a copy for debug purposes
+foreach FILE ( *xml )
+   if ( ! -e        ${FILE}.original ) then
+      ${COPY} $FILE ${FILE}.original
+   endif
+end
+
+# NOTE: If you require bit-for-bit agreement between different runs,
+#  in particular, between pmo (single instance) and assimilations (NINST > 1),
+#  or if you need to change the number of nodes/member due to changing memory needs,
+#  then env_run.xml:BFBFLAG must be set to TRUE, so that the coupler will
+#  generate bit-for-bit identical results, regardless of the number of tasks
+#  given to it.  The time penalty appears to be ~ 0.5% in the forecast.

@@ Diff output truncated at 40000 characters. @@