[Dart-dev] [7735] DART/trunk/models/cam/shell_scripts: new scripts which run multiple forecast cycles in a single
nancy at ucar.edu
nancy at ucar.edu
Mon Mar 16 14:54:24 MDT 2015
Revision: 7735
Author: nancy
Date: 2015-03-16 14:54:24 -0600 (Mon, 16 Mar 2015)
Log Message:
-----------
new scripts which run multiple forecast cycles in a single
batch job.
Added Paths:
-----------
DART/trunk/models/cam/shell_scripts/README
DART/trunk/models/cam/shell_scripts/archive_cycles.csh
DART/trunk/models/cam/shell_scripts/submit_cycles.csh
-------------- next part --------------
Copied: DART/trunk/models/cam/shell_scripts/README (from rev 7729, DART/branches/cam/models/cam/shell_scripts/README)
===================================================================
--- DART/trunk/models/cam/shell_scripts/README (rev 0)
+++ DART/trunk/models/cam/shell_scripts/README 2015-03-16 20:54:24 UTC (rev 7735)
@@ -0,0 +1,92 @@
+# DART software - Copyright 2004 - 2013 UCAR. This open source software is
+# provided by UCAR, "as is", without charge, subject to all terms of use at
+# http://www.image.ucar.edu/DAReS/DART/DART_download
+#
+# DART $Id$
+
+This README describes how to use the scripts in this directory
+to set up a CESM (active atmospheric component) assimilation.
+
+This includes changes to CESM and DART scripts which enable
+multiple CAM forecasts and assimilations ("cycles") in a single
+LSF job on yellowstone. This reduces the time spent waiting
+in the queue by a large percentage. It also moves the short
+term archiver tasks into a separate, single-task job, which
+saves ~30% of the core hours used in the standard workflow mode
+and saves even more by eliminating most of the copies that are
+in st_archive.csh.
+
+The "traditional", single-cycle-per-job way of running CESM+DART is to
+1) Build the DART executables in ../work, as described elsewhere.
+2) Set assimilation parameters in ../work/input.nml according to your needs,
+ as described elsewhere.
+3) Set set-up parameters in CESM{version}_setup_hybrid and locate the files
+ referenced in that script (obs_seq, initial ensemble, ...),
+ as described in that script.
+4) Run CESM{version}_setup_hybrid interactively to set up the
+ $CASEROOT and $RUNDIR directories and put/modify all the
+ necessary scripts and other files there.
+5) Follow directions printed at the end of the CESM_2_1_setup_hybrid run.
+6) Run $CASE.submit to submit a single cycle job with
+ CONTINUE_RUN = FALSE
+7) Change
+ CONTINUE_RUN = TRUE
+ RESUBMIT = as many cycles as desired.
+ $CASE.run will resubmit itself to the queue RESUBMIT times.
+8) Short- and/or long-term archiving may be done at the end of each cycle,
+ as set in env_run.xml when any job finishes.
+
+The new, multicycle way of running CESM+DART uses a different
+workflow starting with 7). Instead of $CASE.run resubmitting
+itself to the queue RESUBMIT times, the variant script
+$CASE.run_cycles will recursively run itself RESUBMIT times
+within a single job. RESUBMIT will be decremented as usual
+at the end of each assimilation cycle, and will be -1 at the end
+of the job.
+
+There is a new submit script as well; $CASE.submit_cycles
+("submit_cycles.csh" in this directory),
+which is run interactively in $CASEROOT, just like
+$CASE.submit in the standard CESM workflow.
+It can submit any number of multicycle jobs ($CASE.run_cycles),
+each dependent on the previous one. These are submitted in groups,
+separated by single processor archiving jobs,
+to prevent the disk from filling up.
+The user organizes this series of jobs in $CASE.submit_cycles by 3 parameters.
+ RESUBMIT = the number of cycles which will fit in the wall clock limit, minus 1
+ jobs_loop = the number of multi-cycle jobs which can be run
+ before the disk fills up
+ archive_loop = the outermost loop; the number of times archive_cycles.csh
+ will be run
+So the total number of assimilation cycles will be
+($RESUBMIT + 1) * jobs_loop * archive_loop.
+So,
+ 7) Edit $CASE.submit_cycles
+ 8) Run $CASE.submit_cycles interactively
+
+
+
+Another way to describe the multi-cycle workflow follows.
+After the new CESM{version}_setup_cycles and CESM_DART_config scripts
+set up the case, the calling tree of these scripts is
+-> $CASE.submit_cycles (run this interactively in $CASEROOT)
+ -> $CASE.run_cycles #1 (created by CESM{version}_setup_cycles)
+ -> the next $CASE.run_cycles (NOT as a batch job; a recursion)
+ ... repeats until RESUBMIT has been reduced to 0.
+
+ [-> $CASE.run_cycles #2,....] (after previous set of $CASE.run_cycles finishes)
+ (Resets RESUBMIT to initial value)
+ -> archive_cycles.csh (waits for the series of $CASE.run_cycles)
+ archives selected restarts
+ archives selected history files
+ -> lt_archive.sh -m copy_dirs_hsi
+
+ -> $CASE.run_cycles #($jobs_loop +1) (waits until the archive_cycles.csh is done)
+ ...
+
+
+# <next few lines under version control, do not edit>
+# $URL$
+# $Revision$
+# $Date$
+
Copied: DART/trunk/models/cam/shell_scripts/archive_cycles.csh (from rev 7729, DART/branches/cam/models/cam/shell_scripts/archive_cycles.csh)
===================================================================
--- DART/trunk/models/cam/shell_scripts/archive_cycles.csh (rev 0)
+++ DART/trunk/models/cam/shell_scripts/archive_cycles.csh 2015-03-16 20:54:24 UTC (rev 7735)
@@ -0,0 +1,214 @@
+#!/bin/csh -f
+#
+# DART software - Copyright 2004 - 2013 UCAR. This open source software is
+# provided by UCAR, "as is", without charge, subject to all terms of use at
+# http://www.image.ucar.edu/DAReS/DART/DART_download
+#
+# DART $Id$
+
+# This script is used to archive DART and CESM files when running multiple
+# CESM/DART cycles in a single batch job. See the README file for more details.
+
+#BSUB -o archive_cycles.%J
+#BSUB -e archive_cycles.%J
+#BSUB -W 11:00
+#BSUB -N
+#BSUB -q caldera
+#BSUB -n 1
+#BSUB -P ZZZZZZZZ
+#BSUB -J archive_cycles.csh
+
+# ==============================================================================
+# Load environment variables from CESM
+# ==============================================================================
+
+cd BOGUS_CASE
+
+source ./Tools/ccsm_getenv || exit -1
+
+# Set the archive frequency for restart sets. The others will be removed.
+set archive_Nth_days = 4
+
+# ==============================================================================
+# standard commands:
+#
+# If you are running on a machine where the standard commands are not in the
+# expected location, add a case for them below.
+# ==============================================================================
+
+set nonomatch # suppress "rm" warnings if wildcard does not match anything
+
+# The FORCE options are not optional.
+# The VERBOSE options are useful for debugging though
+# some systems don't like the -v option to any of the following
+switch ("`hostname`")
+ case be*:
+ # NCAR "bluefire"
+ set MOVE = '/usr/local/bin/mv -fv'
+ set COPY = '/usr/local/bin/cp -fv --preserve=timestamps'
+ set REMOVE = '/usr/local/bin/rm -fr'
+ set LINK = '/usr/local/bin/ln -fvs'
+ breaksw
+
+ default:
+ # NERSC "hopper", NWSC "yellowstone"
+ set MOVE = '/bin/mv -fv'
+ set COPY = '/bin/cp -fv --preserve=timestamps'
+ set REMOVE = '/bin/rm -fr'
+ set LINK = '/bin/ln -fvs'
+ breaksw
+endsw
+
+# ==============================================================================
+# Use the CESM coupler files to make a file list; options are to
+# delete them, archive them, or leave them in place.
+# ==============================================================================
+
+cd $RUNDIR
+set most_recent = `ls -1t *cpl*.r* | head -1`
+set most_recent_date = `echo $most_recent | sed "s/\.nc//; s/^.*\.r\.//;"`
+echo THE MOST RECENT DATE, $most_recent_date, RESTARTS WILL be saved
+
+foreach cplfile ( *cpl*.r* )
+
+ # Extract the date from the cpl restart file name
+ set date = `echo $cplfile | sed "s/\.nc//; s/^.*\.r\.//;"`
+
+ if ($date != $most_recent_date) then
+ set temp = `echo $date | sed -e "s#-# #g"`
+ set day = $temp[3]
+ set secs = $temp[4]
+ if ( $secs == 00000 && (`expr $day % $archive_Nth_days` == 0 || $day == 28) ) then
+
+ echo $date : archiving files from this date
+ if ( -e $DOUT_S_ROOT/rest/${date} ) then
+ echo directory already exists
+ else
+ echo creating directory
+ mkdir -p $DOUT_S_ROOT/rest/${date}
+ endif
+ $MOVE *.r*.${date}* $DOUT_S_ROOT/rest/${date}
+ $MOVE *.i.${date}* $DOUT_S_ROOT/rest/${date}
+ $MOVE *prior_inflate_restart.${date} $DOUT_S_ROOT/rest/${date}
+ $MOVE *post_inflate_restart.${date} $DOUT_S_ROOT/rest/${date}
+ $COPY *cam*.h*.${date}* $DOUT_S_ROOT/rest/${date}
+ $COPY *clm*.h*.${date}* $DOUT_S_ROOT/rest/${date}
+ else
+ echo $date : deleting files from this date
+ $REMOVE *.r*.${date}*
+ $REMOVE *.i.${date}*
+ $REMOVE *prior_inflate_restart.${date}
+ $REMOVE *post_inflate_restart.${date}
+ endif
+
+ else
+ echo $date : preserving files from this date
+ endif
+
+end
+
+# ==============================================================================
+# archive history files
+# ==============================================================================
+
+# Save most recent day's worth of history files, for potential continuing accumulation.
+# This assumes you are doing 6 hour assimilation cycles.
+set times = (00000 21600 43200 64800)
+foreach t ($times)
+ set most_recent = `ls -1t *cpl*.r*-$t* | head -1`
+ if ($status != 0) then
+ # Move on to the next iteration of the t loop.
+ continue
+ endif
+
+ set most_recent_date = `echo $most_recent | sed "s/\.nc//; s/^.*\.r\.//;"`
+ echo THE MOST RECENT DATE, $most_recent_date, HISTORY FILES WILL be saved
+
+ if ( -e TEMP ) then
+ echo TEMP directory already exists to hold current history files
+ else
+ echo creating TEMP directory to hold current history files
+ mkdir TEMP
+ endif
+
+ #move current history files into this directory
+ $MOVE *cam*.h*.${most_recent_date}* TEMP
+ $MOVE *pop*.h*.${most_recent_date}* TEMP
+ $MOVE *pop*.d*.${most_recent_date}* TEMP
+ $MOVE *rtm*.h*.${most_recent_date}* TEMP
+ $MOVE *clm2*.h*.${most_recent_date}* TEMP
+ $MOVE *cice*.h*.${most_recent_date}* TEMP
+ # Move CAM-SE grid files out of the way
+ $MOVE *Mapping*.nc TEMP
+end
+
+# Now archive all other history files.
+# All times (except for those hidden in TEMP) will be moved to the archive directory.
+# Excluding history files based on times requires extra code here.
+
+# Each of these entries will have * prepended and appended to it.
+set files = ('cpl.log.' 'cesm.log.' 'dart_log.' 'P*Diag' 'True' \
+ 'obs_seq' '$CASE.cam*.h' 'atm_0001*.log.' '$CASE.clm*.h' 'lnd_0001*.log.' \
+ 'ice_0001*.log.' 'atm.log.' 'lnd.log.')
+
+# These are parallel lists; entries here must correspond exactly to the file list directly above.
+set dests = ('cpl/logs' 'cpl/logs' 'dart/logs' 'dart/hist' 'dart/hist' \
+ 'dart/hist' 'atm/hist' 'atm/logs' 'lnd/hist' 'lnd/logs' \
+ 'ice/logs' 'atm/logs' 'lnd/logs')
+
+
+if ($#files != $#dests) then
+ echo "Wordlists 'files' and 'dests' must have the same number of words in them"
+ exit 89
+endif
+
+# Make copies of the obs_seq.final files in an unarchived place,
+# to make obs space diagnostics easier.
+set o_s_finals = $RUNDIR:h/Obs_seqs
+if (! -d ${o_s_finals}) mkdir ${o_s_finals}
+
+set f = 1
+while ($f <= $#files)
+ set file_set = "*$files[$f]*"
+ ls $file_set >& /dev/null
+ if ($status != 0) then
+ echo "Finished with all files matching $files[$f]"
+ else
+ if ("$files[$f]" == 'obs_seq' ) then
+ $COPY $file_set ${o_s_finals}
+ endif
+
+ if (! -d $DOUT_S_ROOT/$dests[$f] ) then
+ echo "Making $DOUT_S_ROOT/$dests[$f] "
+ mkdir -p $DOUT_S_ROOT/$dests[$f]
+ endif
+
+ $MOVE $file_set $DOUT_S_ROOT/$dests[$f]
+ endif
+ @ f++
+end
+
+# Remove all log files that haven't been archived.
+
+$REMOVE *log*
+
+
+# move the history files in TEMP back into RUNDIR:
+$MOVE TEMP/* .
+
+# ==============================================================================
+# run the long term archiver if requested
+# ==============================================================================
+
+if ($DOUT_L_MS == 'TRUE') then
+ cd $DOUT_S_ROOT
+ $CASEROOT/Tools/lt_archive.sh -m copy_dirs_hsi
+endif
+
+exit 0
+
+# <next few lines under version control, do not edit>
+# $URL$
+# $Revision$
+# $Date$
+
Copied: DART/trunk/models/cam/shell_scripts/submit_cycles.csh (from rev 7729, DART/branches/cam/models/cam/shell_scripts/submit_cycles.csh)
===================================================================
--- DART/trunk/models/cam/shell_scripts/submit_cycles.csh (rev 0)
+++ DART/trunk/models/cam/shell_scripts/submit_cycles.csh 2015-03-16 20:54:24 UTC (rev 7735)
@@ -0,0 +1,111 @@
+#!/bin/csh -f
+
+# DART software - Copyright 2004 - 2013 UCAR. This open source software is
+# provided by UCAR, "as is", without charge, subject to all terms of use at
+# http://www.image.ucar.edu/DAReS/DART/DART_download
+#
+# DART $Id$
+
+# This interactive script builds and submits a series of dependent jobs that runs
+# a multi-cycle DART experiment where multiple model advances and assimilations
+# are done in a single batch job, and the 'job dependency' feature of LSF is used
+# to sequence multiple batch jobs.
+#
+# Running this way is a bit more complicated than the basic scripts but
+# for a long experiment will be much cheaper and should have faster throughput.
+# See the README file for more details on how to use these scripts.
+#
+# This script is intended to work only with the LSF batch system. It would
+# need to be heavily adapted for other batch systems.
+
+# (to actually use this script, remove the following lines once you
+# have reviewed how these scripts work.)
+echo "HEY! This is a script intended only for advanced users "
+echo " on NCAR's yellowstone computer."
+echo " Please consult us before using it, since it does not have the level of "
+echo " error checking and commenting which other DART scripts have. "
+exit 218
+
+# ==============================================================================
+
+# Run this from the CESM $CASE directory
+
+# Each 'job' runs RESUBMIT+1 forecasts+assimilation cycles.
+# This is defined for CESM by the RESUBMIT variable in env_run.xml,
+# and is set in this script, and then reset by each job at its beginning.
+set RESUBMIT = 9
+
+# 'jobs_loop' jobs can be run before the $RUNDIR disk fills up
+# and the output must be archived/purged.
+set jobs_loop = 1
+
+# Long assimilations may require several archivings.
+# This is defined by archive_loop.
+set archive_loop = 1
+
+@ cycles = ($RESUBMIT + 1) * $jobs_loop * $archive_loop
+echo "The total number of forecasts submitted by this script is $cycles"
+
+# Set RESUBMIT for the first job, to make $CASE.run_cycles start with the
+# right number of forecasts. It will take care of itself after this.
+./xmlchange RESUBMIT=$RESUBMIT
+./xmlchange CONTINUE_RUN=TRUE
+
+source ./Tools/ccsm_getenv || exit 14
+echo RESUBMIT is now $RESUBMIT
+
+# First job has no dependency built in.
+setenv BATCHSUBMIT_DEP `echo $BATCHSUBMIT`
+
+# Loop over the number of archivings which will be required.
+set a_l = 1
+while ($a_l <= $archive_loop)
+ echo "archive loop $a_l"
+
+ # Loop over the number of jobs which fill up the scratch space,
+ # and necessitate archiving.
+ # Each job will start with the RESUBMIT value set in this script.
+ set j_l = 1
+ while ($j_l <= $jobs_loop)
+ echo " job loop $j_l"
+ echo " ${BATCHSUBMIT_DEP} ./${CASE}.run_cycles"
+
+ echo "${BATCHSUBMIT_DEP} ./${CASE}.run_cycles" >! templar
+ source templar >! out
+
+ set i = `grep Job out`
+ set jobid = `echo $i | cut -d'<' -f2 | cut -d'>' -f1`
+
+ # The next job will start if the previous job state is DONE (not EXIT).
+ setenv BATCHSUBMIT_DEP `echo 'bsub -w "done(' $jobid ')" <'`
+ @ j_l++
+ end
+
+ # ONLY submit the post run cleanup, archive and long_term archive
+ # if completion of the last job (all of its RESUBMITs) is successful.
+
+ setenv BATCHSUBMIT_DEP `echo 'bsub -w "done(' $jobid ')" <'`
+ echo " ${BATCHSUBMIT_DEP} ./archive_cycles.csh"
+
+ echo "${BATCHSUBMIT_DEP} ./archive_cycles.csh" >! templar
+ source templar >! out
+
+ set i = `grep Job out`
+ set jobid = `echo $i | cut -d'<' -f2 | cut -d'>' -f1`
+
+ # submit the next $CASE.run_cycles only if the current archive_cycles.csh is "done"
+ setenv BATCHSUBMIT_DEP `echo 'bsub -w "done(' $jobid ')" <'`
+
+ @ a_l++
+end
+
+if ($status == 0) then
+ rm templar out
+endif
+
+exit 0
+
+# <next few lines under version control, do not edit>
+# $URL$
+# $Revision$
+# $Date$
More information about the Dart-dev
mailing list