[Tgcmgroup] mss dispose
Ben Foster
Ben Foster <foster@hao.ucar.edu>
Wed, 16 Jul 2003 10:08:45 -0600 (MDT)
Hi tgcmgroup:
I have prepared modifications to tiegcm1 and tgcm24 to allow batch jobs
to dispose history files to the mss in a separate job step after model
execution rather than during execution.
Normally, when namelist parameter DISPOSE=1, the model disposes history
files to the mss from the fortran during model execution. This is inefficient,
because only the master task does the dispose, leaving all other tasks idle.
Also, if the mss is down or heavily loaded, the model can exceed the wallclock
limit before finishing the run due to waiting for the mss.
The DISPOSE namelist read parameter can now have one of 3 settings:
; DISPOSE=0 -> do not dispose to mss.
; DISPOSE=1 -> dispose to mss during model execution.
; DISPOSE=2 -> dispose to mss after model execution.
To enable the DISPOSE=2 option, you must use the mods in the following
directories, depending on the model you are running:
/home/tgcm/tgcm24/modsrc.dispose (mods to fixed code in /home/tgcm/tgcm24)
/home/tgcm/tiegcm1/modsrc.dispose (mods to fixed code in /home/tgcm/tiegcm1)
Let me know if you need help using or merging these mods. You must also use
a new job script. There are sample job scripts in the above modsrc.dispose
directories (*.job files). These scripts use 3 job steps as follows:
Step 0: Build and execute the model (parallel, regular queue)
Step 1: Dispose history files to mss if DISPOSE was 2 (serial, share queue)
Step 2: Return output listing (serial, interactive queue)
If the model crashes or times out, the dispose step will still be executed,
correctly disposing the disk files that were written before the crash, so
this method preserves our save/restart capability.
Note that the mss dispose step (2nd step) uses the share queue. The wallclock
limit in this queue is 6 hours. If you are disposing several history files,
set this limit fairly high to allow time for the mss. If the dispose step
exceeds the wallclock limit (or the mss was down), you can go to the execution
directory (assuming you had POSTCLEAN=0 in the job script), and interactively
dispose the disk files by executing the dispose.csh shell script. (This script
was built by the model during execution if DISPOSE was 2). If some of your files
were disposed before the timeout, you can edit the dispose.csh script
accordingly.
--Ben
-----------------------------------------------------------------------
Ben Foster High Altitude Observatory (HAO)
foster@ucar.edu phone: 303-497-1595 fax: 303-497-1589
Nat. Center for Atmos. Res. P.O. Box 3000 Boulder CO 80307 USA
-----------------------------------------------------------------------