[Dart-dev] [4898] DART/trunk/observations: Moved a chunk of discussion about time, type , location, error from the text

nancy at ucar.edu nancy at ucar.edu
Thu May 5 10:23:46 MDT 2011


Revision: 4898
Author:   nancy
Date:     2011-05-05 10:23:45 -0600 (Thu, 05 May 2011)
Log Message:
-----------
Moved a chunk of discussion about time,type,location,error from the text
converter page into the main observations page since it applies to any
of the observation creation programs.

Modified Paths:
--------------
    DART/trunk/observations/observations.html
    DART/trunk/observations/text/text_to_obs.html

-------------- next part --------------
Modified: DART/trunk/observations/observations.html
===================================================================
--- DART/trunk/observations/observations.html	2011-05-04 21:30:19 UTC (rev 4897)
+++ DART/trunk/observations/observations.html	2011-05-05 16:23:45 UTC (rev 4898)
@@ -27,6 +27,7 @@
 
 <A HREF="#Overview">OVERVIEW</A> /
 <A HREF="#DataSources">DATA SOURCES</A> /
+<A HREF="#Decisions">DECISIONS</A> /
 <A HREF="#Programs">PROGRAMS</A> /
 <A HREF="#KnownBugs">KNOWN BUGS</A> /
 <A HREF="#FuturePlans">FUTURE PLANS</A> /
@@ -49,7 +50,7 @@
 type of observation from its state space values.
 <br>
 <br>
-In many cases, a single, self-contained program can convert 
+In many cases a single, self-contained program can convert 
 directly from the observation
 location, time, value, and error into the DART format.  In other cases,
 especially those linking with a complicated external library (e.g. BUFR),
@@ -86,29 +87,199 @@
 </P>
 
 <P>
-This directory is a work in progress.  There are currently about
-10 other observation sources and types which we are in the process
+This directory is a work in progress.  There are currently 
+some additional observation sources and types which we are in the process
 of collecting information and conversion programs for and which will
 eventually be added to this directory.  In the meantime, if you have
 converters for data or interest in something that is not in the
-repository, please email the DART group.
+repository, please <a href="mailto: dart at ucar.edu">email the DART group</a>.
 </P>
 
 <!--==================================================================-->
 
 <A NAME="DataSources"></A>
 <HR>
-<H2>DATA SOURCES</H2>
+<H2>DATA SOURCES AND FORMATS</H2>
 <P>
-&nbsp;
+See the various subdirectories here, which generally include
+information on where the example data was obtained and in what
+format it is distributed. Most data is available for download off
+the web.  The Data Support Section (DSS) at NCAR has large data
+repositories, the MADIS data center distributes observations in
+NetCDF format, GTS real-time weather data is available from various
+sources. For new converters, 
+if you can find what format the data is distributed in
+you may be able to adapt one of the existing converters 
+here for your own use.  
+Formats read by the existing converters
+include NetCDF, HDF, little-r, text, Prepbufr,
+amongst others.
 </P>
 
 <!--==================================================================-->
 
+<A NAME="Decisions"></A>
+<HR>
+<H2>DECISIONS YOU MIGHT NEED TO MAKE</H2>
+
+<H4>Time</H4>
+
+<P>
+Time enters into
+the assimilation system in 3 places: the timestamp of the
+state vector data (the current model time when this data was produced),
+the time of each observation, and the minimum time period the model
+should be called to advance (the assimilation window size).  The
+internal timestepping of the model is unrelated to any of these 
+times and is outside the scope of the assimilation system.
+</P><P>
+The basic time type in DART is a pair of integers; one for
+the day number and one for the number of seconds.  Generally the
+low order models, which aren't direct geophysical models, use
+time directly as a sequence of days starting at 0 and incrementing
+in any appropriate number of seconds or days.  The observations
+assimilated into these systems do not need to use a calendar.
+</P><P>
+Observations of a real-world system
+usually are distributed with a year/month/day,
+hour/min/seconds timestamp.  There are routines in DART
+to convert back and forth between the (day-number/seconds) 
+format and a variety of (year/month/day) calendars. 
+See <a href="../time_manager/time_manager_mod.html#time_type">here</a>
+for more details on how DART stores time information and
+the types of available calendars.  Some climate models which do
+long runs (100s or 1000s of years) use a modified calendar
+for simplicity in computation, e.g.  months which always 
+have 30 days, or no leap years.  When trying to assimilate real
+observations into these models there may be calendar issues to solve.
+</P><P>
+The smallest resolvable unit of time in DART is a second.
+To model a system which operates on sub-second time scales
+the time can be scaled up by some factor. As long as the
+observation time, the state data time, and the minimum model
+advance time are expressed in the same scaled time units,
+there is no problem.
+</P>
+
+<H4>Error</H4>
+
+<P>
+Observation must specify an associated expected error.
+Each individual observation stores its own value,
+so it can be a constant value for all observations or it
+can vary by location, by height, by magnitude of the
+observed value, etc.  
+This value is the expected instrument error plus
+the representativeness error of the model.  
+The model error includes deficiencies
+in the equations representing the processes of the system
+as well as errors introduced by representing a continuous
+system as a series of discrete points.
+While the instrument error and the representativeness error
+could be specified separately, they each have the same
+impact on the assimilation and can be difficult to determine
+with any real accuracy.  For simplicity, in DART (and
+most current assimilation software) they are
+combined and specified as a single value.
+</P><P>
+The instrument error is generally supplied by the instrument
+maker.  Sadly, it is frequently surprisingly
+difficult to find these values.
+For the representativeness error, a set of artificial
+observations could be generated with the 
+<a href="../perfect_model_obs/perfect_model_obs.html">perfect_model_obs</a>
+program and an assimilation experiment could be run to
+generate an estimate of the error in the model.
+In practice however most people make an educated guess
+on the values of the error and then start with a larger than
+expected value and decrease it based on the results of
+running some test assimilations.
+For these tests the namelist for the
+<a href="../filter/filter.html#Namelist">outlier threshold</a> 
+should be disabled by setting it to -1 (the default value is 3). 
+This value controls whether the observation is rejected
+because the observed value is too far from the ensemble mean.
+</P><P>
+If the diagnostics show that the difference between the mean of
+the forward operators and the observed value is consistently 
+smaller than the specified observation error, then the error 
+is probably too large.  
+A too-large error reduces the impact of an
+observation on the state.
+If the specified observation error is too small it is
+likely the observation will be rejected when the outlier
+threshold is enabled, and the observation will not be 
+assimilated.  It is important
+to look at the output observation sequence files after an assimilation
+to see how many observations were assimilated or rejected, and also at
+the RMSE versus the total spread.  
+DART includes Matlab diagnostic routines
+to create these types of plots.  
+The observation RMSE and total spread should be roughly
+commensurate.  The total spread includes contributions from both
+the ensemble variance and the observational error variance, so it
+can be adjusted by changing the error values on the incoming observations.
+There are other ways to adjust the ensemble spread, including
+<A HREf="../filter/filter.html#Inflation">inflation</a>,
+so the observation error is not the only factor to consider.
+</P><P>
+One last recommendation: if possible, the Prior forward operator values
+should be compared against the observations after several assimilation
+cycles.  If you plot results using the Posterior values
+it is always possible for the assimilation to overfit the
+observations and look good on the diagnostic plots.  But the
+actual test is to then advance the model and look at how the
+forecast of the state compares to the observations.
+</P>
+
+<H4>Types</H4>
+
+<P>
+All observations have to have a specific 'type'.  
+There are namelist controls
+to turn on and off the assimilation of observations at run-time
+by type, or to only evaluate the forward operator for an observation
+but have no impact on the state.  
+Several of the diagnostics also group observations by
+type to give aggregate statistics after an assimilation.
+Generally types are based on both the observing platform or
+instrument as well as the kind of observation, 
+e.g. RADIOSONDE_TEMPERATURE, ARGO_SALINITY, etc.
+Each type is associated with a single underlying generic 'kind',
+which controls what forward operator code is called inside the
+model, e.g. KIND_TEMPERATURE, KIND_DENSITY, etc.
+</P><P>
+See <a href="../obs_def/obs_def_mod.html">here</a> for more
+details on how to use and add new DART types.
+The DART obs_kind_mod.f90 defines a list of already defined
+observation kinds, and users can either use existing observation
+types in 'obs_def_xxx_mod.f90' files, or define their own.
+</P>
+
+<H4>Locations</H4>
+
+<P>
+The two most common choices for specifying the location of an
+observation are the 
+<a href="../location/threed_sphere/location_mod.html">threed_sphere</a>
+and the
+<a href="../location/oned/location_mod.html">oned</a>
+locations.  For observations of a real-world system, the 3D Sphere
+is generally the best choice.  For low-order, 1D models, the 1D
+locations are the most commonly used.  The observation locations
+need to match the type of locations used in the model.
+</P> 
+
+<!--==================================================================-->
+
 <A NAME="Programs"></A>
 <HR>
 <H2>PROGRAMS</H2>
 <P>
+The DART/observations directory contains a variety of converter programs
+to read various external formats and convert the observations into the
+format requried by DART.  In addition this external program produces
+DART observation sequence files:
 <ul>
     <li><a href="http://code.google.com/p/opaws/"
          >Observation Processing And Wind Synthesis (OPAWS)</a>:
@@ -139,7 +310,11 @@
 <HR>
 <H2>FUTURE PLANS</H2>
 <P>
-&nbsp;
+Contact the
+<a href="mailto: dart at ucar.edu">DART development group</a>
+if you have observations in a different format that you want
+to convert.  We can give you advice and pointers on how to
+approach writing the code.
 </P>
 
 <!--==================================================================-->
@@ -151,7 +326,7 @@
 <H2>Terms of Use</H2>
 
 <P>
-DART software - Copyright &#169; 2004 - 2010 UCAR.<br>
+DART software - Copyright &copy; 2004 - 2010 UCAR.<br>
 This open source software is provided by UCAR, "as is",<br>
 without charge, subject to all terms of use at<br>
 <a href="http://www.image.ucar.edu/DAReS/DART/DART_download">

Modified: DART/trunk/observations/text/text_to_obs.html
===================================================================
--- DART/trunk/observations/text/text_to_obs.html	2011-05-04 21:30:19 UTC (rev 4897)
+++ DART/trunk/observations/text/text_to_obs.html	2011-05-05 16:23:45 UTC (rev 4898)
@@ -134,125 +134,14 @@
 <HR>
 <H2>DECISIONS YOU MIGHT NEED TO MAKE</H2>
 
-<H4>Time</H4>
-
 <P>
-The basic time type in DART is a pair of integers; one for
-the day number and one for the number of seconds.  For some
-of the simple models which aren't direct geophysical models
-time can start at day 0 and increment in any appropriate
-number of seconds or days.  For a model of a real-world system
-it is likely to have observations with a year/month/day,
-hour/min/seconds timestamp.  There are routines in DART
-to convert back and forth between the (day-number/seconds) 
-format and a variety of (year/month/day) calendars. 
-See <a href="../../time_manager/time_manager_mod.html#time_type">here</a>
-for more details on how DART stores time information and
-the types of available calendars.
+See the discussion in the
+<a href="../observations.html#Decisions">observations introduction</a> 
+page about what options are available for the things you need to
+specify.  These include setting a time, specifying an expected error,
+setting a location, and an observation type.
 </P>
 
-<H4>Error</H4>
-
-<P>
-Each observation must specify an associated error.
-There is an error value per individual observation.
-It can be a constant value for all observations
-of that type, or it
-can vary by location, by height, by magnitude of the
-observed value, etc.  
-This value is the expected instrument error plus
-the representativeness error of the model.  
-The model error includes deficiencies
-in the equations representing the processes of the system
-as well as errors introduced by representing a continuous
-system as a series of discrete points.
-While the instrument error and the representativeness error
-could be specified separately, they each have the same
-impact on the assimilation and can be difficult to determine
-with any real accuracy.  For simplicity, in DART (and
-most current assimilation software) they are
-combined and specified as a single value.
-</P><P>
-The instrument error is generally supplied by the instrument
-maker.  Sadly, however, it is frequently surprisingly
-difficult to find these values.
-For the representativeness error, a set of artificial
-observations could be generated with the 
-<a href="../../perfect_model_obs/perfect_model_obs.html">perfect_model_obs</a>
-program and an assimilation experiment could be run to
-generate an estimate of the error in the model.
-In practice however most people make an educated guess
-on the values of the error and then start with a larger than
-expected value and decrease it based on the results of
-running some test assimilations.
-For these tests the namelist for the
-<a href="../../filter/filter.html#Namelist">outlier threshold</a> 
-should be disabled by setting it to -1 (the default value is 3). 
-This value controls whether the observation is rejected
-because the observed value is too far from the ensemble mean.
-</P><P>
-If the diagnostics show that the difference between the mean of
-the forward operators and the observed value is consistently 
-smaller than the specified observation error, then the error 
-is probably too large.  
-A too-large error reduces the impact of an
-observation on the state.
-If the specified observation error is too small it is
-likely the observation will be rejected when the outlier
-threshold is enabled, and the observation will not be 
-assimilated.  It is important
-to look at the output observation sequence files after an assimilation
-to see how many observations were assimilated or rejected, and at
-the error and spread of the ensemble mean compared to the observation
-values.
-</P><P>
-One last recommendation: if possible, the Prior forward operator values
-should be compared against the observations after several assimilation
-cycles.  If you plot results using the Posterior values
-it is always possible for the assimilation to overfit the
-observations and look good on the diagnostic plots.  But the
-actual test is to then advance the model and look at how the
-forecast of the state compares to the observations.
-</P>
-
-<H4>Types</H4>
-
-<P>
-All observations have to have a specific 'type'.  
-There are namelist controls
-to turn on and off the assimilation of observations at run-time
-by type, or to only evaluate the forward operator for an observation
-but have no impact on the state.  
-Several of the diagnostics also group observations by
-type to give aggregate statistics after an assimilation.
-Generally types are based on both the observing platform or
-instrument as well as the kind of observation, 
-e.g. RADIOSONDE_TEMPERATURE, ARGO_SALINITY, etc.
-Each type is associated with a single underlying generic 'kind',
-which controls what forward operator code is called inside the
-model, e.g. KIND_TEMPERATURE, KIND_DENSITY, etc.
-</P><P>
-See <a href="../../obs_def/obs_def_mod.html">here</a> for more
-details on how to use and add new DART types.
-The DART obs_kind_mod.f90 defines a list of already defined
-observation kinds, and users can either use existing observation
-types in 'obs_def_xxx_mod.f90' files, or define their own.
-</P>
-
-<H4>Locations</H4>
-
-<P>
-The two most common choices for specifying the location of an
-observation are the 
-<a href="../../location/threed_sphere/location_mod.html">threed_sphere</a>
-and the
-<a href="../../location/oned/location_mod.html">oned</a>
-locations.  For observations of a real-world system, the 3D Sphere
-is generally the best choice.  For low-order, 1D models, the oned
-locations are the most commonly used.  The observation locations
-need to match the type of locations used in the model.
-</P> 
-
 <!--==================================================================-->
 <!-- Describe the bugs.                                               -->
 <!--==================================================================-->


More information about the Dart-dev mailing list