[Dart-dev] [4811] DART/trunk/utilities: Utility to compute the 'distance' between a group of ensemble members and

nancy at ucar.edu nancy at ucar.edu
Wed Mar 23 15:58:20 MDT 2011


Revision: 4811
Author:   nancy
Date:     2011-03-23 15:58:20 -0600 (Wed, 23 Mar 2011)
Log Message:
-----------
Utility to compute the 'distance' between a group of ensemble members and
the ensemble mean.  The output prints the members in sorted order, and
creates a small file which contains the name of the closest member.
There are several choices for how the distance is computed, since different
schemes might be more appropriate for different types of state vectors.

To use this utility you must turn on the 'output_restart_mean' namelist
option in the &filter_nml namelist, so that it will write the ensemble
mean in restart file format when filter exits.

Added Paths:
-----------
    DART/trunk/utilities/closest_member_tool.f90
    DART/trunk/utilities/closest_member_tool.html
    DART/trunk/utilities/closest_member_tool.nml

Property Changed:
----------------
    DART/trunk/utilities/

-------------- next part --------------

Property changes on: DART/trunk/utilities
___________________________________________________________________
Added: svn:mergeinfo
   + /DART/branches/close:4780-4810

Copied: DART/trunk/utilities/closest_member_tool.f90 (from rev 4810, DART/branches/close/closest_member_tool.f90)
===================================================================
--- DART/trunk/utilities/closest_member_tool.f90	                        (rev 0)
+++ DART/trunk/utilities/closest_member_tool.f90	2011-03-23 21:58:20 UTC (rev 4811)
@@ -0,0 +1,333 @@
+! DART software - Copyright \xA9 2004 - 2010 UCAR. This open source software is
+! provided by UCAR, "as is", without charge, subject to all terms of use at
+! http://www.image.ucar.edu/DAReS/DART/DART_download
+
+program closest_member_tool
+
+! <next few lines under version control, do not edit>
+! $URL$
+! $Id$
+! $Revision$
+! $Date$
+
+! Program to overwrite the time on each ensemble in a restart file.
+
+use types_mod,           only : r8
+use time_manager_mod,    only : time_type, set_time_missing,               &
+                                operator(==), print_time, print_date,      &
+                                set_calendar_type, GREGORIAN, NO_CALENDAR, &
+                                get_calendar_type
+
+use utilities_mod,       only : register_module, do_output,                &
+                                error_handler, nmlfileunit, E_MSG, E_ERR,  &
+                                timestamp, find_namelist_in_file,          &
+                                check_namelist_read, logfileunit,          &
+                                do_nml_file, do_nml_term, open_file, close_file
+                                
+use       sort_mod,      only : slow_index_sort
+
+use assim_model_mod,     only : static_init_assim_model, get_model_size,   &
+                                open_restart_read, open_restart_write,     &
+                                awrite_state_restart, aread_state_restart, &
+                                close_restart
+
+use mpi_utilities_mod,    only : initialize_mpi_utilities, task_count,     &
+                                 finalize_mpi_utilities
+
+
+implicit none
+
+! version controlled file description for error handling, do not edit
+character(len=128), parameter :: &
+   source   = "$URL$", &
+   revision = "$Revision$", &
+   revdate  = "$Date$"
+
+integer               :: iunit, model_size, io, ens
+integer, allocatable  :: index_list(:)
+character(len = 128)  :: ifile, msgstring
+real(r8), allocatable :: mean(:), member(:), diffs(:)
+type(time_type)       :: mean_time, member_time, mean_advance_time, advance_time
+
+character(len=64)     :: method_name(4) = (/     &
+   "Simple Difference    ", &
+   "Normalized Difference", &
+   "Simple RMSE          ", &
+   "Normalized RMSE      "  /)
+
+!----------------------------------------------------------------
+! These variables are namelist-controllable.
+!
+character(len = 128) :: input_file_name        = "filter_restart"
+character(len = 128) :: output_file_name       = "closest_restart"
+integer              :: ens_size               = 1
+logical              :: single_restart_file_in = .true.
+integer              :: difference_method      = 4
+
+!----------------------------------------------------------------
+! different methods to compute 'distance' from mean:
+!  1 = simple absolute difference
+!  2 = normalized absolute difference
+!  3 = simple rmse difference
+!  4 = normalized rmse difference
+!
+!  (suggest more...)
+!----------------------------------------------------------------
+
+namelist /closest_member_tool_nml/  &
+   input_file_name,              &
+   output_file_name,             &
+   ens_size,                     &
+   single_restart_file_in,       &
+   difference_method
+
+
+logical :: input_is_model_advance_file  = .false.
+! FIXME: could add this to namelist:
+!      input_is_model_advance_file 
+! right now we don't output model_advance means, so for now
+! it stays out.  but if you did have the mean, you could
+! use this tool on it.
+
+
+!----------------------------------------------------------------
+!----------------------------------------------------------------
+
+! This program should only be run with a single process
+call initialize_mpi_utilities('closest_member_tool')
+
+if(task_count() > 1) &
+   call error_handler(E_ERR,'closest_member_tool','Only use single process', &
+                      source,revision,revdate)
+
+call register_module(source,revision,revdate)
+
+! Read the namelist entry and print it
+call find_namelist_in_file("input.nml", "closest_member_tool_nml", iunit)
+read(iunit, nml = closest_member_tool_nml, iostat = io)
+call check_namelist_read(iunit, io, "closest_member_tool_nml")
+
+if (do_nml_file()) write(nmlfileunit, nml=closest_member_tool_nml)
+if (do_nml_term()) write(     *     , nml=closest_member_tool_nml)
+
+
+! Initialize the model so we can get the size.
+call static_init_assim_model()
+model_size = get_model_size()
+
+write(msgstring, *) 'Model size/restart data length =', model_size
+call error_handler(E_MSG,'',msgstring)
+write(msgstring, *) 'Ensemble member count = ', ens_size
+call error_handler(E_MSG,'',msgstring)
+write(msgstring, *) 'Computing difference using method: '//trim(method_name(difference_method))
+call error_handler(E_MSG,'',msgstring)
+
+
+! make space for the mean and a single member, plus place to sort list for output
+allocate(mean(model_size), member(model_size), index_list(ens_size), diffs(ens_size))
+
+mean_time         = set_time_missing()
+member_time       = set_time_missing()
+mean_advance_time = set_time_missing()
+mean_advance_time = set_time_missing()
+
+! read in the mean - always in a separate file
+iunit = open_restart_read(trim(input_file_name)//'.mean')
+! Read in the advance time if present
+if (input_is_model_advance_file) then
+   call aread_state_restart(mean_time, mean, iunit, mean_advance_time)
+else
+   call aread_state_restart(mean_time, mean, iunit)
+endif
+call close_restart(iunit)
+
+
+! either loop over individual files or open a single file and read a member
+! at a time.  same functionality; where the file open/close happens differs.
+if (single_restart_file_in) then
+
+   ! One restart file - open once and loop on the read
+
+   iunit = open_restart_read(trim(input_file_name))
+
+   ! Read in the advance time if present
+   do ens=1, ens_size
+      if (input_is_model_advance_file) then
+         call aread_state_restart(member_time, member, iunit, advance_time)
+      else
+         call aread_state_restart(member_time, member, iunit)
+      endif
+
+      ! FIXME: should put in tests for time matching mean to avoid comparing
+      ! against the wrong file.
+
+      !------------------- Compute scaled RMSE   -----------------------
+
+      diffs(ens) = compute_diff(mean, member, model_size) 
+     
+      !------------------- Compute scaled RMSE   -----------------------
+
+   enddo
+   call close_restart(iunit)
+
+else
+   do ens=1, ens_size
+ 
+      ! add member number as a suffix: e.g. base.0000
+      write(ifile, "(a,a,i4.4)") trim(input_file_name), '.', ens
+
+      !------------------- Read restart from file ----------------------
+      iunit = open_restart_read(ifile)
+      ! Read in the advance time if present
+      if (input_is_model_advance_file) then
+         call aread_state_restart(member_time, member, iunit, advance_time)
+      else
+         call aread_state_restart(member_time, member, iunit)
+      endif
+      call close_restart(iunit)
+      !------------------- Read restart from file ----------------------
+      
+      !------------------- Compute scaled RMSE   -----------------------
+
+      diffs(ens) = compute_diff(mean, member, model_size) 
+     
+      !------------------- Compute scaled RMSE   -----------------------
+
+   enddo
+
+endif
+
+!------------------- Print out results     -----------------------
+
+call slow_index_sort(diffs, index_list, ens_size)
+call error_handler(E_MSG, '', ' ')
+write(msgstring, "(A,I5)") 'Member with the minimum difference from the mean is ', index_list(1)
+call error_handler(E_MSG, '', msgstring)
+call error_handler(E_MSG, '', ' ')
+
+do ens=1, ens_size
+   write(msgstring, "(A,I5,A,F18.6)") "Member ", index_list(ens), " difference ", diffs(index_list(ens))
+   call error_handler(E_MSG, '', msgstring)
+enddo
+
+!------------------- Print out results     -----------------------
+
+!------------------- Write results to file -----------------------
+
+! if the input is a single file, write the ensemble member number to a file.
+! if the input is separate files, write the full filename to a file.
+
+iunit = open_file(output_file_name, 'formatted', 'write')
+
+if (single_restart_file_in) then
+   write(iunit, "(I4)") index_list(1)
+else
+   write(iunit, "(A,A,I4.4)") trim(input_file_name), '.', index_list(1)
+endif
+
+call close_file(iunit)
+  
+call error_handler(E_MSG, '', ' ')
+write(msgstring, *) 'Writing closest member information to file: ', trim(output_file_name)
+call error_handler(E_MSG, '', msgstring)
+
+!------------------- Write results to file -----------------------
+
+deallocate(mean, member, index_list, diffs)
+
+call finalize_mpi_utilities()   ! now closes log file, too
+
+!----------------------------------------------------------------
+!----------------------------------------------------------------
+
+contains
+
+function compute_diff(target, candidate, arraysize)
+ real(r8), intent(in) :: target(:)
+ real(r8), intent(in) :: candidate(:)
+ integer,  intent(in) :: arraysize
+
+ real(r8) :: compute_diff
+
+integer  :: i
+real(r8) :: val, r, diff, biggest
+
+select case (difference_method)
+
+   ! simple absolute difference
+   case (1)
+
+      val = 0.0
+      do i = 1, arraysize  
+         val = val + abs(target(i) - candidate(i))
+      enddo 
+
+      compute_diff = val
+
+   ! normalized absolute difference
+   case (2)
+
+      val = 0.0
+      do i = 1, arraysize  
+         if (target(i) == 0.0) then
+            if (candidate(i) == 0.0) then
+               diff = 0.0
+            else
+               diff = candidate(i)
+            endif
+         else
+            diff = abs((target(i) - candidate(i)) / target(i))
+         endif
+         val = val + diff
+      enddo
+      
+      compute_diff = val
+
+   ! simple rmse difference
+   case (3)
+
+      val = 0.0
+      do i = 1, arraysize  
+         diff = target(i) - candidate(i)
+         val = val + ( diff * diff )
+      enddo
+      
+      if (val > 0) then
+         compute_diff = sqrt(val)
+      else
+         compute_diff = val
+      endif
+
+   ! normalized rmse difference
+   case (4)
+
+      val = 0.0
+      do i = 1, arraysize  
+         if (target(i) == 0.0) then
+            if (candidate(i) == 0.0) then
+               diff = 0.0
+            else
+               diff = candidate(i)
+            endif
+         else
+            diff = (target(i) - candidate(i)) / target(i)
+         endif
+         val = val + (diff*diff)
+      enddo
+      
+      if (val > 0) then
+         compute_diff = sqrt(val)
+      else
+         compute_diff = val
+      endif
+
+   case default
+      write(msgstring, *) 'Valid values for difference_method are 1-4, value is', difference_method
+      call error_handler(E_ERR,'closest_member_tool','Bad value for difference_method', &
+                         source,revision,revdate, text2=msgstring)
+end select
+
+
+end function compute_diff
+
+end program closest_member_tool

Copied: DART/trunk/utilities/closest_member_tool.html (from rev 4810, DART/branches/close/closest_member_tool.html)
===================================================================
--- DART/trunk/utilities/closest_member_tool.html	                        (rev 0)
+++ DART/trunk/utilities/closest_member_tool.html	2011-03-23 21:58:20 UTC (rev 4811)
@@ -0,0 +1,278 @@
+<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
+          "http://www.w3.org/TR/html4/strict.dtd">
+<HTML>
+<HEAD>
+<TITLE>program closest_member_tool</TITLE>
+<link rel="stylesheet" type="text/css" href="../doc/html/doc.css">
+<link href="../doc/html/dart.ico" rel="shortcut icon" />
+</HEAD>
+<BODY>
+<A NAME="TOP"></A>
+
+<H1>PROGRAM <em class=program>closest_member_tool</em></H1>
+
+<table border=0 summary="" cellpadding=5>
+<tr>
+    <td valign=middle>
+    <img src="../doc/html/Dartboard9.png" alt="DART project logo" height=70 />
+    </td>
+    <td>
+       <P>Jump to <a href="../index.html">DART Documentation Main Index</a><br />
+          <small><small>version information for this file: <br />
+          <!-- version tag follows, do not edit -->
+          $Id$</small></small>
+       </P></td>
+</tr>
+</table>
+
+<A HREF="#Namelist">NAMELIST</A> /
+<A HREF="#Modules">MODULES</A> /
+<A HREF="#FilesUsed">FILES</A> /
+<A HREF="#References">REFERENCES</A> /
+<A HREF="#Errors">ERRORS</A> /
+<A HREF="#FuturePlans">PLANS</A> /
+<A HREF="#Legalese">TERMS OF USE</A>
+
+<H2>Overview</H2>
+
+<P>
+Utility program to compare the ensemble mean to a group of ensemble
+member restart files.  The program prints out a sorted order of which
+members are 'closest' to the mean, where the method used to determine 
+'close' is selectable by namelist option.  It also creates a file with
+a single number or character string in it, for ease in scripting, which
+identifies the closest member.
+</P><P>
+The ensemble mean is optionally written out from filter by setting a namelist
+option in the <a href="../filter/filter.html#Namelist">filter namelist</a>
+to output the ensemble mean in restart file format at the end of a run.
+Restart files contain only the bare state vector, so no information
+about the location or variable type is available.  The difference is computed
+point by point across the ensemble members.
+</P><P>
+Available methods are:
+<dl>
+<dt>1 - simple absolute difference:</dt>
+<dd>
+The absolute value of the difference between each item in the 
+mean vector and the corresponding item in each ensemble member, 
+accumulated over the entire state vector.
+</dd>
+<dt>2 - normalized absolute difference:</dt>
+<dd>
+The absolute value of the difference between each item in the 
+mean vector and the corresponding item in each ensemble member
+normalized by the mean value,
+accumulated over the entire state vector.
+</dd>
+<dt>3 - simple RMSE difference:</dt>
+<dd>
+The square root of the accumulated sum of the 
+square of the difference between each item in the mean vector 
+and the corresponding item in each ensemble member.
+</dd>
+<dt>4 - normalized RMSE difference:</dt>
+<dd>
+The square root of the accumulated sum of the 
+square of the normalized difference between each item in the mean 
+vector and the corresponding item in each ensemble member.
+</dd>
+</dl>
+</P><P>
+This program could be used to select one or more ensemble
+members to run a free model forecast forward in time after
+the assimilation is finished.  Each member is an equally likely
+representation of the model state.  Using the ensemble mean
+is not usually a good choice since the mean may not have
+self-consistent structures in the data.
+</P><P>
+In addition to printing out data about all members to both
+the console and to the dart log file, this program creates
+a single output file containing information about the closest member.
+If the input restart data is in a single file, the output
+file 'closest_restart' contains a single number which is
+the ensemble member number.  If the
+input restart data is in separate files, the output file
+contains the full filename of the closest member, e.g.
+'filter_restart.0004' if member 4 is closest.  For scripting
+the contents of this file can be used to copy the corresponding
+member data and convert it to the model input format for a 
+free forecast, for example.
+</P>
+
+Namelist interface
+<A HREF="#Namelist"><em class=code>&amp;closest_member_tool_nml</em> </A>
+is read from file <em class=file>input.nml</em>.
+</P>
+
+<!--==================================================================-->
+<!--=================== DESCRIPTION OF A NAMELIST ====================-->
+<!--==================================================================-->
+
+<A NAME="Namelist"></A>
+<HR>
+<H2>NAMELIST</H2>
+<P>We adhere to the F90 standard of starting a namelist with an ampersand
+'&amp;' and terminating with a slash '/' for all our namelist input.
+</P>
+<div class=namelist>
+<pre>
+<em class=call>namelist / closest_member_tool_nml / </em> 
+   input_file_name, output_file_name, 
+   ens_size, single_restart_file_in, difference_method
+</pre>
+</div>
+
+<div class=indent1>
+<!-- Description -->
+
+<P>
+The namelist sets the behavior of this tool.
+<br>
+<br>
+This namelist is read in a file called <em class=file>input.nml</em>
+</P>
+
+<TABLE border=0 cellpadding=10 width=100%>
+<TR><TH align=left>Contents    </TH>
+    <TH align=left>Type        </TH>
+    <TH align=left>Description </TH></TR>
+<TR><!--contents--><TD valign=top>input_file_name</TD>
+    <!--  type  --><TD valign=top>character</TD>
+    <!--descript--><TD valign=top>File containing the DART restart data.  If
+                  'single_restart_file_in' is .true., this is the exact filename.
+                  If 'single_restart_file_in' is .false. this is the base name and
+                  '.NNNN' will be appended to each name before being opened.
+                  Default "filter_restart"</TD></TR>
+<TR><!--contents--><TD valign=top>output_file_name</TD>
+    <!--  type  --><TD valign=top>character</TD>
+    <!--descript--><TD valign=top>File created by this program with either
+                  the ensemble number or full restart filename for the closest member.
+                  If 'single_restart_file_in' is .true., the file contains the ensemble number.
+                  If 'single_restart_file_in' is .false. the file contains the base name and
+                  '.NNNN' appended for the closest member.
+                  Default "closest_restart"</TD></TR>
+<TR><!--contents--><TD valign=top>ens_size</TD>
+    <!--  type  --><TD valign=top>integer</TD>
+    <!--descript--><TD valign=top>Total number of ensemble members.
+                  Default 20</TD></TR>
+<TR><!--contents--><TD valign=top>single_restart_file_in</TD>
+    <!--  type  --><TD valign=top>logical</TD>
+    <!--descript--><TD valign=top>Whether the input filename contains a single ensemble 
+                  member per file or multiple members concatinated in the same file.
+                  Default .true.</TD></TR>
+<TR><!--contents--><TD valign=top>difference_method</TD>
+    <!--  type  --><TD valign=top>integer</TD>
+    <!--descript--><TD valign=top>Select which method is used
+                  to compute 'distance' from mean:
+<table>
+<tr><td>1 &nbsp;</td><td>simple absolute difference</td></tr>
+<tr><td>2</td><td>normalized absolute difference</td></tr>
+<tr><td>3</td><td>simple RMSE difference</td></tr>
+<tr><td>4</td><td>normalized RMSE difference</td></tr>
+</table>
+
+                  Default 4</TD></TR>
+</TABLE>
+
+</div>
+<br>
+
+<!--==================================================================-->
+
+<A NAME="Modules"></A>
+<HR>
+<H2>MODULES USED</H2>
+<PRE>
+types_mod
+time_manager_mod
+utilities_mod
+sort_mod
+assim_model_mod
+mpi_utilities_mod
+</PRE>
+
+<!--==================================================================-->
+<!-- Describe the Files Used by this module.                          -->
+<!--==================================================================-->
+
+<A NAME="FilesUsed"></A>
+<HR>
+<H2>FILES</H2>
+<UL><LI>inputfile (filter_restart, filter_restart.mean)
+    <LI>closest_member_tool.nml 
+</UL>
+
+<!--==================================================================-->
+<!-- Cite references, if need be.                                     -->
+<!--==================================================================-->
+
+<A NAME="References"></A>
+<HR>
+<H2>REFERENCES</H2>
+<ul>
+<li> none </li>
+</ul>
+
+<!--==================================================================-->
+<!-- Describe all the error conditions and codes.                     -->
+<!--==================================================================-->
+
+<A NAME="Errors"></A>
+<HR>
+<H2>ERROR CODES and CONDITIONS</H2>
+<div class=errors>
+<TABLE border=1 cellspacing=1 cellpadding=10 width=100%>
+<TR><TH>Routine</TH><TH>Message</TH><TH>Comment</TH></TR>
+
+<TR><!-- routine --><TD VALIGN=top>closest_member_tool</TD>
+    <!-- message --><TD VALIGN=top>Invalid method number</TD>
+    <!-- comment --><TD VALIGN=top>Values 1-4 are supported</TD></TR>
+
+</TABLE>
+</div>
+
+<H2>KNOWN BUGS</H2>
+<P>
+none
+</P>
+
+<!--==================================================================-->
+<!-- Describe Future Plans.                                           -->
+<!--==================================================================-->
+
+<A NAME="FuturePlans"></A>
+<HR>
+<H2>FUTURE PLANS</H2>
+<P>
+none
+</P>
+
+<!--==================================================================-->
+<!-- Legalese & Metadata                                              -->
+<!--==================================================================-->
+
+<A NAME="Legalese"></A>
+<HR>
+<H2>Terms of Use</H2>
+
+<P>
+DART software - Copyright &copy; 2004 - 2010 UCAR.<br>
+This open source software is provided by UCAR, "as is",<br>
+without charge, subject to all terms of use at<br>
+<a href="http://www.image.ucar.edu/DAReS/DART/DART_download">
+http://www.image.ucar.edu/DAReS/DART/DART_download</a>
+</P>
+
+<TABLE border=0 cellpadding=0 width=100% summary="">
+<TR><TD valign=top>Contact:       </TD><TD> DART core group   </TD></TR>
+<TR><TD valign=top>Revision:      </TD><TD> $Revision$ </TD></TR>
+<TR><TD valign=top>Source:        </TD><TD> $URL$ </TD></TR>
+<TR><TD valign=top>Change Date:   </TD><TD> $Date$ </TD></TR>
+<TR><TD valign=top>Change&nbsp;history:&nbsp;</TD><TD> try "svn&nbsp;log" or "svn&nbsp;diff" </TD></TR>
+</TABLE>
+
+<!--==================================================================-->
+
+</BODY>
+</HTML>

Copied: DART/trunk/utilities/closest_member_tool.nml (from rev 4810, DART/branches/close/closest_member_tool.nml)
===================================================================
--- DART/trunk/utilities/closest_member_tool.nml	                        (rev 0)
+++ DART/trunk/utilities/closest_member_tool.nml	2011-03-23 21:58:20 UTC (rev 4811)
@@ -0,0 +1,16 @@
+
+# different methods to compute 'distance' from mean:
+#  1 = simple absolute difference
+#  2 = normalized absolute difference
+#  3 = simple rmse difference
+#  4 = normalized rmse difference
+
+namelist /closest_member_tool_nml/  
+   input_file_name        = 'filter_restart',
+   output_file_name       = 'closest_restart',
+   ens_size               = 20,
+   single_restart_file_in = .true.,
+   difference_method      = 4,
+ /
+
+


More information about the Dart-dev mailing list