[Dart-dev] [7002] DART/trunk/obs_sequence: change the 'list' option to require N separate lists
nancy at ucar.edu
nancy at ucar.edu
Thu May 22 10:17:32 MDT 2014
Revision: 7002
Author: nancy
Date: 2014-05-22 10:17:32 -0600 (Thu, 22 May 2014)
Log Message:
-----------
change the 'list' option to require N separate lists
if you are going to compare more than a single set
of obs_seq.final files. each list contains the filenames
for each experiment. this is easier for the user to
generate and less confusing all around. fixes issue DART-30.
also support a new namelist item: eval_and_assim_can_match
which defaults to false for backwards compatibility
but can allow obs which are sucessfully assimilated in
one experiment and sucessfully evaluated in another to
match. the implementation is 0 and 1 match, and 2 and 3 match
(assuming the dart_qc_threshold is at the default value
of 3). fixes issue DART-31.
Modified Paths:
--------------
DART/trunk/obs_sequence/obs_common_subset.f90
DART/trunk/obs_sequence/obs_common_subset.html
DART/trunk/obs_sequence/obs_common_subset.nml
-------------- next part --------------
Modified: DART/trunk/obs_sequence/obs_common_subset.f90
===================================================================
--- DART/trunk/obs_sequence/obs_common_subset.f90 2014-05-21 23:15:14 UTC (rev 7001)
+++ DART/trunk/obs_sequence/obs_common_subset.f90 2014-05-22 16:17:32 UTC (rev 7002)
@@ -4,44 +4,47 @@
!
! $Id$
-program obs_common_subset
-! This program takes as input one or more sets of N obs_seq filenames.
-! N is usually 2, but can be any number of files to compare together.
-! It opens the obs_seq files N at a time and does an obs-by-obs comparison.
-! It is intended to pull out a common subset of obs which were successfully
-! assimilated in all experiments so the subsequent diagnostics can give
-! an apples-to-apples comparison of the relative error or fit to obs
-! between different experiments.
-!
-! This tool does not search in the input obs_seq files to try to match obs
-! in different orders or to skip over obs present in only one input file,
-! so all experiments should be done with identical input obs_seq files.
-! The obs to be assimilated or evaluated can be selected by namelist
-! at run time if they are different, and different experiment settings
-! can be selected by other namelist values.
-!
-! If the next input obs don't match exactly (identical types, times, locations,
-! and DART QC) they are discarded. If they match, and if the DART QC indicates
-! they have been successfully assimilated in all experiments, then each of the
-! N input obs are copied through to the N output obs_seq files. At the end of
-! running this tool it creates N output obs_seq files which contain a consistent
-! set of assimilated obs across the set.
+!> This program takes as input one or more sets of N obs_seq filenames.
+!> N is usually 2, but can be any number of files to compare together.
+!> It opens the obs_seq files N at a time and does an obs-by-obs comparison.
+!> It is intended to pull out a common subset of obs which were successfully
+!> assimilated in all experiments so the subsequent diagnostics can give
+!> an apples-to-apples comparison of the relative error or fit to obs
+!> between different experiments.
+!>
+!> This tool does not search in the input obs_seq files to try to match obs
+!> in different orders or to skip over obs present in only one input file,
+!> so all experiments should be done with identical input obs_seq files.
+!> The obs to be assimilated or evaluated can be selected by namelist
+!> at run time if they are different, and different experiment settings
+!> can be selected by other namelist values.
+!>
+!> If the next input obs don't match exactly (identical types, times, locations,
+!> and DART QC) they are discarded. If they match, and if the DART QC indicates
+!> they have been successfully assimilated in all experiments, then each of the
+!> N input obs are copied through to the N output obs_seq files. At the end of
+!> running this tool it creates sets of N output obs_seq files which contain a
+!> consistent set of assimilated obs across the set.
+!>
+!> The outputs are suitable for input to the obs_diag or obs_seq_to_netcdf tools
+!> and there are matlab scripts for comparing the results of N experiments.
+!> Differences in the output diagnostics will now be because of the experiment
+!> differences, not differences in the number of obs assimilated.
+!>
+!> Running this program creates sets of N new output files corresponding to each
+!> of N input files. The output names are the input filenames with a suffix appended.
+!> It can take a list of obs_seq files, either given directly in the namelist or
+!> a list of N filenames where each file contains one input filename per line.
+!> If the lists contain more than 1 filename, this program will loop over each
+!> set of filenames in the lists until the end of the file.If more than N files
+!>
+!> There is an option to allow observations which have been either successfully
+!> assimilated or evaluated to match each other. The default requires them to
+!> have been treated identically in all experiments.
-! The outputs are suitable for input to the obs_diag or obs_seq_to_netcdf tools
-! and there are matlab scripts for comparing the results of N experiments.
-! Differences in the output diagnostics will now be because of the experiment
-! differences, not differences in the number of obs assimilated.
+program obs_common_subset
-! Running this program creates sets of N new output files corresponding to each
-! of N input files. The output names are the input filenames with a suffix appended.
-! It can take a list of obs_seq files, either given directly in the namelist or
-! from a separate ascii file with one input filename per line. If more than N files
-! are specified it will process them N at a time and loop until the end of the list.
-! The filenames should be listed so all files from experiment 1 are first, then all
-! files from experiment 2, up to N experiments.
-
-
use types_mod, only : r8, metadatalength
use utilities_mod, only : register_module, initialize_utilities, &
find_namelist_in_file, check_namelist_read, &
@@ -73,30 +76,31 @@
character(len=32 ), parameter :: revision = "$Revision$"
character(len=128), parameter :: revdate = "$Date$"
-! max number of files to compare against each other. this program will loop
-! over sets of files for as long as there are input files in the list. if you
-! need to compare more than 50 files together, let us know what you are doing
-! because i'd be very interested to hear. but you can make the maxseq larger
-! in that case.
+! max_num_input_files : maximum total number of input sequence files to be processed.
+integer, parameter :: max_num_input_files = 5000
-integer, parameter :: maxseq = 50 ! max number of files compared at once
+! max number of files to compare against each other. if you need to compare
+! more than 50 files together, let us know what you are doing because i'd be
+! very interested to hear. but you can make the maxcomp larger in that case.
-type(obs_sequence_type) :: seq_in(maxseq)
-type(obs_sequence_type) :: seq_out(maxseq)
-type(obs_type) :: obs_in(maxseq), next_obs_in(maxseq)
-type(obs_type) :: obs_out(maxseq), prev_obs_out(maxseq)
-logical :: is_this_last(maxseq)
+integer, parameter :: maxcomp = 50 ! max number of files compared at once
+
+type(obs_sequence_type) :: seq_in(maxcomp)
+type(obs_sequence_type) :: seq_out(maxcomp)
+type(obs_type) :: obs_in(maxcomp), next_obs_in(maxcomp)
+type(obs_type) :: obs_out(maxcomp), prev_obs_out(maxcomp)
+logical :: is_this_last(maxcomp)
logical :: wanted
-integer :: size_seq_in(maxseq), num_copies_in(maxseq), num_qc_in(maxseq)
-integer :: size_seq_out, num_inserted, iunit, io, i, j, k
-integer :: max_num_obs(maxseq), file_id, atonce, nsets, offset
+integer :: size_seq_in(maxcomp), num_copies_in(maxcomp), num_qc_in(maxcomp)
+integer :: size_seq_out, num_inserted, iunit, io, i, j, k, nextfile
+integer :: max_num_obs(maxcomp), file_id, atonce, nsets
integer :: num_rejected_badqc, num_rejected_diffqc, num_rejected_other
integer :: num_mismatch_loc, num_mismatch_time, num_mismatch_type
character(len = 129) :: read_format
logical :: pre_I_format, cal
-character(len = 256) :: msgstring, msgstring1, msgstring2, msgstring3
-character(len = 256) :: filename_in(maxseq)
-character(len = 300) :: filename_out(maxseq) ! filename_in + . + suffix
+character(len = 512) :: msgstring, msgstring1, msgstring2, msgstring3
+character(len = 256) :: filename_in(maxcomp)
+character(len = 384) :: filename_out(maxcomp) ! filename_in + . + suffix
character(len = metadatalength), parameter :: dart_qc_meta_data = 'DART quality control'
character(len = metadatalength) :: meta_data
@@ -104,29 +108,22 @@
integer :: qc_index
integer :: num_input_sets = 0
+character(len = 256) :: temp_filelist(max_num_input_files) = ''
+
!----------------------------------------------------------------
! Namelist input with default values
-! max_num_input_files : maximum number of input sequence files to be processed
-! if filenames are specified in the namelist. if filenames are specified in a
-! separate ascii input file, the list length can be infinite. for names in the
-! namelist, the len must be > num_to_compare_at_once * number of sets of files
-! to do. e.g. if you are comparing 3 files at a time, and want to compare a
-! series of 100 sets of 3 files, that would be 300 filenames. make these
-! numbers bigger if these values are too small, and recompile.
-
-integer, parameter :: max_num_input_files = 5000
-
character(len = 256) :: filename_seq(max_num_input_files) = ''
-character(len = 256) :: filename_seq_list = ''
+character(len = 256) :: filename_seq_list(maxcomp) = ''
character(len = 32) :: filename_out_suffix = '.common'
integer :: num_to_compare_at_once = 2
logical :: print_only = .false.
! not expected to be changed often, but its here if needed.
-integer :: print_every = 1000
+integer :: print_every = 10000
integer :: dart_qc_threshold = 3 ! ok if DART QC <= this
+logical :: eval_and_assim_can_match = .false.
character(len=32) :: calendar = 'Gregorian'
@@ -134,7 +131,8 @@
num_to_compare_at_once, &
filename_seq, filename_seq_list, &
filename_out_suffix, print_only, &
- print_every, dart_qc_threshold, calendar
+ print_every, dart_qc_threshold, &
+ calendar, eval_and_assim_can_match
!----------------------------------------------------------------
! Start of the program:
@@ -161,19 +159,19 @@
! as in the other tools, it is an error to specify both an explicit
! list and the name of file for input.
-write(msgstring, '(A,I3,A)') "Comparing ", num_to_compare_at_once, " obs_seq files at a time"
+write(msgstring, '(A,I3,A)') "Asking to compare ", num_to_compare_at_once, " obs_seq files at a time"
call error_handler(E_MSG, "obs_common_subset", msgstring, source,revision,revdate)
-if ((num_to_compare_at_once < 2) .or. (num_to_compare_at_once > maxseq)) then
- write(msgstring, *) "num_to_compare_at_once must be >= 2 and <= ", maxseq
+if ((num_to_compare_at_once < 2) .or. (num_to_compare_at_once > maxcomp)) then
+ write(msgstring, *) "num_to_compare_at_once must be >= 2 and <= ", maxcomp
call error_handler(E_ERR, "obs_common_subset", msgstring, source,revision,revdate)
endif
! if there is only a single file from each experiment to compare, the order
-! is obvious. if each experiment has a list of input obs_seq.final files from
-! a multi-step assimilation, list all the files from experiment 1, then all
-! files from experiment 2, etc.
-call handle_filenames(num_to_compare_at_once, filename_seq, filename_seq_list, &
+! doesn't matter. if each experiment has a list of input obs_seq.final files from
+! a multi-step assimilation, list all the files from experiment 1 in list-file 1,
+! all files from experiment 2 in list-file 2, etc.
+call handle_filenames(num_to_compare_at_once, maxcomp, filename_seq, filename_seq_list, &
num_input_sets)
if (num_input_sets > 1) then
@@ -202,42 +200,33 @@
! if they aren't present.
! count of input sets was set in the handle_filenames() routine above.
+nextfile = 1
NUMSETS: do j = 1, nsets
! if num_to_compare_at_once = 3 and num_input_sets = 100, then this
! assumes the input file list is organized in this order:
- ! a1,b1,c1, a2,b2,c2, ... a100,b100,c100
+ ! a1,b1,c1, a2,b2,c2, ..., a100,b100,c100
! where a, b, c are the 3 separate experiments to compare,
! and 1, 2, ... 100 are the obs_seq.final output files from 100 steps
- ! of a multi-step assimilation.
+ ! of a multi-step assimilation. handle_filenames() arranges this
+ ! order if list files were used.
+ ! fill in the names of the current set of obs_seq files
do i = 1, atonce
-
- offset = (i-1)*num_input_sets + j
-
- ! fill in the names of the current set of obs_seq files
- filename_in(i) = filename_seq(offset)
-
+ filename_in(i) = filename_seq(nextfile)
+ nextfile = nextfile + 1
enddo
- ! now filename_in(:) has all the names for this set
+ ! now filename_in(:) has all the names for this set. read the headers in.
do i = 1, atonce
- if ((len(filename_in(i)) == 0) .or. (filename_in(i) == "")) then
- write(msgstring, *) 'sequence name ', i, ' in set ', j, ' is empty or null'
- call error_handler(E_ERR,'obs_common_subset', msgstring, &
- source,revision,revdate) ! shouldn't happen because already tested for this
- endif
-
- ! read in the next set of files.
-
call read_obs_seq_header(filename_in(i), num_copies_in(i), num_qc_in(i), &
size_seq_in(i), max_num_obs(i), file_id, read_format, pre_I_format, &
close_the_file = .true.)
if (size_seq_in(i) == 0) then
write(msgstring1, *) 'This tool cannot process empty obs_seq files'
- write(msgstring,*) 'Obs in input sequence file ', trim(filename_in(i))
+ write(msgstring,*) 'No obs found in input sequence file ', trim(filename_in(i))
call error_handler(E_ERR,'obs_common_subset',msgstring, &
source,revision,revdate, text2=msgstring1)
endif
@@ -479,20 +468,22 @@
!---------------------------------------------------------------------
-subroutine handle_filenames(setsize, filename_seq, filename_seq_list, &
+subroutine handle_filenames(setsize, maxsets, filename_seq, filename_seq_list, &
num_input_sets)
! sort out the input lists, set the length as a return in num_input_sets,
! fill in filename_seq if list was given, and make sure what's specified is consistent.
integer, intent(in) :: setsize
+integer, intent(in) :: maxsets
character(len=*), intent(inout) :: filename_seq(:)
-character(len=*), intent(in) :: filename_seq_list
+character(len=*), intent(in) :: filename_seq_list(:)
integer, intent(out) :: num_input_sets
-integer :: indx, num_input_files
+integer :: indx, num_input_files, i, ifiles, fcount, this_set_filecount
+integer :: files_per_set, nlists, iset, src, dst
logical :: from_file
-character(len=32) :: source
+character(len=32) :: nname
! if the user specifies neither filename_seq nor filename_seq_list, error
! if the user specifies both, error.
@@ -501,7 +492,7 @@
! maxfiles and read it into the explicit list and continue.
! set num_input_sets to the count of sets in the list
-if (filename_seq(1) == '' .and. filename_seq_list == '') then
+if (filename_seq(1) == '' .and. filename_seq_list(1) == '') then
msgstring2='One of filename_seq or filename_seq_list must be specified in namelist'
call error_handler(E_ERR,'handle_filenames', &
'no filenames specified as input', &
@@ -509,64 +500,143 @@
endif
! make sure the namelist specifies one or the other but not both
-if (filename_seq(1) /= '' .and. filename_seq_list /= '') then
+if (filename_seq(1) /= '' .and. filename_seq_list(1) /= '') then
call error_handler(E_ERR,'handle_filenames', &
'cannot specify both filename_seq and filename_seq_list in namelist', &
source,revision,revdate)
endif
-! if they have specified a file which contains a list, read it into
-! the filename_seq array and set the count.
-if (filename_seq_list /= '') then
- source = 'filename_seq_list'
- from_file = .true.
-else
- source = 'filename_seq'
+! count of all filenames, either explicitly in the namelist
+! or specified by a list of files which contain lists of names.
+num_input_files = -1
+nlists = -1
+
+if (filename_seq(1) /= '') then
+
+ ! if the files are given in the namelist, just need to make sure
+ ! there are at least N given (where N == num_to_compare_at_once).
+
+ nname = 'filename_seq'
from_file = .false.
-endif
+ FILELOOP: do indx = 1, max_num_input_files
+ ! an empty name ends the list
+ if (filename_seq(indx) == '') then
+ if (indx < num_to_compare_at_once) then
+ write(msgstring, '(A,I5,A)') trim(nname)//' must contain at least ', &
+ num_to_compare_at_once, ' obs_seq filenames'
+ call error_handler(E_ERR,'handle_filenames', msgstring, &
+ source,revision,revdate)
+ endif
+ num_input_files = indx - 1
+ exit FILELOOP
+ endif
+ enddo FILELOOP
-! debug?
-if (.false.) write(*,*)'filename_seq(1) ',trim(filename_seq(1)),' ', trim(source), ' ',from_file
+ ! make sure the list is an even multiple of setsize
+ if (modulo(num_input_files, setsize) /= 0) then
+ write(msgstring, *) 'number of input files must be an even multiple of ', setsize
+ write(msgstring1, *) 'found ', num_input_files, ' input files'
+ call error_handler(E_ERR,'handle_filenames', msgstring, source,revision,revdate, &
+ text2=msgstring1)
+ endif
-! count of filenames in list
-num_input_files = 0
+ num_input_sets = num_input_files / setsize
+else
-! the point of this loop is to count up how many sets of input seq files we have,
-! and to fill in the filename_seq(:) array if a list file was given.
-FILELOOP: do indx = 1, max_num_input_files
- if (from_file) &
- filename_seq(indx) = get_next_filename(filename_seq_list, indx)
+ ! if they have specified a file which contains a list, read it into
+ ! a temp array, set the counts, and do some initial error checks.
- ! an empty name ends the list
- if (filename_seq(indx) == '') then
- if (indx == 1) then
- call error_handler(E_ERR,'handle_filenames', &
- trim(source)//' contains no input obs_seq filenames', &
- source,revision,revdate)
- endif
+ nname = 'filename_seq_list'
+ from_file = .true.
- exit FILELOOP
+ ! this is the total running count of files across all lists
+ fcount = 1
- endif
+ EXPLOOP: do iset = 1, maxsets
+
+ ! an empty name ends the list of lists
+ if (filename_seq_list(iset) == '') then
+ num_input_files = fcount - 1
+ nlists = iset - 1
+ if (nlists /= setsize) then
+ write(msgstring1, '(A)') 'number of input file lists in "'//trim(nname)// &
+ '" must equal "num_to_compare_at_once"'
+ write(msgstring2, '(2(A,I5))') trim(nname)//' has ', nlists, &
+ ' items while num_to_compare_at_once is ', setsize
+
+ call error_handler(E_ERR,'handle_filenames', msgstring1, &
+ source,revision,revdate, text2=msgstring2)
+ endif
+ exit EXPLOOP
+ endif
+
+ LISTLOOP: do ifiles=1, max_num_input_files
+ if (fcount >= max_num_input_files) then
+ write(msgstring, *) 'cannot specify more than ',max_num_input_files,' total files'
+ write(msgstring1, *) 'for more, change "max_num_input_files" in the source and recompile'
+ call error_handler(E_ERR,'handle_filenames', msgstring, source,revision,revdate, &
+ text2=msgstring1)
+ endif
- num_input_files = indx
+ temp_filelist(fcount) = get_next_filename(filename_seq_list(iset), ifiles)
+
+ ! an empty name ends the list
+ if (temp_filelist(fcount) == '') then
+ if (ifiles == 1) then
+ call error_handler(E_ERR,'handle_filenames', &
+ trim(filename_seq_list(iset))//' contains no input obs_seq filenames', &
+ source,revision,revdate)
+ endif
+ this_set_filecount = ifiles-1
+ exit LISTLOOP
+ endif
+ fcount = fcount + 1
+ enddo LISTLOOP
-enddo FILELOOP
+ ! make sure each file in the list contains the same number
+ ! of filenames to be compared. set the target the first time
+ ! through and then compare subsequent counts to be sure they match.
+ if (iset == 1) then
+ files_per_set = this_set_filecount
+ else
+ if (files_per_set /= this_set_filecount) then
+ write(msgstring1, '(A)') trim(filename_seq_list(iset))//' does not contain ' // &
+ 'the same number of obs_seq filenames as previous lists'
+ write(msgstring2, '(3(A,I5),A)') 'list 1 contains ', files_per_set, ' filenames while list ', iset, &
+ ' contains ', this_set_filecount, ' filenames'
+ call error_handler(E_ERR,'handle_filenames', msgstring1, &
+ source,revision,revdate, text2=msgstring2)
+ endif
+ endif
+
+ enddo EXPLOOP
+
+ num_input_sets = files_per_set
+endif
-if (num_input_files == max_num_input_files) then
- write(msgstring, *) 'cannot specify more than ',max_num_input_files,' files'
- call error_handler(E_ERR,'handle_filenames', msgstring, source,revision,revdate)
+if (num_input_files >= max_num_input_files) then
+ write(msgstring, *) 'cannot specify more than ',max_num_input_files,' files'
+ write(msgstring1, *) 'for more, change "max_num_input_files" in the source and recompile'
+ call error_handler(E_ERR,'handle_filenames', msgstring, source,revision,revdate, &
+ text2=msgstring1)
endif
-! make sure the list is an even multiple of setsize
-if (modulo(num_input_files, setsize) /= 0) then
- write(msgstring, *) 'number of input files must be an even multiple of ', setsize
- call error_handler(E_ERR,'handle_filenames', msgstring, source,revision,revdate)
+if (from_file) then
+ ! now that we know we have the right number of filenames, copy them
+ ! from the temp list into filename_seq in the right order for processing:
+ ! first all file 1s from each experiment, then file 2s, etc.
+ ! the temp list is all files from exp1, all files from exp2, etc
+ dst = 0
+ do i=1, num_input_sets
+ do j=1, setsize
+ src = (j-1)*num_input_sets + i
+ dst = dst + 1
+ filename_seq(dst) = temp_filelist(src)
+ enddo
+ enddo
endif
-num_input_sets = num_input_files / setsize
-
end subroutine handle_filenames
!---------------------------------------------------------------------
@@ -732,8 +802,6 @@
else
type_count(this_obs_kind) = type_count(this_obs_kind) + 1
endif
-! print *, 'obs kind index = ', this_obs_kind
-! if(this_obs_kind > 0)print *, 'obs name = ', get_obs_kind_name(this_obs_kind)
call get_next_obs(seq_in, obs, next_obs, is_this_last)
if (.not. is_this_last) then
@@ -988,17 +1056,19 @@
return
endif
- ! here is the first test we expect to be the reason most obs fail:
- ! if they've been assimilated in one experiment but not another.
- if (test1_qc /= testN_qc) then
- num_rejected_diffqc = num_rejected_diffqc + 1
+ ! check first to see if either of the DART QCs indicate
+ ! that this obs was rejected for some reason. this code
+ ! is testing the first QC multiple times if N > 2 but
+ ! it makes the logic simpler so just repeat the test.
+ if (test1_qc > threshold .or. testN_qc > threshold) then
+ num_rejected_badqc = num_rejected_badqc + 1
return
endif
- ! and here is the second:
- ! if they were not assimilated in any experiment.
- if (test1_qc > threshold) then
- num_rejected_badqc = num_rejected_badqc + 1
+ ! this is the test we expect to be the reason most obs fail:
+ ! if they've been assimilated in one experiment but not another.
+ if (.not. match_qc(test1_qc, testN_qc)) then
+ num_rejected_diffqc = num_rejected_diffqc + 1
return
endif
@@ -1011,6 +1081,39 @@
!---------------------------------------------------------------------
+function match_qc(qc1, qc2)
+
+! normal behavior is to return true if the values are equal,
+! and false if not. but if the "it's ok for evaluated and
+! assimilated obs to match", then allow DART QC value 0 to
+! match 1, and 2 to match 3, but no other combinations are valid.
+
+integer, intent(in) :: qc1
+integer, intent(in) :: qc2
+logical :: match_qc
+
+! if the combination of values is ok, return early
+match_qc = .true.
+
+if (qc1 == qc2) return
+
+if (eval_and_assim_can_match) then
+ if (qc1 == 0 .and. qc2 == 1) return
+ if (qc1 == 1 .and. qc2 == 0) return
+
+ if (qc1 == 2 .and. qc2 == 3) return
+ if (qc1 == 3 .and. qc2 == 2) return
+endif
+
+! not one of the allowed matches. these qcs really
+! are not compatible and the match failed.
+
+match_qc = .false.
+
+end function match_qc
+
+
+!---------------------------------------------------------------------
end program obs_common_subset
! <next few lines under version control, do not edit>
Modified: DART/trunk/obs_sequence/obs_common_subset.html
===================================================================
--- DART/trunk/obs_sequence/obs_common_subset.html 2014-05-21 23:15:14 UTC (rev 7001)
+++ DART/trunk/obs_sequence/obs_common_subset.html 2014-05-22 16:17:32 UTC (rev 7002)
@@ -36,40 +36,95 @@
<H2>Overview</H2>
<P>
-This specialized tool allow you to select sets of observations
-from two or more input observation sequence files
-where the obs were successfully assimilated in all files.
-It creates a new set of observation sequence files containing only that
-subset of the original observations.
+This specialized tool allows you to select subsets of observations
+from two or more observation sequence files output from <em class=file>filter</em>.
+It creates a new set of output observation sequence files containing only
+the observations which were successfully assimilated in all experiments.
</P>
<P>
-The intended use of this tool is as part of the process of comparing the results
-from a group of related experiments in which the same observation sequence file
-is used as input for all runs. The tool cannot process observation sequence
+Experiments using the same input observation sequence file but
+with different configurations (e.g. different inflation values,
+different localization radii, etc) can assimilate different
+numbers of the available observations.
+In that case there will be differences in the diagnostic plots
+which are not directly relatable to the differences in the quality
+of the assimilation. If this tool is run on the <em class=file>obs_seq.final</em>
+files from all the experiments and then the diagnostics are generated,
+only the observations which were assimilated in all experiments will
+contribute to the summary statistics. A more direct comparison can be
+made and improvements can be correctly attributed to the differences
+in the experimental parameters.
+</P>
+<P>
+This tool is intended to be used when comparing the results
+from a group of related experiments in
+which <strong>the exact same input observation sequence file</strong>
+is used for all runs. The tool cannot process observation sequence
files which differ in anything other than whether an observation was successfully
-assimilated/evaluated or not.
+assimilated/evaluated or not. Note that it is fine to add or remove observation
+types from the <em class=code>assimilate_these_obs_types</em> or
+<em class=code>evaluate_these_obs_types</em> namelist items for different
+experiments. The output observation
+sequence files will still contain an identical list of observations, with some
+marked with a DART QC indicating 'not assimilated because of namelist control'.
</P>
<P>
-Experiments with different configurations can assimilate different
-numbers of the available observations. In that case there will be differences
-in the diagnostic plots
-which are not directly relatable to the differences in the quality
-of the assimilation. If this tool is run on the obs_seq.final files from
-all the experiments and then the diagnostics are generated,
-only the observations which were assimilated in all experiments will
-contribute to the summary statistics. A more direct comparison can be
-made and improvements attributed to the differences in the experimental
-parameters.
-</P><P>
See the <a href="http://www.image.ucar.edu/DAReS/DART/DART_Documentation.php#obs_diagnostics">
two experiment diagnostic plot</a> documentation for Matlab scripts
supplied with DART to directly compare the observation diagnostic output
-from multiple experiments (more than two, the script has a poor name).
+from multiple experiments (it does more than two, the script has a poor name).
</P><P>
This is one of a set of tools which operate on observation sequence files.
-For a more general purpose tool, see the
-<a href="obs_sequence_tool.hmtl">obs_sequence_tool</a>.
+For a more general purpose tool see the
+<a href="obs_sequence_tool.html">obs_sequence_tool</a>,
+and for a more flexible selection tool see the
+<a href="obs_selection_tool.html">obs_selection_tool</a>.
</P>
+<H4>Creating an Input Filelist</H4>
+<P>
+One of the inputs to this tool is a list of filenames to compare.
+The filenames can be directly in the namelist file, or they can
+be in a set of separate text files. The latter may be easier when there
+are more than just a few files to compare.
+</P>
+<P>
+For experiments where there are multiple job steps, and so multiple
+output observation sequence files per experiment, the input to this
+tool would then be a list of lists of filenames.
+Each set of names must be put into a text file with
+each filename on a separate line.
+</P><P>
+If each experiment was run in a different set of directories,
+and if a list of observation sequence filenames was made
+with the <em class=file>ls</em> command:
+</P>
+<pre>
+> ls exp1/*/obs_seq.final > exp1list
+> cat exp1list
+exp1/step1/obs_seq.final
+exp1/step2/obs_seq.final
+exp1/step3/obs_seq.final
+exp1/step4/obs_seq.final
+> ls exp2/*/obs_seq.final > exp2list
+> cat exp2list
+exp2/step1/obs_seq.final
+exp2/step2/obs_seq.final
+exp2/step3/obs_seq.final
+exp2/step4/obs_seq.final
+> ls exp3/*/obs_seq.final > exp3list
+> cat exp2list
+exp3/step1/obs_seq.final
+exp3/step2/obs_seq.final
+exp3/step3/obs_seq.final
+exp3/step4/obs_seq.final
+</pre>
+<P>Then the namelist entries would be:
+<pre>
+ filename_seq = ''
+ filename_seq_list = 'exp1list', 'exp2list', exp3list'
+ num_to_compare_at_once = 3
+</pre>
+</P>
<!--==================================================================-->
<!--=================== DESCRIPTION OF A NAMELIST ====================-->
@@ -94,10 +149,11 @@
filename_seq = '',
filename_seq_list = '',
filename_out_suffix = '.common' ,
- print_every = 1000,
+ print_every = 10000,
dart_qc_threshold = 3,
calendar = 'Gregorian',
print_only = .false.,
+ eval_and_assim_can_match = .false.,
/
</pre>
</div>
@@ -115,7 +171,7 @@
<TBODY valign=top>
-<TR><TD>num_to_process_at_once</TD>
+<TR><TD>num_to_compare_at_once</TD>
<TD>integer</TD>
<TD>Number of observation sequence files to compare together at a
time. Most commonly the value is 2, but can be any number.
@@ -127,20 +183,20 @@
<TD>character(len=256), dimension(5000)</TD>
<TD>The array of names of the observation sequence files to process.
If more than N files (where N is num_to_compare_at_once) are listed,
-they should be ordered so all files from the first experiment are together,
-followed by all files from the second experiment, up to N experiments.
+they should be ordered so the first N files are compared together,
+followed by the next set of N files, etc.
+You can only specify one of filename_seq OR filename_seq_list, not both.
</TD></TR>
<TR><TD>filename_seq_list</TD>
- <TD>character(len=256)</TD>
+ <TD>character(len=256), dimension(100)</TD>
<TD>An alternative way to specify the list of input
-observation sequence files. Give the name of a text file which contains, one per
-line, the names of the observation sequence files to process. You can only
-specify one of filename_seq OR filename_seq_list, not both. As with the
-<em class=code>filename_seq</em> item,
-if more than N files (where N is num_to_compare_at_once) are listed,
-they should be ordered so all files from the first experiment are together,
-followed by all files from the second experiment, up to N experiments.
+observation sequence files. Give a list of N filenames which contain,
+one per line, the names of the observation sequence files to process.
+There should be N files specified (where N is num_to_compare_at_once), and
+the first observation sequence filename listed in each file will be compared
+together, then the second, until the lists are exhausted.
+You can only specify one of filename_seq OR filename_seq_list, not both.
</TD></TR>
<TR><TD>filename_out_suffix</TD>
@@ -158,15 +214,17 @@
<TR><TD>dart_qc_threshold</TD>
<TD>integer</TD>
- <TD>The limit for the DART QC value. For an observation which was
+ <TD>Observations with a DART QC value larger than this threshold
+will be discarded.
+Note that this is the QC value set by <em class=file>filter</em>
+to indicate the outcome of trying to assimilate an observation.
+This is not related to the incoming data QC.
+For an observation which was
successfully assimilated or evaluated in both the Prior and Posterior
this should be set to 1. To also include observations which were successfully
-processed in the Prior but not the Posterior, set to 3. It is not
-recommended to set it to a higher value than 3.
-Note that this is the QC value set by <em class=file>filter</em>
-to indicate the outcome of
-trying to assimilate an observation.
-This is not related to the incoming data QC.
+processed in the Prior but not the Posterior, set to 3. To ignore
+the magnitude of the DART QC values and keep observations only if
+the DART QCs match, set this to any value higher than 7.
</TD></TR>
<TR><TD>calendar</TD>
@@ -184,6 +242,12 @@
the input and output files.
</TD></TR>
+<TR><TD>eval_and_assim_can_match</TD>
+ <TD>logical</TD>
+ <TD>Normally .FALSE. . If .TRUE. then observations which were
+either successfully evaluated OR assimilated will match and are kept.
+</TD></TR>
+
</TBODY>
</TABLE>
</div>
Modified: DART/trunk/obs_sequence/obs_common_subset.nml
===================================================================
--- DART/trunk/obs_sequence/obs_common_subset.nml 2014-05-21 23:15:14 UTC (rev 7001)
+++ DART/trunk/obs_sequence/obs_common_subset.nml 2014-05-22 16:17:32 UTC (rev 7002)
@@ -4,8 +4,9 @@
filename_seq_list = '',
filename_out_suffix = '.common' ,
calendar = 'Gregorian',
- print_every = 1000,
+ print_every = 10000,
dart_qc_threshold = 3,
print_only = .false.,
+ eval_and_assim_can_match = .false.
/
More information about the Dart-dev
mailing list