[Dart-dev] [4517] DART/trunk/diagnostics/matlab: Add the observation-space rank histogram routine.

nancy at ucar.edu nancy at ucar.edu
Fri Oct 8 16:44:24 MDT 2010


Revision: 4517
Author:   thoar
Date:     2010-10-08 16:44:24 -0600 (Fri, 08 Oct 2010)
Log Message:
-----------
Add the observation-space rank histogram routine.
usage notes are included via the normal matlab 'help'
mechanism.  The observation sequence file must be processed
by obs_diag.f90 in the normal way. If there are ensemble members
present in the obs_seq.final files, the rank histogram is automatically
created. Simply use plot_rank_histogram.m to view the results ...

The other changes at this time are to better determine
the unique variable names. As a consequence, they are now
also listed alphabetically (which was not the intent, but does no harm).

Modified Paths:
--------------
    DART/trunk/diagnostics/matlab/plot_bias_xxx_profile.m
    DART/trunk/diagnostics/matlab/plot_evolution.m
    DART/trunk/diagnostics/matlab/plot_profile.m
    DART/trunk/diagnostics/matlab/plot_rmse_xxx_evolution.m
    DART/trunk/diagnostics/matlab/plot_rmse_xxx_profile.m

Added Paths:
-----------
    DART/trunk/diagnostics/matlab/plot_rank_histogram.m

-------------- next part --------------
Modified: DART/trunk/diagnostics/matlab/plot_bias_xxx_profile.m
===================================================================
--- DART/trunk/diagnostics/matlab/plot_bias_xxx_profile.m	2010-10-08 20:56:37 UTC (rev 4516)
+++ DART/trunk/diagnostics/matlab/plot_bias_xxx_profile.m	2010-10-08 22:44:24 UTC (rev 4517)
@@ -383,10 +383,10 @@
 y       = struct([]);
 ydims   = struct([]);
 
-for j = 1:length(i)
-   fprintf('%2d is %s\n',j,basenames{j})
-    y{j} = basenames{j};
-ydims{j} = basedims{j};
+for k = 1:length(i)
+   fprintf('%2d is %s\n',k,basenames{i(k)})
+    y{k} = basenames{i(k)};
+ydims{k} = basedims{i(k)};
 end
 
 

Modified: DART/trunk/diagnostics/matlab/plot_evolution.m
===================================================================
--- DART/trunk/diagnostics/matlab/plot_evolution.m	2010-10-08 20:56:37 UTC (rev 4516)
+++ DART/trunk/diagnostics/matlab/plot_evolution.m	2010-10-08 22:44:24 UTC (rev 4517)
@@ -429,10 +429,10 @@
 [b,i,j] = unique(basenames);
 y     = cell(length(i),1);
 ydims = cell(length(i),1);
-for j = 1:length(i)
-   disp(sprintf('%2d is %s',j,basenames{j}))
-    y{j} = basenames{j};
-ydims{j} = basedims{j};
+for k = 1:length(i)
+   disp(sprintf('%2d is %s',k,basenames{i(k)}))
+    y{k} = basenames{i(k)};
+ydims{k} = basedims{i(k)};
 end
 
 

Modified: DART/trunk/diagnostics/matlab/plot_profile.m
===================================================================
--- DART/trunk/diagnostics/matlab/plot_profile.m	2010-10-08 20:56:37 UTC (rev 4516)
+++ DART/trunk/diagnostics/matlab/plot_profile.m	2010-10-08 22:44:24 UTC (rev 4517)
@@ -377,10 +377,10 @@
 y       = struct([]);
 ydims   = struct([]);
 
-for j = 1:length(i)
-   fprintf('%2d is %s\n',j,basenames{j})
-    y{j} = basenames{j};
-ydims{j} = basedims{j};
+for k = 1:length(i)
+   fprintf('%2d is %s\n',k,basenames{i(k)})
+    y{k} = basenames{i(k)};
+ydims{k} = basedims{i(k)};
 end
 
 

Added: DART/trunk/diagnostics/matlab/plot_rank_histogram.m
===================================================================
--- DART/trunk/diagnostics/matlab/plot_rank_histogram.m	                        (rev 0)
+++ DART/trunk/diagnostics/matlab/plot_rank_histogram.m	2010-10-08 22:44:24 UTC (rev 4517)
@@ -0,0 +1,403 @@
+function plotdat = plot_rank_histogram(fname, timeindex, varargin)
+%% plot_rank_histogram plots the rank histogram of the observations - segregated by level and variable.
+% Part of the observation-space diagnostics routines.
+%
+% 'obs_diag' produces a netcdf file containing the diagnostics.
+% 'obs_diag' condenses the 'obs_seq.final' information into summaries 
+%            for a few specified regions - on a level-by-level basis.
+%
+% USAGE: plotdat = plot_rank_histogram(fname, timeindex, [,obstypestring]);
+%
+% fname         : netcdf file produced by 'obs_diag'
+% timeindex     : the index of the timestep of interest
+% obstypestring : 'observation type' string. Optional.
+%                 Must match something in the 'ObservationTypes' variable.
+%                 (ncdump -v ObservationTypes obs_diag_output.nc)
+%                 If specified, only this observation type will be plotted.
+%                 If not specified, ALL observation types incluced in the 
+%                 netCDF file will be plotted.
+%
+% OUTPUT: two files will result for each observation type plotted. One is a 
+%         postscript file containing a page for each level - all regions.
+%         The other file is a simple text file containing summary information
+%         about how many obs were assimilated, how many were available, etc.
+%         Both filenames contain the observation type as part of the name.
+%
+% EXAMPLE 1 - plot the rank histogram for ALL observation types, ALL levels
+%
+% fname     = 'obs_diag_output.nc'; % netcdf file produced by 'obs_diag'
+% timeindex = 1;                    % plot the histogram for the first timestep 
+% plotdat   = plot_rank_histogram(fname, timeindex);
+%
+%
+% EXAMPLE 2 - rank histogram for the radiosonde temperature observations.
+%             This requires knowledge that 'RADIOSONDE_TEMPERATURE' is an
+%             observation type in the netCDF file.
+%
+% fname     = 'obs_diag_output.nc'; % netcdf file produced by 'obs_diag'
+% timeindex = 3;                    % plot the histogram for the third timestep 
+% plotdat   = plot_rank_histogram(fname, timeindex, 'RADIOSONDE_TEMPERATURE');
+
+%% DART software - Copyright � 2004 - 2010 UCAR. This open source software is
+% provided by UCAR, "as is", without charge, subject to all terms of use at
+% http://www.image.ucar.edu/DAReS/DART/DART_download
+%
+% <next few lines under version control, do not edit>
+% $URL: $
+% $Id: $
+% $Revision: $
+% $Date: $
+
+if nargin == 2
+   nvars = 0;
+elseif nargin == 3
+   varname = varargin{1};
+   nvars = 1;
+else
+   error('wrong number of arguments ... ')
+end
+
+% Make sure the file exists.
+
+if (exist(fname,'file') ~= 2)
+   error('file/fname <%s> does not exist',fname)
+end
+
+% Make sure the variable exists.
+
+if (nvars > 0)
+   fullvarname = BuildFullVarname(varname);
+   if ( ~ nc_isvar(fname,fullvarname) )
+      error('There is no %s variable in %s', fullvarname, fname)
+   end
+end
+
+% Harvest plotting info/metadata from netcdf file.
+
+plotdat.fname         = fname;
+plotdat.timeindex     = timeindex;
+plotdat.timecenters   = nc_varget(fname,'time');
+plotdat.timeedges     = nc_varget(fname,'time_bounds');
+plotdat.mlevel        = nc_varget(fname,'mlevel');
+plotdat.plevel        = nc_varget(fname,'plevel');
+plotdat.plevel_edges  = nc_varget(fname,'plevel_edges');
+plotdat.hlevel        = nc_varget(fname,'hlevel');
+plotdat.hlevel_edges  = nc_varget(fname,'hlevel_edges');
+plotdat.Nrhbins       = length(nc_varget(fname,'rank_bins'));
+plotdat.ncopies       = length(nc_varget(fname,'copy'));
+plotdat.nregions      = length(nc_varget(fname,'region'));
+plotdat.region_names  = strtrim(nc_varget(fname,'region_names'));
+
+plotdat.timeseparation     = nc_attget(fname, nc_global, 'bin_separation');
+plotdat.timewidth          = nc_attget(fname, nc_global, 'bin_width');
+time_to_skip               = nc_attget(fname, nc_global, 'time_to_skip');
+plotdat.rat_cri            = nc_attget(fname, nc_global, 'rat_cri');
+plotdat.input_qc_threshold = nc_attget(fname, nc_global, 'input_qc_threshold');
+plotdat.lonlim1            = nc_attget(fname, nc_global, 'lonlim1');
+plotdat.lonlim2            = nc_attget(fname, nc_global, 'lonlim2');
+plotdat.latlim1            = nc_attget(fname, nc_global, 'latlim1');
+plotdat.latlim2            = nc_attget(fname, nc_global, 'latlim2');
+plotdat.biasconv           = nc_attget(fname, nc_global, 'bias_convention');
+
+% Make sure the time index makes sense.
+
+if (timeindex > length(plotdat.timecenters))
+   error('There are only %d time indices in the file, you asked for # %d', ...
+         length(plotdat.timecenters), timeindex)
+end
+
+% Matlab does a crazy thing with the strings for 1 region
+
+if (plotdat.nregions == 1 && (size(plotdat.region_names,2) == 1) )
+   plotdat.region_names = deblank(plotdat.region_names');
+end
+
+% Coordinate between time types and dates
+
+calendar     = nc_attget(fname,'time','calendar');
+timeunits    = nc_attget(fname,'time','units');
+timebase     = sscanf(timeunits,'%*s%*s%d%*c%d%*c%d'); % YYYY MM DD
+timeorigin   = datenum(timebase(1),timebase(2),timebase(3));
+skip_seconds = time_to_skip(4)*3600 + time_to_skip(5)*60 + time_to_skip(6);
+iskip        = time_to_skip(3) + skip_seconds/86400;
+
+plotdat.timecenters = plotdat.timecenters + timeorigin;
+plotdat.timeedges   = plotdat.timeedges   + timeorigin;
+plotdat.Ntimes      = length(plotdat.timecenters);
+plotdat.toff        = plotdat.timecenters(1) + iskip;
+
+plotdat.timespan    = sprintf('%s -- %s', datestr(plotdat.timeedges(timeindex,1),21), ...
+                                       datestr(plotdat.timeedges(timeindex,2),21));
+
+% set up a structure with all static plotting components
+
+plotdat.linewidth = 2.0;
+
+if (nvars == 0)
+   [plotdat.allvarnames, plotdat.allvardims] = get_varsNdims(fname);
+   [plotdat.varnames,    plotdat.vardims]    = FindTemporalVars(plotdat);
+   plotdat.nvars       = length(plotdat.varnames);
+else
+   plotdat.varnames{1} = varname;
+   plotdat.nvars       = nvars;
+end
+
+plotdat.Npossindex  = get_copy_index(fname,'Nposs');
+plotdat.Nusedindex  = get_copy_index(fname,'Nused');
+plotdat.NQC4index   = get_copy_index(fname,'N_DARTqc_4');
+plotdat.NQC5index   = get_copy_index(fname,'N_DARTqc_5');
+plotdat.NQC6index   = get_copy_index(fname,'N_DARTqc_6');
+plotdat.NQC7index   = get_copy_index(fname,'N_DARTqc_7');
+
+%----------------------------------------------------------------------
+% Loop around (time-copy-level-region) observation types
+%----------------------------------------------------------------------
+
+for ivar = 1:plotdat.nvars
+    
+   % create the variable names of interest.
+   % netCDF can only support variable names less than 41 chars.
+    
+   plotdat.myvarname = plotdat.varnames{ivar};  
+   plotdat.guessvar  = sprintf('%s_guess',plotdat.varnames{ivar});
+
+   plotdat.rhistvar = BuildFullVarname(plotdat.varnames{ivar});
+
+   % remove any existing postscript file - will simply append each
+   % level as another 'page' in the .ps file.
+   
+   psfname = sprintf('%s_rank_hist.ps',plotdat.varnames{ivar});
+   fprintf('Removing %s from the current directory.\n',psfname)
+   system(sprintf('rm %s',psfname));
+
+   % remove any existing log file - 
+   
+   lgfname = sprintf('%s_rank_hist_obscount.txt',plotdat.varnames{ivar});
+   fprintf('Removing %s from the current directory.\n',lgfname)
+   system(sprintf('rm %s',lgfname));
+   logfid = fopen(lgfname,'wt');
+   fprintf(logfid,'%s\n',lgfname);
+
+   % get appropriate vertical coordinate variable
+
+   guessdims = nc_var_dims(fname, plotdat.guessvar);
+   rhistdims = nc_var_dims(fname, plotdat.rhistvar);
+
+   if ( findstr('surface',guessdims{3}) > 0 )
+      plotdat.level       = 1;
+      plotdat.level_units = 'surface';
+   elseif ( findstr('undef',guessdims{3}) > 0 )
+      plotdat.level       = 1;
+      plotdat.level_units = 'undefined';
+   else
+      plotdat.level       = nc_varget(fname, guessdims{3});
+      plotdat.level_units = nc_attget(fname, guessdims{3}, 'units');
+   end
+   plotdat.nlevels = length(plotdat.level);
+
+   % Here is the tricky part. Singleton dimensions are auto-squeezed ... 
+   % single levels, single regions ...
+
+   guess_raw = nc_varget(fname, plotdat.guessvar);  
+   guess = reshape(guess_raw, plotdat.Ntimes,  plotdat.ncopies, ...
+                              plotdat.nlevels, plotdat.nregions);
+
+   rhist_raw = nc_varget(fname, plotdat.rhistvar); 
+   rhist = reshape(rhist_raw, plotdat.Ntimes,  plotdat.Nrhbins, ...
+                              plotdat.nlevels, plotdat.nregions);
+
+   % check to see if there is anything to plot
+   nposs = sum(guess(plotdat.timeindex,plotdat.Npossindex,:,:));
+
+   if ( sum(nposs(:)) < 1 )
+      fprintf('%s no obs for %s...  skipping\n', plotdat.varnames{ivar})
+      continue
+   end
+   
+   for ilevel = 1:plotdat.nlevels
+
+      fprintf(logfid,'\nlevel %d %f %s\n',ilevel,plotdat.level(ilevel),plotdat.level_units);
+      
+      plotdat.ges_Nqc4  = squeeze(guess(plotdat.timeindex,plotdat.NQC4index  ,ilevel,:));
+      fprintf(logfid,'DART QC == 4, prior %d\n',sum(plotdat.ges_Nqc4(:)));
+
+      plotdat.ges_Nqc5  = squeeze(guess(plotdat.timeindex,plotdat.NQC5index  ,ilevel,:));
+      fprintf(logfid,'DART QC == 5, prior %d\n',sum(plotdat.ges_Nqc5(:)));
+
+      plotdat.ges_Nqc6  = squeeze(guess(plotdat.timeindex,plotdat.NQC6index  ,ilevel,:));
+      fprintf(logfid,'DART QC == 6, prior %d\n',sum(plotdat.ges_Nqc6(:)));
+
+      plotdat.ges_Nqc7  = squeeze(guess(plotdat.timeindex,plotdat.NQC7index  ,ilevel,:));
+      fprintf(logfid,'DART QC == 7, prior %d\n',sum(plotdat.ges_Nqc7(:)));
+
+      plotdat.ges_Nposs = squeeze(guess(plotdat.timeindex,plotdat.Npossindex, ilevel,:));
+      fprintf(logfid,'# obs poss,   prior %d\n',sum(plotdat.ges_Nposs(:)));
+
+      plotdat.ges_Nused = squeeze(guess(plotdat.timeindex,plotdat.Nusedindex, ilevel,:));
+      fprintf(logfid,'# obs used,   prior %d\n',sum(plotdat.ges_Nused(:)));
+
+      % plot by region
+
+      if (plotdat.nregions > 1)
+         clf; orient tall
+      else 
+         clf; orient landscape
+      end
+
+      for iregion = 1:plotdat.nregions
+
+         plotdat.region   = iregion;  
+         plotdat.myregion = deblank(plotdat.region_names(iregion,:));
+         plotdat.title    = sprintf('%s @ %d %s',    ...
+                              plotdat.myvarname,     ...
+                              plotdat.level(ilevel), ...
+                              plotdat.level_units);
+                          
+         plotdat.rank_hist = squeeze(rhist(plotdat.timeindex, :, ilevel,iregion));
+         
+         myplot(plotdat);
+      end
+
+      % create a postscript file
+      print(gcf,'-dpsc','-append',psfname);
+
+      % block to go slow and look at each one ...
+        disp('Pausing, hit any key to continue ...')
+        pause
+
+   end
+end
+
+%----------------------------------------------------------------------
+% 'Helper' functions
+%----------------------------------------------------------------------
+
+function myplot(plotdat)
+
+   % 6	 Rejected because of incoming data qc.
+   % 5	 Not used because of namelist control.
+
+   nobs_poss = plotdat.ges_Nposs(plotdat.region) - ...
+               plotdat.ges_Nqc5( plotdat.region) - ...
+               plotdat.ges_Nqc6( plotdat.region);
+
+   % nobs_used = plotdat.ges_Nused(plotdat.region);
+   nobs_used = sum(plotdat.rank_hist(:));
+   obsstring = sprintf('%d obs possible, %d obs binned',nobs_poss, nobs_used);
+
+   % Determine some quantities for the legend
+   % Plot 
+   
+   subplot(plotdat.nregions,1,plotdat.region);
+   bar(plotdat.rank_hist);
+
+   axlims = axis;
+   axlims = [0 plotdat.Nrhbins+1 0 axlims(4)];
+   axis(axlims)
+
+   h = text(plotdat.Nrhbins/2, 0.9*axlims(4),plotdat.myregion);
+   set(h,'FontSize',14,'FontWeight','Bold')
+
+   xlabel({'Observation Rank (among ensemble members)',obsstring})
+   ylabel('count')
+
+   % more annotation ...
+
+   if (plotdat.region == 1)
+      title({plotdat.title, plotdat.timespan}, ...
+         'Interpreter', 'none', 'Fontsize', 12, 'FontWeight', 'bold')
+      BottomAnnotation(plotdat.fname)
+   else
+      title(plotdat.timespan, 'Interpreter', 'none', ...
+         'Fontsize', 12, 'FontWeight', 'bold')
+   end
+   
+ 
+
+
+function BottomAnnotation(main)
+% annotates the directory containing the data being plotted
+subplot('position',[0.48 0.01 0.04 0.04])
+axis off
+fullname = which(main);   % Could be in MatlabPath
+if( isempty(fullname) )
+   if ( main(1) == '/' )  % must be a absolute pathname
+      string1 = sprintf('data file: %s',main);
+   else                   % must be a relative pathname
+      mydir = pwd;
+      string1 = sprintf('data file: %s/%s',mydir,main);
+   end
+else
+   string1 = sprintf('data file: %s',fullname);
+end
+
+h = text(0.0, 0.5, string1);
+set(h,'HorizontalAlignment','center', ...
+      'VerticalAlignment','middle',...
+      'Interpreter','none',...
+      'FontSize',10)
+
+
+
+function [y,ydims] = FindTemporalVars(x)
+% Returns UNIQUE (i.e. base) temporal variable names
+if ( ~(isfield(x,'allvarnames') && isfield(x,'allvardims')))
+   error('Doh! no ''allvarnames'' and ''allvardims'' components')
+end
+
+j = 0;
+
+for i = 1:length(x.allvarnames)
+   indx = findstr('time',x.allvardims{i});
+   if (indx > 0) 
+      j = j + 1;
+
+      basenames{j} = ReturnBase(x.allvarnames{i});
+      basedims{j}  = x.allvardims{i};
+   end
+end
+
+[b,i,j] = unique(basenames);
+y     = cell(length(i),1);
+ydims = cell(length(i),1);
+for k = 1:length(i)
+   fprintf('%2d is %s\n',k,basenames{i(k)})
+    y{k} = basenames{i(k)};
+ydims{k} = basedims{i(k)};
+end
+
+
+
+function s = ReturnBase(string1)
+inds = findstr('_guess',string1);
+if (inds > 0 )
+   s = string1(1:inds-1);
+end
+
+inds = findstr('_analy',string1);
+if (inds > 0 )
+   s = string1(1:inds-1);
+end
+
+inds = findstr('_VPguess',string1);
+if (inds > 0 )
+   s = string1(1:inds-1);
+end
+
+inds = findstr('_VPanaly',string1);
+if (inds > 0 )
+   s = string1(1:inds-1);
+end
+
+
+
+function s = BuildFullVarname(varname)
+% netCDF restricts the length of a variable name to
+% be 40 characters. 
+
+rankvar = sprintf('%s_guess_RankHist',varname);
+if (length(rankvar) > 40 ) 
+   s = rankvar(1:40);
+else
+   s = rankvar;
+end

Modified: DART/trunk/diagnostics/matlab/plot_rmse_xxx_evolution.m
===================================================================
--- DART/trunk/diagnostics/matlab/plot_rmse_xxx_evolution.m	2010-10-08 20:56:37 UTC (rev 4516)
+++ DART/trunk/diagnostics/matlab/plot_rmse_xxx_evolution.m	2010-10-08 22:44:24 UTC (rev 4517)
@@ -441,10 +441,10 @@
 [b,i,j] = unique(basenames);
 y     = cell(length(i),1);
 ydims = cell(length(i),1);
-for j = 1:length(i)
-   disp(sprintf('%2d is %s',j,basenames{j}))
-    y{j} = basenames{j};
-ydims{j} = basedims{j};
+for k = 1:length(i)
+   disp(sprintf('%2d is %s',k,basenames{i(k)}))
+    y{k} = basenames{i(k)};
+ydims{k} = basedims{i(k)};
 end
 
 

Modified: DART/trunk/diagnostics/matlab/plot_rmse_xxx_profile.m
===================================================================
--- DART/trunk/diagnostics/matlab/plot_rmse_xxx_profile.m	2010-10-08 20:56:37 UTC (rev 4516)
+++ DART/trunk/diagnostics/matlab/plot_rmse_xxx_profile.m	2010-10-08 22:44:24 UTC (rev 4517)
@@ -381,10 +381,10 @@
 y       = struct([]);
 ydims   = struct([]);
 
-for j = 1:length(i)
-   fprintf('%2d is %s\n',j,basenames{j})
-    y{j} = basenames{j};
-ydims{j} = basedims{j};
+for k = 1:length(i)
+   fprintf('%2d is %s\n',k,basenames{i(k)})
+    y{k} = basenames{i(k)};
+ydims{k} = basedims{i(k)};
 end
 
 


More information about the Dart-dev mailing list