[Met_help] [rt.rap.ucar.edu #99817] History for stat_analysis and observation identifiers

John Halley Gotway via RT met_help at ucar.edu
Mon Jul 12 11:29:45 MDT 2021


----------------------------------------------------------------
  Initial Request
----------------------------------------------------------------

Hi,

I'm running stat_analysis on a marine dataset consisting of about 660 
ARGO profiles of temperature and salinity.  Since I have already matched 
the observation with the RTOFS model data for the day for all of the 
profiles, I'm using a python script to read the data into the 
appropriate format and feeding it into stat_analysis.  The stat_analysis 
command currently looks like this:

stat_analysis -lookin python read_godae_matched_pairs_v2.py $DATASET \
   -job aggregate_stat -line_type MPR -out_line_type CNT \
   -by FCST_VAR,OBS_SID,FCST_LEAD \
   -out_stat HYCOM_CNT.stat \
   -log run_stat_analysis.log \
   -tmp_dir /scratch2/NCEPDEV/marine/Todd.Spindler/save/MET/GODAE-MET/tmp \
   -dump_row HYCOM_MPR.stat -v 3

The run log is attached.

I'm trying to compute statistics on each profile, broken out by forecast 
lead time, using all depths for the rmse, bias, that sort of thing.  I 
was hoping that using the -by switch would break out the statistics as I 
need.  However, although stat_analysis does not throw any errors, the 
.stat file it created has no OBS_SID or any location data OBS_LON, 
OBS_LAT, to allow me to identify the profile line.  I tried adding 
FCST_VAR even though it has only a single entry (temperature).  Or is it 
that by putting OBS_SID in the -by line is blocking it from being 
written out?    Is there a way to designate a unique ID to the MPR data 
set and/or the -job section of the command so that stat_analysis will 
retain the identifier?

-- 
Dr. Todd Spindler
IMSG at NOAA/NWS/NCEP/EMC
5830 University Research Ct., #2118
College Park, MD 20740
Todd.Spindler at noaa.gov
301-683-3757



----------------------------------------------------------------
  Complete Ticket History
----------------------------------------------------------------

Subject: stat_analysis and observation identifiers
From: John Halley Gotway
Time: Thu May 06 21:02:53 2021

Todd,

Sounds like you have your python embedded script set up to serve up
MPR
lines to Stat-Analysis. I really wish I could see what your output
file
"HYCOM_CNT.stat" actually contains. Does it contain 1 line CNT output
line,
or many output lines, one for each unique combination of input
FCST_VAR,OBS_SID,FCST_LEAD value?

For now, I'll assume it's the latter... that you have many CNT output
lines, but just can't distinguish between them right now. If that's
not the
case, please let me know.

Good news, I have a solution for you. Please try adding the "-set_hdr"
job
command option, like this:

stat_analysis -lookin python read_godae_matched_pairs_v2.py $DATASET \
   -job aggregate_stat -line_type MPR -out_line_type CNT \
   -by FCST_VAR,OBS_SID,FCST_LEAD \
   -out_stat HYCOM_CNT.stat \
*   -set_hdr VX_MASK OBS_SID \*
   -log run_stat_analysis.log \
   -tmp_dir /scratch2/NCEPDEV/marine/Todd.Spindler/save/MET/GODAE-
MET/tmp \
   -dump_row HYCOM_MPR.stat -v 3

That tells stat_analysis that when writing output, it should populate
the
VX_MASK output column with whatever the current value for OBS_SID is.
In
earlier versions of MET, you could only use "-set_hdr" to set the
output
header columns to a constant string. But the changes for this issue (
https://github.com/dtcenter/MET/issues/1102) enable us to reference
the
"-by" column values in the arguments to "-set_hdr". This should result
in
the same number of output CNT lines, but the VX_MASK column will now
contain the OBS_SID string for each aggregation. Another option would
be
writing that to the DESC output column instead "-set_hdr DESC
OBS_SID", but
personally, I think VX_MASK makes more sense.

You can search for "set_hdr" on this page to see what documentation
exists
for this option:
https://met.readthedocs.io/en/latest/Users_Guide/data_io.html

Hope that does it!

Thanks,
John

On Thu, May 6, 2021 at 4:49 PM Todd Spindler via RT
<met_help at ucar.edu>
wrote:

>
> Thu May 06 16:49:18 2021: Request 99817 was acted upon.
> Transaction: Ticket created by todd.spindler at noaa.gov
>        Queue: met_help
>      Subject: stat_analysis and observation identifiers
>        Owner: Nobody
>   Requestors: todd.spindler at noaa.gov
>       Status: new
>  Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=99817 >
>
>
> Hi,
>
> I'm running stat_analysis on a marine dataset consisting of about
660
> ARGO profiles of temperature and salinity.  Since I have already
matched
> the observation with the RTOFS model data for the day for all of the
> profiles, I'm using a python script to read the data into the
> appropriate format and feeding it into stat_analysis.  The
stat_analysis
> command currently looks like this:
>
> stat_analysis -lookin python read_godae_matched_pairs_v2.py $DATASET
\
>    -job aggregate_stat -line_type MPR -out_line_type CNT \
>    -by FCST_VAR,OBS_SID,FCST_LEAD \
>    -out_stat HYCOM_CNT.stat \
>    -log run_stat_analysis.log \
>    -tmp_dir /scratch2/NCEPDEV/marine/Todd.Spindler/save/MET/GODAE-
MET/tmp \
>    -dump_row HYCOM_MPR.stat -v 3
>
> The run log is attached.
>
> I'm trying to compute statistics on each profile, broken out by
forecast
> lead time, using all depths for the rmse, bias, that sort of thing.
I
> was hoping that using the -by switch would break out the statistics
as I
> need.  However, although stat_analysis does not throw any errors,
the
> .stat file it created has no OBS_SID or any location data OBS_LON,
> OBS_LAT, to allow me to identify the profile line.  I tried adding
> FCST_VAR even though it has only a single entry (temperature).  Or
is it
> that by putting OBS_SID in the -by line is blocking it from being
> written out?    Is there a way to designate a unique ID to the MPR
data
> set and/or the -job section of the command so that stat_analysis
will
> retain the identifier?
>
> --
> Dr. Todd Spindler
> IMSG at NOAA/NWS/NCEP/EMC
> 5830 University Research Ct., #2118
> College Park, MD 20740
> Todd.Spindler at noaa.gov
> 301-683-3757
>
>
>

------------------------------------------------


More information about the Met_help mailing list