[Met_help] [rt.rap.ucar.edu #99408] History for Forecast point data

John Halley Gotway via RT met_help at ucar.edu
Mon Jul 12 11:25:20 MDT 2021


----------------------------------------------------------------
  Initial Request
----------------------------------------------------------------

Hi Todd,

I am moving this over to the MET Help ticketing system so I can get input from other team members.

Thanks,
George

---------

Hi George,

One last question.  The 11-column format

# Read and format the input 11-column observations:
# (1) string: Message_Type
# (2) string: Station_ID
# (3) string: Valid_Time(YYYYMMDD_HHMMSS)
# (4) numeric: Lat(Deg North)
# (5) numeric: Lon(Deg East)
# (6) numeric: Elevation(msl)
# (7) string: Var_Name(or GRIB_Code)
# (8) numeric: Level
# (9) numeric: Height(msl or agl)
# (10) string: QC_String
# (11) numeric: Observation_Value

works for loading obs data from ascii files, but is there a way to load model forecast data similarly?  The GODAE data sets are all matched pair data of a sort, with model forecasts and obs, complete with climo data included, everything interpolated to the obs locations and the same vertical depths, but missing the stats data computed in point_stat (the usual precursor to create MPR).  From reading the docs, there's a way to read in MPR ascii data, but that's data that's already passed through point_stat, and this is data that needs to be fed into point_stat.  Any ideas on how to proceed?

-- 
Dr. Todd Spindler
IMSG at NOAA/NWS/NCEP/EMC
5830 University Research Ct., #2118
College Park, MD 20740
Todd.Spindler at noaa.gov
301-683-3757

----------------------------------------------------------------
  Complete Ticket History
----------------------------------------------------------------

Subject: Forecast point data
From: John Halley Gotway
Time: Mon Apr 19 11:25:33 2021

Hi Todd,

I'm on the MET-Help desk on Mondays and was looking back through
existing tickets. I see this one that George created last week based
on your question.

I see you're asking about providing point forecasts to the Point-Stat
tool. Unfortunately, that does NOT work. Point-Stat processes gridded
forecast data and interpolates those values to the point observation
locations.

There are no tools currently in MET that help in creating matched
pairs using point forecasts and point observations. But if you already
have the fcst/obs data paired up, you can pass it to the Stat-Analysis
tool and derive statistics from it.

And you'd do that using python-embedding in Stat-Analysis. Really, all
we're doing it passing the paired data to Stat-Analysis as if it were
the MPR output line type from Point-Stat. If you need help on the
specifics, you could send us some sample paired data and I could take
a look at the formatting to get it ingested into Stat-Analysis.

Am I answering your question? Or do you have others?

Thanks,
John Halley Gotway

------------------------------------------------
Subject: Forecast point data
From: Todd Spindler
Time: Tue Apr 20 11:13:06 2021

Hi john,

Thanks for looking at this.  I have a sample data set up on our ftp
server:

ftp://polar.ncep.noaa.gov/rtofs/for_DTC/class4_20210309_HYCOM_RTOFS_2.0_profile.nc

My reader script pulls out obs data from the RTOFS GODAE profile
dataset, which has an unusual format.  The details on the file format
are given in the intercomparisonproposalv25.pdf (attached).

I can use the reader as it's currently configured to read observations
at ground level, but I don't know how to set up the Pandas DataFrame
for
forecasts.

Here's what I've used so far:

 > ascii2nc -format python "read_godae_point_notab.py
class4_20210309_HYCOM_RTOFS_2.0_profile.nc" sample_godae_obs.nc

 > plot_point_obs sample_godae_obs.nc sample_godae_obs.ps

Any pointers on how to extend the reader to handle forecasts from the
dataset (it has obs and forecasts) would be appreciated.

--
Dr. Todd Spindler
IMSG at NOAA/NWS/NCEP/EMC
5830 University Research Ct., #2118
College Park, MD 20740
Todd.Spindler at noaa.gov
301-683-3757

On 4/19/21 1:25 PM, John Halley Gotway via RT wrote:
> Hi Todd,
>
> I'm on the MET-Help desk on Mondays and was looking back through
existing tickets. I see this one that George created last week based
on your question.
>
> I see you're asking about providing point forecasts to the Point-
Stat tool. Unfortunately, that does NOT work. Point-Stat processes
gridded forecast data and interpolates those values to the point
observation locations.
>
> There are no tools currently in MET that help in creating matched
pairs
using point forecasts and point observations. But if you already have
the
fcst/obs data paired up, you can pass it to the Stat-Analysis tool and
derive statistics from it.
>
> And you'd do that using python-embedding in Stat-Analysis. Really,
all we're doing it passing the paired data to Stat-Analysis as if it
were the
MPR output line type from Point-Stat. If you need help on the
specifics, you could send us some sample paired data and I could take
a look at the formatting to get it ingested into Stat-Analysis.
>
> Am I answering your question? Or do you have others?
>
> Thanks,
> John Halley Gotway


------------------------------------------------
Subject: Forecast point data
From: John Halley Gotway
Time: Wed Apr 21 11:44:59 2021

Todd,

Thanks for sending this sample data and python script. I adapted your
python script to serve up matched pair data to Stat-Analysis instead
of
observation data to ascii2nc. Here's a call to Stat-Analysis for this
data:

stat_analysis -lookin python read_godae_matched_pairs.py
class4_20210309_HYCOM_RTOFS_2.0_profile.nc -job aggregate_stat
-line_type
MPR -out_line_type CNT -by FCST_VAR,OBTYPE,FCST_LEAD -out_stat
HYCOM_CNT.stat -log run_stat_analysis.log -dump_row HYCOM_MPR.stat -v
3

This command is doing the following:
- Run the python script indicated by the -lookin option to read input
data
in memory.
- Process input MPR lines and convert them to continuous statistic
output
lines (CNT).
- Do this for each unique combination of forecast variable (FCST_VAR),
message type (OBTYPE), and lead time (FCST_LEAD) found in the input.
- Write the resulting stats to an output file (-out_stat
HYCOM_CNT.stat)
- Write a log file named run_stat_analysis.log (-log run_analysis.log)
- Write the input MPR lines read via python embedding (-dump_row
HYCOM_MPR.stat)

This results in 108 output CNT lines. Please find the python script
and
output from this job attached.

And note...
- I'm not very good at python and am guessing there's ways to speed up
this
script considerably.
- The input file format for MET's python embedding is pretty darn
particular. Seems that formatting everything as a string works well.
- I have the python script print 3 lines to the screen as a sanity
check to
make sure there aren't any surprises.
- I hard-coded "HYCOM" since I couldn't find that in NetCDF metadata
anywhere.

This Stat-Analysis job is just one example. You could select many
other
output line types from Stat-Analysis. Here's a link to the docs for
that
job type:
https://met.readthedocs.io/en/latest/Users_Guide/stat-
analysis.html#job-aggregate-stat

I'm not really sure why Stat-Analysis only just 3432 lines of the 3960
input lines?
   DEBUG 2: Job 1 used 3432 out of 3960 STAT lines.

Perhaps some of the lines have a bad data values in the FCST or OBS
columns?

Thanks,
John

On Tue, Apr 20, 2021 at 11:13 AM Todd Spindler via RT
<met_help at ucar.edu>
wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=99408 >
>
> Hi john,
>
> Thanks for looking at this.  I have a sample data set up on our ftp
server:
>
>
>
ftp://polar.ncep.noaa.gov/rtofs/for_DTC/class4_20210309_HYCOM_RTOFS_2.0_profile.nc
>
> My reader script pulls out obs data from the RTOFS GODAE profile
> dataset, which has an unusual format.  The details on the file
format
> are given in the intercomparisonproposalv25.pdf (attached).
>
> I can use the reader as it's currently configured to read
observations
> at ground level, but I don't know how to set up the Pandas DataFrame
for
> forecasts.
>
> Here's what I've used so far:
>
>  > ascii2nc -format python "read_godae_point_notab.py
> class4_20210309_HYCOM_RTOFS_2.0_profile.nc" sample_godae_obs.nc
>
>  > plot_point_obs sample_godae_obs.nc sample_godae_obs.ps
>
> Any pointers on how to extend the reader to handle forecasts from
the
> dataset (it has obs and forecasts) would be appreciated.
>
> --
> Dr. Todd Spindler
> IMSG at NOAA/NWS/NCEP/EMC
> 5830 University Research Ct., #2118
> College Park, MD 20740
> Todd.Spindler at noaa.gov
> 301-683-3757
>
> On 4/19/21 1:25 PM, John Halley Gotway via RT wrote:
> > Hi Todd,
> >
> > I'm on the MET-Help desk on Mondays and was looking back through
> existing tickets. I see this one that George created last week based
on
> your question.
> >
> > I see you're asking about providing point forecasts to the Point-
Stat
> tool. Unfortunately, that does NOT work. Point-Stat processes
gridded
> forecast data and interpolates those values to the point observation
> locations.
> >
> > There are no tools currently in MET that help in creating matched
pairs
> using point forecasts and point observations. But if you already
have the
> fcst/obs data paired up, you can pass it to the Stat-Analysis tool
and
> derive statistics from it.
> >
> > And you'd do that using python-embedding in Stat-Analysis. Really,
all
> we're doing it passing the paired data to Stat-Analysis as if it
were the
> MPR output line type from Point-Stat. If you need help on the
specifics,
> you could send us some sample paired data and I could take a look at
the
> formatting to get it ingested into Stat-Analysis.
> >
> > Am I answering your question? Or do you have others?
> >
> > Thanks,
> > John Halley Gotway
>
>
>

------------------------------------------------
Subject: Re: [rt.rap.ucar.edu #99408] Forecast point data
From: Todd Spindler
Time: Thu Apr 22 11:30:40 2021

Thanks for that, John.  I have another, unrelated question.

I've been playing with setting up RTOFS gridded data and GHRSST OSPO
satellite SST obs data to process through grid_stat and stat_anal.
I've
been regridding the RTOFS with CDO to a 1/12 deg cylindrical
projection
to allow grid_stat to ingest it, which seems to work ok. The problem
I'm
getting, though, is in picking up a climatology for the SST analysis. 
I
have a copy of the World Ocean Atlas 2018 climatology data set of
salinity and temperature, but when I try to read it in a met/grid_stat
run it balks at getting a time stamp from the climo file.  I'm not
exactly what sort of time stamp ought to be there (and I've had issues
with WOA2018 before),  but in python it's easier to bypass than in
met.

Here's the setup I'm using on Hera:

configs and scripts are here:
/scratch2/NCEPDEV/marine/Todd.Spindler/save/MET/RTOFS_GRID/Test3

initial setup using "source METsetup.linux.sh"

test run with ./run_test/sh

RTOFS data is in
/scratch2/NCEPDEV/marine/Todd.Spindler/save/MET/RTOFS_GRID/standard
OBS data is in
/scratch2/NCEPDEV/marine/Todd.Spindler/noscrub/GHRSST/GHRSST-OSPO-L4-
GLOB_20210322.nc
Climo data is in
/scratch2/NCEPDEV/marine/Todd.Spindler/noscrub/Global/climo/WOA2018/woa18_decav_t00_04.nc

The test uses the file GridStatConfig in the configs and scripts
directory.  I tried setting up the climo_mean and climo_stddev
dictionaries, and met threw a date exception:

DEBUG 1: Default Config File:
/contrib/met/10.0.0-beta3/share/met/config/GridStatConfig_default
DEBUG 1: User Config File:
/scratch2/NCEPDEV/marine/Todd.Spindler/save/MET/RTOFS_GRID/Test3/GridStatConfig
DEBUG 4: Met2dDataFileFactory::new_met_2d_data_file() -> created new
Met2dDataFile object of type "FileType_NcCF".
DEBUG 4: NcCfFile::open() -> parsing units for the time variable "days
since 1900-12-31 00:00:00"
DEBUG 4: parse_cf_time_string() -> parsed NetCDF CF convention time
unit
string "days since 1900-12-31 00:00:00"
DEBUG 4:         as a reference time of 19001231_000000 and 86400
second(s) per time step.
DEBUG 4: NcCfFile::open() -> could not extract init time from the
"forecast_reference_time" variable.
DEBUG 4: NcCfFile::open() -> could not extract init time from file
name.
DEBUG 4: Met2dDataFileFactory::new_met_2d_data_file() -> created new
Met2dDataFile object of type "FileType_NcCF".
DEBUG 4: NcCfFile::open() -> parsing units for the time variable
"seconds since 1981-01-01 00:00:00"
DEBUG 4: parse_cf_time_string() -> parsed NetCDF CF convention time
unit
string "seconds since 1981-01-01 00:00:00"
DEBUG 4:         as a reference time of 19810101_000000 and 1
second(s)
per time step.
DEBUG 4: get_nc_data(NcVar *, double *) add_offset = 0,
scale_factor=1,
cell_count=1, is_unsigned_value: 0
DEBUG 4: apply_scale_factor(double) unpacked data: count=0 out of 1.
FillValue(int)=-9999 data range [1.26926e+09 - 1.26926e+09] raw data:
[1269259200 - 1269259200] Positive count: 1
DEBUG 4: NcCfFile::open() -> could not extract init time from the
"forecast_reference_time" variable.
DEBUG 4: NcCfFile::open() -> could not extract init time from file
name.
ERROR  :
ERROR  : DictionaryEntry::dict_value() -> bad type
ERROR  :

Any suggestions?

--
Dr. Todd Spindler
IMSG at NOAA/NWS/NCEP/EMC
5830 University Research Ct., #2118
College Park, MD 20740
Todd.Spindler at noaa.gov
301-683-3757

On 4/21/21 1:45 PM, John Halley Gotway via RT wrote:
> Todd,
>
> Thanks for sending this sample data and python script. I adapted
your
> python script to serve up matched pair data to Stat-Analysis instead
of
> observation data to ascii2nc. Here's a call to Stat-Analysis for
this data:
>
> stat_analysis -lookin python read_godae_matched_pairs.py
> class4_20210309_HYCOM_RTOFS_2.0_profile.nc -job aggregate_stat
-line_type
> MPR -out_line_type CNT -by FCST_VAR,OBTYPE,FCST_LEAD -out_stat
> HYCOM_CNT.stat -log run_stat_analysis.log -dump_row HYCOM_MPR.stat
-v 3
>
> This command is doing the following:
> - Run the python script indicated by the -lookin option to read
input data
> in memory.
> - Process input MPR lines and convert them to continuous statistic
output
> lines (CNT).
> - Do this for each unique combination of forecast variable
(FCST_VAR),
> message type (OBTYPE), and lead time (FCST_LEAD) found in the input.
> - Write the resulting stats to an output file (-out_stat
HYCOM_CNT.stat)
> - Write a log file named run_stat_analysis.log (-log
run_analysis.log)
> - Write the input MPR lines read via python embedding (-dump_row
> HYCOM_MPR.stat)
>
> This results in 108 output CNT lines. Please find the python script
and
> output from this job attached.
>
> And note...
> - I'm not very good at python and am guessing there's ways to speed
up this
> script considerably.
> - The input file format for MET's python embedding is pretty darn
> particular. Seems that formatting everything as a string works well.
> - I have the python script print 3 lines to the screen as a sanity
check to
> make sure there aren't any surprises.
> - I hard-coded "HYCOM" since I couldn't find that in NetCDF metadata
> anywhere.
>
> This Stat-Analysis job is just one example. You could select many
other
> output line types from Stat-Analysis. Here's a link to the docs for
that
> job type:
> https://met.readthedocs.io/en/latest/Users_Guide/stat-
analysis.html#job-aggregate-stat
>
> I'm not really sure why Stat-Analysis only just 3432 lines of the
3960
> input lines?
>     DEBUG 2: Job 1 used 3432 out of 3960 STAT lines.
>
> Perhaps some of the lines have a bad data values in the FCST or OBS
columns?
>
> Thanks,
> John
>
> On Tue, Apr 20, 2021 at 11:13 AM Todd Spindler via RT
<met_help at ucar.edu>
> wrote:
>
>> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=99408 >
>>
>> Hi john,
>>
>> Thanks for looking at this.  I have a sample data set up on our ftp
server:
>>
>>
>>
ftp://polar.ncep.noaa.gov/rtofs/for_DTC/class4_20210309_HYCOM_RTOFS_2.0_profile.nc
>>
>> My reader script pulls out obs data from the RTOFS GODAE profile
>> dataset, which has an unusual format.  The details on the file
format
>> are given in the intercomparisonproposalv25.pdf (attached).
>>
>> I can use the reader as it's currently configured to read
observations
>> at ground level, but I don't know how to set up the Pandas
DataFrame for
>> forecasts.
>>
>> Here's what I've used so far:
>>
>>   > ascii2nc -format python "read_godae_point_notab.py
>> class4_20210309_HYCOM_RTOFS_2.0_profile.nc" sample_godae_obs.nc
>>
>>   > plot_point_obs sample_godae_obs.nc sample_godae_obs.ps
>>
>> Any pointers on how to extend the reader to handle forecasts from
the
>> dataset (it has obs and forecasts) would be appreciated.
>>
>> --
>> Dr. Todd Spindler
>> IMSG at NOAA/NWS/NCEP/EMC
>> 5830 University Research Ct., #2118
>> College Park, MD 20740
>> Todd.Spindler at noaa.gov
>> 301-683-3757
>>
>> On 4/19/21 1:25 PM, John Halley Gotway via RT wrote:
>>> Hi Todd,
>>>
>>> I'm on the MET-Help desk on Mondays and was looking back through
>> existing tickets. I see this one that George created last week
based on
>> your question.
>>> I see you're asking about providing point forecasts to the Point-
Stat
>> tool. Unfortunately, that does NOT work. Point-Stat processes
gridded
>> forecast data and interpolates those values to the point
observation
>> locations.
>>> There are no tools currently in MET that help in creating matched
pairs
>> using point forecasts and point observations. But if you already
have the
>> fcst/obs data paired up, you can pass it to the Stat-Analysis tool
and
>> derive statistics from it.
>>> And you'd do that using python-embedding in Stat-Analysis. Really,
all
>> we're doing it passing the paired data to Stat-Analysis as if it
were the
>> MPR output line type from Point-Stat. If you need help on the
specifics,
>> you could send us some sample paired data and I could take a look
at the
>> formatting to get it ingested into Stat-Analysis.
>>> Am I answering your question? Or do you have others?
>>>
>>> Thanks,
>>> John Halley Gotway
>>
>>




------------------------------------------------
Subject: Forecast point data
From: John Halley Gotway
Time: Thu Apr 22 11:52:09 2021

Whenever getting going with a new gridded dataset in MET, whether it
be a
gridded forecast file, gridded analysis file, or gridded climo file, I
always recommend that folks start by running the plot_data_plane tool.
That's a great way to figure out if MET can read it.

If plot_data_plane works fine, I suspect it'll be fine in the other
tools,
like Grid-Stat, as well. If it does not, consider using python
embedding to
serve up the data instead.

As for the precise source of the error message you're seeing:
   ERROR  : DictionaryEntry::dict_value() -> bad type

I'd really have to dive into the details. We're scrambling right now
trying
to get the next development version out for the next release. So I
don't
have the cycles right now to take a look.

So I'd start with:
(1) Plot the climo data with plot_data_plane
(2) That should error out with the same error message
(3) Send us the exact command you ran to produce the error
(4) Post the sample data file required to demonstrate to our anonymous
ftp
site (
http://dtcenter.org/community-code/model-evaluation-tools-met/met-
help-desk#ftp
)
(5) I'll reassign this ticket to another developer Howard Soh, and he
can
take a closer look.

I suspect one of the following outcomes:
- Your config string is misconfigured and it's an easy fix.
- Howard will find a bug in MET, write up a GitHub issue, and fix it.
- Howard will find that there is a problem with the NetCDF file and
the fix
should be made to the data rather than MET.
- The data is a NetCDF format not currently supported and the only way
to
read it is via python-embedding.

Thanks,
John

On Thu, Apr 22, 2021 at 11:31 AM Todd Spindler via RT
<met_help at ucar.edu>
wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=99408 >
>
> Thanks for that, John.  I have another, unrelated question.
>
> I've been playing with setting up RTOFS gridded data and GHRSST OSPO
> satellite SST obs data to process through grid_stat and stat_anal.
I've
> been regridding the RTOFS with CDO to a 1/12 deg cylindrical
projection
> to allow grid_stat to ingest it, which seems to work ok. The problem
I'm
> getting, though, is in picking up a climatology for the SST
analysis.
> I
> have a copy of the World Ocean Atlas 2018 climatology data set of
> salinity and temperature, but when I try to read it in a
met/grid_stat
> run it balks at getting a time stamp from the climo file.  I'm not
> exactly what sort of time stamp ought to be there (and I've had
issues
> with WOA2018 before),  but in python it's easier to bypass than in
met.
>
> Here's the setup I'm using on Hera:
>
> configs and scripts are here:
> /scratch2/NCEPDEV/marine/Todd.Spindler/save/MET/RTOFS_GRID/Test3
>
> initial setup using "source METsetup.linux.sh"
>
> test run with ./run_test/sh
>
> RTOFS data is in
> /scratch2/NCEPDEV/marine/Todd.Spindler/save/MET/RTOFS_GRID/standard
> OBS data is in
>
> /scratch2/NCEPDEV/marine/Todd.Spindler/noscrub/GHRSST/GHRSST-OSPO-
L4-GLOB_20210322.nc
> Climo data is in
> /scratch2/NCEPDEV/marine/Todd.Spindler/noscrub/Global/climo/WOA2018/
> woa18_decav_t00_04.nc
>
> The test uses the file GridStatConfig in the configs and scripts
> directory.  I tried setting up the climo_mean and climo_stddev
> dictionaries, and met threw a date exception:
>
> DEBUG 1: Default Config File:
> /contrib/met/10.0.0-beta3/share/met/config/GridStatConfig_default
> DEBUG 1: User Config File:
>
>
/scratch2/NCEPDEV/marine/Todd.Spindler/save/MET/RTOFS_GRID/Test3/GridStatConfig
> DEBUG 4: Met2dDataFileFactory::new_met_2d_data_file() -> created new
> Met2dDataFile object of type "FileType_NcCF".
> DEBUG 4: NcCfFile::open() -> parsing units for the time variable
"days
> since 1900-12-31 00:00:00"
> DEBUG 4: parse_cf_time_string() -> parsed NetCDF CF convention time
unit
> string "days since 1900-12-31 00:00:00"
> DEBUG 4:         as a reference time of 19001231_000000 and 86400
> second(s) per time step.
> DEBUG 4: NcCfFile::open() -> could not extract init time from the
> "forecast_reference_time" variable.
> DEBUG 4: NcCfFile::open() -> could not extract init time from file
name.
> DEBUG 4: Met2dDataFileFactory::new_met_2d_data_file() -> created new
> Met2dDataFile object of type "FileType_NcCF".
> DEBUG 4: NcCfFile::open() -> parsing units for the time variable
> "seconds since 1981-01-01 00:00:00"
> DEBUG 4: parse_cf_time_string() -> parsed NetCDF CF convention time
unit
> string "seconds since 1981-01-01 00:00:00"
> DEBUG 4:         as a reference time of 19810101_000000 and 1
second(s)
> per time step.
> DEBUG 4: get_nc_data(NcVar *, double *) add_offset = 0,
scale_factor=1,
> cell_count=1, is_unsigned_value: 0
> DEBUG 4: apply_scale_factor(double) unpacked data: count=0 out of 1.
> FillValue(int)=-9999 data range [1.26926e+09 - 1.26926e+09] raw
data:
> [1269259200 - 1269259200] Positive count: 1
> DEBUG 4: NcCfFile::open() -> could not extract init time from the
> "forecast_reference_time" variable.
> DEBUG 4: NcCfFile::open() -> could not extract init time from file
name.
> ERROR  :
> ERROR  : DictionaryEntry::dict_value() -> bad type
> ERROR  :
>
> Any suggestions?
>
> --
> Dr. Todd Spindler
> IMSG at NOAA/NWS/NCEP/EMC
> 5830 University Research Ct., #2118
> College Park, MD 20740
> Todd.Spindler at noaa.gov
> 301-683-3757
>
> On 4/21/21 1:45 PM, John Halley Gotway via RT wrote:
> > Todd,
> >
> > Thanks for sending this sample data and python script. I adapted
your
> > python script to serve up matched pair data to Stat-Analysis
instead of
> > observation data to ascii2nc. Here's a call to Stat-Analysis for
this
> data:
> >
> > stat_analysis -lookin python read_godae_matched_pairs.py
> > class4_20210309_HYCOM_RTOFS_2.0_profile.nc -job aggregate_stat
> -line_type
> > MPR -out_line_type CNT -by FCST_VAR,OBTYPE,FCST_LEAD -out_stat
> > HYCOM_CNT.stat -log run_stat_analysis.log -dump_row HYCOM_MPR.stat
-v 3
> >
> > This command is doing the following:
> > - Run the python script indicated by the -lookin option to read
input
> data
> > in memory.
> > - Process input MPR lines and convert them to continuous statistic
output
> > lines (CNT).
> > - Do this for each unique combination of forecast variable
(FCST_VAR),
> > message type (OBTYPE), and lead time (FCST_LEAD) found in the
input.
> > - Write the resulting stats to an output file (-out_stat
HYCOM_CNT.stat)
> > - Write a log file named run_stat_analysis.log (-log
run_analysis.log)
> > - Write the input MPR lines read via python embedding (-dump_row
> > HYCOM_MPR.stat)
> >
> > This results in 108 output CNT lines. Please find the python
script and
> > output from this job attached.
> >
> > And note...
> > - I'm not very good at python and am guessing there's ways to
speed up
> this
> > script considerably.
> > - The input file format for MET's python embedding is pretty darn
> > particular. Seems that formatting everything as a string works
well.
> > - I have the python script print 3 lines to the screen as a sanity
check
> to
> > make sure there aren't any surprises.
> > - I hard-coded "HYCOM" since I couldn't find that in NetCDF
metadata
> > anywhere.
> >
> > This Stat-Analysis job is just one example. You could select many
other
> > output line types from Stat-Analysis. Here's a link to the docs
for that
> > job type:
> >
> https://met.readthedocs.io/en/latest/Users_Guide/stat-
analysis.html#job-aggregate-stat
> >
> > I'm not really sure why Stat-Analysis only just 3432 lines of the
3960
> > input lines?
> >     DEBUG 2: Job 1 used 3432 out of 3960 STAT lines.
> >
> > Perhaps some of the lines have a bad data values in the FCST or
OBS
> columns?
> >
> > Thanks,
> > John
> >
> > On Tue, Apr 20, 2021 at 11:13 AM Todd Spindler via RT
<met_help at ucar.edu
> >
> > wrote:
> >
> >> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=99408 >
> >>
> >> Hi john,
> >>
> >> Thanks for looking at this.  I have a sample data set up on our
ftp
> server:
> >>
> >>
> >>
>
ftp://polar.ncep.noaa.gov/rtofs/for_DTC/class4_20210309_HYCOM_RTOFS_2.0_profile.nc
> >>
> >> My reader script pulls out obs data from the RTOFS GODAE profile
> >> dataset, which has an unusual format.  The details on the file
format
> >> are given in the intercomparisonproposalv25.pdf (attached).
> >>
> >> I can use the reader as it's currently configured to read
observations
> >> at ground level, but I don't know how to set up the Pandas
DataFrame for
> >> forecasts.
> >>
> >> Here's what I've used so far:
> >>
> >>   > ascii2nc -format python "read_godae_point_notab.py
> >> class4_20210309_HYCOM_RTOFS_2.0_profile.nc" sample_godae_obs.nc
> >>
> >>   > plot_point_obs sample_godae_obs.nc sample_godae_obs.ps
> >>
> >> Any pointers on how to extend the reader to handle forecasts from
the
> >> dataset (it has obs and forecasts) would be appreciated.
> >>
> >> --
> >> Dr. Todd Spindler
> >> IMSG at NOAA/NWS/NCEP/EMC
> >> 5830 University Research Ct., #2118
> >> College Park, MD 20740
> >> Todd.Spindler at noaa.gov
> >> 301-683-3757
> >>
> >> On 4/19/21 1:25 PM, John Halley Gotway via RT wrote:
> >>> Hi Todd,
> >>>
> >>> I'm on the MET-Help desk on Mondays and was looking back through
> >> existing tickets. I see this one that George created last week
based on
> >> your question.
> >>> I see you're asking about providing point forecasts to the
Point-Stat
> >> tool. Unfortunately, that does NOT work. Point-Stat processes
gridded
> >> forecast data and interpolates those values to the point
observation
> >> locations.
> >>> There are no tools currently in MET that help in creating
matched pairs
> >> using point forecasts and point observations. But if you already
have
> the
> >> fcst/obs data paired up, you can pass it to the Stat-Analysis
tool and
> >> derive statistics from it.
> >>> And you'd do that using python-embedding in Stat-Analysis.
Really, all
> >> we're doing it passing the paired data to Stat-Analysis as if it
were
> the
> >> MPR output line type from Point-Stat. If you need help on the
specifics,
> >> you could send us some sample paired data and I could take a look
at the
> >> formatting to get it ingested into Stat-Analysis.
> >>> Am I answering your question? Or do you have others?
> >>>
> >>> Thanks,
> >>> John Halley Gotway
> >>
> >>
>
>
>
>
>

------------------------------------------------
Subject: Forecast point data
From: John Halley Gotway
Time: Thu Apr 22 14:22:03 2021

Todd,

Oh great, I'm glad that worked. I logged on to hera to take a closer
look.
And I think I have it sorted out. I copied over setup and am working
in:
   /scratch1/BMC/dtc/John.H.Gotway/MET/MET_Help/spindler_data_20210422

Here are some relevant changes in GridStatConfig:
(1) In "fcst" and "obs" set the level = "(0,*,*)"; The 0 means use the
first (only only timestep) from that file and the "*,*" indicate which
dimensions have contain the gridded dimensions.

(2) Just for testing purposes, I switch "regrid.to_grid = OBS" to
"regrid.to_grid = "G001"; and using nearest neighbor interpolation.
That
makes it run MUCH faster. This is just for testing but you can switch
back
to whatever grid you'd like.

(3) In the climo_mean.field, you had that mis-configured. And that is
the
source of the actual runtime error you reported. Instead use:
   field     = [ { name = "t_an"; level = "(0,0,*,*)"; } ];

The number of climo_mean fields must either be 0 or match the number
of
fcst.field and obs.field entries. The 0,0 indexes into the time and
depth
dimensions while *,* are the gridded dimensions. I assume the first
dimension is just the surface for SST's? But please correct as needed.

(4) More climo_mean changes... I hear that you just want to use the
climo
data regardless of it's timestamp. Being able to select the RIGHT
climo
data from multiple available time is the source of several config file
options. But you essentially want to just disable those and use
whatever is
passed as input. Do that by setting:
   time_interp_method = NEAREST;
   match_month        = FALSE;
   day_interval       = NA;
   hour_interval      = NA;

So now when I run Grid-Stat configured this way, it finds 1 climo mean
field and produces output. Note that it still does produce this
warning
message:

WARNING: process_scores() -> Forecast and observation valid times do
not
match 20210323_000000 != 20210322_120000 for sst(0,*,*) versus
analysed_sst(0,*,*).

So there's a mismatch in the times of the fcst and obs data being
compared.
But this is one great reason to make use of METplus. Once you
understand
the naming conventions of the data files and encode that into the
METplus
filename templates in the config file, it can figure out which
forecast
file corresponds to which observation file.

If you have questions about setting up a METplus use case, please send
a
new email to create a new ticket. We try to avoid really long tickets
with
lots of back and forths. Those are more difficult to follow.

Hope that helps... let me know if you're still experiencing problems.

Thanks,
John

On Thu, Apr 22, 2021 at 12:11 PM Todd Spindler via RT
<met_help at ucar.edu>
wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=99408 >
>
>  Thanks for the suggestions. I've plotted up the WOA 2018 temp field
here:
>
> > plot_data_plane woa18_decav_t00_04.nc woa.ps
> 'name="t_an";level="(0,0,*,*)";'
> DEBUG 1: Opening data file: woa18_decav_t00_04.nc
> WARNING:
> WARNING: parse_cf_time_string() -> Unsupported time step in the CF
> convention
> time unit "months since 1955-01-01 00:00:00"
> WARNING:
> DEBUG 1: Creating postscript file: woa.ps
>
>
>
>
> plot_data_plane also complains about the date field, but can plot
it. Since
> it's a single field for climatology, I'm not sure why met needs to
process
> the
> date at all.
>
> --
> Dr. Todd Spindler
> IMSG at NOAA/NWS/NCEP/EMC
> 5830 University Research Ct., #2118
> College Park, MD 20740Todd.Spindler at noaa.gov  (mailto:
> Todd.Spindler at noaa.gov)301-683-3757
>
> On 4/22/21 1:52 PM, John Halley Gotway via RT wrote:
>
>   Whenever getting going with a new gridded dataset in MET, whether
it be a
>   gridded forecast file, gridded analysis file, or gridded climo
file, I
>   always recommend that folks start by running the plot_data_plane
tool.
>   That's a great way to figure out if MET can read it.
>
>   If plot_data_plane works fine, I suspect it'll be fine in the
other
> tools,
>   like Grid-Stat, as well. If it does not, consider using python
embedding
> to
>   serve up the data instead.
>
>   As for the precise source of the error message you're seeing:
>      ERROR  : DictionaryEntry::dict_value() -> bad type
>
>   I'd really have to dive into the details. We're scrambling right
now
> trying
>   to get the next development version out for the next release. So I
don't
>   have the cycles right now to take a look.
>
>   So I'd start with:
>   (1) Plot the climo data with plot_data_plane
>   (2) That should error out with the same error message
>   (3) Send us the exact command you ran to produce the error
>   (4) Post the sample data file required to demonstrate to our
anonymous
> ftp
>   site (
> http://dtcenter.org/community-code/model-evaluation-tools-met/met-
help-desk#ftp
> )
>   (5) I'll reassign this ticket to another developer Howard Soh, and
he can
>   take a closer look.
>
>   I suspect one of the following outcomes:
>   - Your config string is misconfigured and it's an easy fix.
>   - Howard will find a bug in MET, write up a GitHub issue, and fix
it.
>   - Howard will find that there is a problem with the NetCDF file
and the
> fix
>   should be made to the data rather than MET.
>   - The data is a NetCDF format not currently supported and the only
way to
>   read it is via python-embedding.
>
>   Thanks,
>   John
>
>   On Thu, Apr 22, 2021 at 11:31 AM Todd Spindler via RT   <
> met_help at ucar.edu>  (mailto:met_help at ucar.edu)  wrote:
>
>
>     <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=99408
>    >
>
>     Thanks for that, John.  I have another, unrelated question.
>
>     I've been playing with setting up RTOFS gridded data and GHRSST
OSPO
>     satellite SST obs data to process through grid_stat and
stat_anal. I've
>     been regridding the RTOFS with CDO to a 1/12 deg cylindrical
projection
>     to allow grid_stat to ingest it, which seems to work ok. The
problem
> I'm
>     getting, though, is in picking up a climatology for the SST
analysis.
>     I
>     have a copy of the World Ocean Atlas 2018 climatology data set
of
>     salinity and temperature, but when I try to read it in a
met/grid_stat
>     run it balks at getting a time stamp from the climo file.  I'm
not
>     exactly what sort of time stamp ought to be there (and I've had
issues
>     with WOA2018 before),  but in python it's easier to bypass than
in met.
>
>     Here's the setup I'm using on Hera:
>
>     configs and scripts are here:
>     /scratch2/NCEPDEV/marine/Todd.Spindler/save/MET/RTOFS_GRID/Test3
>
>     initial setup using "source METsetup.linux.sh"
>
>     test run with ./run_test/sh
>
>     RTOFS data is in
>
/scratch2/NCEPDEV/marine/Todd.Spindler/save/MET/RTOFS_GRID/standard
>     OBS data is in
>
>
> /scratch2/NCEPDEV/marine/Todd.Spindler/noscrub/GHRSST/GHRSST-OSPO-
L4-GLOB_20210322.nc
>     Climo data is in
>
/scratch2/NCEPDEV/marine/Todd.Spindler/noscrub/Global/climo/WOA2018/
>     woa18_decav_t00_04.nc
>
>     The test uses the file GridStatConfig in the configs and scripts
>     directory.  I tried setting up the climo_mean and climo_stddev
>     dictionaries, and met threw a date exception:
>
>     DEBUG 1: Default Config File:
>     /contrib/met/10.0.0-
beta3/share/met/config/GridStatConfig_default
>     DEBUG 1: User Config File:
>
>
>
/scratch2/NCEPDEV/marine/Todd.Spindler/save/MET/RTOFS_GRID/Test3/GridStatConfig
>     DEBUG 4: Met2dDataFileFactory::new_met_2d_data_file() -> created
new
>     Met2dDataFile object of type "FileType_NcCF".
>     DEBUG 4: NcCfFile::open() -> parsing units for the time variable
"days
>     since 1900-12-31 00:00:00"
>     DEBUG 4: parse_cf_time_string() -> parsed NetCDF CF convention
time
> unit
>     string "days since 1900-12-31 00:00:00"
>     DEBUG 4:         as a reference time of 19001231_000000 and
86400
>     second(s) per time step.
>     DEBUG 4: NcCfFile::open() -> could not extract init time from
the
>     "forecast_reference_time" variable.
>     DEBUG 4: NcCfFile::open() -> could not extract init time from
file
> name.
>     DEBUG 4: Met2dDataFileFactory::new_met_2d_data_file() -> created
new
>     Met2dDataFile object of type "FileType_NcCF".
>     DEBUG 4: NcCfFile::open() -> parsing units for the time variable
>     "seconds since 1981-01-01 00:00:00"
>     DEBUG 4: parse_cf_time_string() -> parsed NetCDF CF convention
time
> unit
>     string "seconds since 1981-01-01 00:00:00"
>     DEBUG 4:         as a reference time of 19810101_000000 and 1
second(s)
>     per time step.
>     DEBUG 4: get_nc_data(NcVar *, double *) add_offset = 0,
scale_factor=1,
>     cell_count=1, is_unsigned_value: 0
>     DEBUG 4: apply_scale_factor(double) unpacked data: count=0 out
of 1.
>     FillValue(int)=-9999 data range [1.26926e+09 - 1.26926e+09] raw
data:
>     [1269259200 - 1269259200] Positive count: 1
>     DEBUG 4: NcCfFile::open() -> could not extract init time from
the
>     "forecast_reference_time" variable.
>     DEBUG 4: NcCfFile::open() -> could not extract init time from
file
> name.
>     ERROR  :
>     ERROR  : DictionaryEntry::dict_value() -> bad type
>     ERROR  :
>
>     Any suggestions?
>
>     --
>     Dr. Todd Spindler
>     IMSG at NOAA/NWS/NCEP/EMC
>     5830 University Research Ct., #2118
>     College Park, MD 20740    Todd.Spindler at noaa.gov  (mailto:
> Todd.Spindler at noaa.gov)    301-683-3757
>
>     On 4/21/21 1:45 PM, John Halley Gotway via RT wrote:
>
>       Todd,
>
>       Thanks for sending this sample data and python script. I
adapted your
>       python script to serve up matched pair data to Stat-Analysis
instead
> of
>       observation data to ascii2nc. Here's a call to Stat-Analysis
for
> this
>
>     data:
>
>       stat_analysis -lookin python read_godae_matched_pairs.py
>       class4_20210309_HYCOM_RTOFS_2.0_profile.nc -job aggregate_stat
>
>     -line_type
>
>       MPR -out_line_type CNT -by FCST_VAR,OBTYPE,FCST_LEAD -out_stat
>       HYCOM_CNT.stat -log run_stat_analysis.log -dump_row
HYCOM_MPR.stat
> -v 3
>
>       This command is doing the following:
>       - Run the python script indicated by the -lookin option to
read
> input
>
>     data
>
>       in memory.
>       - Process input MPR lines and convert them to continuous
statistic
> output
>       lines (CNT).
>       - Do this for each unique combination of forecast variable
> (FCST_VAR),
>       message type (OBTYPE), and lead time (FCST_LEAD) found in the
input.
>       - Write the resulting stats to an output file (-out_stat
> HYCOM_CNT.stat)
>       - Write a log file named run_stat_analysis.log (-log
> run_analysis.log)
>       - Write the input MPR lines read via python embedding (-
dump_row
>       HYCOM_MPR.stat)
>
>       This results in 108 output CNT lines. Please find the python
script
> and
>       output from this job attached.
>
>       And note...
>       - I'm not very good at python and am guessing there's ways to
speed
> up
>
>     this
>
>       script considerably.
>       - The input file format for MET's python embedding is pretty
darn
>       particular. Seems that formatting everything as a string works
well.
>       - I have the python script print 3 lines to the screen as a
sanity
> check
>
>     to
>
>       make sure there aren't any surprises.
>       - I hard-coded "HYCOM" since I couldn't find that in NetCDF
metadata
>       anywhere.
>
>       This Stat-Analysis job is just one example. You could select
many
> other
>       output line types from Stat-Analysis. Here's a link to the
docs for
> that
>       job type:
>
>
>
> https://met.readthedocs.io/en/latest/Users_Guide/stat-
analysis.html#job-aggregate-stat
>
>
>       I'm not really sure why Stat-Analysis only just 3432 lines of
the
> 3960
>       input lines?
>           DEBUG 2: Job 1 used 3432 out of 3960 STAT lines.
>
>       Perhaps some of the lines have a bad data values in the FCST
or OBS
>
>
>     columns?
>
>       Thanks,
>       John
>
>       On Tue, Apr 20, 2021 at 11:13 AM Todd Spindler via RT <
> met_help at ucar.edu  (mailto:met_help at ucar.edu)
>       wrote:
>
>
>         <URL:
> https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=99408         >
>
>         Hi john,
>
>         Thanks for looking at this.  I have a sample data set up on
our ftp
>
>     server:
>
>
>
>
>
>
ftp://polar.ncep.noaa.gov/rtofs/for_DTC/class4_20210309_HYCOM_RTOFS_2.0_profile.nc
>
>
>         My reader script pulls out obs data from the RTOFS GODAE
profile
>         dataset, which has an unusual format.  The details on the
file
> format
>         are given in the intercomparisonproposalv25.pdf (attached).
>
>         I can use the reader as it's currently configured to read
> observations
>         at ground level, but I don't know how to set up the Pandas
> DataFrame for
>         forecasts.
>
>         Here's what I've used so far:
>
>           > ascii2nc -format python "read_godae_point_notab.py
>         class4_20210309_HYCOM_RTOFS_2.0_profile.nc"
sample_godae_obs.nc
>
>           > plot_point_obs sample_godae_obs.nc sample_godae_obs.ps
>
>         Any pointers on how to extend the reader to handle forecasts
from
> the
>         dataset (it has obs and forecasts) would be appreciated.
>
>         --
>         Dr. Todd Spindler
>         IMSG at NOAA/NWS/NCEP/EMC
>         5830 University Research Ct., #2118
>         College Park, MD 20740                Todd.Spindler at noaa.gov
> (mailto:Todd.Spindler at noaa.gov)        301-683-3757
>
>         On 4/19/21 1:25 PM, John Halley Gotway via RT wrote:
>
>           Hi Todd,
>
>           I'm on the MET-Help desk on Mondays and was looking back
> through
>
>         existing tickets. I see this one that George created last
week
> based on
>         your question.
>
>           I see you're asking about providing point forecasts to the
> Point-Stat
>
>         tool. Unfortunately, that does NOT work. Point-Stat
processes
> gridded
>         forecast data and interpolates those values to the point
> observation
>         locations.
>
>           There are no tools currently in MET that help in creating
> matched pairs
>
>         using point forecasts and point observations. But if you
already
> have
>
>     the
>
>         fcst/obs data paired up, you can pass it to the Stat-
Analysis tool
> and
>         derive statistics from it.
>
>           And you'd do that using python-embedding in Stat-Analysis.
> Really, all
>
>         we're doing it passing
>         the paired data to Stat-Analysis as if it were
>
>     the
>
>         MPR output line type from Point-Stat. If you need help on
the
> specifics,
>         you could send us some sample paired data and I could take a
look
> at the
>         formatting to get it ingested into Stat-Analysis.
>
>           Am I answering your question? Or do you have others?
>
>           Thanks,
>           John Halley Gotway
>
>
>
>
>
>
>
>
>
>
>
>

------------------------------------------------


More information about the Met_help mailing list