[Met_help] [rt.rap.ucar.edu #76361] History for Statanalysis Question

John Halley Gotway via RT met_help at ucar.edu
Tue Jun 7 11:11:33 MDT 2016


----------------------------------------------------------------
  Initial Request
----------------------------------------------------------------

John, I am getting the following error when running statanalysis.  The forcast times and valid times seem to be correct to me, so I am not sure the cause of the error.

/h/WXQC/met-5.1/bin/stat_analysis -lookin /h/data/global/WXQC/data/met/mdlob_pairs -out /h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_PSTD_0Z -config /h/WXQC/met-5.1/data/config/STATAnalysisConfig_hira_bs_updated -v 6
DEBUG 1: Creating STAT-Analysis output file "/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_PSTD_0Z"
DEBUG 1: Default Config File: /home/qcteam/met-5.1/share/met/config/STATAnalysisConfig_default
DEBUG 1: User Config File: /h/WXQC/met-5.1/data/config/STATAnalysisConfig_hira_bs_updated
DEBUG 4: Default Job from the config file: "-model GALWEM -fcst_lead 120000 -fcst_init_beg 20160501_000000 -fcst_init_end 20160501_000000 -fcst_init_hour 120000 -fcst_var APCP -fcst_thresh >=50 -line_type MPR -vif_flag 1 "
DEBUG 4: Amending default job with command line options: "(nul)"
DEBUG 3: Processing STAT file "/h/data/global/WXQC/data/met/mdlob_pairs/point_stat_3_galwem_120000L_20160501_120000V.stat" ... 1 of 2
ERROR  : 
ERROR  : DataLine::get_item(int) -> range check error
ERROR  :

The config file and data file are on your server.



----------------------------------------------------------------
  Complete Ticket History
----------------------------------------------------------------

Subject: Statanalysis Question
From: John Halley Gotway
Time: Fri May 13 17:27:18 2016

Bob,

The problem is coming from the first line of the file you sent to me.
It
contains a comma-separated list of header column names.

I'm not exactly sure where you pulled those header column names, but
that's
the problem.  MET expects data to be separated by whitespace... so it
interprets that long string with a bunch of commas as a single column.
The
error comes when it tries to read the "second" column.   If you just
remove
that first line, it should run fine.

If you do want header columns, here's a trick.  Run the following job:

stat_analysis -lookin
point_stat_3_galwem_120000L_20160501_120000V.stat \
   -job filter -line_type MPR -dump_row out.stat

The file out.stat, will now contain the full header for the MPR line
type.
When you select a single LINE_TYPE value, stat-analysis will write the
full
header for that line type to the output.

Have a good weekend.

John


On Fri, May 13, 2016 at 10:26 AM, robert.craig.2 at us.af.mil via RT <
met_help at ucar.edu> wrote:

>
> Fri May 13 10:26:26 2016: Request 76361 was acted upon.
> Transaction: Ticket created by robert.craig.2 at us.af.mil
>        Queue: met_help
>      Subject: Statanalysis Question
>        Owner: Nobody
>   Requestors: robert.craig.2 at us.af.mil
>       Status: new
>  Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361 >
>
>
> John, I am getting the following error when running statanalysis.
The
> forcast times and valid times seem to be correct to me, so I am not
sure
> the cause of the error.
>
> /h/WXQC/met-5.1/bin/stat_analysis -lookin
> /h/data/global/WXQC/data/met/mdlob_pairs -out
> /h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_PSTD_0Z
-config
> /h/WXQC/met-5.1/data/config/STATAnalysisConfig_hira_bs_updated -v 6
> DEBUG 1: Creating STAT-Analysis output file
> "/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_PSTD_0Z"
> DEBUG 1: Default Config File:
> /home/qcteam/met-5.1/share/met/config/STATAnalysisConfig_default
> DEBUG 1: User Config File:
> /h/WXQC/met-5.1/data/config/STATAnalysisConfig_hira_bs_updated
> DEBUG 4: Default Job from the config file: "-model GALWEM -fcst_lead
> 120000 -fcst_init_beg 20160501_000000 -fcst_init_end 20160501_000000
> -fcst_init_hour 120000 -fcst_var APCP -fcst_thresh >=50 -line_type
MPR
> -vif_flag 1 "
> DEBUG 4: Amending default job with command line options: "(nul)"
> DEBUG 3: Processing STAT file
>
"/h/data/global/WXQC/data/met/mdlob_pairs/point_stat_3_galwem_120000L_20160501_120000V.stat"
> ... 1 of 2
> ERROR  :
> ERROR  : DataLine::get_item(int) -> range check error
> ERROR  :
>
> The config file and data file are on your server.
>
>
>

------------------------------------------------
Subject: RE: [rt.rap.ucar.edu #76361] Statanalysis Question
From: robert.craig.2 at us.af.mil
Time: Mon May 16 14:47:17 2016

Thanks John, I knew the data was space delimited but forgot to check
the header.  As usual with MET, I progressed further but am hitting a
new error.   See below.   I pushed the the config file to the ftp
directory.  As you can see, the -tmp_dir is set to
/h/data/global/WXQC/data/met/tmp.  This directory permissions are wide
open - infact stat_anal temp files are in there.   Does the
config*.temp try to write somewhere else?

Also, notice in the command line options, there are three thesholds.
MET kept telling me that I had to have three since this is probability
data.  Also, the latest MPR files (.stat) are in the ftp dir.  As you
can see I generated model/ob pairs using different thresholds for the
forecast and observation data.  So this is where I get confused: I
assume the fcst thresh in the config file is a filter to pull those
lines that have the threshold I want.  I am not sure what the
-out_fcst_thresh in the command line is doing.  If it is filtering the
mpr line fcst data, then I would think I would set it to ge 0 for the
fcst and ob since the fcst and ob data range from 0 to 1.  Am I
handling this correctly?

Thanks
BOb

['/h/WXQC/met-5.1/bin/stat_analysis', '-lookin',
'/h/data/global/WXQC/data/met/mdlob_pairs', '-out',
'/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_1_PSTD_0Z', '-
config', '/h/WXQC/met-
5.1/data/config/STATAnalysisConfig_hira_bs_updated', '-out_fcst_thresh
ge0,ge0.5,ge1', '-out_obs_thresh ge0', '-v', '6']
DEBUG 1: Creating STAT-Analysis output file
"/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_1_PSTD_0Z"
DEBUG 1: Default Config File: /home/qcteam/met-
5.1/share/met/config/STATAnalysisConfig_default
DEBUG 1: User Config File: /h/WXQC/met-
5.1/data/config/STATAnalysisConfig_hira_bs_updated
DEBUG 4: Default Job from the config file: "-model GALWEM -fcst_lead
120000 -fcst_init_beg 20160501_000000 -fcst_init_end 20160502_000000
-fcst_init_hour 000000 -fcst_var APCP -fcst_thresh >=1 -line_type MPR
-vif_flag 1 "
DEBUG 4: Amending default job with command line options: "-
out_fcst_thresh ge0,ge0.5,ge1 -out_obs_thresh ge0"
DEBUG 3: Processing STAT file
"/h/data/global/WXQC/data/met/mdlob_pairs/point_stat_3_galwem_120000L_20160501_120000V.stat"
... 1 of 10
ERROR  :
ERROR  :
ERROR  :   MetConfig::read_string(const char *) -> unable to open temp
file "config_23325_0_.temp"
ERROR  :

-----Original Message-----
From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
Sent: Friday, May 13, 2016 6:27 PM
To: CRAIG, ROBERT J GS-12 USAF ACC 16 WS/WXN
<robert.craig.2 at us.af.mil>
Subject: Re: [rt.rap.ucar.edu #76361] Statanalysis Question

Bob,

The problem is coming from the first line of the file you sent to me.
It contains a comma-separated list of header column names.

I'm not exactly sure where you pulled those header column names, but
that's the problem.  MET expects data to be separated by whitespace...
so it interprets that long string with a bunch of commas as a single
column.  The
error comes when it tries to read the "second" column.   If you just
remove
that first line, it should run fine.

If you do want header columns, here's a trick.  Run the following job:

stat_analysis -lookin
point_stat_3_galwem_120000L_20160501_120000V.stat \
   -job filter -line_type MPR -dump_row out.stat

The file out.stat, will now contain the full header for the MPR line
type.
When you select a single LINE_TYPE value, stat-analysis will write the
full header for that line type to the output.

Have a good weekend.

John


On Fri, May 13, 2016 at 10:26 AM, robert.craig.2 at us.af.mil via RT <
met_help at ucar.edu> wrote:

>
> Fri May 13 10:26:26 2016: Request 76361 was acted upon.
> Transaction: Ticket created by robert.craig.2 at us.af.mil
>        Queue: met_help
>      Subject: Statanalysis Question
>        Owner: Nobody
>   Requestors: robert.craig.2 at us.af.mil
>       Status: new
>  Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361
> >
>
>
> John, I am getting the following error when running statanalysis.
The
> forcast times and valid times seem to be correct to me, so I am not
> sure the cause of the error.
>
> /h/WXQC/met-5.1/bin/stat_analysis -lookin
> /h/data/global/WXQC/data/met/mdlob_pairs -out
> /h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_PSTD_0Z
> -config /h/WXQC/met-
5.1/data/config/STATAnalysisConfig_hira_bs_updated
> -v 6 DEBUG 1: Creating STAT-Analysis output file
> "/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_PSTD_0Z"
> DEBUG 1: Default Config File:
> /home/qcteam/met-5.1/share/met/config/STATAnalysisConfig_default
> DEBUG 1: User Config File:
> /h/WXQC/met-5.1/data/config/STATAnalysisConfig_hira_bs_updated
> DEBUG 4: Default Job from the config file: "-model GALWEM -fcst_lead
> 120000 -fcst_init_beg 20160501_000000 -fcst_init_end 20160501_000000
> -fcst_init_hour 120000 -fcst_var APCP -fcst_thresh >=50 -line_type
MPR
> -vif_flag 1 "
> DEBUG 4: Amending default job with command line options: "(nul)"
> DEBUG 3: Processing STAT file
>
"/h/data/global/WXQC/data/met/mdlob_pairs/point_stat_3_galwem_120000L_20160501_120000V.stat"
> ... 1 of 2
> ERROR  :
> ERROR  : DataLine::get_item(int) -> range check error ERROR  :
>
> The config file and data file are on your server.
>
>
>



------------------------------------------------
Subject: Statanalysis Question
From: John Halley Gotway
Time: Mon May 16 15:53:39 2016

Bob,

Thanks for sending the sample data.  I agree that STAT-Analysis can
get
pretty confusing.  It has a lot of flexibility, but we really need to
think
through what you're trying to do.

First, regarding the error you're getting.  Unfortunately, the config
file
string parser is writing a temp file in the current "runtime"
directory.
The error is from the fact that you don't have permission to write the
file
"config_23325_0_.temp" in the current directory.  Ultimately, we
should
change that to use the temp directory instead.

Next, I looked at the data you sent to me.  Listed below are the
unique
combinations of just a few of the header columns:

FCST_VAR FCST_THRESH LINE_TYPE TOTAL
APCP         >=1                  MPR           14666
APCP         >=25                MPR           14666
APCP         >=50                MPR           14666
CEIL           <=1000             MPR           11926
CEIL           <=100               MPR           11926
CEIL           <=300               MPR           11926

Based on this, it looks like you have a lot of duplicate matched pair
(MPR)
output lines... We have the same 14666 pairs for APCP repeated 3 times
followed by the same 11926 pairs for CEIL repeated 3 times.  This
isn't
necessary.  Instead, the FCST_THRESH and OBS_THRESH columns for the
MPR
line type should be set to "NA".  The MPR line type that Point-Stat
creates
just contains the paired forecast and observation values.  Thresholds
do
not apply to this line type.

I posted an updated version of your file to the ftp site.  I stripped
it
down to 14666 APCP lines and 11926 CEIL lines with NA in the
FCST_THRESH
and OBS_THRESH columns:

ftp://ftp.rap.ucar.edu/incoming/irap/met_help/craig_data/point_stat_3_galwem_120000L_20160501_120000V_JHG.stat

Looking at the values in the FCST column, I see numbers between 0 and
1 (0,
0.11, 0.22, 0.33, ...).  Looking in the OBS column, I see 2 numbers (0
or
1).  And looking at your config file, it looks like you want to use
these
MPR lines to compute probabilistic output.  MET verifies probabilities
using an Nx2 contingency table.  You use "-out_fcst_thresh" to select
the
probabilistic thresholds to be applied and "-out_obs_thresh" to select
the
observation threshold to be applied.

Here's a stat-analysis job you could run to read the MPR lines, define
the
probabilistic forecast thresholds, define the single observation
threshold,
and compute a PSTD output line.  Using "-by FCST_VAR" tells it to run
the
job separately for each unique entry found in the FCST_VAR column.

/usr/local/met-5.1/bin/stat_analysis \
   -lookin point_stat_3_galwem_120000L_20160501_120000V_JHG.stat \
   -job aggregate_stat -line_type MPR -out_line_type PSTD \
   -out_fcst_thresh
ge0,ge0.1,ge0.2,ge0.3,ge0.4,ge0.5,ge0.6,ge0.7,ge0.8,ge0.9,ge1.0 \
   -out_obs_thresh eq1.0 \
   -by FCST_VAR \
   -out_stat out_pstd.txt

The output statistics are written to "out_pstd.txt".

Hope that helps.

John







On Mon, May 16, 2016 at 2:47 PM, robert.craig.2 at us.af.mil via RT <
met_help at ucar.edu> wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361 >
>
> Thanks John, I knew the data was space delimited but forgot to check
the
> header.  As usual with MET, I progressed further but am hitting a
new
> error.   See below.   I pushed the the config file to the ftp
directory.
> As you can see, the -tmp_dir is set to
/h/data/global/WXQC/data/met/tmp.
> This directory permissions are wide open - infact stat_anal temp
files are
> in there.   Does the config*.temp try to write somewhere else?
>
> Also, notice in the command line options, there are three thesholds.
MET
> kept telling me that I had to have three since this is probability
data.
> Also, the latest MPR files (.stat) are in the ftp dir.  As you can
see I
> generated model/ob pairs using different thresholds for the forecast
and
> observation data.  So this is where I get confused: I assume the
fcst
> thresh in the config file is a filter to pull those lines that have
the
> threshold I want.  I am not sure what the -out_fcst_thresh in the
command
> line is doing.  If it is filtering the mpr line fcst data, then I
would
> think I would set it to ge 0 for the fcst and ob since the fcst and
ob data
> range from 0 to 1.  Am I handling this correctly?
>
> Thanks
> BOb
>
> ['/h/WXQC/met-5.1/bin/stat_analysis', '-lookin',
> '/h/data/global/WXQC/data/met/mdlob_pairs', '-out',
> '/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_1_PSTD_0Z',
> '-config',
> '/h/WXQC/met-5.1/data/config/STATAnalysisConfig_hira_bs_updated',
> '-out_fcst_thresh ge0,ge0.5,ge1', '-out_obs_thresh ge0', '-v', '6']
> DEBUG 1: Creating STAT-Analysis output file
> "/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_1_PSTD_0Z"
> DEBUG 1: Default Config File:
> /home/qcteam/met-5.1/share/met/config/STATAnalysisConfig_default
> DEBUG 1: User Config File:
> /h/WXQC/met-5.1/data/config/STATAnalysisConfig_hira_bs_updated
> DEBUG 4: Default Job from the config file: "-model GALWEM -fcst_lead
> 120000 -fcst_init_beg 20160501_000000 -fcst_init_end 20160502_000000
> -fcst_init_hour 000000 -fcst_var APCP -fcst_thresh >=1 -line_type
MPR
> -vif_flag 1 "
> DEBUG 4: Amending default job with command line options: "-
out_fcst_thresh
> ge0,ge0.5,ge1 -out_obs_thresh ge0"
> DEBUG 3: Processing STAT file
>
"/h/data/global/WXQC/data/met/mdlob_pairs/point_stat_3_galwem_120000L_20160501_120000V.stat"
> ... 1 of 10
> ERROR  :
> ERROR  :
> ERROR  :   MetConfig::read_string(const char *) -> unable to open
temp
> file "config_23325_0_.temp"
> ERROR  :
>
> -----Original Message-----
> From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
> Sent: Friday, May 13, 2016 6:27 PM
> To: CRAIG, ROBERT J GS-12 USAF ACC 16 WS/WXN
<robert.craig.2 at us.af.mil>
> Subject: Re: [rt.rap.ucar.edu #76361] Statanalysis Question
>
> Bob,
>
> The problem is coming from the first line of the file you sent to
me.  It
> contains a comma-separated list of header column names.
>
> I'm not exactly sure where you pulled those header column names, but
> that's the problem.  MET expects data to be separated by
whitespace... so
> it interprets that long string with a bunch of commas as a single
column.
> The
> error comes when it tries to read the "second" column.   If you just
remove
> that first line, it should run fine.
>
> If you do want header columns, here's a trick.  Run the following
job:
>
> stat_analysis -lookin
point_stat_3_galwem_120000L_20160501_120000V.stat \
>    -job filter -line_type MPR -dump_row out.stat
>
> The file out.stat, will now contain the full header for the MPR line
type.
> When you select a single LINE_TYPE value, stat-analysis will write
the
> full header for that line type to the output.
>
> Have a good weekend.
>
> John
>
>
> On Fri, May 13, 2016 at 10:26 AM, robert.craig.2 at us.af.mil via RT <
> met_help at ucar.edu> wrote:
>
> >
> > Fri May 13 10:26:26 2016: Request 76361 was acted upon.
> > Transaction: Ticket created by robert.craig.2 at us.af.mil
> >        Queue: met_help
> >      Subject: Statanalysis Question
> >        Owner: Nobody
> >   Requestors: robert.craig.2 at us.af.mil
> >       Status: new
> >  Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361
> > >
> >
> >
> > John, I am getting the following error when running statanalysis.
The
> > forcast times and valid times seem to be correct to me, so I am
not
> > sure the cause of the error.
> >
> > /h/WXQC/met-5.1/bin/stat_analysis -lookin
> > /h/data/global/WXQC/data/met/mdlob_pairs -out
> > /h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_PSTD_0Z
> > -config /h/WXQC/met-
5.1/data/config/STATAnalysisConfig_hira_bs_updated
> > -v 6 DEBUG 1: Creating STAT-Analysis output file
> > "/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_PSTD_0Z"
> > DEBUG 1: Default Config File:
> > /home/qcteam/met-5.1/share/met/config/STATAnalysisConfig_default
> > DEBUG 1: User Config File:
> > /h/WXQC/met-5.1/data/config/STATAnalysisConfig_hira_bs_updated
> > DEBUG 4: Default Job from the config file: "-model GALWEM
-fcst_lead
> > 120000 -fcst_init_beg 20160501_000000 -fcst_init_end
20160501_000000
> > -fcst_init_hour 120000 -fcst_var APCP -fcst_thresh >=50 -line_type
MPR
> > -vif_flag 1 "
> > DEBUG 4: Amending default job with command line options: "(nul)"
> > DEBUG 3: Processing STAT file
> >
>
"/h/data/global/WXQC/data/met/mdlob_pairs/point_stat_3_galwem_120000L_20160501_120000V.stat"
> > ... 1 of 2
> > ERROR  :
> > ERROR  : DataLine::get_item(int) -> range check error ERROR  :
> >
> > The config file and data file are on your server.
> >
> >
> >
>
>
>
>

------------------------------------------------
Subject: RE: [rt.rap.ucar.edu #76361] Statanalysis Question
From: robert.craig.2 at us.af.mil
Time: Thu May 19 14:23:04 2016

Thanks John, the directory problem was due to come corruption on the
MET we had on one if the systems.  On another system the problem
doesn't come up so we are hoping a recompile of MET on said system
will clear up the issue.

As far as the second comment, I don't think you interpreted what I am
doing correctly.  In each file, there is three sets of data for each
variable.  They are not identical since the first set is the ob
neighborhood data for precip > 1.  The next set is the ob neighborhood
data for precip > 25, and the same for precip > 50.  I you compare the
model and ob data, the model data should be different (for some obs)
then the model data for the previous category.  The neighborhoods
around each ob site follow the HiRA method.   All the data in the mpr
file lines are probabilities, so I want to create PSTD from these
data.  So I was using the fcst_thresh to filter for the HiRA
thresholds I am interested in.   I tried your code and added -by
FCST_THRESH and got out three different sets of values.   So my
question is why did you have the -out_fcst_thresh set to 10 prob
thresolds?  Why wouldn't ge0 pick  up all probabilities?  I am not
understanding how these thresholds are being used.

Thanks
Bob

-----Original Message-----
From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
Sent: Monday, May 16, 2016 4:54 PM
To: CRAIG, ROBERT J GS-12 USAF ACC 16 WS/WXN
<robert.craig.2 at us.af.mil>
Subject: Re: [rt.rap.ucar.edu #76361] Statanalysis Question

Bob,

Thanks for sending the sample data.  I agree that STAT-Analysis can
get pretty confusing.  It has a lot of flexibility, but we really need
to think through what you're trying to do.

First, regarding the error you're getting.  Unfortunately, the config
file string parser is writing a temp file in the current "runtime"
directory.
The error is from the fact that you don't have permission to write the
file "config_23325_0_.temp" in the current directory.  Ultimately, we
should change that to use the temp directory instead.

Next, I looked at the data you sent to me.  Listed below are the
unique combinations of just a few of the header columns:

FCST_VAR FCST_THRESH LINE_TYPE TOTAL
APCP         >=1                  MPR           14666
APCP         >=25                MPR           14666
APCP         >=50                MPR           14666
CEIL           <=1000             MPR           11926
CEIL           <=100               MPR           11926
CEIL           <=300               MPR           11926

Based on this, it looks like you have a lot of duplicate matched pair
(MPR) output lines... We have the same 14666 pairs for APCP repeated 3
times followed by the same 11926 pairs for CEIL repeated 3 times.
This isn't necessary.  Instead, the FCST_THRESH and OBS_THRESH columns
for the MPR line type should be set to "NA".  The MPR line type that
Point-Stat creates just contains the paired forecast and observation
values.  Thresholds do not apply to this line type.

I posted an updated version of your file to the ftp site.  I stripped
it down to 14666 APCP lines and 11926 CEIL lines with NA in the
FCST_THRESH and OBS_THRESH columns:

ftp://ftp.rap.ucar.edu/incoming/irap/met_help/craig_data/point_stat_3_galwem_120000L_20160501_120000V_JHG.stat

Looking at the values in the FCST column, I see numbers between 0 and
1 (0, 0.11, 0.22, 0.33, ...).  Looking in the OBS column, I see 2
numbers (0 or 1).  And looking at your config file, it looks like you
want to use these MPR lines to compute probabilistic output.  MET
verifies probabilities using an Nx2 contingency table.  You use "-
out_fcst_thresh" to select the probabilistic thresholds to be applied
and "-out_obs_thresh" to select the observation threshold to be
applied.

Here's a stat-analysis job you could run to read the MPR lines, define
the probabilistic forecast thresholds, define the single observation
threshold, and compute a PSTD output line.  Using "-by FCST_VAR" tells
it to run the job separately for each unique entry found in the
FCST_VAR column.

/usr/local/met-5.1/bin/stat_analysis \
   -lookin point_stat_3_galwem_120000L_20160501_120000V_JHG.stat \
   -job aggregate_stat -line_type MPR -out_line_type PSTD \
   -out_fcst_thresh
ge0,ge0.1,ge0.2,ge0.3,ge0.4,ge0.5,ge0.6,ge0.7,ge0.8,ge0.9,ge1.0 \
   -out_obs_thresh eq1.0 \
   -by FCST_VAR \
   -out_stat out_pstd.txt

The output statistics are written to "out_pstd.txt".

Hope that helps.

John







On Mon, May 16, 2016 at 2:47 PM, robert.craig.2 at us.af.mil via RT <
met_help at ucar.edu> wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361 >
>
> Thanks John, I knew the data was space delimited but forgot to check
> the header.  As usual with MET, I progressed further but am hitting
a new
> error.   See below.   I pushed the the config file to the ftp
directory.
> As you can see, the -tmp_dir is set to
/h/data/global/WXQC/data/met/tmp.
> This directory permissions are wide open - infact stat_anal temp
files are
> in there.   Does the config*.temp try to write somewhere else?
>
> Also, notice in the command line options, there are three thesholds.
> MET kept telling me that I had to have three since this is
probability data.
> Also, the latest MPR files (.stat) are in the ftp dir.  As you can
see
> I generated model/ob pairs using different thresholds for the
forecast
> and observation data.  So this is where I get confused: I assume the
> fcst thresh in the config file is a filter to pull those lines that
> have the threshold I want.  I am not sure what the -out_fcst_thresh
in
> the command line is doing.  If it is filtering the mpr line fcst
data,
> then I would think I would set it to ge 0 for the fcst and ob since
> the fcst and ob data range from 0 to 1.  Am I handling this
correctly?
>
> Thanks
> BOb
>
> ['/h/WXQC/met-5.1/bin/stat_analysis', '-lookin',
> '/h/data/global/WXQC/data/met/mdlob_pairs', '-out',
> '/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_1_PSTD_0Z',
> '-config',
> '/h/WXQC/met-5.1/data/config/STATAnalysisConfig_hira_bs_updated',
> '-out_fcst_thresh ge0,ge0.5,ge1', '-out_obs_thresh ge0', '-v', '6']
> DEBUG 1: Creating STAT-Analysis output file
> "/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_1_PSTD_0Z"
> DEBUG 1: Default Config File:
> /home/qcteam/met-5.1/share/met/config/STATAnalysisConfig_default
> DEBUG 1: User Config File:
> /h/WXQC/met-5.1/data/config/STATAnalysisConfig_hira_bs_updated
> DEBUG 4: Default Job from the config file: "-model GALWEM -fcst_lead
> 120000 -fcst_init_beg 20160501_000000 -fcst_init_end 20160502_000000
> -fcst_init_hour 000000 -fcst_var APCP -fcst_thresh >=1 -line_type
MPR
> -vif_flag 1 "
> DEBUG 4: Amending default job with command line options:
> "-out_fcst_thresh
> ge0,ge0.5,ge1 -out_obs_thresh ge0"
> DEBUG 3: Processing STAT file
>
"/h/data/global/WXQC/data/met/mdlob_pairs/point_stat_3_galwem_120000L_20160501_120000V.stat"
> ... 1 of 10
> ERROR  :
> ERROR  :
> ERROR  :   MetConfig::read_string(const char *) -> unable to open
temp
> file "config_23325_0_.temp"
> ERROR  :
>
> -----Original Message-----
> From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
> Sent: Friday, May 13, 2016 6:27 PM
> To: CRAIG, ROBERT J GS-12 USAF ACC 16 WS/WXN
> <robert.craig.2 at us.af.mil>
> Subject: Re: [rt.rap.ucar.edu #76361] Statanalysis Question
>
> Bob,
>
> The problem is coming from the first line of the file you sent to
me.
> It contains a comma-separated list of header column names.
>
> I'm not exactly sure where you pulled those header column names, but
> that's the problem.  MET expects data to be separated by
whitespace...
> so it interprets that long string with a bunch of commas as a single
column.
> The
> error comes when it tries to read the "second" column.   If you just
remove
> that first line, it should run fine.
>
> If you do want header columns, here's a trick.  Run the following
job:
>
> stat_analysis -lookin
point_stat_3_galwem_120000L_20160501_120000V.stat \
>    -job filter -line_type MPR -dump_row out.stat
>
> The file out.stat, will now contain the full header for the MPR line
type.
> When you select a single LINE_TYPE value, stat-analysis will write
the
> full header for that line type to the output.
>
> Have a good weekend.
>
> John
>
>
> On Fri, May 13, 2016 at 10:26 AM, robert.craig.2 at us.af.mil via RT <
> met_help at ucar.edu> wrote:
>
> >
> > Fri May 13 10:26:26 2016: Request 76361 was acted upon.
> > Transaction: Ticket created by robert.craig.2 at us.af.mil
> >        Queue: met_help
> >      Subject: Statanalysis Question
> >        Owner: Nobody
> >   Requestors: robert.craig.2 at us.af.mil
> >       Status: new
> >  Ticket <URL:
> > https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361
> > >
> >
> >
> > John, I am getting the following error when running statanalysis.
> > The forcast times and valid times seem to be correct to me, so I
am
> > not sure the cause of the error.
> >
> > /h/WXQC/met-5.1/bin/stat_analysis -lookin
> > /h/data/global/WXQC/data/met/mdlob_pairs -out
> > /h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_PSTD_0Z
> > -config
> > /h/WXQC/met-5.1/data/config/STATAnalysisConfig_hira_bs_updated
> > -v 6 DEBUG 1: Creating STAT-Analysis output file
> > "/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_PSTD_0Z"
> > DEBUG 1: Default Config File:
> > /home/qcteam/met-5.1/share/met/config/STATAnalysisConfig_default
> > DEBUG 1: User Config File:
> > /h/WXQC/met-5.1/data/config/STATAnalysisConfig_hira_bs_updated
> > DEBUG 4: Default Job from the config file: "-model GALWEM
-fcst_lead
> > 120000 -fcst_init_beg 20160501_000000 -fcst_init_end
20160501_000000
> > -fcst_init_hour 120000 -fcst_var APCP -fcst_thresh >=50 -line_type
> > MPR -vif_flag 1 "
> > DEBUG 4: Amending default job with command line options: "(nul)"
> > DEBUG 3: Processing STAT file
> >
>
"/h/data/global/WXQC/data/met/mdlob_pairs/point_stat_3_galwem_120000L_20160501_120000V.stat"
> > ... 1 of 2
> > ERROR  :
> > ERROR  : DataLine::get_item(int) -> range check error ERROR  :
> >
> > The config file and data file are on your server.
> >
> >
> >
>
>
>
>



------------------------------------------------
Subject: Statanalysis Question
From: John Halley Gotway
Time: Thu May 19 15:31:29 2016

Bob,

It's funny, I was just talking to a colleague today about doing
something
very similar to this on a different dataset.

I think I understand what you're saying about the thresholds >1, >25,
and
>50.  These are used to define the "event" which is used in computing
the
fractional coverage fields.  My confusion comes from the fact that in
MET
currently only Grid-Stat is computing these fractional coverage
fields, not
Point-Stat.  But I see now what you're doing.

One suggestion would be to change the contents of the INTERP_MTHD and
INTERP_PNTS header columns.  You currently have NEAREST, 1 which would
indicate that each observation value was matched to the forecast value
at
the nearest grid point.  Instead, I would suggest writing NBRHD and N,
where N indicates the number of points in the neighborhood.  For
example,
the NBRHD output from Grid-Stat would write 49 if we were using a 7x7
box.

The -fcst_thresh and -obs_thresh options are used to filter the input
MPR
lines, as you already know.  The -out_fcst_thresh and -out_obs_thresh
options define the thresholds to be applied when computing the output
for
the job.  In MET, probabilities are not processed in a "continuous"
way.
Instead, they are put into probability bins.  Those bins are used to
create
an Nx2 contingency table from which probabilistic statistics are
computed.

Setting "-out_fcst_thresh gt0,gt0.25,gt0.5,gt0.75,gt1.0" defines 4
probability bins which yields a 4x2 contingency table, from which
stats are
computed.

Hope that helps.

Thanks,
Johhn

On Thu, May 19, 2016 at 2:23 PM, robert.craig.2 at us.af.mil via RT <
met_help at ucar.edu> wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361 >
>
> Thanks John, the directory problem was due to come corruption on the
MET
> we had on one if the systems.  On another system the problem doesn't
come
> up so we are hoping a recompile of MET on said system will clear up
the
> issue.
>
> As far as the second comment, I don't think you interpreted what I
am
> doing correctly.  In each file, there is three sets of data for each
> variable.  They are not identical since the first set is the ob
> neighborhood data for precip > 1.  The next set is the ob
neighborhood data
> for precip > 25, and the same for precip > 50.  I you compare the
model and
> ob data, the model data should be different (for some obs) then the
model
> data for the previous category.  The neighborhoods  around each ob
site
> follow the HiRA method.   All the data in the mpr file lines are
> probabilities, so I want to create PSTD from these data.  So I was
using
> the fcst_thresh to filter for the HiRA thresholds I am interested
in.   I
> tried your code and added -by FCST_THRESH and got out three
different sets
> of values.   So my question is why did you have the -out_fcst_thresh
set to
> 10 prob thresolds?  Why wouldn't ge0 pick  up all probabilities?  I
am not
> understanding how these thresholds are being used.
>
> Thanks
> Bob
>
> -----Original Message-----
> From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
> Sent: Monday, May 16, 2016 4:54 PM
> To: CRAIG, ROBERT J GS-12 USAF ACC 16 WS/WXN
<robert.craig.2 at us.af.mil>
> Subject: Re: [rt.rap.ucar.edu #76361] Statanalysis Question
>
> Bob,
>
> Thanks for sending the sample data.  I agree that STAT-Analysis can
get
> pretty confusing.  It has a lot of flexibility, but we really need
to think
> through what you're trying to do.
>
> First, regarding the error you're getting.  Unfortunately, the
config file
> string parser is writing a temp file in the current "runtime"
directory.
> The error is from the fact that you don't have permission to write
the
> file "config_23325_0_.temp" in the current directory.  Ultimately,
we
> should change that to use the temp directory instead.
>
> Next, I looked at the data you sent to me.  Listed below are the
unique
> combinations of just a few of the header columns:
>
> FCST_VAR FCST_THRESH LINE_TYPE TOTAL
> APCP         >=1                  MPR           14666
> APCP         >=25                MPR           14666
> APCP         >=50                MPR           14666
> CEIL           <=1000             MPR           11926
> CEIL           <=100               MPR           11926
> CEIL           <=300               MPR           11926
>
> Based on this, it looks like you have a lot of duplicate matched
pair
> (MPR) output lines... We have the same 14666 pairs for APCP repeated
3
> times followed by the same 11926 pairs for CEIL repeated 3 times.
This
> isn't necessary.  Instead, the FCST_THRESH and OBS_THRESH columns
for the
> MPR line type should be set to "NA".  The MPR line type that Point-
Stat
> creates just contains the paired forecast and observation values.
> Thresholds do not apply to this line type.
>
> I posted an updated version of your file to the ftp site.  I
stripped it
> down to 14666 APCP lines and 11926 CEIL lines with NA in the
FCST_THRESH
> and OBS_THRESH columns:
>
>
>
ftp://ftp.rap.ucar.edu/incoming/irap/met_help/craig_data/point_stat_3_galwem_120000L_20160501_120000V_JHG.stat
>
> Looking at the values in the FCST column, I see numbers between 0
and 1
> (0, 0.11, 0.22, 0.33, ...).  Looking in the OBS column, I see 2
numbers (0
> or 1).  And looking at your config file, it looks like you want to
use
> these MPR lines to compute probabilistic output.  MET verifies
> probabilities using an Nx2 contingency table.  You use "-
out_fcst_thresh"
> to select the probabilistic thresholds to be applied and "-
out_obs_thresh"
> to select the observation threshold to be applied.
>
> Here's a stat-analysis job you could run to read the MPR lines,
define the
> probabilistic forecast thresholds, define the single observation
threshold,
> and compute a PSTD output line.  Using "-by FCST_VAR" tells it to
run the
> job separately for each unique entry found in the FCST_VAR column.
>
> /usr/local/met-5.1/bin/stat_analysis \
>    -lookin point_stat_3_galwem_120000L_20160501_120000V_JHG.stat \
>    -job aggregate_stat -line_type MPR -out_line_type PSTD \
>    -out_fcst_thresh
> ge0,ge0.1,ge0.2,ge0.3,ge0.4,ge0.5,ge0.6,ge0.7,ge0.8,ge0.9,ge1.0 \
>    -out_obs_thresh eq1.0 \
>    -by FCST_VAR \
>    -out_stat out_pstd.txt
>
> The output statistics are written to "out_pstd.txt".
>
> Hope that helps.
>
> John
>
>
>
>
>
>
>
> On Mon, May 16, 2016 at 2:47 PM, robert.craig.2 at us.af.mil via RT <
> met_help at ucar.edu> wrote:
>
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361 >
> >
> > Thanks John, I knew the data was space delimited but forgot to
check
> > the header.  As usual with MET, I progressed further but am
hitting a new
> > error.   See below.   I pushed the the config file to the ftp
directory.
> > As you can see, the -tmp_dir is set to
/h/data/global/WXQC/data/met/tmp.
> > This directory permissions are wide open - infact stat_anal temp
files
> are
> > in there.   Does the config*.temp try to write somewhere else?
> >
> > Also, notice in the command line options, there are three
thesholds.
> > MET kept telling me that I had to have three since this is
probability
> data.
> > Also, the latest MPR files (.stat) are in the ftp dir.  As you can
see
> > I generated model/ob pairs using different thresholds for the
forecast
> > and observation data.  So this is where I get confused: I assume
the
> > fcst thresh in the config file is a filter to pull those lines
that
> > have the threshold I want.  I am not sure what the
-out_fcst_thresh in
> > the command line is doing.  If it is filtering the mpr line fcst
data,
> > then I would think I would set it to ge 0 for the fcst and ob
since
> > the fcst and ob data range from 0 to 1.  Am I handling this
correctly?
> >
> > Thanks
> > BOb
> >
> > ['/h/WXQC/met-5.1/bin/stat_analysis', '-lookin',
> > '/h/data/global/WXQC/data/met/mdlob_pairs', '-out',
> > '/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_1_PSTD_0Z',
> > '-config',
> > '/h/WXQC/met-5.1/data/config/STATAnalysisConfig_hira_bs_updated',
> > '-out_fcst_thresh ge0,ge0.5,ge1', '-out_obs_thresh ge0', '-v',
'6']
> > DEBUG 1: Creating STAT-Analysis output file
> > "/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_1_PSTD_0Z"
> > DEBUG 1: Default Config File:
> > /home/qcteam/met-5.1/share/met/config/STATAnalysisConfig_default
> > DEBUG 1: User Config File:
> > /h/WXQC/met-5.1/data/config/STATAnalysisConfig_hira_bs_updated
> > DEBUG 4: Default Job from the config file: "-model GALWEM
-fcst_lead
> > 120000 -fcst_init_beg 20160501_000000 -fcst_init_end
20160502_000000
> > -fcst_init_hour 000000 -fcst_var APCP -fcst_thresh >=1 -line_type
MPR
> > -vif_flag 1 "
> > DEBUG 4: Amending default job with command line options:
> > "-out_fcst_thresh
> > ge0,ge0.5,ge1 -out_obs_thresh ge0"
> > DEBUG 3: Processing STAT file
> >
>
"/h/data/global/WXQC/data/met/mdlob_pairs/point_stat_3_galwem_120000L_20160501_120000V.stat"
> > ... 1 of 10
> > ERROR  :
> > ERROR  :
> > ERROR  :   MetConfig::read_string(const char *) -> unable to open
temp
> > file "config_23325_0_.temp"
> > ERROR  :
> >
> > -----Original Message-----
> > From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
> > Sent: Friday, May 13, 2016 6:27 PM
> > To: CRAIG, ROBERT J GS-12 USAF ACC 16 WS/WXN
> > <robert.craig.2 at us.af.mil>
> > Subject: Re: [rt.rap.ucar.edu #76361] Statanalysis Question
> >
> > Bob,
> >
> > The problem is coming from the first line of the file you sent to
me.
> > It contains a comma-separated list of header column names.
> >
> > I'm not exactly sure where you pulled those header column names,
but
> > that's the problem.  MET expects data to be separated by
whitespace...
> > so it interprets that long string with a bunch of commas as a
single
> column.
> > The
> > error comes when it tries to read the "second" column.   If you
just
> remove
> > that first line, it should run fine.
> >
> > If you do want header columns, here's a trick.  Run the following
job:
> >
> > stat_analysis -lookin
point_stat_3_galwem_120000L_20160501_120000V.stat \
> >    -job filter -line_type MPR -dump_row out.stat
> >
> > The file out.stat, will now contain the full header for the MPR
line
> type.
> > When you select a single LINE_TYPE value, stat-analysis will write
the
> > full header for that line type to the output.
> >
> > Have a good weekend.
> >
> > John
> >
> >
> > On Fri, May 13, 2016 at 10:26 AM, robert.craig.2 at us.af.mil via RT
<
> > met_help at ucar.edu> wrote:
> >
> > >
> > > Fri May 13 10:26:26 2016: Request 76361 was acted upon.
> > > Transaction: Ticket created by robert.craig.2 at us.af.mil
> > >        Queue: met_help
> > >      Subject: Statanalysis Question
> > >        Owner: Nobody
> > >   Requestors: robert.craig.2 at us.af.mil
> > >       Status: new
> > >  Ticket <URL:
> > > https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361
> > > >
> > >
> > >
> > > John, I am getting the following error when running
statanalysis.
> > > The forcast times and valid times seem to be correct to me, so I
am
> > > not sure the cause of the error.
> > >
> > > /h/WXQC/met-5.1/bin/stat_analysis -lookin
> > > /h/data/global/WXQC/data/met/mdlob_pairs -out
> > > /h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_PSTD_0Z
> > > -config
> > > /h/WXQC/met-5.1/data/config/STATAnalysisConfig_hira_bs_updated
> > > -v 6 DEBUG 1: Creating STAT-Analysis output file
> > >
"/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_PSTD_0Z"
> > > DEBUG 1: Default Config File:
> > > /home/qcteam/met-5.1/share/met/config/STATAnalysisConfig_default
> > > DEBUG 1: User Config File:
> > > /h/WXQC/met-5.1/data/config/STATAnalysisConfig_hira_bs_updated
> > > DEBUG 4: Default Job from the config file: "-model GALWEM
-fcst_lead
> > > 120000 -fcst_init_beg 20160501_000000 -fcst_init_end
20160501_000000
> > > -fcst_init_hour 120000 -fcst_var APCP -fcst_thresh >=50
-line_type
> > > MPR -vif_flag 1 "
> > > DEBUG 4: Amending default job with command line options: "(nul)"
> > > DEBUG 3: Processing STAT file
> > >
> >
>
"/h/data/global/WXQC/data/met/mdlob_pairs/point_stat_3_galwem_120000L_20160501_120000V.stat"
> > > ... 1 of 2
> > > ERROR  :
> > > ERROR  : DataLine::get_item(int) -> range check error ERROR  :
> > >
> > > The config file and data file are on your server.
> > >
> > >
> > >
> >
> >
> >
> >
>
>
>
>

------------------------------------------------
Subject: RE: [rt.rap.ucar.edu #76361] Statanalysis Question
From: robert.craig.2 at us.af.mil
Time: Fri May 20 15:33:54 2016

John, I might have found more information on the error below:

MetConfig::read_string(const char *) -> unable to open temp file
"config_1943_0_.temp"

When I run stat_anal on MPR files, no problems if I  process 2 days of
data, but when I increase it to three days, the error occurs.  I also
played with the dates to try to eliminate a particular data file as
the cause.  So, it seems to be related to how many data files it has
to process.   I posted all the data files on the ftp server and my
command line is below.  The config file on the server is still
representative.  This error does not seem related to directory
permissions.

/h/WXQC/met-5.1/bin/stat_analysis  -lookin
/h/data/global/WXQC/data/met/mdlob_pairs -out
/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_PSTD_0
-config /h/WXQC/met-5.1/data/config/STATAnalysisConfig_hira_bs_updated
-out_fcst_thresh
ge0,ge0.1,ge0.2,ge0.3,ge0.4,ge0.5,ge0.6,ge0.7,ge0.8,ge0.9,ge1.0
-out_obs_thresh eq1 -by FCST_VAR -by FCST_THRESH -v 6

This is concerning since I have to run this on many more than two days
worth of data.  Any idea what be happening?

Thanks
Bob

-----Original Message-----
From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
Sent: Thursday, May 19, 2016 4:31 PM
To: CRAIG, ROBERT J GS-12 USAF ACC 16 WS/WXN
<robert.craig.2 at us.af.mil>
Subject: Re: [rt.rap.ucar.edu #76361] Statanalysis Question

Bob,

It's funny, I was just talking to a colleague today about doing
something very similar to this on a different dataset.

I think I understand what you're saying about the thresholds >1, >25,
and
>50.  These are used to define the "event" which is used in computing
>the
fractional coverage fields.  My confusion comes from the fact that in
MET currently only Grid-Stat is computing these fractional coverage
fields, not Point-Stat.  But I see now what you're doing.

One suggestion would be to change the contents of the INTERP_MTHD and
INTERP_PNTS header columns.  You currently have NEAREST, 1 which would
indicate that each observation value was matched to the forecast value
at the nearest grid point.  Instead, I would suggest writing NBRHD and
N, where N indicates the number of points in the neighborhood.  For
example, the NBRHD output from Grid-Stat would write 49 if we were
using a 7x7 box.

The -fcst_thresh and -obs_thresh options are used to filter the input
MPR lines, as you already know.  The -out_fcst_thresh and
-out_obs_thresh options define the thresholds to be applied when
computing the output for the job.  In MET, probabilities are not
processed in a "continuous" way.
Instead, they are put into probability bins.  Those bins are used to
create an Nx2 contingency table from which probabilistic statistics
are computed.

Setting "-out_fcst_thresh gt0,gt0.25,gt0.5,gt0.75,gt1.0" defines 4
probability bins which yields a 4x2 contingency table, from which
stats are computed.

Hope that helps.

Thanks,
Johhn

On Thu, May 19, 2016 at 2:23 PM, robert.craig.2 at us.af.mil via RT <
met_help at ucar.edu> wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361 >
>
> Thanks John, the directory problem was due to come corruption on the
> MET we had on one if the systems.  On another system the problem
> doesn't come up so we are hoping a recompile of MET on said system
> will clear up the issue.
>
> As far as the second comment, I don't think you interpreted what I
am
> doing correctly.  In each file, there is three sets of data for each
> variable.  They are not identical since the first set is the ob
> neighborhood data for precip > 1.  The next set is the ob
neighborhood
> data for precip > 25, and the same for precip > 50.  I you compare
the
> model and ob data, the model data should be different (for some obs)
> then the model data for the previous category.  The neighborhoods
around each ob site
> follow the HiRA method.   All the data in the mpr file lines are
> probabilities, so I want to create PSTD from these data.  So I was
using
> the fcst_thresh to filter for the HiRA thresholds I am interested
in.   I
> tried your code and added -by FCST_THRESH and got out three
different sets
> of values.   So my question is why did you have the -out_fcst_thresh
set to
> 10 prob thresolds?  Why wouldn't ge0 pick  up all probabilities?  I
am
> not understanding how these thresholds are being used.
>
> Thanks
> Bob
>
> -----Original Message-----
> From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
> Sent: Monday, May 16, 2016 4:54 PM
> To: CRAIG, ROBERT J GS-12 USAF ACC 16 WS/WXN
> <robert.craig.2 at us.af.mil>
> Subject: Re: [rt.rap.ucar.edu #76361] Statanalysis Question
>
> Bob,
>
> Thanks for sending the sample data.  I agree that STAT-Analysis can
> get pretty confusing.  It has a lot of flexibility, but we really
need
> to think through what you're trying to do.
>
> First, regarding the error you're getting.  Unfortunately, the
config
> file string parser is writing a temp file in the current "runtime"
directory.
> The error is from the fact that you don't have permission to write
the
> file "config_23325_0_.temp" in the current directory.  Ultimately,
we
> should change that to use the temp directory instead.
>
> Next, I looked at the data you sent to me.  Listed below are the
> unique combinations of just a few of the header columns:
>
> FCST_VAR FCST_THRESH LINE_TYPE TOTAL
> APCP         >=1                  MPR           14666
> APCP         >=25                MPR           14666
> APCP         >=50                MPR           14666
> CEIL           <=1000             MPR           11926
> CEIL           <=100               MPR           11926
> CEIL           <=300               MPR           11926
>
> Based on this, it looks like you have a lot of duplicate matched
pair
> (MPR) output lines... We have the same 14666 pairs for APCP repeated
3
> times followed by the same 11926 pairs for CEIL repeated 3 times.
> This isn't necessary.  Instead, the FCST_THRESH and OBS_THRESH
columns
> for the MPR line type should be set to "NA".  The MPR line type that
> Point-Stat creates just contains the paired forecast and observation
values.
> Thresholds do not apply to this line type.
>
> I posted an updated version of your file to the ftp site.  I
stripped
> it down to 14666 APCP lines and 11926 CEIL lines with NA in the
> FCST_THRESH and OBS_THRESH columns:
>
>
>
ftp://ftp.rap.ucar.edu/incoming/irap/met_help/craig_data/point_stat_3_
> galwem_120000L_20160501_120000V_JHG.stat
>
> Looking at the values in the FCST column, I see numbers between 0
and
> 1 (0, 0.11, 0.22, 0.33, ...).  Looking in the OBS column, I see 2
> numbers (0 or 1).  And looking at your config file, it looks like
you
> want to use these MPR lines to compute probabilistic output.  MET
> verifies probabilities using an Nx2 contingency table.  You use "-
out_fcst_thresh"
> to select the probabilistic thresholds to be applied and "-
out_obs_thresh"
> to select the observation threshold to be applied.
>
> Here's a stat-analysis job you could run to read the MPR lines,
define
> the probabilistic forecast thresholds, define the single observation
> threshold, and compute a PSTD output line.  Using "-by FCST_VAR"
tells
> it to run the job separately for each unique entry found in the
FCST_VAR column.
>
> /usr/local/met-5.1/bin/stat_analysis \
>    -lookin point_stat_3_galwem_120000L_20160501_120000V_JHG.stat \
>    -job aggregate_stat -line_type MPR -out_line_type PSTD \
>    -out_fcst_thresh
> ge0,ge0.1,ge0.2,ge0.3,ge0.4,ge0.5,ge0.6,ge0.7,ge0.8,ge0.9,ge1.0 \
>    -out_obs_thresh eq1.0 \
>    -by FCST_VAR \
>    -out_stat out_pstd.txt
>
> The output statistics are written to "out_pstd.txt".
>
> Hope that helps.
>
> John
>
>
>
>
>
>
>
> On Mon, May 16, 2016 at 2:47 PM, robert.craig.2 at us.af.mil via RT <
> met_help at ucar.edu> wrote:
>
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361 >
> >
> > Thanks John, I knew the data was space delimited but forgot to
check
> > the header.  As usual with MET, I progressed further but am
hitting a new
> > error.   See below.   I pushed the the config file to the ftp
directory.
> > As you can see, the -tmp_dir is set to
/h/data/global/WXQC/data/met/tmp.
> > This directory permissions are wide open - infact stat_anal temp
> > files
> are
> > in there.   Does the config*.temp try to write somewhere else?
> >
> > Also, notice in the command line options, there are three
thesholds.
> > MET kept telling me that I had to have three since this is
> > probability
> data.
> > Also, the latest MPR files (.stat) are in the ftp dir.  As you can
> > see I generated model/ob pairs using different thresholds for the
> > forecast and observation data.  So this is where I get confused: I
> > assume the fcst thresh in the config file is a filter to pull
those
> > lines that have the threshold I want.  I am not sure what the
> > -out_fcst_thresh in the command line is doing.  If it is filtering
> > the mpr line fcst data, then I would think I would set it to ge 0
> > for the fcst and ob since the fcst and ob data range from 0 to 1.
Am I handling this correctly?
> >
> > Thanks
> > BOb
> >
> > ['/h/WXQC/met-5.1/bin/stat_analysis', '-lookin',
> > '/h/data/global/WXQC/data/met/mdlob_pairs', '-out',
> > '/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_1_PSTD_0Z',
> > '-config',
> > '/h/WXQC/met-5.1/data/config/STATAnalysisConfig_hira_bs_updated',
> > '-out_fcst_thresh ge0,ge0.5,ge1', '-out_obs_thresh ge0', '-v',
'6']
> > DEBUG 1: Creating STAT-Analysis output file
> > "/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_1_PSTD_0Z"
> > DEBUG 1: Default Config File:
> > /home/qcteam/met-5.1/share/met/config/STATAnalysisConfig_default
> > DEBUG 1: User Config File:
> > /h/WXQC/met-5.1/data/config/STATAnalysisConfig_hira_bs_updated
> > DEBUG 4: Default Job from the config file: "-model GALWEM
-fcst_lead
> > 120000 -fcst_init_beg 20160501_000000 -fcst_init_end
20160502_000000
> > -fcst_init_hour 000000 -fcst_var APCP -fcst_thresh >=1 -line_type
> > MPR -vif_flag 1 "
> > DEBUG 4: Amending default job with command line options:
> > "-out_fcst_thresh
> > ge0,ge0.5,ge1 -out_obs_thresh ge0"
> > DEBUG 3: Processing STAT file
> >
>
"/h/data/global/WXQC/data/met/mdlob_pairs/point_stat_3_galwem_120000L_20160501_120000V.stat"
> > ... 1 of 10
> > ERROR  :
> > ERROR  :
> > ERROR  :   MetConfig::read_string(const char *) -> unable to open
temp
> > file "config_23325_0_.temp"
> > ERROR  :
> >
> > -----Original Message-----
> > From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
> > Sent: Friday, May 13, 2016 6:27 PM
> > To: CRAIG, ROBERT J GS-12 USAF ACC 16 WS/WXN
> > <robert.craig.2 at us.af.mil>
> > Subject: Re: [rt.rap.ucar.edu #76361] Statanalysis Question
> >
> > Bob,
> >
> > The problem is coming from the first line of the file you sent to
me.
> > It contains a comma-separated list of header column names.
> >
> > I'm not exactly sure where you pulled those header column names,
but
> > that's the problem.  MET expects data to be separated by
whitespace...
> > so it interprets that long string with a bunch of commas as a
single
> column.
> > The
> > error comes when it tries to read the "second" column.   If you
just
> remove
> > that first line, it should run fine.
> >
> > If you do want header columns, here's a trick.  Run the following
job:
> >
> > stat_analysis -lookin
point_stat_3_galwem_120000L_20160501_120000V.stat \
> >    -job filter -line_type MPR -dump_row out.stat
> >
> > The file out.stat, will now contain the full header for the MPR
line
> type.
> > When you select a single LINE_TYPE value, stat-analysis will write
> > the full header for that line type to the output.
> >
> > Have a good weekend.
> >
> > John
> >
> >
> > On Fri, May 13, 2016 at 10:26 AM, robert.craig.2 at us.af.mil via RT
<
> > met_help at ucar.edu> wrote:
> >
> > >
> > > Fri May 13 10:26:26 2016: Request 76361 was acted upon.
> > > Transaction: Ticket created by robert.craig.2 at us.af.mil
> > >        Queue: met_help
> > >      Subject: Statanalysis Question
> > >        Owner: Nobody
> > >   Requestors: robert.craig.2 at us.af.mil
> > >       Status: new
> > >  Ticket <URL:
> > > https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361
> > > >
> > >
> > >
> > > John, I am getting the following error when running
statanalysis.
> > > The forcast times and valid times seem to be correct to me, so I
> > > am not sure the cause of the error.
> > >
> > > /h/WXQC/met-5.1/bin/stat_analysis -lookin
> > > /h/data/global/WXQC/data/met/mdlob_pairs -out
> > > /h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_PSTD_0Z
> > > -config
> > > /h/WXQC/met-5.1/data/config/STATAnalysisConfig_hira_bs_updated
> > > -v 6 DEBUG 1: Creating STAT-Analysis output file
> > >
"/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_PSTD_0Z"
> > > DEBUG 1: Default Config File:
> > > /home/qcteam/met-5.1/share/met/config/STATAnalysisConfig_default
> > > DEBUG 1: User Config File:
> > > /h/WXQC/met-5.1/data/config/STATAnalysisConfig_hira_bs_updated
> > > DEBUG 4: Default Job from the config file: "-model GALWEM
> > > -fcst_lead
> > > 120000 -fcst_init_beg 20160501_000000 -fcst_init_end
> > > 20160501_000000 -fcst_init_hour 120000 -fcst_var APCP
-fcst_thresh
> > > >=50 -line_type MPR -vif_flag 1 "
> > > DEBUG 4: Amending default job with command line options: "(nul)"
> > > DEBUG 3: Processing STAT file
> > >
> >
>
"/h/data/global/WXQC/data/met/mdlob_pairs/point_stat_3_galwem_120000L_20160501_120000V.stat"
> > > ... 1 of 2
> > > ERROR  :
> > > ERROR  : DataLine::get_item(int) -> range check error ERROR  :
> > >
> > > The config file and data file are on your server.
> > >
> > >
> > >
> >
> >
> >
> >
>
>
>
>



------------------------------------------------
Subject: Statanalysis Question
From: John Halley Gotway
Time: Mon May 23 10:56:15 2016

Bob,

Surprisingly, I was able to replicate the same error you're seeing!  I
see
where the error is occurring, but I don't yet understand why its
happening.  However, I do have a workaround for you.

Try editing your config file by emptying out the "fcst_thresh"
setting:
   fcst_thresh = [];

In your job, you're using "-by fcst_thresh" anyway, so STAT-Analysis
will
group the data by the FCST_THRESH column.  After I removed the
"fcst_thresh" setting, the job completed.  I'll continue looking into
the
reason for that error.

How many of these files are you planning to pass to STAT-Analysis at
any
given time?  I see each sample file contains about 80,000 MPR lines.
Passing it 5 files to process about 400,000 lines, that job takes
about 62
seconds to run on my machine.  I worry that as you increase the number
of
files, it'll run very slowly.

Here's some alternative logic you might consider.  Run STAT-Analysis
once
for each .stat file you're generating.  Instead of writing the PSTD
line
type, write the PCT line type (that's just the counts of that
probabilistic
Nx2 table).  Then run jobs to aggregate the PCT lines types and
compute
PSTD lines (-job aggregate_stat -line_type PCT -out_line_type PSTD).
That
would make the processing for each STAT-Analysis job much more
manageable.

BUT in order to do this in 2 steps, you would really need to put the
neighborhood size information into the INTERP_MTHD and INTERP_PNTS
columns.  Otherwise, the threshold information won't be retained in
the PCT
output lines.  And I also found an issue in the formatting of the
OBS_THRESH output column ("=1" in the output should really be "==1").
For
now I just switched to "ge1", but I need to fix this formatting issue.

Listed below are the c-shell command I used to loop over your sample
files
and run stat_analysis in this way...

# Loop through MPR files and compute PCT output lines
foreach mpr_file (`ls stat_mpr/*.stat`)
   set pct_file = `echo $mpr_file | sed 's/stat_mpr/stat_pct/g' | sed
's/.stat/_pct.stat/g'`
   /usr/local/met-5.1/bin/stat_analysis -lookin $mpr_file -out_stat
$pct_file \
   -job aggregate_stat -line_type MPR -out_line_type PCT \
   -by MODEL,FCST_VALID_BEG,FCST_LEAD,FCST_VAR,FCST_THRESH \
   -out_fcst_thresh
ge0,ge0.1,ge0.2,ge0.3,ge0.4,ge0.5,ge0.6,ge0.7,ge0.8,ge0.9,ge1.0
-out_obs_thresh ge1
end

# Aggregate PCT lines and compute PSTD stats
/usr/local/met-5.1/bin/stat_analysis -lookin stat_pct -out_stat
pstd.stat \
   -job aggregate_stat -line_type PCT -out_line_type PSTD \
   -by MODEL,FCST_VALID_BEG,FCST_LEAD,FCST_VAR,INTERP_MTHD,INTERP_PNTS

Note that this job is combining all of the neighborhood sizes because
INTERP_MTHD and INTERP_PNTS is set the same in the PCT lines.

Thanks,
John



On Fri, May 20, 2016 at 3:33 PM, robert.craig.2 at us.af.mil via RT <
met_help at ucar.edu> wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361 >
>
> John, I might have found more information on the error below:
>
> MetConfig::read_string(const char *) -> unable to open temp file
> "config_1943_0_.temp"
>
> When I run stat_anal on MPR files, no problems if I  process 2 days
of
> data, but when I increase it to three days, the error occurs.  I
also
> played with the dates to try to eliminate a particular data file as
the
> cause.  So, it seems to be related to how many data files it has to
> process.   I posted all the data files on the ftp server and my
command
> line is below.  The config file on the server is still
representative.
> This error does not seem related to directory permissions.
>
> /h/WXQC/met-5.1/bin/stat_analysis  -lookin
> /h/data/global/WXQC/data/met/mdlob_pairs -out
> /h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_PSTD_0
-config
> /h/WXQC/met-5.1/data/config/STATAnalysisConfig_hira_bs_updated
> -out_fcst_thresh
> ge0,ge0.1,ge0.2,ge0.3,ge0.4,ge0.5,ge0.6,ge0.7,ge0.8,ge0.9,ge1.0
> -out_obs_thresh eq1 -by FCST_VAR -by FCST_THRESH -v 6
>
> This is concerning since I have to run this on many more than two
days
> worth of data.  Any idea what be happening?
>
> Thanks
> Bob
>
> -----Original Message-----
> From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
> Sent: Thursday, May 19, 2016 4:31 PM
> To: CRAIG, ROBERT J GS-12 USAF ACC 16 WS/WXN
<robert.craig.2 at us.af.mil>
> Subject: Re: [rt.rap.ucar.edu #76361] Statanalysis Question
>
> Bob,
>
> It's funny, I was just talking to a colleague today about doing
something
> very similar to this on a different dataset.
>
> I think I understand what you're saying about the thresholds >1,
>25, and
> >50.  These are used to define the "event" which is used in
computing
> >the
> fractional coverage fields.  My confusion comes from the fact that
in MET
> currently only Grid-Stat is computing these fractional coverage
fields, not
> Point-Stat.  But I see now what you're doing.
>
> One suggestion would be to change the contents of the INTERP_MTHD
and
> INTERP_PNTS header columns.  You currently have NEAREST, 1 which
would
> indicate that each observation value was matched to the forecast
value at
> the nearest grid point.  Instead, I would suggest writing NBRHD and
N,
> where N indicates the number of points in the neighborhood.  For
example,
> the NBRHD output from Grid-Stat would write 49 if we were using a
7x7 box.
>
> The -fcst_thresh and -obs_thresh options are used to filter the
input MPR
> lines, as you already know.  The -out_fcst_thresh and
-out_obs_thresh
> options define the thresholds to be applied when computing the
output for
> the job.  In MET, probabilities are not processed in a "continuous"
way.
> Instead, they are put into probability bins.  Those bins are used to
> create an Nx2 contingency table from which probabilistic statistics
are
> computed.
>
> Setting "-out_fcst_thresh gt0,gt0.25,gt0.5,gt0.75,gt1.0" defines 4
> probability bins which yields a 4x2 contingency table, from which
stats are
> computed.
>
> Hope that helps.
>
> Thanks,
> Johhn
>
> On Thu, May 19, 2016 at 2:23 PM, robert.craig.2 at us.af.mil via RT <
> met_help at ucar.edu> wrote:
>
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361 >
> >
> > Thanks John, the directory problem was due to come corruption on
the
> > MET we had on one if the systems.  On another system the problem
> > doesn't come up so we are hoping a recompile of MET on said system
> > will clear up the issue.
> >
> > As far as the second comment, I don't think you interpreted what I
am
> > doing correctly.  In each file, there is three sets of data for
each
> > variable.  They are not identical since the first set is the ob
> > neighborhood data for precip > 1.  The next set is the ob
neighborhood
> > data for precip > 25, and the same for precip > 50.  I you compare
the
> > model and ob data, the model data should be different (for some
obs)
> > then the model data for the previous category.  The neighborhoods
> around each ob site
> > follow the HiRA method.   All the data in the mpr file lines are
> > probabilities, so I want to create PSTD from these data.  So I was
using
> > the fcst_thresh to filter for the HiRA thresholds I am interested
in.   I
> > tried your code and added -by FCST_THRESH and got out three
different
> sets
> > of values.   So my question is why did you have the
-out_fcst_thresh set
> to
> > 10 prob thresolds?  Why wouldn't ge0 pick  up all probabilities?
I am
> > not understanding how these thresholds are being used.
> >
> > Thanks
> > Bob
> >
> > -----Original Message-----
> > From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
> > Sent: Monday, May 16, 2016 4:54 PM
> > To: CRAIG, ROBERT J GS-12 USAF ACC 16 WS/WXN
> > <robert.craig.2 at us.af.mil>
> > Subject: Re: [rt.rap.ucar.edu #76361] Statanalysis Question
> >
> > Bob,
> >
> > Thanks for sending the sample data.  I agree that STAT-Analysis
can
> > get pretty confusing.  It has a lot of flexibility, but we really
need
> > to think through what you're trying to do.
> >
> > First, regarding the error you're getting.  Unfortunately, the
config
> > file string parser is writing a temp file in the current "runtime"
> directory.
> > The error is from the fact that you don't have permission to write
the
> > file "config_23325_0_.temp" in the current directory.  Ultimately,
we
> > should change that to use the temp directory instead.
> >
> > Next, I looked at the data you sent to me.  Listed below are the
> > unique combinations of just a few of the header columns:
> >
> > FCST_VAR FCST_THRESH LINE_TYPE TOTAL
> > APCP         >=1                  MPR           14666
> > APCP         >=25                MPR           14666
> > APCP         >=50                MPR           14666
> > CEIL           <=1000             MPR           11926
> > CEIL           <=100               MPR           11926
> > CEIL           <=300               MPR           11926
> >
> > Based on this, it looks like you have a lot of duplicate matched
pair
> > (MPR) output lines... We have the same 14666 pairs for APCP
repeated 3
> > times followed by the same 11926 pairs for CEIL repeated 3 times.
> > This isn't necessary.  Instead, the FCST_THRESH and OBS_THRESH
columns
> > for the MPR line type should be set to "NA".  The MPR line type
that
> > Point-Stat creates just contains the paired forecast and
observation
> values.
> > Thresholds do not apply to this line type.
> >
> > I posted an updated version of your file to the ftp site.  I
stripped
> > it down to 14666 APCP lines and 11926 CEIL lines with NA in the
> > FCST_THRESH and OBS_THRESH columns:
> >
> >
> >
ftp://ftp.rap.ucar.edu/incoming/irap/met_help/craig_data/point_stat_3_
> > galwem_120000L_20160501_120000V_JHG.stat
> >
> > Looking at the values in the FCST column, I see numbers between 0
and
> > 1 (0, 0.11, 0.22, 0.33, ...).  Looking in the OBS column, I see 2
> > numbers (0 or 1).  And looking at your config file, it looks like
you
> > want to use these MPR lines to compute probabilistic output.  MET
> > verifies probabilities using an Nx2 contingency table.  You use
> "-out_fcst_thresh"
> > to select the probabilistic thresholds to be applied and
> "-out_obs_thresh"
> > to select the observation threshold to be applied.
> >
> > Here's a stat-analysis job you could run to read the MPR lines,
define
> > the probabilistic forecast thresholds, define the single
observation
> > threshold, and compute a PSTD output line.  Using "-by FCST_VAR"
tells
> > it to run the job separately for each unique entry found in the
FCST_VAR
> column.
> >
> > /usr/local/met-5.1/bin/stat_analysis \
> >    -lookin point_stat_3_galwem_120000L_20160501_120000V_JHG.stat \
> >    -job aggregate_stat -line_type MPR -out_line_type PSTD \
> >    -out_fcst_thresh
> > ge0,ge0.1,ge0.2,ge0.3,ge0.4,ge0.5,ge0.6,ge0.7,ge0.8,ge0.9,ge1.0 \
> >    -out_obs_thresh eq1.0 \
> >    -by FCST_VAR \
> >    -out_stat out_pstd.txt
> >
> > The output statistics are written to "out_pstd.txt".
> >
> > Hope that helps.
> >
> > John
> >
> >
> >
> >
> >
> >
> >
> > On Mon, May 16, 2016 at 2:47 PM, robert.craig.2 at us.af.mil via RT <
> > met_help at ucar.edu> wrote:
> >
> > >
> > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361 >
> > >
> > > Thanks John, I knew the data was space delimited but forgot to
check
> > > the header.  As usual with MET, I progressed further but am
hitting a
> new
> > > error.   See below.   I pushed the the config file to the ftp
> directory.
> > > As you can see, the -tmp_dir is set to
> /h/data/global/WXQC/data/met/tmp.
> > > This directory permissions are wide open - infact stat_anal temp
> > > files
> > are
> > > in there.   Does the config*.temp try to write somewhere else?
> > >
> > > Also, notice in the command line options, there are three
thesholds.
> > > MET kept telling me that I had to have three since this is
> > > probability
> > data.
> > > Also, the latest MPR files (.stat) are in the ftp dir.  As you
can
> > > see I generated model/ob pairs using different thresholds for
the
> > > forecast and observation data.  So this is where I get confused:
I
> > > assume the fcst thresh in the config file is a filter to pull
those
> > > lines that have the threshold I want.  I am not sure what the
> > > -out_fcst_thresh in the command line is doing.  If it is
filtering
> > > the mpr line fcst data, then I would think I would set it to ge
0
> > > for the fcst and ob since the fcst and ob data range from 0 to
1.  Am
> I handling this correctly?
> > >
> > > Thanks
> > > BOb
> > >
> > > ['/h/WXQC/met-5.1/bin/stat_analysis', '-lookin',
> > > '/h/data/global/WXQC/data/met/mdlob_pairs', '-out',
> > >
'/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_1_PSTD_0Z',
> > > '-config',
> > > '/h/WXQC/met-
5.1/data/config/STATAnalysisConfig_hira_bs_updated',
> > > '-out_fcst_thresh ge0,ge0.5,ge1', '-out_obs_thresh ge0', '-v',
'6']
> > > DEBUG 1: Creating STAT-Analysis output file
> > >
"/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_1_PSTD_0Z"
> > > DEBUG 1: Default Config File:
> > > /home/qcteam/met-5.1/share/met/config/STATAnalysisConfig_default
> > > DEBUG 1: User Config File:
> > > /h/WXQC/met-5.1/data/config/STATAnalysisConfig_hira_bs_updated
> > > DEBUG 4: Default Job from the config file: "-model GALWEM
-fcst_lead
> > > 120000 -fcst_init_beg 20160501_000000 -fcst_init_end
20160502_000000
> > > -fcst_init_hour 000000 -fcst_var APCP -fcst_thresh >=1
-line_type
> > > MPR -vif_flag 1 "
> > > DEBUG 4: Amending default job with command line options:
> > > "-out_fcst_thresh
> > > ge0,ge0.5,ge1 -out_obs_thresh ge0"
> > > DEBUG 3: Processing STAT file
> > >
> >
>
"/h/data/global/WXQC/data/met/mdlob_pairs/point_stat_3_galwem_120000L_20160501_120000V.stat"
> > > ... 1 of 10
> > > ERROR  :
> > > ERROR  :
> > > ERROR  :   MetConfig::read_string(const char *) -> unable to
open temp
> > > file "config_23325_0_.temp"
> > > ERROR  :
> > >
> > > -----Original Message-----
> > > From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
> > > Sent: Friday, May 13, 2016 6:27 PM
> > > To: CRAIG, ROBERT J GS-12 USAF ACC 16 WS/WXN
> > > <robert.craig.2 at us.af.mil>
> > > Subject: Re: [rt.rap.ucar.edu #76361] Statanalysis Question
> > >
> > > Bob,
> > >
> > > The problem is coming from the first line of the file you sent
to me.
> > > It contains a comma-separated list of header column names.
> > >
> > > I'm not exactly sure where you pulled those header column names,
but
> > > that's the problem.  MET expects data to be separated by
whitespace...
> > > so it interprets that long string with a bunch of commas as a
single
> > column.
> > > The
> > > error comes when it tries to read the "second" column.   If you
just
> > remove
> > > that first line, it should run fine.
> > >
> > > If you do want header columns, here's a trick.  Run the
following job:
> > >
> > > stat_analysis -lookin
> point_stat_3_galwem_120000L_20160501_120000V.stat \
> > >    -job filter -line_type MPR -dump_row out.stat
> > >
> > > The file out.stat, will now contain the full header for the MPR
line
> > type.
> > > When you select a single LINE_TYPE value, stat-analysis will
write
> > > the full header for that line type to the output.
> > >
> > > Have a good weekend.
> > >
> > > John
> > >
> > >
> > > On Fri, May 13, 2016 at 10:26 AM, robert.craig.2 at us.af.mil via
RT <
> > > met_help at ucar.edu> wrote:
> > >
> > > >
> > > > Fri May 13 10:26:26 2016: Request 76361 was acted upon.
> > > > Transaction: Ticket created by robert.craig.2 at us.af.mil
> > > >        Queue: met_help
> > > >      Subject: Statanalysis Question
> > > >        Owner: Nobody
> > > >   Requestors: robert.craig.2 at us.af.mil
> > > >       Status: new
> > > >  Ticket <URL:
> > > > https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361
> > > > >
> > > >
> > > >
> > > > John, I am getting the following error when running
statanalysis.
> > > > The forcast times and valid times seem to be correct to me, so
I
> > > > am not sure the cause of the error.
> > > >
> > > > /h/WXQC/met-5.1/bin/stat_analysis -lookin
> > > > /h/data/global/WXQC/data/met/mdlob_pairs -out
> > > >
/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_PSTD_0Z
> > > > -config
> > > > /h/WXQC/met-5.1/data/config/STATAnalysisConfig_hira_bs_updated
> > > > -v 6 DEBUG 1: Creating STAT-Analysis output file
> > > >
"/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_PSTD_0Z"
> > > > DEBUG 1: Default Config File:
> > > > /home/qcteam/met-
5.1/share/met/config/STATAnalysisConfig_default
> > > > DEBUG 1: User Config File:
> > > > /h/WXQC/met-5.1/data/config/STATAnalysisConfig_hira_bs_updated
> > > > DEBUG 4: Default Job from the config file: "-model GALWEM
> > > > -fcst_lead
> > > > 120000 -fcst_init_beg 20160501_000000 -fcst_init_end
> > > > 20160501_000000 -fcst_init_hour 120000 -fcst_var APCP
-fcst_thresh
> > > > >=50 -line_type MPR -vif_flag 1 "
> > > > DEBUG 4: Amending default job with command line options:
"(nul)"
> > > > DEBUG 3: Processing STAT file
> > > >
> > >
> >
>
"/h/data/global/WXQC/data/met/mdlob_pairs/point_stat_3_galwem_120000L_20160501_120000V.stat"
> > > > ... 1 of 2
> > > > ERROR  :
> > > > ERROR  : DataLine::get_item(int) -> range check error ERROR  :
> > > >
> > > > The config file and data file are on your server.
> > > >
> > > >
> > > >
> > >
> > >
> > >
> > >
> >
> >
> >
> >
>
>
>
>

------------------------------------------------
Subject: Statanalysis Question
From: John Halley Gotway
Time: Mon May 23 12:03:36 2016

Bob,

I wanted to give you an update on the two issues you found...

(1) I found and fixed a pesky little bug in STAT-Analysis.  It was
writing
"=1" to the output stat files when it should be writing "==1".  That
was a
one line fix.

(2) I now understand the original problem you wrote about.  It shows
up
when you use the -fcst_thresh, -obs_thresh, or -cov_thresh job command
options to filter the input data.  When STAT-Analysis processes those
threshold-type columns, it's actually writing a tiny temp file and
reading
that back in.  After doing that about 65,528 times (at least on my
machine), I get that error.  Writing that many little temp files is
wreaking havoc.

We'll look into (2) some more to see if we can find an alternative way
of
processes this data.  I'll hold off on posting the (1) bugfix to see
if can
bundle a bugfix for (2) with it.

Thanks,
John

On Mon, May 23, 2016 at 10:55 AM, John Halley Gotway <johnhg at ucar.edu>
wrote:

> Bob,
>
> Surprisingly, I was able to replicate the same error you're seeing!
I see
> where the error is occurring, but I don't yet understand why its
> happening.  However, I do have a workaround for you.
>
> Try editing your config file by emptying out the "fcst_thresh"
setting:
>    fcst_thresh = [];
>
> In your job, you're using "-by fcst_thresh" anyway, so STAT-Analysis
will
> group the data by the FCST_THRESH column.  After I removed the
> "fcst_thresh" setting, the job completed.  I'll continue looking
into the
> reason for that error.
>
> How many of these files are you planning to pass to STAT-Analysis at
any
> given time?  I see each sample file contains about 80,000 MPR lines.
> Passing it 5 files to process about 400,000 lines, that job takes
about 62
> seconds to run on my machine.  I worry that as you increase the
number of
> files, it'll run very slowly.
>
> Here's some alternative logic you might consider.  Run STAT-Analysis
once
> for each .stat file you're generating.  Instead of writing the PSTD
line
> type, write the PCT line type (that's just the counts of that
probabilistic
> Nx2 table).  Then run jobs to aggregate the PCT lines types and
compute
> PSTD lines (-job aggregate_stat -line_type PCT -out_line_type PSTD).
That
> would make the processing for each STAT-Analysis job much more
manageable.
>
> BUT in order to do this in 2 steps, you would really need to put the
> neighborhood size information into the INTERP_MTHD and INTERP_PNTS
> columns.  Otherwise, the threshold information won't be retained in
the PCT
> output lines.  And I also found an issue in the formatting of the
> OBS_THRESH output column ("=1" in the output should really be
"==1").  For
> now I just switched to "ge1", but I need to fix this formatting
issue.
>
> Listed below are the c-shell command I used to loop over your sample
files
> and run stat_analysis in this way...
>
> # Loop through MPR files and compute PCT output lines
> foreach mpr_file (`ls stat_mpr/*.stat`)
>    set pct_file = `echo $mpr_file | sed 's/stat_mpr/stat_pct/g' |
sed
> 's/.stat/_pct.stat/g'`
>    /usr/local/met-5.1/bin/stat_analysis -lookin $mpr_file -out_stat
> $pct_file \
>    -job aggregate_stat -line_type MPR -out_line_type PCT \
>    -by MODEL,FCST_VALID_BEG,FCST_LEAD,FCST_VAR,FCST_THRESH \
>    -out_fcst_thresh
> ge0,ge0.1,ge0.2,ge0.3,ge0.4,ge0.5,ge0.6,ge0.7,ge0.8,ge0.9,ge1.0
> -out_obs_thresh ge1
> end
>
> # Aggregate PCT lines and compute PSTD stats
> /usr/local/met-5.1/bin/stat_analysis -lookin stat_pct -out_stat
pstd.stat \
>    -job aggregate_stat -line_type PCT -out_line_type PSTD \
>    -by
MODEL,FCST_VALID_BEG,FCST_LEAD,FCST_VAR,INTERP_MTHD,INTERP_PNTS
>
> Note that this job is combining all of the neighborhood sizes
because
> INTERP_MTHD and INTERP_PNTS is set the same in the PCT lines.
>
> Thanks,
> John
>
>
>
> On Fri, May 20, 2016 at 3:33 PM, robert.craig.2 at us.af.mil via RT <
> met_help at ucar.edu> wrote:
>
>>
>> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361 >
>>
>> John, I might have found more information on the error below:
>>
>> MetConfig::read_string(const char *) -> unable to open temp file
>> "config_1943_0_.temp"
>>
>> When I run stat_anal on MPR files, no problems if I  process 2 days
of
>> data, but when I increase it to three days, the error occurs.  I
also
>> played with the dates to try to eliminate a particular data file as
the
>> cause.  So, it seems to be related to how many data files it has to
>> process.   I posted all the data files on the ftp server and my
command
>> line is below.  The config file on the server is still
representative.
>> This error does not seem related to directory permissions.
>>
>> /h/WXQC/met-5.1/bin/stat_analysis  -lookin
>> /h/data/global/WXQC/data/met/mdlob_pairs -out
>> /h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_PSTD_0
-config
>> /h/WXQC/met-5.1/data/config/STATAnalysisConfig_hira_bs_updated
>> -out_fcst_thresh
>> ge0,ge0.1,ge0.2,ge0.3,ge0.4,ge0.5,ge0.6,ge0.7,ge0.8,ge0.9,ge1.0
>> -out_obs_thresh eq1 -by FCST_VAR -by FCST_THRESH -v 6
>>
>> This is concerning since I have to run this on many more than two
days
>> worth of data.  Any idea what be happening?
>>
>> Thanks
>> Bob
>>
>> -----Original Message-----
>> From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
>> Sent: Thursday, May 19, 2016 4:31 PM
>> To: CRAIG, ROBERT J GS-12 USAF ACC 16 WS/WXN
<robert.craig.2 at us.af.mil>
>> Subject: Re: [rt.rap.ucar.edu #76361] Statanalysis Question
>>
>> Bob,
>>
>> It's funny, I was just talking to a colleague today about doing
something
>> very similar to this on a different dataset.
>>
>> I think I understand what you're saying about the thresholds >1,
>25, and
>> >50.  These are used to define the "event" which is used in
computing
>> >the
>> fractional coverage fields.  My confusion comes from the fact that
in MET
>> currently only Grid-Stat is computing these fractional coverage
fields, not
>> Point-Stat.  But I see now what you're doing.
>>
>> One suggestion would be to change the contents of the INTERP_MTHD
and
>> INTERP_PNTS header columns.  You currently have NEAREST, 1 which
would
>> indicate that each observation value was matched to the forecast
value at
>> the nearest grid point.  Instead, I would suggest writing NBRHD and
N,
>> where N indicates the number of points in the neighborhood.  For
example,
>> the NBRHD output from Grid-Stat would write 49 if we were using a
7x7 box.
>>
>> The -fcst_thresh and -obs_thresh options are used to filter the
input MPR
>> lines, as you already know.  The -out_fcst_thresh and
-out_obs_thresh
>> options define the thresholds to be applied when computing the
output for
>> the job.  In MET, probabilities are not processed in a "continuous"
way.
>> Instead, they are put into probability bins.  Those bins are used
to
>> create an Nx2 contingency table from which probabilistic statistics
are
>> computed.
>>
>> Setting "-out_fcst_thresh gt0,gt0.25,gt0.5,gt0.75,gt1.0" defines 4
>> probability bins which yields a 4x2 contingency table, from which
stats are
>> computed.
>>
>> Hope that helps.
>>
>> Thanks,
>> Johhn
>>
>> On Thu, May 19, 2016 at 2:23 PM, robert.craig.2 at us.af.mil via RT <
>> met_help at ucar.edu> wrote:
>>
>> >
>> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361 >
>> >
>> > Thanks John, the directory problem was due to come corruption on
the
>> > MET we had on one if the systems.  On another system the problem
>> > doesn't come up so we are hoping a recompile of MET on said
system
>> > will clear up the issue.
>> >
>> > As far as the second comment, I don't think you interpreted what
I am
>> > doing correctly.  In each file, there is three sets of data for
each
>> > variable.  They are not identical since the first set is the ob
>> > neighborhood data for precip > 1.  The next set is the ob
neighborhood
>> > data for precip > 25, and the same for precip > 50.  I you
compare the
>> > model and ob data, the model data should be different (for some
obs)
>> > then the model data for the previous category.  The neighborhoods
>> around each ob site
>> > follow the HiRA method.   All the data in the mpr file lines are
>> > probabilities, so I want to create PSTD from these data.  So I
was using
>> > the fcst_thresh to filter for the HiRA thresholds I am interested
in.
>>  I
>> > tried your code and added -by FCST_THRESH and got out three
different
>> sets
>> > of values.   So my question is why did you have the
-out_fcst_thresh
>> set to
>> > 10 prob thresolds?  Why wouldn't ge0 pick  up all probabilities?
I am
>> > not understanding how these thresholds are being used.
>> >
>> > Thanks
>> > Bob
>> >
>> > -----Original Message-----
>> > From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
>> > Sent: Monday, May 16, 2016 4:54 PM
>> > To: CRAIG, ROBERT J GS-12 USAF ACC 16 WS/WXN
>> > <robert.craig.2 at us.af.mil>
>> > Subject: Re: [rt.rap.ucar.edu #76361] Statanalysis Question
>> >
>> > Bob,
>> >
>> > Thanks for sending the sample data.  I agree that STAT-Analysis
can
>> > get pretty confusing.  It has a lot of flexibility, but we really
need
>> > to think through what you're trying to do.
>> >
>> > First, regarding the error you're getting.  Unfortunately, the
config
>> > file string parser is writing a temp file in the current
"runtime"
>> directory.
>> > The error is from the fact that you don't have permission to
write the
>> > file "config_23325_0_.temp" in the current directory.
Ultimately, we
>> > should change that to use the temp directory instead.
>> >
>> > Next, I looked at the data you sent to me.  Listed below are the
>> > unique combinations of just a few of the header columns:
>> >
>> > FCST_VAR FCST_THRESH LINE_TYPE TOTAL
>> > APCP         >=1                  MPR           14666
>> > APCP         >=25                MPR           14666
>> > APCP         >=50                MPR           14666
>> > CEIL           <=1000             MPR           11926
>> > CEIL           <=100               MPR           11926
>> > CEIL           <=300               MPR           11926
>> >
>> > Based on this, it looks like you have a lot of duplicate matched
pair
>> > (MPR) output lines... We have the same 14666 pairs for APCP
repeated 3
>> > times followed by the same 11926 pairs for CEIL repeated 3 times.
>> > This isn't necessary.  Instead, the FCST_THRESH and OBS_THRESH
columns
>> > for the MPR line type should be set to "NA".  The MPR line type
that
>> > Point-Stat creates just contains the paired forecast and
observation
>> values.
>> > Thresholds do not apply to this line type.
>> >
>> > I posted an updated version of your file to the ftp site.  I
stripped
>> > it down to 14666 APCP lines and 11926 CEIL lines with NA in the
>> > FCST_THRESH and OBS_THRESH columns:
>> >
>> >
>> >
ftp://ftp.rap.ucar.edu/incoming/irap/met_help/craig_data/point_stat_3_
>> > galwem_120000L_20160501_120000V_JHG.stat
>> >
>> > Looking at the values in the FCST column, I see numbers between 0
and
>> > 1 (0, 0.11, 0.22, 0.33, ...).  Looking in the OBS column, I see 2
>> > numbers (0 or 1).  And looking at your config file, it looks like
you
>> > want to use these MPR lines to compute probabilistic output.  MET
>> > verifies probabilities using an Nx2 contingency table.  You use
>> "-out_fcst_thresh"
>> > to select the probabilistic thresholds to be applied and
>> "-out_obs_thresh"
>> > to select the observation threshold to be applied.
>> >
>> > Here's a stat-analysis job you could run to read the MPR lines,
define
>> > the probabilistic forecast thresholds, define the single
observation
>> > threshold, and compute a PSTD output line.  Using "-by FCST_VAR"
tells
>> > it to run the job separately for each unique entry found in the
>> FCST_VAR column.
>> >
>> > /usr/local/met-5.1/bin/stat_analysis \
>> >    -lookin point_stat_3_galwem_120000L_20160501_120000V_JHG.stat
\
>> >    -job aggregate_stat -line_type MPR -out_line_type PSTD \
>> >    -out_fcst_thresh
>> > ge0,ge0.1,ge0.2,ge0.3,ge0.4,ge0.5,ge0.6,ge0.7,ge0.8,ge0.9,ge1.0 \
>> >    -out_obs_thresh eq1.0 \
>> >    -by FCST_VAR \
>> >    -out_stat out_pstd.txt
>> >
>> > The output statistics are written to "out_pstd.txt".
>> >
>> > Hope that helps.
>> >
>> > John
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> > On Mon, May 16, 2016 at 2:47 PM, robert.craig.2 at us.af.mil via RT
<
>> > met_help at ucar.edu> wrote:
>> >
>> > >
>> > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361 >
>> > >
>> > > Thanks John, I knew the data was space delimited but forgot to
check
>> > > the header.  As usual with MET, I progressed further but am
hitting a
>> new
>> > > error.   See below.   I pushed the the config file to the ftp
>> directory.
>> > > As you can see, the -tmp_dir is set to
>> /h/data/global/WXQC/data/met/tmp.
>> > > This directory permissions are wide open - infact stat_anal
temp
>> > > files
>> > are
>> > > in there.   Does the config*.temp try to write somewhere else?
>> > >
>> > > Also, notice in the command line options, there are three
thesholds.
>> > > MET kept telling me that I had to have three since this is
>> > > probability
>> > data.
>> > > Also, the latest MPR files (.stat) are in the ftp dir.  As you
can
>> > > see I generated model/ob pairs using different thresholds for
the
>> > > forecast and observation data.  So this is where I get
confused: I
>> > > assume the fcst thresh in the config file is a filter to pull
those
>> > > lines that have the threshold I want.  I am not sure what the
>> > > -out_fcst_thresh in the command line is doing.  If it is
filtering
>> > > the mpr line fcst data, then I would think I would set it to ge
0
>> > > for the fcst and ob since the fcst and ob data range from 0 to
1.  Am
>> I handling this correctly?
>> > >
>> > > Thanks
>> > > BOb
>> > >
>> > > ['/h/WXQC/met-5.1/bin/stat_analysis', '-lookin',
>> > > '/h/data/global/WXQC/data/met/mdlob_pairs', '-out',
>> > >
'/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_1_PSTD_0Z',
>> > > '-config',
>> > > '/h/WXQC/met-
5.1/data/config/STATAnalysisConfig_hira_bs_updated',
>> > > '-out_fcst_thresh ge0,ge0.5,ge1', '-out_obs_thresh ge0', '-v',
'6']
>> > > DEBUG 1: Creating STAT-Analysis output file
>> > >
"/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_1_PSTD_0Z"
>> > > DEBUG 1: Default Config File:
>> > > /home/qcteam/met-
5.1/share/met/config/STATAnalysisConfig_default
>> > > DEBUG 1: User Config File:
>> > > /h/WXQC/met-5.1/data/config/STATAnalysisConfig_hira_bs_updated
>> > > DEBUG 4: Default Job from the config file: "-model GALWEM
-fcst_lead
>> > > 120000 -fcst_init_beg 20160501_000000 -fcst_init_end
20160502_000000
>> > > -fcst_init_hour 000000 -fcst_var APCP -fcst_thresh >=1
-line_type
>> > > MPR -vif_flag 1 "
>> > > DEBUG 4: Amending default job with command line options:
>> > > "-out_fcst_thresh
>> > > ge0,ge0.5,ge1 -out_obs_thresh ge0"
>> > > DEBUG 3: Processing STAT file
>> > >
>> >
>>
"/h/data/global/WXQC/data/met/mdlob_pairs/point_stat_3_galwem_120000L_20160501_120000V.stat"
>> > > ... 1 of 10
>> > > ERROR  :
>> > > ERROR  :
>> > > ERROR  :   MetConfig::read_string(const char *) -> unable to
open temp
>> > > file "config_23325_0_.temp"
>> > > ERROR  :
>> > >
>> > > -----Original Message-----
>> > > From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
>> > > Sent: Friday, May 13, 2016 6:27 PM
>> > > To: CRAIG, ROBERT J GS-12 USAF ACC 16 WS/WXN
>> > > <robert.craig.2 at us.af.mil>
>> > > Subject: Re: [rt.rap.ucar.edu #76361] Statanalysis Question
>> > >
>> > > Bob,
>> > >
>> > > The problem is coming from the first line of the file you sent
to me.
>> > > It contains a comma-separated list of header column names.
>> > >
>> > > I'm not exactly sure where you pulled those header column
names, but
>> > > that's the problem.  MET expects data to be separated by
whitespace...
>> > > so it interprets that long string with a bunch of commas as a
single
>> > column.
>> > > The
>> > > error comes when it tries to read the "second" column.   If you
just
>> > remove
>> > > that first line, it should run fine.
>> > >
>> > > If you do want header columns, here's a trick.  Run the
following job:
>> > >
>> > > stat_analysis -lookin
>> point_stat_3_galwem_120000L_20160501_120000V.stat \
>> > >    -job filter -line_type MPR -dump_row out.stat
>> > >
>> > > The file out.stat, will now contain the full header for the MPR
line
>> > type.
>> > > When you select a single LINE_TYPE value, stat-analysis will
write
>> > > the full header for that line type to the output.
>> > >
>> > > Have a good weekend.
>> > >
>> > > John
>> > >
>> > >
>> > > On Fri, May 13, 2016 at 10:26 AM, robert.craig.2 at us.af.mil via
RT <
>> > > met_help at ucar.edu> wrote:
>> > >
>> > > >
>> > > > Fri May 13 10:26:26 2016: Request 76361 was acted upon.
>> > > > Transaction: Ticket created by robert.craig.2 at us.af.mil
>> > > >        Queue: met_help
>> > > >      Subject: Statanalysis Question
>> > > >        Owner: Nobody
>> > > >   Requestors: robert.craig.2 at us.af.mil
>> > > >       Status: new
>> > > >  Ticket <URL:
>> > > > https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361
>> > > > >
>> > > >
>> > > >
>> > > > John, I am getting the following error when running
statanalysis.
>> > > > The forcast times and valid times seem to be correct to me,
so I
>> > > > am not sure the cause of the error.
>> > > >
>> > > > /h/WXQC/met-5.1/bin/stat_analysis -lookin
>> > > > /h/data/global/WXQC/data/met/mdlob_pairs -out
>> > > >
/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_PSTD_0Z
>> > > > -config
>> > > > /h/WXQC/met-
5.1/data/config/STATAnalysisConfig_hira_bs_updated
>> > > > -v 6 DEBUG 1: Creating STAT-Analysis output file
>> > > >
"/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_PSTD_0Z"
>> > > > DEBUG 1: Default Config File:
>> > > > /home/qcteam/met-
5.1/share/met/config/STATAnalysisConfig_default
>> > > > DEBUG 1: User Config File:
>> > > > /h/WXQC/met-
5.1/data/config/STATAnalysisConfig_hira_bs_updated
>> > > > DEBUG 4: Default Job from the config file: "-model GALWEM
>> > > > -fcst_lead
>> > > > 120000 -fcst_init_beg 20160501_000000 -fcst_init_end
>> > > > 20160501_000000 -fcst_init_hour 120000 -fcst_var APCP
-fcst_thresh
>> > > > >=50 -line_type MPR -vif_flag 1 "
>> > > > DEBUG 4: Amending default job with command line options:
"(nul)"
>> > > > DEBUG 3: Processing STAT file
>> > > >
>> > >
>> >
>>
"/h/data/global/WXQC/data/met/mdlob_pairs/point_stat_3_galwem_120000L_20160501_120000V.stat"
>> > > > ... 1 of 2
>> > > > ERROR  :
>> > > > ERROR  : DataLine::get_item(int) -> range check error ERROR
:
>> > > >
>> > > > The config file and data file are on your server.
>> > > >
>> > > >
>> > > >
>> > >
>> > >
>> > >
>> > >
>> >
>> >
>> >
>> >
>>
>>
>>
>>
>

------------------------------------------------
Subject: RE: [rt.rap.ucar.edu #76361] Statanalysis Question
From: robert.craig.2 at us.af.mil
Time: Mon May 23 12:24:44 2016

Thanks for looking into this.  I am currently checking out your PCT
recommendation.  I will let you know if I have more questions.

Bob

-----Original Message-----
From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
Sent: Monday, May 23, 2016 1:04 PM
To: CRAIG, ROBERT J GS-12 USAF ACC 16 WS/WXN
<robert.craig.2 at us.af.mil>
Subject: Re: [rt.rap.ucar.edu #76361] Statanalysis Question

Bob,

I wanted to give you an update on the two issues you found...

(1) I found and fixed a pesky little bug in STAT-Analysis.  It was
writing "=1" to the output stat files when it should be writing "==1".
That was a one line fix.

(2) I now understand the original problem you wrote about.  It shows
up when you use the -fcst_thresh, -obs_thresh, or -cov_thresh job
command options to filter the input data.  When STAT-Analysis
processes those threshold-type columns, it's actually writing a tiny
temp file and reading that back in.  After doing that about 65,528
times (at least on my machine), I get that error.  Writing that many
little temp files is wreaking havoc.

We'll look into (2) some more to see if we can find an alternative way
of processes this data.  I'll hold off on posting the (1) bugfix to
see if can bundle a bugfix for (2) with it.

Thanks,
John

On Mon, May 23, 2016 at 10:55 AM, John Halley Gotway <johnhg at ucar.edu>
wrote:

> Bob,
>
> Surprisingly, I was able to replicate the same error you're seeing!
I
> see where the error is occurring, but I don't yet understand why its
> happening.  However, I do have a workaround for you.
>
> Try editing your config file by emptying out the "fcst_thresh"
setting:
>    fcst_thresh = [];
>
> In your job, you're using "-by fcst_thresh" anyway, so STAT-Analysis
> will group the data by the FCST_THRESH column.  After I removed the
> "fcst_thresh" setting, the job completed.  I'll continue looking
into
> the reason for that error.
>
> How many of these files are you planning to pass to STAT-Analysis at
> any given time?  I see each sample file contains about 80,000 MPR
lines.
> Passing it 5 files to process about 400,000 lines, that job takes
> about 62 seconds to run on my machine.  I worry that as you increase
> the number of files, it'll run very slowly.
>
> Here's some alternative logic you might consider.  Run STAT-Analysis
> once for each .stat file you're generating.  Instead of writing the
> PSTD line type, write the PCT line type (that's just the counts of
> that probabilistic
> Nx2 table).  Then run jobs to aggregate the PCT lines types and
> compute PSTD lines (-job aggregate_stat -line_type PCT
-out_line_type
> PSTD).  That would make the processing for each STAT-Analysis job
much more manageable.
>
> BUT in order to do this in 2 steps, you would really need to put the
> neighborhood size information into the INTERP_MTHD and INTERP_PNTS
> columns.  Otherwise, the threshold information won't be retained in
> the PCT output lines.  And I also found an issue in the formatting
of
> the OBS_THRESH output column ("=1" in the output should really be
> "==1").  For now I just switched to "ge1", but I need to fix this
formatting issue.
>
> Listed below are the c-shell command I used to loop over your sample
> files and run stat_analysis in this way...
>
> # Loop through MPR files and compute PCT output lines foreach
mpr_file
> (`ls stat_mpr/*.stat`)
>    set pct_file = `echo $mpr_file | sed 's/stat_mpr/stat_pct/g' |
sed
> 's/.stat/_pct.stat/g'`
>    /usr/local/met-5.1/bin/stat_analysis -lookin $mpr_file -out_stat
> $pct_file \
>    -job aggregate_stat -line_type MPR -out_line_type PCT \
>    -by MODEL,FCST_VALID_BEG,FCST_LEAD,FCST_VAR,FCST_THRESH \
>    -out_fcst_thresh
> ge0,ge0.1,ge0.2,ge0.3,ge0.4,ge0.5,ge0.6,ge0.7,ge0.8,ge0.9,ge1.0
> -out_obs_thresh ge1
> end
>
> # Aggregate PCT lines and compute PSTD stats
> /usr/local/met-5.1/bin/stat_analysis -lookin stat_pct -out_stat
pstd.stat \
>    -job aggregate_stat -line_type PCT -out_line_type PSTD \
>    -by
MODEL,FCST_VALID_BEG,FCST_LEAD,FCST_VAR,INTERP_MTHD,INTERP_PNTS
>
> Note that this job is combining all of the neighborhood sizes
because
> INTERP_MTHD and INTERP_PNTS is set the same in the PCT lines.
>
> Thanks,
> John
>
>
>
> On Fri, May 20, 2016 at 3:33 PM, robert.craig.2 at us.af.mil via RT <
> met_help at ucar.edu> wrote:
>
>>
>> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361 >
>>
>> John, I might have found more information on the error below:
>>
>> MetConfig::read_string(const char *) -> unable to open temp file
>> "config_1943_0_.temp"
>>
>> When I run stat_anal on MPR files, no problems if I  process 2 days
>> of data, but when I increase it to three days, the error occurs.  I
>> also played with the dates to try to eliminate a particular data
file
>> as the cause.  So, it seems to be related to how many data files it
has to
>> process.   I posted all the data files on the ftp server and my
command
>> line is below.  The config file on the server is still
representative.
>> This error does not seem related to directory permissions.
>>
>> /h/WXQC/met-5.1/bin/stat_analysis  -lookin
>> /h/data/global/WXQC/data/met/mdlob_pairs -out
>> /h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_PSTD_0
>> -config
>> /h/WXQC/met-5.1/data/config/STATAnalysisConfig_hira_bs_updated
>> -out_fcst_thresh
>> ge0,ge0.1,ge0.2,ge0.3,ge0.4,ge0.5,ge0.6,ge0.7,ge0.8,ge0.9,ge1.0
>> -out_obs_thresh eq1 -by FCST_VAR -by FCST_THRESH -v 6
>>
>> This is concerning since I have to run this on many more than two
>> days worth of data.  Any idea what be happening?
>>
>> Thanks
>> Bob
>>
>> -----Original Message-----
>> From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
>> Sent: Thursday, May 19, 2016 4:31 PM
>> To: CRAIG, ROBERT J GS-12 USAF ACC 16 WS/WXN
>> <robert.craig.2 at us.af.mil>
>> Subject: Re: [rt.rap.ucar.edu #76361] Statanalysis Question
>>
>> Bob,
>>
>> It's funny, I was just talking to a colleague today about doing
>> something very similar to this on a different dataset.
>>
>> I think I understand what you're saying about the thresholds >1,
>25,
>> and
>> >50.  These are used to define the "event" which is used in
computing
>> >the
>> fractional coverage fields.  My confusion comes from the fact that
in
>> MET currently only Grid-Stat is computing these fractional coverage
>> fields, not Point-Stat.  But I see now what you're doing.
>>
>> One suggestion would be to change the contents of the INTERP_MTHD
and
>> INTERP_PNTS header columns.  You currently have NEAREST, 1 which
>> would indicate that each observation value was matched to the
>> forecast value at the nearest grid point.  Instead, I would suggest
>> writing NBRHD and N, where N indicates the number of points in the
>> neighborhood.  For example, the NBRHD output from Grid-Stat would
write 49 if we were using a 7x7 box.
>>
>> The -fcst_thresh and -obs_thresh options are used to filter the
input
>> MPR lines, as you already know.  The -out_fcst_thresh and
>> -out_obs_thresh options define the thresholds to be applied when
>> computing the output for the job.  In MET, probabilities are not
processed in a "continuous" way.
>> Instead, they are put into probability bins.  Those bins are used
to
>> create an Nx2 contingency table from which probabilistic statistics
>> are computed.
>>
>> Setting "-out_fcst_thresh gt0,gt0.25,gt0.5,gt0.75,gt1.0" defines 4
>> probability bins which yields a 4x2 contingency table, from which
>> stats are computed.
>>
>> Hope that helps.
>>
>> Thanks,
>> Johhn
>>
>> On Thu, May 19, 2016 at 2:23 PM, robert.craig.2 at us.af.mil via RT <
>> met_help at ucar.edu> wrote:
>>
>> >
>> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361 >
>> >
>> > Thanks John, the directory problem was due to come corruption on
>> > the MET we had on one if the systems.  On another system the
>> > problem doesn't come up so we are hoping a recompile of MET on
said
>> > system will clear up the issue.
>> >
>> > As far as the second comment, I don't think you interpreted what
I
>> > am doing correctly.  In each file, there is three sets of data
for
>> > each variable.  They are not identical since the first set is the
>> > ob neighborhood data for precip > 1.  The next set is the ob
>> > neighborhood data for precip > 25, and the same for precip > 50.
I
>> > you compare the model and ob data, the model data should be
>> > different (for some obs) then the model data for the previous
>> > category.  The neighborhoods
>> around each ob site
>> > follow the HiRA method.   All the data in the mpr file lines are
>> > probabilities, so I want to create PSTD from these data.  So I
was
>> > using the fcst_thresh to filter for the HiRA thresholds I am
interested in.
>>  I
>> > tried your code and added -by FCST_THRESH and got out three
>> > different
>> sets
>> > of values.   So my question is why did you have the
-out_fcst_thresh
>> set to
>> > 10 prob thresolds?  Why wouldn't ge0 pick  up all probabilities?
I
>> > am not understanding how these thresholds are being used.
>> >
>> > Thanks
>> > Bob
>> >
>> > -----Original Message-----
>> > From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
>> > Sent: Monday, May 16, 2016 4:54 PM
>> > To: CRAIG, ROBERT J GS-12 USAF ACC 16 WS/WXN
>> > <robert.craig.2 at us.af.mil>
>> > Subject: Re: [rt.rap.ucar.edu #76361] Statanalysis Question
>> >
>> > Bob,
>> >
>> > Thanks for sending the sample data.  I agree that STAT-Analysis
can
>> > get pretty confusing.  It has a lot of flexibility, but we really
>> > need to think through what you're trying to do.
>> >
>> > First, regarding the error you're getting.  Unfortunately, the
>> > config file string parser is writing a temp file in the current
"runtime"
>> directory.
>> > The error is from the fact that you don't have permission to
write
>> > the file "config_23325_0_.temp" in the current directory.
>> > Ultimately, we should change that to use the temp directory
instead.
>> >
>> > Next, I looked at the data you sent to me.  Listed below are the
>> > unique combinations of just a few of the header columns:
>> >
>> > FCST_VAR FCST_THRESH LINE_TYPE TOTAL
>> > APCP         >=1                  MPR           14666
>> > APCP         >=25                MPR           14666
>> > APCP         >=50                MPR           14666
>> > CEIL           <=1000             MPR           11926
>> > CEIL           <=100               MPR           11926
>> > CEIL           <=300               MPR           11926
>> >
>> > Based on this, it looks like you have a lot of duplicate matched
>> > pair
>> > (MPR) output lines... We have the same 14666 pairs for APCP
>> > repeated 3 times followed by the same 11926 pairs for CEIL
repeated 3 times.
>> > This isn't necessary.  Instead, the FCST_THRESH and OBS_THRESH
>> > columns for the MPR line type should be set to "NA".  The MPR
line
>> > type that Point-Stat creates just contains the paired forecast
and
>> > observation
>> values.
>> > Thresholds do not apply to this line type.
>> >
>> > I posted an updated version of your file to the ftp site.  I
>> > stripped it down to 14666 APCP lines and 11926 CEIL lines with NA
>> > in the FCST_THRESH and OBS_THRESH columns:
>> >
>> >
>> >
ftp://ftp.rap.ucar.edu/incoming/irap/met_help/craig_data/point_stat
>> > _3_ galwem_120000L_20160501_120000V_JHG.stat
>> >
>> > Looking at the values in the FCST column, I see numbers between 0
>> > and
>> > 1 (0, 0.11, 0.22, 0.33, ...).  Looking in the OBS column, I see 2
>> > numbers (0 or 1).  And looking at your config file, it looks like
>> > you want to use these MPR lines to compute probabilistic output.
>> > MET verifies probabilities using an Nx2 contingency table.  You
use
>> "-out_fcst_thresh"
>> > to select the probabilistic thresholds to be applied and
>> "-out_obs_thresh"
>> > to select the observation threshold to be applied.
>> >
>> > Here's a stat-analysis job you could run to read the MPR lines,
>> > define the probabilistic forecast thresholds, define the single
>> > observation threshold, and compute a PSTD output line.  Using "-
by
>> > FCST_VAR" tells it to run the job separately for each unique
entry
>> > found in the
>> FCST_VAR column.
>> >
>> > /usr/local/met-5.1/bin/stat_analysis \
>> >    -lookin point_stat_3_galwem_120000L_20160501_120000V_JHG.stat
\
>> >    -job aggregate_stat -line_type MPR -out_line_type PSTD \
>> >    -out_fcst_thresh
>> > ge0,ge0.1,ge0.2,ge0.3,ge0.4,ge0.5,ge0.6,ge0.7,ge0.8,ge0.9,ge1.0 \
>> >    -out_obs_thresh eq1.0 \
>> >    -by FCST_VAR \
>> >    -out_stat out_pstd.txt
>> >
>> > The output statistics are written to "out_pstd.txt".
>> >
>> > Hope that helps.
>> >
>> > John
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> > On Mon, May 16, 2016 at 2:47 PM, robert.craig.2 at us.af.mil via RT
<
>> > met_help at ucar.edu> wrote:
>> >
>> > >
>> > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361 >
>> > >
>> > > Thanks John, I knew the data was space delimited but forgot to
>> > > check the header.  As usual with MET, I progressed further but
am
>> > > hitting a
>> new
>> > > error.   See below.   I pushed the the config file to the ftp
>> directory.
>> > > As you can see, the -tmp_dir is set to
>> /h/data/global/WXQC/data/met/tmp.
>> > > This directory permissions are wide open - infact stat_anal
temp
>> > > files
>> > are
>> > > in there.   Does the config*.temp try to write somewhere else?
>> > >
>> > > Also, notice in the command line options, there are three
thesholds.
>> > > MET kept telling me that I had to have three since this is
>> > > probability
>> > data.
>> > > Also, the latest MPR files (.stat) are in the ftp dir.  As you
>> > > can see I generated model/ob pairs using different thresholds
for
>> > > the forecast and observation data.  So this is where I get
>> > > confused: I assume the fcst thresh in the config file is a
filter
>> > > to pull those lines that have the threshold I want.  I am not
>> > > sure what the -out_fcst_thresh in the command line is doing.
If
>> > > it is filtering the mpr line fcst data, then I would think I
>> > > would set it to ge 0 for the fcst and ob since the fcst and ob
>> > > data range from 0 to 1.  Am
>> I handling this correctly?
>> > >
>> > > Thanks
>> > > BOb
>> > >
>> > > ['/h/WXQC/met-5.1/bin/stat_analysis', '-lookin',
>> > > '/h/data/global/WXQC/data/met/mdlob_pairs', '-out',
>> > >
'/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_1_PSTD_0Z'
>> > > ,
>> > > '-config',
>> > > '/h/WXQC/met-
5.1/data/config/STATAnalysisConfig_hira_bs_updated',
>> > > '-out_fcst_thresh ge0,ge0.5,ge1', '-out_obs_thresh ge0', '-v',
>> > > '6'] DEBUG 1: Creating STAT-Analysis output file
>> > >
"/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_1_PSTD_0Z"
>> > > DEBUG 1: Default Config File:
>> > > /home/qcteam/met-
5.1/share/met/config/STATAnalysisConfig_default
>> > > DEBUG 1: User Config File:
>> > > /h/WXQC/met-5.1/data/config/STATAnalysisConfig_hira_bs_updated
>> > > DEBUG 4: Default Job from the config file: "-model GALWEM
>> > > -fcst_lead
>> > > 120000 -fcst_init_beg 20160501_000000 -fcst_init_end
>> > > 20160502_000000 -fcst_init_hour 000000 -fcst_var APCP
>> > > -fcst_thresh >=1 -line_type MPR -vif_flag 1 "
>> > > DEBUG 4: Amending default job with command line options:
>> > > "-out_fcst_thresh
>> > > ge0,ge0.5,ge1 -out_obs_thresh ge0"
>> > > DEBUG 3: Processing STAT file
>> > >
>> >
>>
"/h/data/global/WXQC/data/met/mdlob_pairs/point_stat_3_galwem_120000L_20160501_120000V.stat"
>> > > ... 1 of 10
>> > > ERROR  :
>> > > ERROR  :
>> > > ERROR  :   MetConfig::read_string(const char *) -> unable to
open temp
>> > > file "config_23325_0_.temp"
>> > > ERROR  :
>> > >
>> > > -----Original Message-----
>> > > From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
>> > > Sent: Friday, May 13, 2016 6:27 PM
>> > > To: CRAIG, ROBERT J GS-12 USAF ACC 16 WS/WXN
>> > > <robert.craig.2 at us.af.mil>
>> > > Subject: Re: [rt.rap.ucar.edu #76361] Statanalysis Question
>> > >
>> > > Bob,
>> > >
>> > > The problem is coming from the first line of the file you sent
to me.
>> > > It contains a comma-separated list of header column names.
>> > >
>> > > I'm not exactly sure where you pulled those header column
names,
>> > > but that's the problem.  MET expects data to be separated by
whitespace...
>> > > so it interprets that long string with a bunch of commas as a
>> > > single
>> > column.
>> > > The
>> > > error comes when it tries to read the "second" column.   If you
just
>> > remove
>> > > that first line, it should run fine.
>> > >
>> > > If you do want header columns, here's a trick.  Run the
following job:
>> > >
>> > > stat_analysis -lookin
>> point_stat_3_galwem_120000L_20160501_120000V.stat \
>> > >    -job filter -line_type MPR -dump_row out.stat
>> > >
>> > > The file out.stat, will now contain the full header for the MPR
>> > > line
>> > type.
>> > > When you select a single LINE_TYPE value, stat-analysis will
>> > > write the full header for that line type to the output.
>> > >
>> > > Have a good weekend.
>> > >
>> > > John
>> > >
>> > >
>> > > On Fri, May 13, 2016 at 10:26 AM, robert.craig.2 at us.af.mil via
RT
>> > > < met_help at ucar.edu> wrote:
>> > >
>> > > >
>> > > > Fri May 13 10:26:26 2016: Request 76361 was acted upon.
>> > > > Transaction: Ticket created by robert.craig.2 at us.af.mil
>> > > >        Queue: met_help
>> > > >      Subject: Statanalysis Question
>> > > >        Owner: Nobody
>> > > >   Requestors: robert.craig.2 at us.af.mil
>> > > >       Status: new
>> > > >  Ticket <URL:
>> > > > https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361
>> > > > >
>> > > >
>> > > >
>> > > > John, I am getting the following error when running
statanalysis.
>> > > > The forcast times and valid times seem to be correct to me,
so
>> > > > I am not sure the cause of the error.
>> > > >
>> > > > /h/WXQC/met-5.1/bin/stat_analysis -lookin
>> > > > /h/data/global/WXQC/data/met/mdlob_pairs -out
>> > > >
/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_PSTD_0
>> > > > Z
>> > > > -config
>> > > > /h/WXQC/met-
5.1/data/config/STATAnalysisConfig_hira_bs_updated
>> > > > -v 6 DEBUG 1: Creating STAT-Analysis output file
>> > > >
"/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_PSTD_0Z"
>> > > > DEBUG 1: Default Config File:
>> > > > /home/qcteam/met-
5.1/share/met/config/STATAnalysisConfig_defaul
>> > > > t
>> > > > DEBUG 1: User Config File:
>> > > > /h/WXQC/met-
5.1/data/config/STATAnalysisConfig_hira_bs_updated
>> > > > DEBUG 4: Default Job from the config file: "-model GALWEM
>> > > > -fcst_lead
>> > > > 120000 -fcst_init_beg 20160501_000000 -fcst_init_end
>> > > > 20160501_000000 -fcst_init_hour 120000 -fcst_var APCP
>> > > > -fcst_thresh
>> > > > >=50 -line_type MPR -vif_flag 1 "
>> > > > DEBUG 4: Amending default job with command line options:
"(nul)"
>> > > > DEBUG 3: Processing STAT file
>> > > >
>> > >
>> >
>>
"/h/data/global/WXQC/data/met/mdlob_pairs/point_stat_3_galwem_120000L_20160501_120000V.stat"
>> > > > ... 1 of 2
>> > > > ERROR  :
>> > > > ERROR  : DataLine::get_item(int) -> range check error ERROR
:
>> > > >
>> > > > The config file and data file are on your server.
>> > > >
>> > > >
>> > > >
>> > >
>> > >
>> > >
>> > >
>> >
>> >
>> >
>> >
>>
>>
>>
>>
>



------------------------------------------------
Subject: Statanalysis Question
From: John Halley Gotway
Time: Mon May 23 14:50:36 2016

Bob,

Good news, we found and fixed the problem.  The vx_config library was
opening up files but failing to close them.  After we hit the upper
limit
(getconf OPEN_MAX = 65536 on my machine) for the number of open files,
we
get that error message.  It's a simple fix to make the vx_config
library
close the files that it opens.  And I'm now able to process all of the
sample MPR lines you sent me.

Unfortunately, these 2 fixes will require that MET be recompiled with
the
set of patches I just posted:
   http://www.dtcenter.org/met/users/support/known_issues/METv5.1/index.php

Please let me know if you have more questions/issue.

John


On Mon, May 23, 2016 at 12:24 PM, robert.craig.2 at us.af.mil via RT <
met_help at ucar.edu> wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361 >
>
> Thanks for looking into this.  I am currently checking out your PCT
> recommendation.  I will let you know if I have more questions.
>
> Bob
>
> -----Original Message-----
> From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
> Sent: Monday, May 23, 2016 1:04 PM
> To: CRAIG, ROBERT J GS-12 USAF ACC 16 WS/WXN
<robert.craig.2 at us.af.mil>
> Subject: Re: [rt.rap.ucar.edu #76361] Statanalysis Question
>
> Bob,
>
> I wanted to give you an update on the two issues you found...
>
> (1) I found and fixed a pesky little bug in STAT-Analysis.  It was
writing
> "=1" to the output stat files when it should be writing "==1".  That
was a
> one line fix.
>
> (2) I now understand the original problem you wrote about.  It shows
up
> when you use the -fcst_thresh, -obs_thresh, or -cov_thresh job
command
> options to filter the input data.  When STAT-Analysis processes
those
> threshold-type columns, it's actually writing a tiny temp file and
reading
> that back in.  After doing that about 65,528 times (at least on my
> machine), I get that error.  Writing that many little temp files is
> wreaking havoc.
>
> We'll look into (2) some more to see if we can find an alternative
way of
> processes this data.  I'll hold off on posting the (1) bugfix to see
if can
> bundle a bugfix for (2) with it.
>
> Thanks,
> John
>
> On Mon, May 23, 2016 at 10:55 AM, John Halley Gotway
<johnhg at ucar.edu>
> wrote:
>
> > Bob,
> >
> > Surprisingly, I was able to replicate the same error you're
seeing!  I
> > see where the error is occurring, but I don't yet understand why
its
> > happening.  However, I do have a workaround for you.
> >
> > Try editing your config file by emptying out the "fcst_thresh"
setting:
> >    fcst_thresh = [];
> >
> > In your job, you're using "-by fcst_thresh" anyway, so STAT-
Analysis
> > will group the data by the FCST_THRESH column.  After I removed
the
> > "fcst_thresh" setting, the job completed.  I'll continue looking
into
> > the reason for that error.
> >
> > How many of these files are you planning to pass to STAT-Analysis
at
> > any given time?  I see each sample file contains about 80,000 MPR
lines.
> > Passing it 5 files to process about 400,000 lines, that job takes
> > about 62 seconds to run on my machine.  I worry that as you
increase
> > the number of files, it'll run very slowly.
> >
> > Here's some alternative logic you might consider.  Run STAT-
Analysis
> > once for each .stat file you're generating.  Instead of writing
the
> > PSTD line type, write the PCT line type (that's just the counts of
> > that probabilistic
> > Nx2 table).  Then run jobs to aggregate the PCT lines types and
> > compute PSTD lines (-job aggregate_stat -line_type PCT
-out_line_type
> > PSTD).  That would make the processing for each STAT-Analysis job
much
> more manageable.
> >
> > BUT in order to do this in 2 steps, you would really need to put
the
> > neighborhood size information into the INTERP_MTHD and INTERP_PNTS
> > columns.  Otherwise, the threshold information won't be retained
in
> > the PCT output lines.  And I also found an issue in the formatting
of
> > the OBS_THRESH output column ("=1" in the output should really be
> > "==1").  For now I just switched to "ge1", but I need to fix this
> formatting issue.
> >
> > Listed below are the c-shell command I used to loop over your
sample
> > files and run stat_analysis in this way...
> >
> > # Loop through MPR files and compute PCT output lines foreach
mpr_file
> > (`ls stat_mpr/*.stat`)
> >    set pct_file = `echo $mpr_file | sed 's/stat_mpr/stat_pct/g' |
sed
> > 's/.stat/_pct.stat/g'`
> >    /usr/local/met-5.1/bin/stat_analysis -lookin $mpr_file
-out_stat
> > $pct_file \
> >    -job aggregate_stat -line_type MPR -out_line_type PCT \
> >    -by MODEL,FCST_VALID_BEG,FCST_LEAD,FCST_VAR,FCST_THRESH \
> >    -out_fcst_thresh
> > ge0,ge0.1,ge0.2,ge0.3,ge0.4,ge0.5,ge0.6,ge0.7,ge0.8,ge0.9,ge1.0
> > -out_obs_thresh ge1
> > end
> >
> > # Aggregate PCT lines and compute PSTD stats
> > /usr/local/met-5.1/bin/stat_analysis -lookin stat_pct -out_stat
> pstd.stat \
> >    -job aggregate_stat -line_type PCT -out_line_type PSTD \
> >    -by
MODEL,FCST_VALID_BEG,FCST_LEAD,FCST_VAR,INTERP_MTHD,INTERP_PNTS
> >
> > Note that this job is combining all of the neighborhood sizes
because
> > INTERP_MTHD and INTERP_PNTS is set the same in the PCT lines.
> >
> > Thanks,
> > John
> >
> >
> >
> > On Fri, May 20, 2016 at 3:33 PM, robert.craig.2 at us.af.mil via RT <
> > met_help at ucar.edu> wrote:
> >
> >>
> >> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361 >
> >>
> >> John, I might have found more information on the error below:
> >>
> >> MetConfig::read_string(const char *) -> unable to open temp file
> >> "config_1943_0_.temp"
> >>
> >> When I run stat_anal on MPR files, no problems if I  process 2
days
> >> of data, but when I increase it to three days, the error occurs.
I
> >> also played with the dates to try to eliminate a particular data
file
> >> as the cause.  So, it seems to be related to how many data files
it has
> to
> >> process.   I posted all the data files on the ftp server and my
command
> >> line is below.  The config file on the server is still
representative.
> >> This error does not seem related to directory permissions.
> >>
> >> /h/WXQC/met-5.1/bin/stat_analysis  -lookin
> >> /h/data/global/WXQC/data/met/mdlob_pairs -out
> >> /h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_PSTD_0
> >> -config
> >> /h/WXQC/met-5.1/data/config/STATAnalysisConfig_hira_bs_updated
> >> -out_fcst_thresh
> >> ge0,ge0.1,ge0.2,ge0.3,ge0.4,ge0.5,ge0.6,ge0.7,ge0.8,ge0.9,ge1.0
> >> -out_obs_thresh eq1 -by FCST_VAR -by FCST_THRESH -v 6
> >>
> >> This is concerning since I have to run this on many more than two
> >> days worth of data.  Any idea what be happening?
> >>
> >> Thanks
> >> Bob
> >>
> >> -----Original Message-----
> >> From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
> >> Sent: Thursday, May 19, 2016 4:31 PM
> >> To: CRAIG, ROBERT J GS-12 USAF ACC 16 WS/WXN
> >> <robert.craig.2 at us.af.mil>
> >> Subject: Re: [rt.rap.ucar.edu #76361] Statanalysis Question
> >>
> >> Bob,
> >>
> >> It's funny, I was just talking to a colleague today about doing
> >> something very similar to this on a different dataset.
> >>
> >> I think I understand what you're saying about the thresholds >1,
>25,
> >> and
> >> >50.  These are used to define the "event" which is used in
computing
> >> >the
> >> fractional coverage fields.  My confusion comes from the fact
that in
> >> MET currently only Grid-Stat is computing these fractional
coverage
> >> fields, not Point-Stat.  But I see now what you're doing.
> >>
> >> One suggestion would be to change the contents of the INTERP_MTHD
and
> >> INTERP_PNTS header columns.  You currently have NEAREST, 1 which
> >> would indicate that each observation value was matched to the
> >> forecast value at the nearest grid point.  Instead, I would
suggest
> >> writing NBRHD and N, where N indicates the number of points in
the
> >> neighborhood.  For example, the NBRHD output from Grid-Stat would
write
> 49 if we were using a 7x7 box.
> >>
> >> The -fcst_thresh and -obs_thresh options are used to filter the
input
> >> MPR lines, as you already know.  The -out_fcst_thresh and
> >> -out_obs_thresh options define the thresholds to be applied when
> >> computing the output for the job.  In MET, probabilities are not
> processed in a "continuous" way.
> >> Instead, they are put into probability bins.  Those bins are used
to
> >> create an Nx2 contingency table from which probabilistic
statistics
> >> are computed.
> >>
> >> Setting "-out_fcst_thresh gt0,gt0.25,gt0.5,gt0.75,gt1.0" defines
4
> >> probability bins which yields a 4x2 contingency table, from which
> >> stats are computed.
> >>
> >> Hope that helps.
> >>
> >> Thanks,
> >> Johhn
> >>
> >> On Thu, May 19, 2016 at 2:23 PM, robert.craig.2 at us.af.mil via RT
<
> >> met_help at ucar.edu> wrote:
> >>
> >> >
> >> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361 >
> >> >
> >> > Thanks John, the directory problem was due to come corruption
on
> >> > the MET we had on one if the systems.  On another system the
> >> > problem doesn't come up so we are hoping a recompile of MET on
said
> >> > system will clear up the issue.
> >> >
> >> > As far as the second comment, I don't think you interpreted
what I
> >> > am doing correctly.  In each file, there is three sets of data
for
> >> > each variable.  They are not identical since the first set is
the
> >> > ob neighborhood data for precip > 1.  The next set is the ob
> >> > neighborhood data for precip > 25, and the same for precip >
50.  I
> >> > you compare the model and ob data, the model data should be
> >> > different (for some obs) then the model data for the previous
> >> > category.  The neighborhoods
> >> around each ob site
> >> > follow the HiRA method.   All the data in the mpr file lines
are
> >> > probabilities, so I want to create PSTD from these data.  So I
was
> >> > using the fcst_thresh to filter for the HiRA thresholds I am
> interested in.
> >>  I
> >> > tried your code and added -by FCST_THRESH and got out three
> >> > different
> >> sets
> >> > of values.   So my question is why did you have the
-out_fcst_thresh
> >> set to
> >> > 10 prob thresolds?  Why wouldn't ge0 pick  up all
probabilities?  I
> >> > am not understanding how these thresholds are being used.
> >> >
> >> > Thanks
> >> > Bob
> >> >
> >> > -----Original Message-----
> >> > From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
> >> > Sent: Monday, May 16, 2016 4:54 PM
> >> > To: CRAIG, ROBERT J GS-12 USAF ACC 16 WS/WXN
> >> > <robert.craig.2 at us.af.mil>
> >> > Subject: Re: [rt.rap.ucar.edu #76361] Statanalysis Question
> >> >
> >> > Bob,
> >> >
> >> > Thanks for sending the sample data.  I agree that STAT-Analysis
can
> >> > get pretty confusing.  It has a lot of flexibility, but we
really
> >> > need to think through what you're trying to do.
> >> >
> >> > First, regarding the error you're getting.  Unfortunately, the
> >> > config file string parser is writing a temp file in the current
> "runtime"
> >> directory.
> >> > The error is from the fact that you don't have permission to
write
> >> > the file "config_23325_0_.temp" in the current directory.
> >> > Ultimately, we should change that to use the temp directory
instead.
> >> >
> >> > Next, I looked at the data you sent to me.  Listed below are
the
> >> > unique combinations of just a few of the header columns:
> >> >
> >> > FCST_VAR FCST_THRESH LINE_TYPE TOTAL
> >> > APCP         >=1                  MPR           14666
> >> > APCP         >=25                MPR           14666
> >> > APCP         >=50                MPR           14666
> >> > CEIL           <=1000             MPR           11926
> >> > CEIL           <=100               MPR           11926
> >> > CEIL           <=300               MPR           11926
> >> >
> >> > Based on this, it looks like you have a lot of duplicate
matched
> >> > pair
> >> > (MPR) output lines... We have the same 14666 pairs for APCP
> >> > repeated 3 times followed by the same 11926 pairs for CEIL
repeated 3
> times.
> >> > This isn't necessary.  Instead, the FCST_THRESH and OBS_THRESH
> >> > columns for the MPR line type should be set to "NA".  The MPR
line
> >> > type that Point-Stat creates just contains the paired forecast
and
> >> > observation
> >> values.
> >> > Thresholds do not apply to this line type.
> >> >
> >> > I posted an updated version of your file to the ftp site.  I
> >> > stripped it down to 14666 APCP lines and 11926 CEIL lines with
NA
> >> > in the FCST_THRESH and OBS_THRESH columns:
> >> >
> >> >
> >> >
ftp://ftp.rap.ucar.edu/incoming/irap/met_help/craig_data/point_stat
> >> > _3_ galwem_120000L_20160501_120000V_JHG.stat
> >> >
> >> > Looking at the values in the FCST column, I see numbers between
0
> >> > and
> >> > 1 (0, 0.11, 0.22, 0.33, ...).  Looking in the OBS column, I see
2
> >> > numbers (0 or 1).  And looking at your config file, it looks
like
> >> > you want to use these MPR lines to compute probabilistic
output.
> >> > MET verifies probabilities using an Nx2 contingency table.  You
use
> >> "-out_fcst_thresh"
> >> > to select the probabilistic thresholds to be applied and
> >> "-out_obs_thresh"
> >> > to select the observation threshold to be applied.
> >> >
> >> > Here's a stat-analysis job you could run to read the MPR lines,
> >> > define the probabilistic forecast thresholds, define the single
> >> > observation threshold, and compute a PSTD output line.  Using
"-by
> >> > FCST_VAR" tells it to run the job separately for each unique
entry
> >> > found in the
> >> FCST_VAR column.
> >> >
> >> > /usr/local/met-5.1/bin/stat_analysis \
> >> >    -lookin
point_stat_3_galwem_120000L_20160501_120000V_JHG.stat \
> >> >    -job aggregate_stat -line_type MPR -out_line_type PSTD \
> >> >    -out_fcst_thresh
> >> > ge0,ge0.1,ge0.2,ge0.3,ge0.4,ge0.5,ge0.6,ge0.7,ge0.8,ge0.9,ge1.0
\
> >> >    -out_obs_thresh eq1.0 \
> >> >    -by FCST_VAR \
> >> >    -out_stat out_pstd.txt
> >> >
> >> > The output statistics are written to "out_pstd.txt".
> >> >
> >> > Hope that helps.
> >> >
> >> > John
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> > On Mon, May 16, 2016 at 2:47 PM, robert.craig.2 at us.af.mil via
RT <
> >> > met_help at ucar.edu> wrote:
> >> >
> >> > >
> >> > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361
>
> >> > >
> >> > > Thanks John, I knew the data was space delimited but forgot
to
> >> > > check the header.  As usual with MET, I progressed further
but am
> >> > > hitting a
> >> new
> >> > > error.   See below.   I pushed the the config file to the ftp
> >> directory.
> >> > > As you can see, the -tmp_dir is set to
> >> /h/data/global/WXQC/data/met/tmp.
> >> > > This directory permissions are wide open - infact stat_anal
temp
> >> > > files
> >> > are
> >> > > in there.   Does the config*.temp try to write somewhere
else?
> >> > >
> >> > > Also, notice in the command line options, there are three
thesholds.
> >> > > MET kept telling me that I had to have three since this is
> >> > > probability
> >> > data.
> >> > > Also, the latest MPR files (.stat) are in the ftp dir.  As
you
> >> > > can see I generated model/ob pairs using different thresholds
for
> >> > > the forecast and observation data.  So this is where I get
> >> > > confused: I assume the fcst thresh in the config file is a
filter
> >> > > to pull those lines that have the threshold I want.  I am not
> >> > > sure what the -out_fcst_thresh in the command line is doing.
If
> >> > > it is filtering the mpr line fcst data, then I would think I
> >> > > would set it to ge 0 for the fcst and ob since the fcst and
ob
> >> > > data range from 0 to 1.  Am
> >> I handling this correctly?
> >> > >
> >> > > Thanks
> >> > > BOb
> >> > >
> >> > > ['/h/WXQC/met-5.1/bin/stat_analysis', '-lookin',
> >> > > '/h/data/global/WXQC/data/met/mdlob_pairs', '-out',
> >> > >
'/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_1_PSTD_0Z'
> >> > > ,
> >> > > '-config',
> >> > > '/h/WXQC/met-
5.1/data/config/STATAnalysisConfig_hira_bs_updated',
> >> > > '-out_fcst_thresh ge0,ge0.5,ge1', '-out_obs_thresh ge0', '-
v',
> >> > > '6'] DEBUG 1: Creating STAT-Analysis output file
> >> > >
"/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_1_PSTD_0Z"
> >> > > DEBUG 1: Default Config File:
> >> > > /home/qcteam/met-
5.1/share/met/config/STATAnalysisConfig_default
> >> > > DEBUG 1: User Config File:
> >> > > /h/WXQC/met-
5.1/data/config/STATAnalysisConfig_hira_bs_updated
> >> > > DEBUG 4: Default Job from the config file: "-model GALWEM
> >> > > -fcst_lead
> >> > > 120000 -fcst_init_beg 20160501_000000 -fcst_init_end
> >> > > 20160502_000000 -fcst_init_hour 000000 -fcst_var APCP
> >> > > -fcst_thresh >=1 -line_type MPR -vif_flag 1 "
> >> > > DEBUG 4: Amending default job with command line options:
> >> > > "-out_fcst_thresh
> >> > > ge0,ge0.5,ge1 -out_obs_thresh ge0"
> >> > > DEBUG 3: Processing STAT file
> >> > >
> >> >
> >>
>
"/h/data/global/WXQC/data/met/mdlob_pairs/point_stat_3_galwem_120000L_20160501_120000V.stat"
> >> > > ... 1 of 10
> >> > > ERROR  :
> >> > > ERROR  :
> >> > > ERROR  :   MetConfig::read_string(const char *) -> unable to
open
> temp
> >> > > file "config_23325_0_.temp"
> >> > > ERROR  :
> >> > >
> >> > > -----Original Message-----
> >> > > From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
> >> > > Sent: Friday, May 13, 2016 6:27 PM
> >> > > To: CRAIG, ROBERT J GS-12 USAF ACC 16 WS/WXN
> >> > > <robert.craig.2 at us.af.mil>
> >> > > Subject: Re: [rt.rap.ucar.edu #76361] Statanalysis Question
> >> > >
> >> > > Bob,
> >> > >
> >> > > The problem is coming from the first line of the file you
sent to
> me.
> >> > > It contains a comma-separated list of header column names.
> >> > >
> >> > > I'm not exactly sure where you pulled those header column
names,
> >> > > but that's the problem.  MET expects data to be separated by
> whitespace...
> >> > > so it interprets that long string with a bunch of commas as a
> >> > > single
> >> > column.
> >> > > The
> >> > > error comes when it tries to read the "second" column.   If
you just
> >> > remove
> >> > > that first line, it should run fine.
> >> > >
> >> > > If you do want header columns, here's a trick.  Run the
following
> job:
> >> > >
> >> > > stat_analysis -lookin
> >> point_stat_3_galwem_120000L_20160501_120000V.stat \
> >> > >    -job filter -line_type MPR -dump_row out.stat
> >> > >
> >> > > The file out.stat, will now contain the full header for the
MPR
> >> > > line
> >> > type.
> >> > > When you select a single LINE_TYPE value, stat-analysis will
> >> > > write the full header for that line type to the output.
> >> > >
> >> > > Have a good weekend.
> >> > >
> >> > > John
> >> > >
> >> > >
> >> > > On Fri, May 13, 2016 at 10:26 AM, robert.craig.2 at us.af.mil
via RT
> >> > > < met_help at ucar.edu> wrote:
> >> > >
> >> > > >
> >> > > > Fri May 13 10:26:26 2016: Request 76361 was acted upon.
> >> > > > Transaction: Ticket created by robert.craig.2 at us.af.mil
> >> > > >        Queue: met_help
> >> > > >      Subject: Statanalysis Question
> >> > > >        Owner: Nobody
> >> > > >   Requestors: robert.craig.2 at us.af.mil
> >> > > >       Status: new
> >> > > >  Ticket <URL:
> >> > > > https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361
> >> > > > >
> >> > > >
> >> > > >
> >> > > > John, I am getting the following error when running
statanalysis.
> >> > > > The forcast times and valid times seem to be correct to me,
so
> >> > > > I am not sure the cause of the error.
> >> > > >
> >> > > > /h/WXQC/met-5.1/bin/stat_analysis -lookin
> >> > > > /h/data/global/WXQC/data/met/mdlob_pairs -out
> >> > > >
/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_PSTD_0
> >> > > > Z
> >> > > > -config
> >> > > > /h/WXQC/met-
5.1/data/config/STATAnalysisConfig_hira_bs_updated
> >> > > > -v 6 DEBUG 1: Creating STAT-Analysis output file
> >> > > >
"/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_PSTD_0Z"
> >> > > > DEBUG 1: Default Config File:
> >> > > > /home/qcteam/met-
5.1/share/met/config/STATAnalysisConfig_defaul
> >> > > > t
> >> > > > DEBUG 1: User Config File:
> >> > > > /h/WXQC/met-
5.1/data/config/STATAnalysisConfig_hira_bs_updated
> >> > > > DEBUG 4: Default Job from the config file: "-model GALWEM
> >> > > > -fcst_lead
> >> > > > 120000 -fcst_init_beg 20160501_000000 -fcst_init_end
> >> > > > 20160501_000000 -fcst_init_hour 120000 -fcst_var APCP
> >> > > > -fcst_thresh
> >> > > > >=50 -line_type MPR -vif_flag 1 "
> >> > > > DEBUG 4: Amending default job with command line options:
"(nul)"
> >> > > > DEBUG 3: Processing STAT file
> >> > > >
> >> > >
> >> >
> >>
>
"/h/data/global/WXQC/data/met/mdlob_pairs/point_stat_3_galwem_120000L_20160501_120000V.stat"
> >> > > > ... 1 of 2
> >> > > > ERROR  :
> >> > > > ERROR  : DataLine::get_item(int) -> range check error ERROR
:
> >> > > >
> >> > > > The config file and data file are on your server.
> >> > > >
> >> > > >
> >> > > >
> >> > >
> >> > >
> >> > >
> >> > >
> >> >
> >> >
> >> >
> >> >
> >>
> >>
> >>
> >>
> >
>
>
>
>

------------------------------------------------
Subject: Statanalysis Question
From: John Halley Gotway
Time: Tue May 24 09:31:55 2016

Ah yes, I always forget about that.  The patches are now on the ftp
site:

MET 5.1 patch file ...

ftp://ftp.rap.ucar.edu/incoming/irap/met_help/met-5.1_patches/met-
5.1_patches_20160523.tar.gz

Full MET 5.1 release plus latest patches ...

ftp://ftp.rap.ucar.edu/incoming/irap/met_help/met-5.1_patches/met-
5.1_bugfix.20160523.tar.gz

I also attached a screenshot of the met-5.1 known issues page with
notes
that describe the two recent patches.

Thanks,
John


On Mon, May 23, 2016 at 2:50 PM, John Halley Gotway <johnhg at ucar.edu>
wrote:

> Bob,
>
> Good news, we found and fixed the problem.  The vx_config library
was
> opening up files but failing to close them.  After we hit the upper
limit
> (getconf OPEN_MAX = 65536 on my machine) for the number of open
files, we
> get that error message.  It's a simple fix to make the vx_config
library
> close the files that it opens.  And I'm now able to process all of
the
> sample MPR lines you sent me.
>
> Unfortunately, these 2 fixes will require that MET be recompiled
with the
> set of patches I just posted:
>
>
http://www.dtcenter.org/met/users/support/known_issues/METv5.1/index.php
>
> Please let me know if you have more questions/issue.
>
> John
>
>
> On Mon, May 23, 2016 at 12:24 PM, robert.craig.2 at us.af.mil via RT <
> met_help at ucar.edu> wrote:
>
>>
>> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361 >
>>
>> Thanks for looking into this.  I am currently checking out your PCT
>> recommendation.  I will let you know if I have more questions.
>>
>> Bob
>>
>> -----Original Message-----
>> From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
>> Sent: Monday, May 23, 2016 1:04 PM
>> To: CRAIG, ROBERT J GS-12 USAF ACC 16 WS/WXN
<robert.craig.2 at us.af.mil>
>> Subject: Re: [rt.rap.ucar.edu #76361] Statanalysis Question
>>
>> Bob,
>>
>> I wanted to give you an update on the two issues you found...
>>
>> (1) I found and fixed a pesky little bug in STAT-Analysis.  It was
>> writing "=1" to the output stat files when it should be writing
"==1".
>> That was a one line fix.
>>
>> (2) I now understand the original problem you wrote about.  It
shows up
>> when you use the -fcst_thresh, -obs_thresh, or -cov_thresh job
command
>> options to filter the input data.  When STAT-Analysis processes
those
>> threshold-type columns, it's actually writing a tiny temp file and
reading
>> that back in.  After doing that about 65,528 times (at least on my
>> machine), I get that error.  Writing that many little temp files is
>> wreaking havoc.
>>
>> We'll look into (2) some more to see if we can find an alternative
way of
>> processes this data.  I'll hold off on posting the (1) bugfix to
see if can
>> bundle a bugfix for (2) with it.
>>
>> Thanks,
>> John
>>
>> On Mon, May 23, 2016 at 10:55 AM, John Halley Gotway
<johnhg at ucar.edu>
>> wrote:
>>
>> > Bob,
>> >
>> > Surprisingly, I was able to replicate the same error you're
seeing!  I
>> > see where the error is occurring, but I don't yet understand why
its
>> > happening.  However, I do have a workaround for you.
>> >
>> > Try editing your config file by emptying out the "fcst_thresh"
setting:
>> >    fcst_thresh = [];
>> >
>> > In your job, you're using "-by fcst_thresh" anyway, so STAT-
Analysis
>> > will group the data by the FCST_THRESH column.  After I removed
the
>> > "fcst_thresh" setting, the job completed.  I'll continue looking
into
>> > the reason for that error.
>> >
>> > How many of these files are you planning to pass to STAT-Analysis
at
>> > any given time?  I see each sample file contains about 80,000 MPR
lines.
>> > Passing it 5 files to process about 400,000 lines, that job takes
>> > about 62 seconds to run on my machine.  I worry that as you
increase
>> > the number of files, it'll run very slowly.
>> >
>> > Here's some alternative logic you might consider.  Run STAT-
Analysis
>> > once for each .stat file you're generating.  Instead of writing
the
>> > PSTD line type, write the PCT line type (that's just the counts
of
>> > that probabilistic
>> > Nx2 table).  Then run jobs to aggregate the PCT lines types and
>> > compute PSTD lines (-job aggregate_stat -line_type PCT
-out_line_type
>> > PSTD).  That would make the processing for each STAT-Analysis job
much
>> more manageable.
>> >
>> > BUT in order to do this in 2 steps, you would really need to put
the
>> > neighborhood size information into the INTERP_MTHD and
INTERP_PNTS
>> > columns.  Otherwise, the threshold information won't be retained
in
>> > the PCT output lines.  And I also found an issue in the
formatting of
>> > the OBS_THRESH output column ("=1" in the output should really be
>> > "==1").  For now I just switched to "ge1", but I need to fix this
>> formatting issue.
>> >
>> > Listed below are the c-shell command I used to loop over your
sample
>> > files and run stat_analysis in this way...
>> >
>> > # Loop through MPR files and compute PCT output lines foreach
mpr_file
>> > (`ls stat_mpr/*.stat`)
>> >    set pct_file = `echo $mpr_file | sed 's/stat_mpr/stat_pct/g' |
sed
>> > 's/.stat/_pct.stat/g'`
>> >    /usr/local/met-5.1/bin/stat_analysis -lookin $mpr_file
-out_stat
>> > $pct_file \
>> >    -job aggregate_stat -line_type MPR -out_line_type PCT \
>> >    -by MODEL,FCST_VALID_BEG,FCST_LEAD,FCST_VAR,FCST_THRESH \
>> >    -out_fcst_thresh
>> > ge0,ge0.1,ge0.2,ge0.3,ge0.4,ge0.5,ge0.6,ge0.7,ge0.8,ge0.9,ge1.0
>> > -out_obs_thresh ge1
>> > end
>> >
>> > # Aggregate PCT lines and compute PSTD stats
>> > /usr/local/met-5.1/bin/stat_analysis -lookin stat_pct -out_stat
>> pstd.stat \
>> >    -job aggregate_stat -line_type PCT -out_line_type PSTD \
>> >    -by
MODEL,FCST_VALID_BEG,FCST_LEAD,FCST_VAR,INTERP_MTHD,INTERP_PNTS
>> >
>> > Note that this job is combining all of the neighborhood sizes
because
>> > INTERP_MTHD and INTERP_PNTS is set the same in the PCT lines.
>> >
>> > Thanks,
>> > John
>> >
>> >
>> >
>> > On Fri, May 20, 2016 at 3:33 PM, robert.craig.2 at us.af.mil via RT
<
>> > met_help at ucar.edu> wrote:
>> >
>> >>
>> >> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361 >
>> >>
>> >> John, I might have found more information on the error below:
>> >>
>> >> MetConfig::read_string(const char *) -> unable to open temp file
>> >> "config_1943_0_.temp"
>> >>
>> >> When I run stat_anal on MPR files, no problems if I  process 2
days
>> >> of data, but when I increase it to three days, the error occurs.
I
>> >> also played with the dates to try to eliminate a particular data
file
>> >> as the cause.  So, it seems to be related to how many data files
it
>> has to
>> >> process.   I posted all the data files on the ftp server and my
command
>> >> line is below.  The config file on the server is still
representative.
>> >> This error does not seem related to directory permissions.
>> >>
>> >> /h/WXQC/met-5.1/bin/stat_analysis  -lookin
>> >> /h/data/global/WXQC/data/met/mdlob_pairs -out
>> >> /h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_PSTD_0
>> >> -config
>> >> /h/WXQC/met-5.1/data/config/STATAnalysisConfig_hira_bs_updated
>> >> -out_fcst_thresh
>> >> ge0,ge0.1,ge0.2,ge0.3,ge0.4,ge0.5,ge0.6,ge0.7,ge0.8,ge0.9,ge1.0
>> >> -out_obs_thresh eq1 -by FCST_VAR -by FCST_THRESH -v 6
>> >>
>> >> This is concerning since I have to run this on many more than
two
>> >> days worth of data.  Any idea what be happening?
>> >>
>> >> Thanks
>> >> Bob
>> >>
>> >> -----Original Message-----
>> >> From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
>> >> Sent: Thursday, May 19, 2016 4:31 PM
>> >> To: CRAIG, ROBERT J GS-12 USAF ACC 16 WS/WXN
>> >> <robert.craig.2 at us.af.mil>
>> >> Subject: Re: [rt.rap.ucar.edu #76361] Statanalysis Question
>> >>
>> >> Bob,
>> >>
>> >> It's funny, I was just talking to a colleague today about doing
>> >> something very similar to this on a different dataset.
>> >>
>> >> I think I understand what you're saying about the thresholds >1,
>25,
>> >> and
>> >> >50.  These are used to define the "event" which is used in
computing
>> >> >the
>> >> fractional coverage fields.  My confusion comes from the fact
that in
>> >> MET currently only Grid-Stat is computing these fractional
coverage
>> >> fields, not Point-Stat.  But I see now what you're doing.
>> >>
>> >> One suggestion would be to change the contents of the
INTERP_MTHD and
>> >> INTERP_PNTS header columns.  You currently have NEAREST, 1 which
>> >> would indicate that each observation value was matched to the
>> >> forecast value at the nearest grid point.  Instead, I would
suggest
>> >> writing NBRHD and N, where N indicates the number of points in
the
>> >> neighborhood.  For example, the NBRHD output from Grid-Stat
would
>> write 49 if we were using a 7x7 box.
>> >>
>> >> The -fcst_thresh and -obs_thresh options are used to filter the
input
>> >> MPR lines, as you already know.  The -out_fcst_thresh and
>> >> -out_obs_thresh options define the thresholds to be applied when
>> >> computing the output for the job.  In MET, probabilities are not
>> processed in a "continuous" way.
>> >> Instead, they are put into probability bins.  Those bins are
used to
>> >> create an Nx2 contingency table from which probabilistic
statistics
>> >> are computed.
>> >>
>> >> Setting "-out_fcst_thresh gt0,gt0.25,gt0.5,gt0.75,gt1.0" defines
4
>> >> probability bins which yields a 4x2 contingency table, from
which
>> >> stats are computed.
>> >>
>> >> Hope that helps.
>> >>
>> >> Thanks,
>> >> Johhn
>> >>
>> >> On Thu, May 19, 2016 at 2:23 PM, robert.craig.2 at us.af.mil via RT
<
>> >> met_help at ucar.edu> wrote:
>> >>
>> >> >
>> >> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361
>
>> >> >
>> >> > Thanks John, the directory problem was due to come corruption
on
>> >> > the MET we had on one if the systems.  On another system the
>> >> > problem doesn't come up so we are hoping a recompile of MET on
said
>> >> > system will clear up the issue.
>> >> >
>> >> > As far as the second comment, I don't think you interpreted
what I
>> >> > am doing correctly.  In each file, there is three sets of data
for
>> >> > each variable.  They are not identical since the first set is
the
>> >> > ob neighborhood data for precip > 1.  The next set is the ob
>> >> > neighborhood data for precip > 25, and the same for precip >
50.  I
>> >> > you compare the model and ob data, the model data should be
>> >> > different (for some obs) then the model data for the previous
>> >> > category.  The neighborhoods
>> >> around each ob site
>> >> > follow the HiRA method.   All the data in the mpr file lines
are
>> >> > probabilities, so I want to create PSTD from these data.  So I
was
>> >> > using the fcst_thresh to filter for the HiRA thresholds I am
>> interested in.
>> >>  I
>> >> > tried your code and added -by FCST_THRESH and got out three
>> >> > different
>> >> sets
>> >> > of values.   So my question is why did you have the
-out_fcst_thresh
>> >> set to
>> >> > 10 prob thresolds?  Why wouldn't ge0 pick  up all
probabilities?  I
>> >> > am not understanding how these thresholds are being used.
>> >> >
>> >> > Thanks
>> >> > Bob
>> >> >
>> >> > -----Original Message-----
>> >> > From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
>> >> > Sent: Monday, May 16, 2016 4:54 PM
>> >> > To: CRAIG, ROBERT J GS-12 USAF ACC 16 WS/WXN
>> >> > <robert.craig.2 at us.af.mil>
>> >> > Subject: Re: [rt.rap.ucar.edu #76361] Statanalysis Question
>> >> >
>> >> > Bob,
>> >> >
>> >> > Thanks for sending the sample data.  I agree that STAT-
Analysis can
>> >> > get pretty confusing.  It has a lot of flexibility, but we
really
>> >> > need to think through what you're trying to do.
>> >> >
>> >> > First, regarding the error you're getting.  Unfortunately, the
>> >> > config file string parser is writing a temp file in the
current
>> "runtime"
>> >> directory.
>> >> > The error is from the fact that you don't have permission to
write
>> >> > the file "config_23325_0_.temp" in the current directory.
>> >> > Ultimately, we should change that to use the temp directory
instead.
>> >> >
>> >> > Next, I looked at the data you sent to me.  Listed below are
the
>> >> > unique combinations of just a few of the header columns:
>> >> >
>> >> > FCST_VAR FCST_THRESH LINE_TYPE TOTAL
>> >> > APCP         >=1                  MPR           14666
>> >> > APCP         >=25                MPR           14666
>> >> > APCP         >=50                MPR           14666
>> >> > CEIL           <=1000             MPR           11926
>> >> > CEIL           <=100               MPR           11926
>> >> > CEIL           <=300               MPR           11926
>> >> >
>> >> > Based on this, it looks like you have a lot of duplicate
matched
>> >> > pair
>> >> > (MPR) output lines... We have the same 14666 pairs for APCP
>> >> > repeated 3 times followed by the same 11926 pairs for CEIL
repeated
>> 3 times.
>> >> > This isn't necessary.  Instead, the FCST_THRESH and OBS_THRESH
>> >> > columns for the MPR line type should be set to "NA".  The MPR
line
>> >> > type that Point-Stat creates just contains the paired forecast
and
>> >> > observation
>> >> values.
>> >> > Thresholds do not apply to this line type.
>> >> >
>> >> > I posted an updated version of your file to the ftp site.  I
>> >> > stripped it down to 14666 APCP lines and 11926 CEIL lines with
NA
>> >> > in the FCST_THRESH and OBS_THRESH columns:
>> >> >
>> >> >
>> >> >
ftp://ftp.rap.ucar.edu/incoming/irap/met_help/craig_data/point_stat
>> >> > _3_ galwem_120000L_20160501_120000V_JHG.stat
>> >> >
>> >> > Looking at the values in the FCST column, I see numbers
between 0
>> >> > and
>> >> > 1 (0, 0.11, 0.22, 0.33, ...).  Looking in the OBS column, I
see 2
>> >> > numbers (0 or 1).  And looking at your config file, it looks
like
>> >> > you want to use these MPR lines to compute probabilistic
output.
>> >> > MET verifies probabilities using an Nx2 contingency table.
You use
>> >> "-out_fcst_thresh"
>> >> > to select the probabilistic thresholds to be applied and
>> >> "-out_obs_thresh"
>> >> > to select the observation threshold to be applied.
>> >> >
>> >> > Here's a stat-analysis job you could run to read the MPR
lines,
>> >> > define the probabilistic forecast thresholds, define the
single
>> >> > observation threshold, and compute a PSTD output line.  Using
"-by
>> >> > FCST_VAR" tells it to run the job separately for each unique
entry
>> >> > found in the
>> >> FCST_VAR column.
>> >> >
>> >> > /usr/local/met-5.1/bin/stat_analysis \
>> >> >    -lookin
point_stat_3_galwem_120000L_20160501_120000V_JHG.stat \
>> >> >    -job aggregate_stat -line_type MPR -out_line_type PSTD \
>> >> >    -out_fcst_thresh
>> >> >
ge0,ge0.1,ge0.2,ge0.3,ge0.4,ge0.5,ge0.6,ge0.7,ge0.8,ge0.9,ge1.0 \
>> >> >    -out_obs_thresh eq1.0 \
>> >> >    -by FCST_VAR \
>> >> >    -out_stat out_pstd.txt
>> >> >
>> >> > The output statistics are written to "out_pstd.txt".
>> >> >
>> >> > Hope that helps.
>> >> >
>> >> > John
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> > On Mon, May 16, 2016 at 2:47 PM, robert.craig.2 at us.af.mil via
RT <
>> >> > met_help at ucar.edu> wrote:
>> >> >
>> >> > >
>> >> > > <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361 >
>> >> > >
>> >> > > Thanks John, I knew the data was space delimited but forgot
to
>> >> > > check the header.  As usual with MET, I progressed further
but am
>> >> > > hitting a
>> >> new
>> >> > > error.   See below.   I pushed the the config file to the
ftp
>> >> directory.
>> >> > > As you can see, the -tmp_dir is set to
>> >> /h/data/global/WXQC/data/met/tmp.
>> >> > > This directory permissions are wide open - infact stat_anal
temp
>> >> > > files
>> >> > are
>> >> > > in there.   Does the config*.temp try to write somewhere
else?
>> >> > >
>> >> > > Also, notice in the command line options, there are three
>> thesholds.
>> >> > > MET kept telling me that I had to have three since this is
>> >> > > probability
>> >> > data.
>> >> > > Also, the latest MPR files (.stat) are in the ftp dir.  As
you
>> >> > > can see I generated model/ob pairs using different
thresholds for
>> >> > > the forecast and observation data.  So this is where I get
>> >> > > confused: I assume the fcst thresh in the config file is a
filter
>> >> > > to pull those lines that have the threshold I want.  I am
not
>> >> > > sure what the -out_fcst_thresh in the command line is doing.
If
>> >> > > it is filtering the mpr line fcst data, then I would think I
>> >> > > would set it to ge 0 for the fcst and ob since the fcst and
ob
>> >> > > data range from 0 to 1.  Am
>> >> I handling this correctly?
>> >> > >
>> >> > > Thanks
>> >> > > BOb
>> >> > >
>> >> > > ['/h/WXQC/met-5.1/bin/stat_analysis', '-lookin',
>> >> > > '/h/data/global/WXQC/data/met/mdlob_pairs', '-out',
>> >> > >
'/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_1_PSTD_0Z'
>> >> > > ,
>> >> > > '-config',
>> >> > > '/h/WXQC/met-
5.1/data/config/STATAnalysisConfig_hira_bs_updated',
>> >> > > '-out_fcst_thresh ge0,ge0.5,ge1', '-out_obs_thresh ge0', '-
v',
>> >> > > '6'] DEBUG 1: Creating STAT-Analysis output file
>> >> > >
"/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_1_PSTD_0Z"
>> >> > > DEBUG 1: Default Config File:
>> >> > > /home/qcteam/met-
5.1/share/met/config/STATAnalysisConfig_default
>> >> > > DEBUG 1: User Config File:
>> >> > > /h/WXQC/met-
5.1/data/config/STATAnalysisConfig_hira_bs_updated
>> >> > > DEBUG 4: Default Job from the config file: "-model GALWEM
>> >> > > -fcst_lead
>> >> > > 120000 -fcst_init_beg 20160501_000000 -fcst_init_end
>> >> > > 20160502_000000 -fcst_init_hour 000000 -fcst_var APCP
>> >> > > -fcst_thresh >=1 -line_type MPR -vif_flag 1 "
>> >> > > DEBUG 4: Amending default job with command line options:
>> >> > > "-out_fcst_thresh
>> >> > > ge0,ge0.5,ge1 -out_obs_thresh ge0"
>> >> > > DEBUG 3: Processing STAT file
>> >> > >
>> >> >
>> >>
>>
"/h/data/global/WXQC/data/met/mdlob_pairs/point_stat_3_galwem_120000L_20160501_120000V.stat"
>> >> > > ... 1 of 10
>> >> > > ERROR  :
>> >> > > ERROR  :
>> >> > > ERROR  :   MetConfig::read_string(const char *) -> unable to
open
>> temp
>> >> > > file "config_23325_0_.temp"
>> >> > > ERROR  :
>> >> > >
>> >> > > -----Original Message-----
>> >> > > From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
>> >> > > Sent: Friday, May 13, 2016 6:27 PM
>> >> > > To: CRAIG, ROBERT J GS-12 USAF ACC 16 WS/WXN
>> >> > > <robert.craig.2 at us.af.mil>
>> >> > > Subject: Re: [rt.rap.ucar.edu #76361] Statanalysis Question
>> >> > >
>> >> > > Bob,
>> >> > >
>> >> > > The problem is coming from the first line of the file you
sent to
>> me.
>> >> > > It contains a comma-separated list of header column names.
>> >> > >
>> >> > > I'm not exactly sure where you pulled those header column
names,
>> >> > > but that's the problem.  MET expects data to be separated by
>> whitespace...
>> >> > > so it interprets that long string with a bunch of commas as
a
>> >> > > single
>> >> > column.
>> >> > > The
>> >> > > error comes when it tries to read the "second" column.   If
you
>> just
>> >> > remove
>> >> > > that first line, it should run fine.
>> >> > >
>> >> > > If you do want header columns, here's a trick.  Run the
following
>> job:
>> >> > >
>> >> > > stat_analysis -lookin
>> >> point_stat_3_galwem_120000L_20160501_120000V.stat \
>> >> > >    -job filter -line_type MPR -dump_row out.stat
>> >> > >
>> >> > > The file out.stat, will now contain the full header for the
MPR
>> >> > > line
>> >> > type.
>> >> > > When you select a single LINE_TYPE value, stat-analysis will
>> >> > > write the full header for that line type to the output.
>> >> > >
>> >> > > Have a good weekend.
>> >> > >
>> >> > > John
>> >> > >
>> >> > >
>> >> > > On Fri, May 13, 2016 at 10:26 AM, robert.craig.2 at us.af.mil
via RT
>> >> > > < met_help at ucar.edu> wrote:
>> >> > >
>> >> > > >
>> >> > > > Fri May 13 10:26:26 2016: Request 76361 was acted upon.
>> >> > > > Transaction: Ticket created by robert.craig.2 at us.af.mil
>> >> > > >        Queue: met_help
>> >> > > >      Subject: Statanalysis Question
>> >> > > >        Owner: Nobody
>> >> > > >   Requestors: robert.craig.2 at us.af.mil
>> >> > > >       Status: new
>> >> > > >  Ticket <URL:
>> >> > > > https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361
>> >> > > > >
>> >> > > >
>> >> > > >
>> >> > > > John, I am getting the following error when running
statanalysis.
>> >> > > > The forcast times and valid times seem to be correct to
me, so
>> >> > > > I am not sure the cause of the error.
>> >> > > >
>> >> > > > /h/WXQC/met-5.1/bin/stat_analysis -lookin
>> >> > > > /h/data/global/WXQC/data/met/mdlob_pairs -out
>> >> > > >
/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_PSTD_0
>> >> > > > Z
>> >> > > > -config
>> >> > > > /h/WXQC/met-
5.1/data/config/STATAnalysisConfig_hira_bs_updated
>> >> > > > -v 6 DEBUG 1: Creating STAT-Analysis output file
>> >> > > >
>> "/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_PSTD_0Z"
>> >> > > > DEBUG 1: Default Config File:
>> >> > > > /home/qcteam/met-
5.1/share/met/config/STATAnalysisConfig_defaul
>> >> > > > t
>> >> > > > DEBUG 1: User Config File:
>> >> > > > /h/WXQC/met-
5.1/data/config/STATAnalysisConfig_hira_bs_updated
>> >> > > > DEBUG 4: Default Job from the config file: "-model GALWEM
>> >> > > > -fcst_lead
>> >> > > > 120000 -fcst_init_beg 20160501_000000 -fcst_init_end
>> >> > > > 20160501_000000 -fcst_init_hour 120000 -fcst_var APCP
>> >> > > > -fcst_thresh
>> >> > > > >=50 -line_type MPR -vif_flag 1 "
>> >> > > > DEBUG 4: Amending default job with command line options:
"(nul)"
>> >> > > > DEBUG 3: Processing STAT file
>> >> > > >
>> >> > >
>> >> >
>> >>
>>
"/h/data/global/WXQC/data/met/mdlob_pairs/point_stat_3_galwem_120000L_20160501_120000V.stat"
>> >> > > > ... 1 of 2
>> >> > > > ERROR  :
>> >> > > > ERROR  : DataLine::get_item(int) -> range check error
ERROR  :
>> >> > > >
>> >> > > > The config file and data file are on your server.
>> >> > > >
>> >> > > >
>> >> > > >
>> >> > >
>> >> > >
>> >> > >
>> >> > >
>> >> >
>> >> >
>> >> >
>> >> >
>> >>
>> >>
>> >>
>> >>
>> >
>>
>>
>>
>>
>

------------------------------------------------
Subject: Statanalysis Question
From: robert.craig.2 at us.af.mil
Time: Thu May 26 09:55:08 2016

Hi John, I am to the point where I am trying to read the pct.txt files
to generate the PSTD files.  When I do this using /h/WXQC/met-
5.1/bin/stat_analysis  -lookin
/h/data/global/WXQC/data/met/ens_cont_tbl -out
/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_3_PSTD_0Z
-line_type PCT -out_line_type PSTD -by FCST_VAR -by FCST_THRESH -v 6,
I get the following:



['/h/WXQC/met-5.1/bin/stat_analysis', '-lookin',
'/h/data/global/WXQC/data/met/ens_cont_tbl/', '-out',
'/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_3_PSTD_0Z',
'-line_type PCT', '-out_line_type PSTD', '-by FCST_VAR', '-by
FCST_THRESH', '-v', '6']

DEBUG 1: Creating STAT-Analysis output file
"/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_3_PSTD_0Z"

DEBUG 4: Amending default job with command line options: "-line_type
PCT -out_line_type PSTD -by FCST_VAR -by FCST_THRESH"

ERROR  :

ERROR  : process_search_dirs() -> no STAT files found in the
directories specified!

ERROR  :

ERROR  :

ERROR  : main() -> encountered an error value of 1.  Calling
clean_up() and usage() before exiting.

ERROR  :



*** Model Evaluation Tools (METV5.1) ***



The ens_cont_tbl directory contains 3 files and I placed them on your
ftp server.  I didn’t think the file naming convention was important
here as long as they end in _pct.txt.  I did try renaming one to the
more conventional naming scheme – point_stat … but that didn’t make a
difference.  So why is it not finding my files?



Thanks

Bob



-----Original Message-----
From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
Sent: Monday, May 23, 2016 11:56 AM
To: CRAIG, ROBERT J GS-12 USAF ACC 16 WS/WXN
<robert.craig.2 at us.af.mil>
Subject: Re: [rt.rap.ucar.edu #76361] Statanalysis Question



Bob,



Surprisingly, I was able to replicate the same error you're seeing!  I
see where the error is occurring, but I don't yet understand why its
happening.  However, I do have a workaround for you.



Try editing your config file by emptying out the "fcst_thresh"
setting:

   fcst_thresh = [];



In your job, you're using "-by fcst_thresh" anyway, so STAT-Analysis
will group the data by the FCST_THRESH column.  After I removed the
"fcst_thresh" setting, the job completed.  I'll continue looking into
the reason for that error.



How many of these files are you planning to pass to STAT-Analysis at
any given time?  I see each sample file contains about 80,000 MPR
lines.

Passing it 5 files to process about 400,000 lines, that job takes
about 62 seconds to run on my machine.  I worry that as you increase
the number of files, it'll run very slowly.



Here's some alternative logic you might consider.  Run STAT-Analysis
once for each .stat file you're generating.  Instead of writing the
PSTD line type, write the PCT line type (that's just the counts of
that probabilistic

Nx2 table).  Then run jobs to aggregate the PCT lines types and
compute PSTD lines (-job aggregate_stat -line_type PCT -out_line_type
PSTD).  That would make the processing for each STAT-Analysis job much
more manageable.



BUT in order to do this in 2 steps, you would really need to put the
neighborhood size information into the INTERP_MTHD and INTERP_PNTS
columns.  Otherwise, the threshold information won't be retained in
the PCT output lines.  And I also found an issue in the formatting of
the OBS_THRESH output column ("=1" in the output should really be
"==1").  For now I just switched to "ge1", but I need to fix this
formatting issue.



Listed below are the c-shell command I used to loop over your sample
files and run stat_analysis in this way...



# Loop through MPR files and compute PCT output lines foreach mpr_file
(`ls stat_mpr/*.stat`)

   set pct_file = `echo $mpr_file | sed 's/stat_mpr/stat_pct/g' | sed
's/.stat/_pct.stat/g'`

   /usr/local/met-5.1/bin/stat_analysis -lookin $mpr_file -out_stat
$pct_file \

   -job aggregate_stat -line_type MPR -out_line_type PCT \

   -by MODEL,FCST_VALID_BEG,FCST_LEAD,FCST_VAR,FCST_THRESH \

   -out_fcst_thresh

ge0,ge0.1,ge0.2,ge0.3,ge0.4,ge0.5,ge0.6,ge0.7,ge0.8,ge0.9,ge1.0

-out_obs_thresh ge1

end



# Aggregate PCT lines and compute PSTD stats /usr/local/met-
5.1/bin/stat_analysis -lookin stat_pct -out_stat pstd.stat \

   -job aggregate_stat -line_type PCT -out_line_type PSTD \

   -by MODEL,FCST_VALID_BEG,FCST_LEAD,FCST_VAR,INTERP_MTHD,INTERP_PNTS



Note that this job is combining all of the neighborhood sizes because
INTERP_MTHD and INTERP_PNTS is set the same in the PCT lines.



Thanks,

John







On Fri, May 20, 2016 at 3:33 PM, robert.craig.2 at us.af.mil via RT <
met_help at ucar.edu> wrote:



>

> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361 >

>

> John, I might have found more information on the error below:

>

> MetConfig::read_string(const char *) -> unable to open temp file

> "config_1943_0_.temp"

>

> When I run stat_anal on MPR files, no problems if I  process 2 days
of

> data, but when I increase it to three days, the error occurs.  I
also

> played with the dates to try to eliminate a particular data file as

> the cause.  So, it seems to be related to how many data files it has
to

> process.   I posted all the data files on the ftp server and my
command

> line is below.  The config file on the server is still
representative.

> This error does not seem related to directory permissions.

>

> /h/WXQC/met-5.1/bin/stat_analysis  -lookin

> /h/data/global/WXQC/data/met/mdlob_pairs -out

> /h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_PSTD_0

> -config /h/WXQC/met-
5.1/data/config/STATAnalysisConfig_hira_bs_updated

> -out_fcst_thresh

> ge0,ge0.1,ge0.2,ge0.3,ge0.4,ge0.5,ge0.6,ge0.7,ge0.8,ge0.9,ge1.0

> -out_obs_thresh eq1 -by FCST_VAR -by FCST_THRESH -v 6

>

> This is concerning since I have to run this on many more than two
days

> worth of data.  Any idea what be happening?

>

> Thanks

> Bob

>

> -----Original Message-----

> From: John Halley Gotway via RT [mailto:met_help at ucar.edu]

> Sent: Thursday, May 19, 2016 4:31 PM

> To: CRAIG, ROBERT J GS-12 USAF ACC 16 WS/WXN

> <robert.craig.2 at us.af.mil>

> Subject: Re: [rt.rap.ucar.edu #76361] Statanalysis Question

>

> Bob,

>

> It's funny, I was just talking to a colleague today about doing

> something very similar to this on a different dataset.

>

> I think I understand what you're saying about the thresholds >1,
>25,

> and

> >50.  These are used to define the "event" which is used in
computing

> >the

> fractional coverage fields.  My confusion comes from the fact that
in

> MET currently only Grid-Stat is computing these fractional coverage

> fields, not Point-Stat.  But I see now what you're doing.

>

> One suggestion would be to change the contents of the INTERP_MTHD
and

> INTERP_PNTS header columns.  You currently have NEAREST, 1 which
would

> indicate that each observation value was matched to the forecast
value

> at the nearest grid point.  Instead, I would suggest writing NBRHD
and

> N, where N indicates the number of points in the neighborhood.  For

> example, the NBRHD output from Grid-Stat would write 49 if we were
using a 7x7 box.

>

> The -fcst_thresh and -obs_thresh options are used to filter the
input

> MPR lines, as you already know.  The -out_fcst_thresh and

> -out_obs_thresh options define the thresholds to be applied when

> computing the output for the job.  In MET, probabilities are not
processed in a "continuous" way.

> Instead, they are put into probability bins.  Those bins are used to

> create an Nx2 contingency table from which probabilistic statistics

> are computed.

>

> Setting "-out_fcst_thresh gt0,gt0.25,gt0.5,gt0.75,gt1.0" defines 4

> probability bins which yields a 4x2 contingency table, from which

> stats are computed.

>

> Hope that helps.

>

> Thanks,

> Johhn

>

> On Thu, May 19, 2016 at 2:23 PM, robert.craig.2 at us.af.mil via RT <

> met_help at ucar.edu> wrote:

>

> >

> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361 >

> >

> > Thanks John, the directory problem was due to come corruption on
the

> > MET we had on one if the systems.  On another system the problem

> > doesn't come up so we are hoping a recompile of MET on said system

> > will clear up the issue.

> >

> > As far as the second comment, I don't think you interpreted what I

> > am doing correctly.  In each file, there is three sets of data for

> > each variable.  They are not identical since the first set is the
ob

> > neighborhood data for precip > 1.  The next set is the ob

> > neighborhood data for precip > 25, and the same for precip > 50.
I

> > you compare the model and ob data, the model data should be

> > different (for some obs) then the model data for the previous

> > category.  The neighborhoods

> around each ob site

> > follow the HiRA method.   All the data in the mpr file lines are

> > probabilities, so I want to create PSTD from these data.  So I was
using

> > the fcst_thresh to filter for the HiRA thresholds I am interested
in.   I

> > tried your code and added -by FCST_THRESH and got out three

> > different

> sets

> > of values.   So my question is why did you have the
-out_fcst_thresh set

> to

> > 10 prob thresolds?  Why wouldn't ge0 pick  up all probabilities?
I

> > am not understanding how these thresholds are being used.

> >

> > Thanks

> > Bob

> >

> > -----Original Message-----

> > From: John Halley Gotway via RT [mailto:met_help at ucar.edu]

> > Sent: Monday, May 16, 2016 4:54 PM

> > To: CRAIG, ROBERT J GS-12 USAF ACC 16 WS/WXN

> > <robert.craig.2 at us.af.mil>

> > Subject: Re: [rt.rap.ucar.edu #76361] Statanalysis Question

> >

> > Bob,

> >

> > Thanks for sending the sample data.  I agree that STAT-Analysis
can

> > get pretty confusing.  It has a lot of flexibility, but we really

> > need to think through what you're trying to do.

> >

> > First, regarding the error you're getting.  Unfortunately, the

> > config file string parser is writing a temp file in the current
"runtime"

> directory.

> > The error is from the fact that you don't have permission to write

> > the file "config_23325_0_.temp" in the current directory.

> > Ultimately, we should change that to use the temp directory
instead.

> >

> > Next, I looked at the data you sent to me.  Listed below are the

> > unique combinations of just a few of the header columns:

> >

> > FCST_VAR FCST_THRESH LINE_TYPE TOTAL

> > APCP         >=1                  MPR           14666

> > APCP         >=25                MPR           14666

> > APCP         >=50                MPR           14666

> > CEIL           <=1000             MPR           11926

> > CEIL           <=100               MPR           11926

> > CEIL           <=300               MPR           11926

> >

> > Based on this, it looks like you have a lot of duplicate matched

> > pair

> > (MPR) output lines... We have the same 14666 pairs for APCP
repeated

> > 3 times followed by the same 11926 pairs for CEIL repeated 3
times.

> > This isn't necessary.  Instead, the FCST_THRESH and OBS_THRESH

> > columns for the MPR line type should be set to "NA".  The MPR line

> > type that Point-Stat creates just contains the paired forecast and

> > observation

> values.

> > Thresholds do not apply to this line type.

> >

> > I posted an updated version of your file to the ftp site.  I

> > stripped it down to 14666 APCP lines and 11926 CEIL lines with NA
in

> > the FCST_THRESH and OBS_THRESH columns:

> >

> >

> >
ftp://ftp.rap.ucar.edu/incoming/irap/met_help/craig_data/point_stat_

> > 3_ galwem_120000L_20160501_120000V_JHG.stat

> >

> > Looking at the values in the FCST column, I see numbers between 0

> > and

> > 1 (0, 0.11, 0.22, 0.33, ...).  Looking in the OBS column, I see 2

> > numbers (0 or 1).  And looking at your config file, it looks like

> > you want to use these MPR lines to compute probabilistic output.

> > MET verifies probabilities using an Nx2 contingency table.  You
use

> "-out_fcst_thresh"

> > to select the probabilistic thresholds to be applied and

> "-out_obs_thresh"

> > to select the observation threshold to be applied.

> >

> > Here's a stat-analysis job you could run to read the MPR lines,

> > define the probabilistic forecast thresholds, define the single

> > observation threshold, and compute a PSTD output line.  Using "-by

> > FCST_VAR" tells it to run the job separately for each unique entry

> > found in the FCST_VAR

> column.

> >

> > /usr/local/met-5.1/bin/stat_analysis \

> >    -lookin point_stat_3_galwem_120000L_20160501_120000V_JHG.stat \

> >    -job aggregate_stat -line_type MPR -out_line_type PSTD \

> >    -out_fcst_thresh

> > ge0,ge0.1,ge0.2,ge0.3,ge0.4,ge0.5,ge0.6,ge0.7,ge0.8,ge0.9,ge1.0 \

> >    -out_obs_thresh eq1.0 \

> >    -by FCST_VAR \

> >    -out_stat out_pstd.txt

> >

> > The output statistics are written to "out_pstd.txt".

> >

> > Hope that helps.

> >

> > John

> >

> >

> >

> >

> >

> >

> >

> > On Mon, May 16, 2016 at 2:47 PM, robert.craig.2 at us.af.mil via RT <

> > met_help at ucar.edu> wrote:

> >

> > >

> > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361 >

> > >

> > > Thanks John, I knew the data was space delimited but forgot to

> > > check the header.  As usual with MET, I progressed further but
am

> > > hitting a

> new

> > > error.   See below.   I pushed the the config file to the ftp

> directory.

> > > As you can see, the -tmp_dir is set to

> /h/data/global/WXQC/data/met/tmp.

> > > This directory permissions are wide open - infact stat_anal temp

> > > files

> > are

> > > in there.   Does the config*.temp try to write somewhere else?

> > >

> > > Also, notice in the command line options, there are three
thesholds.

> > > MET kept telling me that I had to have three since this is

> > > probability

> > data.

> > > Also, the latest MPR files (.stat) are in the ftp dir.  As you
can

> > > see I generated model/ob pairs using different thresholds for
the

> > > forecast and observation data.  So this is where I get confused:
I

> > > assume the fcst thresh in the config file is a filter to pull

> > > those lines that have the threshold I want.  I am not sure what

> > > the -out_fcst_thresh in the command line is doing.  If it is

> > > filtering the mpr line fcst data, then I would think I would set

> > > it to ge 0 for the fcst and ob since the fcst and ob data range

> > > from 0 to 1.  Am

> I handling this correctly?

> > >

> > > Thanks

> > > BOb

> > >

> > > ['/h/WXQC/met-5.1/bin/stat_analysis', '-lookin',

> > > '/h/data/global/WXQC/data/met/mdlob_pairs', '-out',

> > >
'/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_1_PSTD_0Z',

> > > '-config',

> > > '/h/WXQC/met-
5.1/data/config/STATAnalysisConfig_hira_bs_updated',

> > > '-out_fcst_thresh ge0,ge0.5,ge1', '-out_obs_thresh ge0', '-v',

> > > '6'] DEBUG 1: Creating STAT-Analysis output file

> > >
"/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_1_PSTD_0Z"

> > > DEBUG 1: Default Config File:

> > > /home/qcteam/met-5.1/share/met/config/STATAnalysisConfig_default

> > > DEBUG 1: User Config File:

> > > /h/WXQC/met-5.1/data/config/STATAnalysisConfig_hira_bs_updated

> > > DEBUG 4: Default Job from the config file: "-model GALWEM

> > > -fcst_lead

> > > 120000 -fcst_init_beg 20160501_000000 -fcst_init_end

> > > 20160502_000000 -fcst_init_hour 000000 -fcst_var APCP
-fcst_thresh

> > > >=1 -line_type MPR -vif_flag 1 "

> > > DEBUG 4: Amending default job with command line options:

> > > "-out_fcst_thresh

> > > ge0,ge0.5,ge1 -out_obs_thresh ge0"

> > > DEBUG 3: Processing STAT file

> > >

> >

>
"/h/data/global/WXQC/data/met/mdlob_pairs/point_stat_3_galwem_120000L_20160501_120000V.stat"

> > > ... 1 of 10

> > > ERROR  :

> > > ERROR  :

> > > ERROR  :   MetConfig::read_string(const char *) -> unable to
open temp

> > > file "config_23325_0_.temp"

> > > ERROR  :

> > >

> > > -----Original Message-----

> > > From: John Halley Gotway via RT [mailto:met_help at ucar.edu]

> > > Sent: Friday, May 13, 2016 6:27 PM

> > > To: CRAIG, ROBERT J GS-12 USAF ACC 16 WS/WXN

> > > <robert.craig.2 at us.af.mil>

> > > Subject: Re: [rt.rap.ucar.edu #76361] Statanalysis Question

> > >

> > > Bob,

> > >

> > > The problem is coming from the first line of the file you sent
to me.

> > > It contains a comma-separated list of header column names.

> > >

> > > I'm not exactly sure where you pulled those header column names,

> > > but that's the problem.  MET expects data to be separated by
whitespace...

> > > so it interprets that long string with a bunch of commas as a

> > > single

> > column.

> > > The

> > > error comes when it tries to read the "second" column.   If you
just

> > remove

> > > that first line, it should run fine.

> > >

> > > If you do want header columns, here's a trick.  Run the
following job:

> > >

> > > stat_analysis -lookin

> point_stat_3_galwem_120000L_20160501_120000V.stat \

> > >    -job filter -line_type MPR -dump_row out.stat

> > >

> > > The file out.stat, will now contain the full header for the MPR

> > > line

> > type.

> > > When you select a single LINE_TYPE value, stat-analysis will
write

> > > the full header for that line type to the output.

> > >

> > > Have a good weekend.

> > >

> > > John

> > >

> > >

> > > On Fri, May 13, 2016 at 10:26 AM, robert.craig.2 at us.af.mil via
RT

> > > < met_help at ucar.edu> wrote:

> > >

> > > >

> > > > Fri May 13 10:26:26 2016: Request 76361 was acted upon.

> > > > Transaction: Ticket created by robert.craig.2 at us.af.mil

> > > >        Queue: met_help

> > > >      Subject: Statanalysis Question

> > > >        Owner: Nobody

> > > >   Requestors: robert.craig.2 at us.af.mil

> > > >       Status: new

> > > >  Ticket <URL:

> > > > https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361

> > > > >

> > > >

> > > >

> > > > John, I am getting the following error when running
statanalysis.

> > > > The forcast times and valid times seem to be correct to me, so
I

> > > > am not sure the cause of the error.

> > > >

> > > > /h/WXQC/met-5.1/bin/stat_analysis -lookin

> > > > /h/data/global/WXQC/data/met/mdlob_pairs -out

> > > >
/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_PSTD_0Z

> > > > -config

> > > > /h/WXQC/met-5.1/data/config/STATAnalysisConfig_hira_bs_updated

> > > > -v 6 DEBUG 1: Creating STAT-Analysis output file

> > > >
"/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_PSTD_0Z"

> > > > DEBUG 1: Default Config File:

> > > > /home/qcteam/met-
5.1/share/met/config/STATAnalysisConfig_default

> > > > DEBUG 1: User Config File:

> > > > /h/WXQC/met-5.1/data/config/STATAnalysisConfig_hira_bs_updated

> > > > DEBUG 4: Default Job from the config file: "-model GALWEM

> > > > -fcst_lead

> > > > 120000 -fcst_init_beg 20160501_000000 -fcst_init_end

> > > > 20160501_000000 -fcst_init_hour 120000 -fcst_var APCP

> > > > -fcst_thresh

> > > > >=50 -line_type MPR -vif_flag 1 "

> > > > DEBUG 4: Amending default job with command line options:
"(nul)"

> > > > DEBUG 3: Processing STAT file

> > > >

> > >

> >

>
"/h/data/global/WXQC/data/met/mdlob_pairs/point_stat_3_galwem_120000L_20160501_120000V.stat"

> > > > ... 1 of 2

> > > > ERROR  :

> > > > ERROR  : DataLine::get_item(int) -> range check error ERROR  :

> > > >

> > > > The config file and data file are on your server.

> > > >

> > > >

> > > >

> > >

> > >

> > >

> > >

> >

> >

> >

> >

>

>

>

>



------------------------------------------------
Subject: Statanalysis Question
From: John Halley Gotway
Time: Thu May 26 10:31:09 2016

Bob,

If you tell STAT-Analysis to "-lookin" a directory, it will
recursively
search that directory looking for files ending in ".stat".  If you
instead
tell STAT-Analysis to "-lookin" one or more specific file names, it
will
process them regardless of their suffix.

So you can either name your PCT output files as "_pct.stat" or you can
keep
them as "_pct.txt" and change your STAT-Analysis job command to
something
like:
  -lookin /h/data/global/WXQC/data/met/ens_cont_tbl/*_pct.txt

Make sense?

Thanks,
John

On Thu, May 26, 2016 at 9:55 AM, robert.craig.2 at us.af.mil via RT <
met_help at ucar.edu> wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361 >
>
> Hi John, I am to the point where I am trying to read the pct.txt
files to
> generate the PSTD files.  When I do this using
> /h/WXQC/met-5.1/bin/stat_analysis  -lookin
> /h/data/global/WXQC/data/met/ens_cont_tbl -out
> /h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_3_PSTD_0Z
> -line_type PCT -out_line_type PSTD -by FCST_VAR -by FCST_THRESH -v
6, I get
> the following:
>
>
>
> ['/h/WXQC/met-5.1/bin/stat_analysis', '-lookin',
> '/h/data/global/WXQC/data/met/ens_cont_tbl/', '-out',
>
'/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_3_PSTD_0Z',
> '-line_type PCT', '-out_line_type PSTD', '-by FCST_VAR', '-by
FCST_THRESH',
> '-v', '6']
>
> DEBUG 1: Creating STAT-Analysis output file
> "/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_3_PSTD_0Z"
>
> DEBUG 4: Amending default job with command line options: "-line_type
PCT
> -out_line_type PSTD -by FCST_VAR -by FCST_THRESH"
>
> ERROR  :
>
> ERROR  : process_search_dirs() -> no STAT files found in the
directories
> specified!
>
> ERROR  :
>
> ERROR  :
>
> ERROR  : main() -> encountered an error value of 1.  Calling
clean_up()
> and usage() before exiting.
>
> ERROR  :
>
>
>
> *** Model Evaluation Tools (METV5.1) ***
>
>
>
> The ens_cont_tbl directory contains 3 files and I placed them on
your ftp
> server.  I didn’t think the file naming convention was important
here as
> long as they end in _pct.txt.  I did try renaming one to the more
> conventional naming scheme – point_stat … but that didn’t make a
> difference.  So why is it not finding my files?
>
>
>
> Thanks
>
> Bob
>
>
>
> -----Original Message-----
> From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
> Sent: Monday, May 23, 2016 11:56 AM
> To: CRAIG, ROBERT J GS-12 USAF ACC 16 WS/WXN
<robert.craig.2 at us.af.mil>
> Subject: Re: [rt.rap.ucar.edu #76361] Statanalysis Question
>
>
>
> Bob,
>
>
>
> Surprisingly, I was able to replicate the same error you're seeing!
I see
> where the error is occurring, but I don't yet understand why its
> happening.  However, I do have a workaround for you.
>
>
>
> Try editing your config file by emptying out the "fcst_thresh"
setting:
>
>    fcst_thresh = [];
>
>
>
> In your job, you're using "-by fcst_thresh" anyway, so STAT-Analysis
will
> group the data by the FCST_THRESH column.  After I removed the
> "fcst_thresh" setting, the job completed.  I'll continue looking
into the
> reason for that error.
>
>
>
> How many of these files are you planning to pass to STAT-Analysis at
any
> given time?  I see each sample file contains about 80,000 MPR lines.
>
> Passing it 5 files to process about 400,000 lines, that job takes
about 62
> seconds to run on my machine.  I worry that as you increase the
number of
> files, it'll run very slowly.
>
>
>
> Here's some alternative logic you might consider.  Run STAT-Analysis
once
> for each .stat file you're generating.  Instead of writing the PSTD
line
> type, write the PCT line type (that's just the counts of that
probabilistic
>
> Nx2 table).  Then run jobs to aggregate the PCT lines types and
compute
> PSTD lines (-job aggregate_stat -line_type PCT -out_line_type PSTD).
That
> would make the processing for each STAT-Analysis job much more
manageable.
>
>
>
> BUT in order to do this in 2 steps, you would really need to put the
> neighborhood size information into the INTERP_MTHD and INTERP_PNTS
> columns.  Otherwise, the threshold information won't be retained in
the PCT
> output lines.  And I also found an issue in the formatting of the
> OBS_THRESH output column ("=1" in the output should really be
"==1").  For
> now I just switched to "ge1", but I need to fix this formatting
issue.
>
>
>
> Listed below are the c-shell command I used to loop over your sample
files
> and run stat_analysis in this way...
>
>
>
> # Loop through MPR files and compute PCT output lines foreach
mpr_file
> (`ls stat_mpr/*.stat`)
>
>    set pct_file = `echo $mpr_file | sed 's/stat_mpr/stat_pct/g' |
sed
> 's/.stat/_pct.stat/g'`
>
>    /usr/local/met-5.1/bin/stat_analysis -lookin $mpr_file -out_stat
> $pct_file \
>
>    -job aggregate_stat -line_type MPR -out_line_type PCT \
>
>    -by MODEL,FCST_VALID_BEG,FCST_LEAD,FCST_VAR,FCST_THRESH \
>
>    -out_fcst_thresh
>
> ge0,ge0.1,ge0.2,ge0.3,ge0.4,ge0.5,ge0.6,ge0.7,ge0.8,ge0.9,ge1.0
>
> -out_obs_thresh ge1
>
> end
>
>
>
> # Aggregate PCT lines and compute PSTD stats
> /usr/local/met-5.1/bin/stat_analysis -lookin stat_pct -out_stat
pstd.stat \
>
>    -job aggregate_stat -line_type PCT -out_line_type PSTD \
>
>    -by
MODEL,FCST_VALID_BEG,FCST_LEAD,FCST_VAR,INTERP_MTHD,INTERP_PNTS
>
>
>
> Note that this job is combining all of the neighborhood sizes
because
> INTERP_MTHD and INTERP_PNTS is set the same in the PCT lines.
>
>
>
> Thanks,
>
> John
>
>
>
>
>
>
>
> On Fri, May 20, 2016 at 3:33 PM, robert.craig.2 at us.af.mil via RT <
> met_help at ucar.edu> wrote:
>
>
>
> >
>
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361 >
>
> >
>
> > John, I might have found more information on the error below:
>
> >
>
> > MetConfig::read_string(const char *) -> unable to open temp file
>
> > "config_1943_0_.temp"
>
> >
>
> > When I run stat_anal on MPR files, no problems if I  process 2
days of
>
> > data, but when I increase it to three days, the error occurs.  I
also
>
> > played with the dates to try to eliminate a particular data file
as
>
> > the cause.  So, it seems to be related to how many data files it
has to
>
> > process.   I posted all the data files on the ftp server and my
command
>
> > line is below.  The config file on the server is still
representative.
>
> > This error does not seem related to directory permissions.
>
> >
>
> > /h/WXQC/met-5.1/bin/stat_analysis  -lookin
>
> > /h/data/global/WXQC/data/met/mdlob_pairs -out
>
> > /h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_PSTD_0
>
> > -config /h/WXQC/met-
5.1/data/config/STATAnalysisConfig_hira_bs_updated
>
> > -out_fcst_thresh
>
> > ge0,ge0.1,ge0.2,ge0.3,ge0.4,ge0.5,ge0.6,ge0.7,ge0.8,ge0.9,ge1.0
>
> > -out_obs_thresh eq1 -by FCST_VAR -by FCST_THRESH -v 6
>
> >
>
> > This is concerning since I have to run this on many more than two
days
>
> > worth of data.  Any idea what be happening?
>
> >
>
> > Thanks
>
> > Bob
>
> >
>
> > -----Original Message-----
>
> > From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
>
> > Sent: Thursday, May 19, 2016 4:31 PM
>
> > To: CRAIG, ROBERT J GS-12 USAF ACC 16 WS/WXN
>
> > <robert.craig.2 at us.af.mil>
>
> > Subject: Re: [rt.rap.ucar.edu #76361] Statanalysis Question
>
> >
>
> > Bob,
>
> >
>
> > It's funny, I was just talking to a colleague today about doing
>
> > something very similar to this on a different dataset.
>
> >
>
> > I think I understand what you're saying about the thresholds >1,
>25,
>
> > and
>
> > >50.  These are used to define the "event" which is used in
computing
>
> > >the
>
> > fractional coverage fields.  My confusion comes from the fact that
in
>
> > MET currently only Grid-Stat is computing these fractional
coverage
>
> > fields, not Point-Stat.  But I see now what you're doing.
>
> >
>
> > One suggestion would be to change the contents of the INTERP_MTHD
and
>
> > INTERP_PNTS header columns.  You currently have NEAREST, 1 which
would
>
> > indicate that each observation value was matched to the forecast
value
>
> > at the nearest grid point.  Instead, I would suggest writing NBRHD
and
>
> > N, where N indicates the number of points in the neighborhood.
For
>
> > example, the NBRHD output from Grid-Stat would write 49 if we were
using
> a 7x7 box.
>
> >
>
> > The -fcst_thresh and -obs_thresh options are used to filter the
input
>
> > MPR lines, as you already know.  The -out_fcst_thresh and
>
> > -out_obs_thresh options define the thresholds to be applied when
>
> > computing the output for the job.  In MET, probabilities are not
> processed in a "continuous" way.
>
> > Instead, they are put into probability bins.  Those bins are used
to
>
> > create an Nx2 contingency table from which probabilistic
statistics
>
> > are computed.
>
> >
>
> > Setting "-out_fcst_thresh gt0,gt0.25,gt0.5,gt0.75,gt1.0" defines 4
>
> > probability bins which yields a 4x2 contingency table, from which
>
> > stats are computed.
>
> >
>
> > Hope that helps.
>
> >
>
> > Thanks,
>
> > Johhn
>
> >
>
> > On Thu, May 19, 2016 at 2:23 PM, robert.craig.2 at us.af.mil via RT <
>
> > met_help at ucar.edu> wrote:
>
> >
>
> > >
>
> > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361 >
>
> > >
>
> > > Thanks John, the directory problem was due to come corruption on
the
>
> > > MET we had on one if the systems.  On another system the problem
>
> > > doesn't come up so we are hoping a recompile of MET on said
system
>
> > > will clear up the issue.
>
> > >
>
> > > As far as the second comment, I don't think you interpreted what
I
>
> > > am doing correctly.  In each file, there is three sets of data
for
>
> > > each variable.  They are not identical since the first set is
the ob
>
> > > neighborhood data for precip > 1.  The next set is the ob
>
> > > neighborhood data for precip > 25, and the same for precip > 50.
I
>
> > > you compare the model and ob data, the model data should be
>
> > > different (for some obs) then the model data for the previous
>
> > > category.  The neighborhoods
>
> > around each ob site
>
> > > follow the HiRA method.   All the data in the mpr file lines are
>
> > > probabilities, so I want to create PSTD from these data.  So I
was
> using
>
> > > the fcst_thresh to filter for the HiRA thresholds I am
interested in.
>  I
>
> > > tried your code and added -by FCST_THRESH and got out three
>
> > > different
>
> > sets
>
> > > of values.   So my question is why did you have the
-out_fcst_thresh
> set
>
> > to
>
> > > 10 prob thresolds?  Why wouldn't ge0 pick  up all probabilities?
I
>
> > > am not understanding how these thresholds are being used.
>
> > >
>
> > > Thanks
>
> > > Bob
>
> > >
>
> > > -----Original Message-----
>
> > > From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
>
> > > Sent: Monday, May 16, 2016 4:54 PM
>
> > > To: CRAIG, ROBERT J GS-12 USAF ACC 16 WS/WXN
>
> > > <robert.craig.2 at us.af.mil>
>
> > > Subject: Re: [rt.rap.ucar.edu #76361] Statanalysis Question
>
> > >
>
> > > Bob,
>
> > >
>
> > > Thanks for sending the sample data.  I agree that STAT-Analysis
can
>
> > > get pretty confusing.  It has a lot of flexibility, but we
really
>
> > > need to think through what you're trying to do.
>
> > >
>
> > > First, regarding the error you're getting.  Unfortunately, the
>
> > > config file string parser is writing a temp file in the current
> "runtime"
>
> > directory.
>
> > > The error is from the fact that you don't have permission to
write
>
> > > the file "config_23325_0_.temp" in the current directory.
>
> > > Ultimately, we should change that to use the temp directory
instead.
>
> > >
>
> > > Next, I looked at the data you sent to me.  Listed below are the
>
> > > unique combinations of just a few of the header columns:
>
> > >
>
> > > FCST_VAR FCST_THRESH LINE_TYPE TOTAL
>
> > > APCP         >=1                  MPR           14666
>
> > > APCP         >=25                MPR           14666
>
> > > APCP         >=50                MPR           14666
>
> > > CEIL           <=1000             MPR           11926
>
> > > CEIL           <=100               MPR           11926
>
> > > CEIL           <=300               MPR           11926
>
> > >
>
> > > Based on this, it looks like you have a lot of duplicate matched
>
> > > pair
>
> > > (MPR) output lines... We have the same 14666 pairs for APCP
repeated
>
> > > 3 times followed by the same 11926 pairs for CEIL repeated 3
times.
>
> > > This isn't necessary.  Instead, the FCST_THRESH and OBS_THRESH
>
> > > columns for the MPR line type should be set to "NA".  The MPR
line
>
> > > type that Point-Stat creates just contains the paired forecast
and
>
> > > observation
>
> > values.
>
> > > Thresholds do not apply to this line type.
>
> > >
>
> > > I posted an updated version of your file to the ftp site.  I
>
> > > stripped it down to 14666 APCP lines and 11926 CEIL lines with
NA in
>
> > > the FCST_THRESH and OBS_THRESH columns:
>
> > >
>
> > >
>
> > >
ftp://ftp.rap.ucar.edu/incoming/irap/met_help/craig_data/point_stat_
>
> > > 3_ galwem_120000L_20160501_120000V_JHG.stat
>
> > >
>
> > > Looking at the values in the FCST column, I see numbers between
0
>
> > > and
>
> > > 1 (0, 0.11, 0.22, 0.33, ...).  Looking in the OBS column, I see
2
>
> > > numbers (0 or 1).  And looking at your config file, it looks
like
>
> > > you want to use these MPR lines to compute probabilistic output.
>
> > > MET verifies probabilities using an Nx2 contingency table.  You
use
>
> > "-out_fcst_thresh"
>
> > > to select the probabilistic thresholds to be applied and
>
> > "-out_obs_thresh"
>
> > > to select the observation threshold to be applied.
>
> > >
>
> > > Here's a stat-analysis job you could run to read the MPR lines,
>
> > > define the probabilistic forecast thresholds, define the single
>
> > > observation threshold, and compute a PSTD output line.  Using "-
by
>
> > > FCST_VAR" tells it to run the job separately for each unique
entry
>
> > > found in the FCST_VAR
>
> > column.
>
> > >
>
> > > /usr/local/met-5.1/bin/stat_analysis \
>
> > >    -lookin point_stat_3_galwem_120000L_20160501_120000V_JHG.stat
\
>
> > >    -job aggregate_stat -line_type MPR -out_line_type PSTD \
>
> > >    -out_fcst_thresh
>
> > > ge0,ge0.1,ge0.2,ge0.3,ge0.4,ge0.5,ge0.6,ge0.7,ge0.8,ge0.9,ge1.0
\
>
> > >    -out_obs_thresh eq1.0 \
>
> > >    -by FCST_VAR \
>
> > >    -out_stat out_pstd.txt
>
> > >
>
> > > The output statistics are written to "out_pstd.txt".
>
> > >
>
> > > Hope that helps.
>
> > >
>
> > > John
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > > On Mon, May 16, 2016 at 2:47 PM, robert.craig.2 at us.af.mil via RT
<
>
> > > met_help at ucar.edu> wrote:
>
> > >
>
> > > >
>
> > > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361
>
>
> > > >
>
> > > > Thanks John, I knew the data was space delimited but forgot to
>
> > > > check the header.  As usual with MET, I progressed further but
am
>
> > > > hitting a
>
> > new
>
> > > > error.   See below.   I pushed the the config file to the ftp
>
> > directory.
>
> > > > As you can see, the -tmp_dir is set to
>
> > /h/data/global/WXQC/data/met/tmp.
>
> > > > This directory permissions are wide open - infact stat_anal
temp
>
> > > > files
>
> > > are
>
> > > > in there.   Does the config*.temp try to write somewhere else?
>
> > > >
>
> > > > Also, notice in the command line options, there are three
thesholds.
>
> > > > MET kept telling me that I had to have three since this is
>
> > > > probability
>
> > > data.
>
> > > > Also, the latest MPR files (.stat) are in the ftp dir.  As you
can
>
> > > > see I generated model/ob pairs using different thresholds for
the
>
> > > > forecast and observation data.  So this is where I get
confused: I
>
> > > > assume the fcst thresh in the config file is a filter to pull
>
> > > > those lines that have the threshold I want.  I am not sure
what
>
> > > > the -out_fcst_thresh in the command line is doing.  If it is
>
> > > > filtering the mpr line fcst data, then I would think I would
set
>
> > > > it to ge 0 for the fcst and ob since the fcst and ob data
range
>
> > > > from 0 to 1.  Am
>
> > I handling this correctly?
>
> > > >
>
> > > > Thanks
>
> > > > BOb
>
> > > >
>
> > > > ['/h/WXQC/met-5.1/bin/stat_analysis', '-lookin',
>
> > > > '/h/data/global/WXQC/data/met/mdlob_pairs', '-out',
>
> > > >
'/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_1_PSTD_0Z',
>
> > > > '-config',
>
> > > > '/h/WXQC/met-
5.1/data/config/STATAnalysisConfig_hira_bs_updated',
>
> > > > '-out_fcst_thresh ge0,ge0.5,ge1', '-out_obs_thresh ge0', '-v',
>
> > > > '6'] DEBUG 1: Creating STAT-Analysis output file
>
> > > >
"/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_1_PSTD_0Z"
>
> > > > DEBUG 1: Default Config File:
>
> > > > /home/qcteam/met-
5.1/share/met/config/STATAnalysisConfig_default
>
> > > > DEBUG 1: User Config File:
>
> > > > /h/WXQC/met-5.1/data/config/STATAnalysisConfig_hira_bs_updated
>
> > > > DEBUG 4: Default Job from the config file: "-model GALWEM
>
> > > > -fcst_lead
>
> > > > 120000 -fcst_init_beg 20160501_000000 -fcst_init_end
>
> > > > 20160502_000000 -fcst_init_hour 000000 -fcst_var APCP
-fcst_thresh
>
> > > > >=1 -line_type MPR -vif_flag 1 "
>
> > > > DEBUG 4: Amending default job with command line options:
>
> > > > "-out_fcst_thresh
>
> > > > ge0,ge0.5,ge1 -out_obs_thresh ge0"
>
> > > > DEBUG 3: Processing STAT file
>
> > > >
>
> > >
>
> >
>
"/h/data/global/WXQC/data/met/mdlob_pairs/point_stat_3_galwem_120000L_20160501_120000V.stat"
>
> > > > ... 1 of 10
>
> > > > ERROR  :
>
> > > > ERROR  :
>
> > > > ERROR  :   MetConfig::read_string(const char *) -> unable to
open
> temp
>
> > > > file "config_23325_0_.temp"
>
> > > > ERROR  :
>
> > > >
>
> > > > -----Original Message-----
>
> > > > From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
>
> > > > Sent: Friday, May 13, 2016 6:27 PM
>
> > > > To: CRAIG, ROBERT J GS-12 USAF ACC 16 WS/WXN
>
> > > > <robert.craig.2 at us.af.mil>
>
> > > > Subject: Re: [rt.rap.ucar.edu #76361] Statanalysis Question
>
> > > >
>
> > > > Bob,
>
> > > >
>
> > > > The problem is coming from the first line of the file you sent
to me.
>
> > > > It contains a comma-separated list of header column names.
>
> > > >
>
> > > > I'm not exactly sure where you pulled those header column
names,
>
> > > > but that's the problem.  MET expects data to be separated by
> whitespace...
>
> > > > so it interprets that long string with a bunch of commas as a
>
> > > > single
>
> > > column.
>
> > > > The
>
> > > > error comes when it tries to read the "second" column.   If
you just
>
> > > remove
>
> > > > that first line, it should run fine.
>
> > > >
>
> > > > If you do want header columns, here's a trick.  Run the
following
> job:
>
> > > >
>
> > > > stat_analysis -lookin
>
> > point_stat_3_galwem_120000L_20160501_120000V.stat \
>
> > > >    -job filter -line_type MPR -dump_row out.stat
>
> > > >
>
> > > > The file out.stat, will now contain the full header for the
MPR
>
> > > > line
>
> > > type.
>
> > > > When you select a single LINE_TYPE value, stat-analysis will
write
>
> > > > the full header for that line type to the output.
>
> > > >
>
> > > > Have a good weekend.
>
> > > >
>
> > > > John
>
> > > >
>
> > > >
>
> > > > On Fri, May 13, 2016 at 10:26 AM, robert.craig.2 at us.af.mil via
RT
>
> > > > < met_help at ucar.edu> wrote:
>
> > > >
>
> > > > >
>
> > > > > Fri May 13 10:26:26 2016: Request 76361 was acted upon.
>
> > > > > Transaction: Ticket created by robert.craig.2 at us.af.mil
>
> > > > >        Queue: met_help
>
> > > > >      Subject: Statanalysis Question
>
> > > > >        Owner: Nobody
>
> > > > >   Requestors: robert.craig.2 at us.af.mil
>
> > > > >       Status: new
>
> > > > >  Ticket <URL:
>
> > > > > https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361
>
> > > > > >
>
> > > > >
>
> > > > >
>
> > > > > John, I am getting the following error when running
statanalysis.
>
> > > > > The forcast times and valid times seem to be correct to me,
so I
>
> > > > > am not sure the cause of the error.
>
> > > > >
>
> > > > > /h/WXQC/met-5.1/bin/stat_analysis -lookin
>
> > > > > /h/data/global/WXQC/data/met/mdlob_pairs -out
>
> > > > >
/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_PSTD_0Z
>
> > > > > -config
>
> > > > > /h/WXQC/met-
5.1/data/config/STATAnalysisConfig_hira_bs_updated
>
> > > > > -v 6 DEBUG 1: Creating STAT-Analysis output file
>
> > > > >
"/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_PSTD_0Z"
>
> > > > > DEBUG 1: Default Config File:
>
> > > > > /home/qcteam/met-
5.1/share/met/config/STATAnalysisConfig_default
>
> > > > > DEBUG 1: User Config File:
>
> > > > > /h/WXQC/met-
5.1/data/config/STATAnalysisConfig_hira_bs_updated
>
> > > > > DEBUG 4: Default Job from the config file: "-model GALWEM
>
> > > > > -fcst_lead
>
> > > > > 120000 -fcst_init_beg 20160501_000000 -fcst_init_end
>
> > > > > 20160501_000000 -fcst_init_hour 120000 -fcst_var APCP
>
> > > > > -fcst_thresh
>
> > > > > >=50 -line_type MPR -vif_flag 1 "
>
> > > > > DEBUG 4: Amending default job with command line options:
"(nul)"
>
> > > > > DEBUG 3: Processing STAT file
>
> > > > >
>
> > > >
>
> > >
>
> >
>
"/h/data/global/WXQC/data/met/mdlob_pairs/point_stat_3_galwem_120000L_20160501_120000V.stat"
>
> > > > > ... 1 of 2
>
> > > > > ERROR  :
>
> > > > > ERROR  : DataLine::get_item(int) -> range check error ERROR
:
>
> > > > >
>
> > > > > The config file and data file are on your server.
>
> > > > >
>
> > > > >
>
> > > > >
>
> > > >
>
> > > >
>
> > > >
>
> > > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> >
>
> >
>
> >
>
> >
>
>
>
>

------------------------------------------------
Subject: RE: [rt.rap.ucar.edu #76361] Statanalysis Question
From: robert.craig.2 at us.af.mil
Time: Thu May 26 11:01:09 2016

Okay, makes sense.

Thanks

-----Original Message-----
From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
Sent: Thursday, May 26, 2016 11:31 AM
To: CRAIG, ROBERT J GS-12 USAF ACC 16 WS/WXN
<robert.craig.2 at us.af.mil>
Subject: Re: [rt.rap.ucar.edu #76361] Statanalysis Question

Bob,

If you tell STAT-Analysis to "-lookin" a directory, it will
recursively search that directory looking for files ending in ".stat".
If you instead tell STAT-Analysis to "-lookin" one or more specific
file names, it will process them regardless of their suffix.

So you can either name your PCT output files as "_pct.stat" or you can
keep them as "_pct.txt" and change your STAT-Analysis job command to
something
like:
  -lookin /h/data/global/WXQC/data/met/ens_cont_tbl/*_pct.txt

Make sense?

Thanks,
John

On Thu, May 26, 2016 at 9:55 AM, robert.craig.2 at us.af.mil via RT <
met_help at ucar.edu> wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361 >
>
> Hi John, I am to the point where I am trying to read the pct.txt
files
> to generate the PSTD files.  When I do this using
> /h/WXQC/met-5.1/bin/stat_analysis  -lookin
> /h/data/global/WXQC/data/met/ens_cont_tbl -out
> /h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_3_PSTD_0Z
> -line_type PCT -out_line_type PSTD -by FCST_VAR -by FCST_THRESH -v
6,
> I get the following:
>
>
>
> ['/h/WXQC/met-5.1/bin/stat_analysis', '-lookin',
> '/h/data/global/WXQC/data/met/ens_cont_tbl/', '-out',
>
'/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_3_PSTD_0Z',
> '-line_type PCT', '-out_line_type PSTD', '-by FCST_VAR', '-by
> FCST_THRESH', '-v', '6']
>
> DEBUG 1: Creating STAT-Analysis output file
> "/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_3_PSTD_0Z"
>
> DEBUG 4: Amending default job with command line options: "-line_type
> PCT -out_line_type PSTD -by FCST_VAR -by FCST_THRESH"
>
> ERROR  :
>
> ERROR  : process_search_dirs() -> no STAT files found in the
> directories specified!
>
> ERROR  :
>
> ERROR  :
>
> ERROR  : main() -> encountered an error value of 1.  Calling
> clean_up() and usage() before exiting.
>
> ERROR  :
>
>
>
> *** Model Evaluation Tools (METV5.1) ***
>
>
>
> The ens_cont_tbl directory contains 3 files and I placed them on
your
> ftp server.  I didn’t think the file naming convention was important
> here as long as they end in _pct.txt.  I did try renaming one to the
> more conventional naming scheme – point_stat … but that didn’t make
a
> difference.  So why is it not finding my files?
>
>
>
> Thanks
>
> Bob
>
>
>
> -----Original Message-----
> From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
> Sent: Monday, May 23, 2016 11:56 AM
> To: CRAIG, ROBERT J GS-12 USAF ACC 16 WS/WXN
> <robert.craig.2 at us.af.mil>
> Subject: Re: [rt.rap.ucar.edu #76361] Statanalysis Question
>
>
>
> Bob,
>
>
>
> Surprisingly, I was able to replicate the same error you're seeing!
I
> see where the error is occurring, but I don't yet understand why its
> happening.  However, I do have a workaround for you.
>
>
>
> Try editing your config file by emptying out the "fcst_thresh"
setting:
>
>    fcst_thresh = [];
>
>
>
> In your job, you're using "-by fcst_thresh" anyway, so STAT-Analysis
> will group the data by the FCST_THRESH column.  After I removed the
> "fcst_thresh" setting, the job completed.  I'll continue looking
into
> the reason for that error.
>
>
>
> How many of these files are you planning to pass to STAT-Analysis at
> any given time?  I see each sample file contains about 80,000 MPR
lines.
>
> Passing it 5 files to process about 400,000 lines, that job takes
> about 62 seconds to run on my machine.  I worry that as you increase
> the number of files, it'll run very slowly.
>
>
>
> Here's some alternative logic you might consider.  Run STAT-Analysis
> once for each .stat file you're generating.  Instead of writing the
> PSTD line type, write the PCT line type (that's just the counts of
> that probabilistic
>
> Nx2 table).  Then run jobs to aggregate the PCT lines types and
> compute PSTD lines (-job aggregate_stat -line_type PCT
-out_line_type
> PSTD).  That would make the processing for each STAT-Analysis job
much more manageable.
>
>
>
> BUT in order to do this in 2 steps, you would really need to put the
> neighborhood size information into the INTERP_MTHD and INTERP_PNTS
> columns.  Otherwise, the threshold information won't be retained in
> the PCT output lines.  And I also found an issue in the formatting
of
> the OBS_THRESH output column ("=1" in the output should really be
> "==1").  For now I just switched to "ge1", but I need to fix this
formatting issue.
>
>
>
> Listed below are the c-shell command I used to loop over your sample
> files and run stat_analysis in this way...
>
>
>
> # Loop through MPR files and compute PCT output lines foreach
mpr_file
> (`ls stat_mpr/*.stat`)
>
>    set pct_file = `echo $mpr_file | sed 's/stat_mpr/stat_pct/g' |
sed
> 's/.stat/_pct.stat/g'`
>
>    /usr/local/met-5.1/bin/stat_analysis -lookin $mpr_file -out_stat
> $pct_file \
>
>    -job aggregate_stat -line_type MPR -out_line_type PCT \
>
>    -by MODEL,FCST_VALID_BEG,FCST_LEAD,FCST_VAR,FCST_THRESH \
>
>    -out_fcst_thresh
>
> ge0,ge0.1,ge0.2,ge0.3,ge0.4,ge0.5,ge0.6,ge0.7,ge0.8,ge0.9,ge1.0
>
> -out_obs_thresh ge1
>
> end
>
>
>
> # Aggregate PCT lines and compute PSTD stats
> /usr/local/met-5.1/bin/stat_analysis -lookin stat_pct -out_stat
> pstd.stat \
>
>    -job aggregate_stat -line_type PCT -out_line_type PSTD \
>
>    -by
MODEL,FCST_VALID_BEG,FCST_LEAD,FCST_VAR,INTERP_MTHD,INTERP_PNTS
>
>
>
> Note that this job is combining all of the neighborhood sizes
because
> INTERP_MTHD and INTERP_PNTS is set the same in the PCT lines.
>
>
>
> Thanks,
>
> John
>
>
>
>
>
>
>
> On Fri, May 20, 2016 at 3:33 PM, robert.craig.2 at us.af.mil via RT <
> met_help at ucar.edu> wrote:
>
>
>
> >
>
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361 >
>
> >
>
> > John, I might have found more information on the error below:
>
> >
>
> > MetConfig::read_string(const char *) -> unable to open temp file
>
> > "config_1943_0_.temp"
>
> >
>
> > When I run stat_anal on MPR files, no problems if I  process 2
days
> > of
>
> > data, but when I increase it to three days, the error occurs.  I
> > also
>
> > played with the dates to try to eliminate a particular data file
as
>
> > the cause.  So, it seems to be related to how many data files it
has
> > to
>
> > process.   I posted all the data files on the ftp server and my
command
>
> > line is below.  The config file on the server is still
representative.
>
> > This error does not seem related to directory permissions.
>
> >
>
> > /h/WXQC/met-5.1/bin/stat_analysis  -lookin
>
> > /h/data/global/WXQC/data/met/mdlob_pairs -out
>
> > /h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_PSTD_0
>
> > -config
> > /h/WXQC/met-5.1/data/config/STATAnalysisConfig_hira_bs_updated
>
> > -out_fcst_thresh
>
> > ge0,ge0.1,ge0.2,ge0.3,ge0.4,ge0.5,ge0.6,ge0.7,ge0.8,ge0.9,ge1.0
>
> > -out_obs_thresh eq1 -by FCST_VAR -by FCST_THRESH -v 6
>
> >
>
> > This is concerning since I have to run this on many more than two
> > days
>
> > worth of data.  Any idea what be happening?
>
> >
>
> > Thanks
>
> > Bob
>
> >
>
> > -----Original Message-----
>
> > From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
>
> > Sent: Thursday, May 19, 2016 4:31 PM
>
> > To: CRAIG, ROBERT J GS-12 USAF ACC 16 WS/WXN
>
> > <robert.craig.2 at us.af.mil>
>
> > Subject: Re: [rt.rap.ucar.edu #76361] Statanalysis Question
>
> >
>
> > Bob,
>
> >
>
> > It's funny, I was just talking to a colleague today about doing
>
> > something very similar to this on a different dataset.
>
> >
>
> > I think I understand what you're saying about the thresholds >1,
> > >25,
>
> > and
>
> > >50.  These are used to define the "event" which is used in
> > >computing
>
> > >the
>
> > fractional coverage fields.  My confusion comes from the fact that
> > in
>
> > MET currently only Grid-Stat is computing these fractional
coverage
>
> > fields, not Point-Stat.  But I see now what you're doing.
>
> >
>
> > One suggestion would be to change the contents of the INTERP_MTHD
> > and
>
> > INTERP_PNTS header columns.  You currently have NEAREST, 1 which
> > would
>
> > indicate that each observation value was matched to the forecast
> > value
>
> > at the nearest grid point.  Instead, I would suggest writing NBRHD
> > and
>
> > N, where N indicates the number of points in the neighborhood.
For
>
> > example, the NBRHD output from Grid-Stat would write 49 if we were
> > using
> a 7x7 box.
>
> >
>
> > The -fcst_thresh and -obs_thresh options are used to filter the
> > input
>
> > MPR lines, as you already know.  The -out_fcst_thresh and
>
> > -out_obs_thresh options define the thresholds to be applied when
>
> > computing the output for the job.  In MET, probabilities are not
> processed in a "continuous" way.
>
> > Instead, they are put into probability bins.  Those bins are used
to
>
> > create an Nx2 contingency table from which probabilistic
statistics
>
> > are computed.
>
> >
>
> > Setting "-out_fcst_thresh gt0,gt0.25,gt0.5,gt0.75,gt1.0" defines 4
>
> > probability bins which yields a 4x2 contingency table, from which
>
> > stats are computed.
>
> >
>
> > Hope that helps.
>
> >
>
> > Thanks,
>
> > Johhn
>
> >
>
> > On Thu, May 19, 2016 at 2:23 PM, robert.craig.2 at us.af.mil via RT <
>
> > met_help at ucar.edu> wrote:
>
> >
>
> > >
>
> > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361 >
>
> > >
>
> > > Thanks John, the directory problem was due to come corruption on
> > > the
>
> > > MET we had on one if the systems.  On another system the problem
>
> > > doesn't come up so we are hoping a recompile of MET on said
system
>
> > > will clear up the issue.
>
> > >
>
> > > As far as the second comment, I don't think you interpreted what
I
>
> > > am doing correctly.  In each file, there is three sets of data
for
>
> > > each variable.  They are not identical since the first set is
the
> > > ob
>
> > > neighborhood data for precip > 1.  The next set is the ob
>
> > > neighborhood data for precip > 25, and the same for precip > 50.
> > > I
>
> > > you compare the model and ob data, the model data should be
>
> > > different (for some obs) then the model data for the previous
>
> > > category.  The neighborhoods
>
> > around each ob site
>
> > > follow the HiRA method.   All the data in the mpr file lines are
>
> > > probabilities, so I want to create PSTD from these data.  So I
was
> using
>
> > > the fcst_thresh to filter for the HiRA thresholds I am
interested in.
>  I
>
> > > tried your code and added -by FCST_THRESH and got out three
>
> > > different
>
> > sets
>
> > > of values.   So my question is why did you have the
-out_fcst_thresh
> set
>
> > to
>
> > > 10 prob thresolds?  Why wouldn't ge0 pick  up all probabilities?
> > > I
>
> > > am not understanding how these thresholds are being used.
>
> > >
>
> > > Thanks
>
> > > Bob
>
> > >
>
> > > -----Original Message-----
>
> > > From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
>
> > > Sent: Monday, May 16, 2016 4:54 PM
>
> > > To: CRAIG, ROBERT J GS-12 USAF ACC 16 WS/WXN
>
> > > <robert.craig.2 at us.af.mil>
>
> > > Subject: Re: [rt.rap.ucar.edu #76361] Statanalysis Question
>
> > >
>
> > > Bob,
>
> > >
>
> > > Thanks for sending the sample data.  I agree that STAT-Analysis
> > > can
>
> > > get pretty confusing.  It has a lot of flexibility, but we
really
>
> > > need to think through what you're trying to do.
>
> > >
>
> > > First, regarding the error you're getting.  Unfortunately, the
>
> > > config file string parser is writing a temp file in the current
> "runtime"
>
> > directory.
>
> > > The error is from the fact that you don't have permission to
write
>
> > > the file "config_23325_0_.temp" in the current directory.
>
> > > Ultimately, we should change that to use the temp directory
instead.
>
> > >
>
> > > Next, I looked at the data you sent to me.  Listed below are the
>
> > > unique combinations of just a few of the header columns:
>
> > >
>
> > > FCST_VAR FCST_THRESH LINE_TYPE TOTAL
>
> > > APCP         >=1                  MPR           14666
>
> > > APCP         >=25                MPR           14666
>
> > > APCP         >=50                MPR           14666
>
> > > CEIL           <=1000             MPR           11926
>
> > > CEIL           <=100               MPR           11926
>
> > > CEIL           <=300               MPR           11926
>
> > >
>
> > > Based on this, it looks like you have a lot of duplicate matched
>
> > > pair
>
> > > (MPR) output lines... We have the same 14666 pairs for APCP
> > > repeated
>
> > > 3 times followed by the same 11926 pairs for CEIL repeated 3
times.
>
> > > This isn't necessary.  Instead, the FCST_THRESH and OBS_THRESH
>
> > > columns for the MPR line type should be set to "NA".  The MPR
line
>
> > > type that Point-Stat creates just contains the paired forecast
and
>
> > > observation
>
> > values.
>
> > > Thresholds do not apply to this line type.
>
> > >
>
> > > I posted an updated version of your file to the ftp site.  I
>
> > > stripped it down to 14666 APCP lines and 11926 CEIL lines with
NA
> > > in
>
> > > the FCST_THRESH and OBS_THRESH columns:
>
> > >
>
> > >
>
> > >
ftp://ftp.rap.ucar.edu/incoming/irap/met_help/craig_data/point_sta
> > > t_
>
> > > 3_ galwem_120000L_20160501_120000V_JHG.stat
>
> > >
>
> > > Looking at the values in the FCST column, I see numbers between
0
>
> > > and
>
> > > 1 (0, 0.11, 0.22, 0.33, ...).  Looking in the OBS column, I see
2
>
> > > numbers (0 or 1).  And looking at your config file, it looks
like
>
> > > you want to use these MPR lines to compute probabilistic output.
>
> > > MET verifies probabilities using an Nx2 contingency table.  You
> > > use
>
> > "-out_fcst_thresh"
>
> > > to select the probabilistic thresholds to be applied and
>
> > "-out_obs_thresh"
>
> > > to select the observation threshold to be applied.
>
> > >
>
> > > Here's a stat-analysis job you could run to read the MPR lines,
>
> > > define the probabilistic forecast thresholds, define the single
>
> > > observation threshold, and compute a PSTD output line.  Using "-
by
>
> > > FCST_VAR" tells it to run the job separately for each unique
entry
>
> > > found in the FCST_VAR
>
> > column.
>
> > >
>
> > > /usr/local/met-5.1/bin/stat_analysis \
>
> > >    -lookin point_stat_3_galwem_120000L_20160501_120000V_JHG.stat
\
>
> > >    -job aggregate_stat -line_type MPR -out_line_type PSTD \
>
> > >    -out_fcst_thresh
>
> > > ge0,ge0.1,ge0.2,ge0.3,ge0.4,ge0.5,ge0.6,ge0.7,ge0.8,ge0.9,ge1.0
\
>
> > >    -out_obs_thresh eq1.0 \
>
> > >    -by FCST_VAR \
>
> > >    -out_stat out_pstd.txt
>
> > >
>
> > > The output statistics are written to "out_pstd.txt".
>
> > >
>
> > > Hope that helps.
>
> > >
>
> > > John
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > > On Mon, May 16, 2016 at 2:47 PM, robert.craig.2 at us.af.mil via RT
<
>
> > > met_help at ucar.edu> wrote:
>
> > >
>
> > > >
>
> > > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361
>
>
> > > >
>
> > > > Thanks John, I knew the data was space delimited but forgot to
>
> > > > check the header.  As usual with MET, I progressed further but
> > > > am
>
> > > > hitting a
>
> > new
>
> > > > error.   See below.   I pushed the the config file to the ftp
>
> > directory.
>
> > > > As you can see, the -tmp_dir is set to
>
> > /h/data/global/WXQC/data/met/tmp.
>
> > > > This directory permissions are wide open - infact stat_anal
temp
>
> > > > files
>
> > > are
>
> > > > in there.   Does the config*.temp try to write somewhere else?
>
> > > >
>
> > > > Also, notice in the command line options, there are three
thesholds.
>
> > > > MET kept telling me that I had to have three since this is
>
> > > > probability
>
> > > data.
>
> > > > Also, the latest MPR files (.stat) are in the ftp dir.  As you
> > > > can
>
> > > > see I generated model/ob pairs using different thresholds for
> > > > the
>
> > > > forecast and observation data.  So this is where I get
confused:
> > > > I
>
> > > > assume the fcst thresh in the config file is a filter to pull
>
> > > > those lines that have the threshold I want.  I am not sure
what
>
> > > > the -out_fcst_thresh in the command line is doing.  If it is
>
> > > > filtering the mpr line fcst data, then I would think I would
set
>
> > > > it to ge 0 for the fcst and ob since the fcst and ob data
range
>
> > > > from 0 to 1.  Am
>
> > I handling this correctly?
>
> > > >
>
> > > > Thanks
>
> > > > BOb
>
> > > >
>
> > > > ['/h/WXQC/met-5.1/bin/stat_analysis', '-lookin',
>
> > > > '/h/data/global/WXQC/data/met/mdlob_pairs', '-out',
>
> > > >
'/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_1_PSTD_0Z
> > > > ',
>
> > > > '-config',
>
> > > > '/h/WXQC/met-
5.1/data/config/STATAnalysisConfig_hira_bs_updated'
> > > > ,
>
> > > > '-out_fcst_thresh ge0,ge0.5,ge1', '-out_obs_thresh ge0', '-v',
>
> > > > '6'] DEBUG 1: Creating STAT-Analysis output file
>
> > > >
"/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_1_PSTD_0Z"
>
> > > > DEBUG 1: Default Config File:
>
> > > > /home/qcteam/met-
5.1/share/met/config/STATAnalysisConfig_default
>
> > > > DEBUG 1: User Config File:
>
> > > > /h/WXQC/met-5.1/data/config/STATAnalysisConfig_hira_bs_updated
>
> > > > DEBUG 4: Default Job from the config file: "-model GALWEM
>
> > > > -fcst_lead
>
> > > > 120000 -fcst_init_beg 20160501_000000 -fcst_init_end
>
> > > > 20160502_000000 -fcst_init_hour 000000 -fcst_var APCP
> > > > -fcst_thresh
>
> > > > >=1 -line_type MPR -vif_flag 1 "
>
> > > > DEBUG 4: Amending default job with command line options:
>
> > > > "-out_fcst_thresh
>
> > > > ge0,ge0.5,ge1 -out_obs_thresh ge0"
>
> > > > DEBUG 3: Processing STAT file
>
> > > >
>
> > >
>
> >
>
"/h/data/global/WXQC/data/met/mdlob_pairs/point_stat_3_galwem_120000L_20160501_120000V.stat"
>
> > > > ... 1 of 10
>
> > > > ERROR  :
>
> > > > ERROR  :
>
> > > > ERROR  :   MetConfig::read_string(const char *) -> unable to
open
> temp
>
> > > > file "config_23325_0_.temp"
>
> > > > ERROR  :
>
> > > >
>
> > > > -----Original Message-----
>
> > > > From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
>
> > > > Sent: Friday, May 13, 2016 6:27 PM
>
> > > > To: CRAIG, ROBERT J GS-12 USAF ACC 16 WS/WXN
>
> > > > <robert.craig.2 at us.af.mil>
>
> > > > Subject: Re: [rt.rap.ucar.edu #76361] Statanalysis Question
>
> > > >
>
> > > > Bob,
>
> > > >
>
> > > > The problem is coming from the first line of the file you sent
to me.
>
> > > > It contains a comma-separated list of header column names.
>
> > > >
>
> > > > I'm not exactly sure where you pulled those header column
names,
>
> > > > but that's the problem.  MET expects data to be separated by
> whitespace...
>
> > > > so it interprets that long string with a bunch of commas as a
>
> > > > single
>
> > > column.
>
> > > > The
>
> > > > error comes when it tries to read the "second" column.   If
you just
>
> > > remove
>
> > > > that first line, it should run fine.
>
> > > >
>
> > > > If you do want header columns, here's a trick.  Run the
> > > > following
> job:
>
> > > >
>
> > > > stat_analysis -lookin
>
> > point_stat_3_galwem_120000L_20160501_120000V.stat \
>
> > > >    -job filter -line_type MPR -dump_row out.stat
>
> > > >
>
> > > > The file out.stat, will now contain the full header for the
MPR
>
> > > > line
>
> > > type.
>
> > > > When you select a single LINE_TYPE value, stat-analysis will
> > > > write
>
> > > > the full header for that line type to the output.
>
> > > >
>
> > > > Have a good weekend.
>
> > > >
>
> > > > John
>
> > > >
>
> > > >
>
> > > > On Fri, May 13, 2016 at 10:26 AM, robert.craig.2 at us.af.mil via
> > > > RT
>
> > > > < met_help at ucar.edu> wrote:
>
> > > >
>
> > > > >
>
> > > > > Fri May 13 10:26:26 2016: Request 76361 was acted upon.
>
> > > > > Transaction: Ticket created by robert.craig.2 at us.af.mil
>
> > > > >        Queue: met_help
>
> > > > >      Subject: Statanalysis Question
>
> > > > >        Owner: Nobody
>
> > > > >   Requestors: robert.craig.2 at us.af.mil
>
> > > > >       Status: new
>
> > > > >  Ticket <URL:
>
> > > > > https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361
>
> > > > > >
>
> > > > >
>
> > > > >
>
> > > > > John, I am getting the following error when running
statanalysis.
>
> > > > > The forcast times and valid times seem to be correct to me,
so
> > > > > I
>
> > > > > am not sure the cause of the error.
>
> > > > >
>
> > > > > /h/WXQC/met-5.1/bin/stat_analysis -lookin
>
> > > > > /h/data/global/WXQC/data/met/mdlob_pairs -out
>
> > > > >
/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_PSTD_
> > > > > 0Z
>
> > > > > -config
>
> > > > > /h/WXQC/met-
5.1/data/config/STATAnalysisConfig_hira_bs_updated
>
> > > > > -v 6 DEBUG 1: Creating STAT-Analysis output file
>
> > > > >
"/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_PSTD_0Z"
>
> > > > > DEBUG 1: Default Config File:
>
> > > > > /home/qcteam/met-
5.1/share/met/config/STATAnalysisConfig_defau
> > > > > lt
>
> > > > > DEBUG 1: User Config File:
>
> > > > > /h/WXQC/met-
5.1/data/config/STATAnalysisConfig_hira_bs_updated
>
> > > > > DEBUG 4: Default Job from the config file: "-model GALWEM
>
> > > > > -fcst_lead
>
> > > > > 120000 -fcst_init_beg 20160501_000000 -fcst_init_end
>
> > > > > 20160501_000000 -fcst_init_hour 120000 -fcst_var APCP
>
> > > > > -fcst_thresh
>
> > > > > >=50 -line_type MPR -vif_flag 1 "
>
> > > > > DEBUG 4: Amending default job with command line options:
"(nul)"
>
> > > > > DEBUG 3: Processing STAT file
>
> > > > >
>
> > > >
>
> > >
>
> >
>
"/h/data/global/WXQC/data/met/mdlob_pairs/point_stat_3_galwem_120000L_20160501_120000V.stat"
>
> > > > > ... 1 of 2
>
> > > > > ERROR  :
>
> > > > > ERROR  : DataLine::get_item(int) -> range check error ERROR
:
>
> > > > >
>
> > > > > The config file and data file are on your server.
>
> > > > >
>
> > > > >
>
> > > > >
>
> > > >
>
> > > >
>
> > > >
>
> > > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> >
>
> >
>
> >
>
> >
>
>
>
>



------------------------------------------------
Subject: RE: [rt.rap.ucar.edu #76361] Statanalysis Question
From: robert.craig.2 at us.af.mil
Time: Thu May 26 11:51:28 2016

Well, that fixed that issue and it got me to my next error:

/gpfs/h/data/global/WXQC/data/temp
['/h/WXQC/met-5.1/bin/stat_analysis', '-lookin',
'/h/data/global/WXQC/data/met/ens_cont_tbl', '-out',
'/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_25_3_PSTD_0Z',
'-out_fcst_thresh
ge0,ge0.1,ge0.2,ge0.3,ge0.4,ge0.5,ge0.6,ge0.7,ge0.8,ge0.9,ge1.0', '-
out_obs_thresh eq1', '-line_type PCT', '-out_line_type PSTD', '-by
FCST_VAR', '-by FCST_THRESH', '-v', '6']
DEBUG 1: Creating STAT-Analysis output file
"/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_25_3_PSTD_0Z"
DEBUG 4: Amending default job with command line options: "-
out_fcst_thresh
ge0,ge0.1,ge0.2,ge0.3,ge0.4,ge0.5,ge0.6,ge0.7,ge0.8,ge0.9,ge1.0
-out_obs_thresh eq1 -line_type PCT -out_line_type PSTD -by FCST_VAR
-by FCST_THRESH"
DEBUG 3: Processing STAT file
"/h/data/global/WXQC/data/met/ens_cont_tbl/GALWEM_APCP_12hr_1_3_PCT_0Z_pct.stat"
.... 1 of 3
ERROR  :
ERROR  : DataLine::get_item(int) -> range check error
ERROR  :

Any ideas?

-----Original Message-----
From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
Sent: Thursday, May 26, 2016 11:31 AM
To: CRAIG, ROBERT J GS-12 USAF ACC 16 WS/WXN
<robert.craig.2 at us.af.mil>
Subject: Re: [rt.rap.ucar.edu #76361] Statanalysis Question

Bob,

If you tell STAT-Analysis to "-lookin" a directory, it will
recursively search that directory looking for files ending in ".stat".
If you instead tell STAT-Analysis to "-lookin" one or more specific
file names, it will process them regardless of their suffix.

So you can either name your PCT output files as "_pct.stat" or you can
keep them as "_pct.txt" and change your STAT-Analysis job command to
something
like:
  -lookin /h/data/global/WXQC/data/met/ens_cont_tbl/*_pct.txt

Make sense?

Thanks,
John

On Thu, May 26, 2016 at 9:55 AM, robert.craig.2 at us.af.mil via RT <
met_help at ucar.edu> wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361 >
>
> Hi John, I am to the point where I am trying to read the pct.txt
files
> to generate the PSTD files.  When I do this using
> /h/WXQC/met-5.1/bin/stat_analysis  -lookin
> /h/data/global/WXQC/data/met/ens_cont_tbl -out
> /h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_3_PSTD_0Z
> -line_type PCT -out_line_type PSTD -by FCST_VAR -by FCST_THRESH -v
6,
> I get the following:
>
>
>
> ['/h/WXQC/met-5.1/bin/stat_analysis', '-lookin',
> '/h/data/global/WXQC/data/met/ens_cont_tbl/', '-out',
>
'/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_3_PSTD_0Z',
> '-line_type PCT', '-out_line_type PSTD', '-by FCST_VAR', '-by
> FCST_THRESH', '-v', '6']
>
> DEBUG 1: Creating STAT-Analysis output file
> "/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_3_PSTD_0Z"
>
> DEBUG 4: Amending default job with command line options: "-line_type
> PCT -out_line_type PSTD -by FCST_VAR -by FCST_THRESH"
>
> ERROR  :
>
> ERROR  : process_search_dirs() -> no STAT files found in the
> directories specified!
>
> ERROR  :
>
> ERROR  :
>
> ERROR  : main() -> encountered an error value of 1.  Calling
> clean_up() and usage() before exiting.
>
> ERROR  :
>
>
>
> *** Model Evaluation Tools (METV5.1) ***
>
>
>
> The ens_cont_tbl directory contains 3 files and I placed them on
your
> ftp server.  I didn’t think the file naming convention was important
> here as long as they end in _pct.txt.  I did try renaming one to the
> more conventional naming scheme – point_stat … but that didn’t make
a
> difference.  So why is it not finding my files?
>
>
>
> Thanks
>
> Bob
>
>
>
> -----Original Message-----
> From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
> Sent: Monday, May 23, 2016 11:56 AM
> To: CRAIG, ROBERT J GS-12 USAF ACC 16 WS/WXN
> <robert.craig.2 at us.af.mil>
> Subject: Re: [rt.rap.ucar.edu #76361] Statanalysis Question
>
>
>
> Bob,
>
>
>
> Surprisingly, I was able to replicate the same error you're seeing!
I
> see where the error is occurring, but I don't yet understand why its
> happening.  However, I do have a workaround for you.
>
>
>
> Try editing your config file by emptying out the "fcst_thresh"
setting:
>
>    fcst_thresh = [];
>
>
>
> In your job, you're using "-by fcst_thresh" anyway, so STAT-Analysis
> will group the data by the FCST_THRESH column.  After I removed the
> "fcst_thresh" setting, the job completed.  I'll continue looking
into
> the reason for that error.
>
>
>
> How many of these files are you planning to pass to STAT-Analysis at
> any given time?  I see each sample file contains about 80,000 MPR
lines.
>
> Passing it 5 files to process about 400,000 lines, that job takes
> about 62 seconds to run on my machine.  I worry that as you increase
> the number of files, it'll run very slowly.
>
>
>
> Here's some alternative logic you might consider.  Run STAT-Analysis
> once for each .stat file you're generating.  Instead of writing the
> PSTD line type, write the PCT line type (that's just the counts of
> that probabilistic
>
> Nx2 table).  Then run jobs to aggregate the PCT lines types and
> compute PSTD lines (-job aggregate_stat -line_type PCT
-out_line_type
> PSTD).  That would make the processing for each STAT-Analysis job
much more manageable.
>
>
>
> BUT in order to do this in 2 steps, you would really need to put the
> neighborhood size information into the INTERP_MTHD and INTERP_PNTS
> columns.  Otherwise, the threshold information won't be retained in
> the PCT output lines.  And I also found an issue in the formatting
of
> the OBS_THRESH output column ("=1" in the output should really be
> "==1").  For now I just switched to "ge1", but I need to fix this
formatting issue.
>
>
>
> Listed below are the c-shell command I used to loop over your sample
> files and run stat_analysis in this way...
>
>
>
> # Loop through MPR files and compute PCT output lines foreach
mpr_file
> (`ls stat_mpr/*.stat`)
>
>    set pct_file = `echo $mpr_file | sed 's/stat_mpr/stat_pct/g' |
sed
> 's/.stat/_pct.stat/g'`
>
>    /usr/local/met-5.1/bin/stat_analysis -lookin $mpr_file -out_stat
> $pct_file \
>
>    -job aggregate_stat -line_type MPR -out_line_type PCT \
>
>    -by MODEL,FCST_VALID_BEG,FCST_LEAD,FCST_VAR,FCST_THRESH \
>
>    -out_fcst_thresh
>
> ge0,ge0.1,ge0.2,ge0.3,ge0.4,ge0.5,ge0.6,ge0.7,ge0.8,ge0.9,ge1.0
>
> -out_obs_thresh ge1
>
> end
>
>
>
> # Aggregate PCT lines and compute PSTD stats
> /usr/local/met-5.1/bin/stat_analysis -lookin stat_pct -out_stat
> pstd.stat \
>
>    -job aggregate_stat -line_type PCT -out_line_type PSTD \
>
>    -by
MODEL,FCST_VALID_BEG,FCST_LEAD,FCST_VAR,INTERP_MTHD,INTERP_PNTS
>
>
>
> Note that this job is combining all of the neighborhood sizes
because
> INTERP_MTHD and INTERP_PNTS is set the same in the PCT lines.
>
>
>
> Thanks,
>
> John
>
>
>
>
>
>
>
> On Fri, May 20, 2016 at 3:33 PM, robert.craig.2 at us.af.mil via RT <
> met_help at ucar.edu> wrote:
>
>
>
> >
>
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361 >
>
> >
>
> > John, I might have found more information on the error below:
>
> >
>
> > MetConfig::read_string(const char *) -> unable to open temp file
>
> > "config_1943_0_.temp"
>
> >
>
> > When I run stat_anal on MPR files, no problems if I  process 2
days
> > of
>
> > data, but when I increase it to three days, the error occurs.  I
> > also
>
> > played with the dates to try to eliminate a particular data file
as
>
> > the cause.  So, it seems to be related to how many data files it
has
> > to
>
> > process.   I posted all the data files on the ftp server and my
command
>
> > line is below.  The config file on the server is still
representative.
>
> > This error does not seem related to directory permissions.
>
> >
>
> > /h/WXQC/met-5.1/bin/stat_analysis  -lookin
>
> > /h/data/global/WXQC/data/met/mdlob_pairs -out
>
> > /h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_PSTD_0
>
> > -config
> > /h/WXQC/met-5.1/data/config/STATAnalysisConfig_hira_bs_updated
>
> > -out_fcst_thresh
>
> > ge0,ge0.1,ge0.2,ge0.3,ge0.4,ge0.5,ge0.6,ge0.7,ge0.8,ge0.9,ge1.0
>
> > -out_obs_thresh eq1 -by FCST_VAR -by FCST_THRESH -v 6
>
> >
>
> > This is concerning since I have to run this on many more than two
> > days
>
> > worth of data.  Any idea what be happening?
>
> >
>
> > Thanks
>
> > Bob
>
> >
>
> > -----Original Message-----
>
> > From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
>
> > Sent: Thursday, May 19, 2016 4:31 PM
>
> > To: CRAIG, ROBERT J GS-12 USAF ACC 16 WS/WXN
>
> > <robert.craig.2 at us.af.mil>
>
> > Subject: Re: [rt.rap.ucar.edu #76361] Statanalysis Question
>
> >
>
> > Bob,
>
> >
>
> > It's funny, I was just talking to a colleague today about doing
>
> > something very similar to this on a different dataset.
>
> >
>
> > I think I understand what you're saying about the thresholds >1,
> > >25,
>
> > and
>
> > >50.  These are used to define the "event" which is used in
> > >computing
>
> > >the
>
> > fractional coverage fields.  My confusion comes from the fact that
> > in
>
> > MET currently only Grid-Stat is computing these fractional
coverage
>
> > fields, not Point-Stat.  But I see now what you're doing.
>
> >
>
> > One suggestion would be to change the contents of the INTERP_MTHD
> > and
>
> > INTERP_PNTS header columns.  You currently have NEAREST, 1 which
> > would
>
> > indicate that each observation value was matched to the forecast
> > value
>
> > at the nearest grid point.  Instead, I would suggest writing NBRHD
> > and
>
> > N, where N indicates the number of points in the neighborhood.
For
>
> > example, the NBRHD output from Grid-Stat would write 49 if we were
> > using
> a 7x7 box.
>
> >
>
> > The -fcst_thresh and -obs_thresh options are used to filter the
> > input
>
> > MPR lines, as you already know.  The -out_fcst_thresh and
>
> > -out_obs_thresh options define the thresholds to be applied when
>
> > computing the output for the job.  In MET, probabilities are not
> processed in a "continuous" way.
>
> > Instead, they are put into probability bins.  Those bins are used
to
>
> > create an Nx2 contingency table from which probabilistic
statistics
>
> > are computed.
>
> >
>
> > Setting "-out_fcst_thresh gt0,gt0.25,gt0.5,gt0.75,gt1.0" defines 4
>
> > probability bins which yields a 4x2 contingency table, from which
>
> > stats are computed.
>
> >
>
> > Hope that helps.
>
> >
>
> > Thanks,
>
> > Johhn
>
> >
>
> > On Thu, May 19, 2016 at 2:23 PM, robert.craig.2 at us.af.mil via RT <
>
> > met_help at ucar.edu> wrote:
>
> >
>
> > >
>
> > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361 >
>
> > >
>
> > > Thanks John, the directory problem was due to come corruption on
> > > the
>
> > > MET we had on one if the systems.  On another system the problem
>
> > > doesn't come up so we are hoping a recompile of MET on said
system
>
> > > will clear up the issue.
>
> > >
>
> > > As far as the second comment, I don't think you interpreted what
I
>
> > > am doing correctly.  In each file, there is three sets of data
for
>
> > > each variable.  They are not identical since the first set is
the
> > > ob
>
> > > neighborhood data for precip > 1.  The next set is the ob
>
> > > neighborhood data for precip > 25, and the same for precip > 50.
> > > I
>
> > > you compare the model and ob data, the model data should be
>
> > > different (for some obs) then the model data for the previous
>
> > > category.  The neighborhoods
>
> > around each ob site
>
> > > follow the HiRA method.   All the data in the mpr file lines are
>
> > > probabilities, so I want to create PSTD from these data.  So I
was
> using
>
> > > the fcst_thresh to filter for the HiRA thresholds I am
interested in.
>  I
>
> > > tried your code and added -by FCST_THRESH and got out three
>
> > > different
>
> > sets
>
> > > of values.   So my question is why did you have the
-out_fcst_thresh
> set
>
> > to
>
> > > 10 prob thresolds?  Why wouldn't ge0 pick  up all probabilities?
> > > I
>
> > > am not understanding how these thresholds are being used.
>
> > >
>
> > > Thanks
>
> > > Bob
>
> > >
>
> > > -----Original Message-----
>
> > > From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
>
> > > Sent: Monday, May 16, 2016 4:54 PM
>
> > > To: CRAIG, ROBERT J GS-12 USAF ACC 16 WS/WXN
>
> > > <robert.craig.2 at us.af.mil>
>
> > > Subject: Re: [rt.rap.ucar.edu #76361] Statanalysis Question
>
> > >
>
> > > Bob,
>
> > >
>
> > > Thanks for sending the sample data.  I agree that STAT-Analysis
> > > can
>
> > > get pretty confusing.  It has a lot of flexibility, but we
really
>
> > > need to think through what you're trying to do.
>
> > >
>
> > > First, regarding the error you're getting.  Unfortunately, the
>
> > > config file string parser is writing a temp file in the current
> "runtime"
>
> > directory.
>
> > > The error is from the fact that you don't have permission to
write
>
> > > the file "config_23325_0_.temp" in the current directory.
>
> > > Ultimately, we should change that to use the temp directory
instead.
>
> > >
>
> > > Next, I looked at the data you sent to me.  Listed below are the
>
> > > unique combinations of just a few of the header columns:
>
> > >
>
> > > FCST_VAR FCST_THRESH LINE_TYPE TOTAL
>
> > > APCP         >=1                  MPR           14666
>
> > > APCP         >=25                MPR           14666
>
> > > APCP         >=50                MPR           14666
>
> > > CEIL           <=1000             MPR           11926
>
> > > CEIL           <=100               MPR           11926
>
> > > CEIL           <=300               MPR           11926
>
> > >
>
> > > Based on this, it looks like you have a lot of duplicate matched
>
> > > pair
>
> > > (MPR) output lines... We have the same 14666 pairs for APCP
> > > repeated
>
> > > 3 times followed by the same 11926 pairs for CEIL repeated 3
times.
>
> > > This isn't necessary.  Instead, the FCST_THRESH and OBS_THRESH
>
> > > columns for the MPR line type should be set to "NA".  The MPR
line
>
> > > type that Point-Stat creates just contains the paired forecast
and
>
> > > observation
>
> > values.
>
> > > Thresholds do not apply to this line type.
>
> > >
>
> > > I posted an updated version of your file to the ftp site.  I
>
> > > stripped it down to 14666 APCP lines and 11926 CEIL lines with
NA
> > > in
>
> > > the FCST_THRESH and OBS_THRESH columns:
>
> > >
>
> > >
>
> > >
ftp://ftp.rap.ucar.edu/incoming/irap/met_help/craig_data/point_sta
> > > t_
>
> > > 3_ galwem_120000L_20160501_120000V_JHG.stat
>
> > >
>
> > > Looking at the values in the FCST column, I see numbers between
0
>
> > > and
>
> > > 1 (0, 0.11, 0.22, 0.33, ...).  Looking in the OBS column, I see
2
>
> > > numbers (0 or 1).  And looking at your config file, it looks
like
>
> > > you want to use these MPR lines to compute probabilistic output.
>
> > > MET verifies probabilities using an Nx2 contingency table.  You
> > > use
>
> > "-out_fcst_thresh"
>
> > > to select the probabilistic thresholds to be applied and
>
> > "-out_obs_thresh"
>
> > > to select the observation threshold to be applied.
>
> > >
>
> > > Here's a stat-analysis job you could run to read the MPR lines,
>
> > > define the probabilistic forecast thresholds, define the single
>
> > > observation threshold, and compute a PSTD output line.  Using "-
by
>
> > > FCST_VAR" tells it to run the job separately for each unique
entry
>
> > > found in the FCST_VAR
>
> > column.
>
> > >
>
> > > /usr/local/met-5.1/bin/stat_analysis \
>
> > >    -lookin point_stat_3_galwem_120000L_20160501_120000V_JHG.stat
\
>
> > >    -job aggregate_stat -line_type MPR -out_line_type PSTD \
>
> > >    -out_fcst_thresh
>
> > > ge0,ge0.1,ge0.2,ge0.3,ge0.4,ge0.5,ge0.6,ge0.7,ge0.8,ge0.9,ge1.0
\
>
> > >    -out_obs_thresh eq1.0 \
>
> > >    -by FCST_VAR \
>
> > >    -out_stat out_pstd.txt
>
> > >
>
> > > The output statistics are written to "out_pstd.txt".
>
> > >
>
> > > Hope that helps.
>
> > >
>
> > > John
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > > On Mon, May 16, 2016 at 2:47 PM, robert.craig.2 at us.af.mil via RT
<
>
> > > met_help at ucar.edu> wrote:
>
> > >
>
> > > >
>
> > > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361
>
>
> > > >
>
> > > > Thanks John, I knew the data was space delimited but forgot to
>
> > > > check the header.  As usual with MET, I progressed further but
> > > > am
>
> > > > hitting a
>
> > new
>
> > > > error.   See below.   I pushed the the config file to the ftp
>
> > directory.
>
> > > > As you can see, the -tmp_dir is set to
>
> > /h/data/global/WXQC/data/met/tmp.
>
> > > > This directory permissions are wide open - infact stat_anal
temp
>
> > > > files
>
> > > are
>
> > > > in there.   Does the config*.temp try to write somewhere else?
>
> > > >
>
> > > > Also, notice in the command line options, there are three
thesholds.
>
> > > > MET kept telling me that I had to have three since this is
>
> > > > probability
>
> > > data.
>
> > > > Also, the latest MPR files (.stat) are in the ftp dir.  As you
> > > > can
>
> > > > see I generated model/ob pairs using different thresholds for
> > > > the
>
> > > > forecast and observation data.  So this is where I get
confused:
> > > > I
>
> > > > assume the fcst thresh in the config file is a filter to pull
>
> > > > those lines that have the threshold I want.  I am not sure
what
>
> > > > the -out_fcst_thresh in the command line is doing.  If it is
>
> > > > filtering the mpr line fcst data, then I would think I would
set
>
> > > > it to ge 0 for the fcst and ob since the fcst and ob data
range
>
> > > > from 0 to 1.  Am
>
> > I handling this correctly?
>
> > > >
>
> > > > Thanks
>
> > > > BOb
>
> > > >
>
> > > > ['/h/WXQC/met-5.1/bin/stat_analysis', '-lookin',
>
> > > > '/h/data/global/WXQC/data/met/mdlob_pairs', '-out',
>
> > > >
'/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_1_PSTD_0Z
> > > > ',
>
> > > > '-config',
>
> > > > '/h/WXQC/met-
5.1/data/config/STATAnalysisConfig_hira_bs_updated'
> > > > ,
>
> > > > '-out_fcst_thresh ge0,ge0.5,ge1', '-out_obs_thresh ge0', '-v',
>
> > > > '6'] DEBUG 1: Creating STAT-Analysis output file
>
> > > >
"/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_1_PSTD_0Z"
>
> > > > DEBUG 1: Default Config File:
>
> > > > /home/qcteam/met-
5.1/share/met/config/STATAnalysisConfig_default
>
> > > > DEBUG 1: User Config File:
>
> > > > /h/WXQC/met-5.1/data/config/STATAnalysisConfig_hira_bs_updated
>
> > > > DEBUG 4: Default Job from the config file: "-model GALWEM
>
> > > > -fcst_lead
>
> > > > 120000 -fcst_init_beg 20160501_000000 -fcst_init_end
>
> > > > 20160502_000000 -fcst_init_hour 000000 -fcst_var APCP
> > > > -fcst_thresh
>
> > > > >=1 -line_type MPR -vif_flag 1 "
>
> > > > DEBUG 4: Amending default job with command line options:
>
> > > > "-out_fcst_thresh
>
> > > > ge0,ge0.5,ge1 -out_obs_thresh ge0"
>
> > > > DEBUG 3: Processing STAT file
>
> > > >
>
> > >
>
> >
>
"/h/data/global/WXQC/data/met/mdlob_pairs/point_stat_3_galwem_120000L_20160501_120000V.stat"
>
> > > > ... 1 of 10
>
> > > > ERROR  :
>
> > > > ERROR  :
>
> > > > ERROR  :   MetConfig::read_string(const char *) -> unable to
open
> temp
>
> > > > file "config_23325_0_.temp"
>
> > > > ERROR  :
>
> > > >
>
> > > > -----Original Message-----
>
> > > > From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
>
> > > > Sent: Friday, May 13, 2016 6:27 PM
>
> > > > To: CRAIG, ROBERT J GS-12 USAF ACC 16 WS/WXN
>
> > > > <robert.craig.2 at us.af.mil>
>
> > > > Subject: Re: [rt.rap.ucar.edu #76361] Statanalysis Question
>
> > > >
>
> > > > Bob,
>
> > > >
>
> > > > The problem is coming from the first line of the file you sent
to me.
>
> > > > It contains a comma-separated list of header column names.
>
> > > >
>
> > > > I'm not exactly sure where you pulled those header column
names,
>
> > > > but that's the problem.  MET expects data to be separated by
> whitespace...
>
> > > > so it interprets that long string with a bunch of commas as a
>
> > > > single
>
> > > column.
>
> > > > The
>
> > > > error comes when it tries to read the "second" column.   If
you just
>
> > > remove
>
> > > > that first line, it should run fine.
>
> > > >
>
> > > > If you do want header columns, here's a trick.  Run the
> > > > following
> job:
>
> > > >
>
> > > > stat_analysis -lookin
>
> > point_stat_3_galwem_120000L_20160501_120000V.stat \
>
> > > >    -job filter -line_type MPR -dump_row out.stat
>
> > > >
>
> > > > The file out.stat, will now contain the full header for the
MPR
>
> > > > line
>
> > > type.
>
> > > > When you select a single LINE_TYPE value, stat-analysis will
> > > > write
>
> > > > the full header for that line type to the output.
>
> > > >
>
> > > > Have a good weekend.
>
> > > >
>
> > > > John
>
> > > >
>
> > > >
>
> > > > On Fri, May 13, 2016 at 10:26 AM, robert.craig.2 at us.af.mil via
> > > > RT
>
> > > > < met_help at ucar.edu> wrote:
>
> > > >
>
> > > > >
>
> > > > > Fri May 13 10:26:26 2016: Request 76361 was acted upon.
>
> > > > > Transaction: Ticket created by robert.craig.2 at us.af.mil
>
> > > > >        Queue: met_help
>
> > > > >      Subject: Statanalysis Question
>
> > > > >        Owner: Nobody
>
> > > > >   Requestors: robert.craig.2 at us.af.mil
>
> > > > >       Status: new
>
> > > > >  Ticket <URL:
>
> > > > > https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361
>
> > > > > >
>
> > > > >
>
> > > > >
>
> > > > > John, I am getting the following error when running
statanalysis.
>
> > > > > The forcast times and valid times seem to be correct to me,
so
> > > > > I
>
> > > > > am not sure the cause of the error.
>
> > > > >
>
> > > > > /h/WXQC/met-5.1/bin/stat_analysis -lookin
>
> > > > > /h/data/global/WXQC/data/met/mdlob_pairs -out
>
> > > > >
/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_PSTD_
> > > > > 0Z
>
> > > > > -config
>
> > > > > /h/WXQC/met-
5.1/data/config/STATAnalysisConfig_hira_bs_updated
>
> > > > > -v 6 DEBUG 1: Creating STAT-Analysis output file
>
> > > > >
"/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_PSTD_0Z"
>
> > > > > DEBUG 1: Default Config File:
>
> > > > > /home/qcteam/met-
5.1/share/met/config/STATAnalysisConfig_defau
> > > > > lt
>
> > > > > DEBUG 1: User Config File:
>
> > > > > /h/WXQC/met-
5.1/data/config/STATAnalysisConfig_hira_bs_updated
>
> > > > > DEBUG 4: Default Job from the config file: "-model GALWEM
>
> > > > > -fcst_lead
>
> > > > > 120000 -fcst_init_beg 20160501_000000 -fcst_init_end
>
> > > > > 20160501_000000 -fcst_init_hour 120000 -fcst_var APCP
>
> > > > > -fcst_thresh
>
> > > > > >=50 -line_type MPR -vif_flag 1 "
>
> > > > > DEBUG 4: Amending default job with command line options:
"(nul)"
>
> > > > > DEBUG 3: Processing STAT file
>
> > > > >
>
> > > >
>
> > >
>
> >
>
"/h/data/global/WXQC/data/met/mdlob_pairs/point_stat_3_galwem_120000L_20160501_120000V.stat"
>
> > > > > ... 1 of 2
>
> > > > > ERROR  :
>
> > > > > ERROR  : DataLine::get_item(int) -> range check error ERROR
:
>
> > > > >
>
> > > > > The config file and data file are on your server.
>
> > > > >
>
> > > > >
>
> > > > >
>
> > > >
>
> > > >
>
> > > >
>
> > > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> >
>
> >
>
> >
>
> >
>
>
>
>



------------------------------------------------
Subject: RE: [rt.rap.ucar.edu #76361] Statanalysis Question
From: robert.craig.2 at us.af.mil
Time: Thu May 26 12:02:50 2016

John, Here is the data in the files I am trying to read.  Is this
error telling me that one of the values has exceeded the allowable
size for a integer variable?

JOB_LIST:      -job aggregate_stat -model GALWEM -fcst_lead 120000
-fcst_init_beg 20160501_000000 -fcst_init_end 20160503_000000
-fcst_init_hour 000000 -fcst_var APCP -interp_mthd NEAREST
-interp_pnts
3 -line_type MPR -by FCST_VAR -by FCST_THRESH -dump_row
/h/data/global/WXQC/data/met/filter_job.stat -out_line_type PCT
-out_fcst_thresh >=0 -out_fcst_thresh >=0.1 -out_fcst_thresh >=0.2
-out_fcst_thre
sh >=0.3 -out_fcst_thresh >=0.4 -out_fcst_thresh >=0.5
-out_fcst_thresh >=0.6 -out_fcst_thresh >=0.7 -out_fcst_thresh >=0.8
-out_fcst_thresh >=0.9 -out_fcst_thresh >=1.0 -out_obs_thresh =1
-vif_flag 1
COL_NAME: FCST_VAR FCST_THRESH TOTAL N_THRESH THRESH_1 OY_1  ON_1
THRESH_2 OY_2 ON_2 THRESH_3 OY_3 ON_3 THRESH_4 OY_4 ON_4 THRESH_5 OY_5
ON_5 THRESH_6 OY_6 ON_6 THRESH_7 OY_7 ON_7 THRESH_8 OY_8 ON_8 TH
RESH_9 OY_9 ON_9 THRESH_10 OY_10 ON_10 THRESH_11
     PCT: APCP     >=1         39072       11        0  472 24899
0.1   85 1172      0.2   68  824      0.3  120  869      0.4   88  640
0.5   91  685      0.6  124  661      0.7  129  649
   0.8  189  741       0.9  1962  4604         1
     PCT: APCP     >=25        39072       11        0  408 37879
0.1   25  162      0.2   16  104      0.3   16  120      0.4   11   36
0.5   11   49      0.6   14   50      0.7    5   37
   0.8   16   42       0.9    20    51         1
     PCT: APCP     >=50        39072       11        0  170 38702
0.1    6   59      0.2    3   35      0.3    0   31      0.4    4   16
0.5    3   18      0.6    2    9      0.7    1    8
   0.8    1    3       0.9     1     0         1

Thanks
Bob

-----Original Message-----
From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
Sent: Thursday, May 26, 2016 11:31 AM
To: CRAIG, ROBERT J GS-12 USAF ACC 16 WS/WXN
<robert.craig.2 at us.af.mil>
Subject: Re: [rt.rap.ucar.edu #76361] Statanalysis Question

Bob,

If you tell STAT-Analysis to "-lookin" a directory, it will
recursively search that directory looking for files ending in ".stat".
If you instead tell STAT-Analysis to "-lookin" one or more specific
file names, it will process them regardless of their suffix.

So you can either name your PCT output files as "_pct.stat" or you can
keep them as "_pct.txt" and change your STAT-Analysis job command to
something
like:
  -lookin /h/data/global/WXQC/data/met/ens_cont_tbl/*_pct.txt

Make sense?

Thanks,
John

On Thu, May 26, 2016 at 9:55 AM, robert.craig.2 at us.af.mil via RT <
met_help at ucar.edu> wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361 >
>
> Hi John, I am to the point where I am trying to read the pct.txt
files
> to generate the PSTD files.  When I do this using
> /h/WXQC/met-5.1/bin/stat_analysis  -lookin
> /h/data/global/WXQC/data/met/ens_cont_tbl -out
> /h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_3_PSTD_0Z
> -line_type PCT -out_line_type PSTD -by FCST_VAR -by FCST_THRESH -v
6,
> I get the following:
>
>
>
> ['/h/WXQC/met-5.1/bin/stat_analysis', '-lookin',
> '/h/data/global/WXQC/data/met/ens_cont_tbl/', '-out',
>
'/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_3_PSTD_0Z',
> '-line_type PCT', '-out_line_type PSTD', '-by FCST_VAR', '-by
> FCST_THRESH', '-v', '6']
>
> DEBUG 1: Creating STAT-Analysis output file
> "/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_3_PSTD_0Z"
>
> DEBUG 4: Amending default job with command line options: "-line_type
> PCT -out_line_type PSTD -by FCST_VAR -by FCST_THRESH"
>
> ERROR  :
>
> ERROR  : process_search_dirs() -> no STAT files found in the
> directories specified!
>
> ERROR  :
>
> ERROR  :
>
> ERROR  : main() -> encountered an error value of 1.  Calling
> clean_up() and usage() before exiting.
>
> ERROR  :
>
>
>
> *** Model Evaluation Tools (METV5.1) ***
>
>
>
> The ens_cont_tbl directory contains 3 files and I placed them on
your
> ftp server.  I didn’t think the file naming convention was important
> here as long as they end in _pct.txt.  I did try renaming one to the
> more conventional naming scheme – point_stat … but that didn’t make
a
> difference.  So why is it not finding my files?
>
>
>
> Thanks
>
> Bob
>
>
>
> -----Original Message-----
> From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
> Sent: Monday, May 23, 2016 11:56 AM
> To: CRAIG, ROBERT J GS-12 USAF ACC 16 WS/WXN
> <robert.craig.2 at us.af.mil>
> Subject: Re: [rt.rap.ucar.edu #76361] Statanalysis Question
>
>
>
> Bob,
>
>
>
> Surprisingly, I was able to replicate the same error you're seeing!
I
> see where the error is occurring, but I don't yet understand why its
> happening.  However, I do have a workaround for you.
>
>
>
> Try editing your config file by emptying out the "fcst_thresh"
setting:
>
>    fcst_thresh = [];
>
>
>
> In your job, you're using "-by fcst_thresh" anyway, so STAT-Analysis
> will group the data by the FCST_THRESH column.  After I removed the
> "fcst_thresh" setting, the job completed.  I'll continue looking
into
> the reason for that error.
>
>
>
> How many of these files are you planning to pass to STAT-Analysis at
> any given time?  I see each sample file contains about 80,000 MPR
lines.
>
> Passing it 5 files to process about 400,000 lines, that job takes
> about 62 seconds to run on my machine.  I worry that as you increase
> the number of files, it'll run very slowly.
>
>
>
> Here's some alternative logic you might consider.  Run STAT-Analysis
> once for each .stat file you're generating.  Instead of writing the
> PSTD line type, write the PCT line type (that's just the counts of
> that probabilistic
>
> Nx2 table).  Then run jobs to aggregate the PCT lines types and
> compute PSTD lines (-job aggregate_stat -line_type PCT
-out_line_type
> PSTD).  That would make the processing for each STAT-Analysis job
much more manageable.
>
>
>
> BUT in order to do this in 2 steps, you would really need to put the
> neighborhood size information into the INTERP_MTHD and INTERP_PNTS
> columns.  Otherwise, the threshold information won't be retained in
> the PCT output lines.  And I also found an issue in the formatting
of
> the OBS_THRESH output column ("=1" in the output should really be
> "==1").  For now I just switched to "ge1", but I need to fix this
formatting issue.
>
>
>
> Listed below are the c-shell command I used to loop over your sample
> files and run stat_analysis in this way...
>
>
>
> # Loop through MPR files and compute PCT output lines foreach
mpr_file
> (`ls stat_mpr/*.stat`)
>
>    set pct_file = `echo $mpr_file | sed 's/stat_mpr/stat_pct/g' |
sed
> 's/.stat/_pct.stat/g'`
>
>    /usr/local/met-5.1/bin/stat_analysis -lookin $mpr_file -out_stat
> $pct_file \
>
>    -job aggregate_stat -line_type MPR -out_line_type PCT \
>
>    -by MODEL,FCST_VALID_BEG,FCST_LEAD,FCST_VAR,FCST_THRESH \
>
>    -out_fcst_thresh
>
> ge0,ge0.1,ge0.2,ge0.3,ge0.4,ge0.5,ge0.6,ge0.7,ge0.8,ge0.9,ge1.0
>
> -out_obs_thresh ge1
>
> end
>
>
>
> # Aggregate PCT lines and compute PSTD stats
> /usr/local/met-5.1/bin/stat_analysis -lookin stat_pct -out_stat
> pstd.stat \
>
>    -job aggregate_stat -line_type PCT -out_line_type PSTD \
>
>    -by
MODEL,FCST_VALID_BEG,FCST_LEAD,FCST_VAR,INTERP_MTHD,INTERP_PNTS
>
>
>
> Note that this job is combining all of the neighborhood sizes
because
> INTERP_MTHD and INTERP_PNTS is set the same in the PCT lines.
>
>
>
> Thanks,
>
> John
>
>
>
>
>
>
>
> On Fri, May 20, 2016 at 3:33 PM, robert.craig.2 at us.af.mil via RT <
> met_help at ucar.edu> wrote:
>
>
>
> >
>
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361 >
>
> >
>
> > John, I might have found more information on the error below:
>
> >
>
> > MetConfig::read_string(const char *) -> unable to open temp file
>
> > "config_1943_0_.temp"
>
> >
>
> > When I run stat_anal on MPR files, no problems if I  process 2
days
> > of
>
> > data, but when I increase it to three days, the error occurs.  I
> > also
>
> > played with the dates to try to eliminate a particular data file
as
>
> > the cause.  So, it seems to be related to how many data files it
has
> > to
>
> > process.   I posted all the data files on the ftp server and my
command
>
> > line is below.  The config file on the server is still
representative.
>
> > This error does not seem related to directory permissions.
>
> >
>
> > /h/WXQC/met-5.1/bin/stat_analysis  -lookin
>
> > /h/data/global/WXQC/data/met/mdlob_pairs -out
>
> > /h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_PSTD_0
>
> > -config
> > /h/WXQC/met-5.1/data/config/STATAnalysisConfig_hira_bs_updated
>
> > -out_fcst_thresh
>
> > ge0,ge0.1,ge0.2,ge0.3,ge0.4,ge0.5,ge0.6,ge0.7,ge0.8,ge0.9,ge1.0
>
> > -out_obs_thresh eq1 -by FCST_VAR -by FCST_THRESH -v 6
>
> >
>
> > This is concerning since I have to run this on many more than two
> > days
>
> > worth of data.  Any idea what be happening?
>
> >
>
> > Thanks
>
> > Bob
>
> >
>
> > -----Original Message-----
>
> > From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
>
> > Sent: Thursday, May 19, 2016 4:31 PM
>
> > To: CRAIG, ROBERT J GS-12 USAF ACC 16 WS/WXN
>
> > <robert.craig.2 at us.af.mil>
>
> > Subject: Re: [rt.rap.ucar.edu #76361] Statanalysis Question
>
> >
>
> > Bob,
>
> >
>
> > It's funny, I was just talking to a colleague today about doing
>
> > something very similar to this on a different dataset.
>
> >
>
> > I think I understand what you're saying about the thresholds >1,
> > >25,
>
> > and
>
> > >50.  These are used to define the "event" which is used in
> > >computing
>
> > >the
>
> > fractional coverage fields.  My confusion comes from the fact that
> > in
>
> > MET currently only Grid-Stat is computing these fractional
coverage
>
> > fields, not Point-Stat.  But I see now what you're doing.
>
> >
>
> > One suggestion would be to change the contents of the INTERP_MTHD
> > and
>
> > INTERP_PNTS header columns.  You currently have NEAREST, 1 which
> > would
>
> > indicate that each observation value was matched to the forecast
> > value
>
> > at the nearest grid point.  Instead, I would suggest writing NBRHD
> > and
>
> > N, where N indicates the number of points in the neighborhood.
For
>
> > example, the NBRHD output from Grid-Stat would write 49 if we were
> > using
> a 7x7 box.
>
> >
>
> > The -fcst_thresh and -obs_thresh options are used to filter the
> > input
>
> > MPR lines, as you already know.  The -out_fcst_thresh and
>
> > -out_obs_thresh options define the thresholds to be applied when
>
> > computing the output for the job.  In MET, probabilities are not
> processed in a "continuous" way.
>
> > Instead, they are put into probability bins.  Those bins are used
to
>
> > create an Nx2 contingency table from which probabilistic
statistics
>
> > are computed.
>
> >
>
> > Setting "-out_fcst_thresh gt0,gt0.25,gt0.5,gt0.75,gt1.0" defines 4
>
> > probability bins which yields a 4x2 contingency table, from which
>
> > stats are computed.
>
> >
>
> > Hope that helps.
>
> >
>
> > Thanks,
>
> > Johhn
>
> >
>
> > On Thu, May 19, 2016 at 2:23 PM, robert.craig.2 at us.af.mil via RT <
>
> > met_help at ucar.edu> wrote:
>
> >
>
> > >
>
> > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361 >
>
> > >
>
> > > Thanks John, the directory problem was due to come corruption on
> > > the
>
> > > MET we had on one if the systems.  On another system the problem
>
> > > doesn't come up so we are hoping a recompile of MET on said
system
>
> > > will clear up the issue.
>
> > >
>
> > > As far as the second comment, I don't think you interpreted what
I
>
> > > am doing correctly.  In each file, there is three sets of data
for
>
> > > each variable.  They are not identical since the first set is
the
> > > ob
>
> > > neighborhood data for precip > 1.  The next set is the ob
>
> > > neighborhood data for precip > 25, and the same for precip > 50.
> > > I
>
> > > you compare the model and ob data, the model data should be
>
> > > different (for some obs) then the model data for the previous
>
> > > category.  The neighborhoods
>
> > around each ob site
>
> > > follow the HiRA method.   All the data in the mpr file lines are
>
> > > probabilities, so I want to create PSTD from these data.  So I
was
> using
>
> > > the fcst_thresh to filter for the HiRA thresholds I am
interested in.
>  I
>
> > > tried your code and added -by FCST_THRESH and got out three
>
> > > different
>
> > sets
>
> > > of values.   So my question is why did you have the
-out_fcst_thresh
> set
>
> > to
>
> > > 10 prob thresolds?  Why wouldn't ge0 pick  up all probabilities?
> > > I
>
> > > am not understanding how these thresholds are being used.
>
> > >
>
> > > Thanks
>
> > > Bob
>
> > >
>
> > > -----Original Message-----
>
> > > From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
>
> > > Sent: Monday, May 16, 2016 4:54 PM
>
> > > To: CRAIG, ROBERT J GS-12 USAF ACC 16 WS/WXN
>
> > > <robert.craig.2 at us.af.mil>
>
> > > Subject: Re: [rt.rap.ucar.edu #76361] Statanalysis Question
>
> > >
>
> > > Bob,
>
> > >
>
> > > Thanks for sending the sample data.  I agree that STAT-Analysis
> > > can
>
> > > get pretty confusing.  It has a lot of flexibility, but we
really
>
> > > need to think through what you're trying to do.
>
> > >
>
> > > First, regarding the error you're getting.  Unfortunately, the
>
> > > config file string parser is writing a temp file in the current
> "runtime"
>
> > directory.
>
> > > The error is from the fact that you don't have permission to
write
>
> > > the file "config_23325_0_.temp" in the current directory.
>
> > > Ultimately, we should change that to use the temp directory
instead.
>
> > >
>
> > > Next, I looked at the data you sent to me.  Listed below are the
>
> > > unique combinations of just a few of the header columns:
>
> > >
>
> > > FCST_VAR FCST_THRESH LINE_TYPE TOTAL
>
> > > APCP         >=1                  MPR           14666
>
> > > APCP         >=25                MPR           14666
>
> > > APCP         >=50                MPR           14666
>
> > > CEIL           <=1000             MPR           11926
>
> > > CEIL           <=100               MPR           11926
>
> > > CEIL           <=300               MPR           11926
>
> > >
>
> > > Based on this, it looks like you have a lot of duplicate matched
>
> > > pair
>
> > > (MPR) output lines... We have the same 14666 pairs for APCP
> > > repeated
>
> > > 3 times followed by the same 11926 pairs for CEIL repeated 3
times.
>
> > > This isn't necessary.  Instead, the FCST_THRESH and OBS_THRESH
>
> > > columns for the MPR line type should be set to "NA".  The MPR
line
>
> > > type that Point-Stat creates just contains the paired forecast
and
>
> > > observation
>
> > values.
>
> > > Thresholds do not apply to this line type.
>
> > >
>
> > > I posted an updated version of your file to the ftp site.  I
>
> > > stripped it down to 14666 APCP lines and 11926 CEIL lines with
NA
> > > in
>
> > > the FCST_THRESH and OBS_THRESH columns:
>
> > >
>
> > >
>
> > >
ftp://ftp.rap.ucar.edu/incoming/irap/met_help/craig_data/point_sta
> > > t_
>
> > > 3_ galwem_120000L_20160501_120000V_JHG.stat
>
> > >
>
> > > Looking at the values in the FCST column, I see numbers between
0
>
> > > and
>
> > > 1 (0, 0.11, 0.22, 0.33, ...).  Looking in the OBS column, I see
2
>
> > > numbers (0 or 1).  And looking at your config file, it looks
like
>
> > > you want to use these MPR lines to compute probabilistic output.
>
> > > MET verifies probabilities using an Nx2 contingency table.  You
> > > use
>
> > "-out_fcst_thresh"
>
> > > to select the probabilistic thresholds to be applied and
>
> > "-out_obs_thresh"
>
> > > to select the observation threshold to be applied.
>
> > >
>
> > > Here's a stat-analysis job you could run to read the MPR lines,
>
> > > define the probabilistic forecast thresholds, define the single
>
> > > observation threshold, and compute a PSTD output line.  Using "-
by
>
> > > FCST_VAR" tells it to run the job separately for each unique
entry
>
> > > found in the FCST_VAR
>
> > column.
>
> > >
>
> > > /usr/local/met-5.1/bin/stat_analysis \
>
> > >    -lookin point_stat_3_galwem_120000L_20160501_120000V_JHG.stat
\
>
> > >    -job aggregate_stat -line_type MPR -out_line_type PSTD \
>
> > >    -out_fcst_thresh
>
> > > ge0,ge0.1,ge0.2,ge0.3,ge0.4,ge0.5,ge0.6,ge0.7,ge0.8,ge0.9,ge1.0
\
>
> > >    -out_obs_thresh eq1.0 \
>
> > >    -by FCST_VAR \
>
> > >    -out_stat out_pstd.txt
>
> > >
>
> > > The output statistics are written to "out_pstd.txt".
>
> > >
>
> > > Hope that helps.
>
> > >
>
> > > John
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > > On Mon, May 16, 2016 at 2:47 PM, robert.craig.2 at us.af.mil via RT
<
>
> > > met_help at ucar.edu> wrote:
>
> > >
>
> > > >
>
> > > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361
>
>
> > > >
>
> > > > Thanks John, I knew the data was space delimited but forgot to
>
> > > > check the header.  As usual with MET, I progressed further but
> > > > am
>
> > > > hitting a
>
> > new
>
> > > > error.   See below.   I pushed the the config file to the ftp
>
> > directory.
>
> > > > As you can see, the -tmp_dir is set to
>
> > /h/data/global/WXQC/data/met/tmp.
>
> > > > This directory permissions are wide open - infact stat_anal
temp
>
> > > > files
>
> > > are
>
> > > > in there.   Does the config*.temp try to write somewhere else?
>
> > > >
>
> > > > Also, notice in the command line options, there are three
thesholds.
>
> > > > MET kept telling me that I had to have three since this is
>
> > > > probability
>
> > > data.
>
> > > > Also, the latest MPR files (.stat) are in the ftp dir.  As you
> > > > can
>
> > > > see I generated model/ob pairs using different thresholds for
> > > > the
>
> > > > forecast and observation data.  So this is where I get
confused:
> > > > I
>
> > > > assume the fcst thresh in the config file is a filter to pull
>
> > > > those lines that have the threshold I want.  I am not sure
what
>
> > > > the -out_fcst_thresh in the command line is doing.  If it is
>
> > > > filtering the mpr line fcst data, then I would think I would
set
>
> > > > it to ge 0 for the fcst and ob since the fcst and ob data
range
>
> > > > from 0 to 1.  Am
>
> > I handling this correctly?
>
> > > >
>
> > > > Thanks
>
> > > > BOb
>
> > > >
>
> > > > ['/h/WXQC/met-5.1/bin/stat_analysis', '-lookin',
>
> > > > '/h/data/global/WXQC/data/met/mdlob_pairs', '-out',
>
> > > >
'/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_1_PSTD_0Z
> > > > ',
>
> > > > '-config',
>
> > > > '/h/WXQC/met-
5.1/data/config/STATAnalysisConfig_hira_bs_updated'
> > > > ,
>
> > > > '-out_fcst_thresh ge0,ge0.5,ge1', '-out_obs_thresh ge0', '-v',
>
> > > > '6'] DEBUG 1: Creating STAT-Analysis output file
>
> > > >
"/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_1_PSTD_0Z"
>
> > > > DEBUG 1: Default Config File:
>
> > > > /home/qcteam/met-
5.1/share/met/config/STATAnalysisConfig_default
>
> > > > DEBUG 1: User Config File:
>
> > > > /h/WXQC/met-5.1/data/config/STATAnalysisConfig_hira_bs_updated
>
> > > > DEBUG 4: Default Job from the config file: "-model GALWEM
>
> > > > -fcst_lead
>
> > > > 120000 -fcst_init_beg 20160501_000000 -fcst_init_end
>
> > > > 20160502_000000 -fcst_init_hour 000000 -fcst_var APCP
> > > > -fcst_thresh
>
> > > > >=1 -line_type MPR -vif_flag 1 "
>
> > > > DEBUG 4: Amending default job with command line options:
>
> > > > "-out_fcst_thresh
>
> > > > ge0,ge0.5,ge1 -out_obs_thresh ge0"
>
> > > > DEBUG 3: Processing STAT file
>
> > > >
>
> > >
>
> >
>
"/h/data/global/WXQC/data/met/mdlob_pairs/point_stat_3_galwem_120000L_20160501_120000V.stat"
>
> > > > ... 1 of 10
>
> > > > ERROR  :
>
> > > > ERROR  :
>
> > > > ERROR  :   MetConfig::read_string(const char *) -> unable to
open
> temp
>
> > > > file "config_23325_0_.temp"
>
> > > > ERROR  :
>
> > > >
>
> > > > -----Original Message-----
>
> > > > From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
>
> > > > Sent: Friday, May 13, 2016 6:27 PM
>
> > > > To: CRAIG, ROBERT J GS-12 USAF ACC 16 WS/WXN
>
> > > > <robert.craig.2 at us.af.mil>
>
> > > > Subject: Re: [rt.rap.ucar.edu #76361] Statanalysis Question
>
> > > >
>
> > > > Bob,
>
> > > >
>
> > > > The problem is coming from the first line of the file you sent
to me.
>
> > > > It contains a comma-separated list of header column names.
>
> > > >
>
> > > > I'm not exactly sure where you pulled those header column
names,
>
> > > > but that's the problem.  MET expects data to be separated by
> whitespace...
>
> > > > so it interprets that long string with a bunch of commas as a
>
> > > > single
>
> > > column.
>
> > > > The
>
> > > > error comes when it tries to read the "second" column.   If
you just
>
> > > remove
>
> > > > that first line, it should run fine.
>
> > > >
>
> > > > If you do want header columns, here's a trick.  Run the
> > > > following
> job:
>
> > > >
>
> > > > stat_analysis -lookin
>
> > point_stat_3_galwem_120000L_20160501_120000V.stat \
>
> > > >    -job filter -line_type MPR -dump_row out.stat
>
> > > >
>
> > > > The file out.stat, will now contain the full header for the
MPR
>
> > > > line
>
> > > type.
>
> > > > When you select a single LINE_TYPE value, stat-analysis will
> > > > write
>
> > > > the full header for that line type to the output.
>
> > > >
>
> > > > Have a good weekend.
>
> > > >
>
> > > > John
>
> > > >
>
> > > >
>
> > > > On Fri, May 13, 2016 at 10:26 AM, robert.craig.2 at us.af.mil via
> > > > RT
>
> > > > < met_help at ucar.edu> wrote:
>
> > > >
>
> > > > >
>
> > > > > Fri May 13 10:26:26 2016: Request 76361 was acted upon.
>
> > > > > Transaction: Ticket created by robert.craig.2 at us.af.mil
>
> > > > >        Queue: met_help
>
> > > > >      Subject: Statanalysis Question
>
> > > > >        Owner: Nobody
>
> > > > >   Requestors: robert.craig.2 at us.af.mil
>
> > > > >       Status: new
>
> > > > >  Ticket <URL:
>
> > > > > https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361
>
> > > > > >
>
> > > > >
>
> > > > >
>
> > > > > John, I am getting the following error when running
statanalysis.
>
> > > > > The forcast times and valid times seem to be correct to me,
so
> > > > > I
>
> > > > > am not sure the cause of the error.
>
> > > > >
>
> > > > > /h/WXQC/met-5.1/bin/stat_analysis -lookin
>
> > > > > /h/data/global/WXQC/data/met/mdlob_pairs -out
>
> > > > >
/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_PSTD_
> > > > > 0Z
>
> > > > > -config
>
> > > > > /h/WXQC/met-
5.1/data/config/STATAnalysisConfig_hira_bs_updated
>
> > > > > -v 6 DEBUG 1: Creating STAT-Analysis output file
>
> > > > >
"/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_PSTD_0Z"
>
> > > > > DEBUG 1: Default Config File:
>
> > > > > /home/qcteam/met-
5.1/share/met/config/STATAnalysisConfig_defau
> > > > > lt
>
> > > > > DEBUG 1: User Config File:
>
> > > > > /h/WXQC/met-
5.1/data/config/STATAnalysisConfig_hira_bs_updated
>
> > > > > DEBUG 4: Default Job from the config file: "-model GALWEM
>
> > > > > -fcst_lead
>
> > > > > 120000 -fcst_init_beg 20160501_000000 -fcst_init_end
>
> > > > > 20160501_000000 -fcst_init_hour 120000 -fcst_var APCP
>
> > > > > -fcst_thresh
>
> > > > > >=50 -line_type MPR -vif_flag 1 "
>
> > > > > DEBUG 4: Amending default job with command line options:
"(nul)"
>
> > > > > DEBUG 3: Processing STAT file
>
> > > > >
>
> > > >
>
> > >
>
> >
>
"/h/data/global/WXQC/data/met/mdlob_pairs/point_stat_3_galwem_120000L_20160501_120000V.stat"
>
> > > > > ... 1 of 2
>
> > > > > ERROR  :
>
> > > > > ERROR  : DataLine::get_item(int) -> range check error ERROR
:
>
> > > > >
>
> > > > > The config file and data file are on your server.
>
> > > > >
>
> > > > >
>
> > > > >
>
> > > >
>
> > > >
>
> > > >
>
> > > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> >
>
> >
>
> >
>
> >
>
>
>
>



------------------------------------------------
Subject: Statanalysis Question
From: John Halley Gotway
Time: Thu May 26 12:06:34 2016

Bob,

Ah OK, I see the problem here.  The short answer is that you should
use the
"-out_stat" job command option instead of "-out".

And here's the long answer...

Prior to MET version 5.1, STAT-Analysis wrote a very simple ASCII
output
file.  For your MPR data, its reading MPR lines, putting them into
probabilistic contingency tables, and writing output PCT information.
That output is normally written to the screen in a pretty basic way.
Notice that the output just contains the PCT counts without the 21
header
columns common to all the .stat line types.  Using the "-out" job
command
option redirects that output from the screen to an output file, but
it's
still formatted in exactly the same way.  And looking at the format of
the
files you sent, I see that's what you've done.

In MET version 5.1, we added a new output option for STAT-Analysis
named
"-out_stat".  That tell STAT-Analysis to write the same numbers to an
output file but add in those 21 header columns.  Using the true .stat
format enables STAT-Analysis to read its own output... which is what
you
want to do.  We kept the "-out" job command option available for
backwards
compatibility.

Please try rerunning your MPR -> PCT jobs using the "-out_stat" option
and
them pass that output back in for the PCT -> PSTD jobs.

Thanks,
John

On Thu, May 26, 2016 at 11:51 AM, robert.craig.2 at us.af.mil via RT <
met_help at ucar.edu> wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361 >
>
> Well, that fixed that issue and it got me to my next error:
>
> /gpfs/h/data/global/WXQC/data/temp
> ['/h/WXQC/met-5.1/bin/stat_analysis', '-lookin',
> '/h/data/global/WXQC/data/met/ens_cont_tbl', '-out',
>
'/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_25_3_PSTD_0Z',
> '-out_fcst_thresh
> ge0,ge0.1,ge0.2,ge0.3,ge0.4,ge0.5,ge0.6,ge0.7,ge0.8,ge0.9,ge1.0',
> '-out_obs_thresh eq1', '-line_type PCT', '-out_line_type PSTD', '-by
> FCST_VAR', '-by FCST_THRESH', '-v', '6']
> DEBUG 1: Creating STAT-Analysis output file
> "/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_25_3_PSTD_0Z"
> DEBUG 4: Amending default job with command line options: "-
out_fcst_thresh
> ge0,ge0.1,ge0.2,ge0.3,ge0.4,ge0.5,ge0.6,ge0.7,ge0.8,ge0.9,ge1.0
> -out_obs_thresh eq1 -line_type PCT -out_line_type PSTD -by FCST_VAR
-by
> FCST_THRESH"
> DEBUG 3: Processing STAT file
>
"/h/data/global/WXQC/data/met/ens_cont_tbl/GALWEM_APCP_12hr_1_3_PCT_0Z_pct.stat"
> .... 1 of 3
> ERROR  :
> ERROR  : DataLine::get_item(int) -> range check error
> ERROR  :
>
> Any ideas?
>
> -----Original Message-----
> From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
> Sent: Thursday, May 26, 2016 11:31 AM
> To: CRAIG, ROBERT J GS-12 USAF ACC 16 WS/WXN
<robert.craig.2 at us.af.mil>
> Subject: Re: [rt.rap.ucar.edu #76361] Statanalysis Question
>
> Bob,
>
> If you tell STAT-Analysis to "-lookin" a directory, it will
recursively
> search that directory looking for files ending in ".stat".  If you
instead
> tell STAT-Analysis to "-lookin" one or more specific file names, it
will
> process them regardless of their suffix.
>
> So you can either name your PCT output files as "_pct.stat" or you
can
> keep them as "_pct.txt" and change your STAT-Analysis job command to
> something
> like:
>   -lookin /h/data/global/WXQC/data/met/ens_cont_tbl/*_pct.txt
>
> Make sense?
>
> Thanks,
> John
>
> On Thu, May 26, 2016 at 9:55 AM, robert.craig.2 at us.af.mil via RT <
> met_help at ucar.edu> wrote:
>
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361 >
> >
> > Hi John, I am to the point where I am trying to read the pct.txt
files
> > to generate the PSTD files.  When I do this using
> > /h/WXQC/met-5.1/bin/stat_analysis  -lookin
> > /h/data/global/WXQC/data/met/ens_cont_tbl -out
> > /h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_3_PSTD_0Z
> > -line_type PCT -out_line_type PSTD -by FCST_VAR -by FCST_THRESH -v
6,
> > I get the following:
> >
> >
> >
> > ['/h/WXQC/met-5.1/bin/stat_analysis', '-lookin',
> > '/h/data/global/WXQC/data/met/ens_cont_tbl/', '-out',
> >
'/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_3_PSTD_0Z',
> > '-line_type PCT', '-out_line_type PSTD', '-by FCST_VAR', '-by
> > FCST_THRESH', '-v', '6']
> >
> > DEBUG 1: Creating STAT-Analysis output file
> >
"/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_3_PSTD_0Z"
> >
> > DEBUG 4: Amending default job with command line options: "-
line_type
> > PCT -out_line_type PSTD -by FCST_VAR -by FCST_THRESH"
> >
> > ERROR  :
> >
> > ERROR  : process_search_dirs() -> no STAT files found in the
> > directories specified!
> >
> > ERROR  :
> >
> > ERROR  :
> >
> > ERROR  : main() -> encountered an error value of 1.  Calling
> > clean_up() and usage() before exiting.
> >
> > ERROR  :
> >
> >
> >
> > *** Model Evaluation Tools (METV5.1) ***
> >
> >
> >
> > The ens_cont_tbl directory contains 3 files and I placed them on
your
> > ftp server.  I didn’t think the file naming convention was
important
> > here as long as they end in _pct.txt.  I did try renaming one to
the
> > more conventional naming scheme – point_stat … but that didn’t
make a
> > difference.  So why is it not finding my files?
> >
> >
> >
> > Thanks
> >
> > Bob
> >
> >
> >
> > -----Original Message-----
> > From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
> > Sent: Monday, May 23, 2016 11:56 AM
> > To: CRAIG, ROBERT J GS-12 USAF ACC 16 WS/WXN
> > <robert.craig.2 at us.af.mil>
> > Subject: Re: [rt.rap.ucar.edu #76361] Statanalysis Question
> >
> >
> >
> > Bob,
> >
> >
> >
> > Surprisingly, I was able to replicate the same error you're
seeing!  I
> > see where the error is occurring, but I don't yet understand why
its
> > happening.  However, I do have a workaround for you.
> >
> >
> >
> > Try editing your config file by emptying out the "fcst_thresh"
setting:
> >
> >    fcst_thresh = [];
> >
> >
> >
> > In your job, you're using "-by fcst_thresh" anyway, so STAT-
Analysis
> > will group the data by the FCST_THRESH column.  After I removed
the
> > "fcst_thresh" setting, the job completed.  I'll continue looking
into
> > the reason for that error.
> >
> >
> >
> > How many of these files are you planning to pass to STAT-Analysis
at
> > any given time?  I see each sample file contains about 80,000 MPR
lines.
> >
> > Passing it 5 files to process about 400,000 lines, that job takes
> > about 62 seconds to run on my machine.  I worry that as you
increase
> > the number of files, it'll run very slowly.
> >
> >
> >
> > Here's some alternative logic you might consider.  Run STAT-
Analysis
> > once for each .stat file you're generating.  Instead of writing
the
> > PSTD line type, write the PCT line type (that's just the counts of
> > that probabilistic
> >
> > Nx2 table).  Then run jobs to aggregate the PCT lines types and
> > compute PSTD lines (-job aggregate_stat -line_type PCT
-out_line_type
> > PSTD).  That would make the processing for each STAT-Analysis job
much
> more manageable.
> >
> >
> >
> > BUT in order to do this in 2 steps, you would really need to put
the
> > neighborhood size information into the INTERP_MTHD and INTERP_PNTS
> > columns.  Otherwise, the threshold information won't be retained
in
> > the PCT output lines.  And I also found an issue in the formatting
of
> > the OBS_THRESH output column ("=1" in the output should really be
> > "==1").  For now I just switched to "ge1", but I need to fix this
> formatting issue.
> >
> >
> >
> > Listed below are the c-shell command I used to loop over your
sample
> > files and run stat_analysis in this way...
> >
> >
> >
> > # Loop through MPR files and compute PCT output lines foreach
mpr_file
> > (`ls stat_mpr/*.stat`)
> >
> >    set pct_file = `echo $mpr_file | sed 's/stat_mpr/stat_pct/g' |
sed
> > 's/.stat/_pct.stat/g'`
> >
> >    /usr/local/met-5.1/bin/stat_analysis -lookin $mpr_file
-out_stat
> > $pct_file \
> >
> >    -job aggregate_stat -line_type MPR -out_line_type PCT \
> >
> >    -by MODEL,FCST_VALID_BEG,FCST_LEAD,FCST_VAR,FCST_THRESH \
> >
> >    -out_fcst_thresh
> >
> > ge0,ge0.1,ge0.2,ge0.3,ge0.4,ge0.5,ge0.6,ge0.7,ge0.8,ge0.9,ge1.0
> >
> > -out_obs_thresh ge1
> >
> > end
> >
> >
> >
> > # Aggregate PCT lines and compute PSTD stats
> > /usr/local/met-5.1/bin/stat_analysis -lookin stat_pct -out_stat
> > pstd.stat \
> >
> >    -job aggregate_stat -line_type PCT -out_line_type PSTD \
> >
> >    -by
MODEL,FCST_VALID_BEG,FCST_LEAD,FCST_VAR,INTERP_MTHD,INTERP_PNTS
> >
> >
> >
> > Note that this job is combining all of the neighborhood sizes
because
> > INTERP_MTHD and INTERP_PNTS is set the same in the PCT lines.
> >
> >
> >
> > Thanks,
> >
> > John
> >
> >
> >
> >
> >
> >
> >
> > On Fri, May 20, 2016 at 3:33 PM, robert.craig.2 at us.af.mil via RT <
> > met_help at ucar.edu> wrote:
> >
> >
> >
> > >
> >
> > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361 >
> >
> > >
> >
> > > John, I might have found more information on the error below:
> >
> > >
> >
> > > MetConfig::read_string(const char *) -> unable to open temp file
> >
> > > "config_1943_0_.temp"
> >
> > >
> >
> > > When I run stat_anal on MPR files, no problems if I  process 2
days
> > > of
> >
> > > data, but when I increase it to three days, the error occurs.  I
> > > also
> >
> > > played with the dates to try to eliminate a particular data file
as
> >
> > > the cause.  So, it seems to be related to how many data files it
has
> > > to
> >
> > > process.   I posted all the data files on the ftp server and my
command
> >
> > > line is below.  The config file on the server is still
representative.
> >
> > > This error does not seem related to directory permissions.
> >
> > >
> >
> > > /h/WXQC/met-5.1/bin/stat_analysis  -lookin
> >
> > > /h/data/global/WXQC/data/met/mdlob_pairs -out
> >
> > > /h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_PSTD_0
> >
> > > -config
> > > /h/WXQC/met-5.1/data/config/STATAnalysisConfig_hira_bs_updated
> >
> > > -out_fcst_thresh
> >
> > > ge0,ge0.1,ge0.2,ge0.3,ge0.4,ge0.5,ge0.6,ge0.7,ge0.8,ge0.9,ge1.0
> >
> > > -out_obs_thresh eq1 -by FCST_VAR -by FCST_THRESH -v 6
> >
> > >
> >
> > > This is concerning since I have to run this on many more than
two
> > > days
> >
> > > worth of data.  Any idea what be happening?
> >
> > >
> >
> > > Thanks
> >
> > > Bob
> >
> > >
> >
> > > -----Original Message-----
> >
> > > From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
> >
> > > Sent: Thursday, May 19, 2016 4:31 PM
> >
> > > To: CRAIG, ROBERT J GS-12 USAF ACC 16 WS/WXN
> >
> > > <robert.craig.2 at us.af.mil>
> >
> > > Subject: Re: [rt.rap.ucar.edu #76361] Statanalysis Question
> >
> > >
> >
> > > Bob,
> >
> > >
> >
> > > It's funny, I was just talking to a colleague today about doing
> >
> > > something very similar to this on a different dataset.
> >
> > >
> >
> > > I think I understand what you're saying about the thresholds >1,
> > > >25,
> >
> > > and
> >
> > > >50.  These are used to define the "event" which is used in
> > > >computing
> >
> > > >the
> >
> > > fractional coverage fields.  My confusion comes from the fact
that
> > > in
> >
> > > MET currently only Grid-Stat is computing these fractional
coverage
> >
> > > fields, not Point-Stat.  But I see now what you're doing.
> >
> > >
> >
> > > One suggestion would be to change the contents of the
INTERP_MTHD
> > > and
> >
> > > INTERP_PNTS header columns.  You currently have NEAREST, 1 which
> > > would
> >
> > > indicate that each observation value was matched to the forecast
> > > value
> >
> > > at the nearest grid point.  Instead, I would suggest writing
NBRHD
> > > and
> >
> > > N, where N indicates the number of points in the neighborhood.
For
> >
> > > example, the NBRHD output from Grid-Stat would write 49 if we
were
> > > using
> > a 7x7 box.
> >
> > >
> >
> > > The -fcst_thresh and -obs_thresh options are used to filter the
> > > input
> >
> > > MPR lines, as you already know.  The -out_fcst_thresh and
> >
> > > -out_obs_thresh options define the thresholds to be applied when
> >
> > > computing the output for the job.  In MET, probabilities are not
> > processed in a "continuous" way.
> >
> > > Instead, they are put into probability bins.  Those bins are
used to
> >
> > > create an Nx2 contingency table from which probabilistic
statistics
> >
> > > are computed.
> >
> > >
> >
> > > Setting "-out_fcst_thresh gt0,gt0.25,gt0.5,gt0.75,gt1.0" defines
4
> >
> > > probability bins which yields a 4x2 contingency table, from
which
> >
> > > stats are computed.
> >
> > >
> >
> > > Hope that helps.
> >
> > >
> >
> > > Thanks,
> >
> > > Johhn
> >
> > >
> >
> > > On Thu, May 19, 2016 at 2:23 PM, robert.craig.2 at us.af.mil via RT
<
> >
> > > met_help at ucar.edu> wrote:
> >
> > >
> >
> > > >
> >
> > > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361
>
> >
> > > >
> >
> > > > Thanks John, the directory problem was due to come corruption
on
> > > > the
> >
> > > > MET we had on one if the systems.  On another system the
problem
> >
> > > > doesn't come up so we are hoping a recompile of MET on said
system
> >
> > > > will clear up the issue.
> >
> > > >
> >
> > > > As far as the second comment, I don't think you interpreted
what I
> >
> > > > am doing correctly.  In each file, there is three sets of data
for
> >
> > > > each variable.  They are not identical since the first set is
the
> > > > ob
> >
> > > > neighborhood data for precip > 1.  The next set is the ob
> >
> > > > neighborhood data for precip > 25, and the same for precip >
50.
> > > > I
> >
> > > > you compare the model and ob data, the model data should be
> >
> > > > different (for some obs) then the model data for the previous
> >
> > > > category.  The neighborhoods
> >
> > > around each ob site
> >
> > > > follow the HiRA method.   All the data in the mpr file lines
are
> >
> > > > probabilities, so I want to create PSTD from these data.  So I
was
> > using
> >
> > > > the fcst_thresh to filter for the HiRA thresholds I am
interested in.
> >  I
> >
> > > > tried your code and added -by FCST_THRESH and got out three
> >
> > > > different
> >
> > > sets
> >
> > > > of values.   So my question is why did you have the
-out_fcst_thresh
> > set
> >
> > > to
> >
> > > > 10 prob thresolds?  Why wouldn't ge0 pick  up all
probabilities?
> > > > I
> >
> > > > am not understanding how these thresholds are being used.
> >
> > > >
> >
> > > > Thanks
> >
> > > > Bob
> >
> > > >
> >
> > > > -----Original Message-----
> >
> > > > From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
> >
> > > > Sent: Monday, May 16, 2016 4:54 PM
> >
> > > > To: CRAIG, ROBERT J GS-12 USAF ACC 16 WS/WXN
> >
> > > > <robert.craig.2 at us.af.mil>
> >
> > > > Subject: Re: [rt.rap.ucar.edu #76361] Statanalysis Question
> >
> > > >
> >
> > > > Bob,
> >
> > > >
> >
> > > > Thanks for sending the sample data.  I agree that STAT-
Analysis
> > > > can
> >
> > > > get pretty confusing.  It has a lot of flexibility, but we
really
> >
> > > > need to think through what you're trying to do.
> >
> > > >
> >
> > > > First, regarding the error you're getting.  Unfortunately, the
> >
> > > > config file string parser is writing a temp file in the
current
> > "runtime"
> >
> > > directory.
> >
> > > > The error is from the fact that you don't have permission to
write
> >
> > > > the file "config_23325_0_.temp" in the current directory.
> >
> > > > Ultimately, we should change that to use the temp directory
instead.
> >
> > > >
> >
> > > > Next, I looked at the data you sent to me.  Listed below are
the
> >
> > > > unique combinations of just a few of the header columns:
> >
> > > >
> >
> > > > FCST_VAR FCST_THRESH LINE_TYPE TOTAL
> >
> > > > APCP         >=1                  MPR           14666
> >
> > > > APCP         >=25                MPR           14666
> >
> > > > APCP         >=50                MPR           14666
> >
> > > > CEIL           <=1000             MPR           11926
> >
> > > > CEIL           <=100               MPR           11926
> >
> > > > CEIL           <=300               MPR           11926
> >
> > > >
> >
> > > > Based on this, it looks like you have a lot of duplicate
matched
> >
> > > > pair
> >
> > > > (MPR) output lines... We have the same 14666 pairs for APCP
> > > > repeated
> >
> > > > 3 times followed by the same 11926 pairs for CEIL repeated 3
times.
> >
> > > > This isn't necessary.  Instead, the FCST_THRESH and OBS_THRESH
> >
> > > > columns for the MPR line type should be set to "NA".  The MPR
line
> >
> > > > type that Point-Stat creates just contains the paired forecast
and
> >
> > > > observation
> >
> > > values.
> >
> > > > Thresholds do not apply to this line type.
> >
> > > >
> >
> > > > I posted an updated version of your file to the ftp site.  I
> >
> > > > stripped it down to 14666 APCP lines and 11926 CEIL lines with
NA
> > > > in
> >
> > > > the FCST_THRESH and OBS_THRESH columns:
> >
> > > >
> >
> > > >
> >
> > > >
ftp://ftp.rap.ucar.edu/incoming/irap/met_help/craig_data/point_sta
> > > > t_
> >
> > > > 3_ galwem_120000L_20160501_120000V_JHG.stat
> >
> > > >
> >
> > > > Looking at the values in the FCST column, I see numbers
between 0
> >
> > > > and
> >
> > > > 1 (0, 0.11, 0.22, 0.33, ...).  Looking in the OBS column, I
see 2
> >
> > > > numbers (0 or 1).  And looking at your config file, it looks
like
> >
> > > > you want to use these MPR lines to compute probabilistic
output.
> >
> > > > MET verifies probabilities using an Nx2 contingency table.
You
> > > > use
> >
> > > "-out_fcst_thresh"
> >
> > > > to select the probabilistic thresholds to be applied and
> >
> > > "-out_obs_thresh"
> >
> > > > to select the observation threshold to be applied.
> >
> > > >
> >
> > > > Here's a stat-analysis job you could run to read the MPR
lines,
> >
> > > > define the probabilistic forecast thresholds, define the
single
> >
> > > > observation threshold, and compute a PSTD output line.  Using
"-by
> >
> > > > FCST_VAR" tells it to run the job separately for each unique
entry
> >
> > > > found in the FCST_VAR
> >
> > > column.
> >
> > > >
> >
> > > > /usr/local/met-5.1/bin/stat_analysis \
> >
> > > >    -lookin
point_stat_3_galwem_120000L_20160501_120000V_JHG.stat \
> >
> > > >    -job aggregate_stat -line_type MPR -out_line_type PSTD \
> >
> > > >    -out_fcst_thresh
> >
> > > >
ge0,ge0.1,ge0.2,ge0.3,ge0.4,ge0.5,ge0.6,ge0.7,ge0.8,ge0.9,ge1.0 \
> >
> > > >    -out_obs_thresh eq1.0 \
> >
> > > >    -by FCST_VAR \
> >
> > > >    -out_stat out_pstd.txt
> >
> > > >
> >
> > > > The output statistics are written to "out_pstd.txt".
> >
> > > >
> >
> > > > Hope that helps.
> >
> > > >
> >
> > > > John
> >
> > > >
> >
> > > >
> >
> > > >
> >
> > > >
> >
> > > >
> >
> > > >
> >
> > > >
> >
> > > > On Mon, May 16, 2016 at 2:47 PM, robert.craig.2 at us.af.mil via
RT <
> >
> > > > met_help at ucar.edu> wrote:
> >
> > > >
> >
> > > > >
> >
> > > > > <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361 >
> >
> > > > >
> >
> > > > > Thanks John, I knew the data was space delimited but forgot
to
> >
> > > > > check the header.  As usual with MET, I progressed further
but
> > > > > am
> >
> > > > > hitting a
> >
> > > new
> >
> > > > > error.   See below.   I pushed the the config file to the
ftp
> >
> > > directory.
> >
> > > > > As you can see, the -tmp_dir is set to
> >
> > > /h/data/global/WXQC/data/met/tmp.
> >
> > > > > This directory permissions are wide open - infact stat_anal
temp
> >
> > > > > files
> >
> > > > are
> >
> > > > > in there.   Does the config*.temp try to write somewhere
else?
> >
> > > > >
> >
> > > > > Also, notice in the command line options, there are three
> thesholds.
> >
> > > > > MET kept telling me that I had to have three since this is
> >
> > > > > probability
> >
> > > > data.
> >
> > > > > Also, the latest MPR files (.stat) are in the ftp dir.  As
you
> > > > > can
> >
> > > > > see I generated model/ob pairs using different thresholds
for
> > > > > the
> >
> > > > > forecast and observation data.  So this is where I get
confused:
> > > > > I
> >
> > > > > assume the fcst thresh in the config file is a filter to
pull
> >
> > > > > those lines that have the threshold I want.  I am not sure
what
> >
> > > > > the -out_fcst_thresh in the command line is doing.  If it is
> >
> > > > > filtering the mpr line fcst data, then I would think I would
set
> >
> > > > > it to ge 0 for the fcst and ob since the fcst and ob data
range
> >
> > > > > from 0 to 1.  Am
> >
> > > I handling this correctly?
> >
> > > > >
> >
> > > > > Thanks
> >
> > > > > BOb
> >
> > > > >
> >
> > > > > ['/h/WXQC/met-5.1/bin/stat_analysis', '-lookin',
> >
> > > > > '/h/data/global/WXQC/data/met/mdlob_pairs', '-out',
> >
> > > > >
'/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_1_PSTD_0Z
> > > > > ',
> >
> > > > > '-config',
> >
> > > > > '/h/WXQC/met-
5.1/data/config/STATAnalysisConfig_hira_bs_updated'
> > > > > ,
> >
> > > > > '-out_fcst_thresh ge0,ge0.5,ge1', '-out_obs_thresh ge0', '-
v',
> >
> > > > > '6'] DEBUG 1: Creating STAT-Analysis output file
> >
> > > > >
"/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_1_PSTD_0Z"
> >
> > > > > DEBUG 1: Default Config File:
> >
> > > > > /home/qcteam/met-
5.1/share/met/config/STATAnalysisConfig_default
> >
> > > > > DEBUG 1: User Config File:
> >
> > > > > /h/WXQC/met-
5.1/data/config/STATAnalysisConfig_hira_bs_updated
> >
> > > > > DEBUG 4: Default Job from the config file: "-model GALWEM
> >
> > > > > -fcst_lead
> >
> > > > > 120000 -fcst_init_beg 20160501_000000 -fcst_init_end
> >
> > > > > 20160502_000000 -fcst_init_hour 000000 -fcst_var APCP
> > > > > -fcst_thresh
> >
> > > > > >=1 -line_type MPR -vif_flag 1 "
> >
> > > > > DEBUG 4: Amending default job with command line options:
> >
> > > > > "-out_fcst_thresh
> >
> > > > > ge0,ge0.5,ge1 -out_obs_thresh ge0"
> >
> > > > > DEBUG 3: Processing STAT file
> >
> > > > >
> >
> > > >
> >
> > >
> >
>
"/h/data/global/WXQC/data/met/mdlob_pairs/point_stat_3_galwem_120000L_20160501_120000V.stat"
> >
> > > > > ... 1 of 10
> >
> > > > > ERROR  :
> >
> > > > > ERROR  :
> >
> > > > > ERROR  :   MetConfig::read_string(const char *) -> unable to
open
> > temp
> >
> > > > > file "config_23325_0_.temp"
> >
> > > > > ERROR  :
> >
> > > > >
> >
> > > > > -----Original Message-----
> >
> > > > > From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
> >
> > > > > Sent: Friday, May 13, 2016 6:27 PM
> >
> > > > > To: CRAIG, ROBERT J GS-12 USAF ACC 16 WS/WXN
> >
> > > > > <robert.craig.2 at us.af.mil>
> >
> > > > > Subject: Re: [rt.rap.ucar.edu #76361] Statanalysis Question
> >
> > > > >
> >
> > > > > Bob,
> >
> > > > >
> >
> > > > > The problem is coming from the first line of the file you
sent to
> me.
> >
> > > > > It contains a comma-separated list of header column names.
> >
> > > > >
> >
> > > > > I'm not exactly sure where you pulled those header column
names,
> >
> > > > > but that's the problem.  MET expects data to be separated by
> > whitespace...
> >
> > > > > so it interprets that long string with a bunch of commas as
a
> >
> > > > > single
> >
> > > > column.
> >
> > > > > The
> >
> > > > > error comes when it tries to read the "second" column.   If
you
> just
> >
> > > > remove
> >
> > > > > that first line, it should run fine.
> >
> > > > >
> >
> > > > > If you do want header columns, here's a trick.  Run the
> > > > > following
> > job:
> >
> > > > >
> >
> > > > > stat_analysis -lookin
> >
> > > point_stat_3_galwem_120000L_20160501_120000V.stat \
> >
> > > > >    -job filter -line_type MPR -dump_row out.stat
> >
> > > > >
> >
> > > > > The file out.stat, will now contain the full header for the
MPR
> >
> > > > > line
> >
> > > > type.
> >
> > > > > When you select a single LINE_TYPE value, stat-analysis will
> > > > > write
> >
> > > > > the full header for that line type to the output.
> >
> > > > >
> >
> > > > > Have a good weekend.
> >
> > > > >
> >
> > > > > John
> >
> > > > >
> >
> > > > >
> >
> > > > > On Fri, May 13, 2016 at 10:26 AM, robert.craig.2 at us.af.mil
via
> > > > > RT
> >
> > > > > < met_help at ucar.edu> wrote:
> >
> > > > >
> >
> > > > > >
> >
> > > > > > Fri May 13 10:26:26 2016: Request 76361 was acted upon.
> >
> > > > > > Transaction: Ticket created by robert.craig.2 at us.af.mil
> >
> > > > > >        Queue: met_help
> >
> > > > > >      Subject: Statanalysis Question
> >
> > > > > >        Owner: Nobody
> >
> > > > > >   Requestors: robert.craig.2 at us.af.mil
> >
> > > > > >       Status: new
> >
> > > > > >  Ticket <URL:
> >
> > > > > > https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361
> >
> > > > > > >
> >
> > > > > >
> >
> > > > > >
> >
> > > > > > John, I am getting the following error when running
statanalysis.
> >
> > > > > > The forcast times and valid times seem to be correct to
me, so
> > > > > > I
> >
> > > > > > am not sure the cause of the error.
> >
> > > > > >
> >
> > > > > > /h/WXQC/met-5.1/bin/stat_analysis -lookin
> >
> > > > > > /h/data/global/WXQC/data/met/mdlob_pairs -out
> >
> > > > > >
/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_PSTD_
> > > > > > 0Z
> >
> > > > > > -config
> >
> > > > > > /h/WXQC/met-
5.1/data/config/STATAnalysisConfig_hira_bs_updated
> >
> > > > > > -v 6 DEBUG 1: Creating STAT-Analysis output file
> >
> > > > > >
> "/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_PSTD_0Z"
> >
> > > > > > DEBUG 1: Default Config File:
> >
> > > > > > /home/qcteam/met-
5.1/share/met/config/STATAnalysisConfig_defau
> > > > > > lt
> >
> > > > > > DEBUG 1: User Config File:
> >
> > > > > > /h/WXQC/met-
5.1/data/config/STATAnalysisConfig_hira_bs_updated
> >
> > > > > > DEBUG 4: Default Job from the config file: "-model GALWEM
> >
> > > > > > -fcst_lead
> >
> > > > > > 120000 -fcst_init_beg 20160501_000000 -fcst_init_end
> >
> > > > > > 20160501_000000 -fcst_init_hour 120000 -fcst_var APCP
> >
> > > > > > -fcst_thresh
> >
> > > > > > >=50 -line_type MPR -vif_flag 1 "
> >
> > > > > > DEBUG 4: Amending default job with command line options:
"(nul)"
> >
> > > > > > DEBUG 3: Processing STAT file
> >
> > > > > >
> >
> > > > >
> >
> > > >
> >
> > >
> >
>
"/h/data/global/WXQC/data/met/mdlob_pairs/point_stat_3_galwem_120000L_20160501_120000V.stat"
> >
> > > > > > ... 1 of 2
> >
> > > > > > ERROR  :
> >
> > > > > > ERROR  : DataLine::get_item(int) -> range check error
ERROR  :
> >
> > > > > >
> >
> > > > > > The config file and data file are on your server.
> >
> > > > > >
> >
> > > > > >
> >
> > > > > >
> >
> > > > >
> >
> > > > >
> >
> > > > >
> >
> > > > >
> >
> > > >
> >
> > > >
> >
> > > >
> >
> > > >
> >
> > >
> >
> > >
> >
> > >
> >
> > >
> >
> >
> >
> >
>
>
>
>

------------------------------------------------
Subject: RE: [rt.rap.ucar.edu #76361] Statanalysis Question
From: robert.craig.2 at us.af.mil
Time: Thu May 26 12:30:32 2016

John, that got me passed that error and into a new one:

['/h/WXQC/met-5.1/bin/stat_analysis', '-lookin',
'/h/data/global/WXQC/data/met/ens_cont_tbl', '-out',
'/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_3_PSTD_0Z',
'-line_type PCT', '-out_line_type PSTD', '-by FCST_VAR', '-by
FCST_THRESH', '-v', '6']
DEBUG 1: Creating STAT-Analysis output file
"/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_3_PSTD_0Z"
DEBUG 4: Amending default job with command line options: "-line_type
PCT -out_line_type PSTD -by FCST_VAR -by FCST_THRESH"
DEBUG 3: Processing STAT file
"/h/data/global/WXQC/data/met/ens_cont_tbl/GALWEM_APCP_12hr_1_3_PCT_0Z_pct.stat"
.... 1 of 3
DEBUG 3: Processing STAT file
"/h/data/global/WXQC/data/met/ens_cont_tbl/GALWEM_APCP_12hr_25_3_PCT_0Z_pct.stat"
... 2 of 3
DEBUG 3: Processing STAT file
"/h/data/global/WXQC/data/met/ens_cont_tbl/GALWEM_APCP_12hr_50_3_PCT_0Z_pct.stat"
... 3 of 3
DEBUG 2: STAT Lines read     = 9
DEBUG 2: STAT Lines retained = 9
DEBUG 4:
DEBUG 4: Initializing Job 1 to default job: "-line_type PCT -by
FCST_VAR -by FCST_THRESH -out_line_type PSTD -out_alpha 0.05000 "
DEBUG 4:
DEBUG 4: Amending Job 1 with options: "-line_type PCT -out_line_type
PSTD -by FCST_VAR -by FCST_THRESH"
DEBUG 4:
DEBUG 4: Amending Job 1 with command line options: "-line_type PCT
-out_line_type PSTD -by FCST_VAR -by FCST_THRESH"
DEBUG 2:
DEBUG 2: Processing Job 1: -line_type PCT -by FCST_VAR -by FCST_THRESH
-out_line_type PSTD -out_alpha 0.05000
WARNING:
WARNING: The -by option is ignored for the "NA" job type.
WARNING:
ERROR  :
ERROR  : do_job() -> jobtype value of 7 not currently supported!
ERROR  :
ERROR  :
ERROR  : main() -> encountered an error value of 1.  Calling
clean_up() and usage() before exiting.
ERROR  :

The data files its reading after the previous fix is:

VERSION MODEL  FCST_LEAD FCST_VALID_BEG  FCST_VALID_END  OBS_LEAD
OBS_VALID_BEG   OBS_VALID_END   FCST_VAR FCST_LEV OBS_VAR OBS_LEV
OBTYPE VX_MASK INTERP_MTHD INTERP_PNTS FCST_THRESH
                                  OBS_THRESH COV_THRESH ALPHA
LINE_TYPE TOTAL N_THRESH THRESH_1 OY_1 ON_1  THRESH_2 OY_2 ON_2
THRESH_3 OY_3 ON_3 THRESH_4 OY_4 ON_4 THRESH_5 OY_5 ON_5 THRESH_6 OY_6
ON_6
 THRESH_7 OY_7 ON_7 THRESH_8 OY_8 ON_8 THRESH_9 OY_9 ON_9 THRESH_10
OY_10 ON_10 THRESH_11
V5.1    GALWEM 120000    20160501_120000 20160503_120000 000000
20160501_120000 20160503_120000 APCP     L0       APCP    L0
ADPSFC FULL    NEAREST     4           >=0,>=0.1,>=0.2,>=0.3,>=0.4,>=
0.5,>=0.6,>=0.7,>=0.8,>=0.9,>=1.0 =1         NA         NA    PCT
39072       11        0  472 24899      0.1   85 1172      0.2   68
824      0.3  120  869      0.4   88  640      0.5   91  685
      0.6  124  661      0.7  129  649      0.8  189  741       0.9
1962  4604         1
V5.1    GALWEM 120000    20160501_120000 20160503_120000 000000
20160501_120000 20160503_120000 APCP     L0       APCP    L0
ADPSFC FULL    NEAREST     4           >=0,>=0.1,>=0.2,>=0.3,>=0.4,>=
0.5,>=0.6,>=0.7,>=0.8,>=0.9,>=1.0 =1         NA         NA    PCT
39072       11        0  408 37879      0.1   25  162      0.2   16
104      0.3   16  120      0.4   11   36      0.5   11   49
      0.6   14   50      0.7    5   37      0.8   16   42       0.9
20    51         1
V5.1    GALWEM 120000    20160501_120000 20160503_120000 000000
20160501_120000 20160503_120000 APCP     L0       APCP    L0
ADPSFC FULL    NEAREST     4           >=0,>=0.1,>=0.2,>=0.3,>=0.4,>=
0.5,>=0.6,>=0.7,>=0.8,>=0.9,>=1.0 =1         NA         NA    PCT
39072       11        0  170 38702      0.1    6   59      0.2    3
35      0.3    0   31      0.4    4   16      0.5    3   18
      0.6    2    9      0.7    1    8      0.8    1    3       0.9
1     0         1

Bob

-----Original Message-----
From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
Sent: Thursday, May 26, 2016 1:07 PM
To: CRAIG, ROBERT J GS-12 USAF ACC 16 WS/WXN
<robert.craig.2 at us.af.mil>
Subject: Re: [rt.rap.ucar.edu #76361] Statanalysis Question

Bob,

Ah OK, I see the problem here.  The short answer is that you should
use the "-out_stat" job command option instead of "-out".

And here's the long answer...

Prior to MET version 5.1, STAT-Analysis wrote a very simple ASCII
output file.  For your MPR data, its reading MPR lines, putting them
into probabilistic contingency tables, and writing output PCT
information.
That output is normally written to the screen in a pretty basic way.
Notice that the output just contains the PCT counts without the 21
header columns common to all the .stat line types.  Using the "-out"
job command option redirects that output from the screen to an output
file, but it's still formatted in exactly the same way.  And looking
at the format of the files you sent, I see that's what you've done.

In MET version 5.1, we added a new output option for STAT-Analysis
named "-out_stat".  That tell STAT-Analysis to write the same numbers
to an output file but add in those 21 header columns.  Using the true
.stat format enables STAT-Analysis to read its own output... which is
what you want to do.  We kept the "-out" job command option available
for backwards compatibility.

Please try rerunning your MPR -> PCT jobs using the "-out_stat" option
and them pass that output back in for the PCT -> PSTD jobs.

Thanks,
John

On Thu, May 26, 2016 at 11:51 AM, robert.craig.2 at us.af.mil via RT <
met_help at ucar.edu> wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361 >
>
> Well, that fixed that issue and it got me to my next error:
>
> /gpfs/h/data/global/WXQC/data/temp
> ['/h/WXQC/met-5.1/bin/stat_analysis', '-lookin',
> '/h/data/global/WXQC/data/met/ens_cont_tbl', '-out',
>
'/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_25_3_PSTD_0Z',
> '-out_fcst_thresh
> ge0,ge0.1,ge0.2,ge0.3,ge0.4,ge0.5,ge0.6,ge0.7,ge0.8,ge0.9,ge1.0',
> '-out_obs_thresh eq1', '-line_type PCT', '-out_line_type PSTD', '-by
> FCST_VAR', '-by FCST_THRESH', '-v', '6'] DEBUG 1: Creating
> STAT-Analysis output file
> "/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_25_3_PSTD_0Z"
> DEBUG 4: Amending default job with command line options:
> "-out_fcst_thresh
> ge0,ge0.1,ge0.2,ge0.3,ge0.4,ge0.5,ge0.6,ge0.7,ge0.8,ge0.9,ge1.0
> -out_obs_thresh eq1 -line_type PCT -out_line_type PSTD -by FCST_VAR
> -by FCST_THRESH"
> DEBUG 3: Processing STAT file
>
"/h/data/global/WXQC/data/met/ens_cont_tbl/GALWEM_APCP_12hr_1_3_PCT_0Z_pct.stat"
> .... 1 of 3
> ERROR  :
> ERROR  : DataLine::get_item(int) -> range check error ERROR  :
>
> Any ideas?
>
> -----Original Message-----
> From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
> Sent: Thursday, May 26, 2016 11:31 AM
> To: CRAIG, ROBERT J GS-12 USAF ACC 16 WS/WXN
> <robert.craig.2 at us.af.mil>
> Subject: Re: [rt.rap.ucar.edu #76361] Statanalysis Question
>
> Bob,
>
> If you tell STAT-Analysis to "-lookin" a directory, it will
> recursively search that directory looking for files ending in
".stat".
> If you instead tell STAT-Analysis to "-lookin" one or more specific
> file names, it will process them regardless of their suffix.
>
> So you can either name your PCT output files as "_pct.stat" or you
can
> keep them as "_pct.txt" and change your STAT-Analysis job command to
> something
> like:
>   -lookin /h/data/global/WXQC/data/met/ens_cont_tbl/*_pct.txt
>
> Make sense?
>
> Thanks,
> John
>
> On Thu, May 26, 2016 at 9:55 AM, robert.craig.2 at us.af.mil via RT <
> met_help at ucar.edu> wrote:
>
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361 >
> >
> > Hi John, I am to the point where I am trying to read the pct.txt
> > files to generate the PSTD files.  When I do this using
> > /h/WXQC/met-5.1/bin/stat_analysis  -lookin
> > /h/data/global/WXQC/data/met/ens_cont_tbl -out
> > /h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_3_PSTD_0Z
> > -line_type PCT -out_line_type PSTD -by FCST_VAR -by FCST_THRESH -v
> > 6, I get the following:
> >
> >
> >
> > ['/h/WXQC/met-5.1/bin/stat_analysis', '-lookin',
> > '/h/data/global/WXQC/data/met/ens_cont_tbl/', '-out',
> >
'/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_3_PSTD_0Z'
> > , '-line_type PCT', '-out_line_type PSTD', '-by FCST_VAR', '-by
> > FCST_THRESH', '-v', '6']
> >
> > DEBUG 1: Creating STAT-Analysis output file
> >
"/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_3_PSTD_0Z"
> >
> > DEBUG 4: Amending default job with command line options: "-
line_type
> > PCT -out_line_type PSTD -by FCST_VAR -by FCST_THRESH"
> >
> > ERROR  :
> >
> > ERROR  : process_search_dirs() -> no STAT files found in the
> > directories specified!
> >
> > ERROR  :
> >
> > ERROR  :
> >
> > ERROR  : main() -> encountered an error value of 1.  Calling
> > clean_up() and usage() before exiting.
> >
> > ERROR  :
> >
> >
> >
> > *** Model Evaluation Tools (METV5.1) ***
> >
> >
> >
> > The ens_cont_tbl directory contains 3 files and I placed them on
> > your ftp server.  I didn’t think the file naming convention was
> > important here as long as they end in _pct.txt.  I did try
renaming
> > one to the more conventional naming scheme – point_stat … but that
> > didn’t make a difference.  So why is it not finding my files?
> >
> >
> >
> > Thanks
> >
> > Bob
> >
> >
> >
> > -----Original Message-----
> > From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
> > Sent: Monday, May 23, 2016 11:56 AM
> > To: CRAIG, ROBERT J GS-12 USAF ACC 16 WS/WXN
> > <robert.craig.2 at us.af.mil>
> > Subject: Re: [rt.rap.ucar.edu #76361] Statanalysis Question
> >
> >
> >
> > Bob,
> >
> >
> >
> > Surprisingly, I was able to replicate the same error you're
seeing!
> > I see where the error is occurring, but I don't yet understand why
> > its happening.  However, I do have a workaround for you.
> >
> >
> >
> > Try editing your config file by emptying out the "fcst_thresh"
setting:
> >
> >    fcst_thresh = [];
> >
> >
> >
> > In your job, you're using "-by fcst_thresh" anyway, so STAT-
Analysis
> > will group the data by the FCST_THRESH column.  After I removed
the
> > "fcst_thresh" setting, the job completed.  I'll continue looking
> > into the reason for that error.
> >
> >
> >
> > How many of these files are you planning to pass to STAT-Analysis
at
> > any given time?  I see each sample file contains about 80,000 MPR
lines.
> >
> > Passing it 5 files to process about 400,000 lines, that job takes
> > about 62 seconds to run on my machine.  I worry that as you
increase
> > the number of files, it'll run very slowly.
> >
> >
> >
> > Here's some alternative logic you might consider.  Run STAT-
Analysis
> > once for each .stat file you're generating.  Instead of writing
the
> > PSTD line type, write the PCT line type (that's just the counts of
> > that probabilistic
> >
> > Nx2 table).  Then run jobs to aggregate the PCT lines types and
> > compute PSTD lines (-job aggregate_stat -line_type PCT
> > -out_line_type PSTD).  That would make the processing for each
> > STAT-Analysis job much
> more manageable.
> >
> >
> >
> > BUT in order to do this in 2 steps, you would really need to put
the
> > neighborhood size information into the INTERP_MTHD and INTERP_PNTS
> > columns.  Otherwise, the threshold information won't be retained
in
> > the PCT output lines.  And I also found an issue in the formatting
> > of the OBS_THRESH output column ("=1" in the output should really
be
> > "==1").  For now I just switched to "ge1", but I need to fix this
> formatting issue.
> >
> >
> >
> > Listed below are the c-shell command I used to loop over your
sample
> > files and run stat_analysis in this way...
> >
> >
> >
> > # Loop through MPR files and compute PCT output lines foreach
> > mpr_file (`ls stat_mpr/*.stat`)
> >
> >    set pct_file = `echo $mpr_file | sed 's/stat_mpr/stat_pct/g' |
> > sed 's/.stat/_pct.stat/g'`
> >
> >    /usr/local/met-5.1/bin/stat_analysis -lookin $mpr_file
-out_stat
> > $pct_file \
> >
> >    -job aggregate_stat -line_type MPR -out_line_type PCT \
> >
> >    -by MODEL,FCST_VALID_BEG,FCST_LEAD,FCST_VAR,FCST_THRESH \
> >
> >    -out_fcst_thresh
> >
> > ge0,ge0.1,ge0.2,ge0.3,ge0.4,ge0.5,ge0.6,ge0.7,ge0.8,ge0.9,ge1.0
> >
> > -out_obs_thresh ge1
> >
> > end
> >
> >
> >
> > # Aggregate PCT lines and compute PSTD stats
> > /usr/local/met-5.1/bin/stat_analysis -lookin stat_pct -out_stat
> > pstd.stat \
> >
> >    -job aggregate_stat -line_type PCT -out_line_type PSTD \
> >
> >    -by
> > MODEL,FCST_VALID_BEG,FCST_LEAD,FCST_VAR,INTERP_MTHD,INTERP_PNTS
> >
> >
> >
> > Note that this job is combining all of the neighborhood sizes
> > because INTERP_MTHD and INTERP_PNTS is set the same in the PCT
lines.
> >
> >
> >
> > Thanks,
> >
> > John
> >
> >
> >
> >
> >
> >
> >
> > On Fri, May 20, 2016 at 3:33 PM, robert.craig.2 at us.af.mil via RT <
> > met_help at ucar.edu> wrote:
> >
> >
> >
> > >
> >
> > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361 >
> >
> > >
> >
> > > John, I might have found more information on the error below:
> >
> > >
> >
> > > MetConfig::read_string(const char *) -> unable to open temp file
> >
> > > "config_1943_0_.temp"
> >
> > >
> >
> > > When I run stat_anal on MPR files, no problems if I  process 2
> > > days of
> >
> > > data, but when I increase it to three days, the error occurs.  I
> > > also
> >
> > > played with the dates to try to eliminate a particular data file
> > > as
> >
> > > the cause.  So, it seems to be related to how many data files it
> > > has to
> >
> > > process.   I posted all the data files on the ftp server and my
command
> >
> > > line is below.  The config file on the server is still
representative.
> >
> > > This error does not seem related to directory permissions.
> >
> > >
> >
> > > /h/WXQC/met-5.1/bin/stat_analysis  -lookin
> >
> > > /h/data/global/WXQC/data/met/mdlob_pairs -out
> >
> > > /h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_PSTD_0
> >
> > > -config
> > > /h/WXQC/met-5.1/data/config/STATAnalysisConfig_hira_bs_updated
> >
> > > -out_fcst_thresh
> >
> > > ge0,ge0.1,ge0.2,ge0.3,ge0.4,ge0.5,ge0.6,ge0.7,ge0.8,ge0.9,ge1.0
> >
> > > -out_obs_thresh eq1 -by FCST_VAR -by FCST_THRESH -v 6
> >
> > >
> >
> > > This is concerning since I have to run this on many more than
two
> > > days
> >
> > > worth of data.  Any idea what be happening?
> >
> > >
> >
> > > Thanks
> >
> > > Bob
> >
> > >
> >
> > > -----Original Message-----
> >
> > > From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
> >
> > > Sent: Thursday, May 19, 2016 4:31 PM
> >
> > > To: CRAIG, ROBERT J GS-12 USAF ACC 16 WS/WXN
> >
> > > <robert.craig.2 at us.af.mil>
> >
> > > Subject: Re: [rt.rap.ucar.edu #76361] Statanalysis Question
> >
> > >
> >
> > > Bob,
> >
> > >
> >
> > > It's funny, I was just talking to a colleague today about doing
> >
> > > something very similar to this on a different dataset.
> >
> > >
> >
> > > I think I understand what you're saying about the thresholds >1,
> > > >25,
> >
> > > and
> >
> > > >50.  These are used to define the "event" which is used in
> > > >computing
> >
> > > >the
> >
> > > fractional coverage fields.  My confusion comes from the fact
that
> > > in
> >
> > > MET currently only Grid-Stat is computing these fractional
> > > coverage
> >
> > > fields, not Point-Stat.  But I see now what you're doing.
> >
> > >
> >
> > > One suggestion would be to change the contents of the
INTERP_MTHD
> > > and
> >
> > > INTERP_PNTS header columns.  You currently have NEAREST, 1 which
> > > would
> >
> > > indicate that each observation value was matched to the forecast
> > > value
> >
> > > at the nearest grid point.  Instead, I would suggest writing
NBRHD
> > > and
> >
> > > N, where N indicates the number of points in the neighborhood.
> > > For
> >
> > > example, the NBRHD output from Grid-Stat would write 49 if we
were
> > > using
> > a 7x7 box.
> >
> > >
> >
> > > The -fcst_thresh and -obs_thresh options are used to filter the
> > > input
> >
> > > MPR lines, as you already know.  The -out_fcst_thresh and
> >
> > > -out_obs_thresh options define the thresholds to be applied when
> >
> > > computing the output for the job.  In MET, probabilities are not
> > processed in a "continuous" way.
> >
> > > Instead, they are put into probability bins.  Those bins are
used
> > > to
> >
> > > create an Nx2 contingency table from which probabilistic
> > > statistics
> >
> > > are computed.
> >
> > >
> >
> > > Setting "-out_fcst_thresh gt0,gt0.25,gt0.5,gt0.75,gt1.0" defines
4
> >
> > > probability bins which yields a 4x2 contingency table, from
which
> >
> > > stats are computed.
> >
> > >
> >
> > > Hope that helps.
> >
> > >
> >
> > > Thanks,
> >
> > > Johhn
> >
> > >
> >
> > > On Thu, May 19, 2016 at 2:23 PM, robert.craig.2 at us.af.mil via RT
<
> >
> > > met_help at ucar.edu> wrote:
> >
> > >
> >
> > > >
> >
> > > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361
>
> >
> > > >
> >
> > > > Thanks John, the directory problem was due to come corruption
on
> > > > the
> >
> > > > MET we had on one if the systems.  On another system the
problem
> >
> > > > doesn't come up so we are hoping a recompile of MET on said
> > > > system
> >
> > > > will clear up the issue.
> >
> > > >
> >
> > > > As far as the second comment, I don't think you interpreted
what
> > > > I
> >
> > > > am doing correctly.  In each file, there is three sets of data
> > > > for
> >
> > > > each variable.  They are not identical since the first set is
> > > > the ob
> >
> > > > neighborhood data for precip > 1.  The next set is the ob
> >
> > > > neighborhood data for precip > 25, and the same for precip >
50.
> > > > I
> >
> > > > you compare the model and ob data, the model data should be
> >
> > > > different (for some obs) then the model data for the previous
> >
> > > > category.  The neighborhoods
> >
> > > around each ob site
> >
> > > > follow the HiRA method.   All the data in the mpr file lines
are
> >
> > > > probabilities, so I want to create PSTD from these data.  So I
> > > > was
> > using
> >
> > > > the fcst_thresh to filter for the HiRA thresholds I am
interested in.
> >  I
> >
> > > > tried your code and added -by FCST_THRESH and got out three
> >
> > > > different
> >
> > > sets
> >
> > > > of values.   So my question is why did you have the
-out_fcst_thresh
> > set
> >
> > > to
> >
> > > > 10 prob thresolds?  Why wouldn't ge0 pick  up all
probabilities?
> > > > I
> >
> > > > am not understanding how these thresholds are being used.
> >
> > > >
> >
> > > > Thanks
> >
> > > > Bob
> >
> > > >
> >
> > > > -----Original Message-----
> >
> > > > From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
> >
> > > > Sent: Monday, May 16, 2016 4:54 PM
> >
> > > > To: CRAIG, ROBERT J GS-12 USAF ACC 16 WS/WXN
> >
> > > > <robert.craig.2 at us.af.mil>
> >
> > > > Subject: Re: [rt.rap.ucar.edu #76361] Statanalysis Question
> >
> > > >
> >
> > > > Bob,
> >
> > > >
> >
> > > > Thanks for sending the sample data.  I agree that STAT-
Analysis
> > > > can
> >
> > > > get pretty confusing.  It has a lot of flexibility, but we
> > > > really
> >
> > > > need to think through what you're trying to do.
> >
> > > >
> >
> > > > First, regarding the error you're getting.  Unfortunately, the
> >
> > > > config file string parser is writing a temp file in the
current
> > "runtime"
> >
> > > directory.
> >
> > > > The error is from the fact that you don't have permission to
> > > > write
> >
> > > > the file "config_23325_0_.temp" in the current directory.
> >
> > > > Ultimately, we should change that to use the temp directory
instead.
> >
> > > >
> >
> > > > Next, I looked at the data you sent to me.  Listed below are
the
> >
> > > > unique combinations of just a few of the header columns:
> >
> > > >
> >
> > > > FCST_VAR FCST_THRESH LINE_TYPE TOTAL
> >
> > > > APCP         >=1                  MPR           14666
> >
> > > > APCP         >=25                MPR           14666
> >
> > > > APCP         >=50                MPR           14666
> >
> > > > CEIL           <=1000             MPR           11926
> >
> > > > CEIL           <=100               MPR           11926
> >
> > > > CEIL           <=300               MPR           11926
> >
> > > >
> >
> > > > Based on this, it looks like you have a lot of duplicate
matched
> >
> > > > pair
> >
> > > > (MPR) output lines... We have the same 14666 pairs for APCP
> > > > repeated
> >
> > > > 3 times followed by the same 11926 pairs for CEIL repeated 3
times.
> >
> > > > This isn't necessary.  Instead, the FCST_THRESH and OBS_THRESH
> >
> > > > columns for the MPR line type should be set to "NA".  The MPR
> > > > line
> >
> > > > type that Point-Stat creates just contains the paired forecast
> > > > and
> >
> > > > observation
> >
> > > values.
> >
> > > > Thresholds do not apply to this line type.
> >
> > > >
> >
> > > > I posted an updated version of your file to the ftp site.  I
> >
> > > > stripped it down to 14666 APCP lines and 11926 CEIL lines with
> > > > NA in
> >
> > > > the FCST_THRESH and OBS_THRESH columns:
> >
> > > >
> >
> > > >
> >
> > > >
ftp://ftp.rap.ucar.edu/incoming/irap/met_help/craig_data/point_s
> > > > ta
> > > > t_
> >
> > > > 3_ galwem_120000L_20160501_120000V_JHG.stat
> >
> > > >
> >
> > > > Looking at the values in the FCST column, I see numbers
between
> > > > 0
> >
> > > > and
> >
> > > > 1 (0, 0.11, 0.22, 0.33, ...).  Looking in the OBS column, I
see
> > > > 2
> >
> > > > numbers (0 or 1).  And looking at your config file, it looks
> > > > like
> >
> > > > you want to use these MPR lines to compute probabilistic
output.
> >
> > > > MET verifies probabilities using an Nx2 contingency table.
You
> > > > use
> >
> > > "-out_fcst_thresh"
> >
> > > > to select the probabilistic thresholds to be applied and
> >
> > > "-out_obs_thresh"
> >
> > > > to select the observation threshold to be applied.
> >
> > > >
> >
> > > > Here's a stat-analysis job you could run to read the MPR
lines,
> >
> > > > define the probabilistic forecast thresholds, define the
single
> >
> > > > observation threshold, and compute a PSTD output line.  Using
> > > > "-by
> >
> > > > FCST_VAR" tells it to run the job separately for each unique
> > > > entry
> >
> > > > found in the FCST_VAR
> >
> > > column.
> >
> > > >
> >
> > > > /usr/local/met-5.1/bin/stat_analysis \
> >
> > > >    -lookin
point_stat_3_galwem_120000L_20160501_120000V_JHG.stat
> > > > \
> >
> > > >    -job aggregate_stat -line_type MPR -out_line_type PSTD \
> >
> > > >    -out_fcst_thresh
> >
> > > >
ge0,ge0.1,ge0.2,ge0.3,ge0.4,ge0.5,ge0.6,ge0.7,ge0.8,ge0.9,ge1.0
> > > > \
> >
> > > >    -out_obs_thresh eq1.0 \
> >
> > > >    -by FCST_VAR \
> >
> > > >    -out_stat out_pstd.txt
> >
> > > >
> >
> > > > The output statistics are written to "out_pstd.txt".
> >
> > > >
> >
> > > > Hope that helps.
> >
> > > >
> >
> > > > John
> >
> > > >
> >
> > > >
> >
> > > >
> >
> > > >
> >
> > > >
> >
> > > >
> >
> > > >
> >
> > > > On Mon, May 16, 2016 at 2:47 PM, robert.craig.2 at us.af.mil via
RT
> > > > <
> >
> > > > met_help at ucar.edu> wrote:
> >
> > > >
> >
> > > > >
> >
> > > > > <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361
> > > > > >
> >
> > > > >
> >
> > > > > Thanks John, I knew the data was space delimited but forgot
to
> >
> > > > > check the header.  As usual with MET, I progressed further
but
> > > > > am
> >
> > > > > hitting a
> >
> > > new
> >
> > > > > error.   See below.   I pushed the the config file to the
ftp
> >
> > > directory.
> >
> > > > > As you can see, the -tmp_dir is set to
> >
> > > /h/data/global/WXQC/data/met/tmp.
> >
> > > > > This directory permissions are wide open - infact stat_anal
> > > > > temp
> >
> > > > > files
> >
> > > > are
> >
> > > > > in there.   Does the config*.temp try to write somewhere
else?
> >
> > > > >
> >
> > > > > Also, notice in the command line options, there are three
> thesholds.
> >
> > > > > MET kept telling me that I had to have three since this is
> >
> > > > > probability
> >
> > > > data.
> >
> > > > > Also, the latest MPR files (.stat) are in the ftp dir.  As
you
> > > > > can
> >
> > > > > see I generated model/ob pairs using different thresholds
for
> > > > > the
> >
> > > > > forecast and observation data.  So this is where I get
confused:
> > > > > I
> >
> > > > > assume the fcst thresh in the config file is a filter to
pull
> >
> > > > > those lines that have the threshold I want.  I am not sure
> > > > > what
> >
> > > > > the -out_fcst_thresh in the command line is doing.  If it is
> >
> > > > > filtering the mpr line fcst data, then I would think I would
> > > > > set
> >
> > > > > it to ge 0 for the fcst and ob since the fcst and ob data
> > > > > range
> >
> > > > > from 0 to 1.  Am
> >
> > > I handling this correctly?
> >
> > > > >
> >
> > > > > Thanks
> >
> > > > > BOb
> >
> > > > >
> >
> > > > > ['/h/WXQC/met-5.1/bin/stat_analysis', '-lookin',
> >
> > > > > '/h/data/global/WXQC/data/met/mdlob_pairs', '-out',
> >
> > > > >
'/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_1_PSTD_
> > > > > 0Z
> > > > > ',
> >
> > > > > '-config',
> >
> > > > > '/h/WXQC/met-
5.1/data/config/STATAnalysisConfig_hira_bs_updated'
> > > > > ,
> >
> > > > > '-out_fcst_thresh ge0,ge0.5,ge1', '-out_obs_thresh ge0', '-
v',
> >
> > > > > '6'] DEBUG 1: Creating STAT-Analysis output file
> >
> > > > >
"/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_1_PSTD_0Z"
> >
> > > > > DEBUG 1: Default Config File:
> >
> > > > > /home/qcteam/met-
5.1/share/met/config/STATAnalysisConfig_defau
> > > > > lt
> >
> > > > > DEBUG 1: User Config File:
> >
> > > > > /h/WXQC/met-
5.1/data/config/STATAnalysisConfig_hira_bs_updated
> >
> > > > > DEBUG 4: Default Job from the config file: "-model GALWEM
> >
> > > > > -fcst_lead
> >
> > > > > 120000 -fcst_init_beg 20160501_000000 -fcst_init_end
> >
> > > > > 20160502_000000 -fcst_init_hour 000000 -fcst_var APCP
> > > > > -fcst_thresh
> >
> > > > > >=1 -line_type MPR -vif_flag 1 "
> >
> > > > > DEBUG 4: Amending default job with command line options:
> >
> > > > > "-out_fcst_thresh
> >
> > > > > ge0,ge0.5,ge1 -out_obs_thresh ge0"
> >
> > > > > DEBUG 3: Processing STAT file
> >
> > > > >
> >
> > > >
> >
> > >
> >
>
"/h/data/global/WXQC/data/met/mdlob_pairs/point_stat_3_galwem_120000L_20160501_120000V.stat"
> >
> > > > > ... 1 of 10
> >
> > > > > ERROR  :
> >
> > > > > ERROR  :
> >
> > > > > ERROR  :   MetConfig::read_string(const char *) -> unable to
open
> > temp
> >
> > > > > file "config_23325_0_.temp"
> >
> > > > > ERROR  :
> >
> > > > >
> >
> > > > > -----Original Message-----
> >
> > > > > From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
> >
> > > > > Sent: Friday, May 13, 2016 6:27 PM
> >
> > > > > To: CRAIG, ROBERT J GS-12 USAF ACC 16 WS/WXN
> >
> > > > > <robert.craig.2 at us.af.mil>
> >
> > > > > Subject: Re: [rt.rap.ucar.edu #76361] Statanalysis Question
> >
> > > > >
> >
> > > > > Bob,
> >
> > > > >
> >
> > > > > The problem is coming from the first line of the file you
sent
> > > > > to
> me.
> >
> > > > > It contains a comma-separated list of header column names.
> >
> > > > >
> >
> > > > > I'm not exactly sure where you pulled those header column
> > > > > names,
> >
> > > > > but that's the problem.  MET expects data to be separated by
> > whitespace...
> >
> > > > > so it interprets that long string with a bunch of commas as
a
> >
> > > > > single
> >
> > > > column.
> >
> > > > > The
> >
> > > > > error comes when it tries to read the "second" column.   If
you
> just
> >
> > > > remove
> >
> > > > > that first line, it should run fine.
> >
> > > > >
> >
> > > > > If you do want header columns, here's a trick.  Run the
> > > > > following
> > job:
> >
> > > > >
> >
> > > > > stat_analysis -lookin
> >
> > > point_stat_3_galwem_120000L_20160501_120000V.stat \
> >
> > > > >    -job filter -line_type MPR -dump_row out.stat
> >
> > > > >
> >
> > > > > The file out.stat, will now contain the full header for the
> > > > > MPR
> >
> > > > > line
> >
> > > > type.
> >
> > > > > When you select a single LINE_TYPE value, stat-analysis will
> > > > > write
> >
> > > > > the full header for that line type to the output.
> >
> > > > >
> >
> > > > > Have a good weekend.
> >
> > > > >
> >
> > > > > John
> >
> > > > >
> >
> > > > >
> >
> > > > > On Fri, May 13, 2016 at 10:26 AM, robert.craig.2 at us.af.mil
via
> > > > > RT
> >
> > > > > < met_help at ucar.edu> wrote:
> >
> > > > >
> >
> > > > > >
> >
> > > > > > Fri May 13 10:26:26 2016: Request 76361 was acted upon.
> >
> > > > > > Transaction: Ticket created by robert.craig.2 at us.af.mil
> >
> > > > > >        Queue: met_help
> >
> > > > > >      Subject: Statanalysis Question
> >
> > > > > >        Owner: Nobody
> >
> > > > > >   Requestors: robert.craig.2 at us.af.mil
> >
> > > > > >       Status: new
> >
> > > > > >  Ticket <URL:
> >
> > > > > > https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361
> >
> > > > > > >
> >
> > > > > >
> >
> > > > > >
> >
> > > > > > John, I am getting the following error when running
statanalysis.
> >
> > > > > > The forcast times and valid times seem to be correct to
me,
> > > > > > so I
> >
> > > > > > am not sure the cause of the error.
> >
> > > > > >
> >
> > > > > > /h/WXQC/met-5.1/bin/stat_analysis -lookin
> >
> > > > > > /h/data/global/WXQC/data/met/mdlob_pairs -out
> >
> > > > > >
/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_PST
> > > > > > D_
> > > > > > 0Z
> >
> > > > > > -config
> >
> > > > > > /h/WXQC/met-
5.1/data/config/STATAnalysisConfig_hira_bs_updat
> > > > > > ed
> >
> > > > > > -v 6 DEBUG 1: Creating STAT-Analysis output file
> >
> > > > > >
> "/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_PSTD_0Z"
> >
> > > > > > DEBUG 1: Default Config File:
> >
> > > > > > /home/qcteam/met-
5.1/share/met/config/STATAnalysisConfig_def
> > > > > > au
> > > > > > lt
> >
> > > > > > DEBUG 1: User Config File:
> >
> > > > > > /h/WXQC/met-
5.1/data/config/STATAnalysisConfig_hira_bs_updat
> > > > > > ed
> >
> > > > > > DEBUG 4: Default Job from the config file: "-model GALWEM
> >
> > > > > > -fcst_lead
> >
> > > > > > 120000 -fcst_init_beg 20160501_000000 -fcst_init_end
> >
> > > > > > 20160501_000000 -fcst_init_hour 120000 -fcst_var APCP
> >
> > > > > > -fcst_thresh
> >
> > > > > > >=50 -line_type MPR -vif_flag 1 "
> >
> > > > > > DEBUG 4: Amending default job with command line options:
"(nul)"
> >
> > > > > > DEBUG 3: Processing STAT file
> >
> > > > > >
> >
> > > > >
> >
> > > >
> >
> > >
> >
>
"/h/data/global/WXQC/data/met/mdlob_pairs/point_stat_3_galwem_120000L_20160501_120000V.stat"
> >
> > > > > > ... 1 of 2
> >
> > > > > > ERROR  :
> >
> > > > > > ERROR  : DataLine::get_item(int) -> range check error
ERROR  :
> >
> > > > > >
> >
> > > > > > The config file and data file are on your server.
> >
> > > > > >
> >
> > > > > >
> >
> > > > > >
> >
> > > > >
> >
> > > > >
> >
> > > > >
> >
> > > > >
> >
> > > >
> >
> > > >
> >
> > > >
> >
> > > >
> >
> > >
> >
> > >
> >
> > >
> >
> > >
> >
> >
> >
> >
>
>
>
>



------------------------------------------------
Subject: Statanalysis Question
From: John Halley Gotway
Time: Thu May 26 13:38:08 2016

Bob,

You have failed to define the type of "job" you want STAT-Analysis to
perform.  The "aggregate_stat" job type is the one which aggregates
together many input lines (PCT in your case) and uses them to derive a
new
output line type (PSTD in your case).  You just need to add "-job
aggregate_stat" as shown below:

/h/WXQC/met-5.1/bin/stat_analysis \
   -lookin', '/h/data/global/WXQC/data/met/ens_cont_tbl \
   -out
/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_3_PSTD_0Z \
   -job aggregate_stat -line_type PCT -out_line_type PSTD \
   -by FCST_VAR,FCST_THRESH -v 6

Thanks,
John

On Thu, May 26, 2016 at 12:30 PM, robert.craig.2 at us.af.mil via RT <
met_help at ucar.edu> wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361 >
>
> John, that got me passed that error and into a new one:
>
> ['/h/WXQC/met-5.1/bin/stat_analysis', '-lookin',
> '/h/data/global/WXQC/data/met/ens_cont_tbl', '-out',
>
'/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_3_PSTD_0Z',
> '-line_type PCT', '-out_line_type PSTD', '-by FCST_VAR', '-by
FCST_THRESH',
> '-v', '6']
> DEBUG 1: Creating STAT-Analysis output file
> "/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_3_PSTD_0Z"
> DEBUG 4: Amending default job with command line options: "-line_type
PCT
> -out_line_type PSTD -by FCST_VAR -by FCST_THRESH"
> DEBUG 3: Processing STAT file
>
"/h/data/global/WXQC/data/met/ens_cont_tbl/GALWEM_APCP_12hr_1_3_PCT_0Z_pct.stat"
> .... 1 of 3
> DEBUG 3: Processing STAT file
>
"/h/data/global/WXQC/data/met/ens_cont_tbl/GALWEM_APCP_12hr_25_3_PCT_0Z_pct.stat"
> ... 2 of 3
> DEBUG 3: Processing STAT file
>
"/h/data/global/WXQC/data/met/ens_cont_tbl/GALWEM_APCP_12hr_50_3_PCT_0Z_pct.stat"
> ... 3 of 3
> DEBUG 2: STAT Lines read     = 9
> DEBUG 2: STAT Lines retained = 9
> DEBUG 4:
> DEBUG 4: Initializing Job 1 to default job: "-line_type PCT -by
FCST_VAR
> -by FCST_THRESH -out_line_type PSTD -out_alpha 0.05000 "
> DEBUG 4:
> DEBUG 4: Amending Job 1 with options: "-line_type PCT -out_line_type
PSTD
> -by FCST_VAR -by FCST_THRESH"
> DEBUG 4:
> DEBUG 4: Amending Job 1 with command line options: "-line_type PCT
> -out_line_type PSTD -by FCST_VAR -by FCST_THRESH"
> DEBUG 2:
> DEBUG 2: Processing Job 1: -line_type PCT -by FCST_VAR -by
FCST_THRESH
> -out_line_type PSTD -out_alpha 0.05000
> WARNING:
> WARNING: The -by option is ignored for the "NA" job type.
> WARNING:
> ERROR  :
> ERROR  : do_job() -> jobtype value of 7 not currently supported!
> ERROR  :
> ERROR  :
> ERROR  : main() -> encountered an error value of 1.  Calling
clean_up()
> and usage() before exiting.
> ERROR  :
>
> The data files its reading after the previous fix is:
>
> VERSION MODEL  FCST_LEAD FCST_VALID_BEG  FCST_VALID_END  OBS_LEAD
> OBS_VALID_BEG   OBS_VALID_END   FCST_VAR FCST_LEV OBS_VAR OBS_LEV
OBTYPE
> VX_MASK INTERP_MTHD INTERP_PNTS FCST_THRESH
>                                   OBS_THRESH COV_THRESH ALPHA
LINE_TYPE
> TOTAL N_THRESH THRESH_1 OY_1 ON_1  THRESH_2 OY_2 ON_2 THRESH_3 OY_3
ON_3
> THRESH_4 OY_4 ON_4 THRESH_5 OY_5 ON_5 THRESH_6 OY_6 ON_6
>  THRESH_7 OY_7 ON_7 THRESH_8 OY_8 ON_8 THRESH_9 OY_9 ON_9 THRESH_10
OY_10
> ON_10 THRESH_11
> V5.1    GALWEM 120000    20160501_120000 20160503_120000 000000
>  20160501_120000 20160503_120000 APCP     L0       APCP    L0
ADPSFC
> FULL    NEAREST     4           >=0,>=0.1,>=0.2,>=0.3,>=0.4,>=
> 0.5,>=0.6,>=0.7,>=0.8,>=0.9,>=1.0 =1         NA         NA    PCT
>  39072       11        0  472 24899      0.1   85 1172      0.2   68
824
>     0.3  120  869      0.4   88  640      0.5   91  685
>       0.6  124  661      0.7  129  649      0.8  189  741       0.9
1962
> 4604         1
> V5.1    GALWEM 120000    20160501_120000 20160503_120000 000000
>  20160501_120000 20160503_120000 APCP     L0       APCP    L0
ADPSFC
> FULL    NEAREST     4           >=0,>=0.1,>=0.2,>=0.3,>=0.4,>=
> 0.5,>=0.6,>=0.7,>=0.8,>=0.9,>=1.0 =1         NA         NA    PCT
>  39072       11        0  408 37879      0.1   25  162      0.2   16
104
>     0.3   16  120      0.4   11   36      0.5   11   49
>       0.6   14   50      0.7    5   37      0.8   16   42       0.9
20
>   51         1
> V5.1    GALWEM 120000    20160501_120000 20160503_120000 000000
>  20160501_120000 20160503_120000 APCP     L0       APCP    L0
ADPSFC
> FULL    NEAREST     4           >=0,>=0.1,>=0.2,>=0.3,>=0.4,>=
> 0.5,>=0.6,>=0.7,>=0.8,>=0.9,>=1.0 =1         NA         NA    PCT
>  39072       11        0  170 38702      0.1    6   59      0.2    3
35
>     0.3    0   31      0.4    4   16      0.5    3   18
>       0.6    2    9      0.7    1    8      0.8    1    3       0.9
1
>    0         1
>
> Bob
>
> -----Original Message-----
> From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
> Sent: Thursday, May 26, 2016 1:07 PM
> To: CRAIG, ROBERT J GS-12 USAF ACC 16 WS/WXN
<robert.craig.2 at us.af.mil>
> Subject: Re: [rt.rap.ucar.edu #76361] Statanalysis Question
>
> Bob,
>
> Ah OK, I see the problem here.  The short answer is that you should
use
> the "-out_stat" job command option instead of "-out".
>
> And here's the long answer...
>
> Prior to MET version 5.1, STAT-Analysis wrote a very simple ASCII
output
> file.  For your MPR data, its reading MPR lines, putting them into
> probabilistic contingency tables, and writing output PCT
information.
> That output is normally written to the screen in a pretty basic way.
> Notice that the output just contains the PCT counts without the 21
header
> columns common to all the .stat line types.  Using the "-out" job
command
> option redirects that output from the screen to an output file, but
it's
> still formatted in exactly the same way.  And looking at the format
of the
> files you sent, I see that's what you've done.
>
> In MET version 5.1, we added a new output option for STAT-Analysis
named
> "-out_stat".  That tell STAT-Analysis to write the same numbers to
an
> output file but add in those 21 header columns.  Using the true
.stat
> format enables STAT-Analysis to read its own output... which is what
you
> want to do.  We kept the "-out" job command option available for
backwards
> compatibility.
>
> Please try rerunning your MPR -> PCT jobs using the "-out_stat"
option and
> them pass that output back in for the PCT -> PSTD jobs.
>
> Thanks,
> John
>
> On Thu, May 26, 2016 at 11:51 AM, robert.craig.2 at us.af.mil via RT <
> met_help at ucar.edu> wrote:
>
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361 >
> >
> > Well, that fixed that issue and it got me to my next error:
> >
> > /gpfs/h/data/global/WXQC/data/temp
> > ['/h/WXQC/met-5.1/bin/stat_analysis', '-lookin',
> > '/h/data/global/WXQC/data/met/ens_cont_tbl', '-out',
> >
'/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_25_3_PSTD_0Z',
> > '-out_fcst_thresh
> > ge0,ge0.1,ge0.2,ge0.3,ge0.4,ge0.5,ge0.6,ge0.7,ge0.8,ge0.9,ge1.0',
> > '-out_obs_thresh eq1', '-line_type PCT', '-out_line_type PSTD', '-
by
> > FCST_VAR', '-by FCST_THRESH', '-v', '6'] DEBUG 1: Creating
> > STAT-Analysis output file
> >
"/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_25_3_PSTD_0Z"
> > DEBUG 4: Amending default job with command line options:
> > "-out_fcst_thresh
> > ge0,ge0.1,ge0.2,ge0.3,ge0.4,ge0.5,ge0.6,ge0.7,ge0.8,ge0.9,ge1.0
> > -out_obs_thresh eq1 -line_type PCT -out_line_type PSTD -by
FCST_VAR
> > -by FCST_THRESH"
> > DEBUG 3: Processing STAT file
> >
>
"/h/data/global/WXQC/data/met/ens_cont_tbl/GALWEM_APCP_12hr_1_3_PCT_0Z_pct.stat"
> > .... 1 of 3
> > ERROR  :
> > ERROR  : DataLine::get_item(int) -> range check error ERROR  :
> >
> > Any ideas?
> >
> > -----Original Message-----
> > From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
> > Sent: Thursday, May 26, 2016 11:31 AM
> > To: CRAIG, ROBERT J GS-12 USAF ACC 16 WS/WXN
> > <robert.craig.2 at us.af.mil>
> > Subject: Re: [rt.rap.ucar.edu #76361] Statanalysis Question
> >
> > Bob,
> >
> > If you tell STAT-Analysis to "-lookin" a directory, it will
> > recursively search that directory looking for files ending in
".stat".
> > If you instead tell STAT-Analysis to "-lookin" one or more
specific
> > file names, it will process them regardless of their suffix.
> >
> > So you can either name your PCT output files as "_pct.stat" or you
can
> > keep them as "_pct.txt" and change your STAT-Analysis job command
to
> > something
> > like:
> >   -lookin /h/data/global/WXQC/data/met/ens_cont_tbl/*_pct.txt
> >
> > Make sense?
> >
> > Thanks,
> > John
> >
> > On Thu, May 26, 2016 at 9:55 AM, robert.craig.2 at us.af.mil via RT <
> > met_help at ucar.edu> wrote:
> >
> > >
> > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361 >
> > >
> > > Hi John, I am to the point where I am trying to read the pct.txt
> > > files to generate the PSTD files.  When I do this using
> > > /h/WXQC/met-5.1/bin/stat_analysis  -lookin
> > > /h/data/global/WXQC/data/met/ens_cont_tbl -out
> > >
/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_3_PSTD_0Z
> > > -line_type PCT -out_line_type PSTD -by FCST_VAR -by FCST_THRESH
-v
> > > 6, I get the following:
> > >
> > >
> > >
> > > ['/h/WXQC/met-5.1/bin/stat_analysis', '-lookin',
> > > '/h/data/global/WXQC/data/met/ens_cont_tbl/', '-out',
> > >
'/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_3_PSTD_0Z'
> > > , '-line_type PCT', '-out_line_type PSTD', '-by FCST_VAR', '-by
> > > FCST_THRESH', '-v', '6']
> > >
> > > DEBUG 1: Creating STAT-Analysis output file
> > >
"/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_3_PSTD_0Z"
> > >
> > > DEBUG 4: Amending default job with command line options: "-
line_type
> > > PCT -out_line_type PSTD -by FCST_VAR -by FCST_THRESH"
> > >
> > > ERROR  :
> > >
> > > ERROR  : process_search_dirs() -> no STAT files found in the
> > > directories specified!
> > >
> > > ERROR  :
> > >
> > > ERROR  :
> > >
> > > ERROR  : main() -> encountered an error value of 1.  Calling
> > > clean_up() and usage() before exiting.
> > >
> > > ERROR  :
> > >
> > >
> > >
> > > *** Model Evaluation Tools (METV5.1) ***
> > >
> > >
> > >
> > > The ens_cont_tbl directory contains 3 files and I placed them on
> > > your ftp server.  I didn’t think the file naming convention was
> > > important here as long as they end in _pct.txt.  I did try
renaming
> > > one to the more conventional naming scheme – point_stat … but
that
> > > didn’t make a difference.  So why is it not finding my files?
> > >
> > >
> > >
> > > Thanks
> > >
> > > Bob
> > >
> > >
> > >
> > > -----Original Message-----
> > > From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
> > > Sent: Monday, May 23, 2016 11:56 AM
> > > To: CRAIG, ROBERT J GS-12 USAF ACC 16 WS/WXN
> > > <robert.craig.2 at us.af.mil>
> > > Subject: Re: [rt.rap.ucar.edu #76361] Statanalysis Question
> > >
> > >
> > >
> > > Bob,
> > >
> > >
> > >
> > > Surprisingly, I was able to replicate the same error you're
seeing!
> > > I see where the error is occurring, but I don't yet understand
why
> > > its happening.  However, I do have a workaround for you.
> > >
> > >
> > >
> > > Try editing your config file by emptying out the "fcst_thresh"
setting:
> > >
> > >    fcst_thresh = [];
> > >
> > >
> > >
> > > In your job, you're using "-by fcst_thresh" anyway, so STAT-
Analysis
> > > will group the data by the FCST_THRESH column.  After I removed
the
> > > "fcst_thresh" setting, the job completed.  I'll continue looking
> > > into the reason for that error.
> > >
> > >
> > >
> > > How many of these files are you planning to pass to STAT-
Analysis at
> > > any given time?  I see each sample file contains about 80,000
MPR
> lines.
> > >
> > > Passing it 5 files to process about 400,000 lines, that job
takes
> > > about 62 seconds to run on my machine.  I worry that as you
increase
> > > the number of files, it'll run very slowly.
> > >
> > >
> > >
> > > Here's some alternative logic you might consider.  Run STAT-
Analysis
> > > once for each .stat file you're generating.  Instead of writing
the
> > > PSTD line type, write the PCT line type (that's just the counts
of
> > > that probabilistic
> > >
> > > Nx2 table).  Then run jobs to aggregate the PCT lines types and
> > > compute PSTD lines (-job aggregate_stat -line_type PCT
> > > -out_line_type PSTD).  That would make the processing for each
> > > STAT-Analysis job much
> > more manageable.
> > >
> > >
> > >
> > > BUT in order to do this in 2 steps, you would really need to put
the
> > > neighborhood size information into the INTERP_MTHD and
INTERP_PNTS
> > > columns.  Otherwise, the threshold information won't be retained
in
> > > the PCT output lines.  And I also found an issue in the
formatting
> > > of the OBS_THRESH output column ("=1" in the output should
really be
> > > "==1").  For now I just switched to "ge1", but I need to fix
this
> > formatting issue.
> > >
> > >
> > >
> > > Listed below are the c-shell command I used to loop over your
sample
> > > files and run stat_analysis in this way...
> > >
> > >
> > >
> > > # Loop through MPR files and compute PCT output lines foreach
> > > mpr_file (`ls stat_mpr/*.stat`)
> > >
> > >    set pct_file = `echo $mpr_file | sed 's/stat_mpr/stat_pct/g'
|
> > > sed 's/.stat/_pct.stat/g'`
> > >
> > >    /usr/local/met-5.1/bin/stat_analysis -lookin $mpr_file
-out_stat
> > > $pct_file \
> > >
> > >    -job aggregate_stat -line_type MPR -out_line_type PCT \
> > >
> > >    -by MODEL,FCST_VALID_BEG,FCST_LEAD,FCST_VAR,FCST_THRESH \
> > >
> > >    -out_fcst_thresh
> > >
> > > ge0,ge0.1,ge0.2,ge0.3,ge0.4,ge0.5,ge0.6,ge0.7,ge0.8,ge0.9,ge1.0
> > >
> > > -out_obs_thresh ge1
> > >
> > > end
> > >
> > >
> > >
> > > # Aggregate PCT lines and compute PSTD stats
> > > /usr/local/met-5.1/bin/stat_analysis -lookin stat_pct -out_stat
> > > pstd.stat \
> > >
> > >    -job aggregate_stat -line_type PCT -out_line_type PSTD \
> > >
> > >    -by
> > > MODEL,FCST_VALID_BEG,FCST_LEAD,FCST_VAR,INTERP_MTHD,INTERP_PNTS
> > >
> > >
> > >
> > > Note that this job is combining all of the neighborhood sizes
> > > because INTERP_MTHD and INTERP_PNTS is set the same in the PCT
lines.
> > >
> > >
> > >
> > > Thanks,
> > >
> > > John
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > On Fri, May 20, 2016 at 3:33 PM, robert.craig.2 at us.af.mil via RT
<
> > > met_help at ucar.edu> wrote:
> > >
> > >
> > >
> > > >
> > >
> > > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361
>
> > >
> > > >
> > >
> > > > John, I might have found more information on the error below:
> > >
> > > >
> > >
> > > > MetConfig::read_string(const char *) -> unable to open temp
file
> > >
> > > > "config_1943_0_.temp"
> > >
> > > >
> > >
> > > > When I run stat_anal on MPR files, no problems if I  process 2
> > > > days of
> > >
> > > > data, but when I increase it to three days, the error occurs.
I
> > > > also
> > >
> > > > played with the dates to try to eliminate a particular data
file
> > > > as
> > >
> > > > the cause.  So, it seems to be related to how many data files
it
> > > > has to
> > >
> > > > process.   I posted all the data files on the ftp server and
my
> command
> > >
> > > > line is below.  The config file on the server is still
> representative.
> > >
> > > > This error does not seem related to directory permissions.
> > >
> > > >
> > >
> > > > /h/WXQC/met-5.1/bin/stat_analysis  -lookin
> > >
> > > > /h/data/global/WXQC/data/met/mdlob_pairs -out
> > >
> > > >
/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_PSTD_0
> > >
> > > > -config
> > > > /h/WXQC/met-5.1/data/config/STATAnalysisConfig_hira_bs_updated
> > >
> > > > -out_fcst_thresh
> > >
> > > >
ge0,ge0.1,ge0.2,ge0.3,ge0.4,ge0.5,ge0.6,ge0.7,ge0.8,ge0.9,ge1.0
> > >
> > > > -out_obs_thresh eq1 -by FCST_VAR -by FCST_THRESH -v 6
> > >
> > > >
> > >
> > > > This is concerning since I have to run this on many more than
two
> > > > days
> > >
> > > > worth of data.  Any idea what be happening?
> > >
> > > >
> > >
> > > > Thanks
> > >
> > > > Bob
> > >
> > > >
> > >
> > > > -----Original Message-----
> > >
> > > > From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
> > >
> > > > Sent: Thursday, May 19, 2016 4:31 PM
> > >
> > > > To: CRAIG, ROBERT J GS-12 USAF ACC 16 WS/WXN
> > >
> > > > <robert.craig.2 at us.af.mil>
> > >
> > > > Subject: Re: [rt.rap.ucar.edu #76361] Statanalysis Question
> > >
> > > >
> > >
> > > > Bob,
> > >
> > > >
> > >
> > > > It's funny, I was just talking to a colleague today about
doing
> > >
> > > > something very similar to this on a different dataset.
> > >
> > > >
> > >
> > > > I think I understand what you're saying about the thresholds
>1,
> > > > >25,
> > >
> > > > and
> > >
> > > > >50.  These are used to define the "event" which is used in
> > > > >computing
> > >
> > > > >the
> > >
> > > > fractional coverage fields.  My confusion comes from the fact
that
> > > > in
> > >
> > > > MET currently only Grid-Stat is computing these fractional
> > > > coverage
> > >
> > > > fields, not Point-Stat.  But I see now what you're doing.
> > >
> > > >
> > >
> > > > One suggestion would be to change the contents of the
INTERP_MTHD
> > > > and
> > >
> > > > INTERP_PNTS header columns.  You currently have NEAREST, 1
which
> > > > would
> > >
> > > > indicate that each observation value was matched to the
forecast
> > > > value
> > >
> > > > at the nearest grid point.  Instead, I would suggest writing
NBRHD
> > > > and
> > >
> > > > N, where N indicates the number of points in the neighborhood.
> > > > For
> > >
> > > > example, the NBRHD output from Grid-Stat would write 49 if we
were
> > > > using
> > > a 7x7 box.
> > >
> > > >
> > >
> > > > The -fcst_thresh and -obs_thresh options are used to filter
the
> > > > input
> > >
> > > > MPR lines, as you already know.  The -out_fcst_thresh and
> > >
> > > > -out_obs_thresh options define the thresholds to be applied
when
> > >
> > > > computing the output for the job.  In MET, probabilities are
not
> > > processed in a "continuous" way.
> > >
> > > > Instead, they are put into probability bins.  Those bins are
used
> > > > to
> > >
> > > > create an Nx2 contingency table from which probabilistic
> > > > statistics
> > >
> > > > are computed.
> > >
> > > >
> > >
> > > > Setting "-out_fcst_thresh gt0,gt0.25,gt0.5,gt0.75,gt1.0"
defines 4
> > >
> > > > probability bins which yields a 4x2 contingency table, from
which
> > >
> > > > stats are computed.
> > >
> > > >
> > >
> > > > Hope that helps.
> > >
> > > >
> > >
> > > > Thanks,
> > >
> > > > Johhn
> > >
> > > >
> > >
> > > > On Thu, May 19, 2016 at 2:23 PM, robert.craig.2 at us.af.mil via
RT <
> > >
> > > > met_help at ucar.edu> wrote:
> > >
> > > >
> > >
> > > > >
> > >
> > > > > <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361 >
> > >
> > > > >
> > >
> > > > > Thanks John, the directory problem was due to come
corruption on
> > > > > the
> > >
> > > > > MET we had on one if the systems.  On another system the
problem
> > >
> > > > > doesn't come up so we are hoping a recompile of MET on said
> > > > > system
> > >
> > > > > will clear up the issue.
> > >
> > > > >
> > >
> > > > > As far as the second comment, I don't think you interpreted
what
> > > > > I
> > >
> > > > > am doing correctly.  In each file, there is three sets of
data
> > > > > for
> > >
> > > > > each variable.  They are not identical since the first set
is
> > > > > the ob
> > >
> > > > > neighborhood data for precip > 1.  The next set is the ob
> > >
> > > > > neighborhood data for precip > 25, and the same for precip >
50.
> > > > > I
> > >
> > > > > you compare the model and ob data, the model data should be
> > >
> > > > > different (for some obs) then the model data for the
previous
> > >
> > > > > category.  The neighborhoods
> > >
> > > > around each ob site
> > >
> > > > > follow the HiRA method.   All the data in the mpr file lines
are
> > >
> > > > > probabilities, so I want to create PSTD from these data.  So
I
> > > > > was
> > > using
> > >
> > > > > the fcst_thresh to filter for the HiRA thresholds I am
interested
> in.
> > >  I
> > >
> > > > > tried your code and added -by FCST_THRESH and got out three
> > >
> > > > > different
> > >
> > > > sets
> > >
> > > > > of values.   So my question is why did you have the
> -out_fcst_thresh
> > > set
> > >
> > > > to
> > >
> > > > > 10 prob thresolds?  Why wouldn't ge0 pick  up all
probabilities?
> > > > > I
> > >
> > > > > am not understanding how these thresholds are being used.
> > >
> > > > >
> > >
> > > > > Thanks
> > >
> > > > > Bob
> > >
> > > > >
> > >
> > > > > -----Original Message-----
> > >
> > > > > From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
> > >
> > > > > Sent: Monday, May 16, 2016 4:54 PM
> > >
> > > > > To: CRAIG, ROBERT J GS-12 USAF ACC 16 WS/WXN
> > >
> > > > > <robert.craig.2 at us.af.mil>
> > >
> > > > > Subject: Re: [rt.rap.ucar.edu #76361] Statanalysis Question
> > >
> > > > >
> > >
> > > > > Bob,
> > >
> > > > >
> > >
> > > > > Thanks for sending the sample data.  I agree that STAT-
Analysis
> > > > > can
> > >
> > > > > get pretty confusing.  It has a lot of flexibility, but we
> > > > > really
> > >
> > > > > need to think through what you're trying to do.
> > >
> > > > >
> > >
> > > > > First, regarding the error you're getting.  Unfortunately,
the
> > >
> > > > > config file string parser is writing a temp file in the
current
> > > "runtime"
> > >
> > > > directory.
> > >
> > > > > The error is from the fact that you don't have permission to
> > > > > write
> > >
> > > > > the file "config_23325_0_.temp" in the current directory.
> > >
> > > > > Ultimately, we should change that to use the temp directory
> instead.
> > >
> > > > >
> > >
> > > > > Next, I looked at the data you sent to me.  Listed below are
the
> > >
> > > > > unique combinations of just a few of the header columns:
> > >
> > > > >
> > >
> > > > > FCST_VAR FCST_THRESH LINE_TYPE TOTAL
> > >
> > > > > APCP         >=1                  MPR           14666
> > >
> > > > > APCP         >=25                MPR           14666
> > >
> > > > > APCP         >=50                MPR           14666
> > >
> > > > > CEIL           <=1000             MPR           11926
> > >
> > > > > CEIL           <=100               MPR           11926
> > >
> > > > > CEIL           <=300               MPR           11926
> > >
> > > > >
> > >
> > > > > Based on this, it looks like you have a lot of duplicate
matched
> > >
> > > > > pair
> > >
> > > > > (MPR) output lines... We have the same 14666 pairs for APCP
> > > > > repeated
> > >
> > > > > 3 times followed by the same 11926 pairs for CEIL repeated 3
times.
> > >
> > > > > This isn't necessary.  Instead, the FCST_THRESH and
OBS_THRESH
> > >
> > > > > columns for the MPR line type should be set to "NA".  The
MPR
> > > > > line
> > >
> > > > > type that Point-Stat creates just contains the paired
forecast
> > > > > and
> > >
> > > > > observation
> > >
> > > > values.
> > >
> > > > > Thresholds do not apply to this line type.
> > >
> > > > >
> > >
> > > > > I posted an updated version of your file to the ftp site.  I
> > >
> > > > > stripped it down to 14666 APCP lines and 11926 CEIL lines
with
> > > > > NA in
> > >
> > > > > the FCST_THRESH and OBS_THRESH columns:
> > >
> > > > >
> > >
> > > > >
> > >
> > > > >
ftp://ftp.rap.ucar.edu/incoming/irap/met_help/craig_data/point_s
> > > > > ta
> > > > > t_
> > >
> > > > > 3_ galwem_120000L_20160501_120000V_JHG.stat
> > >
> > > > >
> > >
> > > > > Looking at the values in the FCST column, I see numbers
between
> > > > > 0
> > >
> > > > > and
> > >
> > > > > 1 (0, 0.11, 0.22, 0.33, ...).  Looking in the OBS column, I
see
> > > > > 2
> > >
> > > > > numbers (0 or 1).  And looking at your config file, it looks
> > > > > like
> > >
> > > > > you want to use these MPR lines to compute probabilistic
output.
> > >
> > > > > MET verifies probabilities using an Nx2 contingency table.
You
> > > > > use
> > >
> > > > "-out_fcst_thresh"
> > >
> > > > > to select the probabilistic thresholds to be applied and
> > >
> > > > "-out_obs_thresh"
> > >
> > > > > to select the observation threshold to be applied.
> > >
> > > > >
> > >
> > > > > Here's a stat-analysis job you could run to read the MPR
lines,
> > >
> > > > > define the probabilistic forecast thresholds, define the
single
> > >
> > > > > observation threshold, and compute a PSTD output line.
Using
> > > > > "-by
> > >
> > > > > FCST_VAR" tells it to run the job separately for each unique
> > > > > entry
> > >
> > > > > found in the FCST_VAR
> > >
> > > > column.
> > >
> > > > >
> > >
> > > > > /usr/local/met-5.1/bin/stat_analysis \
> > >
> > > > >    -lookin
point_stat_3_galwem_120000L_20160501_120000V_JHG.stat
> > > > > \
> > >
> > > > >    -job aggregate_stat -line_type MPR -out_line_type PSTD \
> > >
> > > > >    -out_fcst_thresh
> > >
> > > > >
ge0,ge0.1,ge0.2,ge0.3,ge0.4,ge0.5,ge0.6,ge0.7,ge0.8,ge0.9,ge1.0
> > > > > \
> > >
> > > > >    -out_obs_thresh eq1.0 \
> > >
> > > > >    -by FCST_VAR \
> > >
> > > > >    -out_stat out_pstd.txt
> > >
> > > > >
> > >
> > > > > The output statistics are written to "out_pstd.txt".
> > >
> > > > >
> > >
> > > > > Hope that helps.
> > >
> > > > >
> > >
> > > > > John
> > >
> > > > >
> > >
> > > > >
> > >
> > > > >
> > >
> > > > >
> > >
> > > > >
> > >
> > > > >
> > >
> > > > >
> > >
> > > > > On Mon, May 16, 2016 at 2:47 PM, robert.craig.2 at us.af.mil
via RT
> > > > > <
> > >
> > > > > met_help at ucar.edu> wrote:
> > >
> > > > >
> > >
> > > > > >
> > >
> > > > > > <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361
> > > > > > >
> > >
> > > > > >
> > >
> > > > > > Thanks John, I knew the data was space delimited but
forgot to
> > >
> > > > > > check the header.  As usual with MET, I progressed further
but
> > > > > > am
> > >
> > > > > > hitting a
> > >
> > > > new
> > >
> > > > > > error.   See below.   I pushed the the config file to the
ftp
> > >
> > > > directory.
> > >
> > > > > > As you can see, the -tmp_dir is set to
> > >
> > > > /h/data/global/WXQC/data/met/tmp.
> > >
> > > > > > This directory permissions are wide open - infact
stat_anal
> > > > > > temp
> > >
> > > > > > files
> > >
> > > > > are
> > >
> > > > > > in there.   Does the config*.temp try to write somewhere
else?
> > >
> > > > > >
> > >
> > > > > > Also, notice in the command line options, there are three
> > thesholds.
> > >
> > > > > > MET kept telling me that I had to have three since this is
> > >
> > > > > > probability
> > >
> > > > > data.
> > >
> > > > > > Also, the latest MPR files (.stat) are in the ftp dir.  As
you
> > > > > > can
> > >
> > > > > > see I generated model/ob pairs using different thresholds
for
> > > > > > the
> > >
> > > > > > forecast and observation data.  So this is where I get
confused:
> > > > > > I
> > >
> > > > > > assume the fcst thresh in the config file is a filter to
pull
> > >
> > > > > > those lines that have the threshold I want.  I am not sure
> > > > > > what
> > >
> > > > > > the -out_fcst_thresh in the command line is doing.  If it
is
> > >
> > > > > > filtering the mpr line fcst data, then I would think I
would
> > > > > > set
> > >
> > > > > > it to ge 0 for the fcst and ob since the fcst and ob data
> > > > > > range
> > >
> > > > > > from 0 to 1.  Am
> > >
> > > > I handling this correctly?
> > >
> > > > > >
> > >
> > > > > > Thanks
> > >
> > > > > > BOb
> > >
> > > > > >
> > >
> > > > > > ['/h/WXQC/met-5.1/bin/stat_analysis', '-lookin',
> > >
> > > > > > '/h/data/global/WXQC/data/met/mdlob_pairs', '-out',
> > >
> > > > > >
'/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_1_PSTD_
> > > > > > 0Z
> > > > > > ',
> > >
> > > > > > '-config',
> > >
> > > > > > '/h/WXQC/met-
5.1/data/config/STATAnalysisConfig_hira_bs_updated'
> > > > > > ,
> > >
> > > > > > '-out_fcst_thresh ge0,ge0.5,ge1', '-out_obs_thresh ge0',
'-v',
> > >
> > > > > > '6'] DEBUG 1: Creating STAT-Analysis output file
> > >
> > > > > >
"/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_1_PSTD_0Z"
> > >
> > > > > > DEBUG 1: Default Config File:
> > >
> > > > > > /home/qcteam/met-
5.1/share/met/config/STATAnalysisConfig_defau
> > > > > > lt
> > >
> > > > > > DEBUG 1: User Config File:
> > >
> > > > > > /h/WXQC/met-
5.1/data/config/STATAnalysisConfig_hira_bs_updated
> > >
> > > > > > DEBUG 4: Default Job from the config file: "-model GALWEM
> > >
> > > > > > -fcst_lead
> > >
> > > > > > 120000 -fcst_init_beg 20160501_000000 -fcst_init_end
> > >
> > > > > > 20160502_000000 -fcst_init_hour 000000 -fcst_var APCP
> > > > > > -fcst_thresh
> > >
> > > > > > >=1 -line_type MPR -vif_flag 1 "
> > >
> > > > > > DEBUG 4: Amending default job with command line options:
> > >
> > > > > > "-out_fcst_thresh
> > >
> > > > > > ge0,ge0.5,ge1 -out_obs_thresh ge0"
> > >
> > > > > > DEBUG 3: Processing STAT file
> > >
> > > > > >
> > >
> > > > >
> > >
> > > >
> > >
> >
>
"/h/data/global/WXQC/data/met/mdlob_pairs/point_stat_3_galwem_120000L_20160501_120000V.stat"
> > >
> > > > > > ... 1 of 10
> > >
> > > > > > ERROR  :
> > >
> > > > > > ERROR  :
> > >
> > > > > > ERROR  :   MetConfig::read_string(const char *) -> unable
to open
> > > temp
> > >
> > > > > > file "config_23325_0_.temp"
> > >
> > > > > > ERROR  :
> > >
> > > > > >
> > >
> > > > > > -----Original Message-----
> > >
> > > > > > From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
> > >
> > > > > > Sent: Friday, May 13, 2016 6:27 PM
> > >
> > > > > > To: CRAIG, ROBERT J GS-12 USAF ACC 16 WS/WXN
> > >
> > > > > > <robert.craig.2 at us.af.mil>
> > >
> > > > > > Subject: Re: [rt.rap.ucar.edu #76361] Statanalysis
Question
> > >
> > > > > >
> > >
> > > > > > Bob,
> > >
> > > > > >
> > >
> > > > > > The problem is coming from the first line of the file you
sent
> > > > > > to
> > me.
> > >
> > > > > > It contains a comma-separated list of header column names.
> > >
> > > > > >
> > >
> > > > > > I'm not exactly sure where you pulled those header column
> > > > > > names,
> > >
> > > > > > but that's the problem.  MET expects data to be separated
by
> > > whitespace...
> > >
> > > > > > so it interprets that long string with a bunch of commas
as a
> > >
> > > > > > single
> > >
> > > > > column.
> > >
> > > > > > The
> > >
> > > > > > error comes when it tries to read the "second" column.
If you
> > just
> > >
> > > > > remove
> > >
> > > > > > that first line, it should run fine.
> > >
> > > > > >
> > >
> > > > > > If you do want header columns, here's a trick.  Run the
> > > > > > following
> > > job:
> > >
> > > > > >
> > >
> > > > > > stat_analysis -lookin
> > >
> > > > point_stat_3_galwem_120000L_20160501_120000V.stat \
> > >
> > > > > >    -job filter -line_type MPR -dump_row out.stat
> > >
> > > > > >
> > >
> > > > > > The file out.stat, will now contain the full header for
the
> > > > > > MPR
> > >
> > > > > > line
> > >
> > > > > type.
> > >
> > > > > > When you select a single LINE_TYPE value, stat-analysis
will
> > > > > > write
> > >
> > > > > > the full header for that line type to the output.
> > >
> > > > > >
> > >
> > > > > > Have a good weekend.
> > >
> > > > > >
> > >
> > > > > > John
> > >
> > > > > >
> > >
> > > > > >
> > >
> > > > > > On Fri, May 13, 2016 at 10:26 AM, robert.craig.2 at us.af.mil
via
> > > > > > RT
> > >
> > > > > > < met_help at ucar.edu> wrote:
> > >
> > > > > >
> > >
> > > > > > >
> > >
> > > > > > > Fri May 13 10:26:26 2016: Request 76361 was acted upon.
> > >
> > > > > > > Transaction: Ticket created by robert.craig.2 at us.af.mil
> > >
> > > > > > >        Queue: met_help
> > >
> > > > > > >      Subject: Statanalysis Question
> > >
> > > > > > >        Owner: Nobody
> > >
> > > > > > >   Requestors: robert.craig.2 at us.af.mil
> > >
> > > > > > >       Status: new
> > >
> > > > > > >  Ticket <URL:
> > >
> > > > > > > https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=76361
> > >
> > > > > > > >
> > >
> > > > > > >
> > >
> > > > > > >
> > >
> > > > > > > John, I am getting the following error when running
> statanalysis.
> > >
> > > > > > > The forcast times and valid times seem to be correct to
me,
> > > > > > > so I
> > >
> > > > > > > am not sure the cause of the error.
> > >
> > > > > > >
> > >
> > > > > > > /h/WXQC/met-5.1/bin/stat_analysis -lookin
> > >
> > > > > > > /h/data/global/WXQC/data/met/mdlob_pairs -out
> > >
> > > > > > >
/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_PST
> > > > > > > D_
> > > > > > > 0Z
> > >
> > > > > > > -config
> > >
> > > > > > > /h/WXQC/met-
5.1/data/config/STATAnalysisConfig_hira_bs_updat
> > > > > > > ed
> > >
> > > > > > > -v 6 DEBUG 1: Creating STAT-Analysis output file
> > >
> > > > > > >
> > "/h/data/global/WXQC/data/met/summary/GALWEM_APCP_12hr_50_PSTD_0Z"
> > >
> > > > > > > DEBUG 1: Default Config File:
> > >
> > > > > > > /home/qcteam/met-
5.1/share/met/config/STATAnalysisConfig_def
> > > > > > > au
> > > > > > > lt
> > >
> > > > > > > DEBUG 1: User Config File:
> > >
> > > > > > > /h/WXQC/met-
5.1/data/config/STATAnalysisConfig_hira_bs_updat
> > > > > > > ed
> > >
> > > > > > > DEBUG 4: Default Job from the config file: "-model
GALWEM
> > >
> > > > > > > -fcst_lead
> > >
> > > > > > > 120000 -fcst_init_beg 20160501_000000 -fcst_init_end
> > >
> > > > > > > 20160501_000000 -fcst_init_hour 120000 -fcst_var APCP
> > >
> > > > > > > -fcst_thresh
> > >
> > > > > > > >=50 -line_type MPR -vif_flag 1 "
> > >
> > > > > > > DEBUG 4: Amending default job with command line options:
> "(nul)"
> > >
> > > > > > > DEBUG 3: Processing STAT file
> > >
> > > > > > >
> > >
> > > > > >
> > >
> > > > >
> > >
> > > >
> > >
> >
>
"/h/data/global/WXQC/data/met/mdlob_pairs/point_stat_3_galwem_120000L_20160501_120000V.stat"
> > >
> > > > > > > ... 1 of 2
> > >
> > > > > > > ERROR  :
> > >
> > > > > > > ERROR  : DataLine::get_item(int) -> range check error
ERROR  :
> > >
> > > > > > >
> > >
> > > > > > > The config file and data file are on your server.
> > >
> > > > > > >
> > >
> > > > > > >
> > >
> > > > > > >
> > >
> > > > > >
> > >
> > > > > >
> > >
> > > > > >
> > >
> > > > > >
> > >
> > > > >
> > >
> > > > >
> > >
> > > > >
> > >
> > > > >
> > >
> > > >
> > >
> > > >
> > >
> > > >
> > >
> > > >
> > >
> > >
> > >
> > >
> >
> >
> >
> >
>
>
>
>

------------------------------------------------


More information about the Met_help mailing list