[Met_help] [rt.rap.ucar.edu #84822] History for question on regenerating data

Mon May 7 15:06:46 MDT 2018

----------------------------------------------------------------
  Initial Request
----------------------------------------------------------------

Hi,

I'm running point-stat using ASCAT and GFS data to verify surface wind
speeds.  I found an error in my ASCAT input data that goes back to Mar 7.
I had switched the input source of the data, and within the new data files,
it was allowing very small values (< 1 m/s) to be used as data points in
the verification.  I imagine that this is an issue, since point-stat is
using these very small values as matched pairs with the GFS, correct?

Is there a way to regenerate the point-stat statistics without using the
original GFS data?  I do have the *stat and the *mpr files, and it is
pretty easy to identify where the bad values are located.

Thanks,
Roz

-- 
Rosalyn MacCracken
Support Scientist

Ocean Applications Branch
NOAA/NWS Ocean Prediction Center
NCWCP
5830 University Research Ct
College Park, MD  20740-3818

(p) 301-683-1551
rosalyn.maccracken at noaa.gov

----------------------------------------------------------------
  Complete Ticket History
----------------------------------------------------------------

Subject: question on regenerating data
From: Julie Prestopnik
Time: Thu Apr 19 09:23:32 2018

Hi Roz.  My apologies for the delay in responding.

Unfortunately, John is out of the office this week, and I do not know
the
answers to your questions.  As you said, I would also imagine that
point-stat is using those small values as matched pairs.  Also, I do
not
believe there is a way to regenerate the point-stat statistics without
using the original GFS data.  I cannot say with certainty, however.
Thank
you for your patience in advance.  We'll get a definite response to
you as
soon as we can.

Thanks,
Julie

On Wed, Apr 18, 2018 at 6:31 AM, Rosalyn MacCracken - NOAA Affiliate
via RT
<met_help at ucar.edu> wrote:

>
> Wed Apr 18 06:31:39 2018: Request 84822 was acted upon.
> Transaction: Ticket created by rosalyn.maccracken at noaa.gov
>        Queue: met_help
>      Subject: question on regenerating data
>        Owner: Nobody
>   Requestors: rosalyn.maccracken at noaa.gov
>       Status: new
>  Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822 >
>
>
> Hi,
>
> I'm running point-stat using ASCAT and GFS data to verify surface
wind
> speeds.  I found an error in my ASCAT input data that goes back to
Mar 7.
> I had switched the input source of the data, and within the new data
files,
> it was allowing very small values (< 1 m/s) to be used as data
points in
> the verification.  I imagine that this is an issue, since point-stat
is
> using these very small values as matched pairs with the GFS,
correct?
>
> Is there a way to regenerate the point-stat statistics without using
the
> original GFS data?  I do have the *stat and the *mpr files, and it
is
> pretty easy to identify where the bad values are located.
>
> Thanks,
> Roz
>
> --
> Rosalyn MacCracken
> Support Scientist
>
> Ocean Applications Branch
> NOAA/NWS Ocean Prediction Center
> NCWCP
> 5830 University Research Ct
> College Park, MD  20740-3818
>
> (p) 301-683-1551
> rosalyn.maccracken at noaa.gov
>
>

------------------------------------------------
Subject: question on regenerating data
From: John Halley Gotway
Time: Mon Apr 23 12:18:17 2018

Hi Roz,

I read that you've run Point-Stat and saved off the matched pairs
(MPR)
output line type.  And you'd like to (1) filter those MPR lines to
discard
some of them and then (2) use the filtered data to regenerate summary
statistics.  Yes, this is easily done using the STAT-Analysis tool in
MET.

You wrote that you're verifying wind speeds against ASCAT and that
you'd
like to exclude pairs where the observed wind speed is less than 1
m/s.
I'm just guessing here, but I'll presume that you want to produce both
SL1L2 and CNT output line types.  Here's what the STAT-Analysis job
would
look like:

# Filter MPR's and write SL1L2 output line
stat_analysis \
   -lookin input.stat \            # List a .stat filename or
directory
containing them
   -job aggregate_stat \        # Job type is aggregate_stat
   -line_type MPR \              # Input line type = MPR
   -out_line_type SL1L2 \      # Output line type = SL1L2 partial sums
   -fcst_var WIND \               # Only process lines where FCST_VAR
column = WIND
   -column_thresh OBS gt1 \ # Only use MPR lines where OBS column > 1
   -by
MODEL,FCST_LEV,FCST_INIT_BEG,FCST_LEAD,VX_MASK,INTERP_MTHD,INTERP_PNTS
#
Run this same job for each unique combination of these columns
   -out_stat MPR_to_SL1L2.stat

This will read produce an output .stat file containing an SL1L2 line
for
each unique combination of the header columns listed after the "-by"
option.  To generate CNT output lines instead, you'd run a second job
where
you replace SL1L2 with CNT.  You could run these jobs on the command
line
or group them together into a STAT-Analysis config file, if you
prefer.
Both would work.

You could run this once for each input .stat file you're processing...
or
you could pass many input .stat files to the job.  Since FCST_INIT_BEG
and
FCST_LEAD are included in the "-by" option, you'll get separate output
lines for each unique time.

Hope that helps get you going.

Thanks,
John

On Thu, Apr 19, 2018 at 9:23 AM, Julie Prestopnik via RT
<met_help at ucar.edu>
wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822 >
>
> Hi Roz.  My apologies for the delay in responding.
>
> Unfortunately, John is out of the office this week, and I do not
know the
> answers to your questions.  As you said, I would also imagine that
> point-stat is using those small values as matched pairs.  Also, I do
not
> believe there is a way to regenerate the point-stat statistics
without
> using the original GFS data.  I cannot say with certainty, however.
Thank
> you for your patience in advance.  We'll get a definite response to
you as
> soon as we can.
>
> Thanks,
> Julie
>
> On Wed, Apr 18, 2018 at 6:31 AM, Rosalyn MacCracken - NOAA Affiliate
via RT
> <met_help at ucar.edu> wrote:
>
> >
> > Wed Apr 18 06:31:39 2018: Request 84822 was acted upon.
> > Transaction: Ticket created by rosalyn.maccracken at noaa.gov
> >        Queue: met_help
> >      Subject: question on regenerating data
> >        Owner: Nobody
> >   Requestors: rosalyn.maccracken at noaa.gov
> >       Status: new
> >  Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822 >
> >
> >
> > Hi,
> >
> > I'm running point-stat using ASCAT and GFS data to verify surface
wind
> > speeds.  I found an error in my ASCAT input data that goes back to
Mar 7.
> > I had switched the input source of the data, and within the new
data
> files,
> > it was allowing very small values (< 1 m/s) to be used as data
points in
> > the verification.  I imagine that this is an issue, since point-
stat is
> > using these very small values as matched pairs with the GFS,
correct?
> >
> > Is there a way to regenerate the point-stat statistics without
using the
> > original GFS data?  I do have the *stat and the *mpr files, and it
is
> > pretty easy to identify where the bad values are located.
> >
> > Thanks,
> > Roz
> >
> > --
> > Rosalyn MacCracken
> > Support Scientist
> >
> > Ocean Applications Branch
> > NOAA/NWS Ocean Prediction Center
> > NCWCP
> > 5830 University Research Ct
> > College Park, MD  20740-3818
> >
> > (p) 301-683-1551
> > rosalyn.maccracken at noaa.gov
> >
> >
>
>

------------------------------------------------
Subject: question on regenerating data
From: Rosalyn MacCracken - NOAA Affiliate
Time: Mon Apr 23 13:01:45 2018

Hi John,

That's actually only partially correct.  It's not that I want to use
part
of the MPR lines and discard the rest, and I do need to regenerate
statistics.  Let me try to re-explain.

Back in early March we switched from getting our ASCAT obs from the
prepbufr data, to getting it from the MGDRLITE data. So, processing
didn't
change.  I was producing statistics at certain threshold levels for
both
GFS and ASCAT.  I had this set with the cat_thresh list, at levels of
0,6,17, etc.  We found out after processing for a couple of weeks that
the
ASCAT data included these really small values, <1.0 m/s, and that
these
small wind speeds were being included into the statistics processing.

So, a couple of questions.
1) Do I have to regenerate all of my statistics (*.cts, *.cnt and *ctc
files) because of this error? Or, since I have threshold levels set,
will
those small values be amoung the statistics in the lowest thresholds?
2) I have the *.stat files, but, they are spread out into separate
directories like:
/GFS/data/hourly/${YYYYMMDDHH}/*.stat
Can I tell stat-analysis to "lookin" directories with a wildcard (like
201803*)?  If so, how?  Or, is I tell it to look in /GFS/data/hourly,
will
it look in all the directories recursively under hourly?  And, it
that's
the case, can I give it a date range, so, that it only processes data
from
March?

Roz

On Mon, Apr 23, 2018 at 2:18 PM, John Halley Gotway via RT <
met_help at ucar.edu> wrote:

> Hi Roz,
>
> I read that you've run Point-Stat and saved off the matched pairs
(MPR)
> output line type.  And you'd like to (1) filter those MPR lines to
discard
> some of them and then (2) use the filtered data to regenerate
summary
> statistics.  Yes, this is easily done using the STAT-Analysis tool
in MET.
>
> You wrote that you're verifying wind speeds against ASCAT and that
you'd
> like to exclude pairs where the observed wind speed is less than 1
m/s.
> I'm just guessing here, but I'll presume that you want to produce
both
> SL1L2 and CNT output line types.  Here's what the STAT-Analysis job
would
> look like:
>
> # Filter MPR's and write SL1L2 output line
> stat_analysis \
>    -lookin input.stat \            # List a .stat filename or
directory
> containing them
>    -job aggregate_stat \        # Job type is aggregate_stat
>    -line_type MPR \              # Input line type = MPR
>    -out_line_type SL1L2 \      # Output line type = SL1L2 partial
sums
>    -fcst_var WIND \               # Only process lines where
FCST_VAR
> column = WIND
>    -column_thresh OBS gt1 \ # Only use MPR lines where OBS column >
1
>    -by
>
MODEL,FCST_LEV,FCST_INIT_BEG,FCST_LEAD,VX_MASK,INTERP_MTHD,INTERP_PNTS
#
> Run this same job for each unique combination of these columns
>    -out_stat MPR_to_SL1L2.stat
>
> This will read produce an output .stat file containing an SL1L2 line
for
> each unique combination of the header columns listed after the "-by"
> option.  To generate CNT output lines instead, you'd run a second
job where
> you replace SL1L2 with CNT.  You could run these jobs on the command
line
> or group them together into a STAT-Analysis config file, if you
prefer.
> Both would work.
>
> You could run this once for each input .stat file you're
processing... or
> you could pass many input .stat files to the job.  Since
FCST_INIT_BEG and
> FCST_LEAD are included in the "-by" option, you'll get separate
output
> lines for each unique time.
>
> Hope that helps get you going.
>
> Thanks,
> John
>
>
> On Thu, Apr 19, 2018 at 9:23 AM, Julie Prestopnik via RT <
> met_help at ucar.edu>
> wrote:
>
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822 >
> >
> > Hi Roz.  My apologies for the delay in responding.
> >
> > Unfortunately, John is out of the office this week, and I do not
know the
> > answers to your questions.  As you said, I would also imagine that
> > point-stat is using those small values as matched pairs.  Also, I
do not
> > believe there is a way to regenerate the point-stat statistics
without
> > using the original GFS data.  I cannot say with certainty,
however.
> Thank
> > you for your patience in advance.  We'll get a definite response
to you
> as
> > soon as we can.
> >
> > Thanks,
> > Julie
> >
> > On Wed, Apr 18, 2018 at 6:31 AM, Rosalyn MacCracken - NOAA
Affiliate via
> RT
> > <met_help at ucar.edu> wrote:
> >
> > >
> > > Wed Apr 18 06:31:39 2018: Request 84822 was acted upon.
> > > Transaction: Ticket created by rosalyn.maccracken at noaa.gov
> > >        Queue: met_help
> > >      Subject: question on regenerating data
> > >        Owner: Nobody
> > >   Requestors: rosalyn.maccracken at noaa.gov
> > >       Status: new
> > >  Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822
> >
> > >
> > >
> > > Hi,
> > >
> > > I'm running point-stat using ASCAT and GFS data to verify
surface wind
> > > speeds.  I found an error in my ASCAT input data that goes back
to Mar
> 7.
> > > I had switched the input source of the data, and within the new
data
> > files,
> > > it was allowing very small values (< 1 m/s) to be used as data
points
> in
> > > the verification.  I imagine that this is an issue, since point-
stat is
> > > using these very small values as matched pairs with the GFS,
correct?
> > >
> > > Is there a way to regenerate the point-stat statistics without
using
> the
> > > original GFS data?  I do have the *stat and the *mpr files, and
it is
> > > pretty easy to identify where the bad values are located.
> > >
> > > Thanks,
> > > Roz
> > >
> > > --
> > > Rosalyn MacCracken
> > > Support Scientist
> > >
> > > Ocean Applications Branch
> > > NOAA/NWS Ocean Prediction Center
> > > NCWCP
> > > 5830 University Research Ct
> > > College Park, MD  20740-3818
> > >
> > > (p) 301-683-1551
> > > rosalyn.maccracken at noaa.gov
> > >
> > >
> >
> >
>
>

--
Rosalyn MacCracken
Support Scientist

Ocean Applications Branch
NOAA/NWS Ocean Prediction Center
NCWCP
5830 University Research Ct
College Park, MD  20740-3818

(p) 301-683-1551
rosalyn.maccracken at noaa.gov

------------------------------------------------
Subject: question on regenerating data
From: John Halley Gotway
Time: Mon Apr 23 14:01:57 2018

Roz,

It is ultimately up to you to decide which matched pairs you want to
include in your processing.  Do you consider those small (<1.0 m/s)
observation values to be corrupt and incorrect in some way or just not
very
interesting?  If they really are BAD data values, I agree that you
should
exclude them from your analysis.  But if they're just uninteresting
values
of low wind speed, then there's no reason why you should exclude them.
For
example, *most* of the time it ins't raining, but we often included
observations of 0 precip.

There are three configurable options in Point-Stat that may be useful
here:
(1) You already know and use the "cat_thresh" option.  This threshold
defines the events and non-events for a 2x2 contingency table.  This
threshold affects the contents of FHO, CTC, CTS, MCTC, and MCTS line
types
that Point-Stat writes.
(2) The "cnt_thresh" option is a more recent addition.  Perhaps this
was a
poor name choice, but instead of defining categories, it's really a
*filtering* threshold.  This threshold affects the contents of the
SL1L2,
SAL1L2, and CNT line types that Point-Stat writes.  For example,
setting
"cnt_thresh = [ ge6, ge17 ];" will produce 2 CNT and 2 SL1L2 output
lines
containing only those points where the wind speed was >=6 and >=17,
respectively.
(3) The "wind_thresh" option is very similar to the "cnt_thresh"
option but
affects the contents of teh VL1L2, VAL1L2, and VCNT (new in met-7.0)
line
types.  Only those U/V pairs that meet the specified wind speed
threshold
are included in the output.

For both "cnt_thresh" and "wind_thresh", the default value in the
config
file is "NA", meaning, do not apply any filtering threshold criteria.

You have the flexibility to run STAT-Analysis on the MPR output lines
to
recompute any of these output line types applying whatever filtering
criteria you'd like.
Here's the MET user's guide:
https://dtcenter.org/met/users/docs/users_guide/MET_Users_Guide_v7.0.pdf
Look on page 98 for the job command options for the "aggregate_stat"
line
type when the input line type is "MPR".

For your second question, the "-lookin PATH" option is *VERY*
flexible.
You can set PATH to either a single value or multiple values.  If you
use
wildcards, then the shell expands those wildcards to multiple values.
Each
value you pass in can either be a filename or a directory name.  If
you
pass in a filename, STAT-Analysis will read it *REGARDLESS* of the
file
extension.  If you pass in a directory name, STAT-Analysis will search
that
directory *RECURSIVELY* for files ending in ".stat".  For example,
either
of the following settings would tell STAT-Analysis to read the same
list of
files:
   -lookin /GFS/data/hourly/*/*.stat
   ... or ...
   -lookin /GFS/data/hourly

Be aware though that the more data you pass to STAT-Analysis, the
longer
it'll take for it to process it.  You can decide how much data you
pass it
for each job.  I'd suggest starting with what is most convenient for
you.
If it's too slow, change the logic to pass it less data (e.g. only 1
day of
data rather than 1 month of data).

Yes, you can give it a date range.  Use -fcst_init_beg and
-fcst_init_end
to specify beginning/ending model initialization times or
-fcst_valid_beg
and -fcst_valid_end to specify beginning/ending valid times.

If you find that you're running multiple jobs on the same subset of
data
(e.g. process MPR to CNT, MPR to SL1L2, MPR to CTC, MPR to CTS), it'd
be
more efficient to group those jobs into a config file.  That'll do the
filtering ONCE and write the filtered data to a temp file.  Then all
the
jobs read data from the temp instead of starting over from scratch.

Make sense?

John

On Mon, Apr 23, 2018 at 1:01 PM, Rosalyn MacCracken - NOAA Affiliate
via RT
<met_help at ucar.edu> wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822 >
>
> Hi John,
>
> That's actually only partially correct.  It's not that I want to use
part
> of the MPR lines and discard the rest, and I do need to regenerate
> statistics.  Let me try to re-explain.
>
> Back in early March we switched from getting our ASCAT obs from the
> prepbufr data, to getting it from the MGDRLITE data. So, processing
didn't
> change.  I was producing statistics at certain threshold levels for
both
> GFS and ASCAT.  I had this set with the cat_thresh list, at levels
of
> 0,6,17, etc.  We found out after processing for a couple of weeks
that the
> ASCAT data included these really small values, <1.0 m/s, and that
these
> small wind speeds were being included into the statistics
processing.
>
> So, a couple of questions.
> 1) Do I have to regenerate all of my statistics (*.cts, *.cnt and
*ctc
> files) because of this error? Or, since I have threshold levels set,
will
> those small values be amoung the statistics in the lowest
thresholds?
> 2) I have the *.stat files, but, they are spread out into separate
> directories like:
> /GFS/data/hourly/${YYYYMMDDHH}/*.stat
> Can I tell stat-analysis to "lookin" directories with a wildcard
(like
> 201803*)?  If so, how?  Or, is I tell it to look in
/GFS/data/hourly, will
> it look in all the directories recursively under hourly?  And, it
that's
> the case, can I give it a date range, so, that it only processes
data from
> March?
>
> Roz
>
> On Mon, Apr 23, 2018 at 2:18 PM, John Halley Gotway via RT <
> met_help at ucar.edu> wrote:
>
> > Hi Roz,
> >
> > I read that you've run Point-Stat and saved off the matched pairs
(MPR)
> > output line type.  And you'd like to (1) filter those MPR lines to
> discard
> > some of them and then (2) use the filtered data to regenerate
summary
> > statistics.  Yes, this is easily done using the STAT-Analysis tool
in
> MET.
> >
> > You wrote that you're verifying wind speeds against ASCAT and that
you'd
> > like to exclude pairs where the observed wind speed is less than 1
m/s.
> > I'm just guessing here, but I'll presume that you want to produce
both
> > SL1L2 and CNT output line types.  Here's what the STAT-Analysis
job would
> > look like:
> >
> > # Filter MPR's and write SL1L2 output line
> > stat_analysis \
> >    -lookin input.stat \            # List a .stat filename or
directory
> > containing them
> >    -job aggregate_stat \        # Job type is aggregate_stat
> >    -line_type MPR \              # Input line type = MPR
> >    -out_line_type SL1L2 \      # Output line type = SL1L2 partial
sums
> >    -fcst_var WIND \               # Only process lines where
FCST_VAR
> > column = WIND
> >    -column_thresh OBS gt1 \ # Only use MPR lines where OBS column
> 1
> >    -by
> >
MODEL,FCST_LEV,FCST_INIT_BEG,FCST_LEAD,VX_MASK,INTERP_MTHD,INTERP_PNTS
#
> > Run this same job for each unique combination of these columns
> >    -out_stat MPR_to_SL1L2.stat
> >
> > This will read produce an output .stat file containing an SL1L2
line for
> > each unique combination of the header columns listed after the "-
by"
> > option.  To generate CNT output lines instead, you'd run a second
job
> where
> > you replace SL1L2 with CNT.  You could run these jobs on the
command line
> > or group them together into a STAT-Analysis config file, if you
prefer.
> > Both would work.
> >
> > You could run this once for each input .stat file you're
processing... or
> > you could pass many input .stat files to the job.  Since
FCST_INIT_BEG
> and
> > FCST_LEAD are included in the "-by" option, you'll get separate
output
> > lines for each unique time.
> >
> > Hope that helps get you going.
> >
> > Thanks,
> > John
> >
> >
> > On Thu, Apr 19, 2018 at 9:23 AM, Julie Prestopnik via RT <
> > met_help at ucar.edu>
> > wrote:
> >
> > >
> > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822 >
> > >
> > > Hi Roz.  My apologies for the delay in responding.
> > >
> > > Unfortunately, John is out of the office this week, and I do not
know
> the
> > > answers to your questions.  As you said, I would also imagine
that
> > > point-stat is using those small values as matched pairs.  Also,
I do
> not
> > > believe there is a way to regenerate the point-stat statistics
without
> > > using the original GFS data.  I cannot say with certainty,
however.
> > Thank
> > > you for your patience in advance.  We'll get a definite response
to you
> > as
> > > soon as we can.
> > >
> > > Thanks,
> > > Julie
> > >
> > > On Wed, Apr 18, 2018 at 6:31 AM, Rosalyn MacCracken - NOAA
Affiliate
> via
> > RT
> > > <met_help at ucar.edu> wrote:
> > >
> > > >
> > > > Wed Apr 18 06:31:39 2018: Request 84822 was acted upon.
> > > > Transaction: Ticket created by rosalyn.maccracken at noaa.gov
> > > >        Queue: met_help
> > > >      Subject: question on regenerating data
> > > >        Owner: Nobody
> > > >   Requestors: rosalyn.maccracken at noaa.gov
> > > >       Status: new
> > > >  Ticket <URL: https://rt.rap.ucar.edu/rt/
> Ticket/Display.html?id=84822
> > >
> > > >
> > > >
> > > > Hi,
> > > >
> > > > I'm running point-stat using ASCAT and GFS data to verify
surface
> wind
> > > > speeds.  I found an error in my ASCAT input data that goes
back to
> Mar
> > 7.
> > > > I had switched the input source of the data, and within the
new data
> > > files,
> > > > it was allowing very small values (< 1 m/s) to be used as data
points
> > in
> > > > the verification.  I imagine that this is an issue, since
point-stat
> is
> > > > using these very small values as matched pairs with the GFS,
correct?
> > > >
> > > > Is there a way to regenerate the point-stat statistics without
using
> > the
> > > > original GFS data?  I do have the *stat and the *mpr files,
and it is
> > > > pretty easy to identify where the bad values are located.
> > > >
> > > > Thanks,
> > > > Roz
> > > >
> > > > --
> > > > Rosalyn MacCracken
> > > > Support Scientist
> > > >
> > > > Ocean Applications Branch
> > > > NOAA/NWS Ocean Prediction Center
> > > > NCWCP
> > > > 5830 University Research Ct
> > > > College Park, MD  20740-3818
> > > >
> > > > (p) 301-683-1551
> > > > rosalyn.maccracken at noaa.gov
> > > >
> > > >
> > >
> > >
> >
> >
>
>
> --
> Rosalyn MacCracken
> Support Scientist
>
> Ocean Applications Branch
> NOAA/NWS Ocean Prediction Center
> NCWCP
> 5830 University Research Ct
> College Park, MD  20740-3818
>
> (p) 301-683-1551
> rosalyn.maccracken at noaa.gov
>
>

------------------------------------------------
Subject: question on regenerating data
From: Rosalyn MacCracken - NOAA Affiliate
Time: Tue Apr 24 07:48:43 2018

Hi John,

Yes, that makes sense.  Those very small values (<1.0 m/s), are bad
values.  That's why they shouldn't be included in the processing.

So, I need to just regenerate hourly data, one hour at a time.  Would
it
make sense to use a shell script and loop stat-analysis?  Something
like:

for day in 11 12
do
  for cycle in 00 06 12 18
  do
stat_analysis -lookin /GFS/data/hourly/201803${day}${hour}/*.stat \
-job aggregate_stat \
   -line_type MPR \
   -out_line_type CTC,CTS,CNT \
  -fcst_var WIND \
-column_thresh OBS gt1 \
 -by
MODEL,FCST_LEV,FCST_INIT_BEG,FCST_LEAD,VX_MASK,INTERP_MTHD,INTERP_PNTS
-out_stat /new_rerun_stat_files/MPR_to_CTC_CTS_CNT.stat
  done
done

or, something like that?  And, will this regenerate hour forecasts, at
each
forecast and lead hour?  I guess it will see the forecast and lead
hour
from the *.stat file, and whatever *stat file is in the directory, it
will
regenerate those hours, right?

So, I need to regenerate the CTC, CNT and CTS files.  That's why I
did:
 -out_line_type CTC,CTS,CNT
but, will that make 3 separate files, or just another *.stat file?

Roz

On Mon, Apr 23, 2018 at 4:01 PM, John Halley Gotway via RT <
met_help at ucar.edu> wrote:

> Roz,
>
> It is ultimately up to you to decide which matched pairs you want to
> include in your processing.  Do you consider those small (<1.0 m/s)
> observation values to be corrupt and incorrect in some way or just
not very
> interesting?  If they really are BAD data values, I agree that you
should
> exclude them from your analysis.  But if they're just uninteresting
values
> of low wind speed, then there's no reason why you should exclude
them.  For
> example, *most* of the time it ins't raining, but we often included
> observations of 0 precip.
>
> There are three configurable options in Point-Stat that may be
useful here:
> (1) You already know and use the "cat_thresh" option.  This
threshold
> defines the events and non-events for a 2x2 contingency table.  This
> threshold affects the contents of FHO, CTC, CTS, MCTC, and MCTS line
types
> that Point-Stat writes.
> (2) The "cnt_thresh" option is a more recent addition.  Perhaps this
was a
> poor name choice, but instead of defining categories, it's really a
> *filtering* threshold.  This threshold affects the contents of the
SL1L2,
> SAL1L2, and CNT line types that Point-Stat writes.  For example,
setting
> "cnt_thresh = [ ge6, ge17 ];" will produce 2 CNT and 2 SL1L2 output
lines
> containing only those points where the wind speed was >=6 and >=17,
> respectively.
> (3) The "wind_thresh" option is very similar to the "cnt_thresh"
option but
> affects the contents of teh VL1L2, VAL1L2, and VCNT (new in met-7.0)
line
> types.  Only those U/V pairs that meet the specified wind speed
threshold
> are included in the output.
>
> For both "cnt_thresh" and "wind_thresh", the default value in the
config
> file is "NA", meaning, do not apply any filtering threshold
criteria.
>
> You have the flexibility to run STAT-Analysis on the MPR output
lines to
> recompute any of these output line types applying whatever filtering
> criteria you'd like.
> Here's the MET user's guide:
>
https://dtcenter.org/met/users/docs/users_guide/MET_Users_Guide_v7.0.pdf
> Look on page 98 for the job command options for the "aggregate_stat"
line
> type when the input line type is "MPR".
>
> For your second question, the "-lookin PATH" option is *VERY*
flexible.
> You can set PATH to either a single value or multiple values.  If
you use
> wildcards, then the shell expands those wildcards to multiple
values.  Each
> value you pass in can either be a filename or a directory name.  If
you
> pass in a filename, STAT-Analysis will read it *REGARDLESS* of the
file
> extension.  If you pass in a directory name, STAT-Analysis will
search that
> directory *RECURSIVELY* for files ending in ".stat".  For example,
either
> of the following settings would tell STAT-Analysis to read the same
list of
> files:
>    -lookin /GFS/data/hourly/*/*.stat
>    ... or ...
>    -lookin /GFS/data/hourly
>
> Be aware though that the more data you pass to STAT-Analysis, the
longer
> it'll take for it to process it.  You can decide how much data you
pass it
> for each job.  I'd suggest starting with what is most convenient for
you.
> If it's too slow, change the logic to pass it less data (e.g. only 1
day of
> data rather than 1 month of data).
>
> Yes, you can give it a date range.  Use -fcst_init_beg and
-fcst_init_end
> to specify beginning/ending model initialization times or
-fcst_valid_beg
> and -fcst_valid_end to specify beginning/ending valid times.
>
> If you find that you're running multiple jobs on the same subset of
data
> (e.g. process MPR to CNT, MPR to SL1L2, MPR to CTC, MPR to CTS),
it'd be
> more efficient to group those jobs into a config file.  That'll do
the
> filtering ONCE and write the filtered data to a temp file.  Then all
the
> jobs read data from the temp instead of starting over from scratch.
>
> Make sense?
>
> John
>
>
>
> On Mon, Apr 23, 2018 at 1:01 PM, Rosalyn MacCracken - NOAA Affiliate
via RT
> <met_help at ucar.edu> wrote:
>
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822 >
> >
> > Hi John,
> >
> > That's actually only partially correct.  It's not that I want to
use part
> > of the MPR lines and discard the rest, and I do need to regenerate
> > statistics.  Let me try to re-explain.
> >
> > Back in early March we switched from getting our ASCAT obs from
the
> > prepbufr data, to getting it from the MGDRLITE data. So,
processing
> didn't
> > change.  I was producing statistics at certain threshold levels
for both
> > GFS and ASCAT.  I had this set with the cat_thresh list, at levels
of
> > 0,6,17, etc.  We found out after processing for a couple of weeks
that
> the
> > ASCAT data included these really small values, <1.0 m/s, and that
these
> > small wind speeds were being included into the statistics
processing.
> >
> > So, a couple of questions.
> > 1) Do I have to regenerate all of my statistics (*.cts, *.cnt and
*ctc
> > files) because of this error? Or, since I have threshold levels
set, will
> > those small values be amoung the statistics in the lowest
thresholds?
> > 2) I have the *.stat files, but, they are spread out into separate
> > directories like:
> > /GFS/data/hourly/${YYYYMMDDHH}/*.stat
> > Can I tell stat-analysis to "lookin" directories with a wildcard
(like
> > 201803*)?  If so, how?  Or, is I tell it to look in
/GFS/data/hourly,
> will
> > it look in all the directories recursively under hourly?  And, it
that's
> > the case, can I give it a date range, so, that it only processes
data
> from
> > March?
> >
> > Roz
> >
> > On Mon, Apr 23, 2018 at 2:18 PM, John Halley Gotway via RT <
> > met_help at ucar.edu> wrote:
> >
> > > Hi Roz,
> > >
> > > I read that you've run Point-Stat and saved off the matched
pairs (MPR)
> > > output line type.  And you'd like to (1) filter those MPR lines
to
> > discard
> > > some of them and then (2) use the filtered data to regenerate
summary
> > > statistics.  Yes, this is easily done using the STAT-Analysis
tool in
> > MET.
> > >
> > > You wrote that you're verifying wind speeds against ASCAT and
that
> you'd
> > > like to exclude pairs where the observed wind speed is less than
1 m/s.
> > > I'm just guessing here, but I'll presume that you want to
produce both
> > > SL1L2 and CNT output line types.  Here's what the STAT-Analysis
job
> would
> > > look like:
> > >
> > > # Filter MPR's and write SL1L2 output line
> > > stat_analysis \
> > >    -lookin input.stat \            # List a .stat filename or
directory
> > > containing them
> > >    -job aggregate_stat \        # Job type is aggregate_stat
> > >    -line_type MPR \              # Input line type = MPR
> > >    -out_line_type SL1L2 \      # Output line type = SL1L2
partial sums
> > >    -fcst_var WIND \               # Only process lines where
FCST_VAR
> > > column = WIND
> > >    -column_thresh OBS gt1 \ # Only use MPR lines where OBS
column > 1
> > >    -by
> > >
MODEL,FCST_LEV,FCST_INIT_BEG,FCST_LEAD,VX_MASK,INTERP_MTHD,INTERP_PNTS
> #
> > > Run this same job for each unique combination of these columns
> > >    -out_stat MPR_to_SL1L2.stat
> > >
> > > This will read produce an output .stat file containing an SL1L2
line
> for
> > > each unique combination of the header columns listed after the
"-by"
> > > option.  To generate CNT output lines instead, you'd run a
second job
> > where
> > > you replace SL1L2 with CNT.  You could run these jobs on the
command
> line
> > > or group them together into a STAT-Analysis config file, if you
prefer.
> > > Both would work.
> > >
> > > You could run this once for each input .stat file you're
processing...
> or
> > > you could pass many input .stat files to the job.  Since
FCST_INIT_BEG
> > and
> > > FCST_LEAD are included in the "-by" option, you'll get separate
output
> > > lines for each unique time.
> > >
> > > Hope that helps get you going.
> > >
> > > Thanks,
> > > John
> > >
> > >
> > > On Thu, Apr 19, 2018 at 9:23 AM, Julie Prestopnik via RT <
> > > met_help at ucar.edu>
> > > wrote:
> > >
> > > >
> > > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822
>
> > > >
> > > > Hi Roz.  My apologies for the delay in responding.
> > > >
> > > > Unfortunately, John is out of the office this week, and I do
not know
> > the
> > > > answers to your questions.  As you said, I would also imagine
that
> > > > point-stat is using those small values as matched pairs.
Also, I do
> > not
> > > > believe there is a way to regenerate the point-stat statistics
> without
> > > > using the original GFS data.  I cannot say with certainty,
however.
> > > Thank
> > > > you for your patience in advance.  We'll get a definite
response to
> you
> > > as
> > > > soon as we can.
> > > >
> > > > Thanks,
> > > > Julie
> > > >
> > > > On Wed, Apr 18, 2018 at 6:31 AM, Rosalyn MacCracken - NOAA
Affiliate
> > via
> > > RT
> > > > <met_help at ucar.edu> wrote:
> > > >
> > > > >
> > > > > Wed Apr 18 06:31:39 2018: Request 84822 was acted upon.
> > > > > Transaction: Ticket created by rosalyn.maccracken at noaa.gov
> > > > >        Queue: met_help
> > > > >      Subject: question on regenerating data
> > > > >        Owner: Nobody
> > > > >   Requestors: rosalyn.maccracken at noaa.gov
> > > > >       Status: new
> > > > >  Ticket <URL: https://rt.rap.ucar.edu/rt/
> > Ticket/Display.html?id=84822
> > > >
> > > > >
> > > > >
> > > > > Hi,
> > > > >
> > > > > I'm running point-stat using ASCAT and GFS data to verify
surface
> > wind
> > > > > speeds.  I found an error in my ASCAT input data that goes
back to
> > Mar
> > > 7.
> > > > > I had switched the input source of the data, and within the
new
> data
> > > > files,
> > > > > it was allowing very small values (< 1 m/s) to be used as
data
> points
> > > in
> > > > > the verification.  I imagine that this is an issue, since
> point-stat
> > is
> > > > > using these very small values as matched pairs with the GFS,
> correct?
> > > > >
> > > > > Is there a way to regenerate the point-stat statistics
without
> using
> > > the
> > > > > original GFS data?  I do have the *stat and the *mpr files,
and it
> is
> > > > > pretty easy to identify where the bad values are located.
> > > > >
> > > > > Thanks,
> > > > > Roz
> > > > >
> > > > > --
> > > > > Rosalyn MacCracken
> > > > > Support Scientist
> > > > >
> > > > > Ocean Applications Branch
> > > > > NOAA/NWS Ocean Prediction Center
> > > > > NCWCP
> > > > > 5830 University Research Ct
> > > > > College Park, MD  20740-3818
> > > > >
> > > > > (p) 301-683-1551
> > > > > rosalyn.maccracken at noaa.gov
> > > > >
> > > > >
> > > >
> > > >
> > >
> > >
> >
> >
> > --
> > Rosalyn MacCracken
> > Support Scientist
> >
> > Ocean Applications Branch
> > NOAA/NWS Ocean Prediction Center
> > NCWCP
> > 5830 University Research Ct
> > College Park, MD  20740-3818
> >
> > (p) 301-683-1551
> > rosalyn.maccracken at noaa.gov
> >
> >
>
>

--
Rosalyn MacCracken
Support Scientist

Ocean Applications Branch
NOAA/NWS Ocean Prediction Center
NCWCP
5830 University Research Ct
College Park, MD  20740-3818

(p) 301-683-1551
rosalyn.maccracken at noaa.gov

------------------------------------------------
Subject: question on regenerating data
From: John Halley Gotway
Time: Tue Apr 24 09:42:06 2018

Roz,

Each "-job aggregate_stat" only generates a single output line type.
So
using "-out_line_type CTC,CTS,CNT" will not work.

You'll need to run separate jobs for each output line type you want to
generate.  That's why I'd recommend grouping those multiple jobs
together
into a single STAT-Analysis config file.  Then you'd call STAT-
Analysis
once using the "-config" command line option.

Another issue is that if you set "-out_stat" to the same filename,
it'll
get overridden by each job.  STAT-Analysis will overwrite that output
file
rather than appending to it.

You could send me a day's worth of .stat output files
(/GFS/data/hourly/20180305*) and I could send you some suggestions.
Or if
you have access to theia you could copy them up there and point me to
it.

Thanks,
John

On Tue, Apr 24, 2018 at 7:48 AM, Rosalyn MacCracken - NOAA Affiliate
via RT
<met_help at ucar.edu> wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822 >
>
> Hi John,
>
> Yes, that makes sense.  Those very small values (<1.0 m/s), are bad
> values.  That's why they shouldn't be included in the processing.
>
> So, I need to just regenerate hourly data, one hour at a time.
Would it
> make sense to use a shell script and loop stat-analysis?  Something
like:
>
> for day in 11 12
> do
>   for cycle in 00 06 12 18
>   do
> stat_analysis -lookin /GFS/data/hourly/201803${day}${hour}/*.stat \
> -job aggregate_stat \
>    -line_type MPR \
>    -out_line_type CTC,CTS,CNT \
>   -fcst_var WIND \
> -column_thresh OBS gt1 \
>  -by
>
MODEL,FCST_LEV,FCST_INIT_BEG,FCST_LEAD,VX_MASK,INTERP_MTHD,INTERP_PNTS
> -out_stat /new_rerun_stat_files/MPR_to_CTC_CTS_CNT.stat
>   done
> done
>
> or, something like that?  And, will this regenerate hour forecasts,
at each
> forecast and lead hour?  I guess it will see the forecast and lead
hour
> from the *.stat file, and whatever *stat file is in the directory,
it will
> regenerate those hours, right?
>
> So, I need to regenerate the CTC, CNT and CTS files.  That's why I
did:
>  -out_line_type CTC,CTS,CNT
> but, will that make 3 separate files, or just another *.stat file?
>
> Roz
>
>
> On Mon, Apr 23, 2018 at 4:01 PM, John Halley Gotway via RT <
> met_help at ucar.edu> wrote:
>
> > Roz,
> >
> > It is ultimately up to you to decide which matched pairs you want
to
> > include in your processing.  Do you consider those small (<1.0
m/s)
> > observation values to be corrupt and incorrect in some way or just
not
> very
> > interesting?  If they really are BAD data values, I agree that you
should
> > exclude them from your analysis.  But if they're just
uninteresting
> values
> > of low wind speed, then there's no reason why you should exclude
them.
> For
> > example, *most* of the time it ins't raining, but we often
included
> > observations of 0 precip.
> >
> > There are three configurable options in Point-Stat that may be
useful
> here:
> > (1) You already know and use the "cat_thresh" option.  This
threshold
> > defines the events and non-events for a 2x2 contingency table.
This
> > threshold affects the contents of FHO, CTC, CTS, MCTC, and MCTS
line
> types
> > that Point-Stat writes.
> > (2) The "cnt_thresh" option is a more recent addition.  Perhaps
this was
> a
> > poor name choice, but instead of defining categories, it's really
a
> > *filtering* threshold.  This threshold affects the contents of the
SL1L2,
> > SAL1L2, and CNT line types that Point-Stat writes.  For example,
setting
> > "cnt_thresh = [ ge6, ge17 ];" will produce 2 CNT and 2 SL1L2
output lines
> > containing only those points where the wind speed was >=6 and
>=17,
> > respectively.
> > (3) The "wind_thresh" option is very similar to the "cnt_thresh"
option
> but
> > affects the contents of teh VL1L2, VAL1L2, and VCNT (new in met-
7.0) line
> > types.  Only those U/V pairs that meet the specified wind speed
threshold
> > are included in the output.
> >
> > For both "cnt_thresh" and "wind_thresh", the default value in the
config
> > file is "NA", meaning, do not apply any filtering threshold
criteria.
> >
> > You have the flexibility to run STAT-Analysis on the MPR output
lines to
> > recompute any of these output line types applying whatever
filtering
> > criteria you'd like.
> > Here's the MET user's guide:
> >
https://dtcenter.org/met/users/docs/users_guide/MET_Users_Guide_v7.0.pdf
> > Look on page 98 for the job command options for the
"aggregate_stat" line
> > type when the input line type is "MPR".
> >
> > For your second question, the "-lookin PATH" option is *VERY*
flexible.
> > You can set PATH to either a single value or multiple values.  If
you use
> > wildcards, then the shell expands those wildcards to multiple
values.
> Each
> > value you pass in can either be a filename or a directory name.
If you
> > pass in a filename, STAT-Analysis will read it *REGARDLESS* of the
file
> > extension.  If you pass in a directory name, STAT-Analysis will
search
> that
> > directory *RECURSIVELY* for files ending in ".stat".  For example,
either
> > of the following settings would tell STAT-Analysis to read the
same list
> of
> > files:
> >    -lookin /GFS/data/hourly/*/*.stat
> >    ... or ...
> >    -lookin /GFS/data/hourly
> >
> > Be aware though that the more data you pass to STAT-Analysis, the
longer
> > it'll take for it to process it.  You can decide how much data you
pass
> it
> > for each job.  I'd suggest starting with what is most convenient
for you.
> > If it's too slow, change the logic to pass it less data (e.g. only
1 day
> of
> > data rather than 1 month of data).
> >
> > Yes, you can give it a date range.  Use -fcst_init_beg and
-fcst_init_end
> > to specify beginning/ending model initialization times or
-fcst_valid_beg
> > and -fcst_valid_end to specify beginning/ending valid times.
> >
> > If you find that you're running multiple jobs on the same subset
of data
> > (e.g. process MPR to CNT, MPR to SL1L2, MPR to CTC, MPR to CTS),
it'd be
> > more efficient to group those jobs into a config file.  That'll do
the
> > filtering ONCE and write the filtered data to a temp file.  Then
all the
> > jobs read data from the temp instead of starting over from
scratch.
> >
> > Make sense?
> >
> > John
> >
> >
> >
> > On Mon, Apr 23, 2018 at 1:01 PM, Rosalyn MacCracken - NOAA
Affiliate via
> RT
> > <met_help at ucar.edu> wrote:
> >
> > >
> > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822 >
> > >
> > > Hi John,
> > >
> > > That's actually only partially correct.  It's not that I want to
use
> part
> > > of the MPR lines and discard the rest, and I do need to
regenerate
> > > statistics.  Let me try to re-explain.
> > >
> > > Back in early March we switched from getting our ASCAT obs from
the
> > > prepbufr data, to getting it from the MGDRLITE data. So,
processing
> > didn't
> > > change.  I was producing statistics at certain threshold levels
for
> both
> > > GFS and ASCAT.  I had this set with the cat_thresh list, at
levels of
> > > 0,6,17, etc.  We found out after processing for a couple of
weeks that
> > the
> > > ASCAT data included these really small values, <1.0 m/s, and
that these
> > > small wind speeds were being included into the statistics
processing.
> > >
> > > So, a couple of questions.
> > > 1) Do I have to regenerate all of my statistics (*.cts, *.cnt
and *ctc
> > > files) because of this error? Or, since I have threshold levels
set,
> will
> > > those small values be amoung the statistics in the lowest
thresholds?
> > > 2) I have the *.stat files, but, they are spread out into
separate
> > > directories like:
> > > /GFS/data/hourly/${YYYYMMDDHH}/*.stat
> > > Can I tell stat-analysis to "lookin" directories with a wildcard
(like
> > > 201803*)?  If so, how?  Or, is I tell it to look in
/GFS/data/hourly,
> > will
> > > it look in all the directories recursively under hourly?  And,
it
> that's
> > > the case, can I give it a date range, so, that it only processes
data
> > from
> > > March?
> > >
> > > Roz
> > >
> > > On Mon, Apr 23, 2018 at 2:18 PM, John Halley Gotway via RT <
> > > met_help at ucar.edu> wrote:
> > >
> > > > Hi Roz,
> > > >
> > > > I read that you've run Point-Stat and saved off the matched
pairs
> (MPR)
> > > > output line type.  And you'd like to (1) filter those MPR
lines to
> > > discard
> > > > some of them and then (2) use the filtered data to regenerate
summary
> > > > statistics.  Yes, this is easily done using the STAT-Analysis
tool in
> > > MET.
> > > >
> > > > You wrote that you're verifying wind speeds against ASCAT and
that
> > you'd
> > > > like to exclude pairs where the observed wind speed is less
than 1
> m/s.
> > > > I'm just guessing here, but I'll presume that you want to
produce
> both
> > > > SL1L2 and CNT output line types.  Here's what the STAT-
Analysis job
> > would
> > > > look like:
> > > >
> > > > # Filter MPR's and write SL1L2 output line
> > > > stat_analysis \
> > > >    -lookin input.stat \            # List a .stat filename or
> directory
> > > > containing them
> > > >    -job aggregate_stat \        # Job type is aggregate_stat
> > > >    -line_type MPR \              # Input line type = MPR
> > > >    -out_line_type SL1L2 \      # Output line type = SL1L2
partial
> sums
> > > >    -fcst_var WIND \               # Only process lines where
FCST_VAR
> > > > column = WIND
> > > >    -column_thresh OBS gt1 \ # Only use MPR lines where OBS
column > 1
> > > >    -by
> > > > MODEL,FCST_LEV,FCST_INIT_BEG,FCST_LEAD,VX_MASK,INTERP_MTHD,
> INTERP_PNTS
> > #
> > > > Run this same job for each unique combination of these columns
> > > >    -out_stat MPR_to_SL1L2.stat
> > > >
> > > > This will read produce an output .stat file containing an
SL1L2 line
> > for
> > > > each unique combination of the header columns listed after the
"-by"
> > > > option.  To generate CNT output lines instead, you'd run a
second job
> > > where
> > > > you replace SL1L2 with CNT.  You could run these jobs on the
command
> > line
> > > > or group them together into a STAT-Analysis config file, if
you
> prefer.
> > > > Both would work.
> > > >
> > > > You could run this once for each input .stat file you're
> processing...
> > or
> > > > you could pass many input .stat files to the job.  Since
> FCST_INIT_BEG
> > > and
> > > > FCST_LEAD are included in the "-by" option, you'll get
separate
> output
> > > > lines for each unique time.
> > > >
> > > > Hope that helps get you going.
> > > >
> > > > Thanks,
> > > > John
> > > >
> > > >
> > > > On Thu, Apr 19, 2018 at 9:23 AM, Julie Prestopnik via RT <
> > > > met_help at ucar.edu>
> > > > wrote:
> > > >
> > > > >
> > > > > <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822 >
> > > > >
> > > > > Hi Roz.  My apologies for the delay in responding.
> > > > >
> > > > > Unfortunately, John is out of the office this week, and I do
not
> know
> > > the
> > > > > answers to your questions.  As you said, I would also
imagine that
> > > > > point-stat is using those small values as matched pairs.
Also, I
> do
> > > not
> > > > > believe there is a way to regenerate the point-stat
statistics
> > without
> > > > > using the original GFS data.  I cannot say with certainty,
however.
> > > > Thank
> > > > > you for your patience in advance.  We'll get a definite
response to
> > you
> > > > as
> > > > > soon as we can.
> > > > >
> > > > > Thanks,
> > > > > Julie
> > > > >
> > > > > On Wed, Apr 18, 2018 at 6:31 AM, Rosalyn MacCracken - NOAA
> Affiliate
> > > via
> > > > RT
> > > > > <met_help at ucar.edu> wrote:
> > > > >
> > > > > >
> > > > > > Wed Apr 18 06:31:39 2018: Request 84822 was acted upon.
> > > > > > Transaction: Ticket created by rosalyn.maccracken at noaa.gov
> > > > > >        Queue: met_help
> > > > > >      Subject: question on regenerating data
> > > > > >        Owner: Nobody
> > > > > >   Requestors: rosalyn.maccracken at noaa.gov
> > > > > >       Status: new
> > > > > >  Ticket <URL: https://rt.rap.ucar.edu/rt/
> > > Ticket/Display.html?id=84822
> > > > >
> > > > > >
> > > > > >
> > > > > > Hi,
> > > > > >
> > > > > > I'm running point-stat using ASCAT and GFS data to verify
surface
> > > wind
> > > > > > speeds.  I found an error in my ASCAT input data that goes
back
> to
> > > Mar
> > > > 7.
> > > > > > I had switched the input source of the data, and within
the new
> > data
> > > > > files,
> > > > > > it was allowing very small values (< 1 m/s) to be used as
data
> > points
> > > > in
> > > > > > the verification.  I imagine that this is an issue, since
> > point-stat
> > > is
> > > > > > using these very small values as matched pairs with the
GFS,
> > correct?
> > > > > >
> > > > > > Is there a way to regenerate the point-stat statistics
without
> > using
> > > > the
> > > > > > original GFS data?  I do have the *stat and the *mpr
files, and
> it
> > is
> > > > > > pretty easy to identify where the bad values are located.
> > > > > >
> > > > > > Thanks,
> > > > > > Roz
> > > > > >
> > > > > > --
> > > > > > Rosalyn MacCracken
> > > > > > Support Scientist
> > > > > >
> > > > > > Ocean Applications Branch
> > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > NCWCP
> > > > > > 5830 University Research Ct
> > > > > > College Park, MD  20740-3818
> > > > > >
> > > > > > (p) 301-683-1551
> > > > > > rosalyn.maccracken at noaa.gov
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > >
> > > >
> > >
> > >
> > > --
> > > Rosalyn MacCracken
> > > Support Scientist
> > >
> > > Ocean Applications Branch
> > > NOAA/NWS Ocean Prediction Center
> > > NCWCP
> > > 5830 University Research Ct
> > > College Park, MD  20740-3818
> > >
> > > (p) 301-683-1551
> > > rosalyn.maccracken at noaa.gov
> > >
> > >
> >
> >
>
>
> --
> Rosalyn MacCracken
> Support Scientist
>
> Ocean Applications Branch
> NOAA/NWS Ocean Prediction Center
> NCWCP
> 5830 University Research Ct
> College Park, MD  20740-3818
>
> (p) 301-683-1551
> rosalyn.maccracken at noaa.gov
>
>

------------------------------------------------
Subject: question on regenerating data
From: Rosalyn MacCracken - NOAA Affiliate
Time: Tue Apr 24 11:57:43 2018

HI John,

Yes, it does seem that the -config option is the way to go to recreate
those 3 files. I'll be sure to have a unique file name, or, mv the
output
file to a different name before running the command again.  Thanks for
pointing that out.

I'm teleworking for the next couple of weeks, so, download and send
you
*.stat files like I can when I'm at my computer at work.  I don't have
access to theia or wcoss anymore.  You have an ftp server that I can
upload
data to, right?  If not, I can try and fiddle around with this
tomorrow and
see if I can't get this to work the way I want to.

Roz

On Tue, Apr 24, 2018 at 11:42 AM, John Halley Gotway via RT <
met_help at ucar.edu> wrote:

> Roz,
>
> Each "-job aggregate_stat" only generates a single output line type.
So
> using "-out_line_type CTC,CTS,CNT" will not work.
>
> You'll need to run separate jobs for each output line type you want
to
> generate.  That's why I'd recommend grouping those multiple jobs
together
> into a single STAT-Analysis config file.  Then you'd call STAT-
Analysis
> once using the "-config" command line option.
>
> Another issue is that if you set "-out_stat" to the same filename,
it'll
> get overridden by each job.  STAT-Analysis will overwrite that
output file
> rather than appending to it.
>
> You could send me a day's worth of .stat output files
> (/GFS/data/hourly/20180305*) and I could send you some suggestions.
Or if
> you have access to theia you could copy them up there and point me
to it.
>
> Thanks,
> John
>
> On Tue, Apr 24, 2018 at 7:48 AM, Rosalyn MacCracken - NOAA Affiliate
via RT
> <met_help at ucar.edu> wrote:
>
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822 >
> >
> > Hi John,
> >
> > Yes, that makes sense.  Those very small values (<1.0 m/s), are
bad
> > values.  That's why they shouldn't be included in the processing.
> >
> > So, I need to just regenerate hourly data, one hour at a time.
Would it
> > make sense to use a shell script and loop stat-analysis?
Something like:
> >
> > for day in 11 12
> > do
> >   for cycle in 00 06 12 18
> >   do
> > stat_analysis -lookin /GFS/data/hourly/201803${day}${hour}/*.stat
\
> > -job aggregate_stat \
> >    -line_type MPR \
> >    -out_line_type CTC,CTS,CNT \
> >   -fcst_var WIND \
> > -column_thresh OBS gt1 \
> >  -by
> >
MODEL,FCST_LEV,FCST_INIT_BEG,FCST_LEAD,VX_MASK,INTERP_MTHD,INTERP_PNTS
> > -out_stat /new_rerun_stat_files/MPR_to_CTC_CTS_CNT.stat
> >   done
> > done
> >
> > or, something like that?  And, will this regenerate hour
forecasts, at
> each
> > forecast and lead hour?  I guess it will see the forecast and lead
hour
> > from the *.stat file, and whatever *stat file is in the directory,
it
> will
> > regenerate those hours, right?
> >
> > So, I need to regenerate the CTC, CNT and CTS files.  That's why I
did:
> >  -out_line_type CTC,CTS,CNT
> > but, will that make 3 separate files, or just another *.stat file?
> >
> > Roz
> >
> >
> > On Mon, Apr 23, 2018 at 4:01 PM, John Halley Gotway via RT <
> > met_help at ucar.edu> wrote:
> >
> > > Roz,
> > >
> > > It is ultimately up to you to decide which matched pairs you
want to
> > > include in your processing.  Do you consider those small (<1.0
m/s)
> > > observation values to be corrupt and incorrect in some way or
just not
> > very
> > > interesting?  If they really are BAD data values, I agree that
you
> should
> > > exclude them from your analysis.  But if they're just
uninteresting
> > values
> > > of low wind speed, then there's no reason why you should exclude
them.
> > For
> > > example, *most* of the time it ins't raining, but we often
included
> > > observations of 0 precip.
> > >
> > > There are three configurable options in Point-Stat that may be
useful
> > here:
> > > (1) You already know and use the "cat_thresh" option.  This
threshold
> > > defines the events and non-events for a 2x2 contingency table.
This
> > > threshold affects the contents of FHO, CTC, CTS, MCTC, and MCTS
line
> > types
> > > that Point-Stat writes.
> > > (2) The "cnt_thresh" option is a more recent addition.  Perhaps
this
> was
> > a
> > > poor name choice, but instead of defining categories, it's
really a
> > > *filtering* threshold.  This threshold affects the contents of
the
> SL1L2,
> > > SAL1L2, and CNT line types that Point-Stat writes.  For example,
> setting
> > > "cnt_thresh = [ ge6, ge17 ];" will produce 2 CNT and 2 SL1L2
output
> lines
> > > containing only those points where the wind speed was >=6 and
>=17,
> > > respectively.
> > > (3) The "wind_thresh" option is very similar to the "cnt_thresh"
option
> > but
> > > affects the contents of teh VL1L2, VAL1L2, and VCNT (new in met-
7.0)
> line
> > > types.  Only those U/V pairs that meet the specified wind speed
> threshold
> > > are included in the output.
> > >
> > > For both "cnt_thresh" and "wind_thresh", the default value in
the
> config
> > > file is "NA", meaning, do not apply any filtering threshold
criteria.
> > >
> > > You have the flexibility to run STAT-Analysis on the MPR output
lines
> to
> > > recompute any of these output line types applying whatever
filtering
> > > criteria you'd like.
> > > Here's the MET user's guide:
> > > https://dtcenter.org/met/users/docs/users_guide/MET_
> Users_Guide_v7.0.pdf
> > > Look on page 98 for the job command options for the
"aggregate_stat"
> line
> > > type when the input line type is "MPR".
> > >
> > > For your second question, the "-lookin PATH" option is *VERY*
flexible.
> > > You can set PATH to either a single value or multiple values.
If you
> use
> > > wildcards, then the shell expands those wildcards to multiple
values.
> > Each
> > > value you pass in can either be a filename or a directory name.
If you
> > > pass in a filename, STAT-Analysis will read it *REGARDLESS* of
the file
> > > extension.  If you pass in a directory name, STAT-Analysis will
search
> > that
> > > directory *RECURSIVELY* for files ending in ".stat".  For
example,
> either
> > > of the following settings would tell STAT-Analysis to read the
same
> list
> > of
> > > files:
> > >    -lookin /GFS/data/hourly/*/*.stat
> > >    ... or ...
> > >    -lookin /GFS/data/hourly
> > >
> > > Be aware though that the more data you pass to STAT-Analysis,
the
> longer
> > > it'll take for it to process it.  You can decide how much data
you pass
> > it
> > > for each job.  I'd suggest starting with what is most convenient
for
> you.
> > > If it's too slow, change the logic to pass it less data (e.g.
only 1
> day
> > of
> > > data rather than 1 month of data).
> > >
> > > Yes, you can give it a date range.  Use -fcst_init_beg and
> -fcst_init_end
> > > to specify beginning/ending model initialization times or
> -fcst_valid_beg
> > > and -fcst_valid_end to specify beginning/ending valid times.
> > >
> > > If you find that you're running multiple jobs on the same subset
of
> data
> > > (e.g. process MPR to CNT, MPR to SL1L2, MPR to CTC, MPR to CTS),
it'd
> be
> > > more efficient to group those jobs into a config file.  That'll
do the
> > > filtering ONCE and write the filtered data to a temp file.  Then
all
> the
> > > jobs read data from the temp instead of starting over from
scratch.
> > >
> > > Make sense?
> > >
> > > John
> > >
> > >
> > >
> > > On Mon, Apr 23, 2018 at 1:01 PM, Rosalyn MacCracken - NOAA
Affiliate
> via
> > RT
> > > <met_help at ucar.edu> wrote:
> > >
> > > >
> > > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822
>
> > > >
> > > > Hi John,
> > > >
> > > > That's actually only partially correct.  It's not that I want
to use
> > part
> > > > of the MPR lines and discard the rest, and I do need to
regenerate
> > > > statistics.  Let me try to re-explain.
> > > >
> > > > Back in early March we switched from getting our ASCAT obs
from the
> > > > prepbufr data, to getting it from the MGDRLITE data. So,
processing
> > > didn't
> > > > change.  I was producing statistics at certain threshold
levels for
> > both
> > > > GFS and ASCAT.  I had this set with the cat_thresh list, at
levels of
> > > > 0,6,17, etc.  We found out after processing for a couple of
weeks
> that
> > > the
> > > > ASCAT data included these really small values, <1.0 m/s, and
that
> these
> > > > small wind speeds were being included into the statistics
processing.
> > > >
> > > > So, a couple of questions.
> > > > 1) Do I have to regenerate all of my statistics (*.cts, *.cnt
and
> *ctc
> > > > files) because of this error? Or, since I have threshold
levels set,
> > will
> > > > those small values be amoung the statistics in the lowest
thresholds?
> > > > 2) I have the *.stat files, but, they are spread out into
separate
> > > > directories like:
> > > > /GFS/data/hourly/${YYYYMMDDHH}/*.stat
> > > > Can I tell stat-analysis to "lookin" directories with a
wildcard
> (like
> > > > 201803*)?  If so, how?  Or, is I tell it to look in
/GFS/data/hourly,
> > > will
> > > > it look in all the directories recursively under hourly?  And,
it
> > that's
> > > > the case, can I give it a date range, so, that it only
processes data
> > > from
> > > > March?
> > > >
> > > > Roz
> > > >
> > > > On Mon, Apr 23, 2018 at 2:18 PM, John Halley Gotway via RT <
> > > > met_help at ucar.edu> wrote:
> > > >
> > > > > Hi Roz,
> > > > >
> > > > > I read that you've run Point-Stat and saved off the matched
pairs
> > (MPR)
> > > > > output line type.  And you'd like to (1) filter those MPR
lines to
> > > > discard
> > > > > some of them and then (2) use the filtered data to
regenerate
> summary
> > > > > statistics.  Yes, this is easily done using the STAT-
Analysis tool
> in
> > > > MET.
> > > > >
> > > > > You wrote that you're verifying wind speeds against ASCAT
and that
> > > you'd
> > > > > like to exclude pairs where the observed wind speed is less
than 1
> > m/s.
> > > > > I'm just guessing here, but I'll presume that you want to
produce
> > both
> > > > > SL1L2 and CNT output line types.  Here's what the STAT-
Analysis job
> > > would
> > > > > look like:
> > > > >
> > > > > # Filter MPR's and write SL1L2 output line
> > > > > stat_analysis \
> > > > >    -lookin input.stat \            # List a .stat filename
or
> > directory
> > > > > containing them
> > > > >    -job aggregate_stat \        # Job type is aggregate_stat
> > > > >    -line_type MPR \              # Input line type = MPR
> > > > >    -out_line_type SL1L2 \      # Output line type = SL1L2
partial
> > sums
> > > > >    -fcst_var WIND \               # Only process lines where
> FCST_VAR
> > > > > column = WIND
> > > > >    -column_thresh OBS gt1 \ # Only use MPR lines where OBS
column
> > 1
> > > > >    -by
> > > > > MODEL,FCST_LEV,FCST_INIT_BEG,FCST_LEAD,VX_MASK,INTERP_MTHD,
> > INTERP_PNTS
> > > #
> > > > > Run this same job for each unique combination of these
columns
> > > > >    -out_stat MPR_to_SL1L2.stat
> > > > >
> > > > > This will read produce an output .stat file containing an
SL1L2
> line
> > > for
> > > > > each unique combination of the header columns listed after
the
> "-by"
> > > > > option.  To generate CNT output lines instead, you'd run a
second
> job
> > > > where
> > > > > you replace SL1L2 with CNT.  You could run these jobs on the
> command
> > > line
> > > > > or group them together into a STAT-Analysis config file, if
you
> > prefer.
> > > > > Both would work.
> > > > >
> > > > > You could run this once for each input .stat file you're
> > processing...
> > > or
> > > > > you could pass many input .stat files to the job.  Since
> > FCST_INIT_BEG
> > > > and
> > > > > FCST_LEAD are included in the "-by" option, you'll get
separate
> > output
> > > > > lines for each unique time.
> > > > >
> > > > > Hope that helps get you going.
> > > > >
> > > > > Thanks,
> > > > > John
> > > > >
> > > > >
> > > > > On Thu, Apr 19, 2018 at 9:23 AM, Julie Prestopnik via RT <
> > > > > met_help at ucar.edu>
> > > > > wrote:
> > > > >
> > > > > >
> > > > > > <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822 >
> > > > > >
> > > > > > Hi Roz.  My apologies for the delay in responding.
> > > > > >
> > > > > > Unfortunately, John is out of the office this week, and I
do not
> > know
> > > > the
> > > > > > answers to your questions.  As you said, I would also
imagine
> that
> > > > > > point-stat is using those small values as matched pairs.
Also, I
> > do
> > > > not
> > > > > > believe there is a way to regenerate the point-stat
statistics
> > > without
> > > > > > using the original GFS data.  I cannot say with certainty,
> however.
> > > > > Thank
> > > > > > you for your patience in advance.  We'll get a definite
response
> to
> > > you
> > > > > as
> > > > > > soon as we can.
> > > > > >
> > > > > > Thanks,
> > > > > > Julie
> > > > > >
> > > > > > On Wed, Apr 18, 2018 at 6:31 AM, Rosalyn MacCracken - NOAA
> > Affiliate
> > > > via
> > > > > RT
> > > > > > <met_help at ucar.edu> wrote:
> > > > > >
> > > > > > >
> > > > > > > Wed Apr 18 06:31:39 2018: Request 84822 was acted upon.
> > > > > > > Transaction: Ticket created by
rosalyn.maccracken at noaa.gov
> > > > > > >        Queue: met_help
> > > > > > >      Subject: question on regenerating data
> > > > > > >        Owner: Nobody
> > > > > > >   Requestors: rosalyn.maccracken at noaa.gov
> > > > > > >       Status: new
> > > > > > >  Ticket <URL: https://rt.rap.ucar.edu/rt/
> > > > Ticket/Display.html?id=84822
> > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Hi,
> > > > > > >
> > > > > > > I'm running point-stat using ASCAT and GFS data to
verify
> surface
> > > > wind
> > > > > > > speeds.  I found an error in my ASCAT input data that
goes back
> > to
> > > > Mar
> > > > > 7.
> > > > > > > I had switched the input source of the data, and within
the new
> > > data
> > > > > > files,
> > > > > > > it was allowing very small values (< 1 m/s) to be used
as data
> > > points
> > > > > in
> > > > > > > the verification.  I imagine that this is an issue,
since
> > > point-stat
> > > > is
> > > > > > > using these very small values as matched pairs with the
GFS,
> > > correct?
> > > > > > >
> > > > > > > Is there a way to regenerate the point-stat statistics
without
> > > using
> > > > > the
> > > > > > > original GFS data?  I do have the *stat and the *mpr
files, and
> > it
> > > is
> > > > > > > pretty easy to identify where the bad values are
located.
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Roz
> > > > > > >
> > > > > > > --
> > > > > > > Rosalyn MacCracken
> > > > > > > Support Scientist
> > > > > > >
> > > > > > > Ocean Applications Branch
> > > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > > NCWCP
> > > > > > > 5830 University Research Ct
> > > > > > > College Park, MD  20740-3818
> > > > > > >
> > > > > > > (p) 301-683-1551
> > > > > > > rosalyn.maccracken at noaa.gov
> > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > Rosalyn MacCracken
> > > > Support Scientist
> > > >
> > > > Ocean Applications Branch
> > > > NOAA/NWS Ocean Prediction Center
> > > > NCWCP
> > > > 5830 University Research Ct
> > > > College Park, MD  20740-3818
> > > >
> > > > (p) 301-683-1551
> > > > rosalyn.maccracken at noaa.gov
> > > >
> > > >
> > >
> > >
> >
> >
> > --
> > Rosalyn MacCracken
> > Support Scientist
> >
> > Ocean Applications Branch
> > NOAA/NWS Ocean Prediction Center
> > NCWCP
> > 5830 University Research Ct
> > College Park, MD  20740-3818
> >
> > (p) 301-683-1551
> > rosalyn.maccracken at noaa.gov
> >
> >
>
>

--
Rosalyn MacCracken
Support Scientist

Ocean Applications Branch
NOAA/NWS Ocean Prediction Center
NCWCP
5830 University Research Ct
College Park, MD  20740-3818

(p) 301-683-1551
rosalyn.maccracken at noaa.gov

------------------------------------------------
Subject: question on regenerating data
From: John Halley Gotway
Time: Tue Apr 24 12:49:47 2018

Roz,

Yes, we do.  Follow the instructions here:
   https://dtcenter.org/met/users/support/met_help.php#ftp

I'd suggest making a tar file for one day and posting them to the ftp
site:
   tar -cvzf sample.tar.gz /GFS/data/hourly/20180305*

Thanks,
John

On Tue, Apr 24, 2018 at 11:57 AM, Rosalyn MacCracken - NOAA Affiliate
via
RT <met_help at ucar.edu> wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822 >
>
> HI John,
>
> Yes, it does seem that the -config option is the way to go to
recreate
> those 3 files. I'll be sure to have a unique file name, or, mv the
output
> file to a different name before running the command again.  Thanks
for
> pointing that out.
>
> I'm teleworking for the next couple of weeks, so, download and send
you
> *.stat files like I can when I'm at my computer at work.  I don't
have
> access to theia or wcoss anymore.  You have an ftp server that I can
upload
> data to, right?  If not, I can try and fiddle around with this
tomorrow and
> see if I can't get this to work the way I want to.
>
> Roz
>
> On Tue, Apr 24, 2018 at 11:42 AM, John Halley Gotway via RT <
> met_help at ucar.edu> wrote:
>
> > Roz,
> >
> > Each "-job aggregate_stat" only generates a single output line
type.  So
> > using "-out_line_type CTC,CTS,CNT" will not work.
> >
> > You'll need to run separate jobs for each output line type you
want to
> > generate.  That's why I'd recommend grouping those multiple jobs
together
> > into a single STAT-Analysis config file.  Then you'd call STAT-
Analysis
> > once using the "-config" command line option.
> >
> > Another issue is that if you set "-out_stat" to the same filename,
it'll
> > get overridden by each job.  STAT-Analysis will overwrite that
output
> file
> > rather than appending to it.
> >
> > You could send me a day's worth of .stat output files
> > (/GFS/data/hourly/20180305*) and I could send you some
suggestions.  Or
> if
> > you have access to theia you could copy them up there and point me
to it.
> >
> > Thanks,
> > John
> >
> > On Tue, Apr 24, 2018 at 7:48 AM, Rosalyn MacCracken - NOAA
Affiliate via
> RT
> > <met_help at ucar.edu> wrote:
> >
> > >
> > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822 >
> > >
> > > Hi John,
> > >
> > > Yes, that makes sense.  Those very small values (<1.0 m/s), are
bad
> > > values.  That's why they shouldn't be included in the
processing.
> > >
> > > So, I need to just regenerate hourly data, one hour at a time.
Would
> it
> > > make sense to use a shell script and loop stat-analysis?
Something
> like:
> > >
> > > for day in 11 12
> > > do
> > >   for cycle in 00 06 12 18
> > >   do
> > > stat_analysis -lookin
/GFS/data/hourly/201803${day}${hour}/*.stat \
> > > -job aggregate_stat \
> > >    -line_type MPR \
> > >    -out_line_type CTC,CTS,CNT \
> > >   -fcst_var WIND \
> > > -column_thresh OBS gt1 \
> > >  -by
> > >
MODEL,FCST_LEV,FCST_INIT_BEG,FCST_LEAD,VX_MASK,INTERP_MTHD,INTERP_PNTS
> > > -out_stat /new_rerun_stat_files/MPR_to_CTC_CTS_CNT.stat
> > >   done
> > > done
> > >
> > > or, something like that?  And, will this regenerate hour
forecasts, at
> > each
> > > forecast and lead hour?  I guess it will see the forecast and
lead hour
> > > from the *.stat file, and whatever *stat file is in the
directory, it
> > will
> > > regenerate those hours, right?
> > >
> > > So, I need to regenerate the CTC, CNT and CTS files.  That's why
I did:
> > >  -out_line_type CTC,CTS,CNT
> > > but, will that make 3 separate files, or just another *.stat
file?
> > >
> > > Roz
> > >
> > >
> > > On Mon, Apr 23, 2018 at 4:01 PM, John Halley Gotway via RT <
> > > met_help at ucar.edu> wrote:
> > >
> > > > Roz,
> > > >
> > > > It is ultimately up to you to decide which matched pairs you
want to
> > > > include in your processing.  Do you consider those small (<1.0
m/s)
> > > > observation values to be corrupt and incorrect in some way or
just
> not
> > > very
> > > > interesting?  If they really are BAD data values, I agree that
you
> > should
> > > > exclude them from your analysis.  But if they're just
uninteresting
> > > values
> > > > of low wind speed, then there's no reason why you should
exclude
> them.
> > > For
> > > > example, *most* of the time it ins't raining, but we often
included
> > > > observations of 0 precip.
> > > >
> > > > There are three configurable options in Point-Stat that may be
useful
> > > here:
> > > > (1) You already know and use the "cat_thresh" option.  This
threshold
> > > > defines the events and non-events for a 2x2 contingency table.
This
> > > > threshold affects the contents of FHO, CTC, CTS, MCTC, and
MCTS line
> > > types
> > > > that Point-Stat writes.
> > > > (2) The "cnt_thresh" option is a more recent addition.
Perhaps this
> > was
> > > a
> > > > poor name choice, but instead of defining categories, it's
really a
> > > > *filtering* threshold.  This threshold affects the contents of
the
> > SL1L2,
> > > > SAL1L2, and CNT line types that Point-Stat writes.  For
example,
> > setting
> > > > "cnt_thresh = [ ge6, ge17 ];" will produce 2 CNT and 2 SL1L2
output
> > lines
> > > > containing only those points where the wind speed was >=6 and
>=17,
> > > > respectively.
> > > > (3) The "wind_thresh" option is very similar to the
"cnt_thresh"
> option
> > > but
> > > > affects the contents of teh VL1L2, VAL1L2, and VCNT (new in
met-7.0)
> > line
> > > > types.  Only those U/V pairs that meet the specified wind
speed
> > threshold
> > > > are included in the output.
> > > >
> > > > For both "cnt_thresh" and "wind_thresh", the default value in
the
> > config
> > > > file is "NA", meaning, do not apply any filtering threshold
criteria.
> > > >
> > > > You have the flexibility to run STAT-Analysis on the MPR
output lines
> > to
> > > > recompute any of these output line types applying whatever
filtering
> > > > criteria you'd like.
> > > > Here's the MET user's guide:
> > > > https://dtcenter.org/met/users/docs/users_guide/MET_
> > Users_Guide_v7.0.pdf
> > > > Look on page 98 for the job command options for the
"aggregate_stat"
> > line
> > > > type when the input line type is "MPR".
> > > >
> > > > For your second question, the "-lookin PATH" option is *VERY*
> flexible.
> > > > You can set PATH to either a single value or multiple values.
If you
> > use
> > > > wildcards, then the shell expands those wildcards to multiple
values.
> > > Each
> > > > value you pass in can either be a filename or a directory
name.  If
> you
> > > > pass in a filename, STAT-Analysis will read it *REGARDLESS* of
the
> file
> > > > extension.  If you pass in a directory name, STAT-Analysis
will
> search
> > > that
> > > > directory *RECURSIVELY* for files ending in ".stat".  For
example,
> > either
> > > > of the following settings would tell STAT-Analysis to read the
same
> > list
> > > of
> > > > files:
> > > >    -lookin /GFS/data/hourly/*/*.stat
> > > >    ... or ...
> > > >    -lookin /GFS/data/hourly
> > > >
> > > > Be aware though that the more data you pass to STAT-Analysis,
the
> > longer
> > > > it'll take for it to process it.  You can decide how much data
you
> pass
> > > it
> > > > for each job.  I'd suggest starting with what is most
convenient for
> > you.
> > > > If it's too slow, change the logic to pass it less data (e.g.
only 1
> > day
> > > of
> > > > data rather than 1 month of data).
> > > >
> > > > Yes, you can give it a date range.  Use -fcst_init_beg and
> > -fcst_init_end
> > > > to specify beginning/ending model initialization times or
> > -fcst_valid_beg
> > > > and -fcst_valid_end to specify beginning/ending valid times.
> > > >
> > > > If you find that you're running multiple jobs on the same
subset of
> > data
> > > > (e.g. process MPR to CNT, MPR to SL1L2, MPR to CTC, MPR to
CTS), it'd
> > be
> > > > more efficient to group those jobs into a config file.
That'll do
> the
> > > > filtering ONCE and write the filtered data to a temp file.
Then all
> > the
> > > > jobs read data from the temp instead of starting over from
scratch.
> > > >
> > > > Make sense?
> > > >
> > > > John
> > > >
> > > >
> > > >
> > > > On Mon, Apr 23, 2018 at 1:01 PM, Rosalyn MacCracken - NOAA
Affiliate
> > via
> > > RT
> > > > <met_help at ucar.edu> wrote:
> > > >
> > > > >
> > > > > <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822 >
> > > > >
> > > > > Hi John,
> > > > >
> > > > > That's actually only partially correct.  It's not that I
want to
> use
> > > part
> > > > > of the MPR lines and discard the rest, and I do need to
regenerate
> > > > > statistics.  Let me try to re-explain.
> > > > >
> > > > > Back in early March we switched from getting our ASCAT obs
from the
> > > > > prepbufr data, to getting it from the MGDRLITE data. So,
processing
> > > > didn't
> > > > > change.  I was producing statistics at certain threshold
levels for
> > > both
> > > > > GFS and ASCAT.  I had this set with the cat_thresh list, at
levels
> of
> > > > > 0,6,17, etc.  We found out after processing for a couple of
weeks
> > that
> > > > the
> > > > > ASCAT data included these really small values, <1.0 m/s, and
that
> > these
> > > > > small wind speeds were being included into the statistics
> processing.
> > > > >
> > > > > So, a couple of questions.
> > > > > 1) Do I have to regenerate all of my statistics (*.cts,
*.cnt and
> > *ctc
> > > > > files) because of this error? Or, since I have threshold
levels
> set,
> > > will
> > > > > those small values be amoung the statistics in the lowest
> thresholds?
> > > > > 2) I have the *.stat files, but, they are spread out into
separate
> > > > > directories like:
> > > > > /GFS/data/hourly/${YYYYMMDDHH}/*.stat
> > > > > Can I tell stat-analysis to "lookin" directories with a
wildcard
> > (like
> > > > > 201803*)?  If so, how?  Or, is I tell it to look in
> /GFS/data/hourly,
> > > > will
> > > > > it look in all the directories recursively under hourly?
And, it
> > > that's
> > > > > the case, can I give it a date range, so, that it only
processes
> data
> > > > from
> > > > > March?
> > > > >
> > > > > Roz
> > > > >
> > > > > On Mon, Apr 23, 2018 at 2:18 PM, John Halley Gotway via RT <
> > > > > met_help at ucar.edu> wrote:
> > > > >
> > > > > > Hi Roz,
> > > > > >
> > > > > > I read that you've run Point-Stat and saved off the
matched pairs
> > > (MPR)
> > > > > > output line type.  And you'd like to (1) filter those MPR
lines
> to
> > > > > discard
> > > > > > some of them and then (2) use the filtered data to
regenerate
> > summary
> > > > > > statistics.  Yes, this is easily done using the STAT-
Analysis
> tool
> > in
> > > > > MET.
> > > > > >
> > > > > > You wrote that you're verifying wind speeds against ASCAT
and
> that
> > > > you'd
> > > > > > like to exclude pairs where the observed wind speed is
less than
> 1
> > > m/s.
> > > > > > I'm just guessing here, but I'll presume that you want to
produce
> > > both
> > > > > > SL1L2 and CNT output line types.  Here's what the STAT-
Analysis
> job
> > > > would
> > > > > > look like:
> > > > > >
> > > > > > # Filter MPR's and write SL1L2 output line
> > > > > > stat_analysis \
> > > > > >    -lookin input.stat \            # List a .stat filename
or
> > > directory
> > > > > > containing them
> > > > > >    -job aggregate_stat \        # Job type is
aggregate_stat
> > > > > >    -line_type MPR \              # Input line type = MPR
> > > > > >    -out_line_type SL1L2 \      # Output line type = SL1L2
partial
> > > sums
> > > > > >    -fcst_var WIND \               # Only process lines
where
> > FCST_VAR
> > > > > > column = WIND
> > > > > >    -column_thresh OBS gt1 \ # Only use MPR lines where OBS
column
> > > 1
> > > > > >    -by
> > > > > >
MODEL,FCST_LEV,FCST_INIT_BEG,FCST_LEAD,VX_MASK,INTERP_MTHD,
> > > INTERP_PNTS
> > > > #
> > > > > > Run this same job for each unique combination of these
columns
> > > > > >    -out_stat MPR_to_SL1L2.stat
> > > > > >
> > > > > > This will read produce an output .stat file containing an
SL1L2
> > line
> > > > for
> > > > > > each unique combination of the header columns listed after
the
> > "-by"
> > > > > > option.  To generate CNT output lines instead, you'd run a
second
> > job
> > > > > where
> > > > > > you replace SL1L2 with CNT.  You could run these jobs on
the
> > command
> > > > line
> > > > > > or group them together into a STAT-Analysis config file,
if you
> > > prefer.
> > > > > > Both would work.
> > > > > >
> > > > > > You could run this once for each input .stat file you're
> > > processing...
> > > > or
> > > > > > you could pass many input .stat files to the job.  Since
> > > FCST_INIT_BEG
> > > > > and
> > > > > > FCST_LEAD are included in the "-by" option, you'll get
separate
> > > output
> > > > > > lines for each unique time.
> > > > > >
> > > > > > Hope that helps get you going.
> > > > > >
> > > > > > Thanks,
> > > > > > John
> > > > > >
> > > > > >
> > > > > > On Thu, Apr 19, 2018 at 9:23 AM, Julie Prestopnik via RT <
> > > > > > met_help at ucar.edu>
> > > > > > wrote:
> > > > > >
> > > > > > >
> > > > > > > <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822
> >
> > > > > > >
> > > > > > > Hi Roz.  My apologies for the delay in responding.
> > > > > > >
> > > > > > > Unfortunately, John is out of the office this week, and
I do
> not
> > > know
> > > > > the
> > > > > > > answers to your questions.  As you said, I would also
imagine
> > that
> > > > > > > point-stat is using those small values as matched pairs.
> Also, I
> > > do
> > > > > not
> > > > > > > believe there is a way to regenerate the point-stat
statistics
> > > > without
> > > > > > > using the original GFS data.  I cannot say with
certainty,
> > however.
> > > > > > Thank
> > > > > > > you for your patience in advance.  We'll get a definite
> response
> > to
> > > > you
> > > > > > as
> > > > > > > soon as we can.
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Julie
> > > > > > >
> > > > > > > On Wed, Apr 18, 2018 at 6:31 AM, Rosalyn MacCracken -
NOAA
> > > Affiliate
> > > > > via
> > > > > > RT
> > > > > > > <met_help at ucar.edu> wrote:
> > > > > > >
> > > > > > > >
> > > > > > > > Wed Apr 18 06:31:39 2018: Request 84822 was acted
upon.
> > > > > > > > Transaction: Ticket created by
rosalyn.maccracken at noaa.gov
> > > > > > > >        Queue: met_help
> > > > > > > >      Subject: question on regenerating data
> > > > > > > >        Owner: Nobody
> > > > > > > >   Requestors: rosalyn.maccracken at noaa.gov
> > > > > > > >       Status: new
> > > > > > > >  Ticket <URL: https://rt.rap.ucar.edu/rt/
> > > > > Ticket/Display.html?id=84822
> > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > Hi,
> > > > > > > >
> > > > > > > > I'm running point-stat using ASCAT and GFS data to
verify
> > surface
> > > > > wind
> > > > > > > > speeds.  I found an error in my ASCAT input data that
goes
> back
> > > to
> > > > > Mar
> > > > > > 7.
> > > > > > > > I had switched the input source of the data, and
within the
> new
> > > > data
> > > > > > > files,
> > > > > > > > it was allowing very small values (< 1 m/s) to be used
as
> data
> > > > points
> > > > > > in
> > > > > > > > the verification.  I imagine that this is an issue,
since
> > > > point-stat
> > > > > is
> > > > > > > > using these very small values as matched pairs with
the GFS,
> > > > correct?
> > > > > > > >
> > > > > > > > Is there a way to regenerate the point-stat statistics
> without
> > > > using
> > > > > > the
> > > > > > > > original GFS data?  I do have the *stat and the *mpr
files,
> and
> > > it
> > > > is
> > > > > > > > pretty easy to identify where the bad values are
located.
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Roz
> > > > > > > >
> > > > > > > > --
> > > > > > > > Rosalyn MacCracken
> > > > > > > > Support Scientist
> > > > > > > >
> > > > > > > > Ocean Applications Branch
> > > > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > > > NCWCP
> > > > > > > > 5830 University Research Ct
> > > > > > > > College Park, MD  20740-3818
> > > > > > > >
> > > > > > > > (p) 301-683-1551
> > > > > > > > rosalyn.maccracken at noaa.gov
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Rosalyn MacCracken
> > > > > Support Scientist
> > > > >
> > > > > Ocean Applications Branch
> > > > > NOAA/NWS Ocean Prediction Center
> > > > > NCWCP
> > > > > 5830 University Research Ct
> > > > > College Park, MD  20740-3818
> > > > >
> > > > > (p) 301-683-1551
> > > > > rosalyn.maccracken at noaa.gov
> > > > >
> > > > >
> > > >
> > > >
> > >
> > >
> > > --
> > > Rosalyn MacCracken
> > > Support Scientist
> > >
> > > Ocean Applications Branch
> > > NOAA/NWS Ocean Prediction Center
> > > NCWCP
> > > 5830 University Research Ct
> > > College Park, MD  20740-3818
> > >
> > > (p) 301-683-1551
> > > rosalyn.maccracken at noaa.gov
> > >
> > >
> >
> >
>
>
> --
> Rosalyn MacCracken
> Support Scientist
>
> Ocean Applications Branch
> NOAA/NWS Ocean Prediction Center
> NCWCP
> 5830 University Research Ct
> College Park, MD  20740-3818
>
> (p) 301-683-1551
> rosalyn.maccracken at noaa.gov
>
>

------------------------------------------------
Subject: question on regenerating data
From: Rosalyn MacCracken - NOAA Affiliate
Time: Tue Apr 24 12:53:08 2018

Ok, I'll get that over to the ftp site.  I have to make sure that I
find a
day that has all the data in it.  Sometimes the data isn't available
when
the script runs.  A little annoying, but, that's operations...

I'll let you know when I get the file to the ftp site.

Thanks!

Roz

On Tue, Apr 24, 2018 at 2:49 PM, John Halley Gotway via RT <
met_help at ucar.edu> wrote:

> Roz,
>
> Yes, we do.  Follow the instructions here:
>    https://dtcenter.org/met/users/support/met_help.php#ftp
>
> I'd suggest making a tar file for one day and posting them to the
ftp site:
>    tar -cvzf sample.tar.gz /GFS/data/hourly/20180305*
>
> Thanks,
> John
>
> On Tue, Apr 24, 2018 at 11:57 AM, Rosalyn MacCracken - NOAA
Affiliate via
> RT <met_help at ucar.edu> wrote:
>
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822 >
> >
> > HI John,
> >
> > Yes, it does seem that the -config option is the way to go to
recreate
> > those 3 files. I'll be sure to have a unique file name, or, mv the
output
> > file to a different name before running the command again.  Thanks
for
> > pointing that out.
> >
> > I'm teleworking for the next couple of weeks, so, download and
send you
> > *.stat files like I can when I'm at my computer at work.  I don't
have
> > access to theia or wcoss anymore.  You have an ftp server that I
can
> upload
> > data to, right?  If not, I can try and fiddle around with this
tomorrow
> and
> > see if I can't get this to work the way I want to.
> >
> > Roz
> >
> > On Tue, Apr 24, 2018 at 11:42 AM, John Halley Gotway via RT <
> > met_help at ucar.edu> wrote:
> >
> > > Roz,
> > >
> > > Each "-job aggregate_stat" only generates a single output line
type.
> So
> > > using "-out_line_type CTC,CTS,CNT" will not work.
> > >
> > > You'll need to run separate jobs for each output line type you
want to
> > > generate.  That's why I'd recommend grouping those multiple jobs
> together
> > > into a single STAT-Analysis config file.  Then you'd call STAT-
Analysis
> > > once using the "-config" command line option.
> > >
> > > Another issue is that if you set "-out_stat" to the same
filename,
> it'll
> > > get overridden by each job.  STAT-Analysis will overwrite that
output
> > file
> > > rather than appending to it.
> > >
> > > You could send me a day's worth of .stat output files
> > > (/GFS/data/hourly/20180305*) and I could send you some
suggestions.  Or
> > if
> > > you have access to theia you could copy them up there and point
me to
> it.
> > >
> > > Thanks,
> > > John
> > >
> > > On Tue, Apr 24, 2018 at 7:48 AM, Rosalyn MacCracken - NOAA
Affiliate
> via
> > RT
> > > <met_help at ucar.edu> wrote:
> > >
> > > >
> > > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822
>
> > > >
> > > > Hi John,
> > > >
> > > > Yes, that makes sense.  Those very small values (<1.0 m/s),
are bad
> > > > values.  That's why they shouldn't be included in the
processing.
> > > >
> > > > So, I need to just regenerate hourly data, one hour at a time.
Would
> > it
> > > > make sense to use a shell script and loop stat-analysis?
Something
> > like:
> > > >
> > > > for day in 11 12
> > > > do
> > > >   for cycle in 00 06 12 18
> > > >   do
> > > > stat_analysis -lookin
/GFS/data/hourly/201803${day}${hour}/*.stat \
> > > > -job aggregate_stat \
> > > >    -line_type MPR \
> > > >    -out_line_type CTC,CTS,CNT \
> > > >   -fcst_var WIND \
> > > > -column_thresh OBS gt1 \
> > > >  -by
> > > > MODEL,FCST_LEV,FCST_INIT_BEG,FCST_LEAD,VX_MASK,INTERP_MTHD,
> INTERP_PNTS
> > > > -out_stat /new_rerun_stat_files/MPR_to_CTC_CTS_CNT.stat
> > > >   done
> > > > done
> > > >
> > > > or, something like that?  And, will this regenerate hour
forecasts,
> at
> > > each
> > > > forecast and lead hour?  I guess it will see the forecast and
lead
> hour
> > > > from the *.stat file, and whatever *stat file is in the
directory, it
> > > will
> > > > regenerate those hours, right?
> > > >
> > > > So, I need to regenerate the CTC, CNT and CTS files.  That's
why I
> did:
> > > >  -out_line_type CTC,CTS,CNT
> > > > but, will that make 3 separate files, or just another *.stat
file?
> > > >
> > > > Roz
> > > >
> > > >
> > > > On Mon, Apr 23, 2018 at 4:01 PM, John Halley Gotway via RT <
> > > > met_help at ucar.edu> wrote:
> > > >
> > > > > Roz,
> > > > >
> > > > > It is ultimately up to you to decide which matched pairs you
want
> to
> > > > > include in your processing.  Do you consider those small
(<1.0 m/s)
> > > > > observation values to be corrupt and incorrect in some way
or just
> > not
> > > > very
> > > > > interesting?  If they really are BAD data values, I agree
that you
> > > should
> > > > > exclude them from your analysis.  But if they're just
uninteresting
> > > > values
> > > > > of low wind speed, then there's no reason why you should
exclude
> > them.
> > > > For
> > > > > example, *most* of the time it ins't raining, but we often
included
> > > > > observations of 0 precip.
> > > > >
> > > > > There are three configurable options in Point-Stat that may
be
> useful
> > > > here:
> > > > > (1) You already know and use the "cat_thresh" option.  This
> threshold
> > > > > defines the events and non-events for a 2x2 contingency
table.
> This
> > > > > threshold affects the contents of FHO, CTC, CTS, MCTC, and
MCTS
> line
> > > > types
> > > > > that Point-Stat writes.
> > > > > (2) The "cnt_thresh" option is a more recent addition.
Perhaps
> this
> > > was
> > > > a
> > > > > poor name choice, but instead of defining categories, it's
really a
> > > > > *filtering* threshold.  This threshold affects the contents
of the
> > > SL1L2,
> > > > > SAL1L2, and CNT line types that Point-Stat writes.  For
example,
> > > setting
> > > > > "cnt_thresh = [ ge6, ge17 ];" will produce 2 CNT and 2 SL1L2
output
> > > lines
> > > > > containing only those points where the wind speed was >=6
and >=17,
> > > > > respectively.
> > > > > (3) The "wind_thresh" option is very similar to the
"cnt_thresh"
> > option
> > > > but
> > > > > affects the contents of teh VL1L2, VAL1L2, and VCNT (new in
> met-7.0)
> > > line
> > > > > types.  Only those U/V pairs that meet the specified wind
speed
> > > threshold
> > > > > are included in the output.
> > > > >
> > > > > For both "cnt_thresh" and "wind_thresh", the default value
in the
> > > config
> > > > > file is "NA", meaning, do not apply any filtering threshold
> criteria.
> > > > >
> > > > > You have the flexibility to run STAT-Analysis on the MPR
output
> lines
> > > to
> > > > > recompute any of these output line types applying whatever
> filtering
> > > > > criteria you'd like.
> > > > > Here's the MET user's guide:
> > > > > https://dtcenter.org/met/users/docs/users_guide/MET_
> > > Users_Guide_v7.0.pdf
> > > > > Look on page 98 for the job command options for the
> "aggregate_stat"
> > > line
> > > > > type when the input line type is "MPR".
> > > > >
> > > > > For your second question, the "-lookin PATH" option is
*VERY*
> > flexible.
> > > > > You can set PATH to either a single value or multiple
values.  If
> you
> > > use
> > > > > wildcards, then the shell expands those wildcards to
multiple
> values.
> > > > Each
> > > > > value you pass in can either be a filename or a directory
name.  If
> > you
> > > > > pass in a filename, STAT-Analysis will read it *REGARDLESS*
of the
> > file
> > > > > extension.  If you pass in a directory name, STAT-Analysis
will
> > search
> > > > that
> > > > > directory *RECURSIVELY* for files ending in ".stat".  For
example,
> > > either
> > > > > of the following settings would tell STAT-Analysis to read
the same
> > > list
> > > > of
> > > > > files:
> > > > >    -lookin /GFS/data/hourly/*/*.stat
> > > > >    ... or ...
> > > > >    -lookin /GFS/data/hourly
> > > > >
> > > > > Be aware though that the more data you pass to STAT-
Analysis, the
> > > longer
> > > > > it'll take for it to process it.  You can decide how much
data you
> > pass
> > > > it
> > > > > for each job.  I'd suggest starting with what is most
convenient
> for
> > > you.
> > > > > If it's too slow, change the logic to pass it less data
(e.g. only
> 1
> > > day
> > > > of
> > > > > data rather than 1 month of data).
> > > > >
> > > > > Yes, you can give it a date range.  Use -fcst_init_beg and
> > > -fcst_init_end
> > > > > to specify beginning/ending model initialization times or
> > > -fcst_valid_beg
> > > > > and -fcst_valid_end to specify beginning/ending valid times.
> > > > >
> > > > > If you find that you're running multiple jobs on the same
subset of
> > > data
> > > > > (e.g. process MPR to CNT, MPR to SL1L2, MPR to CTC, MPR to
CTS),
> it'd
> > > be
> > > > > more efficient to group those jobs into a config file.
That'll do
> > the
> > > > > filtering ONCE and write the filtered data to a temp file.
Then
> all
> > > the
> > > > > jobs read data from the temp instead of starting over from
scratch.
> > > > >
> > > > > Make sense?
> > > > >
> > > > > John
> > > > >
> > > > >
> > > > >
> > > > > On Mon, Apr 23, 2018 at 1:01 PM, Rosalyn MacCracken - NOAA
> Affiliate
> > > via
> > > > RT
> > > > > <met_help at ucar.edu> wrote:
> > > > >
> > > > > >
> > > > > > <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822 >
> > > > > >
> > > > > > Hi John,
> > > > > >
> > > > > > That's actually only partially correct.  It's not that I
want to
> > use
> > > > part
> > > > > > of the MPR lines and discard the rest, and I do need to
> regenerate
> > > > > > statistics.  Let me try to re-explain.
> > > > > >
> > > > > > Back in early March we switched from getting our ASCAT obs
from
> the
> > > > > > prepbufr data, to getting it from the MGDRLITE data. So,
> processing
> > > > > didn't
> > > > > > change.  I was producing statistics at certain threshold
levels
> for
> > > > both
> > > > > > GFS and ASCAT.  I had this set with the cat_thresh list,
at
> levels
> > of
> > > > > > 0,6,17, etc.  We found out after processing for a couple
of weeks
> > > that
> > > > > the
> > > > > > ASCAT data included these really small values, <1.0 m/s,
and that
> > > these
> > > > > > small wind speeds were being included into the statistics
> > processing.
> > > > > >
> > > > > > So, a couple of questions.
> > > > > > 1) Do I have to regenerate all of my statistics (*.cts,
*.cnt and
> > > *ctc
> > > > > > files) because of this error? Or, since I have threshold
levels
> > set,
> > > > will
> > > > > > those small values be amoung the statistics in the lowest
> > thresholds?
> > > > > > 2) I have the *.stat files, but, they are spread out into
> separate
> > > > > > directories like:
> > > > > > /GFS/data/hourly/${YYYYMMDDHH}/*.stat
> > > > > > Can I tell stat-analysis to "lookin" directories with a
wildcard
> > > (like
> > > > > > 201803*)?  If so, how?  Or, is I tell it to look in
> > /GFS/data/hourly,
> > > > > will
> > > > > > it look in all the directories recursively under hourly?
And, it
> > > > that's
> > > > > > the case, can I give it a date range, so, that it only
processes
> > data
> > > > > from
> > > > > > March?
> > > > > >
> > > > > > Roz
> > > > > >
> > > > > > On Mon, Apr 23, 2018 at 2:18 PM, John Halley Gotway via RT
<
> > > > > > met_help at ucar.edu> wrote:
> > > > > >
> > > > > > > Hi Roz,
> > > > > > >
> > > > > > > I read that you've run Point-Stat and saved off the
matched
> pairs
> > > > (MPR)
> > > > > > > output line type.  And you'd like to (1) filter those
MPR lines
> > to
> > > > > > discard
> > > > > > > some of them and then (2) use the filtered data to
regenerate
> > > summary
> > > > > > > statistics.  Yes, this is easily done using the STAT-
Analysis
> > tool
> > > in
> > > > > > MET.
> > > > > > >
> > > > > > > You wrote that you're verifying wind speeds against
ASCAT and
> > that
> > > > > you'd
> > > > > > > like to exclude pairs where the observed wind speed is
less
> than
> > 1
> > > > m/s.
> > > > > > > I'm just guessing here, but I'll presume that you want
to
> produce
> > > > both
> > > > > > > SL1L2 and CNT output line types.  Here's what the STAT-
Analysis
> > job
> > > > > would
> > > > > > > look like:
> > > > > > >
> > > > > > > # Filter MPR's and write SL1L2 output line
> > > > > > > stat_analysis \
> > > > > > >    -lookin input.stat \            # List a .stat
filename or
> > > > directory
> > > > > > > containing them
> > > > > > >    -job aggregate_stat \        # Job type is
aggregate_stat
> > > > > > >    -line_type MPR \              # Input line type = MPR
> > > > > > >    -out_line_type SL1L2 \      # Output line type =
SL1L2
> partial
> > > > sums
> > > > > > >    -fcst_var WIND \               # Only process lines
where
> > > FCST_VAR
> > > > > > > column = WIND
> > > > > > >    -column_thresh OBS gt1 \ # Only use MPR lines where
OBS
> column
> > > > 1
> > > > > > >    -by
> > > > > > >
MODEL,FCST_LEV,FCST_INIT_BEG,FCST_LEAD,VX_MASK,INTERP_MTHD,
> > > > INTERP_PNTS
> > > > > #
> > > > > > > Run this same job for each unique combination of these
columns
> > > > > > >    -out_stat MPR_to_SL1L2.stat
> > > > > > >
> > > > > > > This will read produce an output .stat file containing
an SL1L2
> > > line
> > > > > for
> > > > > > > each unique combination of the header columns listed
after the
> > > "-by"
> > > > > > > option.  To generate CNT output lines instead, you'd run
a
> second
> > > job
> > > > > > where
> > > > > > > you replace SL1L2 with CNT.  You could run these jobs on
the
> > > command
> > > > > line
> > > > > > > or group them together into a STAT-Analysis config file,
if you
> > > > prefer.
> > > > > > > Both would work.
> > > > > > >
> > > > > > > You could run this once for each input .stat file you're
> > > > processing...
> > > > > or
> > > > > > > you could pass many input .stat files to the job.  Since
> > > > FCST_INIT_BEG
> > > > > > and
> > > > > > > FCST_LEAD are included in the "-by" option, you'll get
separate
> > > > output
> > > > > > > lines for each unique time.
> > > > > > >
> > > > > > > Hope that helps get you going.
> > > > > > >
> > > > > > > Thanks,
> > > > > > > John
> > > > > > >
> > > > > > >
> > > > > > > On Thu, Apr 19, 2018 at 9:23 AM, Julie Prestopnik via RT
<
> > > > > > > met_help at ucar.edu>
> > > > > > > wrote:
> > > > > > >
> > > > > > > >
> > > > > > > > <URL: https://rt.rap.ucar.edu/rt/
> Ticket/Display.html?id=84822
> > >
> > > > > > > >
> > > > > > > > Hi Roz.  My apologies for the delay in responding.
> > > > > > > >
> > > > > > > > Unfortunately, John is out of the office this week,
and I do
> > not
> > > > know
> > > > > > the
> > > > > > > > answers to your questions.  As you said, I would also
imagine
> > > that
> > > > > > > > point-stat is using those small values as matched
pairs.
> > Also, I
> > > > do
> > > > > > not
> > > > > > > > believe there is a way to regenerate the point-stat
> statistics
> > > > > without
> > > > > > > > using the original GFS data.  I cannot say with
certainty,
> > > however.
> > > > > > > Thank
> > > > > > > > you for your patience in advance.  We'll get a
definite
> > response
> > > to
> > > > > you
> > > > > > > as
> > > > > > > > soon as we can.
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Julie
> > > > > > > >
> > > > > > > > On Wed, Apr 18, 2018 at 6:31 AM, Rosalyn MacCracken -
NOAA
> > > > Affiliate
> > > > > > via
> > > > > > > RT
> > > > > > > > <met_help at ucar.edu> wrote:
> > > > > > > >
> > > > > > > > >
> > > > > > > > > Wed Apr 18 06:31:39 2018: Request 84822 was acted
upon.
> > > > > > > > > Transaction: Ticket created by
rosalyn.maccracken at noaa.gov
> > > > > > > > >        Queue: met_help
> > > > > > > > >      Subject: question on regenerating data
> > > > > > > > >        Owner: Nobody
> > > > > > > > >   Requestors: rosalyn.maccracken at noaa.gov
> > > > > > > > >       Status: new
> > > > > > > > >  Ticket <URL: https://rt.rap.ucar.edu/rt/
> > > > > > Ticket/Display.html?id=84822
> > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Hi,
> > > > > > > > >
> > > > > > > > > I'm running point-stat using ASCAT and GFS data to
verify
> > > surface
> > > > > > wind
> > > > > > > > > speeds.  I found an error in my ASCAT input data
that goes
> > back
> > > > to
> > > > > > Mar
> > > > > > > 7.
> > > > > > > > > I had switched the input source of the data, and
within the
> > new
> > > > > data
> > > > > > > > files,
> > > > > > > > > it was allowing very small values (< 1 m/s) to be
used as
> > data
> > > > > points
> > > > > > > in
> > > > > > > > > the verification.  I imagine that this is an issue,
since
> > > > > point-stat
> > > > > > is
> > > > > > > > > using these very small values as matched pairs with
the
> GFS,
> > > > > correct?
> > > > > > > > >
> > > > > > > > > Is there a way to regenerate the point-stat
statistics
> > without
> > > > > using
> > > > > > > the
> > > > > > > > > original GFS data?  I do have the *stat and the *mpr
files,
> > and
> > > > it
> > > > > is
> > > > > > > > > pretty easy to identify where the bad values are
located.
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > > Roz
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Rosalyn MacCracken
> > > > > > > > > Support Scientist
> > > > > > > > >
> > > > > > > > > Ocean Applications Branch
> > > > > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > > > > NCWCP
> > > > > > > > > 5830 University Research Ct
> > > > > > > > > College Park, MD  20740-3818
> > > > > > > > >
> > > > > > > > > (p) 301-683-1551
> > > > > > > > > rosalyn.maccracken at noaa.gov
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Rosalyn MacCracken
> > > > > > Support Scientist
> > > > > >
> > > > > > Ocean Applications Branch
> > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > NCWCP
> > > > > > 5830 University Research Ct
> > > > > > College Park, MD  20740-3818
> > > > > >
> > > > > > (p) 301-683-1551
> > > > > > rosalyn.maccracken at noaa.gov
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > Rosalyn MacCracken
> > > > Support Scientist
> > > >
> > > > Ocean Applications Branch
> > > > NOAA/NWS Ocean Prediction Center
> > > > NCWCP
> > > > 5830 University Research Ct
> > > > College Park, MD  20740-3818
> > > >
> > > > (p) 301-683-1551
> > > > rosalyn.maccracken at noaa.gov
> > > >
> > > >
> > >
> > >
> >
> >
> > --
> > Rosalyn MacCracken
> > Support Scientist
> >
> > Ocean Applications Branch
> > NOAA/NWS Ocean Prediction Center
> > NCWCP
> > 5830 University Research Ct
> > College Park, MD  20740-3818
> >
> > (p) 301-683-1551
> > rosalyn.maccracken at noaa.gov
> >
> >
>
>

--
Rosalyn MacCracken
Support Scientist

Ocean Applications Branch
NOAA/NWS Ocean Prediction Center
NCWCP
5830 University Research Ct
College Park, MD  20740-3818

(p) 301-683-1551
rosalyn.maccracken at noaa.gov

------------------------------------------------
Subject: question on regenerating data
From: Rosalyn MacCracken - NOAA Affiliate
Time: Tue Apr 24 14:09:37 2018

Hi John,

I put my file on the ftp site.  Let me know what you find.  You'll see
those really low OBS values (0.01, 0.02, and so on).

Thanks!

Roz

On Tue, Apr 24, 2018 at 2:53 PM, Rosalyn MacCracken - NOAA Affiliate <
rosalyn.maccracken at noaa.gov> wrote:

> Ok, I'll get that over to the ftp site.  I have to make sure that I
find a
> day that has all the data in it.  Sometimes the data isn't available
when
> the script runs.  A little annoying, but, that's operations...
>
> I'll let you know when I get the file to the ftp site.
>
> Thanks!
>
> Roz
>
> On Tue, Apr 24, 2018 at 2:49 PM, John Halley Gotway via RT <
> met_help at ucar.edu> wrote:
>
>> Roz,
>>
>> Yes, we do.  Follow the instructions here:
>>    https://dtcenter.org/met/users/support/met_help.php#ftp
>>
>> I'd suggest making a tar file for one day and posting them to the
ftp
>> site:
>>    tar -cvzf sample.tar.gz /GFS/data/hourly/20180305*
>>
>> Thanks,
>> John
>>
>> On Tue, Apr 24, 2018 at 11:57 AM, Rosalyn MacCracken - NOAA
Affiliate via
>> RT <met_help at ucar.edu> wrote:
>>
>> >
>> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822 >
>> >
>> > HI John,
>> >
>> > Yes, it does seem that the -config option is the way to go to
recreate
>> > those 3 files. I'll be sure to have a unique file name, or, mv
the
>> output
>> > file to a different name before running the command again.
Thanks for
>> > pointing that out.
>> >
>> > I'm teleworking for the next couple of weeks, so, download and
send you
>> > *.stat files like I can when I'm at my computer at work.  I don't
have
>> > access to theia or wcoss anymore.  You have an ftp server that I
can
>> upload
>> > data to, right?  If not, I can try and fiddle around with this
tomorrow
>> and
>> > see if I can't get this to work the way I want to.
>> >
>> > Roz
>> >
>> > On Tue, Apr 24, 2018 at 11:42 AM, John Halley Gotway via RT <
>> > met_help at ucar.edu> wrote:
>> >
>> > > Roz,
>> > >
>> > > Each "-job aggregate_stat" only generates a single output line
type.
>> So
>> > > using "-out_line_type CTC,CTS,CNT" will not work.
>> > >
>> > > You'll need to run separate jobs for each output line type you
want to
>> > > generate.  That's why I'd recommend grouping those multiple
jobs
>> together
>> > > into a single STAT-Analysis config file.  Then you'd call
>> STAT-Analysis
>> > > once using the "-config" command line option.
>> > >
>> > > Another issue is that if you set "-out_stat" to the same
filename,
>> it'll
>> > > get overridden by each job.  STAT-Analysis will overwrite that
output
>> > file
>> > > rather than appending to it.
>> > >
>> > > You could send me a day's worth of .stat output files
>> > > (/GFS/data/hourly/20180305*) and I could send you some
suggestions.
>> Or
>> > if
>> > > you have access to theia you could copy them up there and point
me to
>> it.
>> > >
>> > > Thanks,
>> > > John
>> > >
>> > > On Tue, Apr 24, 2018 at 7:48 AM, Rosalyn MacCracken - NOAA
Affiliate
>> via
>> > RT
>> > > <met_help at ucar.edu> wrote:
>> > >
>> > > >
>> > > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822
>
>> > > >
>> > > > Hi John,
>> > > >
>> > > > Yes, that makes sense.  Those very small values (<1.0 m/s),
are bad
>> > > > values.  That's why they shouldn't be included in the
processing.
>> > > >
>> > > > So, I need to just regenerate hourly data, one hour at a
time.
>> Would
>> > it
>> > > > make sense to use a shell script and loop stat-analysis?
Something
>> > like:
>> > > >
>> > > > for day in 11 12
>> > > > do
>> > > >   for cycle in 00 06 12 18
>> > > >   do
>> > > > stat_analysis -lookin
/GFS/data/hourly/201803${day}${hour}/*.stat \
>> > > > -job aggregate_stat \
>> > > >    -line_type MPR \
>> > > >    -out_line_type CTC,CTS,CNT \
>> > > >   -fcst_var WIND \
>> > > > -column_thresh OBS gt1 \
>> > > >  -by
>> > > > MODEL,FCST_LEV,FCST_INIT_BEG,FCST_LEAD,VX_MASK,INTERP_MTHD,I
>> NTERP_PNTS
>> > > > -out_stat /new_rerun_stat_files/MPR_to_CTC_CTS_CNT.stat
>> > > >   done
>> > > > done
>> > > >
>> > > > or, something like that?  And, will this regenerate hour
forecasts,
>> at
>> > > each
>> > > > forecast and lead hour?  I guess it will see the forecast and
lead
>> hour
>> > > > from the *.stat file, and whatever *stat file is in the
directory,
>> it
>> > > will
>> > > > regenerate those hours, right?
>> > > >
>> > > > So, I need to regenerate the CTC, CNT and CTS files.  That's
why I
>> did:
>> > > >  -out_line_type CTC,CTS,CNT
>> > > > but, will that make 3 separate files, or just another *.stat
file?
>> > > >
>> > > > Roz
>> > > >
>> > > >
>> > > > On Mon, Apr 23, 2018 at 4:01 PM, John Halley Gotway via RT <
>> > > > met_help at ucar.edu> wrote:
>> > > >
>> > > > > Roz,
>> > > > >
>> > > > > It is ultimately up to you to decide which matched pairs
you want
>> to
>> > > > > include in your processing.  Do you consider those small
(<1.0
>> m/s)
>> > > > > observation values to be corrupt and incorrect in some way
or just
>> > not
>> > > > very
>> > > > > interesting?  If they really are BAD data values, I agree
that you
>> > > should
>> > > > > exclude them from your analysis.  But if they're just
>> uninteresting
>> > > > values
>> > > > > of low wind speed, then there's no reason why you should
exclude
>> > them.
>> > > > For
>> > > > > example, *most* of the time it ins't raining, but we often
>> included
>> > > > > observations of 0 precip.
>> > > > >
>> > > > > There are three configurable options in Point-Stat that may
be
>> useful
>> > > > here:
>> > > > > (1) You already know and use the "cat_thresh" option.  This
>> threshold
>> > > > > defines the events and non-events for a 2x2 contingency
table.
>> This
>> > > > > threshold affects the contents of FHO, CTC, CTS, MCTC, and
MCTS
>> line
>> > > > types
>> > > > > that Point-Stat writes.
>> > > > > (2) The "cnt_thresh" option is a more recent addition.
Perhaps
>> this
>> > > was
>> > > > a
>> > > > > poor name choice, but instead of defining categories, it's
really
>> a
>> > > > > *filtering* threshold.  This threshold affects the contents
of the
>> > > SL1L2,
>> > > > > SAL1L2, and CNT line types that Point-Stat writes.  For
example,
>> > > setting
>> > > > > "cnt_thresh = [ ge6, ge17 ];" will produce 2 CNT and 2
SL1L2
>> output
>> > > lines
>> > > > > containing only those points where the wind speed was >=6
and
>> >=17,
>> > > > > respectively.
>> > > > > (3) The "wind_thresh" option is very similar to the
"cnt_thresh"
>> > option
>> > > > but
>> > > > > affects the contents of teh VL1L2, VAL1L2, and VCNT (new in
>> met-7.0)
>> > > line
>> > > > > types.  Only those U/V pairs that meet the specified wind
speed
>> > > threshold
>> > > > > are included in the output.
>> > > > >
>> > > > > For both "cnt_thresh" and "wind_thresh", the default value
in the
>> > > config
>> > > > > file is "NA", meaning, do not apply any filtering threshold
>> criteria.
>> > > > >
>> > > > > You have the flexibility to run STAT-Analysis on the MPR
output
>> lines
>> > > to
>> > > > > recompute any of these output line types applying whatever
>> filtering
>> > > > > criteria you'd like.
>> > > > > Here's the MET user's guide:
>> > > > > https://dtcenter.org/met/users/docs/users_guide/MET_
>> > > Users_Guide_v7.0.pdf
>> > > > > Look on page 98 for the job command options for the
>> "aggregate_stat"
>> > > line
>> > > > > type when the input line type is "MPR".
>> > > > >
>> > > > > For your second question, the "-lookin PATH" option is
*VERY*
>> > flexible.
>> > > > > You can set PATH to either a single value or multiple
values.  If
>> you
>> > > use
>> > > > > wildcards, then the shell expands those wildcards to
multiple
>> values.
>> > > > Each
>> > > > > value you pass in can either be a filename or a directory
name.
>> If
>> > you
>> > > > > pass in a filename, STAT-Analysis will read it *REGARDLESS*
of the
>> > file
>> > > > > extension.  If you pass in a directory name, STAT-Analysis
will
>> > search
>> > > > that
>> > > > > directory *RECURSIVELY* for files ending in ".stat".  For
example,
>> > > either
>> > > > > of the following settings would tell STAT-Analysis to read
the
>> same
>> > > list
>> > > > of
>> > > > > files:
>> > > > >    -lookin /GFS/data/hourly/*/*.stat
>> > > > >    ... or ...
>> > > > >    -lookin /GFS/data/hourly
>> > > > >
>> > > > > Be aware though that the more data you pass to STAT-
Analysis, the
>> > > longer
>> > > > > it'll take for it to process it.  You can decide how much
data you
>> > pass
>> > > > it
>> > > > > for each job.  I'd suggest starting with what is most
convenient
>> for
>> > > you.
>> > > > > If it's too slow, change the logic to pass it less data
(e.g.
>> only 1
>> > > day
>> > > > of
>> > > > > data rather than 1 month of data).
>> > > > >
>> > > > > Yes, you can give it a date range.  Use -fcst_init_beg and
>> > > -fcst_init_end
>> > > > > to specify beginning/ending model initialization times or
>> > > -fcst_valid_beg
>> > > > > and -fcst_valid_end to specify beginning/ending valid
times.
>> > > > >
>> > > > > If you find that you're running multiple jobs on the same
subset
>> of
>> > > data
>> > > > > (e.g. process MPR to CNT, MPR to SL1L2, MPR to CTC, MPR to
CTS),
>> it'd
>> > > be
>> > > > > more efficient to group those jobs into a config file.
That'll do
>> > the
>> > > > > filtering ONCE and write the filtered data to a temp file.
Then
>> all
>> > > the
>> > > > > jobs read data from the temp instead of starting over from
>> scratch.
>> > > > >
>> > > > > Make sense?
>> > > > >
>> > > > > John
>> > > > >
>> > > > >
>> > > > >
>> > > > > On Mon, Apr 23, 2018 at 1:01 PM, Rosalyn MacCracken - NOAA
>> Affiliate
>> > > via
>> > > > RT
>> > > > > <met_help at ucar.edu> wrote:
>> > > > >
>> > > > > >
>> > > > > > <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822 >
>> > > > > >
>> > > > > > Hi John,
>> > > > > >
>> > > > > > That's actually only partially correct.  It's not that I
want to
>> > use
>> > > > part
>> > > > > > of the MPR lines and discard the rest, and I do need to
>> regenerate
>> > > > > > statistics.  Let me try to re-explain.
>> > > > > >
>> > > > > > Back in early March we switched from getting our ASCAT
obs from
>> the
>> > > > > > prepbufr data, to getting it from the MGDRLITE data. So,
>> processing
>> > > > > didn't
>> > > > > > change.  I was producing statistics at certain threshold
levels
>> for
>> > > > both
>> > > > > > GFS and ASCAT.  I had this set with the cat_thresh list,
at
>> levels
>> > of
>> > > > > > 0,6,17, etc.  We found out after processing for a couple
of
>> weeks
>> > > that
>> > > > > the
>> > > > > > ASCAT data included these really small values, <1.0 m/s,
and
>> that
>> > > these
>> > > > > > small wind speeds were being included into the statistics
>> > processing.
>> > > > > >
>> > > > > > So, a couple of questions.
>> > > > > > 1) Do I have to regenerate all of my statistics (*.cts,
*.cnt
>> and
>> > > *ctc
>> > > > > > files) because of this error? Or, since I have threshold
levels
>> > set,
>> > > > will
>> > > > > > those small values be amoung the statistics in the lowest
>> > thresholds?
>> > > > > > 2) I have the *.stat files, but, they are spread out into
>> separate
>> > > > > > directories like:
>> > > > > > /GFS/data/hourly/${YYYYMMDDHH}/*.stat
>> > > > > > Can I tell stat-analysis to "lookin" directories with a
wildcard
>> > > (like
>> > > > > > 201803*)?  If so, how?  Or, is I tell it to look in
>> > /GFS/data/hourly,
>> > > > > will
>> > > > > > it look in all the directories recursively under hourly?
And,
>> it
>> > > > that's
>> > > > > > the case, can I give it a date range, so, that it only
processes
>> > data
>> > > > > from
>> > > > > > March?
>> > > > > >
>> > > > > > Roz
>> > > > > >
>> > > > > > On Mon, Apr 23, 2018 at 2:18 PM, John Halley Gotway via
RT <
>> > > > > > met_help at ucar.edu> wrote:
>> > > > > >
>> > > > > > > Hi Roz,
>> > > > > > >
>> > > > > > > I read that you've run Point-Stat and saved off the
matched
>> pairs
>> > > > (MPR)
>> > > > > > > output line type.  And you'd like to (1) filter those
MPR
>> lines
>> > to
>> > > > > > discard
>> > > > > > > some of them and then (2) use the filtered data to
regenerate
>> > > summary
>> > > > > > > statistics.  Yes, this is easily done using the STAT-
Analysis
>> > tool
>> > > in
>> > > > > > MET.
>> > > > > > >
>> > > > > > > You wrote that you're verifying wind speeds against
ASCAT and
>> > that
>> > > > > you'd
>> > > > > > > like to exclude pairs where the observed wind speed is
less
>> than
>> > 1
>> > > > m/s.
>> > > > > > > I'm just guessing here, but I'll presume that you want
to
>> produce
>> > > > both
>> > > > > > > SL1L2 and CNT output line types.  Here's what the
>> STAT-Analysis
>> > job
>> > > > > would
>> > > > > > > look like:
>> > > > > > >
>> > > > > > > # Filter MPR's and write SL1L2 output line
>> > > > > > > stat_analysis \
>> > > > > > >    -lookin input.stat \            # List a .stat
filename or
>> > > > directory
>> > > > > > > containing them
>> > > > > > >    -job aggregate_stat \        # Job type is
aggregate_stat
>> > > > > > >    -line_type MPR \              # Input line type =
MPR
>> > > > > > >    -out_line_type SL1L2 \      # Output line type =
SL1L2
>> partial
>> > > > sums
>> > > > > > >    -fcst_var WIND \               # Only process lines
where
>> > > FCST_VAR
>> > > > > > > column = WIND
>> > > > > > >    -column_thresh OBS gt1 \ # Only use MPR lines where
OBS
>> column
>> > > > 1
>> > > > > > >    -by
>> > > > > > >
MODEL,FCST_LEV,FCST_INIT_BEG,FCST_LEAD,VX_MASK,INTERP_MTHD,
>> > > > INTERP_PNTS
>> > > > > #
>> > > > > > > Run this same job for each unique combination of these
columns
>> > > > > > >    -out_stat MPR_to_SL1L2.stat
>> > > > > > >
>> > > > > > > This will read produce an output .stat file containing
an
>> SL1L2
>> > > line
>> > > > > for
>> > > > > > > each unique combination of the header columns listed
after the
>> > > "-by"
>> > > > > > > option.  To generate CNT output lines instead, you'd
run a
>> second
>> > > job
>> > > > > > where
>> > > > > > > you replace SL1L2 with CNT.  You could run these jobs
on the
>> > > command
>> > > > > line
>> > > > > > > or group them together into a STAT-Analysis config
file, if
>> you
>> > > > prefer.
>> > > > > > > Both would work.
>> > > > > > >
>> > > > > > > You could run this once for each input .stat file
you're
>> > > > processing...
>> > > > > or
>> > > > > > > you could pass many input .stat files to the job.
Since
>> > > > FCST_INIT_BEG
>> > > > > > and
>> > > > > > > FCST_LEAD are included in the "-by" option, you'll get
>> separate
>> > > > output
>> > > > > > > lines for each unique time.
>> > > > > > >
>> > > > > > > Hope that helps get you going.
>> > > > > > >
>> > > > > > > Thanks,
>> > > > > > > John
>> > > > > > >
>> > > > > > >
>> > > > > > > On Thu, Apr 19, 2018 at 9:23 AM, Julie Prestopnik via
RT <
>> > > > > > > met_help at ucar.edu>
>> > > > > > > wrote:
>> > > > > > >
>> > > > > > > >
>> > > > > > > > <URL: https://rt.rap.ucar.edu/rt/Tic
>> ket/Display.html?id=84822
>> > >
>> > > > > > > >
>> > > > > > > > Hi Roz.  My apologies for the delay in responding.
>> > > > > > > >
>> > > > > > > > Unfortunately, John is out of the office this week,
and I do
>> > not
>> > > > know
>> > > > > > the
>> > > > > > > > answers to your questions.  As you said, I would also
>> imagine
>> > > that
>> > > > > > > > point-stat is using those small values as matched
pairs.
>> > Also, I
>> > > > do
>> > > > > > not
>> > > > > > > > believe there is a way to regenerate the point-stat
>> statistics
>> > > > > without
>> > > > > > > > using the original GFS data.  I cannot say with
certainty,
>> > > however.
>> > > > > > > Thank
>> > > > > > > > you for your patience in advance.  We'll get a
definite
>> > response
>> > > to
>> > > > > you
>> > > > > > > as
>> > > > > > > > soon as we can.
>> > > > > > > >
>> > > > > > > > Thanks,
>> > > > > > > > Julie
>> > > > > > > >
>> > > > > > > > On Wed, Apr 18, 2018 at 6:31 AM, Rosalyn MacCracken -
NOAA
>> > > > Affiliate
>> > > > > > via
>> > > > > > > RT
>> > > > > > > > <met_help at ucar.edu> wrote:
>> > > > > > > >
>> > > > > > > > >
>> > > > > > > > > Wed Apr 18 06:31:39 2018: Request 84822 was acted
upon.
>> > > > > > > > > Transaction: Ticket created by
>> rosalyn.maccracken at noaa.gov
>> > > > > > > > >        Queue: met_help
>> > > > > > > > >      Subject: question on regenerating data
>> > > > > > > > >        Owner: Nobody
>> > > > > > > > >   Requestors: rosalyn.maccracken at noaa.gov
>> > > > > > > > >       Status: new
>> > > > > > > > >  Ticket <URL: https://rt.rap.ucar.edu/rt/
>> > > > > > Ticket/Display.html?id=84822
>> > > > > > > >
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > > Hi,
>> > > > > > > > >
>> > > > > > > > > I'm running point-stat using ASCAT and GFS data to
verify
>> > > surface
>> > > > > > wind
>> > > > > > > > > speeds.  I found an error in my ASCAT input data
that goes
>> > back
>> > > > to
>> > > > > > Mar
>> > > > > > > 7.
>> > > > > > > > > I had switched the input source of the data, and
within
>> the
>> > new
>> > > > > data
>> > > > > > > > files,
>> > > > > > > > > it was allowing very small values (< 1 m/s) to be
used as
>> > data
>> > > > > points
>> > > > > > > in
>> > > > > > > > > the verification.  I imagine that this is an issue,
since
>> > > > > point-stat
>> > > > > > is
>> > > > > > > > > using these very small values as matched pairs with
the
>> GFS,
>> > > > > correct?
>> > > > > > > > >
>> > > > > > > > > Is there a way to regenerate the point-stat
statistics
>> > without
>> > > > > using
>> > > > > > > the
>> > > > > > > > > original GFS data?  I do have the *stat and the
*mpr
>> files,
>> > and
>> > > > it
>> > > > > is
>> > > > > > > > > pretty easy to identify where the bad values are
located.
>> > > > > > > > >
>> > > > > > > > > Thanks,
>> > > > > > > > > Roz
>> > > > > > > > >
>> > > > > > > > > --
>> > > > > > > > > Rosalyn MacCracken
>> > > > > > > > > Support Scientist
>> > > > > > > > >
>> > > > > > > > > Ocean Applications Branch
>> > > > > > > > > NOAA/NWS Ocean Prediction Center
>> > > > > > > > > NCWCP
>> > > > > > > > > 5830 University Research Ct
>> > > > > > > > > College Park, MD  20740-3818
>> > > > > > > > >
>> > > > > > > > > (p) 301-683-1551
>> > > > > > > > > rosalyn.maccracken at noaa.gov
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > >
>> > > > > >
>> > > > > > --
>> > > > > > Rosalyn MacCracken
>> > > > > > Support Scientist
>> > > > > >
>> > > > > > Ocean Applications Branch
>> > > > > > NOAA/NWS Ocean Prediction Center
>> > > > > > NCWCP
>> > > > > > 5830 University Research Ct
>> > > > > > College Park, MD  20740-3818
>> > > > > >
>> > > > > > (p) 301-683-1551
>> > > > > > rosalyn.maccracken at noaa.gov
>> > > > > >
>> > > > > >
>> > > > >
>> > > > >
>> > > >
>> > > >
>> > > > --
>> > > > Rosalyn MacCracken
>> > > > Support Scientist
>> > > >
>> > > > Ocean Applications Branch
>> > > > NOAA/NWS Ocean Prediction Center
>> > > > NCWCP
>> > > > 5830 University Research Ct
>> > > > College Park, MD  20740-3818
>> > > >
>> > > > (p) 301-683-1551
>> > > > rosalyn.maccracken at noaa.gov
>> > > >
>> > > >
>> > >
>> > >
>> >
>> >
>> > --
>> > Rosalyn MacCracken
>> > Support Scientist
>> >
>> > Ocean Applications Branch
>> > NOAA/NWS Ocean Prediction Center
>> > NCWCP
>> > 5830 University Research Ct
>> > College Park, MD  20740-3818
>> >
>> > (p) 301-683-1551
>> > rosalyn.maccracken at noaa.gov
>> >
>> >
>>
>>
>
>
> --
> Rosalyn MacCracken
> Support Scientist
>
> Ocean Applications Branch
> NOAA/NWS Ocean Prediction Center
> NCWCP
> 5830 University Research Ct
> College Park, MD  20740-3818
>
> (p) 301-683-1551
> rosalyn.maccracken at noaa.gov
>

--
Rosalyn MacCracken
Support Scientist

Ocean Applications Branch
NOAA/NWS Ocean Prediction Center
NCWCP
5830 University Research Ct
College Park, MD  20740-3818

(p) 301-683-1551
rosalyn.maccracken at noaa.gov

------------------------------------------------
Subject: question on regenerating data
From: John Halley Gotway
Time: Tue Apr 24 17:06:57 2018

Hi Roz,

Thanks for sending the sample data.  I grabbed it and used it run some
sample jobs:

time /d1/johnhg/MET/MET_releases/met-6.0/bin/stat_analysis \
-lookin
/d1/johnhg/MET/MET_Help/maccracken_data_20180424/opc_test/home/opc_test/data/met_verif/GFS/data/hourly
\
-config STATAnalysisConfig \
-log run_sa.log -v 3

I used the "-lookin" option to point to all the data you sent.

I've attached the...
(1) config file I used
(2) log file that was genrated
(3) output .stat files

Looking at the jobs, you'll see that I've included 5 of them...
- Generate CNT output
- Generate CTC >= 0.0 output
- Generate CTS >= 0.0 output
- Generate CTC >= 5.5689 output
- Generate CTS >= 5.5689 output

Unfortunately, you'll need to define separate jobs for each threshold
you'd
like to use.  Although, you shouldn't use >=0.0 since that's always
true.

Also unfortunately, this is pretty slow.  On my machine, it took like
18
minutes for these 5 jobs!

Thanks,
John

On Tue, Apr 24, 2018 at 2:09 PM, Rosalyn MacCracken - NOAA Affiliate
via RT
<met_help at ucar.edu> wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822 >
>
> Hi John,
>
> I put my file on the ftp site.  Let me know what you find.  You'll
see
> those really low OBS values (0.01, 0.02, and so on).
>
> Thanks!
>
> Roz
>
> On Tue, Apr 24, 2018 at 2:53 PM, Rosalyn MacCracken - NOAA Affiliate
<
> rosalyn.maccracken at noaa.gov> wrote:
>
> > Ok, I'll get that over to the ftp site.  I have to make sure that
I find
> a
> > day that has all the data in it.  Sometimes the data isn't
available when
> > the script runs.  A little annoying, but, that's operations...
> >
> > I'll let you know when I get the file to the ftp site.
> >
> > Thanks!
> >
> > Roz
> >
> > On Tue, Apr 24, 2018 at 2:49 PM, John Halley Gotway via RT <
> > met_help at ucar.edu> wrote:
> >
> >> Roz,
> >>
> >> Yes, we do.  Follow the instructions here:
> >>    https://dtcenter.org/met/users/support/met_help.php#ftp
> >>
> >> I'd suggest making a tar file for one day and posting them to the
ftp
> >> site:
> >>    tar -cvzf sample.tar.gz /GFS/data/hourly/20180305*
> >>
> >> Thanks,
> >> John
> >>
> >> On Tue, Apr 24, 2018 at 11:57 AM, Rosalyn MacCracken - NOAA
Affiliate
> via
> >> RT <met_help at ucar.edu> wrote:
> >>
> >> >
> >> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822 >
> >> >
> >> > HI John,
> >> >
> >> > Yes, it does seem that the -config option is the way to go to
recreate
> >> > those 3 files. I'll be sure to have a unique file name, or, mv
the
> >> output
> >> > file to a different name before running the command again.
Thanks for
> >> > pointing that out.
> >> >
> >> > I'm teleworking for the next couple of weeks, so, download and
send
> you
> >> > *.stat files like I can when I'm at my computer at work.  I
don't have
> >> > access to theia or wcoss anymore.  You have an ftp server that
I can
> >> upload
> >> > data to, right?  If not, I can try and fiddle around with this
> tomorrow
> >> and
> >> > see if I can't get this to work the way I want to.
> >> >
> >> > Roz
> >> >
> >> > On Tue, Apr 24, 2018 at 11:42 AM, John Halley Gotway via RT <
> >> > met_help at ucar.edu> wrote:
> >> >
> >> > > Roz,
> >> > >
> >> > > Each "-job aggregate_stat" only generates a single output
line type.
> >> So
> >> > > using "-out_line_type CTC,CTS,CNT" will not work.
> >> > >
> >> > > You'll need to run separate jobs for each output line type
you want
> to
> >> > > generate.  That's why I'd recommend grouping those multiple
jobs
> >> together
> >> > > into a single STAT-Analysis config file.  Then you'd call
> >> STAT-Analysis
> >> > > once using the "-config" command line option.
> >> > >
> >> > > Another issue is that if you set "-out_stat" to the same
filename,
> >> it'll
> >> > > get overridden by each job.  STAT-Analysis will overwrite
that
> output
> >> > file
> >> > > rather than appending to it.
> >> > >
> >> > > You could send me a day's worth of .stat output files
> >> > > (/GFS/data/hourly/20180305*) and I could send you some
suggestions.
> >> Or
> >> > if
> >> > > you have access to theia you could copy them up there and
point me
> to
> >> it.
> >> > >
> >> > > Thanks,
> >> > > John
> >> > >
> >> > > On Tue, Apr 24, 2018 at 7:48 AM, Rosalyn MacCracken - NOAA
Affiliate
> >> via
> >> > RT
> >> > > <met_help at ucar.edu> wrote:
> >> > >
> >> > > >
> >> > > > <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822 >
> >> > > >
> >> > > > Hi John,
> >> > > >
> >> > > > Yes, that makes sense.  Those very small values (<1.0 m/s),
are
> bad
> >> > > > values.  That's why they shouldn't be included in the
processing.
> >> > > >
> >> > > > So, I need to just regenerate hourly data, one hour at a
time.
> >> Would
> >> > it
> >> > > > make sense to use a shell script and loop stat-analysis?
> Something
> >> > like:
> >> > > >
> >> > > > for day in 11 12
> >> > > > do
> >> > > >   for cycle in 00 06 12 18
> >> > > >   do
> >> > > > stat_analysis -lookin
/GFS/data/hourly/201803${day}${hour}/*.stat
> \
> >> > > > -job aggregate_stat \
> >> > > >    -line_type MPR \
> >> > > >    -out_line_type CTC,CTS,CNT \
> >> > > >   -fcst_var WIND \
> >> > > > -column_thresh OBS gt1 \
> >> > > >  -by
> >> > > >
MODEL,FCST_LEV,FCST_INIT_BEG,FCST_LEAD,VX_MASK,INTERP_MTHD,I
> >> NTERP_PNTS
> >> > > > -out_stat /new_rerun_stat_files/MPR_to_CTC_CTS_CNT.stat
> >> > > >   done
> >> > > > done
> >> > > >
> >> > > > or, something like that?  And, will this regenerate hour
> forecasts,
> >> at
> >> > > each
> >> > > > forecast and lead hour?  I guess it will see the forecast
and lead
> >> hour
> >> > > > from the *.stat file, and whatever *stat file is in the
directory,
> >> it
> >> > > will
> >> > > > regenerate those hours, right?
> >> > > >
> >> > > > So, I need to regenerate the CTC, CNT and CTS files.
That's why I
> >> did:
> >> > > >  -out_line_type CTC,CTS,CNT
> >> > > > but, will that make 3 separate files, or just another
*.stat file?
> >> > > >
> >> > > > Roz
> >> > > >
> >> > > >
> >> > > > On Mon, Apr 23, 2018 at 4:01 PM, John Halley Gotway via RT
<
> >> > > > met_help at ucar.edu> wrote:
> >> > > >
> >> > > > > Roz,
> >> > > > >
> >> > > > > It is ultimately up to you to decide which matched pairs
you
> want
> >> to
> >> > > > > include in your processing.  Do you consider those small
(<1.0
> >> m/s)
> >> > > > > observation values to be corrupt and incorrect in some
way or
> just
> >> > not
> >> > > > very
> >> > > > > interesting?  If they really are BAD data values, I agree
that
> you
> >> > > should
> >> > > > > exclude them from your analysis.  But if they're just
> >> uninteresting
> >> > > > values
> >> > > > > of low wind speed, then there's no reason why you should
exclude
> >> > them.
> >> > > > For
> >> > > > > example, *most* of the time it ins't raining, but we
often
> >> included
> >> > > > > observations of 0 precip.
> >> > > > >
> >> > > > > There are three configurable options in Point-Stat that
may be
> >> useful
> >> > > > here:
> >> > > > > (1) You already know and use the "cat_thresh" option.
This
> >> threshold
> >> > > > > defines the events and non-events for a 2x2 contingency
table.
> >> This
> >> > > > > threshold affects the contents of FHO, CTC, CTS, MCTC,
and MCTS
> >> line
> >> > > > types
> >> > > > > that Point-Stat writes.
> >> > > > > (2) The "cnt_thresh" option is a more recent addition.
Perhaps
> >> this
> >> > > was
> >> > > > a
> >> > > > > poor name choice, but instead of defining categories,
it's
> really
> >> a
> >> > > > > *filtering* threshold.  This threshold affects the
contents of
> the
> >> > > SL1L2,
> >> > > > > SAL1L2, and CNT line types that Point-Stat writes.  For
example,
> >> > > setting
> >> > > > > "cnt_thresh = [ ge6, ge17 ];" will produce 2 CNT and 2
SL1L2
> >> output
> >> > > lines
> >> > > > > containing only those points where the wind speed was >=6
and
> >> >=17,
> >> > > > > respectively.
> >> > > > > (3) The "wind_thresh" option is very similar to the
"cnt_thresh"
> >> > option
> >> > > > but
> >> > > > > affects the contents of teh VL1L2, VAL1L2, and VCNT (new
in
> >> met-7.0)
> >> > > line
> >> > > > > types.  Only those U/V pairs that meet the specified wind
speed
> >> > > threshold
> >> > > > > are included in the output.
> >> > > > >
> >> > > > > For both "cnt_thresh" and "wind_thresh", the default
value in
> the
> >> > > config
> >> > > > > file is "NA", meaning, do not apply any filtering
threshold
> >> criteria.
> >> > > > >
> >> > > > > You have the flexibility to run STAT-Analysis on the MPR
output
> >> lines
> >> > > to
> >> > > > > recompute any of these output line types applying
whatever
> >> filtering
> >> > > > > criteria you'd like.
> >> > > > > Here's the MET user's guide:
> >> > > > > https://dtcenter.org/met/users/docs/users_guide/MET_
> >> > > Users_Guide_v7.0.pdf
> >> > > > > Look on page 98 for the job command options for the
> >> "aggregate_stat"
> >> > > line
> >> > > > > type when the input line type is "MPR".
> >> > > > >
> >> > > > > For your second question, the "-lookin PATH" option is
*VERY*
> >> > flexible.
> >> > > > > You can set PATH to either a single value or multiple
values.
> If
> >> you
> >> > > use
> >> > > > > wildcards, then the shell expands those wildcards to
multiple
> >> values.
> >> > > > Each
> >> > > > > value you pass in can either be a filename or a directory
name.
> >> If
> >> > you
> >> > > > > pass in a filename, STAT-Analysis will read it
*REGARDLESS* of
> the
> >> > file
> >> > > > > extension.  If you pass in a directory name, STAT-
Analysis will
> >> > search
> >> > > > that
> >> > > > > directory *RECURSIVELY* for files ending in ".stat".  For
> example,
> >> > > either
> >> > > > > of the following settings would tell STAT-Analysis to
read the
> >> same
> >> > > list
> >> > > > of
> >> > > > > files:
> >> > > > >    -lookin /GFS/data/hourly/*/*.stat
> >> > > > >    ... or ...
> >> > > > >    -lookin /GFS/data/hourly
> >> > > > >
> >> > > > > Be aware though that the more data you pass to STAT-
Analysis,
> the
> >> > > longer
> >> > > > > it'll take for it to process it.  You can decide how much
data
> you
> >> > pass
> >> > > > it
> >> > > > > for each job.  I'd suggest starting with what is most
convenient
> >> for
> >> > > you.
> >> > > > > If it's too slow, change the logic to pass it less data
(e.g.
> >> only 1
> >> > > day
> >> > > > of
> >> > > > > data rather than 1 month of data).
> >> > > > >
> >> > > > > Yes, you can give it a date range.  Use -fcst_init_beg
and
> >> > > -fcst_init_end
> >> > > > > to specify beginning/ending model initialization times or
> >> > > -fcst_valid_beg
> >> > > > > and -fcst_valid_end to specify beginning/ending valid
times.
> >> > > > >
> >> > > > > If you find that you're running multiple jobs on the same
subset
> >> of
> >> > > data
> >> > > > > (e.g. process MPR to CNT, MPR to SL1L2, MPR to CTC, MPR
to CTS),
> >> it'd
> >> > > be
> >> > > > > more efficient to group those jobs into a config file.
That'll
> do
> >> > the
> >> > > > > filtering ONCE and write the filtered data to a temp
file.  Then
> >> all
> >> > > the
> >> > > > > jobs read data from the temp instead of starting over
from
> >> scratch.
> >> > > > >
> >> > > > > Make sense?
> >> > > > >
> >> > > > > John
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > > > On Mon, Apr 23, 2018 at 1:01 PM, Rosalyn MacCracken -
NOAA
> >> Affiliate
> >> > > via
> >> > > > RT
> >> > > > > <met_help at ucar.edu> wrote:
> >> > > > >
> >> > > > > >
> >> > > > > > <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822
> >
> >> > > > > >
> >> > > > > > Hi John,
> >> > > > > >
> >> > > > > > That's actually only partially correct.  It's not that
I want
> to
> >> > use
> >> > > > part
> >> > > > > > of the MPR lines and discard the rest, and I do need to
> >> regenerate
> >> > > > > > statistics.  Let me try to re-explain.
> >> > > > > >
> >> > > > > > Back in early March we switched from getting our ASCAT
obs
> from
> >> the
> >> > > > > > prepbufr data, to getting it from the MGDRLITE data.
So,
> >> processing
> >> > > > > didn't
> >> > > > > > change.  I was producing statistics at certain
threshold
> levels
> >> for
> >> > > > both
> >> > > > > > GFS and ASCAT.  I had this set with the cat_thresh
list, at
> >> levels
> >> > of
> >> > > > > > 0,6,17, etc.  We found out after processing for a
couple of
> >> weeks
> >> > > that
> >> > > > > the
> >> > > > > > ASCAT data included these really small values, <1.0
m/s, and
> >> that
> >> > > these
> >> > > > > > small wind speeds were being included into the
statistics
> >> > processing.
> >> > > > > >
> >> > > > > > So, a couple of questions.
> >> > > > > > 1) Do I have to regenerate all of my statistics (*.cts,
*.cnt
> >> and
> >> > > *ctc
> >> > > > > > files) because of this error? Or, since I have
threshold
> levels
> >> > set,
> >> > > > will
> >> > > > > > those small values be amoung the statistics in the
lowest
> >> > thresholds?
> >> > > > > > 2) I have the *.stat files, but, they are spread out
into
> >> separate
> >> > > > > > directories like:
> >> > > > > > /GFS/data/hourly/${YYYYMMDDHH}/*.stat
> >> > > > > > Can I tell stat-analysis to "lookin" directories with a
> wildcard
> >> > > (like
> >> > > > > > 201803*)?  If so, how?  Or, is I tell it to look in
> >> > /GFS/data/hourly,
> >> > > > > will
> >> > > > > > it look in all the directories recursively under
hourly?  And,
> >> it
> >> > > > that's
> >> > > > > > the case, can I give it a date range, so, that it only
> processes
> >> > data
> >> > > > > from
> >> > > > > > March?
> >> > > > > >
> >> > > > > > Roz
> >> > > > > >
> >> > > > > > On Mon, Apr 23, 2018 at 2:18 PM, John Halley Gotway via
RT <
> >> > > > > > met_help at ucar.edu> wrote:
> >> > > > > >
> >> > > > > > > Hi Roz,
> >> > > > > > >
> >> > > > > > > I read that you've run Point-Stat and saved off the
matched
> >> pairs
> >> > > > (MPR)
> >> > > > > > > output line type.  And you'd like to (1) filter those
MPR
> >> lines
> >> > to
> >> > > > > > discard
> >> > > > > > > some of them and then (2) use the filtered data to
> regenerate
> >> > > summary
> >> > > > > > > statistics.  Yes, this is easily done using the
> STAT-Analysis
> >> > tool
> >> > > in
> >> > > > > > MET.
> >> > > > > > >
> >> > > > > > > You wrote that you're verifying wind speeds against
ASCAT
> and
> >> > that
> >> > > > > you'd
> >> > > > > > > like to exclude pairs where the observed wind speed
is less
> >> than
> >> > 1
> >> > > > m/s.
> >> > > > > > > I'm just guessing here, but I'll presume that you
want to
> >> produce
> >> > > > both
> >> > > > > > > SL1L2 and CNT output line types.  Here's what the
> >> STAT-Analysis
> >> > job
> >> > > > > would
> >> > > > > > > look like:
> >> > > > > > >
> >> > > > > > > # Filter MPR's and write SL1L2 output line
> >> > > > > > > stat_analysis \
> >> > > > > > >    -lookin input.stat \            # List a .stat
filename
> or
> >> > > > directory
> >> > > > > > > containing them
> >> > > > > > >    -job aggregate_stat \        # Job type is
aggregate_stat
> >> > > > > > >    -line_type MPR \              # Input line type =
MPR
> >> > > > > > >    -out_line_type SL1L2 \      # Output line type =
SL1L2
> >> partial
> >> > > > sums
> >> > > > > > >    -fcst_var WIND \               # Only process
lines where
> >> > > FCST_VAR
> >> > > > > > > column = WIND
> >> > > > > > >    -column_thresh OBS gt1 \ # Only use MPR lines
where OBS
> >> column
> >> > > > 1
> >> > > > > > >    -by
> >> > > > > > >
MODEL,FCST_LEV,FCST_INIT_BEG,FCST_LEAD,VX_MASK,INTERP_MTHD,
> >> > > > INTERP_PNTS
> >> > > > > #
> >> > > > > > > Run this same job for each unique combination of
these
> columns
> >> > > > > > >    -out_stat MPR_to_SL1L2.stat
> >> > > > > > >
> >> > > > > > > This will read produce an output .stat file
containing an
> >> SL1L2
> >> > > line
> >> > > > > for
> >> > > > > > > each unique combination of the header columns listed
after
> the
> >> > > "-by"
> >> > > > > > > option.  To generate CNT output lines instead, you'd
run a
> >> second
> >> > > job
> >> > > > > > where
> >> > > > > > > you replace SL1L2 with CNT.  You could run these jobs
on the
> >> > > command
> >> > > > > line
> >> > > > > > > or group them together into a STAT-Analysis config
file, if
> >> you
> >> > > > prefer.
> >> > > > > > > Both would work.
> >> > > > > > >
> >> > > > > > > You could run this once for each input .stat file
you're
> >> > > > processing...
> >> > > > > or
> >> > > > > > > you could pass many input .stat files to the job.
Since
> >> > > > FCST_INIT_BEG
> >> > > > > > and
> >> > > > > > > FCST_LEAD are included in the "-by" option, you'll
get
> >> separate
> >> > > > output
> >> > > > > > > lines for each unique time.
> >> > > > > > >
> >> > > > > > > Hope that helps get you going.
> >> > > > > > >
> >> > > > > > > Thanks,
> >> > > > > > > John
> >> > > > > > >
> >> > > > > > >
> >> > > > > > > On Thu, Apr 19, 2018 at 9:23 AM, Julie Prestopnik via
RT <
> >> > > > > > > met_help at ucar.edu>
> >> > > > > > > wrote:
> >> > > > > > >
> >> > > > > > > >
> >> > > > > > > > <URL: https://rt.rap.ucar.edu/rt/Tic
> >> ket/Display.html?id=84822
> >> > >
> >> > > > > > > >
> >> > > > > > > > Hi Roz.  My apologies for the delay in responding.
> >> > > > > > > >
> >> > > > > > > > Unfortunately, John is out of the office this week,
and I
> do
> >> > not
> >> > > > know
> >> > > > > > the
> >> > > > > > > > answers to your questions.  As you said, I would
also
> >> imagine
> >> > > that
> >> > > > > > > > point-stat is using those small values as matched
pairs.
> >> > Also, I
> >> > > > do
> >> > > > > > not
> >> > > > > > > > believe there is a way to regenerate the point-stat
> >> statistics
> >> > > > > without
> >> > > > > > > > using the original GFS data.  I cannot say with
certainty,
> >> > > however.
> >> > > > > > > Thank
> >> > > > > > > > you for your patience in advance.  We'll get a
definite
> >> > response
> >> > > to
> >> > > > > you
> >> > > > > > > as
> >> > > > > > > > soon as we can.
> >> > > > > > > >
> >> > > > > > > > Thanks,
> >> > > > > > > > Julie
> >> > > > > > > >
> >> > > > > > > > On Wed, Apr 18, 2018 at 6:31 AM, Rosalyn MacCracken
- NOAA
> >> > > > Affiliate
> >> > > > > > via
> >> > > > > > > RT
> >> > > > > > > > <met_help at ucar.edu> wrote:
> >> > > > > > > >
> >> > > > > > > > >
> >> > > > > > > > > Wed Apr 18 06:31:39 2018: Request 84822 was acted
upon.
> >> > > > > > > > > Transaction: Ticket created by
> >> rosalyn.maccracken at noaa.gov
> >> > > > > > > > >        Queue: met_help
> >> > > > > > > > >      Subject: question on regenerating data
> >> > > > > > > > >        Owner: Nobody
> >> > > > > > > > >   Requestors: rosalyn.maccracken at noaa.gov
> >> > > > > > > > >       Status: new
> >> > > > > > > > >  Ticket <URL: https://rt.rap.ucar.edu/rt/
> >> > > > > > Ticket/Display.html?id=84822
> >> > > > > > > >
> >> > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > > > Hi,
> >> > > > > > > > >
> >> > > > > > > > > I'm running point-stat using ASCAT and GFS data
to
> verify
> >> > > surface
> >> > > > > > wind
> >> > > > > > > > > speeds.  I found an error in my ASCAT input data
that
> goes
> >> > back
> >> > > > to
> >> > > > > > Mar
> >> > > > > > > 7.
> >> > > > > > > > > I had switched the input source of the data, and
within
> >> the
> >> > new
> >> > > > > data
> >> > > > > > > > files,
> >> > > > > > > > > it was allowing very small values (< 1 m/s) to be
used
> as
> >> > data
> >> > > > > points
> >> > > > > > > in
> >> > > > > > > > > the verification.  I imagine that this is an
issue,
> since
> >> > > > > point-stat
> >> > > > > > is
> >> > > > > > > > > using these very small values as matched pairs
with the
> >> GFS,
> >> > > > > correct?
> >> > > > > > > > >
> >> > > > > > > > > Is there a way to regenerate the point-stat
statistics
> >> > without
> >> > > > > using
> >> > > > > > > the
> >> > > > > > > > > original GFS data?  I do have the *stat and the
*mpr
> >> files,
> >> > and
> >> > > > it
> >> > > > > is
> >> > > > > > > > > pretty easy to identify where the bad values are
> located.
> >> > > > > > > > >
> >> > > > > > > > > Thanks,
> >> > > > > > > > > Roz
> >> > > > > > > > >
> >> > > > > > > > > --
> >> > > > > > > > > Rosalyn MacCracken
> >> > > > > > > > > Support Scientist
> >> > > > > > > > >
> >> > > > > > > > > Ocean Applications Branch
> >> > > > > > > > > NOAA/NWS Ocean Prediction Center
> >> > > > > > > > > NCWCP
> >> > > > > > > > > 5830 University Research Ct
> >> > > > > > > > > College Park, MD  20740-3818
> >> > > > > > > > >
> >> > > > > > > > > (p) 301-683-1551
> >> > > > > > > > > rosalyn.maccracken at noaa.gov
> >> > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > > >
> >> > > > > > --
> >> > > > > > Rosalyn MacCracken
> >> > > > > > Support Scientist
> >> > > > > >
> >> > > > > > Ocean Applications Branch
> >> > > > > > NOAA/NWS Ocean Prediction Center
> >> > > > > > NCWCP
> >> > > > > > 5830 University Research Ct
> >> > > > > > College Park, MD  20740-3818
> >> > > > > >
> >> > > > > > (p) 301-683-1551
> >> > > > > > rosalyn.maccracken at noaa.gov
> >> > > > > >
> >> > > > > >
> >> > > > >
> >> > > > >
> >> > > >
> >> > > >
> >> > > > --
> >> > > > Rosalyn MacCracken
> >> > > > Support Scientist
> >> > > >
> >> > > > Ocean Applications Branch
> >> > > > NOAA/NWS Ocean Prediction Center
> >> > > > NCWCP
> >> > > > 5830 University Research Ct
> >> > > > College Park, MD  20740-3818
> >> > > >
> >> > > > (p) 301-683-1551
> >> > > > rosalyn.maccracken at noaa.gov
> >> > > >
> >> > > >
> >> > >
> >> > >
> >> >
> >> >
> >> > --
> >> > Rosalyn MacCracken
> >> > Support Scientist
> >> >
> >> > Ocean Applications Branch
> >> > NOAA/NWS Ocean Prediction Center
> >> > NCWCP
> >> > 5830 University Research Ct
> >> > College Park, MD  20740-3818
> >> >
> >> > (p) 301-683-1551
> >> > rosalyn.maccracken at noaa.gov
> >> >
> >> >
> >>
> >>
> >
> >
> > --
> > Rosalyn MacCracken
> > Support Scientist
> >
> > Ocean Applications Branch
> > NOAA/NWS Ocean Prediction Center
> > NCWCP
> > 5830 University Research Ct
> > College Park, MD  20740-3818
> >
> > (p) 301-683-1551
> > rosalyn.maccracken at noaa.gov
> >
>
>
>
> --
> Rosalyn MacCracken
> Support Scientist
>
> Ocean Applications Branch
> NOAA/NWS Ocean Prediction Center
> NCWCP
> 5830 University Research Ct
> College Park, MD  20740-3818
>
> (p) 301-683-1551
> rosalyn.maccracken at noaa.gov
>
>

------------------------------------------------
Subject: question on regenerating data
From: Rosalyn MacCracken - NOAA Affiliate
Time: Wed Apr 25 09:18:42 2018

Hi John,

Thanks for doing that for me.  I'll take a look at the info you sent
me
this afternoon.  I'm in the middle of doing something right
now...trying to
make a different program work.  ;-/

I wonder if it will be quicker than 18 minutes for some of the
thresholds
that have higher wind speeds, and not as many instances (or 0
instances).
Or, will it take just as long, since it still needs to read through
the
entire *.stat file anyway?

Roz

On Tue, Apr 24, 2018 at 7:06 PM, John Halley Gotway via RT <
met_help at ucar.edu> wrote:

> Hi Roz,
>
> Thanks for sending the sample data.  I grabbed it and used it run
some
> sample jobs:
>
> time /d1/johnhg/MET/MET_releases/met-6.0/bin/stat_analysis \
> -lookin
> /d1/johnhg/MET/MET_Help/maccracken_data_20180424/opc_
> test/home/opc_test/data/met_verif/GFS/data/hourly
> \
> -config STATAnalysisConfig \
> -log run_sa.log -v 3
>
> I used the "-lookin" option to point to all the data you sent.
>
> I've attached the...
> (1) config file I used
> (2) log file that was genrated
> (3) output .stat files
>
> Looking at the jobs, you'll see that I've included 5 of them...
> - Generate CNT output
> - Generate CTC >= 0.0 output
> - Generate CTS >= 0.0 output
> - Generate CTC >= 5.5689 output
> - Generate CTS >= 5.5689 output
>
> Unfortunately, you'll need to define separate jobs for each
threshold you'd
> like to use.  Although, you shouldn't use >=0.0 since that's always
true.
>
> Also unfortunately, this is pretty slow.  On my machine, it took
like 18
> minutes for these 5 jobs!
>
> Thanks,
> John
>
>
> On Tue, Apr 24, 2018 at 2:09 PM, Rosalyn MacCracken - NOAA Affiliate
via RT
> <met_help at ucar.edu> wrote:
>
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822 >
> >
> > Hi John,
> >
> > I put my file on the ftp site.  Let me know what you find.  You'll
see
> > those really low OBS values (0.01, 0.02, and so on).
> >
> > Thanks!
> >
> > Roz
> >
> > On Tue, Apr 24, 2018 at 2:53 PM, Rosalyn MacCracken - NOAA
Affiliate <
> > rosalyn.maccracken at noaa.gov> wrote:
> >
> > > Ok, I'll get that over to the ftp site.  I have to make sure
that I
> find
> > a
> > > day that has all the data in it.  Sometimes the data isn't
available
> when
> > > the script runs.  A little annoying, but, that's operations...
> > >
> > > I'll let you know when I get the file to the ftp site.
> > >
> > > Thanks!
> > >
> > > Roz
> > >
> > > On Tue, Apr 24, 2018 at 2:49 PM, John Halley Gotway via RT <
> > > met_help at ucar.edu> wrote:
> > >
> > >> Roz,
> > >>
> > >> Yes, we do.  Follow the instructions here:
> > >>    https://dtcenter.org/met/users/support/met_help.php#ftp
> > >>
> > >> I'd suggest making a tar file for one day and posting them to
the ftp
> > >> site:
> > >>    tar -cvzf sample.tar.gz /GFS/data/hourly/20180305*
> > >>
> > >> Thanks,
> > >> John
> > >>
> > >> On Tue, Apr 24, 2018 at 11:57 AM, Rosalyn MacCracken - NOAA
Affiliate
> > via
> > >> RT <met_help at ucar.edu> wrote:
> > >>
> > >> >
> > >> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822
>
> > >> >
> > >> > HI John,
> > >> >
> > >> > Yes, it does seem that the -config option is the way to go to
> recreate
> > >> > those 3 files. I'll be sure to have a unique file name, or,
mv the
> > >> output
> > >> > file to a different name before running the command again.
Thanks
> for
> > >> > pointing that out.
> > >> >
> > >> > I'm teleworking for the next couple of weeks, so, download
and send
> > you
> > >> > *.stat files like I can when I'm at my computer at work.  I
don't
> have
> > >> > access to theia or wcoss anymore.  You have an ftp server
that I can
> > >> upload
> > >> > data to, right?  If not, I can try and fiddle around with
this
> > tomorrow
> > >> and
> > >> > see if I can't get this to work the way I want to.
> > >> >
> > >> > Roz
> > >> >
> > >> > On Tue, Apr 24, 2018 at 11:42 AM, John Halley Gotway via RT <
> > >> > met_help at ucar.edu> wrote:
> > >> >
> > >> > > Roz,
> > >> > >
> > >> > > Each "-job aggregate_stat" only generates a single output
line
> type.
> > >> So
> > >> > > using "-out_line_type CTC,CTS,CNT" will not work.
> > >> > >
> > >> > > You'll need to run separate jobs for each output line type
you
> want
> > to
> > >> > > generate.  That's why I'd recommend grouping those multiple
jobs
> > >> together
> > >> > > into a single STAT-Analysis config file.  Then you'd call
> > >> STAT-Analysis
> > >> > > once using the "-config" command line option.
> > >> > >
> > >> > > Another issue is that if you set "-out_stat" to the same
filename,
> > >> it'll
> > >> > > get overridden by each job.  STAT-Analysis will overwrite
that
> > output
> > >> > file
> > >> > > rather than appending to it.
> > >> > >
> > >> > > You could send me a day's worth of .stat output files
> > >> > > (/GFS/data/hourly/20180305*) and I could send you some
> suggestions.
> > >> Or
> > >> > if
> > >> > > you have access to theia you could copy them up there and
point me
> > to
> > >> it.
> > >> > >
> > >> > > Thanks,
> > >> > > John
> > >> > >
> > >> > > On Tue, Apr 24, 2018 at 7:48 AM, Rosalyn MacCracken - NOAA
> Affiliate
> > >> via
> > >> > RT
> > >> > > <met_help at ucar.edu> wrote:
> > >> > >
> > >> > > >
> > >> > > > <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822 >
> > >> > > >
> > >> > > > Hi John,
> > >> > > >
> > >> > > > Yes, that makes sense.  Those very small values (<1.0
m/s), are
> > bad
> > >> > > > values.  That's why they shouldn't be included in the
> processing.
> > >> > > >
> > >> > > > So, I need to just regenerate hourly data, one hour at a
time.
> > >> Would
> > >> > it
> > >> > > > make sense to use a shell script and loop stat-analysis?
> > Something
> > >> > like:
> > >> > > >
> > >> > > > for day in 11 12
> > >> > > > do
> > >> > > >   for cycle in 00 06 12 18
> > >> > > >   do
> > >> > > > stat_analysis -lookin /GFS/data/hourly/201803${day}$
> {hour}/*.stat
> > \
> > >> > > > -job aggregate_stat \
> > >> > > >    -line_type MPR \
> > >> > > >    -out_line_type CTC,CTS,CNT \
> > >> > > >   -fcst_var WIND \
> > >> > > > -column_thresh OBS gt1 \
> > >> > > >  -by
> > >> > > >
MODEL,FCST_LEV,FCST_INIT_BEG,FCST_LEAD,VX_MASK,INTERP_MTHD,I
> > >> NTERP_PNTS
> > >> > > > -out_stat /new_rerun_stat_files/MPR_to_CTC_CTS_CNT.stat
> > >> > > >   done
> > >> > > > done
> > >> > > >
> > >> > > > or, something like that?  And, will this regenerate hour
> > forecasts,
> > >> at
> > >> > > each
> > >> > > > forecast and lead hour?  I guess it will see the forecast
and
> lead
> > >> hour
> > >> > > > from the *.stat file, and whatever *stat file is in the
> directory,
> > >> it
> > >> > > will
> > >> > > > regenerate those hours, right?
> > >> > > >
> > >> > > > So, I need to regenerate the CTC, CNT and CTS files.
That's
> why I
> > >> did:
> > >> > > >  -out_line_type CTC,CTS,CNT
> > >> > > > but, will that make 3 separate files, or just another
*.stat
> file?
> > >> > > >
> > >> > > > Roz
> > >> > > >
> > >> > > >
> > >> > > > On Mon, Apr 23, 2018 at 4:01 PM, John Halley Gotway via
RT <
> > >> > > > met_help at ucar.edu> wrote:
> > >> > > >
> > >> > > > > Roz,
> > >> > > > >
> > >> > > > > It is ultimately up to you to decide which matched
pairs you
> > want
> > >> to
> > >> > > > > include in your processing.  Do you consider those
small (<1.0
> > >> m/s)
> > >> > > > > observation values to be corrupt and incorrect in some
way or
> > just
> > >> > not
> > >> > > > very
> > >> > > > > interesting?  If they really are BAD data values, I
agree that
> > you
> > >> > > should
> > >> > > > > exclude them from your analysis.  But if they're just
> > >> uninteresting
> > >> > > > values
> > >> > > > > of low wind speed, then there's no reason why you
should
> exclude
> > >> > them.
> > >> > > > For
> > >> > > > > example, *most* of the time it ins't raining, but we
often
> > >> included
> > >> > > > > observations of 0 precip.
> > >> > > > >
> > >> > > > > There are three configurable options in Point-Stat that
may be
> > >> useful
> > >> > > > here:
> > >> > > > > (1) You already know and use the "cat_thresh" option.
This
> > >> threshold
> > >> > > > > defines the events and non-events for a 2x2 contingency
table.
> > >> This
> > >> > > > > threshold affects the contents of FHO, CTC, CTS, MCTC,
and
> MCTS
> > >> line
> > >> > > > types
> > >> > > > > that Point-Stat writes.
> > >> > > > > (2) The "cnt_thresh" option is a more recent addition.
> Perhaps
> > >> this
> > >> > > was
> > >> > > > a
> > >> > > > > poor name choice, but instead of defining categories,
it's
> > really
> > >> a
> > >> > > > > *filtering* threshold.  This threshold affects the
contents of
> > the
> > >> > > SL1L2,
> > >> > > > > SAL1L2, and CNT line types that Point-Stat writes.  For
> example,
> > >> > > setting
> > >> > > > > "cnt_thresh = [ ge6, ge17 ];" will produce 2 CNT and 2
SL1L2
> > >> output
> > >> > > lines
> > >> > > > > containing only those points where the wind speed was
>=6 and
> > >> >=17,
> > >> > > > > respectively.
> > >> > > > > (3) The "wind_thresh" option is very similar to the
> "cnt_thresh"
> > >> > option
> > >> > > > but
> > >> > > > > affects the contents of teh VL1L2, VAL1L2, and VCNT
(new in
> > >> met-7.0)
> > >> > > line
> > >> > > > > types.  Only those U/V pairs that meet the specified
wind
> speed
> > >> > > threshold
> > >> > > > > are included in the output.
> > >> > > > >
> > >> > > > > For both "cnt_thresh" and "wind_thresh", the default
value in
> > the
> > >> > > config
> > >> > > > > file is "NA", meaning, do not apply any filtering
threshold
> > >> criteria.
> > >> > > > >
> > >> > > > > You have the flexibility to run STAT-Analysis on the
MPR
> output
> > >> lines
> > >> > > to
> > >> > > > > recompute any of these output line types applying
whatever
> > >> filtering
> > >> > > > > criteria you'd like.
> > >> > > > > Here's the MET user's guide:
> > >> > > > > https://dtcenter.org/met/users/docs/users_guide/MET_
> > >> > > Users_Guide_v7.0.pdf
> > >> > > > > Look on page 98 for the job command options for the
> > >> "aggregate_stat"
> > >> > > line
> > >> > > > > type when the input line type is "MPR".
> > >> > > > >
> > >> > > > > For your second question, the "-lookin PATH" option is
*VERY*
> > >> > flexible.
> > >> > > > > You can set PATH to either a single value or multiple
values.
> > If
> > >> you
> > >> > > use
> > >> > > > > wildcards, then the shell expands those wildcards to
multiple
> > >> values.
> > >> > > > Each
> > >> > > > > value you pass in can either be a filename or a
directory
> name.
> > >> If
> > >> > you
> > >> > > > > pass in a filename, STAT-Analysis will read it
*REGARDLESS* of
> > the
> > >> > file
> > >> > > > > extension.  If you pass in a directory name, STAT-
Analysis
> will
> > >> > search
> > >> > > > that
> > >> > > > > directory *RECURSIVELY* for files ending in ".stat".
For
> > example,
> > >> > > either
> > >> > > > > of the following settings would tell STAT-Analysis to
read the
> > >> same
> > >> > > list
> > >> > > > of
> > >> > > > > files:
> > >> > > > >    -lookin /GFS/data/hourly/*/*.stat
> > >> > > > >    ... or ...
> > >> > > > >    -lookin /GFS/data/hourly
> > >> > > > >
> > >> > > > > Be aware though that the more data you pass to STAT-
Analysis,
> > the
> > >> > > longer
> > >> > > > > it'll take for it to process it.  You can decide how
much data
> > you
> > >> > pass
> > >> > > > it
> > >> > > > > for each job.  I'd suggest starting with what is most
> convenient
> > >> for
> > >> > > you.
> > >> > > > > If it's too slow, change the logic to pass it less data
(e.g.
> > >> only 1
> > >> > > day
> > >> > > > of
> > >> > > > > data rather than 1 month of data).
> > >> > > > >
> > >> > > > > Yes, you can give it a date range.  Use -fcst_init_beg
and
> > >> > > -fcst_init_end
> > >> > > > > to specify beginning/ending model initialization times
or
> > >> > > -fcst_valid_beg
> > >> > > > > and -fcst_valid_end to specify beginning/ending valid
times.
> > >> > > > >
> > >> > > > > If you find that you're running multiple jobs on the
same
> subset
> > >> of
> > >> > > data
> > >> > > > > (e.g. process MPR to CNT, MPR to SL1L2, MPR to CTC, MPR
to
> CTS),
> > >> it'd
> > >> > > be
> > >> > > > > more efficient to group those jobs into a config file.
> That'll
> > do
> > >> > the
> > >> > > > > filtering ONCE and write the filtered data to a temp
file.
> Then
> > >> all
> > >> > > the
> > >> > > > > jobs read data from the temp instead of starting over
from
> > >> scratch.
> > >> > > > >
> > >> > > > > Make sense?
> > >> > > > >
> > >> > > > > John
> > >> > > > >
> > >> > > > >
> > >> > > > >
> > >> > > > > On Mon, Apr 23, 2018 at 1:01 PM, Rosalyn MacCracken -
NOAA
> > >> Affiliate
> > >> > > via
> > >> > > > RT
> > >> > > > > <met_help at ucar.edu> wrote:
> > >> > > > >
> > >> > > > > >
> > >> > > > > > <URL: https://rt.rap.ucar.edu/rt/
> Ticket/Display.html?id=84822
> > >
> > >> > > > > >
> > >> > > > > > Hi John,
> > >> > > > > >
> > >> > > > > > That's actually only partially correct.  It's not
that I
> want
> > to
> > >> > use
> > >> > > > part
> > >> > > > > > of the MPR lines and discard the rest, and I do need
to
> > >> regenerate
> > >> > > > > > statistics.  Let me try to re-explain.
> > >> > > > > >
> > >> > > > > > Back in early March we switched from getting our
ASCAT obs
> > from
> > >> the
> > >> > > > > > prepbufr data, to getting it from the MGDRLITE data.
So,
> > >> processing
> > >> > > > > didn't
> > >> > > > > > change.  I was producing statistics at certain
threshold
> > levels
> > >> for
> > >> > > > both
> > >> > > > > > GFS and ASCAT.  I had this set with the cat_thresh
list, at
> > >> levels
> > >> > of
> > >> > > > > > 0,6,17, etc.  We found out after processing for a
couple of
> > >> weeks
> > >> > > that
> > >> > > > > the
> > >> > > > > > ASCAT data included these really small values, <1.0
m/s, and
> > >> that
> > >> > > these
> > >> > > > > > small wind speeds were being included into the
statistics
> > >> > processing.
> > >> > > > > >
> > >> > > > > > So, a couple of questions.
> > >> > > > > > 1) Do I have to regenerate all of my statistics
(*.cts,
> *.cnt
> > >> and
> > >> > > *ctc
> > >> > > > > > files) because of this error? Or, since I have
threshold
> > levels
> > >> > set,
> > >> > > > will
> > >> > > > > > those small values be amoung the statistics in the
lowest
> > >> > thresholds?
> > >> > > > > > 2) I have the *.stat files, but, they are spread out
into
> > >> separate
> > >> > > > > > directories like:
> > >> > > > > > /GFS/data/hourly/${YYYYMMDDHH}/*.stat
> > >> > > > > > Can I tell stat-analysis to "lookin" directories with
a
> > wildcard
> > >> > > (like
> > >> > > > > > 201803*)?  If so, how?  Or, is I tell it to look in
> > >> > /GFS/data/hourly,
> > >> > > > > will
> > >> > > > > > it look in all the directories recursively under
hourly?
> And,
> > >> it
> > >> > > > that's
> > >> > > > > > the case, can I give it a date range, so, that it
only
> > processes
> > >> > data
> > >> > > > > from
> > >> > > > > > March?
> > >> > > > > >
> > >> > > > > > Roz
> > >> > > > > >
> > >> > > > > > On Mon, Apr 23, 2018 at 2:18 PM, John Halley Gotway
via RT <
> > >> > > > > > met_help at ucar.edu> wrote:
> > >> > > > > >
> > >> > > > > > > Hi Roz,
> > >> > > > > > >
> > >> > > > > > > I read that you've run Point-Stat and saved off the
> matched
> > >> pairs
> > >> > > > (MPR)
> > >> > > > > > > output line type.  And you'd like to (1) filter
those MPR
> > >> lines
> > >> > to
> > >> > > > > > discard
> > >> > > > > > > some of them and then (2) use the filtered data to
> > regenerate
> > >> > > summary
> > >> > > > > > > statistics.  Yes, this is easily done using the
> > STAT-Analysis
> > >> > tool
> > >> > > in
> > >> > > > > > MET.
> > >> > > > > > >
> > >> > > > > > > You wrote that you're verifying wind speeds against
ASCAT
> > and
> > >> > that
> > >> > > > > you'd
> > >> > > > > > > like to exclude pairs where the observed wind speed
is
> less
> > >> than
> > >> > 1
> > >> > > > m/s.
> > >> > > > > > > I'm just guessing here, but I'll presume that you
want to
> > >> produce
> > >> > > > both
> > >> > > > > > > SL1L2 and CNT output line types.  Here's what the
> > >> STAT-Analysis
> > >> > job
> > >> > > > > would
> > >> > > > > > > look like:
> > >> > > > > > >
> > >> > > > > > > # Filter MPR's and write SL1L2 output line
> > >> > > > > > > stat_analysis \
> > >> > > > > > >    -lookin input.stat \            # List a .stat
filename
> > or
> > >> > > > directory
> > >> > > > > > > containing them
> > >> > > > > > >    -job aggregate_stat \        # Job type is
> aggregate_stat
> > >> > > > > > >    -line_type MPR \              # Input line type
= MPR
> > >> > > > > > >    -out_line_type SL1L2 \      # Output line type =
SL1L2
> > >> partial
> > >> > > > sums
> > >> > > > > > >    -fcst_var WIND \               # Only process
lines
> where
> > >> > > FCST_VAR
> > >> > > > > > > column = WIND
> > >> > > > > > >    -column_thresh OBS gt1 \ # Only use MPR lines
where OBS
> > >> column
> > >> > > > 1
> > >> > > > > > >    -by
> > >> > > > > > > MODEL,FCST_LEV,FCST_INIT_BEG,
> FCST_LEAD,VX_MASK,INTERP_MTHD,
> > >> > > > INTERP_PNTS
> > >> > > > > #
> > >> > > > > > > Run this same job for each unique combination of
these
> > columns
> > >> > > > > > >    -out_stat MPR_to_SL1L2.stat
> > >> > > > > > >
> > >> > > > > > > This will read produce an output .stat file
containing an
> > >> SL1L2
> > >> > > line
> > >> > > > > for
> > >> > > > > > > each unique combination of the header columns
listed after
> > the
> > >> > > "-by"
> > >> > > > > > > option.  To generate CNT output lines instead,
you'd run a
> > >> second
> > >> > > job
> > >> > > > > > where
> > >> > > > > > > you replace SL1L2 with CNT.  You could run these
jobs on
> the
> > >> > > command
> > >> > > > > line
> > >> > > > > > > or group them together into a STAT-Analysis config
file,
> if
> > >> you
> > >> > > > prefer.
> > >> > > > > > > Both would work.
> > >> > > > > > >
> > >> > > > > > > You could run this once for each input .stat file
you're
> > >> > > > processing...
> > >> > > > > or
> > >> > > > > > > you could pass many input .stat files to the job.
Since
> > >> > > > FCST_INIT_BEG
> > >> > > > > > and
> > >> > > > > > > FCST_LEAD are included in the "-by" option, you'll
get
> > >> separate
> > >> > > > output
> > >> > > > > > > lines for each unique time.
> > >> > > > > > >
> > >> > > > > > > Hope that helps get you going.
> > >> > > > > > >
> > >> > > > > > > Thanks,
> > >> > > > > > > John
> > >> > > > > > >
> > >> > > > > > >
> > >> > > > > > > On Thu, Apr 19, 2018 at 9:23 AM, Julie Prestopnik
via RT <
> > >> > > > > > > met_help at ucar.edu>
> > >> > > > > > > wrote:
> > >> > > > > > >
> > >> > > > > > > >
> > >> > > > > > > > <URL: https://rt.rap.ucar.edu/rt/Tic
> > >> ket/Display.html?id=84822
> > >> > >
> > >> > > > > > > >
> > >> > > > > > > > Hi Roz.  My apologies for the delay in
responding.
> > >> > > > > > > >
> > >> > > > > > > > Unfortunately, John is out of the office this
week, and
> I
> > do
> > >> > not
> > >> > > > know
> > >> > > > > > the
> > >> > > > > > > > answers to your questions.  As you said, I would
also
> > >> imagine
> > >> > > that
> > >> > > > > > > > point-stat is using those small values as matched
pairs.
> > >> > Also, I
> > >> > > > do
> > >> > > > > > not
> > >> > > > > > > > believe there is a way to regenerate the point-
stat
> > >> statistics
> > >> > > > > without
> > >> > > > > > > > using the original GFS data.  I cannot say with
> certainty,
> > >> > > however.
> > >> > > > > > > Thank
> > >> > > > > > > > you for your patience in advance.  We'll get a
definite
> > >> > response
> > >> > > to
> > >> > > > > you
> > >> > > > > > > as
> > >> > > > > > > > soon as we can.
> > >> > > > > > > >
> > >> > > > > > > > Thanks,
> > >> > > > > > > > Julie
> > >> > > > > > > >
> > >> > > > > > > > On Wed, Apr 18, 2018 at 6:31 AM, Rosalyn
MacCracken -
> NOAA
> > >> > > > Affiliate
> > >> > > > > > via
> > >> > > > > > > RT
> > >> > > > > > > > <met_help at ucar.edu> wrote:
> > >> > > > > > > >
> > >> > > > > > > > >
> > >> > > > > > > > > Wed Apr 18 06:31:39 2018: Request 84822 was
acted
> upon.
> > >> > > > > > > > > Transaction: Ticket created by
> > >> rosalyn.maccracken at noaa.gov
> > >> > > > > > > > >        Queue: met_help
> > >> > > > > > > > >      Subject: question on regenerating data
> > >> > > > > > > > >        Owner: Nobody
> > >> > > > > > > > >   Requestors: rosalyn.maccracken at noaa.gov
> > >> > > > > > > > >       Status: new
> > >> > > > > > > > >  Ticket <URL: https://rt.rap.ucar.edu/rt/
> > >> > > > > > Ticket/Display.html?id=84822
> > >> > > > > > > >
> > >> > > > > > > > >
> > >> > > > > > > > >
> > >> > > > > > > > > Hi,
> > >> > > > > > > > >
> > >> > > > > > > > > I'm running point-stat using ASCAT and GFS data
to
> > verify
> > >> > > surface
> > >> > > > > > wind
> > >> > > > > > > > > speeds.  I found an error in my ASCAT input
data that
> > goes
> > >> > back
> > >> > > > to
> > >> > > > > > Mar
> > >> > > > > > > 7.
> > >> > > > > > > > > I had switched the input source of the data,
and
> within
> > >> the
> > >> > new
> > >> > > > > data
> > >> > > > > > > > files,
> > >> > > > > > > > > it was allowing very small values (< 1 m/s) to
be used
> > as
> > >> > data
> > >> > > > > points
> > >> > > > > > > in
> > >> > > > > > > > > the verification.  I imagine that this is an
issue,
> > since
> > >> > > > > point-stat
> > >> > > > > > is
> > >> > > > > > > > > using these very small values as matched pairs
with
> the
> > >> GFS,
> > >> > > > > correct?
> > >> > > > > > > > >
> > >> > > > > > > > > Is there a way to regenerate the point-stat
statistics
> > >> > without
> > >> > > > > using
> > >> > > > > > > the
> > >> > > > > > > > > original GFS data?  I do have the *stat and the
*mpr
> > >> files,
> > >> > and
> > >> > > > it
> > >> > > > > is
> > >> > > > > > > > > pretty easy to identify where the bad values
are
> > located.
> > >> > > > > > > > >
> > >> > > > > > > > > Thanks,
> > >> > > > > > > > > Roz
> > >> > > > > > > > >
> > >> > > > > > > > > --
> > >> > > > > > > > > Rosalyn MacCracken
> > >> > > > > > > > > Support Scientist
> > >> > > > > > > > >
> > >> > > > > > > > > Ocean Applications Branch
> > >> > > > > > > > > NOAA/NWS Ocean Prediction Center
> > >> > > > > > > > > NCWCP
> > >> > > > > > > > > 5830 University Research Ct
> > >> > > > > > > > > College Park, MD  20740-3818
> > >> > > > > > > > >
> > >> > > > > > > > > (p) 301-683-1551
> > >> > > > > > > > > rosalyn.maccracken at noaa.gov
> > >> > > > > > > > >
> > >> > > > > > > > >
> > >> > > > > > > >
> > >> > > > > > > >
> > >> > > > > > >
> > >> > > > > > >
> > >> > > > > >
> > >> > > > > >
> > >> > > > > > --
> > >> > > > > > Rosalyn MacCracken
> > >> > > > > > Support Scientist
> > >> > > > > >
> > >> > > > > > Ocean Applications Branch
> > >> > > > > > NOAA/NWS Ocean Prediction Center
> > >> > > > > > NCWCP
> > >> > > > > > 5830 University Research Ct
> > >> > > > > > College Park, MD  20740-3818
> > >> > > > > >
> > >> > > > > > (p) 301-683-1551
> > >> > > > > > rosalyn.maccracken at noaa.gov
> > >> > > > > >
> > >> > > > > >
> > >> > > > >
> > >> > > > >
> > >> > > >
> > >> > > >
> > >> > > > --
> > >> > > > Rosalyn MacCracken
> > >> > > > Support Scientist
> > >> > > >
> > >> > > > Ocean Applications Branch
> > >> > > > NOAA/NWS Ocean Prediction Center
> > >> > > > NCWCP
> > >> > > > 5830 University Research Ct
> > >> > > > College Park, MD  20740-3818
> > >> > > >
> > >> > > > (p) 301-683-1551
> > >> > > > rosalyn.maccracken at noaa.gov
> > >> > > >
> > >> > > >
> > >> > >
> > >> > >
> > >> >
> > >> >
> > >> > --
> > >> > Rosalyn MacCracken
> > >> > Support Scientist
> > >> >
> > >> > Ocean Applications Branch
> > >> > NOAA/NWS Ocean Prediction Center
> > >> > NCWCP
> > >> > 5830 University Research Ct
> > >> > College Park, MD  20740-3818
> > >> >
> > >> > (p) 301-683-1551
> > >> > rosalyn.maccracken at noaa.gov
> > >> >
> > >> >
> > >>
> > >>
> > >
> > >
> > > --
> > > Rosalyn MacCracken
> > > Support Scientist
> > >
> > > Ocean Applications Branch
> > > NOAA/NWS Ocean Prediction Center
> > > NCWCP
> > > 5830 University Research Ct
> > > College Park, MD  20740-3818
> > >
> > > (p) 301-683-1551
> > > rosalyn.maccracken at noaa.gov
> > >
> >
> >
> >
> > --
> > Rosalyn MacCracken
> > Support Scientist
> >
> > Ocean Applications Branch
> > NOAA/NWS Ocean Prediction Center
> > NCWCP
> > 5830 University Research Ct
> > College Park, MD  20740-3818
> >
> > (p) 301-683-1551
> > rosalyn.maccracken at noaa.gov
> >
> >
>
>

--
Rosalyn MacCracken
Support Scientist

Ocean Applications Branch
NOAA/NWS Ocean Prediction Center
NCWCP
5830 University Research Ct
College Park, MD  20740-3818

(p) 301-683-1551
rosalyn.maccracken at noaa.gov

------------------------------------------------
Subject: question on regenerating data
From: John Halley Gotway
Time: Wed Apr 25 09:40:38 2018

Roz,

I think it'd take just as long.  The slow part is reading the data...
not
applying a threshold.

John

On Wed, Apr 25, 2018 at 9:18 AM, Rosalyn MacCracken - NOAA Affiliate
via RT
<met_help at ucar.edu> wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822 >
>
> Hi John,
>
> Thanks for doing that for me.  I'll take a look at the info you sent
me
> this afternoon.  I'm in the middle of doing something right
now...trying to
> make a different program work.  ;-/
>
> I wonder if it will be quicker than 18 minutes for some of the
thresholds
> that have higher wind speeds, and not as many instances (or 0
instances).
> Or, will it take just as long, since it still needs to read through
the
> entire *.stat file anyway?
>
> Roz
>
> On Tue, Apr 24, 2018 at 7:06 PM, John Halley Gotway via RT <
> met_help at ucar.edu> wrote:
>
> > Hi Roz,
> >
> > Thanks for sending the sample data.  I grabbed it and used it run
some
> > sample jobs:
> >
> > time /d1/johnhg/MET/MET_releases/met-6.0/bin/stat_analysis \
> > -lookin
> > /d1/johnhg/MET/MET_Help/maccracken_data_20180424/opc_
> > test/home/opc_test/data/met_verif/GFS/data/hourly
> > \
> > -config STATAnalysisConfig \
> > -log run_sa.log -v 3
> >
> > I used the "-lookin" option to point to all the data you sent.
> >
> > I've attached the...
> > (1) config file I used
> > (2) log file that was genrated
> > (3) output .stat files
> >
> > Looking at the jobs, you'll see that I've included 5 of them...
> > - Generate CNT output
> > - Generate CTC >= 0.0 output
> > - Generate CTS >= 0.0 output
> > - Generate CTC >= 5.5689 output
> > - Generate CTS >= 5.5689 output
> >
> > Unfortunately, you'll need to define separate jobs for each
threshold
> you'd
> > like to use.  Although, you shouldn't use >=0.0 since that's
always true.
> >
> > Also unfortunately, this is pretty slow.  On my machine, it took
like 18
> > minutes for these 5 jobs!
> >
> > Thanks,
> > John
> >
> >
> > On Tue, Apr 24, 2018 at 2:09 PM, Rosalyn MacCracken - NOAA
Affiliate via
> RT
> > <met_help at ucar.edu> wrote:
> >
> > >
> > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822 >
> > >
> > > Hi John,
> > >
> > > I put my file on the ftp site.  Let me know what you find.
You'll see
> > > those really low OBS values (0.01, 0.02, and so on).
> > >
> > > Thanks!
> > >
> > > Roz
> > >
> > > On Tue, Apr 24, 2018 at 2:53 PM, Rosalyn MacCracken - NOAA
Affiliate <
> > > rosalyn.maccracken at noaa.gov> wrote:
> > >
> > > > Ok, I'll get that over to the ftp site.  I have to make sure
that I
> > find
> > > a
> > > > day that has all the data in it.  Sometimes the data isn't
available
> > when
> > > > the script runs.  A little annoying, but, that's operations...
> > > >
> > > > I'll let you know when I get the file to the ftp site.
> > > >
> > > > Thanks!
> > > >
> > > > Roz
> > > >
> > > > On Tue, Apr 24, 2018 at 2:49 PM, John Halley Gotway via RT <
> > > > met_help at ucar.edu> wrote:
> > > >
> > > >> Roz,
> > > >>
> > > >> Yes, we do.  Follow the instructions here:
> > > >>    https://dtcenter.org/met/users/support/met_help.php#ftp
> > > >>
> > > >> I'd suggest making a tar file for one day and posting them to
the
> ftp
> > > >> site:
> > > >>    tar -cvzf sample.tar.gz /GFS/data/hourly/20180305*
> > > >>
> > > >> Thanks,
> > > >> John
> > > >>
> > > >> On Tue, Apr 24, 2018 at 11:57 AM, Rosalyn MacCracken - NOAA
> Affiliate
> > > via
> > > >> RT <met_help at ucar.edu> wrote:
> > > >>
> > > >> >
> > > >> > <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822 >
> > > >> >
> > > >> > HI John,
> > > >> >
> > > >> > Yes, it does seem that the -config option is the way to go
to
> > recreate
> > > >> > those 3 files. I'll be sure to have a unique file name, or,
mv the
> > > >> output
> > > >> > file to a different name before running the command again.
Thanks
> > for
> > > >> > pointing that out.
> > > >> >
> > > >> > I'm teleworking for the next couple of weeks, so, download
and
> send
> > > you
> > > >> > *.stat files like I can when I'm at my computer at work.  I
don't
> > have
> > > >> > access to theia or wcoss anymore.  You have an ftp server
that I
> can
> > > >> upload
> > > >> > data to, right?  If not, I can try and fiddle around with
this
> > > tomorrow
> > > >> and
> > > >> > see if I can't get this to work the way I want to.
> > > >> >
> > > >> > Roz
> > > >> >
> > > >> > On Tue, Apr 24, 2018 at 11:42 AM, John Halley Gotway via RT
<
> > > >> > met_help at ucar.edu> wrote:
> > > >> >
> > > >> > > Roz,
> > > >> > >
> > > >> > > Each "-job aggregate_stat" only generates a single output
line
> > type.
> > > >> So
> > > >> > > using "-out_line_type CTC,CTS,CNT" will not work.
> > > >> > >
> > > >> > > You'll need to run separate jobs for each output line
type you
> > want
> > > to
> > > >> > > generate.  That's why I'd recommend grouping those
multiple jobs
> > > >> together
> > > >> > > into a single STAT-Analysis config file.  Then you'd call
> > > >> STAT-Analysis
> > > >> > > once using the "-config" command line option.
> > > >> > >
> > > >> > > Another issue is that if you set "-out_stat" to the same
> filename,
> > > >> it'll
> > > >> > > get overridden by each job.  STAT-Analysis will overwrite
that
> > > output
> > > >> > file
> > > >> > > rather than appending to it.
> > > >> > >
> > > >> > > You could send me a day's worth of .stat output files
> > > >> > > (/GFS/data/hourly/20180305*) and I could send you some
> > suggestions.
> > > >> Or
> > > >> > if
> > > >> > > you have access to theia you could copy them up there and
point
> me
> > > to
> > > >> it.
> > > >> > >
> > > >> > > Thanks,
> > > >> > > John
> > > >> > >
> > > >> > > On Tue, Apr 24, 2018 at 7:48 AM, Rosalyn MacCracken -
NOAA
> > Affiliate
> > > >> via
> > > >> > RT
> > > >> > > <met_help at ucar.edu> wrote:
> > > >> > >
> > > >> > > >
> > > >> > > > <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822
> >
> > > >> > > >
> > > >> > > > Hi John,
> > > >> > > >
> > > >> > > > Yes, that makes sense.  Those very small values (<1.0
m/s),
> are
> > > bad
> > > >> > > > values.  That's why they shouldn't be included in the
> > processing.
> > > >> > > >
> > > >> > > > So, I need to just regenerate hourly data, one hour at
a time.
> > > >> Would
> > > >> > it
> > > >> > > > make sense to use a shell script and loop stat-
analysis?
> > > Something
> > > >> > like:
> > > >> > > >
> > > >> > > > for day in 11 12
> > > >> > > > do
> > > >> > > >   for cycle in 00 06 12 18
> > > >> > > >   do
> > > >> > > > stat_analysis -lookin /GFS/data/hourly/201803${day}$
> > {hour}/*.stat
> > > \
> > > >> > > > -job aggregate_stat \
> > > >> > > >    -line_type MPR \
> > > >> > > >    -out_line_type CTC,CTS,CNT \
> > > >> > > >   -fcst_var WIND \
> > > >> > > > -column_thresh OBS gt1 \
> > > >> > > >  -by
> > > >> > > >
MODEL,FCST_LEV,FCST_INIT_BEG,FCST_LEAD,VX_MASK,INTERP_MTHD,I
> > > >> NTERP_PNTS
> > > >> > > > -out_stat /new_rerun_stat_files/MPR_to_CTC_CTS_CNT.stat
> > > >> > > >   done
> > > >> > > > done
> > > >> > > >
> > > >> > > > or, something like that?  And, will this regenerate
hour
> > > forecasts,
> > > >> at
> > > >> > > each
> > > >> > > > forecast and lead hour?  I guess it will see the
forecast and
> > lead
> > > >> hour
> > > >> > > > from the *.stat file, and whatever *stat file is in the
> > directory,
> > > >> it
> > > >> > > will
> > > >> > > > regenerate those hours, right?
> > > >> > > >
> > > >> > > > So, I need to regenerate the CTC, CNT and CTS files.
That's
> > why I
> > > >> did:
> > > >> > > >  -out_line_type CTC,CTS,CNT
> > > >> > > > but, will that make 3 separate files, or just another
*.stat
> > file?
> > > >> > > >
> > > >> > > > Roz
> > > >> > > >
> > > >> > > >
> > > >> > > > On Mon, Apr 23, 2018 at 4:01 PM, John Halley Gotway via
RT <
> > > >> > > > met_help at ucar.edu> wrote:
> > > >> > > >
> > > >> > > > > Roz,
> > > >> > > > >
> > > >> > > > > It is ultimately up to you to decide which matched
pairs you
> > > want
> > > >> to
> > > >> > > > > include in your processing.  Do you consider those
small
> (<1.0
> > > >> m/s)
> > > >> > > > > observation values to be corrupt and incorrect in
some way
> or
> > > just
> > > >> > not
> > > >> > > > very
> > > >> > > > > interesting?  If they really are BAD data values, I
agree
> that
> > > you
> > > >> > > should
> > > >> > > > > exclude them from your analysis.  But if they're just
> > > >> uninteresting
> > > >> > > > values
> > > >> > > > > of low wind speed, then there's no reason why you
should
> > exclude
> > > >> > them.
> > > >> > > > For
> > > >> > > > > example, *most* of the time it ins't raining, but we
often
> > > >> included
> > > >> > > > > observations of 0 precip.
> > > >> > > > >
> > > >> > > > > There are three configurable options in Point-Stat
that may
> be
> > > >> useful
> > > >> > > > here:
> > > >> > > > > (1) You already know and use the "cat_thresh" option.
This
> > > >> threshold
> > > >> > > > > defines the events and non-events for a 2x2
contingency
> table.
> > > >> This
> > > >> > > > > threshold affects the contents of FHO, CTC, CTS,
MCTC, and
> > MCTS
> > > >> line
> > > >> > > > types
> > > >> > > > > that Point-Stat writes.
> > > >> > > > > (2) The "cnt_thresh" option is a more recent
addition.
> > Perhaps
> > > >> this
> > > >> > > was
> > > >> > > > a
> > > >> > > > > poor name choice, but instead of defining categories,
it's
> > > really
> > > >> a
> > > >> > > > > *filtering* threshold.  This threshold affects the
contents
> of
> > > the
> > > >> > > SL1L2,
> > > >> > > > > SAL1L2, and CNT line types that Point-Stat writes.
For
> > example,
> > > >> > > setting
> > > >> > > > > "cnt_thresh = [ ge6, ge17 ];" will produce 2 CNT and
2 SL1L2
> > > >> output
> > > >> > > lines
> > > >> > > > > containing only those points where the wind speed was
>=6
> and
> > > >> >=17,
> > > >> > > > > respectively.
> > > >> > > > > (3) The "wind_thresh" option is very similar to the
> > "cnt_thresh"
> > > >> > option
> > > >> > > > but
> > > >> > > > > affects the contents of teh VL1L2, VAL1L2, and VCNT
(new in
> > > >> met-7.0)
> > > >> > > line
> > > >> > > > > types.  Only those U/V pairs that meet the specified
wind
> > speed
> > > >> > > threshold
> > > >> > > > > are included in the output.
> > > >> > > > >
> > > >> > > > > For both "cnt_thresh" and "wind_thresh", the default
value
> in
> > > the
> > > >> > > config
> > > >> > > > > file is "NA", meaning, do not apply any filtering
threshold
> > > >> criteria.
> > > >> > > > >
> > > >> > > > > You have the flexibility to run STAT-Analysis on the
MPR
> > output
> > > >> lines
> > > >> > > to
> > > >> > > > > recompute any of these output line types applying
whatever
> > > >> filtering
> > > >> > > > > criteria you'd like.
> > > >> > > > > Here's the MET user's guide:
> > > >> > > > > https://dtcenter.org/met/users/docs/users_guide/MET_
> > > >> > > Users_Guide_v7.0.pdf
> > > >> > > > > Look on page 98 for the job command options for the
> > > >> "aggregate_stat"
> > > >> > > line
> > > >> > > > > type when the input line type is "MPR".
> > > >> > > > >
> > > >> > > > > For your second question, the "-lookin PATH" option
is
> *VERY*
> > > >> > flexible.
> > > >> > > > > You can set PATH to either a single value or multiple
> values.
> > > If
> > > >> you
> > > >> > > use
> > > >> > > > > wildcards, then the shell expands those wildcards to
> multiple
> > > >> values.
> > > >> > > > Each
> > > >> > > > > value you pass in can either be a filename or a
directory
> > name.
> > > >> If
> > > >> > you
> > > >> > > > > pass in a filename, STAT-Analysis will read it
*REGARDLESS*
> of
> > > the
> > > >> > file
> > > >> > > > > extension.  If you pass in a directory name, STAT-
Analysis
> > will
> > > >> > search
> > > >> > > > that
> > > >> > > > > directory *RECURSIVELY* for files ending in ".stat".
For
> > > example,
> > > >> > > either
> > > >> > > > > of the following settings would tell STAT-Analysis to
read
> the
> > > >> same
> > > >> > > list
> > > >> > > > of
> > > >> > > > > files:
> > > >> > > > >    -lookin /GFS/data/hourly/*/*.stat
> > > >> > > > >    ... or ...
> > > >> > > > >    -lookin /GFS/data/hourly
> > > >> > > > >
> > > >> > > > > Be aware though that the more data you pass to
> STAT-Analysis,
> > > the
> > > >> > > longer
> > > >> > > > > it'll take for it to process it.  You can decide how
much
> data
> > > you
> > > >> > pass
> > > >> > > > it
> > > >> > > > > for each job.  I'd suggest starting with what is most
> > convenient
> > > >> for
> > > >> > > you.
> > > >> > > > > If it's too slow, change the logic to pass it less
data
> (e.g.
> > > >> only 1
> > > >> > > day
> > > >> > > > of
> > > >> > > > > data rather than 1 month of data).
> > > >> > > > >
> > > >> > > > > Yes, you can give it a date range.  Use
-fcst_init_beg and
> > > >> > > -fcst_init_end
> > > >> > > > > to specify beginning/ending model initialization
times or
> > > >> > > -fcst_valid_beg
> > > >> > > > > and -fcst_valid_end to specify beginning/ending valid
times.
> > > >> > > > >
> > > >> > > > > If you find that you're running multiple jobs on the
same
> > subset
> > > >> of
> > > >> > > data
> > > >> > > > > (e.g. process MPR to CNT, MPR to SL1L2, MPR to CTC,
MPR to
> > CTS),
> > > >> it'd
> > > >> > > be
> > > >> > > > > more efficient to group those jobs into a config
file.
> > That'll
> > > do
> > > >> > the
> > > >> > > > > filtering ONCE and write the filtered data to a temp
file.
> > Then
> > > >> all
> > > >> > > the
> > > >> > > > > jobs read data from the temp instead of starting over
from
> > > >> scratch.
> > > >> > > > >
> > > >> > > > > Make sense?
> > > >> > > > >
> > > >> > > > > John
> > > >> > > > >
> > > >> > > > >
> > > >> > > > >
> > > >> > > > > On Mon, Apr 23, 2018 at 1:01 PM, Rosalyn MacCracken -
NOAA
> > > >> Affiliate
> > > >> > > via
> > > >> > > > RT
> > > >> > > > > <met_help at ucar.edu> wrote:
> > > >> > > > >
> > > >> > > > > >
> > > >> > > > > > <URL: https://rt.rap.ucar.edu/rt/
> > Ticket/Display.html?id=84822
> > > >
> > > >> > > > > >
> > > >> > > > > > Hi John,
> > > >> > > > > >
> > > >> > > > > > That's actually only partially correct.  It's not
that I
> > want
> > > to
> > > >> > use
> > > >> > > > part
> > > >> > > > > > of the MPR lines and discard the rest, and I do
need to
> > > >> regenerate
> > > >> > > > > > statistics.  Let me try to re-explain.
> > > >> > > > > >
> > > >> > > > > > Back in early March we switched from getting our
ASCAT obs
> > > from
> > > >> the
> > > >> > > > > > prepbufr data, to getting it from the MGDRLITE
data. So,
> > > >> processing
> > > >> > > > > didn't
> > > >> > > > > > change.  I was producing statistics at certain
threshold
> > > levels
> > > >> for
> > > >> > > > both
> > > >> > > > > > GFS and ASCAT.  I had this set with the cat_thresh
list,
> at
> > > >> levels
> > > >> > of
> > > >> > > > > > 0,6,17, etc.  We found out after processing for a
couple
> of
> > > >> weeks
> > > >> > > that
> > > >> > > > > the
> > > >> > > > > > ASCAT data included these really small values, <1.0
m/s,
> and
> > > >> that
> > > >> > > these
> > > >> > > > > > small wind speeds were being included into the
statistics
> > > >> > processing.
> > > >> > > > > >
> > > >> > > > > > So, a couple of questions.
> > > >> > > > > > 1) Do I have to regenerate all of my statistics
(*.cts,
> > *.cnt
> > > >> and
> > > >> > > *ctc
> > > >> > > > > > files) because of this error? Or, since I have
threshold
> > > levels
> > > >> > set,
> > > >> > > > will
> > > >> > > > > > those small values be amoung the statistics in the
lowest
> > > >> > thresholds?
> > > >> > > > > > 2) I have the *.stat files, but, they are spread
out into
> > > >> separate
> > > >> > > > > > directories like:
> > > >> > > > > > /GFS/data/hourly/${YYYYMMDDHH}/*.stat
> > > >> > > > > > Can I tell stat-analysis to "lookin" directories
with a
> > > wildcard
> > > >> > > (like
> > > >> > > > > > 201803*)?  If so, how?  Or, is I tell it to look in
> > > >> > /GFS/data/hourly,
> > > >> > > > > will
> > > >> > > > > > it look in all the directories recursively under
hourly?
> > And,
> > > >> it
> > > >> > > > that's
> > > >> > > > > > the case, can I give it a date range, so, that it
only
> > > processes
> > > >> > data
> > > >> > > > > from
> > > >> > > > > > March?
> > > >> > > > > >
> > > >> > > > > > Roz
> > > >> > > > > >
> > > >> > > > > > On Mon, Apr 23, 2018 at 2:18 PM, John Halley Gotway
via
> RT <
> > > >> > > > > > met_help at ucar.edu> wrote:
> > > >> > > > > >
> > > >> > > > > > > Hi Roz,
> > > >> > > > > > >
> > > >> > > > > > > I read that you've run Point-Stat and saved off
the
> > matched
> > > >> pairs
> > > >> > > > (MPR)
> > > >> > > > > > > output line type.  And you'd like to (1) filter
those
> MPR
> > > >> lines
> > > >> > to
> > > >> > > > > > discard
> > > >> > > > > > > some of them and then (2) use the filtered data
to
> > > regenerate
> > > >> > > summary
> > > >> > > > > > > statistics.  Yes, this is easily done using the
> > > STAT-Analysis
> > > >> > tool
> > > >> > > in
> > > >> > > > > > MET.
> > > >> > > > > > >
> > > >> > > > > > > You wrote that you're verifying wind speeds
against
> ASCAT
> > > and
> > > >> > that
> > > >> > > > > you'd
> > > >> > > > > > > like to exclude pairs where the observed wind
speed is
> > less
> > > >> than
> > > >> > 1
> > > >> > > > m/s.
> > > >> > > > > > > I'm just guessing here, but I'll presume that you
want
> to
> > > >> produce
> > > >> > > > both
> > > >> > > > > > > SL1L2 and CNT output line types.  Here's what the
> > > >> STAT-Analysis
> > > >> > job
> > > >> > > > > would
> > > >> > > > > > > look like:
> > > >> > > > > > >
> > > >> > > > > > > # Filter MPR's and write SL1L2 output line
> > > >> > > > > > > stat_analysis \
> > > >> > > > > > >    -lookin input.stat \            # List a .stat
> filename
> > > or
> > > >> > > > directory
> > > >> > > > > > > containing them
> > > >> > > > > > >    -job aggregate_stat \        # Job type is
> > aggregate_stat
> > > >> > > > > > >    -line_type MPR \              # Input line
type = MPR
> > > >> > > > > > >    -out_line_type SL1L2 \      # Output line type
=
> SL1L2
> > > >> partial
> > > >> > > > sums
> > > >> > > > > > >    -fcst_var WIND \               # Only process
lines
> > where
> > > >> > > FCST_VAR
> > > >> > > > > > > column = WIND
> > > >> > > > > > >    -column_thresh OBS gt1 \ # Only use MPR lines
where
> OBS
> > > >> column
> > > >> > > > 1
> > > >> > > > > > >    -by
> > > >> > > > > > > MODEL,FCST_LEV,FCST_INIT_BEG,
> > FCST_LEAD,VX_MASK,INTERP_MTHD,
> > > >> > > > INTERP_PNTS
> > > >> > > > > #
> > > >> > > > > > > Run this same job for each unique combination of
these
> > > columns
> > > >> > > > > > >    -out_stat MPR_to_SL1L2.stat
> > > >> > > > > > >
> > > >> > > > > > > This will read produce an output .stat file
containing
> an
> > > >> SL1L2
> > > >> > > line
> > > >> > > > > for
> > > >> > > > > > > each unique combination of the header columns
listed
> after
> > > the
> > > >> > > "-by"
> > > >> > > > > > > option.  To generate CNT output lines instead,
you'd
> run a
> > > >> second
> > > >> > > job
> > > >> > > > > > where
> > > >> > > > > > > you replace SL1L2 with CNT.  You could run these
jobs on
> > the
> > > >> > > command
> > > >> > > > > line
> > > >> > > > > > > or group them together into a STAT-Analysis
config file,
> > if
> > > >> you
> > > >> > > > prefer.
> > > >> > > > > > > Both would work.
> > > >> > > > > > >
> > > >> > > > > > > You could run this once for each input .stat file
you're
> > > >> > > > processing...
> > > >> > > > > or
> > > >> > > > > > > you could pass many input .stat files to the job.
Since
> > > >> > > > FCST_INIT_BEG
> > > >> > > > > > and
> > > >> > > > > > > FCST_LEAD are included in the "-by" option,
you'll get
> > > >> separate
> > > >> > > > output
> > > >> > > > > > > lines for each unique time.
> > > >> > > > > > >
> > > >> > > > > > > Hope that helps get you going.
> > > >> > > > > > >
> > > >> > > > > > > Thanks,
> > > >> > > > > > > John
> > > >> > > > > > >
> > > >> > > > > > >
> > > >> > > > > > > On Thu, Apr 19, 2018 at 9:23 AM, Julie Prestopnik
via
> RT <
> > > >> > > > > > > met_help at ucar.edu>
> > > >> > > > > > > wrote:
> > > >> > > > > > >
> > > >> > > > > > > >
> > > >> > > > > > > > <URL: https://rt.rap.ucar.edu/rt/Tic
> > > >> ket/Display.html?id=84822
> > > >> > >
> > > >> > > > > > > >
> > > >> > > > > > > > Hi Roz.  My apologies for the delay in
responding.
> > > >> > > > > > > >
> > > >> > > > > > > > Unfortunately, John is out of the office this
week,
> and
> > I
> > > do
> > > >> > not
> > > >> > > > know
> > > >> > > > > > the
> > > >> > > > > > > > answers to your questions.  As you said, I
would also
> > > >> imagine
> > > >> > > that
> > > >> > > > > > > > point-stat is using those small values as
matched
> pairs.
> > > >> > Also, I
> > > >> > > > do
> > > >> > > > > > not
> > > >> > > > > > > > believe there is a way to regenerate the point-
stat
> > > >> statistics
> > > >> > > > > without
> > > >> > > > > > > > using the original GFS data.  I cannot say with
> > certainty,
> > > >> > > however.
> > > >> > > > > > > Thank
> > > >> > > > > > > > you for your patience in advance.  We'll get a
> definite
> > > >> > response
> > > >> > > to
> > > >> > > > > you
> > > >> > > > > > > as
> > > >> > > > > > > > soon as we can.
> > > >> > > > > > > >
> > > >> > > > > > > > Thanks,
> > > >> > > > > > > > Julie
> > > >> > > > > > > >
> > > >> > > > > > > > On Wed, Apr 18, 2018 at 6:31 AM, Rosalyn
MacCracken -
> > NOAA
> > > >> > > > Affiliate
> > > >> > > > > > via
> > > >> > > > > > > RT
> > > >> > > > > > > > <met_help at ucar.edu> wrote:
> > > >> > > > > > > >
> > > >> > > > > > > > >
> > > >> > > > > > > > > Wed Apr 18 06:31:39 2018: Request 84822 was
acted
> > upon.
> > > >> > > > > > > > > Transaction: Ticket created by
> > > >> rosalyn.maccracken at noaa.gov
> > > >> > > > > > > > >        Queue: met_help
> > > >> > > > > > > > >      Subject: question on regenerating data
> > > >> > > > > > > > >        Owner: Nobody
> > > >> > > > > > > > >   Requestors: rosalyn.maccracken at noaa.gov
> > > >> > > > > > > > >       Status: new
> > > >> > > > > > > > >  Ticket <URL: https://rt.rap.ucar.edu/rt/
> > > >> > > > > > Ticket/Display.html?id=84822
> > > >> > > > > > > >
> > > >> > > > > > > > >
> > > >> > > > > > > > >
> > > >> > > > > > > > > Hi,
> > > >> > > > > > > > >
> > > >> > > > > > > > > I'm running point-stat using ASCAT and GFS
data to
> > > verify
> > > >> > > surface
> > > >> > > > > > wind
> > > >> > > > > > > > > speeds.  I found an error in my ASCAT input
data
> that
> > > goes
> > > >> > back
> > > >> > > > to
> > > >> > > > > > Mar
> > > >> > > > > > > 7.
> > > >> > > > > > > > > I had switched the input source of the data,
and
> > within
> > > >> the
> > > >> > new
> > > >> > > > > data
> > > >> > > > > > > > files,
> > > >> > > > > > > > > it was allowing very small values (< 1 m/s)
to be
> used
> > > as
> > > >> > data
> > > >> > > > > points
> > > >> > > > > > > in
> > > >> > > > > > > > > the verification.  I imagine that this is an
issue,
> > > since
> > > >> > > > > point-stat
> > > >> > > > > > is
> > > >> > > > > > > > > using these very small values as matched
pairs with
> > the
> > > >> GFS,
> > > >> > > > > correct?
> > > >> > > > > > > > >
> > > >> > > > > > > > > Is there a way to regenerate the point-stat
> statistics
> > > >> > without
> > > >> > > > > using
> > > >> > > > > > > the
> > > >> > > > > > > > > original GFS data?  I do have the *stat and
the *mpr
> > > >> files,
> > > >> > and
> > > >> > > > it
> > > >> > > > > is
> > > >> > > > > > > > > pretty easy to identify where the bad values
are
> > > located.
> > > >> > > > > > > > >
> > > >> > > > > > > > > Thanks,
> > > >> > > > > > > > > Roz
> > > >> > > > > > > > >
> > > >> > > > > > > > > --
> > > >> > > > > > > > > Rosalyn MacCracken
> > > >> > > > > > > > > Support Scientist
> > > >> > > > > > > > >
> > > >> > > > > > > > > Ocean Applications Branch
> > > >> > > > > > > > > NOAA/NWS Ocean Prediction Center
> > > >> > > > > > > > > NCWCP
> > > >> > > > > > > > > 5830 University Research Ct
> > > >> > > > > > > > > College Park, MD  20740-3818
> > > >> > > > > > > > >
> > > >> > > > > > > > > (p) 301-683-1551
> > > >> > > > > > > > > rosalyn.maccracken at noaa.gov
> > > >> > > > > > > > >
> > > >> > > > > > > > >
> > > >> > > > > > > >
> > > >> > > > > > > >
> > > >> > > > > > >
> > > >> > > > > > >
> > > >> > > > > >
> > > >> > > > > >
> > > >> > > > > > --
> > > >> > > > > > Rosalyn MacCracken
> > > >> > > > > > Support Scientist
> > > >> > > > > >
> > > >> > > > > > Ocean Applications Branch
> > > >> > > > > > NOAA/NWS Ocean Prediction Center
> > > >> > > > > > NCWCP
> > > >> > > > > > 5830 University Research Ct
> > > >> > > > > > College Park, MD  20740-3818
> > > >> > > > > >
> > > >> > > > > > (p) 301-683-1551
> > > >> > > > > > rosalyn.maccracken at noaa.gov
> > > >> > > > > >
> > > >> > > > > >
> > > >> > > > >
> > > >> > > > >
> > > >> > > >
> > > >> > > >
> > > >> > > > --
> > > >> > > > Rosalyn MacCracken
> > > >> > > > Support Scientist
> > > >> > > >
> > > >> > > > Ocean Applications Branch
> > > >> > > > NOAA/NWS Ocean Prediction Center
> > > >> > > > NCWCP
> > > >> > > > 5830 University Research Ct
> > > >> > > > College Park, MD  20740-3818
> > > >> > > >
> > > >> > > > (p) 301-683-1551
> > > >> > > > rosalyn.maccracken at noaa.gov
> > > >> > > >
> > > >> > > >
> > > >> > >
> > > >> > >
> > > >> >
> > > >> >
> > > >> > --
> > > >> > Rosalyn MacCracken
> > > >> > Support Scientist
> > > >> >
> > > >> > Ocean Applications Branch
> > > >> > NOAA/NWS Ocean Prediction Center
> > > >> > NCWCP
> > > >> > 5830 University Research Ct
> > > >> > College Park, MD  20740-3818
> > > >> >
> > > >> > (p) 301-683-1551
> > > >> > rosalyn.maccracken at noaa.gov
> > > >> >
> > > >> >
> > > >>
> > > >>
> > > >
> > > >
> > > > --
> > > > Rosalyn MacCracken
> > > > Support Scientist
> > > >
> > > > Ocean Applications Branch
> > > > NOAA/NWS Ocean Prediction Center
> > > > NCWCP
> > > > 5830 University Research Ct
> > > > College Park, MD  20740-3818
> > > >
> > > > (p) 301-683-1551
> > > > rosalyn.maccracken at noaa.gov
> > > >
> > >
> > >
> > >
> > > --
> > > Rosalyn MacCracken
> > > Support Scientist
> > >
> > > Ocean Applications Branch
> > > NOAA/NWS Ocean Prediction Center
> > > NCWCP
> > > 5830 University Research Ct
> > > College Park, MD  20740-3818
> > >
> > > (p) 301-683-1551
> > > rosalyn.maccracken at noaa.gov
> > >
> > >
> >
> >
>
>
> --
> Rosalyn MacCracken
> Support Scientist
>
> Ocean Applications Branch
> NOAA/NWS Ocean Prediction Center
> NCWCP
> 5830 University Research Ct
> College Park, MD  20740-3818
>
> (p) 301-683-1551
> rosalyn.maccracken at noaa.gov
>
>

------------------------------------------------
Subject: question on regenerating data
From: Rosalyn MacCracken - NOAA Affiliate
Time: Wed Apr 25 10:08:49 2018

Figures.  I just calculated how long it will take me to regenerate
data for
03072018 - 04122018.  It will take me 912 hours.  ;-(

Ok, I know I asked this, but, if I had a OBS value of 0.01 and a
matched
GFS point of 10 m/s, and I had a low threshold of 0-5 m/s, 6-10 m/s
and
10-15 m/s, and say, CSI was calculated.  Which threshold would be used
for
the output, the 0-5 or 6-10?  And, would the 10-15 threshold even be
effected?

Roz

On Wed, Apr 25, 2018 at 11:40 AM, John Halley Gotway via RT <
met_help at ucar.edu> wrote:

> Roz,
>
> I think it'd take just as long.  The slow part is reading the
data... not
> applying a threshold.
>
> John
>
> On Wed, Apr 25, 2018 at 9:18 AM, Rosalyn MacCracken - NOAA Affiliate
via RT
> <met_help at ucar.edu> wrote:
>
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822 >
> >
> > Hi John,
> >
> > Thanks for doing that for me.  I'll take a look at the info you
sent me
> > this afternoon.  I'm in the middle of doing something right
now...trying
> to
> > make a different program work.  ;-/
> >
> > I wonder if it will be quicker than 18 minutes for some of the
thresholds
> > that have higher wind speeds, and not as many instances (or 0
instances).
> > Or, will it take just as long, since it still needs to read
through the
> > entire *.stat file anyway?
> >
> > Roz
> >
> > On Tue, Apr 24, 2018 at 7:06 PM, John Halley Gotway via RT <
> > met_help at ucar.edu> wrote:
> >
> > > Hi Roz,
> > >
> > > Thanks for sending the sample data.  I grabbed it and used it
run some
> > > sample jobs:
> > >
> > > time /d1/johnhg/MET/MET_releases/met-6.0/bin/stat_analysis \
> > > -lookin
> > > /d1/johnhg/MET/MET_Help/maccracken_data_20180424/opc_
> > > test/home/opc_test/data/met_verif/GFS/data/hourly
> > > \
> > > -config STATAnalysisConfig \
> > > -log run_sa.log -v 3
> > >
> > > I used the "-lookin" option to point to all the data you sent.
> > >
> > > I've attached the...
> > > (1) config file I used
> > > (2) log file that was genrated
> > > (3) output .stat files
> > >
> > > Looking at the jobs, you'll see that I've included 5 of them...
> > > - Generate CNT output
> > > - Generate CTC >= 0.0 output
> > > - Generate CTS >= 0.0 output
> > > - Generate CTC >= 5.5689 output
> > > - Generate CTS >= 5.5689 output
> > >
> > > Unfortunately, you'll need to define separate jobs for each
threshold
> > you'd
> > > like to use.  Although, you shouldn't use >=0.0 since that's
always
> true.
> > >
> > > Also unfortunately, this is pretty slow.  On my machine, it took
like
> 18
> > > minutes for these 5 jobs!
> > >
> > > Thanks,
> > > John
> > >
> > >
> > > On Tue, Apr 24, 2018 at 2:09 PM, Rosalyn MacCracken - NOAA
Affiliate
> via
> > RT
> > > <met_help at ucar.edu> wrote:
> > >
> > > >
> > > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822
>
> > > >
> > > > Hi John,
> > > >
> > > > I put my file on the ftp site.  Let me know what you find.
You'll
> see
> > > > those really low OBS values (0.01, 0.02, and so on).
> > > >
> > > > Thanks!
> > > >
> > > > Roz
> > > >
> > > > On Tue, Apr 24, 2018 at 2:53 PM, Rosalyn MacCracken - NOAA
Affiliate
> <
> > > > rosalyn.maccracken at noaa.gov> wrote:
> > > >
> > > > > Ok, I'll get that over to the ftp site.  I have to make sure
that I
> > > find
> > > > a
> > > > > day that has all the data in it.  Sometimes the data isn't
> available
> > > when
> > > > > the script runs.  A little annoying, but, that's
operations...
> > > > >
> > > > > I'll let you know when I get the file to the ftp site.
> > > > >
> > > > > Thanks!
> > > > >
> > > > > Roz
> > > > >
> > > > > On Tue, Apr 24, 2018 at 2:49 PM, John Halley Gotway via RT <
> > > > > met_help at ucar.edu> wrote:
> > > > >
> > > > >> Roz,
> > > > >>
> > > > >> Yes, we do.  Follow the instructions here:
> > > > >>    https://dtcenter.org/met/users/support/met_help.php#ftp
> > > > >>
> > > > >> I'd suggest making a tar file for one day and posting them
to the
> > ftp
> > > > >> site:
> > > > >>    tar -cvzf sample.tar.gz /GFS/data/hourly/20180305*
> > > > >>
> > > > >> Thanks,
> > > > >> John
> > > > >>
> > > > >> On Tue, Apr 24, 2018 at 11:57 AM, Rosalyn MacCracken - NOAA
> > Affiliate
> > > > via
> > > > >> RT <met_help at ucar.edu> wrote:
> > > > >>
> > > > >> >
> > > > >> > <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822 >
> > > > >> >
> > > > >> > HI John,
> > > > >> >
> > > > >> > Yes, it does seem that the -config option is the way to
go to
> > > recreate
> > > > >> > those 3 files. I'll be sure to have a unique file name,
or, mv
> the
> > > > >> output
> > > > >> > file to a different name before running the command
again.
> Thanks
> > > for
> > > > >> > pointing that out.
> > > > >> >
> > > > >> > I'm teleworking for the next couple of weeks, so,
download and
> > send
> > > > you
> > > > >> > *.stat files like I can when I'm at my computer at work.
I
> don't
> > > have
> > > > >> > access to theia or wcoss anymore.  You have an ftp server
that I
> > can
> > > > >> upload
> > > > >> > data to, right?  If not, I can try and fiddle around with
this
> > > > tomorrow
> > > > >> and
> > > > >> > see if I can't get this to work the way I want to.
> > > > >> >
> > > > >> > Roz
> > > > >> >
> > > > >> > On Tue, Apr 24, 2018 at 11:42 AM, John Halley Gotway via
RT <
> > > > >> > met_help at ucar.edu> wrote:
> > > > >> >
> > > > >> > > Roz,
> > > > >> > >
> > > > >> > > Each "-job aggregate_stat" only generates a single
output line
> > > type.
> > > > >> So
> > > > >> > > using "-out_line_type CTC,CTS,CNT" will not work.
> > > > >> > >
> > > > >> > > You'll need to run separate jobs for each output line
type you
> > > want
> > > > to
> > > > >> > > generate.  That's why I'd recommend grouping those
multiple
> jobs
> > > > >> together
> > > > >> > > into a single STAT-Analysis config file.  Then you'd
call
> > > > >> STAT-Analysis
> > > > >> > > once using the "-config" command line option.
> > > > >> > >
> > > > >> > > Another issue is that if you set "-out_stat" to the
same
> > filename,
> > > > >> it'll
> > > > >> > > get overridden by each job.  STAT-Analysis will
overwrite that
> > > > output
> > > > >> > file
> > > > >> > > rather than appending to it.
> > > > >> > >
> > > > >> > > You could send me a day's worth of .stat output files
> > > > >> > > (/GFS/data/hourly/20180305*) and I could send you some
> > > suggestions.
> > > > >> Or
> > > > >> > if
> > > > >> > > you have access to theia you could copy them up there
and
> point
> > me
> > > > to
> > > > >> it.
> > > > >> > >
> > > > >> > > Thanks,
> > > > >> > > John
> > > > >> > >
> > > > >> > > On Tue, Apr 24, 2018 at 7:48 AM, Rosalyn MacCracken -
NOAA
> > > Affiliate
> > > > >> via
> > > > >> > RT
> > > > >> > > <met_help at ucar.edu> wrote:
> > > > >> > >
> > > > >> > > >
> > > > >> > > > <URL: https://rt.rap.ucar.edu/rt/
> Ticket/Display.html?id=84822
> > >
> > > > >> > > >
> > > > >> > > > Hi John,
> > > > >> > > >
> > > > >> > > > Yes, that makes sense.  Those very small values (<1.0
m/s),
> > are
> > > > bad
> > > > >> > > > values.  That's why they shouldn't be included in the
> > > processing.
> > > > >> > > >
> > > > >> > > > So, I need to just regenerate hourly data, one hour
at a
> time.
> > > > >> Would
> > > > >> > it
> > > > >> > > > make sense to use a shell script and loop stat-
analysis?
> > > > Something
> > > > >> > like:
> > > > >> > > >
> > > > >> > > > for day in 11 12
> > > > >> > > > do
> > > > >> > > >   for cycle in 00 06 12 18
> > > > >> > > >   do
> > > > >> > > > stat_analysis -lookin /GFS/data/hourly/201803${day}$
> > > {hour}/*.stat
> > > > \
> > > > >> > > > -job aggregate_stat \
> > > > >> > > >    -line_type MPR \
> > > > >> > > >    -out_line_type CTC,CTS,CNT \
> > > > >> > > >   -fcst_var WIND \
> > > > >> > > > -column_thresh OBS gt1 \
> > > > >> > > >  -by
> > > > >> > > >
MODEL,FCST_LEV,FCST_INIT_BEG,FCST_LEAD,VX_MASK,INTERP_MTHD,
> I
> > > > >> NTERP_PNTS
> > > > >> > > > -out_stat
/new_rerun_stat_files/MPR_to_CTC_CTS_CNT.stat
> > > > >> > > >   done
> > > > >> > > > done
> > > > >> > > >
> > > > >> > > > or, something like that?  And, will this regenerate
hour
> > > > forecasts,
> > > > >> at
> > > > >> > > each
> > > > >> > > > forecast and lead hour?  I guess it will see the
forecast
> and
> > > lead
> > > > >> hour
> > > > >> > > > from the *.stat file, and whatever *stat file is in
the
> > > directory,
> > > > >> it
> > > > >> > > will
> > > > >> > > > regenerate those hours, right?
> > > > >> > > >
> > > > >> > > > So, I need to regenerate the CTC, CNT and CTS files.
That's
> > > why I
> > > > >> did:
> > > > >> > > >  -out_line_type CTC,CTS,CNT
> > > > >> > > > but, will that make 3 separate files, or just another
*.stat
> > > file?
> > > > >> > > >
> > > > >> > > > Roz
> > > > >> > > >
> > > > >> > > >
> > > > >> > > > On Mon, Apr 23, 2018 at 4:01 PM, John Halley Gotway
via RT <
> > > > >> > > > met_help at ucar.edu> wrote:
> > > > >> > > >
> > > > >> > > > > Roz,
> > > > >> > > > >
> > > > >> > > > > It is ultimately up to you to decide which matched
pairs
> you
> > > > want
> > > > >> to
> > > > >> > > > > include in your processing.  Do you consider those
small
> > (<1.0
> > > > >> m/s)
> > > > >> > > > > observation values to be corrupt and incorrect in
some way
> > or
> > > > just
> > > > >> > not
> > > > >> > > > very
> > > > >> > > > > interesting?  If they really are BAD data values, I
agree
> > that
> > > > you
> > > > >> > > should
> > > > >> > > > > exclude them from your analysis.  But if they're
just
> > > > >> uninteresting
> > > > >> > > > values
> > > > >> > > > > of low wind speed, then there's no reason why you
should
> > > exclude
> > > > >> > them.
> > > > >> > > > For
> > > > >> > > > > example, *most* of the time it ins't raining, but
we often
> > > > >> included
> > > > >> > > > > observations of 0 precip.
> > > > >> > > > >
> > > > >> > > > > There are three configurable options in Point-Stat
that
> may
> > be
> > > > >> useful
> > > > >> > > > here:
> > > > >> > > > > (1) You already know and use the "cat_thresh"
option.
> This
> > > > >> threshold
> > > > >> > > > > defines the events and non-events for a 2x2
contingency
> > table.
> > > > >> This
> > > > >> > > > > threshold affects the contents of FHO, CTC, CTS,
MCTC, and
> > > MCTS
> > > > >> line
> > > > >> > > > types
> > > > >> > > > > that Point-Stat writes.
> > > > >> > > > > (2) The "cnt_thresh" option is a more recent
addition.
> > > Perhaps
> > > > >> this
> > > > >> > > was
> > > > >> > > > a
> > > > >> > > > > poor name choice, but instead of defining
categories, it's
> > > > really
> > > > >> a
> > > > >> > > > > *filtering* threshold.  This threshold affects the
> contents
> > of
> > > > the
> > > > >> > > SL1L2,
> > > > >> > > > > SAL1L2, and CNT line types that Point-Stat writes.
For
> > > example,
> > > > >> > > setting
> > > > >> > > > > "cnt_thresh = [ ge6, ge17 ];" will produce 2 CNT
and 2
> SL1L2
> > > > >> output
> > > > >> > > lines
> > > > >> > > > > containing only those points where the wind speed
was >=6
> > and
> > > > >> >=17,
> > > > >> > > > > respectively.
> > > > >> > > > > (3) The "wind_thresh" option is very similar to the
> > > "cnt_thresh"
> > > > >> > option
> > > > >> > > > but
> > > > >> > > > > affects the contents of teh VL1L2, VAL1L2, and VCNT
(new
> in
> > > > >> met-7.0)
> > > > >> > > line
> > > > >> > > > > types.  Only those U/V pairs that meet the
specified wind
> > > speed
> > > > >> > > threshold
> > > > >> > > > > are included in the output.
> > > > >> > > > >
> > > > >> > > > > For both "cnt_thresh" and "wind_thresh", the
default value
> > in
> > > > the
> > > > >> > > config
> > > > >> > > > > file is "NA", meaning, do not apply any filtering
> threshold
> > > > >> criteria.
> > > > >> > > > >
> > > > >> > > > > You have the flexibility to run STAT-Analysis on
the MPR
> > > output
> > > > >> lines
> > > > >> > > to
> > > > >> > > > > recompute any of these output line types applying
whatever
> > > > >> filtering
> > > > >> > > > > criteria you'd like.
> > > > >> > > > > Here's the MET user's guide:
> > > > >> > > > >
https://dtcenter.org/met/users/docs/users_guide/MET_
> > > > >> > > Users_Guide_v7.0.pdf
> > > > >> > > > > Look on page 98 for the job command options for the
> > > > >> "aggregate_stat"
> > > > >> > > line
> > > > >> > > > > type when the input line type is "MPR".
> > > > >> > > > >
> > > > >> > > > > For your second question, the "-lookin PATH" option
is
> > *VERY*
> > > > >> > flexible.
> > > > >> > > > > You can set PATH to either a single value or
multiple
> > values.
> > > > If
> > > > >> you
> > > > >> > > use
> > > > >> > > > > wildcards, then the shell expands those wildcards
to
> > multiple
> > > > >> values.
> > > > >> > > > Each
> > > > >> > > > > value you pass in can either be a filename or a
directory
> > > name.
> > > > >> If
> > > > >> > you
> > > > >> > > > > pass in a filename, STAT-Analysis will read it
> *REGARDLESS*
> > of
> > > > the
> > > > >> > file
> > > > >> > > > > extension.  If you pass in a directory name, STAT-
Analysis
> > > will
> > > > >> > search
> > > > >> > > > that
> > > > >> > > > > directory *RECURSIVELY* for files ending in
".stat".  For
> > > > example,
> > > > >> > > either
> > > > >> > > > > of the following settings would tell STAT-Analysis
to read
> > the
> > > > >> same
> > > > >> > > list
> > > > >> > > > of
> > > > >> > > > > files:
> > > > >> > > > >    -lookin /GFS/data/hourly/*/*.stat
> > > > >> > > > >    ... or ...
> > > > >> > > > >    -lookin /GFS/data/hourly
> > > > >> > > > >
> > > > >> > > > > Be aware though that the more data you pass to
> > STAT-Analysis,
> > > > the
> > > > >> > > longer
> > > > >> > > > > it'll take for it to process it.  You can decide
how much
> > data
> > > > you
> > > > >> > pass
> > > > >> > > > it
> > > > >> > > > > for each job.  I'd suggest starting with what is
most
> > > convenient
> > > > >> for
> > > > >> > > you.
> > > > >> > > > > If it's too slow, change the logic to pass it less
data
> > (e.g.
> > > > >> only 1
> > > > >> > > day
> > > > >> > > > of
> > > > >> > > > > data rather than 1 month of data).
> > > > >> > > > >
> > > > >> > > > > Yes, you can give it a date range.  Use
-fcst_init_beg and
> > > > >> > > -fcst_init_end
> > > > >> > > > > to specify beginning/ending model initialization
times or
> > > > >> > > -fcst_valid_beg
> > > > >> > > > > and -fcst_valid_end to specify beginning/ending
valid
> times.
> > > > >> > > > >
> > > > >> > > > > If you find that you're running multiple jobs on
the same
> > > subset
> > > > >> of
> > > > >> > > data
> > > > >> > > > > (e.g. process MPR to CNT, MPR to SL1L2, MPR to CTC,
MPR to
> > > CTS),
> > > > >> it'd
> > > > >> > > be
> > > > >> > > > > more efficient to group those jobs into a config
file.
> > > That'll
> > > > do
> > > > >> > the
> > > > >> > > > > filtering ONCE and write the filtered data to a
temp file.
> > > Then
> > > > >> all
> > > > >> > > the
> > > > >> > > > > jobs read data from the temp instead of starting
over from
> > > > >> scratch.
> > > > >> > > > >
> > > > >> > > > > Make sense?
> > > > >> > > > >
> > > > >> > > > > John
> > > > >> > > > >
> > > > >> > > > >
> > > > >> > > > >
> > > > >> > > > > On Mon, Apr 23, 2018 at 1:01 PM, Rosalyn MacCracken
- NOAA
> > > > >> Affiliate
> > > > >> > > via
> > > > >> > > > RT
> > > > >> > > > > <met_help at ucar.edu> wrote:
> > > > >> > > > >
> > > > >> > > > > >
> > > > >> > > > > > <URL: https://rt.rap.ucar.edu/rt/
> > > Ticket/Display.html?id=84822
> > > > >
> > > > >> > > > > >
> > > > >> > > > > > Hi John,
> > > > >> > > > > >
> > > > >> > > > > > That's actually only partially correct.  It's not
that I
> > > want
> > > > to
> > > > >> > use
> > > > >> > > > part
> > > > >> > > > > > of the MPR lines and discard the rest, and I do
need to
> > > > >> regenerate
> > > > >> > > > > > statistics.  Let me try to re-explain.
> > > > >> > > > > >
> > > > >> > > > > > Back in early March we switched from getting our
ASCAT
> obs
> > > > from
> > > > >> the
> > > > >> > > > > > prepbufr data, to getting it from the MGDRLITE
data. So,
> > > > >> processing
> > > > >> > > > > didn't
> > > > >> > > > > > change.  I was producing statistics at certain
threshold
> > > > levels
> > > > >> for
> > > > >> > > > both
> > > > >> > > > > > GFS and ASCAT.  I had this set with the
cat_thresh list,
> > at
> > > > >> levels
> > > > >> > of
> > > > >> > > > > > 0,6,17, etc.  We found out after processing for a
couple
> > of
> > > > >> weeks
> > > > >> > > that
> > > > >> > > > > the
> > > > >> > > > > > ASCAT data included these really small values,
<1.0 m/s,
> > and
> > > > >> that
> > > > >> > > these
> > > > >> > > > > > small wind speeds were being included into the
> statistics
> > > > >> > processing.
> > > > >> > > > > >
> > > > >> > > > > > So, a couple of questions.
> > > > >> > > > > > 1) Do I have to regenerate all of my statistics
(*.cts,
> > > *.cnt
> > > > >> and
> > > > >> > > *ctc
> > > > >> > > > > > files) because of this error? Or, since I have
threshold
> > > > levels
> > > > >> > set,
> > > > >> > > > will
> > > > >> > > > > > those small values be amoung the statistics in
the
> lowest
> > > > >> > thresholds?
> > > > >> > > > > > 2) I have the *.stat files, but, they are spread
out
> into
> > > > >> separate
> > > > >> > > > > > directories like:
> > > > >> > > > > > /GFS/data/hourly/${YYYYMMDDHH}/*.stat
> > > > >> > > > > > Can I tell stat-analysis to "lookin" directories
with a
> > > > wildcard
> > > > >> > > (like
> > > > >> > > > > > 201803*)?  If so, how?  Or, is I tell it to look
in
> > > > >> > /GFS/data/hourly,
> > > > >> > > > > will
> > > > >> > > > > > it look in all the directories recursively under
hourly?
> > > And,
> > > > >> it
> > > > >> > > > that's
> > > > >> > > > > > the case, can I give it a date range, so, that it
only
> > > > processes
> > > > >> > data
> > > > >> > > > > from
> > > > >> > > > > > March?
> > > > >> > > > > >
> > > > >> > > > > > Roz
> > > > >> > > > > >
> > > > >> > > > > > On Mon, Apr 23, 2018 at 2:18 PM, John Halley
Gotway via
> > RT <
> > > > >> > > > > > met_help at ucar.edu> wrote:
> > > > >> > > > > >
> > > > >> > > > > > > Hi Roz,
> > > > >> > > > > > >
> > > > >> > > > > > > I read that you've run Point-Stat and saved off
the
> > > matched
> > > > >> pairs
> > > > >> > > > (MPR)
> > > > >> > > > > > > output line type.  And you'd like to (1) filter
those
> > MPR
> > > > >> lines
> > > > >> > to
> > > > >> > > > > > discard
> > > > >> > > > > > > some of them and then (2) use the filtered data
to
> > > > regenerate
> > > > >> > > summary
> > > > >> > > > > > > statistics.  Yes, this is easily done using the
> > > > STAT-Analysis
> > > > >> > tool
> > > > >> > > in
> > > > >> > > > > > MET.
> > > > >> > > > > > >
> > > > >> > > > > > > You wrote that you're verifying wind speeds
against
> > ASCAT
> > > > and
> > > > >> > that
> > > > >> > > > > you'd
> > > > >> > > > > > > like to exclude pairs where the observed wind
speed is
> > > less
> > > > >> than
> > > > >> > 1
> > > > >> > > > m/s.
> > > > >> > > > > > > I'm just guessing here, but I'll presume that
you want
> > to
> > > > >> produce
> > > > >> > > > both
> > > > >> > > > > > > SL1L2 and CNT output line types.  Here's what
the
> > > > >> STAT-Analysis
> > > > >> > job
> > > > >> > > > > would
> > > > >> > > > > > > look like:
> > > > >> > > > > > >
> > > > >> > > > > > > # Filter MPR's and write SL1L2 output line
> > > > >> > > > > > > stat_analysis \
> > > > >> > > > > > >    -lookin input.stat \            # List a
.stat
> > filename
> > > > or
> > > > >> > > > directory
> > > > >> > > > > > > containing them
> > > > >> > > > > > >    -job aggregate_stat \        # Job type is
> > > aggregate_stat
> > > > >> > > > > > >    -line_type MPR \              # Input line
type =
> MPR
> > > > >> > > > > > >    -out_line_type SL1L2 \      # Output line
type =
> > SL1L2
> > > > >> partial
> > > > >> > > > sums
> > > > >> > > > > > >    -fcst_var WIND \               # Only
process lines
> > > where
> > > > >> > > FCST_VAR
> > > > >> > > > > > > column = WIND
> > > > >> > > > > > >    -column_thresh OBS gt1 \ # Only use MPR
lines where
> > OBS
> > > > >> column
> > > > >> > > > 1
> > > > >> > > > > > >    -by
> > > > >> > > > > > > MODEL,FCST_LEV,FCST_INIT_BEG,
> > > FCST_LEAD,VX_MASK,INTERP_MTHD,
> > > > >> > > > INTERP_PNTS
> > > > >> > > > > #
> > > > >> > > > > > > Run this same job for each unique combination
of these
> > > > columns
> > > > >> > > > > > >    -out_stat MPR_to_SL1L2.stat
> > > > >> > > > > > >
> > > > >> > > > > > > This will read produce an output .stat file
containing
> > an
> > > > >> SL1L2
> > > > >> > > line
> > > > >> > > > > for
> > > > >> > > > > > > each unique combination of the header columns
listed
> > after
> > > > the
> > > > >> > > "-by"
> > > > >> > > > > > > option.  To generate CNT output lines instead,
you'd
> > run a
> > > > >> second
> > > > >> > > job
> > > > >> > > > > > where
> > > > >> > > > > > > you replace SL1L2 with CNT.  You could run
these jobs
> on
> > > the
> > > > >> > > command
> > > > >> > > > > line
> > > > >> > > > > > > or group them together into a STAT-Analysis
config
> file,
> > > if
> > > > >> you
> > > > >> > > > prefer.
> > > > >> > > > > > > Both would work.
> > > > >> > > > > > >
> > > > >> > > > > > > You could run this once for each input .stat
file
> you're
> > > > >> > > > processing...
> > > > >> > > > > or
> > > > >> > > > > > > you could pass many input .stat files to the
job.
> Since
> > > > >> > > > FCST_INIT_BEG
> > > > >> > > > > > and
> > > > >> > > > > > > FCST_LEAD are included in the "-by" option,
you'll get
> > > > >> separate
> > > > >> > > > output
> > > > >> > > > > > > lines for each unique time.
> > > > >> > > > > > >
> > > > >> > > > > > > Hope that helps get you going.
> > > > >> > > > > > >
> > > > >> > > > > > > Thanks,
> > > > >> > > > > > > John
> > > > >> > > > > > >
> > > > >> > > > > > >
> > > > >> > > > > > > On Thu, Apr 19, 2018 at 9:23 AM, Julie
Prestopnik via
> > RT <
> > > > >> > > > > > > met_help at ucar.edu>
> > > > >> > > > > > > wrote:
> > > > >> > > > > > >
> > > > >> > > > > > > >
> > > > >> > > > > > > > <URL: https://rt.rap.ucar.edu/rt/Tic
> > > > >> ket/Display.html?id=84822
> > > > >> > >
> > > > >> > > > > > > >
> > > > >> > > > > > > > Hi Roz.  My apologies for the delay in
responding.
> > > > >> > > > > > > >
> > > > >> > > > > > > > Unfortunately, John is out of the office this
week,
> > and
> > > I
> > > > do
> > > > >> > not
> > > > >> > > > know
> > > > >> > > > > > the
> > > > >> > > > > > > > answers to your questions.  As you said, I
would
> also
> > > > >> imagine
> > > > >> > > that
> > > > >> > > > > > > > point-stat is using those small values as
matched
> > pairs.
> > > > >> > Also, I
> > > > >> > > > do
> > > > >> > > > > > not
> > > > >> > > > > > > > believe there is a way to regenerate the
point-stat
> > > > >> statistics
> > > > >> > > > > without
> > > > >> > > > > > > > using the original GFS data.  I cannot say
with
> > > certainty,
> > > > >> > > however.
> > > > >> > > > > > > Thank
> > > > >> > > > > > > > you for your patience in advance.  We'll get
a
> > definite
> > > > >> > response
> > > > >> > > to
> > > > >> > > > > you
> > > > >> > > > > > > as
> > > > >> > > > > > > > soon as we can.
> > > > >> > > > > > > >
> > > > >> > > > > > > > Thanks,
> > > > >> > > > > > > > Julie
> > > > >> > > > > > > >
> > > > >> > > > > > > > On Wed, Apr 18, 2018 at 6:31 AM, Rosalyn
MacCracken
> -
> > > NOAA
> > > > >> > > > Affiliate
> > > > >> > > > > > via
> > > > >> > > > > > > RT
> > > > >> > > > > > > > <met_help at ucar.edu> wrote:
> > > > >> > > > > > > >
> > > > >> > > > > > > > >
> > > > >> > > > > > > > > Wed Apr 18 06:31:39 2018: Request 84822 was
acted
> > > upon.
> > > > >> > > > > > > > > Transaction: Ticket created by
> > > > >> rosalyn.maccracken at noaa.gov
> > > > >> > > > > > > > >        Queue: met_help
> > > > >> > > > > > > > >      Subject: question on regenerating data
> > > > >> > > > > > > > >        Owner: Nobody
> > > > >> > > > > > > > >   Requestors: rosalyn.maccracken at noaa.gov
> > > > >> > > > > > > > >       Status: new
> > > > >> > > > > > > > >  Ticket <URL: https://rt.rap.ucar.edu/rt/
> > > > >> > > > > > Ticket/Display.html?id=84822
> > > > >> > > > > > > >
> > > > >> > > > > > > > >
> > > > >> > > > > > > > >
> > > > >> > > > > > > > > Hi,
> > > > >> > > > > > > > >
> > > > >> > > > > > > > > I'm running point-stat using ASCAT and GFS
data to
> > > > verify
> > > > >> > > surface
> > > > >> > > > > > wind
> > > > >> > > > > > > > > speeds.  I found an error in my ASCAT input
data
> > that
> > > > goes
> > > > >> > back
> > > > >> > > > to
> > > > >> > > > > > Mar
> > > > >> > > > > > > 7.
> > > > >> > > > > > > > > I had switched the input source of the
data, and
> > > within
> > > > >> the
> > > > >> > new
> > > > >> > > > > data
> > > > >> > > > > > > > files,
> > > > >> > > > > > > > > it was allowing very small values (< 1 m/s)
to be
> > used
> > > > as
> > > > >> > data
> > > > >> > > > > points
> > > > >> > > > > > > in
> > > > >> > > > > > > > > the verification.  I imagine that this is
an
> issue,
> > > > since
> > > > >> > > > > point-stat
> > > > >> > > > > > is
> > > > >> > > > > > > > > using these very small values as matched
pairs
> with
> > > the
> > > > >> GFS,
> > > > >> > > > > correct?
> > > > >> > > > > > > > >
> > > > >> > > > > > > > > Is there a way to regenerate the point-stat
> > statistics
> > > > >> > without
> > > > >> > > > > using
> > > > >> > > > > > > the
> > > > >> > > > > > > > > original GFS data?  I do have the *stat and
the
> *mpr
> > > > >> files,
> > > > >> > and
> > > > >> > > > it
> > > > >> > > > > is
> > > > >> > > > > > > > > pretty easy to identify where the bad
values are
> > > > located.
> > > > >> > > > > > > > >
> > > > >> > > > > > > > > Thanks,
> > > > >> > > > > > > > > Roz
> > > > >> > > > > > > > >
> > > > >> > > > > > > > > --
> > > > >> > > > > > > > > Rosalyn MacCracken
> > > > >> > > > > > > > > Support Scientist
> > > > >> > > > > > > > >
> > > > >> > > > > > > > > Ocean Applications Branch
> > > > >> > > > > > > > > NOAA/NWS Ocean Prediction Center
> > > > >> > > > > > > > > NCWCP
> > > > >> > > > > > > > > 5830 University Research Ct
> > > > >> > > > > > > > > College Park, MD  20740-3818
> > > > >> > > > > > > > >
> > > > >> > > > > > > > > (p) 301-683-1551
> > > > >> > > > > > > > > rosalyn.maccracken at noaa.gov
> > > > >> > > > > > > > >
> > > > >> > > > > > > > >
> > > > >> > > > > > > >
> > > > >> > > > > > > >
> > > > >> > > > > > >
> > > > >> > > > > > >
> > > > >> > > > > >
> > > > >> > > > > >
> > > > >> > > > > > --
> > > > >> > > > > > Rosalyn MacCracken
> > > > >> > > > > > Support Scientist
> > > > >> > > > > >
> > > > >> > > > > > Ocean Applications Branch
> > > > >> > > > > > NOAA/NWS Ocean Prediction Center
> > > > >> > > > > > NCWCP
> > > > >> > > > > > 5830 University Research Ct
> > > > >> > > > > > College Park, MD  20740-3818
> > > > >> > > > > >
> > > > >> > > > > > (p) 301-683-1551
> > > > >> > > > > > rosalyn.maccracken at noaa.gov
> > > > >> > > > > >
> > > > >> > > > > >
> > > > >> > > > >
> > > > >> > > > >
> > > > >> > > >
> > > > >> > > >
> > > > >> > > > --
> > > > >> > > > Rosalyn MacCracken
> > > > >> > > > Support Scientist
> > > > >> > > >
> > > > >> > > > Ocean Applications Branch
> > > > >> > > > NOAA/NWS Ocean Prediction Center
> > > > >> > > > NCWCP
> > > > >> > > > 5830 University Research Ct
> > > > >> > > > College Park, MD  20740-3818
> > > > >> > > >
> > > > >> > > > (p) 301-683-1551
> > > > >> > > > rosalyn.maccracken at noaa.gov
> > > > >> > > >
> > > > >> > > >
> > > > >> > >
> > > > >> > >
> > > > >> >
> > > > >> >
> > > > >> > --
> > > > >> > Rosalyn MacCracken
> > > > >> > Support Scientist
> > > > >> >
> > > > >> > Ocean Applications Branch
> > > > >> > NOAA/NWS Ocean Prediction Center
> > > > >> > NCWCP
> > > > >> > 5830 University Research Ct
> > > > >> > College Park, MD  20740-3818
> > > > >> >
> > > > >> > (p) 301-683-1551
> > > > >> > rosalyn.maccracken at noaa.gov
> > > > >> >
> > > > >> >
> > > > >>
> > > > >>
> > > > >
> > > > >
> > > > > --
> > > > > Rosalyn MacCracken
> > > > > Support Scientist
> > > > >
> > > > > Ocean Applications Branch
> > > > > NOAA/NWS Ocean Prediction Center
> > > > > NCWCP
> > > > > 5830 University Research Ct
> > > > > College Park, MD  20740-3818
> > > > >
> > > > > (p) 301-683-1551
> > > > > rosalyn.maccracken at noaa.gov
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Rosalyn MacCracken
> > > > Support Scientist
> > > >
> > > > Ocean Applications Branch
> > > > NOAA/NWS Ocean Prediction Center
> > > > NCWCP
> > > > 5830 University Research Ct
> > > > College Park, MD  20740-3818
> > > >
> > > > (p) 301-683-1551
> > > > rosalyn.maccracken at noaa.gov
> > > >
> > > >
> > >
> > >
> >
> >
> > --
> > Rosalyn MacCracken
> > Support Scientist
> >
> > Ocean Applications Branch
> > NOAA/NWS Ocean Prediction Center
> > NCWCP
> > 5830 University Research Ct
> > College Park, MD  20740-3818
> >
> > (p) 301-683-1551
> > rosalyn.maccracken at noaa.gov
> >
> >
>
>

--
Rosalyn MacCracken
Support Scientist

Ocean Applications Branch
NOAA/NWS Ocean Prediction Center
NCWCP
5830 University Research Ct
College Park, MD  20740-3818

(p) 301-683-1551
rosalyn.maccracken at noaa.gov

------------------------------------------------
Subject: question on regenerating data
From: John Halley Gotway
Time: Thu Apr 26 14:14:36 2018

Roz,

The CSI statistics is computed from a 2x2 contingency table.  A 2x2
contingency table is defined by a single threshold.  Looking in the
.stat
files you sent, I see that you've applied many thresholds to generate
many
2x2 contingency tables and corresponding statistics.  Yes, it is true
that
for most of those thresholds, the "bad" observation values will fall
into
the "non-event" category.  But those non-event counts are included in
the
computation of some stats, including CSI.  So even through the bad
observations aren't very interesting, they really are impacting the
statistics.

John

On Wed, Apr 25, 2018 at 10:08 AM, Rosalyn MacCracken - NOAA Affiliate
via
RT <met_help at ucar.edu> wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822 >
>
> Figures.  I just calculated how long it will take me to regenerate
data for
> 03072018 - 04122018.  It will take me 912 hours.  ;-(
>
> Ok, I know I asked this, but, if I had a OBS value of 0.01 and a
matched
> GFS point of 10 m/s, and I had a low threshold of 0-5 m/s, 6-10 m/s
and
> 10-15 m/s, and say, CSI was calculated.  Which threshold would be
used for
> the output, the 0-5 or 6-10?  And, would the 10-15 threshold even be
> effected?
>
> Roz
>
> On Wed, Apr 25, 2018 at 11:40 AM, John Halley Gotway via RT <
> met_help at ucar.edu> wrote:
>
> > Roz,
> >
> > I think it'd take just as long.  The slow part is reading the
data... not
> > applying a threshold.
> >
> > John
> >
> > On Wed, Apr 25, 2018 at 9:18 AM, Rosalyn MacCracken - NOAA
Affiliate via
> RT
> > <met_help at ucar.edu> wrote:
> >
> > >
> > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822 >
> > >
> > > Hi John,
> > >
> > > Thanks for doing that for me.  I'll take a look at the info you
sent me
> > > this afternoon.  I'm in the middle of doing something right
> now...trying
> > to
> > > make a different program work.  ;-/
> > >
> > > I wonder if it will be quicker than 18 minutes for some of the
> thresholds
> > > that have higher wind speeds, and not as many instances (or 0
> instances).
> > > Or, will it take just as long, since it still needs to read
through the
> > > entire *.stat file anyway?
> > >
> > > Roz
> > >
> > > On Tue, Apr 24, 2018 at 7:06 PM, John Halley Gotway via RT <
> > > met_help at ucar.edu> wrote:
> > >
> > > > Hi Roz,
> > > >
> > > > Thanks for sending the sample data.  I grabbed it and used it
run
> some
> > > > sample jobs:
> > > >
> > > > time /d1/johnhg/MET/MET_releases/met-6.0/bin/stat_analysis \
> > > > -lookin
> > > > /d1/johnhg/MET/MET_Help/maccracken_data_20180424/opc_
> > > > test/home/opc_test/data/met_verif/GFS/data/hourly
> > > > \
> > > > -config STATAnalysisConfig \
> > > > -log run_sa.log -v 3
> > > >
> > > > I used the "-lookin" option to point to all the data you sent.
> > > >
> > > > I've attached the...
> > > > (1) config file I used
> > > > (2) log file that was genrated
> > > > (3) output .stat files
> > > >
> > > > Looking at the jobs, you'll see that I've included 5 of
them...
> > > > - Generate CNT output
> > > > - Generate CTC >= 0.0 output
> > > > - Generate CTS >= 0.0 output
> > > > - Generate CTC >= 5.5689 output
> > > > - Generate CTS >= 5.5689 output
> > > >
> > > > Unfortunately, you'll need to define separate jobs for each
threshold
> > > you'd
> > > > like to use.  Although, you shouldn't use >=0.0 since that's
always
> > true.
> > > >
> > > > Also unfortunately, this is pretty slow.  On my machine, it
took like
> > 18
> > > > minutes for these 5 jobs!
> > > >
> > > > Thanks,
> > > > John
> > > >
> > > >
> > > > On Tue, Apr 24, 2018 at 2:09 PM, Rosalyn MacCracken - NOAA
Affiliate
> > via
> > > RT
> > > > <met_help at ucar.edu> wrote:
> > > >
> > > > >
> > > > > <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822 >
> > > > >
> > > > > Hi John,
> > > > >
> > > > > I put my file on the ftp site.  Let me know what you find.
You'll
> > see
> > > > > those really low OBS values (0.01, 0.02, and so on).
> > > > >
> > > > > Thanks!
> > > > >
> > > > > Roz
> > > > >
> > > > > On Tue, Apr 24, 2018 at 2:53 PM, Rosalyn MacCracken - NOAA
> Affiliate
> > <
> > > > > rosalyn.maccracken at noaa.gov> wrote:
> > > > >
> > > > > > Ok, I'll get that over to the ftp site.  I have to make
sure
> that I
> > > > find
> > > > > a
> > > > > > day that has all the data in it.  Sometimes the data isn't
> > available
> > > > when
> > > > > > the script runs.  A little annoying, but, that's
operations...
> > > > > >
> > > > > > I'll let you know when I get the file to the ftp site.
> > > > > >
> > > > > > Thanks!
> > > > > >
> > > > > > Roz
> > > > > >
> > > > > > On Tue, Apr 24, 2018 at 2:49 PM, John Halley Gotway via RT
<
> > > > > > met_help at ucar.edu> wrote:
> > > > > >
> > > > > >> Roz,
> > > > > >>
> > > > > >> Yes, we do.  Follow the instructions here:
> > > > > >>
https://dtcenter.org/met/users/support/met_help.php#ftp
> > > > > >>
> > > > > >> I'd suggest making a tar file for one day and posting
them to
> the
> > > ftp
> > > > > >> site:
> > > > > >>    tar -cvzf sample.tar.gz /GFS/data/hourly/20180305*
> > > > > >>
> > > > > >> Thanks,
> > > > > >> John
> > > > > >>
> > > > > >> On Tue, Apr 24, 2018 at 11:57 AM, Rosalyn MacCracken -
NOAA
> > > Affiliate
> > > > > via
> > > > > >> RT <met_help at ucar.edu> wrote:
> > > > > >>
> > > > > >> >
> > > > > >> > <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822
> >
> > > > > >> >
> > > > > >> > HI John,
> > > > > >> >
> > > > > >> > Yes, it does seem that the -config option is the way to
go to
> > > > recreate
> > > > > >> > those 3 files. I'll be sure to have a unique file name,
or, mv
> > the
> > > > > >> output
> > > > > >> > file to a different name before running the command
again.
> > Thanks
> > > > for
> > > > > >> > pointing that out.
> > > > > >> >
> > > > > >> > I'm teleworking for the next couple of weeks, so,
download and
> > > send
> > > > > you
> > > > > >> > *.stat files like I can when I'm at my computer at
work.  I
> > don't
> > > > have
> > > > > >> > access to theia or wcoss anymore.  You have an ftp
server
> that I
> > > can
> > > > > >> upload
> > > > > >> > data to, right?  If not, I can try and fiddle around
with this
> > > > > tomorrow
> > > > > >> and
> > > > > >> > see if I can't get this to work the way I want to.
> > > > > >> >
> > > > > >> > Roz
> > > > > >> >
> > > > > >> > On Tue, Apr 24, 2018 at 11:42 AM, John Halley Gotway
via RT <
> > > > > >> > met_help at ucar.edu> wrote:
> > > > > >> >
> > > > > >> > > Roz,
> > > > > >> > >
> > > > > >> > > Each "-job aggregate_stat" only generates a single
output
> line
> > > > type.
> > > > > >> So
> > > > > >> > > using "-out_line_type CTC,CTS,CNT" will not work.
> > > > > >> > >
> > > > > >> > > You'll need to run separate jobs for each output line
type
> you
> > > > want
> > > > > to
> > > > > >> > > generate.  That's why I'd recommend grouping those
multiple
> > jobs
> > > > > >> together
> > > > > >> > > into a single STAT-Analysis config file.  Then you'd
call
> > > > > >> STAT-Analysis
> > > > > >> > > once using the "-config" command line option.
> > > > > >> > >
> > > > > >> > > Another issue is that if you set "-out_stat" to the
same
> > > filename,
> > > > > >> it'll
> > > > > >> > > get overridden by each job.  STAT-Analysis will
overwrite
> that
> > > > > output
> > > > > >> > file
> > > > > >> > > rather than appending to it.
> > > > > >> > >
> > > > > >> > > You could send me a day's worth of .stat output files
> > > > > >> > > (/GFS/data/hourly/20180305*) and I could send you
some
> > > > suggestions.
> > > > > >> Or
> > > > > >> > if
> > > > > >> > > you have access to theia you could copy them up there
and
> > point
> > > me
> > > > > to
> > > > > >> it.
> > > > > >> > >
> > > > > >> > > Thanks,
> > > > > >> > > John
> > > > > >> > >
> > > > > >> > > On Tue, Apr 24, 2018 at 7:48 AM, Rosalyn MacCracken -
NOAA
> > > > Affiliate
> > > > > >> via
> > > > > >> > RT
> > > > > >> > > <met_help at ucar.edu> wrote:
> > > > > >> > >
> > > > > >> > > >
> > > > > >> > > > <URL: https://rt.rap.ucar.edu/rt/
> > Ticket/Display.html?id=84822
> > > >
> > > > > >> > > >
> > > > > >> > > > Hi John,
> > > > > >> > > >
> > > > > >> > > > Yes, that makes sense.  Those very small values
(<1.0
> m/s),
> > > are
> > > > > bad
> > > > > >> > > > values.  That's why they shouldn't be included in
the
> > > > processing.
> > > > > >> > > >
> > > > > >> > > > So, I need to just regenerate hourly data, one hour
at a
> > time.
> > > > > >> Would
> > > > > >> > it
> > > > > >> > > > make sense to use a shell script and loop stat-
analysis?
> > > > > Something
> > > > > >> > like:
> > > > > >> > > >
> > > > > >> > > > for day in 11 12
> > > > > >> > > > do
> > > > > >> > > >   for cycle in 00 06 12 18
> > > > > >> > > >   do
> > > > > >> > > > stat_analysis -lookin
/GFS/data/hourly/201803${day}$
> > > > {hour}/*.stat
> > > > > \
> > > > > >> > > > -job aggregate_stat \
> > > > > >> > > >    -line_type MPR \
> > > > > >> > > >    -out_line_type CTC,CTS,CNT \
> > > > > >> > > >   -fcst_var WIND \
> > > > > >> > > > -column_thresh OBS gt1 \
> > > > > >> > > >  -by
> > > > > >> > > > MODEL,FCST_LEV,FCST_INIT_BEG,
> FCST_LEAD,VX_MASK,INTERP_MTHD,
> > I
> > > > > >> NTERP_PNTS
> > > > > >> > > > -out_stat
/new_rerun_stat_files/MPR_to_CTC_CTS_CNT.stat
> > > > > >> > > >   done
> > > > > >> > > > done
> > > > > >> > > >
> > > > > >> > > > or, something like that?  And, will this regenerate
hour
> > > > > forecasts,
> > > > > >> at
> > > > > >> > > each
> > > > > >> > > > forecast and lead hour?  I guess it will see the
forecast
> > and
> > > > lead
> > > > > >> hour
> > > > > >> > > > from the *.stat file, and whatever *stat file is in
the
> > > > directory,
> > > > > >> it
> > > > > >> > > will
> > > > > >> > > > regenerate those hours, right?
> > > > > >> > > >
> > > > > >> > > > So, I need to regenerate the CTC, CNT and CTS
files.
> That's
> > > > why I
> > > > > >> did:
> > > > > >> > > >  -out_line_type CTC,CTS,CNT
> > > > > >> > > > but, will that make 3 separate files, or just
another
> *.stat
> > > > file?
> > > > > >> > > >
> > > > > >> > > > Roz
> > > > > >> > > >
> > > > > >> > > >
> > > > > >> > > > On Mon, Apr 23, 2018 at 4:01 PM, John Halley Gotway
via
> RT <
> > > > > >> > > > met_help at ucar.edu> wrote:
> > > > > >> > > >
> > > > > >> > > > > Roz,
> > > > > >> > > > >
> > > > > >> > > > > It is ultimately up to you to decide which
matched pairs
> > you
> > > > > want
> > > > > >> to
> > > > > >> > > > > include in your processing.  Do you consider
those small
> > > (<1.0
> > > > > >> m/s)
> > > > > >> > > > > observation values to be corrupt and incorrect in
some
> way
> > > or
> > > > > just
> > > > > >> > not
> > > > > >> > > > very
> > > > > >> > > > > interesting?  If they really are BAD data values,
I
> agree
> > > that
> > > > > you
> > > > > >> > > should
> > > > > >> > > > > exclude them from your analysis.  But if they're
just
> > > > > >> uninteresting
> > > > > >> > > > values
> > > > > >> > > > > of low wind speed, then there's no reason why you
should
> > > > exclude
> > > > > >> > them.
> > > > > >> > > > For
> > > > > >> > > > > example, *most* of the time it ins't raining, but
we
> often
> > > > > >> included
> > > > > >> > > > > observations of 0 precip.
> > > > > >> > > > >
> > > > > >> > > > > There are three configurable options in Point-
Stat that
> > may
> > > be
> > > > > >> useful
> > > > > >> > > > here:
> > > > > >> > > > > (1) You already know and use the "cat_thresh"
option.
> > This
> > > > > >> threshold
> > > > > >> > > > > defines the events and non-events for a 2x2
contingency
> > > table.
> > > > > >> This
> > > > > >> > > > > threshold affects the contents of FHO, CTC, CTS,
MCTC,
> and
> > > > MCTS
> > > > > >> line
> > > > > >> > > > types
> > > > > >> > > > > that Point-Stat writes.
> > > > > >> > > > > (2) The "cnt_thresh" option is a more recent
addition.
> > > > Perhaps
> > > > > >> this
> > > > > >> > > was
> > > > > >> > > > a
> > > > > >> > > > > poor name choice, but instead of defining
categories,
> it's
> > > > > really
> > > > > >> a
> > > > > >> > > > > *filtering* threshold.  This threshold affects
the
> > contents
> > > of
> > > > > the
> > > > > >> > > SL1L2,
> > > > > >> > > > > SAL1L2, and CNT line types that Point-Stat
writes.  For
> > > > example,
> > > > > >> > > setting
> > > > > >> > > > > "cnt_thresh = [ ge6, ge17 ];" will produce 2 CNT
and 2
> > SL1L2
> > > > > >> output
> > > > > >> > > lines
> > > > > >> > > > > containing only those points where the wind speed
was
> >=6
> > > and
> > > > > >> >=17,
> > > > > >> > > > > respectively.
> > > > > >> > > > > (3) The "wind_thresh" option is very similar to
the
> > > > "cnt_thresh"
> > > > > >> > option
> > > > > >> > > > but
> > > > > >> > > > > affects the contents of teh VL1L2, VAL1L2, and
VCNT (new
> > in
> > > > > >> met-7.0)
> > > > > >> > > line
> > > > > >> > > > > types.  Only those U/V pairs that meet the
specified
> wind
> > > > speed
> > > > > >> > > threshold
> > > > > >> > > > > are included in the output.
> > > > > >> > > > >
> > > > > >> > > > > For both "cnt_thresh" and "wind_thresh", the
default
> value
> > > in
> > > > > the
> > > > > >> > > config
> > > > > >> > > > > file is "NA", meaning, do not apply any filtering
> > threshold
> > > > > >> criteria.
> > > > > >> > > > >
> > > > > >> > > > > You have the flexibility to run STAT-Analysis on
the MPR
> > > > output
> > > > > >> lines
> > > > > >> > > to
> > > > > >> > > > > recompute any of these output line types applying
> whatever
> > > > > >> filtering
> > > > > >> > > > > criteria you'd like.
> > > > > >> > > > > Here's the MET user's guide:
> > > > > >> > > > >
https://dtcenter.org/met/users/docs/users_guide/MET_
> > > > > >> > > Users_Guide_v7.0.pdf
> > > > > >> > > > > Look on page 98 for the job command options for
the
> > > > > >> "aggregate_stat"
> > > > > >> > > line
> > > > > >> > > > > type when the input line type is "MPR".
> > > > > >> > > > >
> > > > > >> > > > > For your second question, the "-lookin PATH"
option is
> > > *VERY*
> > > > > >> > flexible.
> > > > > >> > > > > You can set PATH to either a single value or
multiple
> > > values.
> > > > > If
> > > > > >> you
> > > > > >> > > use
> > > > > >> > > > > wildcards, then the shell expands those wildcards
to
> > > multiple
> > > > > >> values.
> > > > > >> > > > Each
> > > > > >> > > > > value you pass in can either be a filename or a
> directory
> > > > name.
> > > > > >> If
> > > > > >> > you
> > > > > >> > > > > pass in a filename, STAT-Analysis will read it
> > *REGARDLESS*
> > > of
> > > > > the
> > > > > >> > file
> > > > > >> > > > > extension.  If you pass in a directory name,
> STAT-Analysis
> > > > will
> > > > > >> > search
> > > > > >> > > > that
> > > > > >> > > > > directory *RECURSIVELY* for files ending in
".stat".
> For
> > > > > example,
> > > > > >> > > either
> > > > > >> > > > > of the following settings would tell STAT-
Analysis to
> read
> > > the
> > > > > >> same
> > > > > >> > > list
> > > > > >> > > > of
> > > > > >> > > > > files:
> > > > > >> > > > >    -lookin /GFS/data/hourly/*/*.stat
> > > > > >> > > > >    ... or ...
> > > > > >> > > > >    -lookin /GFS/data/hourly
> > > > > >> > > > >
> > > > > >> > > > > Be aware though that the more data you pass to
> > > STAT-Analysis,
> > > > > the
> > > > > >> > > longer
> > > > > >> > > > > it'll take for it to process it.  You can decide
how
> much
> > > data
> > > > > you
> > > > > >> > pass
> > > > > >> > > > it
> > > > > >> > > > > for each job.  I'd suggest starting with what is
most
> > > > convenient
> > > > > >> for
> > > > > >> > > you.
> > > > > >> > > > > If it's too slow, change the logic to pass it
less data
> > > (e.g.
> > > > > >> only 1
> > > > > >> > > day
> > > > > >> > > > of
> > > > > >> > > > > data rather than 1 month of data).
> > > > > >> > > > >
> > > > > >> > > > > Yes, you can give it a date range.  Use
-fcst_init_beg
> and
> > > > > >> > > -fcst_init_end
> > > > > >> > > > > to specify beginning/ending model initialization
times
> or
> > > > > >> > > -fcst_valid_beg
> > > > > >> > > > > and -fcst_valid_end to specify beginning/ending
valid
> > times.
> > > > > >> > > > >
> > > > > >> > > > > If you find that you're running multiple jobs on
the
> same
> > > > subset
> > > > > >> of
> > > > > >> > > data
> > > > > >> > > > > (e.g. process MPR to CNT, MPR to SL1L2, MPR to
CTC, MPR
> to
> > > > CTS),
> > > > > >> it'd
> > > > > >> > > be
> > > > > >> > > > > more efficient to group those jobs into a config
file.
> > > > That'll
> > > > > do
> > > > > >> > the
> > > > > >> > > > > filtering ONCE and write the filtered data to a
temp
> file.
> > > > Then
> > > > > >> all
> > > > > >> > > the
> > > > > >> > > > > jobs read data from the temp instead of starting
over
> from
> > > > > >> scratch.
> > > > > >> > > > >
> > > > > >> > > > > Make sense?
> > > > > >> > > > >
> > > > > >> > > > > John
> > > > > >> > > > >
> > > > > >> > > > >
> > > > > >> > > > >
> > > > > >> > > > > On Mon, Apr 23, 2018 at 1:01 PM, Rosalyn
MacCracken -
> NOAA
> > > > > >> Affiliate
> > > > > >> > > via
> > > > > >> > > > RT
> > > > > >> > > > > <met_help at ucar.edu> wrote:
> > > > > >> > > > >
> > > > > >> > > > > >
> > > > > >> > > > > > <URL: https://rt.rap.ucar.edu/rt/
> > > > Ticket/Display.html?id=84822
> > > > > >
> > > > > >> > > > > >
> > > > > >> > > > > > Hi John,
> > > > > >> > > > > >
> > > > > >> > > > > > That's actually only partially correct.  It's
not
> that I
> > > > want
> > > > > to
> > > > > >> > use
> > > > > >> > > > part
> > > > > >> > > > > > of the MPR lines and discard the rest, and I do
need
> to
> > > > > >> regenerate
> > > > > >> > > > > > statistics.  Let me try to re-explain.
> > > > > >> > > > > >
> > > > > >> > > > > > Back in early March we switched from getting
our ASCAT
> > obs
> > > > > from
> > > > > >> the
> > > > > >> > > > > > prepbufr data, to getting it from the MGDRLITE
data.
> So,
> > > > > >> processing
> > > > > >> > > > > didn't
> > > > > >> > > > > > change.  I was producing statistics at certain
> threshold
> > > > > levels
> > > > > >> for
> > > > > >> > > > both
> > > > > >> > > > > > GFS and ASCAT.  I had this set with the
cat_thresh
> list,
> > > at
> > > > > >> levels
> > > > > >> > of
> > > > > >> > > > > > 0,6,17, etc.  We found out after processing for
a
> couple
> > > of
> > > > > >> weeks
> > > > > >> > > that
> > > > > >> > > > > the
> > > > > >> > > > > > ASCAT data included these really small values,
<1.0
> m/s,
> > > and
> > > > > >> that
> > > > > >> > > these
> > > > > >> > > > > > small wind speeds were being included into the
> > statistics
> > > > > >> > processing.
> > > > > >> > > > > >
> > > > > >> > > > > > So, a couple of questions.
> > > > > >> > > > > > 1) Do I have to regenerate all of my statistics
> (*.cts,
> > > > *.cnt
> > > > > >> and
> > > > > >> > > *ctc
> > > > > >> > > > > > files) because of this error? Or, since I have
> threshold
> > > > > levels
> > > > > >> > set,
> > > > > >> > > > will
> > > > > >> > > > > > those small values be amoung the statistics in
the
> > lowest
> > > > > >> > thresholds?
> > > > > >> > > > > > 2) I have the *.stat files, but, they are
spread out
> > into
> > > > > >> separate
> > > > > >> > > > > > directories like:
> > > > > >> > > > > > /GFS/data/hourly/${YYYYMMDDHH}/*.stat
> > > > > >> > > > > > Can I tell stat-analysis to "lookin"
directories with
> a
> > > > > wildcard
> > > > > >> > > (like
> > > > > >> > > > > > 201803*)?  If so, how?  Or, is I tell it to
look in
> > > > > >> > /GFS/data/hourly,
> > > > > >> > > > > will
> > > > > >> > > > > > it look in all the directories recursively
under
> hourly?
> > > > And,
> > > > > >> it
> > > > > >> > > > that's
> > > > > >> > > > > > the case, can I give it a date range, so, that
it only
> > > > > processes
> > > > > >> > data
> > > > > >> > > > > from
> > > > > >> > > > > > March?
> > > > > >> > > > > >
> > > > > >> > > > > > Roz
> > > > > >> > > > > >
> > > > > >> > > > > > On Mon, Apr 23, 2018 at 2:18 PM, John Halley
Gotway
> via
> > > RT <
> > > > > >> > > > > > met_help at ucar.edu> wrote:
> > > > > >> > > > > >
> > > > > >> > > > > > > Hi Roz,
> > > > > >> > > > > > >
> > > > > >> > > > > > > I read that you've run Point-Stat and saved
off the
> > > > matched
> > > > > >> pairs
> > > > > >> > > > (MPR)
> > > > > >> > > > > > > output line type.  And you'd like to (1)
filter
> those
> > > MPR
> > > > > >> lines
> > > > > >> > to
> > > > > >> > > > > > discard
> > > > > >> > > > > > > some of them and then (2) use the filtered
data to
> > > > > regenerate
> > > > > >> > > summary
> > > > > >> > > > > > > statistics.  Yes, this is easily done using
the
> > > > > STAT-Analysis
> > > > > >> > tool
> > > > > >> > > in
> > > > > >> > > > > > MET.
> > > > > >> > > > > > >
> > > > > >> > > > > > > You wrote that you're verifying wind speeds
against
> > > ASCAT
> > > > > and
> > > > > >> > that
> > > > > >> > > > > you'd
> > > > > >> > > > > > > like to exclude pairs where the observed wind
speed
> is
> > > > less
> > > > > >> than
> > > > > >> > 1
> > > > > >> > > > m/s.
> > > > > >> > > > > > > I'm just guessing here, but I'll presume that
you
> want
> > > to
> > > > > >> produce
> > > > > >> > > > both
> > > > > >> > > > > > > SL1L2 and CNT output line types.  Here's what
the
> > > > > >> STAT-Analysis
> > > > > >> > job
> > > > > >> > > > > would
> > > > > >> > > > > > > look like:
> > > > > >> > > > > > >
> > > > > >> > > > > > > # Filter MPR's and write SL1L2 output line
> > > > > >> > > > > > > stat_analysis \
> > > > > >> > > > > > >    -lookin input.stat \            # List a
.stat
> > > filename
> > > > > or
> > > > > >> > > > directory
> > > > > >> > > > > > > containing them
> > > > > >> > > > > > >    -job aggregate_stat \        # Job type is
> > > > aggregate_stat
> > > > > >> > > > > > >    -line_type MPR \              # Input line
type =
> > MPR
> > > > > >> > > > > > >    -out_line_type SL1L2 \      # Output line
type =
> > > SL1L2
> > > > > >> partial
> > > > > >> > > > sums
> > > > > >> > > > > > >    -fcst_var WIND \               # Only
process
> lines
> > > > where
> > > > > >> > > FCST_VAR
> > > > > >> > > > > > > column = WIND
> > > > > >> > > > > > >    -column_thresh OBS gt1 \ # Only use MPR
lines
> where
> > > OBS
> > > > > >> column
> > > > > >> > > > 1
> > > > > >> > > > > > >    -by
> > > > > >> > > > > > > MODEL,FCST_LEV,FCST_INIT_BEG,
> > > > FCST_LEAD,VX_MASK,INTERP_MTHD,
> > > > > >> > > > INTERP_PNTS
> > > > > >> > > > > #
> > > > > >> > > > > > > Run this same job for each unique combination
of
> these
> > > > > columns
> > > > > >> > > > > > >    -out_stat MPR_to_SL1L2.stat
> > > > > >> > > > > > >
> > > > > >> > > > > > > This will read produce an output .stat file
> containing
> > > an
> > > > > >> SL1L2
> > > > > >> > > line
> > > > > >> > > > > for
> > > > > >> > > > > > > each unique combination of the header columns
listed
> > > after
> > > > > the
> > > > > >> > > "-by"
> > > > > >> > > > > > > option.  To generate CNT output lines
instead, you'd
> > > run a
> > > > > >> second
> > > > > >> > > job
> > > > > >> > > > > > where
> > > > > >> > > > > > > you replace SL1L2 with CNT.  You could run
these
> jobs
> > on
> > > > the
> > > > > >> > > command
> > > > > >> > > > > line
> > > > > >> > > > > > > or group them together into a STAT-Analysis
config
> > file,
> > > > if
> > > > > >> you
> > > > > >> > > > prefer.
> > > > > >> > > > > > > Both would work.
> > > > > >> > > > > > >
> > > > > >> > > > > > > You could run this once for each input .stat
file
> > you're
> > > > > >> > > > processing...
> > > > > >> > > > > or
> > > > > >> > > > > > > you could pass many input .stat files to the
job.
> > Since
> > > > > >> > > > FCST_INIT_BEG
> > > > > >> > > > > > and
> > > > > >> > > > > > > FCST_LEAD are included in the "-by" option,
you'll
> get
> > > > > >> separate
> > > > > >> > > > output
> > > > > >> > > > > > > lines for each unique time.
> > > > > >> > > > > > >
> > > > > >> > > > > > > Hope that helps get you going.
> > > > > >> > > > > > >
> > > > > >> > > > > > > Thanks,
> > > > > >> > > > > > > John
> > > > > >> > > > > > >
> > > > > >> > > > > > >
> > > > > >> > > > > > > On Thu, Apr 19, 2018 at 9:23 AM, Julie
Prestopnik
> via
> > > RT <
> > > > > >> > > > > > > met_help at ucar.edu>
> > > > > >> > > > > > > wrote:
> > > > > >> > > > > > >
> > > > > >> > > > > > > >
> > > > > >> > > > > > > > <URL: https://rt.rap.ucar.edu/rt/Tic
> > > > > >> ket/Display.html?id=84822
> > > > > >> > >
> > > > > >> > > > > > > >
> > > > > >> > > > > > > > Hi Roz.  My apologies for the delay in
responding.
> > > > > >> > > > > > > >
> > > > > >> > > > > > > > Unfortunately, John is out of the office
this
> week,
> > > and
> > > > I
> > > > > do
> > > > > >> > not
> > > > > >> > > > know
> > > > > >> > > > > > the
> > > > > >> > > > > > > > answers to your questions.  As you said, I
would
> > also
> > > > > >> imagine
> > > > > >> > > that
> > > > > >> > > > > > > > point-stat is using those small values as
matched
> > > pairs.
> > > > > >> > Also, I
> > > > > >> > > > do
> > > > > >> > > > > > not
> > > > > >> > > > > > > > believe there is a way to regenerate the
> point-stat
> > > > > >> statistics
> > > > > >> > > > > without
> > > > > >> > > > > > > > using the original GFS data.  I cannot say
with
> > > > certainty,
> > > > > >> > > however.
> > > > > >> > > > > > > Thank
> > > > > >> > > > > > > > you for your patience in advance.  We'll
get a
> > > definite
> > > > > >> > response
> > > > > >> > > to
> > > > > >> > > > > you
> > > > > >> > > > > > > as
> > > > > >> > > > > > > > soon as we can.
> > > > > >> > > > > > > >
> > > > > >> > > > > > > > Thanks,
> > > > > >> > > > > > > > Julie
> > > > > >> > > > > > > >
> > > > > >> > > > > > > > On Wed, Apr 18, 2018 at 6:31 AM, Rosalyn
> MacCracken
> > -
> > > > NOAA
> > > > > >> > > > Affiliate
> > > > > >> > > > > > via
> > > > > >> > > > > > > RT
> > > > > >> > > > > > > > <met_help at ucar.edu> wrote:
> > > > > >> > > > > > > >
> > > > > >> > > > > > > > >
> > > > > >> > > > > > > > > Wed Apr 18 06:31:39 2018: Request 84822
was
> acted
> > > > upon.
> > > > > >> > > > > > > > > Transaction: Ticket created by
> > > > > >> rosalyn.maccracken at noaa.gov
> > > > > >> > > > > > > > >        Queue: met_help
> > > > > >> > > > > > > > >      Subject: question on regenerating
data
> > > > > >> > > > > > > > >        Owner: Nobody
> > > > > >> > > > > > > > >   Requestors: rosalyn.maccracken at noaa.gov
> > > > > >> > > > > > > > >       Status: new
> > > > > >> > > > > > > > >  Ticket <URL: https://rt.rap.ucar.edu/rt/
> > > > > >> > > > > > Ticket/Display.html?id=84822
> > > > > >> > > > > > > >
> > > > > >> > > > > > > > >
> > > > > >> > > > > > > > >
> > > > > >> > > > > > > > > Hi,
> > > > > >> > > > > > > > >
> > > > > >> > > > > > > > > I'm running point-stat using ASCAT and
GFS data
> to
> > > > > verify
> > > > > >> > > surface
> > > > > >> > > > > > wind
> > > > > >> > > > > > > > > speeds.  I found an error in my ASCAT
input data
> > > that
> > > > > goes
> > > > > >> > back
> > > > > >> > > > to
> > > > > >> > > > > > Mar
> > > > > >> > > > > > > 7.
> > > > > >> > > > > > > > > I had switched the input source of the
data, and
> > > > within
> > > > > >> the
> > > > > >> > new
> > > > > >> > > > > data
> > > > > >> > > > > > > > files,
> > > > > >> > > > > > > > > it was allowing very small values (< 1
m/s) to
> be
> > > used
> > > > > as
> > > > > >> > data
> > > > > >> > > > > points
> > > > > >> > > > > > > in
> > > > > >> > > > > > > > > the verification.  I imagine that this is
an
> > issue,
> > > > > since
> > > > > >> > > > > point-stat
> > > > > >> > > > > > is
> > > > > >> > > > > > > > > using these very small values as matched
pairs
> > with
> > > > the
> > > > > >> GFS,
> > > > > >> > > > > correct?
> > > > > >> > > > > > > > >
> > > > > >> > > > > > > > > Is there a way to regenerate the point-
stat
> > > statistics
> > > > > >> > without
> > > > > >> > > > > using
> > > > > >> > > > > > > the
> > > > > >> > > > > > > > > original GFS data?  I do have the *stat
and the
> > *mpr
> > > > > >> files,
> > > > > >> > and
> > > > > >> > > > it
> > > > > >> > > > > is
> > > > > >> > > > > > > > > pretty easy to identify where the bad
values are
> > > > > located.
> > > > > >> > > > > > > > >
> > > > > >> > > > > > > > > Thanks,
> > > > > >> > > > > > > > > Roz
> > > > > >> > > > > > > > >
> > > > > >> > > > > > > > > --
> > > > > >> > > > > > > > > Rosalyn MacCracken
> > > > > >> > > > > > > > > Support Scientist
> > > > > >> > > > > > > > >
> > > > > >> > > > > > > > > Ocean Applications Branch
> > > > > >> > > > > > > > > NOAA/NWS Ocean Prediction Center
> > > > > >> > > > > > > > > NCWCP
> > > > > >> > > > > > > > > 5830 University Research Ct
> > > > > >> > > > > > > > > College Park, MD  20740-3818
> > > > > >> > > > > > > > >
> > > > > >> > > > > > > > > (p) 301-683-1551
> > > > > >> > > > > > > > > rosalyn.maccracken at noaa.gov
> > > > > >> > > > > > > > >
> > > > > >> > > > > > > > >
> > > > > >> > > > > > > >
> > > > > >> > > > > > > >
> > > > > >> > > > > > >
> > > > > >> > > > > > >
> > > > > >> > > > > >
> > > > > >> > > > > >
> > > > > >> > > > > > --
> > > > > >> > > > > > Rosalyn MacCracken
> > > > > >> > > > > > Support Scientist
> > > > > >> > > > > >
> > > > > >> > > > > > Ocean Applications Branch
> > > > > >> > > > > > NOAA/NWS Ocean Prediction Center
> > > > > >> > > > > > NCWCP
> > > > > >> > > > > > 5830 University Research Ct
> > > > > >> > > > > > College Park, MD  20740-3818
> > > > > >> > > > > >
> > > > > >> > > > > > (p) 301-683-1551
> > > > > >> > > > > > rosalyn.maccracken at noaa.gov
> > > > > >> > > > > >
> > > > > >> > > > > >
> > > > > >> > > > >
> > > > > >> > > > >
> > > > > >> > > >
> > > > > >> > > >
> > > > > >> > > > --
> > > > > >> > > > Rosalyn MacCracken
> > > > > >> > > > Support Scientist
> > > > > >> > > >
> > > > > >> > > > Ocean Applications Branch
> > > > > >> > > > NOAA/NWS Ocean Prediction Center
> > > > > >> > > > NCWCP
> > > > > >> > > > 5830 University Research Ct
> > > > > >> > > > College Park, MD  20740-3818
> > > > > >> > > >
> > > > > >> > > > (p) 301-683-1551
> > > > > >> > > > rosalyn.maccracken at noaa.gov
> > > > > >> > > >
> > > > > >> > > >
> > > > > >> > >
> > > > > >> > >
> > > > > >> >
> > > > > >> >
> > > > > >> > --
> > > > > >> > Rosalyn MacCracken
> > > > > >> > Support Scientist
> > > > > >> >
> > > > > >> > Ocean Applications Branch
> > > > > >> > NOAA/NWS Ocean Prediction Center
> > > > > >> > NCWCP
> > > > > >> > 5830 University Research Ct
> > > > > >> > College Park, MD  20740-3818
> > > > > >> >
> > > > > >> > (p) 301-683-1551
> > > > > >> > rosalyn.maccracken at noaa.gov
> > > > > >> >
> > > > > >> >
> > > > > >>
> > > > > >>
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Rosalyn MacCracken
> > > > > > Support Scientist
> > > > > >
> > > > > > Ocean Applications Branch
> > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > NCWCP
> > > > > > 5830 University Research Ct
> > > > > > College Park, MD  20740-3818
> > > > > >
> > > > > > (p) 301-683-1551
> > > > > > rosalyn.maccracken at noaa.gov
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Rosalyn MacCracken
> > > > > Support Scientist
> > > > >
> > > > > Ocean Applications Branch
> > > > > NOAA/NWS Ocean Prediction Center
> > > > > NCWCP
> > > > > 5830 University Research Ct
> > > > > College Park, MD  20740-3818
> > > > >
> > > > > (p) 301-683-1551
> > > > > rosalyn.maccracken at noaa.gov
> > > > >
> > > > >
> > > >
> > > >
> > >
> > >
> > > --
> > > Rosalyn MacCracken
> > > Support Scientist
> > >
> > > Ocean Applications Branch
> > > NOAA/NWS Ocean Prediction Center
> > > NCWCP
> > > 5830 University Research Ct
> > > College Park, MD  20740-3818
> > >
> > > (p) 301-683-1551
> > > rosalyn.maccracken at noaa.gov
> > >
> > >
> >
> >
>
>
> --
> Rosalyn MacCracken
> Support Scientist
>
> Ocean Applications Branch
> NOAA/NWS Ocean Prediction Center
> NCWCP
> 5830 University Research Ct
> College Park, MD  20740-3818
>
> (p) 301-683-1551
> rosalyn.maccracken at noaa.gov
>
>

------------------------------------------------
Subject: question on regenerating data
From: Rosalyn MacCracken - NOAA Affiliate
Time: Sun May 06 14:12:55 2018

Hi John,

Sorry it took me so long to get back to you.  My step-daughter came in
to
town, and I thought that I could get some work done while she was
here,
but, didn't.  Then, I totally forgot to email you back.  Sorry for
leaving
you hanging!

Anyway, I was able to play around with the STATAnalysis config file
you
sent me.  I tried it out with only 1 hour timestep, instead of all the
files for one day.  I wanted to see what kind of time it would take to
process this on my machine.  So, it was quick, 45 seconds.  But, of
course
you run took 18 minutes.  The script was probably reading 20 some
files.
That makes sense.

So, then, I looked at the output, and it wasn't quite what I expected,
and
doesn't quite match the stats from the other processing.  This is what
I
did:

1)  I copied the 00z only *20180307*.stat file to a temp directory.
Before
I did this, I looked at the matching *.mpr file, and saw that the
OBS_VALID_BEG was 20180307_000000 and the OBS_VALID_END was
20180307_002700.
2)  Ran the run_sa.sh script and generated the CTS, CTC and CNT files.
3)  I looked at the new agg_cts file, and the OBS_VALID_BEG and _END
matched the *.mpr file in step 1.
4)  I looked at the original CTS file, and the OBS_VALID_BEG was
20180307_223000 and the OBS_VALID_END was 20180307_013000.  So, that
was
our original way of processing.  I bet if I looked at a more recent
file,
it would be more like OBS_VALID_BEG was 20180307_233000 and the
OBS_VALID_END was 20180307_003000.
5)  I looked at the original *mpr for 01z, and the OBS_VALID_BEG was
20180307_003000 and the OBS_VALID_END was 20180307_012100

So, this tells me that I'm not matching observation times, and I'm not
sure
how to fix it to match things up.  First, we use a +/- 30 min window
for
ASCAT obs, centered on the hour.  For example, if we are processing
the 00z
hour, we will match observations from 233000 from the day before to
003000
the current day.  Actually, we used to do an hour window on either
side,
but, we have more observations now at each hour.  (See the explanation
in
#4 above)

Anyway, how do I create the CTS,CTC and CNT files for the +/- 30 min
window?  Is there a way to dynamically indicate this 30min window, so
that
I don't have to go into the config file every time I run STATanalysis
and
change it?

Roz

On Thu, Apr 26, 2018 at 4:14 PM, John Halley Gotway via RT <
met_help at ucar.edu> wrote:

> Roz,
>
> The CSI statistics is computed from a 2x2 contingency table.  A 2x2
> contingency table is defined by a single threshold.  Looking in the
.stat
> files you sent, I see that you've applied many thresholds to
generate many
> 2x2 contingency tables and corresponding statistics.  Yes, it is
true that
> for most of those thresholds, the "bad" observation values will fall
into
> the "non-event" category.  But those non-event counts are included
in the
> computation of some stats, including CSI.  So even through the bad
> observations aren't very interesting, they really are impacting the
> statistics.
>
> John
>
> On Wed, Apr 25, 2018 at 10:08 AM, Rosalyn MacCracken - NOAA
Affiliate via
> RT <met_help at ucar.edu> wrote:
>
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822 >
> >
> > Figures.  I just calculated how long it will take me to regenerate
data
> for
> > 03072018 - 04122018.  It will take me 912 hours.  ;-(
> >
> > Ok, I know I asked this, but, if I had a OBS value of 0.01 and a
matched
> > GFS point of 10 m/s, and I had a low threshold of 0-5 m/s, 6-10
m/s and
> > 10-15 m/s, and say, CSI was calculated.  Which threshold would be
used
> for
> > the output, the 0-5 or 6-10?  And, would the 10-15 threshold even
be
> > effected?
> >
> > Roz
> >
> > On Wed, Apr 25, 2018 at 11:40 AM, John Halley Gotway via RT <
> > met_help at ucar.edu> wrote:
> >
> > > Roz,
> > >
> > > I think it'd take just as long.  The slow part is reading the
data...
> not
> > > applying a threshold.
> > >
> > > John
> > >
> > > On Wed, Apr 25, 2018 at 9:18 AM, Rosalyn MacCracken - NOAA
Affiliate
> via
> > RT
> > > <met_help at ucar.edu> wrote:
> > >
> > > >
> > > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822
>
> > > >
> > > > Hi John,
> > > >
> > > > Thanks for doing that for me.  I'll take a look at the info
you sent
> me
> > > > this afternoon.  I'm in the middle of doing something right
> > now...trying
> > > to
> > > > make a different program work.  ;-/
> > > >
> > > > I wonder if it will be quicker than 18 minutes for some of the
> > thresholds
> > > > that have higher wind speeds, and not as many instances (or 0
> > instances).
> > > > Or, will it take just as long, since it still needs to read
through
> the
> > > > entire *.stat file anyway?
> > > >
> > > > Roz
> > > >
> > > > On Tue, Apr 24, 2018 at 7:06 PM, John Halley Gotway via RT <
> > > > met_help at ucar.edu> wrote:
> > > >
> > > > > Hi Roz,
> > > > >
> > > > > Thanks for sending the sample data.  I grabbed it and used
it run
> > some
> > > > > sample jobs:
> > > > >
> > > > > time /d1/johnhg/MET/MET_releases/met-6.0/bin/stat_analysis \
> > > > > -lookin
> > > > > /d1/johnhg/MET/MET_Help/maccracken_data_20180424/opc_
> > > > > test/home/opc_test/data/met_verif/GFS/data/hourly
> > > > > \
> > > > > -config STATAnalysisConfig \
> > > > > -log run_sa.log -v 3
> > > > >
> > > > > I used the "-lookin" option to point to all the data you
sent.
> > > > >
> > > > > I've attached the...
> > > > > (1) config file I used
> > > > > (2) log file that was genrated
> > > > > (3) output .stat files
> > > > >
> > > > > Looking at the jobs, you'll see that I've included 5 of
them...
> > > > > - Generate CNT output
> > > > > - Generate CTC >= 0.0 output
> > > > > - Generate CTS >= 0.0 output
> > > > > - Generate CTC >= 5.5689 output
> > > > > - Generate CTS >= 5.5689 output
> > > > >
> > > > > Unfortunately, you'll need to define separate jobs for each
> threshold
> > > > you'd
> > > > > like to use.  Although, you shouldn't use >=0.0 since that's
always
> > > true.
> > > > >
> > > > > Also unfortunately, this is pretty slow.  On my machine, it
took
> like
> > > 18
> > > > > minutes for these 5 jobs!
> > > > >
> > > > > Thanks,
> > > > > John
> > > > >
> > > > >
> > > > > On Tue, Apr 24, 2018 at 2:09 PM, Rosalyn MacCracken - NOAA
> Affiliate
> > > via
> > > > RT
> > > > > <met_help at ucar.edu> wrote:
> > > > >
> > > > > >
> > > > > > <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822 >
> > > > > >
> > > > > > Hi John,
> > > > > >
> > > > > > I put my file on the ftp site.  Let me know what you find.
> You'll
> > > see
> > > > > > those really low OBS values (0.01, 0.02, and so on).
> > > > > >
> > > > > > Thanks!
> > > > > >
> > > > > > Roz
> > > > > >
> > > > > > On Tue, Apr 24, 2018 at 2:53 PM, Rosalyn MacCracken - NOAA
> > Affiliate
> > > <
> > > > > > rosalyn.maccracken at noaa.gov> wrote:
> > > > > >
> > > > > > > Ok, I'll get that over to the ftp site.  I have to make
sure
> > that I
> > > > > find
> > > > > > a
> > > > > > > day that has all the data in it.  Sometimes the data
isn't
> > > available
> > > > > when
> > > > > > > the script runs.  A little annoying, but, that's
operations...
> > > > > > >
> > > > > > > I'll let you know when I get the file to the ftp site.
> > > > > > >
> > > > > > > Thanks!
> > > > > > >
> > > > > > > Roz
> > > > > > >
> > > > > > > On Tue, Apr 24, 2018 at 2:49 PM, John Halley Gotway via
RT <
> > > > > > > met_help at ucar.edu> wrote:
> > > > > > >
> > > > > > >> Roz,
> > > > > > >>
> > > > > > >> Yes, we do.  Follow the instructions here:
> > > > > > >>
https://dtcenter.org/met/users/support/met_help.php#ftp
> > > > > > >>
> > > > > > >> I'd suggest making a tar file for one day and posting
them to
> > the
> > > > ftp
> > > > > > >> site:
> > > > > > >>    tar -cvzf sample.tar.gz /GFS/data/hourly/20180305*
> > > > > > >>
> > > > > > >> Thanks,
> > > > > > >> John
> > > > > > >>
> > > > > > >> On Tue, Apr 24, 2018 at 11:57 AM, Rosalyn MacCracken -
NOAA
> > > > Affiliate
> > > > > > via
> > > > > > >> RT <met_help at ucar.edu> wrote:
> > > > > > >>
> > > > > > >> >
> > > > > > >> > <URL: https://rt.rap.ucar.edu/rt/
> Ticket/Display.html?id=84822
> > >
> > > > > > >> >
> > > > > > >> > HI John,
> > > > > > >> >
> > > > > > >> > Yes, it does seem that the -config option is the way
to go
> to
> > > > > recreate
> > > > > > >> > those 3 files. I'll be sure to have a unique file
name, or,
> mv
> > > the
> > > > > > >> output
> > > > > > >> > file to a different name before running the command
again.
> > > Thanks
> > > > > for
> > > > > > >> > pointing that out.
> > > > > > >> >
> > > > > > >> > I'm teleworking for the next couple of weeks, so,
download
> and
> > > > send
> > > > > > you
> > > > > > >> > *.stat files like I can when I'm at my computer at
work.  I
> > > don't
> > > > > have
> > > > > > >> > access to theia or wcoss anymore.  You have an ftp
server
> > that I
> > > > can
> > > > > > >> upload
> > > > > > >> > data to, right?  If not, I can try and fiddle around
with
> this
> > > > > > tomorrow
> > > > > > >> and
> > > > > > >> > see if I can't get this to work the way I want to.
> > > > > > >> >
> > > > > > >> > Roz
> > > > > > >> >
> > > > > > >> > On Tue, Apr 24, 2018 at 11:42 AM, John Halley Gotway
via RT
> <
> > > > > > >> > met_help at ucar.edu> wrote:
> > > > > > >> >
> > > > > > >> > > Roz,
> > > > > > >> > >
> > > > > > >> > > Each "-job aggregate_stat" only generates a single
output
> > line
> > > > > type.
> > > > > > >> So
> > > > > > >> > > using "-out_line_type CTC,CTS,CNT" will not work.
> > > > > > >> > >
> > > > > > >> > > You'll need to run separate jobs for each output
line type
> > you
> > > > > want
> > > > > > to
> > > > > > >> > > generate.  That's why I'd recommend grouping those
> multiple
> > > jobs
> > > > > > >> together
> > > > > > >> > > into a single STAT-Analysis config file.  Then
you'd call
> > > > > > >> STAT-Analysis
> > > > > > >> > > once using the "-config" command line option.
> > > > > > >> > >
> > > > > > >> > > Another issue is that if you set "-out_stat" to the
same
> > > > filename,
> > > > > > >> it'll
> > > > > > >> > > get overridden by each job.  STAT-Analysis will
overwrite
> > that
> > > > > > output
> > > > > > >> > file
> > > > > > >> > > rather than appending to it.
> > > > > > >> > >
> > > > > > >> > > You could send me a day's worth of .stat output
files
> > > > > > >> > > (/GFS/data/hourly/20180305*) and I could send you
some
> > > > > suggestions.
> > > > > > >> Or
> > > > > > >> > if
> > > > > > >> > > you have access to theia you could copy them up
there and
> > > point
> > > > me
> > > > > > to
> > > > > > >> it.
> > > > > > >> > >
> > > > > > >> > > Thanks,
> > > > > > >> > > John
> > > > > > >> > >
> > > > > > >> > > On Tue, Apr 24, 2018 at 7:48 AM, Rosalyn MacCracken
- NOAA
> > > > > Affiliate
> > > > > > >> via
> > > > > > >> > RT
> > > > > > >> > > <met_help at ucar.edu> wrote:
> > > > > > >> > >
> > > > > > >> > > >
> > > > > > >> > > > <URL: https://rt.rap.ucar.edu/rt/
> > > Ticket/Display.html?id=84822
> > > > >
> > > > > > >> > > >
> > > > > > >> > > > Hi John,
> > > > > > >> > > >
> > > > > > >> > > > Yes, that makes sense.  Those very small values
(<1.0
> > m/s),
> > > > are
> > > > > > bad
> > > > > > >> > > > values.  That's why they shouldn't be included in
the
> > > > > processing.
> > > > > > >> > > >
> > > > > > >> > > > So, I need to just regenerate hourly data, one
hour at a
> > > time.
> > > > > > >> Would
> > > > > > >> > it
> > > > > > >> > > > make sense to use a shell script and loop stat-
analysis?
> > > > > > Something
> > > > > > >> > like:
> > > > > > >> > > >
> > > > > > >> > > > for day in 11 12
> > > > > > >> > > > do
> > > > > > >> > > >   for cycle in 00 06 12 18
> > > > > > >> > > >   do
> > > > > > >> > > > stat_analysis -lookin
/GFS/data/hourly/201803${day}$
> > > > > {hour}/*.stat
> > > > > > \
> > > > > > >> > > > -job aggregate_stat \
> > > > > > >> > > >    -line_type MPR \
> > > > > > >> > > >    -out_line_type CTC,CTS,CNT \
> > > > > > >> > > >   -fcst_var WIND \
> > > > > > >> > > > -column_thresh OBS gt1 \
> > > > > > >> > > >  -by
> > > > > > >> > > > MODEL,FCST_LEV,FCST_INIT_BEG,
> > FCST_LEAD,VX_MASK,INTERP_MTHD,
> > > I
> > > > > > >> NTERP_PNTS
> > > > > > >> > > > -out_stat
/new_rerun_stat_files/MPR_to_CTC_CTS_CNT.stat
> > > > > > >> > > >   done
> > > > > > >> > > > done
> > > > > > >> > > >
> > > > > > >> > > > or, something like that?  And, will this
regenerate hour
> > > > > > forecasts,
> > > > > > >> at
> > > > > > >> > > each
> > > > > > >> > > > forecast and lead hour?  I guess it will see the
> forecast
> > > and
> > > > > lead
> > > > > > >> hour
> > > > > > >> > > > from the *.stat file, and whatever *stat file is
in the
> > > > > directory,
> > > > > > >> it
> > > > > > >> > > will
> > > > > > >> > > > regenerate those hours, right?
> > > > > > >> > > >
> > > > > > >> > > > So, I need to regenerate the CTC, CNT and CTS
files.
> > That's
> > > > > why I
> > > > > > >> did:
> > > > > > >> > > >  -out_line_type CTC,CTS,CNT
> > > > > > >> > > > but, will that make 3 separate files, or just
another
> > *.stat
> > > > > file?
> > > > > > >> > > >
> > > > > > >> > > > Roz
> > > > > > >> > > >
> > > > > > >> > > >
> > > > > > >> > > > On Mon, Apr 23, 2018 at 4:01 PM, John Halley
Gotway via
> > RT <
> > > > > > >> > > > met_help at ucar.edu> wrote:
> > > > > > >> > > >
> > > > > > >> > > > > Roz,
> > > > > > >> > > > >
> > > > > > >> > > > > It is ultimately up to you to decide which
matched
> pairs
> > > you
> > > > > > want
> > > > > > >> to
> > > > > > >> > > > > include in your processing.  Do you consider
those
> small
> > > > (<1.0
> > > > > > >> m/s)
> > > > > > >> > > > > observation values to be corrupt and incorrect
in some
> > way
> > > > or
> > > > > > just
> > > > > > >> > not
> > > > > > >> > > > very
> > > > > > >> > > > > interesting?  If they really are BAD data
values, I
> > agree
> > > > that
> > > > > > you
> > > > > > >> > > should
> > > > > > >> > > > > exclude them from your analysis.  But if
they're just
> > > > > > >> uninteresting
> > > > > > >> > > > values
> > > > > > >> > > > > of low wind speed, then there's no reason why
you
> should
> > > > > exclude
> > > > > > >> > them.
> > > > > > >> > > > For
> > > > > > >> > > > > example, *most* of the time it ins't raining,
but we
> > often
> > > > > > >> included
> > > > > > >> > > > > observations of 0 precip.
> > > > > > >> > > > >
> > > > > > >> > > > > There are three configurable options in Point-
Stat
> that
> > > may
> > > > be
> > > > > > >> useful
> > > > > > >> > > > here:
> > > > > > >> > > > > (1) You already know and use the "cat_thresh"
option.
> > > This
> > > > > > >> threshold
> > > > > > >> > > > > defines the events and non-events for a 2x2
> contingency
> > > > table.
> > > > > > >> This
> > > > > > >> > > > > threshold affects the contents of FHO, CTC,
CTS, MCTC,
> > and
> > > > > MCTS
> > > > > > >> line
> > > > > > >> > > > types
> > > > > > >> > > > > that Point-Stat writes.
> > > > > > >> > > > > (2) The "cnt_thresh" option is a more recent
addition.
> > > > > Perhaps
> > > > > > >> this
> > > > > > >> > > was
> > > > > > >> > > > a
> > > > > > >> > > > > poor name choice, but instead of defining
categories,
> > it's
> > > > > > really
> > > > > > >> a
> > > > > > >> > > > > *filtering* threshold.  This threshold affects
the
> > > contents
> > > > of
> > > > > > the
> > > > > > >> > > SL1L2,
> > > > > > >> > > > > SAL1L2, and CNT line types that Point-Stat
writes.
> For
> > > > > example,
> > > > > > >> > > setting
> > > > > > >> > > > > "cnt_thresh = [ ge6, ge17 ];" will produce 2
CNT and 2
> > > SL1L2
> > > > > > >> output
> > > > > > >> > > lines
> > > > > > >> > > > > containing only those points where the wind
speed was
> > >=6
> > > > and
> > > > > > >> >=17,
> > > > > > >> > > > > respectively.
> > > > > > >> > > > > (3) The "wind_thresh" option is very similar to
the
> > > > > "cnt_thresh"
> > > > > > >> > option
> > > > > > >> > > > but
> > > > > > >> > > > > affects the contents of teh VL1L2, VAL1L2, and
VCNT
> (new
> > > in
> > > > > > >> met-7.0)
> > > > > > >> > > line
> > > > > > >> > > > > types.  Only those U/V pairs that meet the
specified
> > wind
> > > > > speed
> > > > > > >> > > threshold
> > > > > > >> > > > > are included in the output.
> > > > > > >> > > > >
> > > > > > >> > > > > For both "cnt_thresh" and "wind_thresh", the
default
> > value
> > > > in
> > > > > > the
> > > > > > >> > > config
> > > > > > >> > > > > file is "NA", meaning, do not apply any
filtering
> > > threshold
> > > > > > >> criteria.
> > > > > > >> > > > >
> > > > > > >> > > > > You have the flexibility to run STAT-Analysis
on the
> MPR
> > > > > output
> > > > > > >> lines
> > > > > > >> > > to
> > > > > > >> > > > > recompute any of these output line types
applying
> > whatever
> > > > > > >> filtering
> > > > > > >> > > > > criteria you'd like.
> > > > > > >> > > > > Here's the MET user's guide:
> > > > > > >> > > > >
https://dtcenter.org/met/users/docs/users_guide/MET_
> > > > > > >> > > Users_Guide_v7.0.pdf
> > > > > > >> > > > > Look on page 98 for the job command options for
the
> > > > > > >> "aggregate_stat"
> > > > > > >> > > line
> > > > > > >> > > > > type when the input line type is "MPR".
> > > > > > >> > > > >
> > > > > > >> > > > > For your second question, the "-lookin PATH"
option is
> > > > *VERY*
> > > > > > >> > flexible.
> > > > > > >> > > > > You can set PATH to either a single value or
multiple
> > > > values.
> > > > > > If
> > > > > > >> you
> > > > > > >> > > use
> > > > > > >> > > > > wildcards, then the shell expands those
wildcards to
> > > > multiple
> > > > > > >> values.
> > > > > > >> > > > Each
> > > > > > >> > > > > value you pass in can either be a filename or a
> > directory
> > > > > name.
> > > > > > >> If
> > > > > > >> > you
> > > > > > >> > > > > pass in a filename, STAT-Analysis will read it
> > > *REGARDLESS*
> > > > of
> > > > > > the
> > > > > > >> > file
> > > > > > >> > > > > extension.  If you pass in a directory name,
> > STAT-Analysis
> > > > > will
> > > > > > >> > search
> > > > > > >> > > > that
> > > > > > >> > > > > directory *RECURSIVELY* for files ending in
".stat".
> > For
> > > > > > example,
> > > > > > >> > > either
> > > > > > >> > > > > of the following settings would tell STAT-
Analysis to
> > read
> > > > the
> > > > > > >> same
> > > > > > >> > > list
> > > > > > >> > > > of
> > > > > > >> > > > > files:
> > > > > > >> > > > >    -lookin /GFS/data/hourly/*/*.stat
> > > > > > >> > > > >    ... or ...
> > > > > > >> > > > >    -lookin /GFS/data/hourly
> > > > > > >> > > > >
> > > > > > >> > > > > Be aware though that the more data you pass to
> > > > STAT-Analysis,
> > > > > > the
> > > > > > >> > > longer
> > > > > > >> > > > > it'll take for it to process it.  You can
decide how
> > much
> > > > data
> > > > > > you
> > > > > > >> > pass
> > > > > > >> > > > it
> > > > > > >> > > > > for each job.  I'd suggest starting with what
is most
> > > > > convenient
> > > > > > >> for
> > > > > > >> > > you.
> > > > > > >> > > > > If it's too slow, change the logic to pass it
less
> data
> > > > (e.g.
> > > > > > >> only 1
> > > > > > >> > > day
> > > > > > >> > > > of
> > > > > > >> > > > > data rather than 1 month of data).
> > > > > > >> > > > >
> > > > > > >> > > > > Yes, you can give it a date range.  Use
-fcst_init_beg
> > and
> > > > > > >> > > -fcst_init_end
> > > > > > >> > > > > to specify beginning/ending model
initialization times
> > or
> > > > > > >> > > -fcst_valid_beg
> > > > > > >> > > > > and -fcst_valid_end to specify beginning/ending
valid
> > > times.
> > > > > > >> > > > >
> > > > > > >> > > > > If you find that you're running multiple jobs
on the
> > same
> > > > > subset
> > > > > > >> of
> > > > > > >> > > data
> > > > > > >> > > > > (e.g. process MPR to CNT, MPR to SL1L2, MPR to
CTC,
> MPR
> > to
> > > > > CTS),
> > > > > > >> it'd
> > > > > > >> > > be
> > > > > > >> > > > > more efficient to group those jobs into a
config file.
> > > > > That'll
> > > > > > do
> > > > > > >> > the
> > > > > > >> > > > > filtering ONCE and write the filtered data to a
temp
> > file.
> > > > > Then
> > > > > > >> all
> > > > > > >> > > the
> > > > > > >> > > > > jobs read data from the temp instead of
starting over
> > from
> > > > > > >> scratch.
> > > > > > >> > > > >
> > > > > > >> > > > > Make sense?
> > > > > > >> > > > >
> > > > > > >> > > > > John
> > > > > > >> > > > >
> > > > > > >> > > > >
> > > > > > >> > > > >
> > > > > > >> > > > > On Mon, Apr 23, 2018 at 1:01 PM, Rosalyn
MacCracken -
> > NOAA
> > > > > > >> Affiliate
> > > > > > >> > > via
> > > > > > >> > > > RT
> > > > > > >> > > > > <met_help at ucar.edu> wrote:
> > > > > > >> > > > >
> > > > > > >> > > > > >
> > > > > > >> > > > > > <URL: https://rt.rap.ucar.edu/rt/
> > > > > Ticket/Display.html?id=84822
> > > > > > >
> > > > > > >> > > > > >
> > > > > > >> > > > > > Hi John,
> > > > > > >> > > > > >
> > > > > > >> > > > > > That's actually only partially correct.  It's
not
> > that I
> > > > > want
> > > > > > to
> > > > > > >> > use
> > > > > > >> > > > part
> > > > > > >> > > > > > of the MPR lines and discard the rest, and I
do need
> > to
> > > > > > >> regenerate
> > > > > > >> > > > > > statistics.  Let me try to re-explain.
> > > > > > >> > > > > >
> > > > > > >> > > > > > Back in early March we switched from getting
our
> ASCAT
> > > obs
> > > > > > from
> > > > > > >> the
> > > > > > >> > > > > > prepbufr data, to getting it from the
MGDRLITE data.
> > So,
> > > > > > >> processing
> > > > > > >> > > > > didn't
> > > > > > >> > > > > > change.  I was producing statistics at
certain
> > threshold
> > > > > > levels
> > > > > > >> for
> > > > > > >> > > > both
> > > > > > >> > > > > > GFS and ASCAT.  I had this set with the
cat_thresh
> > list,
> > > > at
> > > > > > >> levels
> > > > > > >> > of
> > > > > > >> > > > > > 0,6,17, etc.  We found out after processing
for a
> > couple
> > > > of
> > > > > > >> weeks
> > > > > > >> > > that
> > > > > > >> > > > > the
> > > > > > >> > > > > > ASCAT data included these really small
values, <1.0
> > m/s,
> > > > and
> > > > > > >> that
> > > > > > >> > > these
> > > > > > >> > > > > > small wind speeds were being included into
the
> > > statistics
> > > > > > >> > processing.
> > > > > > >> > > > > >
> > > > > > >> > > > > > So, a couple of questions.
> > > > > > >> > > > > > 1) Do I have to regenerate all of my
statistics
> > (*.cts,
> > > > > *.cnt
> > > > > > >> and
> > > > > > >> > > *ctc
> > > > > > >> > > > > > files) because of this error? Or, since I
have
> > threshold
> > > > > > levels
> > > > > > >> > set,
> > > > > > >> > > > will
> > > > > > >> > > > > > those small values be amoung the statistics
in the
> > > lowest
> > > > > > >> > thresholds?
> > > > > > >> > > > > > 2) I have the *.stat files, but, they are
spread out
> > > into
> > > > > > >> separate
> > > > > > >> > > > > > directories like:
> > > > > > >> > > > > > /GFS/data/hourly/${YYYYMMDDHH}/*.stat
> > > > > > >> > > > > > Can I tell stat-analysis to "lookin"
directories
> with
> > a
> > > > > > wildcard
> > > > > > >> > > (like
> > > > > > >> > > > > > 201803*)?  If so, how?  Or, is I tell it to
look in
> > > > > > >> > /GFS/data/hourly,
> > > > > > >> > > > > will
> > > > > > >> > > > > > it look in all the directories recursively
under
> > hourly?
> > > > > And,
> > > > > > >> it
> > > > > > >> > > > that's
> > > > > > >> > > > > > the case, can I give it a date range, so,
that it
> only
> > > > > > processes
> > > > > > >> > data
> > > > > > >> > > > > from
> > > > > > >> > > > > > March?
> > > > > > >> > > > > >
> > > > > > >> > > > > > Roz
> > > > > > >> > > > > >
> > > > > > >> > > > > > On Mon, Apr 23, 2018 at 2:18 PM, John Halley
Gotway
> > via
> > > > RT <
> > > > > > >> > > > > > met_help at ucar.edu> wrote:
> > > > > > >> > > > > >
> > > > > > >> > > > > > > Hi Roz,
> > > > > > >> > > > > > >
> > > > > > >> > > > > > > I read that you've run Point-Stat and saved
off
> the
> > > > > matched
> > > > > > >> pairs
> > > > > > >> > > > (MPR)
> > > > > > >> > > > > > > output line type.  And you'd like to (1)
filter
> > those
> > > > MPR
> > > > > > >> lines
> > > > > > >> > to
> > > > > > >> > > > > > discard
> > > > > > >> > > > > > > some of them and then (2) use the filtered
data to
> > > > > > regenerate
> > > > > > >> > > summary
> > > > > > >> > > > > > > statistics.  Yes, this is easily done using
the
> > > > > > STAT-Analysis
> > > > > > >> > tool
> > > > > > >> > > in
> > > > > > >> > > > > > MET.
> > > > > > >> > > > > > >
> > > > > > >> > > > > > > You wrote that you're verifying wind speeds
> against
> > > > ASCAT
> > > > > > and
> > > > > > >> > that
> > > > > > >> > > > > you'd
> > > > > > >> > > > > > > like to exclude pairs where the observed
wind
> speed
> > is
> > > > > less
> > > > > > >> than
> > > > > > >> > 1
> > > > > > >> > > > m/s.
> > > > > > >> > > > > > > I'm just guessing here, but I'll presume
that you
> > want
> > > > to
> > > > > > >> produce
> > > > > > >> > > > both
> > > > > > >> > > > > > > SL1L2 and CNT output line types.  Here's
what the
> > > > > > >> STAT-Analysis
> > > > > > >> > job
> > > > > > >> > > > > would
> > > > > > >> > > > > > > look like:
> > > > > > >> > > > > > >
> > > > > > >> > > > > > > # Filter MPR's and write SL1L2 output line
> > > > > > >> > > > > > > stat_analysis \
> > > > > > >> > > > > > >    -lookin input.stat \            # List a
.stat
> > > > filename
> > > > > > or
> > > > > > >> > > > directory
> > > > > > >> > > > > > > containing them
> > > > > > >> > > > > > >    -job aggregate_stat \        # Job type
is
> > > > > aggregate_stat
> > > > > > >> > > > > > >    -line_type MPR \              # Input
line
> type =
> > > MPR
> > > > > > >> > > > > > >    -out_line_type SL1L2 \      # Output
line type
> =
> > > > SL1L2
> > > > > > >> partial
> > > > > > >> > > > sums
> > > > > > >> > > > > > >    -fcst_var WIND \               # Only
process
> > lines
> > > > > where
> > > > > > >> > > FCST_VAR
> > > > > > >> > > > > > > column = WIND
> > > > > > >> > > > > > >    -column_thresh OBS gt1 \ # Only use MPR
lines
> > where
> > > > OBS
> > > > > > >> column
> > > > > > >> > > > 1
> > > > > > >> > > > > > >    -by
> > > > > > >> > > > > > > MODEL,FCST_LEV,FCST_INIT_BEG,
> > > > > FCST_LEAD,VX_MASK,INTERP_MTHD,
> > > > > > >> > > > INTERP_PNTS
> > > > > > >> > > > > #
> > > > > > >> > > > > > > Run this same job for each unique
combination of
> > these
> > > > > > columns
> > > > > > >> > > > > > >    -out_stat MPR_to_SL1L2.stat
> > > > > > >> > > > > > >
> > > > > > >> > > > > > > This will read produce an output .stat file
> > containing
> > > > an
> > > > > > >> SL1L2
> > > > > > >> > > line
> > > > > > >> > > > > for
> > > > > > >> > > > > > > each unique combination of the header
columns
> listed
> > > > after
> > > > > > the
> > > > > > >> > > "-by"
> > > > > > >> > > > > > > option.  To generate CNT output lines
instead,
> you'd
> > > > run a
> > > > > > >> second
> > > > > > >> > > job
> > > > > > >> > > > > > where
> > > > > > >> > > > > > > you replace SL1L2 with CNT.  You could run
these
> > jobs
> > > on
> > > > > the
> > > > > > >> > > command
> > > > > > >> > > > > line
> > > > > > >> > > > > > > or group them together into a STAT-Analysis
config
> > > file,
> > > > > if
> > > > > > >> you
> > > > > > >> > > > prefer.
> > > > > > >> > > > > > > Both would work.
> > > > > > >> > > > > > >
> > > > > > >> > > > > > > You could run this once for each input
.stat file
> > > you're
> > > > > > >> > > > processing...
> > > > > > >> > > > > or
> > > > > > >> > > > > > > you could pass many input .stat files to
the job.
> > > Since
> > > > > > >> > > > FCST_INIT_BEG
> > > > > > >> > > > > > and
> > > > > > >> > > > > > > FCST_LEAD are included in the "-by" option,
you'll
> > get
> > > > > > >> separate
> > > > > > >> > > > output
> > > > > > >> > > > > > > lines for each unique time.
> > > > > > >> > > > > > >
> > > > > > >> > > > > > > Hope that helps get you going.
> > > > > > >> > > > > > >
> > > > > > >> > > > > > > Thanks,
> > > > > > >> > > > > > > John
> > > > > > >> > > > > > >
> > > > > > >> > > > > > >
> > > > > > >> > > > > > > On Thu, Apr 19, 2018 at 9:23 AM, Julie
Prestopnik
> > via
> > > > RT <
> > > > > > >> > > > > > > met_help at ucar.edu>
> > > > > > >> > > > > > > wrote:
> > > > > > >> > > > > > >
> > > > > > >> > > > > > > >
> > > > > > >> > > > > > > > <URL: https://rt.rap.ucar.edu/rt/Tic
> > > > > > >> ket/Display.html?id=84822
> > > > > > >> > >
> > > > > > >> > > > > > > >
> > > > > > >> > > > > > > > Hi Roz.  My apologies for the delay in
> responding.
> > > > > > >> > > > > > > >
> > > > > > >> > > > > > > > Unfortunately, John is out of the office
this
> > week,
> > > > and
> > > > > I
> > > > > > do
> > > > > > >> > not
> > > > > > >> > > > know
> > > > > > >> > > > > > the
> > > > > > >> > > > > > > > answers to your questions.  As you said,
I would
> > > also
> > > > > > >> imagine
> > > > > > >> > > that
> > > > > > >> > > > > > > > point-stat is using those small values as
> matched
> > > > pairs.
> > > > > > >> > Also, I
> > > > > > >> > > > do
> > > > > > >> > > > > > not
> > > > > > >> > > > > > > > believe there is a way to regenerate the
> > point-stat
> > > > > > >> statistics
> > > > > > >> > > > > without
> > > > > > >> > > > > > > > using the original GFS data.  I cannot
say with
> > > > > certainty,
> > > > > > >> > > however.
> > > > > > >> > > > > > > Thank
> > > > > > >> > > > > > > > you for your patience in advance.  We'll
get a
> > > > definite
> > > > > > >> > response
> > > > > > >> > > to
> > > > > > >> > > > > you
> > > > > > >> > > > > > > as
> > > > > > >> > > > > > > > soon as we can.
> > > > > > >> > > > > > > >
> > > > > > >> > > > > > > > Thanks,
> > > > > > >> > > > > > > > Julie
> > > > > > >> > > > > > > >
> > > > > > >> > > > > > > > On Wed, Apr 18, 2018 at 6:31 AM, Rosalyn
> > MacCracken
> > > -
> > > > > NOAA
> > > > > > >> > > > Affiliate
> > > > > > >> > > > > > via
> > > > > > >> > > > > > > RT
> > > > > > >> > > > > > > > <met_help at ucar.edu> wrote:
> > > > > > >> > > > > > > >
> > > > > > >> > > > > > > > >
> > > > > > >> > > > > > > > > Wed Apr 18 06:31:39 2018: Request 84822
was
> > acted
> > > > > upon.
> > > > > > >> > > > > > > > > Transaction: Ticket created by
> > > > > > >> rosalyn.maccracken at noaa.gov
> > > > > > >> > > > > > > > >        Queue: met_help
> > > > > > >> > > > > > > > >      Subject: question on regenerating
data
> > > > > > >> > > > > > > > >        Owner: Nobody
> > > > > > >> > > > > > > > >   Requestors:
rosalyn.maccracken at noaa.gov
> > > > > > >> > > > > > > > >       Status: new
> > > > > > >> > > > > > > > >  Ticket <URL:
https://rt.rap.ucar.edu/rt/
> > > > > > >> > > > > > Ticket/Display.html?id=84822
> > > > > > >> > > > > > > >
> > > > > > >> > > > > > > > >
> > > > > > >> > > > > > > > >
> > > > > > >> > > > > > > > > Hi,
> > > > > > >> > > > > > > > >
> > > > > > >> > > > > > > > > I'm running point-stat using ASCAT and
GFS
> data
> > to
> > > > > > verify
> > > > > > >> > > surface
> > > > > > >> > > > > > wind
> > > > > > >> > > > > > > > > speeds.  I found an error in my ASCAT
input
> data
> > > > that
> > > > > > goes
> > > > > > >> > back
> > > > > > >> > > > to
> > > > > > >> > > > > > Mar
> > > > > > >> > > > > > > 7.
> > > > > > >> > > > > > > > > I had switched the input source of the
data,
> and
> > > > > within
> > > > > > >> the
> > > > > > >> > new
> > > > > > >> > > > > data
> > > > > > >> > > > > > > > files,
> > > > > > >> > > > > > > > > it was allowing very small values (< 1
m/s) to
> > be
> > > > used
> > > > > > as
> > > > > > >> > data
> > > > > > >> > > > > points
> > > > > > >> > > > > > > in
> > > > > > >> > > > > > > > > the verification.  I imagine that this
is an
> > > issue,
> > > > > > since
> > > > > > >> > > > > point-stat
> > > > > > >> > > > > > is
> > > > > > >> > > > > > > > > using these very small values as
matched pairs
> > > with
> > > > > the
> > > > > > >> GFS,
> > > > > > >> > > > > correct?
> > > > > > >> > > > > > > > >
> > > > > > >> > > > > > > > > Is there a way to regenerate the point-
stat
> > > > statistics
> > > > > > >> > without
> > > > > > >> > > > > using
> > > > > > >> > > > > > > the
> > > > > > >> > > > > > > > > original GFS data?  I do have the *stat
and
> the
> > > *mpr
> > > > > > >> files,
> > > > > > >> > and
> > > > > > >> > > > it
> > > > > > >> > > > > is
> > > > > > >> > > > > > > > > pretty easy to identify where the bad
values
> are
> > > > > > located.
> > > > > > >> > > > > > > > >
> > > > > > >> > > > > > > > > Thanks,
> > > > > > >> > > > > > > > > Roz
> > > > > > >> > > > > > > > >
> > > > > > >> > > > > > > > > --
> > > > > > >> > > > > > > > > Rosalyn MacCracken
> > > > > > >> > > > > > > > > Support Scientist
> > > > > > >> > > > > > > > >
> > > > > > >> > > > > > > > > Ocean Applications Branch
> > > > > > >> > > > > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > >> > > > > > > > > NCWCP
> > > > > > >> > > > > > > > > 5830 University Research Ct
> > > > > > >> > > > > > > > > College Park, MD  20740-3818
> > > > > > >> > > > > > > > >
> > > > > > >> > > > > > > > > (p) 301-683-1551
> > > > > > >> > > > > > > > > rosalyn.maccracken at noaa.gov
> > > > > > >> > > > > > > > >
> > > > > > >> > > > > > > > >
> > > > > > >> > > > > > > >
> > > > > > >> > > > > > > >
> > > > > > >> > > > > > >
> > > > > > >> > > > > > >
> > > > > > >> > > > > >
> > > > > > >> > > > > >
> > > > > > >> > > > > > --
> > > > > > >> > > > > > Rosalyn MacCracken
> > > > > > >> > > > > > Support Scientist
> > > > > > >> > > > > >
> > > > > > >> > > > > > Ocean Applications Branch
> > > > > > >> > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > >> > > > > > NCWCP
> > > > > > >> > > > > > 5830 University Research Ct
> > > > > > >> > > > > > College Park, MD  20740-3818
> > > > > > >> > > > > >
> > > > > > >> > > > > > (p) 301-683-1551
> > > > > > >> > > > > > rosalyn.maccracken at noaa.gov
> > > > > > >> > > > > >
> > > > > > >> > > > > >
> > > > > > >> > > > >
> > > > > > >> > > > >
> > > > > > >> > > >
> > > > > > >> > > >
> > > > > > >> > > > --
> > > > > > >> > > > Rosalyn MacCracken
> > > > > > >> > > > Support Scientist
> > > > > > >> > > >
> > > > > > >> > > > Ocean Applications Branch
> > > > > > >> > > > NOAA/NWS Ocean Prediction Center
> > > > > > >> > > > NCWCP
> > > > > > >> > > > 5830 University Research Ct
> > > > > > >> > > > College Park, MD  20740-3818
> > > > > > >> > > >
> > > > > > >> > > > (p) 301-683-1551
> > > > > > >> > > > rosalyn.maccracken at noaa.gov
> > > > > > >> > > >
> > > > > > >> > > >
> > > > > > >> > >
> > > > > > >> > >
> > > > > > >> >
> > > > > > >> >
> > > > > > >> > --
> > > > > > >> > Rosalyn MacCracken
> > > > > > >> > Support Scientist
> > > > > > >> >
> > > > > > >> > Ocean Applications Branch
> > > > > > >> > NOAA/NWS Ocean Prediction Center
> > > > > > >> > NCWCP
> > > > > > >> > 5830 University Research Ct
> > > > > > >> > College Park, MD  20740-3818
> > > > > > >> >
> > > > > > >> > (p) 301-683-1551
> > > > > > >> > rosalyn.maccracken at noaa.gov
> > > > > > >> >
> > > > > > >> >
> > > > > > >>
> > > > > > >>
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Rosalyn MacCracken
> > > > > > > Support Scientist
> > > > > > >
> > > > > > > Ocean Applications Branch
> > > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > > NCWCP
> > > > > > > 5830 University Research Ct
> > > > > > > College Park, MD  20740-3818
> > > > > > >
> > > > > > > (p) 301-683-1551
> > > > > > > rosalyn.maccracken at noaa.gov
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Rosalyn MacCracken
> > > > > > Support Scientist
> > > > > >
> > > > > > Ocean Applications Branch
> > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > NCWCP
> > > > > > 5830 University Research Ct
> > > > > > College Park, MD  20740-3818
> > > > > >
> > > > > > (p) 301-683-1551
> > > > > > rosalyn.maccracken at noaa.gov
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > Rosalyn MacCracken
> > > > Support Scientist
> > > >
> > > > Ocean Applications Branch
> > > > NOAA/NWS Ocean Prediction Center
> > > > NCWCP
> > > > 5830 University Research Ct
> > > > College Park, MD  20740-3818
> > > >
> > > > (p) 301-683-1551
> > > > rosalyn.maccracken at noaa.gov
> > > >
> > > >
> > >
> > >
> >
> >
> > --
> > Rosalyn MacCracken
> > Support Scientist
> >
> > Ocean Applications Branch
> > NOAA/NWS Ocean Prediction Center
> > NCWCP
> > 5830 University Research Ct
> > College Park, MD  20740-3818
> >
> > (p) 301-683-1551
> > rosalyn.maccracken at noaa.gov
> >
> >
>
>

--
Rosalyn MacCracken
Support Scientist

Ocean Applications Branch
NOAA/NWS Ocean Prediction Center
NCWCP
5830 University Research Ct
College Park, MD  20740-3818

(p) 301-683-1551
rosalyn.maccracken at noaa.gov

------------------------------------------------
Subject: question on regenerating data
From: John Halley Gotway
Time: Mon May 07 09:48:39 2018

Roz,

I understand that you're suspicious about the beginning and ending
time
stamps in the OBS_VALID_BEG and _END columns.  You're comparing the
original output from Point-Stat to the output that you're getting from
STAT-Analysis.  However, those timestamps can be different without
there
actually being a problem.  Here's why...

When you run Point-Stat, the obs_window setting in the config file
defines
the matching time window.  If your forecast is valid at time T, the
matching time window is defined as T+obs_window.beg to
T+obs_window.end.
The point observations may actually fall anywhere in that time
window...
but it's that time window that's reported in the summary line type
(like
CTC, CTS, SL1L2, and CNT).  Since the MPR line type is specific to
each
observation value, the *actual* timestamp of that observation is
reported
for in that line.

When your run STAT-Analysis to process those MPR lines, it reads the
OBS_VALID_BEG and OBS_VALID_END columns.  And it keeps track of the
minimum
OBS_VALID_BEG timestamp and the maximum OBS_VALID_END timestamp.  When
it
writes output CTC, CTS, SL1L2, or CNT lines it reports the
minimum/maximum
timestamp values it found in the data.

So Point-Stat reports the *REQUESTED TIME WINDOW* in the OBS_VALID_BEG
and
OBS_VALID_END columns... while STAT-Analysis reports the *ACTUAL TIME
WINDOW*.  And in general, those won't be the same.  So this isn't
necessarily a problem.

If for consistency, you'd like to explicitly set the OBS_VALID_BEG and
OBS_VALID_END timestamps in the output, you can use the "-set_hdr" job
command option to do so:
   -set_hdr OBS_VALID_BEG 20180307_003000 -set_hdr OBS_VALID_END
20180307_013000

Thanks,
John

On Sun, May 6, 2018 at 2:12 PM, Rosalyn MacCracken - NOAA Affiliate
via RT <
met_help at ucar.edu> wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822 >
>
> Hi John,
>
> Sorry it took me so long to get back to you.  My step-daughter came
in to
> town, and I thought that I could get some work done while she was
here,
> but, didn't.  Then, I totally forgot to email you back.  Sorry for
leaving
> you hanging!
>
> Anyway, I was able to play around with the STATAnalysis config file
you
> sent me.  I tried it out with only 1 hour timestep, instead of all
the
> files for one day.  I wanted to see what kind of time it would take
to
> process this on my machine.  So, it was quick, 45 seconds.  But, of
course
> you run took 18 minutes.  The script was probably reading 20 some
files.
> That makes sense.
>
> So, then, I looked at the output, and it wasn't quite what I
expected, and
> doesn't quite match the stats from the other processing.  This is
what I
> did:
>
> 1)  I copied the 00z only *20180307*.stat file to a temp directory.
Before
> I did this, I looked at the matching *.mpr file, and saw that the
> OBS_VALID_BEG was 20180307_000000 and the OBS_VALID_END was
> 20180307_002700.
> 2)  Ran the run_sa.sh script and generated the CTS, CTC and CNT
files.
> 3)  I looked at the new agg_cts file, and the OBS_VALID_BEG and _END
> matched the *.mpr file in step 1.
> 4)  I looked at the original CTS file, and the OBS_VALID_BEG was
> 20180307_223000 and the OBS_VALID_END was 20180307_013000.  So, that
was
> our original way of processing.  I bet if I looked at a more recent
file,
> it would be more like OBS_VALID_BEG was 20180307_233000 and the
> OBS_VALID_END was 20180307_003000.
> 5)  I looked at the original *mpr for 01z, and the OBS_VALID_BEG was
> 20180307_003000 and the OBS_VALID_END was 20180307_012100
>
> So, this tells me that I'm not matching observation times, and I'm
not sure
> how to fix it to match things up.  First, we use a +/- 30 min window
for
> ASCAT obs, centered on the hour.  For example, if we are processing
the 00z
> hour, we will match observations from 233000 from the day before to
003000
> the current day.  Actually, we used to do an hour window on either
side,
> but, we have more observations now at each hour.  (See the
explanation in
> #4 above)
>
> Anyway, how do I create the CTS,CTC and CNT files for the +/- 30 min
> window?  Is there a way to dynamically indicate this 30min window,
so that
> I don't have to go into the config file every time I run
STATanalysis and
> change it?
>
> Roz
>
> On Thu, Apr 26, 2018 at 4:14 PM, John Halley Gotway via RT <
> met_help at ucar.edu> wrote:
>
> > Roz,
> >
> > The CSI statistics is computed from a 2x2 contingency table.  A
2x2
> > contingency table is defined by a single threshold.  Looking in
the .stat
> > files you sent, I see that you've applied many thresholds to
generate
> many
> > 2x2 contingency tables and corresponding statistics.  Yes, it is
true
> that
> > for most of those thresholds, the "bad" observation values will
fall into
> > the "non-event" category.  But those non-event counts are included
in the
> > computation of some stats, including CSI.  So even through the bad
> > observations aren't very interesting, they really are impacting
the
> > statistics.
> >
> > John
> >
> > On Wed, Apr 25, 2018 at 10:08 AM, Rosalyn MacCracken - NOAA
Affiliate via
> > RT <met_help at ucar.edu> wrote:
> >
> > >
> > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822 >
> > >
> > > Figures.  I just calculated how long it will take me to
regenerate data
> > for
> > > 03072018 - 04122018.  It will take me 912 hours.  ;-(
> > >
> > > Ok, I know I asked this, but, if I had a OBS value of 0.01 and a
> matched
> > > GFS point of 10 m/s, and I had a low threshold of 0-5 m/s, 6-10
m/s and
> > > 10-15 m/s, and say, CSI was calculated.  Which threshold would
be used
> > for
> > > the output, the 0-5 or 6-10?  And, would the 10-15 threshold
even be
> > > effected?
> > >
> > > Roz
> > >
> > > On Wed, Apr 25, 2018 at 11:40 AM, John Halley Gotway via RT <
> > > met_help at ucar.edu> wrote:
> > >
> > > > Roz,
> > > >
> > > > I think it'd take just as long.  The slow part is reading the
data...
> > not
> > > > applying a threshold.
> > > >
> > > > John
> > > >
> > > > On Wed, Apr 25, 2018 at 9:18 AM, Rosalyn MacCracken - NOAA
Affiliate
> > via
> > > RT
> > > > <met_help at ucar.edu> wrote:
> > > >
> > > > >
> > > > > <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822 >
> > > > >
> > > > > Hi John,
> > > > >
> > > > > Thanks for doing that for me.  I'll take a look at the info
you
> sent
> > me
> > > > > this afternoon.  I'm in the middle of doing something right
> > > now...trying
> > > > to
> > > > > make a different program work.  ;-/
> > > > >
> > > > > I wonder if it will be quicker than 18 minutes for some of
the
> > > thresholds
> > > > > that have higher wind speeds, and not as many instances (or
0
> > > instances).
> > > > > Or, will it take just as long, since it still needs to read
through
> > the
> > > > > entire *.stat file anyway?
> > > > >
> > > > > Roz
> > > > >
> > > > > On Tue, Apr 24, 2018 at 7:06 PM, John Halley Gotway via RT <
> > > > > met_help at ucar.edu> wrote:
> > > > >
> > > > > > Hi Roz,
> > > > > >
> > > > > > Thanks for sending the sample data.  I grabbed it and used
it run
> > > some
> > > > > > sample jobs:
> > > > > >
> > > > > > time /d1/johnhg/MET/MET_releases/met-6.0/bin/stat_analysis
\
> > > > > > -lookin
> > > > > > /d1/johnhg/MET/MET_Help/maccracken_data_20180424/opc_
> > > > > > test/home/opc_test/data/met_verif/GFS/data/hourly
> > > > > > \
> > > > > > -config STATAnalysisConfig \
> > > > > > -log run_sa.log -v 3
> > > > > >
> > > > > > I used the "-lookin" option to point to all the data you
sent.
> > > > > >
> > > > > > I've attached the...
> > > > > > (1) config file I used
> > > > > > (2) log file that was genrated
> > > > > > (3) output .stat files
> > > > > >
> > > > > > Looking at the jobs, you'll see that I've included 5 of
them...
> > > > > > - Generate CNT output
> > > > > > - Generate CTC >= 0.0 output
> > > > > > - Generate CTS >= 0.0 output
> > > > > > - Generate CTC >= 5.5689 output
> > > > > > - Generate CTS >= 5.5689 output
> > > > > >
> > > > > > Unfortunately, you'll need to define separate jobs for
each
> > threshold
> > > > > you'd
> > > > > > like to use.  Although, you shouldn't use >=0.0 since
that's
> always
> > > > true.
> > > > > >
> > > > > > Also unfortunately, this is pretty slow.  On my machine,
it took
> > like
> > > > 18
> > > > > > minutes for these 5 jobs!
> > > > > >
> > > > > > Thanks,
> > > > > > John
> > > > > >
> > > > > >
> > > > > > On Tue, Apr 24, 2018 at 2:09 PM, Rosalyn MacCracken - NOAA
> > Affiliate
> > > > via
> > > > > RT
> > > > > > <met_help at ucar.edu> wrote:
> > > > > >
> > > > > > >
> > > > > > > <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822
> >
> > > > > > >
> > > > > > > Hi John,
> > > > > > >
> > > > > > > I put my file on the ftp site.  Let me know what you
find.
> > You'll
> > > > see
> > > > > > > those really low OBS values (0.01, 0.02, and so on).
> > > > > > >
> > > > > > > Thanks!
> > > > > > >
> > > > > > > Roz
> > > > > > >
> > > > > > > On Tue, Apr 24, 2018 at 2:53 PM, Rosalyn MacCracken -
NOAA
> > > Affiliate
> > > > <
> > > > > > > rosalyn.maccracken at noaa.gov> wrote:
> > > > > > >
> > > > > > > > Ok, I'll get that over to the ftp site.  I have to
make sure
> > > that I
> > > > > > find
> > > > > > > a
> > > > > > > > day that has all the data in it.  Sometimes the data
isn't
> > > > available
> > > > > > when
> > > > > > > > the script runs.  A little annoying, but, that's
> operations...
> > > > > > > >
> > > > > > > > I'll let you know when I get the file to the ftp site.
> > > > > > > >
> > > > > > > > Thanks!
> > > > > > > >
> > > > > > > > Roz
> > > > > > > >
> > > > > > > > On Tue, Apr 24, 2018 at 2:49 PM, John Halley Gotway
via RT <
> > > > > > > > met_help at ucar.edu> wrote:
> > > > > > > >
> > > > > > > >> Roz,
> > > > > > > >>
> > > > > > > >> Yes, we do.  Follow the instructions here:
> > > > > > > >>
https://dtcenter.org/met/users/support/met_help.php#ftp
> > > > > > > >>
> > > > > > > >> I'd suggest making a tar file for one day and posting
them
> to
> > > the
> > > > > ftp
> > > > > > > >> site:
> > > > > > > >>    tar -cvzf sample.tar.gz /GFS/data/hourly/20180305*
> > > > > > > >>
> > > > > > > >> Thanks,
> > > > > > > >> John
> > > > > > > >>
> > > > > > > >> On Tue, Apr 24, 2018 at 11:57 AM, Rosalyn MacCracken
- NOAA
> > > > > Affiliate
> > > > > > > via
> > > > > > > >> RT <met_help at ucar.edu> wrote:
> > > > > > > >>
> > > > > > > >> >
> > > > > > > >> > <URL: https://rt.rap.ucar.edu/rt/
> > Ticket/Display.html?id=84822
> > > >
> > > > > > > >> >
> > > > > > > >> > HI John,
> > > > > > > >> >
> > > > > > > >> > Yes, it does seem that the -config option is the
way to go
> > to
> > > > > > recreate
> > > > > > > >> > those 3 files. I'll be sure to have a unique file
name,
> or,
> > mv
> > > > the
> > > > > > > >> output
> > > > > > > >> > file to a different name before running the command
again.
> > > > Thanks
> > > > > > for
> > > > > > > >> > pointing that out.
> > > > > > > >> >
> > > > > > > >> > I'm teleworking for the next couple of weeks, so,
download
> > and
> > > > > send
> > > > > > > you
> > > > > > > >> > *.stat files like I can when I'm at my computer at
work.
> I
> > > > don't
> > > > > > have
> > > > > > > >> > access to theia or wcoss anymore.  You have an ftp
server
> > > that I
> > > > > can
> > > > > > > >> upload
> > > > > > > >> > data to, right?  If not, I can try and fiddle
around with
> > this
> > > > > > > tomorrow
> > > > > > > >> and
> > > > > > > >> > see if I can't get this to work the way I want to.
> > > > > > > >> >
> > > > > > > >> > Roz
> > > > > > > >> >
> > > > > > > >> > On Tue, Apr 24, 2018 at 11:42 AM, John Halley
Gotway via
> RT
> > <
> > > > > > > >> > met_help at ucar.edu> wrote:
> > > > > > > >> >
> > > > > > > >> > > Roz,
> > > > > > > >> > >
> > > > > > > >> > > Each "-job aggregate_stat" only generates a
single
> output
> > > line
> > > > > > type.
> > > > > > > >> So
> > > > > > > >> > > using "-out_line_type CTC,CTS,CNT" will not work.
> > > > > > > >> > >
> > > > > > > >> > > You'll need to run separate jobs for each output
line
> type
> > > you
> > > > > > want
> > > > > > > to
> > > > > > > >> > > generate.  That's why I'd recommend grouping
those
> > multiple
> > > > jobs
> > > > > > > >> together
> > > > > > > >> > > into a single STAT-Analysis config file.  Then
you'd
> call
> > > > > > > >> STAT-Analysis
> > > > > > > >> > > once using the "-config" command line option.
> > > > > > > >> > >
> > > > > > > >> > > Another issue is that if you set "-out_stat" to
the same
> > > > > filename,
> > > > > > > >> it'll
> > > > > > > >> > > get overridden by each job.  STAT-Analysis will
> overwrite
> > > that
> > > > > > > output
> > > > > > > >> > file
> > > > > > > >> > > rather than appending to it.
> > > > > > > >> > >
> > > > > > > >> > > You could send me a day's worth of .stat output
files
> > > > > > > >> > > (/GFS/data/hourly/20180305*) and I could send you
some
> > > > > > suggestions.
> > > > > > > >> Or
> > > > > > > >> > if
> > > > > > > >> > > you have access to theia you could copy them up
there
> and
> > > > point
> > > > > me
> > > > > > > to
> > > > > > > >> it.
> > > > > > > >> > >
> > > > > > > >> > > Thanks,
> > > > > > > >> > > John
> > > > > > > >> > >
> > > > > > > >> > > On Tue, Apr 24, 2018 at 7:48 AM, Rosalyn
MacCracken -
> NOAA
> > > > > > Affiliate
> > > > > > > >> via
> > > > > > > >> > RT
> > > > > > > >> > > <met_help at ucar.edu> wrote:
> > > > > > > >> > >
> > > > > > > >> > > >
> > > > > > > >> > > > <URL: https://rt.rap.ucar.edu/rt/
> > > > Ticket/Display.html?id=84822
> > > > > >
> > > > > > > >> > > >
> > > > > > > >> > > > Hi John,
> > > > > > > >> > > >
> > > > > > > >> > > > Yes, that makes sense.  Those very small values
(<1.0
> > > m/s),
> > > > > are
> > > > > > > bad
> > > > > > > >> > > > values.  That's why they shouldn't be included
in the
> > > > > > processing.
> > > > > > > >> > > >
> > > > > > > >> > > > So, I need to just regenerate hourly data, one
hour
> at a
> > > > time.
> > > > > > > >> Would
> > > > > > > >> > it
> > > > > > > >> > > > make sense to use a shell script and loop
> stat-analysis?
> > > > > > > Something
> > > > > > > >> > like:
> > > > > > > >> > > >
> > > > > > > >> > > > for day in 11 12
> > > > > > > >> > > > do
> > > > > > > >> > > >   for cycle in 00 06 12 18
> > > > > > > >> > > >   do
> > > > > > > >> > > > stat_analysis -lookin
/GFS/data/hourly/201803${day}$
> > > > > > {hour}/*.stat
> > > > > > > \
> > > > > > > >> > > > -job aggregate_stat \
> > > > > > > >> > > >    -line_type MPR \
> > > > > > > >> > > >    -out_line_type CTC,CTS,CNT \
> > > > > > > >> > > >   -fcst_var WIND \
> > > > > > > >> > > > -column_thresh OBS gt1 \
> > > > > > > >> > > >  -by
> > > > > > > >> > > > MODEL,FCST_LEV,FCST_INIT_BEG,
> > > FCST_LEAD,VX_MASK,INTERP_MTHD,
> > > > I
> > > > > > > >> NTERP_PNTS
> > > > > > > >> > > > -out_stat /new_rerun_stat_files/MPR_to_
> CTC_CTS_CNT.stat
> > > > > > > >> > > >   done
> > > > > > > >> > > > done
> > > > > > > >> > > >
> > > > > > > >> > > > or, something like that?  And, will this
regenerate
> hour
> > > > > > > forecasts,
> > > > > > > >> at
> > > > > > > >> > > each
> > > > > > > >> > > > forecast and lead hour?  I guess it will see
the
> > forecast
> > > > and
> > > > > > lead
> > > > > > > >> hour
> > > > > > > >> > > > from the *.stat file, and whatever *stat file
is in
> the
> > > > > > directory,
> > > > > > > >> it
> > > > > > > >> > > will
> > > > > > > >> > > > regenerate those hours, right?
> > > > > > > >> > > >
> > > > > > > >> > > > So, I need to regenerate the CTC, CNT and CTS
files.
> > > That's
> > > > > > why I
> > > > > > > >> did:
> > > > > > > >> > > >  -out_line_type CTC,CTS,CNT
> > > > > > > >> > > > but, will that make 3 separate files, or just
another
> > > *.stat
> > > > > > file?
> > > > > > > >> > > >
> > > > > > > >> > > > Roz
> > > > > > > >> > > >
> > > > > > > >> > > >
> > > > > > > >> > > > On Mon, Apr 23, 2018 at 4:01 PM, John Halley
Gotway
> via
> > > RT <
> > > > > > > >> > > > met_help at ucar.edu> wrote:
> > > > > > > >> > > >
> > > > > > > >> > > > > Roz,
> > > > > > > >> > > > >
> > > > > > > >> > > > > It is ultimately up to you to decide which
matched
> > pairs
> > > > you
> > > > > > > want
> > > > > > > >> to
> > > > > > > >> > > > > include in your processing.  Do you consider
those
> > small
> > > > > (<1.0
> > > > > > > >> m/s)
> > > > > > > >> > > > > observation values to be corrupt and
incorrect in
> some
> > > way
> > > > > or
> > > > > > > just
> > > > > > > >> > not
> > > > > > > >> > > > very
> > > > > > > >> > > > > interesting?  If they really are BAD data
values, I
> > > agree
> > > > > that
> > > > > > > you
> > > > > > > >> > > should
> > > > > > > >> > > > > exclude them from your analysis.  But if
they're
> just
> > > > > > > >> uninteresting
> > > > > > > >> > > > values
> > > > > > > >> > > > > of low wind speed, then there's no reason why
you
> > should
> > > > > > exclude
> > > > > > > >> > them.
> > > > > > > >> > > > For
> > > > > > > >> > > > > example, *most* of the time it ins't raining,
but we
> > > often
> > > > > > > >> included
> > > > > > > >> > > > > observations of 0 precip.
> > > > > > > >> > > > >
> > > > > > > >> > > > > There are three configurable options in
Point-Stat
> > that
> > > > may
> > > > > be
> > > > > > > >> useful
> > > > > > > >> > > > here:
> > > > > > > >> > > > > (1) You already know and use the "cat_thresh"
> option.
> > > > This
> > > > > > > >> threshold
> > > > > > > >> > > > > defines the events and non-events for a 2x2
> > contingency
> > > > > table.
> > > > > > > >> This
> > > > > > > >> > > > > threshold affects the contents of FHO, CTC,
CTS,
> MCTC,
> > > and
> > > > > > MCTS
> > > > > > > >> line
> > > > > > > >> > > > types
> > > > > > > >> > > > > that Point-Stat writes.
> > > > > > > >> > > > > (2) The "cnt_thresh" option is a more recent
> addition.
> > > > > > Perhaps
> > > > > > > >> this
> > > > > > > >> > > was
> > > > > > > >> > > > a
> > > > > > > >> > > > > poor name choice, but instead of defining
> categories,
> > > it's
> > > > > > > really
> > > > > > > >> a
> > > > > > > >> > > > > *filtering* threshold.  This threshold
affects the
> > > > contents
> > > > > of
> > > > > > > the
> > > > > > > >> > > SL1L2,
> > > > > > > >> > > > > SAL1L2, and CNT line types that Point-Stat
writes.
> > For
> > > > > > example,
> > > > > > > >> > > setting
> > > > > > > >> > > > > "cnt_thresh = [ ge6, ge17 ];" will produce 2
CNT
> and 2
> > > > SL1L2
> > > > > > > >> output
> > > > > > > >> > > lines
> > > > > > > >> > > > > containing only those points where the wind
speed
> was
> > > >=6
> > > > > and
> > > > > > > >> >=17,
> > > > > > > >> > > > > respectively.
> > > > > > > >> > > > > (3) The "wind_thresh" option is very similar
to the
> > > > > > "cnt_thresh"
> > > > > > > >> > option
> > > > > > > >> > > > but
> > > > > > > >> > > > > affects the contents of teh VL1L2, VAL1L2,
and VCNT
> > (new
> > > > in
> > > > > > > >> met-7.0)
> > > > > > > >> > > line
> > > > > > > >> > > > > types.  Only those U/V pairs that meet the
specified
> > > wind
> > > > > > speed
> > > > > > > >> > > threshold
> > > > > > > >> > > > > are included in the output.
> > > > > > > >> > > > >
> > > > > > > >> > > > > For both "cnt_thresh" and "wind_thresh", the
default
> > > value
> > > > > in
> > > > > > > the
> > > > > > > >> > > config
> > > > > > > >> > > > > file is "NA", meaning, do not apply any
filtering
> > > > threshold
> > > > > > > >> criteria.
> > > > > > > >> > > > >
> > > > > > > >> > > > > You have the flexibility to run STAT-Analysis
on the
> > MPR
> > > > > > output
> > > > > > > >> lines
> > > > > > > >> > > to
> > > > > > > >> > > > > recompute any of these output line types
applying
> > > whatever
> > > > > > > >> filtering
> > > > > > > >> > > > > criteria you'd like.
> > > > > > > >> > > > > Here's the MET user's guide:
> > > > > > > >> > > > > https://dtcenter.org/met/
> users/docs/users_guide/MET_
> > > > > > > >> > > Users_Guide_v7.0.pdf
> > > > > > > >> > > > > Look on page 98 for the job command options
for the
> > > > > > > >> "aggregate_stat"
> > > > > > > >> > > line
> > > > > > > >> > > > > type when the input line type is "MPR".
> > > > > > > >> > > > >
> > > > > > > >> > > > > For your second question, the "-lookin PATH"
option
> is
> > > > > *VERY*
> > > > > > > >> > flexible.
> > > > > > > >> > > > > You can set PATH to either a single value or
> multiple
> > > > > values.
> > > > > > > If
> > > > > > > >> you
> > > > > > > >> > > use
> > > > > > > >> > > > > wildcards, then the shell expands those
wildcards to
> > > > > multiple
> > > > > > > >> values.
> > > > > > > >> > > > Each
> > > > > > > >> > > > > value you pass in can either be a filename or
a
> > > directory
> > > > > > name.
> > > > > > > >> If
> > > > > > > >> > you
> > > > > > > >> > > > > pass in a filename, STAT-Analysis will read
it
> > > > *REGARDLESS*
> > > > > of
> > > > > > > the
> > > > > > > >> > file
> > > > > > > >> > > > > extension.  If you pass in a directory name,
> > > STAT-Analysis
> > > > > > will
> > > > > > > >> > search
> > > > > > > >> > > > that
> > > > > > > >> > > > > directory *RECURSIVELY* for files ending in
".stat".
> > > For
> > > > > > > example,
> > > > > > > >> > > either
> > > > > > > >> > > > > of the following settings would tell STAT-
Analysis
> to
> > > read
> > > > > the
> > > > > > > >> same
> > > > > > > >> > > list
> > > > > > > >> > > > of
> > > > > > > >> > > > > files:
> > > > > > > >> > > > >    -lookin /GFS/data/hourly/*/*.stat
> > > > > > > >> > > > >    ... or ...
> > > > > > > >> > > > >    -lookin /GFS/data/hourly
> > > > > > > >> > > > >
> > > > > > > >> > > > > Be aware though that the more data you pass
to
> > > > > STAT-Analysis,
> > > > > > > the
> > > > > > > >> > > longer
> > > > > > > >> > > > > it'll take for it to process it.  You can
decide how
> > > much
> > > > > data
> > > > > > > you
> > > > > > > >> > pass
> > > > > > > >> > > > it
> > > > > > > >> > > > > for each job.  I'd suggest starting with what
is
> most
> > > > > > convenient
> > > > > > > >> for
> > > > > > > >> > > you.
> > > > > > > >> > > > > If it's too slow, change the logic to pass it
less
> > data
> > > > > (e.g.
> > > > > > > >> only 1
> > > > > > > >> > > day
> > > > > > > >> > > > of
> > > > > > > >> > > > > data rather than 1 month of data).
> > > > > > > >> > > > >
> > > > > > > >> > > > > Yes, you can give it a date range.  Use
> -fcst_init_beg
> > > and
> > > > > > > >> > > -fcst_init_end
> > > > > > > >> > > > > to specify beginning/ending model
initialization
> times
> > > or
> > > > > > > >> > > -fcst_valid_beg
> > > > > > > >> > > > > and -fcst_valid_end to specify
beginning/ending
> valid
> > > > times.
> > > > > > > >> > > > >
> > > > > > > >> > > > > If you find that you're running multiple jobs
on the
> > > same
> > > > > > subset
> > > > > > > >> of
> > > > > > > >> > > data
> > > > > > > >> > > > > (e.g. process MPR to CNT, MPR to SL1L2, MPR
to CTC,
> > MPR
> > > to
> > > > > > CTS),
> > > > > > > >> it'd
> > > > > > > >> > > be
> > > > > > > >> > > > > more efficient to group those jobs into a
config
> file.
> > > > > > That'll
> > > > > > > do
> > > > > > > >> > the
> > > > > > > >> > > > > filtering ONCE and write the filtered data to
a temp
> > > file.
> > > > > > Then
> > > > > > > >> all
> > > > > > > >> > > the
> > > > > > > >> > > > > jobs read data from the temp instead of
starting
> over
> > > from
> > > > > > > >> scratch.
> > > > > > > >> > > > >
> > > > > > > >> > > > > Make sense?
> > > > > > > >> > > > >
> > > > > > > >> > > > > John
> > > > > > > >> > > > >
> > > > > > > >> > > > >
> > > > > > > >> > > > >
> > > > > > > >> > > > > On Mon, Apr 23, 2018 at 1:01 PM, Rosalyn
MacCracken
> -
> > > NOAA
> > > > > > > >> Affiliate
> > > > > > > >> > > via
> > > > > > > >> > > > RT
> > > > > > > >> > > > > <met_help at ucar.edu> wrote:
> > > > > > > >> > > > >
> > > > > > > >> > > > > >
> > > > > > > >> > > > > > <URL: https://rt.rap.ucar.edu/rt/
> > > > > > Ticket/Display.html?id=84822
> > > > > > > >
> > > > > > > >> > > > > >
> > > > > > > >> > > > > > Hi John,
> > > > > > > >> > > > > >
> > > > > > > >> > > > > > That's actually only partially correct.
It's not
> > > that I
> > > > > > want
> > > > > > > to
> > > > > > > >> > use
> > > > > > > >> > > > part
> > > > > > > >> > > > > > of the MPR lines and discard the rest, and
I do
> need
> > > to
> > > > > > > >> regenerate
> > > > > > > >> > > > > > statistics.  Let me try to re-explain.
> > > > > > > >> > > > > >
> > > > > > > >> > > > > > Back in early March we switched from
getting our
> > ASCAT
> > > > obs
> > > > > > > from
> > > > > > > >> the
> > > > > > > >> > > > > > prepbufr data, to getting it from the
MGDRLITE
> data.
> > > So,
> > > > > > > >> processing
> > > > > > > >> > > > > didn't
> > > > > > > >> > > > > > change.  I was producing statistics at
certain
> > > threshold
> > > > > > > levels
> > > > > > > >> for
> > > > > > > >> > > > both
> > > > > > > >> > > > > > GFS and ASCAT.  I had this set with the
cat_thresh
> > > list,
> > > > > at
> > > > > > > >> levels
> > > > > > > >> > of
> > > > > > > >> > > > > > 0,6,17, etc.  We found out after processing
for a
> > > couple
> > > > > of
> > > > > > > >> weeks
> > > > > > > >> > > that
> > > > > > > >> > > > > the
> > > > > > > >> > > > > > ASCAT data included these really small
values,
> <1.0
> > > m/s,
> > > > > and
> > > > > > > >> that
> > > > > > > >> > > these
> > > > > > > >> > > > > > small wind speeds were being included into
the
> > > > statistics
> > > > > > > >> > processing.
> > > > > > > >> > > > > >
> > > > > > > >> > > > > > So, a couple of questions.
> > > > > > > >> > > > > > 1) Do I have to regenerate all of my
statistics
> > > (*.cts,
> > > > > > *.cnt
> > > > > > > >> and
> > > > > > > >> > > *ctc
> > > > > > > >> > > > > > files) because of this error? Or, since I
have
> > > threshold
> > > > > > > levels
> > > > > > > >> > set,
> > > > > > > >> > > > will
> > > > > > > >> > > > > > those small values be amoung the statistics
in the
> > > > lowest
> > > > > > > >> > thresholds?
> > > > > > > >> > > > > > 2) I have the *.stat files, but, they are
spread
> out
> > > > into
> > > > > > > >> separate
> > > > > > > >> > > > > > directories like:
> > > > > > > >> > > > > > /GFS/data/hourly/${YYYYMMDDHH}/*.stat
> > > > > > > >> > > > > > Can I tell stat-analysis to "lookin"
directories
> > with
> > > a
> > > > > > > wildcard
> > > > > > > >> > > (like
> > > > > > > >> > > > > > 201803*)?  If so, how?  Or, is I tell it to
look
> in
> > > > > > > >> > /GFS/data/hourly,
> > > > > > > >> > > > > will
> > > > > > > >> > > > > > it look in all the directories recursively
under
> > > hourly?
> > > > > > And,
> > > > > > > >> it
> > > > > > > >> > > > that's
> > > > > > > >> > > > > > the case, can I give it a date range, so,
that it
> > only
> > > > > > > processes
> > > > > > > >> > data
> > > > > > > >> > > > > from
> > > > > > > >> > > > > > March?
> > > > > > > >> > > > > >
> > > > > > > >> > > > > > Roz
> > > > > > > >> > > > > >
> > > > > > > >> > > > > > On Mon, Apr 23, 2018 at 2:18 PM, John
Halley
> Gotway
> > > via
> > > > > RT <
> > > > > > > >> > > > > > met_help at ucar.edu> wrote:
> > > > > > > >> > > > > >
> > > > > > > >> > > > > > > Hi Roz,
> > > > > > > >> > > > > > >
> > > > > > > >> > > > > > > I read that you've run Point-Stat and
saved off
> > the
> > > > > > matched
> > > > > > > >> pairs
> > > > > > > >> > > > (MPR)
> > > > > > > >> > > > > > > output line type.  And you'd like to (1)
filter
> > > those
> > > > > MPR
> > > > > > > >> lines
> > > > > > > >> > to
> > > > > > > >> > > > > > discard
> > > > > > > >> > > > > > > some of them and then (2) use the
filtered data
> to
> > > > > > > regenerate
> > > > > > > >> > > summary
> > > > > > > >> > > > > > > statistics.  Yes, this is easily done
using the
> > > > > > > STAT-Analysis
> > > > > > > >> > tool
> > > > > > > >> > > in
> > > > > > > >> > > > > > MET.
> > > > > > > >> > > > > > >
> > > > > > > >> > > > > > > You wrote that you're verifying wind
speeds
> > against
> > > > > ASCAT
> > > > > > > and
> > > > > > > >> > that
> > > > > > > >> > > > > you'd
> > > > > > > >> > > > > > > like to exclude pairs where the observed
wind
> > speed
> > > is
> > > > > > less
> > > > > > > >> than
> > > > > > > >> > 1
> > > > > > > >> > > > m/s.
> > > > > > > >> > > > > > > I'm just guessing here, but I'll presume
that
> you
> > > want
> > > > > to
> > > > > > > >> produce
> > > > > > > >> > > > both
> > > > > > > >> > > > > > > SL1L2 and CNT output line types.  Here's
what
> the
> > > > > > > >> STAT-Analysis
> > > > > > > >> > job
> > > > > > > >> > > > > would
> > > > > > > >> > > > > > > look like:
> > > > > > > >> > > > > > >
> > > > > > > >> > > > > > > # Filter MPR's and write SL1L2 output
line
> > > > > > > >> > > > > > > stat_analysis \
> > > > > > > >> > > > > > >    -lookin input.stat \            # List
a
> .stat
> > > > > filename
> > > > > > > or
> > > > > > > >> > > > directory
> > > > > > > >> > > > > > > containing them
> > > > > > > >> > > > > > >    -job aggregate_stat \        # Job
type is
> > > > > > aggregate_stat
> > > > > > > >> > > > > > >    -line_type MPR \              # Input
line
> > type =
> > > > MPR
> > > > > > > >> > > > > > >    -out_line_type SL1L2 \      # Output
line
> type
> > =
> > > > > SL1L2
> > > > > > > >> partial
> > > > > > > >> > > > sums
> > > > > > > >> > > > > > >    -fcst_var WIND \               # Only
process
> > > lines
> > > > > > where
> > > > > > > >> > > FCST_VAR
> > > > > > > >> > > > > > > column = WIND
> > > > > > > >> > > > > > >    -column_thresh OBS gt1 \ # Only use
MPR lines
> > > where
> > > > > OBS
> > > > > > > >> column
> > > > > > > >> > > > 1
> > > > > > > >> > > > > > >    -by
> > > > > > > >> > > > > > > MODEL,FCST_LEV,FCST_INIT_BEG,
> > > > > > FCST_LEAD,VX_MASK,INTERP_MTHD,
> > > > > > > >> > > > INTERP_PNTS
> > > > > > > >> > > > > #
> > > > > > > >> > > > > > > Run this same job for each unique
combination of
> > > these
> > > > > > > columns
> > > > > > > >> > > > > > >    -out_stat MPR_to_SL1L2.stat
> > > > > > > >> > > > > > >
> > > > > > > >> > > > > > > This will read produce an output .stat
file
> > > containing
> > > > > an
> > > > > > > >> SL1L2
> > > > > > > >> > > line
> > > > > > > >> > > > > for
> > > > > > > >> > > > > > > each unique combination of the header
columns
> > listed
> > > > > after
> > > > > > > the
> > > > > > > >> > > "-by"
> > > > > > > >> > > > > > > option.  To generate CNT output lines
instead,
> > you'd
> > > > > run a
> > > > > > > >> second
> > > > > > > >> > > job
> > > > > > > >> > > > > > where
> > > > > > > >> > > > > > > you replace SL1L2 with CNT.  You could
run these
> > > jobs
> > > > on
> > > > > > the
> > > > > > > >> > > command
> > > > > > > >> > > > > line
> > > > > > > >> > > > > > > or group them together into a STAT-
Analysis
> config
> > > > file,
> > > > > > if
> > > > > > > >> you
> > > > > > > >> > > > prefer.
> > > > > > > >> > > > > > > Both would work.
> > > > > > > >> > > > > > >
> > > > > > > >> > > > > > > You could run this once for each input
.stat
> file
> > > > you're
> > > > > > > >> > > > processing...
> > > > > > > >> > > > > or
> > > > > > > >> > > > > > > you could pass many input .stat files to
the
> job.
> > > > Since
> > > > > > > >> > > > FCST_INIT_BEG
> > > > > > > >> > > > > > and
> > > > > > > >> > > > > > > FCST_LEAD are included in the "-by"
option,
> you'll
> > > get
> > > > > > > >> separate
> > > > > > > >> > > > output
> > > > > > > >> > > > > > > lines for each unique time.
> > > > > > > >> > > > > > >
> > > > > > > >> > > > > > > Hope that helps get you going.
> > > > > > > >> > > > > > >
> > > > > > > >> > > > > > > Thanks,
> > > > > > > >> > > > > > > John
> > > > > > > >> > > > > > >
> > > > > > > >> > > > > > >
> > > > > > > >> > > > > > > On Thu, Apr 19, 2018 at 9:23 AM, Julie
> Prestopnik
> > > via
> > > > > RT <
> > > > > > > >> > > > > > > met_help at ucar.edu>
> > > > > > > >> > > > > > > wrote:
> > > > > > > >> > > > > > >
> > > > > > > >> > > > > > > >
> > > > > > > >> > > > > > > > <URL: https://rt.rap.ucar.edu/rt/Tic
> > > > > > > >> ket/Display.html?id=84822
> > > > > > > >> > >
> > > > > > > >> > > > > > > >
> > > > > > > >> > > > > > > > Hi Roz.  My apologies for the delay in
> > responding.
> > > > > > > >> > > > > > > >
> > > > > > > >> > > > > > > > Unfortunately, John is out of the
office this
> > > week,
> > > > > and
> > > > > > I
> > > > > > > do
> > > > > > > >> > not
> > > > > > > >> > > > know
> > > > > > > >> > > > > > the
> > > > > > > >> > > > > > > > answers to your questions.  As you
said, I
> would
> > > > also
> > > > > > > >> imagine
> > > > > > > >> > > that
> > > > > > > >> > > > > > > > point-stat is using those small values
as
> > matched
> > > > > pairs.
> > > > > > > >> > Also, I
> > > > > > > >> > > > do
> > > > > > > >> > > > > > not
> > > > > > > >> > > > > > > > believe there is a way to regenerate
the
> > > point-stat
> > > > > > > >> statistics
> > > > > > > >> > > > > without
> > > > > > > >> > > > > > > > using the original GFS data.  I cannot
say
> with
> > > > > > certainty,
> > > > > > > >> > > however.
> > > > > > > >> > > > > > > Thank
> > > > > > > >> > > > > > > > you for your patience in advance.
We'll get a
> > > > > definite
> > > > > > > >> > response
> > > > > > > >> > > to
> > > > > > > >> > > > > you
> > > > > > > >> > > > > > > as
> > > > > > > >> > > > > > > > soon as we can.
> > > > > > > >> > > > > > > >
> > > > > > > >> > > > > > > > Thanks,
> > > > > > > >> > > > > > > > Julie
> > > > > > > >> > > > > > > >
> > > > > > > >> > > > > > > > On Wed, Apr 18, 2018 at 6:31 AM,
Rosalyn
> > > MacCracken
> > > > -
> > > > > > NOAA
> > > > > > > >> > > > Affiliate
> > > > > > > >> > > > > > via
> > > > > > > >> > > > > > > RT
> > > > > > > >> > > > > > > > <met_help at ucar.edu> wrote:
> > > > > > > >> > > > > > > >
> > > > > > > >> > > > > > > > >
> > > > > > > >> > > > > > > > > Wed Apr 18 06:31:39 2018: Request
84822 was
> > > acted
> > > > > > upon.
> > > > > > > >> > > > > > > > > Transaction: Ticket created by
> > > > > > > >> rosalyn.maccracken at noaa.gov
> > > > > > > >> > > > > > > > >        Queue: met_help
> > > > > > > >> > > > > > > > >      Subject: question on
regenerating data
> > > > > > > >> > > > > > > > >        Owner: Nobody
> > > > > > > >> > > > > > > > >   Requestors:
rosalyn.maccracken at noaa.gov
> > > > > > > >> > > > > > > > >       Status: new
> > > > > > > >> > > > > > > > >  Ticket <URL:
https://rt.rap.ucar.edu/rt/
> > > > > > > >> > > > > > Ticket/Display.html?id=84822
> > > > > > > >> > > > > > > >
> > > > > > > >> > > > > > > > >
> > > > > > > >> > > > > > > > >
> > > > > > > >> > > > > > > > > Hi,
> > > > > > > >> > > > > > > > >
> > > > > > > >> > > > > > > > > I'm running point-stat using ASCAT
and GFS
> > data
> > > to
> > > > > > > verify
> > > > > > > >> > > surface
> > > > > > > >> > > > > > wind
> > > > > > > >> > > > > > > > > speeds.  I found an error in my ASCAT
input
> > data
> > > > > that
> > > > > > > goes
> > > > > > > >> > back
> > > > > > > >> > > > to
> > > > > > > >> > > > > > Mar
> > > > > > > >> > > > > > > 7.
> > > > > > > >> > > > > > > > > I had switched the input source of
the data,
> > and
> > > > > > within
> > > > > > > >> the
> > > > > > > >> > new
> > > > > > > >> > > > > data
> > > > > > > >> > > > > > > > files,
> > > > > > > >> > > > > > > > > it was allowing very small values (<
1 m/s)
> to
> > > be
> > > > > used
> > > > > > > as
> > > > > > > >> > data
> > > > > > > >> > > > > points
> > > > > > > >> > > > > > > in
> > > > > > > >> > > > > > > > > the verification.  I imagine that
this is an
> > > > issue,
> > > > > > > since
> > > > > > > >> > > > > point-stat
> > > > > > > >> > > > > > is
> > > > > > > >> > > > > > > > > using these very small values as
matched
> pairs
> > > > with
> > > > > > the
> > > > > > > >> GFS,
> > > > > > > >> > > > > correct?
> > > > > > > >> > > > > > > > >
> > > > > > > >> > > > > > > > > Is there a way to regenerate the
point-stat
> > > > > statistics
> > > > > > > >> > without
> > > > > > > >> > > > > using
> > > > > > > >> > > > > > > the
> > > > > > > >> > > > > > > > > original GFS data?  I do have the
*stat and
> > the
> > > > *mpr
> > > > > > > >> files,
> > > > > > > >> > and
> > > > > > > >> > > > it
> > > > > > > >> > > > > is
> > > > > > > >> > > > > > > > > pretty easy to identify where the bad
values
> > are
> > > > > > > located.
> > > > > > > >> > > > > > > > >
> > > > > > > >> > > > > > > > > Thanks,
> > > > > > > >> > > > > > > > > Roz
> > > > > > > >> > > > > > > > >
> > > > > > > >> > > > > > > > > --
> > > > > > > >> > > > > > > > > Rosalyn MacCracken
> > > > > > > >> > > > > > > > > Support Scientist
> > > > > > > >> > > > > > > > >
> > > > > > > >> > > > > > > > > Ocean Applications Branch
> > > > > > > >> > > > > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > > >> > > > > > > > > NCWCP
> > > > > > > >> > > > > > > > > 5830 University Research Ct
> > > > > > > >> > > > > > > > > College Park, MD  20740-3818
> > > > > > > >> > > > > > > > >
> > > > > > > >> > > > > > > > > (p) 301-683-1551
> > > > > > > >> > > > > > > > > rosalyn.maccracken at noaa.gov
> > > > > > > >> > > > > > > > >
> > > > > > > >> > > > > > > > >
> > > > > > > >> > > > > > > >
> > > > > > > >> > > > > > > >
> > > > > > > >> > > > > > >
> > > > > > > >> > > > > > >
> > > > > > > >> > > > > >
> > > > > > > >> > > > > >
> > > > > > > >> > > > > > --
> > > > > > > >> > > > > > Rosalyn MacCracken
> > > > > > > >> > > > > > Support Scientist
> > > > > > > >> > > > > >
> > > > > > > >> > > > > > Ocean Applications Branch
> > > > > > > >> > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > > >> > > > > > NCWCP
> > > > > > > >> > > > > > 5830 University Research Ct
> > > > > > > >> > > > > > College Park, MD  20740-3818
> > > > > > > >> > > > > >
> > > > > > > >> > > > > > (p) 301-683-1551
> > > > > > > >> > > > > > rosalyn.maccracken at noaa.gov
> > > > > > > >> > > > > >
> > > > > > > >> > > > > >
> > > > > > > >> > > > >
> > > > > > > >> > > > >
> > > > > > > >> > > >
> > > > > > > >> > > >
> > > > > > > >> > > > --
> > > > > > > >> > > > Rosalyn MacCracken
> > > > > > > >> > > > Support Scientist
> > > > > > > >> > > >
> > > > > > > >> > > > Ocean Applications Branch
> > > > > > > >> > > > NOAA/NWS Ocean Prediction Center
> > > > > > > >> > > > NCWCP
> > > > > > > >> > > > 5830 University Research Ct
> > > > > > > >> > > > College Park, MD  20740-3818
> > > > > > > >> > > >
> > > > > > > >> > > > (p) 301-683-1551
> > > > > > > >> > > > rosalyn.maccracken at noaa.gov
> > > > > > > >> > > >
> > > > > > > >> > > >
> > > > > > > >> > >
> > > > > > > >> > >
> > > > > > > >> >
> > > > > > > >> >
> > > > > > > >> > --
> > > > > > > >> > Rosalyn MacCracken
> > > > > > > >> > Support Scientist
> > > > > > > >> >
> > > > > > > >> > Ocean Applications Branch
> > > > > > > >> > NOAA/NWS Ocean Prediction Center
> > > > > > > >> > NCWCP
> > > > > > > >> > 5830 University Research Ct
> > > > > > > >> > College Park, MD  20740-3818
> > > > > > > >> >
> > > > > > > >> > (p) 301-683-1551
> > > > > > > >> > rosalyn.maccracken at noaa.gov
> > > > > > > >> >
> > > > > > > >> >
> > > > > > > >>
> > > > > > > >>
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > Rosalyn MacCracken
> > > > > > > > Support Scientist
> > > > > > > >
> > > > > > > > Ocean Applications Branch
> > > > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > > > NCWCP
> > > > > > > > 5830 University Research Ct
> > > > > > > > College Park, MD  20740-3818
> > > > > > > >
> > > > > > > > (p) 301-683-1551
> > > > > > > > rosalyn.maccracken at noaa.gov
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Rosalyn MacCracken
> > > > > > > Support Scientist
> > > > > > >
> > > > > > > Ocean Applications Branch
> > > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > > NCWCP
> > > > > > > 5830 University Research Ct
> > > > > > > College Park, MD  20740-3818
> > > > > > >
> > > > > > > (p) 301-683-1551
> > > > > > > rosalyn.maccracken at noaa.gov
> > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Rosalyn MacCracken
> > > > > Support Scientist
> > > > >
> > > > > Ocean Applications Branch
> > > > > NOAA/NWS Ocean Prediction Center
> > > > > NCWCP
> > > > > 5830 University Research Ct
> > > > > College Park, MD  20740-3818
> > > > >
> > > > > (p) 301-683-1551
> > > > > rosalyn.maccracken at noaa.gov
> > > > >
> > > > >
> > > >
> > > >
> > >
> > >
> > > --
> > > Rosalyn MacCracken
> > > Support Scientist
> > >
> > > Ocean Applications Branch
> > > NOAA/NWS Ocean Prediction Center
> > > NCWCP
> > > 5830 University Research Ct
> > > College Park, MD  20740-3818
> > >
> > > (p) 301-683-1551
> > > rosalyn.maccracken at noaa.gov
> > >
> > >
> >
> >
>
>
> --
> Rosalyn MacCracken
> Support Scientist
>
> Ocean Applications Branch
> NOAA/NWS Ocean Prediction Center
> NCWCP
> 5830 University Research Ct
> College Park, MD  20740-3818
>
> (p) 301-683-1551
> rosalyn.maccracken at noaa.gov
>
>

------------------------------------------------
Subject: question on regenerating data
From: Rosalyn MacCracken - NOAA Affiliate
Time: Mon May 07 10:38:32 2018

Hi John,

So, it sounds like I'm ok either way with the timestamp.  If I don't
use
-set_hdr, it sets the correct beginning and end time according to the
mpr
file, or, I can use -set_hdr for consistancy with the other files,
but,
that's more "cosmetic".

Oh, but, that -set_hdr command option is within the config file,
correct?
So, you really couldn't loop through that and pass a time variable
into the
command.  So, it may just be easiest to leave it out.

So, since my processing would take 20 minutes for regenerating one
days
worth of data, I was thinking, I would do all my processing for the
North
Atlantic first, so, I can look at how we did with those Nor' Easters
in
March.  So, that's processing the 00z, 01z, 11z and 12z time periods
first,
since that is when ASCAT passes over the North Atlantic.  So, I would
copy
those time periods into a temp directory and use the -lookin command
to
process those 4 time periods for my entire period (maybe 1 week at a
time).

So, this will produce 1 file with 00z,01z, 11z and 12z data, for each
week,
correct?  And, the only way to get individual files is to copy the
data,
one hour at a time, process, delete the file, later rinse and repeat.
That
might be hard to do.  I may have to think about that...

So, if it's one file, with all the data for the week, at selected
hours,
what happens when I have time to run the rest of the data?  I just
write
that to a different file, and then, maybe append that to the end of
the
first file?  Or, just leave it separate?

I guess I just have to think about what I'm going to do next with
these
files, and the easiest way to do that.

Roz

On Mon, May 7, 2018 at 11:48 AM, John Halley Gotway via RT <
met_help at ucar.edu> wrote:

> Roz,
>
> I understand that you're suspicious about the beginning and ending
time
> stamps in the OBS_VALID_BEG and _END columns.  You're comparing the
> original output from Point-Stat to the output that you're getting
from
> STAT-Analysis.  However, those timestamps can be different without
there
> actually being a problem.  Here's why...
>
> When you run Point-Stat, the obs_window setting in the config file
defines
> the matching time window.  If your forecast is valid at time T, the
> matching time window is defined as T+obs_window.beg to
T+obs_window.end.
> The point observations may actually fall anywhere in that time
window...
> but it's that time window that's reported in the summary line type
(like
> CTC, CTS, SL1L2, and CNT).  Since the MPR line type is specific to
each
> observation value, the *actual* timestamp of that observation is
reported
> for in that line.
>
> When your run STAT-Analysis to process those MPR lines, it reads the
> OBS_VALID_BEG and OBS_VALID_END columns.  And it keeps track of the
minimum
> OBS_VALID_BEG timestamp and the maximum OBS_VALID_END timestamp.
When it
> writes output CTC, CTS, SL1L2, or CNT lines it reports the
minimum/maximum
> timestamp values it found in the data.
>
> So Point-Stat reports the *REQUESTED TIME WINDOW* in the
OBS_VALID_BEG and
> OBS_VALID_END columns... while STAT-Analysis reports the *ACTUAL
TIME
> WINDOW*.  And in general, those won't be the same.  So this isn't
> necessarily a problem.
>
> If for consistency, you'd like to explicitly set the OBS_VALID_BEG
and
> OBS_VALID_END timestamps in the output, you can use the "-set_hdr"
job
> command option to do so:
>    -set_hdr OBS_VALID_BEG 20180307_003000 -set_hdr OBS_VALID_END
> 20180307_013000
>
> Thanks,
> John
>
> On Sun, May 6, 2018 at 2:12 PM, Rosalyn MacCracken - NOAA Affiliate
via RT
> <
> met_help at ucar.edu> wrote:
>
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822 >
> >
> > Hi John,
> >
> > Sorry it took me so long to get back to you.  My step-daughter
came in to
> > town, and I thought that I could get some work done while she was
here,
> > but, didn't.  Then, I totally forgot to email you back.  Sorry for
> leaving
> > you hanging!
> >
> > Anyway, I was able to play around with the STATAnalysis config
file you
> > sent me.  I tried it out with only 1 hour timestep, instead of all
the
> > files for one day.  I wanted to see what kind of time it would
take to
> > process this on my machine.  So, it was quick, 45 seconds.  But,
of
> course
> > you run took 18 minutes.  The script was probably reading 20 some
files.
> > That makes sense.
> >
> > So, then, I looked at the output, and it wasn't quite what I
expected,
> and
> > doesn't quite match the stats from the other processing.  This is
what I
> > did:
> >
> > 1)  I copied the 00z only *20180307*.stat file to a temp
directory.
> Before
> > I did this, I looked at the matching *.mpr file, and saw that the
> > OBS_VALID_BEG was 20180307_000000 and the OBS_VALID_END was
> > 20180307_002700.
> > 2)  Ran the run_sa.sh script and generated the CTS, CTC and CNT
files.
> > 3)  I looked at the new agg_cts file, and the OBS_VALID_BEG and
_END
> > matched the *.mpr file in step 1.
> > 4)  I looked at the original CTS file, and the OBS_VALID_BEG was
> > 20180307_223000 and the OBS_VALID_END was 20180307_013000.  So,
that was
> > our original way of processing.  I bet if I looked at a more
recent file,
> > it would be more like OBS_VALID_BEG was 20180307_233000 and the
> > OBS_VALID_END was 20180307_003000.
> > 5)  I looked at the original *mpr for 01z, and the OBS_VALID_BEG
was
> > 20180307_003000 and the OBS_VALID_END was 20180307_012100
> >
> > So, this tells me that I'm not matching observation times, and I'm
not
> sure
> > how to fix it to match things up.  First, we use a +/- 30 min
window for
> > ASCAT obs, centered on the hour.  For example, if we are
processing the
> 00z
> > hour, we will match observations from 233000 from the day before
to
> 003000
> > the current day.  Actually, we used to do an hour window on either
side,
> > but, we have more observations now at each hour.  (See the
explanation in
> > #4 above)
> >
> > Anyway, how do I create the CTS,CTC and CNT files for the +/- 30
min
> > window?  Is there a way to dynamically indicate this 30min window,
so
> that
> > I don't have to go into the config file every time I run
STATanalysis and
> > change it?
> >
> > Roz
> >
> > On Thu, Apr 26, 2018 at 4:14 PM, John Halley Gotway via RT <
> > met_help at ucar.edu> wrote:
> >
> > > Roz,
> > >
> > > The CSI statistics is computed from a 2x2 contingency table.  A
2x2
> > > contingency table is defined by a single threshold.  Looking in
the
> .stat
> > > files you sent, I see that you've applied many thresholds to
generate
> > many
> > > 2x2 contingency tables and corresponding statistics.  Yes, it is
true
> > that
> > > for most of those thresholds, the "bad" observation values will
fall
> into
> > > the "non-event" category.  But those non-event counts are
included in
> the
> > > computation of some stats, including CSI.  So even through the
bad
> > > observations aren't very interesting, they really are impacting
the
> > > statistics.
> > >
> > > John
> > >
> > > On Wed, Apr 25, 2018 at 10:08 AM, Rosalyn MacCracken - NOAA
Affiliate
> via
> > > RT <met_help at ucar.edu> wrote:
> > >
> > > >
> > > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822
>
> > > >
> > > > Figures.  I just calculated how long it will take me to
regenerate
> data
> > > for
> > > > 03072018 - 04122018.  It will take me 912 hours.  ;-(
> > > >
> > > > Ok, I know I asked this, but, if I had a OBS value of 0.01 and
a
> > matched
> > > > GFS point of 10 m/s, and I had a low threshold of 0-5 m/s, 6-
10 m/s
> and
> > > > 10-15 m/s, and say, CSI was calculated.  Which threshold would
be
> used
> > > for
> > > > the output, the 0-5 or 6-10?  And, would the 10-15 threshold
even be
> > > > effected?
> > > >
> > > > Roz
> > > >
> > > > On Wed, Apr 25, 2018 at 11:40 AM, John Halley Gotway via RT <
> > > > met_help at ucar.edu> wrote:
> > > >
> > > > > Roz,
> > > > >
> > > > > I think it'd take just as long.  The slow part is reading
the
> data...
> > > not
> > > > > applying a threshold.
> > > > >
> > > > > John
> > > > >
> > > > > On Wed, Apr 25, 2018 at 9:18 AM, Rosalyn MacCracken - NOAA
> Affiliate
> > > via
> > > > RT
> > > > > <met_help at ucar.edu> wrote:
> > > > >
> > > > > >
> > > > > > <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822 >
> > > > > >
> > > > > > Hi John,
> > > > > >
> > > > > > Thanks for doing that for me.  I'll take a look at the
info you
> > sent
> > > me
> > > > > > this afternoon.  I'm in the middle of doing something
right
> > > > now...trying
> > > > > to
> > > > > > make a different program work.  ;-/
> > > > > >
> > > > > > I wonder if it will be quicker than 18 minutes for some of
the
> > > > thresholds
> > > > > > that have higher wind speeds, and not as many instances
(or 0
> > > > instances).
> > > > > > Or, will it take just as long, since it still needs to
read
> through
> > > the
> > > > > > entire *.stat file anyway?
> > > > > >
> > > > > > Roz
> > > > > >
> > > > > > On Tue, Apr 24, 2018 at 7:06 PM, John Halley Gotway via RT
<
> > > > > > met_help at ucar.edu> wrote:
> > > > > >
> > > > > > > Hi Roz,
> > > > > > >
> > > > > > > Thanks for sending the sample data.  I grabbed it and
used it
> run
> > > > some
> > > > > > > sample jobs:
> > > > > > >
> > > > > > > time /d1/johnhg/MET/MET_releases/met-
6.0/bin/stat_analysis \
> > > > > > > -lookin
> > > > > > > /d1/johnhg/MET/MET_Help/maccracken_data_20180424/opc_
> > > > > > > test/home/opc_test/data/met_verif/GFS/data/hourly
> > > > > > > \
> > > > > > > -config STATAnalysisConfig \
> > > > > > > -log run_sa.log -v 3
> > > > > > >
> > > > > > > I used the "-lookin" option to point to all the data you
sent.
> > > > > > >
> > > > > > > I've attached the...
> > > > > > > (1) config file I used
> > > > > > > (2) log file that was genrated
> > > > > > > (3) output .stat files
> > > > > > >
> > > > > > > Looking at the jobs, you'll see that I've included 5 of
them...
> > > > > > > - Generate CNT output
> > > > > > > - Generate CTC >= 0.0 output
> > > > > > > - Generate CTS >= 0.0 output
> > > > > > > - Generate CTC >= 5.5689 output
> > > > > > > - Generate CTS >= 5.5689 output
> > > > > > >
> > > > > > > Unfortunately, you'll need to define separate jobs for
each
> > > threshold
> > > > > > you'd
> > > > > > > like to use.  Although, you shouldn't use >=0.0 since
that's
> > always
> > > > > true.
> > > > > > >
> > > > > > > Also unfortunately, this is pretty slow.  On my machine,
it
> took
> > > like
> > > > > 18
> > > > > > > minutes for these 5 jobs!
> > > > > > >
> > > > > > > Thanks,
> > > > > > > John
> > > > > > >
> > > > > > >
> > > > > > > On Tue, Apr 24, 2018 at 2:09 PM, Rosalyn MacCracken -
NOAA
> > > Affiliate
> > > > > via
> > > > > > RT
> > > > > > > <met_help at ucar.edu> wrote:
> > > > > > >
> > > > > > > >
> > > > > > > > <URL: https://rt.rap.ucar.edu/rt/
> Ticket/Display.html?id=84822
> > >
> > > > > > > >
> > > > > > > > Hi John,
> > > > > > > >
> > > > > > > > I put my file on the ftp site.  Let me know what you
find.
> > > You'll
> > > > > see
> > > > > > > > those really low OBS values (0.01, 0.02, and so on).
> > > > > > > >
> > > > > > > > Thanks!
> > > > > > > >
> > > > > > > > Roz
> > > > > > > >
> > > > > > > > On Tue, Apr 24, 2018 at 2:53 PM, Rosalyn MacCracken -
NOAA
> > > > Affiliate
> > > > > <
> > > > > > > > rosalyn.maccracken at noaa.gov> wrote:
> > > > > > > >
> > > > > > > > > Ok, I'll get that over to the ftp site.  I have to
make
> sure
> > > > that I
> > > > > > > find
> > > > > > > > a
> > > > > > > > > day that has all the data in it.  Sometimes the data
isn't
> > > > > available
> > > > > > > when
> > > > > > > > > the script runs.  A little annoying, but, that's
> > operations...
> > > > > > > > >
> > > > > > > > > I'll let you know when I get the file to the ftp
site.
> > > > > > > > >
> > > > > > > > > Thanks!
> > > > > > > > >
> > > > > > > > > Roz
> > > > > > > > >
> > > > > > > > > On Tue, Apr 24, 2018 at 2:49 PM, John Halley Gotway
via RT
> <
> > > > > > > > > met_help at ucar.edu> wrote:
> > > > > > > > >
> > > > > > > > >> Roz,
> > > > > > > > >>
> > > > > > > > >> Yes, we do.  Follow the instructions here:
> > > > > > > > >>    https://dtcenter.org/met/
> users/support/met_help.php#ftp
> > > > > > > > >>
> > > > > > > > >> I'd suggest making a tar file for one day and
posting them
> > to
> > > > the
> > > > > > ftp
> > > > > > > > >> site:
> > > > > > > > >>    tar -cvzf sample.tar.gz
/GFS/data/hourly/20180305*
> > > > > > > > >>
> > > > > > > > >> Thanks,
> > > > > > > > >> John
> > > > > > > > >>
> > > > > > > > >> On Tue, Apr 24, 2018 at 11:57 AM, Rosalyn
MacCracken -
> NOAA
> > > > > > Affiliate
> > > > > > > > via
> > > > > > > > >> RT <met_help at ucar.edu> wrote:
> > > > > > > > >>
> > > > > > > > >> >
> > > > > > > > >> > <URL: https://rt.rap.ucar.edu/rt/
> > > Ticket/Display.html?id=84822
> > > > >
> > > > > > > > >> >
> > > > > > > > >> > HI John,
> > > > > > > > >> >
> > > > > > > > >> > Yes, it does seem that the -config option is the
way to
> go
> > > to
> > > > > > > recreate
> > > > > > > > >> > those 3 files. I'll be sure to have a unique file
name,
> > or,
> > > mv
> > > > > the
> > > > > > > > >> output
> > > > > > > > >> > file to a different name before running the
command
> again.
> > > > > Thanks
> > > > > > > for
> > > > > > > > >> > pointing that out.
> > > > > > > > >> >
> > > > > > > > >> > I'm teleworking for the next couple of weeks, so,
> download
> > > and
> > > > > > send
> > > > > > > > you
> > > > > > > > >> > *.stat files like I can when I'm at my computer
at work.
> > I
> > > > > don't
> > > > > > > have
> > > > > > > > >> > access to theia or wcoss anymore.  You have an
ftp
> server
> > > > that I
> > > > > > can
> > > > > > > > >> upload
> > > > > > > > >> > data to, right?  If not, I can try and fiddle
around
> with
> > > this
> > > > > > > > tomorrow
> > > > > > > > >> and
> > > > > > > > >> > see if I can't get this to work the way I want
to.
> > > > > > > > >> >
> > > > > > > > >> > Roz
> > > > > > > > >> >
> > > > > > > > >> > On Tue, Apr 24, 2018 at 11:42 AM, John Halley
Gotway via
> > RT
> > > <
> > > > > > > > >> > met_help at ucar.edu> wrote:
> > > > > > > > >> >
> > > > > > > > >> > > Roz,
> > > > > > > > >> > >
> > > > > > > > >> > > Each "-job aggregate_stat" only generates a
single
> > output
> > > > line
> > > > > > > type.
> > > > > > > > >> So
> > > > > > > > >> > > using "-out_line_type CTC,CTS,CNT" will not
work.
> > > > > > > > >> > >
> > > > > > > > >> > > You'll need to run separate jobs for each
output line
> > type
> > > > you
> > > > > > > want
> > > > > > > > to
> > > > > > > > >> > > generate.  That's why I'd recommend grouping
those
> > > multiple
> > > > > jobs
> > > > > > > > >> together
> > > > > > > > >> > > into a single STAT-Analysis config file.  Then
you'd
> > call
> > > > > > > > >> STAT-Analysis
> > > > > > > > >> > > once using the "-config" command line option.
> > > > > > > > >> > >
> > > > > > > > >> > > Another issue is that if you set "-out_stat" to
the
> same
> > > > > > filename,
> > > > > > > > >> it'll
> > > > > > > > >> > > get overridden by each job.  STAT-Analysis will
> > overwrite
> > > > that
> > > > > > > > output
> > > > > > > > >> > file
> > > > > > > > >> > > rather than appending to it.
> > > > > > > > >> > >
> > > > > > > > >> > > You could send me a day's worth of .stat output
files
> > > > > > > > >> > > (/GFS/data/hourly/20180305*) and I could send
you some
> > > > > > > suggestions.
> > > > > > > > >> Or
> > > > > > > > >> > if
> > > > > > > > >> > > you have access to theia you could copy them up
there
> > and
> > > > > point
> > > > > > me
> > > > > > > > to
> > > > > > > > >> it.
> > > > > > > > >> > >
> > > > > > > > >> > > Thanks,
> > > > > > > > >> > > John
> > > > > > > > >> > >
> > > > > > > > >> > > On Tue, Apr 24, 2018 at 7:48 AM, Rosalyn
MacCracken -
> > NOAA
> > > > > > > Affiliate
> > > > > > > > >> via
> > > > > > > > >> > RT
> > > > > > > > >> > > <met_help at ucar.edu> wrote:
> > > > > > > > >> > >
> > > > > > > > >> > > >
> > > > > > > > >> > > > <URL: https://rt.rap.ucar.edu/rt/
> > > > > Ticket/Display.html?id=84822
> > > > > > >
> > > > > > > > >> > > >
> > > > > > > > >> > > > Hi John,
> > > > > > > > >> > > >
> > > > > > > > >> > > > Yes, that makes sense.  Those very small
values
> (<1.0
> > > > m/s),
> > > > > > are
> > > > > > > > bad
> > > > > > > > >> > > > values.  That's why they shouldn't be
included in
> the
> > > > > > > processing.
> > > > > > > > >> > > >
> > > > > > > > >> > > > So, I need to just regenerate hourly data,
one hour
> > at a
> > > > > time.
> > > > > > > > >> Would
> > > > > > > > >> > it
> > > > > > > > >> > > > make sense to use a shell script and loop
> > stat-analysis?
> > > > > > > > Something
> > > > > > > > >> > like:
> > > > > > > > >> > > >
> > > > > > > > >> > > > for day in 11 12
> > > > > > > > >> > > > do
> > > > > > > > >> > > >   for cycle in 00 06 12 18
> > > > > > > > >> > > >   do
> > > > > > > > >> > > > stat_analysis -lookin
/GFS/data/hourly/201803${day}$
> > > > > > > {hour}/*.stat
> > > > > > > > \
> > > > > > > > >> > > > -job aggregate_stat \
> > > > > > > > >> > > >    -line_type MPR \
> > > > > > > > >> > > >    -out_line_type CTC,CTS,CNT \
> > > > > > > > >> > > >   -fcst_var WIND \
> > > > > > > > >> > > > -column_thresh OBS gt1 \
> > > > > > > > >> > > >  -by
> > > > > > > > >> > > > MODEL,FCST_LEV,FCST_INIT_BEG,
> > > > FCST_LEAD,VX_MASK,INTERP_MTHD,
> > > > > I
> > > > > > > > >> NTERP_PNTS
> > > > > > > > >> > > > -out_stat /new_rerun_stat_files/MPR_to_
> > CTC_CTS_CNT.stat
> > > > > > > > >> > > >   done
> > > > > > > > >> > > > done
> > > > > > > > >> > > >
> > > > > > > > >> > > > or, something like that?  And, will this
regenerate
> > hour
> > > > > > > > forecasts,
> > > > > > > > >> at
> > > > > > > > >> > > each
> > > > > > > > >> > > > forecast and lead hour?  I guess it will see
the
> > > forecast
> > > > > and
> > > > > > > lead
> > > > > > > > >> hour
> > > > > > > > >> > > > from the *.stat file, and whatever *stat file
is in
> > the
> > > > > > > directory,
> > > > > > > > >> it
> > > > > > > > >> > > will
> > > > > > > > >> > > > regenerate those hours, right?
> > > > > > > > >> > > >
> > > > > > > > >> > > > So, I need to regenerate the CTC, CNT and CTS
files.
> > > > That's
> > > > > > > why I
> > > > > > > > >> did:
> > > > > > > > >> > > >  -out_line_type CTC,CTS,CNT
> > > > > > > > >> > > > but, will that make 3 separate files, or just
> another
> > > > *.stat
> > > > > > > file?
> > > > > > > > >> > > >
> > > > > > > > >> > > > Roz
> > > > > > > > >> > > >
> > > > > > > > >> > > >
> > > > > > > > >> > > > On Mon, Apr 23, 2018 at 4:01 PM, John Halley
Gotway
> > via
> > > > RT <
> > > > > > > > >> > > > met_help at ucar.edu> wrote:
> > > > > > > > >> > > >
> > > > > > > > >> > > > > Roz,
> > > > > > > > >> > > > >
> > > > > > > > >> > > > > It is ultimately up to you to decide which
matched
> > > pairs
> > > > > you
> > > > > > > > want
> > > > > > > > >> to
> > > > > > > > >> > > > > include in your processing.  Do you
consider those
> > > small
> > > > > > (<1.0
> > > > > > > > >> m/s)
> > > > > > > > >> > > > > observation values to be corrupt and
incorrect in
> > some
> > > > way
> > > > > > or
> > > > > > > > just
> > > > > > > > >> > not
> > > > > > > > >> > > > very
> > > > > > > > >> > > > > interesting?  If they really are BAD data
values,
> I
> > > > agree
> > > > > > that
> > > > > > > > you
> > > > > > > > >> > > should
> > > > > > > > >> > > > > exclude them from your analysis.  But if
they're
> > just
> > > > > > > > >> uninteresting
> > > > > > > > >> > > > values
> > > > > > > > >> > > > > of low wind speed, then there's no reason
why you
> > > should
> > > > > > > exclude
> > > > > > > > >> > them.
> > > > > > > > >> > > > For
> > > > > > > > >> > > > > example, *most* of the time it ins't
raining, but
> we
> > > > often
> > > > > > > > >> included
> > > > > > > > >> > > > > observations of 0 precip.
> > > > > > > > >> > > > >
> > > > > > > > >> > > > > There are three configurable options in
Point-Stat
> > > that
> > > > > may
> > > > > > be
> > > > > > > > >> useful
> > > > > > > > >> > > > here:
> > > > > > > > >> > > > > (1) You already know and use the
"cat_thresh"
> > option.
> > > > > This
> > > > > > > > >> threshold
> > > > > > > > >> > > > > defines the events and non-events for a 2x2
> > > contingency
> > > > > > table.
> > > > > > > > >> This
> > > > > > > > >> > > > > threshold affects the contents of FHO, CTC,
CTS,
> > MCTC,
> > > > and
> > > > > > > MCTS
> > > > > > > > >> line
> > > > > > > > >> > > > types
> > > > > > > > >> > > > > that Point-Stat writes.
> > > > > > > > >> > > > > (2) The "cnt_thresh" option is a more
recent
> > addition.
> > > > > > > Perhaps
> > > > > > > > >> this
> > > > > > > > >> > > was
> > > > > > > > >> > > > a
> > > > > > > > >> > > > > poor name choice, but instead of defining
> > categories,
> > > > it's
> > > > > > > > really
> > > > > > > > >> a
> > > > > > > > >> > > > > *filtering* threshold.  This threshold
affects the
> > > > > contents
> > > > > > of
> > > > > > > > the
> > > > > > > > >> > > SL1L2,
> > > > > > > > >> > > > > SAL1L2, and CNT line types that Point-Stat
writes.
> > > For
> > > > > > > example,
> > > > > > > > >> > > setting
> > > > > > > > >> > > > > "cnt_thresh = [ ge6, ge17 ];" will produce
2 CNT
> > and 2
> > > > > SL1L2
> > > > > > > > >> output
> > > > > > > > >> > > lines
> > > > > > > > >> > > > > containing only those points where the wind
speed
> > was
> > > > >=6
> > > > > > and
> > > > > > > > >> >=17,
> > > > > > > > >> > > > > respectively.
> > > > > > > > >> > > > > (3) The "wind_thresh" option is very
similar to
> the
> > > > > > > "cnt_thresh"
> > > > > > > > >> > option
> > > > > > > > >> > > > but
> > > > > > > > >> > > > > affects the contents of teh VL1L2, VAL1L2,
and
> VCNT
> > > (new
> > > > > in
> > > > > > > > >> met-7.0)
> > > > > > > > >> > > line
> > > > > > > > >> > > > > types.  Only those U/V pairs that meet the
> specified
> > > > wind
> > > > > > > speed
> > > > > > > > >> > > threshold
> > > > > > > > >> > > > > are included in the output.
> > > > > > > > >> > > > >
> > > > > > > > >> > > > > For both "cnt_thresh" and "wind_thresh",
the
> default
> > > > value
> > > > > > in
> > > > > > > > the
> > > > > > > > >> > > config
> > > > > > > > >> > > > > file is "NA", meaning, do not apply any
filtering
> > > > > threshold
> > > > > > > > >> criteria.
> > > > > > > > >> > > > >
> > > > > > > > >> > > > > You have the flexibility to run STAT-
Analysis on
> the
> > > MPR
> > > > > > > output
> > > > > > > > >> lines
> > > > > > > > >> > > to
> > > > > > > > >> > > > > recompute any of these output line types
applying
> > > > whatever
> > > > > > > > >> filtering
> > > > > > > > >> > > > > criteria you'd like.
> > > > > > > > >> > > > > Here's the MET user's guide:
> > > > > > > > >> > > > > https://dtcenter.org/met/
> > users/docs/users_guide/MET_
> > > > > > > > >> > > Users_Guide_v7.0.pdf
> > > > > > > > >> > > > > Look on page 98 for the job command options
for
> the
> > > > > > > > >> "aggregate_stat"
> > > > > > > > >> > > line
> > > > > > > > >> > > > > type when the input line type is "MPR".
> > > > > > > > >> > > > >
> > > > > > > > >> > > > > For your second question, the "-lookin
PATH"
> option
> > is
> > > > > > *VERY*
> > > > > > > > >> > flexible.
> > > > > > > > >> > > > > You can set PATH to either a single value
or
> > multiple
> > > > > > values.
> > > > > > > > If
> > > > > > > > >> you
> > > > > > > > >> > > use
> > > > > > > > >> > > > > wildcards, then the shell expands those
wildcards
> to
> > > > > > multiple
> > > > > > > > >> values.
> > > > > > > > >> > > > Each
> > > > > > > > >> > > > > value you pass in can either be a filename
or a
> > > > directory
> > > > > > > name.
> > > > > > > > >> If
> > > > > > > > >> > you
> > > > > > > > >> > > > > pass in a filename, STAT-Analysis will read
it
> > > > > *REGARDLESS*
> > > > > > of
> > > > > > > > the
> > > > > > > > >> > file
> > > > > > > > >> > > > > extension.  If you pass in a directory
name,
> > > > STAT-Analysis
> > > > > > > will
> > > > > > > > >> > search
> > > > > > > > >> > > > that
> > > > > > > > >> > > > > directory *RECURSIVELY* for files ending in
> ".stat".
> > > > For
> > > > > > > > example,
> > > > > > > > >> > > either
> > > > > > > > >> > > > > of the following settings would tell STAT-
Analysis
> > to
> > > > read
> > > > > > the
> > > > > > > > >> same
> > > > > > > > >> > > list
> > > > > > > > >> > > > of
> > > > > > > > >> > > > > files:
> > > > > > > > >> > > > >    -lookin /GFS/data/hourly/*/*.stat
> > > > > > > > >> > > > >    ... or ...
> > > > > > > > >> > > > >    -lookin /GFS/data/hourly
> > > > > > > > >> > > > >
> > > > > > > > >> > > > > Be aware though that the more data you pass
to
> > > > > > STAT-Analysis,
> > > > > > > > the
> > > > > > > > >> > > longer
> > > > > > > > >> > > > > it'll take for it to process it.  You can
decide
> how
> > > > much
> > > > > > data
> > > > > > > > you
> > > > > > > > >> > pass
> > > > > > > > >> > > > it
> > > > > > > > >> > > > > for each job.  I'd suggest starting with
what is
> > most
> > > > > > > convenient
> > > > > > > > >> for
> > > > > > > > >> > > you.
> > > > > > > > >> > > > > If it's too slow, change the logic to pass
it less
> > > data
> > > > > > (e.g.
> > > > > > > > >> only 1
> > > > > > > > >> > > day
> > > > > > > > >> > > > of
> > > > > > > > >> > > > > data rather than 1 month of data).
> > > > > > > > >> > > > >
> > > > > > > > >> > > > > Yes, you can give it a date range.  Use
> > -fcst_init_beg
> > > > and
> > > > > > > > >> > > -fcst_init_end
> > > > > > > > >> > > > > to specify beginning/ending model
initialization
> > times
> > > > or
> > > > > > > > >> > > -fcst_valid_beg
> > > > > > > > >> > > > > and -fcst_valid_end to specify
beginning/ending
> > valid
> > > > > times.
> > > > > > > > >> > > > >
> > > > > > > > >> > > > > If you find that you're running multiple
jobs on
> the
> > > > same
> > > > > > > subset
> > > > > > > > >> of
> > > > > > > > >> > > data
> > > > > > > > >> > > > > (e.g. process MPR to CNT, MPR to SL1L2, MPR
to
> CTC,
> > > MPR
> > > > to
> > > > > > > CTS),
> > > > > > > > >> it'd
> > > > > > > > >> > > be
> > > > > > > > >> > > > > more efficient to group those jobs into a
config
> > file.
> > > > > > > That'll
> > > > > > > > do
> > > > > > > > >> > the
> > > > > > > > >> > > > > filtering ONCE and write the filtered data
to a
> temp
> > > > file.
> > > > > > > Then
> > > > > > > > >> all
> > > > > > > > >> > > the
> > > > > > > > >> > > > > jobs read data from the temp instead of
starting
> > over
> > > > from
> > > > > > > > >> scratch.
> > > > > > > > >> > > > >
> > > > > > > > >> > > > > Make sense?
> > > > > > > > >> > > > >
> > > > > > > > >> > > > > John
> > > > > > > > >> > > > >
> > > > > > > > >> > > > >
> > > > > > > > >> > > > >
> > > > > > > > >> > > > > On Mon, Apr 23, 2018 at 1:01 PM, Rosalyn
> MacCracken
> > -
> > > > NOAA
> > > > > > > > >> Affiliate
> > > > > > > > >> > > via
> > > > > > > > >> > > > RT
> > > > > > > > >> > > > > <met_help at ucar.edu> wrote:
> > > > > > > > >> > > > >
> > > > > > > > >> > > > > >
> > > > > > > > >> > > > > > <URL: https://rt.rap.ucar.edu/rt/
> > > > > > > Ticket/Display.html?id=84822
> > > > > > > > >
> > > > > > > > >> > > > > >
> > > > > > > > >> > > > > > Hi John,
> > > > > > > > >> > > > > >
> > > > > > > > >> > > > > > That's actually only partially correct.
It's
> not
> > > > that I
> > > > > > > want
> > > > > > > > to
> > > > > > > > >> > use
> > > > > > > > >> > > > part
> > > > > > > > >> > > > > > of the MPR lines and discard the rest,
and I do
> > need
> > > > to
> > > > > > > > >> regenerate
> > > > > > > > >> > > > > > statistics.  Let me try to re-explain.
> > > > > > > > >> > > > > >
> > > > > > > > >> > > > > > Back in early March we switched from
getting our
> > > ASCAT
> > > > > obs
> > > > > > > > from
> > > > > > > > >> the
> > > > > > > > >> > > > > > prepbufr data, to getting it from the
MGDRLITE
> > data.
> > > > So,
> > > > > > > > >> processing
> > > > > > > > >> > > > > didn't
> > > > > > > > >> > > > > > change.  I was producing statistics at
certain
> > > > threshold
> > > > > > > > levels
> > > > > > > > >> for
> > > > > > > > >> > > > both
> > > > > > > > >> > > > > > GFS and ASCAT.  I had this set with the
> cat_thresh
> > > > list,
> > > > > > at
> > > > > > > > >> levels
> > > > > > > > >> > of
> > > > > > > > >> > > > > > 0,6,17, etc.  We found out after
processing for
> a
> > > > couple
> > > > > > of
> > > > > > > > >> weeks
> > > > > > > > >> > > that
> > > > > > > > >> > > > > the
> > > > > > > > >> > > > > > ASCAT data included these really small
values,
> > <1.0
> > > > m/s,
> > > > > > and
> > > > > > > > >> that
> > > > > > > > >> > > these
> > > > > > > > >> > > > > > small wind speeds were being included
into the
> > > > > statistics
> > > > > > > > >> > processing.
> > > > > > > > >> > > > > >
> > > > > > > > >> > > > > > So, a couple of questions.
> > > > > > > > >> > > > > > 1) Do I have to regenerate all of my
statistics
> > > > (*.cts,
> > > > > > > *.cnt
> > > > > > > > >> and
> > > > > > > > >> > > *ctc
> > > > > > > > >> > > > > > files) because of this error? Or, since I
have
> > > > threshold
> > > > > > > > levels
> > > > > > > > >> > set,
> > > > > > > > >> > > > will
> > > > > > > > >> > > > > > those small values be amoung the
statistics in
> the
> > > > > lowest
> > > > > > > > >> > thresholds?
> > > > > > > > >> > > > > > 2) I have the *.stat files, but, they are
spread
> > out
> > > > > into
> > > > > > > > >> separate
> > > > > > > > >> > > > > > directories like:
> > > > > > > > >> > > > > > /GFS/data/hourly/${YYYYMMDDHH}/*.stat
> > > > > > > > >> > > > > > Can I tell stat-analysis to "lookin"
directories
> > > with
> > > > a
> > > > > > > > wildcard
> > > > > > > > >> > > (like
> > > > > > > > >> > > > > > 201803*)?  If so, how?  Or, is I tell it
to look
> > in
> > > > > > > > >> > /GFS/data/hourly,
> > > > > > > > >> > > > > will
> > > > > > > > >> > > > > > it look in all the directories
recursively under
> > > > hourly?
> > > > > > > And,
> > > > > > > > >> it
> > > > > > > > >> > > > that's
> > > > > > > > >> > > > > > the case, can I give it a date range, so,
that
> it
> > > only
> > > > > > > > processes
> > > > > > > > >> > data
> > > > > > > > >> > > > > from
> > > > > > > > >> > > > > > March?
> > > > > > > > >> > > > > >
> > > > > > > > >> > > > > > Roz
> > > > > > > > >> > > > > >
> > > > > > > > >> > > > > > On Mon, Apr 23, 2018 at 2:18 PM, John
Halley
> > Gotway
> > > > via
> > > > > > RT <
> > > > > > > > >> > > > > > met_help at ucar.edu> wrote:
> > > > > > > > >> > > > > >
> > > > > > > > >> > > > > > > Hi Roz,
> > > > > > > > >> > > > > > >
> > > > > > > > >> > > > > > > I read that you've run Point-Stat and
saved
> off
> > > the
> > > > > > > matched
> > > > > > > > >> pairs
> > > > > > > > >> > > > (MPR)
> > > > > > > > >> > > > > > > output line type.  And you'd like to
(1)
> filter
> > > > those
> > > > > > MPR
> > > > > > > > >> lines
> > > > > > > > >> > to
> > > > > > > > >> > > > > > discard
> > > > > > > > >> > > > > > > some of them and then (2) use the
filtered
> data
> > to
> > > > > > > > regenerate
> > > > > > > > >> > > summary
> > > > > > > > >> > > > > > > statistics.  Yes, this is easily done
using
> the
> > > > > > > > STAT-Analysis
> > > > > > > > >> > tool
> > > > > > > > >> > > in
> > > > > > > > >> > > > > > MET.
> > > > > > > > >> > > > > > >
> > > > > > > > >> > > > > > > You wrote that you're verifying wind
speeds
> > > against
> > > > > > ASCAT
> > > > > > > > and
> > > > > > > > >> > that
> > > > > > > > >> > > > > you'd
> > > > > > > > >> > > > > > > like to exclude pairs where the
observed wind
> > > speed
> > > > is
> > > > > > > less
> > > > > > > > >> than
> > > > > > > > >> > 1
> > > > > > > > >> > > > m/s.
> > > > > > > > >> > > > > > > I'm just guessing here, but I'll
presume that
> > you
> > > > want
> > > > > > to
> > > > > > > > >> produce
> > > > > > > > >> > > > both
> > > > > > > > >> > > > > > > SL1L2 and CNT output line types.
Here's what
> > the
> > > > > > > > >> STAT-Analysis
> > > > > > > > >> > job
> > > > > > > > >> > > > > would
> > > > > > > > >> > > > > > > look like:
> > > > > > > > >> > > > > > >
> > > > > > > > >> > > > > > > # Filter MPR's and write SL1L2 output
line
> > > > > > > > >> > > > > > > stat_analysis \
> > > > > > > > >> > > > > > >    -lookin input.stat \            #
List a
> > .stat
> > > > > > filename
> > > > > > > > or
> > > > > > > > >> > > > directory
> > > > > > > > >> > > > > > > containing them
> > > > > > > > >> > > > > > >    -job aggregate_stat \        # Job
type is
> > > > > > > aggregate_stat
> > > > > > > > >> > > > > > >    -line_type MPR \              #
Input line
> > > type =
> > > > > MPR
> > > > > > > > >> > > > > > >    -out_line_type SL1L2 \      # Output
line
> > type
> > > =
> > > > > > SL1L2
> > > > > > > > >> partial
> > > > > > > > >> > > > sums
> > > > > > > > >> > > > > > >    -fcst_var WIND \               #
Only
> process
> > > > lines
> > > > > > > where
> > > > > > > > >> > > FCST_VAR
> > > > > > > > >> > > > > > > column = WIND
> > > > > > > > >> > > > > > >    -column_thresh OBS gt1 \ # Only use
MPR
> lines
> > > > where
> > > > > > OBS
> > > > > > > > >> column
> > > > > > > > >> > > > 1
> > > > > > > > >> > > > > > >    -by
> > > > > > > > >> > > > > > > MODEL,FCST_LEV,FCST_INIT_BEG,
> > > > > > > FCST_LEAD,VX_MASK,INTERP_MTHD,
> > > > > > > > >> > > > INTERP_PNTS
> > > > > > > > >> > > > > #
> > > > > > > > >> > > > > > > Run this same job for each unique
combination
> of
> > > > these
> > > > > > > > columns
> > > > > > > > >> > > > > > >    -out_stat MPR_to_SL1L2.stat
> > > > > > > > >> > > > > > >
> > > > > > > > >> > > > > > > This will read produce an output .stat
file
> > > > containing
> > > > > > an
> > > > > > > > >> SL1L2
> > > > > > > > >> > > line
> > > > > > > > >> > > > > for
> > > > > > > > >> > > > > > > each unique combination of the header
columns
> > > listed
> > > > > > after
> > > > > > > > the
> > > > > > > > >> > > "-by"
> > > > > > > > >> > > > > > > option.  To generate CNT output lines
instead,
> > > you'd
> > > > > > run a
> > > > > > > > >> second
> > > > > > > > >> > > job
> > > > > > > > >> > > > > > where
> > > > > > > > >> > > > > > > you replace SL1L2 with CNT.  You could
run
> these
> > > > jobs
> > > > > on
> > > > > > > the
> > > > > > > > >> > > command
> > > > > > > > >> > > > > line
> > > > > > > > >> > > > > > > or group them together into a STAT-
Analysis
> > config
> > > > > file,
> > > > > > > if
> > > > > > > > >> you
> > > > > > > > >> > > > prefer.
> > > > > > > > >> > > > > > > Both would work.
> > > > > > > > >> > > > > > >
> > > > > > > > >> > > > > > > You could run this once for each input
.stat
> > file
> > > > > you're
> > > > > > > > >> > > > processing...
> > > > > > > > >> > > > > or
> > > > > > > > >> > > > > > > you could pass many input .stat files
to the
> > job.
> > > > > Since
> > > > > > > > >> > > > FCST_INIT_BEG
> > > > > > > > >> > > > > > and
> > > > > > > > >> > > > > > > FCST_LEAD are included in the "-by"
option,
> > you'll
> > > > get
> > > > > > > > >> separate
> > > > > > > > >> > > > output
> > > > > > > > >> > > > > > > lines for each unique time.
> > > > > > > > >> > > > > > >
> > > > > > > > >> > > > > > > Hope that helps get you going.
> > > > > > > > >> > > > > > >
> > > > > > > > >> > > > > > > Thanks,
> > > > > > > > >> > > > > > > John
> > > > > > > > >> > > > > > >
> > > > > > > > >> > > > > > >
> > > > > > > > >> > > > > > > On Thu, Apr 19, 2018 at 9:23 AM, Julie
> > Prestopnik
> > > > via
> > > > > > RT <
> > > > > > > > >> > > > > > > met_help at ucar.edu>
> > > > > > > > >> > > > > > > wrote:
> > > > > > > > >> > > > > > >
> > > > > > > > >> > > > > > > >
> > > > > > > > >> > > > > > > > <URL: https://rt.rap.ucar.edu/rt/Tic
> > > > > > > > >> ket/Display.html?id=84822
> > > > > > > > >> > >
> > > > > > > > >> > > > > > > >
> > > > > > > > >> > > > > > > > Hi Roz.  My apologies for the delay
in
> > > responding.
> > > > > > > > >> > > > > > > >
> > > > > > > > >> > > > > > > > Unfortunately, John is out of the
office
> this
> > > > week,
> > > > > > and
> > > > > > > I
> > > > > > > > do
> > > > > > > > >> > not
> > > > > > > > >> > > > know
> > > > > > > > >> > > > > > the
> > > > > > > > >> > > > > > > > answers to your questions.  As you
said, I
> > would
> > > > > also
> > > > > > > > >> imagine
> > > > > > > > >> > > that
> > > > > > > > >> > > > > > > > point-stat is using those small
values as
> > > matched
> > > > > > pairs.
> > > > > > > > >> > Also, I
> > > > > > > > >> > > > do
> > > > > > > > >> > > > > > not
> > > > > > > > >> > > > > > > > believe there is a way to regenerate
the
> > > > point-stat
> > > > > > > > >> statistics
> > > > > > > > >> > > > > without
> > > > > > > > >> > > > > > > > using the original GFS data.  I
cannot say
> > with
> > > > > > > certainty,
> > > > > > > > >> > > however.
> > > > > > > > >> > > > > > > Thank
> > > > > > > > >> > > > > > > > you for your patience in advance.
We'll
> get a
> > > > > > definite
> > > > > > > > >> > response
> > > > > > > > >> > > to
> > > > > > > > >> > > > > you
> > > > > > > > >> > > > > > > as
> > > > > > > > >> > > > > > > > soon as we can.
> > > > > > > > >> > > > > > > >
> > > > > > > > >> > > > > > > > Thanks,
> > > > > > > > >> > > > > > > > Julie
> > > > > > > > >> > > > > > > >
> > > > > > > > >> > > > > > > > On Wed, Apr 18, 2018 at 6:31 AM,
Rosalyn
> > > > MacCracken
> > > > > -
> > > > > > > NOAA
> > > > > > > > >> > > > Affiliate
> > > > > > > > >> > > > > > via
> > > > > > > > >> > > > > > > RT
> > > > > > > > >> > > > > > > > <met_help at ucar.edu> wrote:
> > > > > > > > >> > > > > > > >
> > > > > > > > >> > > > > > > > >
> > > > > > > > >> > > > > > > > > Wed Apr 18 06:31:39 2018: Request
84822
> was
> > > > acted
> > > > > > > upon.
> > > > > > > > >> > > > > > > > > Transaction: Ticket created by
> > > > > > > > >> rosalyn.maccracken at noaa.gov
> > > > > > > > >> > > > > > > > >        Queue: met_help
> > > > > > > > >> > > > > > > > >      Subject: question on
regenerating
> data
> > > > > > > > >> > > > > > > > >        Owner: Nobody
> > > > > > > > >> > > > > > > > >   Requestors:
rosalyn.maccracken at noaa.gov
> > > > > > > > >> > > > > > > > >       Status: new
> > > > > > > > >> > > > > > > > >  Ticket <URL:
https://rt.rap.ucar.edu/rt/
> > > > > > > > >> > > > > > Ticket/Display.html?id=84822
> > > > > > > > >> > > > > > > >
> > > > > > > > >> > > > > > > > >
> > > > > > > > >> > > > > > > > >
> > > > > > > > >> > > > > > > > > Hi,
> > > > > > > > >> > > > > > > > >
> > > > > > > > >> > > > > > > > > I'm running point-stat using ASCAT
and GFS
> > > data
> > > > to
> > > > > > > > verify
> > > > > > > > >> > > surface
> > > > > > > > >> > > > > > wind
> > > > > > > > >> > > > > > > > > speeds.  I found an error in my
ASCAT
> input
> > > data
> > > > > > that
> > > > > > > > goes
> > > > > > > > >> > back
> > > > > > > > >> > > > to
> > > > > > > > >> > > > > > Mar
> > > > > > > > >> > > > > > > 7.
> > > > > > > > >> > > > > > > > > I had switched the input source of
the
> data,
> > > and
> > > > > > > within
> > > > > > > > >> the
> > > > > > > > >> > new
> > > > > > > > >> > > > > data
> > > > > > > > >> > > > > > > > files,
> > > > > > > > >> > > > > > > > > it was allowing very small values
(< 1
> m/s)
> > to
> > > > be
> > > > > > used
> > > > > > > > as
> > > > > > > > >> > data
> > > > > > > > >> > > > > points
> > > > > > > > >> > > > > > > in
> > > > > > > > >> > > > > > > > > the verification.  I imagine that
this is
> an
> > > > > issue,
> > > > > > > > since
> > > > > > > > >> > > > > point-stat
> > > > > > > > >> > > > > > is
> > > > > > > > >> > > > > > > > > using these very small values as
matched
> > pairs
> > > > > with
> > > > > > > the
> > > > > > > > >> GFS,
> > > > > > > > >> > > > > correct?
> > > > > > > > >> > > > > > > > >
> > > > > > > > >> > > > > > > > > Is there a way to regenerate the
> point-stat
> > > > > > statistics
> > > > > > > > >> > without
> > > > > > > > >> > > > > using
> > > > > > > > >> > > > > > > the
> > > > > > > > >> > > > > > > > > original GFS data?  I do have the
*stat
> and
> > > the
> > > > > *mpr
> > > > > > > > >> files,
> > > > > > > > >> > and
> > > > > > > > >> > > > it
> > > > > > > > >> > > > > is
> > > > > > > > >> > > > > > > > > pretty easy to identify where the
bad
> values
> > > are
> > > > > > > > located.
> > > > > > > > >> > > > > > > > >
> > > > > > > > >> > > > > > > > > Thanks,
> > > > > > > > >> > > > > > > > > Roz
> > > > > > > > >> > > > > > > > >
> > > > > > > > >> > > > > > > > > --
> > > > > > > > >> > > > > > > > > Rosalyn MacCracken
> > > > > > > > >> > > > > > > > > Support Scientist
> > > > > > > > >> > > > > > > > >
> > > > > > > > >> > > > > > > > > Ocean Applications Branch
> > > > > > > > >> > > > > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > > > >> > > > > > > > > NCWCP
> > > > > > > > >> > > > > > > > > 5830 University Research Ct
> > > > > > > > >> > > > > > > > > College Park, MD  20740-3818
> > > > > > > > >> > > > > > > > >
> > > > > > > > >> > > > > > > > > (p) 301-683-1551
> > > > > > > > >> > > > > > > > > rosalyn.maccracken at noaa.gov
> > > > > > > > >> > > > > > > > >
> > > > > > > > >> > > > > > > > >
> > > > > > > > >> > > > > > > >
> > > > > > > > >> > > > > > > >
> > > > > > > > >> > > > > > >
> > > > > > > > >> > > > > > >
> > > > > > > > >> > > > > >
> > > > > > > > >> > > > > >
> > > > > > > > >> > > > > > --
> > > > > > > > >> > > > > > Rosalyn MacCracken
> > > > > > > > >> > > > > > Support Scientist
> > > > > > > > >> > > > > >
> > > > > > > > >> > > > > > Ocean Applications Branch
> > > > > > > > >> > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > > > >> > > > > > NCWCP
> > > > > > > > >> > > > > > 5830 University Research Ct
> > > > > > > > >> > > > > > College Park, MD  20740-3818
> > > > > > > > >> > > > > >
> > > > > > > > >> > > > > > (p) 301-683-1551
> > > > > > > > >> > > > > > rosalyn.maccracken at noaa.gov
> > > > > > > > >> > > > > >
> > > > > > > > >> > > > > >
> > > > > > > > >> > > > >
> > > > > > > > >> > > > >
> > > > > > > > >> > > >
> > > > > > > > >> > > >
> > > > > > > > >> > > > --
> > > > > > > > >> > > > Rosalyn MacCracken
> > > > > > > > >> > > > Support Scientist
> > > > > > > > >> > > >
> > > > > > > > >> > > > Ocean Applications Branch
> > > > > > > > >> > > > NOAA/NWS Ocean Prediction Center
> > > > > > > > >> > > > NCWCP
> > > > > > > > >> > > > 5830 University Research Ct
> > > > > > > > >> > > > College Park, MD  20740-3818
> > > > > > > > >> > > >
> > > > > > > > >> > > > (p) 301-683-1551
> > > > > > > > >> > > > rosalyn.maccracken at noaa.gov
> > > > > > > > >> > > >
> > > > > > > > >> > > >
> > > > > > > > >> > >
> > > > > > > > >> > >
> > > > > > > > >> >
> > > > > > > > >> >
> > > > > > > > >> > --
> > > > > > > > >> > Rosalyn MacCracken
> > > > > > > > >> > Support Scientist
> > > > > > > > >> >
> > > > > > > > >> > Ocean Applications Branch
> > > > > > > > >> > NOAA/NWS Ocean Prediction Center
> > > > > > > > >> > NCWCP
> > > > > > > > >> > 5830 University Research Ct
> > > > > > > > >> > College Park, MD  20740-3818
> > > > > > > > >> >
> > > > > > > > >> > (p) 301-683-1551
> > > > > > > > >> > rosalyn.maccracken at noaa.gov
> > > > > > > > >> >
> > > > > > > > >> >
> > > > > > > > >>
> > > > > > > > >>
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Rosalyn MacCracken
> > > > > > > > > Support Scientist
> > > > > > > > >
> > > > > > > > > Ocean Applications Branch
> > > > > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > > > > NCWCP
> > > > > > > > > 5830 University Research Ct
> > > > > > > > > College Park, MD  20740-3818
> > > > > > > > >
> > > > > > > > > (p) 301-683-1551
> > > > > > > > > rosalyn.maccracken at noaa.gov
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > Rosalyn MacCracken
> > > > > > > > Support Scientist
> > > > > > > >
> > > > > > > > Ocean Applications Branch
> > > > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > > > NCWCP
> > > > > > > > 5830 University Research Ct
> > > > > > > > College Park, MD  20740-3818
> > > > > > > >
> > > > > > > > (p) 301-683-1551
> > > > > > > > rosalyn.maccracken at noaa.gov
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Rosalyn MacCracken
> > > > > > Support Scientist
> > > > > >
> > > > > > Ocean Applications Branch
> > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > NCWCP
> > > > > > 5830 University Research Ct
> > > > > > College Park, MD  20740-3818
> > > > > >
> > > > > > (p) 301-683-1551
> > > > > > rosalyn.maccracken at noaa.gov
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > Rosalyn MacCracken
> > > > Support Scientist
> > > >
> > > > Ocean Applications Branch
> > > > NOAA/NWS Ocean Prediction Center
> > > > NCWCP
> > > > 5830 University Research Ct
> > > > College Park, MD  20740-3818
> > > >
> > > > (p) 301-683-1551
> > > > rosalyn.maccracken at noaa.gov
> > > >
> > > >
> > >
> > >
> >
> >
> > --
> > Rosalyn MacCracken
> > Support Scientist
> >
> > Ocean Applications Branch
> > NOAA/NWS Ocean Prediction Center
> > NCWCP
> > 5830 University Research Ct
> > College Park, MD  20740-3818
> >
> > (p) 301-683-1551
> > rosalyn.maccracken at noaa.gov
> >
> >
>
>

--
Rosalyn MacCracken
Support Scientist

Ocean Applications Branch
NOAA/NWS Ocean Prediction Center
NCWCP
5830 University Research Ct
College Park, MD  20740-3818

(p) 301-683-1551
rosalyn.maccracken at noaa.gov

------------------------------------------------
Subject: question on regenerating data
From: John Halley Gotway
Time: Mon May 07 11:06:17 2018

Roz,

Yes, the "-set_hdr" option is specific to each job.  If your jobs are
defined in the config file, then yes, you'd need to specify there.
Indeed,
getting the timestamps consistent really is just cosmetic.  If you're
looping over many times, I'd suggest using an environment variable:
  -set_hdr OBS_VALID_BEG ${CUR_VALID_BEG} -set_hdr OBS_VALID_END
${CUR_VALID_END}

Using environment variables in MET configuration files makes scripting
much
more convenient.

However, STAT-Analysis doesn't have the ability to append to an output
file.  If you write to the same output file name, it'll *clobber* that
file
(i.e. replace it).

Hope that helps.

Thanks,
John

On Mon, May 7, 2018 at 10:38 AM, Rosalyn MacCracken - NOAA Affiliate
via RT
<met_help at ucar.edu> wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822 >
>
> Hi John,
>
> So, it sounds like I'm ok either way with the timestamp.  If I don't
use
> -set_hdr, it sets the correct beginning and end time according to
the mpr
> file, or, I can use -set_hdr for consistancy with the other files,
but,
> that's more "cosmetic".
>
> Oh, but, that -set_hdr command option is within the config file,
correct?
> So, you really couldn't loop through that and pass a time variable
into the
> command.  So, it may just be easiest to leave it out.
>
> So, since my processing would take 20 minutes for regenerating one
days
> worth of data, I was thinking, I would do all my processing for the
North
> Atlantic first, so, I can look at how we did with those Nor' Easters
in
> March.  So, that's processing the 00z, 01z, 11z and 12z time periods
first,
> since that is when ASCAT passes over the North Atlantic.  So, I
would copy
> those time periods into a temp directory and use the -lookin command
to
> process those 4 time periods for my entire period (maybe 1 week at a
> time).
>
> So, this will produce 1 file with 00z,01z, 11z and 12z data, for
each week,
> correct?  And, the only way to get individual files is to copy the
data,
> one hour at a time, process, delete the file, later rinse and
repeat.  That
> might be hard to do.  I may have to think about that...
>
> So, if it's one file, with all the data for the week, at selected
hours,
> what happens when I have time to run the rest of the data?  I just
write
> that to a different file, and then, maybe append that to the end of
the
> first file?  Or, just leave it separate?
>
> I guess I just have to think about what I'm going to do next with
these
> files, and the easiest way to do that.
>
> Roz
>
> On Mon, May 7, 2018 at 11:48 AM, John Halley Gotway via RT <
> met_help at ucar.edu> wrote:
>
> > Roz,
> >
> > I understand that you're suspicious about the beginning and ending
time
> > stamps in the OBS_VALID_BEG and _END columns.  You're comparing
the
> > original output from Point-Stat to the output that you're getting
from
> > STAT-Analysis.  However, those timestamps can be different without
there
> > actually being a problem.  Here's why...
> >
> > When you run Point-Stat, the obs_window setting in the config file
> defines
> > the matching time window.  If your forecast is valid at time T,
the
> > matching time window is defined as T+obs_window.beg to
T+obs_window.end.
> > The point observations may actually fall anywhere in that time
window...
> > but it's that time window that's reported in the summary line type
(like
> > CTC, CTS, SL1L2, and CNT).  Since the MPR line type is specific to
each
> > observation value, the *actual* timestamp of that observation is
reported
> > for in that line.
> >
> > When your run STAT-Analysis to process those MPR lines, it reads
the
> > OBS_VALID_BEG and OBS_VALID_END columns.  And it keeps track of
the
> minimum
> > OBS_VALID_BEG timestamp and the maximum OBS_VALID_END timestamp.
When it
> > writes output CTC, CTS, SL1L2, or CNT lines it reports the
> minimum/maximum
> > timestamp values it found in the data.
> >
> > So Point-Stat reports the *REQUESTED TIME WINDOW* in the
OBS_VALID_BEG
> and
> > OBS_VALID_END columns... while STAT-Analysis reports the *ACTUAL
TIME
> > WINDOW*.  And in general, those won't be the same.  So this isn't
> > necessarily a problem.
> >
> > If for consistency, you'd like to explicitly set the OBS_VALID_BEG
and
> > OBS_VALID_END timestamps in the output, you can use the "-set_hdr"
job
> > command option to do so:
> >    -set_hdr OBS_VALID_BEG 20180307_003000 -set_hdr OBS_VALID_END
> > 20180307_013000
> >
> > Thanks,
> > John
> >
> > On Sun, May 6, 2018 at 2:12 PM, Rosalyn MacCracken - NOAA
Affiliate via
> RT
> > <
> > met_help at ucar.edu> wrote:
> >
> > >
> > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822 >
> > >
> > > Hi John,
> > >
> > > Sorry it took me so long to get back to you.  My step-daughter
came in
> to
> > > town, and I thought that I could get some work done while she
was here,
> > > but, didn't.  Then, I totally forgot to email you back.  Sorry
for
> > leaving
> > > you hanging!
> > >
> > > Anyway, I was able to play around with the STATAnalysis config
file you
> > > sent me.  I tried it out with only 1 hour timestep, instead of
all the
> > > files for one day.  I wanted to see what kind of time it would
take to
> > > process this on my machine.  So, it was quick, 45 seconds.  But,
of
> > course
> > > you run took 18 minutes.  The script was probably reading 20
some
> files.
> > > That makes sense.
> > >
> > > So, then, I looked at the output, and it wasn't quite what I
expected,
> > and
> > > doesn't quite match the stats from the other processing.  This
is what
> I
> > > did:
> > >
> > > 1)  I copied the 00z only *20180307*.stat file to a temp
directory.
> > Before
> > > I did this, I looked at the matching *.mpr file, and saw that
the
> > > OBS_VALID_BEG was 20180307_000000 and the OBS_VALID_END was
> > > 20180307_002700.
> > > 2)  Ran the run_sa.sh script and generated the CTS, CTC and CNT
files.
> > > 3)  I looked at the new agg_cts file, and the OBS_VALID_BEG and
_END
> > > matched the *.mpr file in step 1.
> > > 4)  I looked at the original CTS file, and the OBS_VALID_BEG was
> > > 20180307_223000 and the OBS_VALID_END was 20180307_013000.  So,
that
> was
> > > our original way of processing.  I bet if I looked at a more
recent
> file,
> > > it would be more like OBS_VALID_BEG was 20180307_233000 and the
> > > OBS_VALID_END was 20180307_003000.
> > > 5)  I looked at the original *mpr for 01z, and the OBS_VALID_BEG
was
> > > 20180307_003000 and the OBS_VALID_END was 20180307_012100
> > >
> > > So, this tells me that I'm not matching observation times, and
I'm not
> > sure
> > > how to fix it to match things up.  First, we use a +/- 30 min
window
> for
> > > ASCAT obs, centered on the hour.  For example, if we are
processing the
> > 00z
> > > hour, we will match observations from 233000 from the day before
to
> > 003000
> > > the current day.  Actually, we used to do an hour window on
either
> side,
> > > but, we have more observations now at each hour.  (See the
explanation
> in
> > > #4 above)
> > >
> > > Anyway, how do I create the CTS,CTC and CNT files for the +/- 30
min
> > > window?  Is there a way to dynamically indicate this 30min
window, so
> > that
> > > I don't have to go into the config file every time I run
STATanalysis
> and
> > > change it?
> > >
> > > Roz
> > >
> > > On Thu, Apr 26, 2018 at 4:14 PM, John Halley Gotway via RT <
> > > met_help at ucar.edu> wrote:
> > >
> > > > Roz,
> > > >
> > > > The CSI statistics is computed from a 2x2 contingency table.
A 2x2
> > > > contingency table is defined by a single threshold.  Looking
in the
> > .stat
> > > > files you sent, I see that you've applied many thresholds to
generate
> > > many
> > > > 2x2 contingency tables and corresponding statistics.  Yes, it
is true
> > > that
> > > > for most of those thresholds, the "bad" observation values
will fall
> > into
> > > > the "non-event" category.  But those non-event counts are
included in
> > the
> > > > computation of some stats, including CSI.  So even through the
bad
> > > > observations aren't very interesting, they really are
impacting the
> > > > statistics.
> > > >
> > > > John
> > > >
> > > > On Wed, Apr 25, 2018 at 10:08 AM, Rosalyn MacCracken - NOAA
Affiliate
> > via
> > > > RT <met_help at ucar.edu> wrote:
> > > >
> > > > >
> > > > > <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822 >
> > > > >
> > > > > Figures.  I just calculated how long it will take me to
regenerate
> > data
> > > > for
> > > > > 03072018 - 04122018.  It will take me 912 hours.  ;-(
> > > > >
> > > > > Ok, I know I asked this, but, if I had a OBS value of 0.01
and a
> > > matched
> > > > > GFS point of 10 m/s, and I had a low threshold of 0-5 m/s,
6-10 m/s
> > and
> > > > > 10-15 m/s, and say, CSI was calculated.  Which threshold
would be
> > used
> > > > for
> > > > > the output, the 0-5 or 6-10?  And, would the 10-15 threshold
even
> be
> > > > > effected?
> > > > >
> > > > > Roz
> > > > >
> > > > > On Wed, Apr 25, 2018 at 11:40 AM, John Halley Gotway via RT
<
> > > > > met_help at ucar.edu> wrote:
> > > > >
> > > > > > Roz,
> > > > > >
> > > > > > I think it'd take just as long.  The slow part is reading
the
> > data...
> > > > not
> > > > > > applying a threshold.
> > > > > >
> > > > > > John
> > > > > >
> > > > > > On Wed, Apr 25, 2018 at 9:18 AM, Rosalyn MacCracken - NOAA
> > Affiliate
> > > > via
> > > > > RT
> > > > > > <met_help at ucar.edu> wrote:
> > > > > >
> > > > > > >
> > > > > > > <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822
> >
> > > > > > >
> > > > > > > Hi John,
> > > > > > >
> > > > > > > Thanks for doing that for me.  I'll take a look at the
info you
> > > sent
> > > > me
> > > > > > > this afternoon.  I'm in the middle of doing something
right
> > > > > now...trying
> > > > > > to
> > > > > > > make a different program work.  ;-/
> > > > > > >
> > > > > > > I wonder if it will be quicker than 18 minutes for some
of the
> > > > > thresholds
> > > > > > > that have higher wind speeds, and not as many instances
(or 0
> > > > > instances).
> > > > > > > Or, will it take just as long, since it still needs to
read
> > through
> > > > the
> > > > > > > entire *.stat file anyway?
> > > > > > >
> > > > > > > Roz
> > > > > > >
> > > > > > > On Tue, Apr 24, 2018 at 7:06 PM, John Halley Gotway via
RT <
> > > > > > > met_help at ucar.edu> wrote:
> > > > > > >
> > > > > > > > Hi Roz,
> > > > > > > >
> > > > > > > > Thanks for sending the sample data.  I grabbed it and
used it
> > run
> > > > > some
> > > > > > > > sample jobs:
> > > > > > > >
> > > > > > > > time /d1/johnhg/MET/MET_releases/met-
6.0/bin/stat_analysis \
> > > > > > > > -lookin
> > > > > > > > /d1/johnhg/MET/MET_Help/maccracken_data_20180424/opc_
> > > > > > > > test/home/opc_test/data/met_verif/GFS/data/hourly
> > > > > > > > \
> > > > > > > > -config STATAnalysisConfig \
> > > > > > > > -log run_sa.log -v 3
> > > > > > > >
> > > > > > > > I used the "-lookin" option to point to all the data
you
> sent.
> > > > > > > >
> > > > > > > > I've attached the...
> > > > > > > > (1) config file I used
> > > > > > > > (2) log file that was genrated
> > > > > > > > (3) output .stat files
> > > > > > > >
> > > > > > > > Looking at the jobs, you'll see that I've included 5
of
> them...
> > > > > > > > - Generate CNT output
> > > > > > > > - Generate CTC >= 0.0 output
> > > > > > > > - Generate CTS >= 0.0 output
> > > > > > > > - Generate CTC >= 5.5689 output
> > > > > > > > - Generate CTS >= 5.5689 output
> > > > > > > >
> > > > > > > > Unfortunately, you'll need to define separate jobs for
each
> > > > threshold
> > > > > > > you'd
> > > > > > > > like to use.  Although, you shouldn't use >=0.0 since
that's
> > > always
> > > > > > true.
> > > > > > > >
> > > > > > > > Also unfortunately, this is pretty slow.  On my
machine, it
> > took
> > > > like
> > > > > > 18
> > > > > > > > minutes for these 5 jobs!
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > John
> > > > > > > >
> > > > > > > >
> > > > > > > > On Tue, Apr 24, 2018 at 2:09 PM, Rosalyn MacCracken -
NOAA
> > > > Affiliate
> > > > > > via
> > > > > > > RT
> > > > > > > > <met_help at ucar.edu> wrote:
> > > > > > > >
> > > > > > > > >
> > > > > > > > > <URL: https://rt.rap.ucar.edu/rt/
> > Ticket/Display.html?id=84822
> > > >
> > > > > > > > >
> > > > > > > > > Hi John,
> > > > > > > > >
> > > > > > > > > I put my file on the ftp site.  Let me know what you
find.
> > > > You'll
> > > > > > see
> > > > > > > > > those really low OBS values (0.01, 0.02, and so on).
> > > > > > > > >
> > > > > > > > > Thanks!
> > > > > > > > >
> > > > > > > > > Roz
> > > > > > > > >
> > > > > > > > > On Tue, Apr 24, 2018 at 2:53 PM, Rosalyn MacCracken
- NOAA
> > > > > Affiliate
> > > > > > <
> > > > > > > > > rosalyn.maccracken at noaa.gov> wrote:
> > > > > > > > >
> > > > > > > > > > Ok, I'll get that over to the ftp site.  I have to
make
> > sure
> > > > > that I
> > > > > > > > find
> > > > > > > > > a
> > > > > > > > > > day that has all the data in it.  Sometimes the
data
> isn't
> > > > > > available
> > > > > > > > when
> > > > > > > > > > the script runs.  A little annoying, but, that's
> > > operations...
> > > > > > > > > >
> > > > > > > > > > I'll let you know when I get the file to the ftp
site.
> > > > > > > > > >
> > > > > > > > > > Thanks!
> > > > > > > > > >
> > > > > > > > > > Roz
> > > > > > > > > >
> > > > > > > > > > On Tue, Apr 24, 2018 at 2:49 PM, John Halley
Gotway via
> RT
> > <
> > > > > > > > > > met_help at ucar.edu> wrote:
> > > > > > > > > >
> > > > > > > > > >> Roz,
> > > > > > > > > >>
> > > > > > > > > >> Yes, we do.  Follow the instructions here:
> > > > > > > > > >>    https://dtcenter.org/met/
> > users/support/met_help.php#ftp
> > > > > > > > > >>
> > > > > > > > > >> I'd suggest making a tar file for one day and
posting
> them
> > > to
> > > > > the
> > > > > > > ftp
> > > > > > > > > >> site:
> > > > > > > > > >>    tar -cvzf sample.tar.gz
/GFS/data/hourly/20180305*
> > > > > > > > > >>
> > > > > > > > > >> Thanks,
> > > > > > > > > >> John
> > > > > > > > > >>
> > > > > > > > > >> On Tue, Apr 24, 2018 at 11:57 AM, Rosalyn
MacCracken -
> > NOAA
> > > > > > > Affiliate
> > > > > > > > > via
> > > > > > > > > >> RT <met_help at ucar.edu> wrote:
> > > > > > > > > >>
> > > > > > > > > >> >
> > > > > > > > > >> > <URL: https://rt.rap.ucar.edu/rt/
> > > > Ticket/Display.html?id=84822
> > > > > >
> > > > > > > > > >> >
> > > > > > > > > >> > HI John,
> > > > > > > > > >> >
> > > > > > > > > >> > Yes, it does seem that the -config option is
the way
> to
> > go
> > > > to
> > > > > > > > recreate
> > > > > > > > > >> > those 3 files. I'll be sure to have a unique
file
> name,
> > > or,
> > > > mv
> > > > > > the
> > > > > > > > > >> output
> > > > > > > > > >> > file to a different name before running the
command
> > again.
> > > > > > Thanks
> > > > > > > > for
> > > > > > > > > >> > pointing that out.
> > > > > > > > > >> >
> > > > > > > > > >> > I'm teleworking for the next couple of weeks,
so,
> > download
> > > > and
> > > > > > > send
> > > > > > > > > you
> > > > > > > > > >> > *.stat files like I can when I'm at my computer
at
> work.
> > > I
> > > > > > don't
> > > > > > > > have
> > > > > > > > > >> > access to theia or wcoss anymore.  You have an
ftp
> > server
> > > > > that I
> > > > > > > can
> > > > > > > > > >> upload
> > > > > > > > > >> > data to, right?  If not, I can try and fiddle
around
> > with
> > > > this
> > > > > > > > > tomorrow
> > > > > > > > > >> and
> > > > > > > > > >> > see if I can't get this to work the way I want
to.
> > > > > > > > > >> >
> > > > > > > > > >> > Roz
> > > > > > > > > >> >
> > > > > > > > > >> > On Tue, Apr 24, 2018 at 11:42 AM, John Halley
Gotway
> via
> > > RT
> > > > <
> > > > > > > > > >> > met_help at ucar.edu> wrote:
> > > > > > > > > >> >
> > > > > > > > > >> > > Roz,
> > > > > > > > > >> > >
> > > > > > > > > >> > > Each "-job aggregate_stat" only generates a
single
> > > output
> > > > > line
> > > > > > > > type.
> > > > > > > > > >> So
> > > > > > > > > >> > > using "-out_line_type CTC,CTS,CNT" will not
work.
> > > > > > > > > >> > >
> > > > > > > > > >> > > You'll need to run separate jobs for each
output
> line
> > > type
> > > > > you
> > > > > > > > want
> > > > > > > > > to
> > > > > > > > > >> > > generate.  That's why I'd recommend grouping
those
> > > > multiple
> > > > > > jobs
> > > > > > > > > >> together
> > > > > > > > > >> > > into a single STAT-Analysis config file.
Then you'd
> > > call
> > > > > > > > > >> STAT-Analysis
> > > > > > > > > >> > > once using the "-config" command line option.
> > > > > > > > > >> > >
> > > > > > > > > >> > > Another issue is that if you set "-out_stat"
to the
> > same
> > > > > > > filename,
> > > > > > > > > >> it'll
> > > > > > > > > >> > > get overridden by each job.  STAT-Analysis
will
> > > overwrite
> > > > > that
> > > > > > > > > output
> > > > > > > > > >> > file
> > > > > > > > > >> > > rather than appending to it.
> > > > > > > > > >> > >
> > > > > > > > > >> > > You could send me a day's worth of .stat
output
> files
> > > > > > > > > >> > > (/GFS/data/hourly/20180305*) and I could send
you
> some
> > > > > > > > suggestions.
> > > > > > > > > >> Or
> > > > > > > > > >> > if
> > > > > > > > > >> > > you have access to theia you could copy them
up
> there
> > > and
> > > > > > point
> > > > > > > me
> > > > > > > > > to
> > > > > > > > > >> it.
> > > > > > > > > >> > >
> > > > > > > > > >> > > Thanks,
> > > > > > > > > >> > > John
> > > > > > > > > >> > >
> > > > > > > > > >> > > On Tue, Apr 24, 2018 at 7:48 AM, Rosalyn
MacCracken
> -
> > > NOAA
> > > > > > > > Affiliate
> > > > > > > > > >> via
> > > > > > > > > >> > RT
> > > > > > > > > >> > > <met_help at ucar.edu> wrote:
> > > > > > > > > >> > >
> > > > > > > > > >> > > >
> > > > > > > > > >> > > > <URL: https://rt.rap.ucar.edu/rt/
> > > > > > Ticket/Display.html?id=84822
> > > > > > > >
> > > > > > > > > >> > > >
> > > > > > > > > >> > > > Hi John,
> > > > > > > > > >> > > >
> > > > > > > > > >> > > > Yes, that makes sense.  Those very small
values
> > (<1.0
> > > > > m/s),
> > > > > > > are
> > > > > > > > > bad
> > > > > > > > > >> > > > values.  That's why they shouldn't be
included in
> > the
> > > > > > > > processing.
> > > > > > > > > >> > > >
> > > > > > > > > >> > > > So, I need to just regenerate hourly data,
one
> hour
> > > at a
> > > > > > time.
> > > > > > > > > >> Would
> > > > > > > > > >> > it
> > > > > > > > > >> > > > make sense to use a shell script and loop
> > > stat-analysis?
> > > > > > > > > Something
> > > > > > > > > >> > like:
> > > > > > > > > >> > > >
> > > > > > > > > >> > > > for day in 11 12
> > > > > > > > > >> > > > do
> > > > > > > > > >> > > >   for cycle in 00 06 12 18
> > > > > > > > > >> > > >   do
> > > > > > > > > >> > > > stat_analysis -lookin
> /GFS/data/hourly/201803${day}$
> > > > > > > > {hour}/*.stat
> > > > > > > > > \
> > > > > > > > > >> > > > -job aggregate_stat \
> > > > > > > > > >> > > >    -line_type MPR \
> > > > > > > > > >> > > >    -out_line_type CTC,CTS,CNT \
> > > > > > > > > >> > > >   -fcst_var WIND \
> > > > > > > > > >> > > > -column_thresh OBS gt1 \
> > > > > > > > > >> > > >  -by
> > > > > > > > > >> > > > MODEL,FCST_LEV,FCST_INIT_BEG,
> > > > > FCST_LEAD,VX_MASK,INTERP_MTHD,
> > > > > > I
> > > > > > > > > >> NTERP_PNTS
> > > > > > > > > >> > > > -out_stat /new_rerun_stat_files/MPR_to_
> > > CTC_CTS_CNT.stat
> > > > > > > > > >> > > >   done
> > > > > > > > > >> > > > done
> > > > > > > > > >> > > >
> > > > > > > > > >> > > > or, something like that?  And, will this
> regenerate
> > > hour
> > > > > > > > > forecasts,
> > > > > > > > > >> at
> > > > > > > > > >> > > each
> > > > > > > > > >> > > > forecast and lead hour?  I guess it will
see the
> > > > forecast
> > > > > > and
> > > > > > > > lead
> > > > > > > > > >> hour
> > > > > > > > > >> > > > from the *.stat file, and whatever *stat
file is
> in
> > > the
> > > > > > > > directory,
> > > > > > > > > >> it
> > > > > > > > > >> > > will
> > > > > > > > > >> > > > regenerate those hours, right?
> > > > > > > > > >> > > >
> > > > > > > > > >> > > > So, I need to regenerate the CTC, CNT and
CTS
> files.
> > > > > That's
> > > > > > > > why I
> > > > > > > > > >> did:
> > > > > > > > > >> > > >  -out_line_type CTC,CTS,CNT
> > > > > > > > > >> > > > but, will that make 3 separate files, or
just
> > another
> > > > > *.stat
> > > > > > > > file?
> > > > > > > > > >> > > >
> > > > > > > > > >> > > > Roz
> > > > > > > > > >> > > >
> > > > > > > > > >> > > >
> > > > > > > > > >> > > > On Mon, Apr 23, 2018 at 4:01 PM, John
Halley
> Gotway
> > > via
> > > > > RT <
> > > > > > > > > >> > > > met_help at ucar.edu> wrote:
> > > > > > > > > >> > > >
> > > > > > > > > >> > > > > Roz,
> > > > > > > > > >> > > > >
> > > > > > > > > >> > > > > It is ultimately up to you to decide
which
> matched
> > > > pairs
> > > > > > you
> > > > > > > > > want
> > > > > > > > > >> to
> > > > > > > > > >> > > > > include in your processing.  Do you
consider
> those
> > > > small
> > > > > > > (<1.0
> > > > > > > > > >> m/s)
> > > > > > > > > >> > > > > observation values to be corrupt and
incorrect
> in
> > > some
> > > > > way
> > > > > > > or
> > > > > > > > > just
> > > > > > > > > >> > not
> > > > > > > > > >> > > > very
> > > > > > > > > >> > > > > interesting?  If they really are BAD data
> values,
> > I
> > > > > agree
> > > > > > > that
> > > > > > > > > you
> > > > > > > > > >> > > should
> > > > > > > > > >> > > > > exclude them from your analysis.  But if
they're
> > > just
> > > > > > > > > >> uninteresting
> > > > > > > > > >> > > > values
> > > > > > > > > >> > > > > of low wind speed, then there's no reason
why
> you
> > > > should
> > > > > > > > exclude
> > > > > > > > > >> > them.
> > > > > > > > > >> > > > For
> > > > > > > > > >> > > > > example, *most* of the time it ins't
raining,
> but
> > we
> > > > > often
> > > > > > > > > >> included
> > > > > > > > > >> > > > > observations of 0 precip.
> > > > > > > > > >> > > > >
> > > > > > > > > >> > > > > There are three configurable options in
> Point-Stat
> > > > that
> > > > > > may
> > > > > > > be
> > > > > > > > > >> useful
> > > > > > > > > >> > > > here:
> > > > > > > > > >> > > > > (1) You already know and use the
"cat_thresh"
> > > option.
> > > > > > This
> > > > > > > > > >> threshold
> > > > > > > > > >> > > > > defines the events and non-events for a
2x2
> > > > contingency
> > > > > > > table.
> > > > > > > > > >> This
> > > > > > > > > >> > > > > threshold affects the contents of FHO,
CTC, CTS,
> > > MCTC,
> > > > > and
> > > > > > > > MCTS
> > > > > > > > > >> line
> > > > > > > > > >> > > > types
> > > > > > > > > >> > > > > that Point-Stat writes.
> > > > > > > > > >> > > > > (2) The "cnt_thresh" option is a more
recent
> > > addition.
> > > > > > > > Perhaps
> > > > > > > > > >> this
> > > > > > > > > >> > > was
> > > > > > > > > >> > > > a
> > > > > > > > > >> > > > > poor name choice, but instead of defining
> > > categories,
> > > > > it's
> > > > > > > > > really
> > > > > > > > > >> a
> > > > > > > > > >> > > > > *filtering* threshold.  This threshold
affects
> the
> > > > > > contents
> > > > > > > of
> > > > > > > > > the
> > > > > > > > > >> > > SL1L2,
> > > > > > > > > >> > > > > SAL1L2, and CNT line types that Point-
Stat
> writes.
> > > > For
> > > > > > > > example,
> > > > > > > > > >> > > setting
> > > > > > > > > >> > > > > "cnt_thresh = [ ge6, ge17 ];" will
produce 2 CNT
> > > and 2
> > > > > > SL1L2
> > > > > > > > > >> output
> > > > > > > > > >> > > lines
> > > > > > > > > >> > > > > containing only those points where the
wind
> speed
> > > was
> > > > > >=6
> > > > > > > and
> > > > > > > > > >> >=17,
> > > > > > > > > >> > > > > respectively.
> > > > > > > > > >> > > > > (3) The "wind_thresh" option is very
similar to
> > the
> > > > > > > > "cnt_thresh"
> > > > > > > > > >> > option
> > > > > > > > > >> > > > but
> > > > > > > > > >> > > > > affects the contents of teh VL1L2,
VAL1L2, and
> > VCNT
> > > > (new
> > > > > > in
> > > > > > > > > >> met-7.0)
> > > > > > > > > >> > > line
> > > > > > > > > >> > > > > types.  Only those U/V pairs that meet
the
> > specified
> > > > > wind
> > > > > > > > speed
> > > > > > > > > >> > > threshold
> > > > > > > > > >> > > > > are included in the output.
> > > > > > > > > >> > > > >
> > > > > > > > > >> > > > > For both "cnt_thresh" and "wind_thresh",
the
> > default
> > > > > value
> > > > > > > in
> > > > > > > > > the
> > > > > > > > > >> > > config
> > > > > > > > > >> > > > > file is "NA", meaning, do not apply any
> filtering
> > > > > > threshold
> > > > > > > > > >> criteria.
> > > > > > > > > >> > > > >
> > > > > > > > > >> > > > > You have the flexibility to run STAT-
Analysis on
> > the
> > > > MPR
> > > > > > > > output
> > > > > > > > > >> lines
> > > > > > > > > >> > > to
> > > > > > > > > >> > > > > recompute any of these output line types
> applying
> > > > > whatever
> > > > > > > > > >> filtering
> > > > > > > > > >> > > > > criteria you'd like.
> > > > > > > > > >> > > > > Here's the MET user's guide:
> > > > > > > > > >> > > > > https://dtcenter.org/met/
> > > users/docs/users_guide/MET_
> > > > > > > > > >> > > Users_Guide_v7.0.pdf
> > > > > > > > > >> > > > > Look on page 98 for the job command
options for
> > the
> > > > > > > > > >> "aggregate_stat"
> > > > > > > > > >> > > line
> > > > > > > > > >> > > > > type when the input line type is "MPR".
> > > > > > > > > >> > > > >
> > > > > > > > > >> > > > > For your second question, the "-lookin
PATH"
> > option
> > > is
> > > > > > > *VERY*
> > > > > > > > > >> > flexible.
> > > > > > > > > >> > > > > You can set PATH to either a single value
or
> > > multiple
> > > > > > > values.
> > > > > > > > > If
> > > > > > > > > >> you
> > > > > > > > > >> > > use
> > > > > > > > > >> > > > > wildcards, then the shell expands those
> wildcards
> > to
> > > > > > > multiple
> > > > > > > > > >> values.
> > > > > > > > > >> > > > Each
> > > > > > > > > >> > > > > value you pass in can either be a
filename or a
> > > > > directory
> > > > > > > > name.
> > > > > > > > > >> If
> > > > > > > > > >> > you
> > > > > > > > > >> > > > > pass in a filename, STAT-Analysis will
read it
> > > > > > *REGARDLESS*
> > > > > > > of
> > > > > > > > > the
> > > > > > > > > >> > file
> > > > > > > > > >> > > > > extension.  If you pass in a directory
name,
> > > > > STAT-Analysis
> > > > > > > > will
> > > > > > > > > >> > search
> > > > > > > > > >> > > > that
> > > > > > > > > >> > > > > directory *RECURSIVELY* for files ending
in
> > ".stat".
> > > > > For
> > > > > > > > > example,
> > > > > > > > > >> > > either
> > > > > > > > > >> > > > > of the following settings would tell
> STAT-Analysis
> > > to
> > > > > read
> > > > > > > the
> > > > > > > > > >> same
> > > > > > > > > >> > > list
> > > > > > > > > >> > > > of
> > > > > > > > > >> > > > > files:
> > > > > > > > > >> > > > >    -lookin /GFS/data/hourly/*/*.stat
> > > > > > > > > >> > > > >    ... or ...
> > > > > > > > > >> > > > >    -lookin /GFS/data/hourly
> > > > > > > > > >> > > > >
> > > > > > > > > >> > > > > Be aware though that the more data you
pass to
> > > > > > > STAT-Analysis,
> > > > > > > > > the
> > > > > > > > > >> > > longer
> > > > > > > > > >> > > > > it'll take for it to process it.  You can
decide
> > how
> > > > > much
> > > > > > > data
> > > > > > > > > you
> > > > > > > > > >> > pass
> > > > > > > > > >> > > > it
> > > > > > > > > >> > > > > for each job.  I'd suggest starting with
what is
> > > most
> > > > > > > > convenient
> > > > > > > > > >> for
> > > > > > > > > >> > > you.
> > > > > > > > > >> > > > > If it's too slow, change the logic to
pass it
> less
> > > > data
> > > > > > > (e.g.
> > > > > > > > > >> only 1
> > > > > > > > > >> > > day
> > > > > > > > > >> > > > of
> > > > > > > > > >> > > > > data rather than 1 month of data).
> > > > > > > > > >> > > > >
> > > > > > > > > >> > > > > Yes, you can give it a date range.  Use
> > > -fcst_init_beg
> > > > > and
> > > > > > > > > >> > > -fcst_init_end
> > > > > > > > > >> > > > > to specify beginning/ending model
initialization
> > > times
> > > > > or
> > > > > > > > > >> > > -fcst_valid_beg
> > > > > > > > > >> > > > > and -fcst_valid_end to specify
beginning/ending
> > > valid
> > > > > > times.
> > > > > > > > > >> > > > >
> > > > > > > > > >> > > > > If you find that you're running multiple
jobs on
> > the
> > > > > same
> > > > > > > > subset
> > > > > > > > > >> of
> > > > > > > > > >> > > data
> > > > > > > > > >> > > > > (e.g. process MPR to CNT, MPR to SL1L2,
MPR to
> > CTC,
> > > > MPR
> > > > > to
> > > > > > > > CTS),
> > > > > > > > > >> it'd
> > > > > > > > > >> > > be
> > > > > > > > > >> > > > > more efficient to group those jobs into a
config
> > > file.
> > > > > > > > That'll
> > > > > > > > > do
> > > > > > > > > >> > the
> > > > > > > > > >> > > > > filtering ONCE and write the filtered
data to a
> > temp
> > > > > file.
> > > > > > > > Then
> > > > > > > > > >> all
> > > > > > > > > >> > > the
> > > > > > > > > >> > > > > jobs read data from the temp instead of
starting
> > > over
> > > > > from
> > > > > > > > > >> scratch.
> > > > > > > > > >> > > > >
> > > > > > > > > >> > > > > Make sense?
> > > > > > > > > >> > > > >
> > > > > > > > > >> > > > > John
> > > > > > > > > >> > > > >
> > > > > > > > > >> > > > >
> > > > > > > > > >> > > > >
> > > > > > > > > >> > > > > On Mon, Apr 23, 2018 at 1:01 PM, Rosalyn
> > MacCracken
> > > -
> > > > > NOAA
> > > > > > > > > >> Affiliate
> > > > > > > > > >> > > via
> > > > > > > > > >> > > > RT
> > > > > > > > > >> > > > > <met_help at ucar.edu> wrote:
> > > > > > > > > >> > > > >
> > > > > > > > > >> > > > > >
> > > > > > > > > >> > > > > > <URL: https://rt.rap.ucar.edu/rt/
> > > > > > > > Ticket/Display.html?id=84822
> > > > > > > > > >
> > > > > > > > > >> > > > > >
> > > > > > > > > >> > > > > > Hi John,
> > > > > > > > > >> > > > > >
> > > > > > > > > >> > > > > > That's actually only partially correct.
It's
> > not
> > > > > that I
> > > > > > > > want
> > > > > > > > > to
> > > > > > > > > >> > use
> > > > > > > > > >> > > > part
> > > > > > > > > >> > > > > > of the MPR lines and discard the rest,
and I
> do
> > > need
> > > > > to
> > > > > > > > > >> regenerate
> > > > > > > > > >> > > > > > statistics.  Let me try to re-explain.
> > > > > > > > > >> > > > > >
> > > > > > > > > >> > > > > > Back in early March we switched from
getting
> our
> > > > ASCAT
> > > > > > obs
> > > > > > > > > from
> > > > > > > > > >> the
> > > > > > > > > >> > > > > > prepbufr data, to getting it from the
MGDRLITE
> > > data.
> > > > > So,
> > > > > > > > > >> processing
> > > > > > > > > >> > > > > didn't
> > > > > > > > > >> > > > > > change.  I was producing statistics at
certain
> > > > > threshold
> > > > > > > > > levels
> > > > > > > > > >> for
> > > > > > > > > >> > > > both
> > > > > > > > > >> > > > > > GFS and ASCAT.  I had this set with the
> > cat_thresh
> > > > > list,
> > > > > > > at
> > > > > > > > > >> levels
> > > > > > > > > >> > of
> > > > > > > > > >> > > > > > 0,6,17, etc.  We found out after
processing
> for
> > a
> > > > > couple
> > > > > > > of
> > > > > > > > > >> weeks
> > > > > > > > > >> > > that
> > > > > > > > > >> > > > > the
> > > > > > > > > >> > > > > > ASCAT data included these really small
values,
> > > <1.0
> > > > > m/s,
> > > > > > > and
> > > > > > > > > >> that
> > > > > > > > > >> > > these
> > > > > > > > > >> > > > > > small wind speeds were being included
into the
> > > > > > statistics
> > > > > > > > > >> > processing.
> > > > > > > > > >> > > > > >
> > > > > > > > > >> > > > > > So, a couple of questions.
> > > > > > > > > >> > > > > > 1) Do I have to regenerate all of my
> statistics
> > > > > (*.cts,
> > > > > > > > *.cnt
> > > > > > > > > >> and
> > > > > > > > > >> > > *ctc
> > > > > > > > > >> > > > > > files) because of this error? Or, since
I have
> > > > > threshold
> > > > > > > > > levels
> > > > > > > > > >> > set,
> > > > > > > > > >> > > > will
> > > > > > > > > >> > > > > > those small values be amoung the
statistics in
> > the
> > > > > > lowest
> > > > > > > > > >> > thresholds?
> > > > > > > > > >> > > > > > 2) I have the *.stat files, but, they
are
> spread
> > > out
> > > > > > into
> > > > > > > > > >> separate
> > > > > > > > > >> > > > > > directories like:
> > > > > > > > > >> > > > > > /GFS/data/hourly/${YYYYMMDDHH}/*.stat
> > > > > > > > > >> > > > > > Can I tell stat-analysis to "lookin"
> directories
> > > > with
> > > > > a
> > > > > > > > > wildcard
> > > > > > > > > >> > > (like
> > > > > > > > > >> > > > > > 201803*)?  If so, how?  Or, is I tell
it to
> look
> > > in
> > > > > > > > > >> > /GFS/data/hourly,
> > > > > > > > > >> > > > > will
> > > > > > > > > >> > > > > > it look in all the directories
recursively
> under
> > > > > hourly?
> > > > > > > > And,
> > > > > > > > > >> it
> > > > > > > > > >> > > > that's
> > > > > > > > > >> > > > > > the case, can I give it a date range,
so, that
> > it
> > > > only
> > > > > > > > > processes
> > > > > > > > > >> > data
> > > > > > > > > >> > > > > from
> > > > > > > > > >> > > > > > March?
> > > > > > > > > >> > > > > >
> > > > > > > > > >> > > > > > Roz
> > > > > > > > > >> > > > > >
> > > > > > > > > >> > > > > > On Mon, Apr 23, 2018 at 2:18 PM, John
Halley
> > > Gotway
> > > > > via
> > > > > > > RT <
> > > > > > > > > >> > > > > > met_help at ucar.edu> wrote:
> > > > > > > > > >> > > > > >
> > > > > > > > > >> > > > > > > Hi Roz,
> > > > > > > > > >> > > > > > >
> > > > > > > > > >> > > > > > > I read that you've run Point-Stat and
saved
> > off
> > > > the
> > > > > > > > matched
> > > > > > > > > >> pairs
> > > > > > > > > >> > > > (MPR)
> > > > > > > > > >> > > > > > > output line type.  And you'd like to
(1)
> > filter
> > > > > those
> > > > > > > MPR
> > > > > > > > > >> lines
> > > > > > > > > >> > to
> > > > > > > > > >> > > > > > discard
> > > > > > > > > >> > > > > > > some of them and then (2) use the
filtered
> > data
> > > to
> > > > > > > > > regenerate
> > > > > > > > > >> > > summary
> > > > > > > > > >> > > > > > > statistics.  Yes, this is easily done
using
> > the
> > > > > > > > > STAT-Analysis
> > > > > > > > > >> > tool
> > > > > > > > > >> > > in
> > > > > > > > > >> > > > > > MET.
> > > > > > > > > >> > > > > > >
> > > > > > > > > >> > > > > > > You wrote that you're verifying wind
speeds
> > > > against
> > > > > > > ASCAT
> > > > > > > > > and
> > > > > > > > > >> > that
> > > > > > > > > >> > > > > you'd
> > > > > > > > > >> > > > > > > like to exclude pairs where the
observed
> wind
> > > > speed
> > > > > is
> > > > > > > > less
> > > > > > > > > >> than
> > > > > > > > > >> > 1
> > > > > > > > > >> > > > m/s.
> > > > > > > > > >> > > > > > > I'm just guessing here, but I'll
presume
> that
> > > you
> > > > > want
> > > > > > > to
> > > > > > > > > >> produce
> > > > > > > > > >> > > > both
> > > > > > > > > >> > > > > > > SL1L2 and CNT output line types.
Here's
> what
> > > the
> > > > > > > > > >> STAT-Analysis
> > > > > > > > > >> > job
> > > > > > > > > >> > > > > would
> > > > > > > > > >> > > > > > > look like:
> > > > > > > > > >> > > > > > >
> > > > > > > > > >> > > > > > > # Filter MPR's and write SL1L2 output
line
> > > > > > > > > >> > > > > > > stat_analysis \
> > > > > > > > > >> > > > > > >    -lookin input.stat \            #
List a
> > > .stat
> > > > > > > filename
> > > > > > > > > or
> > > > > > > > > >> > > > directory
> > > > > > > > > >> > > > > > > containing them
> > > > > > > > > >> > > > > > >    -job aggregate_stat \        # Job
type
> is
> > > > > > > > aggregate_stat
> > > > > > > > > >> > > > > > >    -line_type MPR \              #
Input
> line
> > > > type =
> > > > > > MPR
> > > > > > > > > >> > > > > > >    -out_line_type SL1L2 \      #
Output line
> > > type
> > > > =
> > > > > > > SL1L2
> > > > > > > > > >> partial
> > > > > > > > > >> > > > sums
> > > > > > > > > >> > > > > > >    -fcst_var WIND \               #
Only
> > process
> > > > > lines
> > > > > > > > where
> > > > > > > > > >> > > FCST_VAR
> > > > > > > > > >> > > > > > > column = WIND
> > > > > > > > > >> > > > > > >    -column_thresh OBS gt1 \ # Only
use MPR
> > lines
> > > > > where
> > > > > > > OBS
> > > > > > > > > >> column
> > > > > > > > > >> > > > 1
> > > > > > > > > >> > > > > > >    -by
> > > > > > > > > >> > > > > > > MODEL,FCST_LEV,FCST_INIT_BEG,
> > > > > > > > FCST_LEAD,VX_MASK,INTERP_MTHD,
> > > > > > > > > >> > > > INTERP_PNTS
> > > > > > > > > >> > > > > #
> > > > > > > > > >> > > > > > > Run this same job for each unique
> combination
> > of
> > > > > these
> > > > > > > > > columns
> > > > > > > > > >> > > > > > >    -out_stat MPR_to_SL1L2.stat
> > > > > > > > > >> > > > > > >
> > > > > > > > > >> > > > > > > This will read produce an output
.stat file
> > > > > containing
> > > > > > > an
> > > > > > > > > >> SL1L2
> > > > > > > > > >> > > line
> > > > > > > > > >> > > > > for
> > > > > > > > > >> > > > > > > each unique combination of the header
> columns
> > > > listed
> > > > > > > after
> > > > > > > > > the
> > > > > > > > > >> > > "-by"
> > > > > > > > > >> > > > > > > option.  To generate CNT output lines
> instead,
> > > > you'd
> > > > > > > run a
> > > > > > > > > >> second
> > > > > > > > > >> > > job
> > > > > > > > > >> > > > > > where
> > > > > > > > > >> > > > > > > you replace SL1L2 with CNT.  You
could run
> > these
> > > > > jobs
> > > > > > on
> > > > > > > > the
> > > > > > > > > >> > > command
> > > > > > > > > >> > > > > line
> > > > > > > > > >> > > > > > > or group them together into a STAT-
Analysis
> > > config
> > > > > > file,
> > > > > > > > if
> > > > > > > > > >> you
> > > > > > > > > >> > > > prefer.
> > > > > > > > > >> > > > > > > Both would work.
> > > > > > > > > >> > > > > > >
> > > > > > > > > >> > > > > > > You could run this once for each
input .stat
> > > file
> > > > > > you're
> > > > > > > > > >> > > > processing...
> > > > > > > > > >> > > > > or
> > > > > > > > > >> > > > > > > you could pass many input .stat files
to the
> > > job.
> > > > > > Since
> > > > > > > > > >> > > > FCST_INIT_BEG
> > > > > > > > > >> > > > > > and
> > > > > > > > > >> > > > > > > FCST_LEAD are included in the "-by"
option,
> > > you'll
> > > > > get
> > > > > > > > > >> separate
> > > > > > > > > >> > > > output
> > > > > > > > > >> > > > > > > lines for each unique time.
> > > > > > > > > >> > > > > > >
> > > > > > > > > >> > > > > > > Hope that helps get you going.
> > > > > > > > > >> > > > > > >
> > > > > > > > > >> > > > > > > Thanks,
> > > > > > > > > >> > > > > > > John
> > > > > > > > > >> > > > > > >
> > > > > > > > > >> > > > > > >
> > > > > > > > > >> > > > > > > On Thu, Apr 19, 2018 at 9:23 AM,
Julie
> > > Prestopnik
> > > > > via
> > > > > > > RT <
> > > > > > > > > >> > > > > > > met_help at ucar.edu>
> > > > > > > > > >> > > > > > > wrote:
> > > > > > > > > >> > > > > > >
> > > > > > > > > >> > > > > > > >
> > > > > > > > > >> > > > > > > > <URL:
https://rt.rap.ucar.edu/rt/Tic
> > > > > > > > > >> ket/Display.html?id=84822
> > > > > > > > > >> > >
> > > > > > > > > >> > > > > > > >
> > > > > > > > > >> > > > > > > > Hi Roz.  My apologies for the delay
in
> > > > responding.
> > > > > > > > > >> > > > > > > >
> > > > > > > > > >> > > > > > > > Unfortunately, John is out of the
office
> > this
> > > > > week,
> > > > > > > and
> > > > > > > > I
> > > > > > > > > do
> > > > > > > > > >> > not
> > > > > > > > > >> > > > know
> > > > > > > > > >> > > > > > the
> > > > > > > > > >> > > > > > > > answers to your questions.  As you
said, I
> > > would
> > > > > > also
> > > > > > > > > >> imagine
> > > > > > > > > >> > > that
> > > > > > > > > >> > > > > > > > point-stat is using those small
values as
> > > > matched
> > > > > > > pairs.
> > > > > > > > > >> > Also, I
> > > > > > > > > >> > > > do
> > > > > > > > > >> > > > > > not
> > > > > > > > > >> > > > > > > > believe there is a way to
regenerate the
> > > > > point-stat
> > > > > > > > > >> statistics
> > > > > > > > > >> > > > > without
> > > > > > > > > >> > > > > > > > using the original GFS data.  I
cannot say
> > > with
> > > > > > > > certainty,
> > > > > > > > > >> > > however.
> > > > > > > > > >> > > > > > > Thank
> > > > > > > > > >> > > > > > > > you for your patience in advance.
We'll
> > get a
> > > > > > > definite
> > > > > > > > > >> > response
> > > > > > > > > >> > > to
> > > > > > > > > >> > > > > you
> > > > > > > > > >> > > > > > > as
> > > > > > > > > >> > > > > > > > soon as we can.
> > > > > > > > > >> > > > > > > >
> > > > > > > > > >> > > > > > > > Thanks,
> > > > > > > > > >> > > > > > > > Julie
> > > > > > > > > >> > > > > > > >
> > > > > > > > > >> > > > > > > > On Wed, Apr 18, 2018 at 6:31 AM,
Rosalyn
> > > > > MacCracken
> > > > > > -
> > > > > > > > NOAA
> > > > > > > > > >> > > > Affiliate
> > > > > > > > > >> > > > > > via
> > > > > > > > > >> > > > > > > RT
> > > > > > > > > >> > > > > > > > <met_help at ucar.edu> wrote:
> > > > > > > > > >> > > > > > > >
> > > > > > > > > >> > > > > > > > >
> > > > > > > > > >> > > > > > > > > Wed Apr 18 06:31:39 2018: Request
84822
> > was
> > > > > acted
> > > > > > > > upon.
> > > > > > > > > >> > > > > > > > > Transaction: Ticket created by
> > > > > > > > > >> rosalyn.maccracken at noaa.gov
> > > > > > > > > >> > > > > > > > >        Queue: met_help
> > > > > > > > > >> > > > > > > > >      Subject: question on
regenerating
> > data
> > > > > > > > > >> > > > > > > > >        Owner: Nobody
> > > > > > > > > >> > > > > > > > >   Requestors:
> rosalyn.maccracken at noaa.gov
> > > > > > > > > >> > > > > > > > >       Status: new
> > > > > > > > > >> > > > > > > > >  Ticket <URL:
> https://rt.rap.ucar.edu/rt/
> > > > > > > > > >> > > > > > Ticket/Display.html?id=84822
> > > > > > > > > >> > > > > > > >
> > > > > > > > > >> > > > > > > > >
> > > > > > > > > >> > > > > > > > >
> > > > > > > > > >> > > > > > > > > Hi,
> > > > > > > > > >> > > > > > > > >
> > > > > > > > > >> > > > > > > > > I'm running point-stat using
ASCAT and
> GFS
> > > > data
> > > > > to
> > > > > > > > > verify
> > > > > > > > > >> > > surface
> > > > > > > > > >> > > > > > wind
> > > > > > > > > >> > > > > > > > > speeds.  I found an error in my
ASCAT
> > input
> > > > data
> > > > > > > that
> > > > > > > > > goes
> > > > > > > > > >> > back
> > > > > > > > > >> > > > to
> > > > > > > > > >> > > > > > Mar
> > > > > > > > > >> > > > > > > 7.
> > > > > > > > > >> > > > > > > > > I had switched the input source
of the
> > data,
> > > > and
> > > > > > > > within
> > > > > > > > > >> the
> > > > > > > > > >> > new
> > > > > > > > > >> > > > > data
> > > > > > > > > >> > > > > > > > files,
> > > > > > > > > >> > > > > > > > > it was allowing very small values
(< 1
> > m/s)
> > > to
> > > > > be
> > > > > > > used
> > > > > > > > > as
> > > > > > > > > >> > data
> > > > > > > > > >> > > > > points
> > > > > > > > > >> > > > > > > in
> > > > > > > > > >> > > > > > > > > the verification.  I imagine that
this
> is
> > an
> > > > > > issue,
> > > > > > > > > since
> > > > > > > > > >> > > > > point-stat
> > > > > > > > > >> > > > > > is
> > > > > > > > > >> > > > > > > > > using these very small values as
matched
> > > pairs
> > > > > > with
> > > > > > > > the
> > > > > > > > > >> GFS,
> > > > > > > > > >> > > > > correct?
> > > > > > > > > >> > > > > > > > >
> > > > > > > > > >> > > > > > > > > Is there a way to regenerate the
> > point-stat
> > > > > > > statistics
> > > > > > > > > >> > without
> > > > > > > > > >> > > > > using
> > > > > > > > > >> > > > > > > the
> > > > > > > > > >> > > > > > > > > original GFS data?  I do have the
*stat
> > and
> > > > the
> > > > > > *mpr
> > > > > > > > > >> files,
> > > > > > > > > >> > and
> > > > > > > > > >> > > > it
> > > > > > > > > >> > > > > is
> > > > > > > > > >> > > > > > > > > pretty easy to identify where the
bad
> > values
> > > > are
> > > > > > > > > located.
> > > > > > > > > >> > > > > > > > >
> > > > > > > > > >> > > > > > > > > Thanks,
> > > > > > > > > >> > > > > > > > > Roz
> > > > > > > > > >> > > > > > > > >
> > > > > > > > > >> > > > > > > > > --
> > > > > > > > > >> > > > > > > > > Rosalyn MacCracken
> > > > > > > > > >> > > > > > > > > Support Scientist
> > > > > > > > > >> > > > > > > > >
> > > > > > > > > >> > > > > > > > > Ocean Applications Branch
> > > > > > > > > >> > > > > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > > > > >> > > > > > > > > NCWCP
> > > > > > > > > >> > > > > > > > > 5830 University Research Ct
> > > > > > > > > >> > > > > > > > > College Park, MD  20740-3818
> > > > > > > > > >> > > > > > > > >
> > > > > > > > > >> > > > > > > > > (p) 301-683-1551
> > > > > > > > > >> > > > > > > > > rosalyn.maccracken at noaa.gov
> > > > > > > > > >> > > > > > > > >
> > > > > > > > > >> > > > > > > > >
> > > > > > > > > >> > > > > > > >
> > > > > > > > > >> > > > > > > >
> > > > > > > > > >> > > > > > >
> > > > > > > > > >> > > > > > >
> > > > > > > > > >> > > > > >
> > > > > > > > > >> > > > > >
> > > > > > > > > >> > > > > > --
> > > > > > > > > >> > > > > > Rosalyn MacCracken
> > > > > > > > > >> > > > > > Support Scientist
> > > > > > > > > >> > > > > >
> > > > > > > > > >> > > > > > Ocean Applications Branch
> > > > > > > > > >> > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > > > > >> > > > > > NCWCP
> > > > > > > > > >> > > > > > 5830 University Research Ct
> > > > > > > > > >> > > > > > College Park, MD  20740-3818
> > > > > > > > > >> > > > > >
> > > > > > > > > >> > > > > > (p) 301-683-1551
> > > > > > > > > >> > > > > > rosalyn.maccracken at noaa.gov
> > > > > > > > > >> > > > > >
> > > > > > > > > >> > > > > >
> > > > > > > > > >> > > > >
> > > > > > > > > >> > > > >
> > > > > > > > > >> > > >
> > > > > > > > > >> > > >
> > > > > > > > > >> > > > --
> > > > > > > > > >> > > > Rosalyn MacCracken
> > > > > > > > > >> > > > Support Scientist
> > > > > > > > > >> > > >
> > > > > > > > > >> > > > Ocean Applications Branch
> > > > > > > > > >> > > > NOAA/NWS Ocean Prediction Center
> > > > > > > > > >> > > > NCWCP
> > > > > > > > > >> > > > 5830 University Research Ct
> > > > > > > > > >> > > > College Park, MD  20740-3818
> > > > > > > > > >> > > >
> > > > > > > > > >> > > > (p) 301-683-1551
> > > > > > > > > >> > > > rosalyn.maccracken at noaa.gov
> > > > > > > > > >> > > >
> > > > > > > > > >> > > >
> > > > > > > > > >> > >
> > > > > > > > > >> > >
> > > > > > > > > >> >
> > > > > > > > > >> >
> > > > > > > > > >> > --
> > > > > > > > > >> > Rosalyn MacCracken
> > > > > > > > > >> > Support Scientist
> > > > > > > > > >> >
> > > > > > > > > >> > Ocean Applications Branch
> > > > > > > > > >> > NOAA/NWS Ocean Prediction Center
> > > > > > > > > >> > NCWCP
> > > > > > > > > >> > 5830 University Research Ct
> > > > > > > > > >> > College Park, MD  20740-3818
> > > > > > > > > >> >
> > > > > > > > > >> > (p) 301-683-1551
> > > > > > > > > >> > rosalyn.maccracken at noaa.gov
> > > > > > > > > >> >
> > > > > > > > > >> >
> > > > > > > > > >>
> > > > > > > > > >>
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > Rosalyn MacCracken
> > > > > > > > > > Support Scientist
> > > > > > > > > >
> > > > > > > > > > Ocean Applications Branch
> > > > > > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > > > > > NCWCP
> > > > > > > > > > 5830 University Research Ct
> > > > > > > > > > College Park, MD  20740-3818
> > > > > > > > > >
> > > > > > > > > > (p) 301-683-1551
> > > > > > > > > > rosalyn.maccracken at noaa.gov
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Rosalyn MacCracken
> > > > > > > > > Support Scientist
> > > > > > > > >
> > > > > > > > > Ocean Applications Branch
> > > > > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > > > > NCWCP
> > > > > > > > > 5830 University Research Ct
> > > > > > > > > College Park, MD  20740-3818
> > > > > > > > >
> > > > > > > > > (p) 301-683-1551
> > > > > > > > > rosalyn.maccracken at noaa.gov
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Rosalyn MacCracken
> > > > > > > Support Scientist
> > > > > > >
> > > > > > > Ocean Applications Branch
> > > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > > NCWCP
> > > > > > > 5830 University Research Ct
> > > > > > > College Park, MD  20740-3818
> > > > > > >
> > > > > > > (p) 301-683-1551
> > > > > > > rosalyn.maccracken at noaa.gov
> > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Rosalyn MacCracken
> > > > > Support Scientist
> > > > >
> > > > > Ocean Applications Branch
> > > > > NOAA/NWS Ocean Prediction Center
> > > > > NCWCP
> > > > > 5830 University Research Ct
> > > > > College Park, MD  20740-3818
> > > > >
> > > > > (p) 301-683-1551
> > > > > rosalyn.maccracken at noaa.gov
> > > > >
> > > > >
> > > >
> > > >
> > >
> > >
> > > --
> > > Rosalyn MacCracken
> > > Support Scientist
> > >
> > > Ocean Applications Branch
> > > NOAA/NWS Ocean Prediction Center
> > > NCWCP
> > > 5830 University Research Ct
> > > College Park, MD  20740-3818
> > >
> > > (p) 301-683-1551
> > > rosalyn.maccracken at noaa.gov
> > >
> > >
> >
> >
>
>
> --
> Rosalyn MacCracken
> Support Scientist
>
> Ocean Applications Branch
> NOAA/NWS Ocean Prediction Center
> NCWCP
> 5830 University Research Ct
> College Park, MD  20740-3818
>
> (p) 301-683-1551
> rosalyn.maccracken at noaa.gov
>
>

------------------------------------------------
Subject: question on regenerating data
From: Rosalyn MacCracken - NOAA Affiliate
Time: Mon May 07 11:41:31 2018

Hi John,

Where would I set the environmental variables?  Those are set within
the
config file, correct?  Can you somehow pass them into the config file?
I
didn't think that the config file was that dynamic.

So, I was thinking about appending the files by another shell script
using
"cat".  But, I think I'm actually leaning towards copying hourly files
to a
temp directory, processing, then removing the file, and copying the
next
hour, and so on.  I don't know if that's a silly idea or not...

Roz

On Mon, May 7, 2018 at 1:06 PM, John Halley Gotway via RT
<met_help at ucar.edu
> wrote:

> Roz,
>
> Yes, the "-set_hdr" option is specific to each job.  If your jobs
are
> defined in the config file, then yes, you'd need to specify there.
Indeed,
> getting the timestamps consistent really is just cosmetic.  If
you're
> looping over many times, I'd suggest using an environment variable:
>   -set_hdr OBS_VALID_BEG ${CUR_VALID_BEG} -set_hdr OBS_VALID_END
> ${CUR_VALID_END}
>
> Using environment variables in MET configuration files makes
scripting much
> more convenient.
>
> However, STAT-Analysis doesn't have the ability to append to an
output
> file.  If you write to the same output file name, it'll *clobber*
that file
> (i.e. replace it).
>
> Hope that helps.
>
> Thanks,
> John
>
>
> On Mon, May 7, 2018 at 10:38 AM, Rosalyn MacCracken - NOAA Affiliate
via RT
> <met_help at ucar.edu> wrote:
>
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822 >
> >
> > Hi John,
> >
> > So, it sounds like I'm ok either way with the timestamp.  If I
don't use
> > -set_hdr, it sets the correct beginning and end time according to
the mpr
> > file, or, I can use -set_hdr for consistancy with the other files,
but,
> > that's more "cosmetic".
> >
> > Oh, but, that -set_hdr command option is within the config file,
correct?
> > So, you really couldn't loop through that and pass a time variable
into
> the
> > command.  So, it may just be easiest to leave it out.
> >
> > So, since my processing would take 20 minutes for regenerating one
days
> > worth of data, I was thinking, I would do all my processing for
the North
> > Atlantic first, so, I can look at how we did with those Nor'
Easters in
> > March.  So, that's processing the 00z, 01z, 11z and 12z time
periods
> first,
> > since that is when ASCAT passes over the North Atlantic.  So, I
would
> copy
> > those time periods into a temp directory and use the -lookin
command to
> > process those 4 time periods for my entire period (maybe 1 week at
a
> > time).
> >
> > So, this will produce 1 file with 00z,01z, 11z and 12z data, for
each
> week,
> > correct?  And, the only way to get individual files is to copy the
data,
> > one hour at a time, process, delete the file, later rinse and
repeat.
> That
> > might be hard to do.  I may have to think about that...
> >
> > So, if it's one file, with all the data for the week, at selected
hours,
> > what happens when I have time to run the rest of the data?  I just
write
> > that to a different file, and then, maybe append that to the end
of the
> > first file?  Or, just leave it separate?
> >
> > I guess I just have to think about what I'm going to do next with
these
> > files, and the easiest way to do that.
> >
> > Roz
> >
> > On Mon, May 7, 2018 at 11:48 AM, John Halley Gotway via RT <
> > met_help at ucar.edu> wrote:
> >
> > > Roz,
> > >
> > > I understand that you're suspicious about the beginning and
ending time
> > > stamps in the OBS_VALID_BEG and _END columns.  You're comparing
the
> > > original output from Point-Stat to the output that you're
getting from
> > > STAT-Analysis.  However, those timestamps can be different
without
> there
> > > actually being a problem.  Here's why...
> > >
> > > When you run Point-Stat, the obs_window setting in the config
file
> > defines
> > > the matching time window.  If your forecast is valid at time T,
the
> > > matching time window is defined as T+obs_window.beg to
> T+obs_window.end.
> > > The point observations may actually fall anywhere in that time
> window...
> > > but it's that time window that's reported in the summary line
type
> (like
> > > CTC, CTS, SL1L2, and CNT).  Since the MPR line type is specific
to each
> > > observation value, the *actual* timestamp of that observation is
> reported
> > > for in that line.
> > >
> > > When your run STAT-Analysis to process those MPR lines, it reads
the
> > > OBS_VALID_BEG and OBS_VALID_END columns.  And it keeps track of
the
> > minimum
> > > OBS_VALID_BEG timestamp and the maximum OBS_VALID_END timestamp.
When
> it
> > > writes output CTC, CTS, SL1L2, or CNT lines it reports the
> > minimum/maximum
> > > timestamp values it found in the data.
> > >
> > > So Point-Stat reports the *REQUESTED TIME WINDOW* in the
OBS_VALID_BEG
> > and
> > > OBS_VALID_END columns... while STAT-Analysis reports the *ACTUAL
TIME
> > > WINDOW*.  And in general, those won't be the same.  So this
isn't
> > > necessarily a problem.
> > >
> > > If for consistency, you'd like to explicitly set the
OBS_VALID_BEG and
> > > OBS_VALID_END timestamps in the output, you can use the "-
set_hdr" job
> > > command option to do so:
> > >    -set_hdr OBS_VALID_BEG 20180307_003000 -set_hdr OBS_VALID_END
> > > 20180307_013000
> > >
> > > Thanks,
> > > John
> > >
> > > On Sun, May 6, 2018 at 2:12 PM, Rosalyn MacCracken - NOAA
Affiliate via
> > RT
> > > <
> > > met_help at ucar.edu> wrote:
> > >
> > > >
> > > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822
>
> > > >
> > > > Hi John,
> > > >
> > > > Sorry it took me so long to get back to you.  My step-daughter
came
> in
> > to
> > > > town, and I thought that I could get some work done while she
was
> here,
> > > > but, didn't.  Then, I totally forgot to email you back.  Sorry
for
> > > leaving
> > > > you hanging!
> > > >
> > > > Anyway, I was able to play around with the STATAnalysis config
file
> you
> > > > sent me.  I tried it out with only 1 hour timestep, instead of
all
> the
> > > > files for one day.  I wanted to see what kind of time it would
take
> to
> > > > process this on my machine.  So, it was quick, 45 seconds.
But, of
> > > course
> > > > you run took 18 minutes.  The script was probably reading 20
some
> > files.
> > > > That makes sense.
> > > >
> > > > So, then, I looked at the output, and it wasn't quite what I
> expected,
> > > and
> > > > doesn't quite match the stats from the other processing.  This
is
> what
> > I
> > > > did:
> > > >
> > > > 1)  I copied the 00z only *20180307*.stat file to a temp
directory.
> > > Before
> > > > I did this, I looked at the matching *.mpr file, and saw that
the
> > > > OBS_VALID_BEG was 20180307_000000 and the OBS_VALID_END was
> > > > 20180307_002700.
> > > > 2)  Ran the run_sa.sh script and generated the CTS, CTC and
CNT
> files.
> > > > 3)  I looked at the new agg_cts file, and the OBS_VALID_BEG
and _END
> > > > matched the *.mpr file in step 1.
> > > > 4)  I looked at the original CTS file, and the OBS_VALID_BEG
was
> > > > 20180307_223000 and the OBS_VALID_END was 20180307_013000.
So, that
> > was
> > > > our original way of processing.  I bet if I looked at a more
recent
> > file,
> > > > it would be more like OBS_VALID_BEG was 20180307_233000 and
the
> > > > OBS_VALID_END was 20180307_003000.
> > > > 5)  I looked at the original *mpr for 01z, and the
OBS_VALID_BEG was
> > > > 20180307_003000 and the OBS_VALID_END was 20180307_012100
> > > >
> > > > So, this tells me that I'm not matching observation times, and
I'm
> not
> > > sure
> > > > how to fix it to match things up.  First, we use a +/- 30 min
window
> > for
> > > > ASCAT obs, centered on the hour.  For example, if we are
processing
> the
> > > 00z
> > > > hour, we will match observations from 233000 from the day
before to
> > > 003000
> > > > the current day.  Actually, we used to do an hour window on
either
> > side,
> > > > but, we have more observations now at each hour.  (See the
> explanation
> > in
> > > > #4 above)
> > > >
> > > > Anyway, how do I create the CTS,CTC and CNT files for the +/-
30 min
> > > > window?  Is there a way to dynamically indicate this 30min
window, so
> > > that
> > > > I don't have to go into the config file every time I run
STATanalysis
> > and
> > > > change it?
> > > >
> > > > Roz
> > > >
> > > > On Thu, Apr 26, 2018 at 4:14 PM, John Halley Gotway via RT <
> > > > met_help at ucar.edu> wrote:
> > > >
> > > > > Roz,
> > > > >
> > > > > The CSI statistics is computed from a 2x2 contingency table.
A 2x2
> > > > > contingency table is defined by a single threshold.  Looking
in the
> > > .stat
> > > > > files you sent, I see that you've applied many thresholds to
> generate
> > > > many
> > > > > 2x2 contingency tables and corresponding statistics.  Yes,
it is
> true
> > > > that
> > > > > for most of those thresholds, the "bad" observation values
will
> fall
> > > into
> > > > > the "non-event" category.  But those non-event counts are
included
> in
> > > the
> > > > > computation of some stats, including CSI.  So even through
the bad
> > > > > observations aren't very interesting, they really are
impacting the
> > > > > statistics.
> > > > >
> > > > > John
> > > > >
> > > > > On Wed, Apr 25, 2018 at 10:08 AM, Rosalyn MacCracken - NOAA
> Affiliate
> > > via
> > > > > RT <met_help at ucar.edu> wrote:
> > > > >
> > > > > >
> > > > > > <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822 >
> > > > > >
> > > > > > Figures.  I just calculated how long it will take me to
> regenerate
> > > data
> > > > > for
> > > > > > 03072018 - 04122018.  It will take me 912 hours.  ;-(
> > > > > >
> > > > > > Ok, I know I asked this, but, if I had a OBS value of 0.01
and a
> > > > matched
> > > > > > GFS point of 10 m/s, and I had a low threshold of 0-5 m/s,
6-10
> m/s
> > > and
> > > > > > 10-15 m/s, and say, CSI was calculated.  Which threshold
would be
> > > used
> > > > > for
> > > > > > the output, the 0-5 or 6-10?  And, would the 10-15
threshold even
> > be
> > > > > > effected?
> > > > > >
> > > > > > Roz
> > > > > >
> > > > > > On Wed, Apr 25, 2018 at 11:40 AM, John Halley Gotway via
RT <
> > > > > > met_help at ucar.edu> wrote:
> > > > > >
> > > > > > > Roz,
> > > > > > >
> > > > > > > I think it'd take just as long.  The slow part is
reading the
> > > data...
> > > > > not
> > > > > > > applying a threshold.
> > > > > > >
> > > > > > > John
> > > > > > >
> > > > > > > On Wed, Apr 25, 2018 at 9:18 AM, Rosalyn MacCracken -
NOAA
> > > Affiliate
> > > > > via
> > > > > > RT
> > > > > > > <met_help at ucar.edu> wrote:
> > > > > > >
> > > > > > > >
> > > > > > > > <URL: https://rt.rap.ucar.edu/rt/
> Ticket/Display.html?id=84822
> > >
> > > > > > > >
> > > > > > > > Hi John,
> > > > > > > >
> > > > > > > > Thanks for doing that for me.  I'll take a look at the
info
> you
> > > > sent
> > > > > me
> > > > > > > > this afternoon.  I'm in the middle of doing something
right
> > > > > > now...trying
> > > > > > > to
> > > > > > > > make a different program work.  ;-/
> > > > > > > >
> > > > > > > > I wonder if it will be quicker than 18 minutes for
some of
> the
> > > > > > thresholds
> > > > > > > > that have higher wind speeds, and not as many
instances (or 0
> > > > > > instances).
> > > > > > > > Or, will it take just as long, since it still needs to
read
> > > through
> > > > > the
> > > > > > > > entire *.stat file anyway?
> > > > > > > >
> > > > > > > > Roz
> > > > > > > >
> > > > > > > > On Tue, Apr 24, 2018 at 7:06 PM, John Halley Gotway
via RT <
> > > > > > > > met_help at ucar.edu> wrote:
> > > > > > > >
> > > > > > > > > Hi Roz,
> > > > > > > > >
> > > > > > > > > Thanks for sending the sample data.  I grabbed it
and used
> it
> > > run
> > > > > > some
> > > > > > > > > sample jobs:
> > > > > > > > >
> > > > > > > > > time /d1/johnhg/MET/MET_releases/met-
6.0/bin/stat_analysis
> \
> > > > > > > > > -lookin
> > > > > > > > >
/d1/johnhg/MET/MET_Help/maccracken_data_20180424/opc_
> > > > > > > > > test/home/opc_test/data/met_verif/GFS/data/hourly
> > > > > > > > > \
> > > > > > > > > -config STATAnalysisConfig \
> > > > > > > > > -log run_sa.log -v 3
> > > > > > > > >
> > > > > > > > > I used the "-lookin" option to point to all the data
you
> > sent.
> > > > > > > > >
> > > > > > > > > I've attached the...
> > > > > > > > > (1) config file I used
> > > > > > > > > (2) log file that was genrated
> > > > > > > > > (3) output .stat files
> > > > > > > > >
> > > > > > > > > Looking at the jobs, you'll see that I've included 5
of
> > them...
> > > > > > > > > - Generate CNT output
> > > > > > > > > - Generate CTC >= 0.0 output
> > > > > > > > > - Generate CTS >= 0.0 output
> > > > > > > > > - Generate CTC >= 5.5689 output
> > > > > > > > > - Generate CTS >= 5.5689 output
> > > > > > > > >
> > > > > > > > > Unfortunately, you'll need to define separate jobs
for each
> > > > > threshold
> > > > > > > > you'd
> > > > > > > > > like to use.  Although, you shouldn't use >=0.0
since
> that's
> > > > always
> > > > > > > true.
> > > > > > > > >
> > > > > > > > > Also unfortunately, this is pretty slow.  On my
machine, it
> > > took
> > > > > like
> > > > > > > 18
> > > > > > > > > minutes for these 5 jobs!
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > > John
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Tue, Apr 24, 2018 at 2:09 PM, Rosalyn MacCracken
- NOAA
> > > > > Affiliate
> > > > > > > via
> > > > > > > > RT
> > > > > > > > > <met_help at ucar.edu> wrote:
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > <URL: https://rt.rap.ucar.edu/rt/
> > > Ticket/Display.html?id=84822
> > > > >
> > > > > > > > > >
> > > > > > > > > > Hi John,
> > > > > > > > > >
> > > > > > > > > > I put my file on the ftp site.  Let me know what
you
> find.
> > > > > You'll
> > > > > > > see
> > > > > > > > > > those really low OBS values (0.01, 0.02, and so
on).
> > > > > > > > > >
> > > > > > > > > > Thanks!
> > > > > > > > > >
> > > > > > > > > > Roz
> > > > > > > > > >
> > > > > > > > > > On Tue, Apr 24, 2018 at 2:53 PM, Rosalyn
MacCracken -
> NOAA
> > > > > > Affiliate
> > > > > > > <
> > > > > > > > > > rosalyn.maccracken at noaa.gov> wrote:
> > > > > > > > > >
> > > > > > > > > > > Ok, I'll get that over to the ftp site.  I have
to make
> > > sure
> > > > > > that I
> > > > > > > > > find
> > > > > > > > > > a
> > > > > > > > > > > day that has all the data in it.  Sometimes the
data
> > isn't
> > > > > > > available
> > > > > > > > > when
> > > > > > > > > > > the script runs.  A little annoying, but, that's
> > > > operations...
> > > > > > > > > > >
> > > > > > > > > > > I'll let you know when I get the file to the ftp
site.
> > > > > > > > > > >
> > > > > > > > > > > Thanks!
> > > > > > > > > > >
> > > > > > > > > > > Roz
> > > > > > > > > > >
> > > > > > > > > > > On Tue, Apr 24, 2018 at 2:49 PM, John Halley
Gotway via
> > RT
> > > <
> > > > > > > > > > > met_help at ucar.edu> wrote:
> > > > > > > > > > >
> > > > > > > > > > >> Roz,
> > > > > > > > > > >>
> > > > > > > > > > >> Yes, we do.  Follow the instructions here:
> > > > > > > > > > >>    https://dtcenter.org/met/
> > > users/support/met_help.php#ftp
> > > > > > > > > > >>
> > > > > > > > > > >> I'd suggest making a tar file for one day and
posting
> > them
> > > > to
> > > > > > the
> > > > > > > > ftp
> > > > > > > > > > >> site:
> > > > > > > > > > >>    tar -cvzf sample.tar.gz
/GFS/data/hourly/20180305*
> > > > > > > > > > >>
> > > > > > > > > > >> Thanks,
> > > > > > > > > > >> John
> > > > > > > > > > >>
> > > > > > > > > > >> On Tue, Apr 24, 2018 at 11:57 AM, Rosalyn
MacCracken -
> > > NOAA
> > > > > > > > Affiliate
> > > > > > > > > > via
> > > > > > > > > > >> RT <met_help at ucar.edu> wrote:
> > > > > > > > > > >>
> > > > > > > > > > >> >
> > > > > > > > > > >> > <URL: https://rt.rap.ucar.edu/rt/
> > > > > Ticket/Display.html?id=84822
> > > > > > >
> > > > > > > > > > >> >
> > > > > > > > > > >> > HI John,
> > > > > > > > > > >> >
> > > > > > > > > > >> > Yes, it does seem that the -config option is
the way
> > to
> > > go
> > > > > to
> > > > > > > > > recreate
> > > > > > > > > > >> > those 3 files. I'll be sure to have a unique
file
> > name,
> > > > or,
> > > > > mv
> > > > > > > the
> > > > > > > > > > >> output
> > > > > > > > > > >> > file to a different name before running the
command
> > > again.
> > > > > > > Thanks
> > > > > > > > > for
> > > > > > > > > > >> > pointing that out.
> > > > > > > > > > >> >
> > > > > > > > > > >> > I'm teleworking for the next couple of weeks,
so,
> > > download
> > > > > and
> > > > > > > > send
> > > > > > > > > > you
> > > > > > > > > > >> > *.stat files like I can when I'm at my
computer at
> > work.
> > > > I
> > > > > > > don't
> > > > > > > > > have
> > > > > > > > > > >> > access to theia or wcoss anymore.  You have
an ftp
> > > server
> > > > > > that I
> > > > > > > > can
> > > > > > > > > > >> upload
> > > > > > > > > > >> > data to, right?  If not, I can try and fiddle
around
> > > with
> > > > > this
> > > > > > > > > > tomorrow
> > > > > > > > > > >> and
> > > > > > > > > > >> > see if I can't get this to work the way I
want to.
> > > > > > > > > > >> >
> > > > > > > > > > >> > Roz
> > > > > > > > > > >> >
> > > > > > > > > > >> > On Tue, Apr 24, 2018 at 11:42 AM, John Halley
Gotway
> > via
> > > > RT
> > > > > <
> > > > > > > > > > >> > met_help at ucar.edu> wrote:
> > > > > > > > > > >> >
> > > > > > > > > > >> > > Roz,
> > > > > > > > > > >> > >
> > > > > > > > > > >> > > Each "-job aggregate_stat" only generates a
single
> > > > output
> > > > > > line
> > > > > > > > > type.
> > > > > > > > > > >> So
> > > > > > > > > > >> > > using "-out_line_type CTC,CTS,CNT" will not
work.
> > > > > > > > > > >> > >
> > > > > > > > > > >> > > You'll need to run separate jobs for each
output
> > line
> > > > type
> > > > > > you
> > > > > > > > > want
> > > > > > > > > > to
> > > > > > > > > > >> > > generate.  That's why I'd recommend
grouping those
> > > > > multiple
> > > > > > > jobs
> > > > > > > > > > >> together
> > > > > > > > > > >> > > into a single STAT-Analysis config file.
Then
> you'd
> > > > call
> > > > > > > > > > >> STAT-Analysis
> > > > > > > > > > >> > > once using the "-config" command line
option.
> > > > > > > > > > >> > >
> > > > > > > > > > >> > > Another issue is that if you set "-
out_stat" to
> the
> > > same
> > > > > > > > filename,
> > > > > > > > > > >> it'll
> > > > > > > > > > >> > > get overridden by each job.  STAT-Analysis
will
> > > > overwrite
> > > > > > that
> > > > > > > > > > output
> > > > > > > > > > >> > file
> > > > > > > > > > >> > > rather than appending to it.
> > > > > > > > > > >> > >
> > > > > > > > > > >> > > You could send me a day's worth of .stat
output
> > files
> > > > > > > > > > >> > > (/GFS/data/hourly/20180305*) and I could
send you
> > some
> > > > > > > > > suggestions.
> > > > > > > > > > >> Or
> > > > > > > > > > >> > if
> > > > > > > > > > >> > > you have access to theia you could copy
them up
> > there
> > > > and
> > > > > > > point
> > > > > > > > me
> > > > > > > > > > to
> > > > > > > > > > >> it.
> > > > > > > > > > >> > >
> > > > > > > > > > >> > > Thanks,
> > > > > > > > > > >> > > John
> > > > > > > > > > >> > >
> > > > > > > > > > >> > > On Tue, Apr 24, 2018 at 7:48 AM, Rosalyn
> MacCracken
> > -
> > > > NOAA
> > > > > > > > > Affiliate
> > > > > > > > > > >> via
> > > > > > > > > > >> > RT
> > > > > > > > > > >> > > <met_help at ucar.edu> wrote:
> > > > > > > > > > >> > >
> > > > > > > > > > >> > > >
> > > > > > > > > > >> > > > <URL: https://rt.rap.ucar.edu/rt/
> > > > > > > Ticket/Display.html?id=84822
> > > > > > > > >
> > > > > > > > > > >> > > >
> > > > > > > > > > >> > > > Hi John,
> > > > > > > > > > >> > > >
> > > > > > > > > > >> > > > Yes, that makes sense.  Those very small
values
> > > (<1.0
> > > > > > m/s),
> > > > > > > > are
> > > > > > > > > > bad
> > > > > > > > > > >> > > > values.  That's why they shouldn't be
included
> in
> > > the
> > > > > > > > > processing.
> > > > > > > > > > >> > > >
> > > > > > > > > > >> > > > So, I need to just regenerate hourly
data, one
> > hour
> > > > at a
> > > > > > > time.
> > > > > > > > > > >> Would
> > > > > > > > > > >> > it
> > > > > > > > > > >> > > > make sense to use a shell script and loop
> > > > stat-analysis?
> > > > > > > > > > Something
> > > > > > > > > > >> > like:
> > > > > > > > > > >> > > >
> > > > > > > > > > >> > > > for day in 11 12
> > > > > > > > > > >> > > > do
> > > > > > > > > > >> > > >   for cycle in 00 06 12 18
> > > > > > > > > > >> > > >   do
> > > > > > > > > > >> > > > stat_analysis -lookin
> > /GFS/data/hourly/201803${day}$
> > > > > > > > > {hour}/*.stat
> > > > > > > > > > \
> > > > > > > > > > >> > > > -job aggregate_stat \
> > > > > > > > > > >> > > >    -line_type MPR \
> > > > > > > > > > >> > > >    -out_line_type CTC,CTS,CNT \
> > > > > > > > > > >> > > >   -fcst_var WIND \
> > > > > > > > > > >> > > > -column_thresh OBS gt1 \
> > > > > > > > > > >> > > >  -by
> > > > > > > > > > >> > > > MODEL,FCST_LEV,FCST_INIT_BEG,
> > > > > > FCST_LEAD,VX_MASK,INTERP_MTHD,
> > > > > > > I
> > > > > > > > > > >> NTERP_PNTS
> > > > > > > > > > >> > > > -out_stat /new_rerun_stat_files/MPR_to_
> > > > CTC_CTS_CNT.stat
> > > > > > > > > > >> > > >   done
> > > > > > > > > > >> > > > done
> > > > > > > > > > >> > > >
> > > > > > > > > > >> > > > or, something like that?  And, will this
> > regenerate
> > > > hour
> > > > > > > > > > forecasts,
> > > > > > > > > > >> at
> > > > > > > > > > >> > > each
> > > > > > > > > > >> > > > forecast and lead hour?  I guess it will
see the
> > > > > forecast
> > > > > > > and
> > > > > > > > > lead
> > > > > > > > > > >> hour
> > > > > > > > > > >> > > > from the *.stat file, and whatever *stat
file is
> > in
> > > > the
> > > > > > > > > directory,
> > > > > > > > > > >> it
> > > > > > > > > > >> > > will
> > > > > > > > > > >> > > > regenerate those hours, right?
> > > > > > > > > > >> > > >
> > > > > > > > > > >> > > > So, I need to regenerate the CTC, CNT and
CTS
> > files.
> > > > > > That's
> > > > > > > > > why I
> > > > > > > > > > >> did:
> > > > > > > > > > >> > > >  -out_line_type CTC,CTS,CNT
> > > > > > > > > > >> > > > but, will that make 3 separate files, or
just
> > > another
> > > > > > *.stat
> > > > > > > > > file?
> > > > > > > > > > >> > > >
> > > > > > > > > > >> > > > Roz
> > > > > > > > > > >> > > >
> > > > > > > > > > >> > > >
> > > > > > > > > > >> > > > On Mon, Apr 23, 2018 at 4:01 PM, John
Halley
> > Gotway
> > > > via
> > > > > > RT <
> > > > > > > > > > >> > > > met_help at ucar.edu> wrote:
> > > > > > > > > > >> > > >
> > > > > > > > > > >> > > > > Roz,
> > > > > > > > > > >> > > > >
> > > > > > > > > > >> > > > > It is ultimately up to you to decide
which
> > matched
> > > > > pairs
> > > > > > > you
> > > > > > > > > > want
> > > > > > > > > > >> to
> > > > > > > > > > >> > > > > include in your processing.  Do you
consider
> > those
> > > > > small
> > > > > > > > (<1.0
> > > > > > > > > > >> m/s)
> > > > > > > > > > >> > > > > observation values to be corrupt and
incorrect
> > in
> > > > some
> > > > > > way
> > > > > > > > or
> > > > > > > > > > just
> > > > > > > > > > >> > not
> > > > > > > > > > >> > > > very
> > > > > > > > > > >> > > > > interesting?  If they really are BAD
data
> > values,
> > > I
> > > > > > agree
> > > > > > > > that
> > > > > > > > > > you
> > > > > > > > > > >> > > should
> > > > > > > > > > >> > > > > exclude them from your analysis.  But
if
> they're
> > > > just
> > > > > > > > > > >> uninteresting
> > > > > > > > > > >> > > > values
> > > > > > > > > > >> > > > > of low wind speed, then there's no
reason why
> > you
> > > > > should
> > > > > > > > > exclude
> > > > > > > > > > >> > them.
> > > > > > > > > > >> > > > For
> > > > > > > > > > >> > > > > example, *most* of the time it ins't
raining,
> > but
> > > we
> > > > > > often
> > > > > > > > > > >> included
> > > > > > > > > > >> > > > > observations of 0 precip.
> > > > > > > > > > >> > > > >
> > > > > > > > > > >> > > > > There are three configurable options in
> > Point-Stat
> > > > > that
> > > > > > > may
> > > > > > > > be
> > > > > > > > > > >> useful
> > > > > > > > > > >> > > > here:
> > > > > > > > > > >> > > > > (1) You already know and use the
"cat_thresh"
> > > > option.
> > > > > > > This
> > > > > > > > > > >> threshold
> > > > > > > > > > >> > > > > defines the events and non-events for a
2x2
> > > > > contingency
> > > > > > > > table.
> > > > > > > > > > >> This
> > > > > > > > > > >> > > > > threshold affects the contents of FHO,
CTC,
> CTS,
> > > > MCTC,
> > > > > > and
> > > > > > > > > MCTS
> > > > > > > > > > >> line
> > > > > > > > > > >> > > > types
> > > > > > > > > > >> > > > > that Point-Stat writes.
> > > > > > > > > > >> > > > > (2) The "cnt_thresh" option is a more
recent
> > > > addition.
> > > > > > > > > Perhaps
> > > > > > > > > > >> this
> > > > > > > > > > >> > > was
> > > > > > > > > > >> > > > a
> > > > > > > > > > >> > > > > poor name choice, but instead of
defining
> > > > categories,
> > > > > > it's
> > > > > > > > > > really
> > > > > > > > > > >> a
> > > > > > > > > > >> > > > > *filtering* threshold.  This threshold
affects
> > the
> > > > > > > contents
> > > > > > > > of
> > > > > > > > > > the
> > > > > > > > > > >> > > SL1L2,
> > > > > > > > > > >> > > > > SAL1L2, and CNT line types that Point-
Stat
> > writes.
> > > > > For
> > > > > > > > > example,
> > > > > > > > > > >> > > setting
> > > > > > > > > > >> > > > > "cnt_thresh = [ ge6, ge17 ];" will
produce 2
> CNT
> > > > and 2
> > > > > > > SL1L2
> > > > > > > > > > >> output
> > > > > > > > > > >> > > lines
> > > > > > > > > > >> > > > > containing only those points where the
wind
> > speed
> > > > was
> > > > > > >=6
> > > > > > > > and
> > > > > > > > > > >> >=17,
> > > > > > > > > > >> > > > > respectively.
> > > > > > > > > > >> > > > > (3) The "wind_thresh" option is very
similar
> to
> > > the
> > > > > > > > > "cnt_thresh"
> > > > > > > > > > >> > option
> > > > > > > > > > >> > > > but
> > > > > > > > > > >> > > > > affects the contents of teh VL1L2,
VAL1L2, and
> > > VCNT
> > > > > (new
> > > > > > > in
> > > > > > > > > > >> met-7.0)
> > > > > > > > > > >> > > line
> > > > > > > > > > >> > > > > types.  Only those U/V pairs that meet
the
> > > specified
> > > > > > wind
> > > > > > > > > speed
> > > > > > > > > > >> > > threshold
> > > > > > > > > > >> > > > > are included in the output.
> > > > > > > > > > >> > > > >
> > > > > > > > > > >> > > > > For both "cnt_thresh" and
"wind_thresh", the
> > > default
> > > > > > value
> > > > > > > > in
> > > > > > > > > > the
> > > > > > > > > > >> > > config
> > > > > > > > > > >> > > > > file is "NA", meaning, do not apply any
> > filtering
> > > > > > > threshold
> > > > > > > > > > >> criteria.
> > > > > > > > > > >> > > > >
> > > > > > > > > > >> > > > > You have the flexibility to run STAT-
Analysis
> on
> > > the
> > > > > MPR
> > > > > > > > > output
> > > > > > > > > > >> lines
> > > > > > > > > > >> > > to
> > > > > > > > > > >> > > > > recompute any of these output line
types
> > applying
> > > > > > whatever
> > > > > > > > > > >> filtering
> > > > > > > > > > >> > > > > criteria you'd like.
> > > > > > > > > > >> > > > > Here's the MET user's guide:
> > > > > > > > > > >> > > > > https://dtcenter.org/met/
> > > > users/docs/users_guide/MET_
> > > > > > > > > > >> > > Users_Guide_v7.0.pdf
> > > > > > > > > > >> > > > > Look on page 98 for the job command
options
> for
> > > the
> > > > > > > > > > >> "aggregate_stat"
> > > > > > > > > > >> > > line
> > > > > > > > > > >> > > > > type when the input line type is "MPR".
> > > > > > > > > > >> > > > >
> > > > > > > > > > >> > > > > For your second question, the "-lookin
PATH"
> > > option
> > > > is
> > > > > > > > *VERY*
> > > > > > > > > > >> > flexible.
> > > > > > > > > > >> > > > > You can set PATH to either a single
value or
> > > > multiple
> > > > > > > > values.
> > > > > > > > > > If
> > > > > > > > > > >> you
> > > > > > > > > > >> > > use
> > > > > > > > > > >> > > > > wildcards, then the shell expands those
> > wildcards
> > > to
> > > > > > > > multiple
> > > > > > > > > > >> values.
> > > > > > > > > > >> > > > Each
> > > > > > > > > > >> > > > > value you pass in can either be a
filename or
> a
> > > > > > directory
> > > > > > > > > name.
> > > > > > > > > > >> If
> > > > > > > > > > >> > you
> > > > > > > > > > >> > > > > pass in a filename, STAT-Analysis will
read it
> > > > > > > *REGARDLESS*
> > > > > > > > of
> > > > > > > > > > the
> > > > > > > > > > >> > file
> > > > > > > > > > >> > > > > extension.  If you pass in a directory
name,
> > > > > > STAT-Analysis
> > > > > > > > > will
> > > > > > > > > > >> > search
> > > > > > > > > > >> > > > that
> > > > > > > > > > >> > > > > directory *RECURSIVELY* for files
ending in
> > > ".stat".
> > > > > > For
> > > > > > > > > > example,
> > > > > > > > > > >> > > either
> > > > > > > > > > >> > > > > of the following settings would tell
> > STAT-Analysis
> > > > to
> > > > > > read
> > > > > > > > the
> > > > > > > > > > >> same
> > > > > > > > > > >> > > list
> > > > > > > > > > >> > > > of
> > > > > > > > > > >> > > > > files:
> > > > > > > > > > >> > > > >    -lookin /GFS/data/hourly/*/*.stat
> > > > > > > > > > >> > > > >    ... or ...
> > > > > > > > > > >> > > > >    -lookin /GFS/data/hourly
> > > > > > > > > > >> > > > >
> > > > > > > > > > >> > > > > Be aware though that the more data you
pass to
> > > > > > > > STAT-Analysis,
> > > > > > > > > > the
> > > > > > > > > > >> > > longer
> > > > > > > > > > >> > > > > it'll take for it to process it.  You
can
> decide
> > > how
> > > > > > much
> > > > > > > > data
> > > > > > > > > > you
> > > > > > > > > > >> > pass
> > > > > > > > > > >> > > > it
> > > > > > > > > > >> > > > > for each job.  I'd suggest starting
with what
> is
> > > > most
> > > > > > > > > convenient
> > > > > > > > > > >> for
> > > > > > > > > > >> > > you.
> > > > > > > > > > >> > > > > If it's too slow, change the logic to
pass it
> > less
> > > > > data
> > > > > > > > (e.g.
> > > > > > > > > > >> only 1
> > > > > > > > > > >> > > day
> > > > > > > > > > >> > > > of
> > > > > > > > > > >> > > > > data rather than 1 month of data).
> > > > > > > > > > >> > > > >
> > > > > > > > > > >> > > > > Yes, you can give it a date range.  Use
> > > > -fcst_init_beg
> > > > > > and
> > > > > > > > > > >> > > -fcst_init_end
> > > > > > > > > > >> > > > > to specify beginning/ending model
> initialization
> > > > times
> > > > > > or
> > > > > > > > > > >> > > -fcst_valid_beg
> > > > > > > > > > >> > > > > and -fcst_valid_end to specify
> beginning/ending
> > > > valid
> > > > > > > times.
> > > > > > > > > > >> > > > >
> > > > > > > > > > >> > > > > If you find that you're running
multiple jobs
> on
> > > the
> > > > > > same
> > > > > > > > > subset
> > > > > > > > > > >> of
> > > > > > > > > > >> > > data
> > > > > > > > > > >> > > > > (e.g. process MPR to CNT, MPR to SL1L2,
MPR to
> > > CTC,
> > > > > MPR
> > > > > > to
> > > > > > > > > CTS),
> > > > > > > > > > >> it'd
> > > > > > > > > > >> > > be
> > > > > > > > > > >> > > > > more efficient to group those jobs into
a
> config
> > > > file.
> > > > > > > > > That'll
> > > > > > > > > > do
> > > > > > > > > > >> > the
> > > > > > > > > > >> > > > > filtering ONCE and write the filtered
data to
> a
> > > temp
> > > > > > file.
> > > > > > > > > Then
> > > > > > > > > > >> all
> > > > > > > > > > >> > > the
> > > > > > > > > > >> > > > > jobs read data from the temp instead of
> starting
> > > > over
> > > > > > from
> > > > > > > > > > >> scratch.
> > > > > > > > > > >> > > > >
> > > > > > > > > > >> > > > > Make sense?
> > > > > > > > > > >> > > > >
> > > > > > > > > > >> > > > > John
> > > > > > > > > > >> > > > >
> > > > > > > > > > >> > > > >
> > > > > > > > > > >> > > > >
> > > > > > > > > > >> > > > > On Mon, Apr 23, 2018 at 1:01 PM,
Rosalyn
> > > MacCracken
> > > > -
> > > > > > NOAA
> > > > > > > > > > >> Affiliate
> > > > > > > > > > >> > > via
> > > > > > > > > > >> > > > RT
> > > > > > > > > > >> > > > > <met_help at ucar.edu> wrote:
> > > > > > > > > > >> > > > >
> > > > > > > > > > >> > > > > >
> > > > > > > > > > >> > > > > > <URL: https://rt.rap.ucar.edu/rt/
> > > > > > > > > Ticket/Display.html?id=84822
> > > > > > > > > > >
> > > > > > > > > > >> > > > > >
> > > > > > > > > > >> > > > > > Hi John,
> > > > > > > > > > >> > > > > >
> > > > > > > > > > >> > > > > > That's actually only partially
correct.
> It's
> > > not
> > > > > > that I
> > > > > > > > > want
> > > > > > > > > > to
> > > > > > > > > > >> > use
> > > > > > > > > > >> > > > part
> > > > > > > > > > >> > > > > > of the MPR lines and discard the
rest, and I
> > do
> > > > need
> > > > > > to
> > > > > > > > > > >> regenerate
> > > > > > > > > > >> > > > > > statistics.  Let me try to re-
explain.
> > > > > > > > > > >> > > > > >
> > > > > > > > > > >> > > > > > Back in early March we switched from
getting
> > our
> > > > > ASCAT
> > > > > > > obs
> > > > > > > > > > from
> > > > > > > > > > >> the
> > > > > > > > > > >> > > > > > prepbufr data, to getting it from the
> MGDRLITE
> > > > data.
> > > > > > So,
> > > > > > > > > > >> processing
> > > > > > > > > > >> > > > > didn't
> > > > > > > > > > >> > > > > > change.  I was producing statistics
at
> certain
> > > > > > threshold
> > > > > > > > > > levels
> > > > > > > > > > >> for
> > > > > > > > > > >> > > > both
> > > > > > > > > > >> > > > > > GFS and ASCAT.  I had this set with
the
> > > cat_thresh
> > > > > > list,
> > > > > > > > at
> > > > > > > > > > >> levels
> > > > > > > > > > >> > of
> > > > > > > > > > >> > > > > > 0,6,17, etc.  We found out after
processing
> > for
> > > a
> > > > > > couple
> > > > > > > > of
> > > > > > > > > > >> weeks
> > > > > > > > > > >> > > that
> > > > > > > > > > >> > > > > the
> > > > > > > > > > >> > > > > > ASCAT data included these really
small
> values,
> > > > <1.0
> > > > > > m/s,
> > > > > > > > and
> > > > > > > > > > >> that
> > > > > > > > > > >> > > these
> > > > > > > > > > >> > > > > > small wind speeds were being included
into
> the
> > > > > > > statistics
> > > > > > > > > > >> > processing.
> > > > > > > > > > >> > > > > >
> > > > > > > > > > >> > > > > > So, a couple of questions.
> > > > > > > > > > >> > > > > > 1) Do I have to regenerate all of my
> > statistics
> > > > > > (*.cts,
> > > > > > > > > *.cnt
> > > > > > > > > > >> and
> > > > > > > > > > >> > > *ctc
> > > > > > > > > > >> > > > > > files) because of this error? Or,
since I
> have
> > > > > > threshold
> > > > > > > > > > levels
> > > > > > > > > > >> > set,
> > > > > > > > > > >> > > > will
> > > > > > > > > > >> > > > > > those small values be amoung the
statistics
> in
> > > the
> > > > > > > lowest
> > > > > > > > > > >> > thresholds?
> > > > > > > > > > >> > > > > > 2) I have the *.stat files, but, they
are
> > spread
> > > > out
> > > > > > > into
> > > > > > > > > > >> separate
> > > > > > > > > > >> > > > > > directories like:
> > > > > > > > > > >> > > > > > /GFS/data/hourly/${YYYYMMDDHH}/*.stat
> > > > > > > > > > >> > > > > > Can I tell stat-analysis to "lookin"
> > directories
> > > > > with
> > > > > > a
> > > > > > > > > > wildcard
> > > > > > > > > > >> > > (like
> > > > > > > > > > >> > > > > > 201803*)?  If so, how?  Or, is I tell
it to
> > look
> > > > in
> > > > > > > > > > >> > /GFS/data/hourly,
> > > > > > > > > > >> > > > > will
> > > > > > > > > > >> > > > > > it look in all the directories
recursively
> > under
> > > > > > hourly?
> > > > > > > > > And,
> > > > > > > > > > >> it
> > > > > > > > > > >> > > > that's
> > > > > > > > > > >> > > > > > the case, can I give it a date range,
so,
> that
> > > it
> > > > > only
> > > > > > > > > > processes
> > > > > > > > > > >> > data
> > > > > > > > > > >> > > > > from
> > > > > > > > > > >> > > > > > March?
> > > > > > > > > > >> > > > > >
> > > > > > > > > > >> > > > > > Roz
> > > > > > > > > > >> > > > > >
> > > > > > > > > > >> > > > > > On Mon, Apr 23, 2018 at 2:18 PM, John
Halley
> > > > Gotway
> > > > > > via
> > > > > > > > RT <
> > > > > > > > > > >> > > > > > met_help at ucar.edu> wrote:
> > > > > > > > > > >> > > > > >
> > > > > > > > > > >> > > > > > > Hi Roz,
> > > > > > > > > > >> > > > > > >
> > > > > > > > > > >> > > > > > > I read that you've run Point-Stat
and
> saved
> > > off
> > > > > the
> > > > > > > > > matched
> > > > > > > > > > >> pairs
> > > > > > > > > > >> > > > (MPR)
> > > > > > > > > > >> > > > > > > output line type.  And you'd like
to (1)
> > > filter
> > > > > > those
> > > > > > > > MPR
> > > > > > > > > > >> lines
> > > > > > > > > > >> > to
> > > > > > > > > > >> > > > > > discard
> > > > > > > > > > >> > > > > > > some of them and then (2) use the
filtered
> > > data
> > > > to
> > > > > > > > > > regenerate
> > > > > > > > > > >> > > summary
> > > > > > > > > > >> > > > > > > statistics.  Yes, this is easily
done
> using
> > > the
> > > > > > > > > > STAT-Analysis
> > > > > > > > > > >> > tool
> > > > > > > > > > >> > > in
> > > > > > > > > > >> > > > > > MET.
> > > > > > > > > > >> > > > > > >
> > > > > > > > > > >> > > > > > > You wrote that you're verifying
wind
> speeds
> > > > > against
> > > > > > > > ASCAT
> > > > > > > > > > and
> > > > > > > > > > >> > that
> > > > > > > > > > >> > > > > you'd
> > > > > > > > > > >> > > > > > > like to exclude pairs where the
observed
> > wind
> > > > > speed
> > > > > > is
> > > > > > > > > less
> > > > > > > > > > >> than
> > > > > > > > > > >> > 1
> > > > > > > > > > >> > > > m/s.
> > > > > > > > > > >> > > > > > > I'm just guessing here, but I'll
presume
> > that
> > > > you
> > > > > > want
> > > > > > > > to
> > > > > > > > > > >> produce
> > > > > > > > > > >> > > > both
> > > > > > > > > > >> > > > > > > SL1L2 and CNT output line types.
Here's
> > what
> > > > the
> > > > > > > > > > >> STAT-Analysis
> > > > > > > > > > >> > job
> > > > > > > > > > >> > > > > would
> > > > > > > > > > >> > > > > > > look like:
> > > > > > > > > > >> > > > > > >
> > > > > > > > > > >> > > > > > > # Filter MPR's and write SL1L2
output line
> > > > > > > > > > >> > > > > > > stat_analysis \
> > > > > > > > > > >> > > > > > >    -lookin input.stat \
# List
> a
> > > > .stat
> > > > > > > > filename
> > > > > > > > > > or
> > > > > > > > > > >> > > > directory
> > > > > > > > > > >> > > > > > > containing them
> > > > > > > > > > >> > > > > > >    -job aggregate_stat \        #
Job type
> > is
> > > > > > > > > aggregate_stat
> > > > > > > > > > >> > > > > > >    -line_type MPR \              #
Input
> > line
> > > > > type =
> > > > > > > MPR
> > > > > > > > > > >> > > > > > >    -out_line_type SL1L2 \      #
Output
> line
> > > > type
> > > > > =
> > > > > > > > SL1L2
> > > > > > > > > > >> partial
> > > > > > > > > > >> > > > sums
> > > > > > > > > > >> > > > > > >    -fcst_var WIND \               #
Only
> > > process
> > > > > > lines
> > > > > > > > > where
> > > > > > > > > > >> > > FCST_VAR
> > > > > > > > > > >> > > > > > > column = WIND
> > > > > > > > > > >> > > > > > >    -column_thresh OBS gt1 \ # Only
use MPR
> > > lines
> > > > > > where
> > > > > > > > OBS
> > > > > > > > > > >> column
> > > > > > > > > > >> > > > 1
> > > > > > > > > > >> > > > > > >    -by
> > > > > > > > > > >> > > > > > > MODEL,FCST_LEV,FCST_INIT_BEG,
> > > > > > > > > FCST_LEAD,VX_MASK,INTERP_MTHD,
> > > > > > > > > > >> > > > INTERP_PNTS
> > > > > > > > > > >> > > > > #
> > > > > > > > > > >> > > > > > > Run this same job for each unique
> > combination
> > > of
> > > > > > these
> > > > > > > > > > columns
> > > > > > > > > > >> > > > > > >    -out_stat MPR_to_SL1L2.stat
> > > > > > > > > > >> > > > > > >
> > > > > > > > > > >> > > > > > > This will read produce an output
.stat
> file
> > > > > > containing
> > > > > > > > an
> > > > > > > > > > >> SL1L2
> > > > > > > > > > >> > > line
> > > > > > > > > > >> > > > > for
> > > > > > > > > > >> > > > > > > each unique combination of the
header
> > columns
> > > > > listed
> > > > > > > > after
> > > > > > > > > > the
> > > > > > > > > > >> > > "-by"
> > > > > > > > > > >> > > > > > > option.  To generate CNT output
lines
> > instead,
> > > > > you'd
> > > > > > > > run a
> > > > > > > > > > >> second
> > > > > > > > > > >> > > job
> > > > > > > > > > >> > > > > > where
> > > > > > > > > > >> > > > > > > you replace SL1L2 with CNT.  You
could run
> > > these
> > > > > > jobs
> > > > > > > on
> > > > > > > > > the
> > > > > > > > > > >> > > command
> > > > > > > > > > >> > > > > line
> > > > > > > > > > >> > > > > > > or group them together into a
> STAT-Analysis
> > > > config
> > > > > > > file,
> > > > > > > > > if
> > > > > > > > > > >> you
> > > > > > > > > > >> > > > prefer.
> > > > > > > > > > >> > > > > > > Both would work.
> > > > > > > > > > >> > > > > > >
> > > > > > > > > > >> > > > > > > You could run this once for each
input
> .stat
> > > > file
> > > > > > > you're
> > > > > > > > > > >> > > > processing...
> > > > > > > > > > >> > > > > or
> > > > > > > > > > >> > > > > > > you could pass many input .stat
files to
> the
> > > > job.
> > > > > > > Since
> > > > > > > > > > >> > > > FCST_INIT_BEG
> > > > > > > > > > >> > > > > > and
> > > > > > > > > > >> > > > > > > FCST_LEAD are included in the "-by"
> option,
> > > > you'll
> > > > > > get
> > > > > > > > > > >> separate
> > > > > > > > > > >> > > > output
> > > > > > > > > > >> > > > > > > lines for each unique time.
> > > > > > > > > > >> > > > > > >
> > > > > > > > > > >> > > > > > > Hope that helps get you going.
> > > > > > > > > > >> > > > > > >
> > > > > > > > > > >> > > > > > > Thanks,
> > > > > > > > > > >> > > > > > > John
> > > > > > > > > > >> > > > > > >
> > > > > > > > > > >> > > > > > >
> > > > > > > > > > >> > > > > > > On Thu, Apr 19, 2018 at 9:23 AM,
Julie
> > > > Prestopnik
> > > > > > via
> > > > > > > > RT <
> > > > > > > > > > >> > > > > > > met_help at ucar.edu>
> > > > > > > > > > >> > > > > > > wrote:
> > > > > > > > > > >> > > > > > >
> > > > > > > > > > >> > > > > > > >
> > > > > > > > > > >> > > > > > > > <URL:
https://rt.rap.ucar.edu/rt/Tic
> > > > > > > > > > >> ket/Display.html?id=84822
> > > > > > > > > > >> > >
> > > > > > > > > > >> > > > > > > >
> > > > > > > > > > >> > > > > > > > Hi Roz.  My apologies for the
delay in
> > > > > responding.
> > > > > > > > > > >> > > > > > > >
> > > > > > > > > > >> > > > > > > > Unfortunately, John is out of the
office
> > > this
> > > > > > week,
> > > > > > > > and
> > > > > > > > > I
> > > > > > > > > > do
> > > > > > > > > > >> > not
> > > > > > > > > > >> > > > know
> > > > > > > > > > >> > > > > > the
> > > > > > > > > > >> > > > > > > > answers to your questions.  As
you
> said, I
> > > > would
> > > > > > > also
> > > > > > > > > > >> imagine
> > > > > > > > > > >> > > that
> > > > > > > > > > >> > > > > > > > point-stat is using those small
values
> as
> > > > > matched
> > > > > > > > pairs.
> > > > > > > > > > >> > Also, I
> > > > > > > > > > >> > > > do
> > > > > > > > > > >> > > > > > not
> > > > > > > > > > >> > > > > > > > believe there is a way to
regenerate the
> > > > > > point-stat
> > > > > > > > > > >> statistics
> > > > > > > > > > >> > > > > without
> > > > > > > > > > >> > > > > > > > using the original GFS data.  I
cannot
> say
> > > > with
> > > > > > > > > certainty,
> > > > > > > > > > >> > > however.
> > > > > > > > > > >> > > > > > > Thank
> > > > > > > > > > >> > > > > > > > you for your patience in advance.
We'll
> > > get a
> > > > > > > > definite
> > > > > > > > > > >> > response
> > > > > > > > > > >> > > to
> > > > > > > > > > >> > > > > you
> > > > > > > > > > >> > > > > > > as
> > > > > > > > > > >> > > > > > > > soon as we can.
> > > > > > > > > > >> > > > > > > >
> > > > > > > > > > >> > > > > > > > Thanks,
> > > > > > > > > > >> > > > > > > > Julie
> > > > > > > > > > >> > > > > > > >
> > > > > > > > > > >> > > > > > > > On Wed, Apr 18, 2018 at 6:31 AM,
Rosalyn
> > > > > > MacCracken
> > > > > > > -
> > > > > > > > > NOAA
> > > > > > > > > > >> > > > Affiliate
> > > > > > > > > > >> > > > > > via
> > > > > > > > > > >> > > > > > > RT
> > > > > > > > > > >> > > > > > > > <met_help at ucar.edu> wrote:
> > > > > > > > > > >> > > > > > > >
> > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > >> > > > > > > > > Wed Apr 18 06:31:39 2018:
Request
> 84822
> > > was
> > > > > > acted
> > > > > > > > > upon.
> > > > > > > > > > >> > > > > > > > > Transaction: Ticket created by
> > > > > > > > > > >> rosalyn.maccracken at noaa.gov
> > > > > > > > > > >> > > > > > > > >        Queue: met_help
> > > > > > > > > > >> > > > > > > > >      Subject: question on
regenerating
> > > data
> > > > > > > > > > >> > > > > > > > >        Owner: Nobody
> > > > > > > > > > >> > > > > > > > >   Requestors:
> > rosalyn.maccracken at noaa.gov
> > > > > > > > > > >> > > > > > > > >       Status: new
> > > > > > > > > > >> > > > > > > > >  Ticket <URL:
> > https://rt.rap.ucar.edu/rt/
> > > > > > > > > > >> > > > > > Ticket/Display.html?id=84822
> > > > > > > > > > >> > > > > > > >
> > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > >> > > > > > > > > Hi,
> > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > >> > > > > > > > > I'm running point-stat using
ASCAT and
> > GFS
> > > > > data
> > > > > > to
> > > > > > > > > > verify
> > > > > > > > > > >> > > surface
> > > > > > > > > > >> > > > > > wind
> > > > > > > > > > >> > > > > > > > > speeds.  I found an error in my
ASCAT
> > > input
> > > > > data
> > > > > > > > that
> > > > > > > > > > goes
> > > > > > > > > > >> > back
> > > > > > > > > > >> > > > to
> > > > > > > > > > >> > > > > > Mar
> > > > > > > > > > >> > > > > > > 7.
> > > > > > > > > > >> > > > > > > > > I had switched the input source
of the
> > > data,
> > > > > and
> > > > > > > > > within
> > > > > > > > > > >> the
> > > > > > > > > > >> > new
> > > > > > > > > > >> > > > > data
> > > > > > > > > > >> > > > > > > > files,
> > > > > > > > > > >> > > > > > > > > it was allowing very small
values (< 1
> > > m/s)
> > > > to
> > > > > > be
> > > > > > > > used
> > > > > > > > > > as
> > > > > > > > > > >> > data
> > > > > > > > > > >> > > > > points
> > > > > > > > > > >> > > > > > > in
> > > > > > > > > > >> > > > > > > > > the verification.  I imagine
that this
> > is
> > > an
> > > > > > > issue,
> > > > > > > > > > since
> > > > > > > > > > >> > > > > point-stat
> > > > > > > > > > >> > > > > > is
> > > > > > > > > > >> > > > > > > > > using these very small values
as
> matched
> > > > pairs
> > > > > > > with
> > > > > > > > > the
> > > > > > > > > > >> GFS,
> > > > > > > > > > >> > > > > correct?
> > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > >> > > > > > > > > Is there a way to regenerate
the
> > > point-stat
> > > > > > > > statistics
> > > > > > > > > > >> > without
> > > > > > > > > > >> > > > > using
> > > > > > > > > > >> > > > > > > the
> > > > > > > > > > >> > > > > > > > > original GFS data?  I do have
the
> *stat
> > > and
> > > > > the
> > > > > > > *mpr
> > > > > > > > > > >> files,
> > > > > > > > > > >> > and
> > > > > > > > > > >> > > > it
> > > > > > > > > > >> > > > > is
> > > > > > > > > > >> > > > > > > > > pretty easy to identify where
the bad
> > > values
> > > > > are
> > > > > > > > > > located.
> > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > >> > > > > > > > > Thanks,
> > > > > > > > > > >> > > > > > > > > Roz
> > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > >> > > > > > > > > --
> > > > > > > > > > >> > > > > > > > > Rosalyn MacCracken
> > > > > > > > > > >> > > > > > > > > Support Scientist
> > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > >> > > > > > > > > Ocean Applications Branch
> > > > > > > > > > >> > > > > > > > > NOAA/NWS Ocean Prediction
Center
> > > > > > > > > > >> > > > > > > > > NCWCP
> > > > > > > > > > >> > > > > > > > > 5830 University Research Ct
> > > > > > > > > > >> > > > > > > > > College Park, MD  20740-3818
> > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > >> > > > > > > > > (p) 301-683-1551
> > > > > > > > > > >> > > > > > > > > rosalyn.maccracken at noaa.gov
> > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > >> > > > > > > >
> > > > > > > > > > >> > > > > > > >
> > > > > > > > > > >> > > > > > >
> > > > > > > > > > >> > > > > > >
> > > > > > > > > > >> > > > > >
> > > > > > > > > > >> > > > > >
> > > > > > > > > > >> > > > > > --
> > > > > > > > > > >> > > > > > Rosalyn MacCracken
> > > > > > > > > > >> > > > > > Support Scientist
> > > > > > > > > > >> > > > > >
> > > > > > > > > > >> > > > > > Ocean Applications Branch
> > > > > > > > > > >> > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > > > > > >> > > > > > NCWCP
> > > > > > > > > > >> > > > > > 5830 University Research Ct
> > > > > > > > > > >> > > > > > College Park, MD  20740-3818
> > > > > > > > > > >> > > > > >
> > > > > > > > > > >> > > > > > (p) 301-683-1551
> > > > > > > > > > >> > > > > > rosalyn.maccracken at noaa.gov
> > > > > > > > > > >> > > > > >
> > > > > > > > > > >> > > > > >
> > > > > > > > > > >> > > > >
> > > > > > > > > > >> > > > >
> > > > > > > > > > >> > > >
> > > > > > > > > > >> > > >
> > > > > > > > > > >> > > > --
> > > > > > > > > > >> > > > Rosalyn MacCracken
> > > > > > > > > > >> > > > Support Scientist
> > > > > > > > > > >> > > >
> > > > > > > > > > >> > > > Ocean Applications Branch
> > > > > > > > > > >> > > > NOAA/NWS Ocean Prediction Center
> > > > > > > > > > >> > > > NCWCP
> > > > > > > > > > >> > > > 5830 University Research Ct
> > > > > > > > > > >> > > > College Park, MD  20740-3818
> > > > > > > > > > >> > > >
> > > > > > > > > > >> > > > (p) 301-683-1551
> > > > > > > > > > >> > > > rosalyn.maccracken at noaa.gov
> > > > > > > > > > >> > > >
> > > > > > > > > > >> > > >
> > > > > > > > > > >> > >
> > > > > > > > > > >> > >
> > > > > > > > > > >> >
> > > > > > > > > > >> >
> > > > > > > > > > >> > --
> > > > > > > > > > >> > Rosalyn MacCracken
> > > > > > > > > > >> > Support Scientist
> > > > > > > > > > >> >
> > > > > > > > > > >> > Ocean Applications Branch
> > > > > > > > > > >> > NOAA/NWS Ocean Prediction Center
> > > > > > > > > > >> > NCWCP
> > > > > > > > > > >> > 5830 University Research Ct
> > > > > > > > > > >> > College Park, MD  20740-3818
> > > > > > > > > > >> >
> > > > > > > > > > >> > (p) 301-683-1551
> > > > > > > > > > >> > rosalyn.maccracken at noaa.gov
> > > > > > > > > > >> >
> > > > > > > > > > >> >
> > > > > > > > > > >>
> > > > > > > > > > >>
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > --
> > > > > > > > > > > Rosalyn MacCracken
> > > > > > > > > > > Support Scientist
> > > > > > > > > > >
> > > > > > > > > > > Ocean Applications Branch
> > > > > > > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > > > > > > NCWCP
> > > > > > > > > > > 5830 University Research Ct
> > > > > > > > > > > College Park, MD  20740-3818
> > > > > > > > > > >
> > > > > > > > > > > (p) 301-683-1551
> > > > > > > > > > > rosalyn.maccracken at noaa.gov
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > Rosalyn MacCracken
> > > > > > > > > > Support Scientist
> > > > > > > > > >
> > > > > > > > > > Ocean Applications Branch
> > > > > > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > > > > > NCWCP
> > > > > > > > > > 5830 University Research Ct
> > > > > > > > > > College Park, MD  20740-3818
> > > > > > > > > >
> > > > > > > > > > (p) 301-683-1551
> > > > > > > > > > rosalyn.maccracken at noaa.gov
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > Rosalyn MacCracken
> > > > > > > > Support Scientist
> > > > > > > >
> > > > > > > > Ocean Applications Branch
> > > > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > > > NCWCP
> > > > > > > > 5830 University Research Ct
> > > > > > > > College Park, MD  20740-3818
> > > > > > > >
> > > > > > > > (p) 301-683-1551
> > > > > > > > rosalyn.maccracken at noaa.gov
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Rosalyn MacCracken
> > > > > > Support Scientist
> > > > > >
> > > > > > Ocean Applications Branch
> > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > NCWCP
> > > > > > 5830 University Research Ct
> > > > > > College Park, MD  20740-3818
> > > > > >
> > > > > > (p) 301-683-1551
> > > > > > rosalyn.maccracken at noaa.gov
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > Rosalyn MacCracken
> > > > Support Scientist
> > > >
> > > > Ocean Applications Branch
> > > > NOAA/NWS Ocean Prediction Center
> > > > NCWCP
> > > > 5830 University Research Ct
> > > > College Park, MD  20740-3818
> > > >
> > > > (p) 301-683-1551
> > > > rosalyn.maccracken at noaa.gov
> > > >
> > > >
> > >
> > >
> >
> >
> > --
> > Rosalyn MacCracken
> > Support Scientist
> >
> > Ocean Applications Branch
> > NOAA/NWS Ocean Prediction Center
> > NCWCP
> > 5830 University Research Ct
> > College Park, MD  20740-3818
> >
> > (p) 301-683-1551
> > rosalyn.maccracken at noaa.gov
> >
> >
>
>

--
Rosalyn MacCracken
Support Scientist

Ocean Applications Branch
NOAA/NWS Ocean Prediction Center
NCWCP
5830 University Research Ct
College Park, MD  20740-3818

(p) 301-683-1551
rosalyn.maccracken at noaa.gov

------------------------------------------------
Subject: question on regenerating data
From: John Halley Gotway
Time: Mon May 07 12:37:15 2018

Roz,

Yes, you can use environment variables inside MET config files.
Presumably, you're calling stat_analysis from some shell script.  And
that
script could include:

# for cshell
   setenv CUR_VALID_BEG 20180307_003000
   setenv CUR_VALID_END 20180307_013000

# or for bash
   export CUR_VALID_BEG="20180307_003000"
   export CUR_VALID_END="20180307_013000"

Then in your STAT-Analysis config file you could use this:
   -set_hdr OBS_VALID_BEG ${CUR_VALID_BEG} -set_hdr OBS_VALID_END
${CUR_VALID_END}

As for how you set up your logic and organize your data, totally up to
you.

Thanks,
John

On Mon, May 7, 2018 at 11:41 AM, Rosalyn MacCracken - NOAA Affiliate
via RT
<met_help at ucar.edu> wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822 >
>
> Hi John,
>
> Where would I set the environmental variables?  Those are set within
the
> config file, correct?  Can you somehow pass them into the config
file?  I
> didn't think that the config file was that dynamic.
>
> So, I was thinking about appending the files by another shell script
using
> "cat".  But, I think I'm actually leaning towards copying hourly
files to a
> temp directory, processing, then removing the file, and copying the
next
> hour, and so on.  I don't know if that's a silly idea or not...
>
> Roz
>
>
>
>
>
> On Mon, May 7, 2018 at 1:06 PM, John Halley Gotway via RT <
> met_help at ucar.edu
> > wrote:
>
> > Roz,
> >
> > Yes, the "-set_hdr" option is specific to each job.  If your jobs
are
> > defined in the config file, then yes, you'd need to specify there.
> Indeed,
> > getting the timestamps consistent really is just cosmetic.  If
you're
> > looping over many times, I'd suggest using an environment
variable:
> >   -set_hdr OBS_VALID_BEG ${CUR_VALID_BEG} -set_hdr OBS_VALID_END
> > ${CUR_VALID_END}
> >
> > Using environment variables in MET configuration files makes
scripting
> much
> > more convenient.
> >
> > However, STAT-Analysis doesn't have the ability to append to an
output
> > file.  If you write to the same output file name, it'll *clobber*
that
> file
> > (i.e. replace it).
> >
> > Hope that helps.
> >
> > Thanks,
> > John
> >
> >
> > On Mon, May 7, 2018 at 10:38 AM, Rosalyn MacCracken - NOAA
Affiliate via
> RT
> > <met_help at ucar.edu> wrote:
> >
> > >
> > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822 >
> > >
> > > Hi John,
> > >
> > > So, it sounds like I'm ok either way with the timestamp.  If I
don't
> use
> > > -set_hdr, it sets the correct beginning and end time according
to the
> mpr
> > > file, or, I can use -set_hdr for consistancy with the other
files, but,
> > > that's more "cosmetic".
> > >
> > > Oh, but, that -set_hdr command option is within the config file,
> correct?
> > > So, you really couldn't loop through that and pass a time
variable into
> > the
> > > command.  So, it may just be easiest to leave it out.
> > >
> > > So, since my processing would take 20 minutes for regenerating
one days
> > > worth of data, I was thinking, I would do all my processing for
the
> North
> > > Atlantic first, so, I can look at how we did with those Nor'
Easters in
> > > March.  So, that's processing the 00z, 01z, 11z and 12z time
periods
> > first,
> > > since that is when ASCAT passes over the North Atlantic.  So, I
would
> > copy
> > > those time periods into a temp directory and use the -lookin
command to
> > > process those 4 time periods for my entire period (maybe 1 week
at a
> > > time).
> > >
> > > So, this will produce 1 file with 00z,01z, 11z and 12z data, for
each
> > week,
> > > correct?  And, the only way to get individual files is to copy
the
> data,
> > > one hour at a time, process, delete the file, later rinse and
repeat.
> > That
> > > might be hard to do.  I may have to think about that...
> > >
> > > So, if it's one file, with all the data for the week, at
selected
> hours,
> > > what happens when I have time to run the rest of the data?  I
just
> write
> > > that to a different file, and then, maybe append that to the end
of the
> > > first file?  Or, just leave it separate?
> > >
> > > I guess I just have to think about what I'm going to do next
with these
> > > files, and the easiest way to do that.
> > >
> > > Roz
> > >
> > > On Mon, May 7, 2018 at 11:48 AM, John Halley Gotway via RT <
> > > met_help at ucar.edu> wrote:
> > >
> > > > Roz,
> > > >
> > > > I understand that you're suspicious about the beginning and
ending
> time
> > > > stamps in the OBS_VALID_BEG and _END columns.  You're
comparing the
> > > > original output from Point-Stat to the output that you're
getting
> from
> > > > STAT-Analysis.  However, those timestamps can be different
without
> > there
> > > > actually being a problem.  Here's why...
> > > >
> > > > When you run Point-Stat, the obs_window setting in the config
file
> > > defines
> > > > the matching time window.  If your forecast is valid at time
T, the
> > > > matching time window is defined as T+obs_window.beg to
> > T+obs_window.end.
> > > > The point observations may actually fall anywhere in that time
> > window...
> > > > but it's that time window that's reported in the summary line
type
> > (like
> > > > CTC, CTS, SL1L2, and CNT).  Since the MPR line type is
specific to
> each
> > > > observation value, the *actual* timestamp of that observation
is
> > reported
> > > > for in that line.
> > > >
> > > > When your run STAT-Analysis to process those MPR lines, it
reads the
> > > > OBS_VALID_BEG and OBS_VALID_END columns.  And it keeps track
of the
> > > minimum
> > > > OBS_VALID_BEG timestamp and the maximum OBS_VALID_END
timestamp.
> When
> > it
> > > > writes output CTC, CTS, SL1L2, or CNT lines it reports the
> > > minimum/maximum
> > > > timestamp values it found in the data.
> > > >
> > > > So Point-Stat reports the *REQUESTED TIME WINDOW* in the
> OBS_VALID_BEG
> > > and
> > > > OBS_VALID_END columns... while STAT-Analysis reports the
*ACTUAL TIME
> > > > WINDOW*.  And in general, those won't be the same.  So this
isn't
> > > > necessarily a problem.
> > > >
> > > > If for consistency, you'd like to explicitly set the
OBS_VALID_BEG
> and
> > > > OBS_VALID_END timestamps in the output, you can use the "-
set_hdr"
> job
> > > > command option to do so:
> > > >    -set_hdr OBS_VALID_BEG 20180307_003000 -set_hdr
OBS_VALID_END
> > > > 20180307_013000
> > > >
> > > > Thanks,
> > > > John
> > > >
> > > > On Sun, May 6, 2018 at 2:12 PM, Rosalyn MacCracken - NOAA
Affiliate
> via
> > > RT
> > > > <
> > > > met_help at ucar.edu> wrote:
> > > >
> > > > >
> > > > > <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822 >
> > > > >
> > > > > Hi John,
> > > > >
> > > > > Sorry it took me so long to get back to you.  My step-
daughter came
> > in
> > > to
> > > > > town, and I thought that I could get some work done while
she was
> > here,
> > > > > but, didn't.  Then, I totally forgot to email you back.
Sorry for
> > > > leaving
> > > > > you hanging!
> > > > >
> > > > > Anyway, I was able to play around with the STATAnalysis
config file
> > you
> > > > > sent me.  I tried it out with only 1 hour timestep, instead
of all
> > the
> > > > > files for one day.  I wanted to see what kind of time it
would take
> > to
> > > > > process this on my machine.  So, it was quick, 45 seconds.
But, of
> > > > course
> > > > > you run took 18 minutes.  The script was probably reading 20
some
> > > files.
> > > > > That makes sense.
> > > > >
> > > > > So, then, I looked at the output, and it wasn't quite what I
> > expected,
> > > > and
> > > > > doesn't quite match the stats from the other processing.
This is
> > what
> > > I
> > > > > did:
> > > > >
> > > > > 1)  I copied the 00z only *20180307*.stat file to a temp
directory.
> > > > Before
> > > > > I did this, I looked at the matching *.mpr file, and saw
that the
> > > > > OBS_VALID_BEG was 20180307_000000 and the OBS_VALID_END was
> > > > > 20180307_002700.
> > > > > 2)  Ran the run_sa.sh script and generated the CTS, CTC and
CNT
> > files.
> > > > > 3)  I looked at the new agg_cts file, and the OBS_VALID_BEG
and
> _END
> > > > > matched the *.mpr file in step 1.
> > > > > 4)  I looked at the original CTS file, and the OBS_VALID_BEG
was
> > > > > 20180307_223000 and the OBS_VALID_END was 20180307_013000.
So,
> that
> > > was
> > > > > our original way of processing.  I bet if I looked at a more
recent
> > > file,
> > > > > it would be more like OBS_VALID_BEG was 20180307_233000 and
the
> > > > > OBS_VALID_END was 20180307_003000.
> > > > > 5)  I looked at the original *mpr for 01z, and the
OBS_VALID_BEG
> was
> > > > > 20180307_003000 and the OBS_VALID_END was 20180307_012100
> > > > >
> > > > > So, this tells me that I'm not matching observation times,
and I'm
> > not
> > > > sure
> > > > > how to fix it to match things up.  First, we use a +/- 30
min
> window
> > > for
> > > > > ASCAT obs, centered on the hour.  For example, if we are
processing
> > the
> > > > 00z
> > > > > hour, we will match observations from 233000 from the day
before to
> > > > 003000
> > > > > the current day.  Actually, we used to do an hour window on
either
> > > side,
> > > > > but, we have more observations now at each hour.  (See the
> > explanation
> > > in
> > > > > #4 above)
> > > > >
> > > > > Anyway, how do I create the CTS,CTC and CNT files for the
+/- 30
> min
> > > > > window?  Is there a way to dynamically indicate this 30min
window,
> so
> > > > that
> > > > > I don't have to go into the config file every time I run
> STATanalysis
> > > and
> > > > > change it?
> > > > >
> > > > > Roz
> > > > >
> > > > > On Thu, Apr 26, 2018 at 4:14 PM, John Halley Gotway via RT <
> > > > > met_help at ucar.edu> wrote:
> > > > >
> > > > > > Roz,
> > > > > >
> > > > > > The CSI statistics is computed from a 2x2 contingency
table.  A
> 2x2
> > > > > > contingency table is defined by a single threshold.
Looking in
> the
> > > > .stat
> > > > > > files you sent, I see that you've applied many thresholds
to
> > generate
> > > > > many
> > > > > > 2x2 contingency tables and corresponding statistics.  Yes,
it is
> > true
> > > > > that
> > > > > > for most of those thresholds, the "bad" observation values
will
> > fall
> > > > into
> > > > > > the "non-event" category.  But those non-event counts are
> included
> > in
> > > > the
> > > > > > computation of some stats, including CSI.  So even through
the
> bad
> > > > > > observations aren't very interesting, they really are
impacting
> the
> > > > > > statistics.
> > > > > >
> > > > > > John
> > > > > >
> > > > > > On Wed, Apr 25, 2018 at 10:08 AM, Rosalyn MacCracken -
NOAA
> > Affiliate
> > > > via
> > > > > > RT <met_help at ucar.edu> wrote:
> > > > > >
> > > > > > >
> > > > > > > <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822
> >
> > > > > > >
> > > > > > > Figures.  I just calculated how long it will take me to
> > regenerate
> > > > data
> > > > > > for
> > > > > > > 03072018 - 04122018.  It will take me 912 hours.  ;-(
> > > > > > >
> > > > > > > Ok, I know I asked this, but, if I had a OBS value of
0.01 and
> a
> > > > > matched
> > > > > > > GFS point of 10 m/s, and I had a low threshold of 0-5
m/s, 6-10
> > m/s
> > > > and
> > > > > > > 10-15 m/s, and say, CSI was calculated.  Which threshold
would
> be
> > > > used
> > > > > > for
> > > > > > > the output, the 0-5 or 6-10?  And, would the 10-15
threshold
> even
> > > be
> > > > > > > effected?
> > > > > > >
> > > > > > > Roz
> > > > > > >
> > > > > > > On Wed, Apr 25, 2018 at 11:40 AM, John Halley Gotway via
RT <
> > > > > > > met_help at ucar.edu> wrote:
> > > > > > >
> > > > > > > > Roz,
> > > > > > > >
> > > > > > > > I think it'd take just as long.  The slow part is
reading the
> > > > data...
> > > > > > not
> > > > > > > > applying a threshold.
> > > > > > > >
> > > > > > > > John
> > > > > > > >
> > > > > > > > On Wed, Apr 25, 2018 at 9:18 AM, Rosalyn MacCracken -
NOAA
> > > > Affiliate
> > > > > > via
> > > > > > > RT
> > > > > > > > <met_help at ucar.edu> wrote:
> > > > > > > >
> > > > > > > > >
> > > > > > > > > <URL: https://rt.rap.ucar.edu/rt/
> > Ticket/Display.html?id=84822
> > > >
> > > > > > > > >
> > > > > > > > > Hi John,
> > > > > > > > >
> > > > > > > > > Thanks for doing that for me.  I'll take a look at
the info
> > you
> > > > > sent
> > > > > > me
> > > > > > > > > this afternoon.  I'm in the middle of doing
something right
> > > > > > > now...trying
> > > > > > > > to
> > > > > > > > > make a different program work.  ;-/
> > > > > > > > >
> > > > > > > > > I wonder if it will be quicker than 18 minutes for
some of
> > the
> > > > > > > thresholds
> > > > > > > > > that have higher wind speeds, and not as many
instances
> (or 0
> > > > > > > instances).
> > > > > > > > > Or, will it take just as long, since it still needs
to read
> > > > through
> > > > > > the
> > > > > > > > > entire *.stat file anyway?
> > > > > > > > >
> > > > > > > > > Roz
> > > > > > > > >
> > > > > > > > > On Tue, Apr 24, 2018 at 7:06 PM, John Halley Gotway
via RT
> <
> > > > > > > > > met_help at ucar.edu> wrote:
> > > > > > > > >
> > > > > > > > > > Hi Roz,
> > > > > > > > > >
> > > > > > > > > > Thanks for sending the sample data.  I grabbed it
and
> used
> > it
> > > > run
> > > > > > > some
> > > > > > > > > > sample jobs:
> > > > > > > > > >
> > > > > > > > > > time /d1/johnhg/MET/MET_releases/
> met-6.0/bin/stat_analysis
> > \
> > > > > > > > > > -lookin
> > > > > > > > > >
/d1/johnhg/MET/MET_Help/maccracken_data_20180424/opc_
> > > > > > > > > > test/home/opc_test/data/met_verif/GFS/data/hourly
> > > > > > > > > > \
> > > > > > > > > > -config STATAnalysisConfig \
> > > > > > > > > > -log run_sa.log -v 3
> > > > > > > > > >
> > > > > > > > > > I used the "-lookin" option to point to all the
data you
> > > sent.
> > > > > > > > > >
> > > > > > > > > > I've attached the...
> > > > > > > > > > (1) config file I used
> > > > > > > > > > (2) log file that was genrated
> > > > > > > > > > (3) output .stat files
> > > > > > > > > >
> > > > > > > > > > Looking at the jobs, you'll see that I've included
5 of
> > > them...
> > > > > > > > > > - Generate CNT output
> > > > > > > > > > - Generate CTC >= 0.0 output
> > > > > > > > > > - Generate CTS >= 0.0 output
> > > > > > > > > > - Generate CTC >= 5.5689 output
> > > > > > > > > > - Generate CTS >= 5.5689 output
> > > > > > > > > >
> > > > > > > > > > Unfortunately, you'll need to define separate jobs
for
> each
> > > > > > threshold
> > > > > > > > > you'd
> > > > > > > > > > like to use.  Although, you shouldn't use >=0.0
since
> > that's
> > > > > always
> > > > > > > > true.
> > > > > > > > > >
> > > > > > > > > > Also unfortunately, this is pretty slow.  On my
machine,
> it
> > > > took
> > > > > > like
> > > > > > > > 18
> > > > > > > > > > minutes for these 5 jobs!
> > > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > > John
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Tue, Apr 24, 2018 at 2:09 PM, Rosalyn
MacCracken -
> NOAA
> > > > > > Affiliate
> > > > > > > > via
> > > > > > > > > RT
> > > > > > > > > > <met_help at ucar.edu> wrote:
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > <URL: https://rt.rap.ucar.edu/rt/
> > > > Ticket/Display.html?id=84822
> > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Hi John,
> > > > > > > > > > >
> > > > > > > > > > > I put my file on the ftp site.  Let me know what
you
> > find.
> > > > > > You'll
> > > > > > > > see
> > > > > > > > > > > those really low OBS values (0.01, 0.02, and so
on).
> > > > > > > > > > >
> > > > > > > > > > > Thanks!
> > > > > > > > > > >
> > > > > > > > > > > Roz
> > > > > > > > > > >
> > > > > > > > > > > On Tue, Apr 24, 2018 at 2:53 PM, Rosalyn
MacCracken -
> > NOAA
> > > > > > > Affiliate
> > > > > > > > <
> > > > > > > > > > > rosalyn.maccracken at noaa.gov> wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Ok, I'll get that over to the ftp site.  I
have to
> make
> > > > sure
> > > > > > > that I
> > > > > > > > > > find
> > > > > > > > > > > a
> > > > > > > > > > > > day that has all the data in it.  Sometimes
the data
> > > isn't
> > > > > > > > available
> > > > > > > > > > when
> > > > > > > > > > > > the script runs.  A little annoying, but,
that's
> > > > > operations...
> > > > > > > > > > > >
> > > > > > > > > > > > I'll let you know when I get the file to the
ftp
> site.
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks!
> > > > > > > > > > > >
> > > > > > > > > > > > Roz
> > > > > > > > > > > >
> > > > > > > > > > > > On Tue, Apr 24, 2018 at 2:49 PM, John Halley
Gotway
> via
> > > RT
> > > > <
> > > > > > > > > > > > met_help at ucar.edu> wrote:
> > > > > > > > > > > >
> > > > > > > > > > > >> Roz,
> > > > > > > > > > > >>
> > > > > > > > > > > >> Yes, we do.  Follow the instructions here:
> > > > > > > > > > > >>    https://dtcenter.org/met/
> > > > users/support/met_help.php#ftp
> > > > > > > > > > > >>
> > > > > > > > > > > >> I'd suggest making a tar file for one day and
> posting
> > > them
> > > > > to
> > > > > > > the
> > > > > > > > > ftp
> > > > > > > > > > > >> site:
> > > > > > > > > > > >>    tar -cvzf sample.tar.gz
> /GFS/data/hourly/20180305*
> > > > > > > > > > > >>
> > > > > > > > > > > >> Thanks,
> > > > > > > > > > > >> John
> > > > > > > > > > > >>
> > > > > > > > > > > >> On Tue, Apr 24, 2018 at 11:57 AM, Rosalyn
> MacCracken -
> > > > NOAA
> > > > > > > > > Affiliate
> > > > > > > > > > > via
> > > > > > > > > > > >> RT <met_help at ucar.edu> wrote:
> > > > > > > > > > > >>
> > > > > > > > > > > >> >
> > > > > > > > > > > >> > <URL: https://rt.rap.ucar.edu/rt/
> > > > > > Ticket/Display.html?id=84822
> > > > > > > >
> > > > > > > > > > > >> >
> > > > > > > > > > > >> > HI John,
> > > > > > > > > > > >> >
> > > > > > > > > > > >> > Yes, it does seem that the -config option
is the
> way
> > > to
> > > > go
> > > > > > to
> > > > > > > > > > recreate
> > > > > > > > > > > >> > those 3 files. I'll be sure to have a
unique file
> > > name,
> > > > > or,
> > > > > > mv
> > > > > > > > the
> > > > > > > > > > > >> output
> > > > > > > > > > > >> > file to a different name before running the
> command
> > > > again.
> > > > > > > > Thanks
> > > > > > > > > > for
> > > > > > > > > > > >> > pointing that out.
> > > > > > > > > > > >> >
> > > > > > > > > > > >> > I'm teleworking for the next couple of
weeks, so,
> > > > download
> > > > > > and
> > > > > > > > > send
> > > > > > > > > > > you
> > > > > > > > > > > >> > *.stat files like I can when I'm at my
computer at
> > > work.
> > > > > I
> > > > > > > > don't
> > > > > > > > > > have
> > > > > > > > > > > >> > access to theia or wcoss anymore.  You have
an ftp
> > > > server
> > > > > > > that I
> > > > > > > > > can
> > > > > > > > > > > >> upload
> > > > > > > > > > > >> > data to, right?  If not, I can try and
fiddle
> around
> > > > with
> > > > > > this
> > > > > > > > > > > tomorrow
> > > > > > > > > > > >> and
> > > > > > > > > > > >> > see if I can't get this to work the way I
want to.
> > > > > > > > > > > >> >
> > > > > > > > > > > >> > Roz
> > > > > > > > > > > >> >
> > > > > > > > > > > >> > On Tue, Apr 24, 2018 at 11:42 AM, John
Halley
> Gotway
> > > via
> > > > > RT
> > > > > > <
> > > > > > > > > > > >> > met_help at ucar.edu> wrote:
> > > > > > > > > > > >> >
> > > > > > > > > > > >> > > Roz,
> > > > > > > > > > > >> > >
> > > > > > > > > > > >> > > Each "-job aggregate_stat" only generates
a
> single
> > > > > output
> > > > > > > line
> > > > > > > > > > type.
> > > > > > > > > > > >> So
> > > > > > > > > > > >> > > using "-out_line_type CTC,CTS,CNT" will
not
> work.
> > > > > > > > > > > >> > >
> > > > > > > > > > > >> > > You'll need to run separate jobs for each
output
> > > line
> > > > > type
> > > > > > > you
> > > > > > > > > > want
> > > > > > > > > > > to
> > > > > > > > > > > >> > > generate.  That's why I'd recommend
grouping
> those
> > > > > > multiple
> > > > > > > > jobs
> > > > > > > > > > > >> together
> > > > > > > > > > > >> > > into a single STAT-Analysis config file.
Then
> > you'd
> > > > > call
> > > > > > > > > > > >> STAT-Analysis
> > > > > > > > > > > >> > > once using the "-config" command line
option.
> > > > > > > > > > > >> > >
> > > > > > > > > > > >> > > Another issue is that if you set "-
out_stat" to
> > the
> > > > same
> > > > > > > > > filename,
> > > > > > > > > > > >> it'll
> > > > > > > > > > > >> > > get overridden by each job.  STAT-
Analysis will
> > > > > overwrite
> > > > > > > that
> > > > > > > > > > > output
> > > > > > > > > > > >> > file
> > > > > > > > > > > >> > > rather than appending to it.
> > > > > > > > > > > >> > >
> > > > > > > > > > > >> > > You could send me a day's worth of .stat
output
> > > files
> > > > > > > > > > > >> > > (/GFS/data/hourly/20180305*) and I could
send
> you
> > > some
> > > > > > > > > > suggestions.
> > > > > > > > > > > >> Or
> > > > > > > > > > > >> > if
> > > > > > > > > > > >> > > you have access to theia you could copy
them up
> > > there
> > > > > and
> > > > > > > > point
> > > > > > > > > me
> > > > > > > > > > > to
> > > > > > > > > > > >> it.
> > > > > > > > > > > >> > >
> > > > > > > > > > > >> > > Thanks,
> > > > > > > > > > > >> > > John
> > > > > > > > > > > >> > >
> > > > > > > > > > > >> > > On Tue, Apr 24, 2018 at 7:48 AM, Rosalyn
> > MacCracken
> > > -
> > > > > NOAA
> > > > > > > > > > Affiliate
> > > > > > > > > > > >> via
> > > > > > > > > > > >> > RT
> > > > > > > > > > > >> > > <met_help at ucar.edu> wrote:
> > > > > > > > > > > >> > >
> > > > > > > > > > > >> > > >
> > > > > > > > > > > >> > > > <URL: https://rt.rap.ucar.edu/rt/
> > > > > > > > Ticket/Display.html?id=84822
> > > > > > > > > >
> > > > > > > > > > > >> > > >
> > > > > > > > > > > >> > > > Hi John,
> > > > > > > > > > > >> > > >
> > > > > > > > > > > >> > > > Yes, that makes sense.  Those very
small
> values
> > > > (<1.0
> > > > > > > m/s),
> > > > > > > > > are
> > > > > > > > > > > bad
> > > > > > > > > > > >> > > > values.  That's why they shouldn't be
included
> > in
> > > > the
> > > > > > > > > > processing.
> > > > > > > > > > > >> > > >
> > > > > > > > > > > >> > > > So, I need to just regenerate hourly
data, one
> > > hour
> > > > > at a
> > > > > > > > time.
> > > > > > > > > > > >> Would
> > > > > > > > > > > >> > it
> > > > > > > > > > > >> > > > make sense to use a shell script and
loop
> > > > > stat-analysis?
> > > > > > > > > > > Something
> > > > > > > > > > > >> > like:
> > > > > > > > > > > >> > > >
> > > > > > > > > > > >> > > > for day in 11 12
> > > > > > > > > > > >> > > > do
> > > > > > > > > > > >> > > >   for cycle in 00 06 12 18
> > > > > > > > > > > >> > > >   do
> > > > > > > > > > > >> > > > stat_analysis -lookin
> > > /GFS/data/hourly/201803${day}$
> > > > > > > > > > {hour}/*.stat
> > > > > > > > > > > \
> > > > > > > > > > > >> > > > -job aggregate_stat \
> > > > > > > > > > > >> > > >    -line_type MPR \
> > > > > > > > > > > >> > > >    -out_line_type CTC,CTS,CNT \
> > > > > > > > > > > >> > > >   -fcst_var WIND \
> > > > > > > > > > > >> > > > -column_thresh OBS gt1 \
> > > > > > > > > > > >> > > >  -by
> > > > > > > > > > > >> > > > MODEL,FCST_LEV,FCST_INIT_BEG,
> > > > > > > FCST_LEAD,VX_MASK,INTERP_MTHD,
> > > > > > > > I
> > > > > > > > > > > >> NTERP_PNTS
> > > > > > > > > > > >> > > > -out_stat /new_rerun_stat_files/MPR_to_
> > > > > CTC_CTS_CNT.stat
> > > > > > > > > > > >> > > >   done
> > > > > > > > > > > >> > > > done
> > > > > > > > > > > >> > > >
> > > > > > > > > > > >> > > > or, something like that?  And, will
this
> > > regenerate
> > > > > hour
> > > > > > > > > > > forecasts,
> > > > > > > > > > > >> at
> > > > > > > > > > > >> > > each
> > > > > > > > > > > >> > > > forecast and lead hour?  I guess it
will see
> the
> > > > > > forecast
> > > > > > > > and
> > > > > > > > > > lead
> > > > > > > > > > > >> hour
> > > > > > > > > > > >> > > > from the *.stat file, and whatever
*stat file
> is
> > > in
> > > > > the
> > > > > > > > > > directory,
> > > > > > > > > > > >> it
> > > > > > > > > > > >> > > will
> > > > > > > > > > > >> > > > regenerate those hours, right?
> > > > > > > > > > > >> > > >
> > > > > > > > > > > >> > > > So, I need to regenerate the CTC, CNT
and CTS
> > > files.
> > > > > > > That's
> > > > > > > > > > why I
> > > > > > > > > > > >> did:
> > > > > > > > > > > >> > > >  -out_line_type CTC,CTS,CNT
> > > > > > > > > > > >> > > > but, will that make 3 separate files,
or just
> > > > another
> > > > > > > *.stat
> > > > > > > > > > file?
> > > > > > > > > > > >> > > >
> > > > > > > > > > > >> > > > Roz
> > > > > > > > > > > >> > > >
> > > > > > > > > > > >> > > >
> > > > > > > > > > > >> > > > On Mon, Apr 23, 2018 at 4:01 PM, John
Halley
> > > Gotway
> > > > > via
> > > > > > > RT <
> > > > > > > > > > > >> > > > met_help at ucar.edu> wrote:
> > > > > > > > > > > >> > > >
> > > > > > > > > > > >> > > > > Roz,
> > > > > > > > > > > >> > > > >
> > > > > > > > > > > >> > > > > It is ultimately up to you to decide
which
> > > matched
> > > > > > pairs
> > > > > > > > you
> > > > > > > > > > > want
> > > > > > > > > > > >> to
> > > > > > > > > > > >> > > > > include in your processing.  Do you
consider
> > > those
> > > > > > small
> > > > > > > > > (<1.0
> > > > > > > > > > > >> m/s)
> > > > > > > > > > > >> > > > > observation values to be corrupt and
> incorrect
> > > in
> > > > > some
> > > > > > > way
> > > > > > > > > or
> > > > > > > > > > > just
> > > > > > > > > > > >> > not
> > > > > > > > > > > >> > > > very
> > > > > > > > > > > >> > > > > interesting?  If they really are BAD
data
> > > values,
> > > > I
> > > > > > > agree
> > > > > > > > > that
> > > > > > > > > > > you
> > > > > > > > > > > >> > > should
> > > > > > > > > > > >> > > > > exclude them from your analysis.  But
if
> > they're
> > > > > just
> > > > > > > > > > > >> uninteresting
> > > > > > > > > > > >> > > > values
> > > > > > > > > > > >> > > > > of low wind speed, then there's no
reason
> why
> > > you
> > > > > > should
> > > > > > > > > > exclude
> > > > > > > > > > > >> > them.
> > > > > > > > > > > >> > > > For
> > > > > > > > > > > >> > > > > example, *most* of the time it ins't
> raining,
> > > but
> > > > we
> > > > > > > often
> > > > > > > > > > > >> included
> > > > > > > > > > > >> > > > > observations of 0 precip.
> > > > > > > > > > > >> > > > >
> > > > > > > > > > > >> > > > > There are three configurable options
in
> > > Point-Stat
> > > > > > that
> > > > > > > > may
> > > > > > > > > be
> > > > > > > > > > > >> useful
> > > > > > > > > > > >> > > > here:
> > > > > > > > > > > >> > > > > (1) You already know and use the
> "cat_thresh"
> > > > > option.
> > > > > > > > This
> > > > > > > > > > > >> threshold
> > > > > > > > > > > >> > > > > defines the events and non-events for
a 2x2
> > > > > > contingency
> > > > > > > > > table.
> > > > > > > > > > > >> This
> > > > > > > > > > > >> > > > > threshold affects the contents of
FHO, CTC,
> > CTS,
> > > > > MCTC,
> > > > > > > and
> > > > > > > > > > MCTS
> > > > > > > > > > > >> line
> > > > > > > > > > > >> > > > types
> > > > > > > > > > > >> > > > > that Point-Stat writes.
> > > > > > > > > > > >> > > > > (2) The "cnt_thresh" option is a more
recent
> > > > > addition.
> > > > > > > > > > Perhaps
> > > > > > > > > > > >> this
> > > > > > > > > > > >> > > was
> > > > > > > > > > > >> > > > a
> > > > > > > > > > > >> > > > > poor name choice, but instead of
defining
> > > > > categories,
> > > > > > > it's
> > > > > > > > > > > really
> > > > > > > > > > > >> a
> > > > > > > > > > > >> > > > > *filtering* threshold.  This
threshold
> affects
> > > the
> > > > > > > > contents
> > > > > > > > > of
> > > > > > > > > > > the
> > > > > > > > > > > >> > > SL1L2,
> > > > > > > > > > > >> > > > > SAL1L2, and CNT line types that
Point-Stat
> > > writes.
> > > > > > For
> > > > > > > > > > example,
> > > > > > > > > > > >> > > setting
> > > > > > > > > > > >> > > > > "cnt_thresh = [ ge6, ge17 ];" will
produce 2
> > CNT
> > > > > and 2
> > > > > > > > SL1L2
> > > > > > > > > > > >> output
> > > > > > > > > > > >> > > lines
> > > > > > > > > > > >> > > > > containing only those points where
the wind
> > > speed
> > > > > was
> > > > > > > >=6
> > > > > > > > > and
> > > > > > > > > > > >> >=17,
> > > > > > > > > > > >> > > > > respectively.
> > > > > > > > > > > >> > > > > (3) The "wind_thresh" option is very
similar
> > to
> > > > the
> > > > > > > > > > "cnt_thresh"
> > > > > > > > > > > >> > option
> > > > > > > > > > > >> > > > but
> > > > > > > > > > > >> > > > > affects the contents of teh VL1L2,
VAL1L2,
> and
> > > > VCNT
> > > > > > (new
> > > > > > > > in
> > > > > > > > > > > >> met-7.0)
> > > > > > > > > > > >> > > line
> > > > > > > > > > > >> > > > > types.  Only those U/V pairs that
meet the
> > > > specified
> > > > > > > wind
> > > > > > > > > > speed
> > > > > > > > > > > >> > > threshold
> > > > > > > > > > > >> > > > > are included in the output.
> > > > > > > > > > > >> > > > >
> > > > > > > > > > > >> > > > > For both "cnt_thresh" and
"wind_thresh", the
> > > > default
> > > > > > > value
> > > > > > > > > in
> > > > > > > > > > > the
> > > > > > > > > > > >> > > config
> > > > > > > > > > > >> > > > > file is "NA", meaning, do not apply
any
> > > filtering
> > > > > > > > threshold
> > > > > > > > > > > >> criteria.
> > > > > > > > > > > >> > > > >
> > > > > > > > > > > >> > > > > You have the flexibility to run
> STAT-Analysis
> > on
> > > > the
> > > > > > MPR
> > > > > > > > > > output
> > > > > > > > > > > >> lines
> > > > > > > > > > > >> > > to
> > > > > > > > > > > >> > > > > recompute any of these output line
types
> > > applying
> > > > > > > whatever
> > > > > > > > > > > >> filtering
> > > > > > > > > > > >> > > > > criteria you'd like.
> > > > > > > > > > > >> > > > > Here's the MET user's guide:
> > > > > > > > > > > >> > > > > https://dtcenter.org/met/
> > > > > users/docs/users_guide/MET_
> > > > > > > > > > > >> > > Users_Guide_v7.0.pdf
> > > > > > > > > > > >> > > > > Look on page 98 for the job command
options
> > for
> > > > the
> > > > > > > > > > > >> "aggregate_stat"
> > > > > > > > > > > >> > > line
> > > > > > > > > > > >> > > > > type when the input line type is
"MPR".
> > > > > > > > > > > >> > > > >
> > > > > > > > > > > >> > > > > For your second question, the "-
lookin PATH"
> > > > option
> > > > > is
> > > > > > > > > *VERY*
> > > > > > > > > > > >> > flexible.
> > > > > > > > > > > >> > > > > You can set PATH to either a single
value or
> > > > > multiple
> > > > > > > > > values.
> > > > > > > > > > > If
> > > > > > > > > > > >> you
> > > > > > > > > > > >> > > use
> > > > > > > > > > > >> > > > > wildcards, then the shell expands
those
> > > wildcards
> > > > to
> > > > > > > > > multiple
> > > > > > > > > > > >> values.
> > > > > > > > > > > >> > > > Each
> > > > > > > > > > > >> > > > > value you pass in can either be a
filename
> or
> > a
> > > > > > > directory
> > > > > > > > > > name.
> > > > > > > > > > > >> If
> > > > > > > > > > > >> > you
> > > > > > > > > > > >> > > > > pass in a filename, STAT-Analysis
will read
> it
> > > > > > > > *REGARDLESS*
> > > > > > > > > of
> > > > > > > > > > > the
> > > > > > > > > > > >> > file
> > > > > > > > > > > >> > > > > extension.  If you pass in a
directory name,
> > > > > > > STAT-Analysis
> > > > > > > > > > will
> > > > > > > > > > > >> > search
> > > > > > > > > > > >> > > > that
> > > > > > > > > > > >> > > > > directory *RECURSIVELY* for files
ending in
> > > > ".stat".
> > > > > > > For
> > > > > > > > > > > example,
> > > > > > > > > > > >> > > either
> > > > > > > > > > > >> > > > > of the following settings would tell
> > > STAT-Analysis
> > > > > to
> > > > > > > read
> > > > > > > > > the
> > > > > > > > > > > >> same
> > > > > > > > > > > >> > > list
> > > > > > > > > > > >> > > > of
> > > > > > > > > > > >> > > > > files:
> > > > > > > > > > > >> > > > >    -lookin /GFS/data/hourly/*/*.stat
> > > > > > > > > > > >> > > > >    ... or ...
> > > > > > > > > > > >> > > > >    -lookin /GFS/data/hourly
> > > > > > > > > > > >> > > > >
> > > > > > > > > > > >> > > > > Be aware though that the more data
you pass
> to
> > > > > > > > > STAT-Analysis,
> > > > > > > > > > > the
> > > > > > > > > > > >> > > longer
> > > > > > > > > > > >> > > > > it'll take for it to process it.  You
can
> > decide
> > > > how
> > > > > > > much
> > > > > > > > > data
> > > > > > > > > > > you
> > > > > > > > > > > >> > pass
> > > > > > > > > > > >> > > > it
> > > > > > > > > > > >> > > > > for each job.  I'd suggest starting
with
> what
> > is
> > > > > most
> > > > > > > > > > convenient
> > > > > > > > > > > >> for
> > > > > > > > > > > >> > > you.
> > > > > > > > > > > >> > > > > If it's too slow, change the logic to
pass
> it
> > > less
> > > > > > data
> > > > > > > > > (e.g.
> > > > > > > > > > > >> only 1
> > > > > > > > > > > >> > > day
> > > > > > > > > > > >> > > > of
> > > > > > > > > > > >> > > > > data rather than 1 month of data).
> > > > > > > > > > > >> > > > >
> > > > > > > > > > > >> > > > > Yes, you can give it a date range.
Use
> > > > > -fcst_init_beg
> > > > > > > and
> > > > > > > > > > > >> > > -fcst_init_end
> > > > > > > > > > > >> > > > > to specify beginning/ending model
> > initialization
> > > > > times
> > > > > > > or
> > > > > > > > > > > >> > > -fcst_valid_beg
> > > > > > > > > > > >> > > > > and -fcst_valid_end to specify
> > beginning/ending
> > > > > valid
> > > > > > > > times.
> > > > > > > > > > > >> > > > >
> > > > > > > > > > > >> > > > > If you find that you're running
multiple
> jobs
> > on
> > > > the
> > > > > > > same
> > > > > > > > > > subset
> > > > > > > > > > > >> of
> > > > > > > > > > > >> > > data
> > > > > > > > > > > >> > > > > (e.g. process MPR to CNT, MPR to
SL1L2, MPR
> to
> > > > CTC,
> > > > > > MPR
> > > > > > > to
> > > > > > > > > > CTS),
> > > > > > > > > > > >> it'd
> > > > > > > > > > > >> > > be
> > > > > > > > > > > >> > > > > more efficient to group those jobs
into a
> > config
> > > > > file.
> > > > > > > > > > That'll
> > > > > > > > > > > do
> > > > > > > > > > > >> > the
> > > > > > > > > > > >> > > > > filtering ONCE and write the filtered
data
> to
> > a
> > > > temp
> > > > > > > file.
> > > > > > > > > > Then
> > > > > > > > > > > >> all
> > > > > > > > > > > >> > > the
> > > > > > > > > > > >> > > > > jobs read data from the temp instead
of
> > starting
> > > > > over
> > > > > > > from
> > > > > > > > > > > >> scratch.
> > > > > > > > > > > >> > > > >
> > > > > > > > > > > >> > > > > Make sense?
> > > > > > > > > > > >> > > > >
> > > > > > > > > > > >> > > > > John
> > > > > > > > > > > >> > > > >
> > > > > > > > > > > >> > > > >
> > > > > > > > > > > >> > > > >
> > > > > > > > > > > >> > > > > On Mon, Apr 23, 2018 at 1:01 PM,
Rosalyn
> > > > MacCracken
> > > > > -
> > > > > > > NOAA
> > > > > > > > > > > >> Affiliate
> > > > > > > > > > > >> > > via
> > > > > > > > > > > >> > > > RT
> > > > > > > > > > > >> > > > > <met_help at ucar.edu> wrote:
> > > > > > > > > > > >> > > > >
> > > > > > > > > > > >> > > > > >
> > > > > > > > > > > >> > > > > > <URL: https://rt.rap.ucar.edu/rt/
> > > > > > > > > > Ticket/Display.html?id=84822
> > > > > > > > > > > >
> > > > > > > > > > > >> > > > > >
> > > > > > > > > > > >> > > > > > Hi John,
> > > > > > > > > > > >> > > > > >
> > > > > > > > > > > >> > > > > > That's actually only partially
correct.
> > It's
> > > > not
> > > > > > > that I
> > > > > > > > > > want
> > > > > > > > > > > to
> > > > > > > > > > > >> > use
> > > > > > > > > > > >> > > > part
> > > > > > > > > > > >> > > > > > of the MPR lines and discard the
rest,
> and I
> > > do
> > > > > need
> > > > > > > to
> > > > > > > > > > > >> regenerate
> > > > > > > > > > > >> > > > > > statistics.  Let me try to re-
explain.
> > > > > > > > > > > >> > > > > >
> > > > > > > > > > > >> > > > > > Back in early March we switched
from
> getting
> > > our
> > > > > > ASCAT
> > > > > > > > obs
> > > > > > > > > > > from
> > > > > > > > > > > >> the
> > > > > > > > > > > >> > > > > > prepbufr data, to getting it from
the
> > MGDRLITE
> > > > > data.
> > > > > > > So,
> > > > > > > > > > > >> processing
> > > > > > > > > > > >> > > > > didn't
> > > > > > > > > > > >> > > > > > change.  I was producing statistics
at
> > certain
> > > > > > > threshold
> > > > > > > > > > > levels
> > > > > > > > > > > >> for
> > > > > > > > > > > >> > > > both
> > > > > > > > > > > >> > > > > > GFS and ASCAT.  I had this set with
the
> > > > cat_thresh
> > > > > > > list,
> > > > > > > > > at
> > > > > > > > > > > >> levels
> > > > > > > > > > > >> > of
> > > > > > > > > > > >> > > > > > 0,6,17, etc.  We found out after
> processing
> > > for
> > > > a
> > > > > > > couple
> > > > > > > > > of
> > > > > > > > > > > >> weeks
> > > > > > > > > > > >> > > that
> > > > > > > > > > > >> > > > > the
> > > > > > > > > > > >> > > > > > ASCAT data included these really
small
> > values,
> > > > > <1.0
> > > > > > > m/s,
> > > > > > > > > and
> > > > > > > > > > > >> that
> > > > > > > > > > > >> > > these
> > > > > > > > > > > >> > > > > > small wind speeds were being
included into
> > the
> > > > > > > > statistics
> > > > > > > > > > > >> > processing.
> > > > > > > > > > > >> > > > > >
> > > > > > > > > > > >> > > > > > So, a couple of questions.
> > > > > > > > > > > >> > > > > > 1) Do I have to regenerate all of
my
> > > statistics
> > > > > > > (*.cts,
> > > > > > > > > > *.cnt
> > > > > > > > > > > >> and
> > > > > > > > > > > >> > > *ctc
> > > > > > > > > > > >> > > > > > files) because of this error? Or,
since I
> > have
> > > > > > > threshold
> > > > > > > > > > > levels
> > > > > > > > > > > >> > set,
> > > > > > > > > > > >> > > > will
> > > > > > > > > > > >> > > > > > those small values be amoung the
> statistics
> > in
> > > > the
> > > > > > > > lowest
> > > > > > > > > > > >> > thresholds?
> > > > > > > > > > > >> > > > > > 2) I have the *.stat files, but,
they are
> > > spread
> > > > > out
> > > > > > > > into
> > > > > > > > > > > >> separate
> > > > > > > > > > > >> > > > > > directories like:
> > > > > > > > > > > >> > > > > >
/GFS/data/hourly/${YYYYMMDDHH}/*.stat
> > > > > > > > > > > >> > > > > > Can I tell stat-analysis to
"lookin"
> > > directories
> > > > > > with
> > > > > > > a
> > > > > > > > > > > wildcard
> > > > > > > > > > > >> > > (like
> > > > > > > > > > > >> > > > > > 201803*)?  If so, how?  Or, is I
tell it
> to
> > > look
> > > > > in
> > > > > > > > > > > >> > /GFS/data/hourly,
> > > > > > > > > > > >> > > > > will
> > > > > > > > > > > >> > > > > > it look in all the directories
recursively
> > > under
> > > > > > > hourly?
> > > > > > > > > > And,
> > > > > > > > > > > >> it
> > > > > > > > > > > >> > > > that's
> > > > > > > > > > > >> > > > > > the case, can I give it a date
range, so,
> > that
> > > > it
> > > > > > only
> > > > > > > > > > > processes
> > > > > > > > > > > >> > data
> > > > > > > > > > > >> > > > > from
> > > > > > > > > > > >> > > > > > March?
> > > > > > > > > > > >> > > > > >
> > > > > > > > > > > >> > > > > > Roz
> > > > > > > > > > > >> > > > > >
> > > > > > > > > > > >> > > > > > On Mon, Apr 23, 2018 at 2:18 PM,
John
> Halley
> > > > > Gotway
> > > > > > > via
> > > > > > > > > RT <
> > > > > > > > > > > >> > > > > > met_help at ucar.edu> wrote:
> > > > > > > > > > > >> > > > > >
> > > > > > > > > > > >> > > > > > > Hi Roz,
> > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > >> > > > > > > I read that you've run Point-Stat
and
> > saved
> > > > off
> > > > > > the
> > > > > > > > > > matched
> > > > > > > > > > > >> pairs
> > > > > > > > > > > >> > > > (MPR)
> > > > > > > > > > > >> > > > > > > output line type.  And you'd like
to (1)
> > > > filter
> > > > > > > those
> > > > > > > > > MPR
> > > > > > > > > > > >> lines
> > > > > > > > > > > >> > to
> > > > > > > > > > > >> > > > > > discard
> > > > > > > > > > > >> > > > > > > some of them and then (2) use the
> filtered
> > > > data
> > > > > to
> > > > > > > > > > > regenerate
> > > > > > > > > > > >> > > summary
> > > > > > > > > > > >> > > > > > > statistics.  Yes, this is easily
done
> > using
> > > > the
> > > > > > > > > > > STAT-Analysis
> > > > > > > > > > > >> > tool
> > > > > > > > > > > >> > > in
> > > > > > > > > > > >> > > > > > MET.
> > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > >> > > > > > > You wrote that you're verifying
wind
> > speeds
> > > > > > against
> > > > > > > > > ASCAT
> > > > > > > > > > > and
> > > > > > > > > > > >> > that
> > > > > > > > > > > >> > > > > you'd
> > > > > > > > > > > >> > > > > > > like to exclude pairs where the
observed
> > > wind
> > > > > > speed
> > > > > > > is
> > > > > > > > > > less
> > > > > > > > > > > >> than
> > > > > > > > > > > >> > 1
> > > > > > > > > > > >> > > > m/s.
> > > > > > > > > > > >> > > > > > > I'm just guessing here, but I'll
presume
> > > that
> > > > > you
> > > > > > > want
> > > > > > > > > to
> > > > > > > > > > > >> produce
> > > > > > > > > > > >> > > > both
> > > > > > > > > > > >> > > > > > > SL1L2 and CNT output line types.
Here's
> > > what
> > > > > the
> > > > > > > > > > > >> STAT-Analysis
> > > > > > > > > > > >> > job
> > > > > > > > > > > >> > > > > would
> > > > > > > > > > > >> > > > > > > look like:
> > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > >> > > > > > > # Filter MPR's and write SL1L2
output
> line
> > > > > > > > > > > >> > > > > > > stat_analysis \
> > > > > > > > > > > >> > > > > > >    -lookin input.stat \
#
> List
> > a
> > > > > .stat
> > > > > > > > > filename
> > > > > > > > > > > or
> > > > > > > > > > > >> > > > directory
> > > > > > > > > > > >> > > > > > > containing them
> > > > > > > > > > > >> > > > > > >    -job aggregate_stat \        #
Job
> type
> > > is
> > > > > > > > > > aggregate_stat
> > > > > > > > > > > >> > > > > > >    -line_type MPR \
# Input
> > > line
> > > > > > type =
> > > > > > > > MPR
> > > > > > > > > > > >> > > > > > >    -out_line_type SL1L2 \      #
Output
> > line
> > > > > type
> > > > > > =
> > > > > > > > > SL1L2
> > > > > > > > > > > >> partial
> > > > > > > > > > > >> > > > sums
> > > > > > > > > > > >> > > > > > >    -fcst_var WIND \
# Only
> > > > process
> > > > > > > lines
> > > > > > > > > > where
> > > > > > > > > > > >> > > FCST_VAR
> > > > > > > > > > > >> > > > > > > column = WIND
> > > > > > > > > > > >> > > > > > >    -column_thresh OBS gt1 \ #
Only use
> MPR
> > > > lines
> > > > > > > where
> > > > > > > > > OBS
> > > > > > > > > > > >> column
> > > > > > > > > > > >> > > > 1
> > > > > > > > > > > >> > > > > > >    -by
> > > > > > > > > > > >> > > > > > > MODEL,FCST_LEV,FCST_INIT_BEG,
> > > > > > > > > > FCST_LEAD,VX_MASK,INTERP_MTHD,
> > > > > > > > > > > >> > > > INTERP_PNTS
> > > > > > > > > > > >> > > > > #
> > > > > > > > > > > >> > > > > > > Run this same job for each unique
> > > combination
> > > > of
> > > > > > > these
> > > > > > > > > > > columns
> > > > > > > > > > > >> > > > > > >    -out_stat MPR_to_SL1L2.stat
> > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > >> > > > > > > This will read produce an output
.stat
> > file
> > > > > > > containing
> > > > > > > > > an
> > > > > > > > > > > >> SL1L2
> > > > > > > > > > > >> > > line
> > > > > > > > > > > >> > > > > for
> > > > > > > > > > > >> > > > > > > each unique combination of the
header
> > > columns
> > > > > > listed
> > > > > > > > > after
> > > > > > > > > > > the
> > > > > > > > > > > >> > > "-by"
> > > > > > > > > > > >> > > > > > > option.  To generate CNT output
lines
> > > instead,
> > > > > > you'd
> > > > > > > > > run a
> > > > > > > > > > > >> second
> > > > > > > > > > > >> > > job
> > > > > > > > > > > >> > > > > > where
> > > > > > > > > > > >> > > > > > > you replace SL1L2 with CNT.  You
could
> run
> > > > these
> > > > > > > jobs
> > > > > > > > on
> > > > > > > > > > the
> > > > > > > > > > > >> > > command
> > > > > > > > > > > >> > > > > line
> > > > > > > > > > > >> > > > > > > or group them together into a
> > STAT-Analysis
> > > > > config
> > > > > > > > file,
> > > > > > > > > > if
> > > > > > > > > > > >> you
> > > > > > > > > > > >> > > > prefer.
> > > > > > > > > > > >> > > > > > > Both would work.
> > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > >> > > > > > > You could run this once for each
input
> > .stat
> > > > > file
> > > > > > > > you're
> > > > > > > > > > > >> > > > processing...
> > > > > > > > > > > >> > > > > or
> > > > > > > > > > > >> > > > > > > you could pass many input .stat
files to
> > the
> > > > > job.
> > > > > > > > Since
> > > > > > > > > > > >> > > > FCST_INIT_BEG
> > > > > > > > > > > >> > > > > > and
> > > > > > > > > > > >> > > > > > > FCST_LEAD are included in the "-
by"
> > option,
> > > > > you'll
> > > > > > > get
> > > > > > > > > > > >> separate
> > > > > > > > > > > >> > > > output
> > > > > > > > > > > >> > > > > > > lines for each unique time.
> > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > >> > > > > > > Hope that helps get you going.
> > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > >> > > > > > > Thanks,
> > > > > > > > > > > >> > > > > > > John
> > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > >> > > > > > > On Thu, Apr 19, 2018 at 9:23 AM,
Julie
> > > > > Prestopnik
> > > > > > > via
> > > > > > > > > RT <
> > > > > > > > > > > >> > > > > > > met_help at ucar.edu>
> > > > > > > > > > > >> > > > > > > wrote:
> > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > >> > > > > > > >
> > > > > > > > > > > >> > > > > > > > <URL:
https://rt.rap.ucar.edu/rt/Tic
> > > > > > > > > > > >> ket/Display.html?id=84822
> > > > > > > > > > > >> > >
> > > > > > > > > > > >> > > > > > > >
> > > > > > > > > > > >> > > > > > > > Hi Roz.  My apologies for the
delay in
> > > > > > responding.
> > > > > > > > > > > >> > > > > > > >
> > > > > > > > > > > >> > > > > > > > Unfortunately, John is out of
the
> office
> > > > this
> > > > > > > week,
> > > > > > > > > and
> > > > > > > > > > I
> > > > > > > > > > > do
> > > > > > > > > > > >> > not
> > > > > > > > > > > >> > > > know
> > > > > > > > > > > >> > > > > > the
> > > > > > > > > > > >> > > > > > > > answers to your questions.  As
you
> > said, I
> > > > > would
> > > > > > > > also
> > > > > > > > > > > >> imagine
> > > > > > > > > > > >> > > that
> > > > > > > > > > > >> > > > > > > > point-stat is using those small
values
> > as
> > > > > > matched
> > > > > > > > > pairs.
> > > > > > > > > > > >> > Also, I
> > > > > > > > > > > >> > > > do
> > > > > > > > > > > >> > > > > > not
> > > > > > > > > > > >> > > > > > > > believe there is a way to
regenerate
> the
> > > > > > > point-stat
> > > > > > > > > > > >> statistics
> > > > > > > > > > > >> > > > > without
> > > > > > > > > > > >> > > > > > > > using the original GFS data.  I
cannot
> > say
> > > > > with
> > > > > > > > > > certainty,
> > > > > > > > > > > >> > > however.
> > > > > > > > > > > >> > > > > > > Thank
> > > > > > > > > > > >> > > > > > > > you for your patience in
advance.
> We'll
> > > > get a
> > > > > > > > > definite
> > > > > > > > > > > >> > response
> > > > > > > > > > > >> > > to
> > > > > > > > > > > >> > > > > you
> > > > > > > > > > > >> > > > > > > as
> > > > > > > > > > > >> > > > > > > > soon as we can.
> > > > > > > > > > > >> > > > > > > >
> > > > > > > > > > > >> > > > > > > > Thanks,
> > > > > > > > > > > >> > > > > > > > Julie
> > > > > > > > > > > >> > > > > > > >
> > > > > > > > > > > >> > > > > > > > On Wed, Apr 18, 2018 at 6:31
AM,
> Rosalyn
> > > > > > > MacCracken
> > > > > > > > -
> > > > > > > > > > NOAA
> > > > > > > > > > > >> > > > Affiliate
> > > > > > > > > > > >> > > > > > via
> > > > > > > > > > > >> > > > > > > RT
> > > > > > > > > > > >> > > > > > > > <met_help at ucar.edu> wrote:
> > > > > > > > > > > >> > > > > > > >
> > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > >> > > > > > > > > Wed Apr 18 06:31:39 2018:
Request
> > 84822
> > > > was
> > > > > > > acted
> > > > > > > > > > upon.
> > > > > > > > > > > >> > > > > > > > > Transaction: Ticket created
by
> > > > > > > > > > > >> rosalyn.maccracken at noaa.gov
> > > > > > > > > > > >> > > > > > > > >        Queue: met_help
> > > > > > > > > > > >> > > > > > > > >      Subject: question on
> regenerating
> > > > data
> > > > > > > > > > > >> > > > > > > > >        Owner: Nobody
> > > > > > > > > > > >> > > > > > > > >   Requestors:
> > > rosalyn.maccracken at noaa.gov
> > > > > > > > > > > >> > > > > > > > >       Status: new
> > > > > > > > > > > >> > > > > > > > >  Ticket <URL:
> > > https://rt.rap.ucar.edu/rt/
> > > > > > > > > > > >> > > > > > Ticket/Display.html?id=84822
> > > > > > > > > > > >> > > > > > > >
> > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > >> > > > > > > > > Hi,
> > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > >> > > > > > > > > I'm running point-stat using
ASCAT
> and
> > > GFS
> > > > > > data
> > > > > > > to
> > > > > > > > > > > verify
> > > > > > > > > > > >> > > surface
> > > > > > > > > > > >> > > > > > wind
> > > > > > > > > > > >> > > > > > > > > speeds.  I found an error in
my
> ASCAT
> > > > input
> > > > > > data
> > > > > > > > > that
> > > > > > > > > > > goes
> > > > > > > > > > > >> > back
> > > > > > > > > > > >> > > > to
> > > > > > > > > > > >> > > > > > Mar
> > > > > > > > > > > >> > > > > > > 7.
> > > > > > > > > > > >> > > > > > > > > I had switched the input
source of
> the
> > > > data,
> > > > > > and
> > > > > > > > > > within
> > > > > > > > > > > >> the
> > > > > > > > > > > >> > new
> > > > > > > > > > > >> > > > > data
> > > > > > > > > > > >> > > > > > > > files,
> > > > > > > > > > > >> > > > > > > > > it was allowing very small
values
> (< 1
> > > > m/s)
> > > > > to
> > > > > > > be
> > > > > > > > > used
> > > > > > > > > > > as
> > > > > > > > > > > >> > data
> > > > > > > > > > > >> > > > > points
> > > > > > > > > > > >> > > > > > > in
> > > > > > > > > > > >> > > > > > > > > the verification.  I imagine
that
> this
> > > is
> > > > an
> > > > > > > > issue,
> > > > > > > > > > > since
> > > > > > > > > > > >> > > > > point-stat
> > > > > > > > > > > >> > > > > > is
> > > > > > > > > > > >> > > > > > > > > using these very small values
as
> > matched
> > > > > pairs
> > > > > > > > with
> > > > > > > > > > the
> > > > > > > > > > > >> GFS,
> > > > > > > > > > > >> > > > > correct?
> > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > >> > > > > > > > > Is there a way to regenerate
the
> > > > point-stat
> > > > > > > > > statistics
> > > > > > > > > > > >> > without
> > > > > > > > > > > >> > > > > using
> > > > > > > > > > > >> > > > > > > the
> > > > > > > > > > > >> > > > > > > > > original GFS data?  I do have
the
> > *stat
> > > > and
> > > > > > the
> > > > > > > > *mpr
> > > > > > > > > > > >> files,
> > > > > > > > > > > >> > and
> > > > > > > > > > > >> > > > it
> > > > > > > > > > > >> > > > > is
> > > > > > > > > > > >> > > > > > > > > pretty easy to identify where
the
> bad
> > > > values
> > > > > > are
> > > > > > > > > > > located.
> > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > >> > > > > > > > > Thanks,
> > > > > > > > > > > >> > > > > > > > > Roz
> > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > >> > > > > > > > > --
> > > > > > > > > > > >> > > > > > > > > Rosalyn MacCracken
> > > > > > > > > > > >> > > > > > > > > Support Scientist
> > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > >> > > > > > > > > Ocean Applications Branch
> > > > > > > > > > > >> > > > > > > > > NOAA/NWS Ocean Prediction
Center
> > > > > > > > > > > >> > > > > > > > > NCWCP
> > > > > > > > > > > >> > > > > > > > > 5830 University Research Ct
> > > > > > > > > > > >> > > > > > > > > College Park, MD  20740-3818
> > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > >> > > > > > > > > (p) 301-683-1551
> > > > > > > > > > > >> > > > > > > > > rosalyn.maccracken at noaa.gov
> > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > >> > > > > > > >
> > > > > > > > > > > >> > > > > > > >
> > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > >> > > > > >
> > > > > > > > > > > >> > > > > >
> > > > > > > > > > > >> > > > > > --
> > > > > > > > > > > >> > > > > > Rosalyn MacCracken
> > > > > > > > > > > >> > > > > > Support Scientist
> > > > > > > > > > > >> > > > > >
> > > > > > > > > > > >> > > > > > Ocean Applications Branch
> > > > > > > > > > > >> > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > > > > > > >> > > > > > NCWCP
> > > > > > > > > > > >> > > > > > 5830 University Research Ct
> > > > > > > > > > > >> > > > > > College Park, MD  20740-3818
> > > > > > > > > > > >> > > > > >
> > > > > > > > > > > >> > > > > > (p) 301-683-1551
> > > > > > > > > > > >> > > > > > rosalyn.maccracken at noaa.gov
> > > > > > > > > > > >> > > > > >
> > > > > > > > > > > >> > > > > >
> > > > > > > > > > > >> > > > >
> > > > > > > > > > > >> > > > >
> > > > > > > > > > > >> > > >
> > > > > > > > > > > >> > > >
> > > > > > > > > > > >> > > > --
> > > > > > > > > > > >> > > > Rosalyn MacCracken
> > > > > > > > > > > >> > > > Support Scientist
> > > > > > > > > > > >> > > >
> > > > > > > > > > > >> > > > Ocean Applications Branch
> > > > > > > > > > > >> > > > NOAA/NWS Ocean Prediction Center
> > > > > > > > > > > >> > > > NCWCP
> > > > > > > > > > > >> > > > 5830 University Research Ct
> > > > > > > > > > > >> > > > College Park, MD  20740-3818
> > > > > > > > > > > >> > > >
> > > > > > > > > > > >> > > > (p) 301-683-1551
> > > > > > > > > > > >> > > > rosalyn.maccracken at noaa.gov
> > > > > > > > > > > >> > > >
> > > > > > > > > > > >> > > >
> > > > > > > > > > > >> > >
> > > > > > > > > > > >> > >
> > > > > > > > > > > >> >
> > > > > > > > > > > >> >
> > > > > > > > > > > >> > --
> > > > > > > > > > > >> > Rosalyn MacCracken
> > > > > > > > > > > >> > Support Scientist
> > > > > > > > > > > >> >
> > > > > > > > > > > >> > Ocean Applications Branch
> > > > > > > > > > > >> > NOAA/NWS Ocean Prediction Center
> > > > > > > > > > > >> > NCWCP
> > > > > > > > > > > >> > 5830 University Research Ct
> > > > > > > > > > > >> > College Park, MD  20740-3818
> > > > > > > > > > > >> >
> > > > > > > > > > > >> > (p) 301-683-1551
> > > > > > > > > > > >> > rosalyn.maccracken at noaa.gov
> > > > > > > > > > > >> >
> > > > > > > > > > > >> >
> > > > > > > > > > > >>
> > > > > > > > > > > >>
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > --
> > > > > > > > > > > > Rosalyn MacCracken
> > > > > > > > > > > > Support Scientist
> > > > > > > > > > > >
> > > > > > > > > > > > Ocean Applications Branch
> > > > > > > > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > > > > > > > NCWCP
> > > > > > > > > > > > 5830 University Research Ct
> > > > > > > > > > > > College Park, MD  20740-3818
> > > > > > > > > > > >
> > > > > > > > > > > > (p) 301-683-1551
> > > > > > > > > > > > rosalyn.maccracken at noaa.gov
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > --
> > > > > > > > > > > Rosalyn MacCracken
> > > > > > > > > > > Support Scientist
> > > > > > > > > > >
> > > > > > > > > > > Ocean Applications Branch
> > > > > > > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > > > > > > NCWCP
> > > > > > > > > > > 5830 University Research Ct
> > > > > > > > > > > College Park, MD  20740-3818
> > > > > > > > > > >
> > > > > > > > > > > (p) 301-683-1551
> > > > > > > > > > > rosalyn.maccracken at noaa.gov
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Rosalyn MacCracken
> > > > > > > > > Support Scientist
> > > > > > > > >
> > > > > > > > > Ocean Applications Branch
> > > > > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > > > > NCWCP
> > > > > > > > > 5830 University Research Ct
> > > > > > > > > College Park, MD  20740-3818
> > > > > > > > >
> > > > > > > > > (p) 301-683-1551
> > > > > > > > > rosalyn.maccracken at noaa.gov
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Rosalyn MacCracken
> > > > > > > Support Scientist
> > > > > > >
> > > > > > > Ocean Applications Branch
> > > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > > NCWCP
> > > > > > > 5830 University Research Ct
> > > > > > > College Park, MD  20740-3818
> > > > > > >
> > > > > > > (p) 301-683-1551
> > > > > > > rosalyn.maccracken at noaa.gov
> > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Rosalyn MacCracken
> > > > > Support Scientist
> > > > >
> > > > > Ocean Applications Branch
> > > > > NOAA/NWS Ocean Prediction Center
> > > > > NCWCP
> > > > > 5830 University Research Ct
> > > > > College Park, MD  20740-3818
> > > > >
> > > > > (p) 301-683-1551
> > > > > rosalyn.maccracken at noaa.gov
> > > > >
> > > > >
> > > >
> > > >
> > >
> > >
> > > --
> > > Rosalyn MacCracken
> > > Support Scientist
> > >
> > > Ocean Applications Branch
> > > NOAA/NWS Ocean Prediction Center
> > > NCWCP
> > > 5830 University Research Ct
> > > College Park, MD  20740-3818
> > >
> > > (p) 301-683-1551
> > > rosalyn.maccracken at noaa.gov
> > >
> > >
> >
> >
>
>
> --
> Rosalyn MacCracken
> Support Scientist
>
> Ocean Applications Branch
> NOAA/NWS Ocean Prediction Center
> NCWCP
> 5830 University Research Ct
> College Park, MD  20740-3818
>
> (p) 301-683-1551
> rosalyn.maccracken at noaa.gov
>
>

------------------------------------------------
Subject: question on regenerating data
From: Rosalyn MacCracken - NOAA Affiliate
Time: Mon May 07 12:55:04 2018

Ah, Ok, I got it.

I'm going to run another test or two after I have my script set up.  I
don't think I have any other questions.  So, I'm thinking that we
might be
able to close this ticket....unless you think I should run my test and
make
sure everything works the way I think it will before closing the
ticket.

Roz

On Mon, May 7, 2018 at 2:37 PM, John Halley Gotway via RT
<met_help at ucar.edu
> wrote:

> Roz,
>
> Yes, you can use environment variables inside MET config files.
> Presumably, you're calling stat_analysis from some shell script.
And that
> script could include:
>
> # for cshell
>    setenv CUR_VALID_BEG 20180307_003000
>    setenv CUR_VALID_END 20180307_013000
>
> # or for bash
>    export CUR_VALID_BEG="20180307_003000"
>    export CUR_VALID_END="20180307_013000"
>
> Then in your STAT-Analysis config file you could use this:
>    -set_hdr OBS_VALID_BEG ${CUR_VALID_BEG} -set_hdr OBS_VALID_END
> ${CUR_VALID_END}
>
> As for how you set up your logic and organize your data, totally up
to you.
>
> Thanks,
> John
>
>
> On Mon, May 7, 2018 at 11:41 AM, Rosalyn MacCracken - NOAA Affiliate
via RT
> <met_help at ucar.edu> wrote:
>
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822 >
> >
> > Hi John,
> >
> > Where would I set the environmental variables?  Those are set
within the
> > config file, correct?  Can you somehow pass them into the config
file?  I
> > didn't think that the config file was that dynamic.
> >
> > So, I was thinking about appending the files by another shell
script
> using
> > "cat".  But, I think I'm actually leaning towards copying hourly
files
> to a
> > temp directory, processing, then removing the file, and copying
the next
> > hour, and so on.  I don't know if that's a silly idea or not...
> >
> > Roz
> >
> >
> >
> >
> >
> > On Mon, May 7, 2018 at 1:06 PM, John Halley Gotway via RT <
> > met_help at ucar.edu
> > > wrote:
> >
> > > Roz,
> > >
> > > Yes, the "-set_hdr" option is specific to each job.  If your
jobs are
> > > defined in the config file, then yes, you'd need to specify
there.
> > Indeed,
> > > getting the timestamps consistent really is just cosmetic.  If
you're
> > > looping over many times, I'd suggest using an environment
variable:
> > >   -set_hdr OBS_VALID_BEG ${CUR_VALID_BEG} -set_hdr OBS_VALID_END
> > > ${CUR_VALID_END}
> > >
> > > Using environment variables in MET configuration files makes
scripting
> > much
> > > more convenient.
> > >
> > > However, STAT-Analysis doesn't have the ability to append to an
output
> > > file.  If you write to the same output file name, it'll
*clobber* that
> > file
> > > (i.e. replace it).
> > >
> > > Hope that helps.
> > >
> > > Thanks,
> > > John
> > >
> > >
> > > On Mon, May 7, 2018 at 10:38 AM, Rosalyn MacCracken - NOAA
Affiliate
> via
> > RT
> > > <met_help at ucar.edu> wrote:
> > >
> > > >
> > > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822
>
> > > >
> > > > Hi John,
> > > >
> > > > So, it sounds like I'm ok either way with the timestamp.  If I
don't
> > use
> > > > -set_hdr, it sets the correct beginning and end time according
to the
> > mpr
> > > > file, or, I can use -set_hdr for consistancy with the other
files,
> but,
> > > > that's more "cosmetic".
> > > >
> > > > Oh, but, that -set_hdr command option is within the config
file,
> > correct?
> > > > So, you really couldn't loop through that and pass a time
variable
> into
> > > the
> > > > command.  So, it may just be easiest to leave it out.
> > > >
> > > > So, since my processing would take 20 minutes for regenerating
one
> days
> > > > worth of data, I was thinking, I would do all my processing
for the
> > North
> > > > Atlantic first, so, I can look at how we did with those Nor'
Easters
> in
> > > > March.  So, that's processing the 00z, 01z, 11z and 12z time
periods
> > > first,
> > > > since that is when ASCAT passes over the North Atlantic.  So,
I would
> > > copy
> > > > those time periods into a temp directory and use the -lookin
command
> to
> > > > process those 4 time periods for my entire period (maybe 1
week at a
> > > > time).
> > > >
> > > > So, this will produce 1 file with 00z,01z, 11z and 12z data,
for each
> > > week,
> > > > correct?  And, the only way to get individual files is to copy
the
> > data,
> > > > one hour at a time, process, delete the file, later rinse and
repeat.
> > > That
> > > > might be hard to do.  I may have to think about that...
> > > >
> > > > So, if it's one file, with all the data for the week, at
selected
> > hours,
> > > > what happens when I have time to run the rest of the data?  I
just
> > write
> > > > that to a different file, and then, maybe append that to the
end of
> the
> > > > first file?  Or, just leave it separate?
> > > >
> > > > I guess I just have to think about what I'm going to do next
with
> these
> > > > files, and the easiest way to do that.
> > > >
> > > > Roz
> > > >
> > > > On Mon, May 7, 2018 at 11:48 AM, John Halley Gotway via RT <
> > > > met_help at ucar.edu> wrote:
> > > >
> > > > > Roz,
> > > > >
> > > > > I understand that you're suspicious about the beginning and
ending
> > time
> > > > > stamps in the OBS_VALID_BEG and _END columns.  You're
comparing the
> > > > > original output from Point-Stat to the output that you're
getting
> > from
> > > > > STAT-Analysis.  However, those timestamps can be different
without
> > > there
> > > > > actually being a problem.  Here's why...
> > > > >
> > > > > When you run Point-Stat, the obs_window setting in the
config file
> > > > defines
> > > > > the matching time window.  If your forecast is valid at time
T, the
> > > > > matching time window is defined as T+obs_window.beg to
> > > T+obs_window.end.
> > > > > The point observations may actually fall anywhere in that
time
> > > window...
> > > > > but it's that time window that's reported in the summary
line type
> > > (like
> > > > > CTC, CTS, SL1L2, and CNT).  Since the MPR line type is
specific to
> > each
> > > > > observation value, the *actual* timestamp of that
observation is
> > > reported
> > > > > for in that line.
> > > > >
> > > > > When your run STAT-Analysis to process those MPR lines, it
reads
> the
> > > > > OBS_VALID_BEG and OBS_VALID_END columns.  And it keeps track
of the
> > > > minimum
> > > > > OBS_VALID_BEG timestamp and the maximum OBS_VALID_END
timestamp.
> > When
> > > it
> > > > > writes output CTC, CTS, SL1L2, or CNT lines it reports the
> > > > minimum/maximum
> > > > > timestamp values it found in the data.
> > > > >
> > > > > So Point-Stat reports the *REQUESTED TIME WINDOW* in the
> > OBS_VALID_BEG
> > > > and
> > > > > OBS_VALID_END columns... while STAT-Analysis reports the
*ACTUAL
> TIME
> > > > > WINDOW*.  And in general, those won't be the same.  So this
isn't
> > > > > necessarily a problem.
> > > > >
> > > > > If for consistency, you'd like to explicitly set the
OBS_VALID_BEG
> > and
> > > > > OBS_VALID_END timestamps in the output, you can use the "-
set_hdr"
> > job
> > > > > command option to do so:
> > > > >    -set_hdr OBS_VALID_BEG 20180307_003000 -set_hdr
OBS_VALID_END
> > > > > 20180307_013000
> > > > >
> > > > > Thanks,
> > > > > John
> > > > >
> > > > > On Sun, May 6, 2018 at 2:12 PM, Rosalyn MacCracken - NOAA
Affiliate
> > via
> > > > RT
> > > > > <
> > > > > met_help at ucar.edu> wrote:
> > > > >
> > > > > >
> > > > > > <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822 >
> > > > > >
> > > > > > Hi John,
> > > > > >
> > > > > > Sorry it took me so long to get back to you.  My step-
daughter
> came
> > > in
> > > > to
> > > > > > town, and I thought that I could get some work done while
she was
> > > here,
> > > > > > but, didn't.  Then, I totally forgot to email you back.
Sorry
> for
> > > > > leaving
> > > > > > you hanging!
> > > > > >
> > > > > > Anyway, I was able to play around with the STATAnalysis
config
> file
> > > you
> > > > > > sent me.  I tried it out with only 1 hour timestep,
instead of
> all
> > > the
> > > > > > files for one day.  I wanted to see what kind of time it
would
> take
> > > to
> > > > > > process this on my machine.  So, it was quick, 45 seconds.
But,
> of
> > > > > course
> > > > > > you run took 18 minutes.  The script was probably reading
20 some
> > > > files.
> > > > > > That makes sense.
> > > > > >
> > > > > > So, then, I looked at the output, and it wasn't quite what
I
> > > expected,
> > > > > and
> > > > > > doesn't quite match the stats from the other processing.
This is
> > > what
> > > > I
> > > > > > did:
> > > > > >
> > > > > > 1)  I copied the 00z only *20180307*.stat file to a temp
> directory.
> > > > > Before
> > > > > > I did this, I looked at the matching *.mpr file, and saw
that the
> > > > > > OBS_VALID_BEG was 20180307_000000 and the OBS_VALID_END
was
> > > > > > 20180307_002700.
> > > > > > 2)  Ran the run_sa.sh script and generated the CTS, CTC
and CNT
> > > files.
> > > > > > 3)  I looked at the new agg_cts file, and the
OBS_VALID_BEG and
> > _END
> > > > > > matched the *.mpr file in step 1.
> > > > > > 4)  I looked at the original CTS file, and the
OBS_VALID_BEG was
> > > > > > 20180307_223000 and the OBS_VALID_END was 20180307_013000.
So,
> > that
> > > > was
> > > > > > our original way of processing.  I bet if I looked at a
more
> recent
> > > > file,
> > > > > > it would be more like OBS_VALID_BEG was 20180307_233000
and the
> > > > > > OBS_VALID_END was 20180307_003000.
> > > > > > 5)  I looked at the original *mpr for 01z, and the
OBS_VALID_BEG
> > was
> > > > > > 20180307_003000 and the OBS_VALID_END was 20180307_012100
> > > > > >
> > > > > > So, this tells me that I'm not matching observation times,
and
> I'm
> > > not
> > > > > sure
> > > > > > how to fix it to match things up.  First, we use a +/- 30
min
> > window
> > > > for
> > > > > > ASCAT obs, centered on the hour.  For example, if we are
> processing
> > > the
> > > > > 00z
> > > > > > hour, we will match observations from 233000 from the day
before
> to
> > > > > 003000
> > > > > > the current day.  Actually, we used to do an hour window
on
> either
> > > > side,
> > > > > > but, we have more observations now at each hour.  (See the
> > > explanation
> > > > in
> > > > > > #4 above)
> > > > > >
> > > > > > Anyway, how do I create the CTS,CTC and CNT files for the
+/- 30
> > min
> > > > > > window?  Is there a way to dynamically indicate this 30min
> window,
> > so
> > > > > that
> > > > > > I don't have to go into the config file every time I run
> > STATanalysis
> > > > and
> > > > > > change it?
> > > > > >
> > > > > > Roz
> > > > > >
> > > > > > On Thu, Apr 26, 2018 at 4:14 PM, John Halley Gotway via RT
<
> > > > > > met_help at ucar.edu> wrote:
> > > > > >
> > > > > > > Roz,
> > > > > > >
> > > > > > > The CSI statistics is computed from a 2x2 contingency
table.  A
> > 2x2
> > > > > > > contingency table is defined by a single threshold.
Looking in
> > the
> > > > > .stat
> > > > > > > files you sent, I see that you've applied many
thresholds to
> > > generate
> > > > > > many
> > > > > > > 2x2 contingency tables and corresponding statistics.
Yes, it
> is
> > > true
> > > > > > that
> > > > > > > for most of those thresholds, the "bad" observation
values will
> > > fall
> > > > > into
> > > > > > > the "non-event" category.  But those non-event counts
are
> > included
> > > in
> > > > > the
> > > > > > > computation of some stats, including CSI.  So even
through the
> > bad
> > > > > > > observations aren't very interesting, they really are
impacting
> > the
> > > > > > > statistics.
> > > > > > >
> > > > > > > John
> > > > > > >
> > > > > > > On Wed, Apr 25, 2018 at 10:08 AM, Rosalyn MacCracken -
NOAA
> > > Affiliate
> > > > > via
> > > > > > > RT <met_help at ucar.edu> wrote:
> > > > > > >
> > > > > > > >
> > > > > > > > <URL: https://rt.rap.ucar.edu/rt/
> Ticket/Display.html?id=84822
> > >
> > > > > > > >
> > > > > > > > Figures.  I just calculated how long it will take me
to
> > > regenerate
> > > > > data
> > > > > > > for
> > > > > > > > 03072018 - 04122018.  It will take me 912 hours.  ;-(
> > > > > > > >
> > > > > > > > Ok, I know I asked this, but, if I had a OBS value of
0.01
> and
> > a
> > > > > > matched
> > > > > > > > GFS point of 10 m/s, and I had a low threshold of 0-5
m/s,
> 6-10
> > > m/s
> > > > > and
> > > > > > > > 10-15 m/s, and say, CSI was calculated.  Which
threshold
> would
> > be
> > > > > used
> > > > > > > for
> > > > > > > > the output, the 0-5 or 6-10?  And, would the 10-15
threshold
> > even
> > > > be
> > > > > > > > effected?
> > > > > > > >
> > > > > > > > Roz
> > > > > > > >
> > > > > > > > On Wed, Apr 25, 2018 at 11:40 AM, John Halley Gotway
via RT <
> > > > > > > > met_help at ucar.edu> wrote:
> > > > > > > >
> > > > > > > > > Roz,
> > > > > > > > >
> > > > > > > > > I think it'd take just as long.  The slow part is
reading
> the
> > > > > data...
> > > > > > > not
> > > > > > > > > applying a threshold.
> > > > > > > > >
> > > > > > > > > John
> > > > > > > > >
> > > > > > > > > On Wed, Apr 25, 2018 at 9:18 AM, Rosalyn MacCracken
- NOAA
> > > > > Affiliate
> > > > > > > via
> > > > > > > > RT
> > > > > > > > > <met_help at ucar.edu> wrote:
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > <URL: https://rt.rap.ucar.edu/rt/
> > > Ticket/Display.html?id=84822
> > > > >
> > > > > > > > > >
> > > > > > > > > > Hi John,
> > > > > > > > > >
> > > > > > > > > > Thanks for doing that for me.  I'll take a look at
the
> info
> > > you
> > > > > > sent
> > > > > > > me
> > > > > > > > > > this afternoon.  I'm in the middle of doing
something
> right
> > > > > > > > now...trying
> > > > > > > > > to
> > > > > > > > > > make a different program work.  ;-/
> > > > > > > > > >
> > > > > > > > > > I wonder if it will be quicker than 18 minutes for
some
> of
> > > the
> > > > > > > > thresholds
> > > > > > > > > > that have higher wind speeds, and not as many
instances
> > (or 0
> > > > > > > > instances).
> > > > > > > > > > Or, will it take just as long, since it still
needs to
> read
> > > > > through
> > > > > > > the
> > > > > > > > > > entire *.stat file anyway?
> > > > > > > > > >
> > > > > > > > > > Roz
> > > > > > > > > >
> > > > > > > > > > On Tue, Apr 24, 2018 at 7:06 PM, John Halley
Gotway via
> RT
> > <
> > > > > > > > > > met_help at ucar.edu> wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi Roz,
> > > > > > > > > > >
> > > > > > > > > > > Thanks for sending the sample data.  I grabbed
it and
> > used
> > > it
> > > > > run
> > > > > > > > some
> > > > > > > > > > > sample jobs:
> > > > > > > > > > >
> > > > > > > > > > > time /d1/johnhg/MET/MET_releases/
> > met-6.0/bin/stat_analysis
> > > \
> > > > > > > > > > > -lookin
> > > > > > > > > > >
/d1/johnhg/MET/MET_Help/maccracken_data_20180424/opc_
> > > > > > > > > > >
test/home/opc_test/data/met_verif/GFS/data/hourly
> > > > > > > > > > > \
> > > > > > > > > > > -config STATAnalysisConfig \
> > > > > > > > > > > -log run_sa.log -v 3
> > > > > > > > > > >
> > > > > > > > > > > I used the "-lookin" option to point to all the
data
> you
> > > > sent.
> > > > > > > > > > >
> > > > > > > > > > > I've attached the...
> > > > > > > > > > > (1) config file I used
> > > > > > > > > > > (2) log file that was genrated
> > > > > > > > > > > (3) output .stat files
> > > > > > > > > > >
> > > > > > > > > > > Looking at the jobs, you'll see that I've
included 5 of
> > > > them...
> > > > > > > > > > > - Generate CNT output
> > > > > > > > > > > - Generate CTC >= 0.0 output
> > > > > > > > > > > - Generate CTS >= 0.0 output
> > > > > > > > > > > - Generate CTC >= 5.5689 output
> > > > > > > > > > > - Generate CTS >= 5.5689 output
> > > > > > > > > > >
> > > > > > > > > > > Unfortunately, you'll need to define separate
jobs for
> > each
> > > > > > > threshold
> > > > > > > > > > you'd
> > > > > > > > > > > like to use.  Although, you shouldn't use >=0.0
since
> > > that's
> > > > > > always
> > > > > > > > > true.
> > > > > > > > > > >
> > > > > > > > > > > Also unfortunately, this is pretty slow.  On my
> machine,
> > it
> > > > > took
> > > > > > > like
> > > > > > > > > 18
> > > > > > > > > > > minutes for these 5 jobs!
> > > > > > > > > > >
> > > > > > > > > > > Thanks,
> > > > > > > > > > > John
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > On Tue, Apr 24, 2018 at 2:09 PM, Rosalyn
MacCracken -
> > NOAA
> > > > > > > Affiliate
> > > > > > > > > via
> > > > > > > > > > RT
> > > > > > > > > > > <met_help at ucar.edu> wrote:
> > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > <URL: https://rt.rap.ucar.edu/rt/
> > > > > Ticket/Display.html?id=84822
> > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Hi John,
> > > > > > > > > > > >
> > > > > > > > > > > > I put my file on the ftp site.  Let me know
what you
> > > find.
> > > > > > > You'll
> > > > > > > > > see
> > > > > > > > > > > > those really low OBS values (0.01, 0.02, and
so on).
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks!
> > > > > > > > > > > >
> > > > > > > > > > > > Roz
> > > > > > > > > > > >
> > > > > > > > > > > > On Tue, Apr 24, 2018 at 2:53 PM, Rosalyn
MacCracken -
> > > NOAA
> > > > > > > > Affiliate
> > > > > > > > > <
> > > > > > > > > > > > rosalyn.maccracken at noaa.gov> wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Ok, I'll get that over to the ftp site.  I
have to
> > make
> > > > > sure
> > > > > > > > that I
> > > > > > > > > > > find
> > > > > > > > > > > > a
> > > > > > > > > > > > > day that has all the data in it.  Sometimes
the
> data
> > > > isn't
> > > > > > > > > available
> > > > > > > > > > > when
> > > > > > > > > > > > > the script runs.  A little annoying, but,
that's
> > > > > > operations...
> > > > > > > > > > > > >
> > > > > > > > > > > > > I'll let you know when I get the file to the
ftp
> > site.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thanks!
> > > > > > > > > > > > >
> > > > > > > > > > > > > Roz
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Tue, Apr 24, 2018 at 2:49 PM, John Halley
Gotway
> > via
> > > > RT
> > > > > <
> > > > > > > > > > > > > met_help at ucar.edu> wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > >> Roz,
> > > > > > > > > > > > >>
> > > > > > > > > > > > >> Yes, we do.  Follow the instructions here:
> > > > > > > > > > > > >>    https://dtcenter.org/met/
> > > > > users/support/met_help.php#ftp
> > > > > > > > > > > > >>
> > > > > > > > > > > > >> I'd suggest making a tar file for one day
and
> > posting
> > > > them
> > > > > > to
> > > > > > > > the
> > > > > > > > > > ftp
> > > > > > > > > > > > >> site:
> > > > > > > > > > > > >>    tar -cvzf sample.tar.gz
> > /GFS/data/hourly/20180305*
> > > > > > > > > > > > >>
> > > > > > > > > > > > >> Thanks,
> > > > > > > > > > > > >> John
> > > > > > > > > > > > >>
> > > > > > > > > > > > >> On Tue, Apr 24, 2018 at 11:57 AM, Rosalyn
> > MacCracken -
> > > > > NOAA
> > > > > > > > > > Affiliate
> > > > > > > > > > > > via
> > > > > > > > > > > > >> RT <met_help at ucar.edu> wrote:
> > > > > > > > > > > > >>
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >> > <URL: https://rt.rap.ucar.edu/rt/
> > > > > > > Ticket/Display.html?id=84822
> > > > > > > > >
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >> > HI John,
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >> > Yes, it does seem that the -config option
is the
> > way
> > > > to
> > > > > go
> > > > > > > to
> > > > > > > > > > > recreate
> > > > > > > > > > > > >> > those 3 files. I'll be sure to have a
unique
> file
> > > > name,
> > > > > > or,
> > > > > > > mv
> > > > > > > > > the
> > > > > > > > > > > > >> output
> > > > > > > > > > > > >> > file to a different name before running
the
> > command
> > > > > again.
> > > > > > > > > Thanks
> > > > > > > > > > > for
> > > > > > > > > > > > >> > pointing that out.
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >> > I'm teleworking for the next couple of
weeks,
> so,
> > > > > download
> > > > > > > and
> > > > > > > > > > send
> > > > > > > > > > > > you
> > > > > > > > > > > > >> > *.stat files like I can when I'm at my
computer
> at
> > > > work.
> > > > > > I
> > > > > > > > > don't
> > > > > > > > > > > have
> > > > > > > > > > > > >> > access to theia or wcoss anymore.  You
have an
> ftp
> > > > > server
> > > > > > > > that I
> > > > > > > > > > can
> > > > > > > > > > > > >> upload
> > > > > > > > > > > > >> > data to, right?  If not, I can try and
fiddle
> > around
> > > > > with
> > > > > > > this
> > > > > > > > > > > > tomorrow
> > > > > > > > > > > > >> and
> > > > > > > > > > > > >> > see if I can't get this to work the way I
want
> to.
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >> > Roz
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >> > On Tue, Apr 24, 2018 at 11:42 AM, John
Halley
> > Gotway
> > > > via
> > > > > > RT
> > > > > > > <
> > > > > > > > > > > > >> > met_help at ucar.edu> wrote:
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >> > > Roz,
> > > > > > > > > > > > >> > >
> > > > > > > > > > > > >> > > Each "-job aggregate_stat" only
generates a
> > single
> > > > > > output
> > > > > > > > line
> > > > > > > > > > > type.
> > > > > > > > > > > > >> So
> > > > > > > > > > > > >> > > using "-out_line_type CTC,CTS,CNT" will
not
> > work.
> > > > > > > > > > > > >> > >
> > > > > > > > > > > > >> > > You'll need to run separate jobs for
each
> output
> > > > line
> > > > > > type
> > > > > > > > you
> > > > > > > > > > > want
> > > > > > > > > > > > to
> > > > > > > > > > > > >> > > generate.  That's why I'd recommend
grouping
> > those
> > > > > > > multiple
> > > > > > > > > jobs
> > > > > > > > > > > > >> together
> > > > > > > > > > > > >> > > into a single STAT-Analysis config
file.  Then
> > > you'd
> > > > > > call
> > > > > > > > > > > > >> STAT-Analysis
> > > > > > > > > > > > >> > > once using the "-config" command line
option.
> > > > > > > > > > > > >> > >
> > > > > > > > > > > > >> > > Another issue is that if you set "-
out_stat"
> to
> > > the
> > > > > same
> > > > > > > > > > filename,
> > > > > > > > > > > > >> it'll
> > > > > > > > > > > > >> > > get overridden by each job.  STAT-
Analysis
> will
> > > > > > overwrite
> > > > > > > > that
> > > > > > > > > > > > output
> > > > > > > > > > > > >> > file
> > > > > > > > > > > > >> > > rather than appending to it.
> > > > > > > > > > > > >> > >
> > > > > > > > > > > > >> > > You could send me a day's worth of
.stat
> output
> > > > files
> > > > > > > > > > > > >> > > (/GFS/data/hourly/20180305*) and I
could send
> > you
> > > > some
> > > > > > > > > > > suggestions.
> > > > > > > > > > > > >> Or
> > > > > > > > > > > > >> > if
> > > > > > > > > > > > >> > > you have access to theia you could copy
them
> up
> > > > there
> > > > > > and
> > > > > > > > > point
> > > > > > > > > > me
> > > > > > > > > > > > to
> > > > > > > > > > > > >> it.
> > > > > > > > > > > > >> > >
> > > > > > > > > > > > >> > > Thanks,
> > > > > > > > > > > > >> > > John
> > > > > > > > > > > > >> > >
> > > > > > > > > > > > >> > > On Tue, Apr 24, 2018 at 7:48 AM,
Rosalyn
> > > MacCracken
> > > > -
> > > > > > NOAA
> > > > > > > > > > > Affiliate
> > > > > > > > > > > > >> via
> > > > > > > > > > > > >> > RT
> > > > > > > > > > > > >> > > <met_help at ucar.edu> wrote:
> > > > > > > > > > > > >> > >
> > > > > > > > > > > > >> > > >
> > > > > > > > > > > > >> > > > <URL: https://rt.rap.ucar.edu/rt/
> > > > > > > > > Ticket/Display.html?id=84822
> > > > > > > > > > >
> > > > > > > > > > > > >> > > >
> > > > > > > > > > > > >> > > > Hi John,
> > > > > > > > > > > > >> > > >
> > > > > > > > > > > > >> > > > Yes, that makes sense.  Those very
small
> > values
> > > > > (<1.0
> > > > > > > > m/s),
> > > > > > > > > > are
> > > > > > > > > > > > bad
> > > > > > > > > > > > >> > > > values.  That's why they shouldn't be
> included
> > > in
> > > > > the
> > > > > > > > > > > processing.
> > > > > > > > > > > > >> > > >
> > > > > > > > > > > > >> > > > So, I need to just regenerate hourly
data,
> one
> > > > hour
> > > > > > at a
> > > > > > > > > time.
> > > > > > > > > > > > >> Would
> > > > > > > > > > > > >> > it
> > > > > > > > > > > > >> > > > make sense to use a shell script and
loop
> > > > > > stat-analysis?
> > > > > > > > > > > > Something
> > > > > > > > > > > > >> > like:
> > > > > > > > > > > > >> > > >
> > > > > > > > > > > > >> > > > for day in 11 12
> > > > > > > > > > > > >> > > > do
> > > > > > > > > > > > >> > > >   for cycle in 00 06 12 18
> > > > > > > > > > > > >> > > >   do
> > > > > > > > > > > > >> > > > stat_analysis -lookin
> > > > /GFS/data/hourly/201803${day}$
> > > > > > > > > > > {hour}/*.stat
> > > > > > > > > > > > \
> > > > > > > > > > > > >> > > > -job aggregate_stat \
> > > > > > > > > > > > >> > > >    -line_type MPR \
> > > > > > > > > > > > >> > > >    -out_line_type CTC,CTS,CNT \
> > > > > > > > > > > > >> > > >   -fcst_var WIND \
> > > > > > > > > > > > >> > > > -column_thresh OBS gt1 \
> > > > > > > > > > > > >> > > >  -by
> > > > > > > > > > > > >> > > > MODEL,FCST_LEV,FCST_INIT_BEG,
> > > > > > > > FCST_LEAD,VX_MASK,INTERP_MTHD,
> > > > > > > > > I
> > > > > > > > > > > > >> NTERP_PNTS
> > > > > > > > > > > > >> > > > -out_stat
/new_rerun_stat_files/MPR_to_
> > > > > > CTC_CTS_CNT.stat
> > > > > > > > > > > > >> > > >   done
> > > > > > > > > > > > >> > > > done
> > > > > > > > > > > > >> > > >
> > > > > > > > > > > > >> > > > or, something like that?  And, will
this
> > > > regenerate
> > > > > > hour
> > > > > > > > > > > > forecasts,
> > > > > > > > > > > > >> at
> > > > > > > > > > > > >> > > each
> > > > > > > > > > > > >> > > > forecast and lead hour?  I guess it
will see
> > the
> > > > > > > forecast
> > > > > > > > > and
> > > > > > > > > > > lead
> > > > > > > > > > > > >> hour
> > > > > > > > > > > > >> > > > from the *.stat file, and whatever
*stat
> file
> > is
> > > > in
> > > > > > the
> > > > > > > > > > > directory,
> > > > > > > > > > > > >> it
> > > > > > > > > > > > >> > > will
> > > > > > > > > > > > >> > > > regenerate those hours, right?
> > > > > > > > > > > > >> > > >
> > > > > > > > > > > > >> > > > So, I need to regenerate the CTC, CNT
and
> CTS
> > > > files.
> > > > > > > > That's
> > > > > > > > > > > why I
> > > > > > > > > > > > >> did:
> > > > > > > > > > > > >> > > >  -out_line_type CTC,CTS,CNT
> > > > > > > > > > > > >> > > > but, will that make 3 separate files,
or
> just
> > > > > another
> > > > > > > > *.stat
> > > > > > > > > > > file?
> > > > > > > > > > > > >> > > >
> > > > > > > > > > > > >> > > > Roz
> > > > > > > > > > > > >> > > >
> > > > > > > > > > > > >> > > >
> > > > > > > > > > > > >> > > > On Mon, Apr 23, 2018 at 4:01 PM, John
Halley
> > > > Gotway
> > > > > > via
> > > > > > > > RT <
> > > > > > > > > > > > >> > > > met_help at ucar.edu> wrote:
> > > > > > > > > > > > >> > > >
> > > > > > > > > > > > >> > > > > Roz,
> > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > >> > > > > It is ultimately up to you to
decide which
> > > > matched
> > > > > > > pairs
> > > > > > > > > you
> > > > > > > > > > > > want
> > > > > > > > > > > > >> to
> > > > > > > > > > > > >> > > > > include in your processing.  Do you
> consider
> > > > those
> > > > > > > small
> > > > > > > > > > (<1.0
> > > > > > > > > > > > >> m/s)
> > > > > > > > > > > > >> > > > > observation values to be corrupt
and
> > incorrect
> > > > in
> > > > > > some
> > > > > > > > way
> > > > > > > > > > or
> > > > > > > > > > > > just
> > > > > > > > > > > > >> > not
> > > > > > > > > > > > >> > > > very
> > > > > > > > > > > > >> > > > > interesting?  If they really are
BAD data
> > > > values,
> > > > > I
> > > > > > > > agree
> > > > > > > > > > that
> > > > > > > > > > > > you
> > > > > > > > > > > > >> > > should
> > > > > > > > > > > > >> > > > > exclude them from your analysis.
But if
> > > they're
> > > > > > just
> > > > > > > > > > > > >> uninteresting
> > > > > > > > > > > > >> > > > values
> > > > > > > > > > > > >> > > > > of low wind speed, then there's no
reason
> > why
> > > > you
> > > > > > > should
> > > > > > > > > > > exclude
> > > > > > > > > > > > >> > them.
> > > > > > > > > > > > >> > > > For
> > > > > > > > > > > > >> > > > > example, *most* of the time it
ins't
> > raining,
> > > > but
> > > > > we
> > > > > > > > often
> > > > > > > > > > > > >> included
> > > > > > > > > > > > >> > > > > observations of 0 precip.
> > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > >> > > > > There are three configurable
options in
> > > > Point-Stat
> > > > > > > that
> > > > > > > > > may
> > > > > > > > > > be
> > > > > > > > > > > > >> useful
> > > > > > > > > > > > >> > > > here:
> > > > > > > > > > > > >> > > > > (1) You already know and use the
> > "cat_thresh"
> > > > > > option.
> > > > > > > > > This
> > > > > > > > > > > > >> threshold
> > > > > > > > > > > > >> > > > > defines the events and non-events
for a
> 2x2
> > > > > > > contingency
> > > > > > > > > > table.
> > > > > > > > > > > > >> This
> > > > > > > > > > > > >> > > > > threshold affects the contents of
FHO,
> CTC,
> > > CTS,
> > > > > > MCTC,
> > > > > > > > and
> > > > > > > > > > > MCTS
> > > > > > > > > > > > >> line
> > > > > > > > > > > > >> > > > types
> > > > > > > > > > > > >> > > > > that Point-Stat writes.
> > > > > > > > > > > > >> > > > > (2) The "cnt_thresh" option is a
more
> recent
> > > > > > addition.
> > > > > > > > > > > Perhaps
> > > > > > > > > > > > >> this
> > > > > > > > > > > > >> > > was
> > > > > > > > > > > > >> > > > a
> > > > > > > > > > > > >> > > > > poor name choice, but instead of
defining
> > > > > > categories,
> > > > > > > > it's
> > > > > > > > > > > > really
> > > > > > > > > > > > >> a
> > > > > > > > > > > > >> > > > > *filtering* threshold.  This
threshold
> > affects
> > > > the
> > > > > > > > > contents
> > > > > > > > > > of
> > > > > > > > > > > > the
> > > > > > > > > > > > >> > > SL1L2,
> > > > > > > > > > > > >> > > > > SAL1L2, and CNT line types that
Point-Stat
> > > > writes.
> > > > > > > For
> > > > > > > > > > > example,
> > > > > > > > > > > > >> > > setting
> > > > > > > > > > > > >> > > > > "cnt_thresh = [ ge6, ge17 ];" will
> produce 2
> > > CNT
> > > > > > and 2
> > > > > > > > > SL1L2
> > > > > > > > > > > > >> output
> > > > > > > > > > > > >> > > lines
> > > > > > > > > > > > >> > > > > containing only those points where
the
> wind
> > > > speed
> > > > > > was
> > > > > > > > >=6
> > > > > > > > > > and
> > > > > > > > > > > > >> >=17,
> > > > > > > > > > > > >> > > > > respectively.
> > > > > > > > > > > > >> > > > > (3) The "wind_thresh" option is
very
> similar
> > > to
> > > > > the
> > > > > > > > > > > "cnt_thresh"
> > > > > > > > > > > > >> > option
> > > > > > > > > > > > >> > > > but
> > > > > > > > > > > > >> > > > > affects the contents of teh VL1L2,
VAL1L2,
> > and
> > > > > VCNT
> > > > > > > (new
> > > > > > > > > in
> > > > > > > > > > > > >> met-7.0)
> > > > > > > > > > > > >> > > line
> > > > > > > > > > > > >> > > > > types.  Only those U/V pairs that
meet the
> > > > > specified
> > > > > > > > wind
> > > > > > > > > > > speed
> > > > > > > > > > > > >> > > threshold
> > > > > > > > > > > > >> > > > > are included in the output.
> > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > >> > > > > For both "cnt_thresh" and
"wind_thresh",
> the
> > > > > default
> > > > > > > > value
> > > > > > > > > > in
> > > > > > > > > > > > the
> > > > > > > > > > > > >> > > config
> > > > > > > > > > > > >> > > > > file is "NA", meaning, do not apply
any
> > > > filtering
> > > > > > > > > threshold
> > > > > > > > > > > > >> criteria.
> > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > >> > > > > You have the flexibility to run
> > STAT-Analysis
> > > on
> > > > > the
> > > > > > > MPR
> > > > > > > > > > > output
> > > > > > > > > > > > >> lines
> > > > > > > > > > > > >> > > to
> > > > > > > > > > > > >> > > > > recompute any of these output line
types
> > > > applying
> > > > > > > > whatever
> > > > > > > > > > > > >> filtering
> > > > > > > > > > > > >> > > > > criteria you'd like.
> > > > > > > > > > > > >> > > > > Here's the MET user's guide:
> > > > > > > > > > > > >> > > > > https://dtcenter.org/met/
> > > > > > users/docs/users_guide/MET_
> > > > > > > > > > > > >> > > Users_Guide_v7.0.pdf
> > > > > > > > > > > > >> > > > > Look on page 98 for the job command
> options
> > > for
> > > > > the
> > > > > > > > > > > > >> "aggregate_stat"
> > > > > > > > > > > > >> > > line
> > > > > > > > > > > > >> > > > > type when the input line type is
"MPR".
> > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > >> > > > > For your second question, the "-
lookin
> PATH"
> > > > > option
> > > > > > is
> > > > > > > > > > *VERY*
> > > > > > > > > > > > >> > flexible.
> > > > > > > > > > > > >> > > > > You can set PATH to either a single
value
> or
> > > > > > multiple
> > > > > > > > > > values.
> > > > > > > > > > > > If
> > > > > > > > > > > > >> you
> > > > > > > > > > > > >> > > use
> > > > > > > > > > > > >> > > > > wildcards, then the shell expands
those
> > > > wildcards
> > > > > to
> > > > > > > > > > multiple
> > > > > > > > > > > > >> values.
> > > > > > > > > > > > >> > > > Each
> > > > > > > > > > > > >> > > > > value you pass in can either be a
filename
> > or
> > > a
> > > > > > > > directory
> > > > > > > > > > > name.
> > > > > > > > > > > > >> If
> > > > > > > > > > > > >> > you
> > > > > > > > > > > > >> > > > > pass in a filename, STAT-Analysis
will
> read
> > it
> > > > > > > > > *REGARDLESS*
> > > > > > > > > > of
> > > > > > > > > > > > the
> > > > > > > > > > > > >> > file
> > > > > > > > > > > > >> > > > > extension.  If you pass in a
directory
> name,
> > > > > > > > STAT-Analysis
> > > > > > > > > > > will
> > > > > > > > > > > > >> > search
> > > > > > > > > > > > >> > > > that
> > > > > > > > > > > > >> > > > > directory *RECURSIVELY* for files
ending
> in
> > > > > ".stat".
> > > > > > > > For
> > > > > > > > > > > > example,
> > > > > > > > > > > > >> > > either
> > > > > > > > > > > > >> > > > > of the following settings would
tell
> > > > STAT-Analysis
> > > > > > to
> > > > > > > > read
> > > > > > > > > > the
> > > > > > > > > > > > >> same
> > > > > > > > > > > > >> > > list
> > > > > > > > > > > > >> > > > of
> > > > > > > > > > > > >> > > > > files:
> > > > > > > > > > > > >> > > > >    -lookin
/GFS/data/hourly/*/*.stat
> > > > > > > > > > > > >> > > > >    ... or ...
> > > > > > > > > > > > >> > > > >    -lookin /GFS/data/hourly
> > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > >> > > > > Be aware though that the more data
you
> pass
> > to
> > > > > > > > > > STAT-Analysis,
> > > > > > > > > > > > the
> > > > > > > > > > > > >> > > longer
> > > > > > > > > > > > >> > > > > it'll take for it to process it.
You can
> > > decide
> > > > > how
> > > > > > > > much
> > > > > > > > > > data
> > > > > > > > > > > > you
> > > > > > > > > > > > >> > pass
> > > > > > > > > > > > >> > > > it
> > > > > > > > > > > > >> > > > > for each job.  I'd suggest starting
with
> > what
> > > is
> > > > > > most
> > > > > > > > > > > convenient
> > > > > > > > > > > > >> for
> > > > > > > > > > > > >> > > you.
> > > > > > > > > > > > >> > > > > If it's too slow, change the logic
to pass
> > it
> > > > less
> > > > > > > data
> > > > > > > > > > (e.g.
> > > > > > > > > > > > >> only 1
> > > > > > > > > > > > >> > > day
> > > > > > > > > > > > >> > > > of
> > > > > > > > > > > > >> > > > > data rather than 1 month of data).
> > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > >> > > > > Yes, you can give it a date range.
Use
> > > > > > -fcst_init_beg
> > > > > > > > and
> > > > > > > > > > > > >> > > -fcst_init_end
> > > > > > > > > > > > >> > > > > to specify beginning/ending model
> > > initialization
> > > > > > times
> > > > > > > > or
> > > > > > > > > > > > >> > > -fcst_valid_beg
> > > > > > > > > > > > >> > > > > and -fcst_valid_end to specify
> > > beginning/ending
> > > > > > valid
> > > > > > > > > times.
> > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > >> > > > > If you find that you're running
multiple
> > jobs
> > > on
> > > > > the
> > > > > > > > same
> > > > > > > > > > > subset
> > > > > > > > > > > > >> of
> > > > > > > > > > > > >> > > data
> > > > > > > > > > > > >> > > > > (e.g. process MPR to CNT, MPR to
SL1L2,
> MPR
> > to
> > > > > CTC,
> > > > > > > MPR
> > > > > > > > to
> > > > > > > > > > > CTS),
> > > > > > > > > > > > >> it'd
> > > > > > > > > > > > >> > > be
> > > > > > > > > > > > >> > > > > more efficient to group those jobs
into a
> > > config
> > > > > > file.
> > > > > > > > > > > That'll
> > > > > > > > > > > > do
> > > > > > > > > > > > >> > the
> > > > > > > > > > > > >> > > > > filtering ONCE and write the
filtered data
> > to
> > > a
> > > > > temp
> > > > > > > > file.
> > > > > > > > > > > Then
> > > > > > > > > > > > >> all
> > > > > > > > > > > > >> > > the
> > > > > > > > > > > > >> > > > > jobs read data from the temp
instead of
> > > starting
> > > > > > over
> > > > > > > > from
> > > > > > > > > > > > >> scratch.
> > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > >> > > > > Make sense?
> > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > >> > > > > John
> > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > >> > > > > On Mon, Apr 23, 2018 at 1:01 PM,
Rosalyn
> > > > > MacCracken
> > > > > > -
> > > > > > > > NOAA
> > > > > > > > > > > > >> Affiliate
> > > > > > > > > > > > >> > > via
> > > > > > > > > > > > >> > > > RT
> > > > > > > > > > > > >> > > > > <met_help at ucar.edu> wrote:
> > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > >> > > > > >
> > > > > > > > > > > > >> > > > > > <URL: https://rt.rap.ucar.edu/rt/
> > > > > > > > > > > Ticket/Display.html?id=84822
> > > > > > > > > > > > >
> > > > > > > > > > > > >> > > > > >
> > > > > > > > > > > > >> > > > > > Hi John,
> > > > > > > > > > > > >> > > > > >
> > > > > > > > > > > > >> > > > > > That's actually only partially
correct.
> > > It's
> > > > > not
> > > > > > > > that I
> > > > > > > > > > > want
> > > > > > > > > > > > to
> > > > > > > > > > > > >> > use
> > > > > > > > > > > > >> > > > part
> > > > > > > > > > > > >> > > > > > of the MPR lines and discard the
rest,
> > and I
> > > > do
> > > > > > need
> > > > > > > > to
> > > > > > > > > > > > >> regenerate
> > > > > > > > > > > > >> > > > > > statistics.  Let me try to re-
explain.
> > > > > > > > > > > > >> > > > > >
> > > > > > > > > > > > >> > > > > > Back in early March we switched
from
> > getting
> > > > our
> > > > > > > ASCAT
> > > > > > > > > obs
> > > > > > > > > > > > from
> > > > > > > > > > > > >> the
> > > > > > > > > > > > >> > > > > > prepbufr data, to getting it from
the
> > > MGDRLITE
> > > > > > data.
> > > > > > > > So,
> > > > > > > > > > > > >> processing
> > > > > > > > > > > > >> > > > > didn't
> > > > > > > > > > > > >> > > > > > change.  I was producing
statistics at
> > > certain
> > > > > > > > threshold
> > > > > > > > > > > > levels
> > > > > > > > > > > > >> for
> > > > > > > > > > > > >> > > > both
> > > > > > > > > > > > >> > > > > > GFS and ASCAT.  I had this set
with the
> > > > > cat_thresh
> > > > > > > > list,
> > > > > > > > > > at
> > > > > > > > > > > > >> levels
> > > > > > > > > > > > >> > of
> > > > > > > > > > > > >> > > > > > 0,6,17, etc.  We found out after
> > processing
> > > > for
> > > > > a
> > > > > > > > couple
> > > > > > > > > > of
> > > > > > > > > > > > >> weeks
> > > > > > > > > > > > >> > > that
> > > > > > > > > > > > >> > > > > the
> > > > > > > > > > > > >> > > > > > ASCAT data included these really
small
> > > values,
> > > > > > <1.0
> > > > > > > > m/s,
> > > > > > > > > > and
> > > > > > > > > > > > >> that
> > > > > > > > > > > > >> > > these
> > > > > > > > > > > > >> > > > > > small wind speeds were being
included
> into
> > > the
> > > > > > > > > statistics
> > > > > > > > > > > > >> > processing.
> > > > > > > > > > > > >> > > > > >
> > > > > > > > > > > > >> > > > > > So, a couple of questions.
> > > > > > > > > > > > >> > > > > > 1) Do I have to regenerate all of
my
> > > > statistics
> > > > > > > > (*.cts,
> > > > > > > > > > > *.cnt
> > > > > > > > > > > > >> and
> > > > > > > > > > > > >> > > *ctc
> > > > > > > > > > > > >> > > > > > files) because of this error? Or,
since
> I
> > > have
> > > > > > > > threshold
> > > > > > > > > > > > levels
> > > > > > > > > > > > >> > set,
> > > > > > > > > > > > >> > > > will
> > > > > > > > > > > > >> > > > > > those small values be amoung the
> > statistics
> > > in
> > > > > the
> > > > > > > > > lowest
> > > > > > > > > > > > >> > thresholds?
> > > > > > > > > > > > >> > > > > > 2) I have the *.stat files, but,
they
> are
> > > > spread
> > > > > > out
> > > > > > > > > into
> > > > > > > > > > > > >> separate
> > > > > > > > > > > > >> > > > > > directories like:
> > > > > > > > > > > > >> > > > > >
/GFS/data/hourly/${YYYYMMDDHH}/*.stat
> > > > > > > > > > > > >> > > > > > Can I tell stat-analysis to
"lookin"
> > > > directories
> > > > > > > with
> > > > > > > > a
> > > > > > > > > > > > wildcard
> > > > > > > > > > > > >> > > (like
> > > > > > > > > > > > >> > > > > > 201803*)?  If so, how?  Or, is I
tell it
> > to
> > > > look
> > > > > > in
> > > > > > > > > > > > >> > /GFS/data/hourly,
> > > > > > > > > > > > >> > > > > will
> > > > > > > > > > > > >> > > > > > it look in all the directories
> recursively
> > > > under
> > > > > > > > hourly?
> > > > > > > > > > > And,
> > > > > > > > > > > > >> it
> > > > > > > > > > > > >> > > > that's
> > > > > > > > > > > > >> > > > > > the case, can I give it a date
range,
> so,
> > > that
> > > > > it
> > > > > > > only
> > > > > > > > > > > > processes
> > > > > > > > > > > > >> > data
> > > > > > > > > > > > >> > > > > from
> > > > > > > > > > > > >> > > > > > March?
> > > > > > > > > > > > >> > > > > >
> > > > > > > > > > > > >> > > > > > Roz
> > > > > > > > > > > > >> > > > > >
> > > > > > > > > > > > >> > > > > > On Mon, Apr 23, 2018 at 2:18 PM,
John
> > Halley
> > > > > > Gotway
> > > > > > > > via
> > > > > > > > > > RT <
> > > > > > > > > > > > >> > > > > > met_help at ucar.edu> wrote:
> > > > > > > > > > > > >> > > > > >
> > > > > > > > > > > > >> > > > > > > Hi Roz,
> > > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > > >> > > > > > > I read that you've run Point-
Stat and
> > > saved
> > > > > off
> > > > > > > the
> > > > > > > > > > > matched
> > > > > > > > > > > > >> pairs
> > > > > > > > > > > > >> > > > (MPR)
> > > > > > > > > > > > >> > > > > > > output line type.  And you'd
like to
> (1)
> > > > > filter
> > > > > > > > those
> > > > > > > > > > MPR
> > > > > > > > > > > > >> lines
> > > > > > > > > > > > >> > to
> > > > > > > > > > > > >> > > > > > discard
> > > > > > > > > > > > >> > > > > > > some of them and then (2) use
the
> > filtered
> > > > > data
> > > > > > to
> > > > > > > > > > > > regenerate
> > > > > > > > > > > > >> > > summary
> > > > > > > > > > > > >> > > > > > > statistics.  Yes, this is
easily done
> > > using
> > > > > the
> > > > > > > > > > > > STAT-Analysis
> > > > > > > > > > > > >> > tool
> > > > > > > > > > > > >> > > in
> > > > > > > > > > > > >> > > > > > MET.
> > > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > > >> > > > > > > You wrote that you're verifying
wind
> > > speeds
> > > > > > > against
> > > > > > > > > > ASCAT
> > > > > > > > > > > > and
> > > > > > > > > > > > >> > that
> > > > > > > > > > > > >> > > > > you'd
> > > > > > > > > > > > >> > > > > > > like to exclude pairs where the
> observed
> > > > wind
> > > > > > > speed
> > > > > > > > is
> > > > > > > > > > > less
> > > > > > > > > > > > >> than
> > > > > > > > > > > > >> > 1
> > > > > > > > > > > > >> > > > m/s.
> > > > > > > > > > > > >> > > > > > > I'm just guessing here, but
I'll
> presume
> > > > that
> > > > > > you
> > > > > > > > want
> > > > > > > > > > to
> > > > > > > > > > > > >> produce
> > > > > > > > > > > > >> > > > both
> > > > > > > > > > > > >> > > > > > > SL1L2 and CNT output line
types.
> Here's
> > > > what
> > > > > > the
> > > > > > > > > > > > >> STAT-Analysis
> > > > > > > > > > > > >> > job
> > > > > > > > > > > > >> > > > > would
> > > > > > > > > > > > >> > > > > > > look like:
> > > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > > >> > > > > > > # Filter MPR's and write SL1L2
output
> > line
> > > > > > > > > > > > >> > > > > > > stat_analysis \
> > > > > > > > > > > > >> > > > > > >    -lookin input.stat \
#
> > List
> > > a
> > > > > > .stat
> > > > > > > > > > filename
> > > > > > > > > > > > or
> > > > > > > > > > > > >> > > > directory
> > > > > > > > > > > > >> > > > > > > containing them
> > > > > > > > > > > > >> > > > > > >    -job aggregate_stat \
# Job
> > type
> > > > is
> > > > > > > > > > > aggregate_stat
> > > > > > > > > > > > >> > > > > > >    -line_type MPR \
#
> Input
> > > > line
> > > > > > > type =
> > > > > > > > > MPR
> > > > > > > > > > > > >> > > > > > >    -out_line_type SL1L2 \
#
> Output
> > > line
> > > > > > type
> > > > > > > =
> > > > > > > > > > SL1L2
> > > > > > > > > > > > >> partial
> > > > > > > > > > > > >> > > > sums
> > > > > > > > > > > > >> > > > > > >    -fcst_var WIND \
#
> Only
> > > > > process
> > > > > > > > lines
> > > > > > > > > > > where
> > > > > > > > > > > > >> > > FCST_VAR
> > > > > > > > > > > > >> > > > > > > column = WIND
> > > > > > > > > > > > >> > > > > > >    -column_thresh OBS gt1 \ #
Only use
> > MPR
> > > > > lines
> > > > > > > > where
> > > > > > > > > > OBS
> > > > > > > > > > > > >> column
> > > > > > > > > > > > >> > > > 1
> > > > > > > > > > > > >> > > > > > >    -by
> > > > > > > > > > > > >> > > > > > > MODEL,FCST_LEV,FCST_INIT_BEG,
> > > > > > > > > > > FCST_LEAD,VX_MASK,INTERP_MTHD,
> > > > > > > > > > > > >> > > > INTERP_PNTS
> > > > > > > > > > > > >> > > > > #
> > > > > > > > > > > > >> > > > > > > Run this same job for each
unique
> > > > combination
> > > > > of
> > > > > > > > these
> > > > > > > > > > > > columns
> > > > > > > > > > > > >> > > > > > >    -out_stat MPR_to_SL1L2.stat
> > > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > > >> > > > > > > This will read produce an
output .stat
> > > file
> > > > > > > > containing
> > > > > > > > > > an
> > > > > > > > > > > > >> SL1L2
> > > > > > > > > > > > >> > > line
> > > > > > > > > > > > >> > > > > for
> > > > > > > > > > > > >> > > > > > > each unique combination of the
header
> > > > columns
> > > > > > > listed
> > > > > > > > > > after
> > > > > > > > > > > > the
> > > > > > > > > > > > >> > > "-by"
> > > > > > > > > > > > >> > > > > > > option.  To generate CNT output
lines
> > > > instead,
> > > > > > > you'd
> > > > > > > > > > run a
> > > > > > > > > > > > >> second
> > > > > > > > > > > > >> > > job
> > > > > > > > > > > > >> > > > > > where
> > > > > > > > > > > > >> > > > > > > you replace SL1L2 with CNT.
You could
> > run
> > > > > these
> > > > > > > > jobs
> > > > > > > > > on
> > > > > > > > > > > the
> > > > > > > > > > > > >> > > command
> > > > > > > > > > > > >> > > > > line
> > > > > > > > > > > > >> > > > > > > or group them together into a
> > > STAT-Analysis
> > > > > > config
> > > > > > > > > file,
> > > > > > > > > > > if
> > > > > > > > > > > > >> you
> > > > > > > > > > > > >> > > > prefer.
> > > > > > > > > > > > >> > > > > > > Both would work.
> > > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > > >> > > > > > > You could run this once for
each input
> > > .stat
> > > > > > file
> > > > > > > > > you're
> > > > > > > > > > > > >> > > > processing...
> > > > > > > > > > > > >> > > > > or
> > > > > > > > > > > > >> > > > > > > you could pass many input .stat
files
> to
> > > the
> > > > > > job.
> > > > > > > > > Since
> > > > > > > > > > > > >> > > > FCST_INIT_BEG
> > > > > > > > > > > > >> > > > > > and
> > > > > > > > > > > > >> > > > > > > FCST_LEAD are included in the
"-by"
> > > option,
> > > > > > you'll
> > > > > > > > get
> > > > > > > > > > > > >> separate
> > > > > > > > > > > > >> > > > output
> > > > > > > > > > > > >> > > > > > > lines for each unique time.
> > > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > > >> > > > > > > Hope that helps get you going.
> > > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > > >> > > > > > > Thanks,
> > > > > > > > > > > > >> > > > > > > John
> > > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > > >> > > > > > > On Thu, Apr 19, 2018 at 9:23
AM, Julie
> > > > > > Prestopnik
> > > > > > > > via
> > > > > > > > > > RT <
> > > > > > > > > > > > >> > > > > > > met_help at ucar.edu>
> > > > > > > > > > > > >> > > > > > > wrote:
> > > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > > >> > > > > > > >
> > > > > > > > > > > > >> > > > > > > > <URL:
> https://rt.rap.ucar.edu/rt/Tic
> > > > > > > > > > > > >> ket/Display.html?id=84822
> > > > > > > > > > > > >> > >
> > > > > > > > > > > > >> > > > > > > >
> > > > > > > > > > > > >> > > > > > > > Hi Roz.  My apologies for the
delay
> in
> > > > > > > responding.
> > > > > > > > > > > > >> > > > > > > >
> > > > > > > > > > > > >> > > > > > > > Unfortunately, John is out of
the
> > office
> > > > > this
> > > > > > > > week,
> > > > > > > > > > and
> > > > > > > > > > > I
> > > > > > > > > > > > do
> > > > > > > > > > > > >> > not
> > > > > > > > > > > > >> > > > know
> > > > > > > > > > > > >> > > > > > the
> > > > > > > > > > > > >> > > > > > > > answers to your questions.
As you
> > > said, I
> > > > > > would
> > > > > > > > > also
> > > > > > > > > > > > >> imagine
> > > > > > > > > > > > >> > > that
> > > > > > > > > > > > >> > > > > > > > point-stat is using those
small
> values
> > > as
> > > > > > > matched
> > > > > > > > > > pairs.
> > > > > > > > > > > > >> > Also, I
> > > > > > > > > > > > >> > > > do
> > > > > > > > > > > > >> > > > > > not
> > > > > > > > > > > > >> > > > > > > > believe there is a way to
regenerate
> > the
> > > > > > > > point-stat
> > > > > > > > > > > > >> statistics
> > > > > > > > > > > > >> > > > > without
> > > > > > > > > > > > >> > > > > > > > using the original GFS data.
I
> cannot
> > > say
> > > > > > with
> > > > > > > > > > > certainty,
> > > > > > > > > > > > >> > > however.
> > > > > > > > > > > > >> > > > > > > Thank
> > > > > > > > > > > > >> > > > > > > > you for your patience in
advance.
> > We'll
> > > > > get a
> > > > > > > > > > definite
> > > > > > > > > > > > >> > response
> > > > > > > > > > > > >> > > to
> > > > > > > > > > > > >> > > > > you
> > > > > > > > > > > > >> > > > > > > as
> > > > > > > > > > > > >> > > > > > > > soon as we can.
> > > > > > > > > > > > >> > > > > > > >
> > > > > > > > > > > > >> > > > > > > > Thanks,
> > > > > > > > > > > > >> > > > > > > > Julie
> > > > > > > > > > > > >> > > > > > > >
> > > > > > > > > > > > >> > > > > > > > On Wed, Apr 18, 2018 at 6:31
AM,
> > Rosalyn
> > > > > > > > MacCracken
> > > > > > > > > -
> > > > > > > > > > > NOAA
> > > > > > > > > > > > >> > > > Affiliate
> > > > > > > > > > > > >> > > > > > via
> > > > > > > > > > > > >> > > > > > > RT
> > > > > > > > > > > > >> > > > > > > > <met_help at ucar.edu> wrote:
> > > > > > > > > > > > >> > > > > > > >
> > > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > Wed Apr 18 06:31:39 2018:
Request
> > > 84822
> > > > > was
> > > > > > > > acted
> > > > > > > > > > > upon.
> > > > > > > > > > > > >> > > > > > > > > Transaction: Ticket created
by
> > > > > > > > > > > > >> rosalyn.maccracken at noaa.gov
> > > > > > > > > > > > >> > > > > > > > >        Queue: met_help
> > > > > > > > > > > > >> > > > > > > > >      Subject: question on
> > regenerating
> > > > > data
> > > > > > > > > > > > >> > > > > > > > >        Owner: Nobody
> > > > > > > > > > > > >> > > > > > > > >   Requestors:
> > > > rosalyn.maccracken at noaa.gov
> > > > > > > > > > > > >> > > > > > > > >       Status: new
> > > > > > > > > > > > >> > > > > > > > >  Ticket <URL:
> > > > https://rt.rap.ucar.edu/rt/
> > > > > > > > > > > > >> > > > > > Ticket/Display.html?id=84822
> > > > > > > > > > > > >> > > > > > > >
> > > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > Hi,
> > > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > I'm running point-stat
using ASCAT
> > and
> > > > GFS
> > > > > > > data
> > > > > > > > to
> > > > > > > > > > > > verify
> > > > > > > > > > > > >> > > surface
> > > > > > > > > > > > >> > > > > > wind
> > > > > > > > > > > > >> > > > > > > > > speeds.  I found an error
in my
> > ASCAT
> > > > > input
> > > > > > > data
> > > > > > > > > > that
> > > > > > > > > > > > goes
> > > > > > > > > > > > >> > back
> > > > > > > > > > > > >> > > > to
> > > > > > > > > > > > >> > > > > > Mar
> > > > > > > > > > > > >> > > > > > > 7.
> > > > > > > > > > > > >> > > > > > > > > I had switched the input
source of
> > the
> > > > > data,
> > > > > > > and
> > > > > > > > > > > within
> > > > > > > > > > > > >> the
> > > > > > > > > > > > >> > new
> > > > > > > > > > > > >> > > > > data
> > > > > > > > > > > > >> > > > > > > > files,
> > > > > > > > > > > > >> > > > > > > > > it was allowing very small
values
> > (< 1
> > > > > m/s)
> > > > > > to
> > > > > > > > be
> > > > > > > > > > used
> > > > > > > > > > > > as
> > > > > > > > > > > > >> > data
> > > > > > > > > > > > >> > > > > points
> > > > > > > > > > > > >> > > > > > > in
> > > > > > > > > > > > >> > > > > > > > > the verification.  I
imagine that
> > this
> > > > is
> > > > > an
> > > > > > > > > issue,
> > > > > > > > > > > > since
> > > > > > > > > > > > >> > > > > point-stat
> > > > > > > > > > > > >> > > > > > is
> > > > > > > > > > > > >> > > > > > > > > using these very small
values as
> > > matched
> > > > > > pairs
> > > > > > > > > with
> > > > > > > > > > > the
> > > > > > > > > > > > >> GFS,
> > > > > > > > > > > > >> > > > > correct?
> > > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > Is there a way to
regenerate the
> > > > > point-stat
> > > > > > > > > > statistics
> > > > > > > > > > > > >> > without
> > > > > > > > > > > > >> > > > > using
> > > > > > > > > > > > >> > > > > > > the
> > > > > > > > > > > > >> > > > > > > > > original GFS data?  I do
have the
> > > *stat
> > > > > and
> > > > > > > the
> > > > > > > > > *mpr
> > > > > > > > > > > > >> files,
> > > > > > > > > > > > >> > and
> > > > > > > > > > > > >> > > > it
> > > > > > > > > > > > >> > > > > is
> > > > > > > > > > > > >> > > > > > > > > pretty easy to identify
where the
> > bad
> > > > > values
> > > > > > > are
> > > > > > > > > > > > located.
> > > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > Thanks,
> > > > > > > > > > > > >> > > > > > > > > Roz
> > > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > --
> > > > > > > > > > > > >> > > > > > > > > Rosalyn MacCracken
> > > > > > > > > > > > >> > > > > > > > > Support Scientist
> > > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > Ocean Applications Branch
> > > > > > > > > > > > >> > > > > > > > > NOAA/NWS Ocean Prediction
Center
> > > > > > > > > > > > >> > > > > > > > > NCWCP
> > > > > > > > > > > > >> > > > > > > > > 5830 University Research Ct
> > > > > > > > > > > > >> > > > > > > > > College Park, MD  20740-
3818
> > > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > (p) 301-683-1551
> > > > > > > > > > > > >> > > > > > > > > rosalyn.maccracken at noaa.gov
> > > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > > >> > > > > > > >
> > > > > > > > > > > > >> > > > > > > >
> > > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > > >> > > > > >
> > > > > > > > > > > > >> > > > > >
> > > > > > > > > > > > >> > > > > > --
> > > > > > > > > > > > >> > > > > > Rosalyn MacCracken
> > > > > > > > > > > > >> > > > > > Support Scientist
> > > > > > > > > > > > >> > > > > >
> > > > > > > > > > > > >> > > > > > Ocean Applications Branch
> > > > > > > > > > > > >> > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > > > > > > > >> > > > > > NCWCP
> > > > > > > > > > > > >> > > > > > 5830 University Research Ct
> > > > > > > > > > > > >> > > > > > College Park, MD  20740-3818
> > > > > > > > > > > > >> > > > > >
> > > > > > > > > > > > >> > > > > > (p) 301-683-1551
> > > > > > > > > > > > >> > > > > > rosalyn.maccracken at noaa.gov
> > > > > > > > > > > > >> > > > > >
> > > > > > > > > > > > >> > > > > >
> > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > >> > > >
> > > > > > > > > > > > >> > > >
> > > > > > > > > > > > >> > > > --
> > > > > > > > > > > > >> > > > Rosalyn MacCracken
> > > > > > > > > > > > >> > > > Support Scientist
> > > > > > > > > > > > >> > > >
> > > > > > > > > > > > >> > > > Ocean Applications Branch
> > > > > > > > > > > > >> > > > NOAA/NWS Ocean Prediction Center
> > > > > > > > > > > > >> > > > NCWCP
> > > > > > > > > > > > >> > > > 5830 University Research Ct
> > > > > > > > > > > > >> > > > College Park, MD  20740-3818
> > > > > > > > > > > > >> > > >
> > > > > > > > > > > > >> > > > (p) 301-683-1551
> > > > > > > > > > > > >> > > > rosalyn.maccracken at noaa.gov
> > > > > > > > > > > > >> > > >
> > > > > > > > > > > > >> > > >
> > > > > > > > > > > > >> > >
> > > > > > > > > > > > >> > >
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >> > --
> > > > > > > > > > > > >> > Rosalyn MacCracken
> > > > > > > > > > > > >> > Support Scientist
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >> > Ocean Applications Branch
> > > > > > > > > > > > >> > NOAA/NWS Ocean Prediction Center
> > > > > > > > > > > > >> > NCWCP
> > > > > > > > > > > > >> > 5830 University Research Ct
> > > > > > > > > > > > >> > College Park, MD  20740-3818
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >> > (p) 301-683-1551
> > > > > > > > > > > > >> > rosalyn.maccracken at noaa.gov
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >>
> > > > > > > > > > > > >>
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > --
> > > > > > > > > > > > > Rosalyn MacCracken
> > > > > > > > > > > > > Support Scientist
> > > > > > > > > > > > >
> > > > > > > > > > > > > Ocean Applications Branch
> > > > > > > > > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > > > > > > > > NCWCP
> > > > > > > > > > > > > 5830 University Research Ct
> > > > > > > > > > > > > College Park, MD  20740-3818
> > > > > > > > > > > > >
> > > > > > > > > > > > > (p) 301-683-1551
> > > > > > > > > > > > > rosalyn.maccracken at noaa.gov
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > --
> > > > > > > > > > > > Rosalyn MacCracken
> > > > > > > > > > > > Support Scientist
> > > > > > > > > > > >
> > > > > > > > > > > > Ocean Applications Branch
> > > > > > > > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > > > > > > > NCWCP
> > > > > > > > > > > > 5830 University Research Ct
> > > > > > > > > > > > College Park, MD  20740-3818
> > > > > > > > > > > >
> > > > > > > > > > > > (p) 301-683-1551
> > > > > > > > > > > > rosalyn.maccracken at noaa.gov
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > Rosalyn MacCracken
> > > > > > > > > > Support Scientist
> > > > > > > > > >
> > > > > > > > > > Ocean Applications Branch
> > > > > > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > > > > > NCWCP
> > > > > > > > > > 5830 University Research Ct
> > > > > > > > > > College Park, MD  20740-3818
> > > > > > > > > >
> > > > > > > > > > (p) 301-683-1551
> > > > > > > > > > rosalyn.maccracken at noaa.gov
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > Rosalyn MacCracken
> > > > > > > > Support Scientist
> > > > > > > >
> > > > > > > > Ocean Applications Branch
> > > > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > > > NCWCP
> > > > > > > > 5830 University Research Ct
> > > > > > > > College Park, MD  20740-3818
> > > > > > > >
> > > > > > > > (p) 301-683-1551
> > > > > > > > rosalyn.maccracken at noaa.gov
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Rosalyn MacCracken
> > > > > > Support Scientist
> > > > > >
> > > > > > Ocean Applications Branch
> > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > NCWCP
> > > > > > 5830 University Research Ct
> > > > > > College Park, MD  20740-3818
> > > > > >
> > > > > > (p) 301-683-1551
> > > > > > rosalyn.maccracken at noaa.gov
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > Rosalyn MacCracken
> > > > Support Scientist
> > > >
> > > > Ocean Applications Branch
> > > > NOAA/NWS Ocean Prediction Center
> > > > NCWCP
> > > > 5830 University Research Ct
> > > > College Park, MD  20740-3818
> > > >
> > > > (p) 301-683-1551
> > > > rosalyn.maccracken at noaa.gov
> > > >
> > > >
> > >
> > >
> >
> >
> > --
> > Rosalyn MacCracken
> > Support Scientist
> >
> > Ocean Applications Branch
> > NOAA/NWS Ocean Prediction Center
> > NCWCP
> > 5830 University Research Ct
> > College Park, MD  20740-3818
> >
> > (p) 301-683-1551
> > rosalyn.maccracken at noaa.gov
> >
> >
>
>

--
Rosalyn MacCracken
Support Scientist

Ocean Applications Branch
NOAA/NWS Ocean Prediction Center
NCWCP
5830 University Research Ct
College Park, MD  20740-3818

(p) 301-683-1551
rosalyn.maccracken at noaa.gov

------------------------------------------------
Subject: question on regenerating data
From: John Halley Gotway
Time: Mon May 07 15:06:25 2018

Roz,

I'll go ahead and resolve it while I'm thinking about it.  But if more
issues or questions arise, feel free to write.

Thanks,
John

On Mon, May 7, 2018 at 12:55 PM, Rosalyn MacCracken - NOAA Affiliate
via RT
<met_help at ucar.edu> wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822 >
>
> Ah, Ok, I got it.
>
> I'm going to run another test or two after I have my script set up.
I
> don't think I have any other questions.  So, I'm thinking that we
might be
> able to close this ticket....unless you think I should run my test
and make
> sure everything works the way I think it will before closing the
ticket.
>
> Roz
>
> On Mon, May 7, 2018 at 2:37 PM, John Halley Gotway via RT <
> met_help at ucar.edu
> > wrote:
>
> > Roz,
> >
> > Yes, you can use environment variables inside MET config files.
> > Presumably, you're calling stat_analysis from some shell script.
And
> that
> > script could include:
> >
> > # for cshell
> >    setenv CUR_VALID_BEG 20180307_003000
> >    setenv CUR_VALID_END 20180307_013000
> >
> > # or for bash
> >    export CUR_VALID_BEG="20180307_003000"
> >    export CUR_VALID_END="20180307_013000"
> >
> > Then in your STAT-Analysis config file you could use this:
> >    -set_hdr OBS_VALID_BEG ${CUR_VALID_BEG} -set_hdr OBS_VALID_END
> > ${CUR_VALID_END}
> >
> > As for how you set up your logic and organize your data, totally
up to
> you.
> >
> > Thanks,
> > John
> >
> >
> > On Mon, May 7, 2018 at 11:41 AM, Rosalyn MacCracken - NOAA
Affiliate via
> RT
> > <met_help at ucar.edu> wrote:
> >
> > >
> > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822 >
> > >
> > > Hi John,
> > >
> > > Where would I set the environmental variables?  Those are set
within
> the
> > > config file, correct?  Can you somehow pass them into the config
> file?  I
> > > didn't think that the config file was that dynamic.
> > >
> > > So, I was thinking about appending the files by another shell
script
> > using
> > > "cat".  But, I think I'm actually leaning towards copying hourly
files
> > to a
> > > temp directory, processing, then removing the file, and copying
the
> next
> > > hour, and so on.  I don't know if that's a silly idea or not...
> > >
> > > Roz
> > >
> > >
> > >
> > >
> > >
> > > On Mon, May 7, 2018 at 1:06 PM, John Halley Gotway via RT <
> > > met_help at ucar.edu
> > > > wrote:
> > >
> > > > Roz,
> > > >
> > > > Yes, the "-set_hdr" option is specific to each job.  If your
jobs are
> > > > defined in the config file, then yes, you'd need to specify
there.
> > > Indeed,
> > > > getting the timestamps consistent really is just cosmetic.  If
you're
> > > > looping over many times, I'd suggest using an environment
variable:
> > > >   -set_hdr OBS_VALID_BEG ${CUR_VALID_BEG} -set_hdr
OBS_VALID_END
> > > > ${CUR_VALID_END}
> > > >
> > > > Using environment variables in MET configuration files makes
> scripting
> > > much
> > > > more convenient.
> > > >
> > > > However, STAT-Analysis doesn't have the ability to append to
an
> output
> > > > file.  If you write to the same output file name, it'll
*clobber*
> that
> > > file
> > > > (i.e. replace it).
> > > >
> > > > Hope that helps.
> > > >
> > > > Thanks,
> > > > John
> > > >
> > > >
> > > > On Mon, May 7, 2018 at 10:38 AM, Rosalyn MacCracken - NOAA
Affiliate
> > via
> > > RT
> > > > <met_help at ucar.edu> wrote:
> > > >
> > > > >
> > > > > <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822 >
> > > > >
> > > > > Hi John,
> > > > >
> > > > > So, it sounds like I'm ok either way with the timestamp.  If
I
> don't
> > > use
> > > > > -set_hdr, it sets the correct beginning and end time
according to
> the
> > > mpr
> > > > > file, or, I can use -set_hdr for consistancy with the other
files,
> > but,
> > > > > that's more "cosmetic".
> > > > >
> > > > > Oh, but, that -set_hdr command option is within the config
file,
> > > correct?
> > > > > So, you really couldn't loop through that and pass a time
variable
> > into
> > > > the
> > > > > command.  So, it may just be easiest to leave it out.
> > > > >
> > > > > So, since my processing would take 20 minutes for
regenerating one
> > days
> > > > > worth of data, I was thinking, I would do all my processing
for the
> > > North
> > > > > Atlantic first, so, I can look at how we did with those Nor'
> Easters
> > in
> > > > > March.  So, that's processing the 00z, 01z, 11z and 12z time
> periods
> > > > first,
> > > > > since that is when ASCAT passes over the North Atlantic.
So, I
> would
> > > > copy
> > > > > those time periods into a temp directory and use the -lookin
> command
> > to
> > > > > process those 4 time periods for my entire period (maybe 1
week at
> a
> > > > > time).
> > > > >
> > > > > So, this will produce 1 file with 00z,01z, 11z and 12z data,
for
> each
> > > > week,
> > > > > correct?  And, the only way to get individual files is to
copy the
> > > data,
> > > > > one hour at a time, process, delete the file, later rinse
and
> repeat.
> > > > That
> > > > > might be hard to do.  I may have to think about that...
> > > > >
> > > > > So, if it's one file, with all the data for the week, at
selected
> > > hours,
> > > > > what happens when I have time to run the rest of the data?
I just
> > > write
> > > > > that to a different file, and then, maybe append that to the
end of
> > the
> > > > > first file?  Or, just leave it separate?
> > > > >
> > > > > I guess I just have to think about what I'm going to do next
with
> > these
> > > > > files, and the easiest way to do that.
> > > > >
> > > > > Roz
> > > > >
> > > > > On Mon, May 7, 2018 at 11:48 AM, John Halley Gotway via RT <
> > > > > met_help at ucar.edu> wrote:
> > > > >
> > > > > > Roz,
> > > > > >
> > > > > > I understand that you're suspicious about the beginning
and
> ending
> > > time
> > > > > > stamps in the OBS_VALID_BEG and _END columns.  You're
comparing
> the
> > > > > > original output from Point-Stat to the output that you're
getting
> > > from
> > > > > > STAT-Analysis.  However, those timestamps can be different
> without
> > > > there
> > > > > > actually being a problem.  Here's why...
> > > > > >
> > > > > > When you run Point-Stat, the obs_window setting in the
config
> file
> > > > > defines
> > > > > > the matching time window.  If your forecast is valid at
time T,
> the
> > > > > > matching time window is defined as T+obs_window.beg to
> > > > T+obs_window.end.
> > > > > > The point observations may actually fall anywhere in that
time
> > > > window...
> > > > > > but it's that time window that's reported in the summary
line
> type
> > > > (like
> > > > > > CTC, CTS, SL1L2, and CNT).  Since the MPR line type is
specific
> to
> > > each
> > > > > > observation value, the *actual* timestamp of that
observation is
> > > > reported
> > > > > > for in that line.
> > > > > >
> > > > > > When your run STAT-Analysis to process those MPR lines, it
reads
> > the
> > > > > > OBS_VALID_BEG and OBS_VALID_END columns.  And it keeps
track of
> the
> > > > > minimum
> > > > > > OBS_VALID_BEG timestamp and the maximum OBS_VALID_END
timestamp.
> > > When
> > > > it
> > > > > > writes output CTC, CTS, SL1L2, or CNT lines it reports the
> > > > > minimum/maximum
> > > > > > timestamp values it found in the data.
> > > > > >
> > > > > > So Point-Stat reports the *REQUESTED TIME WINDOW* in the
> > > OBS_VALID_BEG
> > > > > and
> > > > > > OBS_VALID_END columns... while STAT-Analysis reports the
*ACTUAL
> > TIME
> > > > > > WINDOW*.  And in general, those won't be the same.  So
this isn't
> > > > > > necessarily a problem.
> > > > > >
> > > > > > If for consistency, you'd like to explicitly set the
> OBS_VALID_BEG
> > > and
> > > > > > OBS_VALID_END timestamps in the output, you can use the
> "-set_hdr"
> > > job
> > > > > > command option to do so:
> > > > > >    -set_hdr OBS_VALID_BEG 20180307_003000 -set_hdr
OBS_VALID_END
> > > > > > 20180307_013000
> > > > > >
> > > > > > Thanks,
> > > > > > John
> > > > > >
> > > > > > On Sun, May 6, 2018 at 2:12 PM, Rosalyn MacCracken - NOAA
> Affiliate
> > > via
> > > > > RT
> > > > > > <
> > > > > > met_help at ucar.edu> wrote:
> > > > > >
> > > > > > >
> > > > > > > <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=84822
> >
> > > > > > >
> > > > > > > Hi John,
> > > > > > >
> > > > > > > Sorry it took me so long to get back to you.  My step-
daughter
> > came
> > > > in
> > > > > to
> > > > > > > town, and I thought that I could get some work done
while she
> was
> > > > here,
> > > > > > > but, didn't.  Then, I totally forgot to email you back.
Sorry
> > for
> > > > > > leaving
> > > > > > > you hanging!
> > > > > > >
> > > > > > > Anyway, I was able to play around with the STATAnalysis
config
> > file
> > > > you
> > > > > > > sent me.  I tried it out with only 1 hour timestep,
instead of
> > all
> > > > the
> > > > > > > files for one day.  I wanted to see what kind of time it
would
> > take
> > > > to
> > > > > > > process this on my machine.  So, it was quick, 45
seconds.
> But,
> > of
> > > > > > course
> > > > > > > you run took 18 minutes.  The script was probably
reading 20
> some
> > > > > files.
> > > > > > > That makes sense.
> > > > > > >
> > > > > > > So, then, I looked at the output, and it wasn't quite
what I
> > > > expected,
> > > > > > and
> > > > > > > doesn't quite match the stats from the other processing.
This
> is
> > > > what
> > > > > I
> > > > > > > did:
> > > > > > >
> > > > > > > 1)  I copied the 00z only *20180307*.stat file to a temp
> > directory.
> > > > > > Before
> > > > > > > I did this, I looked at the matching *.mpr file, and saw
that
> the
> > > > > > > OBS_VALID_BEG was 20180307_000000 and the OBS_VALID_END
was
> > > > > > > 20180307_002700.
> > > > > > > 2)  Ran the run_sa.sh script and generated the CTS, CTC
and CNT
> > > > files.
> > > > > > > 3)  I looked at the new agg_cts file, and the
OBS_VALID_BEG and
> > > _END
> > > > > > > matched the *.mpr file in step 1.
> > > > > > > 4)  I looked at the original CTS file, and the
OBS_VALID_BEG
> was
> > > > > > > 20180307_223000 and the OBS_VALID_END was
20180307_013000.  So,
> > > that
> > > > > was
> > > > > > > our original way of processing.  I bet if I looked at a
more
> > recent
> > > > > file,
> > > > > > > it would be more like OBS_VALID_BEG was 20180307_233000
and the
> > > > > > > OBS_VALID_END was 20180307_003000.
> > > > > > > 5)  I looked at the original *mpr for 01z, and the
> OBS_VALID_BEG
> > > was
> > > > > > > 20180307_003000 and the OBS_VALID_END was
20180307_012100
> > > > > > >
> > > > > > > So, this tells me that I'm not matching observation
times, and
> > I'm
> > > > not
> > > > > > sure
> > > > > > > how to fix it to match things up.  First, we use a +/-
30 min
> > > window
> > > > > for
> > > > > > > ASCAT obs, centered on the hour.  For example, if we are
> > processing
> > > > the
> > > > > > 00z
> > > > > > > hour, we will match observations from 233000 from the
day
> before
> > to
> > > > > > 003000
> > > > > > > the current day.  Actually, we used to do an hour window
on
> > either
> > > > > side,
> > > > > > > but, we have more observations now at each hour.  (See
the
> > > > explanation
> > > > > in
> > > > > > > #4 above)
> > > > > > >
> > > > > > > Anyway, how do I create the CTS,CTC and CNT files for
the +/-
> 30
> > > min
> > > > > > > window?  Is there a way to dynamically indicate this
30min
> > window,
> > > so
> > > > > > that
> > > > > > > I don't have to go into the config file every time I run
> > > STATanalysis
> > > > > and
> > > > > > > change it?
> > > > > > >
> > > > > > > Roz
> > > > > > >
> > > > > > > On Thu, Apr 26, 2018 at 4:14 PM, John Halley Gotway via
RT <
> > > > > > > met_help at ucar.edu> wrote:
> > > > > > >
> > > > > > > > Roz,
> > > > > > > >
> > > > > > > > The CSI statistics is computed from a 2x2 contingency
> table.  A
> > > 2x2
> > > > > > > > contingency table is defined by a single threshold.
Looking
> in
> > > the
> > > > > > .stat
> > > > > > > > files you sent, I see that you've applied many
thresholds to
> > > > generate
> > > > > > > many
> > > > > > > > 2x2 contingency tables and corresponding statistics.
Yes, it
> > is
> > > > true
> > > > > > > that
> > > > > > > > for most of those thresholds, the "bad" observation
values
> will
> > > > fall
> > > > > > into
> > > > > > > > the "non-event" category.  But those non-event counts
are
> > > included
> > > > in
> > > > > > the
> > > > > > > > computation of some stats, including CSI.  So even
through
> the
> > > bad
> > > > > > > > observations aren't very interesting, they really are
> impacting
> > > the
> > > > > > > > statistics.
> > > > > > > >
> > > > > > > > John
> > > > > > > >
> > > > > > > > On Wed, Apr 25, 2018 at 10:08 AM, Rosalyn MacCracken -
NOAA
> > > > Affiliate
> > > > > > via
> > > > > > > > RT <met_help at ucar.edu> wrote:
> > > > > > > >
> > > > > > > > >
> > > > > > > > > <URL: https://rt.rap.ucar.edu/rt/
> > Ticket/Display.html?id=84822
> > > >
> > > > > > > > >
> > > > > > > > > Figures.  I just calculated how long it will take me
to
> > > > regenerate
> > > > > > data
> > > > > > > > for
> > > > > > > > > 03072018 - 04122018.  It will take me 912 hours.  ;-
(
> > > > > > > > >
> > > > > > > > > Ok, I know I asked this, but, if I had a OBS value
of 0.01
> > and
> > > a
> > > > > > > matched
> > > > > > > > > GFS point of 10 m/s, and I had a low threshold of 0-
5 m/s,
> > 6-10
> > > > m/s
> > > > > > and
> > > > > > > > > 10-15 m/s, and say, CSI was calculated.  Which
threshold
> > would
> > > be
> > > > > > used
> > > > > > > > for
> > > > > > > > > the output, the 0-5 or 6-10?  And, would the 10-15
> threshold
> > > even
> > > > > be
> > > > > > > > > effected?
> > > > > > > > >
> > > > > > > > > Roz
> > > > > > > > >
> > > > > > > > > On Wed, Apr 25, 2018 at 11:40 AM, John Halley Gotway
via
> RT <
> > > > > > > > > met_help at ucar.edu> wrote:
> > > > > > > > >
> > > > > > > > > > Roz,
> > > > > > > > > >
> > > > > > > > > > I think it'd take just as long.  The slow part is
reading
> > the
> > > > > > data...
> > > > > > > > not
> > > > > > > > > > applying a threshold.
> > > > > > > > > >
> > > > > > > > > > John
> > > > > > > > > >
> > > > > > > > > > On Wed, Apr 25, 2018 at 9:18 AM, Rosalyn
MacCracken -
> NOAA
> > > > > > Affiliate
> > > > > > > > via
> > > > > > > > > RT
> > > > > > > > > > <met_help at ucar.edu> wrote:
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > <URL: https://rt.rap.ucar.edu/rt/
> > > > Ticket/Display.html?id=84822
> > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Hi John,
> > > > > > > > > > >
> > > > > > > > > > > Thanks for doing that for me.  I'll take a look
at the
> > info
> > > > you
> > > > > > > sent
> > > > > > > > me
> > > > > > > > > > > this afternoon.  I'm in the middle of doing
something
> > right
> > > > > > > > > now...trying
> > > > > > > > > > to
> > > > > > > > > > > make a different program work.  ;-/
> > > > > > > > > > >
> > > > > > > > > > > I wonder if it will be quicker than 18 minutes
for some
> > of
> > > > the
> > > > > > > > > thresholds
> > > > > > > > > > > that have higher wind speeds, and not as many
instances
> > > (or 0
> > > > > > > > > instances).
> > > > > > > > > > > Or, will it take just as long, since it still
needs to
> > read
> > > > > > through
> > > > > > > > the
> > > > > > > > > > > entire *.stat file anyway?
> > > > > > > > > > >
> > > > > > > > > > > Roz
> > > > > > > > > > >
> > > > > > > > > > > On Tue, Apr 24, 2018 at 7:06 PM, John Halley
Gotway via
> > RT
> > > <
> > > > > > > > > > > met_help at ucar.edu> wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hi Roz,
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks for sending the sample data.  I grabbed
it and
> > > used
> > > > it
> > > > > > run
> > > > > > > > > some
> > > > > > > > > > > > sample jobs:
> > > > > > > > > > > >
> > > > > > > > > > > > time /d1/johnhg/MET/MET_releases/
> > > met-6.0/bin/stat_analysis
> > > > \
> > > > > > > > > > > > -lookin
> > > > > > > > > > > > /d1/johnhg/MET/MET_Help/
> maccracken_data_20180424/opc_
> > > > > > > > > > > >
test/home/opc_test/data/met_verif/GFS/data/hourly
> > > > > > > > > > > > \
> > > > > > > > > > > > -config STATAnalysisConfig \
> > > > > > > > > > > > -log run_sa.log -v 3
> > > > > > > > > > > >
> > > > > > > > > > > > I used the "-lookin" option to point to all
the data
> > you
> > > > > sent.
> > > > > > > > > > > >
> > > > > > > > > > > > I've attached the...
> > > > > > > > > > > > (1) config file I used
> > > > > > > > > > > > (2) log file that was genrated
> > > > > > > > > > > > (3) output .stat files
> > > > > > > > > > > >
> > > > > > > > > > > > Looking at the jobs, you'll see that I've
included 5
> of
> > > > > them...
> > > > > > > > > > > > - Generate CNT output
> > > > > > > > > > > > - Generate CTC >= 0.0 output
> > > > > > > > > > > > - Generate CTS >= 0.0 output
> > > > > > > > > > > > - Generate CTC >= 5.5689 output
> > > > > > > > > > > > - Generate CTS >= 5.5689 output
> > > > > > > > > > > >
> > > > > > > > > > > > Unfortunately, you'll need to define separate
jobs
> for
> > > each
> > > > > > > > threshold
> > > > > > > > > > > you'd
> > > > > > > > > > > > like to use.  Although, you shouldn't use
>=0.0 since
> > > > that's
> > > > > > > always
> > > > > > > > > > true.
> > > > > > > > > > > >
> > > > > > > > > > > > Also unfortunately, this is pretty slow.  On
my
> > machine,
> > > it
> > > > > > took
> > > > > > > > like
> > > > > > > > > > 18
> > > > > > > > > > > > minutes for these 5 jobs!
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks,
> > > > > > > > > > > > John
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > On Tue, Apr 24, 2018 at 2:09 PM, Rosalyn
MacCracken -
> > > NOAA
> > > > > > > > Affiliate
> > > > > > > > > > via
> > > > > > > > > > > RT
> > > > > > > > > > > > <met_help at ucar.edu> wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > <URL: https://rt.rap.ucar.edu/rt/
> > > > > > Ticket/Display.html?id=84822
> > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > Hi John,
> > > > > > > > > > > > >
> > > > > > > > > > > > > I put my file on the ftp site.  Let me know
what
> you
> > > > find.
> > > > > > > > You'll
> > > > > > > > > > see
> > > > > > > > > > > > > those really low OBS values (0.01, 0.02, and
so
> on).
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thanks!
> > > > > > > > > > > > >
> > > > > > > > > > > > > Roz
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Tue, Apr 24, 2018 at 2:53 PM, Rosalyn
> MacCracken -
> > > > NOAA
> > > > > > > > > Affiliate
> > > > > > > > > > <
> > > > > > > > > > > > > rosalyn.maccracken at noaa.gov> wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Ok, I'll get that over to the ftp site.  I
have
> to
> > > make
> > > > > > sure
> > > > > > > > > that I
> > > > > > > > > > > > find
> > > > > > > > > > > > > a
> > > > > > > > > > > > > > day that has all the data in it.
Sometimes the
> > data
> > > > > isn't
> > > > > > > > > > available
> > > > > > > > > > > > when
> > > > > > > > > > > > > > the script runs.  A little annoying, but,
that's
> > > > > > > operations...
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I'll let you know when I get the file to
the ftp
> > > site.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Thanks!
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Roz
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Tue, Apr 24, 2018 at 2:49 PM, John
Halley
> Gotway
> > > via
> > > > > RT
> > > > > > <
> > > > > > > > > > > > > > met_help at ucar.edu> wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >> Roz,
> > > > > > > > > > > > > >>
> > > > > > > > > > > > > >> Yes, we do.  Follow the instructions
here:
> > > > > > > > > > > > > >>    https://dtcenter.org/met/
> > > > > > users/support/met_help.php#ftp
> > > > > > > > > > > > > >>
> > > > > > > > > > > > > >> I'd suggest making a tar file for one day
and
> > > posting
> > > > > them
> > > > > > > to
> > > > > > > > > the
> > > > > > > > > > > ftp
> > > > > > > > > > > > > >> site:
> > > > > > > > > > > > > >>    tar -cvzf sample.tar.gz
> > > /GFS/data/hourly/20180305*
> > > > > > > > > > > > > >>
> > > > > > > > > > > > > >> Thanks,
> > > > > > > > > > > > > >> John
> > > > > > > > > > > > > >>
> > > > > > > > > > > > > >> On Tue, Apr 24, 2018 at 11:57 AM, Rosalyn
> > > MacCracken -
> > > > > > NOAA
> > > > > > > > > > > Affiliate
> > > > > > > > > > > > > via
> > > > > > > > > > > > > >> RT <met_help at ucar.edu> wrote:
> > > > > > > > > > > > > >>
> > > > > > > > > > > > > >> >
> > > > > > > > > > > > > >> > <URL: https://rt.rap.ucar.edu/rt/
> > > > > > > > Ticket/Display.html?id=84822
> > > > > > > > > >
> > > > > > > > > > > > > >> >
> > > > > > > > > > > > > >> > HI John,
> > > > > > > > > > > > > >> >
> > > > > > > > > > > > > >> > Yes, it does seem that the -config
option is
> the
> > > way
> > > > > to
> > > > > > go
> > > > > > > > to
> > > > > > > > > > > > recreate
> > > > > > > > > > > > > >> > those 3 files. I'll be sure to have a
unique
> > file
> > > > > name,
> > > > > > > or,
> > > > > > > > mv
> > > > > > > > > > the
> > > > > > > > > > > > > >> output
> > > > > > > > > > > > > >> > file to a different name before running
the
> > > command
> > > > > > again.
> > > > > > > > > > Thanks
> > > > > > > > > > > > for
> > > > > > > > > > > > > >> > pointing that out.
> > > > > > > > > > > > > >> >
> > > > > > > > > > > > > >> > I'm teleworking for the next couple of
weeks,
> > so,
> > > > > > download
> > > > > > > > and
> > > > > > > > > > > send
> > > > > > > > > > > > > you
> > > > > > > > > > > > > >> > *.stat files like I can when I'm at my
> computer
> > at
> > > > > work.
> > > > > > > I
> > > > > > > > > > don't
> > > > > > > > > > > > have
> > > > > > > > > > > > > >> > access to theia or wcoss anymore.  You
have an
> > ftp
> > > > > > server
> > > > > > > > > that I
> > > > > > > > > > > can
> > > > > > > > > > > > > >> upload
> > > > > > > > > > > > > >> > data to, right?  If not, I can try and
fiddle
> > > around
> > > > > > with
> > > > > > > > this
> > > > > > > > > > > > > tomorrow
> > > > > > > > > > > > > >> and
> > > > > > > > > > > > > >> > see if I can't get this to work the way
I want
> > to.
> > > > > > > > > > > > > >> >
> > > > > > > > > > > > > >> > Roz
> > > > > > > > > > > > > >> >
> > > > > > > > > > > > > >> > On Tue, Apr 24, 2018 at 11:42 AM, John
Halley
> > > Gotway
> > > > > via
> > > > > > > RT
> > > > > > > > <
> > > > > > > > > > > > > >> > met_help at ucar.edu> wrote:
> > > > > > > > > > > > > >> >
> > > > > > > > > > > > > >> > > Roz,
> > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > >> > > Each "-job aggregate_stat" only
generates a
> > > single
> > > > > > > output
> > > > > > > > > line
> > > > > > > > > > > > type.
> > > > > > > > > > > > > >> So
> > > > > > > > > > > > > >> > > using "-out_line_type CTC,CTS,CNT"
will not
> > > work.
> > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > >> > > You'll need to run separate jobs for
each
> > output
> > > > > line
> > > > > > > type
> > > > > > > > > you
> > > > > > > > > > > > want
> > > > > > > > > > > > > to
> > > > > > > > > > > > > >> > > generate.  That's why I'd recommend
grouping
> > > those
> > > > > > > > multiple
> > > > > > > > > > jobs
> > > > > > > > > > > > > >> together
> > > > > > > > > > > > > >> > > into a single STAT-Analysis config
file.
> Then
> > > > you'd
> > > > > > > call
> > > > > > > > > > > > > >> STAT-Analysis
> > > > > > > > > > > > > >> > > once using the "-config" command line
> option.
> > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > >> > > Another issue is that if you set "-
out_stat"
> > to
> > > > the
> > > > > > same
> > > > > > > > > > > filename,
> > > > > > > > > > > > > >> it'll
> > > > > > > > > > > > > >> > > get overridden by each job.  STAT-
Analysis
> > will
> > > > > > > overwrite
> > > > > > > > > that
> > > > > > > > > > > > > output
> > > > > > > > > > > > > >> > file
> > > > > > > > > > > > > >> > > rather than appending to it.
> > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > >> > > You could send me a day's worth of
.stat
> > output
> > > > > files
> > > > > > > > > > > > > >> > > (/GFS/data/hourly/20180305*) and I
could
> send
> > > you
> > > > > some
> > > > > > > > > > > > suggestions.
> > > > > > > > > > > > > >> Or
> > > > > > > > > > > > > >> > if
> > > > > > > > > > > > > >> > > you have access to theia you could
copy them
> > up
> > > > > there
> > > > > > > and
> > > > > > > > > > point
> > > > > > > > > > > me
> > > > > > > > > > > > > to
> > > > > > > > > > > > > >> it.
> > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > >> > > Thanks,
> > > > > > > > > > > > > >> > > John
> > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > >> > > On Tue, Apr 24, 2018 at 7:48 AM,
Rosalyn
> > > > MacCracken
> > > > > -
> > > > > > > NOAA
> > > > > > > > > > > > Affiliate
> > > > > > > > > > > > > >> via
> > > > > > > > > > > > > >> > RT
> > > > > > > > > > > > > >> > > <met_help at ucar.edu> wrote:
> > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > >> > > >
> > > > > > > > > > > > > >> > > > <URL: https://rt.rap.ucar.edu/rt/
> > > > > > > > > > Ticket/Display.html?id=84822
> > > > > > > > > > > >
> > > > > > > > > > > > > >> > > >
> > > > > > > > > > > > > >> > > > Hi John,
> > > > > > > > > > > > > >> > > >
> > > > > > > > > > > > > >> > > > Yes, that makes sense.  Those very
small
> > > values
> > > > > > (<1.0
> > > > > > > > > m/s),
> > > > > > > > > > > are
> > > > > > > > > > > > > bad
> > > > > > > > > > > > > >> > > > values.  That's why they shouldn't
be
> > included
> > > > in
> > > > > > the
> > > > > > > > > > > > processing.
> > > > > > > > > > > > > >> > > >
> > > > > > > > > > > > > >> > > > So, I need to just regenerate
hourly data,
> > one
> > > > > hour
> > > > > > > at a
> > > > > > > > > > time.
> > > > > > > > > > > > > >> Would
> > > > > > > > > > > > > >> > it
> > > > > > > > > > > > > >> > > > make sense to use a shell script
and loop
> > > > > > > stat-analysis?
> > > > > > > > > > > > > Something
> > > > > > > > > > > > > >> > like:
> > > > > > > > > > > > > >> > > >
> > > > > > > > > > > > > >> > > > for day in 11 12
> > > > > > > > > > > > > >> > > > do
> > > > > > > > > > > > > >> > > >   for cycle in 00 06 12 18
> > > > > > > > > > > > > >> > > >   do
> > > > > > > > > > > > > >> > > > stat_analysis -lookin
> > > > > /GFS/data/hourly/201803${day}$
> > > > > > > > > > > > {hour}/*.stat
> > > > > > > > > > > > > \
> > > > > > > > > > > > > >> > > > -job aggregate_stat \
> > > > > > > > > > > > > >> > > >    -line_type MPR \
> > > > > > > > > > > > > >> > > >    -out_line_type CTC,CTS,CNT \
> > > > > > > > > > > > > >> > > >   -fcst_var WIND \
> > > > > > > > > > > > > >> > > > -column_thresh OBS gt1 \
> > > > > > > > > > > > > >> > > >  -by
> > > > > > > > > > > > > >> > > > MODEL,FCST_LEV,FCST_INIT_BEG,
> > > > > > > > > FCST_LEAD,VX_MASK,INTERP_MTHD,
> > > > > > > > > > I
> > > > > > > > > > > > > >> NTERP_PNTS
> > > > > > > > > > > > > >> > > > -out_stat
/new_rerun_stat_files/MPR_to_
> > > > > > > CTC_CTS_CNT.stat
> > > > > > > > > > > > > >> > > >   done
> > > > > > > > > > > > > >> > > > done
> > > > > > > > > > > > > >> > > >
> > > > > > > > > > > > > >> > > > or, something like that?  And, will
this
> > > > > regenerate
> > > > > > > hour
> > > > > > > > > > > > > forecasts,
> > > > > > > > > > > > > >> at
> > > > > > > > > > > > > >> > > each
> > > > > > > > > > > > > >> > > > forecast and lead hour?  I guess it
will
> see
> > > the
> > > > > > > > forecast
> > > > > > > > > > and
> > > > > > > > > > > > lead
> > > > > > > > > > > > > >> hour
> > > > > > > > > > > > > >> > > > from the *.stat file, and whatever
*stat
> > file
> > > is
> > > > > in
> > > > > > > the
> > > > > > > > > > > > directory,
> > > > > > > > > > > > > >> it
> > > > > > > > > > > > > >> > > will
> > > > > > > > > > > > > >> > > > regenerate those hours, right?
> > > > > > > > > > > > > >> > > >
> > > > > > > > > > > > > >> > > > So, I need to regenerate the CTC,
CNT and
> > CTS
> > > > > files.
> > > > > > > > > That's
> > > > > > > > > > > > why I
> > > > > > > > > > > > > >> did:
> > > > > > > > > > > > > >> > > >  -out_line_type CTC,CTS,CNT
> > > > > > > > > > > > > >> > > > but, will that make 3 separate
files, or
> > just
> > > > > > another
> > > > > > > > > *.stat
> > > > > > > > > > > > file?
> > > > > > > > > > > > > >> > > >
> > > > > > > > > > > > > >> > > > Roz
> > > > > > > > > > > > > >> > > >
> > > > > > > > > > > > > >> > > >
> > > > > > > > > > > > > >> > > > On Mon, Apr 23, 2018 at 4:01 PM,
John
> Halley
> > > > > Gotway
> > > > > > > via
> > > > > > > > > RT <
> > > > > > > > > > > > > >> > > > met_help at ucar.edu> wrote:
> > > > > > > > > > > > > >> > > >
> > > > > > > > > > > > > >> > > > > Roz,
> > > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > > >> > > > > It is ultimately up to you to
decide
> which
> > > > > matched
> > > > > > > > pairs
> > > > > > > > > > you
> > > > > > > > > > > > > want
> > > > > > > > > > > > > >> to
> > > > > > > > > > > > > >> > > > > include in your processing.  Do
you
> > consider
> > > > > those
> > > > > > > > small
> > > > > > > > > > > (<1.0
> > > > > > > > > > > > > >> m/s)
> > > > > > > > > > > > > >> > > > > observation values to be corrupt
and
> > > incorrect
> > > > > in
> > > > > > > some
> > > > > > > > > way
> > > > > > > > > > > or
> > > > > > > > > > > > > just
> > > > > > > > > > > > > >> > not
> > > > > > > > > > > > > >> > > > very
> > > > > > > > > > > > > >> > > > > interesting?  If they really are
BAD
> data
> > > > > values,
> > > > > > I
> > > > > > > > > agree
> > > > > > > > > > > that
> > > > > > > > > > > > > you
> > > > > > > > > > > > > >> > > should
> > > > > > > > > > > > > >> > > > > exclude them from your analysis.
But if
> > > > they're
> > > > > > > just
> > > > > > > > > > > > > >> uninteresting
> > > > > > > > > > > > > >> > > > values
> > > > > > > > > > > > > >> > > > > of low wind speed, then there's
no
> reason
> > > why
> > > > > you
> > > > > > > > should
> > > > > > > > > > > > exclude
> > > > > > > > > > > > > >> > them.
> > > > > > > > > > > > > >> > > > For
> > > > > > > > > > > > > >> > > > > example, *most* of the time it
ins't
> > > raining,
> > > > > but
> > > > > > we
> > > > > > > > > often
> > > > > > > > > > > > > >> included
> > > > > > > > > > > > > >> > > > > observations of 0 precip.
> > > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > > >> > > > > There are three configurable
options in
> > > > > Point-Stat
> > > > > > > > that
> > > > > > > > > > may
> > > > > > > > > > > be
> > > > > > > > > > > > > >> useful
> > > > > > > > > > > > > >> > > > here:
> > > > > > > > > > > > > >> > > > > (1) You already know and use the
> > > "cat_thresh"
> > > > > > > option.
> > > > > > > > > > This
> > > > > > > > > > > > > >> threshold
> > > > > > > > > > > > > >> > > > > defines the events and non-events
for a
> > 2x2
> > > > > > > > contingency
> > > > > > > > > > > table.
> > > > > > > > > > > > > >> This
> > > > > > > > > > > > > >> > > > > threshold affects the contents of
FHO,
> > CTC,
> > > > CTS,
> > > > > > > MCTC,
> > > > > > > > > and
> > > > > > > > > > > > MCTS
> > > > > > > > > > > > > >> line
> > > > > > > > > > > > > >> > > > types
> > > > > > > > > > > > > >> > > > > that Point-Stat writes.
> > > > > > > > > > > > > >> > > > > (2) The "cnt_thresh" option is a
more
> > recent
> > > > > > > addition.
> > > > > > > > > > > > Perhaps
> > > > > > > > > > > > > >> this
> > > > > > > > > > > > > >> > > was
> > > > > > > > > > > > > >> > > > a
> > > > > > > > > > > > > >> > > > > poor name choice, but instead of
> defining
> > > > > > > categories,
> > > > > > > > > it's
> > > > > > > > > > > > > really
> > > > > > > > > > > > > >> a
> > > > > > > > > > > > > >> > > > > *filtering* threshold.  This
threshold
> > > affects
> > > > > the
> > > > > > > > > > contents
> > > > > > > > > > > of
> > > > > > > > > > > > > the
> > > > > > > > > > > > > >> > > SL1L2,
> > > > > > > > > > > > > >> > > > > SAL1L2, and CNT line types that
> Point-Stat
> > > > > writes.
> > > > > > > > For
> > > > > > > > > > > > example,
> > > > > > > > > > > > > >> > > setting
> > > > > > > > > > > > > >> > > > > "cnt_thresh = [ ge6, ge17 ];"
will
> > produce 2
> > > > CNT
> > > > > > > and 2
> > > > > > > > > > SL1L2
> > > > > > > > > > > > > >> output
> > > > > > > > > > > > > >> > > lines
> > > > > > > > > > > > > >> > > > > containing only those points
where the
> > wind
> > > > > speed
> > > > > > > was
> > > > > > > > > >=6
> > > > > > > > > > > and
> > > > > > > > > > > > > >> >=17,
> > > > > > > > > > > > > >> > > > > respectively.
> > > > > > > > > > > > > >> > > > > (3) The "wind_thresh" option is
very
> > similar
> > > > to
> > > > > > the
> > > > > > > > > > > > "cnt_thresh"
> > > > > > > > > > > > > >> > option
> > > > > > > > > > > > > >> > > > but
> > > > > > > > > > > > > >> > > > > affects the contents of teh
VL1L2,
> VAL1L2,
> > > and
> > > > > > VCNT
> > > > > > > > (new
> > > > > > > > > > in
> > > > > > > > > > > > > >> met-7.0)
> > > > > > > > > > > > > >> > > line
> > > > > > > > > > > > > >> > > > > types.  Only those U/V pairs that
meet
> the
> > > > > > specified
> > > > > > > > > wind
> > > > > > > > > > > > speed
> > > > > > > > > > > > > >> > > threshold
> > > > > > > > > > > > > >> > > > > are included in the output.
> > > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > > >> > > > > For both "cnt_thresh" and
"wind_thresh",
> > the
> > > > > > default
> > > > > > > > > value
> > > > > > > > > > > in
> > > > > > > > > > > > > the
> > > > > > > > > > > > > >> > > config
> > > > > > > > > > > > > >> > > > > file is "NA", meaning, do not
apply any
> > > > > filtering
> > > > > > > > > > threshold
> > > > > > > > > > > > > >> criteria.
> > > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > > >> > > > > You have the flexibility to run
> > > STAT-Analysis
> > > > on
> > > > > > the
> > > > > > > > MPR
> > > > > > > > > > > > output
> > > > > > > > > > > > > >> lines
> > > > > > > > > > > > > >> > > to
> > > > > > > > > > > > > >> > > > > recompute any of these output
line types
> > > > > applying
> > > > > > > > > whatever
> > > > > > > > > > > > > >> filtering
> > > > > > > > > > > > > >> > > > > criteria you'd like.
> > > > > > > > > > > > > >> > > > > Here's the MET user's guide:
> > > > > > > > > > > > > >> > > > > https://dtcenter.org/met/
> > > > > > > users/docs/users_guide/MET_
> > > > > > > > > > > > > >> > > Users_Guide_v7.0.pdf
> > > > > > > > > > > > > >> > > > > Look on page 98 for the job
command
> > options
> > > > for
> > > > > > the
> > > > > > > > > > > > > >> "aggregate_stat"
> > > > > > > > > > > > > >> > > line
> > > > > > > > > > > > > >> > > > > type when the input line type is
"MPR".
> > > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > > >> > > > > For your second question, the "-
lookin
> > PATH"
> > > > > > option
> > > > > > > is
> > > > > > > > > > > *VERY*
> > > > > > > > > > > > > >> > flexible.
> > > > > > > > > > > > > >> > > > > You can set PATH to either a
single
> value
> > or
> > > > > > > multiple
> > > > > > > > > > > values.
> > > > > > > > > > > > > If
> > > > > > > > > > > > > >> you
> > > > > > > > > > > > > >> > > use
> > > > > > > > > > > > > >> > > > > wildcards, then the shell expands
those
> > > > > wildcards
> > > > > > to
> > > > > > > > > > > multiple
> > > > > > > > > > > > > >> values.
> > > > > > > > > > > > > >> > > > Each
> > > > > > > > > > > > > >> > > > > value you pass in can either be a
> filename
> > > or
> > > > a
> > > > > > > > > directory
> > > > > > > > > > > > name.
> > > > > > > > > > > > > >> If
> > > > > > > > > > > > > >> > you
> > > > > > > > > > > > > >> > > > > pass in a filename, STAT-Analysis
will
> > read
> > > it
> > > > > > > > > > *REGARDLESS*
> > > > > > > > > > > of
> > > > > > > > > > > > > the
> > > > > > > > > > > > > >> > file
> > > > > > > > > > > > > >> > > > > extension.  If you pass in a
directory
> > name,
> > > > > > > > > STAT-Analysis
> > > > > > > > > > > > will
> > > > > > > > > > > > > >> > search
> > > > > > > > > > > > > >> > > > that
> > > > > > > > > > > > > >> > > > > directory *RECURSIVELY* for files
ending
> > in
> > > > > > ".stat".
> > > > > > > > > For
> > > > > > > > > > > > > example,
> > > > > > > > > > > > > >> > > either
> > > > > > > > > > > > > >> > > > > of the following settings would
tell
> > > > > STAT-Analysis
> > > > > > > to
> > > > > > > > > read
> > > > > > > > > > > the
> > > > > > > > > > > > > >> same
> > > > > > > > > > > > > >> > > list
> > > > > > > > > > > > > >> > > > of
> > > > > > > > > > > > > >> > > > > files:
> > > > > > > > > > > > > >> > > > >    -lookin
/GFS/data/hourly/*/*.stat
> > > > > > > > > > > > > >> > > > >    ... or ...
> > > > > > > > > > > > > >> > > > >    -lookin /GFS/data/hourly
> > > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > > >> > > > > Be aware though that the more
data you
> > pass
> > > to
> > > > > > > > > > > STAT-Analysis,
> > > > > > > > > > > > > the
> > > > > > > > > > > > > >> > > longer
> > > > > > > > > > > > > >> > > > > it'll take for it to process it.
You
> can
> > > > decide
> > > > > > how
> > > > > > > > > much
> > > > > > > > > > > data
> > > > > > > > > > > > > you
> > > > > > > > > > > > > >> > pass
> > > > > > > > > > > > > >> > > > it
> > > > > > > > > > > > > >> > > > > for each job.  I'd suggest
starting with
> > > what
> > > > is
> > > > > > > most
> > > > > > > > > > > > convenient
> > > > > > > > > > > > > >> for
> > > > > > > > > > > > > >> > > you.
> > > > > > > > > > > > > >> > > > > If it's too slow, change the
logic to
> pass
> > > it
> > > > > less
> > > > > > > > data
> > > > > > > > > > > (e.g.
> > > > > > > > > > > > > >> only 1
> > > > > > > > > > > > > >> > > day
> > > > > > > > > > > > > >> > > > of
> > > > > > > > > > > > > >> > > > > data rather than 1 month of
data).
> > > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > > >> > > > > Yes, you can give it a date
range.  Use
> > > > > > > -fcst_init_beg
> > > > > > > > > and
> > > > > > > > > > > > > >> > > -fcst_init_end
> > > > > > > > > > > > > >> > > > > to specify beginning/ending model
> > > > initialization
> > > > > > > times
> > > > > > > > > or
> > > > > > > > > > > > > >> > > -fcst_valid_beg
> > > > > > > > > > > > > >> > > > > and -fcst_valid_end to specify
> > > > beginning/ending
> > > > > > > valid
> > > > > > > > > > times.
> > > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > > >> > > > > If you find that you're running
multiple
> > > jobs
> > > > on
> > > > > > the
> > > > > > > > > same
> > > > > > > > > > > > subset
> > > > > > > > > > > > > >> of
> > > > > > > > > > > > > >> > > data
> > > > > > > > > > > > > >> > > > > (e.g. process MPR to CNT, MPR to
SL1L2,
> > MPR
> > > to
> > > > > > CTC,
> > > > > > > > MPR
> > > > > > > > > to
> > > > > > > > > > > > CTS),
> > > > > > > > > > > > > >> it'd
> > > > > > > > > > > > > >> > > be
> > > > > > > > > > > > > >> > > > > more efficient to group those
jobs into
> a
> > > > config
> > > > > > > file.
> > > > > > > > > > > > That'll
> > > > > > > > > > > > > do
> > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > >> > > > > filtering ONCE and write the
filtered
> data
> > > to
> > > > a
> > > > > > temp
> > > > > > > > > file.
> > > > > > > > > > > > Then
> > > > > > > > > > > > > >> all
> > > > > > > > > > > > > >> > > the
> > > > > > > > > > > > > >> > > > > jobs read data from the temp
instead of
> > > > starting
> > > > > > > over
> > > > > > > > > from
> > > > > > > > > > > > > >> scratch.
> > > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > > >> > > > > Make sense?
> > > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > > >> > > > > John
> > > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > > >> > > > > On Mon, Apr 23, 2018 at 1:01 PM,
Rosalyn
> > > > > > MacCracken
> > > > > > > -
> > > > > > > > > NOAA
> > > > > > > > > > > > > >> Affiliate
> > > > > > > > > > > > > >> > > via
> > > > > > > > > > > > > >> > > > RT
> > > > > > > > > > > > > >> > > > > <met_help at ucar.edu> wrote:
> > > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > > >> > > > > >
> > > > > > > > > > > > > >> > > > > > <URL:
https://rt.rap.ucar.edu/rt/
> > > > > > > > > > > > Ticket/Display.html?id=84822
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >> > > > > >
> > > > > > > > > > > > > >> > > > > > Hi John,
> > > > > > > > > > > > > >> > > > > >
> > > > > > > > > > > > > >> > > > > > That's actually only partially
> correct.
> > > > It's
> > > > > > not
> > > > > > > > > that I
> > > > > > > > > > > > want
> > > > > > > > > > > > > to
> > > > > > > > > > > > > >> > use
> > > > > > > > > > > > > >> > > > part
> > > > > > > > > > > > > >> > > > > > of the MPR lines and discard
the rest,
> > > and I
> > > > > do
> > > > > > > need
> > > > > > > > > to
> > > > > > > > > > > > > >> regenerate
> > > > > > > > > > > > > >> > > > > > statistics.  Let me try to re-
explain.
> > > > > > > > > > > > > >> > > > > >
> > > > > > > > > > > > > >> > > > > > Back in early March we switched
from
> > > getting
> > > > > our
> > > > > > > > ASCAT
> > > > > > > > > > obs
> > > > > > > > > > > > > from
> > > > > > > > > > > > > >> the
> > > > > > > > > > > > > >> > > > > > prepbufr data, to getting it
from the
> > > > MGDRLITE
> > > > > > > data.
> > > > > > > > > So,
> > > > > > > > > > > > > >> processing
> > > > > > > > > > > > > >> > > > > didn't
> > > > > > > > > > > > > >> > > > > > change.  I was producing
statistics at
> > > > certain
> > > > > > > > > threshold
> > > > > > > > > > > > > levels
> > > > > > > > > > > > > >> for
> > > > > > > > > > > > > >> > > > both
> > > > > > > > > > > > > >> > > > > > GFS and ASCAT.  I had this set
with
> the
> > > > > > cat_thresh
> > > > > > > > > list,
> > > > > > > > > > > at
> > > > > > > > > > > > > >> levels
> > > > > > > > > > > > > >> > of
> > > > > > > > > > > > > >> > > > > > 0,6,17, etc.  We found out
after
> > > processing
> > > > > for
> > > > > > a
> > > > > > > > > couple
> > > > > > > > > > > of
> > > > > > > > > > > > > >> weeks
> > > > > > > > > > > > > >> > > that
> > > > > > > > > > > > > >> > > > > the
> > > > > > > > > > > > > >> > > > > > ASCAT data included these
really small
> > > > values,
> > > > > > > <1.0
> > > > > > > > > m/s,
> > > > > > > > > > > and
> > > > > > > > > > > > > >> that
> > > > > > > > > > > > > >> > > these
> > > > > > > > > > > > > >> > > > > > small wind speeds were being
included
> > into
> > > > the
> > > > > > > > > > statistics
> > > > > > > > > > > > > >> > processing.
> > > > > > > > > > > > > >> > > > > >
> > > > > > > > > > > > > >> > > > > > So, a couple of questions.
> > > > > > > > > > > > > >> > > > > > 1) Do I have to regenerate all
of my
> > > > > statistics
> > > > > > > > > (*.cts,
> > > > > > > > > > > > *.cnt
> > > > > > > > > > > > > >> and
> > > > > > > > > > > > > >> > > *ctc
> > > > > > > > > > > > > >> > > > > > files) because of this error?
Or,
> since
> > I
> > > > have
> > > > > > > > > threshold
> > > > > > > > > > > > > levels
> > > > > > > > > > > > > >> > set,
> > > > > > > > > > > > > >> > > > will
> > > > > > > > > > > > > >> > > > > > those small values be amoung
the
> > > statistics
> > > > in
> > > > > > the
> > > > > > > > > > lowest
> > > > > > > > > > > > > >> > thresholds?
> > > > > > > > > > > > > >> > > > > > 2) I have the *.stat files,
but, they
> > are
> > > > > spread
> > > > > > > out
> > > > > > > > > > into
> > > > > > > > > > > > > >> separate
> > > > > > > > > > > > > >> > > > > > directories like:
> > > > > > > > > > > > > >> > > > > >
/GFS/data/hourly/${YYYYMMDDHH}/*.stat
> > > > > > > > > > > > > >> > > > > > Can I tell stat-analysis to
"lookin"
> > > > > directories
> > > > > > > > with
> > > > > > > > > a
> > > > > > > > > > > > > wildcard
> > > > > > > > > > > > > >> > > (like
> > > > > > > > > > > > > >> > > > > > 201803*)?  If so, how?  Or, is
I tell
> it
> > > to
> > > > > look
> > > > > > > in
> > > > > > > > > > > > > >> > /GFS/data/hourly,
> > > > > > > > > > > > > >> > > > > will
> > > > > > > > > > > > > >> > > > > > it look in all the directories
> > recursively
> > > > > under
> > > > > > > > > hourly?
> > > > > > > > > > > > And,
> > > > > > > > > > > > > >> it
> > > > > > > > > > > > > >> > > > that's
> > > > > > > > > > > > > >> > > > > > the case, can I give it a date
range,
> > so,
> > > > that
> > > > > > it
> > > > > > > > only
> > > > > > > > > > > > > processes
> > > > > > > > > > > > > >> > data
> > > > > > > > > > > > > >> > > > > from
> > > > > > > > > > > > > >> > > > > > March?
> > > > > > > > > > > > > >> > > > > >
> > > > > > > > > > > > > >> > > > > > Roz
> > > > > > > > > > > > > >> > > > > >
> > > > > > > > > > > > > >> > > > > > On Mon, Apr 23, 2018 at 2:18
PM, John
> > > Halley
> > > > > > > Gotway
> > > > > > > > > via
> > > > > > > > > > > RT <
> > > > > > > > > > > > > >> > > > > > met_help at ucar.edu> wrote:
> > > > > > > > > > > > > >> > > > > >
> > > > > > > > > > > > > >> > > > > > > Hi Roz,
> > > > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > > > >> > > > > > > I read that you've run Point-
Stat
> and
> > > > saved
> > > > > > off
> > > > > > > > the
> > > > > > > > > > > > matched
> > > > > > > > > > > > > >> pairs
> > > > > > > > > > > > > >> > > > (MPR)
> > > > > > > > > > > > > >> > > > > > > output line type.  And you'd
like to
> > (1)
> > > > > > filter
> > > > > > > > > those
> > > > > > > > > > > MPR
> > > > > > > > > > > > > >> lines
> > > > > > > > > > > > > >> > to
> > > > > > > > > > > > > >> > > > > > discard
> > > > > > > > > > > > > >> > > > > > > some of them and then (2) use
the
> > > filtered
> > > > > > data
> > > > > > > to
> > > > > > > > > > > > > regenerate
> > > > > > > > > > > > > >> > > summary
> > > > > > > > > > > > > >> > > > > > > statistics.  Yes, this is
easily
> done
> > > > using
> > > > > > the
> > > > > > > > > > > > > STAT-Analysis
> > > > > > > > > > > > > >> > tool
> > > > > > > > > > > > > >> > > in
> > > > > > > > > > > > > >> > > > > > MET.
> > > > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > > > >> > > > > > > You wrote that you're
verifying wind
> > > > speeds
> > > > > > > > against
> > > > > > > > > > > ASCAT
> > > > > > > > > > > > > and
> > > > > > > > > > > > > >> > that
> > > > > > > > > > > > > >> > > > > you'd
> > > > > > > > > > > > > >> > > > > > > like to exclude pairs where
the
> > observed
> > > > > wind
> > > > > > > > speed
> > > > > > > > > is
> > > > > > > > > > > > less
> > > > > > > > > > > > > >> than
> > > > > > > > > > > > > >> > 1
> > > > > > > > > > > > > >> > > > m/s.
> > > > > > > > > > > > > >> > > > > > > I'm just guessing here, but
I'll
> > presume
> > > > > that
> > > > > > > you
> > > > > > > > > want
> > > > > > > > > > > to
> > > > > > > > > > > > > >> produce
> > > > > > > > > > > > > >> > > > both
> > > > > > > > > > > > > >> > > > > > > SL1L2 and CNT output line
types.
> > Here's
> > > > > what
> > > > > > > the
> > > > > > > > > > > > > >> STAT-Analysis
> > > > > > > > > > > > > >> > job
> > > > > > > > > > > > > >> > > > > would
> > > > > > > > > > > > > >> > > > > > > look like:
> > > > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > > > >> > > > > > > # Filter MPR's and write
SL1L2
> output
> > > line
> > > > > > > > > > > > > >> > > > > > > stat_analysis \
> > > > > > > > > > > > > >> > > > > > >    -lookin input.stat \
#
> > > List
> > > > a
> > > > > > > .stat
> > > > > > > > > > > filename
> > > > > > > > > > > > > or
> > > > > > > > > > > > > >> > > > directory
> > > > > > > > > > > > > >> > > > > > > containing them
> > > > > > > > > > > > > >> > > > > > >    -job aggregate_stat \
#
> Job
> > > type
> > > > > is
> > > > > > > > > > > > aggregate_stat
> > > > > > > > > > > > > >> > > > > > >    -line_type MPR \
#
> > Input
> > > > > line
> > > > > > > > type =
> > > > > > > > > > MPR
> > > > > > > > > > > > > >> > > > > > >    -out_line_type SL1L2 \
#
> > Output
> > > > line
> > > > > > > type
> > > > > > > > =
> > > > > > > > > > > SL1L2
> > > > > > > > > > > > > >> partial
> > > > > > > > > > > > > >> > > > sums
> > > > > > > > > > > > > >> > > > > > >    -fcst_var WIND \
#
> > Only
> > > > > > process
> > > > > > > > > lines
> > > > > > > > > > > > where
> > > > > > > > > > > > > >> > > FCST_VAR
> > > > > > > > > > > > > >> > > > > > > column = WIND
> > > > > > > > > > > > > >> > > > > > >    -column_thresh OBS gt1 \ #
Only
> use
> > > MPR
> > > > > > lines
> > > > > > > > > where
> > > > > > > > > > > OBS
> > > > > > > > > > > > > >> column
> > > > > > > > > > > > > >> > > > 1
> > > > > > > > > > > > > >> > > > > > >    -by
> > > > > > > > > > > > > >> > > > > > > MODEL,FCST_LEV,FCST_INIT_BEG,
> > > > > > > > > > > > FCST_LEAD,VX_MASK,INTERP_MTHD,
> > > > > > > > > > > > > >> > > > INTERP_PNTS
> > > > > > > > > > > > > >> > > > > #
> > > > > > > > > > > > > >> > > > > > > Run this same job for each
unique
> > > > > combination
> > > > > > of
> > > > > > > > > these
> > > > > > > > > > > > > columns
> > > > > > > > > > > > > >> > > > > > >    -out_stat
MPR_to_SL1L2.stat
> > > > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > > > >> > > > > > > This will read produce an
output
> .stat
> > > > file
> > > > > > > > > containing
> > > > > > > > > > > an
> > > > > > > > > > > > > >> SL1L2
> > > > > > > > > > > > > >> > > line
> > > > > > > > > > > > > >> > > > > for
> > > > > > > > > > > > > >> > > > > > > each unique combination of
the
> header
> > > > > columns
> > > > > > > > listed
> > > > > > > > > > > after
> > > > > > > > > > > > > the
> > > > > > > > > > > > > >> > > "-by"
> > > > > > > > > > > > > >> > > > > > > option.  To generate CNT
output
> lines
> > > > > instead,
> > > > > > > > you'd
> > > > > > > > > > > run a
> > > > > > > > > > > > > >> second
> > > > > > > > > > > > > >> > > job
> > > > > > > > > > > > > >> > > > > > where
> > > > > > > > > > > > > >> > > > > > > you replace SL1L2 with CNT.
You
> could
> > > run
> > > > > > these
> > > > > > > > > jobs
> > > > > > > > > > on
> > > > > > > > > > > > the
> > > > > > > > > > > > > >> > > command
> > > > > > > > > > > > > >> > > > > line
> > > > > > > > > > > > > >> > > > > > > or group them together into a
> > > > STAT-Analysis
> > > > > > > config
> > > > > > > > > > file,
> > > > > > > > > > > > if
> > > > > > > > > > > > > >> you
> > > > > > > > > > > > > >> > > > prefer.
> > > > > > > > > > > > > >> > > > > > > Both would work.
> > > > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > > > >> > > > > > > You could run this once for
each
> input
> > > > .stat
> > > > > > > file
> > > > > > > > > > you're
> > > > > > > > > > > > > >> > > > processing...
> > > > > > > > > > > > > >> > > > > or
> > > > > > > > > > > > > >> > > > > > > you could pass many input
.stat
> files
> > to
> > > > the
> > > > > > > job.
> > > > > > > > > > Since
> > > > > > > > > > > > > >> > > > FCST_INIT_BEG
> > > > > > > > > > > > > >> > > > > > and
> > > > > > > > > > > > > >> > > > > > > FCST_LEAD are included in the
"-by"
> > > > option,
> > > > > > > you'll
> > > > > > > > > get
> > > > > > > > > > > > > >> separate
> > > > > > > > > > > > > >> > > > output
> > > > > > > > > > > > > >> > > > > > > lines for each unique time.
> > > > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > > > >> > > > > > > Hope that helps get you
going.
> > > > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > > > >> > > > > > > Thanks,
> > > > > > > > > > > > > >> > > > > > > John
> > > > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > > > >> > > > > > > On Thu, Apr 19, 2018 at 9:23
AM,
> Julie
> > > > > > > Prestopnik
> > > > > > > > > via
> > > > > > > > > > > RT <
> > > > > > > > > > > > > >> > > > > > > met_help at ucar.edu>
> > > > > > > > > > > > > >> > > > > > > wrote:
> > > > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > > > >> > > > > > > >
> > > > > > > > > > > > > >> > > > > > > > <URL:
> > https://rt.rap.ucar.edu/rt/Tic
> > > > > > > > > > > > > >> ket/Display.html?id=84822
> > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > >> > > > > > > >
> > > > > > > > > > > > > >> > > > > > > > Hi Roz.  My apologies for
the
> delay
> > in
> > > > > > > > responding.
> > > > > > > > > > > > > >> > > > > > > >
> > > > > > > > > > > > > >> > > > > > > > Unfortunately, John is out
of the
> > > office
> > > > > > this
> > > > > > > > > week,
> > > > > > > > > > > and
> > > > > > > > > > > > I
> > > > > > > > > > > > > do
> > > > > > > > > > > > > >> > not
> > > > > > > > > > > > > >> > > > know
> > > > > > > > > > > > > >> > > > > > the
> > > > > > > > > > > > > >> > > > > > > > answers to your questions.
As you
> > > > said, I
> > > > > > > would
> > > > > > > > > > also
> > > > > > > > > > > > > >> imagine
> > > > > > > > > > > > > >> > > that
> > > > > > > > > > > > > >> > > > > > > > point-stat is using those
small
> > values
> > > > as
> > > > > > > > matched
> > > > > > > > > > > pairs.
> > > > > > > > > > > > > >> > Also, I
> > > > > > > > > > > > > >> > > > do
> > > > > > > > > > > > > >> > > > > > not
> > > > > > > > > > > > > >> > > > > > > > believe there is a way to
> regenerate
> > > the
> > > > > > > > > point-stat
> > > > > > > > > > > > > >> statistics
> > > > > > > > > > > > > >> > > > > without
> > > > > > > > > > > > > >> > > > > > > > using the original GFS
data.  I
> > cannot
> > > > say
> > > > > > > with
> > > > > > > > > > > > certainty,
> > > > > > > > > > > > > >> > > however.
> > > > > > > > > > > > > >> > > > > > > Thank
> > > > > > > > > > > > > >> > > > > > > > you for your patience in
advance.
> > > We'll
> > > > > > get a
> > > > > > > > > > > definite
> > > > > > > > > > > > > >> > response
> > > > > > > > > > > > > >> > > to
> > > > > > > > > > > > > >> > > > > you
> > > > > > > > > > > > > >> > > > > > > as
> > > > > > > > > > > > > >> > > > > > > > soon as we can.
> > > > > > > > > > > > > >> > > > > > > >
> > > > > > > > > > > > > >> > > > > > > > Thanks,
> > > > > > > > > > > > > >> > > > > > > > Julie
> > > > > > > > > > > > > >> > > > > > > >
> > > > > > > > > > > > > >> > > > > > > > On Wed, Apr 18, 2018 at
6:31 AM,
> > > Rosalyn
> > > > > > > > > MacCracken
> > > > > > > > > > -
> > > > > > > > > > > > NOAA
> > > > > > > > > > > > > >> > > > Affiliate
> > > > > > > > > > > > > >> > > > > > via
> > > > > > > > > > > > > >> > > > > > > RT
> > > > > > > > > > > > > >> > > > > > > > <met_help at ucar.edu> wrote:
> > > > > > > > > > > > > >> > > > > > > >
> > > > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > > > >> > > > > > > > > Wed Apr 18 06:31:39 2018:
> Request
> > > > 84822
> > > > > > was
> > > > > > > > > acted
> > > > > > > > > > > > upon.
> > > > > > > > > > > > > >> > > > > > > > > Transaction: Ticket
created by
> > > > > > > > > > > > > >> rosalyn.maccracken at noaa.gov
> > > > > > > > > > > > > >> > > > > > > > >        Queue: met_help
> > > > > > > > > > > > > >> > > > > > > > >      Subject: question on
> > > regenerating
> > > > > > data
> > > > > > > > > > > > > >> > > > > > > > >        Owner: Nobody
> > > > > > > > > > > > > >> > > > > > > > >   Requestors:
> > > > > rosalyn.maccracken at noaa.gov
> > > > > > > > > > > > > >> > > > > > > > >       Status: new
> > > > > > > > > > > > > >> > > > > > > > >  Ticket <URL:
> > > > > https://rt.rap.ucar.edu/rt/
> > > > > > > > > > > > > >> > > > > > Ticket/Display.html?id=84822
> > > > > > > > > > > > > >> > > > > > > >
> > > > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > > > >> > > > > > > > > Hi,
> > > > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > > > >> > > > > > > > > I'm running point-stat
using
> ASCAT
> > > and
> > > > > GFS
> > > > > > > > data
> > > > > > > > > to
> > > > > > > > > > > > > verify
> > > > > > > > > > > > > >> > > surface
> > > > > > > > > > > > > >> > > > > > wind
> > > > > > > > > > > > > >> > > > > > > > > speeds.  I found an error
in my
> > > ASCAT
> > > > > > input
> > > > > > > > data
> > > > > > > > > > > that
> > > > > > > > > > > > > goes
> > > > > > > > > > > > > >> > back
> > > > > > > > > > > > > >> > > > to
> > > > > > > > > > > > > >> > > > > > Mar
> > > > > > > > > > > > > >> > > > > > > 7.
> > > > > > > > > > > > > >> > > > > > > > > I had switched the input
source
> of
> > > the
> > > > > > data,
> > > > > > > > and
> > > > > > > > > > > > within
> > > > > > > > > > > > > >> the
> > > > > > > > > > > > > >> > new
> > > > > > > > > > > > > >> > > > > data
> > > > > > > > > > > > > >> > > > > > > > files,
> > > > > > > > > > > > > >> > > > > > > > > it was allowing very
small
> values
> > > (< 1
> > > > > > m/s)
> > > > > > > to
> > > > > > > > > be
> > > > > > > > > > > used
> > > > > > > > > > > > > as
> > > > > > > > > > > > > >> > data
> > > > > > > > > > > > > >> > > > > points
> > > > > > > > > > > > > >> > > > > > > in
> > > > > > > > > > > > > >> > > > > > > > > the verification.  I
imagine
> that
> > > this
> > > > > is
> > > > > > an
> > > > > > > > > > issue,
> > > > > > > > > > > > > since
> > > > > > > > > > > > > >> > > > > point-stat
> > > > > > > > > > > > > >> > > > > > is
> > > > > > > > > > > > > >> > > > > > > > > using these very small
values as
> > > > matched
> > > > > > > pairs
> > > > > > > > > > with
> > > > > > > > > > > > the
> > > > > > > > > > > > > >> GFS,
> > > > > > > > > > > > > >> > > > > correct?
> > > > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > > > >> > > > > > > > > Is there a way to
regenerate the
> > > > > > point-stat
> > > > > > > > > > > statistics
> > > > > > > > > > > > > >> > without
> > > > > > > > > > > > > >> > > > > using
> > > > > > > > > > > > > >> > > > > > > the
> > > > > > > > > > > > > >> > > > > > > > > original GFS data?  I do
have
> the
> > > > *stat
> > > > > > and
> > > > > > > > the
> > > > > > > > > > *mpr
> > > > > > > > > > > > > >> files,
> > > > > > > > > > > > > >> > and
> > > > > > > > > > > > > >> > > > it
> > > > > > > > > > > > > >> > > > > is
> > > > > > > > > > > > > >> > > > > > > > > pretty easy to identify
where
> the
> > > bad
> > > > > > values
> > > > > > > > are
> > > > > > > > > > > > > located.
> > > > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > > > >> > > > > > > > > Thanks,
> > > > > > > > > > > > > >> > > > > > > > > Roz
> > > > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > > > >> > > > > > > > > --
> > > > > > > > > > > > > >> > > > > > > > > Rosalyn MacCracken
> > > > > > > > > > > > > >> > > > > > > > > Support Scientist
> > > > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > > > >> > > > > > > > > Ocean Applications Branch
> > > > > > > > > > > > > >> > > > > > > > > NOAA/NWS Ocean Prediction
Center
> > > > > > > > > > > > > >> > > > > > > > > NCWCP
> > > > > > > > > > > > > >> > > > > > > > > 5830 University Research
Ct
> > > > > > > > > > > > > >> > > > > > > > > College Park, MD  20740-
3818
> > > > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > > > >> > > > > > > > > (p) 301-683-1551
> > > > > > > > > > > > > >> > > > > > > > >
rosalyn.maccracken at noaa.gov
> > > > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > > > >> > > > > > > >
> > > > > > > > > > > > > >> > > > > > > >
> > > > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > > > >> > > > > >
> > > > > > > > > > > > > >> > > > > >
> > > > > > > > > > > > > >> > > > > > --
> > > > > > > > > > > > > >> > > > > > Rosalyn MacCracken
> > > > > > > > > > > > > >> > > > > > Support Scientist
> > > > > > > > > > > > > >> > > > > >
> > > > > > > > > > > > > >> > > > > > Ocean Applications Branch
> > > > > > > > > > > > > >> > > > > > NOAA/NWS Ocean Prediction
Center
> > > > > > > > > > > > > >> > > > > > NCWCP
> > > > > > > > > > > > > >> > > > > > 5830 University Research Ct
> > > > > > > > > > > > > >> > > > > > College Park, MD  20740-3818
> > > > > > > > > > > > > >> > > > > >
> > > > > > > > > > > > > >> > > > > > (p) 301-683-1551
> > > > > > > > > > > > > >> > > > > > rosalyn.maccracken at noaa.gov
> > > > > > > > > > > > > >> > > > > >
> > > > > > > > > > > > > >> > > > > >
> > > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > > >> > > >
> > > > > > > > > > > > > >> > > >
> > > > > > > > > > > > > >> > > > --
> > > > > > > > > > > > > >> > > > Rosalyn MacCracken
> > > > > > > > > > > > > >> > > > Support Scientist
> > > > > > > > > > > > > >> > > >
> > > > > > > > > > > > > >> > > > Ocean Applications Branch
> > > > > > > > > > > > > >> > > > NOAA/NWS Ocean Prediction Center
> > > > > > > > > > > > > >> > > > NCWCP
> > > > > > > > > > > > > >> > > > 5830 University Research Ct
> > > > > > > > > > > > > >> > > > College Park, MD  20740-3818
> > > > > > > > > > > > > >> > > >
> > > > > > > > > > > > > >> > > > (p) 301-683-1551
> > > > > > > > > > > > > >> > > > rosalyn.maccracken at noaa.gov
> > > > > > > > > > > > > >> > > >
> > > > > > > > > > > > > >> > > >
> > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > >> >
> > > > > > > > > > > > > >> >
> > > > > > > > > > > > > >> > --
> > > > > > > > > > > > > >> > Rosalyn MacCracken
> > > > > > > > > > > > > >> > Support Scientist
> > > > > > > > > > > > > >> >
> > > > > > > > > > > > > >> > Ocean Applications Branch
> > > > > > > > > > > > > >> > NOAA/NWS Ocean Prediction Center
> > > > > > > > > > > > > >> > NCWCP
> > > > > > > > > > > > > >> > 5830 University Research Ct
> > > > > > > > > > > > > >> > College Park, MD  20740-3818
> > > > > > > > > > > > > >> >
> > > > > > > > > > > > > >> > (p) 301-683-1551
> > > > > > > > > > > > > >> > rosalyn.maccracken at noaa.gov
> > > > > > > > > > > > > >> >
> > > > > > > > > > > > > >> >
> > > > > > > > > > > > > >>
> > > > > > > > > > > > > >>
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > --
> > > > > > > > > > > > > > Rosalyn MacCracken
> > > > > > > > > > > > > > Support Scientist
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Ocean Applications Branch
> > > > > > > > > > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > > > > > > > > > NCWCP
> > > > > > > > > > > > > > 5830 University Research Ct
> > > > > > > > > > > > > > College Park, MD  20740-3818
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > (p) 301-683-1551
> > > > > > > > > > > > > > rosalyn.maccracken at noaa.gov
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > --
> > > > > > > > > > > > > Rosalyn MacCracken
> > > > > > > > > > > > > Support Scientist
> > > > > > > > > > > > >
> > > > > > > > > > > > > Ocean Applications Branch
> > > > > > > > > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > > > > > > > > NCWCP
> > > > > > > > > > > > > 5830 University Research Ct
> > > > > > > > > > > > > College Park, MD  20740-3818
> > > > > > > > > > > > >
> > > > > > > > > > > > > (p) 301-683-1551
> > > > > > > > > > > > > rosalyn.maccracken at noaa.gov
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > --
> > > > > > > > > > > Rosalyn MacCracken
> > > > > > > > > > > Support Scientist
> > > > > > > > > > >
> > > > > > > > > > > Ocean Applications Branch
> > > > > > > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > > > > > > NCWCP
> > > > > > > > > > > 5830 University Research Ct
> > > > > > > > > > > College Park, MD  20740-3818
> > > > > > > > > > >
> > > > > > > > > > > (p) 301-683-1551
> > > > > > > > > > > rosalyn.maccracken at noaa.gov
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Rosalyn MacCracken
> > > > > > > > > Support Scientist
> > > > > > > > >
> > > > > > > > > Ocean Applications Branch
> > > > > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > > > > NCWCP
> > > > > > > > > 5830 University Research Ct
> > > > > > > > > College Park, MD  20740-3818
> > > > > > > > >
> > > > > > > > > (p) 301-683-1551
> > > > > > > > > rosalyn.maccracken at noaa.gov
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Rosalyn MacCracken
> > > > > > > Support Scientist
> > > > > > >
> > > > > > > Ocean Applications Branch
> > > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > > NCWCP
> > > > > > > 5830 University Research Ct
> > > > > > > College Park, MD  20740-3818
> > > > > > >
> > > > > > > (p) 301-683-1551
> > > > > > > rosalyn.maccracken at noaa.gov
> > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Rosalyn MacCracken
> > > > > Support Scientist
> > > > >
> > > > > Ocean Applications Branch
> > > > > NOAA/NWS Ocean Prediction Center
> > > > > NCWCP
> > > > > 5830 University Research Ct
> > > > > College Park, MD  20740-3818
> > > > >
> > > > > (p) 301-683-1551
> > > > > rosalyn.maccracken at noaa.gov
> > > > >
> > > > >
> > > >
> > > >
> > >
> > >
> > > --
> > > Rosalyn MacCracken
> > > Support Scientist
> > >
> > > Ocean Applications Branch
> > > NOAA/NWS Ocean Prediction Center
> > > NCWCP
> > > 5830 University Research Ct
> > > College Park, MD  20740-3818
> > >
> > > (p) 301-683-1551
> > > rosalyn.maccracken at noaa.gov
> > >
> > >
> >
> >
>
>
> --
> Rosalyn MacCracken
> Support Scientist
>
> Ocean Applications Branch
> NOAA/NWS Ocean Prediction Center
> NCWCP
> 5830 University Research Ct
> College Park, MD  20740-3818
>
> (p) 301-683-1551
> rosalyn.maccracken at noaa.gov
>
>

------------------------------------------------