[Met_help] [rt.rap.ucar.edu #80429] History for stat_analysis aggregate question

John Halley Gotway via RT met_help at ucar.edu
Tue May 16 12:57:58 MDT 2017


----------------------------------------------------------------
  Initial Request
----------------------------------------------------------------

Hi,

I'm using the output from the poin-stat tool as input to the stat_analysis
tool.  I would like to aggregate the *cnt.txt files.  I can get the tool to
aggregate, and aggregate_stat the *cts.txt or *ctc.txt files.  I would
really like to use the information in the *cnt.txt files for multiple
times/days.  How do I do that?

Thanks in advance!

Roz

-- 
Rosalyn MacCracken
Support Scientist

Ocean Applilcations Branch
NOAA/NWS Ocean Prediction Center
NCWCP
5830 University Research Ct
College Park, MD  20740-3818

(p) 301-683-1551
rosalyn.maccracken at noaa.gov


----------------------------------------------------------------
  Complete Ticket History
----------------------------------------------------------------

Subject: stat_analysis aggregate question
From: John Halley Gotway
Time: Fri May 12 12:31:12 2017

Hello Roz,

I see that you have a question about configuring/running STAT-Analysis
jobs.

The "-lookin" command line option is used to tell STAT-Analysis what
input
files to read.  You must specify the "-lookin" option at least once,
but
can use it as many times as you'd like.

The argument you pass with "-lookin" is either the name of a directory
or
explicit file name.

For an explicit file name, STAT-Analysis will read MET output data
from it
**regardless of the file naming convention**.

For a directory name, STAT-Analysis will search **recursively**
through
that directory looking for files ending in the ".stat" suffix.

Each time you run grid_stat, point_stat, wavelet_stat, or
ensemble_stat,
the tool writes a ".stat" output file (and can also write the optional
text
files sorted by line type... such as "_cnt.txt).  That's why STAT-
Analysis
searches directories for ".stat" files.  But if you want it to read
the
"_cnt.txt" file, you need to specify the file name on the command
line.

Make sense?

Just let us know if more issues/questions arise.

Thanks,
John Halley Gotway


On Thu, May 11, 2017 at 2:37 PM, Rosalyn MacCracken - NOAA Affiliate
via RT
<met_help at ucar.edu> wrote:

>
> Thu May 11 14:37:29 2017: Request 80429 was acted upon.
> Transaction: Ticket created by rosalyn.maccracken at noaa.gov
>        Queue: met_help
>      Subject: stat_analysis aggregate question
>        Owner: Nobody
>   Requestors: rosalyn.maccracken at noaa.gov
>       Status: new
>  Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=80429 >
>
>
> Hi,
>
> I'm using the output from the poin-stat tool as input to the
stat_analysis
> tool.  I would like to aggregate the *cnt.txt files.  I can get the
tool to
> aggregate, and aggregate_stat the *cts.txt or *ctc.txt files.  I
would
> really like to use the information in the *cnt.txt files for
multiple
> times/days.  How do I do that?
>
> Thanks in advance!
>
> Roz
>
> --
> Rosalyn MacCracken
> Support Scientist
>
> Ocean Applilcations Branch
> NOAA/NWS Ocean Prediction Center
> NCWCP
> 5830 University Research Ct
> College Park, MD  20740-3818
>
> (p) 301-683-1551
> rosalyn.maccracken at noaa.gov
>
>

------------------------------------------------
Subject: stat_analysis aggregate question
From: Rosalyn MacCracken - NOAA Affiliate
Time: Sat May 13 11:04:47 2017

Hi John,

I finally got it to work.  I had set:
line_type = ["CTC"];

So, I set line_type to nothing [], and everything started working.

So, question.  When using "summary" with -column RMSE set, what does
that
mean?  That only the RMSE column is summed, or something else?

Thanks!

Roz

On Fri, May 12, 2017 at 2:31 PM, John Halley Gotway via RT <
met_help at ucar.edu> wrote:

> Hello Roz,
>
> I see that you have a question about configuring/running STAT-
Analysis
> jobs.
>
> The "-lookin" command line option is used to tell STAT-Analysis what
input
> files to read.  You must specify the "-lookin" option at least once,
but
> can use it as many times as you'd like.
>
> The argument you pass with "-lookin" is either the name of a
directory or
> explicit file name.
>
> For an explicit file name, STAT-Analysis will read MET output data
from it
> **regardless of the file naming convention**.
>
> For a directory name, STAT-Analysis will search **recursively**
through
> that directory looking for files ending in the ".stat" suffix.
>
> Each time you run grid_stat, point_stat, wavelet_stat, or
ensemble_stat,
> the tool writes a ".stat" output file (and can also write the
optional text
> files sorted by line type... such as "_cnt.txt).  That's why STAT-
Analysis
> searches directories for ".stat" files.  But if you want it to read
the
> "_cnt.txt" file, you need to specify the file name on the command
line.
>
> Make sense?
>
> Just let us know if more issues/questions arise.
>
> Thanks,
> John Halley Gotway
>
>
> On Thu, May 11, 2017 at 2:37 PM, Rosalyn MacCracken - NOAA Affiliate
via RT
> <met_help at ucar.edu> wrote:
>
> >
> > Thu May 11 14:37:29 2017: Request 80429 was acted upon.
> > Transaction: Ticket created by rosalyn.maccracken at noaa.gov
> >        Queue: met_help
> >      Subject: stat_analysis aggregate question
> >        Owner: Nobody
> >   Requestors: rosalyn.maccracken at noaa.gov
> >       Status: new
> >  Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=80429 >
> >
> >
> > Hi,
> >
> > I'm using the output from the poin-stat tool as input to the
> stat_analysis
> > tool.  I would like to aggregate the *cnt.txt files.  I can get
the tool
> to
> > aggregate, and aggregate_stat the *cts.txt or *ctc.txt files.  I
would
> > really like to use the information in the *cnt.txt files for
multiple
> > times/days.  How do I do that?
> >
> > Thanks in advance!
> >
> > Roz
> >
> > --
> > Rosalyn MacCracken
> > Support Scientist
> >
> > Ocean Applilcations Branch
> > NOAA/NWS Ocean Prediction Center
> > NCWCP
> > 5830 University Research Ct
> > College Park, MD  20740-3818
> >
> > (p) 301-683-1551
> > rosalyn.maccracken at noaa.gov
> >
> >
>
>


--
Rosalyn MacCracken
Support Scientist

Ocean Applilcations Branch
NOAA/NWS Ocean Prediction Center
NCWCP
5830 University Research Ct
College Park, MD  20740-3818

(p) 301-683-1551
rosalyn.maccracken at noaa.gov

------------------------------------------------
Subject: stat_analysis aggregate question
From: John Halley Gotway
Time: Sat May 13 18:37:01 2017

Roz,

Stat-Analysis can perform a few different "job" types.  One of them is
the
"summary" job type (-job summary).  For that job, you pick exactly one
line
type and one or more columns of interest.  Stat-Analysis will apply
whatever other filtering criteria you specify and compute summary
information for the column(s) you've selected.  The summary info
includes
mean, min, max, and so on.

Let me know if there's something specific you're trying to do with
stat-analysis and I may be able to point you in the right direction.

John


On Sat, May 13, 2017 at 11:04 AM Rosalyn MacCracken - NOAA Affiliate
via RT
<met_help at ucar.edu> wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=80429 >
>
> Hi John,
>
> I finally got it to work.  I had set:
> line_type = ["CTC"];
>
> So, I set line_type to nothing [], and everything started working.
>
> So, question.  When using "summary" with -column RMSE set, what does
that
> mean?  That only the RMSE column is summed, or something else?
>
> Thanks!
>
> Roz
>
> On Fri, May 12, 2017 at 2:31 PM, John Halley Gotway via RT <
> met_help at ucar.edu> wrote:
>
> > Hello Roz,
> >
> > I see that you have a question about configuring/running STAT-
Analysis
> > jobs.
> >
> > The "-lookin" command line option is used to tell STAT-Analysis
what
> input
> > files to read.  You must specify the "-lookin" option at least
once, but
> > can use it as many times as you'd like.
> >
> > The argument you pass with "-lookin" is either the name of a
directory or
> > explicit file name.
> >
> > For an explicit file name, STAT-Analysis will read MET output data
from
> it
> > **regardless of the file naming convention**.
> >
> > For a directory name, STAT-Analysis will search **recursively**
through
> > that directory looking for files ending in the ".stat" suffix.
> >
> > Each time you run grid_stat, point_stat, wavelet_stat, or
ensemble_stat,
> > the tool writes a ".stat" output file (and can also write the
optional
> text
> > files sorted by line type... such as "_cnt.txt).  That's why
> STAT-Analysis
> > searches directories for ".stat" files.  But if you want it to
read the
> > "_cnt.txt" file, you need to specify the file name on the command
line.
> >
> > Make sense?
> >
> > Just let us know if more issues/questions arise.
> >
> > Thanks,
> > John Halley Gotway
> >
> >
> > On Thu, May 11, 2017 at 2:37 PM, Rosalyn MacCracken - NOAA
Affiliate via
> RT
> > <met_help at ucar.edu> wrote:
> >
> > >
> > > Thu May 11 14:37:29 2017: Request 80429 was acted upon.
> > > Transaction: Ticket created by rosalyn.maccracken at noaa.gov
> > >        Queue: met_help
> > >      Subject: stat_analysis aggregate question
> > >        Owner: Nobody
> > >   Requestors: rosalyn.maccracken at noaa.gov
> > >       Status: new
> > >  Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=80429
> >
> > >
> > >
> > > Hi,
> > >
> > > I'm using the output from the poin-stat tool as input to the
> > stat_analysis
> > > tool.  I would like to aggregate the *cnt.txt files.  I can get
the
> tool
> > to
> > > aggregate, and aggregate_stat the *cts.txt or *ctc.txt files.  I
would
> > > really like to use the information in the *cnt.txt files for
multiple
> > > times/days.  How do I do that?
> > >
> > > Thanks in advance!
> > >
> > > Roz
> > >
> > > --
> > > Rosalyn MacCracken
> > > Support Scientist
> > >
> > > Ocean Applilcations Branch
> > > NOAA/NWS Ocean Prediction Center
> > > NCWCP
> > > 5830 University Research Ct
> > > College Park, MD  20740-3818
> > >
> > > (p) 301-683-1551
> > > rosalyn.maccracken at noaa.gov
> > >
> > >
> >
> >
>
>
> --
> Rosalyn MacCracken
> Support Scientist
>
> Ocean Applilcations Branch
> NOAA/NWS Ocean Prediction Center
> NCWCP
> 5830 University Research Ct
> College Park, MD  20740-3818
>
> (p) 301-683-1551
> rosalyn.maccracken at noaa.gov
>
>

------------------------------------------------
Subject: stat_analysis aggregate question
From: Rosalyn MacCracken - NOAA Affiliate
Time: Mon May 15 07:29:22 2017

Hi John,

So, I'm interested in doing a couple of things, and I think I've
figured
out how to do some of them.  So, maybe you can tell me how to do the
others.

First, I am mostly interested in the matched points and their
performance.
And, I use a config file, which I call from a script using the
command:

stat_analysis -lookin ${PROCDIR} -out
${PROCDIR}/stat_analysis/stat_analysis.out -config
${CONFIGDIR}/STATAnalysisConfig_working -v 2

I can easily plot spatially, where the matched points are located by
their
lat/lon, and I can find their differences (FCST - OBS).  Then, I used
the
aggregate_stat command to combine my files, so, I can plot histograms
or
box plots of matched point, either at that forecast hour, or over the
span
of my forecast period of interest.  For that, I use this in  my config
file:

"-job aggregate_stat -line_type CTC -out_line_type CTS
    -dump_row
/opc/save/Rosalyn.MacCracken/met_out/stat_analysis/job4_aggstat_ctc_cts.stat"

So, some other things that I might be interested in things that span
the
entire period.  Perhaps, more like times series plots, so we can see
how
the forecast has done over time.  I don't have a problem with plotting
things from the forecast period, but, they usually aren't very
revealing or
interesting.  So, some other things are:

1) putting together a file which spans the forecast period which puts
together information from the SL1L2 file, so, I could plot a time
series of
the MAE.  So, I was thinking I would use:
"-job aggregate_stat -line_type SL1L2 -out_line_type CNT
    -dump_row
/opc/save/Rosalyn.MacCracken/met_out/stat_analysis/job2_aggstat_slil2_cnt_wind2.stat"


Is that right?

2)  From the CNT files, time series plots of the ANOM_CORR, PR_CORR,
GSS or
CSI, and RMSE, and maybe some other things.  I was thinking that I
could do:
"-job summary -fcst_var WIND -line_type CNT -column RMSE
     -dump_row
/opc/save/Rosalyn.MacCracken/met_out/stat_analysis/job7_summary_cnt_rmse.stat"

But, wasn't sure if that was correct.  So, if you could point me to
the
right usage, that would be great.

So, I'm also not sure how you get the mean, min, max, etc, for
multiple
columns.  I think that the CNT file has the most useful info, so, if
you
could tell me how to do that, that would be great.  I'm sure I'll have
another list of things I want to do after todays' meeting with Joe,
so,
I'll be back in touch with that list.

Oh, also, that script you wrote, plot_cnt.r on the MET user page.
Does
that plot from one of these aggregate_stat or summary commands, or is
that
a single CNT file?  If it's an aggregate_stat or summary commands,
what
command did you use and what was in the "stat_list" that you used?
I'm
sure it was a variety of columns from the CNT file, right?

Thanks for for your help!

Roz


On Sun, May 14, 2017 at 12:37 AM, John Halley Gotway via RT <
met_help at ucar.edu> wrote:

> Roz,
>
> Stat-Analysis can perform a few different "job" types.  One of them
is the
> "summary" job type (-job summary).  For that job, you pick exactly
one line
> type and one or more columns of interest.  Stat-Analysis will apply
> whatever other filtering criteria you specify and compute summary
> information for the column(s) you've selected.  The summary info
includes
> mean, min, max, and so on.
>
> Let me know if there's something specific you're trying to do with
> stat-analysis and I may be able to point you in the right direction.
>
> John
>
>
> On Sat, May 13, 2017 at 11:04 AM Rosalyn MacCracken - NOAA Affiliate
via RT
> <met_help at ucar.edu> wrote:
>
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=80429 >
> >
> > Hi John,
> >
> > I finally got it to work.  I had set:
> > line_type = ["CTC"];
> >
> > So, I set line_type to nothing [], and everything started working.
> >
> > So, question.  When using "summary" with -column RMSE set, what
does that
> > mean?  That only the RMSE column is summed, or something else?
> >
> > Thanks!
> >
> > Roz
> >
> > On Fri, May 12, 2017 at 2:31 PM, John Halley Gotway via RT <
> > met_help at ucar.edu> wrote:
> >
> > > Hello Roz,
> > >
> > > I see that you have a question about configuring/running STAT-
Analysis
> > > jobs.
> > >
> > > The "-lookin" command line option is used to tell STAT-Analysis
what
> > input
> > > files to read.  You must specify the "-lookin" option at least
once,
> but
> > > can use it as many times as you'd like.
> > >
> > > The argument you pass with "-lookin" is either the name of a
directory
> or
> > > explicit file name.
> > >
> > > For an explicit file name, STAT-Analysis will read MET output
data from
> > it
> > > **regardless of the file naming convention**.
> > >
> > > For a directory name, STAT-Analysis will search **recursively**
through
> > > that directory looking for files ending in the ".stat" suffix.
> > >
> > > Each time you run grid_stat, point_stat, wavelet_stat, or
> ensemble_stat,
> > > the tool writes a ".stat" output file (and can also write the
optional
> > text
> > > files sorted by line type... such as "_cnt.txt).  That's why
> > STAT-Analysis
> > > searches directories for ".stat" files.  But if you want it to
read the
> > > "_cnt.txt" file, you need to specify the file name on the
command line.
> > >
> > > Make sense?
> > >
> > > Just let us know if more issues/questions arise.
> > >
> > > Thanks,
> > > John Halley Gotway
> > >
> > >
> > > On Thu, May 11, 2017 at 2:37 PM, Rosalyn MacCracken - NOAA
Affiliate
> via
> > RT
> > > <met_help at ucar.edu> wrote:
> > >
> > > >
> > > > Thu May 11 14:37:29 2017: Request 80429 was acted upon.
> > > > Transaction: Ticket created by rosalyn.maccracken at noaa.gov
> > > >        Queue: met_help
> > > >      Subject: stat_analysis aggregate question
> > > >        Owner: Nobody
> > > >   Requestors: rosalyn.maccracken at noaa.gov
> > > >       Status: new
> > > >  Ticket <URL: https://rt.rap.ucar.edu/rt/
> Ticket/Display.html?id=80429
> > >
> > > >
> > > >
> > > > Hi,
> > > >
> > > > I'm using the output from the poin-stat tool as input to the
> > > stat_analysis
> > > > tool.  I would like to aggregate the *cnt.txt files.  I can
get the
> > tool
> > > to
> > > > aggregate, and aggregate_stat the *cts.txt or *ctc.txt files.
I
> would
> > > > really like to use the information in the *cnt.txt files for
multiple
> > > > times/days.  How do I do that?
> > > >
> > > > Thanks in advance!
> > > >
> > > > Roz
> > > >
> > > > --
> > > > Rosalyn MacCracken
> > > > Support Scientist
> > > >
> > > > Ocean Applilcations Branch
> > > > NOAA/NWS Ocean Prediction Center
> > > > NCWCP
> > > > 5830 University Research Ct
> > > > College Park, MD  20740-3818
> > > >
> > > > (p) 301-683-1551
> > > > rosalyn.maccracken at noaa.gov
> > > >
> > > >
> > >
> > >
> >
> >
> > --
> > Rosalyn MacCracken
> > Support Scientist
> >
> > Ocean Applilcations Branch
> > NOAA/NWS Ocean Prediction Center
> > NCWCP
> > 5830 University Research Ct
> > College Park, MD  20740-3818
> >
> > (p) 301-683-1551
> > rosalyn.maccracken at noaa.gov
> >
> >
>
>


--
Rosalyn MacCracken
Support Scientist

Ocean Applilcations Branch
NOAA/NWS Ocean Prediction Center
NCWCP
5830 University Research Ct
College Park, MD  20740-3818

(p) 301-683-1551
rosalyn.maccracken at noaa.gov

------------------------------------------------
Subject: stat_analysis aggregate question
From: John Halley Gotway
Time: Mon May 15 09:51:17 2017

Roz,

I'm glad you've been able to make progress using STAT-Analysis.

Let me mention a few things that you may find useful...

(1) As you've already seen, STAT-Analysis can be run by defining one
or
more jobs in a config file.  Alternatively, you can run a single job
on the
command line with no config file.  I find that much quicker and easier
when
I'm playing around with things.  It's only once I've defined a fixed
set of
jobs that I move them into a config file.

(2) By default, STAT-Analysis writes it output to the screen.  Use the
"-out_stat" job option to redirect the job output to a .stat output
files.
That will include the full set of header columns and should be pretty
easy
for a plotting script to parse.

(3) It sounds like you're interested primarily in matched pairs, i.e.
the
MPR line type.  I assume that's what you're plotting in your
histograms and
boxplots.  If you really just want to "filter" the .stat files, I'd
suggest
using the "filter" job to do so:
   -job filter -line_type MPR -dump_row filter_mpr.stat [[[ additional
filtering criteria ]]]

You mentioned an aggregate_stat job to read CTC and write CTS... but
that
doesn't have anything to do with the MPR line type.  So I'm confused
as to
why that's getting you what you want?

(4) I see that you want a time series of MAE values.  I think you're
on the
right track, but I'd suggest using the "-by" option:

"-job aggregate_stat -line_type SL1L2 -out_line_type CNT -out_stat
cnt_time_series.stat -fcst_lead 24 -by FCST_INIT_BEG"

The job listed above would produce a time series of continuous
statistics
for the 24-hour lead time for each initialization time present.  You
should
be able to use the job command options to define the time series in
any way
you want.

(5) When running a summary job, if you want to summarize multiple
columns,
just use the "-column" option multiple times to include them... or
specify
"-column" as a comma-separated list:

"-job summary -fcst_var WIND -line_type CNT -column RMSE,MAE,ME,MSE"

(6) The "plot_cnt.R" script on the website is outdated since it's
header
columns haven't been updated since version 3.0.  But that same script
is
included in the MET release and has been updated:
   met-6.0/scripts/Rscripts/plot_cnt.R

It reads the CNT line type from a .stat file, an _cnt.txt file, or the
output of a stat-analysis filter job.  I don't know specifically what
stat-analysis command I used, but it'd be something like:

   stat_analysis -job filter -line_type CNT -dump_row cnt_filter.txt
[[[
additional filtering criteria ]]]

Hope that helps.

Thanks,
John


On Mon, May 15, 2017 at 7:29 AM, Rosalyn MacCracken - NOAA Affiliate
via RT
<met_help at ucar.edu> wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=80429 >
>
> Hi John,
>
> So, I'm interested in doing a couple of things, and I think I've
figured
> out how to do some of them.  So, maybe you can tell me how to do the
> others.
>
> First, I am mostly interested in the matched points and their
performance.
> And, I use a config file, which I call from a script using the
command:
>
> stat_analysis -lookin ${PROCDIR} -out
> ${PROCDIR}/stat_analysis/stat_analysis.out -config
> ${CONFIGDIR}/STATAnalysisConfig_working -v 2
>
> I can easily plot spatially, where the matched points are located by
their
> lat/lon, and I can find their differences (FCST - OBS).  Then, I
used the
> aggregate_stat command to combine my files, so, I can plot
histograms or
> box plots of matched point, either at that forecast hour, or over
the span
> of my forecast period of interest.  For that, I use this in  my
config
> file:
>
> "-job aggregate_stat -line_type CTC -out_line_type CTS
>     -dump_row
> /opc/save/Rosalyn.MacCracken/met_out/stat_analysis/job4_
> aggstat_ctc_cts.stat"
>
> So, some other things that I might be interested in things that span
the
> entire period.  Perhaps, more like times series plots, so we can see
how
> the forecast has done over time.  I don't have a problem with
plotting
> things from the forecast period, but, they usually aren't very
revealing or
> interesting.  So, some other things are:
>
> 1) putting together a file which spans the forecast period which
puts
> together information from the SL1L2 file, so, I could plot a time
series of
> the MAE.  So, I was thinking I would use:
> "-job aggregate_stat -line_type SL1L2 -out_line_type CNT
>     -dump_row
> /opc/save/Rosalyn.MacCracken/met_out/stat_analysis/job2_
> aggstat_slil2_cnt_wind2.stat"
>
>
> Is that right?
>
> 2)  From the CNT files, time series plots of the ANOM_CORR, PR_CORR,
GSS or
> CSI, and RMSE, and maybe some other things.  I was thinking that I
could
> do:
> "-job summary -fcst_var WIND -line_type CNT -column RMSE
>      -dump_row
> /opc/save/Rosalyn.MacCracken/met_out/stat_analysis/job7_
> summary_cnt_rmse.stat"
>
> But, wasn't sure if that was correct.  So, if you could point me to
the
> right usage, that would be great.
>
> So, I'm also not sure how you get the mean, min, max, etc, for
multiple
> columns.  I think that the CNT file has the most useful info, so, if
you
> could tell me how to do that, that would be great.  I'm sure I'll
have
> another list of things I want to do after todays' meeting with Joe,
so,
> I'll be back in touch with that list.
>
> Oh, also, that script you wrote, plot_cnt.r on the MET user page.
Does
> that plot from one of these aggregate_stat or summary commands, or
is that
> a single CNT file?  If it's an aggregate_stat or summary commands,
what
> command did you use and what was in the "stat_list" that you used?
I'm
> sure it was a variety of columns from the CNT file, right?
>
> Thanks for for your help!
>
> Roz
>
>
> On Sun, May 14, 2017 at 12:37 AM, John Halley Gotway via RT <
> met_help at ucar.edu> wrote:
>
> > Roz,
> >
> > Stat-Analysis can perform a few different "job" types.  One of
them is
> the
> > "summary" job type (-job summary).  For that job, you pick exactly
one
> line
> > type and one or more columns of interest.  Stat-Analysis will
apply
> > whatever other filtering criteria you specify and compute summary
> > information for the column(s) you've selected.  The summary info
includes
> > mean, min, max, and so on.
> >
> > Let me know if there's something specific you're trying to do with
> > stat-analysis and I may be able to point you in the right
direction.
> >
> > John
> >
> >
> > On Sat, May 13, 2017 at 11:04 AM Rosalyn MacCracken - NOAA
Affiliate via
> RT
> > <met_help at ucar.edu> wrote:
> >
> > >
> > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=80429 >
> > >
> > > Hi John,
> > >
> > > I finally got it to work.  I had set:
> > > line_type = ["CTC"];
> > >
> > > So, I set line_type to nothing [], and everything started
working.
> > >
> > > So, question.  When using "summary" with -column RMSE set, what
does
> that
> > > mean?  That only the RMSE column is summed, or something else?
> > >
> > > Thanks!
> > >
> > > Roz
> > >
> > > On Fri, May 12, 2017 at 2:31 PM, John Halley Gotway via RT <
> > > met_help at ucar.edu> wrote:
> > >
> > > > Hello Roz,
> > > >
> > > > I see that you have a question about configuring/running
> STAT-Analysis
> > > > jobs.
> > > >
> > > > The "-lookin" command line option is used to tell STAT-
Analysis what
> > > input
> > > > files to read.  You must specify the "-lookin" option at least
once,
> > but
> > > > can use it as many times as you'd like.
> > > >
> > > > The argument you pass with "-lookin" is either the name of a
> directory
> > or
> > > > explicit file name.
> > > >
> > > > For an explicit file name, STAT-Analysis will read MET output
data
> from
> > > it
> > > > **regardless of the file naming convention**.
> > > >
> > > > For a directory name, STAT-Analysis will search
**recursively**
> through
> > > > that directory looking for files ending in the ".stat" suffix.
> > > >
> > > > Each time you run grid_stat, point_stat, wavelet_stat, or
> > ensemble_stat,
> > > > the tool writes a ".stat" output file (and can also write the
> optional
> > > text
> > > > files sorted by line type... such as "_cnt.txt).  That's why
> > > STAT-Analysis
> > > > searches directories for ".stat" files.  But if you want it to
read
> the
> > > > "_cnt.txt" file, you need to specify the file name on the
command
> line.
> > > >
> > > > Make sense?
> > > >
> > > > Just let us know if more issues/questions arise.
> > > >
> > > > Thanks,
> > > > John Halley Gotway
> > > >
> > > >
> > > > On Thu, May 11, 2017 at 2:37 PM, Rosalyn MacCracken - NOAA
Affiliate
> > via
> > > RT
> > > > <met_help at ucar.edu> wrote:
> > > >
> > > > >
> > > > > Thu May 11 14:37:29 2017: Request 80429 was acted upon.
> > > > > Transaction: Ticket created by rosalyn.maccracken at noaa.gov
> > > > >        Queue: met_help
> > > > >      Subject: stat_analysis aggregate question
> > > > >        Owner: Nobody
> > > > >   Requestors: rosalyn.maccracken at noaa.gov
> > > > >       Status: new
> > > > >  Ticket <URL: https://rt.rap.ucar.edu/rt/
> > Ticket/Display.html?id=80429
> > > >
> > > > >
> > > > >
> > > > > Hi,
> > > > >
> > > > > I'm using the output from the poin-stat tool as input to the
> > > > stat_analysis
> > > > > tool.  I would like to aggregate the *cnt.txt files.  I can
get the
> > > tool
> > > > to
> > > > > aggregate, and aggregate_stat the *cts.txt or *ctc.txt
files.  I
> > would
> > > > > really like to use the information in the *cnt.txt files for
> multiple
> > > > > times/days.  How do I do that?
> > > > >
> > > > > Thanks in advance!
> > > > >
> > > > > Roz
> > > > >
> > > > > --
> > > > > Rosalyn MacCracken
> > > > > Support Scientist
> > > > >
> > > > > Ocean Applilcations Branch
> > > > > NOAA/NWS Ocean Prediction Center
> > > > > NCWCP
> > > > > 5830 University Research Ct
> > > > > College Park, MD  20740-3818
> > > > >
> > > > > (p) 301-683-1551
> > > > > rosalyn.maccracken at noaa.gov
> > > > >
> > > > >
> > > >
> > > >
> > >
> > >
> > > --
> > > Rosalyn MacCracken
> > > Support Scientist
> > >
> > > Ocean Applilcations Branch
> > > NOAA/NWS Ocean Prediction Center
> > > NCWCP
> > > 5830 University Research Ct
> > > College Park, MD  20740-3818
> > >
> > > (p) 301-683-1551
> > > rosalyn.maccracken at noaa.gov
> > >
> > >
> >
> >
>
>
> --
> Rosalyn MacCracken
> Support Scientist
>
> Ocean Applilcations Branch
> NOAA/NWS Ocean Prediction Center
> NCWCP
> 5830 University Research Ct
> College Park, MD  20740-3818
>
> (p) 301-683-1551
> rosalyn.maccracken at noaa.gov
>
>

------------------------------------------------
Subject: stat_analysis aggregate question
From: John Halley Gotway
Time: Mon May 15 09:58:17 2017

Roz,

OK, I just updated the website version of those scripts to match the
6.0
version.

John

On Mon, May 15, 2017 at 9:50 AM, John Halley Gotway <johnhg at ucar.edu>
wrote:

> Roz,
>
> I'm glad you've been able to make progress using STAT-Analysis.
>
> Let me mention a few things that you may find useful...
>
> (1) As you've already seen, STAT-Analysis can be run by defining one
or
> more jobs in a config file.  Alternatively, you can run a single job
on the
> command line with no config file.  I find that much quicker and
easier when
> I'm playing around with things.  It's only once I've defined a fixed
set of
> jobs that I move them into a config file.
>
> (2) By default, STAT-Analysis writes it output to the screen.  Use
the
> "-out_stat" job option to redirect the job output to a .stat output
files.
> That will include the full set of header columns and should be
pretty easy
> for a plotting script to parse.
>
> (3) It sounds like you're interested primarily in matched pairs,
i.e. the
> MPR line type.  I assume that's what you're plotting in your
histograms and
> boxplots.  If you really just want to "filter" the .stat files, I'd
suggest
> using the "filter" job to do so:
>    -job filter -line_type MPR -dump_row filter_mpr.stat [[[
additional
> filtering criteria ]]]
>
> You mentioned an aggregate_stat job to read CTC and write CTS... but
that
> doesn't have anything to do with the MPR line type.  So I'm confused
as to
> why that's getting you what you want?
>
> (4) I see that you want a time series of MAE values.  I think you're
on
> the right track, but I'd suggest using the "-by" option:
>
> "-job aggregate_stat -line_type SL1L2 -out_line_type CNT -out_stat
> cnt_time_series.stat -fcst_lead 24 -by FCST_INIT_BEG"
>
> The job listed above would produce a time series of continuous
statistics
> for the 24-hour lead time for each initialization time present.  You
should
> be able to use the job command options to define the time series in
any way
> you want.
>
> (5) When running a summary job, if you want to summarize multiple
columns,
> just use the "-column" option multiple times to include them... or
specify
> "-column" as a comma-separated list:
>
> "-job summary -fcst_var WIND -line_type CNT -column RMSE,MAE,ME,MSE"
>
> (6) The "plot_cnt.R" script on the website is outdated since it's
header
> columns haven't been updated since version 3.0.  But that same
script is
> included in the MET release and has been updated:
>    met-6.0/scripts/Rscripts/plot_cnt.R
>
> It reads the CNT line type from a .stat file, an _cnt.txt file, or
the
> output of a stat-analysis filter job.  I don't know specifically
what
> stat-analysis command I used, but it'd be something like:
>
>    stat_analysis -job filter -line_type CNT -dump_row cnt_filter.txt
[[[
> additional filtering criteria ]]]
>
> Hope that helps.
>
> Thanks,
> John
>
>
> On Mon, May 15, 2017 at 7:29 AM, Rosalyn MacCracken - NOAA Affiliate
via
> RT <met_help at ucar.edu> wrote:
>
>>
>> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=80429 >
>>
>> Hi John,
>>
>> So, I'm interested in doing a couple of things, and I think I've
figured
>> out how to do some of them.  So, maybe you can tell me how to do
the
>> others.
>>
>> First, I am mostly interested in the matched points and their
performance.
>> And, I use a config file, which I call from a script using the
command:
>>
>> stat_analysis -lookin ${PROCDIR} -out
>> ${PROCDIR}/stat_analysis/stat_analysis.out -config
>> ${CONFIGDIR}/STATAnalysisConfig_working -v 2
>>
>> I can easily plot spatially, where the matched points are located
by their
>> lat/lon, and I can find their differences (FCST - OBS).  Then, I
used the
>> aggregate_stat command to combine my files, so, I can plot
histograms or
>> box plots of matched point, either at that forecast hour, or over
the span
>> of my forecast period of interest.  For that, I use this in  my
config
>> file:
>>
>> "-job aggregate_stat -line_type CTC -out_line_type CTS
>>     -dump_row
>> /opc/save/Rosalyn.MacCracken/met_out/stat_analysis/job4_aggs
>> tat_ctc_cts.stat"
>>
>> So, some other things that I might be interested in things that
span the
>> entire period.  Perhaps, more like times series plots, so we can
see how
>> the forecast has done over time.  I don't have a problem with
plotting
>> things from the forecast period, but, they usually aren't very
revealing
>> or
>> interesting.  So, some other things are:
>>
>> 1) putting together a file which spans the forecast period which
puts
>> together information from the SL1L2 file, so, I could plot a time
series
>> of
>> the MAE.  So, I was thinking I would use:
>> "-job aggregate_stat -line_type SL1L2 -out_line_type CNT
>>     -dump_row
>> /opc/save/Rosalyn.MacCracken/met_out/stat_analysis/job2_aggs
>> tat_slil2_cnt_wind2.stat"
>>
>>
>> Is that right?
>>
>> 2)  From the CNT files, time series plots of the ANOM_CORR,
PR_CORR, GSS
>> or
>> CSI, and RMSE, and maybe some other things.  I was thinking that I
could
>> do:
>> "-job summary -fcst_var WIND -line_type CNT -column RMSE
>>      -dump_row
>> /opc/save/Rosalyn.MacCracken/met_out/stat_analysis/job7_summ
>> ary_cnt_rmse.stat"
>>
>> But, wasn't sure if that was correct.  So, if you could point me to
the
>> right usage, that would be great.
>>
>> So, I'm also not sure how you get the mean, min, max, etc, for
multiple
>> columns.  I think that the CNT file has the most useful info, so,
if you
>> could tell me how to do that, that would be great.  I'm sure I'll
have
>> another list of things I want to do after todays' meeting with Joe,
so,
>> I'll be back in touch with that list.
>>
>> Oh, also, that script you wrote, plot_cnt.r on the MET user page.
Does
>> that plot from one of these aggregate_stat or summary commands, or
is that
>> a single CNT file?  If it's an aggregate_stat or summary commands,
what
>> command did you use and what was in the "stat_list" that you used?
I'm
>> sure it was a variety of columns from the CNT file, right?
>>
>> Thanks for for your help!
>>
>> Roz
>>
>>
>> On Sun, May 14, 2017 at 12:37 AM, John Halley Gotway via RT <
>> met_help at ucar.edu> wrote:
>>
>> > Roz,
>> >
>> > Stat-Analysis can perform a few different "job" types.  One of
them is
>> the
>> > "summary" job type (-job summary).  For that job, you pick
exactly one
>> line
>> > type and one or more columns of interest.  Stat-Analysis will
apply
>> > whatever other filtering criteria you specify and compute summary
>> > information for the column(s) you've selected.  The summary info
>> includes
>> > mean, min, max, and so on.
>> >
>> > Let me know if there's something specific you're trying to do
with
>> > stat-analysis and I may be able to point you in the right
direction.
>> >
>> > John
>> >
>> >
>> > On Sat, May 13, 2017 at 11:04 AM Rosalyn MacCracken - NOAA
Affiliate
>> via RT
>> > <met_help at ucar.edu> wrote:
>> >
>> > >
>> > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=80429 >
>> > >
>> > > Hi John,
>> > >
>> > > I finally got it to work.  I had set:
>> > > line_type = ["CTC"];
>> > >
>> > > So, I set line_type to nothing [], and everything started
working.
>> > >
>> > > So, question.  When using "summary" with -column RMSE set, what
does
>> that
>> > > mean?  That only the RMSE column is summed, or something else?
>> > >
>> > > Thanks!
>> > >
>> > > Roz
>> > >
>> > > On Fri, May 12, 2017 at 2:31 PM, John Halley Gotway via RT <
>> > > met_help at ucar.edu> wrote:
>> > >
>> > > > Hello Roz,
>> > > >
>> > > > I see that you have a question about configuring/running
>> STAT-Analysis
>> > > > jobs.
>> > > >
>> > > > The "-lookin" command line option is used to tell STAT-
Analysis what
>> > > input
>> > > > files to read.  You must specify the "-lookin" option at
least once,
>> > but
>> > > > can use it as many times as you'd like.
>> > > >
>> > > > The argument you pass with "-lookin" is either the name of a
>> directory
>> > or
>> > > > explicit file name.
>> > > >
>> > > > For an explicit file name, STAT-Analysis will read MET output
data
>> from
>> > > it
>> > > > **regardless of the file naming convention**.
>> > > >
>> > > > For a directory name, STAT-Analysis will search
**recursively**
>> through
>> > > > that directory looking for files ending in the ".stat"
suffix.
>> > > >
>> > > > Each time you run grid_stat, point_stat, wavelet_stat, or
>> > ensemble_stat,
>> > > > the tool writes a ".stat" output file (and can also write the
>> optional
>> > > text
>> > > > files sorted by line type... such as "_cnt.txt).  That's why
>> > > STAT-Analysis
>> > > > searches directories for ".stat" files.  But if you want it
to read
>> the
>> > > > "_cnt.txt" file, you need to specify the file name on the
command
>> line.
>> > > >
>> > > > Make sense?
>> > > >
>> > > > Just let us know if more issues/questions arise.
>> > > >
>> > > > Thanks,
>> > > > John Halley Gotway
>> > > >
>> > > >
>> > > > On Thu, May 11, 2017 at 2:37 PM, Rosalyn MacCracken - NOAA
Affiliate
>> > via
>> > > RT
>> > > > <met_help at ucar.edu> wrote:
>> > > >
>> > > > >
>> > > > > Thu May 11 14:37:29 2017: Request 80429 was acted upon.
>> > > > > Transaction: Ticket created by rosalyn.maccracken at noaa.gov
>> > > > >        Queue: met_help
>> > > > >      Subject: stat_analysis aggregate question
>> > > > >        Owner: Nobody
>> > > > >   Requestors: rosalyn.maccracken at noaa.gov
>> > > > >       Status: new
>> > > > >  Ticket <URL: https://rt.rap.ucar.edu/rt/
>> > Ticket/Display.html?id=80429
>> > > >
>> > > > >
>> > > > >
>> > > > > Hi,
>> > > > >
>> > > > > I'm using the output from the poin-stat tool as input to
the
>> > > > stat_analysis
>> > > > > tool.  I would like to aggregate the *cnt.txt files.  I can
get
>> the
>> > > tool
>> > > > to
>> > > > > aggregate, and aggregate_stat the *cts.txt or *ctc.txt
files.  I
>> > would
>> > > > > really like to use the information in the *cnt.txt files
for
>> multiple
>> > > > > times/days.  How do I do that?
>> > > > >
>> > > > > Thanks in advance!
>> > > > >
>> > > > > Roz
>> > > > >
>> > > > > --
>> > > > > Rosalyn MacCracken
>> > > > > Support Scientist
>> > > > >
>> > > > > Ocean Applilcations Branch
>> > > > > NOAA/NWS Ocean Prediction Center
>> > > > > NCWCP
>> > > > > 5830 University Research Ct
>> > > > > College Park, MD  20740-3818
>> > > > >
>> > > > > (p) 301-683-1551
>> > > > > rosalyn.maccracken at noaa.gov
>> > > > >
>> > > > >
>> > > >
>> > > >
>> > >
>> > >
>> > > --
>> > > Rosalyn MacCracken
>> > > Support Scientist
>> > >
>> > > Ocean Applilcations Branch
>> > > NOAA/NWS Ocean Prediction Center
>> > > NCWCP
>> > > 5830 University Research Ct
>> > > College Park, MD  20740-3818
>> > >
>> > > (p) 301-683-1551
>> > > rosalyn.maccracken at noaa.gov
>> > >
>> > >
>> >
>> >
>>
>>
>> --
>> Rosalyn MacCracken
>> Support Scientist
>>
>> Ocean Applilcations Branch
>> NOAA/NWS Ocean Prediction Center
>> NCWCP
>> 5830 University Research Ct
>> College Park, MD  20740-3818
>>
>> (p) 301-683-1551
>> rosalyn.maccracken at noaa.gov
>>
>>
>

------------------------------------------------
Subject: stat_analysis aggregate question
From: Rosalyn MacCracken - NOAA Affiliate
Time: Mon May 15 10:19:30 2017

Hi John,

Thanks for the info! I've actually been going back and forth from the
command line, and the config file.  When I get something to work on
the
command line, I stick it in the config file.  But, thanks for that
suggestion anyway.

 I'll try the suggestions and see if that's what I'm trying to get as
output.  Also, I think the reading CTC and write CTS was following an
example you had in the slides.  I was thinking that that might get me
output that I wanted, but, really had no idea.  But, I think between
that
-column and that "-out_stat", I might get what I want.

BTW, I did see your other email about the R programming.  I'll check
out
the updated version.

I'll be back in touch after I try some of these things you suggested.
Thanks again!!

Roz

On Mon, May 15, 2017 at 3:51 PM, John Halley Gotway via RT <
met_help at ucar.edu> wrote:

> Roz,
>
> I'm glad you've been able to make progress using STAT-Analysis.
>
> Let me mention a few things that you may find useful...
>
> (1) As you've already seen, STAT-Analysis can be run by defining one
or
> more jobs in a config file.  Alternatively, you can run a single job
on the
> command line with no config file.  I find that much quicker and
easier when
> I'm playing around with things.  It's only once I've defined a fixed
set of
> jobs that I move them into a config file.
>
> (2) By default, STAT-Analysis writes it output to the screen.  Use
the
> "-out_stat" job option to redirect the job output to a .stat output
files.
> That will include the full set of header columns and should be
pretty easy
> for a plotting script to parse.
>
> (3) It sounds like you're interested primarily in matched pairs,
i.e. the
> MPR line type.  I assume that's what you're plotting in your
histograms and
> boxplots.  If you really just want to "filter" the .stat files, I'd
suggest
> using the "filter" job to do so:
>    -job filter -line_type MPR -dump_row filter_mpr.stat [[[
additional
> filtering criteria ]]]
>
> You mentioned an aggregate_stat job to read CTC and write CTS... but
that
> doesn't have anything to do with the MPR line type.  So I'm confused
as to
> why that's getting you what you want?
>
> (4) I see that you want a time series of MAE values.  I think you're
on the
> right track, but I'd suggest using the "-by" option:
>
> "-job aggregate_stat -line_type SL1L2 -out_line_type CNT -out_stat
> cnt_time_series.stat -fcst_lead 24 -by FCST_INIT_BEG"
>
> The job listed above would produce a time series of continuous
statistics
> for the 24-hour lead time for each initialization time present.  You
should
> be able to use the job command options to define the time series in
any way
> you want.
>
> (5) When running a summary job, if you want to summarize multiple
columns,
> just use the "-column" option multiple times to include them... or
specify
> "-column" as a comma-separated list:
>
> "-job summary -fcst_var WIND -line_type CNT -column RMSE,MAE,ME,MSE"
>
> (6) The "plot_cnt.R" script on the website is outdated since it's
header
> columns haven't been updated since version 3.0.  But that same
script is
> included in the MET release and has been updated:
>    met-6.0/scripts/Rscripts/plot_cnt.R
>
> It reads the CNT line type from a .stat file, an _cnt.txt file, or
the
> output of a stat-analysis filter job.  I don't know specifically
what
> stat-analysis command I used, but it'd be something like:
>
>    stat_analysis -job filter -line_type CNT -dump_row cnt_filter.txt
[[[
> additional filtering criteria ]]]
>
> Hope that helps.
>
> Thanks,
> John
>
>
> On Mon, May 15, 2017 at 7:29 AM, Rosalyn MacCracken - NOAA Affiliate
via RT
> <met_help at ucar.edu> wrote:
>
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=80429 >
> >
> > Hi John,
> >
> > So, I'm interested in doing a couple of things, and I think I've
figured
> > out how to do some of them.  So, maybe you can tell me how to do
the
> > others.
> >
> > First, I am mostly interested in the matched points and their
> performance.
> > And, I use a config file, which I call from a script using the
command:
> >
> > stat_analysis -lookin ${PROCDIR} -out
> > ${PROCDIR}/stat_analysis/stat_analysis.out -config
> > ${CONFIGDIR}/STATAnalysisConfig_working -v 2
> >
> > I can easily plot spatially, where the matched points are located
by
> their
> > lat/lon, and I can find their differences (FCST - OBS).  Then, I
used the
> > aggregate_stat command to combine my files, so, I can plot
histograms or
> > box plots of matched point, either at that forecast hour, or over
the
> span
> > of my forecast period of interest.  For that, I use this in  my
config
> > file:
> >
> > "-job aggregate_stat -line_type CTC -out_line_type CTS
> >     -dump_row
> > /opc/save/Rosalyn.MacCracken/met_out/stat_analysis/job4_
> > aggstat_ctc_cts.stat"
> >
> > So, some other things that I might be interested in things that
span the
> > entire period.  Perhaps, more like times series plots, so we can
see how
> > the forecast has done over time.  I don't have a problem with
plotting
> > things from the forecast period, but, they usually aren't very
revealing
> or
> > interesting.  So, some other things are:
> >
> > 1) putting together a file which spans the forecast period which
puts
> > together information from the SL1L2 file, so, I could plot a time
series
> of
> > the MAE.  So, I was thinking I would use:
> > "-job aggregate_stat -line_type SL1L2 -out_line_type CNT
> >     -dump_row
> > /opc/save/Rosalyn.MacCracken/met_out/stat_analysis/job2_
> > aggstat_slil2_cnt_wind2.stat"
> >
> >
> > Is that right?
> >
> > 2)  From the CNT files, time series plots of the ANOM_CORR,
PR_CORR, GSS
> or
> > CSI, and RMSE, and maybe some other things.  I was thinking that I
could
> > do:
> > "-job summary -fcst_var WIND -line_type CNT -column RMSE
> >      -dump_row
> > /opc/save/Rosalyn.MacCracken/met_out/stat_analysis/job7_
> > summary_cnt_rmse.stat"
> >
> > But, wasn't sure if that was correct.  So, if you could point me
to the
> > right usage, that would be great.
> >
> > So, I'm also not sure how you get the mean, min, max, etc, for
multiple
> > columns.  I think that the CNT file has the most useful info, so,
if you
> > could tell me how to do that, that would be great.  I'm sure I'll
have
> > another list of things I want to do after todays' meeting with
Joe, so,
> > I'll be back in touch with that list.
> >
> > Oh, also, that script you wrote, plot_cnt.r on the MET user page.
Does
> > that plot from one of these aggregate_stat or summary commands, or
is
> that
> > a single CNT file?  If it's an aggregate_stat or summary commands,
what
> > command did you use and what was in the "stat_list" that you used?
I'm
> > sure it was a variety of columns from the CNT file, right?
> >
> > Thanks for for your help!
> >
> > Roz
> >
> >
> > On Sun, May 14, 2017 at 12:37 AM, John Halley Gotway via RT <
> > met_help at ucar.edu> wrote:
> >
> > > Roz,
> > >
> > > Stat-Analysis can perform a few different "job" types.  One of
them is
> > the
> > > "summary" job type (-job summary).  For that job, you pick
exactly one
> > line
> > > type and one or more columns of interest.  Stat-Analysis will
apply
> > > whatever other filtering criteria you specify and compute
summary
> > > information for the column(s) you've selected.  The summary info
> includes
> > > mean, min, max, and so on.
> > >
> > > Let me know if there's something specific you're trying to do
with
> > > stat-analysis and I may be able to point you in the right
direction.
> > >
> > > John
> > >
> > >
> > > On Sat, May 13, 2017 at 11:04 AM Rosalyn MacCracken - NOAA
Affiliate
> via
> > RT
> > > <met_help at ucar.edu> wrote:
> > >
> > > >
> > > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=80429
>
> > > >
> > > > Hi John,
> > > >
> > > > I finally got it to work.  I had set:
> > > > line_type = ["CTC"];
> > > >
> > > > So, I set line_type to nothing [], and everything started
working.
> > > >
> > > > So, question.  When using "summary" with -column RMSE set,
what does
> > that
> > > > mean?  That only the RMSE column is summed, or something else?
> > > >
> > > > Thanks!
> > > >
> > > > Roz
> > > >
> > > > On Fri, May 12, 2017 at 2:31 PM, John Halley Gotway via RT <
> > > > met_help at ucar.edu> wrote:
> > > >
> > > > > Hello Roz,
> > > > >
> > > > > I see that you have a question about configuring/running
> > STAT-Analysis
> > > > > jobs.
> > > > >
> > > > > The "-lookin" command line option is used to tell STAT-
Analysis
> what
> > > > input
> > > > > files to read.  You must specify the "-lookin" option at
least
> once,
> > > but
> > > > > can use it as many times as you'd like.
> > > > >
> > > > > The argument you pass with "-lookin" is either the name of a
> > directory
> > > or
> > > > > explicit file name.
> > > > >
> > > > > For an explicit file name, STAT-Analysis will read MET
output data
> > from
> > > > it
> > > > > **regardless of the file naming convention**.
> > > > >
> > > > > For a directory name, STAT-Analysis will search
**recursively**
> > through
> > > > > that directory looking for files ending in the ".stat"
suffix.
> > > > >
> > > > > Each time you run grid_stat, point_stat, wavelet_stat, or
> > > ensemble_stat,
> > > > > the tool writes a ".stat" output file (and can also write
the
> > optional
> > > > text
> > > > > files sorted by line type... such as "_cnt.txt).  That's why
> > > > STAT-Analysis
> > > > > searches directories for ".stat" files.  But if you want it
to read
> > the
> > > > > "_cnt.txt" file, you need to specify the file name on the
command
> > line.
> > > > >
> > > > > Make sense?
> > > > >
> > > > > Just let us know if more issues/questions arise.
> > > > >
> > > > > Thanks,
> > > > > John Halley Gotway
> > > > >
> > > > >
> > > > > On Thu, May 11, 2017 at 2:37 PM, Rosalyn MacCracken - NOAA
> Affiliate
> > > via
> > > > RT
> > > > > <met_help at ucar.edu> wrote:
> > > > >
> > > > > >
> > > > > > Thu May 11 14:37:29 2017: Request 80429 was acted upon.
> > > > > > Transaction: Ticket created by rosalyn.maccracken at noaa.gov
> > > > > >        Queue: met_help
> > > > > >      Subject: stat_analysis aggregate question
> > > > > >        Owner: Nobody
> > > > > >   Requestors: rosalyn.maccracken at noaa.gov
> > > > > >       Status: new
> > > > > >  Ticket <URL: https://rt.rap.ucar.edu/rt/
> > > Ticket/Display.html?id=80429
> > > > >
> > > > > >
> > > > > >
> > > > > > Hi,
> > > > > >
> > > > > > I'm using the output from the poin-stat tool as input to
the
> > > > > stat_analysis
> > > > > > tool.  I would like to aggregate the *cnt.txt files.  I
can get
> the
> > > > tool
> > > > > to
> > > > > > aggregate, and aggregate_stat the *cts.txt or *ctc.txt
files.  I
> > > would
> > > > > > really like to use the information in the *cnt.txt files
for
> > multiple
> > > > > > times/days.  How do I do that?
> > > > > >
> > > > > > Thanks in advance!
> > > > > >
> > > > > > Roz
> > > > > >
> > > > > > --
> > > > > > Rosalyn MacCracken
> > > > > > Support Scientist
> > > > > >
> > > > > > Ocean Applilcations Branch
> > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > NCWCP
> > > > > > 5830 University Research Ct
> > > > > > College Park, MD  20740-3818
> > > > > >
> > > > > > (p) 301-683-1551
> > > > > > rosalyn.maccracken at noaa.gov
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > Rosalyn MacCracken
> > > > Support Scientist
> > > >
> > > > Ocean Applilcations Branch
> > > > NOAA/NWS Ocean Prediction Center
> > > > NCWCP
> > > > 5830 University Research Ct
> > > > College Park, MD  20740-3818
> > > >
> > > > (p) 301-683-1551
> > > > rosalyn.maccracken at noaa.gov
> > > >
> > > >
> > >
> > >
> >
> >
> > --
> > Rosalyn MacCracken
> > Support Scientist
> >
> > Ocean Applilcations Branch
> > NOAA/NWS Ocean Prediction Center
> > NCWCP
> > 5830 University Research Ct
> > College Park, MD  20740-3818
> >
> > (p) 301-683-1551
> > rosalyn.maccracken at noaa.gov
> >
> >
>
>


--
Rosalyn MacCracken
Support Scientist

Ocean Applilcations Branch
NOAA/NWS Ocean Prediction Center
NCWCP
5830 University Research Ct
College Park, MD  20740-3818

(p) 301-683-1551
rosalyn.maccracken at noaa.gov

------------------------------------------------
Subject: stat_analysis aggregate question
From: Rosalyn MacCracken - NOAA Affiliate
Time: Mon May 15 12:33:56 2017

Hi John,

I'm having a few problems, and I'm sure they are pretty simple to
solve.
First, I was looking at the "-job summary" suggestion.  So, I did:

stat_analysis -lookin /opc/save/Rosalyn.MacCracken/met_out/master_gfs
"-job
summary -fcst_var WIND -line_type CNT -column RMSE,MAE,ME,MSE
-out_stat
/opc/save/Rosalyn.MacCracken/met_out/stat_analysis/colum_multivars.stat"
-v
2

and,  only wrote to the screen, not the -out_stat file specified.  So,
how
do I fix that?

Next, I can't get your suggestion of:

"-job aggregate_stat -line_type SL1L2 -out_line_type CNT -out_stat
cnt_time_series.stat -fcst_lead 24 -by FCST_INIT_BEG"

to work, because I have no forecast files.  So, I made a small dataset
to
work with, which only includes match-ups of prepbufr-ascat and GFS at
forecast times 00z, 06z, 12z and 18z.  I don't have any forecast files
associated with the GFS, only what matches the time stamp on the
prepbufr
ascat data.  So, how do you get data so that you can use the
-fcst_lead
option, etc?  Is this like matching observation valid time with files
such
as gfs.tHHz.grb2f01, gfs.tHHz.grb2f02, gfs.tHHz.grb2f03, etc?  In
other
words, in my prepbufr file, ascat data is collected throughout the 6
hour
period when the file is valid.  So, if it's valid at 00z, there is -3
hours
before 00z, and +3 hours after 00z that data is collected and stamped
for
when the data was precisely collected. Technically, I could separate
that
out to match a one hour forecast (gfs.tHHz.grb2f01), or a 2 hour
forecast
(gfs.tHHz.grb2f02), etc.

So, do I need to also generate those matchups in order to use that
-fcst_lead option?  Or, is there a better way to generate the data
that is
needed for that?

Roz

On Mon, May 15, 2017 at 3:51 PM, John Halley Gotway via RT <
met_help at ucar.edu> wrote:

> Roz,
>
> I'm glad you've been able to make progress using STAT-Analysis.
>
> Let me mention a few things that you may find useful...
>
> (1) As you've already seen, STAT-Analysis can be run by defining one
or
> more jobs in a config file.  Alternatively, you can run a single job
on the
> command line with no config file.  I find that much quicker and
easier when
> I'm playing around with things.  It's only once I've defined a fixed
set of
> jobs that I move them into a config file.
>
> (2) By default, STAT-Analysis writes it output to the screen.  Use
the
> "-out_stat" job option to redirect the job output to a .stat output
files.
> That will include the full set of header columns and should be
pretty easy
> for a plotting script to parse.
>
> (3) It sounds like you're interested primarily in matched pairs,
i.e. the
> MPR line type.  I assume that's what you're plotting in your
histograms and
> boxplots.  If you really just want to "filter" the .stat files, I'd
suggest
> using the "filter" job to do so:
>    -job filter -line_type MPR -dump_row filter_mpr.stat [[[
additional
> filtering criteria ]]]
>
> You mentioned an aggregate_stat job to read CTC and write CTS... but
that
> doesn't have anything to do with the MPR line type.  So I'm confused
as to
> why that's getting you what you want?
>
> (4) I see that you want a time series of MAE values.  I think you're
on the
> right track, but I'd suggest using the "-by" option:
>
> "-job aggregate_stat -line_type SL1L2 -out_line_type CNT -out_stat
> cnt_time_series.stat -fcst_lead 24 -by FCST_INIT_BEG"
>
> The job listed above would produce a time series of continuous
statistics
> for the 24-hour lead time for each initialization time present.  You
should
> be able to use the job command options to define the time series in
any way
> you want.
>
> (5) When running a summary job, if you want to summarize multiple
columns,
> just use the "-column" option multiple times to include them... or
specify
> "-column" as a comma-separated list:
>
> "-job summary -fcst_var WIND -line_type CNT -column RMSE,MAE,ME,MSE"
>
> (6) The "plot_cnt.R" script on the website is outdated since it's
header
> columns haven't been updated since version 3.0.  But that same
script is
> included in the MET release and has been updated:
>    met-6.0/scripts/Rscripts/plot_cnt.R
>
> It reads the CNT line type from a .stat file, an _cnt.txt file, or
the
> output of a stat-analysis filter job.  I don't know specifically
what
> stat-analysis command I used, but it'd be something like:
>
>    stat_analysis -job filter -line_type CNT -dump_row cnt_filter.txt
[[[
> additional filtering criteria ]]]
>
> Hope that helps.
>
> Thanks,
> John
>
>
> On Mon, May 15, 2017 at 7:29 AM, Rosalyn MacCracken - NOAA Affiliate
via RT
> <met_help at ucar.edu> wrote:
>
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=80429 >
> >
> > Hi John,
> >
> > So, I'm interested in doing a couple of things, and I think I've
figured
> > out how to do some of them.  So, maybe you can tell me how to do
the
> > others.
> >
> > First, I am mostly interested in the matched points and their
> performance.
> > And, I use a config file, which I call from a script using the
command:
> >
> > stat_analysis -lookin ${PROCDIR} -out
> > ${PROCDIR}/stat_analysis/stat_analysis.out -config
> > ${CONFIGDIR}/STATAnalysisConfig_working -v 2
> >
> > I can easily plot spatially, where the matched points are located
by
> their
> > lat/lon, and I can find their differences (FCST - OBS).  Then, I
used the
> > aggregate_stat command to combine my files, so, I can plot
histograms or
> > box plots of matched point, either at that forecast hour, or over
the
> span
> > of my forecast period of interest.  For that, I use this in  my
config
> > file:
> >
> > "-job aggregate_stat -line_type CTC -out_line_type CTS
> >     -dump_row
> > /opc/save/Rosalyn.MacCracken/met_out/stat_analysis/job4_
> > aggstat_ctc_cts.stat"
> >
> > So, some other things that I might be interested in things that
span the
> > entire period.  Perhaps, more like times series plots, so we can
see how
> > the forecast has done over time.  I don't have a problem with
plotting
> > things from the forecast period, but, they usually aren't very
revealing
> or
> > interesting.  So, some other things are:
> >
> > 1) putting together a file which spans the forecast period which
puts
> > together information from the SL1L2 file, so, I could plot a time
series
> of
> > the MAE.  So, I was thinking I would use:
> > "-job aggregate_stat -line_type SL1L2 -out_line_type CNT
> >     -dump_row
> > /opc/save/Rosalyn.MacCracken/met_out/stat_analysis/job2_
> > aggstat_slil2_cnt_wind2.stat"
> >
> >
> > Is that right?
> >
> > 2)  From the CNT files, time series plots of the ANOM_CORR,
PR_CORR, GSS
> or
> > CSI, and RMSE, and maybe some other things.  I was thinking that I
could
> > do:
> > "-job summary -fcst_var WIND -line_type CNT -column RMSE
> >      -dump_row
> > /opc/save/Rosalyn.MacCracken/met_out/stat_analysis/job7_
> > summary_cnt_rmse.stat"
> >
> > But, wasn't sure if that was correct.  So, if you could point me
to the
> > right usage, that would be great.
> >
> > So, I'm also not sure how you get the mean, min, max, etc, for
multiple
> > columns.  I think that the CNT file has the most useful info, so,
if you
> > could tell me how to do that, that would be great.  I'm sure I'll
have
> > another list of things I want to do after todays' meeting with
Joe, so,
> > I'll be back in touch with that list.
> >
> > Oh, also, that script you wrote, plot_cnt.r on the MET user page.
Does
> > that plot from one of these aggregate_stat or summary commands, or
is
> that
> > a single CNT file?  If it's an aggregate_stat or summary commands,
what
> > command did you use and what was in the "stat_list" that you used?
I'm
> > sure it was a variety of columns from the CNT file, right?
> >
> > Thanks for for your help!
> >
> > Roz
> >
> >
> > On Sun, May 14, 2017 at 12:37 AM, John Halley Gotway via RT <
> > met_help at ucar.edu> wrote:
> >
> > > Roz,
> > >
> > > Stat-Analysis can perform a few different "job" types.  One of
them is
> > the
> > > "summary" job type (-job summary).  For that job, you pick
exactly one
> > line
> > > type and one or more columns of interest.  Stat-Analysis will
apply
> > > whatever other filtering criteria you specify and compute
summary
> > > information for the column(s) you've selected.  The summary info
> includes
> > > mean, min, max, and so on.
> > >
> > > Let me know if there's something specific you're trying to do
with
> > > stat-analysis and I may be able to point you in the right
direction.
> > >
> > > John
> > >
> > >
> > > On Sat, May 13, 2017 at 11:04 AM Rosalyn MacCracken - NOAA
Affiliate
> via
> > RT
> > > <met_help at ucar.edu> wrote:
> > >
> > > >
> > > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=80429
>
> > > >
> > > > Hi John,
> > > >
> > > > I finally got it to work.  I had set:
> > > > line_type = ["CTC"];
> > > >
> > > > So, I set line_type to nothing [], and everything started
working.
> > > >
> > > > So, question.  When using "summary" with -column RMSE set,
what does
> > that
> > > > mean?  That only the RMSE column is summed, or something else?
> > > >
> > > > Thanks!
> > > >
> > > > Roz
> > > >
> > > > On Fri, May 12, 2017 at 2:31 PM, John Halley Gotway via RT <
> > > > met_help at ucar.edu> wrote:
> > > >
> > > > > Hello Roz,
> > > > >
> > > > > I see that you have a question about configuring/running
> > STAT-Analysis
> > > > > jobs.
> > > > >
> > > > > The "-lookin" command line option is used to tell STAT-
Analysis
> what
> > > > input
> > > > > files to read.  You must specify the "-lookin" option at
least
> once,
> > > but
> > > > > can use it as many times as you'd like.
> > > > >
> > > > > The argument you pass with "-lookin" is either the name of a
> > directory
> > > or
> > > > > explicit file name.
> > > > >
> > > > > For an explicit file name, STAT-Analysis will read MET
output data
> > from
> > > > it
> > > > > **regardless of the file naming convention**.
> > > > >
> > > > > For a directory name, STAT-Analysis will search
**recursively**
> > through
> > > > > that directory looking for files ending in the ".stat"
suffix.
> > > > >
> > > > > Each time you run grid_stat, point_stat, wavelet_stat, or
> > > ensemble_stat,
> > > > > the tool writes a ".stat" output file (and can also write
the
> > optional
> > > > text
> > > > > files sorted by line type... such as "_cnt.txt).  That's why
> > > > STAT-Analysis
> > > > > searches directories for ".stat" files.  But if you want it
to read
> > the
> > > > > "_cnt.txt" file, you need to specify the file name on the
command
> > line.
> > > > >
> > > > > Make sense?
> > > > >
> > > > > Just let us know if more issues/questions arise.
> > > > >
> > > > > Thanks,
> > > > > John Halley Gotway
> > > > >
> > > > >
> > > > > On Thu, May 11, 2017 at 2:37 PM, Rosalyn MacCracken - NOAA
> Affiliate
> > > via
> > > > RT
> > > > > <met_help at ucar.edu> wrote:
> > > > >
> > > > > >
> > > > > > Thu May 11 14:37:29 2017: Request 80429 was acted upon.
> > > > > > Transaction: Ticket created by rosalyn.maccracken at noaa.gov
> > > > > >        Queue: met_help
> > > > > >      Subject: stat_analysis aggregate question
> > > > > >        Owner: Nobody
> > > > > >   Requestors: rosalyn.maccracken at noaa.gov
> > > > > >       Status: new
> > > > > >  Ticket <URL: https://rt.rap.ucar.edu/rt/
> > > Ticket/Display.html?id=80429
> > > > >
> > > > > >
> > > > > >
> > > > > > Hi,
> > > > > >
> > > > > > I'm using the output from the poin-stat tool as input to
the
> > > > > stat_analysis
> > > > > > tool.  I would like to aggregate the *cnt.txt files.  I
can get
> the
> > > > tool
> > > > > to
> > > > > > aggregate, and aggregate_stat the *cts.txt or *ctc.txt
files.  I
> > > would
> > > > > > really like to use the information in the *cnt.txt files
for
> > multiple
> > > > > > times/days.  How do I do that?
> > > > > >
> > > > > > Thanks in advance!
> > > > > >
> > > > > > Roz
> > > > > >
> > > > > > --
> > > > > > Rosalyn MacCracken
> > > > > > Support Scientist
> > > > > >
> > > > > > Ocean Applilcations Branch
> > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > NCWCP
> > > > > > 5830 University Research Ct
> > > > > > College Park, MD  20740-3818
> > > > > >
> > > > > > (p) 301-683-1551
> > > > > > rosalyn.maccracken at noaa.gov
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > Rosalyn MacCracken
> > > > Support Scientist
> > > >
> > > > Ocean Applilcations Branch
> > > > NOAA/NWS Ocean Prediction Center
> > > > NCWCP
> > > > 5830 University Research Ct
> > > > College Park, MD  20740-3818
> > > >
> > > > (p) 301-683-1551
> > > > rosalyn.maccracken at noaa.gov
> > > >
> > > >
> > >
> > >
> >
> >
> > --
> > Rosalyn MacCracken
> > Support Scientist
> >
> > Ocean Applilcations Branch
> > NOAA/NWS Ocean Prediction Center
> > NCWCP
> > 5830 University Research Ct
> > College Park, MD  20740-3818
> >
> > (p) 301-683-1551
> > rosalyn.maccracken at noaa.gov
> >
> >
>
>


--
Rosalyn MacCracken
Support Scientist

Ocean Applilcations Branch
NOAA/NWS Ocean Prediction Center
NCWCP
5830 University Research Ct
College Park, MD  20740-3818

(p) 301-683-1551
rosalyn.maccracken at noaa.gov

------------------------------------------------
Subject: stat_analysis aggregate question
From: John Halley Gotway
Time: Mon May 15 14:07:57 2017

Roz,

The output of the "summary" job is not a .stat line type.  There is no
"SUMMARY" line type produced by other MET tools.  That's why you don't
get
any output using the "-out_stat" option.  However, you can use the "-
out"
option to redirect the output to an ASCII file.

I realize this is confusing... the "-out" option has existed for a
long
time.  We only recently added the "-out_stat" option for output the
"aggregate" and "aggregate_stat" job types, which write true STAT
lines to
the output.

On to the next issue.  It's fine that you're not evaluating forecast
lead
times... in fact that makes the logic of defining a time series much
easier.  Just use "-by FCST_VALID_BEG" instead:

"-job aggregate_stat -line_type SL1L2 -out_line_type CNT -out_stat
cnt_time_series.stat -by FCST_VALID_BEG"

Thanks,
John

On Mon, May 15, 2017 at 12:33 PM, Rosalyn MacCracken - NOAA Affiliate
via
RT <met_help at ucar.edu> wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=80429 >
>
> Hi John,
>
> I'm having a few problems, and I'm sure they are pretty simple to
solve.
> First, I was looking at the "-job summary" suggestion.  So, I did:
>
> stat_analysis -lookin
/opc/save/Rosalyn.MacCracken/met_out/master_gfs
> "-job
> summary -fcst_var WIND -line_type CNT -column RMSE,MAE,ME,MSE
-out_stat
>
/opc/save/Rosalyn.MacCracken/met_out/stat_analysis/colum_multivars.stat"
> -v
> 2
>
> and,  only wrote to the screen, not the -out_stat file specified.
So, how
> do I fix that?
>
> Next, I can't get your suggestion of:
>
> "-job aggregate_stat -line_type SL1L2 -out_line_type CNT -out_stat
> cnt_time_series.stat -fcst_lead 24 -by FCST_INIT_BEG"
>
> to work, because I have no forecast files.  So, I made a small
dataset to
> work with, which only includes match-ups of prepbufr-ascat and GFS
at
> forecast times 00z, 06z, 12z and 18z.  I don't have any forecast
files
> associated with the GFS, only what matches the time stamp on the
prepbufr
> ascat data.  So, how do you get data so that you can use the
-fcst_lead
> option, etc?  Is this like matching observation valid time with
files such
> as gfs.tHHz.grb2f01, gfs.tHHz.grb2f02, gfs.tHHz.grb2f03, etc?  In
other
> words, in my prepbufr file, ascat data is collected throughout the 6
hour
> period when the file is valid.  So, if it's valid at 00z, there is
-3 hours
> before 00z, and +3 hours after 00z that data is collected and
stamped for
> when the data was precisely collected. Technically, I could separate
that
> out to match a one hour forecast (gfs.tHHz.grb2f01), or a 2 hour
forecast
> (gfs.tHHz.grb2f02), etc.
>
> So, do I need to also generate those matchups in order to use that
> -fcst_lead option?  Or, is there a better way to generate the data
that is
> needed for that?
>
> Roz
>
> On Mon, May 15, 2017 at 3:51 PM, John Halley Gotway via RT <
> met_help at ucar.edu> wrote:
>
> > Roz,
> >
> > I'm glad you've been able to make progress using STAT-Analysis.
> >
> > Let me mention a few things that you may find useful...
> >
> > (1) As you've already seen, STAT-Analysis can be run by defining
one or
> > more jobs in a config file.  Alternatively, you can run a single
job on
> the
> > command line with no config file.  I find that much quicker and
easier
> when
> > I'm playing around with things.  It's only once I've defined a
fixed set
> of
> > jobs that I move them into a config file.
> >
> > (2) By default, STAT-Analysis writes it output to the screen.  Use
the
> > "-out_stat" job option to redirect the job output to a .stat
output
> files.
> > That will include the full set of header columns and should be
pretty
> easy
> > for a plotting script to parse.
> >
> > (3) It sounds like you're interested primarily in matched pairs,
i.e. the
> > MPR line type.  I assume that's what you're plotting in your
histograms
> and
> > boxplots.  If you really just want to "filter" the .stat files,
I'd
> suggest
> > using the "filter" job to do so:
> >    -job filter -line_type MPR -dump_row filter_mpr.stat [[[
additional
> > filtering criteria ]]]
> >
> > You mentioned an aggregate_stat job to read CTC and write CTS...
but that
> > doesn't have anything to do with the MPR line type.  So I'm
confused as
> to
> > why that's getting you what you want?
> >
> > (4) I see that you want a time series of MAE values.  I think
you're on
> the
> > right track, but I'd suggest using the "-by" option:
> >
> > "-job aggregate_stat -line_type SL1L2 -out_line_type CNT -out_stat
> > cnt_time_series.stat -fcst_lead 24 -by FCST_INIT_BEG"
> >
> > The job listed above would produce a time series of continuous
statistics
> > for the 24-hour lead time for each initialization time present.
You
> should
> > be able to use the job command options to define the time series
in any
> way
> > you want.
> >
> > (5) When running a summary job, if you want to summarize multiple
> columns,
> > just use the "-column" option multiple times to include them... or
> specify
> > "-column" as a comma-separated list:
> >
> > "-job summary -fcst_var WIND -line_type CNT -column
RMSE,MAE,ME,MSE"
> >
> > (6) The "plot_cnt.R" script on the website is outdated since it's
header
> > columns haven't been updated since version 3.0.  But that same
script is
> > included in the MET release and has been updated:
> >    met-6.0/scripts/Rscripts/plot_cnt.R
> >
> > It reads the CNT line type from a .stat file, an _cnt.txt file, or
the
> > output of a stat-analysis filter job.  I don't know specifically
what
> > stat-analysis command I used, but it'd be something like:
> >
> >    stat_analysis -job filter -line_type CNT -dump_row
cnt_filter.txt [[[
> > additional filtering criteria ]]]
> >
> > Hope that helps.
> >
> > Thanks,
> > John
> >
> >
> > On Mon, May 15, 2017 at 7:29 AM, Rosalyn MacCracken - NOAA
Affiliate via
> RT
> > <met_help at ucar.edu> wrote:
> >
> > >
> > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=80429 >
> > >
> > > Hi John,
> > >
> > > So, I'm interested in doing a couple of things, and I think I've
> figured
> > > out how to do some of them.  So, maybe you can tell me how to do
the
> > > others.
> > >
> > > First, I am mostly interested in the matched points and their
> > performance.
> > > And, I use a config file, which I call from a script using the
command:
> > >
> > > stat_analysis -lookin ${PROCDIR} -out
> > > ${PROCDIR}/stat_analysis/stat_analysis.out -config
> > > ${CONFIGDIR}/STATAnalysisConfig_working -v 2
> > >
> > > I can easily plot spatially, where the matched points are
located by
> > their
> > > lat/lon, and I can find their differences (FCST - OBS).  Then, I
used
> the
> > > aggregate_stat command to combine my files, so, I can plot
histograms
> or
> > > box plots of matched point, either at that forecast hour, or
over the
> > span
> > > of my forecast period of interest.  For that, I use this in  my
config
> > > file:
> > >
> > > "-job aggregate_stat -line_type CTC -out_line_type CTS
> > >     -dump_row
> > > /opc/save/Rosalyn.MacCracken/met_out/stat_analysis/job4_
> > > aggstat_ctc_cts.stat"
> > >
> > > So, some other things that I might be interested in things that
span
> the
> > > entire period.  Perhaps, more like times series plots, so we can
see
> how
> > > the forecast has done over time.  I don't have a problem with
plotting
> > > things from the forecast period, but, they usually aren't very
> revealing
> > or
> > > interesting.  So, some other things are:
> > >
> > > 1) putting together a file which spans the forecast period which
puts
> > > together information from the SL1L2 file, so, I could plot a
time
> series
> > of
> > > the MAE.  So, I was thinking I would use:
> > > "-job aggregate_stat -line_type SL1L2 -out_line_type CNT
> > >     -dump_row
> > > /opc/save/Rosalyn.MacCracken/met_out/stat_analysis/job2_
> > > aggstat_slil2_cnt_wind2.stat"
> > >
> > >
> > > Is that right?
> > >
> > > 2)  From the CNT files, time series plots of the ANOM_CORR,
PR_CORR,
> GSS
> > or
> > > CSI, and RMSE, and maybe some other things.  I was thinking that
I
> could
> > > do:
> > > "-job summary -fcst_var WIND -line_type CNT -column RMSE
> > >      -dump_row
> > > /opc/save/Rosalyn.MacCracken/met_out/stat_analysis/job7_
> > > summary_cnt_rmse.stat"
> > >
> > > But, wasn't sure if that was correct.  So, if you could point me
to the
> > > right usage, that would be great.
> > >
> > > So, I'm also not sure how you get the mean, min, max, etc, for
multiple
> > > columns.  I think that the CNT file has the most useful info,
so, if
> you
> > > could tell me how to do that, that would be great.  I'm sure
I'll have
> > > another list of things I want to do after todays' meeting with
Joe, so,
> > > I'll be back in touch with that list.
> > >
> > > Oh, also, that script you wrote, plot_cnt.r on the MET user
page.  Does
> > > that plot from one of these aggregate_stat or summary commands,
or is
> > that
> > > a single CNT file?  If it's an aggregate_stat or summary
commands, what
> > > command did you use and what was in the "stat_list" that you
used?  I'm
> > > sure it was a variety of columns from the CNT file, right?
> > >
> > > Thanks for for your help!
> > >
> > > Roz
> > >
> > >
> > > On Sun, May 14, 2017 at 12:37 AM, John Halley Gotway via RT <
> > > met_help at ucar.edu> wrote:
> > >
> > > > Roz,
> > > >
> > > > Stat-Analysis can perform a few different "job" types.  One of
them
> is
> > > the
> > > > "summary" job type (-job summary).  For that job, you pick
exactly
> one
> > > line
> > > > type and one or more columns of interest.  Stat-Analysis will
apply
> > > > whatever other filtering criteria you specify and compute
summary
> > > > information for the column(s) you've selected.  The summary
info
> > includes
> > > > mean, min, max, and so on.
> > > >
> > > > Let me know if there's something specific you're trying to do
with
> > > > stat-analysis and I may be able to point you in the right
direction.
> > > >
> > > > John
> > > >
> > > >
> > > > On Sat, May 13, 2017 at 11:04 AM Rosalyn MacCracken - NOAA
Affiliate
> > via
> > > RT
> > > > <met_help at ucar.edu> wrote:
> > > >
> > > > >
> > > > > <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=80429 >
> > > > >
> > > > > Hi John,
> > > > >
> > > > > I finally got it to work.  I had set:
> > > > > line_type = ["CTC"];
> > > > >
> > > > > So, I set line_type to nothing [], and everything started
working.
> > > > >
> > > > > So, question.  When using "summary" with -column RMSE set,
what
> does
> > > that
> > > > > mean?  That only the RMSE column is summed, or something
else?
> > > > >
> > > > > Thanks!
> > > > >
> > > > > Roz
> > > > >
> > > > > On Fri, May 12, 2017 at 2:31 PM, John Halley Gotway via RT <
> > > > > met_help at ucar.edu> wrote:
> > > > >
> > > > > > Hello Roz,
> > > > > >
> > > > > > I see that you have a question about configuring/running
> > > STAT-Analysis
> > > > > > jobs.
> > > > > >
> > > > > > The "-lookin" command line option is used to tell STAT-
Analysis
> > what
> > > > > input
> > > > > > files to read.  You must specify the "-lookin" option at
least
> > once,
> > > > but
> > > > > > can use it as many times as you'd like.
> > > > > >
> > > > > > The argument you pass with "-lookin" is either the name of
a
> > > directory
> > > > or
> > > > > > explicit file name.
> > > > > >
> > > > > > For an explicit file name, STAT-Analysis will read MET
output
> data
> > > from
> > > > > it
> > > > > > **regardless of the file naming convention**.
> > > > > >
> > > > > > For a directory name, STAT-Analysis will search
**recursively**
> > > through
> > > > > > that directory looking for files ending in the ".stat"
suffix.
> > > > > >
> > > > > > Each time you run grid_stat, point_stat, wavelet_stat, or
> > > > ensemble_stat,
> > > > > > the tool writes a ".stat" output file (and can also write
the
> > > optional
> > > > > text
> > > > > > files sorted by line type... such as "_cnt.txt).  That's
why
> > > > > STAT-Analysis
> > > > > > searches directories for ".stat" files.  But if you want
it to
> read
> > > the
> > > > > > "_cnt.txt" file, you need to specify the file name on the
command
> > > line.
> > > > > >
> > > > > > Make sense?
> > > > > >
> > > > > > Just let us know if more issues/questions arise.
> > > > > >
> > > > > > Thanks,
> > > > > > John Halley Gotway
> > > > > >
> > > > > >
> > > > > > On Thu, May 11, 2017 at 2:37 PM, Rosalyn MacCracken - NOAA
> > Affiliate
> > > > via
> > > > > RT
> > > > > > <met_help at ucar.edu> wrote:
> > > > > >
> > > > > > >
> > > > > > > Thu May 11 14:37:29 2017: Request 80429 was acted upon.
> > > > > > > Transaction: Ticket created by
rosalyn.maccracken at noaa.gov
> > > > > > >        Queue: met_help
> > > > > > >      Subject: stat_analysis aggregate question
> > > > > > >        Owner: Nobody
> > > > > > >   Requestors: rosalyn.maccracken at noaa.gov
> > > > > > >       Status: new
> > > > > > >  Ticket <URL: https://rt.rap.ucar.edu/rt/
> > > > Ticket/Display.html?id=80429
> > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Hi,
> > > > > > >
> > > > > > > I'm using the output from the poin-stat tool as input to
the
> > > > > > stat_analysis
> > > > > > > tool.  I would like to aggregate the *cnt.txt files.  I
can get
> > the
> > > > > tool
> > > > > > to
> > > > > > > aggregate, and aggregate_stat the *cts.txt or *ctc.txt
files.
> I
> > > > would
> > > > > > > really like to use the information in the *cnt.txt files
for
> > > multiple
> > > > > > > times/days.  How do I do that?
> > > > > > >
> > > > > > > Thanks in advance!
> > > > > > >
> > > > > > > Roz
> > > > > > >
> > > > > > > --
> > > > > > > Rosalyn MacCracken
> > > > > > > Support Scientist
> > > > > > >
> > > > > > > Ocean Applilcations Branch
> > > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > > NCWCP
> > > > > > > 5830 University Research Ct
> > > > > > > College Park, MD  20740-3818
> > > > > > >
> > > > > > > (p) 301-683-1551
> > > > > > > rosalyn.maccracken at noaa.gov
> > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Rosalyn MacCracken
> > > > > Support Scientist
> > > > >
> > > > > Ocean Applilcations Branch
> > > > > NOAA/NWS Ocean Prediction Center
> > > > > NCWCP
> > > > > 5830 University Research Ct
> > > > > College Park, MD  20740-3818
> > > > >
> > > > > (p) 301-683-1551
> > > > > rosalyn.maccracken at noaa.gov
> > > > >
> > > > >
> > > >
> > > >
> > >
> > >
> > > --
> > > Rosalyn MacCracken
> > > Support Scientist
> > >
> > > Ocean Applilcations Branch
> > > NOAA/NWS Ocean Prediction Center
> > > NCWCP
> > > 5830 University Research Ct
> > > College Park, MD  20740-3818
> > >
> > > (p) 301-683-1551
> > > rosalyn.maccracken at noaa.gov
> > >
> > >
> >
> >
>
>
> --
> Rosalyn MacCracken
> Support Scientist
>
> Ocean Applilcations Branch
> NOAA/NWS Ocean Prediction Center
> NCWCP
> 5830 University Research Ct
> College Park, MD  20740-3818
>
> (p) 301-683-1551
> rosalyn.maccracken at noaa.gov
>
>

------------------------------------------------
Subject: stat_analysis aggregate question
From: Rosalyn MacCracken - NOAA Affiliate
Time: Tue May 16 09:34:10 2017

Hi John,

Ok, I was able to get those things working!  I couldn't get the
summary job
to run in the config file and output that small table to the ascii,
but, I
could use it on the command line with no issues.  So, eventually, I'll
run
this in an automated script, so, I tested it with my script, and it
runs
great and outputs what I want.  So, it looks like I'm off to a good
start
now.

Thanks for all your help!

Roz

On Mon, May 15, 2017 at 8:07 PM, John Halley Gotway via RT <
met_help at ucar.edu> wrote:

> Roz,
>
> The output of the "summary" job is not a .stat line type.  There is
no
> "SUMMARY" line type produced by other MET tools.  That's why you
don't get
> any output using the "-out_stat" option.  However, you can use the
"-out"
> option to redirect the output to an ASCII file.
>
> I realize this is confusing... the "-out" option has existed for a
long
> time.  We only recently added the "-out_stat" option for output the
> "aggregate" and "aggregate_stat" job types, which write true STAT
lines to
> the output.
>
> On to the next issue.  It's fine that you're not evaluating forecast
lead
> times... in fact that makes the logic of defining a time series much
> easier.  Just use "-by FCST_VALID_BEG" instead:
>
> "-job aggregate_stat -line_type SL1L2 -out_line_type CNT -out_stat
> cnt_time_series.stat -by FCST_VALID_BEG"
>
> Thanks,
> John
>
> On Mon, May 15, 2017 at 12:33 PM, Rosalyn MacCracken - NOAA
Affiliate via
> RT <met_help at ucar.edu> wrote:
>
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=80429 >
> >
> > Hi John,
> >
> > I'm having a few problems, and I'm sure they are pretty simple to
solve.
> > First, I was looking at the "-job summary" suggestion.  So, I did:
> >
> > stat_analysis -lookin
/opc/save/Rosalyn.MacCracken/met_out/master_gfs
> > "-job
> > summary -fcst_var WIND -line_type CNT -column RMSE,MAE,ME,MSE
-out_stat
> >
/opc/save/Rosalyn.MacCracken/met_out/stat_analysis/colum_multivars.stat"
> > -v
> > 2
> >
> > and,  only wrote to the screen, not the -out_stat file specified.
So,
> how
> > do I fix that?
> >
> > Next, I can't get your suggestion of:
> >
> > "-job aggregate_stat -line_type SL1L2 -out_line_type CNT -out_stat
> > cnt_time_series.stat -fcst_lead 24 -by FCST_INIT_BEG"
> >
> > to work, because I have no forecast files.  So, I made a small
dataset to
> > work with, which only includes match-ups of prepbufr-ascat and GFS
at
> > forecast times 00z, 06z, 12z and 18z.  I don't have any forecast
files
> > associated with the GFS, only what matches the time stamp on the
prepbufr
> > ascat data.  So, how do you get data so that you can use the
-fcst_lead
> > option, etc?  Is this like matching observation valid time with
files
> such
> > as gfs.tHHz.grb2f01, gfs.tHHz.grb2f02, gfs.tHHz.grb2f03, etc?  In
other
> > words, in my prepbufr file, ascat data is collected throughout the
6 hour
> > period when the file is valid.  So, if it's valid at 00z, there is
-3
> hours
> > before 00z, and +3 hours after 00z that data is collected and
stamped for
> > when the data was precisely collected. Technically, I could
separate that
> > out to match a one hour forecast (gfs.tHHz.grb2f01), or a 2 hour
forecast
> > (gfs.tHHz.grb2f02), etc.
> >
> > So, do I need to also generate those matchups in order to use that
> > -fcst_lead option?  Or, is there a better way to generate the data
that
> is
> > needed for that?
> >
> > Roz
> >
> > On Mon, May 15, 2017 at 3:51 PM, John Halley Gotway via RT <
> > met_help at ucar.edu> wrote:
> >
> > > Roz,
> > >
> > > I'm glad you've been able to make progress using STAT-Analysis.
> > >
> > > Let me mention a few things that you may find useful...
> > >
> > > (1) As you've already seen, STAT-Analysis can be run by defining
one or
> > > more jobs in a config file.  Alternatively, you can run a single
job on
> > the
> > > command line with no config file.  I find that much quicker and
easier
> > when
> > > I'm playing around with things.  It's only once I've defined a
fixed
> set
> > of
> > > jobs that I move them into a config file.
> > >
> > > (2) By default, STAT-Analysis writes it output to the screen.
Use the
> > > "-out_stat" job option to redirect the job output to a .stat
output
> > files.
> > > That will include the full set of header columns and should be
pretty
> > easy
> > > for a plotting script to parse.
> > >
> > > (3) It sounds like you're interested primarily in matched pairs,
i.e.
> the
> > > MPR line type.  I assume that's what you're plotting in your
histograms
> > and
> > > boxplots.  If you really just want to "filter" the .stat files,
I'd
> > suggest
> > > using the "filter" job to do so:
> > >    -job filter -line_type MPR -dump_row filter_mpr.stat [[[
additional
> > > filtering criteria ]]]
> > >
> > > You mentioned an aggregate_stat job to read CTC and write CTS...
but
> that
> > > doesn't have anything to do with the MPR line type.  So I'm
confused as
> > to
> > > why that's getting you what you want?
> > >
> > > (4) I see that you want a time series of MAE values.  I think
you're on
> > the
> > > right track, but I'd suggest using the "-by" option:
> > >
> > > "-job aggregate_stat -line_type SL1L2 -out_line_type CNT
-out_stat
> > > cnt_time_series.stat -fcst_lead 24 -by FCST_INIT_BEG"
> > >
> > > The job listed above would produce a time series of continuous
> statistics
> > > for the 24-hour lead time for each initialization time present.
You
> > should
> > > be able to use the job command options to define the time series
in any
> > way
> > > you want.
> > >
> > > (5) When running a summary job, if you want to summarize
multiple
> > columns,
> > > just use the "-column" option multiple times to include them...
or
> > specify
> > > "-column" as a comma-separated list:
> > >
> > > "-job summary -fcst_var WIND -line_type CNT -column
RMSE,MAE,ME,MSE"
> > >
> > > (6) The "plot_cnt.R" script on the website is outdated since
it's
> header
> > > columns haven't been updated since version 3.0.  But that same
script
> is
> > > included in the MET release and has been updated:
> > >    met-6.0/scripts/Rscripts/plot_cnt.R
> > >
> > > It reads the CNT line type from a .stat file, an _cnt.txt file,
or the
> > > output of a stat-analysis filter job.  I don't know specifically
what
> > > stat-analysis command I used, but it'd be something like:
> > >
> > >    stat_analysis -job filter -line_type CNT -dump_row
cnt_filter.txt
> [[[
> > > additional filtering criteria ]]]
> > >
> > > Hope that helps.
> > >
> > > Thanks,
> > > John
> > >
> > >
> > > On Mon, May 15, 2017 at 7:29 AM, Rosalyn MacCracken - NOAA
Affiliate
> via
> > RT
> > > <met_help at ucar.edu> wrote:
> > >
> > > >
> > > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=80429
>
> > > >
> > > > Hi John,
> > > >
> > > > So, I'm interested in doing a couple of things, and I think
I've
> > figured
> > > > out how to do some of them.  So, maybe you can tell me how to
do the
> > > > others.
> > > >
> > > > First, I am mostly interested in the matched points and their
> > > performance.
> > > > And, I use a config file, which I call from a script using the
> command:
> > > >
> > > > stat_analysis -lookin ${PROCDIR} -out
> > > > ${PROCDIR}/stat_analysis/stat_analysis.out -config
> > > > ${CONFIGDIR}/STATAnalysisConfig_working -v 2
> > > >
> > > > I can easily plot spatially, where the matched points are
located by
> > > their
> > > > lat/lon, and I can find their differences (FCST - OBS).  Then,
I used
> > the
> > > > aggregate_stat command to combine my files, so, I can plot
histograms
> > or
> > > > box plots of matched point, either at that forecast hour, or
over the
> > > span
> > > > of my forecast period of interest.  For that, I use this in
my
> config
> > > > file:
> > > >
> > > > "-job aggregate_stat -line_type CTC -out_line_type CTS
> > > >     -dump_row
> > > > /opc/save/Rosalyn.MacCracken/met_out/stat_analysis/job4_
> > > > aggstat_ctc_cts.stat"
> > > >
> > > > So, some other things that I might be interested in things
that span
> > the
> > > > entire period.  Perhaps, more like times series plots, so we
can see
> > how
> > > > the forecast has done over time.  I don't have a problem with
> plotting
> > > > things from the forecast period, but, they usually aren't very
> > revealing
> > > or
> > > > interesting.  So, some other things are:
> > > >
> > > > 1) putting together a file which spans the forecast period
which puts
> > > > together information from the SL1L2 file, so, I could plot a
time
> > series
> > > of
> > > > the MAE.  So, I was thinking I would use:
> > > > "-job aggregate_stat -line_type SL1L2 -out_line_type CNT
> > > >     -dump_row
> > > > /opc/save/Rosalyn.MacCracken/met_out/stat_analysis/job2_
> > > > aggstat_slil2_cnt_wind2.stat"
> > > >
> > > >
> > > > Is that right?
> > > >
> > > > 2)  From the CNT files, time series plots of the ANOM_CORR,
PR_CORR,
> > GSS
> > > or
> > > > CSI, and RMSE, and maybe some other things.  I was thinking
that I
> > could
> > > > do:
> > > > "-job summary -fcst_var WIND -line_type CNT -column RMSE
> > > >      -dump_row
> > > > /opc/save/Rosalyn.MacCracken/met_out/stat_analysis/job7_
> > > > summary_cnt_rmse.stat"
> > > >
> > > > But, wasn't sure if that was correct.  So, if you could point
me to
> the
> > > > right usage, that would be great.
> > > >
> > > > So, I'm also not sure how you get the mean, min, max, etc, for
> multiple
> > > > columns.  I think that the CNT file has the most useful info,
so, if
> > you
> > > > could tell me how to do that, that would be great.  I'm sure
I'll
> have
> > > > another list of things I want to do after todays' meeting with
Joe,
> so,
> > > > I'll be back in touch with that list.
> > > >
> > > > Oh, also, that script you wrote, plot_cnt.r on the MET user
page.
> Does
> > > > that plot from one of these aggregate_stat or summary
commands, or is
> > > that
> > > > a single CNT file?  If it's an aggregate_stat or summary
commands,
> what
> > > > command did you use and what was in the "stat_list" that you
used?
> I'm
> > > > sure it was a variety of columns from the CNT file, right?
> > > >
> > > > Thanks for for your help!
> > > >
> > > > Roz
> > > >
> > > >
> > > > On Sun, May 14, 2017 at 12:37 AM, John Halley Gotway via RT <
> > > > met_help at ucar.edu> wrote:
> > > >
> > > > > Roz,
> > > > >
> > > > > Stat-Analysis can perform a few different "job" types.  One
of them
> > is
> > > > the
> > > > > "summary" job type (-job summary).  For that job, you pick
exactly
> > one
> > > > line
> > > > > type and one or more columns of interest.  Stat-Analysis
will apply
> > > > > whatever other filtering criteria you specify and compute
summary
> > > > > information for the column(s) you've selected.  The summary
info
> > > includes
> > > > > mean, min, max, and so on.
> > > > >
> > > > > Let me know if there's something specific you're trying to
do with
> > > > > stat-analysis and I may be able to point you in the right
> direction.
> > > > >
> > > > > John
> > > > >
> > > > >
> > > > > On Sat, May 13, 2017 at 11:04 AM Rosalyn MacCracken - NOAA
> Affiliate
> > > via
> > > > RT
> > > > > <met_help at ucar.edu> wrote:
> > > > >
> > > > > >
> > > > > > <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=80429 >
> > > > > >
> > > > > > Hi John,
> > > > > >
> > > > > > I finally got it to work.  I had set:
> > > > > > line_type = ["CTC"];
> > > > > >
> > > > > > So, I set line_type to nothing [], and everything started
> working.
> > > > > >
> > > > > > So, question.  When using "summary" with -column RMSE set,
what
> > does
> > > > that
> > > > > > mean?  That only the RMSE column is summed, or something
else?
> > > > > >
> > > > > > Thanks!
> > > > > >
> > > > > > Roz
> > > > > >
> > > > > > On Fri, May 12, 2017 at 2:31 PM, John Halley Gotway via RT
<
> > > > > > met_help at ucar.edu> wrote:
> > > > > >
> > > > > > > Hello Roz,
> > > > > > >
> > > > > > > I see that you have a question about configuring/running
> > > > STAT-Analysis
> > > > > > > jobs.
> > > > > > >
> > > > > > > The "-lookin" command line option is used to tell STAT-
Analysis
> > > what
> > > > > > input
> > > > > > > files to read.  You must specify the "-lookin" option at
least
> > > once,
> > > > > but
> > > > > > > can use it as many times as you'd like.
> > > > > > >
> > > > > > > The argument you pass with "-lookin" is either the name
of a
> > > > directory
> > > > > or
> > > > > > > explicit file name.
> > > > > > >
> > > > > > > For an explicit file name, STAT-Analysis will read MET
output
> > data
> > > > from
> > > > > > it
> > > > > > > **regardless of the file naming convention**.
> > > > > > >
> > > > > > > For a directory name, STAT-Analysis will search
**recursively**
> > > > through
> > > > > > > that directory looking for files ending in the ".stat"
suffix.
> > > > > > >
> > > > > > > Each time you run grid_stat, point_stat, wavelet_stat,
or
> > > > > ensemble_stat,
> > > > > > > the tool writes a ".stat" output file (and can also
write the
> > > > optional
> > > > > > text
> > > > > > > files sorted by line type... such as "_cnt.txt).  That's
why
> > > > > > STAT-Analysis
> > > > > > > searches directories for ".stat" files.  But if you want
it to
> > read
> > > > the
> > > > > > > "_cnt.txt" file, you need to specify the file name on
the
> command
> > > > line.
> > > > > > >
> > > > > > > Make sense?
> > > > > > >
> > > > > > > Just let us know if more issues/questions arise.
> > > > > > >
> > > > > > > Thanks,
> > > > > > > John Halley Gotway
> > > > > > >
> > > > > > >
> > > > > > > On Thu, May 11, 2017 at 2:37 PM, Rosalyn MacCracken -
NOAA
> > > Affiliate
> > > > > via
> > > > > > RT
> > > > > > > <met_help at ucar.edu> wrote:
> > > > > > >
> > > > > > > >
> > > > > > > > Thu May 11 14:37:29 2017: Request 80429 was acted
upon.
> > > > > > > > Transaction: Ticket created by
rosalyn.maccracken at noaa.gov
> > > > > > > >        Queue: met_help
> > > > > > > >      Subject: stat_analysis aggregate question
> > > > > > > >        Owner: Nobody
> > > > > > > >   Requestors: rosalyn.maccracken at noaa.gov
> > > > > > > >       Status: new
> > > > > > > >  Ticket <URL: https://rt.rap.ucar.edu/rt/
> > > > > Ticket/Display.html?id=80429
> > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > Hi,
> > > > > > > >
> > > > > > > > I'm using the output from the poin-stat tool as input
to the
> > > > > > > stat_analysis
> > > > > > > > tool.  I would like to aggregate the *cnt.txt files.
I can
> get
> > > the
> > > > > > tool
> > > > > > > to
> > > > > > > > aggregate, and aggregate_stat the *cts.txt or *ctc.txt
files.
> > I
> > > > > would
> > > > > > > > really like to use the information in the *cnt.txt
files for
> > > > multiple
> > > > > > > > times/days.  How do I do that?
> > > > > > > >
> > > > > > > > Thanks in advance!
> > > > > > > >
> > > > > > > > Roz
> > > > > > > >
> > > > > > > > --
> > > > > > > > Rosalyn MacCracken
> > > > > > > > Support Scientist
> > > > > > > >
> > > > > > > > Ocean Applilcations Branch
> > > > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > > > NCWCP
> > > > > > > > 5830 University Research Ct
> > > > > > > > College Park, MD  20740-3818
> > > > > > > >
> > > > > > > > (p) 301-683-1551
> > > > > > > > rosalyn.maccracken at noaa.gov
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Rosalyn MacCracken
> > > > > > Support Scientist
> > > > > >
> > > > > > Ocean Applilcations Branch
> > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > NCWCP
> > > > > > 5830 University Research Ct
> > > > > > College Park, MD  20740-3818
> > > > > >
> > > > > > (p) 301-683-1551
> > > > > > rosalyn.maccracken at noaa.gov
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > Rosalyn MacCracken
> > > > Support Scientist
> > > >
> > > > Ocean Applilcations Branch
> > > > NOAA/NWS Ocean Prediction Center
> > > > NCWCP
> > > > 5830 University Research Ct
> > > > College Park, MD  20740-3818
> > > >
> > > > (p) 301-683-1551
> > > > rosalyn.maccracken at noaa.gov
> > > >
> > > >
> > >
> > >
> >
> >
> > --
> > Rosalyn MacCracken
> > Support Scientist
> >
> > Ocean Applilcations Branch
> > NOAA/NWS Ocean Prediction Center
> > NCWCP
> > 5830 University Research Ct
> > College Park, MD  20740-3818
> >
> > (p) 301-683-1551
> > rosalyn.maccracken at noaa.gov
> >
> >
>
>


--
Rosalyn MacCracken
Support Scientist

Ocean Applilcations Branch
NOAA/NWS Ocean Prediction Center
NCWCP
5830 University Research Ct
College Park, MD  20740-3818

(p) 301-683-1551
rosalyn.maccracken at noaa.gov

------------------------------------------------
Subject: stat_analysis aggregate question
From: John Halley Gotway
Time: Tue May 16 09:54:31 2017

Roz,

Great, glad to hear you've made progress.

Let me clarify one nuance about the config file which *may* be the
reason
why your summary job didn't work via the config file.

You'll notice that the config file has two sections.  The "filtering"
section at the top contains at least one option for each of the 22
header
columns of the .stat output files.  The "jobs" section at the bottom
defines the analysis job you want to perform.

The logic works like this...

- STAT-Analysis reads all the input files defined using the "-lookin"
option.
- It applies *all* of the filtering options defined in the top section
and
writes the filtered .stat data to an output temp file.
- Each job defined in the "jobs" section, reads data from that temp
file,
applies any additional filtering criteria you've defined, and then
performs
the job on the data that remains.

Therefore, the settings defined in the "filtering" section are
effectively
applied to every job you define in the "jobs" section.

Perhaps, your "filtering" options at the top of your config file have
already filtered out the line type you're processing in the summary
job?
If so, just move that option out of the filtering section and down to
the
jobs section where you'll specify it separately for each job (e.g.
-line_type CNT).

The intent of this design is to enable STAT-Analysis to run more
efficiently.  Rather than having it re-parse *ALL* the input lines for
each
job, do some first order filtering to run jobs on a smaller number of
lines.

Hope this helps clarify.

Thanks,
John



On Tue, May 16, 2017 at 9:34 AM, Rosalyn MacCracken - NOAA Affiliate
via RT
<met_help at ucar.edu> wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=80429 >
>
> Hi John,
>
> Ok, I was able to get those things working!  I couldn't get the
summary job
> to run in the config file and output that small table to the ascii,
but, I
> could use it on the command line with no issues.  So, eventually,
I'll run
> this in an automated script, so, I tested it with my script, and it
runs
> great and outputs what I want.  So, it looks like I'm off to a good
start
> now.
>
> Thanks for all your help!
>
> Roz
>
> On Mon, May 15, 2017 at 8:07 PM, John Halley Gotway via RT <
> met_help at ucar.edu> wrote:
>
> > Roz,
> >
> > The output of the "summary" job is not a .stat line type.  There
is no
> > "SUMMARY" line type produced by other MET tools.  That's why you
don't
> get
> > any output using the "-out_stat" option.  However, you can use the
"-out"
> > option to redirect the output to an ASCII file.
> >
> > I realize this is confusing... the "-out" option has existed for a
long
> > time.  We only recently added the "-out_stat" option for output
the
> > "aggregate" and "aggregate_stat" job types, which write true STAT
lines
> to
> > the output.
> >
> > On to the next issue.  It's fine that you're not evaluating
forecast lead
> > times... in fact that makes the logic of defining a time series
much
> > easier.  Just use "-by FCST_VALID_BEG" instead:
> >
> > "-job aggregate_stat -line_type SL1L2 -out_line_type CNT -out_stat
> > cnt_time_series.stat -by FCST_VALID_BEG"
> >
> > Thanks,
> > John
> >
> > On Mon, May 15, 2017 at 12:33 PM, Rosalyn MacCracken - NOAA
Affiliate via
> > RT <met_help at ucar.edu> wrote:
> >
> > >
> > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=80429 >
> > >
> > > Hi John,
> > >
> > > I'm having a few problems, and I'm sure they are pretty simple
to
> solve.
> > > First, I was looking at the "-job summary" suggestion.  So, I
did:
> > >
> > > stat_analysis -lookin
/opc/save/Rosalyn.MacCracken/met_out/master_gfs
> > > "-job
> > > summary -fcst_var WIND -line_type CNT -column RMSE,MAE,ME,MSE
-out_stat
> > > /opc/save/Rosalyn.MacCracken/met_out/stat_analysis/colum_
> multivars.stat"
> > > -v
> > > 2
> > >
> > > and,  only wrote to the screen, not the -out_stat file
specified.  So,
> > how
> > > do I fix that?
> > >
> > > Next, I can't get your suggestion of:
> > >
> > > "-job aggregate_stat -line_type SL1L2 -out_line_type CNT
-out_stat
> > > cnt_time_series.stat -fcst_lead 24 -by FCST_INIT_BEG"
> > >
> > > to work, because I have no forecast files.  So, I made a small
dataset
> to
> > > work with, which only includes match-ups of prepbufr-ascat and
GFS at
> > > forecast times 00z, 06z, 12z and 18z.  I don't have any forecast
files
> > > associated with the GFS, only what matches the time stamp on the
> prepbufr
> > > ascat data.  So, how do you get data so that you can use the
-fcst_lead
> > > option, etc?  Is this like matching observation valid time with
files
> > such
> > > as gfs.tHHz.grb2f01, gfs.tHHz.grb2f02, gfs.tHHz.grb2f03, etc?
In other
> > > words, in my prepbufr file, ascat data is collected throughout
the 6
> hour
> > > period when the file is valid.  So, if it's valid at 00z, there
is -3
> > hours
> > > before 00z, and +3 hours after 00z that data is collected and
stamped
> for
> > > when the data was precisely collected. Technically, I could
separate
> that
> > > out to match a one hour forecast (gfs.tHHz.grb2f01), or a 2 hour
> forecast
> > > (gfs.tHHz.grb2f02), etc.
> > >
> > > So, do I need to also generate those matchups in order to use
that
> > > -fcst_lead option?  Or, is there a better way to generate the
data that
> > is
> > > needed for that?
> > >
> > > Roz
> > >
> > > On Mon, May 15, 2017 at 3:51 PM, John Halley Gotway via RT <
> > > met_help at ucar.edu> wrote:
> > >
> > > > Roz,
> > > >
> > > > I'm glad you've been able to make progress using STAT-
Analysis.
> > > >
> > > > Let me mention a few things that you may find useful...
> > > >
> > > > (1) As you've already seen, STAT-Analysis can be run by
defining one
> or
> > > > more jobs in a config file.  Alternatively, you can run a
single job
> on
> > > the
> > > > command line with no config file.  I find that much quicker
and
> easier
> > > when
> > > > I'm playing around with things.  It's only once I've defined a
fixed
> > set
> > > of
> > > > jobs that I move them into a config file.
> > > >
> > > > (2) By default, STAT-Analysis writes it output to the screen.
Use
> the
> > > > "-out_stat" job option to redirect the job output to a .stat
output
> > > files.
> > > > That will include the full set of header columns and should be
pretty
> > > easy
> > > > for a plotting script to parse.
> > > >
> > > > (3) It sounds like you're interested primarily in matched
pairs, i.e.
> > the
> > > > MPR line type.  I assume that's what you're plotting in your
> histograms
> > > and
> > > > boxplots.  If you really just want to "filter" the .stat
files, I'd
> > > suggest
> > > > using the "filter" job to do so:
> > > >    -job filter -line_type MPR -dump_row filter_mpr.stat [[[
> additional
> > > > filtering criteria ]]]
> > > >
> > > > You mentioned an aggregate_stat job to read CTC and write
CTS... but
> > that
> > > > doesn't have anything to do with the MPR line type.  So I'm
confused
> as
> > > to
> > > > why that's getting you what you want?
> > > >
> > > > (4) I see that you want a time series of MAE values.  I think
you're
> on
> > > the
> > > > right track, but I'd suggest using the "-by" option:
> > > >
> > > > "-job aggregate_stat -line_type SL1L2 -out_line_type CNT
-out_stat
> > > > cnt_time_series.stat -fcst_lead 24 -by FCST_INIT_BEG"
> > > >
> > > > The job listed above would produce a time series of continuous
> > statistics
> > > > for the 24-hour lead time for each initialization time
present.  You
> > > should
> > > > be able to use the job command options to define the time
series in
> any
> > > way
> > > > you want.
> > > >
> > > > (5) When running a summary job, if you want to summarize
multiple
> > > columns,
> > > > just use the "-column" option multiple times to include
them... or
> > > specify
> > > > "-column" as a comma-separated list:
> > > >
> > > > "-job summary -fcst_var WIND -line_type CNT -column
RMSE,MAE,ME,MSE"
> > > >
> > > > (6) The "plot_cnt.R" script on the website is outdated since
it's
> > header
> > > > columns haven't been updated since version 3.0.  But that same
script
> > is
> > > > included in the MET release and has been updated:
> > > >    met-6.0/scripts/Rscripts/plot_cnt.R
> > > >
> > > > It reads the CNT line type from a .stat file, an _cnt.txt
file, or
> the
> > > > output of a stat-analysis filter job.  I don't know
specifically what
> > > > stat-analysis command I used, but it'd be something like:
> > > >
> > > >    stat_analysis -job filter -line_type CNT -dump_row
cnt_filter.txt
> > [[[
> > > > additional filtering criteria ]]]
> > > >
> > > > Hope that helps.
> > > >
> > > > Thanks,
> > > > John
> > > >
> > > >
> > > > On Mon, May 15, 2017 at 7:29 AM, Rosalyn MacCracken - NOAA
Affiliate
> > via
> > > RT
> > > > <met_help at ucar.edu> wrote:
> > > >
> > > > >
> > > > > <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=80429 >
> > > > >
> > > > > Hi John,
> > > > >
> > > > > So, I'm interested in doing a couple of things, and I think
I've
> > > figured
> > > > > out how to do some of them.  So, maybe you can tell me how
to do
> the
> > > > > others.
> > > > >
> > > > > First, I am mostly interested in the matched points and
their
> > > > performance.
> > > > > And, I use a config file, which I call from a script using
the
> > command:
> > > > >
> > > > > stat_analysis -lookin ${PROCDIR} -out
> > > > > ${PROCDIR}/stat_analysis/stat_analysis.out -config
> > > > > ${CONFIGDIR}/STATAnalysisConfig_working -v 2
> > > > >
> > > > > I can easily plot spatially, where the matched points are
located
> by
> > > > their
> > > > > lat/lon, and I can find their differences (FCST - OBS).
Then, I
> used
> > > the
> > > > > aggregate_stat command to combine my files, so, I can plot
> histograms
> > > or
> > > > > box plots of matched point, either at that forecast hour, or
over
> the
> > > > span
> > > > > of my forecast period of interest.  For that, I use this in
my
> > config
> > > > > file:
> > > > >
> > > > > "-job aggregate_stat -line_type CTC -out_line_type CTS
> > > > >     -dump_row
> > > > > /opc/save/Rosalyn.MacCracken/met_out/stat_analysis/job4_
> > > > > aggstat_ctc_cts.stat"
> > > > >
> > > > > So, some other things that I might be interested in things
that
> span
> > > the
> > > > > entire period.  Perhaps, more like times series plots, so we
can
> see
> > > how
> > > > > the forecast has done over time.  I don't have a problem
with
> > plotting
> > > > > things from the forecast period, but, they usually aren't
very
> > > revealing
> > > > or
> > > > > interesting.  So, some other things are:
> > > > >
> > > > > 1) putting together a file which spans the forecast period
which
> puts
> > > > > together information from the SL1L2 file, so, I could plot a
time
> > > series
> > > > of
> > > > > the MAE.  So, I was thinking I would use:
> > > > > "-job aggregate_stat -line_type SL1L2 -out_line_type CNT
> > > > >     -dump_row
> > > > > /opc/save/Rosalyn.MacCracken/met_out/stat_analysis/job2_
> > > > > aggstat_slil2_cnt_wind2.stat"
> > > > >
> > > > >
> > > > > Is that right?
> > > > >
> > > > > 2)  From the CNT files, time series plots of the ANOM_CORR,
> PR_CORR,
> > > GSS
> > > > or
> > > > > CSI, and RMSE, and maybe some other things.  I was thinking
that I
> > > could
> > > > > do:
> > > > > "-job summary -fcst_var WIND -line_type CNT -column RMSE
> > > > >      -dump_row
> > > > > /opc/save/Rosalyn.MacCracken/met_out/stat_analysis/job7_
> > > > > summary_cnt_rmse.stat"
> > > > >
> > > > > But, wasn't sure if that was correct.  So, if you could
point me to
> > the
> > > > > right usage, that would be great.
> > > > >
> > > > > So, I'm also not sure how you get the mean, min, max, etc,
for
> > multiple
> > > > > columns.  I think that the CNT file has the most useful
info, so,
> if
> > > you
> > > > > could tell me how to do that, that would be great.  I'm sure
I'll
> > have
> > > > > another list of things I want to do after todays' meeting
with Joe,
> > so,
> > > > > I'll be back in touch with that list.
> > > > >
> > > > > Oh, also, that script you wrote, plot_cnt.r on the MET user
page.
> > Does
> > > > > that plot from one of these aggregate_stat or summary
commands, or
> is
> > > > that
> > > > > a single CNT file?  If it's an aggregate_stat or summary
commands,
> > what
> > > > > command did you use and what was in the "stat_list" that you
used?
> > I'm
> > > > > sure it was a variety of columns from the CNT file, right?
> > > > >
> > > > > Thanks for for your help!
> > > > >
> > > > > Roz
> > > > >
> > > > >
> > > > > On Sun, May 14, 2017 at 12:37 AM, John Halley Gotway via RT
<
> > > > > met_help at ucar.edu> wrote:
> > > > >
> > > > > > Roz,
> > > > > >
> > > > > > Stat-Analysis can perform a few different "job" types.
One of
> them
> > > is
> > > > > the
> > > > > > "summary" job type (-job summary).  For that job, you pick
> exactly
> > > one
> > > > > line
> > > > > > type and one or more columns of interest.  Stat-Analysis
will
> apply
> > > > > > whatever other filtering criteria you specify and compute
summary
> > > > > > information for the column(s) you've selected.  The
summary info
> > > > includes
> > > > > > mean, min, max, and so on.
> > > > > >
> > > > > > Let me know if there's something specific you're trying to
do
> with
> > > > > > stat-analysis and I may be able to point you in the right
> > direction.
> > > > > >
> > > > > > John
> > > > > >
> > > > > >
> > > > > > On Sat, May 13, 2017 at 11:04 AM Rosalyn MacCracken - NOAA
> > Affiliate
> > > > via
> > > > > RT
> > > > > > <met_help at ucar.edu> wrote:
> > > > > >
> > > > > > >
> > > > > > > <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=80429
> >
> > > > > > >
> > > > > > > Hi John,
> > > > > > >
> > > > > > > I finally got it to work.  I had set:
> > > > > > > line_type = ["CTC"];
> > > > > > >
> > > > > > > So, I set line_type to nothing [], and everything
started
> > working.
> > > > > > >
> > > > > > > So, question.  When using "summary" with -column RMSE
set, what
> > > does
> > > > > that
> > > > > > > mean?  That only the RMSE column is summed, or something
else?
> > > > > > >
> > > > > > > Thanks!
> > > > > > >
> > > > > > > Roz
> > > > > > >
> > > > > > > On Fri, May 12, 2017 at 2:31 PM, John Halley Gotway via
RT <
> > > > > > > met_help at ucar.edu> wrote:
> > > > > > >
> > > > > > > > Hello Roz,
> > > > > > > >
> > > > > > > > I see that you have a question about
configuring/running
> > > > > STAT-Analysis
> > > > > > > > jobs.
> > > > > > > >
> > > > > > > > The "-lookin" command line option is used to tell
> STAT-Analysis
> > > > what
> > > > > > > input
> > > > > > > > files to read.  You must specify the "-lookin" option
at
> least
> > > > once,
> > > > > > but
> > > > > > > > can use it as many times as you'd like.
> > > > > > > >
> > > > > > > > The argument you pass with "-lookin" is either the
name of a
> > > > > directory
> > > > > > or
> > > > > > > > explicit file name.
> > > > > > > >
> > > > > > > > For an explicit file name, STAT-Analysis will read MET
output
> > > data
> > > > > from
> > > > > > > it
> > > > > > > > **regardless of the file naming convention**.
> > > > > > > >
> > > > > > > > For a directory name, STAT-Analysis will search
> **recursively**
> > > > > through
> > > > > > > > that directory looking for files ending in the ".stat"
> suffix.
> > > > > > > >
> > > > > > > > Each time you run grid_stat, point_stat, wavelet_stat,
or
> > > > > > ensemble_stat,
> > > > > > > > the tool writes a ".stat" output file (and can also
write the
> > > > > optional
> > > > > > > text
> > > > > > > > files sorted by line type... such as "_cnt.txt).
That's why
> > > > > > > STAT-Analysis
> > > > > > > > searches directories for ".stat" files.  But if you
want it
> to
> > > read
> > > > > the
> > > > > > > > "_cnt.txt" file, you need to specify the file name on
the
> > command
> > > > > line.
> > > > > > > >
> > > > > > > > Make sense?
> > > > > > > >
> > > > > > > > Just let us know if more issues/questions arise.
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > John Halley Gotway
> > > > > > > >
> > > > > > > >
> > > > > > > > On Thu, May 11, 2017 at 2:37 PM, Rosalyn MacCracken -
NOAA
> > > > Affiliate
> > > > > > via
> > > > > > > RT
> > > > > > > > <met_help at ucar.edu> wrote:
> > > > > > > >
> > > > > > > > >
> > > > > > > > > Thu May 11 14:37:29 2017: Request 80429 was acted
upon.
> > > > > > > > > Transaction: Ticket created by
rosalyn.maccracken at noaa.gov
> > > > > > > > >        Queue: met_help
> > > > > > > > >      Subject: stat_analysis aggregate question
> > > > > > > > >        Owner: Nobody
> > > > > > > > >   Requestors: rosalyn.maccracken at noaa.gov
> > > > > > > > >       Status: new
> > > > > > > > >  Ticket <URL: https://rt.rap.ucar.edu/rt/
> > > > > > Ticket/Display.html?id=80429
> > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Hi,
> > > > > > > > >
> > > > > > > > > I'm using the output from the poin-stat tool as
input to
> the
> > > > > > > > stat_analysis
> > > > > > > > > tool.  I would like to aggregate the *cnt.txt files.
I can
> > get
> > > > the
> > > > > > > tool
> > > > > > > > to
> > > > > > > > > aggregate, and aggregate_stat the *cts.txt or
*ctc.txt
> files.
> > > I
> > > > > > would
> > > > > > > > > really like to use the information in the *cnt.txt
files
> for
> > > > > multiple
> > > > > > > > > times/days.  How do I do that?
> > > > > > > > >
> > > > > > > > > Thanks in advance!
> > > > > > > > >
> > > > > > > > > Roz
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Rosalyn MacCracken
> > > > > > > > > Support Scientist
> > > > > > > > >
> > > > > > > > > Ocean Applilcations Branch
> > > > > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > > > > NCWCP
> > > > > > > > > 5830 University Research Ct
> > > > > > > > > College Park, MD  20740-3818
> > > > > > > > >
> > > > > > > > > (p) 301-683-1551
> > > > > > > > > rosalyn.maccracken at noaa.gov
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Rosalyn MacCracken
> > > > > > > Support Scientist
> > > > > > >
> > > > > > > Ocean Applilcations Branch
> > > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > > NCWCP
> > > > > > > 5830 University Research Ct
> > > > > > > College Park, MD  20740-3818
> > > > > > >
> > > > > > > (p) 301-683-1551
> > > > > > > rosalyn.maccracken at noaa.gov
> > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Rosalyn MacCracken
> > > > > Support Scientist
> > > > >
> > > > > Ocean Applilcations Branch
> > > > > NOAA/NWS Ocean Prediction Center
> > > > > NCWCP
> > > > > 5830 University Research Ct
> > > > > College Park, MD  20740-3818
> > > > >
> > > > > (p) 301-683-1551
> > > > > rosalyn.maccracken at noaa.gov
> > > > >
> > > > >
> > > >
> > > >
> > >
> > >
> > > --
> > > Rosalyn MacCracken
> > > Support Scientist
> > >
> > > Ocean Applilcations Branch
> > > NOAA/NWS Ocean Prediction Center
> > > NCWCP
> > > 5830 University Research Ct
> > > College Park, MD  20740-3818
> > >
> > > (p) 301-683-1551
> > > rosalyn.maccracken at noaa.gov
> > >
> > >
> >
> >
>
>
> --
> Rosalyn MacCracken
> Support Scientist
>
> Ocean Applilcations Branch
> NOAA/NWS Ocean Prediction Center
> NCWCP
> 5830 University Research Ct
> College Park, MD  20740-3818
>
> (p) 301-683-1551
> rosalyn.maccracken at noaa.gov
>
>

------------------------------------------------
Subject: stat_analysis aggregate question
From: Rosalyn MacCracken - NOAA Affiliate
Time: Tue May 16 09:57:52 2017

Hi John,

Thanks for the clarification.  That certainly helps.  I take a look at
what
I'm filtering to see if things are fighting each other.

But, at any rate, at least I got things working!

Roz

On Tue, May 16, 2017 at 3:54 PM, John Halley Gotway via RT <
met_help at ucar.edu> wrote:

> Roz,
>
> Great, glad to hear you've made progress.
>
> Let me clarify one nuance about the config file which *may* be the
reason
> why your summary job didn't work via the config file.
>
> You'll notice that the config file has two sections.  The
"filtering"
> section at the top contains at least one option for each of the 22
header
> columns of the .stat output files.  The "jobs" section at the bottom
> defines the analysis job you want to perform.
>
> The logic works like this...
>
> - STAT-Analysis reads all the input files defined using the "-
lookin"
> option.
> - It applies *all* of the filtering options defined in the top
section and
> writes the filtered .stat data to an output temp file.
> - Each job defined in the "jobs" section, reads data from that temp
file,
> applies any additional filtering criteria you've defined, and then
performs
> the job on the data that remains.
>
> Therefore, the settings defined in the "filtering" section are
effectively
> applied to every job you define in the "jobs" section.
>
> Perhaps, your "filtering" options at the top of your config file
have
> already filtered out the line type you're processing in the summary
job?
> If so, just move that option out of the filtering section and down
to the
> jobs section where you'll specify it separately for each job (e.g.
> -line_type CNT).
>
> The intent of this design is to enable STAT-Analysis to run more
> efficiently.  Rather than having it re-parse *ALL* the input lines
for each
> job, do some first order filtering to run jobs on a smaller number
of
> lines.
>
> Hope this helps clarify.
>
> Thanks,
> John
>
>
>
> On Tue, May 16, 2017 at 9:34 AM, Rosalyn MacCracken - NOAA Affiliate
via RT
> <met_help at ucar.edu> wrote:
>
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=80429 >
> >
> > Hi John,
> >
> > Ok, I was able to get those things working!  I couldn't get the
summary
> job
> > to run in the config file and output that small table to the
ascii, but,
> I
> > could use it on the command line with no issues.  So, eventually,
I'll
> run
> > this in an automated script, so, I tested it with my script, and
it runs
> > great and outputs what I want.  So, it looks like I'm off to a
good start
> > now.
> >
> > Thanks for all your help!
> >
> > Roz
> >
> > On Mon, May 15, 2017 at 8:07 PM, John Halley Gotway via RT <
> > met_help at ucar.edu> wrote:
> >
> > > Roz,
> > >
> > > The output of the "summary" job is not a .stat line type.  There
is no
> > > "SUMMARY" line type produced by other MET tools.  That's why you
don't
> > get
> > > any output using the "-out_stat" option.  However, you can use
the
> "-out"
> > > option to redirect the output to an ASCII file.
> > >
> > > I realize this is confusing... the "-out" option has existed for
a long
> > > time.  We only recently added the "-out_stat" option for output
the
> > > "aggregate" and "aggregate_stat" job types, which write true
STAT lines
> > to
> > > the output.
> > >
> > > On to the next issue.  It's fine that you're not evaluating
forecast
> lead
> > > times... in fact that makes the logic of defining a time series
much
> > > easier.  Just use "-by FCST_VALID_BEG" instead:
> > >
> > > "-job aggregate_stat -line_type SL1L2 -out_line_type CNT
-out_stat
> > > cnt_time_series.stat -by FCST_VALID_BEG"
> > >
> > > Thanks,
> > > John
> > >
> > > On Mon, May 15, 2017 at 12:33 PM, Rosalyn MacCracken - NOAA
Affiliate
> via
> > > RT <met_help at ucar.edu> wrote:
> > >
> > > >
> > > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=80429
>
> > > >
> > > > Hi John,
> > > >
> > > > I'm having a few problems, and I'm sure they are pretty simple
to
> > solve.
> > > > First, I was looking at the "-job summary" suggestion.  So, I
did:
> > > >
> > > > stat_analysis -lookin /opc/save/Rosalyn.MacCracken/
> met_out/master_gfs
> > > > "-job
> > > > summary -fcst_var WIND -line_type CNT -column RMSE,MAE,ME,MSE
> -out_stat
> > > > /opc/save/Rosalyn.MacCracken/met_out/stat_analysis/colum_
> > multivars.stat"
> > > > -v
> > > > 2
> > > >
> > > > and,  only wrote to the screen, not the -out_stat file
specified.
> So,
> > > how
> > > > do I fix that?
> > > >
> > > > Next, I can't get your suggestion of:
> > > >
> > > > "-job aggregate_stat -line_type SL1L2 -out_line_type CNT
-out_stat
> > > > cnt_time_series.stat -fcst_lead 24 -by FCST_INIT_BEG"
> > > >
> > > > to work, because I have no forecast files.  So, I made a small
> dataset
> > to
> > > > work with, which only includes match-ups of prepbufr-ascat and
GFS at
> > > > forecast times 00z, 06z, 12z and 18z.  I don't have any
forecast
> files
> > > > associated with the GFS, only what matches the time stamp on
the
> > prepbufr
> > > > ascat data.  So, how do you get data so that you can use the
> -fcst_lead
> > > > option, etc?  Is this like matching observation valid time
with files
> > > such
> > > > as gfs.tHHz.grb2f01, gfs.tHHz.grb2f02, gfs.tHHz.grb2f03, etc?
In
> other
> > > > words, in my prepbufr file, ascat data is collected throughout
the 6
> > hour
> > > > period when the file is valid.  So, if it's valid at 00z,
there is -3
> > > hours
> > > > before 00z, and +3 hours after 00z that data is collected and
stamped
> > for
> > > > when the data was precisely collected. Technically, I could
separate
> > that
> > > > out to match a one hour forecast (gfs.tHHz.grb2f01), or a 2
hour
> > forecast
> > > > (gfs.tHHz.grb2f02), etc.
> > > >
> > > > So, do I need to also generate those matchups in order to use
that
> > > > -fcst_lead option?  Or, is there a better way to generate the
data
> that
> > > is
> > > > needed for that?
> > > >
> > > > Roz
> > > >
> > > > On Mon, May 15, 2017 at 3:51 PM, John Halley Gotway via RT <
> > > > met_help at ucar.edu> wrote:
> > > >
> > > > > Roz,
> > > > >
> > > > > I'm glad you've been able to make progress using STAT-
Analysis.
> > > > >
> > > > > Let me mention a few things that you may find useful...
> > > > >
> > > > > (1) As you've already seen, STAT-Analysis can be run by
defining
> one
> > or
> > > > > more jobs in a config file.  Alternatively, you can run a
single
> job
> > on
> > > > the
> > > > > command line with no config file.  I find that much quicker
and
> > easier
> > > > when
> > > > > I'm playing around with things.  It's only once I've defined
a
> fixed
> > > set
> > > > of
> > > > > jobs that I move them into a config file.
> > > > >
> > > > > (2) By default, STAT-Analysis writes it output to the
screen.  Use
> > the
> > > > > "-out_stat" job option to redirect the job output to a .stat
output
> > > > files.
> > > > > That will include the full set of header columns and should
be
> pretty
> > > > easy
> > > > > for a plotting script to parse.
> > > > >
> > > > > (3) It sounds like you're interested primarily in matched
pairs,
> i.e.
> > > the
> > > > > MPR line type.  I assume that's what you're plotting in your
> > histograms
> > > > and
> > > > > boxplots.  If you really just want to "filter" the .stat
files, I'd
> > > > suggest
> > > > > using the "filter" job to do so:
> > > > >    -job filter -line_type MPR -dump_row filter_mpr.stat [[[
> > additional
> > > > > filtering criteria ]]]
> > > > >
> > > > > You mentioned an aggregate_stat job to read CTC and write
CTS...
> but
> > > that
> > > > > doesn't have anything to do with the MPR line type.  So I'm
> confused
> > as
> > > > to
> > > > > why that's getting you what you want?
> > > > >
> > > > > (4) I see that you want a time series of MAE values.  I
think
> you're
> > on
> > > > the
> > > > > right track, but I'd suggest using the "-by" option:
> > > > >
> > > > > "-job aggregate_stat -line_type SL1L2 -out_line_type CNT
-out_stat
> > > > > cnt_time_series.stat -fcst_lead 24 -by FCST_INIT_BEG"
> > > > >
> > > > > The job listed above would produce a time series of
continuous
> > > statistics
> > > > > for the 24-hour lead time for each initialization time
present.
> You
> > > > should
> > > > > be able to use the job command options to define the time
series in
> > any
> > > > way
> > > > > you want.
> > > > >
> > > > > (5) When running a summary job, if you want to summarize
multiple
> > > > columns,
> > > > > just use the "-column" option multiple times to include
them... or
> > > > specify
> > > > > "-column" as a comma-separated list:
> > > > >
> > > > > "-job summary -fcst_var WIND -line_type CNT -column
> RMSE,MAE,ME,MSE"
> > > > >
> > > > > (6) The "plot_cnt.R" script on the website is outdated since
it's
> > > header
> > > > > columns haven't been updated since version 3.0.  But that
same
> script
> > > is
> > > > > included in the MET release and has been updated:
> > > > >    met-6.0/scripts/Rscripts/plot_cnt.R
> > > > >
> > > > > It reads the CNT line type from a .stat file, an _cnt.txt
file, or
> > the
> > > > > output of a stat-analysis filter job.  I don't know
specifically
> what
> > > > > stat-analysis command I used, but it'd be something like:
> > > > >
> > > > >    stat_analysis -job filter -line_type CNT -dump_row
> cnt_filter.txt
> > > [[[
> > > > > additional filtering criteria ]]]
> > > > >
> > > > > Hope that helps.
> > > > >
> > > > > Thanks,
> > > > > John
> > > > >
> > > > >
> > > > > On Mon, May 15, 2017 at 7:29 AM, Rosalyn MacCracken - NOAA
> Affiliate
> > > via
> > > > RT
> > > > > <met_help at ucar.edu> wrote:
> > > > >
> > > > > >
> > > > > > <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=80429 >
> > > > > >
> > > > > > Hi John,
> > > > > >
> > > > > > So, I'm interested in doing a couple of things, and I
think I've
> > > > figured
> > > > > > out how to do some of them.  So, maybe you can tell me how
to do
> > the
> > > > > > others.
> > > > > >
> > > > > > First, I am mostly interested in the matched points and
their
> > > > > performance.
> > > > > > And, I use a config file, which I call from a script using
the
> > > command:
> > > > > >
> > > > > > stat_analysis -lookin ${PROCDIR} -out
> > > > > > ${PROCDIR}/stat_analysis/stat_analysis.out -config
> > > > > > ${CONFIGDIR}/STATAnalysisConfig_working -v 2
> > > > > >
> > > > > > I can easily plot spatially, where the matched points are
located
> > by
> > > > > their
> > > > > > lat/lon, and I can find their differences (FCST - OBS).
Then, I
> > used
> > > > the
> > > > > > aggregate_stat command to combine my files, so, I can plot
> > histograms
> > > > or
> > > > > > box plots of matched point, either at that forecast hour,
or over
> > the
> > > > > span
> > > > > > of my forecast period of interest.  For that, I use this
in  my
> > > config
> > > > > > file:
> > > > > >
> > > > > > "-job aggregate_stat -line_type CTC -out_line_type CTS
> > > > > >     -dump_row
> > > > > > /opc/save/Rosalyn.MacCracken/met_out/stat_analysis/job4_
> > > > > > aggstat_ctc_cts.stat"
> > > > > >
> > > > > > So, some other things that I might be interested in things
that
> > span
> > > > the
> > > > > > entire period.  Perhaps, more like times series plots, so
we can
> > see
> > > > how
> > > > > > the forecast has done over time.  I don't have a problem
with
> > > plotting
> > > > > > things from the forecast period, but, they usually aren't
very
> > > > revealing
> > > > > or
> > > > > > interesting.  So, some other things are:
> > > > > >
> > > > > > 1) putting together a file which spans the forecast period
which
> > puts
> > > > > > together information from the SL1L2 file, so, I could plot
a time
> > > > series
> > > > > of
> > > > > > the MAE.  So, I was thinking I would use:
> > > > > > "-job aggregate_stat -line_type SL1L2 -out_line_type CNT
> > > > > >     -dump_row
> > > > > > /opc/save/Rosalyn.MacCracken/met_out/stat_analysis/job2_
> > > > > > aggstat_slil2_cnt_wind2.stat"
> > > > > >
> > > > > >
> > > > > > Is that right?
> > > > > >
> > > > > > 2)  From the CNT files, time series plots of the
ANOM_CORR,
> > PR_CORR,
> > > > GSS
> > > > > or
> > > > > > CSI, and RMSE, and maybe some other things.  I was
thinking that
> I
> > > > could
> > > > > > do:
> > > > > > "-job summary -fcst_var WIND -line_type CNT -column RMSE
> > > > > >      -dump_row
> > > > > > /opc/save/Rosalyn.MacCracken/met_out/stat_analysis/job7_
> > > > > > summary_cnt_rmse.stat"
> > > > > >
> > > > > > But, wasn't sure if that was correct.  So, if you could
point me
> to
> > > the
> > > > > > right usage, that would be great.
> > > > > >
> > > > > > So, I'm also not sure how you get the mean, min, max, etc,
for
> > > multiple
> > > > > > columns.  I think that the CNT file has the most useful
info, so,
> > if
> > > > you
> > > > > > could tell me how to do that, that would be great.  I'm
sure I'll
> > > have
> > > > > > another list of things I want to do after todays' meeting
with
> Joe,
> > > so,
> > > > > > I'll be back in touch with that list.
> > > > > >
> > > > > > Oh, also, that script you wrote, plot_cnt.r on the MET
user page.
> > > Does
> > > > > > that plot from one of these aggregate_stat or summary
commands,
> or
> > is
> > > > > that
> > > > > > a single CNT file?  If it's an aggregate_stat or summary
> commands,
> > > what
> > > > > > command did you use and what was in the "stat_list" that
you
> used?
> > > I'm
> > > > > > sure it was a variety of columns from the CNT file, right?
> > > > > >
> > > > > > Thanks for for your help!
> > > > > >
> > > > > > Roz
> > > > > >
> > > > > >
> > > > > > On Sun, May 14, 2017 at 12:37 AM, John Halley Gotway via
RT <
> > > > > > met_help at ucar.edu> wrote:
> > > > > >
> > > > > > > Roz,
> > > > > > >
> > > > > > > Stat-Analysis can perform a few different "job" types.
One of
> > them
> > > > is
> > > > > > the
> > > > > > > "summary" job type (-job summary).  For that job, you
pick
> > exactly
> > > > one
> > > > > > line
> > > > > > > type and one or more columns of interest.  Stat-Analysis
will
> > apply
> > > > > > > whatever other filtering criteria you specify and
compute
> summary
> > > > > > > information for the column(s) you've selected.  The
summary
> info
> > > > > includes
> > > > > > > mean, min, max, and so on.
> > > > > > >
> > > > > > > Let me know if there's something specific you're trying
to do
> > with
> > > > > > > stat-analysis and I may be able to point you in the
right
> > > direction.
> > > > > > >
> > > > > > > John
> > > > > > >
> > > > > > >
> > > > > > > On Sat, May 13, 2017 at 11:04 AM Rosalyn MacCracken -
NOAA
> > > Affiliate
> > > > > via
> > > > > > RT
> > > > > > > <met_help at ucar.edu> wrote:
> > > > > > >
> > > > > > > >
> > > > > > > > <URL: https://rt.rap.ucar.edu/rt/
> Ticket/Display.html?id=80429
> > >
> > > > > > > >
> > > > > > > > Hi John,
> > > > > > > >
> > > > > > > > I finally got it to work.  I had set:
> > > > > > > > line_type = ["CTC"];
> > > > > > > >
> > > > > > > > So, I set line_type to nothing [], and everything
started
> > > working.
> > > > > > > >
> > > > > > > > So, question.  When using "summary" with -column RMSE
set,
> what
> > > > does
> > > > > > that
> > > > > > > > mean?  That only the RMSE column is summed, or
something
> else?
> > > > > > > >
> > > > > > > > Thanks!
> > > > > > > >
> > > > > > > > Roz
> > > > > > > >
> > > > > > > > On Fri, May 12, 2017 at 2:31 PM, John Halley Gotway
via RT <
> > > > > > > > met_help at ucar.edu> wrote:
> > > > > > > >
> > > > > > > > > Hello Roz,
> > > > > > > > >
> > > > > > > > > I see that you have a question about
configuring/running
> > > > > > STAT-Analysis
> > > > > > > > > jobs.
> > > > > > > > >
> > > > > > > > > The "-lookin" command line option is used to tell
> > STAT-Analysis
> > > > > what
> > > > > > > > input
> > > > > > > > > files to read.  You must specify the "-lookin"
option at
> > least
> > > > > once,
> > > > > > > but
> > > > > > > > > can use it as many times as you'd like.
> > > > > > > > >
> > > > > > > > > The argument you pass with "-lookin" is either the
name of
> a
> > > > > > directory
> > > > > > > or
> > > > > > > > > explicit file name.
> > > > > > > > >
> > > > > > > > > For an explicit file name, STAT-Analysis will read
MET
> output
> > > > data
> > > > > > from
> > > > > > > > it
> > > > > > > > > **regardless of the file naming convention**.
> > > > > > > > >
> > > > > > > > > For a directory name, STAT-Analysis will search
> > **recursively**
> > > > > > through
> > > > > > > > > that directory looking for files ending in the
".stat"
> > suffix.
> > > > > > > > >
> > > > > > > > > Each time you run grid_stat, point_stat,
wavelet_stat, or
> > > > > > > ensemble_stat,
> > > > > > > > > the tool writes a ".stat" output file (and can also
write
> the
> > > > > > optional
> > > > > > > > text
> > > > > > > > > files sorted by line type... such as "_cnt.txt).
That's
> why
> > > > > > > > STAT-Analysis
> > > > > > > > > searches directories for ".stat" files.  But if you
want it
> > to
> > > > read
> > > > > > the
> > > > > > > > > "_cnt.txt" file, you need to specify the file name
on the
> > > command
> > > > > > line.
> > > > > > > > >
> > > > > > > > > Make sense?
> > > > > > > > >
> > > > > > > > > Just let us know if more issues/questions arise.
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > > John Halley Gotway
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Thu, May 11, 2017 at 2:37 PM, Rosalyn MacCracken
- NOAA
> > > > > Affiliate
> > > > > > > via
> > > > > > > > RT
> > > > > > > > > <met_help at ucar.edu> wrote:
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Thu May 11 14:37:29 2017: Request 80429 was acted
upon.
> > > > > > > > > > Transaction: Ticket created by
> rosalyn.maccracken at noaa.gov
> > > > > > > > > >        Queue: met_help
> > > > > > > > > >      Subject: stat_analysis aggregate question
> > > > > > > > > >        Owner: Nobody
> > > > > > > > > >   Requestors: rosalyn.maccracken at noaa.gov
> > > > > > > > > >       Status: new
> > > > > > > > > >  Ticket <URL: https://rt.rap.ucar.edu/rt/
> > > > > > > Ticket/Display.html?id=80429
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Hi,
> > > > > > > > > >
> > > > > > > > > > I'm using the output from the poin-stat tool as
input to
> > the
> > > > > > > > > stat_analysis
> > > > > > > > > > tool.  I would like to aggregate the *cnt.txt
files.  I
> can
> > > get
> > > > > the
> > > > > > > > tool
> > > > > > > > > to
> > > > > > > > > > aggregate, and aggregate_stat the *cts.txt or
*ctc.txt
> > files.
> > > > I
> > > > > > > would
> > > > > > > > > > really like to use the information in the *cnt.txt
files
> > for
> > > > > > multiple
> > > > > > > > > > times/days.  How do I do that?
> > > > > > > > > >
> > > > > > > > > > Thanks in advance!
> > > > > > > > > >
> > > > > > > > > > Roz
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > Rosalyn MacCracken
> > > > > > > > > > Support Scientist
> > > > > > > > > >
> > > > > > > > > > Ocean Applilcations Branch
> > > > > > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > > > > > NCWCP
> > > > > > > > > > 5830 University Research Ct
> > > > > > > > > > College Park, MD  20740-3818
> > > > > > > > > >
> > > > > > > > > > (p) 301-683-1551
> > > > > > > > > > rosalyn.maccracken at noaa.gov
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > Rosalyn MacCracken
> > > > > > > > Support Scientist
> > > > > > > >
> > > > > > > > Ocean Applilcations Branch
> > > > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > > > NCWCP
> > > > > > > > 5830 University Research Ct
> > > > > > > > College Park, MD  20740-3818
> > > > > > > >
> > > > > > > > (p) 301-683-1551
> > > > > > > > rosalyn.maccracken at noaa.gov
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Rosalyn MacCracken
> > > > > > Support Scientist
> > > > > >
> > > > > > Ocean Applilcations Branch
> > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > NCWCP
> > > > > > 5830 University Research Ct
> > > > > > College Park, MD  20740-3818
> > > > > >
> > > > > > (p) 301-683-1551
> > > > > > rosalyn.maccracken at noaa.gov
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > Rosalyn MacCracken
> > > > Support Scientist
> > > >
> > > > Ocean Applilcations Branch
> > > > NOAA/NWS Ocean Prediction Center
> > > > NCWCP
> > > > 5830 University Research Ct
> > > > College Park, MD  20740-3818
> > > >
> > > > (p) 301-683-1551
> > > > rosalyn.maccracken at noaa.gov
> > > >
> > > >
> > >
> > >
> >
> >
> > --
> > Rosalyn MacCracken
> > Support Scientist
> >
> > Ocean Applilcations Branch
> > NOAA/NWS Ocean Prediction Center
> > NCWCP
> > 5830 University Research Ct
> > College Park, MD  20740-3818
> >
> > (p) 301-683-1551
> > rosalyn.maccracken at noaa.gov
> >
> >
>
>


--
Rosalyn MacCracken
Support Scientist

Ocean Applilcations Branch
NOAA/NWS Ocean Prediction Center
NCWCP
5830 University Research Ct
College Park, MD  20740-3818

(p) 301-683-1551
rosalyn.maccracken at noaa.gov

------------------------------------------------
Subject: stat_analysis aggregate question
From: Rosalyn MacCracken - NOAA Affiliate
Time: Tue May 16 11:25:46 2017

Oh, wait, one more question:

The command you gave me before:
"-job aggregate_stat -line_type SL1L2 -out_line_type CNT -out_stat
cnt_time_series.stat -by FCST_VALID_BEG"


does summarize variables from CNT for the forecast time period, but,
it
lumps all the forecast variables together.  So, you'll have a column
of
FCST_VAR which is UGRD,VGRD and WIND and them a single RMSE associated
with
that.

I tried adding -by FCST_VAR:
"-job aggregate_stat -line_type SL1L2 -out_line_type CNT -out_stat
cnt_time_series.stat -by FCST_VAR -by FCST_VALID_BEG"

but, that didn't work.  Any ideas how to associate a single value, of
say
RMSE, for the forecast time interval of say 2017050100 - 2017050218,
with
each of the forecast varialbes (UGRD,VGRD and WIND)?

Thanks,

Roz

On Tue, May 16, 2017 at 3:54 PM, John Halley Gotway via RT <
met_help at ucar.edu> wrote:

> Roz,
>
> Great, glad to hear you've made progress.
>
> Let me clarify one nuance about the config file which *may* be the
reason
> why your summary job didn't work via the config file.
>
> You'll notice that the config file has two sections.  The
"filtering"
> section at the top contains at least one option for each of the 22
header
> columns of the .stat output files.  The "jobs" section at the bottom
> defines the analysis job you want to perform.
>
> The logic works like this...
>
> - STAT-Analysis reads all the input files defined using the "-
lookin"
> option.
> - It applies *all* of the filtering options defined in the top
section and
> writes the filtered .stat data to an output temp file.
> - Each job defined in the "jobs" section, reads data from that temp
file,
> applies any additional filtering criteria you've defined, and then
performs
> the job on the data that remains.
>
> Therefore, the settings defined in the "filtering" section are
effectively
> applied to every job you define in the "jobs" section.
>
> Perhaps, your "filtering" options at the top of your config file
have
> already filtered out the line type you're processing in the summary
job?
> If so, just move that option out of the filtering section and down
to the
> jobs section where you'll specify it separately for each job (e.g.
> -line_type CNT).
>
> The intent of this design is to enable STAT-Analysis to run more
> efficiently.  Rather than having it re-parse *ALL* the input lines
for each
> job, do some first order filtering to run jobs on a smaller number
of
> lines.
>
> Hope this helps clarify.
>
> Thanks,
> John
>
>
>
> On Tue, May 16, 2017 at 9:34 AM, Rosalyn MacCracken - NOAA Affiliate
via RT
> <met_help at ucar.edu> wrote:
>
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=80429 >
> >
> > Hi John,
> >
> > Ok, I was able to get those things working!  I couldn't get the
summary
> job
> > to run in the config file and output that small table to the
ascii, but,
> I
> > could use it on the command line with no issues.  So, eventually,
I'll
> run
> > this in an automated script, so, I tested it with my script, and
it runs
> > great and outputs what I want.  So, it looks like I'm off to a
good start
> > now.
> >
> > Thanks for all your help!
> >
> > Roz
> >
> > On Mon, May 15, 2017 at 8:07 PM, John Halley Gotway via RT <
> > met_help at ucar.edu> wrote:
> >
> > > Roz,
> > >
> > > The output of the "summary" job is not a .stat line type.  There
is no
> > > "SUMMARY" line type produced by other MET tools.  That's why you
don't
> > get
> > > any output using the "-out_stat" option.  However, you can use
the
> "-out"
> > > option to redirect the output to an ASCII file.
> > >
> > > I realize this is confusing... the "-out" option has existed for
a long
> > > time.  We only recently added the "-out_stat" option for output
the
> > > "aggregate" and "aggregate_stat" job types, which write true
STAT lines
> > to
> > > the output.
> > >
> > > On to the next issue.  It's fine that you're not evaluating
forecast
> lead
> > > times... in fact that makes the logic of defining a time series
much
> > > easier.  Just use "-by FCST_VALID_BEG" instead:
> > >
> > > "-job aggregate_stat -line_type SL1L2 -out_line_type CNT
-out_stat
> > > cnt_time_series.stat -by FCST_VALID_BEG"
> > >
> > > Thanks,
> > > John
> > >
> > > On Mon, May 15, 2017 at 12:33 PM, Rosalyn MacCracken - NOAA
Affiliate
> via
> > > RT <met_help at ucar.edu> wrote:
> > >
> > > >
> > > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=80429
>
> > > >
> > > > Hi John,
> > > >
> > > > I'm having a few problems, and I'm sure they are pretty simple
to
> > solve.
> > > > First, I was looking at the "-job summary" suggestion.  So, I
did:
> > > >
> > > > stat_analysis -lookin /opc/save/Rosalyn.MacCracken/
> met_out/master_gfs
> > > > "-job
> > > > summary -fcst_var WIND -line_type CNT -column RMSE,MAE,ME,MSE
> -out_stat
> > > > /opc/save/Rosalyn.MacCracken/met_out/stat_analysis/colum_
> > multivars.stat"
> > > > -v
> > > > 2
> > > >
> > > > and,  only wrote to the screen, not the -out_stat file
specified.
> So,
> > > how
> > > > do I fix that?
> > > >
> > > > Next, I can't get your suggestion of:
> > > >
> > > > "-job aggregate_stat -line_type SL1L2 -out_line_type CNT
-out_stat
> > > > cnt_time_series.stat -fcst_lead 24 -by FCST_INIT_BEG"
> > > >
> > > > to work, because I have no forecast files.  So, I made a small
> dataset
> > to
> > > > work with, which only includes match-ups of prepbufr-ascat and
GFS at
> > > > forecast times 00z, 06z, 12z and 18z.  I don't have any
forecast
> files
> > > > associated with the GFS, only what matches the time stamp on
the
> > prepbufr
> > > > ascat data.  So, how do you get data so that you can use the
> -fcst_lead
> > > > option, etc?  Is this like matching observation valid time
with files
> > > such
> > > > as gfs.tHHz.grb2f01, gfs.tHHz.grb2f02, gfs.tHHz.grb2f03, etc?
In
> other
> > > > words, in my prepbufr file, ascat data is collected throughout
the 6
> > hour
> > > > period when the file is valid.  So, if it's valid at 00z,
there is -3
> > > hours
> > > > before 00z, and +3 hours after 00z that data is collected and
stamped
> > for
> > > > when the data was precisely collected. Technically, I could
separate
> > that
> > > > out to match a one hour forecast (gfs.tHHz.grb2f01), or a 2
hour
> > forecast
> > > > (gfs.tHHz.grb2f02), etc.
> > > >
> > > > So, do I need to also generate those matchups in order to use
that
> > > > -fcst_lead option?  Or, is there a better way to generate the
data
> that
> > > is
> > > > needed for that?
> > > >
> > > > Roz
> > > >
> > > > On Mon, May 15, 2017 at 3:51 PM, John Halley Gotway via RT <
> > > > met_help at ucar.edu> wrote:
> > > >
> > > > > Roz,
> > > > >
> > > > > I'm glad you've been able to make progress using STAT-
Analysis.
> > > > >
> > > > > Let me mention a few things that you may find useful...
> > > > >
> > > > > (1) As you've already seen, STAT-Analysis can be run by
defining
> one
> > or
> > > > > more jobs in a config file.  Alternatively, you can run a
single
> job
> > on
> > > > the
> > > > > command line with no config file.  I find that much quicker
and
> > easier
> > > > when
> > > > > I'm playing around with things.  It's only once I've defined
a
> fixed
> > > set
> > > > of
> > > > > jobs that I move them into a config file.
> > > > >
> > > > > (2) By default, STAT-Analysis writes it output to the
screen.  Use
> > the
> > > > > "-out_stat" job option to redirect the job output to a .stat
output
> > > > files.
> > > > > That will include the full set of header columns and should
be
> pretty
> > > > easy
> > > > > for a plotting script to parse.
> > > > >
> > > > > (3) It sounds like you're interested primarily in matched
pairs,
> i.e.
> > > the
> > > > > MPR line type.  I assume that's what you're plotting in your
> > histograms
> > > > and
> > > > > boxplots.  If you really just want to "filter" the .stat
files, I'd
> > > > suggest
> > > > > using the "filter" job to do so:
> > > > >    -job filter -line_type MPR -dump_row filter_mpr.stat [[[
> > additional
> > > > > filtering criteria ]]]
> > > > >
> > > > > You mentioned an aggregate_stat job to read CTC and write
CTS...
> but
> > > that
> > > > > doesn't have anything to do with the MPR line type.  So I'm
> confused
> > as
> > > > to
> > > > > why that's getting you what you want?
> > > > >
> > > > > (4) I see that you want a time series of MAE values.  I
think
> you're
> > on
> > > > the
> > > > > right track, but I'd suggest using the "-by" option:
> > > > >
> > > > > "-job aggregate_stat -line_type SL1L2 -out_line_type CNT
-out_stat
> > > > > cnt_time_series.stat -fcst_lead 24 -by FCST_INIT_BEG"
> > > > >
> > > > > The job listed above would produce a time series of
continuous
> > > statistics
> > > > > for the 24-hour lead time for each initialization time
present.
> You
> > > > should
> > > > > be able to use the job command options to define the time
series in
> > any
> > > > way
> > > > > you want.
> > > > >
> > > > > (5) When running a summary job, if you want to summarize
multiple
> > > > columns,
> > > > > just use the "-column" option multiple times to include
them... or
> > > > specify
> > > > > "-column" as a comma-separated list:
> > > > >
> > > > > "-job summary -fcst_var WIND -line_type CNT -column
> RMSE,MAE,ME,MSE"
> > > > >
> > > > > (6) The "plot_cnt.R" script on the website is outdated since
it's
> > > header
> > > > > columns haven't been updated since version 3.0.  But that
same
> script
> > > is
> > > > > included in the MET release and has been updated:
> > > > >    met-6.0/scripts/Rscripts/plot_cnt.R
> > > > >
> > > > > It reads the CNT line type from a .stat file, an _cnt.txt
file, or
> > the
> > > > > output of a stat-analysis filter job.  I don't know
specifically
> what
> > > > > stat-analysis command I used, but it'd be something like:
> > > > >
> > > > >    stat_analysis -job filter -line_type CNT -dump_row
> cnt_filter.txt
> > > [[[
> > > > > additional filtering criteria ]]]
> > > > >
> > > > > Hope that helps.
> > > > >
> > > > > Thanks,
> > > > > John
> > > > >
> > > > >
> > > > > On Mon, May 15, 2017 at 7:29 AM, Rosalyn MacCracken - NOAA
> Affiliate
> > > via
> > > > RT
> > > > > <met_help at ucar.edu> wrote:
> > > > >
> > > > > >
> > > > > > <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=80429 >
> > > > > >
> > > > > > Hi John,
> > > > > >
> > > > > > So, I'm interested in doing a couple of things, and I
think I've
> > > > figured
> > > > > > out how to do some of them.  So, maybe you can tell me how
to do
> > the
> > > > > > others.
> > > > > >
> > > > > > First, I am mostly interested in the matched points and
their
> > > > > performance.
> > > > > > And, I use a config file, which I call from a script using
the
> > > command:
> > > > > >
> > > > > > stat_analysis -lookin ${PROCDIR} -out
> > > > > > ${PROCDIR}/stat_analysis/stat_analysis.out -config
> > > > > > ${CONFIGDIR}/STATAnalysisConfig_working -v 2
> > > > > >
> > > > > > I can easily plot spatially, where the matched points are
located
> > by
> > > > > their
> > > > > > lat/lon, and I can find their differences (FCST - OBS).
Then, I
> > used
> > > > the
> > > > > > aggregate_stat command to combine my files, so, I can plot
> > histograms
> > > > or
> > > > > > box plots of matched point, either at that forecast hour,
or over
> > the
> > > > > span
> > > > > > of my forecast period of interest.  For that, I use this
in  my
> > > config
> > > > > > file:
> > > > > >
> > > > > > "-job aggregate_stat -line_type CTC -out_line_type CTS
> > > > > >     -dump_row
> > > > > > /opc/save/Rosalyn.MacCracken/met_out/stat_analysis/job4_
> > > > > > aggstat_ctc_cts.stat"
> > > > > >
> > > > > > So, some other things that I might be interested in things
that
> > span
> > > > the
> > > > > > entire period.  Perhaps, more like times series plots, so
we can
> > see
> > > > how
> > > > > > the forecast has done over time.  I don't have a problem
with
> > > plotting
> > > > > > things from the forecast period, but, they usually aren't
very
> > > > revealing
> > > > > or
> > > > > > interesting.  So, some other things are:
> > > > > >
> > > > > > 1) putting together a file which spans the forecast period
which
> > puts
> > > > > > together information from the SL1L2 file, so, I could plot
a time
> > > > series
> > > > > of
> > > > > > the MAE.  So, I was thinking I would use:
> > > > > > "-job aggregate_stat -line_type SL1L2 -out_line_type CNT
> > > > > >     -dump_row
> > > > > > /opc/save/Rosalyn.MacCracken/met_out/stat_analysis/job2_
> > > > > > aggstat_slil2_cnt_wind2.stat"
> > > > > >
> > > > > >
> > > > > > Is that right?
> > > > > >
> > > > > > 2)  From the CNT files, time series plots of the
ANOM_CORR,
> > PR_CORR,
> > > > GSS
> > > > > or
> > > > > > CSI, and RMSE, and maybe some other things.  I was
thinking that
> I
> > > > could
> > > > > > do:
> > > > > > "-job summary -fcst_var WIND -line_type CNT -column RMSE
> > > > > >      -dump_row
> > > > > > /opc/save/Rosalyn.MacCracken/met_out/stat_analysis/job7_
> > > > > > summary_cnt_rmse.stat"
> > > > > >
> > > > > > But, wasn't sure if that was correct.  So, if you could
point me
> to
> > > the
> > > > > > right usage, that would be great.
> > > > > >
> > > > > > So, I'm also not sure how you get the mean, min, max, etc,
for
> > > multiple
> > > > > > columns.  I think that the CNT file has the most useful
info, so,
> > if
> > > > you
> > > > > > could tell me how to do that, that would be great.  I'm
sure I'll
> > > have
> > > > > > another list of things I want to do after todays' meeting
with
> Joe,
> > > so,
> > > > > > I'll be back in touch with that list.
> > > > > >
> > > > > > Oh, also, that script you wrote, plot_cnt.r on the MET
user page.
> > > Does
> > > > > > that plot from one of these aggregate_stat or summary
commands,
> or
> > is
> > > > > that
> > > > > > a single CNT file?  If it's an aggregate_stat or summary
> commands,
> > > what
> > > > > > command did you use and what was in the "stat_list" that
you
> used?
> > > I'm
> > > > > > sure it was a variety of columns from the CNT file, right?
> > > > > >
> > > > > > Thanks for for your help!
> > > > > >
> > > > > > Roz
> > > > > >
> > > > > >
> > > > > > On Sun, May 14, 2017 at 12:37 AM, John Halley Gotway via
RT <
> > > > > > met_help at ucar.edu> wrote:
> > > > > >
> > > > > > > Roz,
> > > > > > >
> > > > > > > Stat-Analysis can perform a few different "job" types.
One of
> > them
> > > > is
> > > > > > the
> > > > > > > "summary" job type (-job summary).  For that job, you
pick
> > exactly
> > > > one
> > > > > > line
> > > > > > > type and one or more columns of interest.  Stat-Analysis
will
> > apply
> > > > > > > whatever other filtering criteria you specify and
compute
> summary
> > > > > > > information for the column(s) you've selected.  The
summary
> info
> > > > > includes
> > > > > > > mean, min, max, and so on.
> > > > > > >
> > > > > > > Let me know if there's something specific you're trying
to do
> > with
> > > > > > > stat-analysis and I may be able to point you in the
right
> > > direction.
> > > > > > >
> > > > > > > John
> > > > > > >
> > > > > > >
> > > > > > > On Sat, May 13, 2017 at 11:04 AM Rosalyn MacCracken -
NOAA
> > > Affiliate
> > > > > via
> > > > > > RT
> > > > > > > <met_help at ucar.edu> wrote:
> > > > > > >
> > > > > > > >
> > > > > > > > <URL: https://rt.rap.ucar.edu/rt/
> Ticket/Display.html?id=80429
> > >
> > > > > > > >
> > > > > > > > Hi John,
> > > > > > > >
> > > > > > > > I finally got it to work.  I had set:
> > > > > > > > line_type = ["CTC"];
> > > > > > > >
> > > > > > > > So, I set line_type to nothing [], and everything
started
> > > working.
> > > > > > > >
> > > > > > > > So, question.  When using "summary" with -column RMSE
set,
> what
> > > > does
> > > > > > that
> > > > > > > > mean?  That only the RMSE column is summed, or
something
> else?
> > > > > > > >
> > > > > > > > Thanks!
> > > > > > > >
> > > > > > > > Roz
> > > > > > > >
> > > > > > > > On Fri, May 12, 2017 at 2:31 PM, John Halley Gotway
via RT <
> > > > > > > > met_help at ucar.edu> wrote:
> > > > > > > >
> > > > > > > > > Hello Roz,
> > > > > > > > >
> > > > > > > > > I see that you have a question about
configuring/running
> > > > > > STAT-Analysis
> > > > > > > > > jobs.
> > > > > > > > >
> > > > > > > > > The "-lookin" command line option is used to tell
> > STAT-Analysis
> > > > > what
> > > > > > > > input
> > > > > > > > > files to read.  You must specify the "-lookin"
option at
> > least
> > > > > once,
> > > > > > > but
> > > > > > > > > can use it as many times as you'd like.
> > > > > > > > >
> > > > > > > > > The argument you pass with "-lookin" is either the
name of
> a
> > > > > > directory
> > > > > > > or
> > > > > > > > > explicit file name.
> > > > > > > > >
> > > > > > > > > For an explicit file name, STAT-Analysis will read
MET
> output
> > > > data
> > > > > > from
> > > > > > > > it
> > > > > > > > > **regardless of the file naming convention**.
> > > > > > > > >
> > > > > > > > > For a directory name, STAT-Analysis will search
> > **recursively**
> > > > > > through
> > > > > > > > > that directory looking for files ending in the
".stat"
> > suffix.
> > > > > > > > >
> > > > > > > > > Each time you run grid_stat, point_stat,
wavelet_stat, or
> > > > > > > ensemble_stat,
> > > > > > > > > the tool writes a ".stat" output file (and can also
write
> the
> > > > > > optional
> > > > > > > > text
> > > > > > > > > files sorted by line type... such as "_cnt.txt).
That's
> why
> > > > > > > > STAT-Analysis
> > > > > > > > > searches directories for ".stat" files.  But if you
want it
> > to
> > > > read
> > > > > > the
> > > > > > > > > "_cnt.txt" file, you need to specify the file name
on the
> > > command
> > > > > > line.
> > > > > > > > >
> > > > > > > > > Make sense?
> > > > > > > > >
> > > > > > > > > Just let us know if more issues/questions arise.
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > > John Halley Gotway
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Thu, May 11, 2017 at 2:37 PM, Rosalyn MacCracken
- NOAA
> > > > > Affiliate
> > > > > > > via
> > > > > > > > RT
> > > > > > > > > <met_help at ucar.edu> wrote:
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Thu May 11 14:37:29 2017: Request 80429 was acted
upon.
> > > > > > > > > > Transaction: Ticket created by
> rosalyn.maccracken at noaa.gov
> > > > > > > > > >        Queue: met_help
> > > > > > > > > >      Subject: stat_analysis aggregate question
> > > > > > > > > >        Owner: Nobody
> > > > > > > > > >   Requestors: rosalyn.maccracken at noaa.gov
> > > > > > > > > >       Status: new
> > > > > > > > > >  Ticket <URL: https://rt.rap.ucar.edu/rt/
> > > > > > > Ticket/Display.html?id=80429
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Hi,
> > > > > > > > > >
> > > > > > > > > > I'm using the output from the poin-stat tool as
input to
> > the
> > > > > > > > > stat_analysis
> > > > > > > > > > tool.  I would like to aggregate the *cnt.txt
files.  I
> can
> > > get
> > > > > the
> > > > > > > > tool
> > > > > > > > > to
> > > > > > > > > > aggregate, and aggregate_stat the *cts.txt or
*ctc.txt
> > files.
> > > > I
> > > > > > > would
> > > > > > > > > > really like to use the information in the *cnt.txt
files
> > for
> > > > > > multiple
> > > > > > > > > > times/days.  How do I do that?
> > > > > > > > > >
> > > > > > > > > > Thanks in advance!
> > > > > > > > > >
> > > > > > > > > > Roz
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > Rosalyn MacCracken
> > > > > > > > > > Support Scientist
> > > > > > > > > >
> > > > > > > > > > Ocean Applilcations Branch
> > > > > > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > > > > > NCWCP
> > > > > > > > > > 5830 University Research Ct
> > > > > > > > > > College Park, MD  20740-3818
> > > > > > > > > >
> > > > > > > > > > (p) 301-683-1551
> > > > > > > > > > rosalyn.maccracken at noaa.gov
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > Rosalyn MacCracken
> > > > > > > > Support Scientist
> > > > > > > >
> > > > > > > > Ocean Applilcations Branch
> > > > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > > > NCWCP
> > > > > > > > 5830 University Research Ct
> > > > > > > > College Park, MD  20740-3818
> > > > > > > >
> > > > > > > > (p) 301-683-1551
> > > > > > > > rosalyn.maccracken at noaa.gov
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Rosalyn MacCracken
> > > > > > Support Scientist
> > > > > >
> > > > > > Ocean Applilcations Branch
> > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > NCWCP
> > > > > > 5830 University Research Ct
> > > > > > College Park, MD  20740-3818
> > > > > >
> > > > > > (p) 301-683-1551
> > > > > > rosalyn.maccracken at noaa.gov
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > Rosalyn MacCracken
> > > > Support Scientist
> > > >
> > > > Ocean Applilcations Branch
> > > > NOAA/NWS Ocean Prediction Center
> > > > NCWCP
> > > > 5830 University Research Ct
> > > > College Park, MD  20740-3818
> > > >
> > > > (p) 301-683-1551
> > > > rosalyn.maccracken at noaa.gov
> > > >
> > > >
> > >
> > >
> >
> >
> > --
> > Rosalyn MacCracken
> > Support Scientist
> >
> > Ocean Applilcations Branch
> > NOAA/NWS Ocean Prediction Center
> > NCWCP
> > 5830 University Research Ct
> > College Park, MD  20740-3818
> >
> > (p) 301-683-1551
> > rosalyn.maccracken at noaa.gov
> >
> >
>
>


--
Rosalyn MacCracken
Support Scientist

Ocean Applilcations Branch
NOAA/NWS Ocean Prediction Center
NCWCP
5830 University Research Ct
College Park, MD  20740-3818

(p) 301-683-1551
rosalyn.maccracken at noaa.gov

------------------------------------------------
Subject: stat_analysis aggregate question
From: John Halley Gotway
Time: Tue May 16 11:53:02 2017

Roz,

Adding "-by FCST_VAR" is exactly what I would recommend doing.  The
question is why isn't that "working".

By way of demonstration, I ran the following job on the output from
Grid-Stat generated when you run the "make test" command:

cd met-6.0
bin/stat_analysis -lookin out/grid_stat \
   -job aggregate_stat -line_type SL1L2 -out_line_type CNT \
   -by FCST_VALID_BEG -by FCST_VAR,FCST_LEV \
   -out_stat cnt.txt

The resulting cnt.txt file is attached and includes separate output
for 6
different FCST_VAR values.

If the behavior you're seeing doesn't match what I've described,
please
package up and send me a simple test case which demonstrates the
unexpected
behavior.  Once I'm able to replicate the behavior here, I can figure
out
why it's happening.

Thanks,
John



On Tue, May 16, 2017 at 11:25 AM, Rosalyn MacCracken - NOAA Affiliate
via
RT <met_help at ucar.edu> wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=80429 >
>
> Oh, wait, one more question:
>
> The command you gave me before:
> "-job aggregate_stat -line_type SL1L2 -out_line_type CNT -out_stat
> cnt_time_series.stat -by FCST_VALID_BEG"
>
>
> does summarize variables from CNT for the forecast time period, but,
it
> lumps all the forecast variables together.  So, you'll have a column
of
> FCST_VAR which is UGRD,VGRD and WIND and them a single RMSE
associated with
> that.
>
> I tried adding -by FCST_VAR:
> "-job aggregate_stat -line_type SL1L2 -out_line_type CNT -out_stat
> cnt_time_series.stat -by FCST_VAR -by FCST_VALID_BEG"
>
> but, that didn't work.  Any ideas how to associate a single value,
of say
> RMSE, for the forecast time interval of say 2017050100 - 2017050218,
with
> each of the forecast varialbes (UGRD,VGRD and WIND)?
>
> Thanks,
>
> Roz
>
> On Tue, May 16, 2017 at 3:54 PM, John Halley Gotway via RT <
> met_help at ucar.edu> wrote:
>
> > Roz,
> >
> > Great, glad to hear you've made progress.
> >
> > Let me clarify one nuance about the config file which *may* be the
reason
> > why your summary job didn't work via the config file.
> >
> > You'll notice that the config file has two sections.  The
"filtering"
> > section at the top contains at least one option for each of the 22
header
> > columns of the .stat output files.  The "jobs" section at the
bottom
> > defines the analysis job you want to perform.
> >
> > The logic works like this...
> >
> > - STAT-Analysis reads all the input files defined using the "-
lookin"
> > option.
> > - It applies *all* of the filtering options defined in the top
section
> and
> > writes the filtered .stat data to an output temp file.
> > - Each job defined in the "jobs" section, reads data from that
temp file,
> > applies any additional filtering criteria you've defined, and then
> performs
> > the job on the data that remains.
> >
> > Therefore, the settings defined in the "filtering" section are
> effectively
> > applied to every job you define in the "jobs" section.
> >
> > Perhaps, your "filtering" options at the top of your config file
have
> > already filtered out the line type you're processing in the
summary job?
> > If so, just move that option out of the filtering section and down
to the
> > jobs section where you'll specify it separately for each job (e.g.
> > -line_type CNT).
> >
> > The intent of this design is to enable STAT-Analysis to run more
> > efficiently.  Rather than having it re-parse *ALL* the input lines
for
> each
> > job, do some first order filtering to run jobs on a smaller number
of
> > lines.
> >
> > Hope this helps clarify.
> >
> > Thanks,
> > John
> >
> >
> >
> > On Tue, May 16, 2017 at 9:34 AM, Rosalyn MacCracken - NOAA
Affiliate via
> RT
> > <met_help at ucar.edu> wrote:
> >
> > >
> > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=80429 >
> > >
> > > Hi John,
> > >
> > > Ok, I was able to get those things working!  I couldn't get the
summary
> > job
> > > to run in the config file and output that small table to the
ascii,
> but,
> > I
> > > could use it on the command line with no issues.  So,
eventually, I'll
> > run
> > > this in an automated script, so, I tested it with my script, and
it
> runs
> > > great and outputs what I want.  So, it looks like I'm off to a
good
> start
> > > now.
> > >
> > > Thanks for all your help!
> > >
> > > Roz
> > >
> > > On Mon, May 15, 2017 at 8:07 PM, John Halley Gotway via RT <
> > > met_help at ucar.edu> wrote:
> > >
> > > > Roz,
> > > >
> > > > The output of the "summary" job is not a .stat line type.
There is
> no
> > > > "SUMMARY" line type produced by other MET tools.  That's why
you
> don't
> > > get
> > > > any output using the "-out_stat" option.  However, you can use
the
> > "-out"
> > > > option to redirect the output to an ASCII file.
> > > >
> > > > I realize this is confusing... the "-out" option has existed
for a
> long
> > > > time.  We only recently added the "-out_stat" option for
output the
> > > > "aggregate" and "aggregate_stat" job types, which write true
STAT
> lines
> > > to
> > > > the output.
> > > >
> > > > On to the next issue.  It's fine that you're not evaluating
forecast
> > lead
> > > > times... in fact that makes the logic of defining a time
series much
> > > > easier.  Just use "-by FCST_VALID_BEG" instead:
> > > >
> > > > "-job aggregate_stat -line_type SL1L2 -out_line_type CNT
-out_stat
> > > > cnt_time_series.stat -by FCST_VALID_BEG"
> > > >
> > > > Thanks,
> > > > John
> > > >
> > > > On Mon, May 15, 2017 at 12:33 PM, Rosalyn MacCracken - NOAA
Affiliate
> > via
> > > > RT <met_help at ucar.edu> wrote:
> > > >
> > > > >
> > > > > <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=80429 >
> > > > >
> > > > > Hi John,
> > > > >
> > > > > I'm having a few problems, and I'm sure they are pretty
simple to
> > > solve.
> > > > > First, I was looking at the "-job summary" suggestion.  So,
I did:
> > > > >
> > > > > stat_analysis -lookin /opc/save/Rosalyn.MacCracken/
> > met_out/master_gfs
> > > > > "-job
> > > > > summary -fcst_var WIND -line_type CNT -column
RMSE,MAE,ME,MSE
> > -out_stat
> > > > > /opc/save/Rosalyn.MacCracken/met_out/stat_analysis/colum_
> > > multivars.stat"
> > > > > -v
> > > > > 2
> > > > >
> > > > > and,  only wrote to the screen, not the -out_stat file
specified.
> > So,
> > > > how
> > > > > do I fix that?
> > > > >
> > > > > Next, I can't get your suggestion of:
> > > > >
> > > > > "-job aggregate_stat -line_type SL1L2 -out_line_type CNT
-out_stat
> > > > > cnt_time_series.stat -fcst_lead 24 -by FCST_INIT_BEG"
> > > > >
> > > > > to work, because I have no forecast files.  So, I made a
small
> > dataset
> > > to
> > > > > work with, which only includes match-ups of prepbufr-ascat
and GFS
> at
> > > > > forecast times 00z, 06z, 12z and 18z.  I don't have any
forecast
> > files
> > > > > associated with the GFS, only what matches the time stamp on
the
> > > prepbufr
> > > > > ascat data.  So, how do you get data so that you can use the
> > -fcst_lead
> > > > > option, etc?  Is this like matching observation valid time
with
> files
> > > > such
> > > > > as gfs.tHHz.grb2f01, gfs.tHHz.grb2f02, gfs.tHHz.grb2f03,
etc?  In
> > other
> > > > > words, in my prepbufr file, ascat data is collected
throughout the
> 6
> > > hour
> > > > > period when the file is valid.  So, if it's valid at 00z,
there is
> -3
> > > > hours
> > > > > before 00z, and +3 hours after 00z that data is collected
and
> stamped
> > > for
> > > > > when the data was precisely collected. Technically, I could
> separate
> > > that
> > > > > out to match a one hour forecast (gfs.tHHz.grb2f01), or a 2
hour
> > > forecast
> > > > > (gfs.tHHz.grb2f02), etc.
> > > > >
> > > > > So, do I need to also generate those matchups in order to
use that
> > > > > -fcst_lead option?  Or, is there a better way to generate
the data
> > that
> > > > is
> > > > > needed for that?
> > > > >
> > > > > Roz
> > > > >
> > > > > On Mon, May 15, 2017 at 3:51 PM, John Halley Gotway via RT <
> > > > > met_help at ucar.edu> wrote:
> > > > >
> > > > > > Roz,
> > > > > >
> > > > > > I'm glad you've been able to make progress using STAT-
Analysis.
> > > > > >
> > > > > > Let me mention a few things that you may find useful...
> > > > > >
> > > > > > (1) As you've already seen, STAT-Analysis can be run by
defining
> > one
> > > or
> > > > > > more jobs in a config file.  Alternatively, you can run a
single
> > job
> > > on
> > > > > the
> > > > > > command line with no config file.  I find that much
quicker and
> > > easier
> > > > > when
> > > > > > I'm playing around with things.  It's only once I've
defined a
> > fixed
> > > > set
> > > > > of
> > > > > > jobs that I move them into a config file.
> > > > > >
> > > > > > (2) By default, STAT-Analysis writes it output to the
screen.
> Use
> > > the
> > > > > > "-out_stat" job option to redirect the job output to a
.stat
> output
> > > > > files.
> > > > > > That will include the full set of header columns and
should be
> > pretty
> > > > > easy
> > > > > > for a plotting script to parse.
> > > > > >
> > > > > > (3) It sounds like you're interested primarily in matched
pairs,
> > i.e.
> > > > the
> > > > > > MPR line type.  I assume that's what you're plotting in
your
> > > histograms
> > > > > and
> > > > > > boxplots.  If you really just want to "filter" the .stat
files,
> I'd
> > > > > suggest
> > > > > > using the "filter" job to do so:
> > > > > >    -job filter -line_type MPR -dump_row filter_mpr.stat
[[[
> > > additional
> > > > > > filtering criteria ]]]
> > > > > >
> > > > > > You mentioned an aggregate_stat job to read CTC and write
CTS...
> > but
> > > > that
> > > > > > doesn't have anything to do with the MPR line type.  So
I'm
> > confused
> > > as
> > > > > to
> > > > > > why that's getting you what you want?
> > > > > >
> > > > > > (4) I see that you want a time series of MAE values.  I
think
> > you're
> > > on
> > > > > the
> > > > > > right track, but I'd suggest using the "-by" option:
> > > > > >
> > > > > > "-job aggregate_stat -line_type SL1L2 -out_line_type CNT
> -out_stat
> > > > > > cnt_time_series.stat -fcst_lead 24 -by FCST_INIT_BEG"
> > > > > >
> > > > > > The job listed above would produce a time series of
continuous
> > > > statistics
> > > > > > for the 24-hour lead time for each initialization time
present.
> > You
> > > > > should
> > > > > > be able to use the job command options to define the time
series
> in
> > > any
> > > > > way
> > > > > > you want.
> > > > > >
> > > > > > (5) When running a summary job, if you want to summarize
multiple
> > > > > columns,
> > > > > > just use the "-column" option multiple times to include
them...
> or
> > > > > specify
> > > > > > "-column" as a comma-separated list:
> > > > > >
> > > > > > "-job summary -fcst_var WIND -line_type CNT -column
> > RMSE,MAE,ME,MSE"
> > > > > >
> > > > > > (6) The "plot_cnt.R" script on the website is outdated
since it's
> > > > header
> > > > > > columns haven't been updated since version 3.0.  But that
same
> > script
> > > > is
> > > > > > included in the MET release and has been updated:
> > > > > >    met-6.0/scripts/Rscripts/plot_cnt.R
> > > > > >
> > > > > > It reads the CNT line type from a .stat file, an _cnt.txt
file,
> or
> > > the
> > > > > > output of a stat-analysis filter job.  I don't know
specifically
> > what
> > > > > > stat-analysis command I used, but it'd be something like:
> > > > > >
> > > > > >    stat_analysis -job filter -line_type CNT -dump_row
> > cnt_filter.txt
> > > > [[[
> > > > > > additional filtering criteria ]]]
> > > > > >
> > > > > > Hope that helps.
> > > > > >
> > > > > > Thanks,
> > > > > > John
> > > > > >
> > > > > >
> > > > > > On Mon, May 15, 2017 at 7:29 AM, Rosalyn MacCracken - NOAA
> > Affiliate
> > > > via
> > > > > RT
> > > > > > <met_help at ucar.edu> wrote:
> > > > > >
> > > > > > >
> > > > > > > <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=80429
> >
> > > > > > >
> > > > > > > Hi John,
> > > > > > >
> > > > > > > So, I'm interested in doing a couple of things, and I
think
> I've
> > > > > figured
> > > > > > > out how to do some of them.  So, maybe you can tell me
how to
> do
> > > the
> > > > > > > others.
> > > > > > >
> > > > > > > First, I am mostly interested in the matched points and
their
> > > > > > performance.
> > > > > > > And, I use a config file, which I call from a script
using the
> > > > command:
> > > > > > >
> > > > > > > stat_analysis -lookin ${PROCDIR} -out
> > > > > > > ${PROCDIR}/stat_analysis/stat_analysis.out -config
> > > > > > > ${CONFIGDIR}/STATAnalysisConfig_working -v 2
> > > > > > >
> > > > > > > I can easily plot spatially, where the matched points
are
> located
> > > by
> > > > > > their
> > > > > > > lat/lon, and I can find their differences (FCST - OBS).
Then,
> I
> > > used
> > > > > the
> > > > > > > aggregate_stat command to combine my files, so, I can
plot
> > > histograms
> > > > > or
> > > > > > > box plots of matched point, either at that forecast
hour, or
> over
> > > the
> > > > > > span
> > > > > > > of my forecast period of interest.  For that, I use this
in  my
> > > > config
> > > > > > > file:
> > > > > > >
> > > > > > > "-job aggregate_stat -line_type CTC -out_line_type CTS
> > > > > > >     -dump_row
> > > > > > > /opc/save/Rosalyn.MacCracken/met_out/stat_analysis/job4_
> > > > > > > aggstat_ctc_cts.stat"
> > > > > > >
> > > > > > > So, some other things that I might be interested in
things that
> > > span
> > > > > the
> > > > > > > entire period.  Perhaps, more like times series plots,
so we
> can
> > > see
> > > > > how
> > > > > > > the forecast has done over time.  I don't have a problem
with
> > > > plotting
> > > > > > > things from the forecast period, but, they usually
aren't very
> > > > > revealing
> > > > > > or
> > > > > > > interesting.  So, some other things are:
> > > > > > >
> > > > > > > 1) putting together a file which spans the forecast
period
> which
> > > puts
> > > > > > > together information from the SL1L2 file, so, I could
plot a
> time
> > > > > series
> > > > > > of
> > > > > > > the MAE.  So, I was thinking I would use:
> > > > > > > "-job aggregate_stat -line_type SL1L2 -out_line_type CNT
> > > > > > >     -dump_row
> > > > > > > /opc/save/Rosalyn.MacCracken/met_out/stat_analysis/job2_
> > > > > > > aggstat_slil2_cnt_wind2.stat"
> > > > > > >
> > > > > > >
> > > > > > > Is that right?
> > > > > > >
> > > > > > > 2)  From the CNT files, time series plots of the
ANOM_CORR,
> > > PR_CORR,
> > > > > GSS
> > > > > > or
> > > > > > > CSI, and RMSE, and maybe some other things.  I was
thinking
> that
> > I
> > > > > could
> > > > > > > do:
> > > > > > > "-job summary -fcst_var WIND -line_type CNT -column RMSE
> > > > > > >      -dump_row
> > > > > > > /opc/save/Rosalyn.MacCracken/met_out/stat_analysis/job7_
> > > > > > > summary_cnt_rmse.stat"
> > > > > > >
> > > > > > > But, wasn't sure if that was correct.  So, if you could
point
> me
> > to
> > > > the
> > > > > > > right usage, that would be great.
> > > > > > >
> > > > > > > So, I'm also not sure how you get the mean, min, max,
etc, for
> > > > multiple
> > > > > > > columns.  I think that the CNT file has the most useful
info,
> so,
> > > if
> > > > > you
> > > > > > > could tell me how to do that, that would be great.  I'm
sure
> I'll
> > > > have
> > > > > > > another list of things I want to do after todays'
meeting with
> > Joe,
> > > > so,
> > > > > > > I'll be back in touch with that list.
> > > > > > >
> > > > > > > Oh, also, that script you wrote, plot_cnt.r on the MET
user
> page.
> > > > Does
> > > > > > > that plot from one of these aggregate_stat or summary
commands,
> > or
> > > is
> > > > > > that
> > > > > > > a single CNT file?  If it's an aggregate_stat or summary
> > commands,
> > > > what
> > > > > > > command did you use and what was in the "stat_list" that
you
> > used?
> > > > I'm
> > > > > > > sure it was a variety of columns from the CNT file,
right?
> > > > > > >
> > > > > > > Thanks for for your help!
> > > > > > >
> > > > > > > Roz
> > > > > > >
> > > > > > >
> > > > > > > On Sun, May 14, 2017 at 12:37 AM, John Halley Gotway via
RT <
> > > > > > > met_help at ucar.edu> wrote:
> > > > > > >
> > > > > > > > Roz,
> > > > > > > >
> > > > > > > > Stat-Analysis can perform a few different "job" types.
One
> of
> > > them
> > > > > is
> > > > > > > the
> > > > > > > > "summary" job type (-job summary).  For that job, you
pick
> > > exactly
> > > > > one
> > > > > > > line
> > > > > > > > type and one or more columns of interest.  Stat-
Analysis will
> > > apply
> > > > > > > > whatever other filtering criteria you specify and
compute
> > summary
> > > > > > > > information for the column(s) you've selected.  The
summary
> > info
> > > > > > includes
> > > > > > > > mean, min, max, and so on.
> > > > > > > >
> > > > > > > > Let me know if there's something specific you're
trying to do
> > > with
> > > > > > > > stat-analysis and I may be able to point you in the
right
> > > > direction.
> > > > > > > >
> > > > > > > > John
> > > > > > > >
> > > > > > > >
> > > > > > > > On Sat, May 13, 2017 at 11:04 AM Rosalyn MacCracken -
NOAA
> > > > Affiliate
> > > > > > via
> > > > > > > RT
> > > > > > > > <met_help at ucar.edu> wrote:
> > > > > > > >
> > > > > > > > >
> > > > > > > > > <URL: https://rt.rap.ucar.edu/rt/
> > Ticket/Display.html?id=80429
> > > >
> > > > > > > > >
> > > > > > > > > Hi John,
> > > > > > > > >
> > > > > > > > > I finally got it to work.  I had set:
> > > > > > > > > line_type = ["CTC"];
> > > > > > > > >
> > > > > > > > > So, I set line_type to nothing [], and everything
started
> > > > working.
> > > > > > > > >
> > > > > > > > > So, question.  When using "summary" with -column
RMSE set,
> > what
> > > > > does
> > > > > > > that
> > > > > > > > > mean?  That only the RMSE column is summed, or
something
> > else?
> > > > > > > > >
> > > > > > > > > Thanks!
> > > > > > > > >
> > > > > > > > > Roz
> > > > > > > > >
> > > > > > > > > On Fri, May 12, 2017 at 2:31 PM, John Halley Gotway
via RT
> <
> > > > > > > > > met_help at ucar.edu> wrote:
> > > > > > > > >
> > > > > > > > > > Hello Roz,
> > > > > > > > > >
> > > > > > > > > > I see that you have a question about
configuring/running
> > > > > > > STAT-Analysis
> > > > > > > > > > jobs.
> > > > > > > > > >
> > > > > > > > > > The "-lookin" command line option is used to tell
> > > STAT-Analysis
> > > > > > what
> > > > > > > > > input
> > > > > > > > > > files to read.  You must specify the "-lookin"
option at
> > > least
> > > > > > once,
> > > > > > > > but
> > > > > > > > > > can use it as many times as you'd like.
> > > > > > > > > >
> > > > > > > > > > The argument you pass with "-lookin" is either the
name
> of
> > a
> > > > > > > directory
> > > > > > > > or
> > > > > > > > > > explicit file name.
> > > > > > > > > >
> > > > > > > > > > For an explicit file name, STAT-Analysis will read
MET
> > output
> > > > > data
> > > > > > > from
> > > > > > > > > it
> > > > > > > > > > **regardless of the file naming convention**.
> > > > > > > > > >
> > > > > > > > > > For a directory name, STAT-Analysis will search
> > > **recursively**
> > > > > > > through
> > > > > > > > > > that directory looking for files ending in the
".stat"
> > > suffix.
> > > > > > > > > >
> > > > > > > > > > Each time you run grid_stat, point_stat,
wavelet_stat, or
> > > > > > > > ensemble_stat,
> > > > > > > > > > the tool writes a ".stat" output file (and can
also write
> > the
> > > > > > > optional
> > > > > > > > > text
> > > > > > > > > > files sorted by line type... such as "_cnt.txt).
That's
> > why
> > > > > > > > > STAT-Analysis
> > > > > > > > > > searches directories for ".stat" files.  But if
you want
> it
> > > to
> > > > > read
> > > > > > > the
> > > > > > > > > > "_cnt.txt" file, you need to specify the file name
on the
> > > > command
> > > > > > > line.
> > > > > > > > > >
> > > > > > > > > > Make sense?
> > > > > > > > > >
> > > > > > > > > > Just let us know if more issues/questions arise.
> > > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > > John Halley Gotway
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Thu, May 11, 2017 at 2:37 PM, Rosalyn
MacCracken -
> NOAA
> > > > > > Affiliate
> > > > > > > > via
> > > > > > > > > RT
> > > > > > > > > > <met_help at ucar.edu> wrote:
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Thu May 11 14:37:29 2017: Request 80429 was
acted upon.
> > > > > > > > > > > Transaction: Ticket created by
> > rosalyn.maccracken at noaa.gov
> > > > > > > > > > >        Queue: met_help
> > > > > > > > > > >      Subject: stat_analysis aggregate question
> > > > > > > > > > >        Owner: Nobody
> > > > > > > > > > >   Requestors: rosalyn.maccracken at noaa.gov
> > > > > > > > > > >       Status: new
> > > > > > > > > > >  Ticket <URL: https://rt.rap.ucar.edu/rt/
> > > > > > > > Ticket/Display.html?id=80429
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Hi,
> > > > > > > > > > >
> > > > > > > > > > > I'm using the output from the poin-stat tool as
input
> to
> > > the
> > > > > > > > > > stat_analysis
> > > > > > > > > > > tool.  I would like to aggregate the *cnt.txt
files.  I
> > can
> > > > get
> > > > > > the
> > > > > > > > > tool
> > > > > > > > > > to
> > > > > > > > > > > aggregate, and aggregate_stat the *cts.txt or
*ctc.txt
> > > files.
> > > > > I
> > > > > > > > would
> > > > > > > > > > > really like to use the information in the
*cnt.txt
> files
> > > for
> > > > > > > multiple
> > > > > > > > > > > times/days.  How do I do that?
> > > > > > > > > > >
> > > > > > > > > > > Thanks in advance!
> > > > > > > > > > >
> > > > > > > > > > > Roz
> > > > > > > > > > >
> > > > > > > > > > > --
> > > > > > > > > > > Rosalyn MacCracken
> > > > > > > > > > > Support Scientist
> > > > > > > > > > >
> > > > > > > > > > > Ocean Applilcations Branch
> > > > > > > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > > > > > > NCWCP
> > > > > > > > > > > 5830 University Research Ct
> > > > > > > > > > > College Park, MD  20740-3818
> > > > > > > > > > >
> > > > > > > > > > > (p) 301-683-1551
> > > > > > > > > > > rosalyn.maccracken at noaa.gov
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Rosalyn MacCracken
> > > > > > > > > Support Scientist
> > > > > > > > >
> > > > > > > > > Ocean Applilcations Branch
> > > > > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > > > > NCWCP
> > > > > > > > > 5830 University Research Ct
> > > > > > > > > College Park, MD  20740-3818
> > > > > > > > >
> > > > > > > > > (p) 301-683-1551
> > > > > > > > > rosalyn.maccracken at noaa.gov
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Rosalyn MacCracken
> > > > > > > Support Scientist
> > > > > > >
> > > > > > > Ocean Applilcations Branch
> > > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > > NCWCP
> > > > > > > 5830 University Research Ct
> > > > > > > College Park, MD  20740-3818
> > > > > > >
> > > > > > > (p) 301-683-1551
> > > > > > > rosalyn.maccracken at noaa.gov
> > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Rosalyn MacCracken
> > > > > Support Scientist
> > > > >
> > > > > Ocean Applilcations Branch
> > > > > NOAA/NWS Ocean Prediction Center
> > > > > NCWCP
> > > > > 5830 University Research Ct
> > > > > College Park, MD  20740-3818
> > > > >
> > > > > (p) 301-683-1551
> > > > > rosalyn.maccracken at noaa.gov
> > > > >
> > > > >
> > > >
> > > >
> > >
> > >
> > > --
> > > Rosalyn MacCracken
> > > Support Scientist
> > >
> > > Ocean Applilcations Branch
> > > NOAA/NWS Ocean Prediction Center
> > > NCWCP
> > > 5830 University Research Ct
> > > College Park, MD  20740-3818
> > >
> > > (p) 301-683-1551
> > > rosalyn.maccracken at noaa.gov
> > >
> > >
> >
> >
>
>
> --
> Rosalyn MacCracken
> Support Scientist
>
> Ocean Applilcations Branch
> NOAA/NWS Ocean Prediction Center
> NCWCP
> 5830 University Research Ct
> College Park, MD  20740-3818
>
> (p) 301-683-1551
> rosalyn.maccracken at noaa.gov
>
>

------------------------------------------------
Subject: stat_analysis aggregate question
From: John Halley Gotway
Time: Tue May 16 11:53:02 2017

VERSION MODEL DESC FCST_LEAD FCST_VALID_BEG  FCST_VALID_END  OBS_LEAD
OBS_VALID_BEG   OBS_VALID_END   FCST_VAR FCST_LEV OBS_VAR OBS_LEV
OBTYPE VX_MASK                 INTERP_MTHD INTERP_PNTS FCST_THRESH
OBS_THRESH COV_THRESH ALPHA LINE_TYPE TOTAL FBAR      FBAR_NCL
FBAR_NCU  FBAR_BCL FBAR_BCU FSTDEV   FSTDEV_NCL FSTDEV_NCU FSTDEV_BCL
FSTDEV_BCU OBAR      OBAR_NCL  OBAR_NCU  OBAR_BCL OBAR_BCU OSTDEV
OSTDEV_NCL OSTDEV_NCU OSTDEV_BCL OSTDEV_BCU PR_CORR PR_CORR_NCL
PR_CORR_NCU PR_CORR_BCL PR_CORR_BCU SP_CORR KT_CORR RANKS FRANK_TIES
ORANK_TIES ME        ME_NCL    ME_NCU    ME_BCL ME_BCU ESTDEV
ESTDEV_NCL ESTDEV_NCU ESTDEV_BCL ESTDEV_BCU MBIAS   MBIAS_BCL
MBIAS_BCU MAE     MAE_BCL MAE_BCU MSE       MSE_BCL MSE_BCU BCMSE
BCMSE_BCL BCMSE_BCU RMSE     RMSE_BCL RMSE_BCU E10 E10_BCL E10_BCU E25
E25_BCL E25_BCU E50 E50_BCL E50_BCU E75 E75_BCL E75_BCU E90 E90_BCL
E90_BCU EIQR EIQR_BCL EIQR_BCU MAD MAD_BCL MAD_BCU ANOM_CORR
ANOM_CORR_NCL ANOM_CORR_NCU ANOM_CORR_BCL ANOM_CORR_BCU ME2
ME2_BCL ME2_BCU MSESS    MSESS_BCL MSESS_BCU
V6.0    WRF   NA   120000    20050807_120000 20050807_120000 000000
20050807_120000 20050807_120000 APCP_12  A12      APCP_12 A12
MC_PCP DTC165,DTC166,CONUS,LMV NEAREST     1           NA          NA
NA         0.05  CNT       12393   1.03988   0.9866    1.09316
NA       NA  3.02636    2.98915    3.06451         NA         NA
0.71991   0.66983   0.76998       NA       NA  2.84435    2.80938
2.88021         NA         NA 0.11693    0.099525     0.13426
NA          NA      NA      NA     0          0          0  0.31997
0.25125   0.3887       NA     NA  3.90335    3.85536    3.95257
NA         NA 1.44447        NA        NA 1.37446      NA      NA
15.33733      NA      NA  15.23495        NA        NA  3.91629
NA       NA  NA      NA      NA  NA      NA      NA  NA      NA
NA  NA      NA      NA  NA      NA      NA   NA       NA       NA  NA
NA      NA        NA            NA            NA            NA
NA  0.10238         NA      NA -0.89576        NA        NA
V6.0    WRF   NA   120000    20050807_120000 20050807_120000 000000
20050807_120000 20050807_120000 RH       Z2       RH      Z2
ANALYS DTC165,DTC166,CONUS,LMV NEAREST     1           NA          NA
NA         0.05  CNT       18423  73.05976  72.79302  73.3265
NA       NA 18.4722    18.2855    18.66277         NA         NA
77.61537  77.34607  77.88467       NA       NA 18.64944   18.46095
18.84185         NA         NA 0.79014    0.78465      0.7955
NA          NA      NA      NA     0          0          0 -4.55561
-4.72927  -4.38196      NA     NA 12.026     11.90446   12.15007
NA         NA 0.94131        NA        NA 9.90544      NA      NA
165.37046      NA      NA 144.61687        NA        NA 12.85964
NA       NA  NA      NA      NA  NA      NA      NA  NA      NA
NA  NA      NA      NA  NA      NA      NA   NA       NA       NA  NA
NA      NA        NA            NA            NA            NA
NA 20.75359         NA      NA  0.52453        NA        NA
V6.0    WRF   NA   120000    20050807_120000 20050807_120000 000000
20050807_120000 20050807_120000 TMP      Z2       TMP     Z2
ANALYS DTC165,DTC166,CONUS,LMV NEAREST     1           NA          NA
NA         0.05  CNT       18423 293.3424  293.275   293.40979
NA       NA  4.66739    4.62022    4.71555         NA         NA
292.40927 292.34379 292.47475       NA       NA  4.53444    4.48861
4.58122         NA         NA 0.89847    0.89565      0.90121
NA          NA      NA      NA     0          0          0  0.93312
0.90313   0.96312      NA     NA  2.07735    2.05635    2.09878
NA         NA 1.00319        NA        NA 1.68946      NA      NA
5.18586      NA      NA   4.31514        NA        NA  2.27725
NA       NA  NA      NA      NA  NA      NA      NA  NA      NA
NA  NA      NA      NA  NA      NA      NA   NA       NA       NA  NA
NA      NA        NA            NA            NA            NA
NA  0.87072         NA      NA  0.74778        NA        NA
V6.0    WRF   NA   120000    20050807_120000 20050807_120000 000000
20050807_120000 20050807_120000 UGRD     Z10      UGRD    Z10
ANALYS DTC165,DTC166,CONUS,LMV NEAREST     1           NA          NA
NA         0.05  CNT       18423   0.4412    0.40469   0.47772
NA       NA  2.52879    2.50323    2.55488         NA         NA
0.45185   0.41149   0.49221       NA       NA  2.79517    2.76692
2.82401         NA         NA 0.78662    0.78106      0.79207
NA          NA      NA      NA     0          0          0 -0.010646
-0.036018  0.014727     NA     NA  1.7571     1.73934    1.77523
NA         NA 0.97644        NA        NA 1.29472      NA      NA
3.08735      NA      NA   3.08724        NA        NA  1.75709
NA       NA  NA      NA      NA  NA      NA      NA  NA      NA
NA  NA      NA      NA  NA      NA      NA   NA       NA       NA  NA
NA      NA        NA            NA            NA            NA
NA  0.00011333      NA      NA  0.60484        NA        NA
V6.0    WRF   NA   120000    20050807_120000 20050807_120000 000000
20050807_120000 20050807_120000 VGRD     Z10      VGRD    Z10
ANALYS DTC165,DTC166,CONUS,LMV NEAREST     1           NA          NA
NA         0.05  CNT       18423   0.16869   0.12807   0.20931
NA       NA  2.81304    2.78461    2.84206         NA         NA
0.24364   0.19716   0.29013       NA       NA  3.21925    3.18672
3.25247         NA         NA 0.81331    0.80836      0.81814
NA          NA      NA      NA     0          0          0 -0.074948
-0.10214  -0.047755     NA     NA  1.88317    1.86414    1.9026
NA         NA 0.69238        NA        NA 1.37626      NA      NA
3.55175      NA      NA   3.54613        NA        NA  1.88461
NA       NA  NA      NA      NA  NA      NA      NA  NA      NA
NA  NA      NA      NA  NA      NA      NA   NA       NA       NA  NA
NA      NA        NA            NA            NA            NA
NA  0.0056172       NA      NA  0.65729        NA        NA
V6.0    WRF   NA   240000    20050808_000000 20050808_000000 000000
20050808_000000 20050808_000000 APCP_24  A24      APCP_24 A24
MC_PCP DTC165,DTC166,CONUS,LMV NEAREST     1           NA          NA
NA         0.05  CNT       12345   2.26417   2.16427   2.36407
NA       NA  5.66331    5.59354    5.73485         NA         NA
2.01666   1.91995   2.11337       NA       NA  5.4824     5.41486
5.55166         NA         NA 0.19753    0.18052      0.21442
NA          NA      NA      NA     0          0          0  0.24751
0.12295   0.37208      NA     NA  7.06144    6.97445    7.15064
NA         NA 1.12274        NA        NA 2.80419      NA      NA
49.92114      NA      NA  49.85987        NA        NA  7.06549
NA       NA  NA      NA      NA  NA      NA      NA  NA      NA
NA  NA      NA      NA  NA      NA      NA   NA       NA       NA  NA
NA      NA        NA            NA            NA            NA
NA  0.061263        NA      NA -0.6609         NA        NA

------------------------------------------------
Subject: stat_analysis aggregate question
From: Rosalyn MacCracken - NOAA Affiliate
Time: Tue May 16 12:08:02 2017

Hi John,

Ok, I get that too.  So, in my playing around, I overwrote the "wrong"
file, so, I can't show you the output, and I don't remember how I
produced
it.  So, your above command works and produces the right output.  It
was
just kind of neat that it was a single value of variables in the CNT
file,
over the span of the forecast.  Like a summary, but, no values of
mean,etc.  So, who knows what I did to make that, but, I won't worry
about
it, as long as the other command is working.

Ok, thanks again!  I'll be back in touch when I have more questions.

Roz

On Tue, May 16, 2017 at 5:53 PM, John Halley Gotway via RT <
met_help at ucar.edu> wrote:

> Roz,
>
> Adding "-by FCST_VAR" is exactly what I would recommend doing.  The
> question is why isn't that "working".
>
> By way of demonstration, I ran the following job on the output from
> Grid-Stat generated when you run the "make test" command:
>
> cd met-6.0
> bin/stat_analysis -lookin out/grid_stat \
>    -job aggregate_stat -line_type SL1L2 -out_line_type CNT \
>    -by FCST_VALID_BEG -by FCST_VAR,FCST_LEV \
>    -out_stat cnt.txt
>
> The resulting cnt.txt file is attached and includes separate output
for 6
> different FCST_VAR values.
>
> If the behavior you're seeing doesn't match what I've described,
please
> package up and send me a simple test case which demonstrates the
unexpected
> behavior.  Once I'm able to replicate the behavior here, I can
figure out
> why it's happening.
>
> Thanks,
> John
>
>
>
> On Tue, May 16, 2017 at 11:25 AM, Rosalyn MacCracken - NOAA
Affiliate via
> RT <met_help at ucar.edu> wrote:
>
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=80429 >
> >
> > Oh, wait, one more question:
> >
> > The command you gave me before:
> > "-job aggregate_stat -line_type SL1L2 -out_line_type CNT -out_stat
> > cnt_time_series.stat -by FCST_VALID_BEG"
> >
> >
> > does summarize variables from CNT for the forecast time period,
but, it
> > lumps all the forecast variables together.  So, you'll have a
column of
> > FCST_VAR which is UGRD,VGRD and WIND and them a single RMSE
associated
> with
> > that.
> >
> > I tried adding -by FCST_VAR:
> > "-job aggregate_stat -line_type SL1L2 -out_line_type CNT -out_stat
> > cnt_time_series.stat -by FCST_VAR -by FCST_VALID_BEG"
> >
> > but, that didn't work.  Any ideas how to associate a single value,
of say
> > RMSE, for the forecast time interval of say 2017050100 -
2017050218,
> with
> > each of the forecast varialbes (UGRD,VGRD and WIND)?
> >
> > Thanks,
> >
> > Roz
> >
> > On Tue, May 16, 2017 at 3:54 PM, John Halley Gotway via RT <
> > met_help at ucar.edu> wrote:
> >
> > > Roz,
> > >
> > > Great, glad to hear you've made progress.
> > >
> > > Let me clarify one nuance about the config file which *may* be
the
> reason
> > > why your summary job didn't work via the config file.
> > >
> > > You'll notice that the config file has two sections.  The
"filtering"
> > > section at the top contains at least one option for each of the
22
> header
> > > columns of the .stat output files.  The "jobs" section at the
bottom
> > > defines the analysis job you want to perform.
> > >
> > > The logic works like this...
> > >
> > > - STAT-Analysis reads all the input files defined using the "-
lookin"
> > > option.
> > > - It applies *all* of the filtering options defined in the top
section
> > and
> > > writes the filtered .stat data to an output temp file.
> > > - Each job defined in the "jobs" section, reads data from that
temp
> file,
> > > applies any additional filtering criteria you've defined, and
then
> > performs
> > > the job on the data that remains.
> > >
> > > Therefore, the settings defined in the "filtering" section are
> > effectively
> > > applied to every job you define in the "jobs" section.
> > >
> > > Perhaps, your "filtering" options at the top of your config file
have
> > > already filtered out the line type you're processing in the
summary
> job?
> > > If so, just move that option out of the filtering section and
down to
> the
> > > jobs section where you'll specify it separately for each job
(e.g.
> > > -line_type CNT).
> > >
> > > The intent of this design is to enable STAT-Analysis to run more
> > > efficiently.  Rather than having it re-parse *ALL* the input
lines for
> > each
> > > job, do some first order filtering to run jobs on a smaller
number of
> > > lines.
> > >
> > > Hope this helps clarify.
> > >
> > > Thanks,
> > > John
> > >
> > >
> > >
> > > On Tue, May 16, 2017 at 9:34 AM, Rosalyn MacCracken - NOAA
Affiliate
> via
> > RT
> > > <met_help at ucar.edu> wrote:
> > >
> > > >
> > > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=80429
>
> > > >
> > > > Hi John,
> > > >
> > > > Ok, I was able to get those things working!  I couldn't get
the
> summary
> > > job
> > > > to run in the config file and output that small table to the
ascii,
> > but,
> > > I
> > > > could use it on the command line with no issues.  So,
eventually,
> I'll
> > > run
> > > > this in an automated script, so, I tested it with my script,
and it
> > runs
> > > > great and outputs what I want.  So, it looks like I'm off to a
good
> > start
> > > > now.
> > > >
> > > > Thanks for all your help!
> > > >
> > > > Roz
> > > >
> > > > On Mon, May 15, 2017 at 8:07 PM, John Halley Gotway via RT <
> > > > met_help at ucar.edu> wrote:
> > > >
> > > > > Roz,
> > > > >
> > > > > The output of the "summary" job is not a .stat line type.
There is
> > no
> > > > > "SUMMARY" line type produced by other MET tools.  That's why
you
> > don't
> > > > get
> > > > > any output using the "-out_stat" option.  However, you can
use the
> > > "-out"
> > > > > option to redirect the output to an ASCII file.
> > > > >
> > > > > I realize this is confusing... the "-out" option has existed
for a
> > long
> > > > > time.  We only recently added the "-out_stat" option for
output the
> > > > > "aggregate" and "aggregate_stat" job types, which write true
STAT
> > lines
> > > > to
> > > > > the output.
> > > > >
> > > > > On to the next issue.  It's fine that you're not evaluating
> forecast
> > > lead
> > > > > times... in fact that makes the logic of defining a time
series
> much
> > > > > easier.  Just use "-by FCST_VALID_BEG" instead:
> > > > >
> > > > > "-job aggregate_stat -line_type SL1L2 -out_line_type CNT
-out_stat
> > > > > cnt_time_series.stat -by FCST_VALID_BEG"
> > > > >
> > > > > Thanks,
> > > > > John
> > > > >
> > > > > On Mon, May 15, 2017 at 12:33 PM, Rosalyn MacCracken - NOAA
> Affiliate
> > > via
> > > > > RT <met_help at ucar.edu> wrote:
> > > > >
> > > > > >
> > > > > > <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=80429 >
> > > > > >
> > > > > > Hi John,
> > > > > >
> > > > > > I'm having a few problems, and I'm sure they are pretty
simple to
> > > > solve.
> > > > > > First, I was looking at the "-job summary" suggestion.
So, I
> did:
> > > > > >
> > > > > > stat_analysis -lookin /opc/save/Rosalyn.MacCracken/
> > > met_out/master_gfs
> > > > > > "-job
> > > > > > summary -fcst_var WIND -line_type CNT -column
RMSE,MAE,ME,MSE
> > > -out_stat
> > > > > > /opc/save/Rosalyn.MacCracken/met_out/stat_analysis/colum_
> > > > multivars.stat"
> > > > > > -v
> > > > > > 2
> > > > > >
> > > > > > and,  only wrote to the screen, not the -out_stat file
specified.
> > > So,
> > > > > how
> > > > > > do I fix that?
> > > > > >
> > > > > > Next, I can't get your suggestion of:
> > > > > >
> > > > > > "-job aggregate_stat -line_type SL1L2 -out_line_type CNT
> -out_stat
> > > > > > cnt_time_series.stat -fcst_lead 24 -by FCST_INIT_BEG"
> > > > > >
> > > > > > to work, because I have no forecast files.  So, I made a
small
> > > dataset
> > > > to
> > > > > > work with, which only includes match-ups of prepbufr-ascat
and
> GFS
> > at
> > > > > > forecast times 00z, 06z, 12z and 18z.  I don't have any
forecast
> > > files
> > > > > > associated with the GFS, only what matches the time stamp
on the
> > > > prepbufr
> > > > > > ascat data.  So, how do you get data so that you can use
the
> > > -fcst_lead
> > > > > > option, etc?  Is this like matching observation valid time
with
> > files
> > > > > such
> > > > > > as gfs.tHHz.grb2f01, gfs.tHHz.grb2f02, gfs.tHHz.grb2f03,
etc?  In
> > > other
> > > > > > words, in my prepbufr file, ascat data is collected
throughout
> the
> > 6
> > > > hour
> > > > > > period when the file is valid.  So, if it's valid at 00z,
there
> is
> > -3
> > > > > hours
> > > > > > before 00z, and +3 hours after 00z that data is collected
and
> > stamped
> > > > for
> > > > > > when the data was precisely collected. Technically, I
could
> > separate
> > > > that
> > > > > > out to match a one hour forecast (gfs.tHHz.grb2f01), or a
2 hour
> > > > forecast
> > > > > > (gfs.tHHz.grb2f02), etc.
> > > > > >
> > > > > > So, do I need to also generate those matchups in order to
use
> that
> > > > > > -fcst_lead option?  Or, is there a better way to generate
the
> data
> > > that
> > > > > is
> > > > > > needed for that?
> > > > > >
> > > > > > Roz
> > > > > >
> > > > > > On Mon, May 15, 2017 at 3:51 PM, John Halley Gotway via RT
<
> > > > > > met_help at ucar.edu> wrote:
> > > > > >
> > > > > > > Roz,
> > > > > > >
> > > > > > > I'm glad you've been able to make progress using STAT-
Analysis.
> > > > > > >
> > > > > > > Let me mention a few things that you may find useful...
> > > > > > >
> > > > > > > (1) As you've already seen, STAT-Analysis can be run by
> defining
> > > one
> > > > or
> > > > > > > more jobs in a config file.  Alternatively, you can run
a
> single
> > > job
> > > > on
> > > > > > the
> > > > > > > command line with no config file.  I find that much
quicker and
> > > > easier
> > > > > > when
> > > > > > > I'm playing around with things.  It's only once I've
defined a
> > > fixed
> > > > > set
> > > > > > of
> > > > > > > jobs that I move them into a config file.
> > > > > > >
> > > > > > > (2) By default, STAT-Analysis writes it output to the
screen.
> > Use
> > > > the
> > > > > > > "-out_stat" job option to redirect the job output to a
.stat
> > output
> > > > > > files.
> > > > > > > That will include the full set of header columns and
should be
> > > pretty
> > > > > > easy
> > > > > > > for a plotting script to parse.
> > > > > > >
> > > > > > > (3) It sounds like you're interested primarily in
matched
> pairs,
> > > i.e.
> > > > > the
> > > > > > > MPR line type.  I assume that's what you're plotting in
your
> > > > histograms
> > > > > > and
> > > > > > > boxplots.  If you really just want to "filter" the .stat
files,
> > I'd
> > > > > > suggest
> > > > > > > using the "filter" job to do so:
> > > > > > >    -job filter -line_type MPR -dump_row filter_mpr.stat
[[[
> > > > additional
> > > > > > > filtering criteria ]]]
> > > > > > >
> > > > > > > You mentioned an aggregate_stat job to read CTC and
write
> CTS...
> > > but
> > > > > that
> > > > > > > doesn't have anything to do with the MPR line type.  So
I'm
> > > confused
> > > > as
> > > > > > to
> > > > > > > why that's getting you what you want?
> > > > > > >
> > > > > > > (4) I see that you want a time series of MAE values.  I
think
> > > you're
> > > > on
> > > > > > the
> > > > > > > right track, but I'd suggest using the "-by" option:
> > > > > > >
> > > > > > > "-job aggregate_stat -line_type SL1L2 -out_line_type CNT
> > -out_stat
> > > > > > > cnt_time_series.stat -fcst_lead 24 -by FCST_INIT_BEG"
> > > > > > >
> > > > > > > The job listed above would produce a time series of
continuous
> > > > > statistics
> > > > > > > for the 24-hour lead time for each initialization time
present.
> > > You
> > > > > > should
> > > > > > > be able to use the job command options to define the
time
> series
> > in
> > > > any
> > > > > > way
> > > > > > > you want.
> > > > > > >
> > > > > > > (5) When running a summary job, if you want to summarize
> multiple
> > > > > > columns,
> > > > > > > just use the "-column" option multiple times to include
them...
> > or
> > > > > > specify
> > > > > > > "-column" as a comma-separated list:
> > > > > > >
> > > > > > > "-job summary -fcst_var WIND -line_type CNT -column
> > > RMSE,MAE,ME,MSE"
> > > > > > >
> > > > > > > (6) The "plot_cnt.R" script on the website is outdated
since
> it's
> > > > > header
> > > > > > > columns haven't been updated since version 3.0.  But
that same
> > > script
> > > > > is
> > > > > > > included in the MET release and has been updated:
> > > > > > >    met-6.0/scripts/Rscripts/plot_cnt.R
> > > > > > >
> > > > > > > It reads the CNT line type from a .stat file, an
_cnt.txt file,
> > or
> > > > the
> > > > > > > output of a stat-analysis filter job.  I don't know
> specifically
> > > what
> > > > > > > stat-analysis command I used, but it'd be something
like:
> > > > > > >
> > > > > > >    stat_analysis -job filter -line_type CNT -dump_row
> > > cnt_filter.txt
> > > > > [[[
> > > > > > > additional filtering criteria ]]]
> > > > > > >
> > > > > > > Hope that helps.
> > > > > > >
> > > > > > > Thanks,
> > > > > > > John
> > > > > > >
> > > > > > >
> > > > > > > On Mon, May 15, 2017 at 7:29 AM, Rosalyn MacCracken -
NOAA
> > > Affiliate
> > > > > via
> > > > > > RT
> > > > > > > <met_help at ucar.edu> wrote:
> > > > > > >
> > > > > > > >
> > > > > > > > <URL: https://rt.rap.ucar.edu/rt/
> Ticket/Display.html?id=80429
> > >
> > > > > > > >
> > > > > > > > Hi John,
> > > > > > > >
> > > > > > > > So, I'm interested in doing a couple of things, and I
think
> > I've
> > > > > > figured
> > > > > > > > out how to do some of them.  So, maybe you can tell me
how to
> > do
> > > > the
> > > > > > > > others.
> > > > > > > >
> > > > > > > > First, I am mostly interested in the matched points
and their
> > > > > > > performance.
> > > > > > > > And, I use a config file, which I call from a script
using
> the
> > > > > command:
> > > > > > > >
> > > > > > > > stat_analysis -lookin ${PROCDIR} -out
> > > > > > > > ${PROCDIR}/stat_analysis/stat_analysis.out -config
> > > > > > > > ${CONFIGDIR}/STATAnalysisConfig_working -v 2
> > > > > > > >
> > > > > > > > I can easily plot spatially, where the matched points
are
> > located
> > > > by
> > > > > > > their
> > > > > > > > lat/lon, and I can find their differences (FCST -
OBS).
> Then,
> > I
> > > > used
> > > > > > the
> > > > > > > > aggregate_stat command to combine my files, so, I can
plot
> > > > histograms
> > > > > > or
> > > > > > > > box plots of matched point, either at that forecast
hour, or
> > over
> > > > the
> > > > > > > span
> > > > > > > > of my forecast period of interest.  For that, I use
this in
> my
> > > > > config
> > > > > > > > file:
> > > > > > > >
> > > > > > > > "-job aggregate_stat -line_type CTC -out_line_type CTS
> > > > > > > >     -dump_row
> > > > > > > >
/opc/save/Rosalyn.MacCracken/met_out/stat_analysis/job4_
> > > > > > > > aggstat_ctc_cts.stat"
> > > > > > > >
> > > > > > > > So, some other things that I might be interested in
things
> that
> > > > span
> > > > > > the
> > > > > > > > entire period.  Perhaps, more like times series plots,
so we
> > can
> > > > see
> > > > > > how
> > > > > > > > the forecast has done over time.  I don't have a
problem with
> > > > > plotting
> > > > > > > > things from the forecast period, but, they usually
aren't
> very
> > > > > > revealing
> > > > > > > or
> > > > > > > > interesting.  So, some other things are:
> > > > > > > >
> > > > > > > > 1) putting together a file which spans the forecast
period
> > which
> > > > puts
> > > > > > > > together information from the SL1L2 file, so, I could
plot a
> > time
> > > > > > series
> > > > > > > of
> > > > > > > > the MAE.  So, I was thinking I would use:
> > > > > > > > "-job aggregate_stat -line_type SL1L2 -out_line_type
CNT
> > > > > > > >     -dump_row
> > > > > > > >
/opc/save/Rosalyn.MacCracken/met_out/stat_analysis/job2_
> > > > > > > > aggstat_slil2_cnt_wind2.stat"
> > > > > > > >
> > > > > > > >
> > > > > > > > Is that right?
> > > > > > > >
> > > > > > > > 2)  From the CNT files, time series plots of the
ANOM_CORR,
> > > > PR_CORR,
> > > > > > GSS
> > > > > > > or
> > > > > > > > CSI, and RMSE, and maybe some other things.  I was
thinking
> > that
> > > I
> > > > > > could
> > > > > > > > do:
> > > > > > > > "-job summary -fcst_var WIND -line_type CNT -column
RMSE
> > > > > > > >      -dump_row
> > > > > > > >
/opc/save/Rosalyn.MacCracken/met_out/stat_analysis/job7_
> > > > > > > > summary_cnt_rmse.stat"
> > > > > > > >
> > > > > > > > But, wasn't sure if that was correct.  So, if you
could point
> > me
> > > to
> > > > > the
> > > > > > > > right usage, that would be great.
> > > > > > > >
> > > > > > > > So, I'm also not sure how you get the mean, min, max,
etc,
> for
> > > > > multiple
> > > > > > > > columns.  I think that the CNT file has the most
useful info,
> > so,
> > > > if
> > > > > > you
> > > > > > > > could tell me how to do that, that would be great.
I'm sure
> > I'll
> > > > > have
> > > > > > > > another list of things I want to do after todays'
meeting
> with
> > > Joe,
> > > > > so,
> > > > > > > > I'll be back in touch with that list.
> > > > > > > >
> > > > > > > > Oh, also, that script you wrote, plot_cnt.r on the MET
user
> > page.
> > > > > Does
> > > > > > > > that plot from one of these aggregate_stat or summary
> commands,
> > > or
> > > > is
> > > > > > > that
> > > > > > > > a single CNT file?  If it's an aggregate_stat or
summary
> > > commands,
> > > > > what
> > > > > > > > command did you use and what was in the "stat_list"
that you
> > > used?
> > > > > I'm
> > > > > > > > sure it was a variety of columns from the CNT file,
right?
> > > > > > > >
> > > > > > > > Thanks for for your help!
> > > > > > > >
> > > > > > > > Roz
> > > > > > > >
> > > > > > > >
> > > > > > > > On Sun, May 14, 2017 at 12:37 AM, John Halley Gotway
via RT <
> > > > > > > > met_help at ucar.edu> wrote:
> > > > > > > >
> > > > > > > > > Roz,
> > > > > > > > >
> > > > > > > > > Stat-Analysis can perform a few different "job"
types.  One
> > of
> > > > them
> > > > > > is
> > > > > > > > the
> > > > > > > > > "summary" job type (-job summary).  For that job,
you pick
> > > > exactly
> > > > > > one
> > > > > > > > line
> > > > > > > > > type and one or more columns of interest.  Stat-
Analysis
> will
> > > > apply
> > > > > > > > > whatever other filtering criteria you specify and
compute
> > > summary
> > > > > > > > > information for the column(s) you've selected.  The
summary
> > > info
> > > > > > > includes
> > > > > > > > > mean, min, max, and so on.
> > > > > > > > >
> > > > > > > > > Let me know if there's something specific you're
trying to
> do
> > > > with
> > > > > > > > > stat-analysis and I may be able to point you in the
right
> > > > > direction.
> > > > > > > > >
> > > > > > > > > John
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Sat, May 13, 2017 at 11:04 AM Rosalyn MacCracken
- NOAA
> > > > > Affiliate
> > > > > > > via
> > > > > > > > RT
> > > > > > > > > <met_help at ucar.edu> wrote:
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > <URL: https://rt.rap.ucar.edu/rt/
> > > Ticket/Display.html?id=80429
> > > > >
> > > > > > > > > >
> > > > > > > > > > Hi John,
> > > > > > > > > >
> > > > > > > > > > I finally got it to work.  I had set:
> > > > > > > > > > line_type = ["CTC"];
> > > > > > > > > >
> > > > > > > > > > So, I set line_type to nothing [], and everything
started
> > > > > working.
> > > > > > > > > >
> > > > > > > > > > So, question.  When using "summary" with -column
RMSE
> set,
> > > what
> > > > > > does
> > > > > > > > that
> > > > > > > > > > mean?  That only the RMSE column is summed, or
something
> > > else?
> > > > > > > > > >
> > > > > > > > > > Thanks!
> > > > > > > > > >
> > > > > > > > > > Roz
> > > > > > > > > >
> > > > > > > > > > On Fri, May 12, 2017 at 2:31 PM, John Halley
Gotway via
> RT
> > <
> > > > > > > > > > met_help at ucar.edu> wrote:
> > > > > > > > > >
> > > > > > > > > > > Hello Roz,
> > > > > > > > > > >
> > > > > > > > > > > I see that you have a question about
> configuring/running
> > > > > > > > STAT-Analysis
> > > > > > > > > > > jobs.
> > > > > > > > > > >
> > > > > > > > > > > The "-lookin" command line option is used to
tell
> > > > STAT-Analysis
> > > > > > > what
> > > > > > > > > > input
> > > > > > > > > > > files to read.  You must specify the "-lookin"
option
> at
> > > > least
> > > > > > > once,
> > > > > > > > > but
> > > > > > > > > > > can use it as many times as you'd like.
> > > > > > > > > > >
> > > > > > > > > > > The argument you pass with "-lookin" is either
the name
> > of
> > > a
> > > > > > > > directory
> > > > > > > > > or
> > > > > > > > > > > explicit file name.
> > > > > > > > > > >
> > > > > > > > > > > For an explicit file name, STAT-Analysis will
read MET
> > > output
> > > > > > data
> > > > > > > > from
> > > > > > > > > > it
> > > > > > > > > > > **regardless of the file naming convention**.
> > > > > > > > > > >
> > > > > > > > > > > For a directory name, STAT-Analysis will search
> > > > **recursively**
> > > > > > > > through
> > > > > > > > > > > that directory looking for files ending in the
".stat"
> > > > suffix.
> > > > > > > > > > >
> > > > > > > > > > > Each time you run grid_stat, point_stat,
wavelet_stat,
> or
> > > > > > > > > ensemble_stat,
> > > > > > > > > > > the tool writes a ".stat" output file (and can
also
> write
> > > the
> > > > > > > > optional
> > > > > > > > > > text
> > > > > > > > > > > files sorted by line type... such as "_cnt.txt).
> That's
> > > why
> > > > > > > > > > STAT-Analysis
> > > > > > > > > > > searches directories for ".stat" files.  But if
you
> want
> > it
> > > > to
> > > > > > read
> > > > > > > > the
> > > > > > > > > > > "_cnt.txt" file, you need to specify the file
name on
> the
> > > > > command
> > > > > > > > line.
> > > > > > > > > > >
> > > > > > > > > > > Make sense?
> > > > > > > > > > >
> > > > > > > > > > > Just let us know if more issues/questions arise.
> > > > > > > > > > >
> > > > > > > > > > > Thanks,
> > > > > > > > > > > John Halley Gotway
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > On Thu, May 11, 2017 at 2:37 PM, Rosalyn
MacCracken -
> > NOAA
> > > > > > > Affiliate
> > > > > > > > > via
> > > > > > > > > > RT
> > > > > > > > > > > <met_help at ucar.edu> wrote:
> > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Thu May 11 14:37:29 2017: Request 80429 was
acted
> upon.
> > > > > > > > > > > > Transaction: Ticket created by
> > > rosalyn.maccracken at noaa.gov
> > > > > > > > > > > >        Queue: met_help
> > > > > > > > > > > >      Subject: stat_analysis aggregate question
> > > > > > > > > > > >        Owner: Nobody
> > > > > > > > > > > >   Requestors: rosalyn.maccracken at noaa.gov
> > > > > > > > > > > >       Status: new
> > > > > > > > > > > >  Ticket <URL: https://rt.rap.ucar.edu/rt/
> > > > > > > > > Ticket/Display.html?id=80429
> > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Hi,
> > > > > > > > > > > >
> > > > > > > > > > > > I'm using the output from the poin-stat tool
as input
> > to
> > > > the
> > > > > > > > > > > stat_analysis
> > > > > > > > > > > > tool.  I would like to aggregate the *cnt.txt
> files.  I
> > > can
> > > > > get
> > > > > > > the
> > > > > > > > > > tool
> > > > > > > > > > > to
> > > > > > > > > > > > aggregate, and aggregate_stat the *cts.txt or
> *ctc.txt
> > > > files.
> > > > > > I
> > > > > > > > > would
> > > > > > > > > > > > really like to use the information in the
*cnt.txt
> > files
> > > > for
> > > > > > > > multiple
> > > > > > > > > > > > times/days.  How do I do that?
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks in advance!
> > > > > > > > > > > >
> > > > > > > > > > > > Roz
> > > > > > > > > > > >
> > > > > > > > > > > > --
> > > > > > > > > > > > Rosalyn MacCracken
> > > > > > > > > > > > Support Scientist
> > > > > > > > > > > >
> > > > > > > > > > > > Ocean Applilcations Branch
> > > > > > > > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > > > > > > > NCWCP
> > > > > > > > > > > > 5830 University Research Ct
> > > > > > > > > > > > College Park, MD  20740-3818
> > > > > > > > > > > >
> > > > > > > > > > > > (p) 301-683-1551
> > > > > > > > > > > > rosalyn.maccracken at noaa.gov
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > Rosalyn MacCracken
> > > > > > > > > > Support Scientist
> > > > > > > > > >
> > > > > > > > > > Ocean Applilcations Branch
> > > > > > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > > > > > NCWCP
> > > > > > > > > > 5830 University Research Ct
> > > > > > > > > > College Park, MD  20740-3818
> > > > > > > > > >
> > > > > > > > > > (p) 301-683-1551
> > > > > > > > > > rosalyn.maccracken at noaa.gov
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > Rosalyn MacCracken
> > > > > > > > Support Scientist
> > > > > > > >
> > > > > > > > Ocean Applilcations Branch
> > > > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > > > NCWCP
> > > > > > > > 5830 University Research Ct
> > > > > > > > College Park, MD  20740-3818
> > > > > > > >
> > > > > > > > (p) 301-683-1551
> > > > > > > > rosalyn.maccracken at noaa.gov
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Rosalyn MacCracken
> > > > > > Support Scientist
> > > > > >
> > > > > > Ocean Applilcations Branch
> > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > NCWCP
> > > > > > 5830 University Research Ct
> > > > > > College Park, MD  20740-3818
> > > > > >
> > > > > > (p) 301-683-1551
> > > > > > rosalyn.maccracken at noaa.gov
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > Rosalyn MacCracken
> > > > Support Scientist
> > > >
> > > > Ocean Applilcations Branch
> > > > NOAA/NWS Ocean Prediction Center
> > > > NCWCP
> > > > 5830 University Research Ct
> > > > College Park, MD  20740-3818
> > > >
> > > > (p) 301-683-1551
> > > > rosalyn.maccracken at noaa.gov
> > > >
> > > >
> > >
> > >
> >
> >
> > --
> > Rosalyn MacCracken
> > Support Scientist
> >
> > Ocean Applilcations Branch
> > NOAA/NWS Ocean Prediction Center
> > NCWCP
> > 5830 University Research Ct
> > College Park, MD  20740-3818
> >
> > (p) 301-683-1551
> > rosalyn.maccracken at noaa.gov
> >
> >
>
>


--
Rosalyn MacCracken
Support Scientist

Ocean Applilcations Branch
NOAA/NWS Ocean Prediction Center
NCWCP
5830 University Research Ct
College Park, MD  20740-3818

(p) 301-683-1551
rosalyn.maccracken at noaa.gov

------------------------------------------------


More information about the Met_help mailing list