[Met_help] [rt.rap.ucar.edu #80429] History for stat_analysis aggregate question
John Halley Gotway via RT
met_help at ucar.edu
Tue May 16 11:28:09 MDT 2017
----------------------------------------------------------------
Initial Request
----------------------------------------------------------------
Hi,
I'm using the output from the poin-stat tool as input to the stat_analysis
tool. I would like to aggregate the *cnt.txt files. I can get the tool to
aggregate, and aggregate_stat the *cts.txt or *ctc.txt files. I would
really like to use the information in the *cnt.txt files for multiple
times/days. How do I do that?
Thanks in advance!
Roz
--
Rosalyn MacCracken
Support Scientist
Ocean Applilcations Branch
NOAA/NWS Ocean Prediction Center
NCWCP
5830 University Research Ct
College Park, MD 20740-3818
(p) 301-683-1551
rosalyn.maccracken at noaa.gov
----------------------------------------------------------------
Complete Ticket History
----------------------------------------------------------------
Subject: stat_analysis aggregate question
From: John Halley Gotway
Time: Fri May 12 12:31:12 2017
Hello Roz,
I see that you have a question about configuring/running STAT-Analysis
jobs.
The "-lookin" command line option is used to tell STAT-Analysis what
input
files to read. You must specify the "-lookin" option at least once,
but
can use it as many times as you'd like.
The argument you pass with "-lookin" is either the name of a directory
or
explicit file name.
For an explicit file name, STAT-Analysis will read MET output data
from it
**regardless of the file naming convention**.
For a directory name, STAT-Analysis will search **recursively**
through
that directory looking for files ending in the ".stat" suffix.
Each time you run grid_stat, point_stat, wavelet_stat, or
ensemble_stat,
the tool writes a ".stat" output file (and can also write the optional
text
files sorted by line type... such as "_cnt.txt). That's why STAT-
Analysis
searches directories for ".stat" files. But if you want it to read
the
"_cnt.txt" file, you need to specify the file name on the command
line.
Make sense?
Just let us know if more issues/questions arise.
Thanks,
John Halley Gotway
On Thu, May 11, 2017 at 2:37 PM, Rosalyn MacCracken - NOAA Affiliate
via RT
<met_help at ucar.edu> wrote:
>
> Thu May 11 14:37:29 2017: Request 80429 was acted upon.
> Transaction: Ticket created by rosalyn.maccracken at noaa.gov
> Queue: met_help
> Subject: stat_analysis aggregate question
> Owner: Nobody
> Requestors: rosalyn.maccracken at noaa.gov
> Status: new
> Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=80429 >
>
>
> Hi,
>
> I'm using the output from the poin-stat tool as input to the
stat_analysis
> tool. I would like to aggregate the *cnt.txt files. I can get the
tool to
> aggregate, and aggregate_stat the *cts.txt or *ctc.txt files. I
would
> really like to use the information in the *cnt.txt files for
multiple
> times/days. How do I do that?
>
> Thanks in advance!
>
> Roz
>
> --
> Rosalyn MacCracken
> Support Scientist
>
> Ocean Applilcations Branch
> NOAA/NWS Ocean Prediction Center
> NCWCP
> 5830 University Research Ct
> College Park, MD 20740-3818
>
> (p) 301-683-1551
> rosalyn.maccracken at noaa.gov
>
>
------------------------------------------------
Subject: stat_analysis aggregate question
From: Rosalyn MacCracken - NOAA Affiliate
Time: Sat May 13 11:04:47 2017
Hi John,
I finally got it to work. I had set:
line_type = ["CTC"];
So, I set line_type to nothing [], and everything started working.
So, question. When using "summary" with -column RMSE set, what does
that
mean? That only the RMSE column is summed, or something else?
Thanks!
Roz
On Fri, May 12, 2017 at 2:31 PM, John Halley Gotway via RT <
met_help at ucar.edu> wrote:
> Hello Roz,
>
> I see that you have a question about configuring/running STAT-
Analysis
> jobs.
>
> The "-lookin" command line option is used to tell STAT-Analysis what
input
> files to read. You must specify the "-lookin" option at least once,
but
> can use it as many times as you'd like.
>
> The argument you pass with "-lookin" is either the name of a
directory or
> explicit file name.
>
> For an explicit file name, STAT-Analysis will read MET output data
from it
> **regardless of the file naming convention**.
>
> For a directory name, STAT-Analysis will search **recursively**
through
> that directory looking for files ending in the ".stat" suffix.
>
> Each time you run grid_stat, point_stat, wavelet_stat, or
ensemble_stat,
> the tool writes a ".stat" output file (and can also write the
optional text
> files sorted by line type... such as "_cnt.txt). That's why STAT-
Analysis
> searches directories for ".stat" files. But if you want it to read
the
> "_cnt.txt" file, you need to specify the file name on the command
line.
>
> Make sense?
>
> Just let us know if more issues/questions arise.
>
> Thanks,
> John Halley Gotway
>
>
> On Thu, May 11, 2017 at 2:37 PM, Rosalyn MacCracken - NOAA Affiliate
via RT
> <met_help at ucar.edu> wrote:
>
> >
> > Thu May 11 14:37:29 2017: Request 80429 was acted upon.
> > Transaction: Ticket created by rosalyn.maccracken at noaa.gov
> > Queue: met_help
> > Subject: stat_analysis aggregate question
> > Owner: Nobody
> > Requestors: rosalyn.maccracken at noaa.gov
> > Status: new
> > Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=80429 >
> >
> >
> > Hi,
> >
> > I'm using the output from the poin-stat tool as input to the
> stat_analysis
> > tool. I would like to aggregate the *cnt.txt files. I can get
the tool
> to
> > aggregate, and aggregate_stat the *cts.txt or *ctc.txt files. I
would
> > really like to use the information in the *cnt.txt files for
multiple
> > times/days. How do I do that?
> >
> > Thanks in advance!
> >
> > Roz
> >
> > --
> > Rosalyn MacCracken
> > Support Scientist
> >
> > Ocean Applilcations Branch
> > NOAA/NWS Ocean Prediction Center
> > NCWCP
> > 5830 University Research Ct
> > College Park, MD 20740-3818
> >
> > (p) 301-683-1551
> > rosalyn.maccracken at noaa.gov
> >
> >
>
>
--
Rosalyn MacCracken
Support Scientist
Ocean Applilcations Branch
NOAA/NWS Ocean Prediction Center
NCWCP
5830 University Research Ct
College Park, MD 20740-3818
(p) 301-683-1551
rosalyn.maccracken at noaa.gov
------------------------------------------------
Subject: stat_analysis aggregate question
From: John Halley Gotway
Time: Sat May 13 18:37:01 2017
Roz,
Stat-Analysis can perform a few different "job" types. One of them is
the
"summary" job type (-job summary). For that job, you pick exactly one
line
type and one or more columns of interest. Stat-Analysis will apply
whatever other filtering criteria you specify and compute summary
information for the column(s) you've selected. The summary info
includes
mean, min, max, and so on.
Let me know if there's something specific you're trying to do with
stat-analysis and I may be able to point you in the right direction.
John
On Sat, May 13, 2017 at 11:04 AM Rosalyn MacCracken - NOAA Affiliate
via RT
<met_help at ucar.edu> wrote:
>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=80429 >
>
> Hi John,
>
> I finally got it to work. I had set:
> line_type = ["CTC"];
>
> So, I set line_type to nothing [], and everything started working.
>
> So, question. When using "summary" with -column RMSE set, what does
that
> mean? That only the RMSE column is summed, or something else?
>
> Thanks!
>
> Roz
>
> On Fri, May 12, 2017 at 2:31 PM, John Halley Gotway via RT <
> met_help at ucar.edu> wrote:
>
> > Hello Roz,
> >
> > I see that you have a question about configuring/running STAT-
Analysis
> > jobs.
> >
> > The "-lookin" command line option is used to tell STAT-Analysis
what
> input
> > files to read. You must specify the "-lookin" option at least
once, but
> > can use it as many times as you'd like.
> >
> > The argument you pass with "-lookin" is either the name of a
directory or
> > explicit file name.
> >
> > For an explicit file name, STAT-Analysis will read MET output data
from
> it
> > **regardless of the file naming convention**.
> >
> > For a directory name, STAT-Analysis will search **recursively**
through
> > that directory looking for files ending in the ".stat" suffix.
> >
> > Each time you run grid_stat, point_stat, wavelet_stat, or
ensemble_stat,
> > the tool writes a ".stat" output file (and can also write the
optional
> text
> > files sorted by line type... such as "_cnt.txt). That's why
> STAT-Analysis
> > searches directories for ".stat" files. But if you want it to
read the
> > "_cnt.txt" file, you need to specify the file name on the command
line.
> >
> > Make sense?
> >
> > Just let us know if more issues/questions arise.
> >
> > Thanks,
> > John Halley Gotway
> >
> >
> > On Thu, May 11, 2017 at 2:37 PM, Rosalyn MacCracken - NOAA
Affiliate via
> RT
> > <met_help at ucar.edu> wrote:
> >
> > >
> > > Thu May 11 14:37:29 2017: Request 80429 was acted upon.
> > > Transaction: Ticket created by rosalyn.maccracken at noaa.gov
> > > Queue: met_help
> > > Subject: stat_analysis aggregate question
> > > Owner: Nobody
> > > Requestors: rosalyn.maccracken at noaa.gov
> > > Status: new
> > > Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=80429
> >
> > >
> > >
> > > Hi,
> > >
> > > I'm using the output from the poin-stat tool as input to the
> > stat_analysis
> > > tool. I would like to aggregate the *cnt.txt files. I can get
the
> tool
> > to
> > > aggregate, and aggregate_stat the *cts.txt or *ctc.txt files. I
would
> > > really like to use the information in the *cnt.txt files for
multiple
> > > times/days. How do I do that?
> > >
> > > Thanks in advance!
> > >
> > > Roz
> > >
> > > --
> > > Rosalyn MacCracken
> > > Support Scientist
> > >
> > > Ocean Applilcations Branch
> > > NOAA/NWS Ocean Prediction Center
> > > NCWCP
> > > 5830 University Research Ct
> > > College Park, MD 20740-3818
> > >
> > > (p) 301-683-1551
> > > rosalyn.maccracken at noaa.gov
> > >
> > >
> >
> >
>
>
> --
> Rosalyn MacCracken
> Support Scientist
>
> Ocean Applilcations Branch
> NOAA/NWS Ocean Prediction Center
> NCWCP
> 5830 University Research Ct
> College Park, MD 20740-3818
>
> (p) 301-683-1551
> rosalyn.maccracken at noaa.gov
>
>
------------------------------------------------
Subject: stat_analysis aggregate question
From: Rosalyn MacCracken - NOAA Affiliate
Time: Mon May 15 07:29:22 2017
Hi John,
So, I'm interested in doing a couple of things, and I think I've
figured
out how to do some of them. So, maybe you can tell me how to do the
others.
First, I am mostly interested in the matched points and their
performance.
And, I use a config file, which I call from a script using the
command:
stat_analysis -lookin ${PROCDIR} -out
${PROCDIR}/stat_analysis/stat_analysis.out -config
${CONFIGDIR}/STATAnalysisConfig_working -v 2
I can easily plot spatially, where the matched points are located by
their
lat/lon, and I can find their differences (FCST - OBS). Then, I used
the
aggregate_stat command to combine my files, so, I can plot histograms
or
box plots of matched point, either at that forecast hour, or over the
span
of my forecast period of interest. For that, I use this in my config
file:
"-job aggregate_stat -line_type CTC -out_line_type CTS
-dump_row
/opc/save/Rosalyn.MacCracken/met_out/stat_analysis/job4_aggstat_ctc_cts.stat"
So, some other things that I might be interested in things that span
the
entire period. Perhaps, more like times series plots, so we can see
how
the forecast has done over time. I don't have a problem with plotting
things from the forecast period, but, they usually aren't very
revealing or
interesting. So, some other things are:
1) putting together a file which spans the forecast period which puts
together information from the SL1L2 file, so, I could plot a time
series of
the MAE. So, I was thinking I would use:
"-job aggregate_stat -line_type SL1L2 -out_line_type CNT
-dump_row
/opc/save/Rosalyn.MacCracken/met_out/stat_analysis/job2_aggstat_slil2_cnt_wind2.stat"
Is that right?
2) From the CNT files, time series plots of the ANOM_CORR, PR_CORR,
GSS or
CSI, and RMSE, and maybe some other things. I was thinking that I
could do:
"-job summary -fcst_var WIND -line_type CNT -column RMSE
-dump_row
/opc/save/Rosalyn.MacCracken/met_out/stat_analysis/job7_summary_cnt_rmse.stat"
But, wasn't sure if that was correct. So, if you could point me to
the
right usage, that would be great.
So, I'm also not sure how you get the mean, min, max, etc, for
multiple
columns. I think that the CNT file has the most useful info, so, if
you
could tell me how to do that, that would be great. I'm sure I'll have
another list of things I want to do after todays' meeting with Joe,
so,
I'll be back in touch with that list.
Oh, also, that script you wrote, plot_cnt.r on the MET user page.
Does
that plot from one of these aggregate_stat or summary commands, or is
that
a single CNT file? If it's an aggregate_stat or summary commands,
what
command did you use and what was in the "stat_list" that you used?
I'm
sure it was a variety of columns from the CNT file, right?
Thanks for for your help!
Roz
On Sun, May 14, 2017 at 12:37 AM, John Halley Gotway via RT <
met_help at ucar.edu> wrote:
> Roz,
>
> Stat-Analysis can perform a few different "job" types. One of them
is the
> "summary" job type (-job summary). For that job, you pick exactly
one line
> type and one or more columns of interest. Stat-Analysis will apply
> whatever other filtering criteria you specify and compute summary
> information for the column(s) you've selected. The summary info
includes
> mean, min, max, and so on.
>
> Let me know if there's something specific you're trying to do with
> stat-analysis and I may be able to point you in the right direction.
>
> John
>
>
> On Sat, May 13, 2017 at 11:04 AM Rosalyn MacCracken - NOAA Affiliate
via RT
> <met_help at ucar.edu> wrote:
>
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=80429 >
> >
> > Hi John,
> >
> > I finally got it to work. I had set:
> > line_type = ["CTC"];
> >
> > So, I set line_type to nothing [], and everything started working.
> >
> > So, question. When using "summary" with -column RMSE set, what
does that
> > mean? That only the RMSE column is summed, or something else?
> >
> > Thanks!
> >
> > Roz
> >
> > On Fri, May 12, 2017 at 2:31 PM, John Halley Gotway via RT <
> > met_help at ucar.edu> wrote:
> >
> > > Hello Roz,
> > >
> > > I see that you have a question about configuring/running STAT-
Analysis
> > > jobs.
> > >
> > > The "-lookin" command line option is used to tell STAT-Analysis
what
> > input
> > > files to read. You must specify the "-lookin" option at least
once,
> but
> > > can use it as many times as you'd like.
> > >
> > > The argument you pass with "-lookin" is either the name of a
directory
> or
> > > explicit file name.
> > >
> > > For an explicit file name, STAT-Analysis will read MET output
data from
> > it
> > > **regardless of the file naming convention**.
> > >
> > > For a directory name, STAT-Analysis will search **recursively**
through
> > > that directory looking for files ending in the ".stat" suffix.
> > >
> > > Each time you run grid_stat, point_stat, wavelet_stat, or
> ensemble_stat,
> > > the tool writes a ".stat" output file (and can also write the
optional
> > text
> > > files sorted by line type... such as "_cnt.txt). That's why
> > STAT-Analysis
> > > searches directories for ".stat" files. But if you want it to
read the
> > > "_cnt.txt" file, you need to specify the file name on the
command line.
> > >
> > > Make sense?
> > >
> > > Just let us know if more issues/questions arise.
> > >
> > > Thanks,
> > > John Halley Gotway
> > >
> > >
> > > On Thu, May 11, 2017 at 2:37 PM, Rosalyn MacCracken - NOAA
Affiliate
> via
> > RT
> > > <met_help at ucar.edu> wrote:
> > >
> > > >
> > > > Thu May 11 14:37:29 2017: Request 80429 was acted upon.
> > > > Transaction: Ticket created by rosalyn.maccracken at noaa.gov
> > > > Queue: met_help
> > > > Subject: stat_analysis aggregate question
> > > > Owner: Nobody
> > > > Requestors: rosalyn.maccracken at noaa.gov
> > > > Status: new
> > > > Ticket <URL: https://rt.rap.ucar.edu/rt/
> Ticket/Display.html?id=80429
> > >
> > > >
> > > >
> > > > Hi,
> > > >
> > > > I'm using the output from the poin-stat tool as input to the
> > > stat_analysis
> > > > tool. I would like to aggregate the *cnt.txt files. I can
get the
> > tool
> > > to
> > > > aggregate, and aggregate_stat the *cts.txt or *ctc.txt files.
I
> would
> > > > really like to use the information in the *cnt.txt files for
multiple
> > > > times/days. How do I do that?
> > > >
> > > > Thanks in advance!
> > > >
> > > > Roz
> > > >
> > > > --
> > > > Rosalyn MacCracken
> > > > Support Scientist
> > > >
> > > > Ocean Applilcations Branch
> > > > NOAA/NWS Ocean Prediction Center
> > > > NCWCP
> > > > 5830 University Research Ct
> > > > College Park, MD 20740-3818
> > > >
> > > > (p) 301-683-1551
> > > > rosalyn.maccracken at noaa.gov
> > > >
> > > >
> > >
> > >
> >
> >
> > --
> > Rosalyn MacCracken
> > Support Scientist
> >
> > Ocean Applilcations Branch
> > NOAA/NWS Ocean Prediction Center
> > NCWCP
> > 5830 University Research Ct
> > College Park, MD 20740-3818
> >
> > (p) 301-683-1551
> > rosalyn.maccracken at noaa.gov
> >
> >
>
>
--
Rosalyn MacCracken
Support Scientist
Ocean Applilcations Branch
NOAA/NWS Ocean Prediction Center
NCWCP
5830 University Research Ct
College Park, MD 20740-3818
(p) 301-683-1551
rosalyn.maccracken at noaa.gov
------------------------------------------------
Subject: stat_analysis aggregate question
From: John Halley Gotway
Time: Mon May 15 09:51:17 2017
Roz,
I'm glad you've been able to make progress using STAT-Analysis.
Let me mention a few things that you may find useful...
(1) As you've already seen, STAT-Analysis can be run by defining one
or
more jobs in a config file. Alternatively, you can run a single job
on the
command line with no config file. I find that much quicker and easier
when
I'm playing around with things. It's only once I've defined a fixed
set of
jobs that I move them into a config file.
(2) By default, STAT-Analysis writes it output to the screen. Use the
"-out_stat" job option to redirect the job output to a .stat output
files.
That will include the full set of header columns and should be pretty
easy
for a plotting script to parse.
(3) It sounds like you're interested primarily in matched pairs, i.e.
the
MPR line type. I assume that's what you're plotting in your
histograms and
boxplots. If you really just want to "filter" the .stat files, I'd
suggest
using the "filter" job to do so:
-job filter -line_type MPR -dump_row filter_mpr.stat [[[ additional
filtering criteria ]]]
You mentioned an aggregate_stat job to read CTC and write CTS... but
that
doesn't have anything to do with the MPR line type. So I'm confused
as to
why that's getting you what you want?
(4) I see that you want a time series of MAE values. I think you're
on the
right track, but I'd suggest using the "-by" option:
"-job aggregate_stat -line_type SL1L2 -out_line_type CNT -out_stat
cnt_time_series.stat -fcst_lead 24 -by FCST_INIT_BEG"
The job listed above would produce a time series of continuous
statistics
for the 24-hour lead time for each initialization time present. You
should
be able to use the job command options to define the time series in
any way
you want.
(5) When running a summary job, if you want to summarize multiple
columns,
just use the "-column" option multiple times to include them... or
specify
"-column" as a comma-separated list:
"-job summary -fcst_var WIND -line_type CNT -column RMSE,MAE,ME,MSE"
(6) The "plot_cnt.R" script on the website is outdated since it's
header
columns haven't been updated since version 3.0. But that same script
is
included in the MET release and has been updated:
met-6.0/scripts/Rscripts/plot_cnt.R
It reads the CNT line type from a .stat file, an _cnt.txt file, or the
output of a stat-analysis filter job. I don't know specifically what
stat-analysis command I used, but it'd be something like:
stat_analysis -job filter -line_type CNT -dump_row cnt_filter.txt
[[[
additional filtering criteria ]]]
Hope that helps.
Thanks,
John
On Mon, May 15, 2017 at 7:29 AM, Rosalyn MacCracken - NOAA Affiliate
via RT
<met_help at ucar.edu> wrote:
>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=80429 >
>
> Hi John,
>
> So, I'm interested in doing a couple of things, and I think I've
figured
> out how to do some of them. So, maybe you can tell me how to do the
> others.
>
> First, I am mostly interested in the matched points and their
performance.
> And, I use a config file, which I call from a script using the
command:
>
> stat_analysis -lookin ${PROCDIR} -out
> ${PROCDIR}/stat_analysis/stat_analysis.out -config
> ${CONFIGDIR}/STATAnalysisConfig_working -v 2
>
> I can easily plot spatially, where the matched points are located by
their
> lat/lon, and I can find their differences (FCST - OBS). Then, I
used the
> aggregate_stat command to combine my files, so, I can plot
histograms or
> box plots of matched point, either at that forecast hour, or over
the span
> of my forecast period of interest. For that, I use this in my
config
> file:
>
> "-job aggregate_stat -line_type CTC -out_line_type CTS
> -dump_row
> /opc/save/Rosalyn.MacCracken/met_out/stat_analysis/job4_
> aggstat_ctc_cts.stat"
>
> So, some other things that I might be interested in things that span
the
> entire period. Perhaps, more like times series plots, so we can see
how
> the forecast has done over time. I don't have a problem with
plotting
> things from the forecast period, but, they usually aren't very
revealing or
> interesting. So, some other things are:
>
> 1) putting together a file which spans the forecast period which
puts
> together information from the SL1L2 file, so, I could plot a time
series of
> the MAE. So, I was thinking I would use:
> "-job aggregate_stat -line_type SL1L2 -out_line_type CNT
> -dump_row
> /opc/save/Rosalyn.MacCracken/met_out/stat_analysis/job2_
> aggstat_slil2_cnt_wind2.stat"
>
>
> Is that right?
>
> 2) From the CNT files, time series plots of the ANOM_CORR, PR_CORR,
GSS or
> CSI, and RMSE, and maybe some other things. I was thinking that I
could
> do:
> "-job summary -fcst_var WIND -line_type CNT -column RMSE
> -dump_row
> /opc/save/Rosalyn.MacCracken/met_out/stat_analysis/job7_
> summary_cnt_rmse.stat"
>
> But, wasn't sure if that was correct. So, if you could point me to
the
> right usage, that would be great.
>
> So, I'm also not sure how you get the mean, min, max, etc, for
multiple
> columns. I think that the CNT file has the most useful info, so, if
you
> could tell me how to do that, that would be great. I'm sure I'll
have
> another list of things I want to do after todays' meeting with Joe,
so,
> I'll be back in touch with that list.
>
> Oh, also, that script you wrote, plot_cnt.r on the MET user page.
Does
> that plot from one of these aggregate_stat or summary commands, or
is that
> a single CNT file? If it's an aggregate_stat or summary commands,
what
> command did you use and what was in the "stat_list" that you used?
I'm
> sure it was a variety of columns from the CNT file, right?
>
> Thanks for for your help!
>
> Roz
>
>
> On Sun, May 14, 2017 at 12:37 AM, John Halley Gotway via RT <
> met_help at ucar.edu> wrote:
>
> > Roz,
> >
> > Stat-Analysis can perform a few different "job" types. One of
them is
> the
> > "summary" job type (-job summary). For that job, you pick exactly
one
> line
> > type and one or more columns of interest. Stat-Analysis will
apply
> > whatever other filtering criteria you specify and compute summary
> > information for the column(s) you've selected. The summary info
includes
> > mean, min, max, and so on.
> >
> > Let me know if there's something specific you're trying to do with
> > stat-analysis and I may be able to point you in the right
direction.
> >
> > John
> >
> >
> > On Sat, May 13, 2017 at 11:04 AM Rosalyn MacCracken - NOAA
Affiliate via
> RT
> > <met_help at ucar.edu> wrote:
> >
> > >
> > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=80429 >
> > >
> > > Hi John,
> > >
> > > I finally got it to work. I had set:
> > > line_type = ["CTC"];
> > >
> > > So, I set line_type to nothing [], and everything started
working.
> > >
> > > So, question. When using "summary" with -column RMSE set, what
does
> that
> > > mean? That only the RMSE column is summed, or something else?
> > >
> > > Thanks!
> > >
> > > Roz
> > >
> > > On Fri, May 12, 2017 at 2:31 PM, John Halley Gotway via RT <
> > > met_help at ucar.edu> wrote:
> > >
> > > > Hello Roz,
> > > >
> > > > I see that you have a question about configuring/running
> STAT-Analysis
> > > > jobs.
> > > >
> > > > The "-lookin" command line option is used to tell STAT-
Analysis what
> > > input
> > > > files to read. You must specify the "-lookin" option at least
once,
> > but
> > > > can use it as many times as you'd like.
> > > >
> > > > The argument you pass with "-lookin" is either the name of a
> directory
> > or
> > > > explicit file name.
> > > >
> > > > For an explicit file name, STAT-Analysis will read MET output
data
> from
> > > it
> > > > **regardless of the file naming convention**.
> > > >
> > > > For a directory name, STAT-Analysis will search
**recursively**
> through
> > > > that directory looking for files ending in the ".stat" suffix.
> > > >
> > > > Each time you run grid_stat, point_stat, wavelet_stat, or
> > ensemble_stat,
> > > > the tool writes a ".stat" output file (and can also write the
> optional
> > > text
> > > > files sorted by line type... such as "_cnt.txt). That's why
> > > STAT-Analysis
> > > > searches directories for ".stat" files. But if you want it to
read
> the
> > > > "_cnt.txt" file, you need to specify the file name on the
command
> line.
> > > >
> > > > Make sense?
> > > >
> > > > Just let us know if more issues/questions arise.
> > > >
> > > > Thanks,
> > > > John Halley Gotway
> > > >
> > > >
> > > > On Thu, May 11, 2017 at 2:37 PM, Rosalyn MacCracken - NOAA
Affiliate
> > via
> > > RT
> > > > <met_help at ucar.edu> wrote:
> > > >
> > > > >
> > > > > Thu May 11 14:37:29 2017: Request 80429 was acted upon.
> > > > > Transaction: Ticket created by rosalyn.maccracken at noaa.gov
> > > > > Queue: met_help
> > > > > Subject: stat_analysis aggregate question
> > > > > Owner: Nobody
> > > > > Requestors: rosalyn.maccracken at noaa.gov
> > > > > Status: new
> > > > > Ticket <URL: https://rt.rap.ucar.edu/rt/
> > Ticket/Display.html?id=80429
> > > >
> > > > >
> > > > >
> > > > > Hi,
> > > > >
> > > > > I'm using the output from the poin-stat tool as input to the
> > > > stat_analysis
> > > > > tool. I would like to aggregate the *cnt.txt files. I can
get the
> > > tool
> > > > to
> > > > > aggregate, and aggregate_stat the *cts.txt or *ctc.txt
files. I
> > would
> > > > > really like to use the information in the *cnt.txt files for
> multiple
> > > > > times/days. How do I do that?
> > > > >
> > > > > Thanks in advance!
> > > > >
> > > > > Roz
> > > > >
> > > > > --
> > > > > Rosalyn MacCracken
> > > > > Support Scientist
> > > > >
> > > > > Ocean Applilcations Branch
> > > > > NOAA/NWS Ocean Prediction Center
> > > > > NCWCP
> > > > > 5830 University Research Ct
> > > > > College Park, MD 20740-3818
> > > > >
> > > > > (p) 301-683-1551
> > > > > rosalyn.maccracken at noaa.gov
> > > > >
> > > > >
> > > >
> > > >
> > >
> > >
> > > --
> > > Rosalyn MacCracken
> > > Support Scientist
> > >
> > > Ocean Applilcations Branch
> > > NOAA/NWS Ocean Prediction Center
> > > NCWCP
> > > 5830 University Research Ct
> > > College Park, MD 20740-3818
> > >
> > > (p) 301-683-1551
> > > rosalyn.maccracken at noaa.gov
> > >
> > >
> >
> >
>
>
> --
> Rosalyn MacCracken
> Support Scientist
>
> Ocean Applilcations Branch
> NOAA/NWS Ocean Prediction Center
> NCWCP
> 5830 University Research Ct
> College Park, MD 20740-3818
>
> (p) 301-683-1551
> rosalyn.maccracken at noaa.gov
>
>
------------------------------------------------
Subject: stat_analysis aggregate question
From: John Halley Gotway
Time: Mon May 15 09:58:17 2017
Roz,
OK, I just updated the website version of those scripts to match the
6.0
version.
John
On Mon, May 15, 2017 at 9:50 AM, John Halley Gotway <johnhg at ucar.edu>
wrote:
> Roz,
>
> I'm glad you've been able to make progress using STAT-Analysis.
>
> Let me mention a few things that you may find useful...
>
> (1) As you've already seen, STAT-Analysis can be run by defining one
or
> more jobs in a config file. Alternatively, you can run a single job
on the
> command line with no config file. I find that much quicker and
easier when
> I'm playing around with things. It's only once I've defined a fixed
set of
> jobs that I move them into a config file.
>
> (2) By default, STAT-Analysis writes it output to the screen. Use
the
> "-out_stat" job option to redirect the job output to a .stat output
files.
> That will include the full set of header columns and should be
pretty easy
> for a plotting script to parse.
>
> (3) It sounds like you're interested primarily in matched pairs,
i.e. the
> MPR line type. I assume that's what you're plotting in your
histograms and
> boxplots. If you really just want to "filter" the .stat files, I'd
suggest
> using the "filter" job to do so:
> -job filter -line_type MPR -dump_row filter_mpr.stat [[[
additional
> filtering criteria ]]]
>
> You mentioned an aggregate_stat job to read CTC and write CTS... but
that
> doesn't have anything to do with the MPR line type. So I'm confused
as to
> why that's getting you what you want?
>
> (4) I see that you want a time series of MAE values. I think you're
on
> the right track, but I'd suggest using the "-by" option:
>
> "-job aggregate_stat -line_type SL1L2 -out_line_type CNT -out_stat
> cnt_time_series.stat -fcst_lead 24 -by FCST_INIT_BEG"
>
> The job listed above would produce a time series of continuous
statistics
> for the 24-hour lead time for each initialization time present. You
should
> be able to use the job command options to define the time series in
any way
> you want.
>
> (5) When running a summary job, if you want to summarize multiple
columns,
> just use the "-column" option multiple times to include them... or
specify
> "-column" as a comma-separated list:
>
> "-job summary -fcst_var WIND -line_type CNT -column RMSE,MAE,ME,MSE"
>
> (6) The "plot_cnt.R" script on the website is outdated since it's
header
> columns haven't been updated since version 3.0. But that same
script is
> included in the MET release and has been updated:
> met-6.0/scripts/Rscripts/plot_cnt.R
>
> It reads the CNT line type from a .stat file, an _cnt.txt file, or
the
> output of a stat-analysis filter job. I don't know specifically
what
> stat-analysis command I used, but it'd be something like:
>
> stat_analysis -job filter -line_type CNT -dump_row cnt_filter.txt
[[[
> additional filtering criteria ]]]
>
> Hope that helps.
>
> Thanks,
> John
>
>
> On Mon, May 15, 2017 at 7:29 AM, Rosalyn MacCracken - NOAA Affiliate
via
> RT <met_help at ucar.edu> wrote:
>
>>
>> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=80429 >
>>
>> Hi John,
>>
>> So, I'm interested in doing a couple of things, and I think I've
figured
>> out how to do some of them. So, maybe you can tell me how to do
the
>> others.
>>
>> First, I am mostly interested in the matched points and their
performance.
>> And, I use a config file, which I call from a script using the
command:
>>
>> stat_analysis -lookin ${PROCDIR} -out
>> ${PROCDIR}/stat_analysis/stat_analysis.out -config
>> ${CONFIGDIR}/STATAnalysisConfig_working -v 2
>>
>> I can easily plot spatially, where the matched points are located
by their
>> lat/lon, and I can find their differences (FCST - OBS). Then, I
used the
>> aggregate_stat command to combine my files, so, I can plot
histograms or
>> box plots of matched point, either at that forecast hour, or over
the span
>> of my forecast period of interest. For that, I use this in my
config
>> file:
>>
>> "-job aggregate_stat -line_type CTC -out_line_type CTS
>> -dump_row
>> /opc/save/Rosalyn.MacCracken/met_out/stat_analysis/job4_aggs
>> tat_ctc_cts.stat"
>>
>> So, some other things that I might be interested in things that
span the
>> entire period. Perhaps, more like times series plots, so we can
see how
>> the forecast has done over time. I don't have a problem with
plotting
>> things from the forecast period, but, they usually aren't very
revealing
>> or
>> interesting. So, some other things are:
>>
>> 1) putting together a file which spans the forecast period which
puts
>> together information from the SL1L2 file, so, I could plot a time
series
>> of
>> the MAE. So, I was thinking I would use:
>> "-job aggregate_stat -line_type SL1L2 -out_line_type CNT
>> -dump_row
>> /opc/save/Rosalyn.MacCracken/met_out/stat_analysis/job2_aggs
>> tat_slil2_cnt_wind2.stat"
>>
>>
>> Is that right?
>>
>> 2) From the CNT files, time series plots of the ANOM_CORR,
PR_CORR, GSS
>> or
>> CSI, and RMSE, and maybe some other things. I was thinking that I
could
>> do:
>> "-job summary -fcst_var WIND -line_type CNT -column RMSE
>> -dump_row
>> /opc/save/Rosalyn.MacCracken/met_out/stat_analysis/job7_summ
>> ary_cnt_rmse.stat"
>>
>> But, wasn't sure if that was correct. So, if you could point me to
the
>> right usage, that would be great.
>>
>> So, I'm also not sure how you get the mean, min, max, etc, for
multiple
>> columns. I think that the CNT file has the most useful info, so,
if you
>> could tell me how to do that, that would be great. I'm sure I'll
have
>> another list of things I want to do after todays' meeting with Joe,
so,
>> I'll be back in touch with that list.
>>
>> Oh, also, that script you wrote, plot_cnt.r on the MET user page.
Does
>> that plot from one of these aggregate_stat or summary commands, or
is that
>> a single CNT file? If it's an aggregate_stat or summary commands,
what
>> command did you use and what was in the "stat_list" that you used?
I'm
>> sure it was a variety of columns from the CNT file, right?
>>
>> Thanks for for your help!
>>
>> Roz
>>
>>
>> On Sun, May 14, 2017 at 12:37 AM, John Halley Gotway via RT <
>> met_help at ucar.edu> wrote:
>>
>> > Roz,
>> >
>> > Stat-Analysis can perform a few different "job" types. One of
them is
>> the
>> > "summary" job type (-job summary). For that job, you pick
exactly one
>> line
>> > type and one or more columns of interest. Stat-Analysis will
apply
>> > whatever other filtering criteria you specify and compute summary
>> > information for the column(s) you've selected. The summary info
>> includes
>> > mean, min, max, and so on.
>> >
>> > Let me know if there's something specific you're trying to do
with
>> > stat-analysis and I may be able to point you in the right
direction.
>> >
>> > John
>> >
>> >
>> > On Sat, May 13, 2017 at 11:04 AM Rosalyn MacCracken - NOAA
Affiliate
>> via RT
>> > <met_help at ucar.edu> wrote:
>> >
>> > >
>> > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=80429 >
>> > >
>> > > Hi John,
>> > >
>> > > I finally got it to work. I had set:
>> > > line_type = ["CTC"];
>> > >
>> > > So, I set line_type to nothing [], and everything started
working.
>> > >
>> > > So, question. When using "summary" with -column RMSE set, what
does
>> that
>> > > mean? That only the RMSE column is summed, or something else?
>> > >
>> > > Thanks!
>> > >
>> > > Roz
>> > >
>> > > On Fri, May 12, 2017 at 2:31 PM, John Halley Gotway via RT <
>> > > met_help at ucar.edu> wrote:
>> > >
>> > > > Hello Roz,
>> > > >
>> > > > I see that you have a question about configuring/running
>> STAT-Analysis
>> > > > jobs.
>> > > >
>> > > > The "-lookin" command line option is used to tell STAT-
Analysis what
>> > > input
>> > > > files to read. You must specify the "-lookin" option at
least once,
>> > but
>> > > > can use it as many times as you'd like.
>> > > >
>> > > > The argument you pass with "-lookin" is either the name of a
>> directory
>> > or
>> > > > explicit file name.
>> > > >
>> > > > For an explicit file name, STAT-Analysis will read MET output
data
>> from
>> > > it
>> > > > **regardless of the file naming convention**.
>> > > >
>> > > > For a directory name, STAT-Analysis will search
**recursively**
>> through
>> > > > that directory looking for files ending in the ".stat"
suffix.
>> > > >
>> > > > Each time you run grid_stat, point_stat, wavelet_stat, or
>> > ensemble_stat,
>> > > > the tool writes a ".stat" output file (and can also write the
>> optional
>> > > text
>> > > > files sorted by line type... such as "_cnt.txt). That's why
>> > > STAT-Analysis
>> > > > searches directories for ".stat" files. But if you want it
to read
>> the
>> > > > "_cnt.txt" file, you need to specify the file name on the
command
>> line.
>> > > >
>> > > > Make sense?
>> > > >
>> > > > Just let us know if more issues/questions arise.
>> > > >
>> > > > Thanks,
>> > > > John Halley Gotway
>> > > >
>> > > >
>> > > > On Thu, May 11, 2017 at 2:37 PM, Rosalyn MacCracken - NOAA
Affiliate
>> > via
>> > > RT
>> > > > <met_help at ucar.edu> wrote:
>> > > >
>> > > > >
>> > > > > Thu May 11 14:37:29 2017: Request 80429 was acted upon.
>> > > > > Transaction: Ticket created by rosalyn.maccracken at noaa.gov
>> > > > > Queue: met_help
>> > > > > Subject: stat_analysis aggregate question
>> > > > > Owner: Nobody
>> > > > > Requestors: rosalyn.maccracken at noaa.gov
>> > > > > Status: new
>> > > > > Ticket <URL: https://rt.rap.ucar.edu/rt/
>> > Ticket/Display.html?id=80429
>> > > >
>> > > > >
>> > > > >
>> > > > > Hi,
>> > > > >
>> > > > > I'm using the output from the poin-stat tool as input to
the
>> > > > stat_analysis
>> > > > > tool. I would like to aggregate the *cnt.txt files. I can
get
>> the
>> > > tool
>> > > > to
>> > > > > aggregate, and aggregate_stat the *cts.txt or *ctc.txt
files. I
>> > would
>> > > > > really like to use the information in the *cnt.txt files
for
>> multiple
>> > > > > times/days. How do I do that?
>> > > > >
>> > > > > Thanks in advance!
>> > > > >
>> > > > > Roz
>> > > > >
>> > > > > --
>> > > > > Rosalyn MacCracken
>> > > > > Support Scientist
>> > > > >
>> > > > > Ocean Applilcations Branch
>> > > > > NOAA/NWS Ocean Prediction Center
>> > > > > NCWCP
>> > > > > 5830 University Research Ct
>> > > > > College Park, MD 20740-3818
>> > > > >
>> > > > > (p) 301-683-1551
>> > > > > rosalyn.maccracken at noaa.gov
>> > > > >
>> > > > >
>> > > >
>> > > >
>> > >
>> > >
>> > > --
>> > > Rosalyn MacCracken
>> > > Support Scientist
>> > >
>> > > Ocean Applilcations Branch
>> > > NOAA/NWS Ocean Prediction Center
>> > > NCWCP
>> > > 5830 University Research Ct
>> > > College Park, MD 20740-3818
>> > >
>> > > (p) 301-683-1551
>> > > rosalyn.maccracken at noaa.gov
>> > >
>> > >
>> >
>> >
>>
>>
>> --
>> Rosalyn MacCracken
>> Support Scientist
>>
>> Ocean Applilcations Branch
>> NOAA/NWS Ocean Prediction Center
>> NCWCP
>> 5830 University Research Ct
>> College Park, MD 20740-3818
>>
>> (p) 301-683-1551
>> rosalyn.maccracken at noaa.gov
>>
>>
>
------------------------------------------------
Subject: stat_analysis aggregate question
From: Rosalyn MacCracken - NOAA Affiliate
Time: Mon May 15 10:19:30 2017
Hi John,
Thanks for the info! I've actually been going back and forth from the
command line, and the config file. When I get something to work on
the
command line, I stick it in the config file. But, thanks for that
suggestion anyway.
I'll try the suggestions and see if that's what I'm trying to get as
output. Also, I think the reading CTC and write CTS was following an
example you had in the slides. I was thinking that that might get me
output that I wanted, but, really had no idea. But, I think between
that
-column and that "-out_stat", I might get what I want.
BTW, I did see your other email about the R programming. I'll check
out
the updated version.
I'll be back in touch after I try some of these things you suggested.
Thanks again!!
Roz
On Mon, May 15, 2017 at 3:51 PM, John Halley Gotway via RT <
met_help at ucar.edu> wrote:
> Roz,
>
> I'm glad you've been able to make progress using STAT-Analysis.
>
> Let me mention a few things that you may find useful...
>
> (1) As you've already seen, STAT-Analysis can be run by defining one
or
> more jobs in a config file. Alternatively, you can run a single job
on the
> command line with no config file. I find that much quicker and
easier when
> I'm playing around with things. It's only once I've defined a fixed
set of
> jobs that I move them into a config file.
>
> (2) By default, STAT-Analysis writes it output to the screen. Use
the
> "-out_stat" job option to redirect the job output to a .stat output
files.
> That will include the full set of header columns and should be
pretty easy
> for a plotting script to parse.
>
> (3) It sounds like you're interested primarily in matched pairs,
i.e. the
> MPR line type. I assume that's what you're plotting in your
histograms and
> boxplots. If you really just want to "filter" the .stat files, I'd
suggest
> using the "filter" job to do so:
> -job filter -line_type MPR -dump_row filter_mpr.stat [[[
additional
> filtering criteria ]]]
>
> You mentioned an aggregate_stat job to read CTC and write CTS... but
that
> doesn't have anything to do with the MPR line type. So I'm confused
as to
> why that's getting you what you want?
>
> (4) I see that you want a time series of MAE values. I think you're
on the
> right track, but I'd suggest using the "-by" option:
>
> "-job aggregate_stat -line_type SL1L2 -out_line_type CNT -out_stat
> cnt_time_series.stat -fcst_lead 24 -by FCST_INIT_BEG"
>
> The job listed above would produce a time series of continuous
statistics
> for the 24-hour lead time for each initialization time present. You
should
> be able to use the job command options to define the time series in
any way
> you want.
>
> (5) When running a summary job, if you want to summarize multiple
columns,
> just use the "-column" option multiple times to include them... or
specify
> "-column" as a comma-separated list:
>
> "-job summary -fcst_var WIND -line_type CNT -column RMSE,MAE,ME,MSE"
>
> (6) The "plot_cnt.R" script on the website is outdated since it's
header
> columns haven't been updated since version 3.0. But that same
script is
> included in the MET release and has been updated:
> met-6.0/scripts/Rscripts/plot_cnt.R
>
> It reads the CNT line type from a .stat file, an _cnt.txt file, or
the
> output of a stat-analysis filter job. I don't know specifically
what
> stat-analysis command I used, but it'd be something like:
>
> stat_analysis -job filter -line_type CNT -dump_row cnt_filter.txt
[[[
> additional filtering criteria ]]]
>
> Hope that helps.
>
> Thanks,
> John
>
>
> On Mon, May 15, 2017 at 7:29 AM, Rosalyn MacCracken - NOAA Affiliate
via RT
> <met_help at ucar.edu> wrote:
>
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=80429 >
> >
> > Hi John,
> >
> > So, I'm interested in doing a couple of things, and I think I've
figured
> > out how to do some of them. So, maybe you can tell me how to do
the
> > others.
> >
> > First, I am mostly interested in the matched points and their
> performance.
> > And, I use a config file, which I call from a script using the
command:
> >
> > stat_analysis -lookin ${PROCDIR} -out
> > ${PROCDIR}/stat_analysis/stat_analysis.out -config
> > ${CONFIGDIR}/STATAnalysisConfig_working -v 2
> >
> > I can easily plot spatially, where the matched points are located
by
> their
> > lat/lon, and I can find their differences (FCST - OBS). Then, I
used the
> > aggregate_stat command to combine my files, so, I can plot
histograms or
> > box plots of matched point, either at that forecast hour, or over
the
> span
> > of my forecast period of interest. For that, I use this in my
config
> > file:
> >
> > "-job aggregate_stat -line_type CTC -out_line_type CTS
> > -dump_row
> > /opc/save/Rosalyn.MacCracken/met_out/stat_analysis/job4_
> > aggstat_ctc_cts.stat"
> >
> > So, some other things that I might be interested in things that
span the
> > entire period. Perhaps, more like times series plots, so we can
see how
> > the forecast has done over time. I don't have a problem with
plotting
> > things from the forecast period, but, they usually aren't very
revealing
> or
> > interesting. So, some other things are:
> >
> > 1) putting together a file which spans the forecast period which
puts
> > together information from the SL1L2 file, so, I could plot a time
series
> of
> > the MAE. So, I was thinking I would use:
> > "-job aggregate_stat -line_type SL1L2 -out_line_type CNT
> > -dump_row
> > /opc/save/Rosalyn.MacCracken/met_out/stat_analysis/job2_
> > aggstat_slil2_cnt_wind2.stat"
> >
> >
> > Is that right?
> >
> > 2) From the CNT files, time series plots of the ANOM_CORR,
PR_CORR, GSS
> or
> > CSI, and RMSE, and maybe some other things. I was thinking that I
could
> > do:
> > "-job summary -fcst_var WIND -line_type CNT -column RMSE
> > -dump_row
> > /opc/save/Rosalyn.MacCracken/met_out/stat_analysis/job7_
> > summary_cnt_rmse.stat"
> >
> > But, wasn't sure if that was correct. So, if you could point me
to the
> > right usage, that would be great.
> >
> > So, I'm also not sure how you get the mean, min, max, etc, for
multiple
> > columns. I think that the CNT file has the most useful info, so,
if you
> > could tell me how to do that, that would be great. I'm sure I'll
have
> > another list of things I want to do after todays' meeting with
Joe, so,
> > I'll be back in touch with that list.
> >
> > Oh, also, that script you wrote, plot_cnt.r on the MET user page.
Does
> > that plot from one of these aggregate_stat or summary commands, or
is
> that
> > a single CNT file? If it's an aggregate_stat or summary commands,
what
> > command did you use and what was in the "stat_list" that you used?
I'm
> > sure it was a variety of columns from the CNT file, right?
> >
> > Thanks for for your help!
> >
> > Roz
> >
> >
> > On Sun, May 14, 2017 at 12:37 AM, John Halley Gotway via RT <
> > met_help at ucar.edu> wrote:
> >
> > > Roz,
> > >
> > > Stat-Analysis can perform a few different "job" types. One of
them is
> > the
> > > "summary" job type (-job summary). For that job, you pick
exactly one
> > line
> > > type and one or more columns of interest. Stat-Analysis will
apply
> > > whatever other filtering criteria you specify and compute
summary
> > > information for the column(s) you've selected. The summary info
> includes
> > > mean, min, max, and so on.
> > >
> > > Let me know if there's something specific you're trying to do
with
> > > stat-analysis and I may be able to point you in the right
direction.
> > >
> > > John
> > >
> > >
> > > On Sat, May 13, 2017 at 11:04 AM Rosalyn MacCracken - NOAA
Affiliate
> via
> > RT
> > > <met_help at ucar.edu> wrote:
> > >
> > > >
> > > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=80429
>
> > > >
> > > > Hi John,
> > > >
> > > > I finally got it to work. I had set:
> > > > line_type = ["CTC"];
> > > >
> > > > So, I set line_type to nothing [], and everything started
working.
> > > >
> > > > So, question. When using "summary" with -column RMSE set,
what does
> > that
> > > > mean? That only the RMSE column is summed, or something else?
> > > >
> > > > Thanks!
> > > >
> > > > Roz
> > > >
> > > > On Fri, May 12, 2017 at 2:31 PM, John Halley Gotway via RT <
> > > > met_help at ucar.edu> wrote:
> > > >
> > > > > Hello Roz,
> > > > >
> > > > > I see that you have a question about configuring/running
> > STAT-Analysis
> > > > > jobs.
> > > > >
> > > > > The "-lookin" command line option is used to tell STAT-
Analysis
> what
> > > > input
> > > > > files to read. You must specify the "-lookin" option at
least
> once,
> > > but
> > > > > can use it as many times as you'd like.
> > > > >
> > > > > The argument you pass with "-lookin" is either the name of a
> > directory
> > > or
> > > > > explicit file name.
> > > > >
> > > > > For an explicit file name, STAT-Analysis will read MET
output data
> > from
> > > > it
> > > > > **regardless of the file naming convention**.
> > > > >
> > > > > For a directory name, STAT-Analysis will search
**recursively**
> > through
> > > > > that directory looking for files ending in the ".stat"
suffix.
> > > > >
> > > > > Each time you run grid_stat, point_stat, wavelet_stat, or
> > > ensemble_stat,
> > > > > the tool writes a ".stat" output file (and can also write
the
> > optional
> > > > text
> > > > > files sorted by line type... such as "_cnt.txt). That's why
> > > > STAT-Analysis
> > > > > searches directories for ".stat" files. But if you want it
to read
> > the
> > > > > "_cnt.txt" file, you need to specify the file name on the
command
> > line.
> > > > >
> > > > > Make sense?
> > > > >
> > > > > Just let us know if more issues/questions arise.
> > > > >
> > > > > Thanks,
> > > > > John Halley Gotway
> > > > >
> > > > >
> > > > > On Thu, May 11, 2017 at 2:37 PM, Rosalyn MacCracken - NOAA
> Affiliate
> > > via
> > > > RT
> > > > > <met_help at ucar.edu> wrote:
> > > > >
> > > > > >
> > > > > > Thu May 11 14:37:29 2017: Request 80429 was acted upon.
> > > > > > Transaction: Ticket created by rosalyn.maccracken at noaa.gov
> > > > > > Queue: met_help
> > > > > > Subject: stat_analysis aggregate question
> > > > > > Owner: Nobody
> > > > > > Requestors: rosalyn.maccracken at noaa.gov
> > > > > > Status: new
> > > > > > Ticket <URL: https://rt.rap.ucar.edu/rt/
> > > Ticket/Display.html?id=80429
> > > > >
> > > > > >
> > > > > >
> > > > > > Hi,
> > > > > >
> > > > > > I'm using the output from the poin-stat tool as input to
the
> > > > > stat_analysis
> > > > > > tool. I would like to aggregate the *cnt.txt files. I
can get
> the
> > > > tool
> > > > > to
> > > > > > aggregate, and aggregate_stat the *cts.txt or *ctc.txt
files. I
> > > would
> > > > > > really like to use the information in the *cnt.txt files
for
> > multiple
> > > > > > times/days. How do I do that?
> > > > > >
> > > > > > Thanks in advance!
> > > > > >
> > > > > > Roz
> > > > > >
> > > > > > --
> > > > > > Rosalyn MacCracken
> > > > > > Support Scientist
> > > > > >
> > > > > > Ocean Applilcations Branch
> > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > NCWCP
> > > > > > 5830 University Research Ct
> > > > > > College Park, MD 20740-3818
> > > > > >
> > > > > > (p) 301-683-1551
> > > > > > rosalyn.maccracken at noaa.gov
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > Rosalyn MacCracken
> > > > Support Scientist
> > > >
> > > > Ocean Applilcations Branch
> > > > NOAA/NWS Ocean Prediction Center
> > > > NCWCP
> > > > 5830 University Research Ct
> > > > College Park, MD 20740-3818
> > > >
> > > > (p) 301-683-1551
> > > > rosalyn.maccracken at noaa.gov
> > > >
> > > >
> > >
> > >
> >
> >
> > --
> > Rosalyn MacCracken
> > Support Scientist
> >
> > Ocean Applilcations Branch
> > NOAA/NWS Ocean Prediction Center
> > NCWCP
> > 5830 University Research Ct
> > College Park, MD 20740-3818
> >
> > (p) 301-683-1551
> > rosalyn.maccracken at noaa.gov
> >
> >
>
>
--
Rosalyn MacCracken
Support Scientist
Ocean Applilcations Branch
NOAA/NWS Ocean Prediction Center
NCWCP
5830 University Research Ct
College Park, MD 20740-3818
(p) 301-683-1551
rosalyn.maccracken at noaa.gov
------------------------------------------------
Subject: stat_analysis aggregate question
From: Rosalyn MacCracken - NOAA Affiliate
Time: Mon May 15 12:33:56 2017
Hi John,
I'm having a few problems, and I'm sure they are pretty simple to
solve.
First, I was looking at the "-job summary" suggestion. So, I did:
stat_analysis -lookin /opc/save/Rosalyn.MacCracken/met_out/master_gfs
"-job
summary -fcst_var WIND -line_type CNT -column RMSE,MAE,ME,MSE
-out_stat
/opc/save/Rosalyn.MacCracken/met_out/stat_analysis/colum_multivars.stat"
-v
2
and, only wrote to the screen, not the -out_stat file specified. So,
how
do I fix that?
Next, I can't get your suggestion of:
"-job aggregate_stat -line_type SL1L2 -out_line_type CNT -out_stat
cnt_time_series.stat -fcst_lead 24 -by FCST_INIT_BEG"
to work, because I have no forecast files. So, I made a small dataset
to
work with, which only includes match-ups of prepbufr-ascat and GFS at
forecast times 00z, 06z, 12z and 18z. I don't have any forecast files
associated with the GFS, only what matches the time stamp on the
prepbufr
ascat data. So, how do you get data so that you can use the
-fcst_lead
option, etc? Is this like matching observation valid time with files
such
as gfs.tHHz.grb2f01, gfs.tHHz.grb2f02, gfs.tHHz.grb2f03, etc? In
other
words, in my prepbufr file, ascat data is collected throughout the 6
hour
period when the file is valid. So, if it's valid at 00z, there is -3
hours
before 00z, and +3 hours after 00z that data is collected and stamped
for
when the data was precisely collected. Technically, I could separate
that
out to match a one hour forecast (gfs.tHHz.grb2f01), or a 2 hour
forecast
(gfs.tHHz.grb2f02), etc.
So, do I need to also generate those matchups in order to use that
-fcst_lead option? Or, is there a better way to generate the data
that is
needed for that?
Roz
On Mon, May 15, 2017 at 3:51 PM, John Halley Gotway via RT <
met_help at ucar.edu> wrote:
> Roz,
>
> I'm glad you've been able to make progress using STAT-Analysis.
>
> Let me mention a few things that you may find useful...
>
> (1) As you've already seen, STAT-Analysis can be run by defining one
or
> more jobs in a config file. Alternatively, you can run a single job
on the
> command line with no config file. I find that much quicker and
easier when
> I'm playing around with things. It's only once I've defined a fixed
set of
> jobs that I move them into a config file.
>
> (2) By default, STAT-Analysis writes it output to the screen. Use
the
> "-out_stat" job option to redirect the job output to a .stat output
files.
> That will include the full set of header columns and should be
pretty easy
> for a plotting script to parse.
>
> (3) It sounds like you're interested primarily in matched pairs,
i.e. the
> MPR line type. I assume that's what you're plotting in your
histograms and
> boxplots. If you really just want to "filter" the .stat files, I'd
suggest
> using the "filter" job to do so:
> -job filter -line_type MPR -dump_row filter_mpr.stat [[[
additional
> filtering criteria ]]]
>
> You mentioned an aggregate_stat job to read CTC and write CTS... but
that
> doesn't have anything to do with the MPR line type. So I'm confused
as to
> why that's getting you what you want?
>
> (4) I see that you want a time series of MAE values. I think you're
on the
> right track, but I'd suggest using the "-by" option:
>
> "-job aggregate_stat -line_type SL1L2 -out_line_type CNT -out_stat
> cnt_time_series.stat -fcst_lead 24 -by FCST_INIT_BEG"
>
> The job listed above would produce a time series of continuous
statistics
> for the 24-hour lead time for each initialization time present. You
should
> be able to use the job command options to define the time series in
any way
> you want.
>
> (5) When running a summary job, if you want to summarize multiple
columns,
> just use the "-column" option multiple times to include them... or
specify
> "-column" as a comma-separated list:
>
> "-job summary -fcst_var WIND -line_type CNT -column RMSE,MAE,ME,MSE"
>
> (6) The "plot_cnt.R" script on the website is outdated since it's
header
> columns haven't been updated since version 3.0. But that same
script is
> included in the MET release and has been updated:
> met-6.0/scripts/Rscripts/plot_cnt.R
>
> It reads the CNT line type from a .stat file, an _cnt.txt file, or
the
> output of a stat-analysis filter job. I don't know specifically
what
> stat-analysis command I used, but it'd be something like:
>
> stat_analysis -job filter -line_type CNT -dump_row cnt_filter.txt
[[[
> additional filtering criteria ]]]
>
> Hope that helps.
>
> Thanks,
> John
>
>
> On Mon, May 15, 2017 at 7:29 AM, Rosalyn MacCracken - NOAA Affiliate
via RT
> <met_help at ucar.edu> wrote:
>
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=80429 >
> >
> > Hi John,
> >
> > So, I'm interested in doing a couple of things, and I think I've
figured
> > out how to do some of them. So, maybe you can tell me how to do
the
> > others.
> >
> > First, I am mostly interested in the matched points and their
> performance.
> > And, I use a config file, which I call from a script using the
command:
> >
> > stat_analysis -lookin ${PROCDIR} -out
> > ${PROCDIR}/stat_analysis/stat_analysis.out -config
> > ${CONFIGDIR}/STATAnalysisConfig_working -v 2
> >
> > I can easily plot spatially, where the matched points are located
by
> their
> > lat/lon, and I can find their differences (FCST - OBS). Then, I
used the
> > aggregate_stat command to combine my files, so, I can plot
histograms or
> > box plots of matched point, either at that forecast hour, or over
the
> span
> > of my forecast period of interest. For that, I use this in my
config
> > file:
> >
> > "-job aggregate_stat -line_type CTC -out_line_type CTS
> > -dump_row
> > /opc/save/Rosalyn.MacCracken/met_out/stat_analysis/job4_
> > aggstat_ctc_cts.stat"
> >
> > So, some other things that I might be interested in things that
span the
> > entire period. Perhaps, more like times series plots, so we can
see how
> > the forecast has done over time. I don't have a problem with
plotting
> > things from the forecast period, but, they usually aren't very
revealing
> or
> > interesting. So, some other things are:
> >
> > 1) putting together a file which spans the forecast period which
puts
> > together information from the SL1L2 file, so, I could plot a time
series
> of
> > the MAE. So, I was thinking I would use:
> > "-job aggregate_stat -line_type SL1L2 -out_line_type CNT
> > -dump_row
> > /opc/save/Rosalyn.MacCracken/met_out/stat_analysis/job2_
> > aggstat_slil2_cnt_wind2.stat"
> >
> >
> > Is that right?
> >
> > 2) From the CNT files, time series plots of the ANOM_CORR,
PR_CORR, GSS
> or
> > CSI, and RMSE, and maybe some other things. I was thinking that I
could
> > do:
> > "-job summary -fcst_var WIND -line_type CNT -column RMSE
> > -dump_row
> > /opc/save/Rosalyn.MacCracken/met_out/stat_analysis/job7_
> > summary_cnt_rmse.stat"
> >
> > But, wasn't sure if that was correct. So, if you could point me
to the
> > right usage, that would be great.
> >
> > So, I'm also not sure how you get the mean, min, max, etc, for
multiple
> > columns. I think that the CNT file has the most useful info, so,
if you
> > could tell me how to do that, that would be great. I'm sure I'll
have
> > another list of things I want to do after todays' meeting with
Joe, so,
> > I'll be back in touch with that list.
> >
> > Oh, also, that script you wrote, plot_cnt.r on the MET user page.
Does
> > that plot from one of these aggregate_stat or summary commands, or
is
> that
> > a single CNT file? If it's an aggregate_stat or summary commands,
what
> > command did you use and what was in the "stat_list" that you used?
I'm
> > sure it was a variety of columns from the CNT file, right?
> >
> > Thanks for for your help!
> >
> > Roz
> >
> >
> > On Sun, May 14, 2017 at 12:37 AM, John Halley Gotway via RT <
> > met_help at ucar.edu> wrote:
> >
> > > Roz,
> > >
> > > Stat-Analysis can perform a few different "job" types. One of
them is
> > the
> > > "summary" job type (-job summary). For that job, you pick
exactly one
> > line
> > > type and one or more columns of interest. Stat-Analysis will
apply
> > > whatever other filtering criteria you specify and compute
summary
> > > information for the column(s) you've selected. The summary info
> includes
> > > mean, min, max, and so on.
> > >
> > > Let me know if there's something specific you're trying to do
with
> > > stat-analysis and I may be able to point you in the right
direction.
> > >
> > > John
> > >
> > >
> > > On Sat, May 13, 2017 at 11:04 AM Rosalyn MacCracken - NOAA
Affiliate
> via
> > RT
> > > <met_help at ucar.edu> wrote:
> > >
> > > >
> > > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=80429
>
> > > >
> > > > Hi John,
> > > >
> > > > I finally got it to work. I had set:
> > > > line_type = ["CTC"];
> > > >
> > > > So, I set line_type to nothing [], and everything started
working.
> > > >
> > > > So, question. When using "summary" with -column RMSE set,
what does
> > that
> > > > mean? That only the RMSE column is summed, or something else?
> > > >
> > > > Thanks!
> > > >
> > > > Roz
> > > >
> > > > On Fri, May 12, 2017 at 2:31 PM, John Halley Gotway via RT <
> > > > met_help at ucar.edu> wrote:
> > > >
> > > > > Hello Roz,
> > > > >
> > > > > I see that you have a question about configuring/running
> > STAT-Analysis
> > > > > jobs.
> > > > >
> > > > > The "-lookin" command line option is used to tell STAT-
Analysis
> what
> > > > input
> > > > > files to read. You must specify the "-lookin" option at
least
> once,
> > > but
> > > > > can use it as many times as you'd like.
> > > > >
> > > > > The argument you pass with "-lookin" is either the name of a
> > directory
> > > or
> > > > > explicit file name.
> > > > >
> > > > > For an explicit file name, STAT-Analysis will read MET
output data
> > from
> > > > it
> > > > > **regardless of the file naming convention**.
> > > > >
> > > > > For a directory name, STAT-Analysis will search
**recursively**
> > through
> > > > > that directory looking for files ending in the ".stat"
suffix.
> > > > >
> > > > > Each time you run grid_stat, point_stat, wavelet_stat, or
> > > ensemble_stat,
> > > > > the tool writes a ".stat" output file (and can also write
the
> > optional
> > > > text
> > > > > files sorted by line type... such as "_cnt.txt). That's why
> > > > STAT-Analysis
> > > > > searches directories for ".stat" files. But if you want it
to read
> > the
> > > > > "_cnt.txt" file, you need to specify the file name on the
command
> > line.
> > > > >
> > > > > Make sense?
> > > > >
> > > > > Just let us know if more issues/questions arise.
> > > > >
> > > > > Thanks,
> > > > > John Halley Gotway
> > > > >
> > > > >
> > > > > On Thu, May 11, 2017 at 2:37 PM, Rosalyn MacCracken - NOAA
> Affiliate
> > > via
> > > > RT
> > > > > <met_help at ucar.edu> wrote:
> > > > >
> > > > > >
> > > > > > Thu May 11 14:37:29 2017: Request 80429 was acted upon.
> > > > > > Transaction: Ticket created by rosalyn.maccracken at noaa.gov
> > > > > > Queue: met_help
> > > > > > Subject: stat_analysis aggregate question
> > > > > > Owner: Nobody
> > > > > > Requestors: rosalyn.maccracken at noaa.gov
> > > > > > Status: new
> > > > > > Ticket <URL: https://rt.rap.ucar.edu/rt/
> > > Ticket/Display.html?id=80429
> > > > >
> > > > > >
> > > > > >
> > > > > > Hi,
> > > > > >
> > > > > > I'm using the output from the poin-stat tool as input to
the
> > > > > stat_analysis
> > > > > > tool. I would like to aggregate the *cnt.txt files. I
can get
> the
> > > > tool
> > > > > to
> > > > > > aggregate, and aggregate_stat the *cts.txt or *ctc.txt
files. I
> > > would
> > > > > > really like to use the information in the *cnt.txt files
for
> > multiple
> > > > > > times/days. How do I do that?
> > > > > >
> > > > > > Thanks in advance!
> > > > > >
> > > > > > Roz
> > > > > >
> > > > > > --
> > > > > > Rosalyn MacCracken
> > > > > > Support Scientist
> > > > > >
> > > > > > Ocean Applilcations Branch
> > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > NCWCP
> > > > > > 5830 University Research Ct
> > > > > > College Park, MD 20740-3818
> > > > > >
> > > > > > (p) 301-683-1551
> > > > > > rosalyn.maccracken at noaa.gov
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > Rosalyn MacCracken
> > > > Support Scientist
> > > >
> > > > Ocean Applilcations Branch
> > > > NOAA/NWS Ocean Prediction Center
> > > > NCWCP
> > > > 5830 University Research Ct
> > > > College Park, MD 20740-3818
> > > >
> > > > (p) 301-683-1551
> > > > rosalyn.maccracken at noaa.gov
> > > >
> > > >
> > >
> > >
> >
> >
> > --
> > Rosalyn MacCracken
> > Support Scientist
> >
> > Ocean Applilcations Branch
> > NOAA/NWS Ocean Prediction Center
> > NCWCP
> > 5830 University Research Ct
> > College Park, MD 20740-3818
> >
> > (p) 301-683-1551
> > rosalyn.maccracken at noaa.gov
> >
> >
>
>
--
Rosalyn MacCracken
Support Scientist
Ocean Applilcations Branch
NOAA/NWS Ocean Prediction Center
NCWCP
5830 University Research Ct
College Park, MD 20740-3818
(p) 301-683-1551
rosalyn.maccracken at noaa.gov
------------------------------------------------
Subject: stat_analysis aggregate question
From: John Halley Gotway
Time: Mon May 15 14:07:57 2017
Roz,
The output of the "summary" job is not a .stat line type. There is no
"SUMMARY" line type produced by other MET tools. That's why you don't
get
any output using the "-out_stat" option. However, you can use the "-
out"
option to redirect the output to an ASCII file.
I realize this is confusing... the "-out" option has existed for a
long
time. We only recently added the "-out_stat" option for output the
"aggregate" and "aggregate_stat" job types, which write true STAT
lines to
the output.
On to the next issue. It's fine that you're not evaluating forecast
lead
times... in fact that makes the logic of defining a time series much
easier. Just use "-by FCST_VALID_BEG" instead:
"-job aggregate_stat -line_type SL1L2 -out_line_type CNT -out_stat
cnt_time_series.stat -by FCST_VALID_BEG"
Thanks,
John
On Mon, May 15, 2017 at 12:33 PM, Rosalyn MacCracken - NOAA Affiliate
via
RT <met_help at ucar.edu> wrote:
>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=80429 >
>
> Hi John,
>
> I'm having a few problems, and I'm sure they are pretty simple to
solve.
> First, I was looking at the "-job summary" suggestion. So, I did:
>
> stat_analysis -lookin
/opc/save/Rosalyn.MacCracken/met_out/master_gfs
> "-job
> summary -fcst_var WIND -line_type CNT -column RMSE,MAE,ME,MSE
-out_stat
>
/opc/save/Rosalyn.MacCracken/met_out/stat_analysis/colum_multivars.stat"
> -v
> 2
>
> and, only wrote to the screen, not the -out_stat file specified.
So, how
> do I fix that?
>
> Next, I can't get your suggestion of:
>
> "-job aggregate_stat -line_type SL1L2 -out_line_type CNT -out_stat
> cnt_time_series.stat -fcst_lead 24 -by FCST_INIT_BEG"
>
> to work, because I have no forecast files. So, I made a small
dataset to
> work with, which only includes match-ups of prepbufr-ascat and GFS
at
> forecast times 00z, 06z, 12z and 18z. I don't have any forecast
files
> associated with the GFS, only what matches the time stamp on the
prepbufr
> ascat data. So, how do you get data so that you can use the
-fcst_lead
> option, etc? Is this like matching observation valid time with
files such
> as gfs.tHHz.grb2f01, gfs.tHHz.grb2f02, gfs.tHHz.grb2f03, etc? In
other
> words, in my prepbufr file, ascat data is collected throughout the 6
hour
> period when the file is valid. So, if it's valid at 00z, there is
-3 hours
> before 00z, and +3 hours after 00z that data is collected and
stamped for
> when the data was precisely collected. Technically, I could separate
that
> out to match a one hour forecast (gfs.tHHz.grb2f01), or a 2 hour
forecast
> (gfs.tHHz.grb2f02), etc.
>
> So, do I need to also generate those matchups in order to use that
> -fcst_lead option? Or, is there a better way to generate the data
that is
> needed for that?
>
> Roz
>
> On Mon, May 15, 2017 at 3:51 PM, John Halley Gotway via RT <
> met_help at ucar.edu> wrote:
>
> > Roz,
> >
> > I'm glad you've been able to make progress using STAT-Analysis.
> >
> > Let me mention a few things that you may find useful...
> >
> > (1) As you've already seen, STAT-Analysis can be run by defining
one or
> > more jobs in a config file. Alternatively, you can run a single
job on
> the
> > command line with no config file. I find that much quicker and
easier
> when
> > I'm playing around with things. It's only once I've defined a
fixed set
> of
> > jobs that I move them into a config file.
> >
> > (2) By default, STAT-Analysis writes it output to the screen. Use
the
> > "-out_stat" job option to redirect the job output to a .stat
output
> files.
> > That will include the full set of header columns and should be
pretty
> easy
> > for a plotting script to parse.
> >
> > (3) It sounds like you're interested primarily in matched pairs,
i.e. the
> > MPR line type. I assume that's what you're plotting in your
histograms
> and
> > boxplots. If you really just want to "filter" the .stat files,
I'd
> suggest
> > using the "filter" job to do so:
> > -job filter -line_type MPR -dump_row filter_mpr.stat [[[
additional
> > filtering criteria ]]]
> >
> > You mentioned an aggregate_stat job to read CTC and write CTS...
but that
> > doesn't have anything to do with the MPR line type. So I'm
confused as
> to
> > why that's getting you what you want?
> >
> > (4) I see that you want a time series of MAE values. I think
you're on
> the
> > right track, but I'd suggest using the "-by" option:
> >
> > "-job aggregate_stat -line_type SL1L2 -out_line_type CNT -out_stat
> > cnt_time_series.stat -fcst_lead 24 -by FCST_INIT_BEG"
> >
> > The job listed above would produce a time series of continuous
statistics
> > for the 24-hour lead time for each initialization time present.
You
> should
> > be able to use the job command options to define the time series
in any
> way
> > you want.
> >
> > (5) When running a summary job, if you want to summarize multiple
> columns,
> > just use the "-column" option multiple times to include them... or
> specify
> > "-column" as a comma-separated list:
> >
> > "-job summary -fcst_var WIND -line_type CNT -column
RMSE,MAE,ME,MSE"
> >
> > (6) The "plot_cnt.R" script on the website is outdated since it's
header
> > columns haven't been updated since version 3.0. But that same
script is
> > included in the MET release and has been updated:
> > met-6.0/scripts/Rscripts/plot_cnt.R
> >
> > It reads the CNT line type from a .stat file, an _cnt.txt file, or
the
> > output of a stat-analysis filter job. I don't know specifically
what
> > stat-analysis command I used, but it'd be something like:
> >
> > stat_analysis -job filter -line_type CNT -dump_row
cnt_filter.txt [[[
> > additional filtering criteria ]]]
> >
> > Hope that helps.
> >
> > Thanks,
> > John
> >
> >
> > On Mon, May 15, 2017 at 7:29 AM, Rosalyn MacCracken - NOAA
Affiliate via
> RT
> > <met_help at ucar.edu> wrote:
> >
> > >
> > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=80429 >
> > >
> > > Hi John,
> > >
> > > So, I'm interested in doing a couple of things, and I think I've
> figured
> > > out how to do some of them. So, maybe you can tell me how to do
the
> > > others.
> > >
> > > First, I am mostly interested in the matched points and their
> > performance.
> > > And, I use a config file, which I call from a script using the
command:
> > >
> > > stat_analysis -lookin ${PROCDIR} -out
> > > ${PROCDIR}/stat_analysis/stat_analysis.out -config
> > > ${CONFIGDIR}/STATAnalysisConfig_working -v 2
> > >
> > > I can easily plot spatially, where the matched points are
located by
> > their
> > > lat/lon, and I can find their differences (FCST - OBS). Then, I
used
> the
> > > aggregate_stat command to combine my files, so, I can plot
histograms
> or
> > > box plots of matched point, either at that forecast hour, or
over the
> > span
> > > of my forecast period of interest. For that, I use this in my
config
> > > file:
> > >
> > > "-job aggregate_stat -line_type CTC -out_line_type CTS
> > > -dump_row
> > > /opc/save/Rosalyn.MacCracken/met_out/stat_analysis/job4_
> > > aggstat_ctc_cts.stat"
> > >
> > > So, some other things that I might be interested in things that
span
> the
> > > entire period. Perhaps, more like times series plots, so we can
see
> how
> > > the forecast has done over time. I don't have a problem with
plotting
> > > things from the forecast period, but, they usually aren't very
> revealing
> > or
> > > interesting. So, some other things are:
> > >
> > > 1) putting together a file which spans the forecast period which
puts
> > > together information from the SL1L2 file, so, I could plot a
time
> series
> > of
> > > the MAE. So, I was thinking I would use:
> > > "-job aggregate_stat -line_type SL1L2 -out_line_type CNT
> > > -dump_row
> > > /opc/save/Rosalyn.MacCracken/met_out/stat_analysis/job2_
> > > aggstat_slil2_cnt_wind2.stat"
> > >
> > >
> > > Is that right?
> > >
> > > 2) From the CNT files, time series plots of the ANOM_CORR,
PR_CORR,
> GSS
> > or
> > > CSI, and RMSE, and maybe some other things. I was thinking that
I
> could
> > > do:
> > > "-job summary -fcst_var WIND -line_type CNT -column RMSE
> > > -dump_row
> > > /opc/save/Rosalyn.MacCracken/met_out/stat_analysis/job7_
> > > summary_cnt_rmse.stat"
> > >
> > > But, wasn't sure if that was correct. So, if you could point me
to the
> > > right usage, that would be great.
> > >
> > > So, I'm also not sure how you get the mean, min, max, etc, for
multiple
> > > columns. I think that the CNT file has the most useful info,
so, if
> you
> > > could tell me how to do that, that would be great. I'm sure
I'll have
> > > another list of things I want to do after todays' meeting with
Joe, so,
> > > I'll be back in touch with that list.
> > >
> > > Oh, also, that script you wrote, plot_cnt.r on the MET user
page. Does
> > > that plot from one of these aggregate_stat or summary commands,
or is
> > that
> > > a single CNT file? If it's an aggregate_stat or summary
commands, what
> > > command did you use and what was in the "stat_list" that you
used? I'm
> > > sure it was a variety of columns from the CNT file, right?
> > >
> > > Thanks for for your help!
> > >
> > > Roz
> > >
> > >
> > > On Sun, May 14, 2017 at 12:37 AM, John Halley Gotway via RT <
> > > met_help at ucar.edu> wrote:
> > >
> > > > Roz,
> > > >
> > > > Stat-Analysis can perform a few different "job" types. One of
them
> is
> > > the
> > > > "summary" job type (-job summary). For that job, you pick
exactly
> one
> > > line
> > > > type and one or more columns of interest. Stat-Analysis will
apply
> > > > whatever other filtering criteria you specify and compute
summary
> > > > information for the column(s) you've selected. The summary
info
> > includes
> > > > mean, min, max, and so on.
> > > >
> > > > Let me know if there's something specific you're trying to do
with
> > > > stat-analysis and I may be able to point you in the right
direction.
> > > >
> > > > John
> > > >
> > > >
> > > > On Sat, May 13, 2017 at 11:04 AM Rosalyn MacCracken - NOAA
Affiliate
> > via
> > > RT
> > > > <met_help at ucar.edu> wrote:
> > > >
> > > > >
> > > > > <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=80429 >
> > > > >
> > > > > Hi John,
> > > > >
> > > > > I finally got it to work. I had set:
> > > > > line_type = ["CTC"];
> > > > >
> > > > > So, I set line_type to nothing [], and everything started
working.
> > > > >
> > > > > So, question. When using "summary" with -column RMSE set,
what
> does
> > > that
> > > > > mean? That only the RMSE column is summed, or something
else?
> > > > >
> > > > > Thanks!
> > > > >
> > > > > Roz
> > > > >
> > > > > On Fri, May 12, 2017 at 2:31 PM, John Halley Gotway via RT <
> > > > > met_help at ucar.edu> wrote:
> > > > >
> > > > > > Hello Roz,
> > > > > >
> > > > > > I see that you have a question about configuring/running
> > > STAT-Analysis
> > > > > > jobs.
> > > > > >
> > > > > > The "-lookin" command line option is used to tell STAT-
Analysis
> > what
> > > > > input
> > > > > > files to read. You must specify the "-lookin" option at
least
> > once,
> > > > but
> > > > > > can use it as many times as you'd like.
> > > > > >
> > > > > > The argument you pass with "-lookin" is either the name of
a
> > > directory
> > > > or
> > > > > > explicit file name.
> > > > > >
> > > > > > For an explicit file name, STAT-Analysis will read MET
output
> data
> > > from
> > > > > it
> > > > > > **regardless of the file naming convention**.
> > > > > >
> > > > > > For a directory name, STAT-Analysis will search
**recursively**
> > > through
> > > > > > that directory looking for files ending in the ".stat"
suffix.
> > > > > >
> > > > > > Each time you run grid_stat, point_stat, wavelet_stat, or
> > > > ensemble_stat,
> > > > > > the tool writes a ".stat" output file (and can also write
the
> > > optional
> > > > > text
> > > > > > files sorted by line type... such as "_cnt.txt). That's
why
> > > > > STAT-Analysis
> > > > > > searches directories for ".stat" files. But if you want
it to
> read
> > > the
> > > > > > "_cnt.txt" file, you need to specify the file name on the
command
> > > line.
> > > > > >
> > > > > > Make sense?
> > > > > >
> > > > > > Just let us know if more issues/questions arise.
> > > > > >
> > > > > > Thanks,
> > > > > > John Halley Gotway
> > > > > >
> > > > > >
> > > > > > On Thu, May 11, 2017 at 2:37 PM, Rosalyn MacCracken - NOAA
> > Affiliate
> > > > via
> > > > > RT
> > > > > > <met_help at ucar.edu> wrote:
> > > > > >
> > > > > > >
> > > > > > > Thu May 11 14:37:29 2017: Request 80429 was acted upon.
> > > > > > > Transaction: Ticket created by
rosalyn.maccracken at noaa.gov
> > > > > > > Queue: met_help
> > > > > > > Subject: stat_analysis aggregate question
> > > > > > > Owner: Nobody
> > > > > > > Requestors: rosalyn.maccracken at noaa.gov
> > > > > > > Status: new
> > > > > > > Ticket <URL: https://rt.rap.ucar.edu/rt/
> > > > Ticket/Display.html?id=80429
> > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Hi,
> > > > > > >
> > > > > > > I'm using the output from the poin-stat tool as input to
the
> > > > > > stat_analysis
> > > > > > > tool. I would like to aggregate the *cnt.txt files. I
can get
> > the
> > > > > tool
> > > > > > to
> > > > > > > aggregate, and aggregate_stat the *cts.txt or *ctc.txt
files.
> I
> > > > would
> > > > > > > really like to use the information in the *cnt.txt files
for
> > > multiple
> > > > > > > times/days. How do I do that?
> > > > > > >
> > > > > > > Thanks in advance!
> > > > > > >
> > > > > > > Roz
> > > > > > >
> > > > > > > --
> > > > > > > Rosalyn MacCracken
> > > > > > > Support Scientist
> > > > > > >
> > > > > > > Ocean Applilcations Branch
> > > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > > NCWCP
> > > > > > > 5830 University Research Ct
> > > > > > > College Park, MD 20740-3818
> > > > > > >
> > > > > > > (p) 301-683-1551
> > > > > > > rosalyn.maccracken at noaa.gov
> > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Rosalyn MacCracken
> > > > > Support Scientist
> > > > >
> > > > > Ocean Applilcations Branch
> > > > > NOAA/NWS Ocean Prediction Center
> > > > > NCWCP
> > > > > 5830 University Research Ct
> > > > > College Park, MD 20740-3818
> > > > >
> > > > > (p) 301-683-1551
> > > > > rosalyn.maccracken at noaa.gov
> > > > >
> > > > >
> > > >
> > > >
> > >
> > >
> > > --
> > > Rosalyn MacCracken
> > > Support Scientist
> > >
> > > Ocean Applilcations Branch
> > > NOAA/NWS Ocean Prediction Center
> > > NCWCP
> > > 5830 University Research Ct
> > > College Park, MD 20740-3818
> > >
> > > (p) 301-683-1551
> > > rosalyn.maccracken at noaa.gov
> > >
> > >
> >
> >
>
>
> --
> Rosalyn MacCracken
> Support Scientist
>
> Ocean Applilcations Branch
> NOAA/NWS Ocean Prediction Center
> NCWCP
> 5830 University Research Ct
> College Park, MD 20740-3818
>
> (p) 301-683-1551
> rosalyn.maccracken at noaa.gov
>
>
------------------------------------------------
Subject: stat_analysis aggregate question
From: Rosalyn MacCracken - NOAA Affiliate
Time: Tue May 16 09:34:10 2017
Hi John,
Ok, I was able to get those things working! I couldn't get the
summary job
to run in the config file and output that small table to the ascii,
but, I
could use it on the command line with no issues. So, eventually, I'll
run
this in an automated script, so, I tested it with my script, and it
runs
great and outputs what I want. So, it looks like I'm off to a good
start
now.
Thanks for all your help!
Roz
On Mon, May 15, 2017 at 8:07 PM, John Halley Gotway via RT <
met_help at ucar.edu> wrote:
> Roz,
>
> The output of the "summary" job is not a .stat line type. There is
no
> "SUMMARY" line type produced by other MET tools. That's why you
don't get
> any output using the "-out_stat" option. However, you can use the
"-out"
> option to redirect the output to an ASCII file.
>
> I realize this is confusing... the "-out" option has existed for a
long
> time. We only recently added the "-out_stat" option for output the
> "aggregate" and "aggregate_stat" job types, which write true STAT
lines to
> the output.
>
> On to the next issue. It's fine that you're not evaluating forecast
lead
> times... in fact that makes the logic of defining a time series much
> easier. Just use "-by FCST_VALID_BEG" instead:
>
> "-job aggregate_stat -line_type SL1L2 -out_line_type CNT -out_stat
> cnt_time_series.stat -by FCST_VALID_BEG"
>
> Thanks,
> John
>
> On Mon, May 15, 2017 at 12:33 PM, Rosalyn MacCracken - NOAA
Affiliate via
> RT <met_help at ucar.edu> wrote:
>
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=80429 >
> >
> > Hi John,
> >
> > I'm having a few problems, and I'm sure they are pretty simple to
solve.
> > First, I was looking at the "-job summary" suggestion. So, I did:
> >
> > stat_analysis -lookin
/opc/save/Rosalyn.MacCracken/met_out/master_gfs
> > "-job
> > summary -fcst_var WIND -line_type CNT -column RMSE,MAE,ME,MSE
-out_stat
> >
/opc/save/Rosalyn.MacCracken/met_out/stat_analysis/colum_multivars.stat"
> > -v
> > 2
> >
> > and, only wrote to the screen, not the -out_stat file specified.
So,
> how
> > do I fix that?
> >
> > Next, I can't get your suggestion of:
> >
> > "-job aggregate_stat -line_type SL1L2 -out_line_type CNT -out_stat
> > cnt_time_series.stat -fcst_lead 24 -by FCST_INIT_BEG"
> >
> > to work, because I have no forecast files. So, I made a small
dataset to
> > work with, which only includes match-ups of prepbufr-ascat and GFS
at
> > forecast times 00z, 06z, 12z and 18z. I don't have any forecast
files
> > associated with the GFS, only what matches the time stamp on the
prepbufr
> > ascat data. So, how do you get data so that you can use the
-fcst_lead
> > option, etc? Is this like matching observation valid time with
files
> such
> > as gfs.tHHz.grb2f01, gfs.tHHz.grb2f02, gfs.tHHz.grb2f03, etc? In
other
> > words, in my prepbufr file, ascat data is collected throughout the
6 hour
> > period when the file is valid. So, if it's valid at 00z, there is
-3
> hours
> > before 00z, and +3 hours after 00z that data is collected and
stamped for
> > when the data was precisely collected. Technically, I could
separate that
> > out to match a one hour forecast (gfs.tHHz.grb2f01), or a 2 hour
forecast
> > (gfs.tHHz.grb2f02), etc.
> >
> > So, do I need to also generate those matchups in order to use that
> > -fcst_lead option? Or, is there a better way to generate the data
that
> is
> > needed for that?
> >
> > Roz
> >
> > On Mon, May 15, 2017 at 3:51 PM, John Halley Gotway via RT <
> > met_help at ucar.edu> wrote:
> >
> > > Roz,
> > >
> > > I'm glad you've been able to make progress using STAT-Analysis.
> > >
> > > Let me mention a few things that you may find useful...
> > >
> > > (1) As you've already seen, STAT-Analysis can be run by defining
one or
> > > more jobs in a config file. Alternatively, you can run a single
job on
> > the
> > > command line with no config file. I find that much quicker and
easier
> > when
> > > I'm playing around with things. It's only once I've defined a
fixed
> set
> > of
> > > jobs that I move them into a config file.
> > >
> > > (2) By default, STAT-Analysis writes it output to the screen.
Use the
> > > "-out_stat" job option to redirect the job output to a .stat
output
> > files.
> > > That will include the full set of header columns and should be
pretty
> > easy
> > > for a plotting script to parse.
> > >
> > > (3) It sounds like you're interested primarily in matched pairs,
i.e.
> the
> > > MPR line type. I assume that's what you're plotting in your
histograms
> > and
> > > boxplots. If you really just want to "filter" the .stat files,
I'd
> > suggest
> > > using the "filter" job to do so:
> > > -job filter -line_type MPR -dump_row filter_mpr.stat [[[
additional
> > > filtering criteria ]]]
> > >
> > > You mentioned an aggregate_stat job to read CTC and write CTS...
but
> that
> > > doesn't have anything to do with the MPR line type. So I'm
confused as
> > to
> > > why that's getting you what you want?
> > >
> > > (4) I see that you want a time series of MAE values. I think
you're on
> > the
> > > right track, but I'd suggest using the "-by" option:
> > >
> > > "-job aggregate_stat -line_type SL1L2 -out_line_type CNT
-out_stat
> > > cnt_time_series.stat -fcst_lead 24 -by FCST_INIT_BEG"
> > >
> > > The job listed above would produce a time series of continuous
> statistics
> > > for the 24-hour lead time for each initialization time present.
You
> > should
> > > be able to use the job command options to define the time series
in any
> > way
> > > you want.
> > >
> > > (5) When running a summary job, if you want to summarize
multiple
> > columns,
> > > just use the "-column" option multiple times to include them...
or
> > specify
> > > "-column" as a comma-separated list:
> > >
> > > "-job summary -fcst_var WIND -line_type CNT -column
RMSE,MAE,ME,MSE"
> > >
> > > (6) The "plot_cnt.R" script on the website is outdated since
it's
> header
> > > columns haven't been updated since version 3.0. But that same
script
> is
> > > included in the MET release and has been updated:
> > > met-6.0/scripts/Rscripts/plot_cnt.R
> > >
> > > It reads the CNT line type from a .stat file, an _cnt.txt file,
or the
> > > output of a stat-analysis filter job. I don't know specifically
what
> > > stat-analysis command I used, but it'd be something like:
> > >
> > > stat_analysis -job filter -line_type CNT -dump_row
cnt_filter.txt
> [[[
> > > additional filtering criteria ]]]
> > >
> > > Hope that helps.
> > >
> > > Thanks,
> > > John
> > >
> > >
> > > On Mon, May 15, 2017 at 7:29 AM, Rosalyn MacCracken - NOAA
Affiliate
> via
> > RT
> > > <met_help at ucar.edu> wrote:
> > >
> > > >
> > > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=80429
>
> > > >
> > > > Hi John,
> > > >
> > > > So, I'm interested in doing a couple of things, and I think
I've
> > figured
> > > > out how to do some of them. So, maybe you can tell me how to
do the
> > > > others.
> > > >
> > > > First, I am mostly interested in the matched points and their
> > > performance.
> > > > And, I use a config file, which I call from a script using the
> command:
> > > >
> > > > stat_analysis -lookin ${PROCDIR} -out
> > > > ${PROCDIR}/stat_analysis/stat_analysis.out -config
> > > > ${CONFIGDIR}/STATAnalysisConfig_working -v 2
> > > >
> > > > I can easily plot spatially, where the matched points are
located by
> > > their
> > > > lat/lon, and I can find their differences (FCST - OBS). Then,
I used
> > the
> > > > aggregate_stat command to combine my files, so, I can plot
histograms
> > or
> > > > box plots of matched point, either at that forecast hour, or
over the
> > > span
> > > > of my forecast period of interest. For that, I use this in
my
> config
> > > > file:
> > > >
> > > > "-job aggregate_stat -line_type CTC -out_line_type CTS
> > > > -dump_row
> > > > /opc/save/Rosalyn.MacCracken/met_out/stat_analysis/job4_
> > > > aggstat_ctc_cts.stat"
> > > >
> > > > So, some other things that I might be interested in things
that span
> > the
> > > > entire period. Perhaps, more like times series plots, so we
can see
> > how
> > > > the forecast has done over time. I don't have a problem with
> plotting
> > > > things from the forecast period, but, they usually aren't very
> > revealing
> > > or
> > > > interesting. So, some other things are:
> > > >
> > > > 1) putting together a file which spans the forecast period
which puts
> > > > together information from the SL1L2 file, so, I could plot a
time
> > series
> > > of
> > > > the MAE. So, I was thinking I would use:
> > > > "-job aggregate_stat -line_type SL1L2 -out_line_type CNT
> > > > -dump_row
> > > > /opc/save/Rosalyn.MacCracken/met_out/stat_analysis/job2_
> > > > aggstat_slil2_cnt_wind2.stat"
> > > >
> > > >
> > > > Is that right?
> > > >
> > > > 2) From the CNT files, time series plots of the ANOM_CORR,
PR_CORR,
> > GSS
> > > or
> > > > CSI, and RMSE, and maybe some other things. I was thinking
that I
> > could
> > > > do:
> > > > "-job summary -fcst_var WIND -line_type CNT -column RMSE
> > > > -dump_row
> > > > /opc/save/Rosalyn.MacCracken/met_out/stat_analysis/job7_
> > > > summary_cnt_rmse.stat"
> > > >
> > > > But, wasn't sure if that was correct. So, if you could point
me to
> the
> > > > right usage, that would be great.
> > > >
> > > > So, I'm also not sure how you get the mean, min, max, etc, for
> multiple
> > > > columns. I think that the CNT file has the most useful info,
so, if
> > you
> > > > could tell me how to do that, that would be great. I'm sure
I'll
> have
> > > > another list of things I want to do after todays' meeting with
Joe,
> so,
> > > > I'll be back in touch with that list.
> > > >
> > > > Oh, also, that script you wrote, plot_cnt.r on the MET user
page.
> Does
> > > > that plot from one of these aggregate_stat or summary
commands, or is
> > > that
> > > > a single CNT file? If it's an aggregate_stat or summary
commands,
> what
> > > > command did you use and what was in the "stat_list" that you
used?
> I'm
> > > > sure it was a variety of columns from the CNT file, right?
> > > >
> > > > Thanks for for your help!
> > > >
> > > > Roz
> > > >
> > > >
> > > > On Sun, May 14, 2017 at 12:37 AM, John Halley Gotway via RT <
> > > > met_help at ucar.edu> wrote:
> > > >
> > > > > Roz,
> > > > >
> > > > > Stat-Analysis can perform a few different "job" types. One
of them
> > is
> > > > the
> > > > > "summary" job type (-job summary). For that job, you pick
exactly
> > one
> > > > line
> > > > > type and one or more columns of interest. Stat-Analysis
will apply
> > > > > whatever other filtering criteria you specify and compute
summary
> > > > > information for the column(s) you've selected. The summary
info
> > > includes
> > > > > mean, min, max, and so on.
> > > > >
> > > > > Let me know if there's something specific you're trying to
do with
> > > > > stat-analysis and I may be able to point you in the right
> direction.
> > > > >
> > > > > John
> > > > >
> > > > >
> > > > > On Sat, May 13, 2017 at 11:04 AM Rosalyn MacCracken - NOAA
> Affiliate
> > > via
> > > > RT
> > > > > <met_help at ucar.edu> wrote:
> > > > >
> > > > > >
> > > > > > <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=80429 >
> > > > > >
> > > > > > Hi John,
> > > > > >
> > > > > > I finally got it to work. I had set:
> > > > > > line_type = ["CTC"];
> > > > > >
> > > > > > So, I set line_type to nothing [], and everything started
> working.
> > > > > >
> > > > > > So, question. When using "summary" with -column RMSE set,
what
> > does
> > > > that
> > > > > > mean? That only the RMSE column is summed, or something
else?
> > > > > >
> > > > > > Thanks!
> > > > > >
> > > > > > Roz
> > > > > >
> > > > > > On Fri, May 12, 2017 at 2:31 PM, John Halley Gotway via RT
<
> > > > > > met_help at ucar.edu> wrote:
> > > > > >
> > > > > > > Hello Roz,
> > > > > > >
> > > > > > > I see that you have a question about configuring/running
> > > > STAT-Analysis
> > > > > > > jobs.
> > > > > > >
> > > > > > > The "-lookin" command line option is used to tell STAT-
Analysis
> > > what
> > > > > > input
> > > > > > > files to read. You must specify the "-lookin" option at
least
> > > once,
> > > > > but
> > > > > > > can use it as many times as you'd like.
> > > > > > >
> > > > > > > The argument you pass with "-lookin" is either the name
of a
> > > > directory
> > > > > or
> > > > > > > explicit file name.
> > > > > > >
> > > > > > > For an explicit file name, STAT-Analysis will read MET
output
> > data
> > > > from
> > > > > > it
> > > > > > > **regardless of the file naming convention**.
> > > > > > >
> > > > > > > For a directory name, STAT-Analysis will search
**recursively**
> > > > through
> > > > > > > that directory looking for files ending in the ".stat"
suffix.
> > > > > > >
> > > > > > > Each time you run grid_stat, point_stat, wavelet_stat,
or
> > > > > ensemble_stat,
> > > > > > > the tool writes a ".stat" output file (and can also
write the
> > > > optional
> > > > > > text
> > > > > > > files sorted by line type... such as "_cnt.txt). That's
why
> > > > > > STAT-Analysis
> > > > > > > searches directories for ".stat" files. But if you want
it to
> > read
> > > > the
> > > > > > > "_cnt.txt" file, you need to specify the file name on
the
> command
> > > > line.
> > > > > > >
> > > > > > > Make sense?
> > > > > > >
> > > > > > > Just let us know if more issues/questions arise.
> > > > > > >
> > > > > > > Thanks,
> > > > > > > John Halley Gotway
> > > > > > >
> > > > > > >
> > > > > > > On Thu, May 11, 2017 at 2:37 PM, Rosalyn MacCracken -
NOAA
> > > Affiliate
> > > > > via
> > > > > > RT
> > > > > > > <met_help at ucar.edu> wrote:
> > > > > > >
> > > > > > > >
> > > > > > > > Thu May 11 14:37:29 2017: Request 80429 was acted
upon.
> > > > > > > > Transaction: Ticket created by
rosalyn.maccracken at noaa.gov
> > > > > > > > Queue: met_help
> > > > > > > > Subject: stat_analysis aggregate question
> > > > > > > > Owner: Nobody
> > > > > > > > Requestors: rosalyn.maccracken at noaa.gov
> > > > > > > > Status: new
> > > > > > > > Ticket <URL: https://rt.rap.ucar.edu/rt/
> > > > > Ticket/Display.html?id=80429
> > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > Hi,
> > > > > > > >
> > > > > > > > I'm using the output from the poin-stat tool as input
to the
> > > > > > > stat_analysis
> > > > > > > > tool. I would like to aggregate the *cnt.txt files.
I can
> get
> > > the
> > > > > > tool
> > > > > > > to
> > > > > > > > aggregate, and aggregate_stat the *cts.txt or *ctc.txt
files.
> > I
> > > > > would
> > > > > > > > really like to use the information in the *cnt.txt
files for
> > > > multiple
> > > > > > > > times/days. How do I do that?
> > > > > > > >
> > > > > > > > Thanks in advance!
> > > > > > > >
> > > > > > > > Roz
> > > > > > > >
> > > > > > > > --
> > > > > > > > Rosalyn MacCracken
> > > > > > > > Support Scientist
> > > > > > > >
> > > > > > > > Ocean Applilcations Branch
> > > > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > > > NCWCP
> > > > > > > > 5830 University Research Ct
> > > > > > > > College Park, MD 20740-3818
> > > > > > > >
> > > > > > > > (p) 301-683-1551
> > > > > > > > rosalyn.maccracken at noaa.gov
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Rosalyn MacCracken
> > > > > > Support Scientist
> > > > > >
> > > > > > Ocean Applilcations Branch
> > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > NCWCP
> > > > > > 5830 University Research Ct
> > > > > > College Park, MD 20740-3818
> > > > > >
> > > > > > (p) 301-683-1551
> > > > > > rosalyn.maccracken at noaa.gov
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > Rosalyn MacCracken
> > > > Support Scientist
> > > >
> > > > Ocean Applilcations Branch
> > > > NOAA/NWS Ocean Prediction Center
> > > > NCWCP
> > > > 5830 University Research Ct
> > > > College Park, MD 20740-3818
> > > >
> > > > (p) 301-683-1551
> > > > rosalyn.maccracken at noaa.gov
> > > >
> > > >
> > >
> > >
> >
> >
> > --
> > Rosalyn MacCracken
> > Support Scientist
> >
> > Ocean Applilcations Branch
> > NOAA/NWS Ocean Prediction Center
> > NCWCP
> > 5830 University Research Ct
> > College Park, MD 20740-3818
> >
> > (p) 301-683-1551
> > rosalyn.maccracken at noaa.gov
> >
> >
>
>
--
Rosalyn MacCracken
Support Scientist
Ocean Applilcations Branch
NOAA/NWS Ocean Prediction Center
NCWCP
5830 University Research Ct
College Park, MD 20740-3818
(p) 301-683-1551
rosalyn.maccracken at noaa.gov
------------------------------------------------
Subject: stat_analysis aggregate question
From: John Halley Gotway
Time: Tue May 16 09:54:31 2017
Roz,
Great, glad to hear you've made progress.
Let me clarify one nuance about the config file which *may* be the
reason
why your summary job didn't work via the config file.
You'll notice that the config file has two sections. The "filtering"
section at the top contains at least one option for each of the 22
header
columns of the .stat output files. The "jobs" section at the bottom
defines the analysis job you want to perform.
The logic works like this...
- STAT-Analysis reads all the input files defined using the "-lookin"
option.
- It applies *all* of the filtering options defined in the top section
and
writes the filtered .stat data to an output temp file.
- Each job defined in the "jobs" section, reads data from that temp
file,
applies any additional filtering criteria you've defined, and then
performs
the job on the data that remains.
Therefore, the settings defined in the "filtering" section are
effectively
applied to every job you define in the "jobs" section.
Perhaps, your "filtering" options at the top of your config file have
already filtered out the line type you're processing in the summary
job?
If so, just move that option out of the filtering section and down to
the
jobs section where you'll specify it separately for each job (e.g.
-line_type CNT).
The intent of this design is to enable STAT-Analysis to run more
efficiently. Rather than having it re-parse *ALL* the input lines for
each
job, do some first order filtering to run jobs on a smaller number of
lines.
Hope this helps clarify.
Thanks,
John
On Tue, May 16, 2017 at 9:34 AM, Rosalyn MacCracken - NOAA Affiliate
via RT
<met_help at ucar.edu> wrote:
>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=80429 >
>
> Hi John,
>
> Ok, I was able to get those things working! I couldn't get the
summary job
> to run in the config file and output that small table to the ascii,
but, I
> could use it on the command line with no issues. So, eventually,
I'll run
> this in an automated script, so, I tested it with my script, and it
runs
> great and outputs what I want. So, it looks like I'm off to a good
start
> now.
>
> Thanks for all your help!
>
> Roz
>
> On Mon, May 15, 2017 at 8:07 PM, John Halley Gotway via RT <
> met_help at ucar.edu> wrote:
>
> > Roz,
> >
> > The output of the "summary" job is not a .stat line type. There
is no
> > "SUMMARY" line type produced by other MET tools. That's why you
don't
> get
> > any output using the "-out_stat" option. However, you can use the
"-out"
> > option to redirect the output to an ASCII file.
> >
> > I realize this is confusing... the "-out" option has existed for a
long
> > time. We only recently added the "-out_stat" option for output
the
> > "aggregate" and "aggregate_stat" job types, which write true STAT
lines
> to
> > the output.
> >
> > On to the next issue. It's fine that you're not evaluating
forecast lead
> > times... in fact that makes the logic of defining a time series
much
> > easier. Just use "-by FCST_VALID_BEG" instead:
> >
> > "-job aggregate_stat -line_type SL1L2 -out_line_type CNT -out_stat
> > cnt_time_series.stat -by FCST_VALID_BEG"
> >
> > Thanks,
> > John
> >
> > On Mon, May 15, 2017 at 12:33 PM, Rosalyn MacCracken - NOAA
Affiliate via
> > RT <met_help at ucar.edu> wrote:
> >
> > >
> > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=80429 >
> > >
> > > Hi John,
> > >
> > > I'm having a few problems, and I'm sure they are pretty simple
to
> solve.
> > > First, I was looking at the "-job summary" suggestion. So, I
did:
> > >
> > > stat_analysis -lookin
/opc/save/Rosalyn.MacCracken/met_out/master_gfs
> > > "-job
> > > summary -fcst_var WIND -line_type CNT -column RMSE,MAE,ME,MSE
-out_stat
> > > /opc/save/Rosalyn.MacCracken/met_out/stat_analysis/colum_
> multivars.stat"
> > > -v
> > > 2
> > >
> > > and, only wrote to the screen, not the -out_stat file
specified. So,
> > how
> > > do I fix that?
> > >
> > > Next, I can't get your suggestion of:
> > >
> > > "-job aggregate_stat -line_type SL1L2 -out_line_type CNT
-out_stat
> > > cnt_time_series.stat -fcst_lead 24 -by FCST_INIT_BEG"
> > >
> > > to work, because I have no forecast files. So, I made a small
dataset
> to
> > > work with, which only includes match-ups of prepbufr-ascat and
GFS at
> > > forecast times 00z, 06z, 12z and 18z. I don't have any forecast
files
> > > associated with the GFS, only what matches the time stamp on the
> prepbufr
> > > ascat data. So, how do you get data so that you can use the
-fcst_lead
> > > option, etc? Is this like matching observation valid time with
files
> > such
> > > as gfs.tHHz.grb2f01, gfs.tHHz.grb2f02, gfs.tHHz.grb2f03, etc?
In other
> > > words, in my prepbufr file, ascat data is collected throughout
the 6
> hour
> > > period when the file is valid. So, if it's valid at 00z, there
is -3
> > hours
> > > before 00z, and +3 hours after 00z that data is collected and
stamped
> for
> > > when the data was precisely collected. Technically, I could
separate
> that
> > > out to match a one hour forecast (gfs.tHHz.grb2f01), or a 2 hour
> forecast
> > > (gfs.tHHz.grb2f02), etc.
> > >
> > > So, do I need to also generate those matchups in order to use
that
> > > -fcst_lead option? Or, is there a better way to generate the
data that
> > is
> > > needed for that?
> > >
> > > Roz
> > >
> > > On Mon, May 15, 2017 at 3:51 PM, John Halley Gotway via RT <
> > > met_help at ucar.edu> wrote:
> > >
> > > > Roz,
> > > >
> > > > I'm glad you've been able to make progress using STAT-
Analysis.
> > > >
> > > > Let me mention a few things that you may find useful...
> > > >
> > > > (1) As you've already seen, STAT-Analysis can be run by
defining one
> or
> > > > more jobs in a config file. Alternatively, you can run a
single job
> on
> > > the
> > > > command line with no config file. I find that much quicker
and
> easier
> > > when
> > > > I'm playing around with things. It's only once I've defined a
fixed
> > set
> > > of
> > > > jobs that I move them into a config file.
> > > >
> > > > (2) By default, STAT-Analysis writes it output to the screen.
Use
> the
> > > > "-out_stat" job option to redirect the job output to a .stat
output
> > > files.
> > > > That will include the full set of header columns and should be
pretty
> > > easy
> > > > for a plotting script to parse.
> > > >
> > > > (3) It sounds like you're interested primarily in matched
pairs, i.e.
> > the
> > > > MPR line type. I assume that's what you're plotting in your
> histograms
> > > and
> > > > boxplots. If you really just want to "filter" the .stat
files, I'd
> > > suggest
> > > > using the "filter" job to do so:
> > > > -job filter -line_type MPR -dump_row filter_mpr.stat [[[
> additional
> > > > filtering criteria ]]]
> > > >
> > > > You mentioned an aggregate_stat job to read CTC and write
CTS... but
> > that
> > > > doesn't have anything to do with the MPR line type. So I'm
confused
> as
> > > to
> > > > why that's getting you what you want?
> > > >
> > > > (4) I see that you want a time series of MAE values. I think
you're
> on
> > > the
> > > > right track, but I'd suggest using the "-by" option:
> > > >
> > > > "-job aggregate_stat -line_type SL1L2 -out_line_type CNT
-out_stat
> > > > cnt_time_series.stat -fcst_lead 24 -by FCST_INIT_BEG"
> > > >
> > > > The job listed above would produce a time series of continuous
> > statistics
> > > > for the 24-hour lead time for each initialization time
present. You
> > > should
> > > > be able to use the job command options to define the time
series in
> any
> > > way
> > > > you want.
> > > >
> > > > (5) When running a summary job, if you want to summarize
multiple
> > > columns,
> > > > just use the "-column" option multiple times to include
them... or
> > > specify
> > > > "-column" as a comma-separated list:
> > > >
> > > > "-job summary -fcst_var WIND -line_type CNT -column
RMSE,MAE,ME,MSE"
> > > >
> > > > (6) The "plot_cnt.R" script on the website is outdated since
it's
> > header
> > > > columns haven't been updated since version 3.0. But that same
script
> > is
> > > > included in the MET release and has been updated:
> > > > met-6.0/scripts/Rscripts/plot_cnt.R
> > > >
> > > > It reads the CNT line type from a .stat file, an _cnt.txt
file, or
> the
> > > > output of a stat-analysis filter job. I don't know
specifically what
> > > > stat-analysis command I used, but it'd be something like:
> > > >
> > > > stat_analysis -job filter -line_type CNT -dump_row
cnt_filter.txt
> > [[[
> > > > additional filtering criteria ]]]
> > > >
> > > > Hope that helps.
> > > >
> > > > Thanks,
> > > > John
> > > >
> > > >
> > > > On Mon, May 15, 2017 at 7:29 AM, Rosalyn MacCracken - NOAA
Affiliate
> > via
> > > RT
> > > > <met_help at ucar.edu> wrote:
> > > >
> > > > >
> > > > > <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=80429 >
> > > > >
> > > > > Hi John,
> > > > >
> > > > > So, I'm interested in doing a couple of things, and I think
I've
> > > figured
> > > > > out how to do some of them. So, maybe you can tell me how
to do
> the
> > > > > others.
> > > > >
> > > > > First, I am mostly interested in the matched points and
their
> > > > performance.
> > > > > And, I use a config file, which I call from a script using
the
> > command:
> > > > >
> > > > > stat_analysis -lookin ${PROCDIR} -out
> > > > > ${PROCDIR}/stat_analysis/stat_analysis.out -config
> > > > > ${CONFIGDIR}/STATAnalysisConfig_working -v 2
> > > > >
> > > > > I can easily plot spatially, where the matched points are
located
> by
> > > > their
> > > > > lat/lon, and I can find their differences (FCST - OBS).
Then, I
> used
> > > the
> > > > > aggregate_stat command to combine my files, so, I can plot
> histograms
> > > or
> > > > > box plots of matched point, either at that forecast hour, or
over
> the
> > > > span
> > > > > of my forecast period of interest. For that, I use this in
my
> > config
> > > > > file:
> > > > >
> > > > > "-job aggregate_stat -line_type CTC -out_line_type CTS
> > > > > -dump_row
> > > > > /opc/save/Rosalyn.MacCracken/met_out/stat_analysis/job4_
> > > > > aggstat_ctc_cts.stat"
> > > > >
> > > > > So, some other things that I might be interested in things
that
> span
> > > the
> > > > > entire period. Perhaps, more like times series plots, so we
can
> see
> > > how
> > > > > the forecast has done over time. I don't have a problem
with
> > plotting
> > > > > things from the forecast period, but, they usually aren't
very
> > > revealing
> > > > or
> > > > > interesting. So, some other things are:
> > > > >
> > > > > 1) putting together a file which spans the forecast period
which
> puts
> > > > > together information from the SL1L2 file, so, I could plot a
time
> > > series
> > > > of
> > > > > the MAE. So, I was thinking I would use:
> > > > > "-job aggregate_stat -line_type SL1L2 -out_line_type CNT
> > > > > -dump_row
> > > > > /opc/save/Rosalyn.MacCracken/met_out/stat_analysis/job2_
> > > > > aggstat_slil2_cnt_wind2.stat"
> > > > >
> > > > >
> > > > > Is that right?
> > > > >
> > > > > 2) From the CNT files, time series plots of the ANOM_CORR,
> PR_CORR,
> > > GSS
> > > > or
> > > > > CSI, and RMSE, and maybe some other things. I was thinking
that I
> > > could
> > > > > do:
> > > > > "-job summary -fcst_var WIND -line_type CNT -column RMSE
> > > > > -dump_row
> > > > > /opc/save/Rosalyn.MacCracken/met_out/stat_analysis/job7_
> > > > > summary_cnt_rmse.stat"
> > > > >
> > > > > But, wasn't sure if that was correct. So, if you could
point me to
> > the
> > > > > right usage, that would be great.
> > > > >
> > > > > So, I'm also not sure how you get the mean, min, max, etc,
for
> > multiple
> > > > > columns. I think that the CNT file has the most useful
info, so,
> if
> > > you
> > > > > could tell me how to do that, that would be great. I'm sure
I'll
> > have
> > > > > another list of things I want to do after todays' meeting
with Joe,
> > so,
> > > > > I'll be back in touch with that list.
> > > > >
> > > > > Oh, also, that script you wrote, plot_cnt.r on the MET user
page.
> > Does
> > > > > that plot from one of these aggregate_stat or summary
commands, or
> is
> > > > that
> > > > > a single CNT file? If it's an aggregate_stat or summary
commands,
> > what
> > > > > command did you use and what was in the "stat_list" that you
used?
> > I'm
> > > > > sure it was a variety of columns from the CNT file, right?
> > > > >
> > > > > Thanks for for your help!
> > > > >
> > > > > Roz
> > > > >
> > > > >
> > > > > On Sun, May 14, 2017 at 12:37 AM, John Halley Gotway via RT
<
> > > > > met_help at ucar.edu> wrote:
> > > > >
> > > > > > Roz,
> > > > > >
> > > > > > Stat-Analysis can perform a few different "job" types.
One of
> them
> > > is
> > > > > the
> > > > > > "summary" job type (-job summary). For that job, you pick
> exactly
> > > one
> > > > > line
> > > > > > type and one or more columns of interest. Stat-Analysis
will
> apply
> > > > > > whatever other filtering criteria you specify and compute
summary
> > > > > > information for the column(s) you've selected. The
summary info
> > > > includes
> > > > > > mean, min, max, and so on.
> > > > > >
> > > > > > Let me know if there's something specific you're trying to
do
> with
> > > > > > stat-analysis and I may be able to point you in the right
> > direction.
> > > > > >
> > > > > > John
> > > > > >
> > > > > >
> > > > > > On Sat, May 13, 2017 at 11:04 AM Rosalyn MacCracken - NOAA
> > Affiliate
> > > > via
> > > > > RT
> > > > > > <met_help at ucar.edu> wrote:
> > > > > >
> > > > > > >
> > > > > > > <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=80429
> >
> > > > > > >
> > > > > > > Hi John,
> > > > > > >
> > > > > > > I finally got it to work. I had set:
> > > > > > > line_type = ["CTC"];
> > > > > > >
> > > > > > > So, I set line_type to nothing [], and everything
started
> > working.
> > > > > > >
> > > > > > > So, question. When using "summary" with -column RMSE
set, what
> > > does
> > > > > that
> > > > > > > mean? That only the RMSE column is summed, or something
else?
> > > > > > >
> > > > > > > Thanks!
> > > > > > >
> > > > > > > Roz
> > > > > > >
> > > > > > > On Fri, May 12, 2017 at 2:31 PM, John Halley Gotway via
RT <
> > > > > > > met_help at ucar.edu> wrote:
> > > > > > >
> > > > > > > > Hello Roz,
> > > > > > > >
> > > > > > > > I see that you have a question about
configuring/running
> > > > > STAT-Analysis
> > > > > > > > jobs.
> > > > > > > >
> > > > > > > > The "-lookin" command line option is used to tell
> STAT-Analysis
> > > > what
> > > > > > > input
> > > > > > > > files to read. You must specify the "-lookin" option
at
> least
> > > > once,
> > > > > > but
> > > > > > > > can use it as many times as you'd like.
> > > > > > > >
> > > > > > > > The argument you pass with "-lookin" is either the
name of a
> > > > > directory
> > > > > > or
> > > > > > > > explicit file name.
> > > > > > > >
> > > > > > > > For an explicit file name, STAT-Analysis will read MET
output
> > > data
> > > > > from
> > > > > > > it
> > > > > > > > **regardless of the file naming convention**.
> > > > > > > >
> > > > > > > > For a directory name, STAT-Analysis will search
> **recursively**
> > > > > through
> > > > > > > > that directory looking for files ending in the ".stat"
> suffix.
> > > > > > > >
> > > > > > > > Each time you run grid_stat, point_stat, wavelet_stat,
or
> > > > > > ensemble_stat,
> > > > > > > > the tool writes a ".stat" output file (and can also
write the
> > > > > optional
> > > > > > > text
> > > > > > > > files sorted by line type... such as "_cnt.txt).
That's why
> > > > > > > STAT-Analysis
> > > > > > > > searches directories for ".stat" files. But if you
want it
> to
> > > read
> > > > > the
> > > > > > > > "_cnt.txt" file, you need to specify the file name on
the
> > command
> > > > > line.
> > > > > > > >
> > > > > > > > Make sense?
> > > > > > > >
> > > > > > > > Just let us know if more issues/questions arise.
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > John Halley Gotway
> > > > > > > >
> > > > > > > >
> > > > > > > > On Thu, May 11, 2017 at 2:37 PM, Rosalyn MacCracken -
NOAA
> > > > Affiliate
> > > > > > via
> > > > > > > RT
> > > > > > > > <met_help at ucar.edu> wrote:
> > > > > > > >
> > > > > > > > >
> > > > > > > > > Thu May 11 14:37:29 2017: Request 80429 was acted
upon.
> > > > > > > > > Transaction: Ticket created by
rosalyn.maccracken at noaa.gov
> > > > > > > > > Queue: met_help
> > > > > > > > > Subject: stat_analysis aggregate question
> > > > > > > > > Owner: Nobody
> > > > > > > > > Requestors: rosalyn.maccracken at noaa.gov
> > > > > > > > > Status: new
> > > > > > > > > Ticket <URL: https://rt.rap.ucar.edu/rt/
> > > > > > Ticket/Display.html?id=80429
> > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Hi,
> > > > > > > > >
> > > > > > > > > I'm using the output from the poin-stat tool as
input to
> the
> > > > > > > > stat_analysis
> > > > > > > > > tool. I would like to aggregate the *cnt.txt files.
I can
> > get
> > > > the
> > > > > > > tool
> > > > > > > > to
> > > > > > > > > aggregate, and aggregate_stat the *cts.txt or
*ctc.txt
> files.
> > > I
> > > > > > would
> > > > > > > > > really like to use the information in the *cnt.txt
files
> for
> > > > > multiple
> > > > > > > > > times/days. How do I do that?
> > > > > > > > >
> > > > > > > > > Thanks in advance!
> > > > > > > > >
> > > > > > > > > Roz
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Rosalyn MacCracken
> > > > > > > > > Support Scientist
> > > > > > > > >
> > > > > > > > > Ocean Applilcations Branch
> > > > > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > > > > NCWCP
> > > > > > > > > 5830 University Research Ct
> > > > > > > > > College Park, MD 20740-3818
> > > > > > > > >
> > > > > > > > > (p) 301-683-1551
> > > > > > > > > rosalyn.maccracken at noaa.gov
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Rosalyn MacCracken
> > > > > > > Support Scientist
> > > > > > >
> > > > > > > Ocean Applilcations Branch
> > > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > > NCWCP
> > > > > > > 5830 University Research Ct
> > > > > > > College Park, MD 20740-3818
> > > > > > >
> > > > > > > (p) 301-683-1551
> > > > > > > rosalyn.maccracken at noaa.gov
> > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Rosalyn MacCracken
> > > > > Support Scientist
> > > > >
> > > > > Ocean Applilcations Branch
> > > > > NOAA/NWS Ocean Prediction Center
> > > > > NCWCP
> > > > > 5830 University Research Ct
> > > > > College Park, MD 20740-3818
> > > > >
> > > > > (p) 301-683-1551
> > > > > rosalyn.maccracken at noaa.gov
> > > > >
> > > > >
> > > >
> > > >
> > >
> > >
> > > --
> > > Rosalyn MacCracken
> > > Support Scientist
> > >
> > > Ocean Applilcations Branch
> > > NOAA/NWS Ocean Prediction Center
> > > NCWCP
> > > 5830 University Research Ct
> > > College Park, MD 20740-3818
> > >
> > > (p) 301-683-1551
> > > rosalyn.maccracken at noaa.gov
> > >
> > >
> >
> >
>
>
> --
> Rosalyn MacCracken
> Support Scientist
>
> Ocean Applilcations Branch
> NOAA/NWS Ocean Prediction Center
> NCWCP
> 5830 University Research Ct
> College Park, MD 20740-3818
>
> (p) 301-683-1551
> rosalyn.maccracken at noaa.gov
>
>
------------------------------------------------
Subject: stat_analysis aggregate question
From: Rosalyn MacCracken - NOAA Affiliate
Time: Tue May 16 09:57:52 2017
Hi John,
Thanks for the clarification. That certainly helps. I take a look at
what
I'm filtering to see if things are fighting each other.
But, at any rate, at least I got things working!
Roz
On Tue, May 16, 2017 at 3:54 PM, John Halley Gotway via RT <
met_help at ucar.edu> wrote:
> Roz,
>
> Great, glad to hear you've made progress.
>
> Let me clarify one nuance about the config file which *may* be the
reason
> why your summary job didn't work via the config file.
>
> You'll notice that the config file has two sections. The
"filtering"
> section at the top contains at least one option for each of the 22
header
> columns of the .stat output files. The "jobs" section at the bottom
> defines the analysis job you want to perform.
>
> The logic works like this...
>
> - STAT-Analysis reads all the input files defined using the "-
lookin"
> option.
> - It applies *all* of the filtering options defined in the top
section and
> writes the filtered .stat data to an output temp file.
> - Each job defined in the "jobs" section, reads data from that temp
file,
> applies any additional filtering criteria you've defined, and then
performs
> the job on the data that remains.
>
> Therefore, the settings defined in the "filtering" section are
effectively
> applied to every job you define in the "jobs" section.
>
> Perhaps, your "filtering" options at the top of your config file
have
> already filtered out the line type you're processing in the summary
job?
> If so, just move that option out of the filtering section and down
to the
> jobs section where you'll specify it separately for each job (e.g.
> -line_type CNT).
>
> The intent of this design is to enable STAT-Analysis to run more
> efficiently. Rather than having it re-parse *ALL* the input lines
for each
> job, do some first order filtering to run jobs on a smaller number
of
> lines.
>
> Hope this helps clarify.
>
> Thanks,
> John
>
>
>
> On Tue, May 16, 2017 at 9:34 AM, Rosalyn MacCracken - NOAA Affiliate
via RT
> <met_help at ucar.edu> wrote:
>
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=80429 >
> >
> > Hi John,
> >
> > Ok, I was able to get those things working! I couldn't get the
summary
> job
> > to run in the config file and output that small table to the
ascii, but,
> I
> > could use it on the command line with no issues. So, eventually,
I'll
> run
> > this in an automated script, so, I tested it with my script, and
it runs
> > great and outputs what I want. So, it looks like I'm off to a
good start
> > now.
> >
> > Thanks for all your help!
> >
> > Roz
> >
> > On Mon, May 15, 2017 at 8:07 PM, John Halley Gotway via RT <
> > met_help at ucar.edu> wrote:
> >
> > > Roz,
> > >
> > > The output of the "summary" job is not a .stat line type. There
is no
> > > "SUMMARY" line type produced by other MET tools. That's why you
don't
> > get
> > > any output using the "-out_stat" option. However, you can use
the
> "-out"
> > > option to redirect the output to an ASCII file.
> > >
> > > I realize this is confusing... the "-out" option has existed for
a long
> > > time. We only recently added the "-out_stat" option for output
the
> > > "aggregate" and "aggregate_stat" job types, which write true
STAT lines
> > to
> > > the output.
> > >
> > > On to the next issue. It's fine that you're not evaluating
forecast
> lead
> > > times... in fact that makes the logic of defining a time series
much
> > > easier. Just use "-by FCST_VALID_BEG" instead:
> > >
> > > "-job aggregate_stat -line_type SL1L2 -out_line_type CNT
-out_stat
> > > cnt_time_series.stat -by FCST_VALID_BEG"
> > >
> > > Thanks,
> > > John
> > >
> > > On Mon, May 15, 2017 at 12:33 PM, Rosalyn MacCracken - NOAA
Affiliate
> via
> > > RT <met_help at ucar.edu> wrote:
> > >
> > > >
> > > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=80429
>
> > > >
> > > > Hi John,
> > > >
> > > > I'm having a few problems, and I'm sure they are pretty simple
to
> > solve.
> > > > First, I was looking at the "-job summary" suggestion. So, I
did:
> > > >
> > > > stat_analysis -lookin /opc/save/Rosalyn.MacCracken/
> met_out/master_gfs
> > > > "-job
> > > > summary -fcst_var WIND -line_type CNT -column RMSE,MAE,ME,MSE
> -out_stat
> > > > /opc/save/Rosalyn.MacCracken/met_out/stat_analysis/colum_
> > multivars.stat"
> > > > -v
> > > > 2
> > > >
> > > > and, only wrote to the screen, not the -out_stat file
specified.
> So,
> > > how
> > > > do I fix that?
> > > >
> > > > Next, I can't get your suggestion of:
> > > >
> > > > "-job aggregate_stat -line_type SL1L2 -out_line_type CNT
-out_stat
> > > > cnt_time_series.stat -fcst_lead 24 -by FCST_INIT_BEG"
> > > >
> > > > to work, because I have no forecast files. So, I made a small
> dataset
> > to
> > > > work with, which only includes match-ups of prepbufr-ascat and
GFS at
> > > > forecast times 00z, 06z, 12z and 18z. I don't have any
forecast
> files
> > > > associated with the GFS, only what matches the time stamp on
the
> > prepbufr
> > > > ascat data. So, how do you get data so that you can use the
> -fcst_lead
> > > > option, etc? Is this like matching observation valid time
with files
> > > such
> > > > as gfs.tHHz.grb2f01, gfs.tHHz.grb2f02, gfs.tHHz.grb2f03, etc?
In
> other
> > > > words, in my prepbufr file, ascat data is collected throughout
the 6
> > hour
> > > > period when the file is valid. So, if it's valid at 00z,
there is -3
> > > hours
> > > > before 00z, and +3 hours after 00z that data is collected and
stamped
> > for
> > > > when the data was precisely collected. Technically, I could
separate
> > that
> > > > out to match a one hour forecast (gfs.tHHz.grb2f01), or a 2
hour
> > forecast
> > > > (gfs.tHHz.grb2f02), etc.
> > > >
> > > > So, do I need to also generate those matchups in order to use
that
> > > > -fcst_lead option? Or, is there a better way to generate the
data
> that
> > > is
> > > > needed for that?
> > > >
> > > > Roz
> > > >
> > > > On Mon, May 15, 2017 at 3:51 PM, John Halley Gotway via RT <
> > > > met_help at ucar.edu> wrote:
> > > >
> > > > > Roz,
> > > > >
> > > > > I'm glad you've been able to make progress using STAT-
Analysis.
> > > > >
> > > > > Let me mention a few things that you may find useful...
> > > > >
> > > > > (1) As you've already seen, STAT-Analysis can be run by
defining
> one
> > or
> > > > > more jobs in a config file. Alternatively, you can run a
single
> job
> > on
> > > > the
> > > > > command line with no config file. I find that much quicker
and
> > easier
> > > > when
> > > > > I'm playing around with things. It's only once I've defined
a
> fixed
> > > set
> > > > of
> > > > > jobs that I move them into a config file.
> > > > >
> > > > > (2) By default, STAT-Analysis writes it output to the
screen. Use
> > the
> > > > > "-out_stat" job option to redirect the job output to a .stat
output
> > > > files.
> > > > > That will include the full set of header columns and should
be
> pretty
> > > > easy
> > > > > for a plotting script to parse.
> > > > >
> > > > > (3) It sounds like you're interested primarily in matched
pairs,
> i.e.
> > > the
> > > > > MPR line type. I assume that's what you're plotting in your
> > histograms
> > > > and
> > > > > boxplots. If you really just want to "filter" the .stat
files, I'd
> > > > suggest
> > > > > using the "filter" job to do so:
> > > > > -job filter -line_type MPR -dump_row filter_mpr.stat [[[
> > additional
> > > > > filtering criteria ]]]
> > > > >
> > > > > You mentioned an aggregate_stat job to read CTC and write
CTS...
> but
> > > that
> > > > > doesn't have anything to do with the MPR line type. So I'm
> confused
> > as
> > > > to
> > > > > why that's getting you what you want?
> > > > >
> > > > > (4) I see that you want a time series of MAE values. I
think
> you're
> > on
> > > > the
> > > > > right track, but I'd suggest using the "-by" option:
> > > > >
> > > > > "-job aggregate_stat -line_type SL1L2 -out_line_type CNT
-out_stat
> > > > > cnt_time_series.stat -fcst_lead 24 -by FCST_INIT_BEG"
> > > > >
> > > > > The job listed above would produce a time series of
continuous
> > > statistics
> > > > > for the 24-hour lead time for each initialization time
present.
> You
> > > > should
> > > > > be able to use the job command options to define the time
series in
> > any
> > > > way
> > > > > you want.
> > > > >
> > > > > (5) When running a summary job, if you want to summarize
multiple
> > > > columns,
> > > > > just use the "-column" option multiple times to include
them... or
> > > > specify
> > > > > "-column" as a comma-separated list:
> > > > >
> > > > > "-job summary -fcst_var WIND -line_type CNT -column
> RMSE,MAE,ME,MSE"
> > > > >
> > > > > (6) The "plot_cnt.R" script on the website is outdated since
it's
> > > header
> > > > > columns haven't been updated since version 3.0. But that
same
> script
> > > is
> > > > > included in the MET release and has been updated:
> > > > > met-6.0/scripts/Rscripts/plot_cnt.R
> > > > >
> > > > > It reads the CNT line type from a .stat file, an _cnt.txt
file, or
> > the
> > > > > output of a stat-analysis filter job. I don't know
specifically
> what
> > > > > stat-analysis command I used, but it'd be something like:
> > > > >
> > > > > stat_analysis -job filter -line_type CNT -dump_row
> cnt_filter.txt
> > > [[[
> > > > > additional filtering criteria ]]]
> > > > >
> > > > > Hope that helps.
> > > > >
> > > > > Thanks,
> > > > > John
> > > > >
> > > > >
> > > > > On Mon, May 15, 2017 at 7:29 AM, Rosalyn MacCracken - NOAA
> Affiliate
> > > via
> > > > RT
> > > > > <met_help at ucar.edu> wrote:
> > > > >
> > > > > >
> > > > > > <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=80429 >
> > > > > >
> > > > > > Hi John,
> > > > > >
> > > > > > So, I'm interested in doing a couple of things, and I
think I've
> > > > figured
> > > > > > out how to do some of them. So, maybe you can tell me how
to do
> > the
> > > > > > others.
> > > > > >
> > > > > > First, I am mostly interested in the matched points and
their
> > > > > performance.
> > > > > > And, I use a config file, which I call from a script using
the
> > > command:
> > > > > >
> > > > > > stat_analysis -lookin ${PROCDIR} -out
> > > > > > ${PROCDIR}/stat_analysis/stat_analysis.out -config
> > > > > > ${CONFIGDIR}/STATAnalysisConfig_working -v 2
> > > > > >
> > > > > > I can easily plot spatially, where the matched points are
located
> > by
> > > > > their
> > > > > > lat/lon, and I can find their differences (FCST - OBS).
Then, I
> > used
> > > > the
> > > > > > aggregate_stat command to combine my files, so, I can plot
> > histograms
> > > > or
> > > > > > box plots of matched point, either at that forecast hour,
or over
> > the
> > > > > span
> > > > > > of my forecast period of interest. For that, I use this
in my
> > > config
> > > > > > file:
> > > > > >
> > > > > > "-job aggregate_stat -line_type CTC -out_line_type CTS
> > > > > > -dump_row
> > > > > > /opc/save/Rosalyn.MacCracken/met_out/stat_analysis/job4_
> > > > > > aggstat_ctc_cts.stat"
> > > > > >
> > > > > > So, some other things that I might be interested in things
that
> > span
> > > > the
> > > > > > entire period. Perhaps, more like times series plots, so
we can
> > see
> > > > how
> > > > > > the forecast has done over time. I don't have a problem
with
> > > plotting
> > > > > > things from the forecast period, but, they usually aren't
very
> > > > revealing
> > > > > or
> > > > > > interesting. So, some other things are:
> > > > > >
> > > > > > 1) putting together a file which spans the forecast period
which
> > puts
> > > > > > together information from the SL1L2 file, so, I could plot
a time
> > > > series
> > > > > of
> > > > > > the MAE. So, I was thinking I would use:
> > > > > > "-job aggregate_stat -line_type SL1L2 -out_line_type CNT
> > > > > > -dump_row
> > > > > > /opc/save/Rosalyn.MacCracken/met_out/stat_analysis/job2_
> > > > > > aggstat_slil2_cnt_wind2.stat"
> > > > > >
> > > > > >
> > > > > > Is that right?
> > > > > >
> > > > > > 2) From the CNT files, time series plots of the
ANOM_CORR,
> > PR_CORR,
> > > > GSS
> > > > > or
> > > > > > CSI, and RMSE, and maybe some other things. I was
thinking that
> I
> > > > could
> > > > > > do:
> > > > > > "-job summary -fcst_var WIND -line_type CNT -column RMSE
> > > > > > -dump_row
> > > > > > /opc/save/Rosalyn.MacCracken/met_out/stat_analysis/job7_
> > > > > > summary_cnt_rmse.stat"
> > > > > >
> > > > > > But, wasn't sure if that was correct. So, if you could
point me
> to
> > > the
> > > > > > right usage, that would be great.
> > > > > >
> > > > > > So, I'm also not sure how you get the mean, min, max, etc,
for
> > > multiple
> > > > > > columns. I think that the CNT file has the most useful
info, so,
> > if
> > > > you
> > > > > > could tell me how to do that, that would be great. I'm
sure I'll
> > > have
> > > > > > another list of things I want to do after todays' meeting
with
> Joe,
> > > so,
> > > > > > I'll be back in touch with that list.
> > > > > >
> > > > > > Oh, also, that script you wrote, plot_cnt.r on the MET
user page.
> > > Does
> > > > > > that plot from one of these aggregate_stat or summary
commands,
> or
> > is
> > > > > that
> > > > > > a single CNT file? If it's an aggregate_stat or summary
> commands,
> > > what
> > > > > > command did you use and what was in the "stat_list" that
you
> used?
> > > I'm
> > > > > > sure it was a variety of columns from the CNT file, right?
> > > > > >
> > > > > > Thanks for for your help!
> > > > > >
> > > > > > Roz
> > > > > >
> > > > > >
> > > > > > On Sun, May 14, 2017 at 12:37 AM, John Halley Gotway via
RT <
> > > > > > met_help at ucar.edu> wrote:
> > > > > >
> > > > > > > Roz,
> > > > > > >
> > > > > > > Stat-Analysis can perform a few different "job" types.
One of
> > them
> > > > is
> > > > > > the
> > > > > > > "summary" job type (-job summary). For that job, you
pick
> > exactly
> > > > one
> > > > > > line
> > > > > > > type and one or more columns of interest. Stat-Analysis
will
> > apply
> > > > > > > whatever other filtering criteria you specify and
compute
> summary
> > > > > > > information for the column(s) you've selected. The
summary
> info
> > > > > includes
> > > > > > > mean, min, max, and so on.
> > > > > > >
> > > > > > > Let me know if there's something specific you're trying
to do
> > with
> > > > > > > stat-analysis and I may be able to point you in the
right
> > > direction.
> > > > > > >
> > > > > > > John
> > > > > > >
> > > > > > >
> > > > > > > On Sat, May 13, 2017 at 11:04 AM Rosalyn MacCracken -
NOAA
> > > Affiliate
> > > > > via
> > > > > > RT
> > > > > > > <met_help at ucar.edu> wrote:
> > > > > > >
> > > > > > > >
> > > > > > > > <URL: https://rt.rap.ucar.edu/rt/
> Ticket/Display.html?id=80429
> > >
> > > > > > > >
> > > > > > > > Hi John,
> > > > > > > >
> > > > > > > > I finally got it to work. I had set:
> > > > > > > > line_type = ["CTC"];
> > > > > > > >
> > > > > > > > So, I set line_type to nothing [], and everything
started
> > > working.
> > > > > > > >
> > > > > > > > So, question. When using "summary" with -column RMSE
set,
> what
> > > > does
> > > > > > that
> > > > > > > > mean? That only the RMSE column is summed, or
something
> else?
> > > > > > > >
> > > > > > > > Thanks!
> > > > > > > >
> > > > > > > > Roz
> > > > > > > >
> > > > > > > > On Fri, May 12, 2017 at 2:31 PM, John Halley Gotway
via RT <
> > > > > > > > met_help at ucar.edu> wrote:
> > > > > > > >
> > > > > > > > > Hello Roz,
> > > > > > > > >
> > > > > > > > > I see that you have a question about
configuring/running
> > > > > > STAT-Analysis
> > > > > > > > > jobs.
> > > > > > > > >
> > > > > > > > > The "-lookin" command line option is used to tell
> > STAT-Analysis
> > > > > what
> > > > > > > > input
> > > > > > > > > files to read. You must specify the "-lookin"
option at
> > least
> > > > > once,
> > > > > > > but
> > > > > > > > > can use it as many times as you'd like.
> > > > > > > > >
> > > > > > > > > The argument you pass with "-lookin" is either the
name of
> a
> > > > > > directory
> > > > > > > or
> > > > > > > > > explicit file name.
> > > > > > > > >
> > > > > > > > > For an explicit file name, STAT-Analysis will read
MET
> output
> > > > data
> > > > > > from
> > > > > > > > it
> > > > > > > > > **regardless of the file naming convention**.
> > > > > > > > >
> > > > > > > > > For a directory name, STAT-Analysis will search
> > **recursively**
> > > > > > through
> > > > > > > > > that directory looking for files ending in the
".stat"
> > suffix.
> > > > > > > > >
> > > > > > > > > Each time you run grid_stat, point_stat,
wavelet_stat, or
> > > > > > > ensemble_stat,
> > > > > > > > > the tool writes a ".stat" output file (and can also
write
> the
> > > > > > optional
> > > > > > > > text
> > > > > > > > > files sorted by line type... such as "_cnt.txt).
That's
> why
> > > > > > > > STAT-Analysis
> > > > > > > > > searches directories for ".stat" files. But if you
want it
> > to
> > > > read
> > > > > > the
> > > > > > > > > "_cnt.txt" file, you need to specify the file name
on the
> > > command
> > > > > > line.
> > > > > > > > >
> > > > > > > > > Make sense?
> > > > > > > > >
> > > > > > > > > Just let us know if more issues/questions arise.
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > > John Halley Gotway
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Thu, May 11, 2017 at 2:37 PM, Rosalyn MacCracken
- NOAA
> > > > > Affiliate
> > > > > > > via
> > > > > > > > RT
> > > > > > > > > <met_help at ucar.edu> wrote:
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Thu May 11 14:37:29 2017: Request 80429 was acted
upon.
> > > > > > > > > > Transaction: Ticket created by
> rosalyn.maccracken at noaa.gov
> > > > > > > > > > Queue: met_help
> > > > > > > > > > Subject: stat_analysis aggregate question
> > > > > > > > > > Owner: Nobody
> > > > > > > > > > Requestors: rosalyn.maccracken at noaa.gov
> > > > > > > > > > Status: new
> > > > > > > > > > Ticket <URL: https://rt.rap.ucar.edu/rt/
> > > > > > > Ticket/Display.html?id=80429
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Hi,
> > > > > > > > > >
> > > > > > > > > > I'm using the output from the poin-stat tool as
input to
> > the
> > > > > > > > > stat_analysis
> > > > > > > > > > tool. I would like to aggregate the *cnt.txt
files. I
> can
> > > get
> > > > > the
> > > > > > > > tool
> > > > > > > > > to
> > > > > > > > > > aggregate, and aggregate_stat the *cts.txt or
*ctc.txt
> > files.
> > > > I
> > > > > > > would
> > > > > > > > > > really like to use the information in the *cnt.txt
files
> > for
> > > > > > multiple
> > > > > > > > > > times/days. How do I do that?
> > > > > > > > > >
> > > > > > > > > > Thanks in advance!
> > > > > > > > > >
> > > > > > > > > > Roz
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > Rosalyn MacCracken
> > > > > > > > > > Support Scientist
> > > > > > > > > >
> > > > > > > > > > Ocean Applilcations Branch
> > > > > > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > > > > > NCWCP
> > > > > > > > > > 5830 University Research Ct
> > > > > > > > > > College Park, MD 20740-3818
> > > > > > > > > >
> > > > > > > > > > (p) 301-683-1551
> > > > > > > > > > rosalyn.maccracken at noaa.gov
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > Rosalyn MacCracken
> > > > > > > > Support Scientist
> > > > > > > >
> > > > > > > > Ocean Applilcations Branch
> > > > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > > > NCWCP
> > > > > > > > 5830 University Research Ct
> > > > > > > > College Park, MD 20740-3818
> > > > > > > >
> > > > > > > > (p) 301-683-1551
> > > > > > > > rosalyn.maccracken at noaa.gov
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Rosalyn MacCracken
> > > > > > Support Scientist
> > > > > >
> > > > > > Ocean Applilcations Branch
> > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > NCWCP
> > > > > > 5830 University Research Ct
> > > > > > College Park, MD 20740-3818
> > > > > >
> > > > > > (p) 301-683-1551
> > > > > > rosalyn.maccracken at noaa.gov
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > Rosalyn MacCracken
> > > > Support Scientist
> > > >
> > > > Ocean Applilcations Branch
> > > > NOAA/NWS Ocean Prediction Center
> > > > NCWCP
> > > > 5830 University Research Ct
> > > > College Park, MD 20740-3818
> > > >
> > > > (p) 301-683-1551
> > > > rosalyn.maccracken at noaa.gov
> > > >
> > > >
> > >
> > >
> >
> >
> > --
> > Rosalyn MacCracken
> > Support Scientist
> >
> > Ocean Applilcations Branch
> > NOAA/NWS Ocean Prediction Center
> > NCWCP
> > 5830 University Research Ct
> > College Park, MD 20740-3818
> >
> > (p) 301-683-1551
> > rosalyn.maccracken at noaa.gov
> >
> >
>
>
--
Rosalyn MacCracken
Support Scientist
Ocean Applilcations Branch
NOAA/NWS Ocean Prediction Center
NCWCP
5830 University Research Ct
College Park, MD 20740-3818
(p) 301-683-1551
rosalyn.maccracken at noaa.gov
------------------------------------------------
Subject: stat_analysis aggregate question
From: Rosalyn MacCracken - NOAA Affiliate
Time: Tue May 16 11:25:46 2017
Oh, wait, one more question:
The command you gave me before:
"-job aggregate_stat -line_type SL1L2 -out_line_type CNT -out_stat
cnt_time_series.stat -by FCST_VALID_BEG"
does summarize variables from CNT for the forecast time period, but,
it
lumps all the forecast variables together. So, you'll have a column
of
FCST_VAR which is UGRD,VGRD and WIND and them a single RMSE associated
with
that.
I tried adding -by FCST_VAR:
"-job aggregate_stat -line_type SL1L2 -out_line_type CNT -out_stat
cnt_time_series.stat -by FCST_VAR -by FCST_VALID_BEG"
but, that didn't work. Any ideas how to associate a single value, of
say
RMSE, for the forecast time interval of say 2017050100 - 2017050218,
with
each of the forecast varialbes (UGRD,VGRD and WIND)?
Thanks,
Roz
On Tue, May 16, 2017 at 3:54 PM, John Halley Gotway via RT <
met_help at ucar.edu> wrote:
> Roz,
>
> Great, glad to hear you've made progress.
>
> Let me clarify one nuance about the config file which *may* be the
reason
> why your summary job didn't work via the config file.
>
> You'll notice that the config file has two sections. The
"filtering"
> section at the top contains at least one option for each of the 22
header
> columns of the .stat output files. The "jobs" section at the bottom
> defines the analysis job you want to perform.
>
> The logic works like this...
>
> - STAT-Analysis reads all the input files defined using the "-
lookin"
> option.
> - It applies *all* of the filtering options defined in the top
section and
> writes the filtered .stat data to an output temp file.
> - Each job defined in the "jobs" section, reads data from that temp
file,
> applies any additional filtering criteria you've defined, and then
performs
> the job on the data that remains.
>
> Therefore, the settings defined in the "filtering" section are
effectively
> applied to every job you define in the "jobs" section.
>
> Perhaps, your "filtering" options at the top of your config file
have
> already filtered out the line type you're processing in the summary
job?
> If so, just move that option out of the filtering section and down
to the
> jobs section where you'll specify it separately for each job (e.g.
> -line_type CNT).
>
> The intent of this design is to enable STAT-Analysis to run more
> efficiently. Rather than having it re-parse *ALL* the input lines
for each
> job, do some first order filtering to run jobs on a smaller number
of
> lines.
>
> Hope this helps clarify.
>
> Thanks,
> John
>
>
>
> On Tue, May 16, 2017 at 9:34 AM, Rosalyn MacCracken - NOAA Affiliate
via RT
> <met_help at ucar.edu> wrote:
>
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=80429 >
> >
> > Hi John,
> >
> > Ok, I was able to get those things working! I couldn't get the
summary
> job
> > to run in the config file and output that small table to the
ascii, but,
> I
> > could use it on the command line with no issues. So, eventually,
I'll
> run
> > this in an automated script, so, I tested it with my script, and
it runs
> > great and outputs what I want. So, it looks like I'm off to a
good start
> > now.
> >
> > Thanks for all your help!
> >
> > Roz
> >
> > On Mon, May 15, 2017 at 8:07 PM, John Halley Gotway via RT <
> > met_help at ucar.edu> wrote:
> >
> > > Roz,
> > >
> > > The output of the "summary" job is not a .stat line type. There
is no
> > > "SUMMARY" line type produced by other MET tools. That's why you
don't
> > get
> > > any output using the "-out_stat" option. However, you can use
the
> "-out"
> > > option to redirect the output to an ASCII file.
> > >
> > > I realize this is confusing... the "-out" option has existed for
a long
> > > time. We only recently added the "-out_stat" option for output
the
> > > "aggregate" and "aggregate_stat" job types, which write true
STAT lines
> > to
> > > the output.
> > >
> > > On to the next issue. It's fine that you're not evaluating
forecast
> lead
> > > times... in fact that makes the logic of defining a time series
much
> > > easier. Just use "-by FCST_VALID_BEG" instead:
> > >
> > > "-job aggregate_stat -line_type SL1L2 -out_line_type CNT
-out_stat
> > > cnt_time_series.stat -by FCST_VALID_BEG"
> > >
> > > Thanks,
> > > John
> > >
> > > On Mon, May 15, 2017 at 12:33 PM, Rosalyn MacCracken - NOAA
Affiliate
> via
> > > RT <met_help at ucar.edu> wrote:
> > >
> > > >
> > > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=80429
>
> > > >
> > > > Hi John,
> > > >
> > > > I'm having a few problems, and I'm sure they are pretty simple
to
> > solve.
> > > > First, I was looking at the "-job summary" suggestion. So, I
did:
> > > >
> > > > stat_analysis -lookin /opc/save/Rosalyn.MacCracken/
> met_out/master_gfs
> > > > "-job
> > > > summary -fcst_var WIND -line_type CNT -column RMSE,MAE,ME,MSE
> -out_stat
> > > > /opc/save/Rosalyn.MacCracken/met_out/stat_analysis/colum_
> > multivars.stat"
> > > > -v
> > > > 2
> > > >
> > > > and, only wrote to the screen, not the -out_stat file
specified.
> So,
> > > how
> > > > do I fix that?
> > > >
> > > > Next, I can't get your suggestion of:
> > > >
> > > > "-job aggregate_stat -line_type SL1L2 -out_line_type CNT
-out_stat
> > > > cnt_time_series.stat -fcst_lead 24 -by FCST_INIT_BEG"
> > > >
> > > > to work, because I have no forecast files. So, I made a small
> dataset
> > to
> > > > work with, which only includes match-ups of prepbufr-ascat and
GFS at
> > > > forecast times 00z, 06z, 12z and 18z. I don't have any
forecast
> files
> > > > associated with the GFS, only what matches the time stamp on
the
> > prepbufr
> > > > ascat data. So, how do you get data so that you can use the
> -fcst_lead
> > > > option, etc? Is this like matching observation valid time
with files
> > > such
> > > > as gfs.tHHz.grb2f01, gfs.tHHz.grb2f02, gfs.tHHz.grb2f03, etc?
In
> other
> > > > words, in my prepbufr file, ascat data is collected throughout
the 6
> > hour
> > > > period when the file is valid. So, if it's valid at 00z,
there is -3
> > > hours
> > > > before 00z, and +3 hours after 00z that data is collected and
stamped
> > for
> > > > when the data was precisely collected. Technically, I could
separate
> > that
> > > > out to match a one hour forecast (gfs.tHHz.grb2f01), or a 2
hour
> > forecast
> > > > (gfs.tHHz.grb2f02), etc.
> > > >
> > > > So, do I need to also generate those matchups in order to use
that
> > > > -fcst_lead option? Or, is there a better way to generate the
data
> that
> > > is
> > > > needed for that?
> > > >
> > > > Roz
> > > >
> > > > On Mon, May 15, 2017 at 3:51 PM, John Halley Gotway via RT <
> > > > met_help at ucar.edu> wrote:
> > > >
> > > > > Roz,
> > > > >
> > > > > I'm glad you've been able to make progress using STAT-
Analysis.
> > > > >
> > > > > Let me mention a few things that you may find useful...
> > > > >
> > > > > (1) As you've already seen, STAT-Analysis can be run by
defining
> one
> > or
> > > > > more jobs in a config file. Alternatively, you can run a
single
> job
> > on
> > > > the
> > > > > command line with no config file. I find that much quicker
and
> > easier
> > > > when
> > > > > I'm playing around with things. It's only once I've defined
a
> fixed
> > > set
> > > > of
> > > > > jobs that I move them into a config file.
> > > > >
> > > > > (2) By default, STAT-Analysis writes it output to the
screen. Use
> > the
> > > > > "-out_stat" job option to redirect the job output to a .stat
output
> > > > files.
> > > > > That will include the full set of header columns and should
be
> pretty
> > > > easy
> > > > > for a plotting script to parse.
> > > > >
> > > > > (3) It sounds like you're interested primarily in matched
pairs,
> i.e.
> > > the
> > > > > MPR line type. I assume that's what you're plotting in your
> > histograms
> > > > and
> > > > > boxplots. If you really just want to "filter" the .stat
files, I'd
> > > > suggest
> > > > > using the "filter" job to do so:
> > > > > -job filter -line_type MPR -dump_row filter_mpr.stat [[[
> > additional
> > > > > filtering criteria ]]]
> > > > >
> > > > > You mentioned an aggregate_stat job to read CTC and write
CTS...
> but
> > > that
> > > > > doesn't have anything to do with the MPR line type. So I'm
> confused
> > as
> > > > to
> > > > > why that's getting you what you want?
> > > > >
> > > > > (4) I see that you want a time series of MAE values. I
think
> you're
> > on
> > > > the
> > > > > right track, but I'd suggest using the "-by" option:
> > > > >
> > > > > "-job aggregate_stat -line_type SL1L2 -out_line_type CNT
-out_stat
> > > > > cnt_time_series.stat -fcst_lead 24 -by FCST_INIT_BEG"
> > > > >
> > > > > The job listed above would produce a time series of
continuous
> > > statistics
> > > > > for the 24-hour lead time for each initialization time
present.
> You
> > > > should
> > > > > be able to use the job command options to define the time
series in
> > any
> > > > way
> > > > > you want.
> > > > >
> > > > > (5) When running a summary job, if you want to summarize
multiple
> > > > columns,
> > > > > just use the "-column" option multiple times to include
them... or
> > > > specify
> > > > > "-column" as a comma-separated list:
> > > > >
> > > > > "-job summary -fcst_var WIND -line_type CNT -column
> RMSE,MAE,ME,MSE"
> > > > >
> > > > > (6) The "plot_cnt.R" script on the website is outdated since
it's
> > > header
> > > > > columns haven't been updated since version 3.0. But that
same
> script
> > > is
> > > > > included in the MET release and has been updated:
> > > > > met-6.0/scripts/Rscripts/plot_cnt.R
> > > > >
> > > > > It reads the CNT line type from a .stat file, an _cnt.txt
file, or
> > the
> > > > > output of a stat-analysis filter job. I don't know
specifically
> what
> > > > > stat-analysis command I used, but it'd be something like:
> > > > >
> > > > > stat_analysis -job filter -line_type CNT -dump_row
> cnt_filter.txt
> > > [[[
> > > > > additional filtering criteria ]]]
> > > > >
> > > > > Hope that helps.
> > > > >
> > > > > Thanks,
> > > > > John
> > > > >
> > > > >
> > > > > On Mon, May 15, 2017 at 7:29 AM, Rosalyn MacCracken - NOAA
> Affiliate
> > > via
> > > > RT
> > > > > <met_help at ucar.edu> wrote:
> > > > >
> > > > > >
> > > > > > <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=80429 >
> > > > > >
> > > > > > Hi John,
> > > > > >
> > > > > > So, I'm interested in doing a couple of things, and I
think I've
> > > > figured
> > > > > > out how to do some of them. So, maybe you can tell me how
to do
> > the
> > > > > > others.
> > > > > >
> > > > > > First, I am mostly interested in the matched points and
their
> > > > > performance.
> > > > > > And, I use a config file, which I call from a script using
the
> > > command:
> > > > > >
> > > > > > stat_analysis -lookin ${PROCDIR} -out
> > > > > > ${PROCDIR}/stat_analysis/stat_analysis.out -config
> > > > > > ${CONFIGDIR}/STATAnalysisConfig_working -v 2
> > > > > >
> > > > > > I can easily plot spatially, where the matched points are
located
> > by
> > > > > their
> > > > > > lat/lon, and I can find their differences (FCST - OBS).
Then, I
> > used
> > > > the
> > > > > > aggregate_stat command to combine my files, so, I can plot
> > histograms
> > > > or
> > > > > > box plots of matched point, either at that forecast hour,
or over
> > the
> > > > > span
> > > > > > of my forecast period of interest. For that, I use this
in my
> > > config
> > > > > > file:
> > > > > >
> > > > > > "-job aggregate_stat -line_type CTC -out_line_type CTS
> > > > > > -dump_row
> > > > > > /opc/save/Rosalyn.MacCracken/met_out/stat_analysis/job4_
> > > > > > aggstat_ctc_cts.stat"
> > > > > >
> > > > > > So, some other things that I might be interested in things
that
> > span
> > > > the
> > > > > > entire period. Perhaps, more like times series plots, so
we can
> > see
> > > > how
> > > > > > the forecast has done over time. I don't have a problem
with
> > > plotting
> > > > > > things from the forecast period, but, they usually aren't
very
> > > > revealing
> > > > > or
> > > > > > interesting. So, some other things are:
> > > > > >
> > > > > > 1) putting together a file which spans the forecast period
which
> > puts
> > > > > > together information from the SL1L2 file, so, I could plot
a time
> > > > series
> > > > > of
> > > > > > the MAE. So, I was thinking I would use:
> > > > > > "-job aggregate_stat -line_type SL1L2 -out_line_type CNT
> > > > > > -dump_row
> > > > > > /opc/save/Rosalyn.MacCracken/met_out/stat_analysis/job2_
> > > > > > aggstat_slil2_cnt_wind2.stat"
> > > > > >
> > > > > >
> > > > > > Is that right?
> > > > > >
> > > > > > 2) From the CNT files, time series plots of the
ANOM_CORR,
> > PR_CORR,
> > > > GSS
> > > > > or
> > > > > > CSI, and RMSE, and maybe some other things. I was
thinking that
> I
> > > > could
> > > > > > do:
> > > > > > "-job summary -fcst_var WIND -line_type CNT -column RMSE
> > > > > > -dump_row
> > > > > > /opc/save/Rosalyn.MacCracken/met_out/stat_analysis/job7_
> > > > > > summary_cnt_rmse.stat"
> > > > > >
> > > > > > But, wasn't sure if that was correct. So, if you could
point me
> to
> > > the
> > > > > > right usage, that would be great.
> > > > > >
> > > > > > So, I'm also not sure how you get the mean, min, max, etc,
for
> > > multiple
> > > > > > columns. I think that the CNT file has the most useful
info, so,
> > if
> > > > you
> > > > > > could tell me how to do that, that would be great. I'm
sure I'll
> > > have
> > > > > > another list of things I want to do after todays' meeting
with
> Joe,
> > > so,
> > > > > > I'll be back in touch with that list.
> > > > > >
> > > > > > Oh, also, that script you wrote, plot_cnt.r on the MET
user page.
> > > Does
> > > > > > that plot from one of these aggregate_stat or summary
commands,
> or
> > is
> > > > > that
> > > > > > a single CNT file? If it's an aggregate_stat or summary
> commands,
> > > what
> > > > > > command did you use and what was in the "stat_list" that
you
> used?
> > > I'm
> > > > > > sure it was a variety of columns from the CNT file, right?
> > > > > >
> > > > > > Thanks for for your help!
> > > > > >
> > > > > > Roz
> > > > > >
> > > > > >
> > > > > > On Sun, May 14, 2017 at 12:37 AM, John Halley Gotway via
RT <
> > > > > > met_help at ucar.edu> wrote:
> > > > > >
> > > > > > > Roz,
> > > > > > >
> > > > > > > Stat-Analysis can perform a few different "job" types.
One of
> > them
> > > > is
> > > > > > the
> > > > > > > "summary" job type (-job summary). For that job, you
pick
> > exactly
> > > > one
> > > > > > line
> > > > > > > type and one or more columns of interest. Stat-Analysis
will
> > apply
> > > > > > > whatever other filtering criteria you specify and
compute
> summary
> > > > > > > information for the column(s) you've selected. The
summary
> info
> > > > > includes
> > > > > > > mean, min, max, and so on.
> > > > > > >
> > > > > > > Let me know if there's something specific you're trying
to do
> > with
> > > > > > > stat-analysis and I may be able to point you in the
right
> > > direction.
> > > > > > >
> > > > > > > John
> > > > > > >
> > > > > > >
> > > > > > > On Sat, May 13, 2017 at 11:04 AM Rosalyn MacCracken -
NOAA
> > > Affiliate
> > > > > via
> > > > > > RT
> > > > > > > <met_help at ucar.edu> wrote:
> > > > > > >
> > > > > > > >
> > > > > > > > <URL: https://rt.rap.ucar.edu/rt/
> Ticket/Display.html?id=80429
> > >
> > > > > > > >
> > > > > > > > Hi John,
> > > > > > > >
> > > > > > > > I finally got it to work. I had set:
> > > > > > > > line_type = ["CTC"];
> > > > > > > >
> > > > > > > > So, I set line_type to nothing [], and everything
started
> > > working.
> > > > > > > >
> > > > > > > > So, question. When using "summary" with -column RMSE
set,
> what
> > > > does
> > > > > > that
> > > > > > > > mean? That only the RMSE column is summed, or
something
> else?
> > > > > > > >
> > > > > > > > Thanks!
> > > > > > > >
> > > > > > > > Roz
> > > > > > > >
> > > > > > > > On Fri, May 12, 2017 at 2:31 PM, John Halley Gotway
via RT <
> > > > > > > > met_help at ucar.edu> wrote:
> > > > > > > >
> > > > > > > > > Hello Roz,
> > > > > > > > >
> > > > > > > > > I see that you have a question about
configuring/running
> > > > > > STAT-Analysis
> > > > > > > > > jobs.
> > > > > > > > >
> > > > > > > > > The "-lookin" command line option is used to tell
> > STAT-Analysis
> > > > > what
> > > > > > > > input
> > > > > > > > > files to read. You must specify the "-lookin"
option at
> > least
> > > > > once,
> > > > > > > but
> > > > > > > > > can use it as many times as you'd like.
> > > > > > > > >
> > > > > > > > > The argument you pass with "-lookin" is either the
name of
> a
> > > > > > directory
> > > > > > > or
> > > > > > > > > explicit file name.
> > > > > > > > >
> > > > > > > > > For an explicit file name, STAT-Analysis will read
MET
> output
> > > > data
> > > > > > from
> > > > > > > > it
> > > > > > > > > **regardless of the file naming convention**.
> > > > > > > > >
> > > > > > > > > For a directory name, STAT-Analysis will search
> > **recursively**
> > > > > > through
> > > > > > > > > that directory looking for files ending in the
".stat"
> > suffix.
> > > > > > > > >
> > > > > > > > > Each time you run grid_stat, point_stat,
wavelet_stat, or
> > > > > > > ensemble_stat,
> > > > > > > > > the tool writes a ".stat" output file (and can also
write
> the
> > > > > > optional
> > > > > > > > text
> > > > > > > > > files sorted by line type... such as "_cnt.txt).
That's
> why
> > > > > > > > STAT-Analysis
> > > > > > > > > searches directories for ".stat" files. But if you
want it
> > to
> > > > read
> > > > > > the
> > > > > > > > > "_cnt.txt" file, you need to specify the file name
on the
> > > command
> > > > > > line.
> > > > > > > > >
> > > > > > > > > Make sense?
> > > > > > > > >
> > > > > > > > > Just let us know if more issues/questions arise.
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > > John Halley Gotway
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Thu, May 11, 2017 at 2:37 PM, Rosalyn MacCracken
- NOAA
> > > > > Affiliate
> > > > > > > via
> > > > > > > > RT
> > > > > > > > > <met_help at ucar.edu> wrote:
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Thu May 11 14:37:29 2017: Request 80429 was acted
upon.
> > > > > > > > > > Transaction: Ticket created by
> rosalyn.maccracken at noaa.gov
> > > > > > > > > > Queue: met_help
> > > > > > > > > > Subject: stat_analysis aggregate question
> > > > > > > > > > Owner: Nobody
> > > > > > > > > > Requestors: rosalyn.maccracken at noaa.gov
> > > > > > > > > > Status: new
> > > > > > > > > > Ticket <URL: https://rt.rap.ucar.edu/rt/
> > > > > > > Ticket/Display.html?id=80429
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Hi,
> > > > > > > > > >
> > > > > > > > > > I'm using the output from the poin-stat tool as
input to
> > the
> > > > > > > > > stat_analysis
> > > > > > > > > > tool. I would like to aggregate the *cnt.txt
files. I
> can
> > > get
> > > > > the
> > > > > > > > tool
> > > > > > > > > to
> > > > > > > > > > aggregate, and aggregate_stat the *cts.txt or
*ctc.txt
> > files.
> > > > I
> > > > > > > would
> > > > > > > > > > really like to use the information in the *cnt.txt
files
> > for
> > > > > > multiple
> > > > > > > > > > times/days. How do I do that?
> > > > > > > > > >
> > > > > > > > > > Thanks in advance!
> > > > > > > > > >
> > > > > > > > > > Roz
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > Rosalyn MacCracken
> > > > > > > > > > Support Scientist
> > > > > > > > > >
> > > > > > > > > > Ocean Applilcations Branch
> > > > > > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > > > > > NCWCP
> > > > > > > > > > 5830 University Research Ct
> > > > > > > > > > College Park, MD 20740-3818
> > > > > > > > > >
> > > > > > > > > > (p) 301-683-1551
> > > > > > > > > > rosalyn.maccracken at noaa.gov
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > Rosalyn MacCracken
> > > > > > > > Support Scientist
> > > > > > > >
> > > > > > > > Ocean Applilcations Branch
> > > > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > > > NCWCP
> > > > > > > > 5830 University Research Ct
> > > > > > > > College Park, MD 20740-3818
> > > > > > > >
> > > > > > > > (p) 301-683-1551
> > > > > > > > rosalyn.maccracken at noaa.gov
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Rosalyn MacCracken
> > > > > > Support Scientist
> > > > > >
> > > > > > Ocean Applilcations Branch
> > > > > > NOAA/NWS Ocean Prediction Center
> > > > > > NCWCP
> > > > > > 5830 University Research Ct
> > > > > > College Park, MD 20740-3818
> > > > > >
> > > > > > (p) 301-683-1551
> > > > > > rosalyn.maccracken at noaa.gov
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > Rosalyn MacCracken
> > > > Support Scientist
> > > >
> > > > Ocean Applilcations Branch
> > > > NOAA/NWS Ocean Prediction Center
> > > > NCWCP
> > > > 5830 University Research Ct
> > > > College Park, MD 20740-3818
> > > >
> > > > (p) 301-683-1551
> > > > rosalyn.maccracken at noaa.gov
> > > >
> > > >
> > >
> > >
> >
> >
> > --
> > Rosalyn MacCracken
> > Support Scientist
> >
> > Ocean Applilcations Branch
> > NOAA/NWS Ocean Prediction Center
> > NCWCP
> > 5830 University Research Ct
> > College Park, MD 20740-3818
> >
> > (p) 301-683-1551
> > rosalyn.maccracken at noaa.gov
> >
> >
>
>
--
Rosalyn MacCracken
Support Scientist
Ocean Applilcations Branch
NOAA/NWS Ocean Prediction Center
NCWCP
5830 University Research Ct
College Park, MD 20740-3818
(p) 301-683-1551
rosalyn.maccracken at noaa.gov
------------------------------------------------
More information about the Met_help
mailing list