[Met_help] [rt.rap.ucar.edu #82576] History for Question on Stat-analysis config file

Tue Jul 9 12:04:12 MDT 2019

----------------------------------------------------------------
  Initial Request
----------------------------------------------------------------

I am trying to use stat analysis to get statistic scores. One of my test is HGT P500 mean 
AC over different forecast times for 20170901~20170930 for 00Z runs only. My config
file STATAnalysisConfig  looks like as below. 

////////////////////////////////////////////////////////////////////////////////
//
// STAT-Analysis configuration file.
//
// For additional information, see the MET_BASE/config/README file.
//
////////////////////////////////////////////////////////////////////////////////

//
// Filtering input STAT lines by the contents of each column
//
model = ["GFS"];
desc  = [];

fcst_lead = ["12","24","36","48","60","72","84","96","108","120","132","144","156","168","180","192","204","216","228","240"];
obs_lead  = fcst_lead;

fcst_valid_beg  = "20170901";
fcst_valid_end  = "20170930";
fcst_valid_hour = ["00","12"];

obs_valid_beg   = "";
obs_valid_end   = "";
obs_valid_hour  = [];

fcst_init_beg   = "";
fcst_init_end   = "";
fcst_init_hour  = ["00"];

obs_init_beg    = "";
obs_init_end    = "";
obs_init_hour   = [];

fcst_var = ["HGT"];
obs_var  = [];

fcst_lev = ["P500"];
obs_lev  = [];

obtype = [];

vx_mask = ["NHM","SHM","NAM","TRP","EUR","ASA","AUS","NPR","SPR"];

interp_mthd = [];

interp_pnts = [];

fcst_thresh = [];
obs_thresh  = [];
cov_thresh  = [];

alpha = [];

line_type = ["SAL1L2"];

column = [];

weight = [];

////////////////////////////////////////////////////////////////////////////////

//
// Array of STAT-Analysis jobs to be performed on the filtered data
//
jobs = [
   "-job aggregate_stat -line_type SAL1L2 -out_line_type CNT -out_stat Z500.stat"
];

////////////////////////////////////////////////////////////////////////////////

//
// Confidence interval settings
//
out_alpha = 0.05;

boot = {
   interval = PCTILE;
   rep_prop = 1.0;
   n_rep    = 0;
   rng      = "mt19937";
   seed     = "";
}

////////////////////////////////////////////////////////////////////////////////

rank_corr_flag = FALSE;
vif_flag       = FALSE;
tmp_dir        = "./";
version        = "V6.1_beta3";

////////////////////////////////////////////////////////////////////////////////

I saved all stat files in directory ./stat and run command is

stat_analysis -lookin ./stat  -config  STATAnalysisConfig 

The running seems ok, bout output file Z500.stat has not scores but a long strings of
score names. Could you please provide further advice?
Thanks,

BTW, can I set multiple line_types in one config file?

----------------------------------------------------------------
  Complete Ticket History
----------------------------------------------------------------

Subject: Question on Stat-analysis config file
From: John Halley Gotway
Time: Tue Oct 31 09:38:33 2017

Hi Bibin,

I moved this question over into met-help.

STAT-Analysis can be run in two ways...
(1) On the command line to run a single job.
(2) Using a config file to run one or more jobs.

Usually, I get started by running a single job on the command line
and, once its working well, group multiple jobs into a config file.

There's any number of reasons why you might be getting no output from
your STAT-Analysis job.

I think it might be best to have you copy that "stat" directory over
to theia or tar it up and post it to our anonymous ftp site.  I'll
grab it and send you some example stat_analysis jobs you could use.

https://dtcenter.org/met/users/support/met_help.php#ftp

Does that work?

Thanks,
John

------------------------------------------------
Subject: Question on Stat-analysis config file
From: Binbin.Zhou at noaa.gov
Time: Tue Oct 31 12:20:43 2017

John,

  I have copied entire testing directory to theia located in:

/scratch4/NCEPDEV/meso/noscrub/Binbin.Zhou/met_test/Stat_Analysis

In this directory all stat files are stored in the subdirectory
./stat
which contains entire one month stat files (2017 Sept.)
and each day has 00Z and 12Z runs out to 240 lead times (12, 24, 36,
....
240 fhr).

The stat config file is STATAnalysisConfig.

Please look at them.

BTW, I can not run MET6.1_beta3 on theia
Error message says:   stat_analysis: error while loading shared
libraries:
libnetcdf.so.6: cannot open shared object file: No such file or
directory

Please also look at this for me,

Thanks

Binbin

------------------------------------------------
Subject: Question on Stat-analysis config file
From: Julie Prestopnik
Time: Tue Oct 31 12:23:28 2017

Hi Binbin.

Regarding running MET6.1_beta3 on theia, please try running "module
purge"
before loading the "module use" and "module load" commands for MET,
then
try it again.  If that doesn't work, please let me know, and I'll take
a
look as soon as I can.

Thanks,
Julie

On Tue, Oct 31, 2017 at 12:20 PM, Binbin.Zhou at noaa.gov via RT <
met_help at ucar.edu> wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=82576 >
>
> John,
>
>   I have copied entire testing directory to theia located in:
>
> /scratch4/NCEPDEV/meso/noscrub/Binbin.Zhou/met_test/Stat_Analysis
>
> In this directory all stat files are stored in the subdirectory
./stat
> which contains entire one month stat files (2017 Sept.)
> and each day has 00Z and 12Z runs out to 240 lead times (12, 24, 36,
....
> 240 fhr).
>
> The stat config file is STATAnalysisConfig.
>
> Please look at them.
>
> BTW, I can not run MET6.1_beta3 on theia
> Error message says:   stat_analysis: error while loading shared
libraries:
> libnetcdf.so.6: cannot open shared object file: No such file or
directory
>
> Please also look at this for me,
>
> Thanks
>
> Binbin
>
>

------------------------------------------------
Subject: Question on Stat-analysis config file
From: Binbin.Zhou at noaa.gov
Time: Tue Oct 31 12:32:52 2017

Julie,

  Yes,  your suggestion works.

Binbin

On Tue, Oct 31, 2017 at 2:23 PM, Julie Prestopnik via RT
<met_help at ucar.edu>
wrote:

> Hi Binbin.
>
> Regarding running MET6.1_beta3 on theia, please try running "module
purge"
> before loading the "module use" and "module load" commands for MET,
then
> try it again.  If that doesn't work, please let me know, and I'll
take a
> look as soon as I can.
>
> Thanks,
> Julie
>
> On Tue, Oct 31, 2017 at 12:20 PM, Binbin.Zhou at noaa.gov via RT <
> met_help at ucar.edu> wrote:
>
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=82576 >
> >
> > John,
> >
> >   I have copied entire testing directory to theia located in:
> >
> > /scratch4/NCEPDEV/meso/noscrub/Binbin.Zhou/met_test/Stat_Analysis
> >
> > In this directory all stat files are stored in the subdirectory
./stat
> > which contains entire one month stat files (2017 Sept.)
> > and each day has 00Z and 12Z runs out to 240 lead times (12, 24,
36, ....
> > 240 fhr).
> >
> > The stat config file is STATAnalysisConfig.
> >
> > Please look at them.
> >
> > BTW, I can not run MET6.1_beta3 on theia
> > Error message says:   stat_analysis: error while loading shared
> libraries:
> > libnetcdf.so.6: cannot open shared object file: No such file or
directory
> >
> > Please also look at this for me,
> >
> > Thanks
> >
> > Binbin
> >
> >
>
>

------------------------------------------------
Subject: Question on Stat-analysis config file
From: Julie Prestopnik
Time: Tue Oct 31 12:35:05 2017

Great!  It's not an ideal solution for competing modules, but I'm glad
it
worked.

Julie

On Tue, Oct 31, 2017 at 12:32 PM, Binbin.Zhou at noaa.gov via RT <
met_help at ucar.edu> wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=82576 >
>
> Julie,
>
>   Yes,  your suggestion works.
>
> Binbin
>
> On Tue, Oct 31, 2017 at 2:23 PM, Julie Prestopnik via RT <
> met_help at ucar.edu>
> wrote:
>
> > Hi Binbin.
> >
> > Regarding running MET6.1_beta3 on theia, please try running
"module
> purge"
> > before loading the "module use" and "module load" commands for
MET, then
> > try it again.  If that doesn't work, please let me know, and I'll
take a
> > look as soon as I can.
> >
> > Thanks,
> > Julie
> >
> > On Tue, Oct 31, 2017 at 12:20 PM, Binbin.Zhou at noaa.gov via RT <
> > met_help at ucar.edu> wrote:
> >
> > >
> > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=82576 >
> > >
> > > John,
> > >
> > >   I have copied entire testing directory to theia located in:
> > >
> > >
/scratch4/NCEPDEV/meso/noscrub/Binbin.Zhou/met_test/Stat_Analysis
> > >
> > > In this directory all stat files are stored in the subdirectory
./stat
> > > which contains entire one month stat files (2017 Sept.)
> > > and each day has 00Z and 12Z runs out to 240 lead times (12, 24,
36,
> ....
> > > 240 fhr).
> > >
> > > The stat config file is STATAnalysisConfig.
> > >
> > > Please look at them.
> > >
> > > BTW, I can not run MET6.1_beta3 on theia
> > > Error message says:   stat_analysis: error while loading shared
> > libraries:
> > > libnetcdf.so.6: cannot open shared object file: No such file or
> directory
> > >
> > > Please also look at this for me,
> > >
> > > Thanks
> > >
> > > Binbin
> > >
> > >
> >
> >
>
>

------------------------------------------------
Subject: Question on Stat-analysis config file
From: John Halley Gotway
Time: Tue Oct 31 16:10:54 2017

Binbin,

Great, thanks for pointing me to your test data.  While it's fine to
have
all your data in a single directory named "stat", it's also fine to
organize them into whatever directory structure you'd like.  It's
pretty
easy to tell stat-analysis where to look for data to process.

I started by loading met-6.1_beta3 and I linked your data into my
working
directory:

> module purge
> module use /contrib/modulefiles
> module load met/6.1_beta3
> cd /scratch4/BMC/dtc/John.H.Gotway/MET/MET_Help/zhou_data_20171031
> ln -sf
/scratch4/NCEPDEV/meso/noscrub/Binbin.Zhou/met_test/Stat_Analysis/stat
.

I see that you want to aggregate SAL1L2 lines together and derive CNT
lines.  The stat directory contains 1200 files.  I started by only
processing the 120 forecast files like this:

> stat_analysis -lookin stat/grid_stat_GFS_F120* \
     -job aggregate_stat -line_type SAL1L2 -out_line_type CNT \
     -by FCST_VAR,FCST_LEV,FCST_LEAD,VX_MASK \
     -out_stat agg_stat_SAL1L2_to_CNT.stat

That processes 60 files and takes a couple seconds to run.  So then I
switched and processes *ALL* lead times by just passing it the top-
level
directory:

> stat_analysis -lookin stat \
     -job aggregate_stat -line_type SAL1L2 -out_line_type CNT \
     -by FCST_VAR,FCST_LEV,FCST_LEAD,VX_MASK \
     -out_stat agg_stat_SAL1L2_to_CNT.stat -v 3

That processes all 1200 input files.  And I used "-v 3" to print out
the
names of the files as it read them.

OK, the important parts of this job are...

(1) -lookin stat ### Tells STAT-Analysis to look in that directory for
files ending in .stat.
(2) -job aggregate_stat -line_type SAL1L2 -out_line_type CNT ### Tells
STAT-Analysis to aggregate SAL1L2 lines and derive CNT
(3) -by FCST_VAR,FCST_LEV,FCST_LEAD,VX_MASK ### Tells STAT-Analysis to
run
this same job for each unique combination of those header columns.
(4) -out_stat agg_stat_SAL1L2_to_CNT.stat ### Tells STAT-Analysis to
write
the aggregated CNT lines to a file by that name.

Make sense?

So the result of this job is in this file:
/scratch4/BMC/dtc/John.H.Gotway/MET/MET_Help/zhou_data_20171031/agg_stat_SAL1L2_to_CNT.stat

It contains 2600 CNT lines for each unique combo those header columns.

Suppose you want to do the same thing but for only the "NAM" and "NHM"
masking regions... just add the job command option "-vx_mask NAM,NHM".
Then STAT-Analysis will only keep input lines that contain those two
names.

Then 2600 lines of output becomes 520:

> stat_analysis -lookin stat \
     -job aggregate_stat -line_type SAL1L2 -out_line_type CNT \
     -by FCST_VAR,FCST_LEV,FCST_LEAD,VX_MASK \
     -out_stat agg_stat_SAL1L2_to_CNT_NAM_NHM.stat -v 3 \
     -vx_mask NAM,NHM

Just let me know what other questions you have.

Thanks,
John

On Tue, Oct 31, 2017 at 12:35 PM, Julie Prestopnik via RT
<met_help at ucar.edu
> wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=82576 >
>
> Great!  It's not an ideal solution for competing modules, but I'm
glad it
> worked.
>
> Julie
>
> On Tue, Oct 31, 2017 at 12:32 PM, Binbin.Zhou at noaa.gov via RT <
> met_help at ucar.edu> wrote:
>
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=82576 >
> >
> > Julie,
> >
> >   Yes,  your suggestion works.
> >
> > Binbin
> >
> > On Tue, Oct 31, 2017 at 2:23 PM, Julie Prestopnik via RT <
> > met_help at ucar.edu>
> > wrote:
> >
> > > Hi Binbin.
> > >
> > > Regarding running MET6.1_beta3 on theia, please try running
"module
> > purge"
> > > before loading the "module use" and "module load" commands for
MET,
> then
> > > try it again.  If that doesn't work, please let me know, and
I'll take
> a
> > > look as soon as I can.
> > >
> > > Thanks,
> > > Julie
> > >
> > > On Tue, Oct 31, 2017 at 12:20 PM, Binbin.Zhou at noaa.gov via RT <
> > > met_help at ucar.edu> wrote:
> > >
> > > >
> > > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=82576
>
> > > >
> > > > John,
> > > >
> > > >   I have copied entire testing directory to theia located in:
> > > >
> > > >
/scratch4/NCEPDEV/meso/noscrub/Binbin.Zhou/met_test/Stat_Analysis
> > > >
> > > > In this directory all stat files are stored in the
subdirectory
> ./stat
> > > > which contains entire one month stat files (2017 Sept.)
> > > > and each day has 00Z and 12Z runs out to 240 lead times (12,
24, 36,
> > ....
> > > > 240 fhr).
> > > >
> > > > The stat config file is STATAnalysisConfig.
> > > >
> > > > Please look at them.
> > > >
> > > > BTW, I can not run MET6.1_beta3 on theia
> > > > Error message says:   stat_analysis: error while loading
shared
> > > libraries:
> > > > libnetcdf.so.6: cannot open shared object file: No such file
or
> > directory
> > > >
> > > > Please also look at this for me,
> > > >
> > > > Thanks
> > > >
> > > > Binbin
> > > >
> > > >
> > >
> > >
> >
> >
>
>

------------------------------------------------
Subject: Question on Stat-analysis config file
From: Binbin.Zhou at noaa.gov
Time: Wed Nov 01 08:49:28 2017

John,

  Thanks, you suggested command works. I added some other options like
 -fcst_valid_beg 20170901 -fcst_valid_end 20170930 -fcst_init_hour 00,
It works fine as well.

So my further question is how those command line options could be
coded
into a
stat analysis config file?

Since I would like to get multiple line_type. Is it possible to use
multiple line_type
(such as SL1L2, SAL1L2, VL1L2, VAL1L2, GRAD, etc)?
Or I have to use them one by one in different commands?

Another question, for GRAD line_type, what is its out_line_type? Still
CNT ?

Binbin

On Tue, Oct 31, 2017 at 6:10 PM, John Halley Gotway via RT <
met_help at ucar.edu> wrote:

> Binbin,
>
> Great, thanks for pointing me to your test data.  While it's fine to
have
> all your data in a single directory named "stat", it's also fine to
> organize them into whatever directory structure you'd like.  It's
pretty
> easy to tell stat-analysis where to look for data to process.
>
> I started by loading met-6.1_beta3 and I linked your data into my
working
> directory:
>
> > module purge
> > module use /contrib/modulefiles
> > module load met/6.1_beta3
> > cd /scratch4/BMC/dtc/John.H.Gotway/MET/MET_Help/zhou_data_20171031
> > ln -sf /scratch4/NCEPDEV/meso/noscrub/Binbin.Zhou/met_test/
> Stat_Analysis/stat
> .
>
> I see that you want to aggregate SAL1L2 lines together and derive
CNT
> lines.  The stat directory contains 1200 files.  I started by only
> processing the 120 forecast files like this:
>
> > stat_analysis -lookin stat/grid_stat_GFS_F120* \
>      -job aggregate_stat -line_type SAL1L2 -out_line_type CNT \
>      -by FCST_VAR,FCST_LEV,FCST_LEAD,VX_MASK \
>      -out_stat agg_stat_SAL1L2_to_CNT.stat
>
> That processes 60 files and takes a couple seconds to run.  So then
I
> switched and processes *ALL* lead times by just passing it the top-
level
> directory:
>
> > stat_analysis -lookin stat \
>      -job aggregate_stat -line_type SAL1L2 -out_line_type CNT \
>      -by FCST_VAR,FCST_LEV,FCST_LEAD,VX_MASK \
>      -out_stat agg_stat_SAL1L2_to_CNT.stat -v 3
>
> That processes all 1200 input files.  And I used "-v 3" to print out
the
> names of the files as it read them.
>
> OK, the important parts of this job are...
>
> (1) -lookin stat ### Tells STAT-Analysis to look in that directory
for
> files ending in .stat.
> (2) -job aggregate_stat -line_type SAL1L2 -out_line_type CNT ###
Tells
> STAT-Analysis to aggregate SAL1L2 lines and derive CNT
> (3) -by FCST_VAR,FCST_LEV,FCST_LEAD,VX_MASK ### Tells STAT-Analysis
to run
> this same job for each unique combination of those header columns.
> (4) -out_stat agg_stat_SAL1L2_to_CNT.stat ### Tells STAT-Analysis to
write
> the aggregated CNT lines to a file by that name.
>
> Make sense?
>
> So the result of this job is in this file:
> /scratch4/BMC/dtc/John.H.Gotway/MET/MET_Help/zhou_data_
> 20171031/agg_stat_SAL1L2_to_CNT.stat
>
> It contains 2600 CNT lines for each unique combo those header
columns.
>
> Suppose you want to do the same thing but for only the "NAM" and
"NHM"
> masking regions... just add the job command option "-vx_mask
NAM,NHM".
> Then STAT-Analysis will only keep input lines that contain those two
names.
>
> Then 2600 lines of output becomes 520:
>
> > stat_analysis -lookin stat \
>      -job aggregate_stat -line_type SAL1L2 -out_line_type CNT \
>      -by FCST_VAR,FCST_LEV,FCST_LEAD,VX_MASK \
>      -out_stat agg_stat_SAL1L2_to_CNT_NAM_NHM.stat -v 3 \
>      -vx_mask NAM,NHM
>
> Just let me know what other questions you have.
>
> Thanks,
> John
>
>
> On Tue, Oct 31, 2017 at 12:35 PM, Julie Prestopnik via RT <
> met_help at ucar.edu
> > wrote:
>
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=82576 >
> >
> > Great!  It's not an ideal solution for competing modules, but I'm
glad it
> > worked.
> >
> > Julie
> >
> > On Tue, Oct 31, 2017 at 12:32 PM, Binbin.Zhou at noaa.gov via RT <
> > met_help at ucar.edu> wrote:
> >
> > >
> > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=82576 >
> > >
> > > Julie,
> > >
> > >   Yes,  your suggestion works.
> > >
> > > Binbin
> > >
> > > On Tue, Oct 31, 2017 at 2:23 PM, Julie Prestopnik via RT <
> > > met_help at ucar.edu>
> > > wrote:
> > >
> > > > Hi Binbin.
> > > >
> > > > Regarding running MET6.1_beta3 on theia, please try running
"module
> > > purge"
> > > > before loading the "module use" and "module load" commands for
MET,
> > then
> > > > try it again.  If that doesn't work, please let me know, and
I'll
> take
> > a
> > > > look as soon as I can.
> > > >
> > > > Thanks,
> > > > Julie
> > > >
> > > > On Tue, Oct 31, 2017 at 12:20 PM, Binbin.Zhou at noaa.gov via RT
<
> > > > met_help at ucar.edu> wrote:
> > > >
> > > > >
> > > > > <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=82576 >
> > > > >
> > > > > John,
> > > > >
> > > > >   I have copied entire testing directory to theia located
in:
> > > > >
> > > > >
/scratch4/NCEPDEV/meso/noscrub/Binbin.Zhou/met_test/Stat_Analysis
> > > > >
> > > > > In this directory all stat files are stored in the
subdirectory
> > ./stat
> > > > > which contains entire one month stat files (2017 Sept.)
> > > > > and each day has 00Z and 12Z runs out to 240 lead times (12,
24,
> 36,
> > > ....
> > > > > 240 fhr).
> > > > >
> > > > > The stat config file is STATAnalysisConfig.
> > > > >
> > > > > Please look at them.
> > > > >
> > > > > BTW, I can not run MET6.1_beta3 on theia
> > > > > Error message says:   stat_analysis: error while loading
shared
> > > > libraries:
> > > > > libnetcdf.so.6: cannot open shared object file: No such file
or
> > > directory
> > > > >
> > > > > Please also look at this for me,
> > > > >
> > > > > Thanks
> > > > >
> > > > > Binbin
> > > > >
> > > > >
> > > >
> > > >
> > >
> > >
> >
> >
>
>

------------------------------------------------
Subject: Question on Stat-analysis config file
From: John Halley Gotway
Time: Wed Nov 01 12:05:19 2017

Binbin,

OK, great.  Glad the command line job worked the way you expected.  So
that's an example of running 1 job on the command line.

You can run multiple jobs using a configuration file but listing them
in
the "jobs" array near the bottom of the config file.  Any settings
that are
common to all the jobs can be factored out and listed at the top...
instead
of relisting them for each job.

When using a config file, STAT-Analysis really does 2 steps:

(1) Read all the input .stat files and subset the data using the
filtering
criteria listed before the "jobs" section.  Write the filtered subset
of
data to a temp file.

(2) For each job, read that temp file and do that job.

By grouping together common filtering criteria at the top, STAT-
Analysis
should run more efficiently.  That means running 10 jobs with one
config
file should be faster than running 10 separate jobs on the command
line.

Looking in your stat files, I don't see any output for GRAD.  But to
answer
your question, you'd just aggregate the GRAD lines together.  They
don't
get converted to anything else.

I set up a config file to run 6 jobs.  Aggregate the (1) SL1L2 lines,
(2)
SAL1L2 lines, (3) VL1L2 lines, (4) VAL1L2 lines, (5) convert SL1L2 to
CNT,
and (6) convert SAL1L2 to CNT.

And here's how I ran it:

cd /scratch4/BMC/dtc/John.H.Gotway/MET/MET_Help/zhou_data_20171031
stat_analysis -lookin stat -config STATAnalysisConfig_monthly_summary
-out
stat_analysis.out -v 3

The "-out" option tells it to write all its output to a file instead
of the
screen.
The "-out_stat" option listed for each job tells it to write the
individual
job output to the files listed.

Since jobs (5) and (6) both write a CNT line, I figured it'd be
difficult
to know whether the CNT scores came from SL1L2 or SAL1L2 input lines.
So I
used the "-set_hdr" option to set the DESC(ription) output column to
differentiate between SL1L2 and SAL1L2 input data.

Hope that helps.

John

On Wed, Nov 1, 2017 at 8:49 AM, Binbin.Zhou at noaa.gov via RT <
met_help at ucar.edu> wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=82576 >
>
> John,
>
>   Thanks, you suggested command works. I added some other options
like
>  -fcst_valid_beg 20170901 -fcst_valid_end 20170930 -fcst_init_hour
00,
> It works fine as well.
>
> So my further question is how those command line options could be
coded
> into a
> stat analysis config file?
>
> Since I would like to get multiple line_type. Is it possible to use
> multiple line_type
> (such as SL1L2, SAL1L2, VL1L2, VAL1L2, GRAD, etc)?
> Or I have to use them one by one in different commands?
>
> Another question, for GRAD line_type, what is its out_line_type?
Still CNT
> ?
>
> Binbin
>
>
>
>
> On Tue, Oct 31, 2017 at 6:10 PM, John Halley Gotway via RT <
> met_help at ucar.edu> wrote:
>
> > Binbin,
> >
> > Great, thanks for pointing me to your test data.  While it's fine
to have
> > all your data in a single directory named "stat", it's also fine
to
> > organize them into whatever directory structure you'd like.  It's
pretty
> > easy to tell stat-analysis where to look for data to process.
> >
> > I started by loading met-6.1_beta3 and I linked your data into my
working
> > directory:
> >
> > > module purge
> > > module use /contrib/modulefiles
> > > module load met/6.1_beta3
> > > cd
/scratch4/BMC/dtc/John.H.Gotway/MET/MET_Help/zhou_data_20171031
> > > ln -sf /scratch4/NCEPDEV/meso/noscrub/Binbin.Zhou/met_test/
> > Stat_Analysis/stat
> > .
> >
> > I see that you want to aggregate SAL1L2 lines together and derive
CNT
> > lines.  The stat directory contains 1200 files.  I started by only
> > processing the 120 forecast files like this:
> >
> > > stat_analysis -lookin stat/grid_stat_GFS_F120* \
> >      -job aggregate_stat -line_type SAL1L2 -out_line_type CNT \
> >      -by FCST_VAR,FCST_LEV,FCST_LEAD,VX_MASK \
> >      -out_stat agg_stat_SAL1L2_to_CNT.stat
> >
> > That processes 60 files and takes a couple seconds to run.  So
then I
> > switched and processes *ALL* lead times by just passing it the
top-level
> > directory:
> >
> > > stat_analysis -lookin stat \
> >      -job aggregate_stat -line_type SAL1L2 -out_line_type CNT \
> >      -by FCST_VAR,FCST_LEV,FCST_LEAD,VX_MASK \
> >      -out_stat agg_stat_SAL1L2_to_CNT.stat -v 3
> >
> > That processes all 1200 input files.  And I used "-v 3" to print
out the
> > names of the files as it read them.
> >
> > OK, the important parts of this job are...
> >
> > (1) -lookin stat ### Tells STAT-Analysis to look in that directory
for
> > files ending in .stat.
> > (2) -job aggregate_stat -line_type SAL1L2 -out_line_type CNT ###
Tells
> > STAT-Analysis to aggregate SAL1L2 lines and derive CNT
> > (3) -by FCST_VAR,FCST_LEV,FCST_LEAD,VX_MASK ### Tells STAT-
Analysis to
> run
> > this same job for each unique combination of those header columns.
> > (4) -out_stat agg_stat_SAL1L2_to_CNT.stat ### Tells STAT-Analysis
to
> write
> > the aggregated CNT lines to a file by that name.
> >
> > Make sense?
> >
> > So the result of this job is in this file:
> > /scratch4/BMC/dtc/John.H.Gotway/MET/MET_Help/zhou_data_
> > 20171031/agg_stat_SAL1L2_to_CNT.stat
> >
> > It contains 2600 CNT lines for each unique combo those header
columns.
> >
> > Suppose you want to do the same thing but for only the "NAM" and
"NHM"
> > masking regions... just add the job command option "-vx_mask
NAM,NHM".
> > Then STAT-Analysis will only keep input lines that contain those
two
> names.
> >
> > Then 2600 lines of output becomes 520:
> >
> > > stat_analysis -lookin stat \
> >      -job aggregate_stat -line_type SAL1L2 -out_line_type CNT \
> >      -by FCST_VAR,FCST_LEV,FCST_LEAD,VX_MASK \
> >      -out_stat agg_stat_SAL1L2_to_CNT_NAM_NHM.stat -v 3 \
> >      -vx_mask NAM,NHM
> >
> > Just let me know what other questions you have.
> >
> > Thanks,
> > John
> >
> >
> > On Tue, Oct 31, 2017 at 12:35 PM, Julie Prestopnik via RT <
> > met_help at ucar.edu
> > > wrote:
> >
> > >
> > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=82576 >
> > >
> > > Great!  It's not an ideal solution for competing modules, but
I'm glad
> it
> > > worked.
> > >
> > > Julie
> > >
> > > On Tue, Oct 31, 2017 at 12:32 PM, Binbin.Zhou at noaa.gov via RT <
> > > met_help at ucar.edu> wrote:
> > >
> > > >
> > > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=82576
>
> > > >
> > > > Julie,
> > > >
> > > >   Yes,  your suggestion works.
> > > >
> > > > Binbin
> > > >
> > > > On Tue, Oct 31, 2017 at 2:23 PM, Julie Prestopnik via RT <
> > > > met_help at ucar.edu>
> > > > wrote:
> > > >
> > > > > Hi Binbin.
> > > > >
> > > > > Regarding running MET6.1_beta3 on theia, please try running
"module
> > > > purge"
> > > > > before loading the "module use" and "module load" commands
for MET,
> > > then
> > > > > try it again.  If that doesn't work, please let me know, and
I'll
> > take
> > > a
> > > > > look as soon as I can.
> > > > >
> > > > > Thanks,
> > > > > Julie
> > > > >
> > > > > On Tue, Oct 31, 2017 at 12:20 PM, Binbin.Zhou at noaa.gov via
RT <
> > > > > met_help at ucar.edu> wrote:
> > > > >
> > > > > >
> > > > > > <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=82576 >
> > > > > >
> > > > > > John,
> > > > > >
> > > > > >   I have copied entire testing directory to theia located
in:
> > > > > >
> > > > > > /scratch4/NCEPDEV/meso/noscrub/Binbin.Zhou/met_test/
> Stat_Analysis
> > > > > >
> > > > > > In this directory all stat files are stored in the
subdirectory
> > > ./stat
> > > > > > which contains entire one month stat files (2017 Sept.)
> > > > > > and each day has 00Z and 12Z runs out to 240 lead times
(12, 24,
> > 36,
> > > > ....
> > > > > > 240 fhr).
> > > > > >
> > > > > > The stat config file is STATAnalysisConfig.
> > > > > >
> > > > > > Please look at them.
> > > > > >
> > > > > > BTW, I can not run MET6.1_beta3 on theia
> > > > > > Error message says:   stat_analysis: error while loading
shared
> > > > > libraries:
> > > > > > libnetcdf.so.6: cannot open shared object file: No such
file or
> > > > directory
> > > > > >
> > > > > > Please also look at this for me,
> > > > > >
> > > > > > Thanks
> > > > > >
> > > > > > Binbin
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > >
> > > >
> > >
> > >
> >
> >
>
>

------------------------------------------------
Subject: Question on Stat-analysis config file
From: Binbin.Zhou at noaa.gov
Time: Wed Nov 01 12:19:32 2017

John,

  Thanks. I'll follow your instruction to test further.
Another question, I tested both SL1L2 and SAL1L2 iine_type jobs.
I found there are many similar scores for both jobs output. For
example,
both have RMSE score, but their values look a little bit different.
Which
RMSE values should I take? I assume the RMSE in SL1L2 line_type's
output should be correct?

Binbin

On Wed, Nov 1, 2017 at 2:05 PM, John Halley Gotway via RT
<met_help at ucar.edu
> wrote:

> Binbin,
>
> OK, great.  Glad the command line job worked the way you expected.
So
> that's an example of running 1 job on the command line.
>
> You can run multiple jobs using a configuration file but listing
them in
> the "jobs" array near the bottom of the config file.  Any settings
that are
> common to all the jobs can be factored out and listed at the top...
instead
> of relisting them for each job.
>
> When using a config file, STAT-Analysis really does 2 steps:
>
> (1) Read all the input .stat files and subset the data using the
filtering
> criteria listed before the "jobs" section.  Write the filtered
subset of
> data to a temp file.
>
> (2) For each job, read that temp file and do that job.
>
> By grouping together common filtering criteria at the top, STAT-
Analysis
> should run more efficiently.  That means running 10 jobs with one
config
> file should be faster than running 10 separate jobs on the command
line.
>
> Looking in your stat files, I don't see any output for GRAD.  But to
answer
> your question, you'd just aggregate the GRAD lines together.  They
don't
> get converted to anything else.
>
> I set up a config file to run 6 jobs.  Aggregate the (1) SL1L2
lines, (2)
> SAL1L2 lines, (3) VL1L2 lines, (4) VAL1L2 lines, (5) convert SL1L2
to CNT,
> and (6) convert SAL1L2 to CNT.
>
> And here's how I ran it:
>
> cd /scratch4/BMC/dtc/John.H.Gotway/MET/MET_Help/zhou_data_20171031
> stat_analysis -lookin stat -config
STATAnalysisConfig_monthly_summary -out
> stat_analysis.out -v 3
>
> The "-out" option tells it to write all its output to a file instead
of the
> screen.
> The "-out_stat" option listed for each job tells it to write the
individual
> job output to the files listed.
>
> Since jobs (5) and (6) both write a CNT line, I figured it'd be
difficult
> to know whether the CNT scores came from SL1L2 or SAL1L2 input
lines.  So I
> used the "-set_hdr" option to set the DESC(ription) output column to
> differentiate between SL1L2 and SAL1L2 input data.
>
> Hope that helps.
>
> John
>
>
>
>
>
>
> On Wed, Nov 1, 2017 at 8:49 AM, Binbin.Zhou at noaa.gov via RT <
> met_help at ucar.edu> wrote:
>
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=82576 >
> >
> > John,
> >
> >   Thanks, you suggested command works. I added some other options
like
> >  -fcst_valid_beg 20170901 -fcst_valid_end 20170930 -fcst_init_hour
00,
> > It works fine as well.
> >
> > So my further question is how those command line options could be
coded
> > into a
> > stat analysis config file?
> >
> > Since I would like to get multiple line_type. Is it possible to
use
> > multiple line_type
> > (such as SL1L2, SAL1L2, VL1L2, VAL1L2, GRAD, etc)?
> > Or I have to use them one by one in different commands?
> >
> > Another question, for GRAD line_type, what is its out_line_type?
Still
> CNT
> > ?
> >
> > Binbin
> >
> >
> >
> >
> > On Tue, Oct 31, 2017 at 6:10 PM, John Halley Gotway via RT <
> > met_help at ucar.edu> wrote:
> >
> > > Binbin,
> > >
> > > Great, thanks for pointing me to your test data.  While it's
fine to
> have
> > > all your data in a single directory named "stat", it's also fine
to
> > > organize them into whatever directory structure you'd like.
It's
> pretty
> > > easy to tell stat-analysis where to look for data to process.
> > >
> > > I started by loading met-6.1_beta3 and I linked your data into
my
> working
> > > directory:
> > >
> > > > module purge
> > > > module use /contrib/modulefiles
> > > > module load met/6.1_beta3
> > > > cd
/scratch4/BMC/dtc/John.H.Gotway/MET/MET_Help/zhou_data_20171031
> > > > ln -sf /scratch4/NCEPDEV/meso/noscrub/Binbin.Zhou/met_test/
> > > Stat_Analysis/stat
> > > .
> > >
> > > I see that you want to aggregate SAL1L2 lines together and
derive CNT
> > > lines.  The stat directory contains 1200 files.  I started by
only
> > > processing the 120 forecast files like this:
> > >
> > > > stat_analysis -lookin stat/grid_stat_GFS_F120* \
> > >      -job aggregate_stat -line_type SAL1L2 -out_line_type CNT \
> > >      -by FCST_VAR,FCST_LEV,FCST_LEAD,VX_MASK \
> > >      -out_stat agg_stat_SAL1L2_to_CNT.stat
> > >
> > > That processes 60 files and takes a couple seconds to run.  So
then I
> > > switched and processes *ALL* lead times by just passing it the
> top-level
> > > directory:
> > >
> > > > stat_analysis -lookin stat \
> > >      -job aggregate_stat -line_type SAL1L2 -out_line_type CNT \
> > >      -by FCST_VAR,FCST_LEV,FCST_LEAD,VX_MASK \
> > >      -out_stat agg_stat_SAL1L2_to_CNT.stat -v 3
> > >
> > > That processes all 1200 input files.  And I used "-v 3" to print
out
> the
> > > names of the files as it read them.
> > >
> > > OK, the important parts of this job are...
> > >
> > > (1) -lookin stat ### Tells STAT-Analysis to look in that
directory for
> > > files ending in .stat.
> > > (2) -job aggregate_stat -line_type SAL1L2 -out_line_type CNT ###
Tells
> > > STAT-Analysis to aggregate SAL1L2 lines and derive CNT
> > > (3) -by FCST_VAR,FCST_LEV,FCST_LEAD,VX_MASK ### Tells STAT-
Analysis to
> > run
> > > this same job for each unique combination of those header
columns.
> > > (4) -out_stat agg_stat_SAL1L2_to_CNT.stat ### Tells STAT-
Analysis to
> > write
> > > the aggregated CNT lines to a file by that name.
> > >
> > > Make sense?
> > >
> > > So the result of this job is in this file:
> > > /scratch4/BMC/dtc/John.H.Gotway/MET/MET_Help/zhou_data_
> > > 20171031/agg_stat_SAL1L2_to_CNT.stat
> > >
> > > It contains 2600 CNT lines for each unique combo those header
columns.
> > >
> > > Suppose you want to do the same thing but for only the "NAM" and
"NHM"
> > > masking regions... just add the job command option "-vx_mask
NAM,NHM".
> > > Then STAT-Analysis will only keep input lines that contain those
two
> > names.
> > >
> > > Then 2600 lines of output becomes 520:
> > >
> > > > stat_analysis -lookin stat \
> > >      -job aggregate_stat -line_type SAL1L2 -out_line_type CNT \
> > >      -by FCST_VAR,FCST_LEV,FCST_LEAD,VX_MASK \
> > >      -out_stat agg_stat_SAL1L2_to_CNT_NAM_NHM.stat -v 3 \
> > >      -vx_mask NAM,NHM
> > >
> > > Just let me know what other questions you have.
> > >
> > > Thanks,
> > > John
> > >
> > >
> > > On Tue, Oct 31, 2017 at 12:35 PM, Julie Prestopnik via RT <
> > > met_help at ucar.edu
> > > > wrote:
> > >
> > > >
> > > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=82576
>
> > > >
> > > > Great!  It's not an ideal solution for competing modules, but
I'm
> glad
> > it
> > > > worked.
> > > >
> > > > Julie
> > > >
> > > > On Tue, Oct 31, 2017 at 12:32 PM, Binbin.Zhou at noaa.gov via RT
<
> > > > met_help at ucar.edu> wrote:
> > > >
> > > > >
> > > > > <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=82576 >
> > > > >
> > > > > Julie,
> > > > >
> > > > >   Yes,  your suggestion works.
> > > > >
> > > > > Binbin
> > > > >
> > > > > On Tue, Oct 31, 2017 at 2:23 PM, Julie Prestopnik via RT <
> > > > > met_help at ucar.edu>
> > > > > wrote:
> > > > >
> > > > > > Hi Binbin.
> > > > > >
> > > > > > Regarding running MET6.1_beta3 on theia, please try
running
> "module
> > > > > purge"
> > > > > > before loading the "module use" and "module load" commands
for
> MET,
> > > > then
> > > > > > try it again.  If that doesn't work, please let me know,
and I'll
> > > take
> > > > a
> > > > > > look as soon as I can.
> > > > > >
> > > > > > Thanks,
> > > > > > Julie
> > > > > >
> > > > > > On Tue, Oct 31, 2017 at 12:20 PM, Binbin.Zhou at noaa.gov via
RT <
> > > > > > met_help at ucar.edu> wrote:
> > > > > >
> > > > > > >
> > > > > > > <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=82576
> >
> > > > > > >
> > > > > > > John,
> > > > > > >
> > > > > > >   I have copied entire testing directory to theia
located in:
> > > > > > >
> > > > > > > /scratch4/NCEPDEV/meso/noscrub/Binbin.Zhou/met_test/
> > Stat_Analysis
> > > > > > >
> > > > > > > In this directory all stat files are stored in the
subdirectory
> > > > ./stat
> > > > > > > which contains entire one month stat files (2017 Sept.)
> > > > > > > and each day has 00Z and 12Z runs out to 240 lead times
(12,
> 24,
> > > 36,
> > > > > ....
> > > > > > > 240 fhr).
> > > > > > >
> > > > > > > The stat config file is STATAnalysisConfig.
> > > > > > >
> > > > > > > Please look at them.
> > > > > > >
> > > > > > > BTW, I can not run MET6.1_beta3 on theia
> > > > > > > Error message says:   stat_analysis: error while loading
shared
> > > > > > libraries:
> > > > > > > libnetcdf.so.6: cannot open shared object file: No such
file or
> > > > > directory
> > > > > > >
> > > > > > > Please also look at this for me,
> > > > > > >
> > > > > > > Thanks
> > > > > > >
> > > > > > > Binbin
> > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > >
> > > >
> > >
> > >
> >
> >
>
>

------------------------------------------------
Subject: Question on Stat-analysis config file
From: John Halley Gotway
Time: Wed Nov 01 12:47:45 2017

Binbin,

Yes, you should report the RMSE scores derived from the SL1L2 partial
sums.

We should think some more about exactly what statistics should be
computed
from the SAL1L2 input line.

Perhaps it'd be good to talk to our statistician, Tressa Fowler, when
she
visits NCEP later this year.

Thanks,
John

On Wed, Nov 1, 2017 at 12:19 PM, Binbin.Zhou at noaa.gov via RT <
met_help at ucar.edu> wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=82576 >
>
> John,
>
>   Thanks. I'll follow your instruction to test further.
> Another question, I tested both SL1L2 and SAL1L2 iine_type jobs.
> I found there are many similar scores for both jobs output. For
example,
> both have RMSE score, but their values look a little bit different.
Which
> RMSE values should I take? I assume the RMSE in SL1L2 line_type's
> output should be correct?
>
> Binbin
>
> On Wed, Nov 1, 2017 at 2:05 PM, John Halley Gotway via RT <
> met_help at ucar.edu
> > wrote:
>
> > Binbin,
> >
> > OK, great.  Glad the command line job worked the way you expected.
So
> > that's an example of running 1 job on the command line.
> >
> > You can run multiple jobs using a configuration file but listing
them in
> > the "jobs" array near the bottom of the config file.  Any settings
that
> are
> > common to all the jobs can be factored out and listed at the
top...
> instead
> > of relisting them for each job.
> >
> > When using a config file, STAT-Analysis really does 2 steps:
> >
> > (1) Read all the input .stat files and subset the data using the
> filtering
> > criteria listed before the "jobs" section.  Write the filtered
subset of
> > data to a temp file.
> >
> > (2) For each job, read that temp file and do that job.
> >
> > By grouping together common filtering criteria at the top, STAT-
Analysis
> > should run more efficiently.  That means running 10 jobs with one
config
> > file should be faster than running 10 separate jobs on the command
line.
> >
> > Looking in your stat files, I don't see any output for GRAD.  But
to
> answer
> > your question, you'd just aggregate the GRAD lines together.  They
don't
> > get converted to anything else.
> >
> > I set up a config file to run 6 jobs.  Aggregate the (1) SL1L2
lines, (2)
> > SAL1L2 lines, (3) VL1L2 lines, (4) VAL1L2 lines, (5) convert SL1L2
to
> CNT,
> > and (6) convert SAL1L2 to CNT.
> >
> > And here's how I ran it:
> >
> > cd /scratch4/BMC/dtc/John.H.Gotway/MET/MET_Help/zhou_data_20171031
> > stat_analysis -lookin stat -config
STATAnalysisConfig_monthly_summary
> -out
> > stat_analysis.out -v 3
> >
> > The "-out" option tells it to write all its output to a file
instead of
> the
> > screen.
> > The "-out_stat" option listed for each job tells it to write the
> individual
> > job output to the files listed.
> >
> > Since jobs (5) and (6) both write a CNT line, I figured it'd be
difficult
> > to know whether the CNT scores came from SL1L2 or SAL1L2 input
lines.
> So I
> > used the "-set_hdr" option to set the DESC(ription) output column
to
> > differentiate between SL1L2 and SAL1L2 input data.
> >
> > Hope that helps.
> >
> > John
> >
> >
> >
> >
> >
> >
> > On Wed, Nov 1, 2017 at 8:49 AM, Binbin.Zhou at noaa.gov via RT <
> > met_help at ucar.edu> wrote:
> >
> > >
> > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=82576 >
> > >
> > > John,
> > >
> > >   Thanks, you suggested command works. I added some other
options like
> > >  -fcst_valid_beg 20170901 -fcst_valid_end 20170930
-fcst_init_hour 00,
> > > It works fine as well.
> > >
> > > So my further question is how those command line options could
be coded
> > > into a
> > > stat analysis config file?
> > >
> > > Since I would like to get multiple line_type. Is it possible to
use
> > > multiple line_type
> > > (such as SL1L2, SAL1L2, VL1L2, VAL1L2, GRAD, etc)?
> > > Or I have to use them one by one in different commands?
> > >
> > > Another question, for GRAD line_type, what is its out_line_type?
Still
> > CNT
> > > ?
> > >
> > > Binbin
> > >
> > >
> > >
> > >
> > > On Tue, Oct 31, 2017 at 6:10 PM, John Halley Gotway via RT <
> > > met_help at ucar.edu> wrote:
> > >
> > > > Binbin,
> > > >
> > > > Great, thanks for pointing me to your test data.  While it's
fine to
> > have
> > > > all your data in a single directory named "stat", it's also
fine to
> > > > organize them into whatever directory structure you'd like.
It's
> > pretty
> > > > easy to tell stat-analysis where to look for data to process.
> > > >
> > > > I started by loading met-6.1_beta3 and I linked your data into
my
> > working
> > > > directory:
> > > >
> > > > > module purge
> > > > > module use /contrib/modulefiles
> > > > > module load met/6.1_beta3
> > > > > cd
/scratch4/BMC/dtc/John.H.Gotway/MET/MET_Help/zhou_data_20171031
> > > > > ln -sf /scratch4/NCEPDEV/meso/noscrub/Binbin.Zhou/met_test/
> > > > Stat_Analysis/stat
> > > > .
> > > >
> > > > I see that you want to aggregate SAL1L2 lines together and
derive CNT
> > > > lines.  The stat directory contains 1200 files.  I started by
only
> > > > processing the 120 forecast files like this:
> > > >
> > > > > stat_analysis -lookin stat/grid_stat_GFS_F120* \
> > > >      -job aggregate_stat -line_type SAL1L2 -out_line_type CNT
\
> > > >      -by FCST_VAR,FCST_LEV,FCST_LEAD,VX_MASK \
> > > >      -out_stat agg_stat_SAL1L2_to_CNT.stat
> > > >
> > > > That processes 60 files and takes a couple seconds to run.  So
then I
> > > > switched and processes *ALL* lead times by just passing it the
> > top-level
> > > > directory:
> > > >
> > > > > stat_analysis -lookin stat \
> > > >      -job aggregate_stat -line_type SAL1L2 -out_line_type CNT
\
> > > >      -by FCST_VAR,FCST_LEV,FCST_LEAD,VX_MASK \
> > > >      -out_stat agg_stat_SAL1L2_to_CNT.stat -v 3
> > > >
> > > > That processes all 1200 input files.  And I used "-v 3" to
print out
> > the
> > > > names of the files as it read them.
> > > >
> > > > OK, the important parts of this job are...
> > > >
> > > > (1) -lookin stat ### Tells STAT-Analysis to look in that
directory
> for
> > > > files ending in .stat.
> > > > (2) -job aggregate_stat -line_type SAL1L2 -out_line_type CNT
###
> Tells
> > > > STAT-Analysis to aggregate SAL1L2 lines and derive CNT
> > > > (3) -by FCST_VAR,FCST_LEV,FCST_LEAD,VX_MASK ### Tells STAT-
Analysis
> to
> > > run
> > > > this same job for each unique combination of those header
columns.
> > > > (4) -out_stat agg_stat_SAL1L2_to_CNT.stat ### Tells STAT-
Analysis to
> > > write
> > > > the aggregated CNT lines to a file by that name.
> > > >
> > > > Make sense?
> > > >
> > > > So the result of this job is in this file:
> > > > /scratch4/BMC/dtc/John.H.Gotway/MET/MET_Help/zhou_data_
> > > > 20171031/agg_stat_SAL1L2_to_CNT.stat
> > > >
> > > > It contains 2600 CNT lines for each unique combo those header
> columns.
> > > >
> > > > Suppose you want to do the same thing but for only the "NAM"
and
> "NHM"
> > > > masking regions... just add the job command option "-vx_mask
> NAM,NHM".
> > > > Then STAT-Analysis will only keep input lines that contain
those two
> > > names.
> > > >
> > > > Then 2600 lines of output becomes 520:
> > > >
> > > > > stat_analysis -lookin stat \
> > > >      -job aggregate_stat -line_type SAL1L2 -out_line_type CNT
\
> > > >      -by FCST_VAR,FCST_LEV,FCST_LEAD,VX_MASK \
> > > >      -out_stat agg_stat_SAL1L2_to_CNT_NAM_NHM.stat -v 3 \
> > > >      -vx_mask NAM,NHM
> > > >
> > > > Just let me know what other questions you have.
> > > >
> > > > Thanks,
> > > > John
> > > >
> > > >
> > > > On Tue, Oct 31, 2017 at 12:35 PM, Julie Prestopnik via RT <
> > > > met_help at ucar.edu
> > > > > wrote:
> > > >
> > > > >
> > > > > <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=82576 >
> > > > >
> > > > > Great!  It's not an ideal solution for competing modules,
but I'm
> > glad
> > > it
> > > > > worked.
> > > > >
> > > > > Julie
> > > > >
> > > > > On Tue, Oct 31, 2017 at 12:32 PM, Binbin.Zhou at noaa.gov via
RT <
> > > > > met_help at ucar.edu> wrote:
> > > > >
> > > > > >
> > > > > > <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=82576 >
> > > > > >
> > > > > > Julie,
> > > > > >
> > > > > >   Yes,  your suggestion works.
> > > > > >
> > > > > > Binbin
> > > > > >
> > > > > > On Tue, Oct 31, 2017 at 2:23 PM, Julie Prestopnik via RT <
> > > > > > met_help at ucar.edu>
> > > > > > wrote:
> > > > > >
> > > > > > > Hi Binbin.
> > > > > > >
> > > > > > > Regarding running MET6.1_beta3 on theia, please try
running
> > "module
> > > > > > purge"
> > > > > > > before loading the "module use" and "module load"
commands for
> > MET,
> > > > > then
> > > > > > > try it again.  If that doesn't work, please let me know,
and
> I'll
> > > > take
> > > > > a
> > > > > > > look as soon as I can.
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Julie
> > > > > > >
> > > > > > > On Tue, Oct 31, 2017 at 12:20 PM, Binbin.Zhou at noaa.gov
via RT
> <
> > > > > > > met_help at ucar.edu> wrote:
> > > > > > >
> > > > > > > >
> > > > > > > > <URL: https://rt.rap.ucar.edu/rt/
> Ticket/Display.html?id=82576
> > >
> > > > > > > >
> > > > > > > > John,
> > > > > > > >
> > > > > > > >   I have copied entire testing directory to theia
located in:
> > > > > > > >
> > > > > > > > /scratch4/NCEPDEV/meso/noscrub/Binbin.Zhou/met_test/
> > > Stat_Analysis
> > > > > > > >
> > > > > > > > In this directory all stat files are stored in the
> subdirectory
> > > > > ./stat
> > > > > > > > which contains entire one month stat files (2017
Sept.)
> > > > > > > > and each day has 00Z and 12Z runs out to 240 lead
times (12,
> > 24,
> > > > 36,
> > > > > > ....
> > > > > > > > 240 fhr).
> > > > > > > >
> > > > > > > > The stat config file is STATAnalysisConfig.
> > > > > > > >
> > > > > > > > Please look at them.
> > > > > > > >
> > > > > > > > BTW, I can not run MET6.1_beta3 on theia
> > > > > > > > Error message says:   stat_analysis: error while
loading
> shared
> > > > > > > libraries:
> > > > > > > > libnetcdf.so.6: cannot open shared object file: No
such file
> or
> > > > > > directory
> > > > > > > >
> > > > > > > > Please also look at this for me,
> > > > > > > >
> > > > > > > > Thanks
> > > > > > > >
> > > > > > > > Binbin
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > >
> > > >
> > >
> > >
> >
> >
>
>

------------------------------------------------
Subject: Question on Stat-analysis config file
From: Binbin.Zhou at noaa.gov
Time: Wed Nov 01 13:59:03 2017

John,

  Thanks,
  I'll further test S1 score and let you know if have any issues.

Binbin

On Wed, Nov 1, 2017 at 2:47 PM, John Halley Gotway via RT
<met_help at ucar.edu
> wrote:

> Binbin,
>
> Yes, you should report the RMSE scores derived from the SL1L2
partial sums.
>
> We should think some more about exactly what statistics should be
computed
> from the SAL1L2 input line.
>
> Perhaps it'd be good to talk to our statistician, Tressa Fowler,
when she
> visits NCEP later this year.
>
> Thanks,
> John
>
> On Wed, Nov 1, 2017 at 12:19 PM, Binbin.Zhou at noaa.gov via RT <
> met_help at ucar.edu> wrote:
>
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=82576 >
> >
> > John,
> >
> >   Thanks. I'll follow your instruction to test further.
> > Another question, I tested both SL1L2 and SAL1L2 iine_type jobs.
> > I found there are many similar scores for both jobs output. For
example,
> > both have RMSE score, but their values look a little bit
different.
> Which
> > RMSE values should I take? I assume the RMSE in SL1L2 line_type's
> > output should be correct?
> >
> > Binbin
> >
> > On Wed, Nov 1, 2017 at 2:05 PM, John Halley Gotway via RT <
> > met_help at ucar.edu
> > > wrote:
> >
> > > Binbin,
> > >
> > > OK, great.  Glad the command line job worked the way you
expected.  So
> > > that's an example of running 1 job on the command line.
> > >
> > > You can run multiple jobs using a configuration file but listing
them
> in
> > > the "jobs" array near the bottom of the config file.  Any
settings that
> > are
> > > common to all the jobs can be factored out and listed at the
top...
> > instead
> > > of relisting them for each job.
> > >
> > > When using a config file, STAT-Analysis really does 2 steps:
> > >
> > > (1) Read all the input .stat files and subset the data using the
> > filtering
> > > criteria listed before the "jobs" section.  Write the filtered
subset
> of
> > > data to a temp file.
> > >
> > > (2) For each job, read that temp file and do that job.
> > >
> > > By grouping together common filtering criteria at the top,
> STAT-Analysis
> > > should run more efficiently.  That means running 10 jobs with
one
> config
> > > file should be faster than running 10 separate jobs on the
command
> line.
> > >
> > > Looking in your stat files, I don't see any output for GRAD.
But to
> > answer
> > > your question, you'd just aggregate the GRAD lines together.
They
> don't
> > > get converted to anything else.
> > >
> > > I set up a config file to run 6 jobs.  Aggregate the (1) SL1L2
lines,
> (2)
> > > SAL1L2 lines, (3) VL1L2 lines, (4) VAL1L2 lines, (5) convert
SL1L2 to
> > CNT,
> > > and (6) convert SAL1L2 to CNT.
> > >
> > > And here's how I ran it:
> > >
> > > cd
/scratch4/BMC/dtc/John.H.Gotway/MET/MET_Help/zhou_data_20171031
> > > stat_analysis -lookin stat -config
STATAnalysisConfig_monthly_summary
> > -out
> > > stat_analysis.out -v 3
> > >
> > > The "-out" option tells it to write all its output to a file
instead of
> > the
> > > screen.
> > > The "-out_stat" option listed for each job tells it to write the
> > individual
> > > job output to the files listed.
> > >
> > > Since jobs (5) and (6) both write a CNT line, I figured it'd be
> difficult
> > > to know whether the CNT scores came from SL1L2 or SAL1L2 input
lines.
> > So I
> > > used the "-set_hdr" option to set the DESC(ription) output
column to
> > > differentiate between SL1L2 and SAL1L2 input data.
> > >
> > > Hope that helps.
> > >
> > > John
> > >
> > >
> > >
> > >
> > >
> > >
> > > On Wed, Nov 1, 2017 at 8:49 AM, Binbin.Zhou at noaa.gov via RT <
> > > met_help at ucar.edu> wrote:
> > >
> > > >
> > > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=82576
>
> > > >
> > > > John,
> > > >
> > > >   Thanks, you suggested command works. I added some other
options
> like
> > > >  -fcst_valid_beg 20170901 -fcst_valid_end 20170930
-fcst_init_hour
> 00,
> > > > It works fine as well.
> > > >
> > > > So my further question is how those command line options could
be
> coded
> > > > into a
> > > > stat analysis config file?
> > > >
> > > > Since I would like to get multiple line_type. Is it possible
to use
> > > > multiple line_type
> > > > (such as SL1L2, SAL1L2, VL1L2, VAL1L2, GRAD, etc)?
> > > > Or I have to use them one by one in different commands?
> > > >
> > > > Another question, for GRAD line_type, what is its
out_line_type?
> Still
> > > CNT
> > > > ?
> > > >
> > > > Binbin
> > > >
> > > >
> > > >
> > > >
> > > > On Tue, Oct 31, 2017 at 6:10 PM, John Halley Gotway via RT <
> > > > met_help at ucar.edu> wrote:
> > > >
> > > > > Binbin,
> > > > >
> > > > > Great, thanks for pointing me to your test data.  While it's
fine
> to
> > > have
> > > > > all your data in a single directory named "stat", it's also
fine to
> > > > > organize them into whatever directory structure you'd like.
It's
> > > pretty
> > > > > easy to tell stat-analysis where to look for data to
process.
> > > > >
> > > > > I started by loading met-6.1_beta3 and I linked your data
into my
> > > working
> > > > > directory:
> > > > >
> > > > > > module purge
> > > > > > module use /contrib/modulefiles
> > > > > > module load met/6.1_beta3
> > > > > > cd /scratch4/BMC/dtc/John.H.Gotway/MET/MET_Help/zhou_data_
> 20171031
> > > > > > ln -sf
/scratch4/NCEPDEV/meso/noscrub/Binbin.Zhou/met_test/
> > > > > Stat_Analysis/stat
> > > > > .
> > > > >
> > > > > I see that you want to aggregate SAL1L2 lines together and
derive
> CNT
> > > > > lines.  The stat directory contains 1200 files.  I started
by only
> > > > > processing the 120 forecast files like this:
> > > > >
> > > > > > stat_analysis -lookin stat/grid_stat_GFS_F120* \
> > > > >      -job aggregate_stat -line_type SAL1L2 -out_line_type
CNT \
> > > > >      -by FCST_VAR,FCST_LEV,FCST_LEAD,VX_MASK \
> > > > >      -out_stat agg_stat_SAL1L2_to_CNT.stat
> > > > >
> > > > > That processes 60 files and takes a couple seconds to run.
So
> then I
> > > > > switched and processes *ALL* lead times by just passing it
the
> > > top-level
> > > > > directory:
> > > > >
> > > > > > stat_analysis -lookin stat \
> > > > >      -job aggregate_stat -line_type SAL1L2 -out_line_type
CNT \
> > > > >      -by FCST_VAR,FCST_LEV,FCST_LEAD,VX_MASK \
> > > > >      -out_stat agg_stat_SAL1L2_to_CNT.stat -v 3
> > > > >
> > > > > That processes all 1200 input files.  And I used "-v 3" to
print
> out
> > > the
> > > > > names of the files as it read them.
> > > > >
> > > > > OK, the important parts of this job are...
> > > > >
> > > > > (1) -lookin stat ### Tells STAT-Analysis to look in that
directory
> > for
> > > > > files ending in .stat.
> > > > > (2) -job aggregate_stat -line_type SAL1L2 -out_line_type CNT
###
> > Tells
> > > > > STAT-Analysis to aggregate SAL1L2 lines and derive CNT
> > > > > (3) -by FCST_VAR,FCST_LEV,FCST_LEAD,VX_MASK ### Tells
> STAT-Analysis
> > to
> > > > run
> > > > > this same job for each unique combination of those header
columns.
> > > > > (4) -out_stat agg_stat_SAL1L2_to_CNT.stat ### Tells STAT-
Analysis
> to
> > > > write
> > > > > the aggregated CNT lines to a file by that name.
> > > > >
> > > > > Make sense?
> > > > >
> > > > > So the result of this job is in this file:
> > > > > /scratch4/BMC/dtc/John.H.Gotway/MET/MET_Help/zhou_data_
> > > > > 20171031/agg_stat_SAL1L2_to_CNT.stat
> > > > >
> > > > > It contains 2600 CNT lines for each unique combo those
header
> > columns.
> > > > >
> > > > > Suppose you want to do the same thing but for only the "NAM"
and
> > "NHM"
> > > > > masking regions... just add the job command option "-vx_mask
> > NAM,NHM".
> > > > > Then STAT-Analysis will only keep input lines that contain
those
> two
> > > > names.
> > > > >
> > > > > Then 2600 lines of output becomes 520:
> > > > >
> > > > > > stat_analysis -lookin stat \
> > > > >      -job aggregate_stat -line_type SAL1L2 -out_line_type
CNT \
> > > > >      -by FCST_VAR,FCST_LEV,FCST_LEAD,VX_MASK \
> > > > >      -out_stat agg_stat_SAL1L2_to_CNT_NAM_NHM.stat -v 3 \
> > > > >      -vx_mask NAM,NHM
> > > > >
> > > > > Just let me know what other questions you have.
> > > > >
> > > > > Thanks,
> > > > > John
> > > > >
> > > > >
> > > > > On Tue, Oct 31, 2017 at 12:35 PM, Julie Prestopnik via RT <
> > > > > met_help at ucar.edu
> > > > > > wrote:
> > > > >
> > > > > >
> > > > > > <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=82576 >
> > > > > >
> > > > > > Great!  It's not an ideal solution for competing modules,
but I'm
> > > glad
> > > > it
> > > > > > worked.
> > > > > >
> > > > > > Julie
> > > > > >
> > > > > > On Tue, Oct 31, 2017 at 12:32 PM, Binbin.Zhou at noaa.gov via
RT <
> > > > > > met_help at ucar.edu> wrote:
> > > > > >
> > > > > > >
> > > > > > > <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=82576
> >
> > > > > > >
> > > > > > > Julie,
> > > > > > >
> > > > > > >   Yes,  your suggestion works.
> > > > > > >
> > > > > > > Binbin
> > > > > > >
> > > > > > > On Tue, Oct 31, 2017 at 2:23 PM, Julie Prestopnik via RT
<
> > > > > > > met_help at ucar.edu>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi Binbin.
> > > > > > > >
> > > > > > > > Regarding running MET6.1_beta3 on theia, please try
running
> > > "module
> > > > > > > purge"
> > > > > > > > before loading the "module use" and "module load"
commands
> for
> > > MET,
> > > > > > then
> > > > > > > > try it again.  If that doesn't work, please let me
know, and
> > I'll
> > > > > take
> > > > > > a
> > > > > > > > look as soon as I can.
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Julie
> > > > > > > >
> > > > > > > > On Tue, Oct 31, 2017 at 12:20 PM, Binbin.Zhou at noaa.gov
via
> RT
> > <
> > > > > > > > met_help at ucar.edu> wrote:
> > > > > > > >
> > > > > > > > >
> > > > > > > > > <URL: https://rt.rap.ucar.edu/rt/
> > Ticket/Display.html?id=82576
> > > >
> > > > > > > > >
> > > > > > > > > John,
> > > > > > > > >
> > > > > > > > >   I have copied entire testing directory to theia
located
> in:
> > > > > > > > >
> > > > > > > > > /scratch4/NCEPDEV/meso/noscrub/Binbin.Zhou/met_test/
> > > > Stat_Analysis
> > > > > > > > >
> > > > > > > > > In this directory all stat files are stored in the
> > subdirectory
> > > > > > ./stat
> > > > > > > > > which contains entire one month stat files (2017
Sept.)
> > > > > > > > > and each day has 00Z and 12Z runs out to 240 lead
times
> (12,
> > > 24,
> > > > > 36,
> > > > > > > ....
> > > > > > > > > 240 fhr).
> > > > > > > > >
> > > > > > > > > The stat config file is STATAnalysisConfig.
> > > > > > > > >
> > > > > > > > > Please look at them.
> > > > > > > > >
> > > > > > > > > BTW, I can not run MET6.1_beta3 on theia
> > > > > > > > > Error message says:   stat_analysis: error while
loading
> > shared
> > > > > > > > libraries:
> > > > > > > > > libnetcdf.so.6: cannot open shared object file: No
such
> file
> > or
> > > > > > > directory
> > > > > > > > >
> > > > > > > > > Please also look at this for me,
> > > > > > > > >
> > > > > > > > > Thanks
> > > > > > > > >
> > > > > > > > > Binbin
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > >
> > > >
> > >
> > >
> >
> >
>
>

------------------------------------------------