[Met_help] [rt.rap.ucar.edu #58586] History for Stat-analyis clarification

John Halley Gotway via RT met_help at ucar.edu
Mon Nov 19 08:26:53 MST 2012


----------------------------------------------------------------
  Initial Request
----------------------------------------------------------------


Hi Paul, I am doing Rf verification for 60 days using point-stat tool. I have no problem till this point. I am running the stat-analysis tool as mentioned in the User's Guide. I need to arrive at the FINAL values for ME, MAE, RMSE etc for all these 60 days. Hence I have the option of using either of 3 jobs of stat-analysis.  1. I am not clear of the differences between the Jobs- summary, aggregate and aggregate_stat.  1a. If I give the command, /oprn/model/wrf3/utils/met/METv3.0/bin/stat_analysis -lookin statfiles/ -out sum-full24  -v 4 -job summary -line_type CNT -column ME it gives me an output (attatched) with just 3 lines as mentioned in the Guide.  I want to know if the MEAN (in column) 3 is the Final value of MEAN ERROR for all the days???, This value is 3.41941. Column  8 is the STDev for all the 54 days????.  1b. If I give the command, 
/oprn/model/wrf3/utils/met/METv3.0/bin/stat_analysis \
  -job summary -line_type CNT -column ME  \
  -dump_row summary_test.stat \
  -lookin statfiles \
  -fcst_var APCP_24 -fcst_lev A24 -v 2 The O/P file summary_test.stat is created, but it is the collection of CNT line types of all the stat files.  From here I have extracted the ME for all the days using awk. bash $ awk '$3=270000  {print $53}' summary_test.stat >out. The average of ME turns out to be 
 
 

  3.419406481 

 
1c. if I give the command, /oprn/model/wrf3/utils/met/METv3.0/bin/stat_analysis \
  -job summary -line_type CNT -column RMSE  \
  -dump_row summary_test-rmse.stat \
  -lookin statfiles \
  -fcst_var APCP_24 -fcst_lev A24 -v 2 What is the difference b/w first two commands????????????? 2nd and 3d command generte same O/P.  What is use of specifying column (if I give ME or RMSE, o/p is just the same). bash-3.2$ diff summary_test.stat summary_test-rmse.stat 
 2. Problem Reg aggregate_stat job. I am giving the following commnand, 
/oprn/model/wrf3/utils/met/METv3.0/bin/stat_analysis -lookin statfiles/ -out agst-full24  -v 4 -job aggregate_stat -line_type MPR -out_line_type CNT -out_fcst_thresh  gt0.0 -out_obs_thresh gt0.0 The ME in this file is 3.42134. (This value is different from the o/p produced by commands specified in 1a and 1b).   Finallly I want to ask, ME and hence other quanities can be generated in MANY ways. What is each job doing. and What I should do??????????????????
geeta 		 	   		  

----------------------------------------------------------------
  Complete Ticket History
----------------------------------------------------------------

Subject: Re: [rt.rap.ucar.edu #58586] Stat-analyis clarification
From: John Halley Gotway
Time: Thu Oct 04 14:04:15 2012

Hello Geeta,

This is John Halley Gotway.  It looks like you have some questions
about
how to use the STAT-Analysis tool to summarize your output.  I'd be
happy
to walk through a few suggestions for you.

Let me start with a brief overview of the 3 job types you mention:

(1) summary: A summary job selects out a single column of data from
one
line type of your MET output and then gives you summary information
about
the numbers it found in that column of data.

(2) aggregate: The aggregate job selects out all the lines of the type
your requested and then aggregates their contents together into a
single
line.  The output line type of the aggregate job is always the same as
the
input line type, but only certain line types can actually be
aggregated -
for example contingency table counts (CTC), partial sums (SL1L2), and
so
on.

(3) aggregate_stat: The aggregate_stat job does the same thing as the
aggregate job described above, but rather than writing out the same
line
type as the input one, it writes out a different one.  This
functionality
only works for certain combinations of line types.  For example, CTC
lines
can be aggregated together and from them we can derive contingency
table
statistics (CTS).  Likewise, we can use SL1L2 lines to derive
continuous
statistics (CNT).

It sounds like you're interested in "overall" statistics for your full
time period.  Whether you decide to run a "summary" job or an
"aggregate_stat" job depends on the question you're trying to answer.

Generally though, to determine overall performance, it's better to
aggregate the daily CTC lines or SL1L2 lines together and then derive
statistics from the aggregated values.  So it's generally advised to
run
an "aggregate_stat" job.

It sounds like you also are confused about what constitutes the
"output"
from STAT-Analysis.  When you run a STAT-Analysis job, it's output is
by
default written to the screen.  You can redirect this output to a file
using the "-out" command line option.  It sounds like you're confused
by
the "-dump_row" option.  Each STAT-Analysis job does some filtering of
.stat data using the filtering criteria you define.  The "-dump_row"
command line option may be used to redirect the output of that
filtering
to a file.  This is good practice when getting started with STAT-
Analysis.
 This enables you to decide on some filtering criteria, run a
STAT-Analysis job, and then inpect the -dump_row output file to make
sure
that the filtering options worked exactly how you expected they would.

But the -dump_row output is NOT the real output of STAT-Analysis.  As
I
said, it's real output is either written to the screen or the file you
specified using "-out".

By running both the summary and aggregate_stat job on your data,
you've
demonstrated a very important fact.  Taking the mean of daily ME
values
(summary job) does not yield the same result as the aggregated ME
value
(aggregate_stat job).  As I mentioned, the latter is generally
preferred,
but both pieces of information may be useful depending on the type of
question you're trying to answer.  The first is an average of daily
performance while the second is performance aggregated over the entire
time period.

Hopefully that clarifies things.  If not, just let me know what
additional
questions you have.

Thanks,
John


>
> Thu Oct 04 04:31:15 2012: Request 58586 was acted upon.
> Transaction: Ticket created by geeta124 at hotmail.com
>        Queue: met_help
>      Subject: Stat-analyis clarification
>        Owner: Nobody
>   Requestors: geeta124 at hotmail.com
>       Status: new
>  Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=58586 >
>
>
>
> Hi Paul, I am doing Rf verification for 60 days using point-stat
tool. I
> have no problem till this point. I am running the stat-analysis tool
as
> mentioned in the User's Guide. I need to arrive at the FINAL values
for
> ME, MAE, RMSE etc for all these 60 days. Hence I have the option of
using
> either of 3 jobs of stat-analysis.  1. I am not clear of the
differences
> between the Jobs- summary, aggregate and aggregate_stat.  1a. If I
give
> the command, /oprn/model/wrf3/utils/met/METv3.0/bin/stat_analysis
-lookin
> statfiles/ -out sum-full24  -v 4 -job summary -line_type CNT -column
ME it
> gives me an output (attatched) with just 3 lines as mentioned in the
> Guide.  I want to know if the MEAN (in column) 3 is the Final value
of
> MEAN ERROR for all the days???, This value is 3.41941. Column  8 is
the
> STDev for all the 54 days????.  1b. If I give the command,
> /oprn/model/wrf3/utils/met/METv3.0/bin/stat_analysis \
>   -job summary -line_type CNT -column ME  \
>   -dump_row summary_test.stat \
>   -lookin statfiles \
>   -fcst_var APCP_24 -fcst_lev A24 -v 2 The O/P file
summary_test.stat is
> created, but it is the collection of CNT line types of all the stat
> files.  From here I have extracted the ME for all the days using
awk.
> bash $ awk '$3=270000  {print $53}' summary_test.stat >out. The
average
> of ME turns out to be
>
>
>
>   3.419406481
>
>
> 1c. if I give the command,
> /oprn/model/wrf3/utils/met/METv3.0/bin/stat_analysis \
>   -job summary -line_type CNT -column RMSE  \
>   -dump_row summary_test-rmse.stat \
>   -lookin statfiles \
>   -fcst_var APCP_24 -fcst_lev A24 -v 2 What is the difference b/w
first
> two commands????????????? 2nd and 3d command generte same O/P.  What
is
> use of specifying column (if I give ME or RMSE, o/p is just the
same).
> bash-3.2$ diff summary_test.stat summary_test-rmse.stat
>  2. Problem Reg aggregate_stat job. I am giving the following
commnand,
> /oprn/model/wrf3/utils/met/METv3.0/bin/stat_analysis -lookin
statfiles/
> -out agst-full24  -v 4 -job aggregate_stat -line_type MPR
-out_line_type
> CNT -out_fcst_thresh  gt0.0 -out_obs_thresh gt0.0 The ME in this
file is
> 3.42134. (This value is different from the o/p produced by commands
> specified in 1a and 1b).   Finallly I want to ask, ME and hence
other
> quanities can be generated in MANY ways. What is each job doing. and
What
> I should do??????????????????
> geeta
>



------------------------------------------------
Subject: Stat-analyis clarification
From: Geeta Geeta
Time: Fri Oct 05 01:54:40 2012


Hi John, Thanks for your prompt and informative reply. But I have a
few more ques.   1. Difference between the summary and aggregate. I
figured out that the SUMMARY is the mean value  of daily ME. (I used
the summary job and a filter dump_row and then used awk on th e53rd
column). So It is clear now.  2. Coming to the aggregate job, Your
statement that "aggregate job is the aggregate over the entire time
period"". Kindly explain a bit more. The diff b/w summary and
aggregate is NOT clear.  3. I understood the difference between
aggregate and aggregate_stat job.  4. I have this doubt. I have the
point_stat o/p considering the RF threshold as gt0.0 in the stat
directory of mines (60 files approx).  Now as an example, I wish to
find out about the CTC and resulting CTS O/P for this threshold
(gt0.0). So I gave the command, bash-3.2$
/oprn/model/wrf3/utils/met/METv3.0/bin/stat_analysis -lookin
statfiles/ -out agst-full24 -v 4 -job aggregate_stat -line_type MPR
-out_line_type CTC -out_fcst_thresh gt0.0 -out_obs_thresh gt0.0

So the O/p is atatched  It says (A=7450 C=7250 B=411 D=1069).  Another
way of getting the results is by using agregate command.
/oprn/model/wrf3/utils/met/METv3.0/bin/stat_analysis -lookin
statfiles/ -out agst-full24-1  -v 4 -job aggregate -line_type CTC
 O/P is agst-ful24-1 (attatched) is different from the earlier.
(A=7471 C=7341 B=390 D=978).  I expected the results to be the
same!!!!!!!!!. Pls clarify. geeta
 > Subject: Re: [rt.rap.ucar.edu #58586] Stat-analyis clarification
> From: met_help at ucar.edu
> To: geeta124 at hotmail.com
> CC: met_help at mailman.ucar.edu
> Date: Thu, 4 Oct 2012 14:04:15 -0600
>
> Hello Geeta,
>
> This is John Halley Gotway.  It looks like you have some questions
about
> how to use the STAT-Analysis tool to summarize your output.  I'd be
happy
> to walk through a few suggestions for you.
>
> Let me start with a brief overview of the 3 job types you mention:
>
> (1) summary: A summary job selects out a single column of data from
one
> line type of your MET output and then gives you summary information
about
> the numbers it found in that column of data.
>
> (2) aggregate: The aggregate job selects out all the lines of the
type
> your requested and then aggregates their contents together into a
single
> line.  The output line type of the aggregate job is always the same
as the
> input line type, but only certain line types can actually be
aggregated -
> for example contingency table counts (CTC), partial sums (SL1L2),
and so
> on.
>
> (3) aggregate_stat: The aggregate_stat job does the same thing as
the
> aggregate job described above, but rather than writing out the same
line
> type as the input one, it writes out a different one.  This
functionality
> only works for certain combinations of line types.  For example, CTC
lines
> can be aggregated together and from them we can derive contingency
table
> statistics (CTS).  Likewise, we can use SL1L2 lines to derive
continuous
> statistics (CNT).
>
> It sounds like you're interested in "overall" statistics for your
full
> time period.  Whether you decide to run a "summary" job or an
> "aggregate_stat" job depends on the question you're trying to
answer.
>
> Generally though, to determine overall performance, it's better to
> aggregate the daily CTC lines or SL1L2 lines together and then
derive
> statistics from the aggregated values.  So it's generally advised to
run
> an "aggregate_stat" job.
>
> It sounds like you also are confused about what constitutes the
"output"
> from STAT-Analysis.  When you run a STAT-Analysis job, it's output
is by
> default written to the screen.  You can redirect this output to a
file
> using the "-out" command line option.  It sounds like you're
confused by
> the "-dump_row" option.  Each STAT-Analysis job does some filtering
of
> .stat data using the filtering criteria you define.  The "-dump_row"
> command line option may be used to redirect the output of that
filtering
> to a file.  This is good practice when getting started with STAT-
Analysis.
>  This enables you to decide on some filtering criteria, run a
> STAT-Analysis job, and then inpect the -dump_row output file to make
sure
> that the filtering options worked exactly how you expected they
would.
>
> But the -dump_row output is NOT the real output of STAT-Analysis.
As I
> said, it's real output is either written to the screen or the file
you
> specified using "-out".
>
> By running both the summary and aggregate_stat job on your data,
you've
> demonstrated a very important fact.  Taking the mean of daily ME
values
> (summary job) does not yield the same result as the aggregated ME
value
> (aggregate_stat job).  As I mentioned, the latter is generally
preferred,
> but both pieces of information may be useful depending on the type
of
> question you're trying to answer.  The first is an average of daily
> performance while the second is performance aggregated over the
entire
> time period.
>
> Hopefully that clarifies things.  If not, just let me know what
additional
> questions you have.
>
> Thanks,
> John
>
>
> >
> > Thu Oct 04 04:31:15 2012: Request 58586 was acted upon.
> > Transaction: Ticket created by geeta124 at hotmail.com
> >        Queue: met_help
> >      Subject: Stat-analyis clarification
> >        Owner: Nobody
> >   Requestors: geeta124 at hotmail.com
> >       Status: new
> >  Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=58586 >
> >
> >
> >
> > Hi Paul, I am doing Rf verification for 60 days using point-stat
tool. I
> > have no problem till this point. I am running the stat-analysis
tool as
> > mentioned in the User's Guide. I need to arrive at the FINAL
values for
> > ME, MAE, RMSE etc for all these 60 days. Hence I have the option
of using
> > either of 3 jobs of stat-analysis.  1. I am not clear of the
differences
> > between the Jobs- summary, aggregate and aggregate_stat.  1a. If I
give
> > the command, /oprn/model/wrf3/utils/met/METv3.0/bin/stat_analysis
-lookin
> > statfiles/ -out sum-full24  -v 4 -job summary -line_type CNT
-column ME it
> > gives me an output (attatched) with just 3 lines as mentioned in
the
> > Guide.  I want to know if the MEAN (in column) 3 is the Final
value of
> > MEAN ERROR for all the days???, This value is 3.41941. Column  8
is the
> > STDev for all the 54 days????.  1b. If I give the command,
> > /oprn/model/wrf3/utils/met/METv3.0/bin/stat_analysis \
> >   -job summary -line_type CNT -column ME  \
> >   -dump_row summary_test.stat \
> >   -lookin statfiles \
> >   -fcst_var APCP_24 -fcst_lev A24 -v 2 The O/P file
summary_test.stat is
> > created, but it is the collection of CNT line types of all the
stat
> > files.  From here I have extracted the ME for all the days using
awk.
> > bash $ awk '$3=270000  {print $53}' summary_test.stat >out. The
average
> > of ME turns out to be
> >
> >
> >
> >   3.419406481
> >
> >
> > 1c. if I give the command,
> > /oprn/model/wrf3/utils/met/METv3.0/bin/stat_analysis \
> >   -job summary -line_type CNT -column RMSE  \
> >   -dump_row summary_test-rmse.stat \
> >   -lookin statfiles \
> >   -fcst_var APCP_24 -fcst_lev A24 -v 2 What is the difference b/w
first
> > two commands????????????? 2nd and 3d command generte same O/P.
What is
> > use of specifying column (if I give ME or RMSE, o/p is just the
same).
> > bash-3.2$ diff summary_test.stat summary_test-rmse.stat
> >  2. Problem Reg aggregate_stat job. I am giving the following
commnand,
> > /oprn/model/wrf3/utils/met/METv3.0/bin/stat_analysis -lookin
statfiles/
> > -out agst-full24  -v 4 -job aggregate_stat -line_type MPR
-out_line_type
> > CNT -out_fcst_thresh  gt0.0 -out_obs_thresh gt0.0 The ME in this
file is
> > 3.42134. (This value is different from the o/p produced by
commands
> > specified in 1a and 1b).   Finallly I want to ask, ME and hence
other
> > quanities can be generated in MANY ways. What is each job doing.
and What
> > I should do??????????????????
> > geeta
> >
>
>
>

------------------------------------------------
Subject: Re: [rt.rap.ucar.edu #58586] Stat-analyis clarification
From: John Halley Gotway
Time: Fri Oct 05 08:53:03 2012

Geeta,

Question...
2. Coming to the aggregate job, Your statement that "aggregate job is
the aggregate over the entire time period"". Kindly explain a bit
more. The diff b/w summary and aggregate is NOT clear.

Answer...
Each time you run Point-Stat, your output includes CNT lines and SL1L2
lines.  The SL1L2 lines contain "partial sums" (see page 4-24 of the
user's guide for more info:
www.dtcenter.org/met/users/docs/users_guide/MET_Users_Guide_v4.0.1.pdf).
The aggregate_stat job, combines multiple SL1L2 lines into a single
one.  That combination is done as a weighted average of
the input lines, where the number of points ("TOTAL" column) is used
as the weight.  The summary job computes an unweighted mean of the
statistics from the CNT lines.  So that's the source of the
difference - an unweighted mean in the summary job and a weighted mean
in the aggregate job.  If the contents of the "TOTAL" column were
identical across all the SL1L2 lines, then the output from both
jobs should match.  However, "TOTAL" generally does not stay the same
from day to day when verifying against point observations.

So this is doing an aggregate_stat job where we convert SL1L2 lines to
CNT lines.  If you're also storing the matched pair (MPR) lines, you
could do MPR -> CNT instead.  The output of those two job
types should be identical - subject to rounding and precision errors.

Question...
I expected the results to be the same!!!!!!!!!. Pls clarify

Answer...
Yes I agree, the output of those jobs should be the same!  And I see
that the total columns are both 16180.  Is it possible that there's a
problem with the thresholds somewhere?  Are you sure that the
CTC lines you're aggregating in the second job are using >0 for both
the forecast and observation thresholds?

To check this, you could try running:
   /oprn/model/wrf3/utils/met/METv3.0/bin/stat_analysis \
     -lookin statfiles/ -out agst-full24-1 -v 4 \
     -job aggregate -line_type CTC -fcst_thresh '>0.000' -obs_thresh
'>0.000' \
     -dump_row agst-full24-1.dump

And then see if you get the same results.

If the thresholds really are not the problem, the next step would be
for you to send me some sample data that illustrates the issue.  I
could run it here and try to figure out what's going on.  I'd
need to know what version of MET you're running plus the date of the
latest set of patches you're using.  And you could send me data
following these instructions:
    http://www.dtcenter.org/met/users/support/met_help.php#ftp

Thanks,
John

On 10/05/2012 01:54 AM, Geeta Geeta via RT wrote:
>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=58586 >
>
>
> Hi John, Thanks for your prompt and informative reply. But I have a
few more ques.   1. Difference between the summary and aggregate. I
figured out that the SUMMARY is the mean value  of daily ME. (I used
the summary job and a filter dump_row and then used awk on th e53rd
column). So It is clear now.  2. Coming to the aggregate job, Your
statement that "aggregate job is the aggregate over the entire time
period"". Kindly explain a bit more. The diff b/w summary and
aggregate is NOT clear.  3. I understood the difference between
aggregate and aggregate_stat job.  4. I have this doubt. I have the
point_stat o/p considering the RF threshold as gt0.0 in the stat
directory of mines (60 files approx).  Now as an example, I wish to
find out about the CTC and resulting CTS O/P for this threshold
(gt0.0). So I gave the command, bash-3.2$
/oprn/model/wrf3/utils/met/METv3.0/bin/stat_analysis -lookin
statfiles/ -out agst-full24 -v 4 -job aggregate_stat -line_type MPR
-out_line_type CT!
 C !
>   -out_fcst_thresh gt0.0 -out_obs_thresh gt0.0
>
> So the O/p is atatched  It says (A=7450 C=7250 B=411 D=1069).
Another way of getting the results is by using agregate command.
/oprn/model/wrf3/utils/met/METv3.0/bin/stat_analysis -lookin
statfiles/ -out agst-full24-1  -v 4 -job aggregate -line_type CTC
>   O/P is agst-ful24-1 (attatched) is different from the earlier.
(A=7471 C=7341 B=390 D=978).  I expected the results to be the
same!!!!!!!!!. Pls clarify. geeta
>   > Subject: Re: [rt.rap.ucar.edu #58586] Stat-analyis clarification
>> From: met_help at ucar.edu
>> To: geeta124 at hotmail.com
>> CC: met_help at mailman.ucar.edu
>> Date: Thu, 4 Oct 2012 14:04:15 -0600
>>
>> Hello Geeta,
>>
>> This is John Halley Gotway.  It looks like you have some questions
about
>> how to use the STAT-Analysis tool to summarize your output.  I'd be
happy
>> to walk through a few suggestions for you.
>>
>> Let me start with a brief overview of the 3 job types you mention:
>>
>> (1) summary: A summary job selects out a single column of data from
one
>> line type of your MET output and then gives you summary information
about
>> the numbers it found in that column of data.
>>
>> (2) aggregate: The aggregate job selects out all the lines of the
type
>> your requested and then aggregates their contents together into a
single
>> line.  The output line type of the aggregate job is always the same
as the
>> input line type, but only certain line types can actually be
aggregated -
>> for example contingency table counts (CTC), partial sums (SL1L2),
and so
>> on.
>>
>> (3) aggregate_stat: The aggregate_stat job does the same thing as
the
>> aggregate job described above, but rather than writing out the same
line
>> type as the input one, it writes out a different one.  This
functionality
>> only works for certain combinations of line types.  For example,
CTC lines
>> can be aggregated together and from them we can derive contingency
table
>> statistics (CTS).  Likewise, we can use SL1L2 lines to derive
continuous
>> statistics (CNT).
>>
>> It sounds like you're interested in "overall" statistics for your
full
>> time period.  Whether you decide to run a "summary" job or an
>> "aggregate_stat" job depends on the question you're trying to
answer.
>>
>> Generally though, to determine overall performance, it's better to
>> aggregate the daily CTC lines or SL1L2 lines together and then
derive
>> statistics from the aggregated values.  So it's generally advised
to run
>> an "aggregate_stat" job.
>>
>> It sounds like you also are confused about what constitutes the
"output"
>> from STAT-Analysis.  When you run a STAT-Analysis job, it's output
is by
>> default written to the screen.  You can redirect this output to a
file
>> using the "-out" command line option.  It sounds like you're
confused by
>> the "-dump_row" option.  Each STAT-Analysis job does some filtering
of
>> .stat data using the filtering criteria you define.  The "-
dump_row"
>> command line option may be used to redirect the output of that
filtering
>> to a file.  This is good practice when getting started with STAT-
Analysis.
>>   This enables you to decide on some filtering criteria, run a
>> STAT-Analysis job, and then inpect the -dump_row output file to
make sure
>> that the filtering options worked exactly how you expected they
would.
>>
>> But the -dump_row output is NOT the real output of STAT-Analysis.
As I
>> said, it's real output is either written to the screen or the file
you
>> specified using "-out".
>>
>> By running both the summary and aggregate_stat job on your data,
you've
>> demonstrated a very important fact.  Taking the mean of daily ME
values
>> (summary job) does not yield the same result as the aggregated ME
value
>> (aggregate_stat job).  As I mentioned, the latter is generally
preferred,
>> but both pieces of information may be useful depending on the type
of
>> question you're trying to answer.  The first is an average of daily
>> performance while the second is performance aggregated over the
entire
>> time period.
>>
>> Hopefully that clarifies things.  If not, just let me know what
additional
>> questions you have.
>>
>> Thanks,
>> John
>>
>>
>>>
>>> Thu Oct 04 04:31:15 2012: Request 58586 was acted upon.
>>> Transaction: Ticket created by geeta124 at hotmail.com
>>>         Queue: met_help
>>>       Subject: Stat-analyis clarification
>>>         Owner: Nobody
>>>    Requestors: geeta124 at hotmail.com
>>>        Status: new
>>>   Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=58586 >
>>>
>>>
>>>
>>> Hi Paul, I am doing Rf verification for 60 days using point-stat
tool. I
>>> have no problem till this point. I am running the stat-analysis
tool as
>>> mentioned in the User's Guide. I need to arrive at the FINAL
values for
>>> ME, MAE, RMSE etc for all these 60 days. Hence I have the option
of using
>>> either of 3 jobs of stat-analysis.  1. I am not clear of the
differences
>>> between the Jobs- summary, aggregate and aggregate_stat.  1a. If I
give
>>> the command, /oprn/model/wrf3/utils/met/METv3.0/bin/stat_analysis
-lookin
>>> statfiles/ -out sum-full24  -v 4 -job summary -line_type CNT
-column ME it
>>> gives me an output (attatched) with just 3 lines as mentioned in
the
>>> Guide.  I want to know if the MEAN (in column) 3 is the Final
value of
>>> MEAN ERROR for all the days???, This value is 3.41941. Column  8
is the
>>> STDev for all the 54 days????.  1b. If I give the command,
>>> /oprn/model/wrf3/utils/met/METv3.0/bin/stat_analysis \
>>>    -job summary -line_type CNT -column ME  \
>>>    -dump_row summary_test.stat \
>>>    -lookin statfiles \
>>>    -fcst_var APCP_24 -fcst_lev A24 -v 2 The O/P file
summary_test.stat is
>>> created, but it is the collection of CNT line types of all the
stat
>>> files.  From here I have extracted the ME for all the days using
awk.
>>> bash $ awk '$3=270000  {print $53}' summary_test.stat >out. The
average
>>> of ME turns out to be
>>>
>>>
>>>
>>>    3.419406481
>>>
>>>
>>> 1c. if I give the command,
>>> /oprn/model/wrf3/utils/met/METv3.0/bin/stat_analysis \
>>>    -job summary -line_type CNT -column RMSE  \
>>>    -dump_row summary_test-rmse.stat \
>>>    -lookin statfiles \
>>>    -fcst_var APCP_24 -fcst_lev A24 -v 2 What is the difference b/w
first
>>> two commands????????????? 2nd and 3d command generte same O/P.
What is
>>> use of specifying column (if I give ME or RMSE, o/p is just the
same).
>>> bash-3.2$ diff summary_test.stat summary_test-rmse.stat
>>>   2. Problem Reg aggregate_stat job. I am giving the following
commnand,
>>> /oprn/model/wrf3/utils/met/METv3.0/bin/stat_analysis -lookin
statfiles/
>>> -out agst-full24  -v 4 -job aggregate_stat -line_type MPR
-out_line_type
>>> CNT -out_fcst_thresh  gt0.0 -out_obs_thresh gt0.0 The ME in this
file is
>>> 3.42134. (This value is different from the o/p produced by
commands
>>> specified in 1a and 1b).   Finallly I want to ask, ME and hence
other
>>> quanities can be generated in MANY ways. What is each job doing.
and What
>>> I should do??????????????????
>>> geeta
>>>
>>
>>
>>
>
>

------------------------------------------------
Subject: Stat-analyis clarification
From: Geeta Geeta
Time: Wed Oct 10 05:24:40 2012


Hi, I am also sending you the dump file. kindly have a look
geeta
 From: geeta124 at hotmail.com
To: met_help at ucar.edu
Subject: RE: [rt.rap.ucar.edu #58586] Stat-analyis clarification
Date: Wed, 10 Oct 2012 16:52:10 +0530





Hi John,
I have tried with the command that U have told,
/oprn/model/wrf3/utils/met/METv3.0/bin/stat_analysis \
>      -lookin statfiles/ -out agst-full24-1 -v 4 \
>      -job aggregate -line_type CTC -fcst_thresh '>0.000' -obs_thresh
'>0.000' \
>      -dump_row agst-full24-1.dump

This .dump file has individual CTC results for 54 days.
So I have to claculate the final A, B, C and D.
I am sending you my stat files.

geeta

> Subject: Re: [rt.rap.ucar.edu #58586] Stat-analyis clarification
> From: met_help at ucar.edu
> To: geeta124 at hotmail.com
> CC: met_help at mailman.ucar.edu
> Date: Fri, 5 Oct 2012 08:53:03 -0600
>
> Geeta,
>
> Question...
> 2. Coming to the aggregate job, Your statement that "aggregate job
is the aggregate over the entire time period"". Kindly explain a bit
more. The diff b/w summary and aggregate is NOT clear.
>
> Answer...
> Each time you run Point-Stat, your output includes CNT lines and
SL1L2 lines.  The SL1L2 lines contain "partial sums" (see page 4-24 of
the user's guide for more info:
>
www.dtcenter.org/met/users/docs/users_guide/MET_Users_Guide_v4.0.1.pdf).
The aggregate_stat job, combines multiple SL1L2 lines into a single
one.  That combination is done as a weighted average of
> the input lines, where the number of points ("TOTAL" column) is used
as the weight.  The summary job computes an unweighted mean of the
statistics from the CNT lines.  So that's the source of the
> difference - an unweighted mean in the summary job and a weighted
mean in the aggregate job.  If the contents of the "TOTAL" column were
identical across all the SL1L2 lines, then the output from both
> jobs should match.  However, "TOTAL" generally does not stay the
same from day to day when verifying against point observations.
>
> So this is doing an aggregate_stat job where we convert SL1L2 lines
to CNT lines.  If you're also storing the matched pair (MPR) lines,
you could do MPR -> CNT instead.  The output of those two job
> types should be identical - subject to rounding and precision
errors.
>
> Question...
> I expected the results to be the same!!!!!!!!!. Pls clarify
>
> Answer...
> Yes I agree, the output of those jobs should be the same!  And I see
that the total columns are both 16180.  Is it possible that there's a
problem with the thresholds somewhere?  Are you sure that the
> CTC lines you're aggregating in the second job are using >0 for both
the forecast and observation thresholds?
>
> To check this, you could try running:
>    /oprn/model/wrf3/utils/met/METv3.0/bin/stat_analysis \
>      -lookin statfiles/ -out agst-full24-1 -v 4 \
>      -job aggregate -line_type CTC -fcst_thresh '>0.000' -obs_thresh
'>0.000' \
>      -dump_row agst-full24-1.dump
>
> And then see if you get the same results.
>
> If the thresholds really are not the problem, the next step would be
for you to send me some sample data that illustrates the issue.  I
could run it here and try to figure out what's going on.  I'd
> need to know what version of MET you're running plus the date of the
latest set of patches you're using.  And you could send me data
following these instructions:
>     http://www.dtcenter.org/met/users/support/met_help.php#ftp
>
> Thanks,
> John
>
> On 10/05/2012 01:54 AM, Geeta Geeta via RT wrote:
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=58586 >
> >
> >
> > Hi John, Thanks for your prompt and informative reply. But I have
a few more ques.   1. Difference between the summary and aggregate. I
figured out that the SUMMARY is the mean value  of daily ME. (I used
the summary job and a filter dump_row and then used awk on th e53rd
column). So It is clear now.  2. Coming to the aggregate job, Your
statement that "aggregate job is the aggregate over the entire time
period"". Kindly explain a bit more. The diff b/w summary and
aggregate is NOT clear.  3. I understood the difference between
aggregate and aggregate_stat job.  4. I have this doubt. I have the
point_stat o/p considering the RF threshold as gt0.0 in the stat
directory of mines (60 files approx).  Now as an example, I wish to
find out about the CTC and resulting CTS O/P for this threshold
(gt0.0). So I gave the command, bash-3.2$
/oprn/model/wrf3/utils/met/METv3.0/bin/stat_analysis -lookin
statfiles/ -out agst-full24 -v 4 -job aggregate_stat -line_type MPR
-out_line_type CT!
>  C !
> >   -out_fcst_thresh gt0.0 -out_obs_thresh gt0.0
> >
> > So the O/p is atatched  It says (A=7450 C=7250 B=411 D=1069).
Another way of getting the results is by using agregate command.
/oprn/model/wrf3/utils/met/METv3.0/bin/stat_analysis -lookin
statfiles/ -out agst-full24-1  -v 4 -job aggregate -line_type CTC
> >   O/P is agst-ful24-1 (attatched) is different from the earlier.
(A=7471 C=7341 B=390 D=978).  I expected the results to be the
same!!!!!!!!!. Pls clarify. geeta
> >   > Subject: Re: [rt.rap.ucar.edu #58586] Stat-analyis
clarification
> >> From: met_help at ucar.edu
> >> To: geeta124 at hotmail.com
> >> CC: met_help at mailman.ucar.edu
> >> Date: Thu, 4 Oct 2012 14:04:15 -0600
> >>
> >> Hello Geeta,
> >>
> >> This is John Halley Gotway.  It looks like you have some
questions about
> >> how to use the STAT-Analysis tool to summarize your output.  I'd
be happy
> >> to walk through a few suggestions for you.
> >>
> >> Let me start with a brief overview of the 3 job types you
mention:
> >>
> >> (1) summary: A summary job selects out a single column of data
from one
> >> line type of your MET output and then gives you summary
information about
> >> the numbers it found in that column of data.
> >>
> >> (2) aggregate: The aggregate job selects out all the lines of the
type
> >> your requested and then aggregates their contents together into a
single
> >> line.  The output line type of the aggregate job is always the
same as the
> >> input line type, but only certain line types can actually be
aggregated -
> >> for example contingency table counts (CTC), partial sums (SL1L2),
and so
> >> on.
> >>
> >> (3) aggregate_stat: The aggregate_stat job does the same thing as
the
> >> aggregate job described above, but rather than writing out the
same line
> >> type as the input one, it writes out a different one.  This
functionality
> >> only works for certain combinations of line types.  For example,
CTC lines
> >> can be aggregated together and from them we can derive
contingency table
> >> statistics (CTS).  Likewise, we can use SL1L2 lines to derive
continuous
> >> statistics (CNT).
> >>
> >> It sounds like you're interested in "overall" statistics for your
full
> >> time period.  Whether you decide to run a "summary" job or an
> >> "aggregate_stat" job depends on the question you're trying to
answer.
> >>
> >> Generally though, to determine overall performance, it's better
to
> >> aggregate the daily CTC lines or SL1L2 lines together and then
derive
> >> statistics from the aggregated values.  So it's generally advised
to run
> >> an "aggregate_stat" job.
> >>
> >> It sounds like you also are confused about what constitutes the
"output"
> >> from STAT-Analysis.  When you run a STAT-Analysis job, it's
output is by
> >> default written to the screen.  You can redirect this output to a
file
> >> using the "-out" command line option.  It sounds like you're
confused by
> >> the "-dump_row" option.  Each STAT-Analysis job does some
filtering of
> >> .stat data using the filtering criteria you define.  The "-
dump_row"
> >> command line option may be used to redirect the output of that
filtering
> >> to a file.  This is good practice when getting started with STAT-
Analysis.
> >>   This enables you to decide on some filtering criteria, run a
> >> STAT-Analysis job, and then inpect the -dump_row output file to
make sure
> >> that the filtering options worked exactly how you expected they
would.
> >>
> >> But the -dump_row output is NOT the real output of STAT-Analysis.
As I
> >> said, it's real output is either written to the screen or the
file you
> >> specified using "-out".
> >>
> >> By running both the summary and aggregate_stat job on your data,
you've
> >> demonstrated a very important fact.  Taking the mean of daily ME
values
> >> (summary job) does not yield the same result as the aggregated ME
value
> >> (aggregate_stat job).  As I mentioned, the latter is generally
preferred,
> >> but both pieces of information may be useful depending on the
type of
> >> question you're trying to answer.  The first is an average of
daily
> >> performance while the second is performance aggregated over the
entire
> >> time period.
> >>
> >> Hopefully that clarifies things.  If not, just let me know what
additional
> >> questions you have.
> >>
> >> Thanks,
> >> John
> >>
> >>
> >>>
> >>> Thu Oct 04 04:31:15 2012: Request 58586 was acted upon.
> >>> Transaction: Ticket created by geeta124 at hotmail.com
> >>>         Queue: met_help
> >>>       Subject: Stat-analyis clarification
> >>>         Owner: Nobody
> >>>    Requestors: geeta124 at hotmail.com
> >>>        Status: new
> >>>   Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=58586 >
> >>>
> >>>
> >>>
> >>> Hi Paul, I am doing Rf verification for 60 days using point-stat
tool. I
> >>> have no problem till this point. I am running the stat-analysis
tool as
> >>> mentioned in the User's Guide. I need to arrive at the FINAL
values for
> >>> ME, MAE, RMSE etc for all these 60 days. Hence I have the option
of using
> >>> either of 3 jobs of stat-analysis.  1. I am not clear of the
differences
> >>> between the Jobs- summary, aggregate and aggregate_stat.  1a. If
I give
> >>> the command,
/oprn/model/wrf3/utils/met/METv3.0/bin/stat_analysis -lookin
> >>> statfiles/ -out sum-full24  -v 4 -job summary -line_type CNT
-column ME it
> >>> gives me an output (attatched) with just 3 lines as mentioned in
the
> >>> Guide.  I want to know if the MEAN (in column) 3 is the Final
value of
> >>> MEAN ERROR for all the days???, This value is 3.41941. Column  8
is the
> >>> STDev for all the 54 days????.  1b. If I give the command,
> >>> /oprn/model/wrf3/utils/met/METv3.0/bin/stat_analysis \
> >>>    -job summary -line_type CNT -column ME  \
> >>>    -dump_row summary_test.stat \
> >>>    -lookin statfiles \
> >>>    -fcst_var APCP_24 -fcst_lev A24 -v 2 The O/P file
summary_test.stat is
> >>> created, but it is the collection of CNT line types of all the
stat
> >>> files.  From here I have extracted the ME for all the days using
awk.
> >>> bash $ awk '$3=270000  {print $53}' summary_test.stat >out. The
average
> >>> of ME turns out to be
> >>>
> >>>
> >>>
> >>>    3.419406481
> >>>
> >>>
> >>> 1c. if I give the command,
> >>> /oprn/model/wrf3/utils/met/METv3.0/bin/stat_analysis \
> >>>    -job summary -line_type CNT -column RMSE  \
> >>>    -dump_row summary_test-rmse.stat \
> >>>    -lookin statfiles \
> >>>    -fcst_var APCP_24 -fcst_lev A24 -v 2 What is the difference
b/w first
> >>> two commands????????????? 2nd and 3d command generte same O/P.
What is
> >>> use of specifying column (if I give ME or RMSE, o/p is just the
same).
> >>> bash-3.2$ diff summary_test.stat summary_test-rmse.stat
> >>>   2. Problem Reg aggregate_stat job. I am giving the following
commnand,
> >>> /oprn/model/wrf3/utils/met/METv3.0/bin/stat_analysis -lookin
statfiles/
> >>> -out agst-full24  -v 4 -job aggregate_stat -line_type MPR
-out_line_type
> >>> CNT -out_fcst_thresh  gt0.0 -out_obs_thresh gt0.0 The ME in this
file is
> >>> 3.42134. (This value is different from the o/p produced by
commands
> >>> specified in 1a and 1b).   Finallly I want to ask, ME and hence
other
> >>> quanities can be generated in MANY ways. What is each job doing.
and What
> >>> I should do??????????????????
> >>> geeta
> >>>
> >>
> >>
> >>
> >
> >
>

------------------------------------------------
Subject: Re: [rt.rap.ucar.edu #58586] Stat-analyis clarification
From: John Halley Gotway
Time: Wed Oct 10 11:27:38 2012

Geeta,

What is the latest set of patches you're using for METv3.0?  Here's
the list of known issues with that release:
    http://www.dtcenter.org/met/users/support/known_issues/METv3.0/index.php

There was one posted on 07/15/2011 that may have to do with the issue
you're seeing:
    07/15/2011: Precision error bugfix for thresholding data using
strict inequalities, like < and >, when the value and the threshold
are exactly equal.

If you're not using the latest set of patches, please update MET and
try rerunning.

Of course, you could also consider updating to METv4.0 with it's
latest set of patches.

Thanks,
John

On 10/10/2012 05:24 AM, Geeta Geeta via RT wrote:
>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=58586 >
>
>
> Hi, I am also sending you the dump file. kindly have a look
> geeta
>   From: geeta124 at hotmail.com
> To: met_help at ucar.edu
> Subject: RE: [rt.rap.ucar.edu #58586] Stat-analyis clarification
> Date: Wed, 10 Oct 2012 16:52:10 +0530
>
>
>
>
>
> Hi John,
> I have tried with the command that U have told,
> /oprn/model/wrf3/utils/met/METv3.0/bin/stat_analysis \
>>       -lookin statfiles/ -out agst-full24-1 -v 4 \
>>       -job aggregate -line_type CTC -fcst_thresh '>0.000'
-obs_thresh '>0.000' \
>>       -dump_row agst-full24-1.dump
>
> This .dump file has individual CTC results for 54 days.
> So I have to claculate the final A, B, C and D.
> I am sending you my stat files.
>
> geeta
>
>> Subject: Re: [rt.rap.ucar.edu #58586] Stat-analyis clarification
>> From: met_help at ucar.edu
>> To: geeta124 at hotmail.com
>> CC: met_help at mailman.ucar.edu
>> Date: Fri, 5 Oct 2012 08:53:03 -0600
>>
>> Geeta,
>>
>> Question...
>> 2. Coming to the aggregate job, Your statement that "aggregate job
is the aggregate over the entire time period"". Kindly explain a bit
more. The diff b/w summary and aggregate is NOT clear.
>>
>> Answer...
>> Each time you run Point-Stat, your output includes CNT lines and
SL1L2 lines.  The SL1L2 lines contain "partial sums" (see page 4-24 of
the user's guide for more info:
>>
www.dtcenter.org/met/users/docs/users_guide/MET_Users_Guide_v4.0.1.pdf).
The aggregate_stat job, combines multiple SL1L2 lines into a single
one.  That combination is done as a weighted average of
>> the input lines, where the number of points ("TOTAL" column) is
used as the weight.  The summary job computes an unweighted mean of
the statistics from the CNT lines.  So that's the source of the
>> difference - an unweighted mean in the summary job and a weighted
mean in the aggregate job.  If the contents of the "TOTAL" column were
identical across all the SL1L2 lines, then the output from both
>> jobs should match.  However, "TOTAL" generally does not stay the
same from day to day when verifying against point observations.
>>
>> So this is doing an aggregate_stat job where we convert SL1L2 lines
to CNT lines.  If you're also storing the matched pair (MPR) lines,
you could do MPR -> CNT instead.  The output of those two job
>> types should be identical - subject to rounding and precision
errors.
>>
>> Question...
>> I expected the results to be the same!!!!!!!!!. Pls clarify
>>
>> Answer...
>> Yes I agree, the output of those jobs should be the same!  And I
see that the total columns are both 16180.  Is it possible that
there's a problem with the thresholds somewhere?  Are you sure that
the
>> CTC lines you're aggregating in the second job are using >0 for
both the forecast and observation thresholds?
>>
>> To check this, you could try running:
>>     /oprn/model/wrf3/utils/met/METv3.0/bin/stat_analysis \
>>       -lookin statfiles/ -out agst-full24-1 -v 4 \
>>       -job aggregate -line_type CTC -fcst_thresh '>0.000'
-obs_thresh '>0.000' \
>>       -dump_row agst-full24-1.dump
>>
>> And then see if you get the same results.
>>
>> If the thresholds really are not the problem, the next step would
be for you to send me some sample data that illustrates the issue.  I
could run it here and try to figure out what's going on.  I'd
>> need to know what version of MET you're running plus the date of
the latest set of patches you're using.  And you could send me data
following these instructions:
>>      http://www.dtcenter.org/met/users/support/met_help.php#ftp
>>
>> Thanks,
>> John
>>
>> On 10/05/2012 01:54 AM, Geeta Geeta via RT wrote:
>>>
>>> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=58586 >
>>>
>>>
>>> Hi John, Thanks for your prompt and informative reply. But I have
a few more ques.   1. Difference between the summary and aggregate. I
figured out that the SUMMARY is the mean value  of daily ME. (I used
the summary job and a filter dump_row and then used awk on th e53rd
column). So It is clear now.  2. Coming to the aggregate job, Your
statement that "aggregate job is the aggregate over the entire time
period"". Kindly explain a bit more. The diff b/w summary and
aggregate is NOT clear.  3. I understood the difference between
aggregate and aggregate_stat job.  4. I have this doubt. I have the
point_stat o/p considering the RF threshold as gt0.0 in the stat
directory of mines (60 files approx).  Now as an example, I wish to
find out about the CTC and resulting CTS O/P for this threshold
(gt0.0). So I gave the command, bash-3.2$
/oprn/model/wrf3/utils/met/METv3.0/bin/stat_analysis -lookin
statfiles/ -out agst-full24 -v 4 -job aggregate_stat -line_type MPR
-out_line_type !
>   CT!
>>   C !
>>>    -out_fcst_thresh gt0.0 -out_obs_thresh gt0.0
>>>
>>> So the O/p is atatched  It says (A=7450 C=7250 B=411 D=1069).
Another way of getting the results is by using agregate command.
/oprn/model/wrf3/utils/met/METv3.0/bin/stat_analysis -lookin
statfiles/ -out agst-full24-1  -v 4 -job aggregate -line_type CTC
>>>    O/P is agst-ful24-1 (attatched) is different from the earlier.
(A=7471 C=7341 B=390 D=978).  I expected the results to be the
same!!!!!!!!!. Pls clarify. geeta
>>>    > Subject: Re: [rt.rap.ucar.edu #58586] Stat-analyis
clarification
>>>> From: met_help at ucar.edu
>>>> To: geeta124 at hotmail.com
>>>> CC: met_help at mailman.ucar.edu
>>>> Date: Thu, 4 Oct 2012 14:04:15 -0600
>>>>
>>>> Hello Geeta,
>>>>
>>>> This is John Halley Gotway.  It looks like you have some
questions about
>>>> how to use the STAT-Analysis tool to summarize your output.  I'd
be happy
>>>> to walk through a few suggestions for you.
>>>>
>>>> Let me start with a brief overview of the 3 job types you
mention:
>>>>
>>>> (1) summary: A summary job selects out a single column of data
from one
>>>> line type of your MET output and then gives you summary
information about
>>>> the numbers it found in that column of data.
>>>>
>>>> (2) aggregate: The aggregate job selects out all the lines of the
type
>>>> your requested and then aggregates their contents together into a
single
>>>> line.  The output line type of the aggregate job is always the
same as the
>>>> input line type, but only certain line types can actually be
aggregated -
>>>> for example contingency table counts (CTC), partial sums (SL1L2),
and so
>>>> on.
>>>>
>>>> (3) aggregate_stat: The aggregate_stat job does the same thing as
the
>>>> aggregate job described above, but rather than writing out the
same line
>>>> type as the input one, it writes out a different one.  This
functionality
>>>> only works for certain combinations of line types.  For example,
CTC lines
>>>> can be aggregated together and from them we can derive
contingency table
>>>> statistics (CTS).  Likewise, we can use SL1L2 lines to derive
continuous
>>>> statistics (CNT).
>>>>
>>>> It sounds like you're interested in "overall" statistics for your
full
>>>> time period.  Whether you decide to run a "summary" job or an
>>>> "aggregate_stat" job depends on the question you're trying to
answer.
>>>>
>>>> Generally though, to determine overall performance, it's better
to
>>>> aggregate the daily CTC lines or SL1L2 lines together and then
derive
>>>> statistics from the aggregated values.  So it's generally advised
to run
>>>> an "aggregate_stat" job.
>>>>
>>>> It sounds like you also are confused about what constitutes the
"output"
>>>> from STAT-Analysis.  When you run a STAT-Analysis job, it's
output is by
>>>> default written to the screen.  You can redirect this output to a
file
>>>> using the "-out" command line option.  It sounds like you're
confused by
>>>> the "-dump_row" option.  Each STAT-Analysis job does some
filtering of
>>>> .stat data using the filtering criteria you define.  The "-
dump_row"
>>>> command line option may be used to redirect the output of that
filtering
>>>> to a file.  This is good practice when getting started with STAT-
Analysis.
>>>>    This enables you to decide on some filtering criteria, run a
>>>> STAT-Analysis job, and then inpect the -dump_row output file to
make sure
>>>> that the filtering options worked exactly how you expected they
would.
>>>>
>>>> But the -dump_row output is NOT the real output of STAT-Analysis.
As I
>>>> said, it's real output is either written to the screen or the
file you
>>>> specified using "-out".
>>>>
>>>> By running both the summary and aggregate_stat job on your data,
you've
>>>> demonstrated a very important fact.  Taking the mean of daily ME
values
>>>> (summary job) does not yield the same result as the aggregated ME
value
>>>> (aggregate_stat job).  As I mentioned, the latter is generally
preferred,
>>>> but both pieces of information may be useful depending on the
type of
>>>> question you're trying to answer.  The first is an average of
daily
>>>> performance while the second is performance aggregated over the
entire
>>>> time period.
>>>>
>>>> Hopefully that clarifies things.  If not, just let me know what
additional
>>>> questions you have.
>>>>
>>>> Thanks,
>>>> John
>>>>
>>>>
>>>>>
>>>>> Thu Oct 04 04:31:15 2012: Request 58586 was acted upon.
>>>>> Transaction: Ticket created by geeta124 at hotmail.com
>>>>>          Queue: met_help
>>>>>        Subject: Stat-analyis clarification
>>>>>          Owner: Nobody
>>>>>     Requestors: geeta124 at hotmail.com
>>>>>         Status: new
>>>>>    Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=58586 >
>>>>>
>>>>>
>>>>>
>>>>> Hi Paul, I am doing Rf verification for 60 days using point-stat
tool. I
>>>>> have no problem till this point. I am running the stat-analysis
tool as
>>>>> mentioned in the User's Guide. I need to arrive at the FINAL
values for
>>>>> ME, MAE, RMSE etc for all these 60 days. Hence I have the option
of using
>>>>> either of 3 jobs of stat-analysis.  1. I am not clear of the
differences
>>>>> between the Jobs- summary, aggregate and aggregate_stat.  1a. If
I give
>>>>> the command,
/oprn/model/wrf3/utils/met/METv3.0/bin/stat_analysis -lookin
>>>>> statfiles/ -out sum-full24  -v 4 -job summary -line_type CNT
-column ME it
>>>>> gives me an output (attatched) with just 3 lines as mentioned in
the
>>>>> Guide.  I want to know if the MEAN (in column) 3 is the Final
value of
>>>>> MEAN ERROR for all the days???, This value is 3.41941. Column  8
is the
>>>>> STDev for all the 54 days????.  1b. If I give the command,
>>>>> /oprn/model/wrf3/utils/met/METv3.0/bin/stat_analysis \
>>>>>     -job summary -line_type CNT -column ME  \
>>>>>     -dump_row summary_test.stat \
>>>>>     -lookin statfiles \
>>>>>     -fcst_var APCP_24 -fcst_lev A24 -v 2 The O/P file
summary_test.stat is
>>>>> created, but it is the collection of CNT line types of all the
stat
>>>>> files.  From here I have extracted the ME for all the days using
awk.
>>>>> bash $ awk '$3=270000  {print $53}' summary_test.stat >out. The
average
>>>>> of ME turns out to be
>>>>>
>>>>>
>>>>>
>>>>>     3.419406481
>>>>>
>>>>>
>>>>> 1c. if I give the command,
>>>>> /oprn/model/wrf3/utils/met/METv3.0/bin/stat_analysis \
>>>>>     -job summary -line_type CNT -column RMSE  \
>>>>>     -dump_row summary_test-rmse.stat \
>>>>>     -lookin statfiles \
>>>>>     -fcst_var APCP_24 -fcst_lev A24 -v 2 What is the difference
b/w first
>>>>> two commands????????????? 2nd and 3d command generte same O/P.
What is
>>>>> use of specifying column (if I give ME or RMSE, o/p is just the
same).
>>>>> bash-3.2$ diff summary_test.stat summary_test-rmse.stat
>>>>>    2. Problem Reg aggregate_stat job. I am giving the following
commnand,
>>>>> /oprn/model/wrf3/utils/met/METv3.0/bin/stat_analysis -lookin
statfiles/
>>>>> -out agst-full24  -v 4 -job aggregate_stat -line_type MPR
-out_line_type
>>>>> CNT -out_fcst_thresh  gt0.0 -out_obs_thresh gt0.0 The ME in this
file is
>>>>> 3.42134. (This value is different from the o/p produced by
commands
>>>>> specified in 1a and 1b).   Finallly I want to ask, ME and hence
other
>>>>> quanities can be generated in MANY ways. What is each job doing.
and What
>>>>> I should do??????????????????
>>>>> geeta
>>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>
>
>

------------------------------------------------


More information about the Met_help mailing list