[Met_help] [rt.rap.ucar.edu #52817] History for Advice on stat_analysis

Wed Feb 1 08:40:14 MST 2012

----------------------------------------------------------------
  Initial Request
----------------------------------------------------------------

Hi,

I'm looking for a bit of advice on generating aggregate error statistics. 

I have forecasts over a series of time periods for a series of models.  Each forecast I have compared to observations using point_stat, and have the mpr and the stats files.  I now trying to aggregate error statistics by lead time for each model,  for example so that I can plot rmse vs lead time for different models.

I'm trying to do this via a stat-analysis job, but as far as I can tell, I need to run a separate job for every lead-time and each  model, and then manually combine the output. Is there a way to do this in one step? I was hoping I could feed all of the mpr files to stat_analysis in one go, and a tabular output similar to the stat or mpr files, with the model and lead time listed in the output.

Is there a way to do this, or do I just need to get on and write a script?

Thanks,

Sam.

----------------------------------------------------------------
  Complete Ticket History
----------------------------------------------------------------

Subject: Re: [rt.rap.ucar.edu #52817] Advice on stat_analysis
From: Paul Oldenburg
Time: Thu Jan 19 08:34:54 2012

Sam,

I think the type of analysis that you are interested in will require
multiple calls to stat_analysis.  If you intend to
plot aggregated verification statistics over several models and
several lead times, each call will calculate statistics
for a single model and lead time combination (i.e. a single dot on
your plot).  The good news is that stat_analysis will
calculate many statistics in a single job.

I think that the type of stat_analysis job that you should consider
using is aggregate_stat with the -line_type MRP
argument which tells stat_analysis to read matched pairs.  If you have
any questions about this process, please let me
know.  Good luck and happy scripting.

Paul

On 01/19/2012 03:33 AM, sam.hawkins at vattenfall.com via RT wrote:
>
> Thu Jan 19 03:33:35 2012: Request 52817 was acted upon.
> Transaction: Ticket created by sam.hawkins at vattenfall.com
>         Queue: met_help
>       Subject: Advice on stat_analysis
>         Owner: Nobody
>    Requestors: sam.hawkins at vattenfall.com
>        Status: new
>   Ticket<URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=52817>
>
>
> Hi,
>
> I'm looking for a bit of advice on generating aggregate error
statistics.
>
> I have forecasts over a series of time periods for a series of
models.  Each forecast I have compared to observations using
point_stat, and have the mpr and the stats files.  I now trying to
aggregate error statistics by lead time for each model,  for example
so that I can plot rmse vs lead time for different models.
>
> I'm trying to do this via a stat-analysis job, but as far as I can
tell, I need to run a separate job for every lead-time and each
model, and then manually combine the output. Is there a way to do this
in one step? I was hoping I could feed all of the mpr files to
stat_analysis in one go, and a tabular output similar to the stat or
mpr files, with the model and lead time listed in the output.
>
> Is there a way to do this, or do I just need to get on and write a
script?
>
> Thanks,
>
> Sam.
>
>
>
>
>

------------------------------------------------
Subject: Re: [rt.rap.ucar.edu #52817] Advice on stat_analysis
From: Paul Oldenburg
Time: Thu Jan 19 08:50:33 2012

Sam,

I have been reminded by my colleague that you could define multiple
jobs in the stat_analysis config file.  Thus,
depending on the type of analysis that you want to perform. you could
create all the statistics for your plot in a
single stat_analysis call.  Furthermore, you can pass command line
parameters to stat_analysis which override/append the
settings that you specify in the config file jobs.  This approach is
definitely more complicated, so I would suggest
starting with running a few simple jobs and then build them up in the
config file.

Also, another important point that I failed to mention was the use of
the -dump_row option.  Using this will direct
stat_analysis to write all the input matched pairs into a file for you
to check and verify that it is actually
aggregating all the data that you intended.

If you have any questions, please let me know.

Thanks,

Paul

On 01/19/2012 08:34 AM, Paul Oldenburg wrote:
> Sam,
>
> I think the type of analysis that you are interested in will require
multiple calls to stat_analysis. If you intend to
> plot aggregated verification statistics over several models and
several lead times, each call will calculate statistics
> for a single model and lead time combination (i.e. a single dot on
your plot). The good news is that stat_analysis will
> calculate many statistics in a single job.
>
> I think that the type of stat_analysis job that you should consider
using is aggregate_stat with the -line_type MRP
> argument which tells stat_analysis to read matched pairs. If you
have any questions about this process, please let me
> know. Good luck and happy scripting.
>
> Paul
>
>
> On 01/19/2012 03:33 AM, sam.hawkins at vattenfall.com via RT wrote:
>>
>> Thu Jan 19 03:33:35 2012: Request 52817 was acted upon.
>> Transaction: Ticket created by sam.hawkins at vattenfall.com
>> Queue: met_help
>> Subject: Advice on stat_analysis
>> Owner: Nobody
>> Requestors: sam.hawkins at vattenfall.com
>> Status: new
>> Ticket<URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=52817>
>>
>>
>> Hi,
>>
>> I'm looking for a bit of advice on generating aggregate error
statistics.
>>
>> I have forecasts over a series of time periods for a series of
models. Each forecast I have compared to observations
>> using point_stat, and have the mpr and the stats files. I now
trying to aggregate error statistics by lead time for
>> each model, for example so that I can plot rmse vs lead time for
different models.
>>
>> I'm trying to do this via a stat-analysis job, but as far as I can
tell, I need to run a separate job for every
>> lead-time and each model, and then manually combine the output. Is
there a way to do this in one step? I was hoping I
>> could feed all of the mpr files to stat_analysis in one go, and a
tabular output similar to the stat or mpr files,
>> with the model and lead time listed in the output.
>>
>> Is there a way to do this, or do I just need to get on and write a
script?
>>
>> Thanks,
>>
>> Sam.
>>
>>
>>
>>
>>
>

------------------------------------------------
Subject: RE: [rt.rap.ucar.edu #52817] Advice on stat_analysis
From: sam.hawkins at vattenfall.com
Time: Thu Jan 19 10:20:12 2012

Paul,

Thanks for your advice. I've tried both approaches. When I specify
-fcst_lead on the command line, it doesn't seem to change anything,
and all the lead times get aggregated.

Putting multiple jobs into the config file works. It would make the
output easier to tabulate if the specified options appeared in the
same row as the statistics, but I guess that may break the output for
other people.

One minor point, I have noticed that stat_analysis only seems to
recognise .stat files, so if you want to use mpr files as input you
must rename them first.

Best regards,

Sam.

-----Original Message-----
From: Paul Oldenburg via RT [mailto:met_help at ucar.edu]
Sent: 19 January 2012 15:51
To: Hawkins Samuel Lennon (AE-DE)
Subject: Re: [rt.rap.ucar.edu #52817] Advice on stat_analysis

Sam,

I have been reminded by my colleague that you could define multiple
jobs in the stat_analysis config file.  Thus, depending on the type of
analysis that you want to perform. you could create all the statistics
for your plot in a single stat_analysis call.  Furthermore, you can
pass command line parameters to stat_analysis which override/append
the settings that you specify in the config file jobs.  This approach
is definitely more complicated, so I would suggest starting with
running a few simple jobs and then build them up in the config file.

Also, another important point that I failed to mention was the use of
the -dump_row option.  Using this will direct stat_analysis to write
all the input matched pairs into a file for you to check and verify
that it is actually aggregating all the data that you intended.

If you have any questions, please let me know.

Thanks,

Paul

On 01/19/2012 08:34 AM, Paul Oldenburg wrote:
> Sam,
>
> I think the type of analysis that you are interested in will require
> multiple calls to stat_analysis. If you intend to plot aggregated
> verification statistics over several models and several lead times,
> each call will calculate statistics for a single model and lead time
combination (i.e. a single dot on your plot). The good news is that
stat_analysis will calculate many statistics in a single job.
>
> I think that the type of stat_analysis job that you should consider
> using is aggregate_stat with the -line_type MRP argument which tells
> stat_analysis to read matched pairs. If you have any questions about
this process, please let me know. Good luck and happy scripting.
>
> Paul
>
>
> On 01/19/2012 03:33 AM, sam.hawkins at vattenfall.com via RT wrote:
>>
>> Thu Jan 19 03:33:35 2012: Request 52817 was acted upon.
>> Transaction: Ticket created by sam.hawkins at vattenfall.com
>> Queue: met_help
>> Subject: Advice on stat_analysis
>> Owner: Nobody
>> Requestors: sam.hawkins at vattenfall.com
>> Status: new
>> Ticket<URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=52817>
>>
>>
>> Hi,
>>
>> I'm looking for a bit of advice on generating aggregate error
statistics.
>>
>> I have forecasts over a series of time periods for a series of
>> models. Each forecast I have compared to observations using
>> point_stat, and have the mpr and the stats files. I now trying to
aggregate error statistics by lead time for each model, for example so
that I can plot rmse vs lead time for different models.
>>
>> I'm trying to do this via a stat-analysis job, but as far as I can
>> tell, I need to run a separate job for every lead-time and each
>> model, and then manually combine the output. Is there a way to do
>> this in one step? I was hoping I could feed all of the mpr files to
stat_analysis in one go, and a tabular output similar to the stat or
mpr files, with the model and lead time listed in the output.
>>
>> Is there a way to do this, or do I just need to get on and write a
script?
>>
>> Thanks,
>>
>> Sam.
>>
>>
>>
>>
>>
>

------------------------------------------------
Subject: Re: [rt.rap.ucar.edu #52817] Advice on stat_analysis
From: Paul Oldenburg
Time: Thu Jan 19 14:50:08 2012

Sam,

I apologize for the misinformation.  After some testing and
documentation research, we determined that the behavior that
I described in which command line filtering parameters override config
file filtering parameters is actually not
supported.  We will consider adding this functionality to a future
release.

Also, stat_analysis was designed to only read .stat files in the
directory tree specified by the -lookin parameter, so
this is known behavior.  Please let me know if you have any other
questions.

Paul

On 01/19/2012 10:20 AM, sam.hawkins at vattenfall.com via RT wrote:
>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=52817>
>
> Paul,
>
> Thanks for your advice. I've tried both approaches. When I specify
-fcst_lead on the command line, it doesn't seem to change anything,
and all the lead times get aggregated.
>
> Putting multiple jobs into the config file works. It would make the
output easier to tabulate if the specified options appeared in the
same row as the statistics, but I guess that may break the output for
other people.
>
> One minor point, I have noticed that stat_analysis only seems to
recognise .stat files, so if you want to use mpr files as input you
must rename them first.
>
> Best regards,
>
> Sam.
>
>
>
> -----Original Message-----
> From: Paul Oldenburg via RT [mailto:met_help at ucar.edu]
> Sent: 19 January 2012 15:51
> To: Hawkins Samuel Lennon (AE-DE)
> Subject: Re: [rt.rap.ucar.edu #52817] Advice on stat_analysis
>
> Sam,
>
> I have been reminded by my colleague that you could define multiple
jobs in the stat_analysis config file.  Thus, depending on the type of
analysis that you want to perform. you could create all the statistics
for your plot in a single stat_analysis call.  Furthermore, you can
pass command line parameters to stat_analysis which override/append
the settings that you specify in the config file jobs.  This approach
is definitely more complicated, so I would suggest starting with
running a few simple jobs and then build them up in the config file.
>
> Also, another important point that I failed to mention was the use
of the -dump_row option.  Using this will direct stat_analysis to
write all the input matched pairs into a file for you to check and
verify that it is actually aggregating all the data that you intended.
>
> If you have any questions, please let me know.
>
> Thanks,
>
> Paul
>
> On 01/19/2012 08:34 AM, Paul Oldenburg wrote:
>> Sam,
>>
>> I think the type of analysis that you are interested in will
require
>> multiple calls to stat_analysis. If you intend to plot aggregated
>> verification statistics over several models and several lead times,
>> each call will calculate statistics for a single model and lead
time combination (i.e. a single dot on your plot). The good news is
that stat_analysis will calculate many statistics in a single job.
>>
>> I think that the type of stat_analysis job that you should consider
>> using is aggregate_stat with the -line_type MRP argument which
tells
>> stat_analysis to read matched pairs. If you have any questions
about this process, please let me know. Good luck and happy scripting.
>>
>> Paul
>>
>>
>> On 01/19/2012 03:33 AM, sam.hawkins at vattenfall.com via RT wrote:
>>>
>>> Thu Jan 19 03:33:35 2012: Request 52817 was acted upon.
>>> Transaction: Ticket created by sam.hawkins at vattenfall.com
>>> Queue: met_help
>>> Subject: Advice on stat_analysis
>>> Owner: Nobody
>>> Requestors: sam.hawkins at vattenfall.com
>>> Status: new
>>> Ticket<URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=52817>
>>>
>>>
>>> Hi,
>>>
>>> I'm looking for a bit of advice on generating aggregate error
statistics.
>>>
>>> I have forecasts over a series of time periods for a series of
>>> models. Each forecast I have compared to observations using
>>> point_stat, and have the mpr and the stats files. I now trying to
aggregate error statistics by lead time for each model, for example so
that I can plot rmse vs lead time for different models.
>>>
>>> I'm trying to do this via a stat-analysis job, but as far as I can
>>> tell, I need to run a separate job for every lead-time and each
>>> model, and then manually combine the output. Is there a way to do
>>> this in one step? I was hoping I could feed all of the mpr files
to stat_analysis in one go, and a tabular output similar to the stat
or mpr files, with the model and lead time listed in the output.
>>>
>>> Is there a way to do this, or do I just need to get on and write a
script?
>>>
>>> Thanks,
>>>
>>> Sam.
>>>
>>>
>>>
>>>
>>>
>>
>
>

------------------------------------------------