[Met_help] [rt.rap.ucar.edu #57770] History for MET v4.0

Tressa Fowler via RT met_help at ucar.edu
Fri Aug 17 14:57:39 MDT 2012


----------------------------------------------------------------
  Initial Request
----------------------------------------------------------------

Three years ago, around the time of MET v2.0, I took a close look at 
using MET for my model evaluation.  After compiling the package, and 
trying some of the features, I learned that MET was not applicable to my 
evaluations.  I was able to confirm this with a message to met_help.  I 
am running month-long climate simulations and comparing the WRF output 
to a single observation point for the month.  From the comparisons of 
the WRF values to the observations, I calculate correlation, bias, and 
RMSE.  Temperature at 2m, surface pressure, downward longwave radiation, 
and downward shortwave radiation are examples of values of WRF that are 
statistically compared against observations.  I have attached a plot 
showing this type of comparison and resulting statistical values (I am 
not interested in the plots, more the statistical results).  Has 
anything like this been added to MET v4.0 since v2.0 to provide this 
type of model evaluation?

Thanks

Mark Seefeldt


----------------------------------------------------------------
  Complete Ticket History
----------------------------------------------------------------

Subject: Re: [rt.rap.ucar.edu #57770] MET v4.0
From: John Halley Gotway
Time: Thu Aug 09 08:16:03 2012

Mark,

Just to make sure I understand your situation - the plot you sent
shows a month-long time series of forecast and observation values at a
single station.  Using those matched pairs, you've computed a
handful of statistics, like rmse, bias, and correlation.  Assuming
that's all correct, let me lay out how MET might be helpful in this
type of evaluation:

- First, format the point observations into the NetCDF format that MET
expects.  If you're using PREPBUFR observations, use the PB2NC tool.
If you're using NetCDF MADIS observations, use the MADIS2NC
tool, and if they're in ASCII format, use the ASCII2NC tool.
- Next, make sure you've post-processed your WRF output - we suggest
using the Unified-PostProcessor.  The output of UPP is in GRIB format.
- Now for each verification time, run Point-Stat to compare your
forecast fields to the point observations.  Configure Point-Stat to
dump out the matched pair (MPR) line type.  You may also choose to
configure Point-Stat to use multiple interpolation methods when
interpolating the forecast data to the observation location (or just
choose the single one you prefer).
- So now you have many output files from many runs of Point-Stat.  The
MPR output line from Point-Stat contains the observation value and
interpolated forecast value.  Next, you can run that data
through the STAT-Analysis tool to compute aggregated statistics over
the whole month.  The output will include confidence intervals.  The
STAT-Analysis job would look something like this:

    stat_analysis -lookin point_stat/out -job aggregate_stat
-line_type MPR -out_line_type CNT -fcst_var TMP -fcst_lev Z2
-interp_mthd BILIN -column_str OBS_SID BSRN

This job tells STAT-Analysis to read the input files, filter the
matched pair lines looking for TMP at 2-meter data computed using the
bilinear interpolation method and only at the BSRN point.  Use
that data to compute continuous statistics.  In addition, you could
pass in the "-vif_flag true" option to account for lag-1 auto-
correlation.

Now the last step is the plotting of the data.  We have been
developing a database/display tool for the output of MET called
METViewer.  And we've been using it extensively in many of the DTC
testing
and evaluation projects.  But we have not done a public release of it,
lacking funding to support it's use to the broader community.  There
is definitely an overhead and learning curve to getting
METViewer set up and working well.  For this small type of analysis,
it wouldn't be worth it.  But it's *VERY* useful for bigger
evaluations handling large amounts of verification data and looking at
many different plot types.

For a time-series at a single location, I'd probably just write an R-
script (or IDL or Matlab) to do it.

So there's a lot of little steps in the process.  But the intention
would be to script it up so that it can easily be run thousands of
times.

Hope that helps explain it.

Thanks,
John Halley Gotway
met_help at ucar.edu



On 08/08/2012 10:19 PM, Mark Seefeldt via RT wrote:
>
> Wed Aug 08 22:19:32 2012: Request 57770 was acted upon.
> Transaction: Ticket created by mark.seefeldt at colorado.edu
>         Queue: met_help
>       Subject: MET v4.0
>         Owner: Nobody
>    Requestors: mark.seefeldt at colorado.edu
>        Status: new
>   Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=57770 >
>
>
> Three years ago, around the time of MET v2.0, I took a close look at
> using MET for my model evaluation.  After compiling the package, and
> trying some of the features, I learned that MET was not applicable
to my
> evaluations.  I was able to confirm this with a message to met_help.
I
> am running month-long climate simulations and comparing the WRF
output
> to a single observation point for the month.  From the comparisons
of
> the WRF values to the observations, I calculate correlation, bias,
and
> RMSE.  Temperature at 2m, surface pressure, downward longwave
radiation,
> and downward shortwave radiation are examples of values of WRF that
are
> statistically compared against observations.  I have attached a plot
> showing this type of comparison and resulting statistical values (I
am
> not interested in the plots, more the statistical results).  Has
> anything like this been added to MET v4.0 since v2.0 to provide this
> type of model evaluation?
>
> Thanks
>
> Mark Seefeldt
>


------------------------------------------------
Subject: Re: [rt.rap.ucar.edu #57770] MET v4.0
From: Mark Seefeldt
Time: Thu Aug 09 15:17:26 2012

John,

Thank you for your very helpful tips.  I will take a closer look at
this
to see if I can make it work with my project.  In total, I will be
running 128 simulations for each of two sites, or a total of 256
simulations with varying WRF physics combinations.  So I will have the
256 statistical measures for each type of measure/observation (i.e. 2m
T).  Are there any tools available to then compare those 256 results
amongst themselves with confidence intervals?  The goal is to have a
performance ranking of each WRF physics configuration.  Not a problem
if
not, this is a request that I am anticipating is beyond the scope of
MET.

I agree with you that creating the plots of that data is more
straightforward in IDL/NCL/MatLab.

Mark

On 8/9/12 10:16 AM, John Halley Gotway via RT wrote:
> Mark,
>
> Just to make sure I understand your situation - the plot you sent
> shows a month-long time series of forecast and observation values at
> a single station.  Using those matched pairs, you've computed a
> handful of statistics, like rmse, bias, and correlation.  Assuming
> that's all correct, let me lay out how MET might be helpful in this
> type of evaluation:
>
> - First, format the point observations into the NetCDF format that
> MET expects.  If you're using PREPBUFR observations, use the PB2NC
> tool.  If you're using NetCDF MADIS observations, use the MADIS2NC
> tool, and if they're in ASCII format, use the ASCII2NC tool. - Next,
> make sure you've post-processed your WRF output - we suggest using
> the Unified-PostProcessor.  The output of UPP is in GRIB format. -
> Now for each verification time, run Point-Stat to compare your
> forecast fields to the point observations.  Configure Point-Stat to
> dump out the matched pair (MPR) line type.  You may also choose to
> configure Point-Stat to use multiple interpolation methods when
> interpolating the forecast data to the observation location (or just
> choose the single one you prefer). - So now you have many output
> files from many runs of Point-Stat.  The MPR output line from
> Point-Stat contains the observation value and interpolated forecast
> value.  Next, you can run that data through the STAT-Analysis tool
to
> compute aggregated statistics over the whole month.  The output will
> include confidence intervals.  The STAT-Analysis job would look
> something like this:
>
> stat_analysis -lookin point_stat/out -job aggregate_stat -line_type
> MPR -out_line_type CNT -fcst_var TMP -fcst_lev Z2 -interp_mthd BILIN
> -column_str OBS_SID BSRN
>
> This job tells STAT-Analysis to read the input files, filter the
> matched pair lines looking for TMP at 2-meter data computed using
the
> bilinear interpolation method and only at the BSRN point.  Use that
> data to compute continuous statistics.  In addition, you could pass
> in the "-vif_flag true" option to account for lag-1
> auto-correlation.
>
> Now the last step is the plotting of the data.  We have been
> developing a database/display tool for the output of MET called
> METViewer.  And we've been using it extensively in many of the DTC
> testing and evaluation projects.  But we have not done a public
> release of it, lacking funding to support it's use to the broader
> community.  There is definitely an overhead and learning curve to
> getting METViewer set up and working well.  For this small type of
> analysis, it wouldn't be worth it.  But it's *VERY* useful for
bigger
> evaluations handling large amounts of verification data and looking
> at many different plot types.
>
> For a time-series at a single location, I'd probably just write an
> R-script (or IDL or Matlab) to do it.
>
> So there's a lot of little steps in the process.  But the intention
> would be to script it up so that it can easily be run thousands of
> times.
>
> Hope that helps explain it.
>
> Thanks, John Halley Gotway met_help at ucar.edu
>
>
>
> On 08/08/2012 10:19 PM, Mark Seefeldt via RT wrote:
>>
>> Wed Aug 08 22:19:32 2012: Request 57770 was acted upon.
>> Transaction: Ticket created by mark.seefeldt at colorado.edu Queue:
>> met_help Subject: MET v4.0 Owner: Nobody Requestors:
>> mark.seefeldt at colorado.edu Status: new Ticket <URL:
>> https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=57770 >
>>
>>
>> Three years ago, around the time of MET v2.0, I took a close look
>> at using MET for my model evaluation.  After compiling the package,
>> and trying some of the features, I learned that MET was not
>> applicable to my evaluations.  I was able to confirm this with a
>> message to met_help.  I am running month-long climate simulations
>> and comparing the WRF output to a single observation point for the
>> month.  From the comparisons of the WRF values to the observations,
>> I calculate correlation, bias, and RMSE.  Temperature at 2m,
>> surface pressure, downward longwave radiation, and downward
>> shortwave radiation are examples of values of WRF that are
>> statistically compared against observations.  I have attached a
>> plot showing this type of comparison and resulting statistical
>> values (I am not interested in the plots, more the statistical
>> results).  Has anything like this been added to MET v4.0 since v2.0
>> to provide this type of model evaluation?
>>
>> Thanks
>>
>> Mark Seefeldt
>>
>
>

------------------------------------------------
Subject: MET v4.0
From: John Halley Gotway
Time: Fri Aug 10 11:31:03 2012

Mark,

Good question.  It sounds like you basically want to do model inter-
comparisons to determine if one configuration outperforms another in a
statistically significant way.  The answer to your question
is yes and no.

In METViewer, we have the ability to compute bootstrap confidence
intervals on pair-wise differences between models.  Then we look to
see whether the confidence interval around the difference includes
the 0-line.  If it does, the difference is not statistically
significant.  If it does not include 0, the difference is significant.

We use METViewer on many of our internal DTC projects to do this type
of analysis.  I've attached a sample plot show this.  The plot is the
comparison of Gilbert Skill Score (aka ETS) computed over
the full CONUS for 2 different model configurations.  For a whole
year, we're looking at all the 00Z initializations at the 36-hour lead
time at the GSS for multiple threshold values (x-axis).  The
red and blue lines are the aggregated GSS scores with confidence
intervals by threshold.  The green line contains the pair-wise
differences.  For some thresholds, one configuration is better and for
others the performance is reversed.

However in MET itself, there is currently no capability for computing
pair-wise differences between models.  It is in our list of
development tasks to enhance STAT-Analysis with this functionality in
the future, but it just hasn't been addressed yet.

Hope that helps clarify.

John


On 08/09/2012 03:17 PM, Mark Seefeldt via RT wrote:
>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=57770 >
>
> John,
>
> Thank you for your very helpful tips.  I will take a closer look at
this
> to see if I can make it work with my project.  In total, I will be
> running 128 simulations for each of two sites, or a total of 256
> simulations with varying WRF physics combinations.  So I will have
the
> 256 statistical measures for each type of measure/observation (i.e.
2m
> T).  Are there any tools available to then compare those 256 results
> amongst themselves with confidence intervals?  The goal is to have a
> performance ranking of each WRF physics configuration.  Not a
problem if
> not, this is a request that I am anticipating is beyond the scope of
MET.
>
> I agree with you that creating the plots of that data is more
> straightforward in IDL/NCL/MatLab.
>
> Mark
>
> On 8/9/12 10:16 AM, John Halley Gotway via RT wrote:
>> Mark,
>>
>> Just to make sure I understand your situation - the plot you sent
>> shows a month-long time series of forecast and observation values
at
>> a single station.  Using those matched pairs, you've computed a
>> handful of statistics, like rmse, bias, and correlation.  Assuming
>> that's all correct, let me lay out how MET might be helpful in this
>> type of evaluation:
>>
>> - First, format the point observations into the NetCDF format that
>> MET expects.  If you're using PREPBUFR observations, use the PB2NC
>> tool.  If you're using NetCDF MADIS observations, use the MADIS2NC
>> tool, and if they're in ASCII format, use the ASCII2NC tool. -
Next,
>> make sure you've post-processed your WRF output - we suggest using
>> the Unified-PostProcessor.  The output of UPP is in GRIB format. -
>> Now for each verification time, run Point-Stat to compare your
>> forecast fields to the point observations.  Configure Point-Stat to
>> dump out the matched pair (MPR) line type.  You may also choose to
>> configure Point-Stat to use multiple interpolation methods when
>> interpolating the forecast data to the observation location (or
just
>> choose the single one you prefer). - So now you have many output
>> files from many runs of Point-Stat.  The MPR output line from
>> Point-Stat contains the observation value and interpolated forecast
>> value.  Next, you can run that data through the STAT-Analysis tool
to
>> compute aggregated statistics over the whole month.  The output
will
>> include confidence intervals.  The STAT-Analysis job would look
>> something like this:
>>
>> stat_analysis -lookin point_stat/out -job aggregate_stat -line_type
>> MPR -out_line_type CNT -fcst_var TMP -fcst_lev Z2 -interp_mthd
BILIN
>> -column_str OBS_SID BSRN
>>
>> This job tells STAT-Analysis to read the input files, filter the
>> matched pair lines looking for TMP at 2-meter data computed using
the
>> bilinear interpolation method and only at the BSRN point.  Use that
>> data to compute continuous statistics.  In addition, you could pass
>> in the "-vif_flag true" option to account for lag-1
>> auto-correlation.
>>
>> Now the last step is the plotting of the data.  We have been
>> developing a database/display tool for the output of MET called
>> METViewer.  And we've been using it extensively in many of the DTC
>> testing and evaluation projects.  But we have not done a public
>> release of it, lacking funding to support it's use to the broader
>> community.  There is definitely an overhead and learning curve to
>> getting METViewer set up and working well.  For this small type of
>> analysis, it wouldn't be worth it.  But it's *VERY* useful for
bigger
>> evaluations handling large amounts of verification data and looking
>> at many different plot types.
>>
>> For a time-series at a single location, I'd probably just write an
>> R-script (or IDL or Matlab) to do it.
>>
>> So there's a lot of little steps in the process.  But the intention
>> would be to script it up so that it can easily be run thousands of
>> times.
>>
>> Hope that helps explain it.
>>
>> Thanks, John Halley Gotway met_help at ucar.edu
>>
>>
>>
>> On 08/08/2012 10:19 PM, Mark Seefeldt via RT wrote:
>>>
>>> Wed Aug 08 22:19:32 2012: Request 57770 was acted upon.
>>> Transaction: Ticket created by mark.seefeldt at colorado.edu Queue:
>>> met_help Subject: MET v4.0 Owner: Nobody Requestors:
>>> mark.seefeldt at colorado.edu Status: new Ticket <URL:
>>> https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=57770 >
>>>
>>>
>>> Three years ago, around the time of MET v2.0, I took a close look
>>> at using MET for my model evaluation.  After compiling the
package,
>>> and trying some of the features, I learned that MET was not
>>> applicable to my evaluations.  I was able to confirm this with a
>>> message to met_help.  I am running month-long climate simulations
>>> and comparing the WRF output to a single observation point for the
>>> month.  From the comparisons of the WRF values to the
observations,
>>> I calculate correlation, bias, and RMSE.  Temperature at 2m,
>>> surface pressure, downward longwave radiation, and downward
>>> shortwave radiation are examples of values of WRF that are
>>> statistically compared against observations.  I have attached a
>>> plot showing this type of comparison and resulting statistical
>>> values (I am not interested in the plots, more the statistical
>>> results).  Has anything like this been added to MET v4.0 since
v2.0
>>> to provide this type of model evaluation?
>>>
>>> Thanks
>>>
>>> Mark Seefeldt
>>>
>>
>>


------------------------------------------------
Subject: Re: [rt.rap.ucar.edu #57770] MET v4.0
From: Mark Seefeldt
Time: Fri Aug 17 11:32:00 2012

John,

Thank you for another detailed response for ways that I can
potentially
use MET for my WRF configuration evaluation.  Out of curiosity, since
you know the statistical analysis world better than me, are there any
software applications out there that would meet the needs of my
project?

Reminder:
1 - Calculate the performance of WRF against a single point
observation,
likely using bias, RMSE, and correlation.
2 - Ranking the performances of a large number of WRF configurations
based on the results of (1) for several different measurements (e.g.
T_2m, P_sfc, SW_d, etc.).

No big deal if you don't have any suggestions.  We have entered into
the
world beyond the scope of met_help.

Thanks

Mark
On 8/10/12 1:31 PM, John Halley Gotway via RT wrote:
> Mark,
>
> Good question.  It sounds like you basically want to do model
> inter-comparisons to determine if one configuration outperforms
> another in a statistically significant way.  The answer to your
> question is yes and no.
>
> In METViewer, we have the ability to compute bootstrap confidence
> intervals on pair-wise differences between models.  Then we look to
> see whether the confidence interval around the difference includes
> the 0-line.  If it does, the difference is not statistically
> significant.  If it does not include 0, the difference is
> significant.
>
> We use METViewer on many of our internal DTC projects to do this
type
> of analysis.  I've attached a sample plot show this.  The plot is
the
> comparison of Gilbert Skill Score (aka ETS) computed over the full
> CONUS for 2 different model configurations.  For a whole year, we're
> looking at all the 00Z initializations at the 36-hour lead time at
> the GSS for multiple threshold values (x-axis).  The red and blue
> lines are the aggregated GSS scores with confidence intervals by
> threshold.  The green line contains the pair-wise differences.  For
> some thresholds, one configuration is better and for others the
> performance is reversed.
>
> However in MET itself, there is currently no capability for
computing
> pair-wise differences between models.  It is in our list of
> development tasks to enhance STAT-Analysis with this functionality
> in the future, but it just hasn't been addressed yet.
>
> Hope that helps clarify.
>
> John
>
>
> On 08/09/2012 03:17 PM, Mark Seefeldt via RT wrote:
>>
>> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=57770 >
>>
>> John,
>>
>> Thank you for your very helpful tips.  I will take a closer look at
>> this to see if I can make it work with my project.  In total, I
>> will be running 128 simulations for each of two sites, or a total
>> of 256 simulations with varying WRF physics combinations.  So I
>> will have the 256 statistical measures for each type of
>> measure/observation (i.e. 2m T).  Are there any tools available to
>> then compare those 256 results amongst themselves with confidence
>> intervals?  The goal is to have a performance ranking of each WRF
>> physics configuration.  Not a problem if not, this is a request
>> that I am anticipating is beyond the scope of MET.
>>
>> I agree with you that creating the plots of that data is more
>> straightforward in IDL/NCL/MatLab.
>>
>> Mark
>>
>> On 8/9/12 10:16 AM, John Halley Gotway via RT wrote:
>>> Mark,
>>>
>>> Just to make sure I understand your situation - the plot you
>>> sent shows a month-long time series of forecast and observation
>>> values at a single station.  Using those matched pairs, you've
>>> computed a handful of statistics, like rmse, bias, and
>>> correlation.  Assuming that's all correct, let me lay out how MET
>>> might be helpful in this type of evaluation:
>>>
>>> - First, format the point observations into the NetCDF format
>>> that MET expects.  If you're using PREPBUFR observations, use the
>>> PB2NC tool.  If you're using NetCDF MADIS observations, use the
>>> MADIS2NC tool, and if they're in ASCII format, use the ASCII2NC
>>> tool. - Next, make sure you've post-processed your WRF output -
>>> we suggest using the Unified-PostProcessor.  The output of UPP is
>>> in GRIB format. - Now for each verification time, run Point-Stat
>>> to compare your forecast fields to the point observations.
>>> Configure Point-Stat to dump out the matched pair (MPR) line
>>> type.  You may also choose to configure Point-Stat to use
>>> multiple interpolation methods when interpolating the forecast
>>> data to the observation location (or just choose the single one
>>> you prefer). - So now you have many output files from many runs
>>> of Point-Stat.  The MPR output line from Point-Stat contains the
>>> observation value and interpolated forecast value.  Next, you can
>>> run that data through the STAT-Analysis tool to compute
>>> aggregated statistics over the whole month.  The output will
>>> include confidence intervals.  The STAT-Analysis job would look
>>> something like this:
>>>
>>> stat_analysis -lookin point_stat/out -job aggregate_stat
>>> -line_type MPR -out_line_type CNT -fcst_var TMP -fcst_lev Z2
>>> -interp_mthd BILIN -column_str OBS_SID BSRN
>>>
>>> This job tells STAT-Analysis to read the input files, filter the
>>> matched pair lines looking for TMP at 2-meter data computed using
>>> the bilinear interpolation method and only at the BSRN point.
>>> Use that data to compute continuous statistics.  In addition, you
>>> could pass in the "-vif_flag true" option to account for lag-1
>>> auto-correlation.
>>>
>>> Now the last step is the plotting of the data.  We have been
>>> developing a database/display tool for the output of MET called
>>> METViewer.  And we've been using it extensively in many of the
>>> DTC testing and evaluation projects.  But we have not done a
>>> public release of it, lacking funding to support it's use to the
>>> broader community.  There is definitely an overhead and learning
>>> curve to getting METViewer set up and working well.  For this
>>> small type of analysis, it wouldn't be worth it.  But it's *VERY*
>>> useful for bigger evaluations handling large amounts of
>>> verification data and looking at many different plot types.
>>>
>>> For a time-series at a single location, I'd probably just write
>>> an R-script (or IDL or Matlab) to do it.
>>>
>>> So there's a lot of little steps in the process.  But the
>>> intention would be to script it up so that it can easily be run
>>> thousands of times.
>>>
>>> Hope that helps explain it.
>>>
>>> Thanks, John Halley Gotway met_help at ucar.edu
>>>
>>>
>>>
>>> On 08/08/2012 10:19 PM, Mark Seefeldt via RT wrote:
>>>>
>>>> Wed Aug 08 22:19:32 2012: Request 57770 was acted upon.
>>>> Transaction: Ticket created by mark.seefeldt at colorado.edu
>>>> Queue: met_help Subject: MET v4.0 Owner: Nobody Requestors:
>>>> mark.seefeldt at colorado.edu Status: new Ticket <URL:
>>>> https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=57770 >
>>>>
>>>>
>>>> Three years ago, around the time of MET v2.0, I took a close
>>>> look at using MET for my model evaluation.  After compiling the
>>>> package, and trying some of the features, I learned that MET
>>>> was not applicable to my evaluations.  I was able to confirm
>>>> this with a message to met_help.  I am running month-long
>>>> climate simulations and comparing the WRF output to a single
>>>> observation point for the month.  From the comparisons of the
>>>> WRF values to the observations, I calculate correlation, bias,
>>>> and RMSE.  Temperature at 2m, surface pressure, downward
>>>> longwave radiation, and downward shortwave radiation are
>>>> examples of values of WRF that are statistically compared
>>>> against observations.  I have attached a plot showing this type
>>>> of comparison and resulting statistical values (I am not
>>>> interested in the plots, more the statistical results).  Has
>>>> anything like this been added to MET v4.0 since v2.0 to provide
>>>> this type of model evaluation?
>>>>
>>>> Thanks
>>>>
>>>> Mark Seefeldt
>>>>
>>>
>>>
>
>

------------------------------------------------
Subject: Re: [rt.rap.ucar.edu #57770] MET v4.0
From: John Halley Gotway
Time: Fri Aug 17 12:11:44 2012

Mark,

You might look at the verification package within R.  But that's the
only suggestion I have.

So I've reassigned your ticket to Tressa Fowler - our resident
statistician.  Perhaps she can offer more insight.

Thanks,
John

On 08/17/2012 11:32 AM, Mark Seefeldt via RT wrote:
>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=57770 >
>
> John,
>
> Thank you for another detailed response for ways that I can
potentially
> use MET for my WRF configuration evaluation.  Out of curiosity,
since
> you know the statistical analysis world better than me, are there
any
> software applications out there that would meet the needs of my
project?
>
> Reminder:
> 1 - Calculate the performance of WRF against a single point
observation,
> likely using bias, RMSE, and correlation.
> 2 - Ranking the performances of a large number of WRF configurations
> based on the results of (1) for several different measurements (e.g.
> T_2m, P_sfc, SW_d, etc.).
>
> No big deal if you don't have any suggestions.  We have entered into
the
> world beyond the scope of met_help.
>
> Thanks
>
> Mark
> On 8/10/12 1:31 PM, John Halley Gotway via RT wrote:
>> Mark,
>>
>> Good question.  It sounds like you basically want to do model
>> inter-comparisons to determine if one configuration outperforms
>> another in a statistically significant way.  The answer to your
>> question is yes and no.
>>
>> In METViewer, we have the ability to compute bootstrap confidence
>> intervals on pair-wise differences between models.  Then we look to
>> see whether the confidence interval around the difference includes
>> the 0-line.  If it does, the difference is not statistically
>> significant.  If it does not include 0, the difference is
>> significant.
>>
>> We use METViewer on many of our internal DTC projects to do this
type
>> of analysis.  I've attached a sample plot show this.  The plot is
the
>> comparison of Gilbert Skill Score (aka ETS) computed over the full
>> CONUS for 2 different model configurations.  For a whole year,
we're
>> looking at all the 00Z initializations at the 36-hour lead time at
>> the GSS for multiple threshold values (x-axis).  The red and blue
>> lines are the aggregated GSS scores with confidence intervals by
>> threshold.  The green line contains the pair-wise differences.  For
>> some thresholds, one configuration is better and for others the
>> performance is reversed.
>>
>> However in MET itself, there is currently no capability for
computing
>> pair-wise differences between models.  It is in our list of
>> development tasks to enhance STAT-Analysis with this functionality
>> in the future, but it just hasn't been addressed yet.
>>
>> Hope that helps clarify.
>>
>> John
>>
>>
>> On 08/09/2012 03:17 PM, Mark Seefeldt via RT wrote:
>>>
>>> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=57770 >
>>>
>>> John,
>>>
>>> Thank you for your very helpful tips.  I will take a closer look
at
>>> this to see if I can make it work with my project.  In total, I
>>> will be running 128 simulations for each of two sites, or a total
>>> of 256 simulations with varying WRF physics combinations.  So I
>>> will have the 256 statistical measures for each type of
>>> measure/observation (i.e. 2m T).  Are there any tools available to
>>> then compare those 256 results amongst themselves with confidence
>>> intervals?  The goal is to have a performance ranking of each WRF
>>> physics configuration.  Not a problem if not, this is a request
>>> that I am anticipating is beyond the scope of MET.
>>>
>>> I agree with you that creating the plots of that data is more
>>> straightforward in IDL/NCL/MatLab.
>>>
>>> Mark
>>>
>>> On 8/9/12 10:16 AM, John Halley Gotway via RT wrote:
>>>> Mark,
>>>>
>>>> Just to make sure I understand your situation - the plot you
>>>> sent shows a month-long time series of forecast and observation
>>>> values at a single station.  Using those matched pairs, you've
>>>> computed a handful of statistics, like rmse, bias, and
>>>> correlation.  Assuming that's all correct, let me lay out how MET
>>>> might be helpful in this type of evaluation:
>>>>
>>>> - First, format the point observations into the NetCDF format
>>>> that MET expects.  If you're using PREPBUFR observations, use the
>>>> PB2NC tool.  If you're using NetCDF MADIS observations, use the
>>>> MADIS2NC tool, and if they're in ASCII format, use the ASCII2NC
>>>> tool. - Next, make sure you've post-processed your WRF output -
>>>> we suggest using the Unified-PostProcessor.  The output of UPP is
>>>> in GRIB format. - Now for each verification time, run Point-Stat
>>>> to compare your forecast fields to the point observations.
>>>> Configure Point-Stat to dump out the matched pair (MPR) line
>>>> type.  You may also choose to configure Point-Stat to use
>>>> multiple interpolation methods when interpolating the forecast
>>>> data to the observation location (or just choose the single one
>>>> you prefer). - So now you have many output files from many runs
>>>> of Point-Stat.  The MPR output line from Point-Stat contains the
>>>> observation value and interpolated forecast value.  Next, you can
>>>> run that data through the STAT-Analysis tool to compute
>>>> aggregated statistics over the whole month.  The output will
>>>> include confidence intervals.  The STAT-Analysis job would look
>>>> something like this:
>>>>
>>>> stat_analysis -lookin point_stat/out -job aggregate_stat
>>>> -line_type MPR -out_line_type CNT -fcst_var TMP -fcst_lev Z2
>>>> -interp_mthd BILIN -column_str OBS_SID BSRN
>>>>
>>>> This job tells STAT-Analysis to read the input files, filter the
>>>> matched pair lines looking for TMP at 2-meter data computed using
>>>> the bilinear interpolation method and only at the BSRN point.
>>>> Use that data to compute continuous statistics.  In addition, you
>>>> could pass in the "-vif_flag true" option to account for lag-1
>>>> auto-correlation.
>>>>
>>>> Now the last step is the plotting of the data.  We have been
>>>> developing a database/display tool for the output of MET called
>>>> METViewer.  And we've been using it extensively in many of the
>>>> DTC testing and evaluation projects.  But we have not done a
>>>> public release of it, lacking funding to support it's use to the
>>>> broader community.  There is definitely an overhead and learning
>>>> curve to getting METViewer set up and working well.  For this
>>>> small type of analysis, it wouldn't be worth it.  But it's *VERY*
>>>> useful for bigger evaluations handling large amounts of
>>>> verification data and looking at many different plot types.
>>>>
>>>> For a time-series at a single location, I'd probably just write
>>>> an R-script (or IDL or Matlab) to do it.
>>>>
>>>> So there's a lot of little steps in the process.  But the
>>>> intention would be to script it up so that it can easily be run
>>>> thousands of times.
>>>>
>>>> Hope that helps explain it.
>>>>
>>>> Thanks, John Halley Gotway met_help at ucar.edu
>>>>
>>>>
>>>>
>>>> On 08/08/2012 10:19 PM, Mark Seefeldt via RT wrote:
>>>>>
>>>>> Wed Aug 08 22:19:32 2012: Request 57770 was acted upon.
>>>>> Transaction: Ticket created by mark.seefeldt at colorado.edu
>>>>> Queue: met_help Subject: MET v4.0 Owner: Nobody Requestors:
>>>>> mark.seefeldt at colorado.edu Status: new Ticket <URL:
>>>>> https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=57770 >
>>>>>
>>>>>
>>>>> Three years ago, around the time of MET v2.0, I took a close
>>>>> look at using MET for my model evaluation.  After compiling the
>>>>> package, and trying some of the features, I learned that MET
>>>>> was not applicable to my evaluations.  I was able to confirm
>>>>> this with a message to met_help.  I am running month-long
>>>>> climate simulations and comparing the WRF output to a single
>>>>> observation point for the month.  From the comparisons of the
>>>>> WRF values to the observations, I calculate correlation, bias,
>>>>> and RMSE.  Temperature at 2m, surface pressure, downward
>>>>> longwave radiation, and downward shortwave radiation are
>>>>> examples of values of WRF that are statistically compared
>>>>> against observations.  I have attached a plot showing this type
>>>>> of comparison and resulting statistical values (I am not
>>>>> interested in the plots, more the statistical results).  Has
>>>>> anything like this been added to MET v4.0 since v2.0 to provide
>>>>> this type of model evaluation?
>>>>>
>>>>> Thanks
>>>>>
>>>>> Mark Seefeldt
>>>>>
>>>>
>>>>
>>
>>


------------------------------------------------
Subject: MET v4.0
From: Tressa Fowler
Time: Fri Aug 17 13:33:38 2012

Hi Mark,

I will second John's recommendation to use the R software with the
verification package. Additionally, it sounds like you will have two
time series, one for each location, for each of your variables. In
this case, I would recommend incorporating time series methods into
your analysis. In particular, you can estimate the autocorrelation in
your data to get more accurate estimates of confidence intervals.

Please let us know if you have further questions.

Tressa

------------------------------------------------
Subject: Re: [rt.rap.ucar.edu #57770] MET v4.0
From: Mark Seefeldt
Time: Fri Aug 17 14:36:43 2012

Thanks, Tressa and John for your suggestions.  I will look into using
R
for this project.  I'll ask around with the math/stats department on
my
campus (I am at Providence College during the academic year) and see
if
anybody can give me some pointers.

Mark

On 8/17/12 3:33 PM, Tressa Fowler via RT wrote:
> Hi Mark,
>
> I will second John's recommendation to use the R software with the
> verification package. Additionally, it sounds like you will have two
> time series, one for each location, for each of your variables. In
> this case, I would recommend incorporating time series methods into
> your analysis. In particular, you can estimate the autocorrelation
in
> your data to get more accurate estimates of confidence intervals.
>
> Please let us know if you have further questions.
>
> Tressa
>

------------------------------------------------


More information about the Met_help mailing list