[Met_help] [rt.rap.ucar.edu #59656] History for Fwd: point_stat configuration : issues & questions

Mon Feb 4 09:06:05 MST 2013

----------------------------------------------------------------
  Initial Request
----------------------------------------------------------------

with attached file...

-------- Message original --------
Sujet: 	point_stat configuration : issues & questions
Date : 	Tue, 18 Dec 2012 11:28:11 +0100
De : 	Remi Montroty <remi.montroty at mfi.fr>
Pour : 	met_help at ucar.edu

Hello,

I've recently compiled METv4.0 on a CentOS 6.2 cluster. I'm trying to 
verify a model temperature (@2meters) output with respect to Synop data. 
It all went great up to the pairing of obs & model points.

I've tried various combinations of PointStatConfig file but I'm still 
not getting any pairing. (I've tried R011, Z002 as levels)

Typically here is my temperature definition (as per wgrib -V ) :

rec 3:357568:date 2012121700 TMP kpds5=11 kpds6=105 kpds7=2 levels=(0,2) 
grid=255 2 m above gnd 24hr fcst: bitmap: 962 undef
   TMP=Temp. [K]
   timerange 0 P1 24 P2 0 TimeU 1  nx 481 ny 497 GDS grid 0 num_in_ave 0 
missing 0
   center 14 subcenter 0 process 125 Table 2 scan: WE:SN winds(N/S)
   latlon: lat  -15.000000 to 16.000000 by 0.062000  nxny 239057
           long 23.000000 to 53.000000 by 0.062000, (481 x 497) scan 64 
mode 128 bdsgrid 1
   min/max data 274 303  num bits 5  BDS_Ref 274  DecScale 0 BinScale 0

and I've attached a tar file containing all files I'm using to do the 
following :

ascii2nc SYNOP.20121218000000.0H.rdb.ascii2nc SYNOP.20121218000000.0H.rdb.nc
point_stat forecastfile.grb SYNOP.20121218000000.0H.rdb.nc 
PointStatConfig.rems2

Maybe it is a level issue on the observation side but I'm confused : 
page 3-13 of Met Users Guide gives :
- column 6 =  elevation in msl of observing location
- column 8 = Pressure level in hPa of observation value
- column 9 = Height in msl of observation value

*So how do I pass a value for height = 2 meters? **
**Is column 6 used in computing something? *

Thanks

----------------------------------------------------------------
  Complete Ticket History
----------------------------------------------------------------

Subject: Fwd: point_stat configuration : issues & questions
From: John Halley Gotway
Time: Tue Dec 18 08:47:40 2012

Remi,

Thanks for sending your data.  That made it much easier for me to
debug.  Newer versions of MET, like METv4.0, support multiple levels
of logging.  You can always try turning up the logging level with
the '-v' option to look for additional information.  Setting it to 3
reveals the problem in this case:

DEBUG 2: Processing TMP/Z2 versus TMP/Z2, for observation type ADPSFC,
over region FULL, for interpolation method UW_MEAN(1), using 0 pairs.
DEBUG 3: Number of matched pairs  = 0
DEBUG 3: Observations processed   = 50
DEBUG 3: Rejected: GRIB code      = 0
DEBUG 3: Rejected: valid time     = 0
DEBUG 3: Rejected: bad obs value  = 0
DEBUG 3: Rejected: off the grid   = 50
DEBUG 3: Rejected: level mismatch = 0
DEBUG 3: Rejected: message type   = 0
DEBUG 3: Rejected: masking region = 0
DEBUG 3: Rejected: bad fcst value = 0

For debug level 3, Point-Stat writes out a count of the number of
observations processed and the reasons why they were rejected.  In
your case, Point-Stat thinks that none of the observations fall
inside your grid.  The next step is to figure out where your forecast
domain and these observations reside...

To plot your forecast domain, try this:
    METv4.0/bin/plot_data_plane forecastfile.grb forecastfile.ps
'name="TMP"; level="Z2";'

The output image is attached to this message.  Looks like it's over
eastern Africa.

To plot your point observations, try this:
    METv4.0/bin/plot_point_obs SYNOP.20121218000000.0H.rdb.nc
SYNOP.20121218000000.0H.rdb.ps

The output image is attached to this message.  Should be 50 red dots
somewhere, but I don't see them.  So I reran with the -v 3 option to
dump out the lat/lon values that are being plotted.  Looks
like they're all right around 0 lat and 0 lon.  It's very difficult to
see, but if you look along the left edge of the plot, right at 0/0,
there's a small red dot.

So I would say that the problem is the lat/lon values used in the
point observation file:
    SYNOP.20121218000000.0H.rdb.ascii2nc

Columns 4 and 5 of that file should contain latitude in degrees north
and longitude in degrees east.  Looks like there's an error in those
values.

Hope that helps.

Thanks,
John Halley Gotway
met_help at ucar.edu

On 12/18/2012 03:29 AM, Remi Montroty via RT wrote:
>
> Tue Dec 18 03:29:19 2012: Request 59656 was acted upon.
> Transaction: Ticket created by remi.montroty at mfi.fr
>         Queue: met_help
>       Subject: Fwd: point_stat configuration : issues & questions
>         Owner: Nobody
>    Requestors: remi.montroty at mfi.fr
>        Status: new
>   Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=59656 >
>
>
> with attached file...
>
> -------- Message original --------
> Sujet: 	point_stat configuration : issues & questions
> Date : 	Tue, 18 Dec 2012 11:28:11 +0100
> De : 	Remi Montroty <remi.montroty at mfi.fr>
> Pour : 	met_help at ucar.edu
>
>
>
> Hello,
>
> I've recently compiled METv4.0 on a CentOS 6.2 cluster. I'm trying
to
> verify a model temperature (@2meters) output with respect to Synop
data.
> It all went great up to the pairing of obs & model points.
>
> I've tried various combinations of PointStatConfig file but I'm
still
> not getting any pairing. (I've tried R011, Z002 as levels)
>
> Typically here is my temperature definition (as per wgrib -V ) :
>
> rec 3:357568:date 2012121700 TMP kpds5=11 kpds6=105 kpds7=2
levels=(0,2)
> grid=255 2 m above gnd 24hr fcst: bitmap: 962 undef
>     TMP=Temp. [K]
>     timerange 0 P1 24 P2 0 TimeU 1  nx 481 ny 497 GDS grid 0
num_in_ave 0
> missing 0
>     center 14 subcenter 0 process 125 Table 2 scan: WE:SN winds(N/S)
>     latlon: lat  -15.000000 to 16.000000 by 0.062000  nxny 239057
>             long 23.000000 to 53.000000 by 0.062000, (481 x 497)
scan 64
> mode 128 bdsgrid 1
>     min/max data 274 303  num bits 5  BDS_Ref 274  DecScale 0
BinScale 0
>
> and I've attached a tar file containing all files I'm using to do
the
> following :
>
> ascii2nc SYNOP.20121218000000.0H.rdb.ascii2nc
SYNOP.20121218000000.0H.rdb.nc
> point_stat forecastfile.grb SYNOP.20121218000000.0H.rdb.nc
> PointStatConfig.rems2
>
>
> Maybe it is a level issue on the observation side but I'm confused :
> page 3-13 of Met Users Guide gives :
> - column 6 =  elevation in msl of observing location
> - column 8 = Pressure level in hPa of observation value
> - column 9 = Height in msl of observation value
>
> *So how do I pass a value for height = 2 meters? **
> **Is column 6 used in computing something? *
>
>
> Thanks
>
>
>
>
>

------------------------------------------------
Subject: Re: [rt.rap.ucar.edu #59656] Fwd: point_stat configuration : issues & questions
From: Remi Montroty
Time: Tue Dec 18 09:27:47 2012

Dear John,

Thanks for the quick response!  I know it goes faster with data, glad
it
helped.

Indeed, I was told there was a factor of 100 000 in lat/lon scaling in
the original files.... which was actually 100. I'm glad that I
understood that Z2 was the appropriate level.
So it is now working (as in producing output, using 37 pairs). Thank
you!

My understanding of the output files is that we are comparing forecast
and observations with a respective threshold (here > 273).  Now that
is
not quite what I meant to do.

What I'd like is to build time series of model errors in terms of bias
and RMS error...

How would one go about to compute bias and rmse of model with respect
to
observations?
How does one build a statistically significant array (say verification
of 24h forecasts during one month, compared to matching SYNOP data)?
Is there a set of graphical packages (NCL or others) dedicated to
processing point_stat outputs into graphs?

Thanks for showing me the plotting tools, they're a great addon to the
command line! :-)

Best regards,

Remi
> Remi,
>
> Thanks for sending your data.  That made it much easier for me to
debug.  Newer versions of MET, like METv4.0, support multiple levels
of logging.  You can always try turning up the logging level with
> the '-v' option to look for additional information.  Setting it to 3
reveals the problem in this case:
>
> DEBUG 2: Processing TMP/Z2 versus TMP/Z2, for observation type
ADPSFC, over region FULL, for interpolation method UW_MEAN(1), using 0
pairs.
> DEBUG 3: Number of matched pairs  = 0
> DEBUG 3: Observations processed   = 50
> DEBUG 3: Rejected: GRIB code      = 0
> DEBUG 3: Rejected: valid time     = 0
> DEBUG 3: Rejected: bad obs value  = 0
> DEBUG 3: Rejected: off the grid   = 50
> DEBUG 3: Rejected: level mismatch = 0
> DEBUG 3: Rejected: message type   = 0
> DEBUG 3: Rejected: masking region = 0
> DEBUG 3: Rejected: bad fcst value = 0
>
> For debug level 3, Point-Stat writes out a count of the number of
observations processed and the reasons why they were rejected.  In
your case, Point-Stat thinks that none of the observations fall
> inside your grid.  The next step is to figure out where your
forecast domain and these observations reside...
>
> To plot your forecast domain, try this:
>      METv4.0/bin/plot_data_plane forecastfile.grb forecastfile.ps
'name="TMP"; level="Z2";'
>
> The output image is attached to this message.  Looks like it's over
eastern Africa.
>
> To plot your point observations, try this:
>      METv4.0/bin/plot_point_obs SYNOP.20121218000000.0H.rdb.nc
SYNOP.20121218000000.0H.rdb.ps
>
> The output image is attached to this message.  Should be 50 red dots
somewhere, but I don't see them.  So I reran with the -v 3 option to
dump out the lat/lon values that are being plotted.  Looks
> like they're all right around 0 lat and 0 lon.  It's very difficult
to see, but if you look along the left edge of the plot, right at 0/0,
there's a small red dot.
>
> So I would say that the problem is the lat/lon values used in the
point observation file:
>      SYNOP.20121218000000.0H.rdb.ascii2nc
>
> Columns 4 and 5 of that file should contain latitude in degrees
north and longitude in degrees east.  Looks like there's an error in
those values.
>
> Hope that helps.
>
> Thanks,
> John Halley Gotway
> met_help at ucar.edu
>
>
> On 12/18/2012 03:29 AM, Remi Montroty via RT wrote:
>> Tue Dec 18 03:29:19 2012: Request 59656 was acted upon.
>> Transaction: Ticket created by remi.montroty at mfi.fr
>>          Queue: met_help
>>        Subject: Fwd: point_stat configuration : issues & questions
>>          Owner: Nobody
>>     Requestors: remi.montroty at mfi.fr
>>         Status: new
>>    Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=59656 >
>>
>>
>> with attached file...
>>
>> -------- Message original --------
>> Sujet: 	point_stat configuration : issues & questions
>> Date : 	Tue, 18 Dec 2012 11:28:11 +0100
>> De : 	Remi Montroty <remi.montroty at mfi.fr>
>> Pour : 	met_help at ucar.edu
>>
>>
>>
>> Hello,
>>
>> I've recently compiled METv4.0 on a CentOS 6.2 cluster. I'm trying
to
>> verify a model temperature (@2meters) output with respect to Synop
data.
>> It all went great up to the pairing of obs & model points.
>>
>> I've tried various combinations of PointStatConfig file but I'm
still
>> not getting any pairing. (I've tried R011, Z002 as levels)
>>
>> Typically here is my temperature definition (as per wgrib -V ) :
>>
>> rec 3:357568:date 2012121700 TMP kpds5=11 kpds6=105 kpds7=2
levels=(0,2)
>> grid=255 2 m above gnd 24hr fcst: bitmap: 962 undef
>>      TMP=Temp. [K]
>>      timerange 0 P1 24 P2 0 TimeU 1  nx 481 ny 497 GDS grid 0
num_in_ave 0
>> missing 0
>>      center 14 subcenter 0 process 125 Table 2 scan: WE:SN
winds(N/S)
>>      latlon: lat  -15.000000 to 16.000000 by 0.062000  nxny 239057
>>              long 23.000000 to 53.000000 by 0.062000, (481 x 497)
scan 64
>> mode 128 bdsgrid 1
>>      min/max data 274 303  num bits 5  BDS_Ref 274  DecScale 0
BinScale 0
>>
>> and I've attached a tar file containing all files I'm using to do
the
>> following :
>>
>> ascii2nc SYNOP.20121218000000.0H.rdb.ascii2nc
SYNOP.20121218000000.0H.rdb.nc
>> point_stat forecastfile.grb SYNOP.20121218000000.0H.rdb.nc
>> PointStatConfig.rems2
>>
>>
>> Maybe it is a level issue on the observation side but I'm confused
:
>> page 3-13 of Met Users Guide gives :
>> - column 6 =  elevation in msl of observing location
>> - column 8 = Pressure level in hPa of observation value
>> - column 9 = Height in msl of observation value
>>
>> *So how do I pass a value for height = 2 meters? **
>> **Is column 6 used in computing something? *
>>
>>
>> Thanks
>>
>>
>>
>>
>>

------------------------------------------------
Subject: Re: [rt.rap.ucar.edu #59656] Fwd: point_stat configuration : issues & questions
From: John Halley Gotway
Time: Tue Dec 18 09:47:00 2012

Remy,

I'll put in my 2 cents on your questions and then reassign this ticket
to Tressa Fowler, our resident statistician.  She may be able to
answer your methods and statistics questions.

Regarding what's showing up in your output, that's all controlled by
the "output_flag" configuration setting.  You selected categorical
output lines (fho, ctc, cts), continuous statistics (cnt),
scalar partial sums (sl1l2), and the raw matched pairs (mpr).  You're
correct that the categorical counts and stats are computed using a
threshold as they're specified in the "cat_thresh"
configuration setting.  In your case, it looks like your computing
them twice, once using <=273, and a second time using >273.

But the continuous stats are computed using the raw fcst and obs
values, not thresholded ones.

It sounds like you're interested in the bias and rmse statistics, both
of which show up in the continuous stats line type.  In the data you
sent me, you've run Point-Stat for a single point in time.
Presumably, you'll run it for many for output times.  As for how to
the resulting statistics, it's kind of up to you.  We do have an
example script which is built upon R:
    METv4.0/scripts/Rscripts/plot_cnt.R

You run it like this:
    Rscript plot_cnt.R

Here's the usage statement:
Usage: plot_cnt.R
          cnt_file_list
          [-column name]
          [-out name]
          [-save]
          where "file_list"    is one or more files containing CNT
lines.
                "-column name" specifies a CNT statistic to be plotted
(multiple).
                "-out name"    specifies an output PDF file name.
                "-save"        calls save.image() before exiting R.

So you pass it a bunch of files containing CNT lines of MET output.
Here's an example of running it on the single output file:

   Rscript METv4.0/scripts/Rscripts/plot_cnt.R out/*.stat -column RMSE
-column MBIAS -out test.pdf

That generates very uninteresting plots with just a single point on
them.

Really there are any number of tools you could use to make plots of
columnar ascii data.  If you happen to be familiar with R, this script
is meant as an example to get you going.

Thanks,
John

On 12/18/2012 09:27 AM, Remi Montroty via RT wrote:
>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=59656 >
>
> Dear John,
>
> Thanks for the quick response!  I know it goes faster with data,
glad it
> helped.
>
> Indeed, I was told there was a factor of 100 000 in lat/lon scaling
in
> the original files.... which was actually 100. I'm glad that I
> understood that Z2 was the appropriate level.
> So it is now working (as in producing output, using 37 pairs). Thank
you!
>
> My understanding of the output files is that we are comparing
forecast
> and observations with a respective threshold (here > 273).  Now that
is
> not quite what I meant to do.
>
> What I'd like is to build time series of model errors in terms of
bias
> and RMS error...
>
> How would one go about to compute bias and rmse of model with
respect to
> observations?
> How does one build a statistically significant array (say
verification
> of 24h forecasts during one month, compared to matching SYNOP data)?
> Is there a set of graphical packages (NCL or others) dedicated to
> processing point_stat outputs into graphs?
>
> Thanks for showing me the plotting tools, they're a great addon to
the
> command line! :-)
>
> Best regards,
>
> Remi
>> Remi,
>>
>> Thanks for sending your data.  That made it much easier for me to
debug.  Newer versions of MET, like METv4.0, support multiple levels
of logging.  You can always try turning up the logging level with
>> the '-v' option to look for additional information.  Setting it to
3 reveals the problem in this case:
>>
>> DEBUG 2: Processing TMP/Z2 versus TMP/Z2, for observation type
ADPSFC, over region FULL, for interpolation method UW_MEAN(1), using 0
pairs.
>> DEBUG 3: Number of matched pairs  = 0
>> DEBUG 3: Observations processed   = 50
>> DEBUG 3: Rejected: GRIB code      = 0
>> DEBUG 3: Rejected: valid time     = 0
>> DEBUG 3: Rejected: bad obs value  = 0
>> DEBUG 3: Rejected: off the grid   = 50
>> DEBUG 3: Rejected: level mismatch = 0
>> DEBUG 3: Rejected: message type   = 0
>> DEBUG 3: Rejected: masking region = 0
>> DEBUG 3: Rejected: bad fcst value = 0
>>
>> For debug level 3, Point-Stat writes out a count of the number of
observations processed and the reasons why they were rejected.  In
your case, Point-Stat thinks that none of the observations fall
>> inside your grid.  The next step is to figure out where your
forecast domain and these observations reside...
>>
>> To plot your forecast domain, try this:
>>       METv4.0/bin/plot_data_plane forecastfile.grb forecastfile.ps
'name="TMP"; level="Z2";'
>>
>> The output image is attached to this message.  Looks like it's over
eastern Africa.
>>
>> To plot your point observations, try this:
>>       METv4.0/bin/plot_point_obs SYNOP.20121218000000.0H.rdb.nc
SYNOP.20121218000000.0H.rdb.ps
>>
>> The output image is attached to this message.  Should be 50 red
dots somewhere, but I don't see them.  So I reran with the -v 3 option
to dump out the lat/lon values that are being plotted.  Looks
>> like they're all right around 0 lat and 0 lon.  It's very difficult
to see, but if you look along the left edge of the plot, right at 0/0,
there's a small red dot.
>>
>> So I would say that the problem is the lat/lon values used in the
point observation file:
>>       SYNOP.20121218000000.0H.rdb.ascii2nc
>>
>> Columns 4 and 5 of that file should contain latitude in degrees
north and longitude in degrees east.  Looks like there's an error in
those values.
>>
>> Hope that helps.
>>
>> Thanks,
>> John Halley Gotway
>> met_help at ucar.edu
>>
>>
>> On 12/18/2012 03:29 AM, Remi Montroty via RT wrote:
>>> Tue Dec 18 03:29:19 2012: Request 59656 was acted upon.
>>> Transaction: Ticket created by remi.montroty at mfi.fr
>>>           Queue: met_help
>>>         Subject: Fwd: point_stat configuration : issues &
questions
>>>           Owner: Nobody
>>>      Requestors: remi.montroty at mfi.fr
>>>          Status: new
>>>     Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=59656 >
>>>
>>>
>>> with attached file...
>>>
>>> -------- Message original --------
>>> Sujet: 	point_stat configuration : issues & questions
>>> Date : 	Tue, 18 Dec 2012 11:28:11 +0100
>>> De : 	Remi Montroty <remi.montroty at mfi.fr>
>>> Pour : 	met_help at ucar.edu
>>>
>>>
>>>
>>> Hello,
>>>
>>> I've recently compiled METv4.0 on a CentOS 6.2 cluster. I'm trying
to
>>> verify a model temperature (@2meters) output with respect to Synop
data.
>>> It all went great up to the pairing of obs & model points.
>>>
>>> I've tried various combinations of PointStatConfig file but I'm
still
>>> not getting any pairing. (I've tried R011, Z002 as levels)
>>>
>>> Typically here is my temperature definition (as per wgrib -V ) :
>>>
>>> rec 3:357568:date 2012121700 TMP kpds5=11 kpds6=105 kpds7=2
levels=(0,2)
>>> grid=255 2 m above gnd 24hr fcst: bitmap: 962 undef
>>>       TMP=Temp. [K]
>>>       timerange 0 P1 24 P2 0 TimeU 1  nx 481 ny 497 GDS grid 0
num_in_ave 0
>>> missing 0
>>>       center 14 subcenter 0 process 125 Table 2 scan: WE:SN
winds(N/S)
>>>       latlon: lat  -15.000000 to 16.000000 by 0.062000  nxny
239057
>>>               long 23.000000 to 53.000000 by 0.062000, (481 x 497)
scan 64
>>> mode 128 bdsgrid 1
>>>       min/max data 274 303  num bits 5  BDS_Ref 274  DecScale 0
BinScale 0
>>>
>>> and I've attached a tar file containing all files I'm using to do
the
>>> following :
>>>
>>> ascii2nc SYNOP.20121218000000.0H.rdb.ascii2nc
SYNOP.20121218000000.0H.rdb.nc
>>> point_stat forecastfile.grb SYNOP.20121218000000.0H.rdb.nc
>>> PointStatConfig.rems2
>>>
>>>
>>> Maybe it is a level issue on the observation side but I'm confused
:
>>> page 3-13 of Met Users Guide gives :
>>> - column 6 =  elevation in msl of observing location
>>> - column 8 = Pressure level in hPa of observation value
>>> - column 9 = Height in msl of observation value
>>>
>>> *So how do I pass a value for height = 2 meters? **
>>> **Is column 6 used in computing something? *
>>>
>>>
>>> Thanks
>>>
>>>
>>>
>>>
>>>
>

------------------------------------------------
Subject: Re: [rt.rap.ucar.edu #59656] Fwd: point_stat configuration : issues & questions
From: Remi Montroty
Time: Thu Jan 10 07:26:38 2013

Dear all,

First of all, a very happy & prosperous 2013 to you. I hope to see yet
another MET release maybe? :-)

I'd like to thank you for the insight given here.  I've been using the
continuous statistics now & tried to make sense out of the 92 columns
of
output.

1) are the columns described somewhere in the documentation?

right now I've scripted a little something that selects the columns I
want.

Select column numbers you want to print (separated by blank space):
$9" "$10" "$23" "$33" "$28" "$38" "$63" "$75
FCST_VAR FCST_LEV FBAR OBAR FSTDEV OSTDEV MBIAS RMSE
TMP Z2 25.42457 25.75434 1.96158 2.39411 0.98720 1.68526
TMP Z2 25.46131 25.75434 2.01557 2.39411 0.98862 1.67708
TMP Z2 25.41807 25.75434 1.90168 2.39411 0.98694 1.62678

=> Am I correct in assuming that MBIAS is the mean Bias defined as the
sum, over all pairs, of (Obs - ModelValue)?

is FBAR the average of all forecast values over all pairs? is OBAR the
average of all observation values over all pairs? Here the difference
(OBAR-FBAR) is on the order of 0.3°C : is is normal that the mean bias
be 3 times that value?

I am really looking for the bias value = 1/N * sum (from i=0,N) {
obs(i)
- forecast(i) }. Not sure if I'm looking at the right thing.

Then similarly, is RMSE = 1/N * sum (from i=0,N) {  ( obs(i)-
forecast(i)
)^2 } ?

2) Yes  I shall be running it daily, comparing all forecast ranges
from
the previous run, valid today, with current obs value. Should I
accumulate statistics daily or run all at once? what is MET best
designed to do?

3) Thank you for pointing me to the R-script. I'll try to have a look.

Thanks again!

Rémi

> Remy,
>
> I'll put in my 2 cents on your questions and then reassign this
ticket to Tressa Fowler, our resident statistician.  She may be able
to answer your methods and statistics questions.
>
> Regarding what's showing up in your output, that's all controlled by
the "output_flag" configuration setting.  You selected categorical
output lines (fho, ctc, cts), continuous statistics (cnt),
> scalar partial sums (sl1l2), and the raw matched pairs (mpr).
You're correct that the categorical counts and stats are computed
using a threshold as they're specified in the "cat_thresh"
> configuration setting.  In your case, it looks like your computing
them twice, once using <=273, and a second time using >273.
>
> But the continuous stats are computed using the raw fcst and obs
values, not thresholded ones.
>
> It sounds like you're interested in the bias and rmse statistics,
both of which show up in the continuous stats line type.  In the data
you sent me, you've run Point-Stat for a single point in time.
> Presumably, you'll run it for many for output times.  As for how to
the resulting statistics, it's kind of up to you.  We do have an
example script which is built upon R:
>      METv4.0/scripts/Rscripts/plot_cnt.R
>
> You run it like this:
>      Rscript plot_cnt.R
>
> Here's the usage statement:
> Usage: plot_cnt.R
>            cnt_file_list
>            [-column name]
>            [-out name]
>            [-save]
>            where "file_list"    is one or more files containing CNT
lines.
>                  "-column name" specifies a CNT statistic to be
plotted (multiple).
>                  "-out name"    specifies an output PDF file name.
>                  "-save"        calls save.image() before exiting R.
>
> So you pass it a bunch of files containing CNT lines of MET output.
Here's an example of running it on the single output file:
>
>     Rscript METv4.0/scripts/Rscripts/plot_cnt.R out/*.stat -column
RMSE -column MBIAS -out test.pdf
>
> That generates very uninteresting plots with just a single point on
them.
>
> Really there are any number of tools you could use to make plots of
columnar ascii data.  If you happen to be familiar with R, this script
is meant as an example to get you going.
>
> Thanks,
> John
>
>
>
>
>
>
> On 12/18/2012 09:27 AM, Remi Montroty via RT wrote:
>> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=59656 >
>>
>> Dear John,
>>
>> Thanks for the quick response!  I know it goes faster with data,
glad it
>> helped.
>>
>> Indeed, I was told there was a factor of 100 000 in lat/lon scaling
in
>> the original files.... which was actually 100. I'm glad that I
>> understood that Z2 was the appropriate level.
>> So it is now working (as in producing output, using 37 pairs).
Thank you!
>>
>> My understanding of the output files is that we are comparing
forecast
>> and observations with a respective threshold (here > 273).  Now
that is
>> not quite what I meant to do.
>>
>> What I'd like is to build time series of model errors in terms of
bias
>> and RMS error...
>>
>> How would one go about to compute bias and rmse of model with
respect to
>> observations?
>> How does one build a statistically significant array (say
verification
>> of 24h forecasts during one month, compared to matching SYNOP
data)?
>> Is there a set of graphical packages (NCL or others) dedicated to
>> processing point_stat outputs into graphs?
>>
>> Thanks for showing me the plotting tools, they're a great addon to
the
>> command line! :-)
>>
>> Best regards,
>>
>> Remi
>>> Remi,
>>>
>>> Thanks for sending your data.  That made it much easier for me to
debug.  Newer versions of MET, like METv4.0, support multiple levels
of logging.  You can always try turning up the logging level with
>>> the '-v' option to look for additional information.  Setting it to
3 reveals the problem in this case:
>>>
>>> DEBUG 2: Processing TMP/Z2 versus TMP/Z2, for observation type
ADPSFC, over region FULL, for interpolation method UW_MEAN(1), using 0
pairs.
>>> DEBUG 3: Number of matched pairs  = 0
>>> DEBUG 3: Observations processed   = 50
>>> DEBUG 3: Rejected: GRIB code      = 0
>>> DEBUG 3: Rejected: valid time     = 0
>>> DEBUG 3: Rejected: bad obs value  = 0
>>> DEBUG 3: Rejected: off the grid   = 50
>>> DEBUG 3: Rejected: level mismatch = 0
>>> DEBUG 3: Rejected: message type   = 0
>>> DEBUG 3: Rejected: masking region = 0
>>> DEBUG 3: Rejected: bad fcst value = 0
>>>
>>> For debug level 3, Point-Stat writes out a count of the number of
observations processed and the reasons why they were rejected.  In
your case, Point-Stat thinks that none of the observations fall
>>> inside your grid.  The next step is to figure out where your
forecast domain and these observations reside...
>>>
>>> To plot your forecast domain, try this:
>>>        METv4.0/bin/plot_data_plane forecastfile.grb
forecastfile.ps 'name="TMP"; level="Z2";'
>>>
>>> The output image is attached to this message.  Looks like it's
over eastern Africa.
>>>
>>> To plot your point observations, try this:
>>>        METv4.0/bin/plot_point_obs SYNOP.20121218000000.0H.rdb.nc
SYNOP.20121218000000.0H.rdb.ps
>>>
>>> The output image is attached to this message.  Should be 50 red
dots somewhere, but I don't see them.  So I reran with the -v 3 option
to dump out the lat/lon values that are being plotted.  Looks
>>> like they're all right around 0 lat and 0 lon.  It's very
difficult to see, but if you look along the left edge of the plot,
right at 0/0, there's a small red dot.
>>>
>>> So I would say that the problem is the lat/lon values used in the
point observation file:
>>>        SYNOP.20121218000000.0H.rdb.ascii2nc
>>>
>>> Columns 4 and 5 of that file should contain latitude in degrees
north and longitude in degrees east.  Looks like there's an error in
those values.
>>>
>>> Hope that helps.
>>>
>>> Thanks,
>>> John Halley Gotway
>>> met_help at ucar.edu
>>>
>>>
>>> On 12/18/2012 03:29 AM, Remi Montroty via RT wrote:
>>>> Tue Dec 18 03:29:19 2012: Request 59656 was acted upon.
>>>> Transaction: Ticket created by remi.montroty at mfi.fr
>>>>            Queue: met_help
>>>>          Subject: Fwd: point_stat configuration : issues &
questions
>>>>            Owner: Nobody
>>>>       Requestors: remi.montroty at mfi.fr
>>>>           Status: new
>>>>      Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=59656 >
>>>>
>>>>
>>>> with attached file...
>>>>
>>>> -------- Message original --------
>>>> Sujet: 	point_stat configuration : issues & questions
>>>> Date : 	Tue, 18 Dec 2012 11:28:11 +0100
>>>> De : 	Remi Montroty <remi.montroty at mfi.fr>
>>>> Pour : 	met_help at ucar.edu
>>>>
>>>>
>>>>
>>>> Hello,
>>>>
>>>> I've recently compiled METv4.0 on a CentOS 6.2 cluster. I'm
trying to
>>>> verify a model temperature (@2meters) output with respect to
Synop data.
>>>> It all went great up to the pairing of obs & model points.
>>>>
>>>> I've tried various combinations of PointStatConfig file but I'm
still
>>>> not getting any pairing. (I've tried R011, Z002 as levels)
>>>>
>>>> Typically here is my temperature definition (as per wgrib -V ) :
>>>>
>>>> rec 3:357568:date 2012121700 TMP kpds5=11 kpds6=105 kpds7=2
levels=(0,2)
>>>> grid=255 2 m above gnd 24hr fcst: bitmap: 962 undef
>>>>        TMP=Temp. [K]
>>>>        timerange 0 P1 24 P2 0 TimeU 1  nx 481 ny 497 GDS grid 0
num_in_ave 0
>>>> missing 0
>>>>        center 14 subcenter 0 process 125 Table 2 scan: WE:SN
winds(N/S)
>>>>        latlon: lat  -15.000000 to 16.000000 by 0.062000  nxny
239057
>>>>                long 23.000000 to 53.000000 by 0.062000, (481 x
497) scan 64
>>>> mode 128 bdsgrid 1
>>>>        min/max data 274 303  num bits 5  BDS_Ref 274  DecScale 0
BinScale 0
>>>>
>>>> and I've attached a tar file containing all files I'm using to do
the
>>>> following :
>>>>
>>>> ascii2nc SYNOP.20121218000000.0H.rdb.ascii2nc
SYNOP.20121218000000.0H.rdb.nc
>>>> point_stat forecastfile.grb SYNOP.20121218000000.0H.rdb.nc
>>>> PointStatConfig.rems2
>>>>
>>>>
>>>> Maybe it is a level issue on the observation side but I'm
confused :
>>>> page 3-13 of Met Users Guide gives :
>>>> - column 6 =  elevation in msl of observing location
>>>> - column 8 = Pressure level in hPa of observation value
>>>> - column 9 = Height in msl of observation value
>>>>
>>>> *So how do I pass a value for height = 2 meters? **
>>>> **Is column 6 used in computing something? *
>>>>
>>>>
>>>> Thanks
>>>>
>>>>
>>>>
>>>>
>>>>

------------------------------------------------
Subject: Fwd: point_stat configuration : issues & questions
From: Remi Montroty
Time: Thu Jan 10 07:28:09 2013

Le 10/01/2013 15:26, Remi Montroty a écrit :
> Then similarly, is RMSE = 1/N * sum (from i=0,N) {  (
> obs(i)-forecast(i) )² }^

I obviously meant to sqrt( ..)  the whole thing! ...

------------------------------------------------
Subject: Re: [rt.rap.ucar.edu #59656] Fwd: point_stat configuration : issues & questions
From: John Halley Gotway
Time: Thu Jan 10 11:16:08 2013

Remi,

Please take a look in the MET user's guide, which can be found in the
METv4.0/doc directory.

The column names are all spelled out in table format there.  For
example, table 4-6 on page 4-21 indicates that column 63 of the CNT
output line contains "MBIAS", which is short for multiplicative
bias.  Also, the mathematical definition of MBIAS (and all the other
stats) is contained in Appendix C.  Page C-8 indicates that MBIAS =
mean forecast value divided by the mean observed value - which
differs from your guess.

The value for mean forecast value MINUS mean observed value is
contained in the "Mean Error", or ME column.  That is column number 53
of the CNT line, according to table 4-21.

Please take some time to read through those tables, and in particular,
appendix C.

And then we'll be happy to answer any additional questions you have.

Thanks,
John

On 01/10/2013 07:26 AM, Remi Montroty via RT wrote:
>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=59656 >
>
> Dear all,
>
> First of all, a very happy & prosperous 2013 to you. I hope to see
yet
> another MET release maybe? :-)
>
> I'd like to thank you for the insight given here.  I've been using
the
> continuous statistics now & tried to make sense out of the 92
columns of
> output.
>
> 1) are the columns described somewhere in the documentation?
>
> right now I've scripted a little something that selects the columns
I want.
>
> Select column numbers you want to print (separated by blank space):
> $9" "$10" "$23" "$33" "$28" "$38" "$63" "$75
> FCST_VAR FCST_LEV FBAR OBAR FSTDEV OSTDEV MBIAS RMSE
> TMP Z2 25.42457 25.75434 1.96158 2.39411 0.98720 1.68526
> TMP Z2 25.46131 25.75434 2.01557 2.39411 0.98862 1.67708
> TMP Z2 25.41807 25.75434 1.90168 2.39411 0.98694 1.62678
>
> => Am I correct in assuming that MBIAS is the mean Bias defined as
the
> sum, over all pairs, of (Obs - ModelValue)?
>
> is FBAR the average of all forecast values over all pairs? is OBAR
the
> average of all observation values over all pairs? Here the
difference
> (OBAR-FBAR) is on the order of 0.3°C : is is normal that the mean
bias
> be 3 times that value?
>
> I am really looking for the bias value = 1/N * sum (from i=0,N) {
obs(i)
> - forecast(i) }. Not sure if I'm looking at the right thing.
>
> Then similarly, is RMSE = 1/N * sum (from i=0,N) {  ( obs(i)-
forecast(i)
> )^2 } ?
>
>
> 2) Yes  I shall be running it daily, comparing all forecast ranges
from
> the previous run, valid today, with current obs value. Should I
> accumulate statistics daily or run all at once? what is MET best
> designed to do?
>
> 3) Thank you for pointing me to the R-script. I'll try to have a
look.
>
> Thanks again!
>
> Rémi
>
>
>
>
>
>
>> Remy,
>>
>> I'll put in my 2 cents on your questions and then reassign this
ticket to Tressa Fowler, our resident statistician.  She may be able
to answer your methods and statistics questions.
>>
>> Regarding what's showing up in your output, that's all controlled
by the "output_flag" configuration setting.  You selected categorical
output lines (fho, ctc, cts), continuous statistics (cnt),
>> scalar partial sums (sl1l2), and the raw matched pairs (mpr).
You're correct that the categorical counts and stats are computed
using a threshold as they're specified in the "cat_thresh"
>> configuration setting.  In your case, it looks like your computing
them twice, once using <=273, and a second time using >273.
>>
>> But the continuous stats are computed using the raw fcst and obs
values, not thresholded ones.
>>
>> It sounds like you're interested in the bias and rmse statistics,
both of which show up in the continuous stats line type.  In the data
you sent me, you've run Point-Stat for a single point in time.
>> Presumably, you'll run it for many for output times.  As for how to
the resulting statistics, it's kind of up to you.  We do have an
example script which is built upon R:
>>       METv4.0/scripts/Rscripts/plot_cnt.R
>>
>> You run it like this:
>>       Rscript plot_cnt.R
>>
>> Here's the usage statement:
>> Usage: plot_cnt.R
>>             cnt_file_list
>>             [-column name]
>>             [-out name]
>>             [-save]
>>             where "file_list"    is one or more files containing
CNT lines.
>>                   "-column name" specifies a CNT statistic to be
plotted (multiple).
>>                   "-out name"    specifies an output PDF file name.
>>                   "-save"        calls save.image() before exiting
R.
>>
>> So you pass it a bunch of files containing CNT lines of MET output.
Here's an example of running it on the single output file:
>>
>>      Rscript METv4.0/scripts/Rscripts/plot_cnt.R out/*.stat -column
RMSE -column MBIAS -out test.pdf
>>
>> That generates very uninteresting plots with just a single point on
them.
>>
>> Really there are any number of tools you could use to make plots of
columnar ascii data.  If you happen to be familiar with R, this script
is meant as an example to get you going.
>>
>> Thanks,
>> John
>>
>>
>>
>>
>>
>>
>> On 12/18/2012 09:27 AM, Remi Montroty via RT wrote:
>>> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=59656 >
>>>
>>> Dear John,
>>>
>>> Thanks for the quick response!  I know it goes faster with data,
glad it
>>> helped.
>>>
>>> Indeed, I was told there was a factor of 100 000 in lat/lon
scaling in
>>> the original files.... which was actually 100. I'm glad that I
>>> understood that Z2 was the appropriate level.
>>> So it is now working (as in producing output, using 37 pairs).
Thank you!
>>>
>>> My understanding of the output files is that we are comparing
forecast
>>> and observations with a respective threshold (here > 273).  Now
that is
>>> not quite what I meant to do.
>>>
>>> What I'd like is to build time series of model errors in terms of
bias
>>> and RMS error...
>>>
>>> How would one go about to compute bias and rmse of model with
respect to
>>> observations?
>>> How does one build a statistically significant array (say
verification
>>> of 24h forecasts during one month, compared to matching SYNOP
data)?
>>> Is there a set of graphical packages (NCL or others) dedicated to
>>> processing point_stat outputs into graphs?
>>>
>>> Thanks for showing me the plotting tools, they're a great addon to
the
>>> command line! :-)
>>>
>>> Best regards,
>>>
>>> Remi
>>>> Remi,
>>>>
>>>> Thanks for sending your data.  That made it much easier for me to
debug.  Newer versions of MET, like METv4.0, support multiple levels
of logging.  You can always try turning up the logging level with
>>>> the '-v' option to look for additional information.  Setting it
to 3 reveals the problem in this case:
>>>>
>>>> DEBUG 2: Processing TMP/Z2 versus TMP/Z2, for observation type
ADPSFC, over region FULL, for interpolation method UW_MEAN(1), using 0
pairs.
>>>> DEBUG 3: Number of matched pairs  = 0
>>>> DEBUG 3: Observations processed   = 50
>>>> DEBUG 3: Rejected: GRIB code      = 0
>>>> DEBUG 3: Rejected: valid time     = 0
>>>> DEBUG 3: Rejected: bad obs value  = 0
>>>> DEBUG 3: Rejected: off the grid   = 50
>>>> DEBUG 3: Rejected: level mismatch = 0
>>>> DEBUG 3: Rejected: message type   = 0
>>>> DEBUG 3: Rejected: masking region = 0
>>>> DEBUG 3: Rejected: bad fcst value = 0
>>>>
>>>> For debug level 3, Point-Stat writes out a count of the number of
observations processed and the reasons why they were rejected.  In
your case, Point-Stat thinks that none of the observations fall
>>>> inside your grid.  The next step is to figure out where your
forecast domain and these observations reside...
>>>>
>>>> To plot your forecast domain, try this:
>>>>         METv4.0/bin/plot_data_plane forecastfile.grb
forecastfile.ps 'name="TMP"; level="Z2";'
>>>>
>>>> The output image is attached to this message.  Looks like it's
over eastern Africa.
>>>>
>>>> To plot your point observations, try this:
>>>>         METv4.0/bin/plot_point_obs SYNOP.20121218000000.0H.rdb.nc
SYNOP.20121218000000.0H.rdb.ps
>>>>
>>>> The output image is attached to this message.  Should be 50 red
dots somewhere, but I don't see them.  So I reran with the -v 3 option
to dump out the lat/lon values that are being plotted.  Looks
>>>> like they're all right around 0 lat and 0 lon.  It's very
difficult to see, but if you look along the left edge of the plot,
right at 0/0, there's a small red dot.
>>>>
>>>> So I would say that the problem is the lat/lon values used in the
point observation file:
>>>>         SYNOP.20121218000000.0H.rdb.ascii2nc
>>>>
>>>> Columns 4 and 5 of that file should contain latitude in degrees
north and longitude in degrees east.  Looks like there's an error in
those values.
>>>>
>>>> Hope that helps.
>>>>
>>>> Thanks,
>>>> John Halley Gotway
>>>> met_help at ucar.edu
>>>>
>>>>
>>>> On 12/18/2012 03:29 AM, Remi Montroty via RT wrote:
>>>>> Tue Dec 18 03:29:19 2012: Request 59656 was acted upon.
>>>>> Transaction: Ticket created by remi.montroty at mfi.fr
>>>>>             Queue: met_help
>>>>>           Subject: Fwd: point_stat configuration : issues &
questions
>>>>>             Owner: Nobody
>>>>>        Requestors: remi.montroty at mfi.fr
>>>>>            Status: new
>>>>>       Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=59656 >
>>>>>
>>>>>
>>>>> with attached file...
>>>>>
>>>>> -------- Message original --------
>>>>> Sujet: 	point_stat configuration : issues & questions
>>>>> Date : 	Tue, 18 Dec 2012 11:28:11 +0100
>>>>> De : 	Remi Montroty <remi.montroty at mfi.fr>
>>>>> Pour : 	met_help at ucar.edu
>>>>>
>>>>>
>>>>>
>>>>> Hello,
>>>>>
>>>>> I've recently compiled METv4.0 on a CentOS 6.2 cluster. I'm
trying to
>>>>> verify a model temperature (@2meters) output with respect to
Synop data.
>>>>> It all went great up to the pairing of obs & model points.
>>>>>
>>>>> I've tried various combinations of PointStatConfig file but I'm
still
>>>>> not getting any pairing. (I've tried R011, Z002 as levels)
>>>>>
>>>>> Typically here is my temperature definition (as per wgrib -V ) :
>>>>>
>>>>> rec 3:357568:date 2012121700 TMP kpds5=11 kpds6=105 kpds7=2
levels=(0,2)
>>>>> grid=255 2 m above gnd 24hr fcst: bitmap: 962 undef
>>>>>         TMP=Temp. [K]
>>>>>         timerange 0 P1 24 P2 0 TimeU 1  nx 481 ny 497 GDS grid 0
num_in_ave 0
>>>>> missing 0
>>>>>         center 14 subcenter 0 process 125 Table 2 scan: WE:SN
winds(N/S)
>>>>>         latlon: lat  -15.000000 to 16.000000 by 0.062000  nxny
239057
>>>>>                 long 23.000000 to 53.000000 by 0.062000, (481 x
497) scan 64
>>>>> mode 128 bdsgrid 1
>>>>>         min/max data 274 303  num bits 5  BDS_Ref 274  DecScale
0 BinScale 0
>>>>>
>>>>> and I've attached a tar file containing all files I'm using to
do the
>>>>> following :
>>>>>
>>>>> ascii2nc SYNOP.20121218000000.0H.rdb.ascii2nc
SYNOP.20121218000000.0H.rdb.nc
>>>>> point_stat forecastfile.grb SYNOP.20121218000000.0H.rdb.nc
>>>>> PointStatConfig.rems2
>>>>>
>>>>>
>>>>> Maybe it is a level issue on the observation side but I'm
confused :
>>>>> page 3-13 of Met Users Guide gives :
>>>>> - column 6 =  elevation in msl of observing location
>>>>> - column 8 = Pressure level in hPa of observation value
>>>>> - column 9 = Height in msl of observation value
>>>>>
>>>>> *So how do I pass a value for height = 2 meters? **
>>>>> **Is column 6 used in computing something? *
>>>>>
>>>>>
>>>>> Thanks
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>

------------------------------------------------
Subject: Fwd: point_stat configuration : issues & questions
From: Tressa Fowler
Time: Fri Jan 18 12:44:06 2013

Hi Remi,

The statistical significance question can be a tricky one. I would say
that a month represents a bare minimum, but your particular problem
may require more data depending on the variability inherent in the
observations and how related each day is to the next. We like to look
at a whole year, but not everyone can do that.

You can use our confidence interval calculations to give you an idea
of the significance for your particular set of cases. If you need more
details about how to analyze this, let me know.

Tressa

------------------------------------------------
Subject: Fwd: point_stat configuration : issues & questions
From: Remi Montroty
Time: Fri Jan 25 02:14:18 2013

Dear Tressa,

Thank you for taking the time to clarify this.

Indeed, running for a whole year is a bit tricky unless I store
everything
on disk (Right now it is in Oracle databases).

I've been running daily point_stat & grid_stat now but I am still not
100%
sure how to validate the significance level : on a given day I have
about
188 synop stations to validate Temp @2m. Out of the 92 columns of
statistics I wonder which to focus on to get that significance level
(and
maybe I'd like to tag days which reach it).
1) Is there any bootstraping methods built-in to test the significance
level?
2) Can I access a specific station statistics through any of the
files?
(I'd like a timeseries of model error (ME,RMSE) at a specific station
to be
precise)

On other subjects:

3) Is there a methodology for using landsea mask as a mask for
verification
(to only validate over land)?
4) Is there a methodology described in order to derive polygons from
shapefiles? (I'd like to use provinces shapefiles to define my
polygons)
5) Can I verify FF,DD (wind force & direction) from U,V model fields ?
6) What is, in the developpers minds, the best tool available for
accumulated precip verification? (I'm painfully aware of the double
penality issue...)

Thank you

ps: I'm sorry if my questions are stupid or fully explained in the
documentation, I've browsed through it and not found answers to them.

On Fri, Jan 18, 2013 at 8:44 PM, Tressa Fowler via RT
<met_help at ucar.edu>wrote:

> Hi Remi,
>
> The statistical significance question can be a tricky one. I would
say
> that a month represents a bare minimum, but your particular problem
may
> require more data depending on the variability inherent in the
observations
> and how related each day is to the next. We like to look at a whole
year,
> but not everyone can do that.
>
> You can use our confidence interval calculations to give you an idea
of
> the significance for your particular set of cases. If you need more
details
> about how to analyze this, let me know.
>
> Tressa
>

------------------------------------------------
Subject: Fwd: point_stat configuration : issues & questions
From: Tressa Fowler
Time: Fri Jan 25 12:32:58 2013

Hi Remy,

Please see some individual answers below. Your questions about masking
have been forwarded back to one of our software engineers, as I am
less familiar with this piece.

Tressa

On Fri Jan 25 02:14:18 2013, remi.montroty at mfi.fr wrote:
> Dear Tressa,
>
> Thank you for taking the time to clarify this.
>
> Indeed, running for a whole year is a bit tricky unless I store
> everything
> on disk (Right now it is in Oracle databases).
>
> I've been running daily point_stat & grid_stat now but I am still
not
> 100%
> sure how to validate the significance level : on a given day I have
> about
> 188 synop stations to validate Temp @2m. Out of the 92 columns of
> statistics I wonder which to focus on to get that significance level
> (and
> maybe I'd like to tag days which reach it).
> 1) Is there any bootstraping methods built-in to test the
significance
> level?

Yes, where appropriate, bootstrap confidence bounds are included in
the MET output files if you have turned on bootstrapping in your
configuration files. Please see Appendix D of our documentation for
information about how they are computed, and the individual chapters
on point stat and grid stat to see for which statistics confidence
intervals are included.

> 2) Can I access a specific station statistics through any of the
> files?
> (I'd like a timeseries of model error (ME,RMSE) at a specific
station
> to be
> precise)

Yes, you can use station masking to produce statistics for a single
location. This is a bit labor intensive as you will have to run MET
once per forecast, but will be easier in the next release.  Please see
the users' guide or example configuration files for station masking
details. To produce the time series, you may want to use our Stat
Analysis tool to filter results by station.

>
> On other subjects:
>
> 3) Is there a methodology for using landsea mask as a mask for
> verification
> (to only validate over land)?
> 4) Is there a methodology described in order to derive polygons from
> shapefiles? (I'd like to use provinces shapefiles to define my
> polygons)
> 5) Can I verify FF,DD (wind force & direction) from U,V model fields
?

You can verify wind speed from U and V, see the configuration file
options for this setting. For direction, you will need to use the Stat
Analysis tool. See Chapter 8 of our documentation, where we have a
section on verifying wind direction.

> 6) What is, in the developpers minds, the best tool available for
> accumulated precip verification? (I'm painfully aware of the double
> penality issue...)

If you have point observations, you are stuck with traditional
verification. The methodologies that mitigate the double penalty are
all spatial methods that require gridded observations. If you have
gridded observations, there are a plethora of methods available, and
the research community has not come to a consensus about what is best.
Please review the literature for a full discussion of this issue. In
MET, we recommend use of MODE or neighborhood methods (Chapters 5 and
6 of the documentation).

>
> Thank you
>
> ps: I'm sorry if my questions are stupid or fully explained in the
> documentation, I've browsed through it and not found answers to
them.
>
>
>
>
>
>
> On Fri, Jan 18, 2013 at 8:44 PM, Tressa Fowler via RT
> <met_help at ucar.edu>wrote:
>
> > Hi Remi,
> >
> > The statistical significance question can be a tricky one. I would
> say
> > that a month represents a bare minimum, but your particular
problem
> may
> > require more data depending on the variability inherent in the
> observations
> > and how related each day is to the next. We like to look at a
whole
> year,
> > but not everyone can do that.
> >
> > You can use our confidence interval calculations to give you an
idea
> of
> > the significance for your particular set of cases. If you need
more
> details
> > about how to analyze this, let me know.
> >
> > Tressa
> >

------------------------------------------------
Subject: Re: [rt.rap.ucar.edu #59656] Fwd: point_stat configuration : issues & questions
From: John Halley Gotway
Time: Fri Jan 25 15:56:09 2013

Hello Remy,

This is John Halley Gotway, one of the MET software engineers.

Regarding the station masking that Tressa mentioned, I'd actually just
suggests doing normal masking using the "mask.grid" and/or "mask.poly"
settings.  Then just turn on the matched pair (MPR) output line.

Once you have output for several days, you can run Stat-Analysis to
filter
out a time series for a single station and compute statistics for that
station.  You run a job that looks something like this:
   -job aggregate_stat -line_type MPR -out_line_type CNT (or whatever
type
of stats you want) -column_eq SID station_name (plus any other
filtering criteria)

That'll look through the input .stat files and pick out the matched
pairs
for the station name you give it.  And using those matched pairs,
it'll
derive continuous statistics.

Regarding a land/sea mask, we call this "data masking".  Basically,
you
need a gridded data file on the same grid as your model data that
defines
your land/sea mask.  Then, in the "mask.poly" section of the tool's
config
file, you specify the filename, field from that file, and threshold
you'd
like to use to define your masking region.  For example, the following
would define a verification region as the places where 2-meter
temperature
is greater than 273.15:
   poly = [ "my_file.grb {name = \"TMP\"; level = \"Z2\";} >273.15" ];

For a land/sea mask, your threshold would probably be equality, like
this:
==1

Regarding shapefiles, I would like us to support them directly in the
future, but we currently do not.  I have however done exactly what
you're
trying to do with shapefiles.  I recently converted shapefiles for the
provinces in Saudi-Arabia to polylines for use in MET.

Hope that helps.  Just let us know if you get stuck on anything.

Thanks,
John

> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=59656 >
>
> Hi Remy,
>
> Please see some individual answers below. Your questions about
masking
> have been forwarded back to one of our software engineers, as I am
less
> familiar with this piece.
>
> Tressa
>
> On Fri Jan 25 02:14:18 2013, remi.montroty at mfi.fr wrote:
>> Dear Tressa,
>>
>> Thank you for taking the time to clarify this.
>>
>> Indeed, running for a whole year is a bit tricky unless I store
>> everything
>> on disk (Right now it is in Oracle databases).
>>
>> I've been running daily point_stat & grid_stat now but I am still
not
>> 100%
>> sure how to validate the significance level : on a given day I have
>> about
>> 188 synop stations to validate Temp @2m. Out of the 92 columns of
>> statistics I wonder which to focus on to get that significance
level
>> (and
>> maybe I'd like to tag days which reach it).
>> 1) Is there any bootstraping methods built-in to test the
significance
>> level?
>
> Yes, where appropriate, bootstrap confidence bounds are included in
the
> MET output files if you have turned on bootstrapping in your
configuration
> files. Please see Appendix D of our documentation for information
about
> how they are computed, and the individual chapters on point stat and
grid
> stat to see for which statistics confidence intervals are included.
>
>> 2) Can I access a specific station statistics through any of the
>> files?
>> (I'd like a timeseries of model error (ME,RMSE) at a specific
station
>> to be
>> precise)
>
> Yes, you can use station masking to produce statistics for a single
> location. This is a bit labor intensive as you will have to run MET
once
> per forecast, but will be easier in the next release.  Please see
the
> users' guide or example configuration files for station masking
details.
> To produce the time series, you may want to use our Stat Analysis
tool to
> filter results by station.
>
>>
>> On other subjects:
>>
>> 3) Is there a methodology for using landsea mask as a mask for
>> verification
>> (to only validate over land)?
>> 4) Is there a methodology described in order to derive polygons
from
>> shapefiles? (I'd like to use provinces shapefiles to define my
>> polygons)
>> 5) Can I verify FF,DD (wind force & direction) from U,V model
fields ?
>
> You can verify wind speed from U and V, see the configuration file
options
> for this setting. For direction, you will need to use the Stat
Analysis
> tool. See Chapter 8 of our documentation, where we have a section on
> verifying wind direction.
>
>> 6) What is, in the developpers minds, the best tool available for
>> accumulated precip verification? (I'm painfully aware of the double
>> penality issue...)
>
> If you have point observations, you are stuck with traditional
> verification. The methodologies that mitigate the double penalty are
all
> spatial methods that require gridded observations. If you have
gridded
> observations, there are a plethora of methods available, and the
research
> community has not come to a consensus about what is best. Please
review
> the literature for a full discussion of this issue. In MET, we
recommend
> use of MODE or neighborhood methods (Chapters 5 and 6 of the
> documentation).
>
>
>>
>> Thank you
>>
>> ps: I'm sorry if my questions are stupid or fully explained in the
>> documentation, I've browsed through it and not found answers to
them.
>>
>>
>>
>>
>>
>>
>> On Fri, Jan 18, 2013 at 8:44 PM, Tressa Fowler via RT
>> <met_help at ucar.edu>wrote:
>>
>> > Hi Remi,
>> >
>> > The statistical significance question can be a tricky one. I
would
>> say
>> > that a month represents a bare minimum, but your particular
problem
>> may
>> > require more data depending on the variability inherent in the
>> observations
>> > and how related each day is to the next. We like to look at a
whole
>> year,
>> > but not everyone can do that.
>> >
>> > You can use our confidence interval calculations to give you an
idea
>> of
>> > the significance for your particular set of cases. If you need
more
>> details
>> > about how to analyze this, let me know.
>> >
>> > Tressa
>> >
>
>
>

------------------------------------------------