[Met_help] [rt.rap.ucar.edu #99884] History for Mean Absolute Error from the Point Stat Tool (one-station case use example)

John Halley Gotway via RT met_help at ucar.edu
Mon Jul 12 11:31:50 MDT 2021


----------------------------------------------------------------
  Initial Request
----------------------------------------------------------------

Hello,

I am testing the point_stat tool (v10.0.0) for analyzing one surface station and for one hour with the 'nearest neighbor' method with windspeed, to get a feel for the output of the program. I set my obs_window to be (+/- 15 min). My station I am using reports every 10 min, so the continuous statistics output says it found 3 matched pairs (1 at the 50th minute, the 00th minute and the 10th minute), which is to be expected. However, the MAE output column shows that it is using the ( forecasted value  minus the mean of all the 3 observations) rather than taking the closest observation to the forecasted value out of the 3 observations to find the MAE, which would have given me a lower MAE, rather than using the average value of all the observations within that time window.

I attached the output results from point_stat of the continuous stats.

All observations' windspeed values:   [ 1.483,1.507,1.795 ] m/s
forecasted value for this nearest model grid point = 2.579 m/s

The MAE inside is | FBAR (only one value)  - OBAR | =  which is  0.985 . I would like the MAE to be from the pair with the closest match to the forecast and use 1.795 to give an MAE of 0.784.

How can I alter my config file (attached) to tell me to give me the MAE or stats and use the ob with the closest observation in time rather then just having a smaller obs_window ?

Thanks,

A.J. Eiserloh - Data Scientist
PG&E - Meteorology Systems and Analytics
Applied Technology Services
3400 Crow Canyon Rd., San Ramon, CA 94583-1393
925-307-4492






----------------------------------------------------------------
  Complete Ticket History
----------------------------------------------------------------

Subject: Mean Absolute Error from the Point Stat Tool (one-station case use example)
From: John Halley Gotway
Time: Thu May 13 15:34:14 2021

Hi AJ,

I see you have a question about multiple observations occurring within
the
obs time window in Point-Stat.

Good news, there’s a config option to handle exactly this situation.
Please
take a look here:

https://met.readthedocs.io/en/main_v10.0/Users_Guide/config_options.html#settings-
common-to-multiple-tools

And search for “obs_summary”.

Setting obs_summary = NEAREST; should do what you want.

Thanks
John Halley Gotway


On Thu, May 13, 2021 at 3:07 PM Eiserloh Jr., A.J. via RT
<met_help at ucar.edu>
wrote:

>
> Thu May 13 15:07:48 2021: Request 99884 was acted upon.
> Transaction: Ticket created by AJEB at pge.com
>        Queue: met_help
>      Subject: Mean Absolute Error from the Point Stat Tool (one-
station
> case use example)
>        Owner: Nobody
>   Requestors: AJEB at pge.com
>       Status: new
>  Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=99884 >
>
>
> Hello,
>
> I am testing the point_stat tool (v10.0.0) for analyzing one surface
> station and for one hour with the 'nearest neighbor' method with
windspeed,
> to get a feel for the output of the program. I set my obs_window to
be (+/-
> 15 min). My station I am using reports every 10 min, so the
continuous
> statistics output says it found 3 matched pairs (1 at the 50th
minute, the
> 00th minute and the 10th minute), which is to be expected. However,
the MAE
> output column shows that it is using the ( forecasted value  minus
the mean
> of all the 3 observations) rather than taking the closest
observation to
> the forecasted value out of the 3 observations to find the MAE,
which would
> have given me a lower MAE, rather than using the average value of
all the
> observations within that time window.
>
> I attached the output results from point_stat of the continuous
stats.
>
> All observations' windspeed values:   [ 1.483,1.507,1.795 ] m/s
> forecasted value for this nearest model grid point = 2.579 m/s
>
> The MAE inside is | FBAR (only one value)  - OBAR | =  which is
0.985 . I
> would like the MAE to be from the pair with the closest match to the
> forecast and use 1.795 to give an MAE of 0.784.
>
> How can I alter my config file (attached) to tell me to give me the
MAE or
> stats and use the ob with the closest observation in time rather
then just
> having a smaller obs_window ?
>
> Thanks,
>
> A.J. Eiserloh - Data Scientist
> PG&E - Meteorology Systems and Analytics
> Applied Technology Services
> 3400 Crow Canyon Rd., San Ramon, CA 94583-1393
>
<https://www.google.com/maps/search/3400+Crow+Canyon+Rd.,+San+Ramon,+CA+94583-
1393?entry=gmail&source=g>
> 925-307-4492
>
>
>
>
>
>

------------------------------------------------
Subject: RE: [rt.rap.ucar.edu #99884] Mean Absolute Error from the Point Stat Tool (one-station case use example)
From: Eiserloh Jr., A.J.
Time: Thu May 13 15:48:22 2021

Hi John,

Thanks, but not exactly. That only sets it to use the pair observation
at the 00th minute (NEAREST in time yes, but not NEAREST to the
forecasted value).

-AJ

-----Original Message-----
From: John Halley Gotway via RT <met_help at ucar.edu>
Sent: Thursday, May 13, 2021 2:34 PM
To: Eiserloh Jr., A.J. <AJEB at pge.com>
Subject: Re: [rt.rap.ucar.edu #99884] Mean Absolute Error from the
Point Stat Tool (one-station case use example)

*****CAUTION: This email was sent from an EXTERNAL source. Think
before clicking links or opening attachments.*****

Hi AJ,

I see you have a question about multiple observations occurring within
the obs time window in Point-Stat.

Good news, there's a config option to handle exactly this situation.
Please take a look here:

https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmet.readthedocs.io%2Fen%2Fmain_v10.0%2FUsers_Guide%2Fconfig_options.html%23settings-
common-to-multiple-
tools&data=04%7C01%7CAJEB%40pge.com%7C60aba413411544bf750e08d91656dc09%7C44ae661aece641aabc967c2c85a08941%7C0%7C0%7C637565384602094917%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=hTzQD52y9VPY7rOJYeb5fD876nq01HpQpgfAwzOGuZQ%3D&reserved=0

And search for "obs_summary".

Setting obs_summary = NEAREST; should do what you want.

Thanks
John Halley Gotway


On Thu, May 13, 2021 at 3:07 PM Eiserloh Jr., A.J. via RT
<met_help at ucar.edu>
wrote:

>
> Thu May 13 15:07:48 2021: Request 99884 was acted upon.
> Transaction: Ticket created by AJEB at pge.com
>        Queue: met_help
>      Subject: Mean Absolute Error from the Point Stat Tool
> (one-station case use example)
>        Owner: Nobody
>   Requestors: AJEB at pge.com
>       Status: new
>  Ticket <URL:
>
https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Frt.r
>
ap.ucar.edu%2Frt%2FTicket%2FDisplay.html%3Fid%3D99884&data=04%7C01
>
%7CAJEB%40pge.com%7C60aba413411544bf750e08d91656dc09%7C44ae661aece641a
>
abc967c2c85a08941%7C0%7C0%7C637565384602094917%7CUnknown%7CTWFpbGZsb3d
>
8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C
>
1000&sdata=QbhdC7X1wVTztRSdYhd6aefqkuXU2gyHR7u13zajuxw%3D&rese
> rved=0 >
>
>
> Hello,
>
> I am testing the point_stat tool (v10.0.0) for analyzing one surface
> station and for one hour with the 'nearest neighbor' method with
> windspeed, to get a feel for the output of the program. I set my
> obs_window to be (+/-
> 15 min). My station I am using reports every 10 min, so the
continuous
> statistics output says it found 3 matched pairs (1 at the 50th
minute,
> the 00th minute and the 10th minute), which is to be expected.
> However, the MAE output column shows that it is using the (
forecasted
> value  minus the mean of all the 3 observations) rather than taking
> the closest observation to the forecasted value out of the 3
> observations to find the MAE, which would have given me a lower MAE,
> rather than using the average value of all the observations within
that time window.
>
> I attached the output results from point_stat of the continuous
stats.
>
> All observations' windspeed values:   [ 1.483,1.507,1.795 ] m/s
> forecasted value for this nearest model grid point = 2.579 m/s
>
> The MAE inside is | FBAR (only one value)  - OBAR | =  which is
0.985
> . I would like the MAE to be from the pair with the closest match to
> the forecast and use 1.795 to give an MAE of 0.784.
>
> How can I alter my config file (attached) to tell me to give me the
> MAE or stats and use the ob with the closest observation in time
> rather then just having a smaller obs_window ?
>
> Thanks,
>
> A.J. Eiserloh - Data Scientist
> PG&E - Meteorology Systems and Analytics Applied Technology Services
> 3400 Crow Canyon Rd., San Ramon, CA 94583-1393
>
<https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww
>
.google.com%2Fmaps%2Fsearch%2F3400%2BCrow%2BCanyon%2BRd.%2C%2BSan%2BRa
> mon%2C%2BCA%2B94583-
1393%3Fentry%3Dgmail%26source%3Dg&data=04%7C01
>
%7CAJEB%40pge.com%7C60aba413411544bf750e08d91656dc09%7C44ae661aece641a
>
abc967c2c85a08941%7C0%7C0%7C637565384602094917%7CUnknown%7CTWFpbGZsb3d
>
8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C
>
1000&sdata=uYMrNvXybUF8RoBlD6HTai1jj4urauHJ2k53dcFqx48%3D&rese
> rved=0>
> 925-307-4492
>
>
>
>
>
>



------------------------------------------------
Subject: Mean Absolute Error from the Point Stat Tool (one-station case use example)
From: John Halley Gotway
Time: Thu May 13 16:32:10 2021

AJ,

Ah, sorry, I should have read through more carefully... was answering
on my
phone.

So this reminds me an awful lot of the "BEST" interpolation method,
described here:
https://met.readthedocs.io/en/latest/Users_Guide/config_options.html#settings-
common-to-multiple-tools


   -

   BEST for the value closest to the observation


When the "interp" dictionary has the method set to "BEST", for each
observation value, point-stat searches the interpolation area for the
forecast value with the smallest absolute difference. And that
forecast
value is used to create the matched pair for that observation.
Obviously,
the bigger you make the interpolation area (i.e. increasing width),
the
more likely you are to find a pair with a small difference. Adding
this
option gave us some pause, because it sounds a lot like cheating and
cherry-picking. But the scientists in our group came up with enough
valid
use cases for it, that it was worth adding. I do advise users to
employ it
with a grain of salt realizing that it's answering a slightly
different
question than you'd get when using other interpolation methods.

However, this is the opposite of what you want to do anyway. Instead
of
using the observation value to find the best match from nearby
forecast
values, you want to use the forecast value to find the best match from
observations within that time window. And that logic is not currently
supported.

I do see how it could be added... supporting obs_summary = BEST. But
we'd
need to do so very carefully! We'd run into a chicken and egg problem
if
obs_summary = BEST and interp.method = BEST. And these same
obs_summary
options apply to ensemble point verification, where picking the best
observation value for an ensemble of forecast values is much less
obvious.
Would you minimize the difference with the ensemble mean? Or perhaps
prefer
an ensemble rank in the middle?

It sounds like you'd want to run interp.method = NEAREST and
obs_summary =
BEST... so it's pretty clear what you're looking for in this case.
Another
downside here is that the forecast value would affect which
observations
are used in the verification. In general we try to avoid that. We
often
compare the performance of competing forecasts against the same set of
observations. Using the same observations (i.e. not choosing different
ones
based on the forecast value) makes for a more fair comparison between
models.

With MET version 10.0.0, you could try setting obs_summary = NEAREST
and
interp.method = BEST to see if that comes close to answering the
verification question you're asking.

Hope that helps clarify.

Thanks,
John

On Thu, May 13, 2021 at 3:48 PM Eiserloh Jr., A.J. via RT
<met_help at ucar.edu>
wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=99884 >
>
> Hi John,
>
> Thanks, but not exactly. That only sets it to use the pair
observation at
> the 00th minute (NEAREST in time yes, but not NEAREST to the
forecasted
> value).
>
> -AJ
>
> -----Original Message-----
> From: John Halley Gotway via RT <met_help at ucar.edu>
> Sent: Thursday, May 13, 2021 2:34 PM
> To: Eiserloh Jr., A.J. <AJEB at pge.com>
> Subject: Re: [rt.rap.ucar.edu #99884] Mean Absolute Error from the
Point
> Stat Tool (one-station case use example)
>
> *****CAUTION: This email was sent from an EXTERNAL source. Think
before
> clicking links or opening attachments.*****
>
> Hi AJ,
>
> I see you have a question about multiple observations occurring
within the
> obs time window in Point-Stat.
>
> Good news, there's a config option to handle exactly this situation.
> Please take a look here:
>
>
>
https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmet.readthedocs.io%2Fen%2Fmain_v10.0%2FUsers_Guide%2Fconfig_options.html%23settings-
common-to-multiple-
tools&data=04%7C01%7CAJEB%40pge.com%7C60aba413411544bf750e08d91656dc09%7C44ae661aece641aabc967c2c85a08941%7C0%7C0%7C637565384602094917%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=hTzQD52y9VPY7rOJYeb5fD876nq01HpQpgfAwzOGuZQ%3D&reserved=0
>
> And search for "obs_summary".
>
> Setting obs_summary = NEAREST; should do what you want.
>
> Thanks
> John Halley Gotway
>
>
> On Thu, May 13, 2021 at 3:07 PM Eiserloh Jr., A.J. via RT <
> met_help at ucar.edu>
> wrote:
>
> >
> > Thu May 13 15:07:48 2021: Request 99884 was acted upon.
> > Transaction: Ticket created by AJEB at pge.com
> >        Queue: met_help
> >      Subject: Mean Absolute Error from the Point Stat Tool
> > (one-station case use example)
> >        Owner: Nobody
> >   Requestors: AJEB at pge.com
> >       Status: new
> >  Ticket <URL:
> >
https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Frt.r
> >
ap.ucar.edu%2Frt%2FTicket%2FDisplay.html%3Fid%3D99884&data=04%7C01
> >
%7CAJEB%40pge.com%7C60aba413411544bf750e08d91656dc09%7C44ae661aece641a
> >
abc967c2c85a08941%7C0%7C0%7C637565384602094917%7CUnknown%7CTWFpbGZsb3d
> >
8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C
> >
1000&sdata=QbhdC7X1wVTztRSdYhd6aefqkuXU2gyHR7u13zajuxw%3D&rese
> > rved=0 >
> >
> >
> > Hello,
> >
> > I am testing the point_stat tool (v10.0.0) for analyzing one
surface
> > station and for one hour with the 'nearest neighbor' method with
> > windspeed, to get a feel for the output of the program. I set my
> > obs_window to be (+/-
> > 15 min). My station I am using reports every 10 min, so the
continuous
> > statistics output says it found 3 matched pairs (1 at the 50th
minute,
> > the 00th minute and the 10th minute), which is to be expected.
> > However, the MAE output column shows that it is using the (
forecasted
> > value  minus the mean of all the 3 observations) rather than
taking
> > the closest observation to the forecasted value out of the 3
> > observations to find the MAE, which would have given me a lower
MAE,
> > rather than using the average value of all the observations within
that
> time window.
> >
> > I attached the output results from point_stat of the continuous
stats.
> >
> > All observations' windspeed values:   [ 1.483,1.507,1.795 ] m/s
> > forecasted value for this nearest model grid point = 2.579 m/s
> >
> > The MAE inside is | FBAR (only one value)  - OBAR | =  which is
0.985
> > . I would like the MAE to be from the pair with the closest match
to
> > the forecast and use 1.795 to give an MAE of 0.784.
> >
> > How can I alter my config file (attached) to tell me to give me
the
> > MAE or stats and use the ob with the closest observation in time
> > rather then just having a smaller obs_window ?
> >
> > Thanks,
> >
> > A.J. Eiserloh - Data Scientist
> > PG&E - Meteorology Systems and Analytics Applied Technology
Services
> > 3400 Crow Canyon Rd., San Ramon, CA 94583-1393
> >
<https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww
> >
.google.com%2Fmaps%2Fsearch%2F3400%2BCrow%2BCanyon%2BRd.%2C%2BSan%2BRa
> > mon%2C%2BCA%2B94583-
1393%3Fentry%3Dgmail%26source%3Dg&data=04%7C01
> >
%7CAJEB%40pge.com%7C60aba413411544bf750e08d91656dc09%7C44ae661aece641a
> >
abc967c2c85a08941%7C0%7C0%7C637565384602094917%7CUnknown%7CTWFpbGZsb3d
> >
8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C
> >
1000&sdata=uYMrNvXybUF8RoBlD6HTai1jj4urauHJ2k53dcFqx48%3D&rese
> > rved=0>
> > 925-307-4492
> >
> >
> >
> >
> >
> >
>
>
>
>

------------------------------------------------
Subject: RE: [rt.rap.ucar.edu #99884] Mean Absolute Error from the Point Stat Tool (one-station case use example)
From: Eiserloh Jr., A.J.
Time: Thu May 13 18:59:06 2021

Hi John,

Thanks! This helps. I removed the obs_summary, and put
interp.method=BEST, and it gives me what I expect, but I think it is
giving me the right answer for the wrong reason, and I am still a
little confused. Shouldn't the interp method still be NEAREST since it
needs to know to find the nearest grid point? Because the results are
what I expected though, it seems like it is still finding the correct
nearest gridpoint, then taking the BEST.  Also, from the
documentation:

" The "type" entry is an array of dictionaries, each specifying an
interpolation method. "

Here is what I tried (below) and both tests gave me the same results
for the 1 station, so I'm wondering is it actually listening to the
extra dict I added or is it by default to know to take the nearest
gridpoint, then find the 'BEST'.

Test 1:

interp = {
   vld_thresh = 1.0;
   shape      = SQUARE;
   type = [
      {
         method = BEST;
         width  = 1;
      }
   ];
}


Test 2:

interp = {
   vld_thresh = 1.0;
   shape      = SQUARE;

   type = [
      {
         method = BEST;
         width  = 1;
      },
      {
         method = NEAREST;
         width  = 1;
      }
   ];
}



Regards,

A.J. Eiserloh - Data Scientist
PG&E - Meteorology Systems and Analytics
Applied Technology Services
3400 Crow Canyon Rd., San Ramon, CA 94583-1393
925-307-4492




-----Original Message-----
From: John Halley Gotway via RT <met_help at ucar.edu>
Sent: Thursday, May 13, 2021 3:32 PM
To: Eiserloh Jr., A.J. <AJEB at pge.com>
Subject: Re: [rt.rap.ucar.edu #99884] Mean Absolute Error from the
Point Stat Tool (one-station case use example)

*****CAUTION: This email was sent from an EXTERNAL source. Think
before clicking links or opening attachments.*****

AJ,

Ah, sorry, I should have read through more carefully... was answering
on my phone.

So this reminds me an awful lot of the "BEST" interpolation method,
described here:
https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmet.readthedocs.io%2Fen%2Flatest%2FUsers_Guide%2Fconfig_options.html%23settings-
common-to-multiple-
tools&data=04%7C01%7CAJEB%40pge.com%7Cb295e845592b4501c1d708d9165ef3de%7C44ae661aece641aabc967c2c85a08941%7C0%7C0%7C637565419341227921%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=6Wu8XttHaOyAYRC9ZjzU3tTrqAkoMMW2GNf5OHrPqz0%3D&reserved=0


   -

   BEST for the value closest to the observation


When the "interp" dictionary has the method set to "BEST", for each
observation value, point-stat searches the interpolation area for the
forecast value with the smallest absolute difference. And that
forecast value is used to create the matched pair for that
observation. Obviously, the bigger you make the interpolation area
(i.e. increasing width), the more likely you are to find a pair with a
small difference. Adding this option gave us some pause, because it
sounds a lot like cheating and cherry-picking. But the scientists in
our group came up with enough valid use cases for it, that it was
worth adding. I do advise users to employ it with a grain of salt
realizing that it's answering a slightly different question than you'd
get when using other interpolation methods.

However, this is the opposite of what you want to do anyway. Instead
of using the observation value to find the best match from nearby
forecast values, you want to use the forecast value to find the best
match from observations within that time window. And that logic is not
currently supported.

I do see how it could be added... supporting obs_summary = BEST. But
we'd need to do so very carefully! We'd run into a chicken and egg
problem if obs_summary = BEST and interp.method = BEST. And these same
obs_summary options apply to ensemble point verification, where
picking the best observation value for an ensemble of forecast values
is much less obvious.
Would you minimize the difference with the ensemble mean? Or perhaps
prefer an ensemble rank in the middle?

It sounds like you'd want to run interp.method = NEAREST and
obs_summary = BEST... so it's pretty clear what you're looking for in
this case. Another downside here is that the forecast value would
affect which observations are used in the verification. In general we
try to avoid that. We often compare the performance of competing
forecasts against the same set of observations. Using the same
observations (i.e. not choosing different ones based on the forecast
value) makes for a more fair comparison between models.

With MET version 10.0.0, you could try setting obs_summary = NEAREST
and interp.method = BEST to see if that comes close to answering the
verification question you're asking.

Hope that helps clarify.

Thanks,
John

On Thu, May 13, 2021 at 3:48 PM Eiserloh Jr., A.J. via RT
<met_help at ucar.edu>
wrote:

>
> <URL:
>
https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Frt.r
>
ap.ucar.edu%2Frt%2FTicket%2FDisplay.html%3Fid%3D99884&data=04%7C01
>
%7CAJEB%40pge.com%7Cb295e845592b4501c1d708d9165ef3de%7C44ae661aece641a
>
abc967c2c85a08941%7C0%7C0%7C637565419341237919%7CUnknown%7CTWFpbGZsb3d
>
8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C
>
1000&sdata=qxTD%2BoCJbSfrXBKX%2BXdn8VdDoe7lBKUYtel6u3v9lx8%3D&
> reserved=0 >
>
> Hi John,
>
> Thanks, but not exactly. That only sets it to use the pair
observation
> at the 00th minute (NEAREST in time yes, but not NEAREST to the
> forecasted value).
>
> -AJ
>
> -----Original Message-----
> From: John Halley Gotway via RT <met_help at ucar.edu>
> Sent: Thursday, May 13, 2021 2:34 PM
> To: Eiserloh Jr., A.J. <AJEB at pge.com>
> Subject: Re: [rt.rap.ucar.edu #99884] Mean Absolute Error from the
> Point Stat Tool (one-station case use example)
>
> *****CAUTION: This email was sent from an EXTERNAL source. Think
> before clicking links or opening attachments.*****
>
> Hi AJ,
>
> I see you have a question about multiple observations occurring
within
> the obs time window in Point-Stat.
>
> Good news, there's a config option to handle exactly this situation.
> Please take a look here:
>
>
>
https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmet.
>
readthedocs.io%2Fen%2Fmain_v10.0%2FUsers_Guide%2Fconfig_options.html%2
> 3settings-common-to-multiple-
tools&data=04%7C01%7CAJEB%40pge.com%7
>
Cb295e845592b4501c1d708d9165ef3de%7C44ae661aece641aabc967c2c85a08941%7
>
C0%7C0%7C637565419341237919%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMD
>
AiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=7bW2
> 7EFnx%2F%2B2v2GQWWUkT31r86jKoMnmVZnxeeWFBoc%3D&reserved=0
>
> And search for "obs_summary".
>
> Setting obs_summary = NEAREST; should do what you want.
>
> Thanks
> John Halley Gotway
>
>
> On Thu, May 13, 2021 at 3:07 PM Eiserloh Jr., A.J. via RT <
> met_help at ucar.edu>
> wrote:
>
> >
> > Thu May 13 15:07:48 2021: Request 99884 was acted upon.
> > Transaction: Ticket created by AJEB at pge.com
> >        Queue: met_help
> >      Subject: Mean Absolute Error from the Point Stat Tool
> > (one-station case use example)
> >        Owner: Nobody
> >   Requestors: AJEB at pge.com
> >       Status: new
> >  Ticket <URL:
> >
https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Frt
> > .r
> >
ap.ucar.edu%2Frt%2FTicket%2FDisplay.html%3Fid%3D99884&data=04%7C
> > 01
> >
%7CAJEB%40pge.com%7C60aba413411544bf750e08d91656dc09%7C44ae661aece64
> > 1a
> >
abc967c2c85a08941%7C0%7C0%7C637565384602094917%7CUnknown%7CTWFpbGZsb
> > 3d
> >
8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%
> > 7C
> >
1000&sdata=QbhdC7X1wVTztRSdYhd6aefqkuXU2gyHR7u13zajuxw%3D&re
> > se
> > rved=0 >
> >
> >
> > Hello,
> >
> > I am testing the point_stat tool (v10.0.0) for analyzing one
surface
> > station and for one hour with the 'nearest neighbor' method with
> > windspeed, to get a feel for the output of the program. I set my
> > obs_window to be (+/-
> > 15 min). My station I am using reports every 10 min, so the
> > continuous statistics output says it found 3 matched pairs (1 at
the
> > 50th minute, the 00th minute and the 10th minute), which is to be
expected.
> > However, the MAE output column shows that it is using the (
> > forecasted value  minus the mean of all the 3 observations) rather
> > than taking the closest observation to the forecasted value out of
> > the 3 observations to find the MAE, which would have given me a
> > lower MAE, rather than using the average value of all the
> > observations within that
> time window.
> >
> > I attached the output results from point_stat of the continuous
stats.
> >
> > All observations' windspeed values:   [ 1.483,1.507,1.795 ] m/s
> > forecasted value for this nearest model grid point = 2.579 m/s
> >
> > The MAE inside is | FBAR (only one value)  - OBAR | =  which is
> > 0.985 . I would like the MAE to be from the pair with the closest
> > match to the forecast and use 1.795 to give an MAE of 0.784.
> >
> > How can I alter my config file (attached) to tell me to give me
the
> > MAE or stats and use the ob with the closest observation in time
> > rather then just having a smaller obs_window ?
> >
> > Thanks,
> >
> > A.J. Eiserloh - Data Scientist
> > PG&E - Meteorology Systems and Analytics Applied Technology
Services
> > 3400 Crow Canyon Rd., San Ramon, CA 94583-1393
> >
<https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fw
> > ww
> >
.google.com%2Fmaps%2Fsearch%2F3400%2BCrow%2BCanyon%2BRd.%2C%2BSan%2B
> > Ra
> > mon%2C%2BCA%2B94583-
1393%3Fentry%3Dgmail%26source%3Dg&data=04%7C
> > 01
> >
%7CAJEB%40pge.com%7C60aba413411544bf750e08d91656dc09%7C44ae661aece64
> > 1a
> >
abc967c2c85a08941%7C0%7C0%7C637565384602094917%7CUnknown%7CTWFpbGZsb
> > 3d
> >
8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%
> > 7C
> >
1000&sdata=uYMrNvXybUF8RoBlD6HTai1jj4urauHJ2k53dcFqx48%3D&re
> > se
> > rved=0>
> > 925-307-4492
> >
> >
> >
> >
> >
> >
>
>
>
>



------------------------------------------------
Subject: Mean Absolute Error from the Point Stat Tool (one-station case use example)
From: John Halley Gotway
Time: Fri May 14 12:04:31 2021

AJ,

A few points to make here.

(1) You said you "removed" the obs_summary option. Please be aware
that
Point-Stat always reads the default configuration file first (
https://github.com/dtcenter/MET/blob/main_v10.0/met/data/config/PointStatConfig_default)...
and then reads the user-specified configuration file from the command
line.
For any items the user does not override, the default setting remains
in
effect. And the default setting is "obs_summary = NONE;". Just wanted
to
make that clear.

(2) It sounds like you're writing the CNT statistics and checking the
FBAR
and OBAR columns to see which variables were used. I'd recommend
reconfiguring point-stat to write the MPR output line type (mpr =
BOTH).
That'll write 1 MPR line to the output for each individual FCST/OBS
matched
pair. I find the MPR line type very useful (as long as you're working
with
a small number of pairs).

(3) Please look closely at the log messages being written by Point-
Stat. I
expect you'll see one that says...
   "WARNING: Resetting interpolation BEST to NEAREST since the
interpolation width is 1."
method = BEST and width = 1 is really just nearest neighbor... and
that's
why I expect you'd see that warning message. So it makes sense that
you're
getting the same output for both of the interp options you specified.

Here's the logic...
- Let's say there your station reports 3 times during the time window
(defined by the obs_window config option).
- If obs_summary = NONE (default), that'll result in 3 MPR lines being
created/written.
- If obs_summary = NEAREST, it'll pick 1 of those 3 obs (closest to
the
valid time of the forecast), and will write 1 MPR output line.
- For each obs processed (1 or 3, based on obs_summary), it'll
"interpolate" the gridded forecast data to that location.
- If you set "method = BEST; width = 2;" it'll search the (2x2=) 4
closest
grid points and choose the value that's closest to the obs value.
- With "method = BEST; width = 5;" it'll search (5x5=) 25 grid points
and
select the value that's closest to the obs value.

But again, there's no way to do what you described... pick the nearest
forecast grid point... and then use that it to select one of the 3 obs
values in that time window that minimizes the errror.

Hope that helps clarify.

John

On Thu, May 13, 2021 at 6:59 PM Eiserloh Jr., A.J. via RT
<met_help at ucar.edu>
wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=99884 >
>
> Hi John,
>
> Thanks! This helps. I removed the obs_summary, and put
interp.method=BEST,
> and it gives me what I expect, but I think it is giving me the right
answer
> for the wrong reason, and I am still a little confused. Shouldn't
the
> interp method still be NEAREST since it needs to know to find the
nearest
> grid point? Because the results are what I expected though, it seems
like
> it is still finding the correct nearest gridpoint, then taking the
BEST.
> Also, from the documentation:
>
> " The "type" entry is an array of dictionaries, each specifying an
> interpolation method. "
>
> Here is what I tried (below) and both tests gave me the same results
for
> the 1 station, so I'm wondering is it actually listening to the
extra dict
> I added or is it by default to know to take the nearest gridpoint,
then
> find the 'BEST'.
>
> Test 1:
>
> interp = {
>    vld_thresh = 1.0;
>    shape      = SQUARE;
>    type = [
>       {
>          method = BEST;
>          width  = 1;
>       }
>    ];
> }
>
>
> Test 2:
>
> interp = {
>    vld_thresh = 1.0;
>    shape      = SQUARE;
>
>    type = [
>       {
>          method = BEST;
>          width  = 1;
>       },
>       {
>          method = NEAREST;
>          width  = 1;
>       }
>    ];
> }
>
>
>
> Regards,
>
> A.J. Eiserloh - Data Scientist
> PG&E - Meteorology Systems and Analytics
> Applied Technology Services
> 3400 Crow Canyon Rd., San Ramon, CA 94583-1393
> 925-307-4492
>
>
>
>
> -----Original Message-----
> From: John Halley Gotway via RT <met_help at ucar.edu>
> Sent: Thursday, May 13, 2021 3:32 PM
> To: Eiserloh Jr., A.J. <AJEB at pge.com>
> Subject: Re: [rt.rap.ucar.edu #99884] Mean Absolute Error from the
Point
> Stat Tool (one-station case use example)
>
> *****CAUTION: This email was sent from an EXTERNAL source. Think
before
> clicking links or opening attachments.*****
>
> AJ,
>
> Ah, sorry, I should have read through more carefully... was
answering on
> my phone.
>
> So this reminds me an awful lot of the "BEST" interpolation method,
> described here:
>
>
https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmet.readthedocs.io%2Fen%2Flatest%2FUsers_Guide%2Fconfig_options.html%23settings-
common-to-multiple-
tools&data=04%7C01%7CAJEB%40pge.com%7Cb295e845592b4501c1d708d9165ef3de%7C44ae661aece641aabc967c2c85a08941%7C0%7C0%7C637565419341227921%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=6Wu8XttHaOyAYRC9ZjzU3tTrqAkoMMW2GNf5OHrPqz0%3D&reserved=0
>
>
>    -
>
>    BEST for the value closest to the observation
>
>
> When the "interp" dictionary has the method set to "BEST", for each
> observation value, point-stat searches the interpolation area for
the
> forecast value with the smallest absolute difference. And that
forecast
> value is used to create the matched pair for that observation.
Obviously,
> the bigger you make the interpolation area (i.e. increasing width),
the
> more likely you are to find a pair with a small difference. Adding
this
> option gave us some pause, because it sounds a lot like cheating and
> cherry-picking. But the scientists in our group came up with enough
valid
> use cases for it, that it was worth adding. I do advise users to
employ it
> with a grain of salt realizing that it's answering a slightly
different
> question than you'd get when using other interpolation methods.
>
> However, this is the opposite of what you want to do anyway. Instead
of
> using the observation value to find the best match from nearby
forecast
> values, you want to use the forecast value to find the best match
from
> observations within that time window. And that logic is not
currently
> supported.
>
> I do see how it could be added... supporting obs_summary = BEST. But
we'd
> need to do so very carefully! We'd run into a chicken and egg
problem if
> obs_summary = BEST and interp.method = BEST. And these same
obs_summary
> options apply to ensemble point verification, where picking the best
> observation value for an ensemble of forecast values is much less
obvious.
> Would you minimize the difference with the ensemble mean? Or perhaps
> prefer an ensemble rank in the middle?
>
> It sounds like you'd want to run interp.method = NEAREST and
obs_summary =
> BEST... so it's pretty clear what you're looking for in this case.
Another
> downside here is that the forecast value would affect which
observations
> are used in the verification. In general we try to avoid that. We
often
> compare the performance of competing forecasts against the same set
of
> observations. Using the same observations (i.e. not choosing
different ones
> based on the forecast value) makes for a more fair comparison
between
> models.
>
> With MET version 10.0.0, you could try setting obs_summary = NEAREST
and
> interp.method = BEST to see if that comes close to answering the
> verification question you're asking.
>
> Hope that helps clarify.
>
> Thanks,
> John
>
> On Thu, May 13, 2021 at 3:48 PM Eiserloh Jr., A.J. via RT <
> met_help at ucar.edu>
> wrote:
>
> >
> > <URL:
> >
https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Frt.r
> >
ap.ucar.edu%2Frt%2FTicket%2FDisplay.html%3Fid%3D99884&data=04%7C01
> >
%7CAJEB%40pge.com%7Cb295e845592b4501c1d708d9165ef3de%7C44ae661aece641a
> >
abc967c2c85a08941%7C0%7C0%7C637565419341237919%7CUnknown%7CTWFpbGZsb3d
> >
8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C
> >
1000&sdata=qxTD%2BoCJbSfrXBKX%2BXdn8VdDoe7lBKUYtel6u3v9lx8%3D&
> > reserved=0 >
> >
> > Hi John,
> >
> > Thanks, but not exactly. That only sets it to use the pair
observation
> > at the 00th minute (NEAREST in time yes, but not NEAREST to the
> > forecasted value).
> >
> > -AJ
> >
> > -----Original Message-----
> > From: John Halley Gotway via RT <met_help at ucar.edu>
> > Sent: Thursday, May 13, 2021 2:34 PM
> > To: Eiserloh Jr., A.J. <AJEB at pge.com>
> > Subject: Re: [rt.rap.ucar.edu #99884] Mean Absolute Error from the
> > Point Stat Tool (one-station case use example)
> >
> > *****CAUTION: This email was sent from an EXTERNAL source. Think
> > before clicking links or opening attachments.*****
> >
> > Hi AJ,
> >
> > I see you have a question about multiple observations occurring
within
> > the obs time window in Point-Stat.
> >
> > Good news, there's a config option to handle exactly this
situation.
> > Please take a look here:
> >
> >
> >
https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmet.
> >
readthedocs.io%2Fen%2Fmain_v10.0%2FUsers_Guide%2Fconfig_options.html%2
> > 3settings-common-to-multiple-
tools&data=04%7C01%7CAJEB%40pge.com%7
> >
Cb295e845592b4501c1d708d9165ef3de%7C44ae661aece641aabc967c2c85a08941%7
> >
C0%7C0%7C637565419341237919%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMD
> >
AiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=7bW2
> > 7EFnx%2F%2B2v2GQWWUkT31r86jKoMnmVZnxeeWFBoc%3D&reserved=0
> >
> > And search for "obs_summary".
> >
> > Setting obs_summary = NEAREST; should do what you want.
> >
> > Thanks
> > John Halley Gotway
> >
> >
> > On Thu, May 13, 2021 at 3:07 PM Eiserloh Jr., A.J. via RT <
> > met_help at ucar.edu>
> > wrote:
> >
> > >
> > > Thu May 13 15:07:48 2021: Request 99884 was acted upon.
> > > Transaction: Ticket created by AJEB at pge.com
> > >        Queue: met_help
> > >      Subject: Mean Absolute Error from the Point Stat Tool
> > > (one-station case use example)
> > >        Owner: Nobody
> > >   Requestors: AJEB at pge.com
> > >       Status: new
> > >  Ticket <URL:
> > >
https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Frt
> > > .r
> > >
ap.ucar.edu%2Frt%2FTicket%2FDisplay.html%3Fid%3D99884&data=04%7C
> > > 01
> > >
%7CAJEB%40pge.com%7C60aba413411544bf750e08d91656dc09%7C44ae661aece64
> > > 1a
> > >
abc967c2c85a08941%7C0%7C0%7C637565384602094917%7CUnknown%7CTWFpbGZsb
> > > 3d
> > >
8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%
> > > 7C
> > >
1000&sdata=QbhdC7X1wVTztRSdYhd6aefqkuXU2gyHR7u13zajuxw%3D&re
> > > se
> > > rved=0 >
> > >
> > >
> > > Hello,
> > >
> > > I am testing the point_stat tool (v10.0.0) for analyzing one
surface
> > > station and for one hour with the 'nearest neighbor' method with
> > > windspeed, to get a feel for the output of the program. I set my
> > > obs_window to be (+/-
> > > 15 min). My station I am using reports every 10 min, so the
> > > continuous statistics output says it found 3 matched pairs (1 at
the
> > > 50th minute, the 00th minute and the 10th minute), which is to
be
> expected.
> > > However, the MAE output column shows that it is using the (
> > > forecasted value  minus the mean of all the 3 observations)
rather
> > > than taking the closest observation to the forecasted value out
of
> > > the 3 observations to find the MAE, which would have given me a
> > > lower MAE, rather than using the average value of all the
> > > observations within that
> > time window.
> > >
> > > I attached the output results from point_stat of the continuous
stats.
> > >
> > > All observations' windspeed values:   [ 1.483,1.507,1.795 ] m/s
> > > forecasted value for this nearest model grid point = 2.579 m/s
> > >
> > > The MAE inside is | FBAR (only one value)  - OBAR | =  which is
> > > 0.985 . I would like the MAE to be from the pair with the
closest
> > > match to the forecast and use 1.795 to give an MAE of 0.784.
> > >
> > > How can I alter my config file (attached) to tell me to give me
the
> > > MAE or stats and use the ob with the closest observation in time
> > > rather then just having a smaller obs_window ?
> > >
> > > Thanks,
> > >
> > > A.J. Eiserloh - Data Scientist
> > > PG&E - Meteorology Systems and Analytics Applied Technology
Services
> > > 3400 Crow Canyon Rd., San Ramon, CA 94583-1393
> > >
<https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fw
> > > ww
> > >
.google.com%2Fmaps%2Fsearch%2F3400%2BCrow%2BCanyon%2BRd.%2C%2BSan%2B
> > > Ra
> > > mon%2C%2BCA%2B94583-
1393%3Fentry%3Dgmail%26source%3Dg&data=04%7C
> > > 01
> > >
%7CAJEB%40pge.com%7C60aba413411544bf750e08d91656dc09%7C44ae661aece64
> > > 1a
> > >
abc967c2c85a08941%7C0%7C0%7C637565384602094917%7CUnknown%7CTWFpbGZsb
> > > 3d
> > >
8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%
> > > 7C
> > >
1000&sdata=uYMrNvXybUF8RoBlD6HTai1jj4urauHJ2k53dcFqx48%3D&re
> > > se
> > > rved=0>
> > > 925-307-4492
> > >
> > >
> > >
> > >
> > >
> > >
> >
> >
> >
> >
>
>
>
>

------------------------------------------------


More information about the Met_help mailing list