[Met_help] [rt.rap.ucar.edu #96983] History for cat_thresh v8.1 versus v9.1

John Halley Gotway via RT met_help at ucar.edu
Fri Oct 23 10:43:44 MDT 2020


----------------------------------------------------------------
  Initial Request
----------------------------------------------------------------

Folks - I updated my MET tools from v8.1 to v9.1. I had the following fcst entry for my GridStatConfig file(s) but I noticed that v9.1 documentation doesn't have cat_thresh. What should I map cat_thresh to? There are vld_thresh and cov_thresh? Thank you.

fcst = {

   valid_time = "20180711_12";
   field = [
      {
        file_type  = GRIB1;
        model      = "WW3NAVGEM";
        name       = "MFLX";
        level      = "Z0";
        prob       = TRUE;
        cat_thresh = [ ==0.1 ];
      }
   ];

}

Efren A. Serra (Contractor)
Physicist

DeVine Consulting, Inc.
Naval Research Laboratory
Marine Meteorology Division
7 Grace Hopper Ave., STOP 2
Monterey, CA 93943
Code 7542
Mobile: 408-425-5027



----------------------------------------------------------------
  Complete Ticket History
----------------------------------------------------------------

Subject: cat_thresh v8.1 versus v9.1
From: Julie Prestopnik
Time: Wed Oct 07 08:41:00 2020

Hi Efren.

I see that you are under the impression that Grid-Stat for METv9.1 no
longer includes cat_thresh.  However, if you look in the
met-9.1/share/met/config/GridStatConfig_default file, you can see that
cat_thresh is still available for use in Grid-Stat:

//
> // Forecast and observation fields to be verified
> //
> fcst = {
>    field = [
>       {
>         name       = "APCP";
>         level      = [ "A03" ];
>         *cat_thresh *= [ >0.0, >=5.0 ];
>       }
>    ];
> }


You mentioned that the v9.1 documentation doesn't have cat_thresh
described.  Please let us know if there is something in the METv8.1
documentation that is missing from the METv9.1 documentation (specific
quote, section, etc.) so that we can ensure it's inclusion in the
METv9.1
documentation.

Thanks!

Julie

On Tue, Oct 6, 2020 at 4:27 PM efren.serra.ctr at nrlmry.navy.mil via RT
<
met_help at ucar.edu> wrote:

>
> Tue Oct 06 16:27:16 2020: Request 96983 was acted upon.
> Transaction: Ticket created by efren.serra.ctr at nrlmry.navy.mil
>        Queue: met_help
>      Subject: cat_thresh v8.1 versus v9.1
>        Owner: Nobody
>   Requestors: efren.serra.ctr at nrlmry.navy.mil
>       Status: new
>  Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96983 >
>
>
> Folks - I updated my MET tools from v8.1 to v9.1. I had the
following fcst
> entry for my GridStatConfig file(s) but I noticed that v9.1
documentation
> doesn't have cat_thresh. What should I map cat_thresh to? There are
> vld_thresh and cov_thresh? Thank you.
>
> fcst = {
>
>    valid_time = "20180711_12";
>    field = [
>       {
>         file_type  = GRIB1;
>         model      = "WW3NAVGEM";
>         name       = "MFLX";
>         level      = "Z0";
>         prob       = TRUE;
>         cat_thresh = [ ==0.1 ];
>       }
>    ];
>
> }
>
> Efren A. Serra (Contractor)
> Physicist
>
> DeVine Consulting, Inc.
> Naval Research Laboratory
> Marine Meteorology Division
> 7 Grace Hopper Ave., STOP 2
> Monterey, CA 93943
> Code 7542
> Mobile: 408-425-5027
>
>
>

--
Julie Prestopnik (she/her/hers)
Software Engineer
National Center for Atmospheric Research
Research Applications Laboratory
Email: jpresto at ucar.edu

My working day may not be your working day.  Please do not feel
obliged to
reply to this email outside of your normal working hours.

------------------------------------------------
Subject: RE: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus v9.1
From: efren.serra.ctr at nrlmry.navy.mil
Time: Wed Oct 07 09:13:37 2020

Julie - Yes; I was searching for 'cat_thresh' in the pdf but that
yielded no results. However, '_tresh' yielded cat_thresh. I was
wondering about the difference between specifying forecast bins of .2
versus .1 with cat_thresh = [==.1] or [==.2]

-----Original Message-----
From: Julie Prestopnik via RT <met_help at ucar.edu>
Sent: Wednesday, October 7, 2020 7:41 AM
To: Serra, Mr. Efren, Contractor, Code 7531
<efren.serra.ctr at nrlmry.navy.mil>
Subject: Re: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus v9.1

Hi Efren.

I see that you are under the impression that Grid-Stat for METv9.1 no
longer includes cat_thresh.  However, if you look in the met-
9.1/share/met/config/GridStatConfig_default file, you can see that
cat_thresh is still available for use in Grid-Stat:

//
> // Forecast and observation fields to be verified // fcst = {
>    field = [
>       {
>         name       = "APCP";
>         level      = [ "A03" ];
>         *cat_thresh *= [ >0.0, >=5.0 ];
>       }
>    ];
> }


You mentioned that the v9.1 documentation doesn't have cat_thresh
described.  Please let us know if there is something in the METv8.1
documentation that is missing from the METv9.1 documentation (specific
quote, section, etc.) so that we can ensure it's inclusion in the
METv9.1 documentation.

Thanks!

Julie

On Tue, Oct 6, 2020 at 4:27 PM efren.serra.ctr at nrlmry.navy.mil via RT
< met_help at ucar.edu> wrote:

>
> Tue Oct 06 16:27:16 2020: Request 96983 was acted upon.
> Transaction: Ticket created by efren.serra.ctr at nrlmry.navy.mil
>        Queue: met_help
>      Subject: cat_thresh v8.1 versus v9.1
>        Owner: Nobody
>   Requestors: efren.serra.ctr at nrlmry.navy.mil
>       Status: new
>  Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96983
> >
>
>
> Folks - I updated my MET tools from v8.1 to v9.1. I had the
following
> fcst entry for my GridStatConfig file(s) but I noticed that v9.1
> documentation doesn't have cat_thresh. What should I map cat_thresh
> to? There are vld_thresh and cov_thresh? Thank you.
>
> fcst = {
>
>    valid_time = "20180711_12";
>    field = [
>       {
>         file_type  = GRIB1;
>         model      = "WW3NAVGEM";
>         name       = "MFLX";
>         level      = "Z0";
>         prob       = TRUE;
>         cat_thresh = [ ==0.1 ];
>       }
>    ];
>
> }
>
> Efren A. Serra (Contractor)
> Physicist
>
> DeVine Consulting, Inc.
> Naval Research Laboratory
> Marine Meteorology Division
> 7 Grace Hopper Ave., STOP 2
> Monterey, CA 93943
> Code 7542
> Mobile: 408-425-5027
>
>
>

--
Julie Prestopnik (she/her/hers)
Software Engineer
National Center for Atmospheric Research Research Applications
Laboratory
Email: jpresto at ucar.edu

My working day may not be your working day.  Please do not feel
obliged to reply to this email outside of your normal working hours.



------------------------------------------------
Subject: cat_thresh v8.1 versus v9.1
From: Julie Prestopnik
Time: Wed Oct 07 09:31:15 2020

Hi Efren.

You mentioned searching in the pdf.  I just wanted to make sure you
had the
link to the documentation for METv9.1:
https://dtcenter.github.io/MET/Users_Guide/index.html

Did you find the information that you were looking for in the
documentation?

Julie

On Wed, Oct 7, 2020 at 9:13 AM efren.serra.ctr at nrlmry.navy.mil via RT
<
met_help at ucar.edu> wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96983 >
>
> Julie - Yes; I was searching for 'cat_thresh' in the pdf but that
yielded
> no results. However, '_tresh' yielded cat_thresh. I was wondering
about the
> difference between specifying forecast bins of .2 versus .1 with
cat_thresh
> = [==.1] or [==.2]
>
> -----Original Message-----
> From: Julie Prestopnik via RT <met_help at ucar.edu>
> Sent: Wednesday, October 7, 2020 7:41 AM
> To: Serra, Mr. Efren, Contractor, Code 7531 <
> efren.serra.ctr at nrlmry.navy.mil>
> Subject: Re: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus v9.1
>
> Hi Efren.
>
> I see that you are under the impression that Grid-Stat for METv9.1
no
> longer includes cat_thresh.  However, if you look in the
> met-9.1/share/met/config/GridStatConfig_default file, you can see
that
> cat_thresh is still available for use in Grid-Stat:
>
> //
> > // Forecast and observation fields to be verified // fcst = {
> >    field = [
> >       {
> >         name       = "APCP";
> >         level      = [ "A03" ];
> >         *cat_thresh *= [ >0.0, >=5.0 ];
> >       }
> >    ];
> > }
>
>
> You mentioned that the v9.1 documentation doesn't have cat_thresh
> described.  Please let us know if there is something in the METv8.1
> documentation that is missing from the METv9.1 documentation
(specific
> quote, section, etc.) so that we can ensure it's inclusion in the
METv9.1
> documentation.
>
> Thanks!
>
> Julie
>
> On Tue, Oct 6, 2020 at 4:27 PM efren.serra.ctr at nrlmry.navy.mil via
RT <
> met_help at ucar.edu> wrote:
>
> >
> > Tue Oct 06 16:27:16 2020: Request 96983 was acted upon.
> > Transaction: Ticket created by efren.serra.ctr at nrlmry.navy.mil
> >        Queue: met_help
> >      Subject: cat_thresh v8.1 versus v9.1
> >        Owner: Nobody
> >   Requestors: efren.serra.ctr at nrlmry.navy.mil
> >       Status: new
> >  Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96983
> > >
> >
> >
> > Folks - I updated my MET tools from v8.1 to v9.1. I had the
following
> > fcst entry for my GridStatConfig file(s) but I noticed that v9.1
> > documentation doesn't have cat_thresh. What should I map
cat_thresh
> > to? There are vld_thresh and cov_thresh? Thank you.
> >
> > fcst = {
> >
> >    valid_time = "20180711_12";
> >    field = [
> >       {
> >         file_type  = GRIB1;
> >         model      = "WW3NAVGEM";
> >         name       = "MFLX";
> >         level      = "Z0";
> >         prob       = TRUE;
> >         cat_thresh = [ ==0.1 ];
> >       }
> >    ];
> >
> > }
> >
> > Efren A. Serra (Contractor)
> > Physicist
> >
> > DeVine Consulting, Inc.
> > Naval Research Laboratory
> > Marine Meteorology Division
> > 7 Grace Hopper Ave., STOP 2
> > Monterey, CA 93943
> > Code 7542
> > Mobile: 408-425-5027
> >
> >
> >
>
> --
> Julie Prestopnik (she/her/hers)
> Software Engineer
> National Center for Atmospheric Research Research Applications
Laboratory
> Email: jpresto at ucar.edu
>
> My working day may not be your working day.  Please do not feel
obliged to
> reply to this email outside of your normal working hours.
>
>
>
>

--
Julie Prestopnik (she/her/hers)
Software Engineer
National Center for Atmospheric Research
Research Applications Laboratory
Email: jpresto at ucar.edu

My working day may not be your working day.  Please do not feel
obliged to
reply to this email outside of your normal working hours.

------------------------------------------------
Subject: cat_thresh v8.1 versus v9.1
From: efren.serra.ctr at nrlmry.navy.mil
Time: Wed Oct 07 09:32:05 2020

Attached are plots I made from pjc, prc and pstd output. Here I'm
including the Brier score out of pstd data files. We are comparing two
different probability models versus the same ground truth and we
expect remarkable differences between the probabilistic models, but we
don't see that in these plots. I have worked with John before, so I
was wondering if you don't mind sharing this with him.

-----Original Message-----
From: Julie Prestopnik via RT <met_help at ucar.edu>
Sent: Wednesday, October 7, 2020 7:41 AM
To: Serra, Mr. Efren, Contractor, Code 7531
<efren.serra.ctr at nrlmry.navy.mil>
Subject: Re: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus v9.1

Hi Efren.

I see that you are under the impression that Grid-Stat for METv9.1 no
longer includes cat_thresh.  However, if you look in the met-
9.1/share/met/config/GridStatConfig_default file, you can see that
cat_thresh is still available for use in Grid-Stat:

//
> // Forecast and observation fields to be verified // fcst = {
>    field = [
>       {
>         name       = "APCP";
>         level      = [ "A03" ];
>         *cat_thresh *= [ >0.0, >=5.0 ];
>       }
>    ];
> }


You mentioned that the v9.1 documentation doesn't have cat_thresh
described.  Please let us know if there is something in the METv8.1
documentation that is missing from the METv9.1 documentation (specific
quote, section, etc.) so that we can ensure it's inclusion in the
METv9.1 documentation.

Thanks!

Julie

On Tue, Oct 6, 2020 at 4:27 PM efren.serra.ctr at nrlmry.navy.mil via RT
< met_help at ucar.edu> wrote:

>
> Tue Oct 06 16:27:16 2020: Request 96983 was acted upon.
> Transaction: Ticket created by efren.serra.ctr at nrlmry.navy.mil
>        Queue: met_help
>      Subject: cat_thresh v8.1 versus v9.1
>        Owner: Nobody
>   Requestors: efren.serra.ctr at nrlmry.navy.mil
>       Status: new
>  Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96983
> >
>
>
> Folks - I updated my MET tools from v8.1 to v9.1. I had the
following
> fcst entry for my GridStatConfig file(s) but I noticed that v9.1
> documentation doesn't have cat_thresh. What should I map cat_thresh
> to? There are vld_thresh and cov_thresh? Thank you.
>
> fcst = {
>
>    valid_time = "20180711_12";
>    field = [
>       {
>         file_type  = GRIB1;
>         model      = "WW3NAVGEM";
>         name       = "MFLX";
>         level      = "Z0";
>         prob       = TRUE;
>         cat_thresh = [ ==0.1 ];
>       }
>    ];
>
> }
>
> Efren A. Serra (Contractor)
> Physicist
>
> DeVine Consulting, Inc.
> Naval Research Laboratory
> Marine Meteorology Division
> 7 Grace Hopper Ave., STOP 2
> Monterey, CA 93943
> Code 7542
> Mobile: 408-425-5027
>
>
>

--
Julie Prestopnik (she/her/hers)
Software Engineer
National Center for Atmospheric Research Research Applications
Laboratory
Email: jpresto at ucar.edu

My working day may not be your working day.  Please do not feel
obliged to reply to this email outside of your normal working hours.


------------------------------------------------
Subject: cat_thresh v8.1 versus v9.1
From: Julie Prestopnik
Time: Wed Oct 07 09:38:29 2020

Hi Efren.

I just wanted to follow up on your previous email.  You mentioned, "I
was
wondering about the difference between specifying forecast bins of .2
versus .1 with cat_thresh = [==.1] or [==.2]".

That information can be found on this page of the User's Guide:
https://dtcenter.github.io/MET/Users_Guide/data_io.html#configuration-
file-details

- Threshold:
>       - A threshold type (<, <=, ==, !-, >=, or >) followed by a
numeric value.
>       - The threshold type may also be specified using two letter
abbreviations
>         (lt, le, eq, ne, ge, gt).
>       - Multiple thresholds may be combined by specifying the logic
type of AND
>         (&&) or *OR (||)*. For example, ">=5&&<=10" defines the
numbers between 5
>         and 10 and *"==1||==2" defines numbers exactly equal to 1 or
2.*
>
>
If that doesn't help and doesn't explain why you are not seeing the
differences you expect, please let me know and I will see if John can
better assist you.

Thanks!

Julie

On Wed, Oct 7, 2020 at 9:32 AM efren.serra.ctr at nrlmry.navy.mil via RT
<
met_help at ucar.edu> wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96983 >
>
> Attached are plots I made from pjc, prc and pstd output. Here I'm
> including the Brier score out of pstd data files. We are comparing
two
> different probability models versus the same ground truth and we
expect
> remarkable differences between the probabilistic models, but we
don't see
> that in these plots. I have worked with John before, so I was
wondering if
> you don't mind sharing this with him.
>
> -----Original Message-----
> From: Julie Prestopnik via RT <met_help at ucar.edu>
> Sent: Wednesday, October 7, 2020 7:41 AM
> To: Serra, Mr. Efren, Contractor, Code 7531 <
> efren.serra.ctr at nrlmry.navy.mil>
> Subject: Re: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus v9.1
>
> Hi Efren.
>
> I see that you are under the impression that Grid-Stat for METv9.1
no
> longer includes cat_thresh.  However, if you look in the
> met-9.1/share/met/config/GridStatConfig_default file, you can see
that
> cat_thresh is still available for use in Grid-Stat:
>
> //
> > // Forecast and observation fields to be verified // fcst = {
> >    field = [
> >       {
> >         name       = "APCP";
> >         level      = [ "A03" ];
> >         *cat_thresh *= [ >0.0, >=5.0 ];
> >       }
> >    ];
> > }
>
>
> You mentioned that the v9.1 documentation doesn't have cat_thresh
> described.  Please let us know if there is something in the METv8.1
> documentation that is missing from the METv9.1 documentation
(specific
> quote, section, etc.) so that we can ensure it's inclusion in the
METv9.1
> documentation.
>
> Thanks!
>
> Julie
>
> On Tue, Oct 6, 2020 at 4:27 PM efren.serra.ctr at nrlmry.navy.mil via
RT <
> met_help at ucar.edu> wrote:
>
> >
> > Tue Oct 06 16:27:16 2020: Request 96983 was acted upon.
> > Transaction: Ticket created by efren.serra.ctr at nrlmry.navy.mil
> >        Queue: met_help
> >      Subject: cat_thresh v8.1 versus v9.1
> >        Owner: Nobody
> >   Requestors: efren.serra.ctr at nrlmry.navy.mil
> >       Status: new
> >  Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96983
> > >
> >
> >
> > Folks - I updated my MET tools from v8.1 to v9.1. I had the
following
> > fcst entry for my GridStatConfig file(s) but I noticed that v9.1
> > documentation doesn't have cat_thresh. What should I map
cat_thresh
> > to? There are vld_thresh and cov_thresh? Thank you.
> >
> > fcst = {
> >
> >    valid_time = "20180711_12";
> >    field = [
> >       {
> >         file_type  = GRIB1;
> >         model      = "WW3NAVGEM";
> >         name       = "MFLX";
> >         level      = "Z0";
> >         prob       = TRUE;
> >         cat_thresh = [ ==0.1 ];
> >       }
> >    ];
> >
> > }
> >
> > Efren A. Serra (Contractor)
> > Physicist
> >
> > DeVine Consulting, Inc.
> > Naval Research Laboratory
> > Marine Meteorology Division
> > 7 Grace Hopper Ave., STOP 2
> > Monterey, CA 93943
> > Code 7542
> > Mobile: 408-425-5027
> >
> >
> >
>
> --
> Julie Prestopnik (she/her/hers)
> Software Engineer
> National Center for Atmospheric Research Research Applications
Laboratory
> Email: jpresto at ucar.edu
>
> My working day may not be your working day.  Please do not feel
obliged to
> reply to this email outside of your normal working hours.
>
>
>

--
Julie Prestopnik (she/her/hers)
Software Engineer
National Center for Atmospheric Research
Research Applications Laboratory
Email: jpresto at ucar.edu

My working day may not be your working day.  Please do not feel
obliged to
reply to this email outside of your normal working hours.

------------------------------------------------
Subject: cat_thresh v8.1 versus v9.1
From: efren.serra.ctr at nrlmry.navy.mil
Time: Wed Oct 07 09:39:56 2020

Here's the actual case we are looking at for which I shared the Brier
scores images.

-----Original Message-----
From: Julie Prestopnik via RT <met_help at ucar.edu>
Sent: Wednesday, October 7, 2020 7:41 AM
To: Serra, Mr. Efren, Contractor, Code 7531
<efren.serra.ctr at nrlmry.navy.mil>
Subject: Re: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus v9.1

Hi Efren.

I see that you are under the impression that Grid-Stat for METv9.1 no
longer includes cat_thresh.  However, if you look in the met-
9.1/share/met/config/GridStatConfig_default file, you can see that
cat_thresh is still available for use in Grid-Stat:

//
> // Forecast and observation fields to be verified // fcst = {
>    field = [
>       {
>         name       = "APCP";
>         level      = [ "A03" ];
>         *cat_thresh *= [ >0.0, >=5.0 ];
>       }
>    ];
> }


You mentioned that the v9.1 documentation doesn't have cat_thresh
described.  Please let us know if there is something in the METv8.1
documentation that is missing from the METv9.1 documentation (specific
quote, section, etc.) so that we can ensure it's inclusion in the
METv9.1 documentation.

Thanks!

Julie

On Tue, Oct 6, 2020 at 4:27 PM efren.serra.ctr at nrlmry.navy.mil via RT
< met_help at ucar.edu> wrote:

>
> Tue Oct 06 16:27:16 2020: Request 96983 was acted upon.
> Transaction: Ticket created by efren.serra.ctr at nrlmry.navy.mil
>        Queue: met_help
>      Subject: cat_thresh v8.1 versus v9.1
>        Owner: Nobody
>   Requestors: efren.serra.ctr at nrlmry.navy.mil
>       Status: new
>  Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96983
> >
>
>
> Folks - I updated my MET tools from v8.1 to v9.1. I had the
following
> fcst entry for my GridStatConfig file(s) but I noticed that v9.1
> documentation doesn't have cat_thresh. What should I map cat_thresh
> to? There are vld_thresh and cov_thresh? Thank you.
>
> fcst = {
>
>    valid_time = "20180711_12";
>    field = [
>       {
>         file_type  = GRIB1;
>         model      = "WW3NAVGEM";
>         name       = "MFLX";
>         level      = "Z0";
>         prob       = TRUE;
>         cat_thresh = [ ==0.1 ];
>       }
>    ];
>
> }
>
> Efren A. Serra (Contractor)
> Physicist
>
> DeVine Consulting, Inc.
> Naval Research Laboratory
> Marine Meteorology Division
> 7 Grace Hopper Ave., STOP 2
> Monterey, CA 93943
> Code 7542
> Mobile: 408-425-5027
>
>
>

--
Julie Prestopnik (she/her/hers)
Software Engineer
National Center for Atmospheric Research Research Applications
Laboratory
Email: jpresto at ucar.edu

My working day may not be your working day.  Please do not feel
obliged to reply to this email outside of your normal working hours.


------------------------------------------------
Subject: RE: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus v9.1
From: efren.serra.ctr at nrlmry.navy.mil
Time: Wed Oct 07 09:43:44 2020

Hi Julie! We'd like to involve John if possible. I sent you the actual
case we are looking at and a couple of Brier score images I created
from the PSTD output. Thanks for your help.

-----Original Message-----
From: Julie Prestopnik via RT <met_help at ucar.edu>
Sent: Wednesday, October 7, 2020 8:38 AM
To: Serra, Mr. Efren, Contractor, Code 7531
<efren.serra.ctr at nrlmry.navy.mil>
Subject: Re: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus v9.1

Hi Efren.

I just wanted to follow up on your previous email.  You mentioned, "I
was wondering about the difference between specifying forecast bins of
.2 versus .1 with cat_thresh = [==.1] or [==.2]".

That information can be found on this page of the User's Guide:
https://dtcenter.github.io/MET/Users_Guide/data_io.html#configuration-
file-details

- Threshold:
>       - A threshold type (<, <=, ==, !-, >=, or >) followed by a
numeric value.
>       - The threshold type may also be specified using two letter
abbreviations
>         (lt, le, eq, ne, ge, gt).
>       - Multiple thresholds may be combined by specifying the logic
type of AND
>         (&&) or *OR (||)*. For example, ">=5&&<=10" defines the
numbers between 5
>         and 10 and *"==1||==2" defines numbers exactly equal to 1 or
> 2.*
>
>
If that doesn't help and doesn't explain why you are not seeing the
differences you expect, please let me know and I will see if John can
better assist you.

Thanks!

Julie

On Wed, Oct 7, 2020 at 9:32 AM efren.serra.ctr at nrlmry.navy.mil via RT
< met_help at ucar.edu> wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96983 >
>
> Attached are plots I made from pjc, prc and pstd output. Here I'm
> including the Brier score out of pstd data files. We are comparing
two
> different probability models versus the same ground truth and we
> expect remarkable differences between the probabilistic models, but
we
> don't see that in these plots. I have worked with John before, so I
> was wondering if you don't mind sharing this with him.
>
> -----Original Message-----
> From: Julie Prestopnik via RT <met_help at ucar.edu>
> Sent: Wednesday, October 7, 2020 7:41 AM
> To: Serra, Mr. Efren, Contractor, Code 7531 <
> efren.serra.ctr at nrlmry.navy.mil>
> Subject: Re: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus v9.1
>
> Hi Efren.
>
> I see that you are under the impression that Grid-Stat for METv9.1
no
> longer includes cat_thresh.  However, if you look in the
> met-9.1/share/met/config/GridStatConfig_default file, you can see
that
> cat_thresh is still available for use in Grid-Stat:
>
> //
> > // Forecast and observation fields to be verified // fcst = {
> >    field = [
> >       {
> >         name       = "APCP";
> >         level      = [ "A03" ];
> >         *cat_thresh *= [ >0.0, >=5.0 ];
> >       }
> >    ];
> > }
>
>
> You mentioned that the v9.1 documentation doesn't have cat_thresh
> described.  Please let us know if there is something in the METv8.1
> documentation that is missing from the METv9.1 documentation
(specific
> quote, section, etc.) so that we can ensure it's inclusion in the
> METv9.1 documentation.
>
> Thanks!
>
> Julie
>
> On Tue, Oct 6, 2020 at 4:27 PM efren.serra.ctr at nrlmry.navy.mil via
RT
> < met_help at ucar.edu> wrote:
>
> >
> > Tue Oct 06 16:27:16 2020: Request 96983 was acted upon.
> > Transaction: Ticket created by efren.serra.ctr at nrlmry.navy.mil
> >        Queue: met_help
> >      Subject: cat_thresh v8.1 versus v9.1
> >        Owner: Nobody
> >   Requestors: efren.serra.ctr at nrlmry.navy.mil
> >       Status: new
> >  Ticket <URL:
> > https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96983
> > >
> >
> >
> > Folks - I updated my MET tools from v8.1 to v9.1. I had the
> > following fcst entry for my GridStatConfig file(s) but I noticed
> > that v9.1 documentation doesn't have cat_thresh. What should I map
> > cat_thresh to? There are vld_thresh and cov_thresh? Thank you.
> >
> > fcst = {
> >
> >    valid_time = "20180711_12";
> >    field = [
> >       {
> >         file_type  = GRIB1;
> >         model      = "WW3NAVGEM";
> >         name       = "MFLX";
> >         level      = "Z0";
> >         prob       = TRUE;
> >         cat_thresh = [ ==0.1 ];
> >       }
> >    ];
> >
> > }
> >
> > Efren A. Serra (Contractor)
> > Physicist
> >
> > DeVine Consulting, Inc.
> > Naval Research Laboratory
> > Marine Meteorology Division
> > 7 Grace Hopper Ave., STOP 2
> > Monterey, CA 93943
> > Code 7542
> > Mobile: 408-425-5027
> >
> >
> >
>
> --
> Julie Prestopnik (she/her/hers)
> Software Engineer
> National Center for Atmospheric Research Research Applications
> Laboratory
> Email: jpresto at ucar.edu
>
> My working day may not be your working day.  Please do not feel
> obliged to reply to this email outside of your normal working hours.
>
>
>

--
Julie Prestopnik (she/her/hers)
Software Engineer
National Center for Atmospheric Research Research Applications
Laboratory
Email: jpresto at ucar.edu

My working day may not be your working day.  Please do not feel
obliged to reply to this email outside of your normal working hours.



------------------------------------------------
Subject: cat_thresh v8.1 versus v9.1
From: John Halley Gotway
Time: Wed Oct 07 10:00:39 2020

Efren,

I see that you're doing probabilistic verification of two models. All
of
the probabilistic stats computed by MET are derived from an Nx2
probabilistic contingency table. Your choice of the forecast
categorical
threshold defines the N probability bins. And your choice of the
observed
categorical threshold defines the 2 yes/no bins for the observation.

It looks like you're processing the probability of significant wave
height
> 12 ft.

So you'd set the observation cat_thresh = >12; since that's the event
for
which probabilities are defined.

And if you set the forecast cat_thresh = ==0.2, you'd be using 5
probability bins:
   0.0 to 0.2, 0.2 to 0.4, 0.4 to 0.6, 0.6 to 0.8, and 0.8 to 1.0
If you set the forecast cat_thresh = ==0.1, you'd be using 10
probability
bins.

MET computes the Brier Score from the Nx2 table... not the raw
probability
values.
In computing the Brier Score, all points falling inside a bin are
evaluated
using the mid-point of that bin.
Points in 0.0 to 0.2 are processed as 0.1.
Points in 0.2 to 0.4 are processed as 0.3.
Points in 0.4 to 0.6 are processed as 0.5.
Points in 0.6 to 0.8 are processed as 0.7.
Points in 0.8 to 1.0 are processed as 0.9.

In a recent MET-Help question from NOAA/WPC, we discussed this
situation.
The Brier Score values were unexpected because WPC's probability
values
were actually already binned. The process of re-binning the already
binned
probabilities introduced some unexpected diffs in the resulting Brier
Score
values. As a result, we wrote up this GitHub issue:
   https://github.com/dtcenter/MET/issues/1495

Now I'm not sure if/how this applies to your data.  But to minimize
the
effect of binning, you could try using 100 probability bins by setting
"cat_thresh = ==0.01;" I'd be curious to see how much impact the
cat_thresh
setting has on the results. And remember that the statistics reported
by
Grid-Stat are computed over some spatial area, as indicated by
VX_MASK. If
that spatial area is large, it's likely that the relative similarity
between the fields AWAY from the large event is averaging things out
and
making their performance look more similar on average.

So take a look at how you've defined the masking regions for Grid-
Stat.

Thanks,
John

On Wed, Oct 7, 2020 at 9:38 AM Julie Prestopnik via RT
<met_help at ucar.edu>
wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96983 >
>
> Hi Efren.
>
> I just wanted to follow up on your previous email.  You mentioned,
"I was
> wondering about the difference between specifying forecast bins of
.2
> versus .1 with cat_thresh = [==.1] or [==.2]".
>
> That information can be found on this page of the User's Guide:
>
>
https://dtcenter.github.io/MET/Users_Guide/data_io.html#configuration-
file-details
>
> - Threshold:
> >       - A threshold type (<, <=, ==, !-, >=, or >) followed by a
numeric
> value.
> >       - The threshold type may also be specified using two letter
> abbreviations
> >         (lt, le, eq, ne, ge, gt).
> >       - Multiple thresholds may be combined by specifying the
logic type
> of AND
> >         (&&) or *OR (||)*. For example, ">=5&&<=10" defines the
numbers
> between 5
> >         and 10 and *"==1||==2" defines numbers exactly equal to 1
or 2.*
> >
> >
> If that doesn't help and doesn't explain why you are not seeing the
> differences you expect, please let me know and I will see if John
can
> better assist you.
>
> Thanks!
>
> Julie
>
> On Wed, Oct 7, 2020 at 9:32 AM efren.serra.ctr at nrlmry.navy.mil via
RT <
> met_help at ucar.edu> wrote:
>
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96983 >
> >
> > Attached are plots I made from pjc, prc and pstd output. Here I'm
> > including the Brier score out of pstd data files. We are comparing
two
> > different probability models versus the same ground truth and we
expect
> > remarkable differences between the probabilistic models, but we
don't see
> > that in these plots. I have worked with John before, so I was
wondering
> if
> > you don't mind sharing this with him.
> >
> > -----Original Message-----
> > From: Julie Prestopnik via RT <met_help at ucar.edu>
> > Sent: Wednesday, October 7, 2020 7:41 AM
> > To: Serra, Mr. Efren, Contractor, Code 7531 <
> > efren.serra.ctr at nrlmry.navy.mil>
> > Subject: Re: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus v9.1
> >
> > Hi Efren.
> >
> > I see that you are under the impression that Grid-Stat for METv9.1
no
> > longer includes cat_thresh.  However, if you look in the
> > met-9.1/share/met/config/GridStatConfig_default file, you can see
that
> > cat_thresh is still available for use in Grid-Stat:
> >
> > //
> > > // Forecast and observation fields to be verified // fcst = {
> > >    field = [
> > >       {
> > >         name       = "APCP";
> > >         level      = [ "A03" ];
> > >         *cat_thresh *= [ >0.0, >=5.0 ];
> > >       }
> > >    ];
> > > }
> >
> >
> > You mentioned that the v9.1 documentation doesn't have cat_thresh
> > described.  Please let us know if there is something in the
METv8.1
> > documentation that is missing from the METv9.1 documentation
(specific
> > quote, section, etc.) so that we can ensure it's inclusion in the
METv9.1
> > documentation.
> >
> > Thanks!
> >
> > Julie
> >
> > On Tue, Oct 6, 2020 at 4:27 PM efren.serra.ctr at nrlmry.navy.mil via
RT <
> > met_help at ucar.edu> wrote:
> >
> > >
> > > Tue Oct 06 16:27:16 2020: Request 96983 was acted upon.
> > > Transaction: Ticket created by efren.serra.ctr at nrlmry.navy.mil
> > >        Queue: met_help
> > >      Subject: cat_thresh v8.1 versus v9.1
> > >        Owner: Nobody
> > >   Requestors: efren.serra.ctr at nrlmry.navy.mil
> > >       Status: new
> > >  Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96983
> > > >
> > >
> > >
> > > Folks - I updated my MET tools from v8.1 to v9.1. I had the
following
> > > fcst entry for my GridStatConfig file(s) but I noticed that v9.1
> > > documentation doesn't have cat_thresh. What should I map
cat_thresh
> > > to? There are vld_thresh and cov_thresh? Thank you.
> > >
> > > fcst = {
> > >
> > >    valid_time = "20180711_12";
> > >    field = [
> > >       {
> > >         file_type  = GRIB1;
> > >         model      = "WW3NAVGEM";
> > >         name       = "MFLX";
> > >         level      = "Z0";
> > >         prob       = TRUE;
> > >         cat_thresh = [ ==0.1 ];
> > >       }
> > >    ];
> > >
> > > }
> > >
> > > Efren A. Serra (Contractor)
> > > Physicist
> > >
> > > DeVine Consulting, Inc.
> > > Naval Research Laboratory
> > > Marine Meteorology Division
> > > 7 Grace Hopper Ave., STOP 2
> > > Monterey, CA 93943
> > > Code 7542
> > > Mobile: 408-425-5027
> > >
> > >
> > >
> >
> > --
> > Julie Prestopnik (she/her/hers)
> > Software Engineer
> > National Center for Atmospheric Research Research Applications
Laboratory
> > Email: jpresto at ucar.edu
> >
> > My working day may not be your working day.  Please do not feel
obliged
> to
> > reply to this email outside of your normal working hours.
> >
> >
> >
>
> --
> Julie Prestopnik (she/her/hers)
> Software Engineer
> National Center for Atmospheric Research
> Research Applications Laboratory
> Email: jpresto at ucar.edu
>
> My working day may not be your working day.  Please do not feel
obliged to
> reply to this email outside of your normal working hours.
>
>

------------------------------------------------
Subject: RE: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus v9.1
From: efren.serra.ctr at nrlmry.navy.mil
Time: Wed Oct 07 10:04:54 2020

John - thanks for the quick response; I'm going to try your suggestion
below about the minimizing the effect of binning with cat_thresh =
"0.01";

-----Original Message-----
From: John Halley Gotway via RT <met_help at ucar.edu>
Sent: Wednesday, October 7, 2020 9:01 AM
To: Serra, Mr. Efren, Contractor, Code 7531
<efren.serra.ctr at nrlmry.navy.mil>
Subject: Re: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus v9.1

Efren,

I see that you're doing probabilistic verification of two models. All
of the probabilistic stats computed by MET are derived from an Nx2
probabilistic contingency table. Your choice of the forecast
categorical threshold defines the N probability bins. And your choice
of the observed categorical threshold defines the 2 yes/no bins for
the observation.

It looks like you're processing the probability of significant wave
height
> 12 ft.

So you'd set the observation cat_thresh = >12; since that's the event
for which probabilities are defined.

And if you set the forecast cat_thresh = ==0.2, you'd be using 5
probability bins:
   0.0 to 0.2, 0.2 to 0.4, 0.4 to 0.6, 0.6 to 0.8, and 0.8 to 1.0 If
you set the forecast cat_thresh = ==0.1, you'd be using 10 probability
bins.

MET computes the Brier Score from the Nx2 table... not the raw
probability values.
In computing the Brier Score, all points falling inside a bin are
evaluated using the mid-point of that bin.
Points in 0.0 to 0.2 are processed as 0.1.
Points in 0.2 to 0.4 are processed as 0.3.
Points in 0.4 to 0.6 are processed as 0.5.
Points in 0.6 to 0.8 are processed as 0.7.
Points in 0.8 to 1.0 are processed as 0.9.

In a recent MET-Help question from NOAA/WPC, we discussed this
situation.
The Brier Score values were unexpected because WPC's probability
values were actually already binned. The process of re-binning the
already binned probabilities introduced some unexpected diffs in the
resulting Brier Score values. As a result, we wrote up this GitHub
issue:
   https://github.com/dtcenter/MET/issues/1495

Now I'm not sure if/how this applies to your data.  But to minimize
the effect of binning, you could try using 100 probability bins by
setting "cat_thresh = ==0.01;" I'd be curious to see how much impact
the cat_thresh setting has on the results. And remember that the
statistics reported by Grid-Stat are computed over some spatial area,
as indicated by VX_MASK. If that spatial area is large, it's likely
that the relative similarity between the fields AWAY from the large
event is averaging things out and making their performance look more
similar on average.

So take a look at how you've defined the masking regions for Grid-
Stat.

Thanks,
John

On Wed, Oct 7, 2020 at 9:38 AM Julie Prestopnik via RT
<met_help at ucar.edu>
wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96983 >
>
> Hi Efren.
>
> I just wanted to follow up on your previous email.  You mentioned,
"I
> was wondering about the difference between specifying forecast bins
of
> .2 versus .1 with cat_thresh = [==.1] or [==.2]".
>
> That information can be found on this page of the User's Guide:
>
>
https://dtcenter.github.io/MET/Users_Guide/data_io.html#configuration-
> file-details
>
> - Threshold:
> >       - A threshold type (<, <=, ==, !-, >=, or >) followed by a
> > numeric
> value.
> >       - The threshold type may also be specified using two letter
> abbreviations
> >         (lt, le, eq, ne, ge, gt).
> >       - Multiple thresholds may be combined by specifying the
logic
> > type
> of AND
> >         (&&) or *OR (||)*. For example, ">=5&&<=10" defines the
> > numbers
> between 5
> >         and 10 and *"==1||==2" defines numbers exactly equal to 1
or
> > 2.*
> >
> >
> If that doesn't help and doesn't explain why you are not seeing the
> differences you expect, please let me know and I will see if John
can
> better assist you.
>
> Thanks!
>
> Julie
>
> On Wed, Oct 7, 2020 at 9:32 AM efren.serra.ctr at nrlmry.navy.mil via
RT
> < met_help at ucar.edu> wrote:
>
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96983 >
> >
> > Attached are plots I made from pjc, prc and pstd output. Here I'm
> > including the Brier score out of pstd data files. We are comparing
> > two different probability models versus the same ground truth and
we
> > expect remarkable differences between the probabilistic models,
but
> > we don't see that in these plots. I have worked with John before,
so
> > I was wondering
> if
> > you don't mind sharing this with him.
> >
> > -----Original Message-----
> > From: Julie Prestopnik via RT <met_help at ucar.edu>
> > Sent: Wednesday, October 7, 2020 7:41 AM
> > To: Serra, Mr. Efren, Contractor, Code 7531 <
> > efren.serra.ctr at nrlmry.navy.mil>
> > Subject: Re: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus v9.1
> >
> > Hi Efren.
> >
> > I see that you are under the impression that Grid-Stat for METv9.1
> > no longer includes cat_thresh.  However, if you look in the
> > met-9.1/share/met/config/GridStatConfig_default file, you can see
> > that cat_thresh is still available for use in Grid-Stat:
> >
> > //
> > > // Forecast and observation fields to be verified // fcst = {
> > >    field = [
> > >       {
> > >         name       = "APCP";
> > >         level      = [ "A03" ];
> > >         *cat_thresh *= [ >0.0, >=5.0 ];
> > >       }
> > >    ];
> > > }
> >
> >
> > You mentioned that the v9.1 documentation doesn't have cat_thresh
> > described.  Please let us know if there is something in the
METv8.1
> > documentation that is missing from the METv9.1 documentation
> > (specific quote, section, etc.) so that we can ensure it's
inclusion
> > in the METv9.1 documentation.
> >
> > Thanks!
> >
> > Julie
> >
> > On Tue, Oct 6, 2020 at 4:27 PM efren.serra.ctr at nrlmry.navy.mil via
> > RT < met_help at ucar.edu> wrote:
> >
> > >
> > > Tue Oct 06 16:27:16 2020: Request 96983 was acted upon.
> > > Transaction: Ticket created by efren.serra.ctr at nrlmry.navy.mil
> > >        Queue: met_help
> > >      Subject: cat_thresh v8.1 versus v9.1
> > >        Owner: Nobody
> > >   Requestors: efren.serra.ctr at nrlmry.navy.mil
> > >       Status: new
> > >  Ticket <URL:
> > > https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96983
> > > >
> > >
> > >
> > > Folks - I updated my MET tools from v8.1 to v9.1. I had the
> > > following fcst entry for my GridStatConfig file(s) but I noticed
> > > that v9.1 documentation doesn't have cat_thresh. What should I
map
> > > cat_thresh to? There are vld_thresh and cov_thresh? Thank you.
> > >
> > > fcst = {
> > >
> > >    valid_time = "20180711_12";
> > >    field = [
> > >       {
> > >         file_type  = GRIB1;
> > >         model      = "WW3NAVGEM";
> > >         name       = "MFLX";
> > >         level      = "Z0";
> > >         prob       = TRUE;
> > >         cat_thresh = [ ==0.1 ];
> > >       }
> > >    ];
> > >
> > > }
> > >
> > > Efren A. Serra (Contractor)
> > > Physicist
> > >
> > > DeVine Consulting, Inc.
> > > Naval Research Laboratory
> > > Marine Meteorology Division
> > > 7 Grace Hopper Ave., STOP 2
> > > Monterey, CA 93943
> > > Code 7542
> > > Mobile: 408-425-5027
> > >
> > >
> > >
> >
> > --
> > Julie Prestopnik (she/her/hers)
> > Software Engineer
> > National Center for Atmospheric Research Research Applications
> > Laboratory
> > Email: jpresto at ucar.edu
> >
> > My working day may not be your working day.  Please do not feel
> > obliged
> to
> > reply to this email outside of your normal working hours.
> >
> >
> >
>
> --
> Julie Prestopnik (she/her/hers)
> Software Engineer
> National Center for Atmospheric Research Research Applications
> Laboratory
> Email: jpresto at ucar.edu
>
> My working day may not be your working day.  Please do not feel
> obliged to reply to this email outside of your normal working hours.
>
>



------------------------------------------------
Subject: RE: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus v9.1
From: efren.serra.ctr at nrlmry.navy.mil
Time: Wed Oct 07 10:49:24 2020

John - Here 's my masks command: gen_vx_mask -type box -height 10
-intersection \

Hence, I'm creating a 10 x 10 1deg box centered at TC location.

-----Original Message-----
From: John Halley Gotway via RT <met_help at ucar.edu>
Sent: Wednesday, October 7, 2020 9:01 AM
To: Serra, Mr. Efren, Contractor, Code 7531
<efren.serra.ctr at nrlmry.navy.mil>
Subject: Re: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus v9.1

Efren,

I see that you're doing probabilistic verification of two models. All
of the probabilistic stats computed by MET are derived from an Nx2
probabilistic contingency table. Your choice of the forecast
categorical threshold defines the N probability bins. And your choice
of the observed categorical threshold defines the 2 yes/no bins for
the observation.

It looks like you're processing the probability of significant wave
height
> 12 ft.

So you'd set the observation cat_thresh = >12; since that's the event
for which probabilities are defined.

And if you set the forecast cat_thresh = ==0.2, you'd be using 5
probability bins:
   0.0 to 0.2, 0.2 to 0.4, 0.4 to 0.6, 0.6 to 0.8, and 0.8 to 1.0 If
you set the forecast cat_thresh = ==0.1, you'd be using 10 probability
bins.

MET computes the Brier Score from the Nx2 table... not the raw
probability values.
In computing the Brier Score, all points falling inside a bin are
evaluated using the mid-point of that bin.
Points in 0.0 to 0.2 are processed as 0.1.
Points in 0.2 to 0.4 are processed as 0.3.
Points in 0.4 to 0.6 are processed as 0.5.
Points in 0.6 to 0.8 are processed as 0.7.
Points in 0.8 to 1.0 are processed as 0.9.

In a recent MET-Help question from NOAA/WPC, we discussed this
situation.
The Brier Score values were unexpected because WPC's probability
values were actually already binned. The process of re-binning the
already binned probabilities introduced some unexpected diffs in the
resulting Brier Score values. As a result, we wrote up this GitHub
issue:
   https://github.com/dtcenter/MET/issues/1495

Now I'm not sure if/how this applies to your data.  But to minimize
the effect of binning, you could try using 100 probability bins by
setting "cat_thresh = ==0.01;" I'd be curious to see how much impact
the cat_thresh setting has on the results. And remember that the
statistics reported by Grid-Stat are computed over some spatial area,
as indicated by VX_MASK. If that spatial area is large, it's likely
that the relative similarity between the fields AWAY from the large
event is averaging things out and making their performance look more
similar on average.

So take a look at how you've defined the masking regions for Grid-
Stat.

Thanks,
John

On Wed, Oct 7, 2020 at 9:38 AM Julie Prestopnik via RT
<met_help at ucar.edu>
wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96983 >
>
> Hi Efren.
>
> I just wanted to follow up on your previous email.  You mentioned,
"I
> was wondering about the difference between specifying forecast bins
of
> .2 versus .1 with cat_thresh = [==.1] or [==.2]".
>
> That information can be found on this page of the User's Guide:
>
>
https://dtcenter.github.io/MET/Users_Guide/data_io.html#configuration-
> file-details
>
> - Threshold:
> >       - A threshold type (<, <=, ==, !-, >=, or >) followed by a
> > numeric
> value.
> >       - The threshold type may also be specified using two letter
> abbreviations
> >         (lt, le, eq, ne, ge, gt).
> >       - Multiple thresholds may be combined by specifying the
logic
> > type
> of AND
> >         (&&) or *OR (||)*. For example, ">=5&&<=10" defines the
> > numbers
> between 5
> >         and 10 and *"==1||==2" defines numbers exactly equal to 1
or
> > 2.*
> >
> >
> If that doesn't help and doesn't explain why you are not seeing the
> differences you expect, please let me know and I will see if John
can
> better assist you.
>
> Thanks!
>
> Julie
>
> On Wed, Oct 7, 2020 at 9:32 AM efren.serra.ctr at nrlmry.navy.mil via
RT
> < met_help at ucar.edu> wrote:
>
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96983 >
> >
> > Attached are plots I made from pjc, prc and pstd output. Here I'm
> > including the Brier score out of pstd data files. We are comparing
> > two different probability models versus the same ground truth and
we
> > expect remarkable differences between the probabilistic models,
but
> > we don't see that in these plots. I have worked with John before,
so
> > I was wondering
> if
> > you don't mind sharing this with him.
> >
> > -----Original Message-----
> > From: Julie Prestopnik via RT <met_help at ucar.edu>
> > Sent: Wednesday, October 7, 2020 7:41 AM
> > To: Serra, Mr. Efren, Contractor, Code 7531 <
> > efren.serra.ctr at nrlmry.navy.mil>
> > Subject: Re: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus v9.1
> >
> > Hi Efren.
> >
> > I see that you are under the impression that Grid-Stat for METv9.1
> > no longer includes cat_thresh.  However, if you look in the
> > met-9.1/share/met/config/GridStatConfig_default file, you can see
> > that cat_thresh is still available for use in Grid-Stat:
> >
> > //
> > > // Forecast and observation fields to be verified // fcst = {
> > >    field = [
> > >       {
> > >         name       = "APCP";
> > >         level      = [ "A03" ];
> > >         *cat_thresh *= [ >0.0, >=5.0 ];
> > >       }
> > >    ];
> > > }
> >
> >
> > You mentioned that the v9.1 documentation doesn't have cat_thresh
> > described.  Please let us know if there is something in the
METv8.1
> > documentation that is missing from the METv9.1 documentation
> > (specific quote, section, etc.) so that we can ensure it's
inclusion
> > in the METv9.1 documentation.
> >
> > Thanks!
> >
> > Julie
> >
> > On Tue, Oct 6, 2020 at 4:27 PM efren.serra.ctr at nrlmry.navy.mil via
> > RT < met_help at ucar.edu> wrote:
> >
> > >
> > > Tue Oct 06 16:27:16 2020: Request 96983 was acted upon.
> > > Transaction: Ticket created by efren.serra.ctr at nrlmry.navy.mil
> > >        Queue: met_help
> > >      Subject: cat_thresh v8.1 versus v9.1
> > >        Owner: Nobody
> > >   Requestors: efren.serra.ctr at nrlmry.navy.mil
> > >       Status: new
> > >  Ticket <URL:
> > > https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96983
> > > >
> > >
> > >
> > > Folks - I updated my MET tools from v8.1 to v9.1. I had the
> > > following fcst entry for my GridStatConfig file(s) but I noticed
> > > that v9.1 documentation doesn't have cat_thresh. What should I
map
> > > cat_thresh to? There are vld_thresh and cov_thresh? Thank you.
> > >
> > > fcst = {
> > >
> > >    valid_time = "20180711_12";
> > >    field = [
> > >       {
> > >         file_type  = GRIB1;
> > >         model      = "WW3NAVGEM";
> > >         name       = "MFLX";
> > >         level      = "Z0";
> > >         prob       = TRUE;
> > >         cat_thresh = [ ==0.1 ];
> > >       }
> > >    ];
> > >
> > > }
> > >
> > > Efren A. Serra (Contractor)
> > > Physicist
> > >
> > > DeVine Consulting, Inc.
> > > Naval Research Laboratory
> > > Marine Meteorology Division
> > > 7 Grace Hopper Ave., STOP 2
> > > Monterey, CA 93943
> > > Code 7542
> > > Mobile: 408-425-5027
> > >
> > >
> > >
> >
> > --
> > Julie Prestopnik (she/her/hers)
> > Software Engineer
> > National Center for Atmospheric Research Research Applications
> > Laboratory
> > Email: jpresto at ucar.edu
> >
> > My working day may not be your working day.  Please do not feel
> > obliged
> to
> > reply to this email outside of your normal working hours.
> >
> >
> >
>
> --
> Julie Prestopnik (she/her/hers)
> Software Engineer
> National Center for Atmospheric Research Research Applications
> Laboratory
> Email: jpresto at ucar.edu
>
> My working day may not be your working day.  Please do not feel
> obliged to reply to this email outside of your normal working hours.
>
>



------------------------------------------------
Subject: cat_thresh v8.1 versus v9.1
From: efren.serra.ctr at nrlmry.navy.mil
Time: Wed Oct 07 14:18:23 2020

John - These are plots of BASER_i or reliability diagrams using
cat_thresh == 0.01 as you suggested. What can you able to conclude
from these? The jigsaw pattern in WW3NAVGEM, for instance. I was
wondering about that. Thanks for your help. I'm going to try
cat_thresh == 0.05 next.

-----Original Message-----
From: John Halley Gotway via RT <met_help at ucar.edu>
Sent: Wednesday, October 7, 2020 9:01 AM
To: Serra, Mr. Efren, Contractor, Code 7531
<efren.serra.ctr at nrlmry.navy.mil>
Subject: Re: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus v9.1

Efren,

I see that you're doing probabilistic verification of two models. All
of the probabilistic stats computed by MET are derived from an Nx2
probabilistic contingency table. Your choice of the forecast
categorical threshold defines the N probability bins. And your choice
of the observed categorical threshold defines the 2 yes/no bins for
the observation.

It looks like you're processing the probability of significant wave
height
> 12 ft.

So you'd set the observation cat_thresh = >12; since that's the event
for which probabilities are defined.

And if you set the forecast cat_thresh = ==0.2, you'd be using 5
probability bins:
   0.0 to 0.2, 0.2 to 0.4, 0.4 to 0.6, 0.6 to 0.8, and 0.8 to 1.0 If
you set the forecast cat_thresh = ==0.1, you'd be using 10 probability
bins.

MET computes the Brier Score from the Nx2 table... not the raw
probability values.
In computing the Brier Score, all points falling inside a bin are
evaluated using the mid-point of that bin.
Points in 0.0 to 0.2 are processed as 0.1.
Points in 0.2 to 0.4 are processed as 0.3.
Points in 0.4 to 0.6 are processed as 0.5.
Points in 0.6 to 0.8 are processed as 0.7.
Points in 0.8 to 1.0 are processed as 0.9.

In a recent MET-Help question from NOAA/WPC, we discussed this
situation.
The Brier Score values were unexpected because WPC's probability
values were actually already binned. The process of re-binning the
already binned probabilities introduced some unexpected diffs in the
resulting Brier Score values. As a result, we wrote up this GitHub
issue:
   https://github.com/dtcenter/MET/issues/1495

Now I'm not sure if/how this applies to your data.  But to minimize
the effect of binning, you could try using 100 probability bins by
setting "cat_thresh = ==0.01;" I'd be curious to see how much impact
the cat_thresh setting has on the results. And remember that the
statistics reported by Grid-Stat are computed over some spatial area,
as indicated by VX_MASK. If that spatial area is large, it's likely
that the relative similarity between the fields AWAY from the large
event is averaging things out and making their performance look more
similar on average.

So take a look at how you've defined the masking regions for Grid-
Stat.

Thanks,
John

On Wed, Oct 7, 2020 at 9:38 AM Julie Prestopnik via RT
<met_help at ucar.edu>
wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96983 >
>
> Hi Efren.
>
> I just wanted to follow up on your previous email.  You mentioned,
"I
> was wondering about the difference between specifying forecast bins
of
> .2 versus .1 with cat_thresh = [==.1] or [==.2]".
>
> That information can be found on this page of the User's Guide:
>
>
https://dtcenter.github.io/MET/Users_Guide/data_io.html#configuration-
> file-details
>
> - Threshold:
> >       - A threshold type (<, <=, ==, !-, >=, or >) followed by a
> > numeric
> value.
> >       - The threshold type may also be specified using two letter
> abbreviations
> >         (lt, le, eq, ne, ge, gt).
> >       - Multiple thresholds may be combined by specifying the
logic
> > type
> of AND
> >         (&&) or *OR (||)*. For example, ">=5&&<=10" defines the
> > numbers
> between 5
> >         and 10 and *"==1||==2" defines numbers exactly equal to 1
or
> > 2.*
> >
> >
> If that doesn't help and doesn't explain why you are not seeing the
> differences you expect, please let me know and I will see if John
can
> better assist you.
>
> Thanks!
>
> Julie
>
> On Wed, Oct 7, 2020 at 9:32 AM efren.serra.ctr at nrlmry.navy.mil via
RT
> < met_help at ucar.edu> wrote:
>
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96983 >
> >
> > Attached are plots I made from pjc, prc and pstd output. Here I'm
> > including the Brier score out of pstd data files. We are comparing
> > two different probability models versus the same ground truth and
we
> > expect remarkable differences between the probabilistic models,
but
> > we don't see that in these plots. I have worked with John before,
so
> > I was wondering
> if
> > you don't mind sharing this with him.
> >
> > -----Original Message-----
> > From: Julie Prestopnik via RT <met_help at ucar.edu>
> > Sent: Wednesday, October 7, 2020 7:41 AM
> > To: Serra, Mr. Efren, Contractor, Code 7531 <
> > efren.serra.ctr at nrlmry.navy.mil>
> > Subject: Re: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus v9.1
> >
> > Hi Efren.
> >
> > I see that you are under the impression that Grid-Stat for METv9.1
> > no longer includes cat_thresh.  However, if you look in the
> > met-9.1/share/met/config/GridStatConfig_default file, you can see
> > that cat_thresh is still available for use in Grid-Stat:
> >
> > //
> > > // Forecast and observation fields to be verified // fcst = {
> > >    field = [
> > >       {
> > >         name       = "APCP";
> > >         level      = [ "A03" ];
> > >         *cat_thresh *= [ >0.0, >=5.0 ];
> > >       }
> > >    ];
> > > }
> >
> >
> > You mentioned that the v9.1 documentation doesn't have cat_thresh
> > described.  Please let us know if there is something in the
METv8.1
> > documentation that is missing from the METv9.1 documentation
> > (specific quote, section, etc.) so that we can ensure it's
inclusion
> > in the METv9.1 documentation.
> >
> > Thanks!
> >
> > Julie
> >
> > On Tue, Oct 6, 2020 at 4:27 PM efren.serra.ctr at nrlmry.navy.mil via
> > RT < met_help at ucar.edu> wrote:
> >
> > >
> > > Tue Oct 06 16:27:16 2020: Request 96983 was acted upon.
> > > Transaction: Ticket created by efren.serra.ctr at nrlmry.navy.mil
> > >        Queue: met_help
> > >      Subject: cat_thresh v8.1 versus v9.1
> > >        Owner: Nobody
> > >   Requestors: efren.serra.ctr at nrlmry.navy.mil
> > >       Status: new
> > >  Ticket <URL:
> > > https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96983
> > > >
> > >
> > >
> > > Folks - I updated my MET tools from v8.1 to v9.1. I had the
> > > following fcst entry for my GridStatConfig file(s) but I noticed
> > > that v9.1 documentation doesn't have cat_thresh. What should I
map
> > > cat_thresh to? There are vld_thresh and cov_thresh? Thank you.
> > >
> > > fcst = {
> > >
> > >    valid_time = "20180711_12";
> > >    field = [
> > >       {
> > >         file_type  = GRIB1;
> > >         model      = "WW3NAVGEM";
> > >         name       = "MFLX";
> > >         level      = "Z0";
> > >         prob       = TRUE;
> > >         cat_thresh = [ ==0.1 ];
> > >       }
> > >    ];
> > >
> > > }
> > >
> > > Efren A. Serra (Contractor)
> > > Physicist
> > >
> > > DeVine Consulting, Inc.
> > > Naval Research Laboratory
> > > Marine Meteorology Division
> > > 7 Grace Hopper Ave., STOP 2
> > > Monterey, CA 93943
> > > Code 7542
> > > Mobile: 408-425-5027
> > >
> > >
> > >
> >
> > --
> > Julie Prestopnik (she/her/hers)
> > Software Engineer
> > National Center for Atmospheric Research Research Applications
> > Laboratory
> > Email: jpresto at ucar.edu
> >
> > My working day may not be your working day.  Please do not feel
> > obliged
> to
> > reply to this email outside of your normal working hours.
> >
> >
> >
>
> --
> Julie Prestopnik (she/her/hers)
> Software Engineer
> National Center for Atmospheric Research Research Applications
> Laboratory
> Email: jpresto at ucar.edu
>
> My working day may not be your working day.  Please do not feel
> obliged to reply to this email outside of your normal working hours.
>
>


------------------------------------------------
Subject: cat_thresh v8.1 versus v9.1
From: John Halley Gotway
Time: Wed Oct 07 14:58:13 2020

Eric,

Efren Serra works at NRL and has been running the Grid-Stat tool in
MET to
evaluate the performance of forecast for the probability of
significant
wave heights > 12 ft.

The attached graphic illustrates this case where there's a large
difference
in the forecast probability values. The NAVGEM model on the left has
probability values up to 60% while the OFCL forecast includes
probabilities
up to 90%. Efren is looking comparing these models using the Brier
score
computed over a 10x10 degree box.

He was surprised not to see a larger difference in the resulting
reliability diagrams.

I explained how Grid-Stat evaulates probabilities using an Nx2
probability
contingency table. Therefore the number of bins will affect the
results. In
particular, when computing the Brier score, the centerpoint of the bin
is
used instead of the raw probability values that went into the bin.

The 2 attached images beginning with "wp_102018_basins" show the Brier
score across multiple forecast lead times.
And the 3 attached images beginning with "wp_102018_gt12ft" show
Reliability diagrams for a few different lead times.

I assume that jagged pattern in the reliability diagram is the result
of
choosing too many bins, resulting in bins with a small number of
points
which leads to sporadic results.

On the one hand, choosing many bins minimizes the effect of binning in
the
computation of the Brier score. On the other hand, too many bins make
for a
reliability diagram that is not very smooth.

Do you have any recommendations or advice for Efren in the
verification of
the probabilistic data or interpretation of results?

Thanks,
John

---------- Forwarded message ---------
From: efren.serra.ctr at nrlmry.navy.mil via RT <met_help at ucar.edu>
Date: Wed, Oct 7, 2020 at 2:18 PM
Subject: RE: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus v9.1
To: <johnhg at ucar.edu>



<URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96983 >

John - These are plots of BASER_i or reliability diagrams using
cat_thresh
== 0.01 as you suggested. What can you able to conclude from these?
The
jigsaw pattern in WW3NAVGEM, for instance. I was wondering about that.
Thanks for your help. I'm going to try cat_thresh == 0.05 next.

-----Original Message-----
From: John Halley Gotway via RT <met_help at ucar.edu>
Sent: Wednesday, October 7, 2020 9:01 AM
To: Serra, Mr. Efren, Contractor, Code 7531
<efren.serra.ctr at nrlmry.navy.mil
>
Subject: Re: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus v9.1

Efren,

I see that you're doing probabilistic verification of two models. All
of
the probabilistic stats computed by MET are derived from an Nx2
probabilistic contingency table. Your choice of the forecast
categorical
threshold defines the N probability bins. And your choice of the
observed
categorical threshold defines the 2 yes/no bins for the observation.

It looks like you're processing the probability of significant wave
height
> 12 ft.

So you'd set the observation cat_thresh = >12; since that's the event
for
which probabilities are defined.

And if you set the forecast cat_thresh = ==0.2, you'd be using 5
probability bins:
   0.0 to 0.2, 0.2 to 0.4, 0.4 to 0.6, 0.6 to 0.8, and 0.8 to 1.0 If
you
set the forecast cat_thresh = ==0.1, you'd be using 10 probability
bins.

MET computes the Brier Score from the Nx2 table... not the raw
probability
values.
In computing the Brier Score, all points falling inside a bin are
evaluated
using the mid-point of that bin.
Points in 0.0 to 0.2 are processed as 0.1.
Points in 0.2 to 0.4 are processed as 0.3.
Points in 0.4 to 0.6 are processed as 0.5.
Points in 0.6 to 0.8 are processed as 0.7.
Points in 0.8 to 1.0 are processed as 0.9.

In a recent MET-Help question from NOAA/WPC, we discussed this
situation.
The Brier Score values were unexpected because WPC's probability
values
were actually already binned. The process of re-binning the already
binned
probabilities introduced some unexpected diffs in the resulting Brier
Score
values. As a result, we wrote up this GitHub issue:
   https://github.com/dtcenter/MET/issues/1495

Now I'm not sure if/how this applies to your data.  But to minimize
the
effect of binning, you could try using 100 probability bins by setting
"cat_thresh = ==0.01;" I'd be curious to see how much impact the
cat_thresh
setting has on the results. And remember that the statistics reported
by
Grid-Stat are computed over some spatial area, as indicated by
VX_MASK. If
that spatial area is large, it's likely that the relative similarity
between the fields AWAY from the large event is averaging things out
and
making their performance look more similar on average.

So take a look at how you've defined the masking regions for Grid-
Stat.

Thanks,
John

On Wed, Oct 7, 2020 at 9:38 AM Julie Prestopnik via RT
<met_help at ucar.edu>
wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96983 >
>
> Hi Efren.
>
> I just wanted to follow up on your previous email.  You mentioned,
"I
> was wondering about the difference between specifying forecast bins
of
> .2 versus .1 with cat_thresh = [==.1] or [==.2]".
>
> That information can be found on this page of the User's Guide:
>
>
https://dtcenter.github.io/MET/Users_Guide/data_io.html#configuration-
> file-details
>
> - Threshold:
> >       - A threshold type (<, <=, ==, !-, >=, or >) followed by a
> > numeric
> value.
> >       - The threshold type may also be specified using two letter
> abbreviations
> >         (lt, le, eq, ne, ge, gt).
> >       - Multiple thresholds may be combined by specifying the
logic
> > type
> of AND
> >         (&&) or *OR (||)*. For example, ">=5&&<=10" defines the
> > numbers
> between 5
> >         and 10 and *"==1||==2" defines numbers exactly equal to 1
or
> > 2.*
> >
> >
> If that doesn't help and doesn't explain why you are not seeing the
> differences you expect, please let me know and I will see if John
can
> better assist you.
>
> Thanks!
>
> Julie
>
> On Wed, Oct 7, 2020 at 9:32 AM efren.serra.ctr at nrlmry.navy.mil via
RT
> < met_help at ucar.edu> wrote:
>
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96983 >
> >
> > Attached are plots I made from pjc, prc and pstd output. Here I'm
> > including the Brier score out of pstd data files. We are comparing
> > two different probability models versus the same ground truth and
we
> > expect remarkable differences between the probabilistic models,
but
> > we don't see that in these plots. I have worked with John before,
so
> > I was wondering
> if
> > you don't mind sharing this with him.
> >
> > -----Original Message-----
> > From: Julie Prestopnik via RT <met_help at ucar.edu>
> > Sent: Wednesday, October 7, 2020 7:41 AM
> > To: Serra, Mr. Efren, Contractor, Code 7531 <
> > efren.serra.ctr at nrlmry.navy.mil>
> > Subject: Re: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus v9.1
> >
> > Hi Efren.
> >
> > I see that you are under the impression that Grid-Stat for METv9.1
> > no longer includes cat_thresh.  However, if you look in the
> > met-9.1/share/met/config/GridStatConfig_default file, you can see
> > that cat_thresh is still available for use in Grid-Stat:
> >
> > //
> > > // Forecast and observation fields to be verified // fcst = {
> > >    field = [
> > >       {
> > >         name       = "APCP";
> > >         level      = [ "A03" ];
> > >         *cat_thresh *= [ >0.0, >=5.0 ];
> > >       }
> > >    ];
> > > }
> >
> >
> > You mentioned that the v9.1 documentation doesn't have cat_thresh
> > described.  Please let us know if there is something in the
METv8.1
> > documentation that is missing from the METv9.1 documentation
> > (specific quote, section, etc.) so that we can ensure it's
inclusion
> > in the METv9.1 documentation.
> >
> > Thanks!
> >
> > Julie
> >
> > On Tue, Oct 6, 2020 at 4:27 PM efren.serra.ctr at nrlmry.navy.mil via
> > RT < met_help at ucar.edu> wrote:
> >
> > >
> > > Tue Oct 06 16:27:16 2020: Request 96983 was acted upon.
> > > Transaction: Ticket created by efren.serra.ctr at nrlmry.navy.mil
> > >        Queue: met_help
> > >      Subject: cat_thresh v8.1 versus v9.1
> > >        Owner: Nobody
> > >   Requestors: efren.serra.ctr at nrlmry.navy.mil
> > >       Status: new
> > >  Ticket <URL:
> > > https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96983
> > > >
> > >
> > >
> > > Folks - I updated my MET tools from v8.1 to v9.1. I had the
> > > following fcst entry for my GridStatConfig file(s) but I noticed
> > > that v9.1 documentation doesn't have cat_thresh. What should I
map
> > > cat_thresh to? There are vld_thresh and cov_thresh? Thank you.
> > >
> > > fcst = {
> > >
> > >    valid_time = "20180711_12";
> > >    field = [
> > >       {
> > >         file_type  = GRIB1;
> > >         model      = "WW3NAVGEM";
> > >         name       = "MFLX";
> > >         level      = "Z0";
> > >         prob       = TRUE;
> > >         cat_thresh = [ ==0.1 ];
> > >       }
> > >    ];
> > >
> > > }
> > >
> > > Efren A. Serra (Contractor)
> > > Physicist
> > >
> > > DeVine Consulting, Inc.
> > > Naval Research Laboratory
> > > Marine Meteorology Division
> > > 7 Grace Hopper Ave., STOP 2
> > > Monterey, CA 93943
> > > Code 7542
> > > Mobile: 408-425-5027
> > >
> > >
> > >
> >
> > --
> > Julie Prestopnik (she/her/hers)
> > Software Engineer
> > National Center for Atmospheric Research Research Applications
> > Laboratory
> > Email: jpresto at ucar.edu
> >
> > My working day may not be your working day.  Please do not feel
> > obliged
> to
> > reply to this email outside of your normal working hours.
> >
> >
> >
>
> --
> Julie Prestopnik (she/her/hers)
> Software Engineer
> National Center for Atmospheric Research Research Applications
> Laboratory
> Email: jpresto at ucar.edu
>
> My working day may not be your working day.  Please do not feel
> obliged to reply to this email outside of your normal working hours.
>
>

------------------------------------------------
Subject: cat_thresh v8.1 versus v9.1
From: Eric Gilleland
Time: Thu Oct 08 09:07:10 2020

Hi John and Efren (and whoever else gets this email via met_help),

John, you're absolutely correct about the choice of binning being
problematic: too smooth v. not smooth enough.  It's been a very long
time
since I've read anything about this issue as I have not really worked
with
probabilistic forecasts at all in my career, so I'll have to look for
some
of those papers and also see if anything new is out there but I will
look
into possible solutions.  I'll get back to you.

Best,

Eric

On Wed, Oct 7, 2020 at 2:58 PM John Halley Gotway via RT
<met_help at ucar.edu>
wrote:

> Eric,
>
> Efren Serra works at NRL and has been running the Grid-Stat tool in
MET to
> evaluate the performance of forecast for the probability of
significant
> wave heights > 12 ft.
>
> The attached graphic illustrates this case where there's a large
difference
> in the forecast probability values. The NAVGEM model on the left has
> probability values up to 60% while the OFCL forecast includes
probabilities
> up to 90%. Efren is looking comparing these models using the Brier
score
> computed over a 10x10 degree box.
>
> He was surprised not to see a larger difference in the resulting
> reliability diagrams.
>
> I explained how Grid-Stat evaulates probabilities using an Nx2
probability
> contingency table. Therefore the number of bins will affect the
results. In
> particular, when computing the Brier score, the centerpoint of the
bin is
> used instead of the raw probability values that went into the bin.
>
> The 2 attached images beginning with "wp_102018_basins" show the
Brier
> score across multiple forecast lead times.
> And the 3 attached images beginning with "wp_102018_gt12ft" show
> Reliability diagrams for a few different lead times.
>
> I assume that jagged pattern in the reliability diagram is the
result of
> choosing too many bins, resulting in bins with a small number of
points
> which leads to sporadic results.
>
> On the one hand, choosing many bins minimizes the effect of binning
in the
> computation of the Brier score. On the other hand, too many bins
make for a
> reliability diagram that is not very smooth.
>
> Do you have any recommendations or advice for Efren in the
verification of
> the probabilistic data or interpretation of results?
>
> Thanks,
> John
>
> ---------- Forwarded message ---------
> From: efren.serra.ctr at nrlmry.navy.mil via RT <met_help at ucar.edu>
> Date: Wed, Oct 7, 2020 at 2:18 PM
> Subject: RE: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus v9.1
> To: <johnhg at ucar.edu>
>
>
>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96983 >
>
> John - These are plots of BASER_i or reliability diagrams using
cat_thresh
> == 0.01 as you suggested. What can you able to conclude from these?
The
> jigsaw pattern in WW3NAVGEM, for instance. I was wondering about
that.
> Thanks for your help. I'm going to try cat_thresh == 0.05 next.
>
> -----Original Message-----
> From: John Halley Gotway via RT <met_help at ucar.edu>
> Sent: Wednesday, October 7, 2020 9:01 AM
> To: Serra, Mr. Efren, Contractor, Code 7531 <
> efren.serra.ctr at nrlmry.navy.mil
> >
> Subject: Re: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus v9.1
>
> Efren,
>
> I see that you're doing probabilistic verification of two models.
All of
> the probabilistic stats computed by MET are derived from an Nx2
> probabilistic contingency table. Your choice of the forecast
categorical
> threshold defines the N probability bins. And your choice of the
observed
> categorical threshold defines the 2 yes/no bins for the observation.
>
> It looks like you're processing the probability of significant wave
height
> > 12 ft.
>
> So you'd set the observation cat_thresh = >12; since that's the
event for
> which probabilities are defined.
>
> And if you set the forecast cat_thresh = ==0.2, you'd be using 5
> probability bins:
>    0.0 to 0.2, 0.2 to 0.4, 0.4 to 0.6, 0.6 to 0.8, and 0.8 to 1.0 If
you
> set the forecast cat_thresh = ==0.1, you'd be using 10 probability
bins.
>
> MET computes the Brier Score from the Nx2 table... not the raw
probability
> values.
> In computing the Brier Score, all points falling inside a bin are
evaluated
> using the mid-point of that bin.
> Points in 0.0 to 0.2 are processed as 0.1.
> Points in 0.2 to 0.4 are processed as 0.3.
> Points in 0.4 to 0.6 are processed as 0.5.
> Points in 0.6 to 0.8 are processed as 0.7.
> Points in 0.8 to 1.0 are processed as 0.9.
>
> In a recent MET-Help question from NOAA/WPC, we discussed this
situation.
> The Brier Score values were unexpected because WPC's probability
values
> were actually already binned. The process of re-binning the already
binned
> probabilities introduced some unexpected diffs in the resulting
Brier Score
> values. As a result, we wrote up this GitHub issue:
>    https://github.com/dtcenter/MET/issues/1495
>
> Now I'm not sure if/how this applies to your data.  But to minimize
the
> effect of binning, you could try using 100 probability bins by
setting
> "cat_thresh = ==0.01;" I'd be curious to see how much impact the
cat_thresh
> setting has on the results. And remember that the statistics
reported by
> Grid-Stat are computed over some spatial area, as indicated by
VX_MASK. If
> that spatial area is large, it's likely that the relative similarity
> between the fields AWAY from the large event is averaging things out
and
> making their performance look more similar on average.
>
> So take a look at how you've defined the masking regions for Grid-
Stat.
>
> Thanks,
> John
>
> On Wed, Oct 7, 2020 at 9:38 AM Julie Prestopnik via RT
<met_help at ucar.edu>
> wrote:
>
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96983 >
> >
> > Hi Efren.
> >
> > I just wanted to follow up on your previous email.  You mentioned,
"I
> > was wondering about the difference between specifying forecast
bins of
> > .2 versus .1 with cat_thresh = [==.1] or [==.2]".
> >
> > That information can be found on this page of the User's Guide:
> >
> >
https://dtcenter.github.io/MET/Users_Guide/data_io.html#configuration-
> > file-details
> >
> > - Threshold:
> > >       - A threshold type (<, <=, ==, !-, >=, or >) followed by a
> > > numeric
> > value.
> > >       - The threshold type may also be specified using two
letter
> > abbreviations
> > >         (lt, le, eq, ne, ge, gt).
> > >       - Multiple thresholds may be combined by specifying the
logic
> > > type
> > of AND
> > >         (&&) or *OR (||)*. For example, ">=5&&<=10" defines the
> > > numbers
> > between 5
> > >         and 10 and *"==1||==2" defines numbers exactly equal to
1 or
> > > 2.*
> > >
> > >
> > If that doesn't help and doesn't explain why you are not seeing
the
> > differences you expect, please let me know and I will see if John
can
> > better assist you.
> >
> > Thanks!
> >
> > Julie
> >
> > On Wed, Oct 7, 2020 at 9:32 AM efren.serra.ctr at nrlmry.navy.mil via
RT
> > < met_help at ucar.edu> wrote:
> >
> > >
> > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96983 >
> > >
> > > Attached are plots I made from pjc, prc and pstd output. Here
I'm
> > > including the Brier score out of pstd data files. We are
comparing
> > > two different probability models versus the same ground truth
and we
> > > expect remarkable differences between the probabilistic models,
but
> > > we don't see that in these plots. I have worked with John
before, so
> > > I was wondering
> > if
> > > you don't mind sharing this with him.
> > >
> > > -----Original Message-----
> > > From: Julie Prestopnik via RT <met_help at ucar.edu>
> > > Sent: Wednesday, October 7, 2020 7:41 AM
> > > To: Serra, Mr. Efren, Contractor, Code 7531 <
> > > efren.serra.ctr at nrlmry.navy.mil>
> > > Subject: Re: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus
v9.1
> > >
> > > Hi Efren.
> > >
> > > I see that you are under the impression that Grid-Stat for
METv9.1
> > > no longer includes cat_thresh.  However, if you look in the
> > > met-9.1/share/met/config/GridStatConfig_default file, you can
see
> > > that cat_thresh is still available for use in Grid-Stat:
> > >
> > > //
> > > > // Forecast and observation fields to be verified // fcst = {
> > > >    field = [
> > > >       {
> > > >         name       = "APCP";
> > > >         level      = [ "A03" ];
> > > >         *cat_thresh *= [ >0.0, >=5.0 ];
> > > >       }
> > > >    ];
> > > > }
> > >
> > >
> > > You mentioned that the v9.1 documentation doesn't have
cat_thresh
> > > described.  Please let us know if there is something in the
METv8.1
> > > documentation that is missing from the METv9.1 documentation
> > > (specific quote, section, etc.) so that we can ensure it's
inclusion
> > > in the METv9.1 documentation.
> > >
> > > Thanks!
> > >
> > > Julie
> > >
> > > On Tue, Oct 6, 2020 at 4:27 PM efren.serra.ctr at nrlmry.navy.mil
via
> > > RT < met_help at ucar.edu> wrote:
> > >
> > > >
> > > > Tue Oct 06 16:27:16 2020: Request 96983 was acted upon.
> > > > Transaction: Ticket created by efren.serra.ctr at nrlmry.navy.mil
> > > >        Queue: met_help
> > > >      Subject: cat_thresh v8.1 versus v9.1
> > > >        Owner: Nobody
> > > >   Requestors: efren.serra.ctr at nrlmry.navy.mil
> > > >       Status: new
> > > >  Ticket <URL:
> > > > https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96983
> > > > >
> > > >
> > > >
> > > > Folks - I updated my MET tools from v8.1 to v9.1. I had the
> > > > following fcst entry for my GridStatConfig file(s) but I
noticed
> > > > that v9.1 documentation doesn't have cat_thresh. What should I
map
> > > > cat_thresh to? There are vld_thresh and cov_thresh? Thank you.
> > > >
> > > > fcst = {
> > > >
> > > >    valid_time = "20180711_12";
> > > >    field = [
> > > >       {
> > > >         file_type  = GRIB1;
> > > >         model      = "WW3NAVGEM";
> > > >         name       = "MFLX";
> > > >         level      = "Z0";
> > > >         prob       = TRUE;
> > > >         cat_thresh = [ ==0.1 ];
> > > >       }
> > > >    ];
> > > >
> > > > }
> > > >
> > > > Efren A. Serra (Contractor)
> > > > Physicist
> > > >
> > > > DeVine Consulting, Inc.
> > > > Naval Research Laboratory
> > > > Marine Meteorology Division
> > > > 7 Grace Hopper Ave., STOP 2
> > > > Monterey, CA 93943
> > > > Code 7542
> > > > Mobile: 408-425-5027
> > > >
> > > >
> > > >
> > >
> > > --
> > > Julie Prestopnik (she/her/hers)
> > > Software Engineer
> > > National Center for Atmospheric Research Research Applications
> > > Laboratory
> > > Email: jpresto at ucar.edu
> > >
> > > My working day may not be your working day.  Please do not feel
> > > obliged
> > to
> > > reply to this email outside of your normal working hours.
> > >
> > >
> > >
> >
> > --
> > Julie Prestopnik (she/her/hers)
> > Software Engineer
> > National Center for Atmospheric Research Research Applications
> > Laboratory
> > Email: jpresto at ucar.edu
> >
> > My working day may not be your working day.  Please do not feel
> > obliged to reply to this email outside of your normal working
hours.
> >
> >
>
>

------------------------------------------------
Subject: RE: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus v9.1
From: efren.serra.ctr at nrlmry.navy.mil
Time: Thu Oct 08 13:32:48 2020

Gents - thank you so much for taking the time to look into this. I
appreciated greatly.

-----Original Message-----
From: John Halley Gotway via RT <met_help at ucar.edu>
Sent: Wednesday, October 7, 2020 1:58 PM
To: Serra, Mr. Efren, Contractor, Code 7531
<efren.serra.ctr at nrlmry.navy.mil>
Cc: ericg at ucar.edu
Subject: Fwd: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus v9.1

Eric,

Efren Serra works at NRL and has been running the Grid-Stat tool in
MET to evaluate the performance of forecast for the probability of
significant wave heights > 12 ft.

The attached graphic illustrates this case where there's a large
difference in the forecast probability values. The NAVGEM model on the
left has probability values up to 60% while the OFCL forecast includes
probabilities up to 90%. Efren is looking comparing these models using
the Brier score computed over a 10x10 degree box.

He was surprised not to see a larger difference in the resulting
reliability diagrams.

I explained how Grid-Stat evaulates probabilities using an Nx2
probability contingency table. Therefore the number of bins will
affect the results. In particular, when computing the Brier score, the
centerpoint of the bin is used instead of the raw probability values
that went into the bin.

The 2 attached images beginning with "wp_102018_basins" show the Brier
score across multiple forecast lead times.
And the 3 attached images beginning with "wp_102018_gt12ft" show
Reliability diagrams for a few different lead times.

I assume that jagged pattern in the reliability diagram is the result
of choosing too many bins, resulting in bins with a small number of
points which leads to sporadic results.

On the one hand, choosing many bins minimizes the effect of binning in
the computation of the Brier score. On the other hand, too many bins
make for a reliability diagram that is not very smooth.

Do you have any recommendations or advice for Efren in the
verification of the probabilistic data or interpretation of results?

Thanks,
John

---------- Forwarded message ---------
From: efren.serra.ctr at nrlmry.navy.mil via RT <met_help at ucar.edu>
Date: Wed, Oct 7, 2020 at 2:18 PM
Subject: RE: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus v9.1
To: <johnhg at ucar.edu>



<URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96983 >

John - These are plots of BASER_i or reliability diagrams using
cat_thresh == 0.01 as you suggested. What can you able to conclude
from these? The jigsaw pattern in WW3NAVGEM, for instance. I was
wondering about that.
Thanks for your help. I'm going to try cat_thresh == 0.05 next.

-----Original Message-----
From: John Halley Gotway via RT <met_help at ucar.edu>
Sent: Wednesday, October 7, 2020 9:01 AM
To: Serra, Mr. Efren, Contractor, Code 7531
<efren.serra.ctr at nrlmry.navy.mil
>
Subject: Re: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus v9.1

Efren,

I see that you're doing probabilistic verification of two models. All
of the probabilistic stats computed by MET are derived from an Nx2
probabilistic contingency table. Your choice of the forecast
categorical threshold defines the N probability bins. And your choice
of the observed categorical threshold defines the 2 yes/no bins for
the observation.

It looks like you're processing the probability of significant wave
height
> 12 ft.

So you'd set the observation cat_thresh = >12; since that's the event
for which probabilities are defined.

And if you set the forecast cat_thresh = ==0.2, you'd be using 5
probability bins:
   0.0 to 0.2, 0.2 to 0.4, 0.4 to 0.6, 0.6 to 0.8, and 0.8 to 1.0 If
you set the forecast cat_thresh = ==0.1, you'd be using 10 probability
bins.

MET computes the Brier Score from the Nx2 table... not the raw
probability values.
In computing the Brier Score, all points falling inside a bin are
evaluated using the mid-point of that bin.
Points in 0.0 to 0.2 are processed as 0.1.
Points in 0.2 to 0.4 are processed as 0.3.
Points in 0.4 to 0.6 are processed as 0.5.
Points in 0.6 to 0.8 are processed as 0.7.
Points in 0.8 to 1.0 are processed as 0.9.

In a recent MET-Help question from NOAA/WPC, we discussed this
situation.
The Brier Score values were unexpected because WPC's probability
values were actually already binned. The process of re-binning the
already binned probabilities introduced some unexpected diffs in the
resulting Brier Score values. As a result, we wrote up this GitHub
issue:
   https://github.com/dtcenter/MET/issues/1495

Now I'm not sure if/how this applies to your data.  But to minimize
the effect of binning, you could try using 100 probability bins by
setting "cat_thresh = ==0.01;" I'd be curious to see how much impact
the cat_thresh setting has on the results. And remember that the
statistics reported by Grid-Stat are computed over some spatial area,
as indicated by VX_MASK. If that spatial area is large, it's likely
that the relative similarity between the fields AWAY from the large
event is averaging things out and making their performance look more
similar on average.

So take a look at how you've defined the masking regions for Grid-
Stat.

Thanks,
John

On Wed, Oct 7, 2020 at 9:38 AM Julie Prestopnik via RT
<met_help at ucar.edu>
wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96983 >
>
> Hi Efren.
>
> I just wanted to follow up on your previous email.  You mentioned,
"I
> was wondering about the difference between specifying forecast bins
of
> .2 versus .1 with cat_thresh = [==.1] or [==.2]".
>
> That information can be found on this page of the User's Guide:
>
>
https://dtcenter.github.io/MET/Users_Guide/data_io.html#configuration-
> file-details
>
> - Threshold:
> >       - A threshold type (<, <=, ==, !-, >=, or >) followed by a
> > numeric
> value.
> >       - The threshold type may also be specified using two letter
> abbreviations
> >         (lt, le, eq, ne, ge, gt).
> >       - Multiple thresholds may be combined by specifying the
logic
> > type
> of AND
> >         (&&) or *OR (||)*. For example, ">=5&&<=10" defines the
> > numbers
> between 5
> >         and 10 and *"==1||==2" defines numbers exactly equal to 1
or
> > 2.*
> >
> >
> If that doesn't help and doesn't explain why you are not seeing the
> differences you expect, please let me know and I will see if John
can
> better assist you.
>
> Thanks!
>
> Julie
>
> On Wed, Oct 7, 2020 at 9:32 AM efren.serra.ctr at nrlmry.navy.mil via
RT
> < met_help at ucar.edu> wrote:
>
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96983 >
> >
> > Attached are plots I made from pjc, prc and pstd output. Here I'm
> > including the Brier score out of pstd data files. We are comparing
> > two different probability models versus the same ground truth and
we
> > expect remarkable differences between the probabilistic models,
but
> > we don't see that in these plots. I have worked with John before,
so
> > I was wondering
> if
> > you don't mind sharing this with him.
> >
> > -----Original Message-----
> > From: Julie Prestopnik via RT <met_help at ucar.edu>
> > Sent: Wednesday, October 7, 2020 7:41 AM
> > To: Serra, Mr. Efren, Contractor, Code 7531 <
> > efren.serra.ctr at nrlmry.navy.mil>
> > Subject: Re: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus v9.1
> >
> > Hi Efren.
> >
> > I see that you are under the impression that Grid-Stat for METv9.1
> > no longer includes cat_thresh.  However, if you look in the
> > met-9.1/share/met/config/GridStatConfig_default file, you can see
> > that cat_thresh is still available for use in Grid-Stat:
> >
> > //
> > > // Forecast and observation fields to be verified // fcst = {
> > >    field = [
> > >       {
> > >         name       = "APCP";
> > >         level      = [ "A03" ];
> > >         *cat_thresh *= [ >0.0, >=5.0 ];
> > >       }
> > >    ];
> > > }
> >
> >
> > You mentioned that the v9.1 documentation doesn't have cat_thresh
> > described.  Please let us know if there is something in the
METv8.1
> > documentation that is missing from the METv9.1 documentation
> > (specific quote, section, etc.) so that we can ensure it's
inclusion
> > in the METv9.1 documentation.
> >
> > Thanks!
> >
> > Julie
> >
> > On Tue, Oct 6, 2020 at 4:27 PM efren.serra.ctr at nrlmry.navy.mil via
> > RT < met_help at ucar.edu> wrote:
> >
> > >
> > > Tue Oct 06 16:27:16 2020: Request 96983 was acted upon.
> > > Transaction: Ticket created by efren.serra.ctr at nrlmry.navy.mil
> > >        Queue: met_help
> > >      Subject: cat_thresh v8.1 versus v9.1
> > >        Owner: Nobody
> > >   Requestors: efren.serra.ctr at nrlmry.navy.mil
> > >       Status: new
> > >  Ticket <URL:
> > > https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96983
> > > >
> > >
> > >
> > > Folks - I updated my MET tools from v8.1 to v9.1. I had the
> > > following fcst entry for my GridStatConfig file(s) but I noticed
> > > that v9.1 documentation doesn't have cat_thresh. What should I
map
> > > cat_thresh to? There are vld_thresh and cov_thresh? Thank you.
> > >
> > > fcst = {
> > >
> > >    valid_time = "20180711_12";
> > >    field = [
> > >       {
> > >         file_type  = GRIB1;
> > >         model      = "WW3NAVGEM";
> > >         name       = "MFLX";
> > >         level      = "Z0";
> > >         prob       = TRUE;
> > >         cat_thresh = [ ==0.1 ];
> > >       }
> > >    ];
> > >
> > > }
> > >
> > > Efren A. Serra (Contractor)
> > > Physicist
> > >
> > > DeVine Consulting, Inc.
> > > Naval Research Laboratory
> > > Marine Meteorology Division
> > > 7 Grace Hopper Ave., STOP 2
> > > Monterey, CA 93943
> > > Code 7542
> > > Mobile: 408-425-5027
> > >
> > >
> > >
> >
> > --
> > Julie Prestopnik (she/her/hers)
> > Software Engineer
> > National Center for Atmospheric Research Research Applications
> > Laboratory
> > Email: jpresto at ucar.edu
> >
> > My working day may not be your working day.  Please do not feel
> > obliged
> to
> > reply to this email outside of your normal working hours.
> >
> >
> >
>
> --
> Julie Prestopnik (she/her/hers)
> Software Engineer
> National Center for Atmospheric Research Research Applications
> Laboratory
> Email: jpresto at ucar.edu
>
> My working day may not be your working day.  Please do not feel
> obliged to reply to this email outside of your normal working hours.
>
>



------------------------------------------------
Subject: RE: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus v9.1
From: efren.serra.ctr at nrlmry.navy.mil
Time: Thu Oct 08 13:34:05 2020

Thank you, Eric.

-----Original Message-----
From: Eric Gilleland via RT <met_help at ucar.edu>
Sent: Thursday, October 8, 2020 8:07 AM
To: Serra, Mr. Efren, Contractor, Code 7531
<efren.serra.ctr at nrlmry.navy.mil>
Subject: Re: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus v9.1

Hi John and Efren (and whoever else gets this email via met_help),

John, you're absolutely correct about the choice of binning being
problematic: too smooth v. not smooth enough.  It's been a very long
time since I've read anything about this issue as I have not really
worked with probabilistic forecasts at all in my career, so I'll have
to look for some of those papers and also see if anything new is out
there but I will look into possible solutions.  I'll get back to you.

Best,

Eric

On Wed, Oct 7, 2020 at 2:58 PM John Halley Gotway via RT
<met_help at ucar.edu>
wrote:

> Eric,
>
> Efren Serra works at NRL and has been running the Grid-Stat tool in
> MET to evaluate the performance of forecast for the probability of
> significant wave heights > 12 ft.
>
> The attached graphic illustrates this case where there's a large
> difference in the forecast probability values. The NAVGEM model on
the
> left has probability values up to 60% while the OFCL forecast
includes
> probabilities up to 90%. Efren is looking comparing these models
using
> the Brier score computed over a 10x10 degree box.
>
> He was surprised not to see a larger difference in the resulting
> reliability diagrams.
>
> I explained how Grid-Stat evaulates probabilities using an Nx2
> probability contingency table. Therefore the number of bins will
> affect the results. In particular, when computing the Brier score,
the
> centerpoint of the bin is used instead of the raw probability values
that went into the bin.
>
> The 2 attached images beginning with "wp_102018_basins" show the
Brier
> score across multiple forecast lead times.
> And the 3 attached images beginning with "wp_102018_gt12ft" show
> Reliability diagrams for a few different lead times.
>
> I assume that jagged pattern in the reliability diagram is the
result
> of choosing too many bins, resulting in bins with a small number of
> points which leads to sporadic results.
>
> On the one hand, choosing many bins minimizes the effect of binning
in
> the computation of the Brier score. On the other hand, too many bins
> make for a reliability diagram that is not very smooth.
>
> Do you have any recommendations or advice for Efren in the
> verification of the probabilistic data or interpretation of results?
>
> Thanks,
> John
>
> ---------- Forwarded message ---------
> From: efren.serra.ctr at nrlmry.navy.mil via RT <met_help at ucar.edu>
> Date: Wed, Oct 7, 2020 at 2:18 PM
> Subject: RE: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus v9.1
> To: <johnhg at ucar.edu>
>
>
>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96983 >
>
> John - These are plots of BASER_i or reliability diagrams using
> cat_thresh == 0.01 as you suggested. What can you able to conclude
> from these? The jigsaw pattern in WW3NAVGEM, for instance. I was
wondering about that.
> Thanks for your help. I'm going to try cat_thresh == 0.05 next.
>
> -----Original Message-----
> From: John Halley Gotway via RT <met_help at ucar.edu>
> Sent: Wednesday, October 7, 2020 9:01 AM
> To: Serra, Mr. Efren, Contractor, Code 7531 <
> efren.serra.ctr at nrlmry.navy.mil
> >
> Subject: Re: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus v9.1
>
> Efren,
>
> I see that you're doing probabilistic verification of two models.
All
> of the probabilistic stats computed by MET are derived from an Nx2
> probabilistic contingency table. Your choice of the forecast
> categorical threshold defines the N probability bins. And your
choice
> of the observed categorical threshold defines the 2 yes/no bins for
the observation.
>
> It looks like you're processing the probability of significant wave
> height
> > 12 ft.
>
> So you'd set the observation cat_thresh = >12; since that's the
event
> for which probabilities are defined.
>
> And if you set the forecast cat_thresh = ==0.2, you'd be using 5
> probability bins:
>    0.0 to 0.2, 0.2 to 0.4, 0.4 to 0.6, 0.6 to 0.8, and 0.8 to 1.0 If
> you set the forecast cat_thresh = ==0.1, you'd be using 10
probability bins.
>
> MET computes the Brier Score from the Nx2 table... not the raw
> probability values.
> In computing the Brier Score, all points falling inside a bin are
> evaluated using the mid-point of that bin.
> Points in 0.0 to 0.2 are processed as 0.1.
> Points in 0.2 to 0.4 are processed as 0.3.
> Points in 0.4 to 0.6 are processed as 0.5.
> Points in 0.6 to 0.8 are processed as 0.7.
> Points in 0.8 to 1.0 are processed as 0.9.
>
> In a recent MET-Help question from NOAA/WPC, we discussed this
situation.
> The Brier Score values were unexpected because WPC's probability
> values were actually already binned. The process of re-binning the
> already binned probabilities introduced some unexpected diffs in the
> resulting Brier Score values. As a result, we wrote up this GitHub
issue:
>    https://github.com/dtcenter/MET/issues/1495
>
> Now I'm not sure if/how this applies to your data.  But to minimize
> the effect of binning, you could try using 100 probability bins by
> setting "cat_thresh = ==0.01;" I'd be curious to see how much impact
> the cat_thresh setting has on the results. And remember that the
> statistics reported by Grid-Stat are computed over some spatial
area,
> as indicated by VX_MASK. If that spatial area is large, it's likely
> that the relative similarity between the fields AWAY from the large
> event is averaging things out and making their performance look more
similar on average.
>
> So take a look at how you've defined the masking regions for Grid-
Stat.
>
> Thanks,
> John
>
> On Wed, Oct 7, 2020 at 9:38 AM Julie Prestopnik via RT
> <met_help at ucar.edu>
> wrote:
>
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96983 >
> >
> > Hi Efren.
> >
> > I just wanted to follow up on your previous email.  You mentioned,
> > "I was wondering about the difference between specifying forecast
> > bins of
> > .2 versus .1 with cat_thresh = [==.1] or [==.2]".
> >
> > That information can be found on this page of the User's Guide:
> >
> >
https://dtcenter.github.io/MET/Users_Guide/data_io.html#configuratio
> > n-
> > file-details
> >
> > - Threshold:
> > >       - A threshold type (<, <=, ==, !-, >=, or >) followed by a
> > > numeric
> > value.
> > >       - The threshold type may also be specified using two
letter
> > abbreviations
> > >         (lt, le, eq, ne, ge, gt).
> > >       - Multiple thresholds may be combined by specifying the
> > > logic type
> > of AND
> > >         (&&) or *OR (||)*. For example, ">=5&&<=10" defines the
> > > numbers
> > between 5
> > >         and 10 and *"==1||==2" defines numbers exactly equal to
1
> > > or
> > > 2.*
> > >
> > >
> > If that doesn't help and doesn't explain why you are not seeing
the
> > differences you expect, please let me know and I will see if John
> > can better assist you.
> >
> > Thanks!
> >
> > Julie
> >
> > On Wed, Oct 7, 2020 at 9:32 AM efren.serra.ctr at nrlmry.navy.mil via
> > RT < met_help at ucar.edu> wrote:
> >
> > >
> > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96983 >
> > >
> > > Attached are plots I made from pjc, prc and pstd output. Here
I'm
> > > including the Brier score out of pstd data files. We are
comparing
> > > two different probability models versus the same ground truth
and
> > > we expect remarkable differences between the probabilistic
models,
> > > but we don't see that in these plots. I have worked with John
> > > before, so I was wondering
> > if
> > > you don't mind sharing this with him.
> > >
> > > -----Original Message-----
> > > From: Julie Prestopnik via RT <met_help at ucar.edu>
> > > Sent: Wednesday, October 7, 2020 7:41 AM
> > > To: Serra, Mr. Efren, Contractor, Code 7531 <
> > > efren.serra.ctr at nrlmry.navy.mil>
> > > Subject: Re: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus
v9.1
> > >
> > > Hi Efren.
> > >
> > > I see that you are under the impression that Grid-Stat for
METv9.1
> > > no longer includes cat_thresh.  However, if you look in the
> > > met-9.1/share/met/config/GridStatConfig_default file, you can
see
> > > that cat_thresh is still available for use in Grid-Stat:
> > >
> > > //
> > > > // Forecast and observation fields to be verified // fcst = {
> > > >    field = [
> > > >       {
> > > >         name       = "APCP";
> > > >         level      = [ "A03" ];
> > > >         *cat_thresh *= [ >0.0, >=5.0 ];
> > > >       }
> > > >    ];
> > > > }
> > >
> > >
> > > You mentioned that the v9.1 documentation doesn't have
cat_thresh
> > > described.  Please let us know if there is something in the
> > > METv8.1 documentation that is missing from the METv9.1
> > > documentation (specific quote, section, etc.) so that we can
> > > ensure it's inclusion in the METv9.1 documentation.
> > >
> > > Thanks!
> > >
> > > Julie
> > >
> > > On Tue, Oct 6, 2020 at 4:27 PM efren.serra.ctr at nrlmry.navy.mil
via
> > > RT < met_help at ucar.edu> wrote:
> > >
> > > >
> > > > Tue Oct 06 16:27:16 2020: Request 96983 was acted upon.
> > > > Transaction: Ticket created by efren.serra.ctr at nrlmry.navy.mil
> > > >        Queue: met_help
> > > >      Subject: cat_thresh v8.1 versus v9.1
> > > >        Owner: Nobody
> > > >   Requestors: efren.serra.ctr at nrlmry.navy.mil
> > > >       Status: new
> > > >  Ticket <URL:
> > > > https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96983
> > > > >
> > > >
> > > >
> > > > Folks - I updated my MET tools from v8.1 to v9.1. I had the
> > > > following fcst entry for my GridStatConfig file(s) but I
noticed
> > > > that v9.1 documentation doesn't have cat_thresh. What should I
> > > > map cat_thresh to? There are vld_thresh and cov_thresh? Thank
you.
> > > >
> > > > fcst = {
> > > >
> > > >    valid_time = "20180711_12";
> > > >    field = [
> > > >       {
> > > >         file_type  = GRIB1;
> > > >         model      = "WW3NAVGEM";
> > > >         name       = "MFLX";
> > > >         level      = "Z0";
> > > >         prob       = TRUE;
> > > >         cat_thresh = [ ==0.1 ];
> > > >       }
> > > >    ];
> > > >
> > > > }
> > > >
> > > > Efren A. Serra (Contractor)
> > > > Physicist
> > > >
> > > > DeVine Consulting, Inc.
> > > > Naval Research Laboratory
> > > > Marine Meteorology Division
> > > > 7 Grace Hopper Ave., STOP 2
> > > > Monterey, CA 93943
> > > > Code 7542
> > > > Mobile: 408-425-5027
> > > >
> > > >
> > > >
> > >
> > > --
> > > Julie Prestopnik (she/her/hers)
> > > Software Engineer
> > > National Center for Atmospheric Research Research Applications
> > > Laboratory
> > > Email: jpresto at ucar.edu
> > >
> > > My working day may not be your working day.  Please do not feel
> > > obliged
> > to
> > > reply to this email outside of your normal working hours.
> > >
> > >
> > >
> >
> > --
> > Julie Prestopnik (she/her/hers)
> > Software Engineer
> > National Center for Atmospheric Research Research Applications
> > Laboratory
> > Email: jpresto at ucar.edu
> >
> > My working day may not be your working day.  Please do not feel
> > obliged to reply to this email outside of your normal working
hours.
> >
> >
>
>



------------------------------------------------
Subject: cat_thresh v8.1 versus v9.1
From: Sampson, Mr. Buck
Time: Thu Oct 08 13:49:54 2020

Thanks.  It is actually me that is complaining about the results.  I
just
don't understand them.
Can we run a very simple test?  Efren is going to send you that test.
I don't
understand the results, for example the OY and ON in the pjc file.  I
would
like to invite one of you to contribute to a manuscript if we get that
far.

R,

Buck Sampson, NRL Monterey

-----Original Message-----
From: Serra, Mr. Efren, Contractor, Code 7531
<efren.serra.ctr at nrlmry.navy.mil>
Sent: Thursday, October 8, 2020 12:34 PM
To: met_help at ucar.edu
Cc: Sampson, Mr. Buck <Buck.Sampson at nrlmry.navy.mil>
Subject: RE: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus v9.1

Thank you, Eric.

-----Original Message-----
From: Eric Gilleland via RT <met_help at ucar.edu>
Sent: Thursday, October 8, 2020 8:07 AM
To: Serra, Mr. Efren, Contractor, Code 7531
<efren.serra.ctr at nrlmry.navy.mil>
Subject: Re: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus v9.1

Hi John and Efren (and whoever else gets this email via met_help),

John, you're absolutely correct about the choice of binning being
problematic: too smooth v. not smooth enough.  It's been a very long
time
since I've read anything about this issue as I have not really worked
with
probabilistic forecasts at all in my career, so I'll have to look for
some of
those papers and also see if anything new is out there but I will look
into
possible solutions.  I'll get back to you.

Best,

Eric

On Wed, Oct 7, 2020 at 2:58 PM John Halley Gotway via RT
<met_help at ucar.edu>
wrote:

> Eric,
>
> Efren Serra works at NRL and has been running the Grid-Stat tool in
> MET to evaluate the performance of forecast for the probability of
> significant wave heights > 12 ft.
>
> The attached graphic illustrates this case where there's a large
> difference in the forecast probability values. The NAVGEM model on
the
> left has probability values up to 60% while the OFCL forecast
includes
> probabilities up to 90%. Efren is looking comparing these models
using
> the Brier score computed over a 10x10 degree box.
>
> He was surprised not to see a larger difference in the resulting
> reliability diagrams.
>
> I explained how Grid-Stat evaulates probabilities using an Nx2
> probability contingency table. Therefore the number of bins will
> affect the results. In particular, when computing the Brier score,
the
> centerpoint of the bin is used instead of the raw probability values
that
> went into the bin.
>
> The 2 attached images beginning with "wp_102018_basins" show the
Brier
> score across multiple forecast lead times.
> And the 3 attached images beginning with "wp_102018_gt12ft" show
> Reliability diagrams for a few different lead times.
>
> I assume that jagged pattern in the reliability diagram is the
result
> of choosing too many bins, resulting in bins with a small number of
> points which leads to sporadic results.
>
> On the one hand, choosing many bins minimizes the effect of binning
in
> the computation of the Brier score. On the other hand, too many bins
> make for a reliability diagram that is not very smooth.
>
> Do you have any recommendations or advice for Efren in the
> verification of the probabilistic data or interpretation of results?
>
> Thanks,
> John
>
> ---------- Forwarded message ---------
> From: efren.serra.ctr at nrlmry.navy.mil via RT <met_help at ucar.edu>
> Date: Wed, Oct 7, 2020 at 2:18 PM
> Subject: RE: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus v9.1
> To: <johnhg at ucar.edu>
>
>
>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96983 >
>
> John - These are plots of BASER_i or reliability diagrams using
> cat_thresh == 0.01 as you suggested. What can you able to conclude
> from these? The jigsaw pattern in WW3NAVGEM, for instance. I was
wondering
> about that.
> Thanks for your help. I'm going to try cat_thresh == 0.05 next.
>
> -----Original Message-----
> From: John Halley Gotway via RT <met_help at ucar.edu>
> Sent: Wednesday, October 7, 2020 9:01 AM
> To: Serra, Mr. Efren, Contractor, Code 7531 <
> efren.serra.ctr at nrlmry.navy.mil
> >
> Subject: Re: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus v9.1
>
> Efren,
>
> I see that you're doing probabilistic verification of two models.
All
> of the probabilistic stats computed by MET are derived from an Nx2
> probabilistic contingency table. Your choice of the forecast
> categorical threshold defines the N probability bins. And your
choice
> of the observed categorical threshold defines the 2 yes/no bins for
the
> observation.
>
> It looks like you're processing the probability of significant wave
> height
> > 12 ft.
>
> So you'd set the observation cat_thresh = >12; since that's the
event
> for which probabilities are defined.
>
> And if you set the forecast cat_thresh = ==0.2, you'd be using 5
> probability bins:
>    0.0 to 0.2, 0.2 to 0.4, 0.4 to 0.6, 0.6 to 0.8, and 0.8 to 1.0 If
> you set the forecast cat_thresh = ==0.1, you'd be using 10
probability bins.
>
> MET computes the Brier Score from the Nx2 table... not the raw
> probability values.
> In computing the Brier Score, all points falling inside a bin are
> evaluated using the mid-point of that bin.
> Points in 0.0 to 0.2 are processed as 0.1.
> Points in 0.2 to 0.4 are processed as 0.3.
> Points in 0.4 to 0.6 are processed as 0.5.
> Points in 0.6 to 0.8 are processed as 0.7.
> Points in 0.8 to 1.0 are processed as 0.9.
>
> In a recent MET-Help question from NOAA/WPC, we discussed this
situation.
> The Brier Score values were unexpected because WPC's probability
> values were actually already binned. The process of re-binning the
> already binned probabilities introduced some unexpected diffs in the
> resulting Brier Score values. As a result, we wrote up this GitHub
issue:
>    https://github.com/dtcenter/MET/issues/1495
>
> Now I'm not sure if/how this applies to your data.  But to minimize
> the effect of binning, you could try using 100 probability bins by
> setting "cat_thresh = ==0.01;" I'd be curious to see how much impact
> the cat_thresh setting has on the results. And remember that the
> statistics reported by Grid-Stat are computed over some spatial
area,
> as indicated by VX_MASK. If that spatial area is large, it's likely
> that the relative similarity between the fields AWAY from the large
> event is averaging things out and making their performance look more
similar
> on average.
>
> So take a look at how you've defined the masking regions for Grid-
Stat.
>
> Thanks,
> John
>
> On Wed, Oct 7, 2020 at 9:38 AM Julie Prestopnik via RT
> <met_help at ucar.edu>
> wrote:
>
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96983 >
> >
> > Hi Efren.
> >
> > I just wanted to follow up on your previous email.  You mentioned,
> > "I was wondering about the difference between specifying forecast
> > bins of
> > .2 versus .1 with cat_thresh = [==.1] or [==.2]".
> >
> > That information can be found on this page of the User's Guide:
> >
> >
https://dtcenter.github.io/MET/Users_Guide/data_io.html#configuratio
> > n-
> > file-details
> >
> > - Threshold:
> > >       - A threshold type (<, <=, ==, !-, >=, or >) followed by a
> > > numeric
> > value.
> > >       - The threshold type may also be specified using two
letter
> > abbreviations
> > >         (lt, le, eq, ne, ge, gt).
> > >       - Multiple thresholds may be combined by specifying the
> > > logic type
> > of AND
> > >         (&&) or *OR (||)*. For example, ">=5&&<=10" defines the
> > > numbers
> > between 5
> > >         and 10 and *"==1||==2" defines numbers exactly equal to
1
> > > or
> > > 2.*
> > >
> > >
> > If that doesn't help and doesn't explain why you are not seeing
the
> > differences you expect, please let me know and I will see if John
> > can better assist you.
> >
> > Thanks!
> >
> > Julie
> >
> > On Wed, Oct 7, 2020 at 9:32 AM efren.serra.ctr at nrlmry.navy.mil via
> > RT < met_help at ucar.edu> wrote:
> >
> > >
> > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96983 >
> > >
> > > Attached are plots I made from pjc, prc and pstd output. Here
I'm
> > > including the Brier score out of pstd data files. We are
comparing
> > > two different probability models versus the same ground truth
and
> > > we expect remarkable differences between the probabilistic
models,
> > > but we don't see that in these plots. I have worked with John
> > > before, so I was wondering
> > if
> > > you don't mind sharing this with him.
> > >
> > > -----Original Message-----
> > > From: Julie Prestopnik via RT <met_help at ucar.edu>
> > > Sent: Wednesday, October 7, 2020 7:41 AM
> > > To: Serra, Mr. Efren, Contractor, Code 7531 <
> > > efren.serra.ctr at nrlmry.navy.mil>
> > > Subject: Re: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus
v9.1
> > >
> > > Hi Efren.
> > >
> > > I see that you are under the impression that Grid-Stat for
METv9.1
> > > no longer includes cat_thresh.  However, if you look in the
> > > met-9.1/share/met/config/GridStatConfig_default file, you can
see
> > > that cat_thresh is still available for use in Grid-Stat:
> > >
> > > //
> > > > // Forecast and observation fields to be verified // fcst = {
> > > >    field = [
> > > >       {
> > > >         name       = "APCP";
> > > >         level      = [ "A03" ];
> > > >         *cat_thresh *= [ >0.0, >=5.0 ];
> > > >       }
> > > >    ];
> > > > }
> > >
> > >
> > > You mentioned that the v9.1 documentation doesn't have
cat_thresh
> > > described.  Please let us know if there is something in the
> > > METv8.1 documentation that is missing from the METv9.1
> > > documentation (specific quote, section, etc.) so that we can
> > > ensure it's inclusion in the METv9.1 documentation.
> > >
> > > Thanks!
> > >
> > > Julie
> > >
> > > On Tue, Oct 6, 2020 at 4:27 PM efren.serra.ctr at nrlmry.navy.mil
via
> > > RT < met_help at ucar.edu> wrote:
> > >
> > > >
> > > > Tue Oct 06 16:27:16 2020: Request 96983 was acted upon.
> > > > Transaction: Ticket created by efren.serra.ctr at nrlmry.navy.mil
> > > >        Queue: met_help
> > > >      Subject: cat_thresh v8.1 versus v9.1
> > > >        Owner: Nobody
> > > >   Requestors: efren.serra.ctr at nrlmry.navy.mil
> > > >       Status: new
> > > >  Ticket <URL:
> > > > https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96983
> > > > >
> > > >
> > > >
> > > > Folks - I updated my MET tools from v8.1 to v9.1. I had the
> > > > following fcst entry for my GridStatConfig file(s) but I
noticed
> > > > that v9.1 documentation doesn't have cat_thresh. What should I
> > > > map cat_thresh to? There are vld_thresh and cov_thresh? Thank
you.
> > > >
> > > > fcst = {
> > > >
> > > >    valid_time = "20180711_12";
> > > >    field = [
> > > >       {
> > > >         file_type  = GRIB1;
> > > >         model      = "WW3NAVGEM";
> > > >         name       = "MFLX";
> > > >         level      = "Z0";
> > > >         prob       = TRUE;
> > > >         cat_thresh = [ ==0.1 ];
> > > >       }
> > > >    ];
> > > >
> > > > }
> > > >
> > > > Efren A. Serra (Contractor)
> > > > Physicist
> > > >
> > > > DeVine Consulting, Inc.
> > > > Naval Research Laboratory
> > > > Marine Meteorology Division
> > > > 7 Grace Hopper Ave., STOP 2
> > > > Monterey, CA 93943
> > > > Code 7542
> > > > Mobile: 408-425-5027
> > > >
> > > >
> > > >
> > >
> > > --
> > > Julie Prestopnik (she/her/hers)
> > > Software Engineer
> > > National Center for Atmospheric Research Research Applications
> > > Laboratory
> > > Email: jpresto at ucar.edu
> > >
> > > My working day may not be your working day.  Please do not feel
> > > obliged
> > to
> > > reply to this email outside of your normal working hours.
> > >
> > >
> > >
> >
> > --
> > Julie Prestopnik (she/her/hers)
> > Software Engineer
> > National Center for Atmospheric Research Research Applications
> > Laboratory
> > Email: jpresto at ucar.edu
> >
> > My working day may not be your working day.  Please do not feel
> > obliged to reply to this email outside of your normal working
hours.
> >
> >
>
>


------------------------------------------------
Subject: RE: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus v9.1
From: efren.serra.ctr at nrlmry.navy.mil
Time: Thu Oct 08 19:33:40 2020

John - I have place all of the data concerning this case (valid time =
20180709_00, forcast time = 20180705_00, tau = 96) at the appropriate
place on your ftp server:

ftp> ls
227 Entering Passive Mode (128,117,14,132,198,113).
150 Opening ASCII mode data connection for file list
-rw-r--r--   1 ftp      ftp          3281 Oct  9 01:26
GridStatConfig_wp102018_gt12ft_WW3NAVGEM-
WW3TCOFCL.2018070900_2018070500-096
-rw-r--r--   1 ftp      ftp          3281 Oct  9 01:27
GridStatConfig_wp102018_gt12ft_WW3TCOFCL-
WW3TCOFCL.2018070900_2018070500-096
-rw-r--r--   1 ftp      ftp          3280 Oct  9 01:26
GridStatConfig_wp102018_gt18ft_WW3NAVGEM-
WW3TCOFCL.2018070900_2018070500-096
-rw-r--r--   1 ftp      ftp          3280 Oct  9 01:28
GridStatConfig_wp102018_gt18ft_WW3TCOFCL-
WW3TCOFCL.2018070900_2018070500-096
-rw-r--r--   1 ftp      ftp         57098 Oct  9 01:24 US058GOCN-
GR1mdl.0011_0255_09600U0RL2018070500_0001_000000-
000000prob_sig_wav_ht_gt12ft
-rw-r--r--   1 ftp      ftp         39172 Oct  9 01:25 US058GOCN-
GR1mdl.0050_0240_09600U0RL2018070500_0001_000000-
000000prob_sig_wav_ht_gt12ft
-rw-r--r--   1 ftp      ftp       1410480 Oct  9 01:24 US058GOCN-
GR1mdl.0095_0200_00000F0RL2018070900_0001_000000-000000sig_wav_ht
-rw-r--r--   1 ftp      ftp        270598 Oct  9 01:32 wp102018-
2018070900.nc
226 Transfer complete
ftp> pwd
257 "/incoming/irap/met_help/serra_data" is the current directory

-----Original Message-----
From: Sampson, Mr. Buck via RT <met_help at ucar.edu>
Sent: Thursday, October 8, 2020 12:50 PM
To: Serra, Mr. Efren, Contractor, Code 7531
<efren.serra.ctr at nrlmry.navy.mil>
Cc: ericg at ucar.edu
Subject: RE: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus v9.1

Thanks.  It is actually me that is complaining about the results.  I
just don't understand them.
Can we run a very simple test?  Efren is going to send you that test.
I don't understand the results, for example the OY and ON in the pjc
file.  I would like to invite one of you to contribute to a manuscript
if we get that far.

R,

Buck Sampson, NRL Monterey

-----Original Message-----
From: Serra, Mr. Efren, Contractor, Code 7531
<efren.serra.ctr at nrlmry.navy.mil>
Sent: Thursday, October 8, 2020 12:34 PM
To: met_help at ucar.edu
Cc: Sampson, Mr. Buck <Buck.Sampson at nrlmry.navy.mil>
Subject: RE: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus v9.1

Thank you, Eric.

-----Original Message-----
From: Eric Gilleland via RT <met_help at ucar.edu>
Sent: Thursday, October 8, 2020 8:07 AM
To: Serra, Mr. Efren, Contractor, Code 7531
<efren.serra.ctr at nrlmry.navy.mil>
Subject: Re: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus v9.1

Hi John and Efren (and whoever else gets this email via met_help),

John, you're absolutely correct about the choice of binning being
problematic: too smooth v. not smooth enough.  It's been a very long
time
since I've read anything about this issue as I have not really worked
with
probabilistic forecasts at all in my career, so I'll have to look for
some of
those papers and also see if anything new is out there but I will look
into
possible solutions.  I'll get back to you.

Best,

Eric

On Wed, Oct 7, 2020 at 2:58 PM John Halley Gotway via RT
<met_help at ucar.edu>
wrote:

> Eric,
>
> Efren Serra works at NRL and has been running the Grid-Stat tool in
> MET to evaluate the performance of forecast for the probability of
> significant wave heights > 12 ft.
>
> The attached graphic illustrates this case where there's a large
> difference in the forecast probability values. The NAVGEM model on
the
> left has probability values up to 60% while the OFCL forecast
includes
> probabilities up to 90%. Efren is looking comparing these models
using
> the Brier score computed over a 10x10 degree box.
>
> He was surprised not to see a larger difference in the resulting
> reliability diagrams.
>
> I explained how Grid-Stat evaulates probabilities using an Nx2
> probability contingency table. Therefore the number of bins will
> affect the results. In particular, when computing the Brier score,
the
> centerpoint of the bin is used instead of the raw probability values
that
> went into the bin.
>
> The 2 attached images beginning with "wp_102018_basins" show the
Brier
> score across multiple forecast lead times.
> And the 3 attached images beginning with "wp_102018_gt12ft" show
> Reliability diagrams for a few different lead times.
>
> I assume that jagged pattern in the reliability diagram is the
result
> of choosing too many bins, resulting in bins with a small number of
> points which leads to sporadic results.
>
> On the one hand, choosing many bins minimizes the effect of binning
in
> the computation of the Brier score. On the other hand, too many bins
> make for a reliability diagram that is not very smooth.
>
> Do you have any recommendations or advice for Efren in the
> verification of the probabilistic data or interpretation of results?
>
> Thanks,
> John
>
> ---------- Forwarded message ---------
> From: efren.serra.ctr at nrlmry.navy.mil via RT <met_help at ucar.edu>
> Date: Wed, Oct 7, 2020 at 2:18 PM
> Subject: RE: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus v9.1
> To: <johnhg at ucar.edu>
>
>
>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96983 >
>
> John - These are plots of BASER_i or reliability diagrams using
> cat_thresh == 0.01 as you suggested. What can you able to conclude
> from these? The jigsaw pattern in WW3NAVGEM, for instance. I was
wondering
> about that.
> Thanks for your help. I'm going to try cat_thresh == 0.05 next.
>
> -----Original Message-----
> From: John Halley Gotway via RT <met_help at ucar.edu>
> Sent: Wednesday, October 7, 2020 9:01 AM
> To: Serra, Mr. Efren, Contractor, Code 7531 <
> efren.serra.ctr at nrlmry.navy.mil
> >
> Subject: Re: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus v9.1
>
> Efren,
>
> I see that you're doing probabilistic verification of two models.
All
> of the probabilistic stats computed by MET are derived from an Nx2
> probabilistic contingency table. Your choice of the forecast
> categorical threshold defines the N probability bins. And your
choice
> of the observed categorical threshold defines the 2 yes/no bins for
the
> observation.
>
> It looks like you're processing the probability of significant wave
> height
> > 12 ft.
>
> So you'd set the observation cat_thresh = >12; since that's the
event
> for which probabilities are defined.
>
> And if you set the forecast cat_thresh = ==0.2, you'd be using 5
> probability bins:
>    0.0 to 0.2, 0.2 to 0.4, 0.4 to 0.6, 0.6 to 0.8, and 0.8 to 1.0 If
> you set the forecast cat_thresh = ==0.1, you'd be using 10
probability bins.
>
> MET computes the Brier Score from the Nx2 table... not the raw
> probability values.
> In computing the Brier Score, all points falling inside a bin are
> evaluated using the mid-point of that bin.
> Points in 0.0 to 0.2 are processed as 0.1.
> Points in 0.2 to 0.4 are processed as 0.3.
> Points in 0.4 to 0.6 are processed as 0.5.
> Points in 0.6 to 0.8 are processed as 0.7.
> Points in 0.8 to 1.0 are processed as 0.9.
>
> In a recent MET-Help question from NOAA/WPC, we discussed this
situation.
> The Brier Score values were unexpected because WPC's probability
> values were actually already binned. The process of re-binning the
> already binned probabilities introduced some unexpected diffs in the
> resulting Brier Score values. As a result, we wrote up this GitHub
issue:
>    https://github.com/dtcenter/MET/issues/1495
>
> Now I'm not sure if/how this applies to your data.  But to minimize
> the effect of binning, you could try using 100 probability bins by
> setting "cat_thresh = ==0.01;" I'd be curious to see how much impact
> the cat_thresh setting has on the results. And remember that the
> statistics reported by Grid-Stat are computed over some spatial
area,
> as indicated by VX_MASK. If that spatial area is large, it's likely
> that the relative similarity between the fields AWAY from the large
> event is averaging things out and making their performance look more
similar
> on average.
>
> So take a look at how you've defined the masking regions for Grid-
Stat.
>
> Thanks,
> John
>
> On Wed, Oct 7, 2020 at 9:38 AM Julie Prestopnik via RT
> <met_help at ucar.edu>
> wrote:
>
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96983 >
> >
> > Hi Efren.
> >
> > I just wanted to follow up on your previous email.  You mentioned,
> > "I was wondering about the difference between specifying forecast
> > bins of
> > .2 versus .1 with cat_thresh = [==.1] or [==.2]".
> >
> > That information can be found on this page of the User's Guide:
> >
> >
https://dtcenter.github.io/MET/Users_Guide/data_io.html#configuratio
> > n-
> > file-details
> >
> > - Threshold:
> > >       - A threshold type (<, <=, ==, !-, >=, or >) followed by a
> > > numeric
> > value.
> > >       - The threshold type may also be specified using two
letter
> > abbreviations
> > >         (lt, le, eq, ne, ge, gt).
> > >       - Multiple thresholds may be combined by specifying the
> > > logic type
> > of AND
> > >         (&&) or *OR (||)*. For example, ">=5&&<=10" defines the
> > > numbers
> > between 5
> > >         and 10 and *"==1||==2" defines numbers exactly equal to
1
> > > or
> > > 2.*
> > >
> > >
> > If that doesn't help and doesn't explain why you are not seeing
the
> > differences you expect, please let me know and I will see if John
> > can better assist you.
> >
> > Thanks!
> >
> > Julie
> >
> > On Wed, Oct 7, 2020 at 9:32 AM efren.serra.ctr at nrlmry.navy.mil via
> > RT < met_help at ucar.edu> wrote:
> >
> > >
> > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96983 >
> > >
> > > Attached are plots I made from pjc, prc and pstd output. Here
I'm
> > > including the Brier score out of pstd data files. We are
comparing
> > > two different probability models versus the same ground truth
and
> > > we expect remarkable differences between the probabilistic
models,
> > > but we don't see that in these plots. I have worked with John
> > > before, so I was wondering
> > if
> > > you don't mind sharing this with him.
> > >
> > > -----Original Message-----
> > > From: Julie Prestopnik via RT <met_help at ucar.edu>
> > > Sent: Wednesday, October 7, 2020 7:41 AM
> > > To: Serra, Mr. Efren, Contractor, Code 7531 <
> > > efren.serra.ctr at nrlmry.navy.mil>
> > > Subject: Re: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus
v9.1
> > >
> > > Hi Efren.
> > >
> > > I see that you are under the impression that Grid-Stat for
METv9.1
> > > no longer includes cat_thresh.  However, if you look in the
> > > met-9.1/share/met/config/GridStatConfig_default file, you can
see
> > > that cat_thresh is still available for use in Grid-Stat:
> > >
> > > //
> > > > // Forecast and observation fields to be verified // fcst = {
> > > >    field = [
> > > >       {
> > > >         name       = "APCP";
> > > >         level      = [ "A03" ];
> > > >         *cat_thresh *= [ >0.0, >=5.0 ];
> > > >       }
> > > >    ];
> > > > }
> > >
> > >
> > > You mentioned that the v9.1 documentation doesn't have
cat_thresh
> > > described.  Please let us know if there is something in the
> > > METv8.1 documentation that is missing from the METv9.1
> > > documentation (specific quote, section, etc.) so that we can
> > > ensure it's inclusion in the METv9.1 documentation.
> > >
> > > Thanks!
> > >
> > > Julie
> > >
> > > On Tue, Oct 6, 2020 at 4:27 PM efren.serra.ctr at nrlmry.navy.mil
via
> > > RT < met_help at ucar.edu> wrote:
> > >
> > > >
> > > > Tue Oct 06 16:27:16 2020: Request 96983 was acted upon.
> > > > Transaction: Ticket created by efren.serra.ctr at nrlmry.navy.mil
> > > >        Queue: met_help
> > > >      Subject: cat_thresh v8.1 versus v9.1
> > > >        Owner: Nobody
> > > >   Requestors: efren.serra.ctr at nrlmry.navy.mil
> > > >       Status: new
> > > >  Ticket <URL:
> > > > https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96983
> > > > >
> > > >
> > > >
> > > > Folks - I updated my MET tools from v8.1 to v9.1. I had the
> > > > following fcst entry for my GridStatConfig file(s) but I
noticed
> > > > that v9.1 documentation doesn't have cat_thresh. What should I
> > > > map cat_thresh to? There are vld_thresh and cov_thresh? Thank
you.
> > > >
> > > > fcst = {
> > > >
> > > >    valid_time = "20180711_12";
> > > >    field = [
> > > >       {
> > > >         file_type  = GRIB1;
> > > >         model      = "WW3NAVGEM";
> > > >         name       = "MFLX";
> > > >         level      = "Z0";
> > > >         prob       = TRUE;
> > > >         cat_thresh = [ ==0.1 ];
> > > >       }
> > > >    ];
> > > >
> > > > }
> > > >
> > > > Efren A. Serra (Contractor)
> > > > Physicist
> > > >
> > > > DeVine Consulting, Inc.
> > > > Naval Research Laboratory
> > > > Marine Meteorology Division
> > > > 7 Grace Hopper Ave., STOP 2
> > > > Monterey, CA 93943
> > > > Code 7542
> > > > Mobile: 408-425-5027
> > > >
> > > >
> > > >
> > >
> > > --
> > > Julie Prestopnik (she/her/hers)
> > > Software Engineer
> > > National Center for Atmospheric Research Research Applications
> > > Laboratory
> > > Email: jpresto at ucar.edu
> > >
> > > My working day may not be your working day.  Please do not feel
> > > obliged
> > to
> > > reply to this email outside of your normal working hours.
> > >
> > >
> > >
> >
> > --
> > Julie Prestopnik (she/her/hers)
> > Software Engineer
> > National Center for Atmospheric Research Research Applications
> > Laboratory
> > Email: jpresto at ucar.edu
> >
> > My working day may not be your working day.  Please do not feel
> > obliged to reply to this email outside of your normal working
hours.
> >
> >
>
>




------------------------------------------------
Subject: cat_thresh v8.1 versus v9.1
From: John Halley Gotway
Time: Fri Oct 09 10:17:03 2020

Efren and Buck,

Thanks for sending the sample data files. I've attached a tarfile with
a
run_met.sh script and modified config file.

This runs grid_stat twice and then the plot_data_plane tool 4 times to
visualize the NetCDF matched pairs output file. I just wanted to make
sure
the forecast and observation values looked reasonable.

./run_met.sh >& run_met.log

Since we're applying the 10x10 degree box to 1 degree resolution data,
that
gives us 100 matched pairs. That's what this log message tells you:

DEBUG 2: Processing MFLX/Z0 versus HTSGW/Z0, for smoothing method
NEAREST(1), over region box_mask, using 100 matched pairs.

You had been verifying the forecast of the probability of significant
wave
height > 12 feet twice... once by thresholding the observed wave
height >
12 (which makes sense) and once using observed wave height > 0 (which
does
not make sense). So in the confg file, I set:

obs = {
   field = [
      {
...
        cat_thresh = [ >=12.0 ];
...

So now we have 2 output PCT lines of interest... one for each of the
models
being verified. Here's a description of the PCT output line type:
https://dtcenter.github.io/MET/Users_Guide/point-stat.html#id14

Since we chose 10 probability bins, that gives a 10x2 probabilistic
contingency table. I just inserted some newlines to turn a PCT line
into a
table. Each line is the left point of the probability bin, followed by
the
count of points whose probability value falls in that bin and the
observed
event (e.g. >12) did occur, followed by the count where the observed
event
did not occur.

For NAVGEM...

Prob_Bin OY ON
      0    0    0
      0.1    0    0
      0.2    1    4
      0.3    0    2
      0.4    3    7
      0.5   15    5
      0.6    5    0
      0.7   31    0
      0.8   14    0
      0.9    13     0

For TCOFCL...

Prob_Bin OY ON
       0    0    0
      0.1    0    0
      0.2    0    2
      0.3    0    3
      0.4    4    8
      0.5    7    5
      0.6    7    0
      0.7   21    0
      0.8   29    0
       0.9    14     0

These counts add up to 100 because there are 100 points in the region.
Out
of all these points the observed event DID NOT occur only 18 times.

Looking in the PSTD line type at the Brier Score column, I see scores
of
0.1083 and 0.0911 for NAVGEM and TCOFCL, respectively. Here's a
description
of the Brier Score:
https://dtcenter.github.io/MET/Users_Guide/appendixC.html#brier-score

Lower is better, so from the perspective of Brier Score, the TCOFCL
forecast matches the observations slightly better. Of course, I could
have
totally reversed the names on the models when I ran Grid-Stat!

Hope this helps.

Thanks,
John

------------------------------------------------
Subject: cat_thresh v8.1 versus v9.1
From: efren.serra.ctr at nrlmry.navy.mil
Time: Fri Oct 09 10:25:08 2020

John - Buck's question is noticing that NAVGEM only has probabilities
up to 60%, how come the OY_TP values don't drop off to zero after .6
probability bins. Attached are plots of OY_TP/ON_TP for cat_thresh ==
.1. Thanks for the response.

-----Original Message-----
From: John Halley Gotway via RT <met_help at ucar.edu>
Sent: Friday, October 9, 2020 9:17 AM
To: Serra, Mr. Efren, Contractor, Code 7531
<efren.serra.ctr at nrlmry.navy.mil>
Cc: ericg at ucar.edu
Subject: Re: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus v9.1

Efren and Buck,

Thanks for sending the sample data files. I've attached a tarfile with
a run_met.sh script and modified config file.

This runs grid_stat twice and then the plot_data_plane tool 4 times to
visualize the NetCDF matched pairs output file. I just wanted to make
sure the forecast and observation values looked reasonable.

./run_met.sh >& run_met.log

Since we're applying the 10x10 degree box to 1 degree resolution data,
that gives us 100 matched pairs. That's what this log message tells
you:

DEBUG 2: Processing MFLX/Z0 versus HTSGW/Z0, for smoothing method
NEAREST(1), over region box_mask, using 100 matched pairs.

You had been verifying the forecast of the probability of significant
wave height > 12 feet twice... once by thresholding the observed wave
height >
12 (which makes sense) and once using observed wave height > 0 (which
does not make sense). So in the confg file, I set:

obs = {
   field = [
      {
...
        cat_thresh = [ >=12.0 ];
...

So now we have 2 output PCT lines of interest... one for each of the
models being verified. Here's a description of the PCT output line
type:
https://dtcenter.github.io/MET/Users_Guide/point-stat.html#id14

Since we chose 10 probability bins, that gives a 10x2 probabilistic
contingency table. I just inserted some newlines to turn a PCT line
into a table. Each line is the left point of the probability bin,
followed by the count of points whose probability value falls in that
bin and the observed event (e.g. >12) did occur, followed by the count
where the observed event did not occur.

For NAVGEM...

Prob_Bin OY ON
      0    0    0
      0.1    0    0
      0.2    1    4
      0.3    0    2
      0.4    3    7
      0.5   15    5
      0.6    5    0
      0.7   31    0
      0.8   14    0
      0.9    13     0

For TCOFCL...

Prob_Bin OY ON
       0    0    0
      0.1    0    0
      0.2    0    2
      0.3    0    3
      0.4    4    8
      0.5    7    5
      0.6    7    0
      0.7   21    0
      0.8   29    0
       0.9    14     0

These counts add up to 100 because there are 100 points in the region.
Out of all these points the observed event DID NOT occur only 18
times.

Looking in the PSTD line type at the Brier Score column, I see scores
of
0.1083 and 0.0911 for NAVGEM and TCOFCL, respectively. Here's a
description of the Brier Score:
https://dtcenter.github.io/MET/Users_Guide/appendixC.html#brier-score

Lower is better, so from the perspective of Brier Score, the TCOFCL
forecast matches the observations slightly better. Of course, I could
have totally reversed the names on the models when I ran Grid-Stat!

Hope this helps.

Thanks,
John


------------------------------------------------
Subject: RE: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus v9.1
From: efren.serra.ctr at nrlmry.navy.mil
Time: Fri Oct 09 10:29:19 2020

John - The double threshold-ing should help. I'm going to try that
right away for this case. You can ignore my last question.

-----Original Message-----
From: John Halley Gotway via RT <met_help at ucar.edu>
Sent: Friday, October 9, 2020 9:17 AM
To: Serra, Mr. Efren, Contractor, Code 7531
<efren.serra.ctr at nrlmry.navy.mil>
Cc: ericg at ucar.edu
Subject: Re: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus v9.1

Efren and Buck,

Thanks for sending the sample data files. I've attached a tarfile with
a run_met.sh script and modified config file.

This runs grid_stat twice and then the plot_data_plane tool 4 times to
visualize the NetCDF matched pairs output file. I just wanted to make
sure the forecast and observation values looked reasonable.

./run_met.sh >& run_met.log

Since we're applying the 10x10 degree box to 1 degree resolution data,
that gives us 100 matched pairs. That's what this log message tells
you:

DEBUG 2: Processing MFLX/Z0 versus HTSGW/Z0, for smoothing method
NEAREST(1), over region box_mask, using 100 matched pairs.

You had been verifying the forecast of the probability of significant
wave height > 12 feet twice... once by thresholding the observed wave
height >
12 (which makes sense) and once using observed wave height > 0 (which
does not make sense). So in the confg file, I set:

obs = {
   field = [
      {
...
        cat_thresh = [ >=12.0 ];
...

So now we have 2 output PCT lines of interest... one for each of the
models being verified. Here's a description of the PCT output line
type:
https://dtcenter.github.io/MET/Users_Guide/point-stat.html#id14

Since we chose 10 probability bins, that gives a 10x2 probabilistic
contingency table. I just inserted some newlines to turn a PCT line
into a table. Each line is the left point of the probability bin,
followed by the count of points whose probability value falls in that
bin and the observed event (e.g. >12) did occur, followed by the count
where the observed event did not occur.

For NAVGEM...

Prob_Bin OY ON
      0    0    0
      0.1    0    0
      0.2    1    4
      0.3    0    2
      0.4    3    7
      0.5   15    5
      0.6    5    0
      0.7   31    0
      0.8   14    0
      0.9    13     0

For TCOFCL...

Prob_Bin OY ON
       0    0    0
      0.1    0    0
      0.2    0    2
      0.3    0    3
      0.4    4    8
      0.5    7    5
      0.6    7    0
      0.7   21    0
      0.8   29    0
       0.9    14     0

These counts add up to 100 because there are 100 points in the region.
Out of all these points the observed event DID NOT occur only 18
times.

Looking in the PSTD line type at the Brier Score column, I see scores
of
0.1083 and 0.0911 for NAVGEM and TCOFCL, respectively. Here's a
description of the Brier Score:
https://dtcenter.github.io/MET/Users_Guide/appendixC.html#brier-score

Lower is better, so from the perspective of Brier Score, the TCOFCL
forecast matches the observations slightly better. Of course, I could
have totally reversed the names on the models when I ran Grid-Stat!

Hope this helps.

Thanks,
John



------------------------------------------------
Subject: cat_thresh v8.1 versus v9.1
From: Sampson, Mr. Buck
Time: Fri Oct 09 10:32:02 2020

John,

That is very clear.  I'm unsure why we have positive OY for some of
our
WW3NAVGEM at those higher thresholds, but that could easily be a
result of
something we did prior to giving you these files.  You've confirmed
what I
thought was in the MET files, and your results do appear more
reasonable than
what I saw before (ww3navgem having higher OY in the highest
thresholds).
Hmmmm ...

Buck


-----Original Message-----
From: Serra, Mr. Efren, Contractor, Code 7531
<efren.serra.ctr at nrlmry.navy.mil>
Sent: Friday, October 9, 2020 9:25 AM
To: met_help at ucar.edu
Cc: ericg at ucar.edu; Sampson, Mr. Buck <Buck.Sampson at nrlmry.navy.mil>
Subject: RE: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus v9.1

John - Buck's question is noticing that NAVGEM only has probabilities
up to
60%, how come the OY_TP values don't drop off to zero after .6
probability
bins. Attached are plots of OY_TP/ON_TP for cat_thresh == .1. Thanks
for the
response.

-----Original Message-----
From: John Halley Gotway via RT <met_help at ucar.edu>
Sent: Friday, October 9, 2020 9:17 AM
To: Serra, Mr. Efren, Contractor, Code 7531
<efren.serra.ctr at nrlmry.navy.mil>
Cc: ericg at ucar.edu
Subject: Re: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus v9.1

Efren and Buck,

Thanks for sending the sample data files. I've attached a tarfile with
a
run_met.sh script and modified config file.

This runs grid_stat twice and then the plot_data_plane tool 4 times to
visualize the NetCDF matched pairs output file. I just wanted to make
sure the
forecast and observation values looked reasonable.

./run_met.sh >& run_met.log

Since we're applying the 10x10 degree box to 1 degree resolution data,
that
gives us 100 matched pairs. That's what this log message tells you:

DEBUG 2: Processing MFLX/Z0 versus HTSGW/Z0, for smoothing method
NEAREST(1),
over region box_mask, using 100 matched pairs.

You had been verifying the forecast of the probability of significant
wave
height > 12 feet twice... once by thresholding the observed wave
height >
12 (which makes sense) and once using observed wave height > 0 (which
does not
make sense). So in the confg file, I set:

obs = {
   field = [
      {
...
        cat_thresh = [ >=12.0 ];
...

So now we have 2 output PCT lines of interest... one for each of the
models
being verified. Here's a description of the PCT output line type:
https://dtcenter.github.io/MET/Users_Guide/point-stat.html#id14

Since we chose 10 probability bins, that gives a 10x2 probabilistic
contingency table. I just inserted some newlines to turn a PCT line
into a
table. Each line is the left point of the probability bin, followed by
the
count of points whose probability value falls in that bin and the
observed
event (e.g. >12) did occur, followed by the count where the observed
event did
not occur.

For NAVGEM...

Prob_Bin OY ON
      0    0    0
      0.1    0    0
      0.2    1    4
      0.3    0    2
      0.4    3    7
      0.5   15    5
      0.6    5    0
      0.7   31    0
      0.8   14    0
      0.9    13     0

For TCOFCL...

Prob_Bin OY ON
       0    0    0
      0.1    0    0
      0.2    0    2
      0.3    0    3
      0.4    4    8
      0.5    7    5
      0.6    7    0
      0.7   21    0
      0.8   29    0
       0.9    14     0

These counts add up to 100 because there are 100 points in the region.
Out of
all these points the observed event DID NOT occur only 18 times.

Looking in the PSTD line type at the Brier Score column, I see scores
of
0.1083 and 0.0911 for NAVGEM and TCOFCL, respectively. Here's a
description of
the Brier Score:
https://dtcenter.github.io/MET/Users_Guide/appendixC.html#brier-score

Lower is better, so from the perspective of Brier Score, the TCOFCL
forecast
matches the observations slightly better. Of course, I could have
totally
reversed the names on the models when I ran Grid-Stat!

Hope this helps.

Thanks,
John


------------------------------------------------
Subject: cat_thresh v8.1 versus v9.1
From: John Halley Gotway
Time: Fri Oct 09 10:33:14 2020

Efren,

That's why I ran the plot_data_plane tool on the NetCDF matched pair
output
from Grid-Stat. The range of values present in the data indicates
min/max
vaules of .203 to .906 in one forecast field and .25 to .95 in the
other.

So while your picture shows maximum probability values of 0.6, that
picture
does not match the values in this file that we're passing to Grid-
Stat. But
I no idea where the disconnect is.

John

On Fri, Oct 9, 2020 at 10:25 AM efren.serra.ctr at nrlmry.navy.mil via RT
<
met_help at ucar.edu> wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96983 >
>
> John - Buck's question is noticing that NAVGEM only has
probabilities up
> to 60%, how come the OY_TP values don't drop off to zero after .6
> probability bins. Attached are plots of OY_TP/ON_TP for cat_thresh
== .1.
> Thanks for the response.
>
> -----Original Message-----
> From: John Halley Gotway via RT <met_help at ucar.edu>
> Sent: Friday, October 9, 2020 9:17 AM
> To: Serra, Mr. Efren, Contractor, Code 7531 <
> efren.serra.ctr at nrlmry.navy.mil>
> Cc: ericg at ucar.edu
> Subject: Re: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus v9.1
>
> Efren and Buck,
>
> Thanks for sending the sample data files. I've attached a tarfile
with a
> run_met.sh script and modified config file.
>
> This runs grid_stat twice and then the plot_data_plane tool 4 times
to
> visualize the NetCDF matched pairs output file. I just wanted to
make sure
> the forecast and observation values looked reasonable.
>
> ./run_met.sh >& run_met.log
>
> Since we're applying the 10x10 degree box to 1 degree resolution
data,
> that gives us 100 matched pairs. That's what this log message tells
you:
>
> DEBUG 2: Processing MFLX/Z0 versus HTSGW/Z0, for smoothing method
> NEAREST(1), over region box_mask, using 100 matched pairs.
>
> You had been verifying the forecast of the probability of
significant wave
> height > 12 feet twice... once by thresholding the observed wave
height >
> 12 (which makes sense) and once using observed wave height > 0
(which does
> not make sense). So in the confg file, I set:
>
> obs = {
>    field = [
>       {
> ...
>         cat_thresh = [ >=12.0 ];
> ...
>
> So now we have 2 output PCT lines of interest... one for each of the
> models being verified. Here's a description of the PCT output line
type:
> https://dtcenter.github.io/MET/Users_Guide/point-stat.html#id14
>
> Since we chose 10 probability bins, that gives a 10x2 probabilistic
> contingency table. I just inserted some newlines to turn a PCT line
into a
> table. Each line is the left point of the probability bin, followed
by the
> count of points whose probability value falls in that bin and the
observed
> event (e.g. >12) did occur, followed by the count where the observed
event
> did not occur.
>
> For NAVGEM...
>
> Prob_Bin OY ON
>       0    0    0
>       0.1    0    0
>       0.2    1    4
>       0.3    0    2
>       0.4    3    7
>       0.5   15    5
>       0.6    5    0
>       0.7   31    0
>       0.8   14    0
>       0.9    13     0
>
> For TCOFCL...
>
> Prob_Bin OY ON
>        0    0    0
>       0.1    0    0
>       0.2    0    2
>       0.3    0    3
>       0.4    4    8
>       0.5    7    5
>       0.6    7    0
>       0.7   21    0
>       0.8   29    0
>        0.9    14     0
>
> These counts add up to 100 because there are 100 points in the
region. Out
> of all these points the observed event DID NOT occur only 18 times.
>
> Looking in the PSTD line type at the Brier Score column, I see
scores of
> 0.1083 and 0.0911 for NAVGEM and TCOFCL, respectively. Here's a
> description of the Brier Score:
> https://dtcenter.github.io/MET/Users_Guide/appendixC.html#brier-
score
>
> Lower is better, so from the perspective of Brier Score, the TCOFCL
> forecast matches the observations slightly better. Of course, I
could have
> totally reversed the names on the models when I ran Grid-Stat!
>
> Hope this helps.
>
> Thanks,
> John
>
>
>

------------------------------------------------
Subject: cat_thresh v8.1 versus v9.1
From: efren.serra.ctr at nrlmry.navy.mil
Time: Fri Oct 09 11:12:58 2020

Buck - attached images, generated using MET plot_data_plane, show that
WW3NAVGEM has gt12ft probability field has values above .6.

These plots were generated from the following commands:

1] plot_data_plane
US058GOCN-GR1mdl.0011_0255_09600U0RL2018070500_0001_000000-
000000prob_sig_wav_ht_gt12ft
ww3tcofcl_prob_sig_wav_ht_gt12ft.ps name='"MFLX";level="Z0";' -v 2
2] plot_data_plane
US058GOCN-GR1mdl.0050_0240_09600U0RL2018070500_0001_000000-
000000prob_sig_wav_ht_gt12ft
ww3navgem_prob_sig_wav_ht_gt12ft.ps name='"MFLX";level="Z0";' -v 2

The *ps files were converted to PDF.

-----Original Message-----
From: Sampson, Mr. Buck <Buck.Sampson at nrlmry.navy.mil>
Sent: Friday, October 9, 2020 9:32 AM
To: Serra, Mr. Efren, Contractor, Code 7531
<efren.serra.ctr at nrlmry.navy.mil>;
met_help at ucar.edu
Cc: ericg at ucar.edu
Subject: RE: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus v9.1

John,

That is very clear.  I'm unsure why we have positive OY for some of
our
WW3NAVGEM at those higher thresholds, but that could easily be a
result of
something we did prior to giving you these files.  You've confirmed
what I
thought was in the MET files, and your results do appear more
reasonable than
what I saw before (ww3navgem having higher OY in the highest
thresholds).
Hmmmm ...

Buck


-----Original Message-----
From: Serra, Mr. Efren, Contractor, Code 7531
<efren.serra.ctr at nrlmry.navy.mil>
Sent: Friday, October 9, 2020 9:25 AM
To: met_help at ucar.edu
Cc: ericg at ucar.edu; Sampson, Mr. Buck <Buck.Sampson at nrlmry.navy.mil>
Subject: RE: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus v9.1

John - Buck's question is noticing that NAVGEM only has probabilities
up to
60%, how come the OY_TP values don't drop off to zero after .6
probability
bins. Attached are plots of OY_TP/ON_TP for cat_thresh == .1. Thanks
for the
response.

-----Original Message-----
From: John Halley Gotway via RT <met_help at ucar.edu>
Sent: Friday, October 9, 2020 9:17 AM
To: Serra, Mr. Efren, Contractor, Code 7531
<efren.serra.ctr at nrlmry.navy.mil>
Cc: ericg at ucar.edu
Subject: Re: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus v9.1

Efren and Buck,

Thanks for sending the sample data files. I've attached a tarfile with
a
run_met.sh script and modified config file.

This runs grid_stat twice and then the plot_data_plane tool 4 times to
visualize the NetCDF matched pairs output file. I just wanted to make
sure the
forecast and observation values looked reasonable.

./run_met.sh >& run_met.log

Since we're applying the 10x10 degree box to 1 degree resolution data,
that
gives us 100 matched pairs. That's what this log message tells you:

DEBUG 2: Processing MFLX/Z0 versus HTSGW/Z0, for smoothing method
NEAREST(1),
over region box_mask, using 100 matched pairs.

You had been verifying the forecast of the probability of significant
wave
height > 12 feet twice... once by thresholding the observed wave
height >
12 (which makes sense) and once using observed wave height > 0 (which
does not
make sense). So in the confg file, I set:

obs = {
   field = [
      {
...
        cat_thresh = [ >=12.0 ];
...

So now we have 2 output PCT lines of interest... one for each of the
models
being verified. Here's a description of the PCT output line type:
https://dtcenter.github.io/MET/Users_Guide/point-stat.html#id14

Since we chose 10 probability bins, that gives a 10x2 probabilistic
contingency table. I just inserted some newlines to turn a PCT line
into a
table. Each line is the left point of the probability bin, followed by
the
count of points whose probability value falls in that bin and the
observed
event (e.g. >12) did occur, followed by the count where the observed
event did
not occur.

For NAVGEM...

Prob_Bin OY ON
      0    0    0
      0.1    0    0
      0.2    1    4
      0.3    0    2
      0.4    3    7
      0.5   15    5
      0.6    5    0
      0.7   31    0
      0.8   14    0
      0.9    13     0

For TCOFCL...

Prob_Bin OY ON
       0    0    0
      0.1    0    0
      0.2    0    2
      0.3    0    3
      0.4    4    8
      0.5    7    5
      0.6    7    0
      0.7   21    0
      0.8   29    0
       0.9    14     0

These counts add up to 100 because there are 100 points in the region.
Out of
all these points the observed event DID NOT occur only 18 times.

Looking in the PSTD line type at the Brier Score column, I see scores
of
0.1083 and 0.0911 for NAVGEM and TCOFCL, respectively. Here's a
description of
the Brier Score:
https://dtcenter.github.io/MET/Users_Guide/appendixC.html#brier-score

Lower is better, so from the perspective of Brier Score, the TCOFCL
forecast
matches the observations slightly better. Of course, I could have
totally
reversed the names on the models when I ran Grid-Stat!

Hope this helps.

Thanks,
John


------------------------------------------------
Subject: RE: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus v9.1
From: efren.serra.ctr at nrlmry.navy.mil
Time: Fri Oct 09 13:53:42 2020

John - is there a way to print tuple <lon, lat, field value>, for
values in 10x10 box or box_mask? Do you have such a tool? I just want
a list of values. Thanks.

-----Original Message-----
From: John Halley Gotway via RT <met_help at ucar.edu>
Sent: Wednesday, October 7, 2020 1:58 PM
To: Serra, Mr. Efren, Contractor, Code 7531
<efren.serra.ctr at nrlmry.navy.mil>
Cc: ericg at ucar.edu
Subject: Fwd: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus v9.1

Eric,

Efren Serra works at NRL and has been running the Grid-Stat tool in
MET to evaluate the performance of forecast for the probability of
significant wave heights > 12 ft.

The attached graphic illustrates this case where there's a large
difference in the forecast probability values. The NAVGEM model on the
left has probability values up to 60% while the OFCL forecast includes
probabilities up to 90%. Efren is looking comparing these models using
the Brier score computed over a 10x10 degree box.

He was surprised not to see a larger difference in the resulting
reliability diagrams.

I explained how Grid-Stat evaulates probabilities using an Nx2
probability contingency table. Therefore the number of bins will
affect the results. In particular, when computing the Brier score, the
centerpoint of the bin is used instead of the raw probability values
that went into the bin.

The 2 attached images beginning with "wp_102018_basins" show the Brier
score across multiple forecast lead times.
And the 3 attached images beginning with "wp_102018_gt12ft" show
Reliability diagrams for a few different lead times.

I assume that jagged pattern in the reliability diagram is the result
of choosing too many bins, resulting in bins with a small number of
points which leads to sporadic results.

On the one hand, choosing many bins minimizes the effect of binning in
the computation of the Brier score. On the other hand, too many bins
make for a reliability diagram that is not very smooth.

Do you have any recommendations or advice for Efren in the
verification of the probabilistic data or interpretation of results?

Thanks,
John

---------- Forwarded message ---------
From: efren.serra.ctr at nrlmry.navy.mil via RT <met_help at ucar.edu>
Date: Wed, Oct 7, 2020 at 2:18 PM
Subject: RE: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus v9.1
To: <johnhg at ucar.edu>



<URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96983 >

John - These are plots of BASER_i or reliability diagrams using
cat_thresh == 0.01 as you suggested. What can you able to conclude
from these? The jigsaw pattern in WW3NAVGEM, for instance. I was
wondering about that.
Thanks for your help. I'm going to try cat_thresh == 0.05 next.

-----Original Message-----
From: John Halley Gotway via RT <met_help at ucar.edu>
Sent: Wednesday, October 7, 2020 9:01 AM
To: Serra, Mr. Efren, Contractor, Code 7531
<efren.serra.ctr at nrlmry.navy.mil
>
Subject: Re: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus v9.1

Efren,

I see that you're doing probabilistic verification of two models. All
of the probabilistic stats computed by MET are derived from an Nx2
probabilistic contingency table. Your choice of the forecast
categorical threshold defines the N probability bins. And your choice
of the observed categorical threshold defines the 2 yes/no bins for
the observation.

It looks like you're processing the probability of significant wave
height
> 12 ft.

So you'd set the observation cat_thresh = >12; since that's the event
for which probabilities are defined.

And if you set the forecast cat_thresh = ==0.2, you'd be using 5
probability bins:
   0.0 to 0.2, 0.2 to 0.4, 0.4 to 0.6, 0.6 to 0.8, and 0.8 to 1.0 If
you set the forecast cat_thresh = ==0.1, you'd be using 10 probability
bins.

MET computes the Brier Score from the Nx2 table... not the raw
probability values.
In computing the Brier Score, all points falling inside a bin are
evaluated using the mid-point of that bin.
Points in 0.0 to 0.2 are processed as 0.1.
Points in 0.2 to 0.4 are processed as 0.3.
Points in 0.4 to 0.6 are processed as 0.5.
Points in 0.6 to 0.8 are processed as 0.7.
Points in 0.8 to 1.0 are processed as 0.9.

In a recent MET-Help question from NOAA/WPC, we discussed this
situation.
The Brier Score values were unexpected because WPC's probability
values were actually already binned. The process of re-binning the
already binned probabilities introduced some unexpected diffs in the
resulting Brier Score values. As a result, we wrote up this GitHub
issue:
   https://github.com/dtcenter/MET/issues/1495

Now I'm not sure if/how this applies to your data.  But to minimize
the effect of binning, you could try using 100 probability bins by
setting "cat_thresh = ==0.01;" I'd be curious to see how much impact
the cat_thresh setting has on the results. And remember that the
statistics reported by Grid-Stat are computed over some spatial area,
as indicated by VX_MASK. If that spatial area is large, it's likely
that the relative similarity between the fields AWAY from the large
event is averaging things out and making their performance look more
similar on average.

So take a look at how you've defined the masking regions for Grid-
Stat.

Thanks,
John

On Wed, Oct 7, 2020 at 9:38 AM Julie Prestopnik via RT
<met_help at ucar.edu>
wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96983 >
>
> Hi Efren.
>
> I just wanted to follow up on your previous email.  You mentioned,
"I
> was wondering about the difference between specifying forecast bins
of
> .2 versus .1 with cat_thresh = [==.1] or [==.2]".
>
> That information can be found on this page of the User's Guide:
>
>
https://dtcenter.github.io/MET/Users_Guide/data_io.html#configuration-
> file-details
>
> - Threshold:
> >       - A threshold type (<, <=, ==, !-, >=, or >) followed by a
> > numeric
> value.
> >       - The threshold type may also be specified using two letter
> abbreviations
> >         (lt, le, eq, ne, ge, gt).
> >       - Multiple thresholds may be combined by specifying the
logic
> > type
> of AND
> >         (&&) or *OR (||)*. For example, ">=5&&<=10" defines the
> > numbers
> between 5
> >         and 10 and *"==1||==2" defines numbers exactly equal to 1
or
> > 2.*
> >
> >
> If that doesn't help and doesn't explain why you are not seeing the
> differences you expect, please let me know and I will see if John
can
> better assist you.
>
> Thanks!
>
> Julie
>
> On Wed, Oct 7, 2020 at 9:32 AM efren.serra.ctr at nrlmry.navy.mil via
RT
> < met_help at ucar.edu> wrote:
>
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96983 >
> >
> > Attached are plots I made from pjc, prc and pstd output. Here I'm
> > including the Brier score out of pstd data files. We are comparing
> > two different probability models versus the same ground truth and
we
> > expect remarkable differences between the probabilistic models,
but
> > we don't see that in these plots. I have worked with John before,
so
> > I was wondering
> if
> > you don't mind sharing this with him.
> >
> > -----Original Message-----
> > From: Julie Prestopnik via RT <met_help at ucar.edu>
> > Sent: Wednesday, October 7, 2020 7:41 AM
> > To: Serra, Mr. Efren, Contractor, Code 7531 <
> > efren.serra.ctr at nrlmry.navy.mil>
> > Subject: Re: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus v9.1
> >
> > Hi Efren.
> >
> > I see that you are under the impression that Grid-Stat for METv9.1
> > no longer includes cat_thresh.  However, if you look in the
> > met-9.1/share/met/config/GridStatConfig_default file, you can see
> > that cat_thresh is still available for use in Grid-Stat:
> >
> > //
> > > // Forecast and observation fields to be verified // fcst = {
> > >    field = [
> > >       {
> > >         name       = "APCP";
> > >         level      = [ "A03" ];
> > >         *cat_thresh *= [ >0.0, >=5.0 ];
> > >       }
> > >    ];
> > > }
> >
> >
> > You mentioned that the v9.1 documentation doesn't have cat_thresh
> > described.  Please let us know if there is something in the
METv8.1
> > documentation that is missing from the METv9.1 documentation
> > (specific quote, section, etc.) so that we can ensure it's
inclusion
> > in the METv9.1 documentation.
> >
> > Thanks!
> >
> > Julie
> >
> > On Tue, Oct 6, 2020 at 4:27 PM efren.serra.ctr at nrlmry.navy.mil via
> > RT < met_help at ucar.edu> wrote:
> >
> > >
> > > Tue Oct 06 16:27:16 2020: Request 96983 was acted upon.
> > > Transaction: Ticket created by efren.serra.ctr at nrlmry.navy.mil
> > >        Queue: met_help
> > >      Subject: cat_thresh v8.1 versus v9.1
> > >        Owner: Nobody
> > >   Requestors: efren.serra.ctr at nrlmry.navy.mil
> > >       Status: new
> > >  Ticket <URL:
> > > https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96983
> > > >
> > >
> > >
> > > Folks - I updated my MET tools from v8.1 to v9.1. I had the
> > > following fcst entry for my GridStatConfig file(s) but I noticed
> > > that v9.1 documentation doesn't have cat_thresh. What should I
map
> > > cat_thresh to? There are vld_thresh and cov_thresh? Thank you.
> > >
> > > fcst = {
> > >
> > >    valid_time = "20180711_12";
> > >    field = [
> > >       {
> > >         file_type  = GRIB1;
> > >         model      = "WW3NAVGEM";
> > >         name       = "MFLX";
> > >         level      = "Z0";
> > >         prob       = TRUE;
> > >         cat_thresh = [ ==0.1 ];
> > >       }
> > >    ];
> > >
> > > }
> > >
> > > Efren A. Serra (Contractor)
> > > Physicist
> > >
> > > DeVine Consulting, Inc.
> > > Naval Research Laboratory
> > > Marine Meteorology Division
> > > 7 Grace Hopper Ave., STOP 2
> > > Monterey, CA 93943
> > > Code 7542
> > > Mobile: 408-425-5027
> > >
> > >
> > >
> >
> > --
> > Julie Prestopnik (she/her/hers)
> > Software Engineer
> > National Center for Atmospheric Research Research Applications
> > Laboratory
> > Email: jpresto at ucar.edu
> >
> > My working day may not be your working day.  Please do not feel
> > obliged
> to
> > reply to this email outside of your normal working hours.
> >
> >
> >
>
> --
> Julie Prestopnik (she/her/hers)
> Software Engineer
> National Center for Atmospheric Research Research Applications
> Laboratory
> Email: jpresto at ucar.edu
>
> My working day may not be your working day.  Please do not feel
> obliged to reply to this email outside of your normal working hours.
>
>



------------------------------------------------
Subject: cat_thresh v8.1 versus v9.1
From: John Halley Gotway
Time: Fri Oct 09 14:09:54 2020

Efren,

No, I don't really. The Point-Stat tool does include the matched pair
line
type (MPR) which includes lat/lon/fcst/obs values, among other things.

We did not add that for the Grid-Stat tool because, in general, it'd
be too
much data to write to ascii format. But in place of that, we added the
gridded NetCDF matched pairs output file. So writing a python script
(or
ncl or whatever) would be pretty easy to dump that data.

For example, here's ncdump:
   ncdump -v FCST_MFLX_Z0_box_mask
grid_stat_WW3NAVGEM_wp102018_gt12ft_960000L_20180709_000000V_pairs.nc

All those '_' characters indicate bad data values. Remember that the
data
is 360x181 = 65160 points, of which only 100 contain valid data values
inside the 10x10 box mask.

John


On Fri, Oct 9, 2020 at 1:54 PM efren.serra.ctr at nrlmry.navy.mil via RT
<
met_help at ucar.edu> wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96983 >
>
> John - is there a way to print tuple <lon, lat, field value>, for
values
> in 10x10 box or box_mask? Do you have such a tool? I just want a
list of
> values. Thanks.
>
> -----Original Message-----
> From: John Halley Gotway via RT <met_help at ucar.edu>
> Sent: Wednesday, October 7, 2020 1:58 PM
> To: Serra, Mr. Efren, Contractor, Code 7531 <
> efren.serra.ctr at nrlmry.navy.mil>
> Cc: ericg at ucar.edu
> Subject: Fwd: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus v9.1
>
> Eric,
>
> Efren Serra works at NRL and has been running the Grid-Stat tool in
MET to
> evaluate the performance of forecast for the probability of
significant
> wave heights > 12 ft.
>
> The attached graphic illustrates this case where there's a large
> difference in the forecast probability values. The NAVGEM model on
the left
> has probability values up to 60% while the OFCL forecast includes
> probabilities up to 90%. Efren is looking comparing these models
using the
> Brier score computed over a 10x10 degree box.
>
> He was surprised not to see a larger difference in the resulting
> reliability diagrams.
>
> I explained how Grid-Stat evaulates probabilities using an Nx2
probability
> contingency table. Therefore the number of bins will affect the
results. In
> particular, when computing the Brier score, the centerpoint of the
bin is
> used instead of the raw probability values that went into the bin.
>
> The 2 attached images beginning with "wp_102018_basins" show the
Brier
> score across multiple forecast lead times.
> And the 3 attached images beginning with "wp_102018_gt12ft" show
> Reliability diagrams for a few different lead times.
>
> I assume that jagged pattern in the reliability diagram is the
result of
> choosing too many bins, resulting in bins with a small number of
points
> which leads to sporadic results.
>
> On the one hand, choosing many bins minimizes the effect of binning
in the
> computation of the Brier score. On the other hand, too many bins
make for a
> reliability diagram that is not very smooth.
>
> Do you have any recommendations or advice for Efren in the
verification of
> the probabilistic data or interpretation of results?
>
> Thanks,
> John
>
> ---------- Forwarded message ---------
> From: efren.serra.ctr at nrlmry.navy.mil via RT <met_help at ucar.edu>
> Date: Wed, Oct 7, 2020 at 2:18 PM
> Subject: RE: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus v9.1
> To: <johnhg at ucar.edu>
>
>
>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96983 >
>
> John - These are plots of BASER_i or reliability diagrams using
cat_thresh
> == 0.01 as you suggested. What can you able to conclude from these?
The
> jigsaw pattern in WW3NAVGEM, for instance. I was wondering about
that.
> Thanks for your help. I'm going to try cat_thresh == 0.05 next.
>
> -----Original Message-----
> From: John Halley Gotway via RT <met_help at ucar.edu>
> Sent: Wednesday, October 7, 2020 9:01 AM
> To: Serra, Mr. Efren, Contractor, Code 7531 <
> efren.serra.ctr at nrlmry.navy.mil
> >
> Subject: Re: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus v9.1
>
> Efren,
>
> I see that you're doing probabilistic verification of two models.
All of
> the probabilistic stats computed by MET are derived from an Nx2
> probabilistic contingency table. Your choice of the forecast
categorical
> threshold defines the N probability bins. And your choice of the
observed
> categorical threshold defines the 2 yes/no bins for the observation.
>
> It looks like you're processing the probability of significant wave
height
> > 12 ft.
>
> So you'd set the observation cat_thresh = >12; since that's the
event for
> which probabilities are defined.
>
> And if you set the forecast cat_thresh = ==0.2, you'd be using 5
> probability bins:
>    0.0 to 0.2, 0.2 to 0.4, 0.4 to 0.6, 0.6 to 0.8, and 0.8 to 1.0 If
you
> set the forecast cat_thresh = ==0.1, you'd be using 10 probability
bins.
>
> MET computes the Brier Score from the Nx2 table... not the raw
probability
> values.
> In computing the Brier Score, all points falling inside a bin are
> evaluated using the mid-point of that bin.
> Points in 0.0 to 0.2 are processed as 0.1.
> Points in 0.2 to 0.4 are processed as 0.3.
> Points in 0.4 to 0.6 are processed as 0.5.
> Points in 0.6 to 0.8 are processed as 0.7.
> Points in 0.8 to 1.0 are processed as 0.9.
>
> In a recent MET-Help question from NOAA/WPC, we discussed this
situation.
> The Brier Score values were unexpected because WPC's probability
values
> were actually already binned. The process of re-binning the already
binned
> probabilities introduced some unexpected diffs in the resulting
Brier Score
> values. As a result, we wrote up this GitHub issue:
>    https://github.com/dtcenter/MET/issues/1495
>
> Now I'm not sure if/how this applies to your data.  But to minimize
the
> effect of binning, you could try using 100 probability bins by
setting
> "cat_thresh = ==0.01;" I'd be curious to see how much impact the
cat_thresh
> setting has on the results. And remember that the statistics
reported by
> Grid-Stat are computed over some spatial area, as indicated by
VX_MASK. If
> that spatial area is large, it's likely that the relative similarity
> between the fields AWAY from the large event is averaging things out
and
> making their performance look more similar on average.
>
> So take a look at how you've defined the masking regions for Grid-
Stat.
>
> Thanks,
> John
>
> On Wed, Oct 7, 2020 at 9:38 AM Julie Prestopnik via RT
<met_help at ucar.edu>
> wrote:
>
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96983 >
> >
> > Hi Efren.
> >
> > I just wanted to follow up on your previous email.  You mentioned,
"I
> > was wondering about the difference between specifying forecast
bins of
> > .2 versus .1 with cat_thresh = [==.1] or [==.2]".
> >
> > That information can be found on this page of the User's Guide:
> >
> >
https://dtcenter.github.io/MET/Users_Guide/data_io.html#configuration-
> > file-details
> >
> > - Threshold:
> > >       - A threshold type (<, <=, ==, !-, >=, or >) followed by a
> > > numeric
> > value.
> > >       - The threshold type may also be specified using two
letter
> > abbreviations
> > >         (lt, le, eq, ne, ge, gt).
> > >       - Multiple thresholds may be combined by specifying the
logic
> > > type
> > of AND
> > >         (&&) or *OR (||)*. For example, ">=5&&<=10" defines the
> > > numbers
> > between 5
> > >         and 10 and *"==1||==2" defines numbers exactly equal to
1 or
> > > 2.*
> > >
> > >
> > If that doesn't help and doesn't explain why you are not seeing
the
> > differences you expect, please let me know and I will see if John
can
> > better assist you.
> >
> > Thanks!
> >
> > Julie
> >
> > On Wed, Oct 7, 2020 at 9:32 AM efren.serra.ctr at nrlmry.navy.mil via
RT
> > < met_help at ucar.edu> wrote:
> >
> > >
> > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96983 >
> > >
> > > Attached are plots I made from pjc, prc and pstd output. Here
I'm
> > > including the Brier score out of pstd data files. We are
comparing
> > > two different probability models versus the same ground truth
and we
> > > expect remarkable differences between the probabilistic models,
but
> > > we don't see that in these plots. I have worked with John
before, so
> > > I was wondering
> > if
> > > you don't mind sharing this with him.
> > >
> > > -----Original Message-----
> > > From: Julie Prestopnik via RT <met_help at ucar.edu>
> > > Sent: Wednesday, October 7, 2020 7:41 AM
> > > To: Serra, Mr. Efren, Contractor, Code 7531 <
> > > efren.serra.ctr at nrlmry.navy.mil>
> > > Subject: Re: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus
v9.1
> > >
> > > Hi Efren.
> > >
> > > I see that you are under the impression that Grid-Stat for
METv9.1
> > > no longer includes cat_thresh.  However, if you look in the
> > > met-9.1/share/met/config/GridStatConfig_default file, you can
see
> > > that cat_thresh is still available for use in Grid-Stat:
> > >
> > > //
> > > > // Forecast and observation fields to be verified // fcst = {
> > > >    field = [
> > > >       {
> > > >         name       = "APCP";
> > > >         level      = [ "A03" ];
> > > >         *cat_thresh *= [ >0.0, >=5.0 ];
> > > >       }
> > > >    ];
> > > > }
> > >
> > >
> > > You mentioned that the v9.1 documentation doesn't have
cat_thresh
> > > described.  Please let us know if there is something in the
METv8.1
> > > documentation that is missing from the METv9.1 documentation
> > > (specific quote, section, etc.) so that we can ensure it's
inclusion
> > > in the METv9.1 documentation.
> > >
> > > Thanks!
> > >
> > > Julie
> > >
> > > On Tue, Oct 6, 2020 at 4:27 PM efren.serra.ctr at nrlmry.navy.mil
via
> > > RT < met_help at ucar.edu> wrote:
> > >
> > > >
> > > > Tue Oct 06 16:27:16 2020: Request 96983 was acted upon.
> > > > Transaction: Ticket created by efren.serra.ctr at nrlmry.navy.mil
> > > >        Queue: met_help
> > > >      Subject: cat_thresh v8.1 versus v9.1
> > > >        Owner: Nobody
> > > >   Requestors: efren.serra.ctr at nrlmry.navy.mil
> > > >       Status: new
> > > >  Ticket <URL:
> > > > https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96983
> > > > >
> > > >
> > > >
> > > > Folks - I updated my MET tools from v8.1 to v9.1. I had the
> > > > following fcst entry for my GridStatConfig file(s) but I
noticed
> > > > that v9.1 documentation doesn't have cat_thresh. What should I
map
> > > > cat_thresh to? There are vld_thresh and cov_thresh? Thank you.
> > > >
> > > > fcst = {
> > > >
> > > >    valid_time = "20180711_12";
> > > >    field = [
> > > >       {
> > > >         file_type  = GRIB1;
> > > >         model      = "WW3NAVGEM";
> > > >         name       = "MFLX";
> > > >         level      = "Z0";
> > > >         prob       = TRUE;
> > > >         cat_thresh = [ ==0.1 ];
> > > >       }
> > > >    ];
> > > >
> > > > }
> > > >
> > > > Efren A. Serra (Contractor)
> > > > Physicist
> > > >
> > > > DeVine Consulting, Inc.
> > > > Naval Research Laboratory
> > > > Marine Meteorology Division
> > > > 7 Grace Hopper Ave., STOP 2
> > > > Monterey, CA 93943
> > > > Code 7542
> > > > Mobile: 408-425-5027
> > > >
> > > >
> > > >
> > >
> > > --
> > > Julie Prestopnik (she/her/hers)
> > > Software Engineer
> > > National Center for Atmospheric Research Research Applications
> > > Laboratory
> > > Email: jpresto at ucar.edu
> > >
> > > My working day may not be your working day.  Please do not feel
> > > obliged
> > to
> > > reply to this email outside of your normal working hours.
> > >
> > >
> > >
> >
> > --
> > Julie Prestopnik (she/her/hers)
> > Software Engineer
> > National Center for Atmospheric Research Research Applications
> > Laboratory
> > Email: jpresto at ucar.edu
> >
> > My working day may not be your working day.  Please do not feel
> > obliged to reply to this email outside of your normal working
hours.
> >
> >
>
>
>
>

------------------------------------------------
Subject: RE: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus v9.1
From: efren.serra.ctr at nrlmry.navy.mil
Time: Fri Oct 09 14:17:13 2020

No worries mate, I figured it out. I'm using ncdump as follows:
ncdump -v lon,lat,OBS_HTSGW_Z0_box_mask -f fortran
grid_stat_wp102018_gt12ft_960000L_20180709_000000V_pairs.nc |grep -v "
_," >ww3tcofcl_sig_wav_ht.asc

Also, how do I know that the comparison is done in [ft] versus [m]. I
have "cat_thresh = [ >= 12.0 ];" but I have also "convert(x) =
3.2808*x;" in the obs structure. Is the convert also being applied to
the cat_thresh or should I have "cat_thresh = [ >= 3.6576 ];" and
don't do any convert?

obs = {

   field = [
      {
        file_type  = GRIB1;
        model      = "WW3TCOFCL";
        name       = "HTSGW";
        level      = "Z0";
        cat_thresh = [ >=12.0 ];
        convert(x) = 3.2808*x;
        mask = { grid = []; poly =
["/ftp/receive/serrae/ww3eval/wp102018/wp102018-2018070900.nc"]; }
      }
   ];

}


-----Original Message-----
From: John Halley Gotway via RT <met_help at ucar.edu>
Sent: Friday, October 9, 2020 1:10 PM
To: Serra, Mr. Efren, Contractor, Code 7531
<efren.serra.ctr at nrlmry.navy.mil>
Cc: ericg at ucar.edu
Subject: Re: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus v9.1

Efren,

No, I don't really. The Point-Stat tool does include the matched pair
line type (MPR) which includes lat/lon/fcst/obs values, among other
things.

We did not add that for the Grid-Stat tool because, in general, it'd
be too much data to write to ascii format. But in place of that, we
added the gridded NetCDF matched pairs output file. So writing a
python script (or ncl or whatever) would be pretty easy to dump that
data.

For example, here's ncdump:
   ncdump -v FCST_MFLX_Z0_box_mask
grid_stat_WW3NAVGEM_wp102018_gt12ft_960000L_20180709_000000V_pairs.nc

All those '_' characters indicate bad data values. Remember that the
data is 360x181 = 65160 points, of which only 100 contain valid data
values inside the 10x10 box mask.

John


On Fri, Oct 9, 2020 at 1:54 PM efren.serra.ctr at nrlmry.navy.mil via RT
< met_help at ucar.edu> wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96983 >
>
> John - is there a way to print tuple <lon, lat, field value>, for
> values in 10x10 box or box_mask? Do you have such a tool? I just
want
> a list of values. Thanks.
>
> -----Original Message-----
> From: John Halley Gotway via RT <met_help at ucar.edu>
> Sent: Wednesday, October 7, 2020 1:58 PM
> To: Serra, Mr. Efren, Contractor, Code 7531 <
> efren.serra.ctr at nrlmry.navy.mil>
> Cc: ericg at ucar.edu
> Subject: Fwd: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus v9.1
>
> Eric,
>
> Efren Serra works at NRL and has been running the Grid-Stat tool in
> MET to evaluate the performance of forecast for the probability of
> significant wave heights > 12 ft.
>
> The attached graphic illustrates this case where there's a large
> difference in the forecast probability values. The NAVGEM model on
the
> left has probability values up to 60% while the OFCL forecast
includes
> probabilities up to 90%. Efren is looking comparing these models
using
> the Brier score computed over a 10x10 degree box.
>
> He was surprised not to see a larger difference in the resulting
> reliability diagrams.
>
> I explained how Grid-Stat evaulates probabilities using an Nx2
> probability contingency table. Therefore the number of bins will
> affect the results. In particular, when computing the Brier score,
the
> centerpoint of the bin is used instead of the raw probability values
that went into the bin.
>
> The 2 attached images beginning with "wp_102018_basins" show the
Brier
> score across multiple forecast lead times.
> And the 3 attached images beginning with "wp_102018_gt12ft" show
> Reliability diagrams for a few different lead times.
>
> I assume that jagged pattern in the reliability diagram is the
result
> of choosing too many bins, resulting in bins with a small number of
> points which leads to sporadic results.
>
> On the one hand, choosing many bins minimizes the effect of binning
in
> the computation of the Brier score. On the other hand, too many bins
> make for a reliability diagram that is not very smooth.
>
> Do you have any recommendations or advice for Efren in the
> verification of the probabilistic data or interpretation of results?
>
> Thanks,
> John
>
> ---------- Forwarded message ---------
> From: efren.serra.ctr at nrlmry.navy.mil via RT <met_help at ucar.edu>
> Date: Wed, Oct 7, 2020 at 2:18 PM
> Subject: RE: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus v9.1
> To: <johnhg at ucar.edu>
>
>
>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96983 >
>
> John - These are plots of BASER_i or reliability diagrams using
> cat_thresh == 0.01 as you suggested. What can you able to conclude
> from these? The jigsaw pattern in WW3NAVGEM, for instance. I was
wondering about that.
> Thanks for your help. I'm going to try cat_thresh == 0.05 next.
>
> -----Original Message-----
> From: John Halley Gotway via RT <met_help at ucar.edu>
> Sent: Wednesday, October 7, 2020 9:01 AM
> To: Serra, Mr. Efren, Contractor, Code 7531 <
> efren.serra.ctr at nrlmry.navy.mil
> >
> Subject: Re: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus v9.1
>
> Efren,
>
> I see that you're doing probabilistic verification of two models.
All
> of the probabilistic stats computed by MET are derived from an Nx2
> probabilistic contingency table. Your choice of the forecast
> categorical threshold defines the N probability bins. And your
choice
> of the observed categorical threshold defines the 2 yes/no bins for
the observation.
>
> It looks like you're processing the probability of significant wave
> height
> > 12 ft.
>
> So you'd set the observation cat_thresh = >12; since that's the
event
> for which probabilities are defined.
>
> And if you set the forecast cat_thresh = ==0.2, you'd be using 5
> probability bins:
>    0.0 to 0.2, 0.2 to 0.4, 0.4 to 0.6, 0.6 to 0.8, and 0.8 to 1.0 If
> you set the forecast cat_thresh = ==0.1, you'd be using 10
probability bins.
>
> MET computes the Brier Score from the Nx2 table... not the raw
> probability values.
> In computing the Brier Score, all points falling inside a bin are
> evaluated using the mid-point of that bin.
> Points in 0.0 to 0.2 are processed as 0.1.
> Points in 0.2 to 0.4 are processed as 0.3.
> Points in 0.4 to 0.6 are processed as 0.5.
> Points in 0.6 to 0.8 are processed as 0.7.
> Points in 0.8 to 1.0 are processed as 0.9.
>
> In a recent MET-Help question from NOAA/WPC, we discussed this
situation.
> The Brier Score values were unexpected because WPC's probability
> values were actually already binned. The process of re-binning the
> already binned probabilities introduced some unexpected diffs in the
> resulting Brier Score values. As a result, we wrote up this GitHub
issue:
>    https://github.com/dtcenter/MET/issues/1495
>
> Now I'm not sure if/how this applies to your data.  But to minimize
> the effect of binning, you could try using 100 probability bins by
> setting "cat_thresh = ==0.01;" I'd be curious to see how much impact
> the cat_thresh setting has on the results. And remember that the
> statistics reported by Grid-Stat are computed over some spatial
area,
> as indicated by VX_MASK. If that spatial area is large, it's likely
> that the relative similarity between the fields AWAY from the large
> event is averaging things out and making their performance look more
similar on average.
>
> So take a look at how you've defined the masking regions for Grid-
Stat.
>
> Thanks,
> John
>
> On Wed, Oct 7, 2020 at 9:38 AM Julie Prestopnik via RT
> <met_help at ucar.edu>
> wrote:
>
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96983 >
> >
> > Hi Efren.
> >
> > I just wanted to follow up on your previous email.  You mentioned,
> > "I was wondering about the difference between specifying forecast
> > bins of
> > .2 versus .1 with cat_thresh = [==.1] or [==.2]".
> >
> > That information can be found on this page of the User's Guide:
> >
> >
https://dtcenter.github.io/MET/Users_Guide/data_io.html#configuratio
> > n-
> > file-details
> >
> > - Threshold:
> > >       - A threshold type (<, <=, ==, !-, >=, or >) followed by a
> > > numeric
> > value.
> > >       - The threshold type may also be specified using two
letter
> > abbreviations
> > >         (lt, le, eq, ne, ge, gt).
> > >       - Multiple thresholds may be combined by specifying the
> > > logic type
> > of AND
> > >         (&&) or *OR (||)*. For example, ">=5&&<=10" defines the
> > > numbers
> > between 5
> > >         and 10 and *"==1||==2" defines numbers exactly equal to
1
> > > or
> > > 2.*
> > >
> > >
> > If that doesn't help and doesn't explain why you are not seeing
the
> > differences you expect, please let me know and I will see if John
> > can better assist you.
> >
> > Thanks!
> >
> > Julie
> >
> > On Wed, Oct 7, 2020 at 9:32 AM efren.serra.ctr at nrlmry.navy.mil via
> > RT < met_help at ucar.edu> wrote:
> >
> > >
> > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96983 >
> > >
> > > Attached are plots I made from pjc, prc and pstd output. Here
I'm
> > > including the Brier score out of pstd data files. We are
comparing
> > > two different probability models versus the same ground truth
and
> > > we expect remarkable differences between the probabilistic
models,
> > > but we don't see that in these plots. I have worked with John
> > > before, so I was wondering
> > if
> > > you don't mind sharing this with him.
> > >
> > > -----Original Message-----
> > > From: Julie Prestopnik via RT <met_help at ucar.edu>
> > > Sent: Wednesday, October 7, 2020 7:41 AM
> > > To: Serra, Mr. Efren, Contractor, Code 7531 <
> > > efren.serra.ctr at nrlmry.navy.mil>
> > > Subject: Re: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus
v9.1
> > >
> > > Hi Efren.
> > >
> > > I see that you are under the impression that Grid-Stat for
METv9.1
> > > no longer includes cat_thresh.  However, if you look in the
> > > met-9.1/share/met/config/GridStatConfig_default file, you can
see
> > > that cat_thresh is still available for use in Grid-Stat:
> > >
> > > //
> > > > // Forecast and observation fields to be verified // fcst = {
> > > >    field = [
> > > >       {
> > > >         name       = "APCP";
> > > >         level      = [ "A03" ];
> > > >         *cat_thresh *= [ >0.0, >=5.0 ];
> > > >       }
> > > >    ];
> > > > }
> > >
> > >
> > > You mentioned that the v9.1 documentation doesn't have
cat_thresh
> > > described.  Please let us know if there is something in the
> > > METv8.1 documentation that is missing from the METv9.1
> > > documentation (specific quote, section, etc.) so that we can
> > > ensure it's inclusion in the METv9.1 documentation.
> > >
> > > Thanks!
> > >
> > > Julie
> > >
> > > On Tue, Oct 6, 2020 at 4:27 PM efren.serra.ctr at nrlmry.navy.mil
via
> > > RT < met_help at ucar.edu> wrote:
> > >
> > > >
> > > > Tue Oct 06 16:27:16 2020: Request 96983 was acted upon.
> > > > Transaction: Ticket created by efren.serra.ctr at nrlmry.navy.mil
> > > >        Queue: met_help
> > > >      Subject: cat_thresh v8.1 versus v9.1
> > > >        Owner: Nobody
> > > >   Requestors: efren.serra.ctr at nrlmry.navy.mil
> > > >       Status: new
> > > >  Ticket <URL:
> > > > https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96983
> > > > >
> > > >
> > > >
> > > > Folks - I updated my MET tools from v8.1 to v9.1. I had the
> > > > following fcst entry for my GridStatConfig file(s) but I
noticed
> > > > that v9.1 documentation doesn't have cat_thresh. What should I
> > > > map cat_thresh to? There are vld_thresh and cov_thresh? Thank
you.
> > > >
> > > > fcst = {
> > > >
> > > >    valid_time = "20180711_12";
> > > >    field = [
> > > >       {
> > > >         file_type  = GRIB1;
> > > >         model      = "WW3NAVGEM";
> > > >         name       = "MFLX";
> > > >         level      = "Z0";
> > > >         prob       = TRUE;
> > > >         cat_thresh = [ ==0.1 ];
> > > >       }
> > > >    ];
> > > >
> > > > }
> > > >
> > > > Efren A. Serra (Contractor)
> > > > Physicist
> > > >
> > > > DeVine Consulting, Inc.
> > > > Naval Research Laboratory
> > > > Marine Meteorology Division
> > > > 7 Grace Hopper Ave., STOP 2
> > > > Monterey, CA 93943
> > > > Code 7542
> > > > Mobile: 408-425-5027
> > > >
> > > >
> > > >
> > >
> > > --
> > > Julie Prestopnik (she/her/hers)
> > > Software Engineer
> > > National Center for Atmospheric Research Research Applications
> > > Laboratory
> > > Email: jpresto at ucar.edu
> > >
> > > My working day may not be your working day.  Please do not feel
> > > obliged
> > to
> > > reply to this email outside of your normal working hours.
> > >
> > >
> > >
> >
> > --
> > Julie Prestopnik (she/her/hers)
> > Software Engineer
> > National Center for Atmospheric Research Research Applications
> > Laboratory
> > Email: jpresto at ucar.edu
> >
> > My working day may not be your working day.  Please do not feel
> > obliged to reply to this email outside of your normal working
hours.
> >
> >
>
>
>
>



------------------------------------------------
Subject: cat_thresh v8.1 versus v9.1
From: John Halley Gotway
Time: Fri Oct 09 14:26:59 2020

Efren,

The convert function is only applied to the input gridded data. The
forecast thresholds are applied exactly as specified. They are not
impacted
by the convert logic.

John

On Fri, Oct 9, 2020 at 2:17 PM efren.serra.ctr at nrlmry.navy.mil via RT
<
met_help at ucar.edu> wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96983 >
>
> No worries mate, I figured it out. I'm using ncdump as follows:
> ncdump -v lon,lat,OBS_HTSGW_Z0_box_mask -f fortran
> grid_stat_wp102018_gt12ft_960000L_20180709_000000V_pairs.nc |grep -v
"
> _," >ww3tcofcl_sig_wav_ht.asc
>
> Also, how do I know that the comparison is done in [ft] versus [m].
I have
> "cat_thresh = [ >= 12.0 ];" but I have also "convert(x) = 3.2808*x;"
in the
> obs structure. Is the convert also being applied to the cat_thresh
or
> should I have "cat_thresh = [ >= 3.6576 ];" and don't do any
convert?
>
> obs = {
>
>    field = [
>       {
>         file_type  = GRIB1;
>         model      = "WW3TCOFCL";
>         name       = "HTSGW";
>         level      = "Z0";
>         cat_thresh = [ >=12.0 ];
>         convert(x) = 3.2808*x;
>         mask = { grid = []; poly =
["/ftp/receive/serrae/ww3eval/wp102018/
> wp102018-2018070900.nc"]; }
>       }
>    ];
>
> }
>
>
> -----Original Message-----
> From: John Halley Gotway via RT <met_help at ucar.edu>
> Sent: Friday, October 9, 2020 1:10 PM
> To: Serra, Mr. Efren, Contractor, Code 7531 <
> efren.serra.ctr at nrlmry.navy.mil>
> Cc: ericg at ucar.edu
> Subject: Re: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus v9.1
>
> Efren,
>
> No, I don't really. The Point-Stat tool does include the matched
pair line
> type (MPR) which includes lat/lon/fcst/obs values, among other
things.
>
> We did not add that for the Grid-Stat tool because, in general, it'd
be
> too much data to write to ascii format. But in place of that, we
added the
> gridded NetCDF matched pairs output file. So writing a python script
(or
> ncl or whatever) would be pretty easy to dump that data.
>
> For example, here's ncdump:
>    ncdump -v FCST_MFLX_Z0_box_mask
>
grid_stat_WW3NAVGEM_wp102018_gt12ft_960000L_20180709_000000V_pairs.nc
>
> All those '_' characters indicate bad data values. Remember that the
data
> is 360x181 = 65160 points, of which only 100 contain valid data
values
> inside the 10x10 box mask.
>
> John
>
>
> On Fri, Oct 9, 2020 at 1:54 PM efren.serra.ctr at nrlmry.navy.mil via
RT <
> met_help at ucar.edu> wrote:
>
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96983 >
> >
> > John - is there a way to print tuple <lon, lat, field value>, for
> > values in 10x10 box or box_mask? Do you have such a tool? I just
want
> > a list of values. Thanks.
> >
> > -----Original Message-----
> > From: John Halley Gotway via RT <met_help at ucar.edu>
> > Sent: Wednesday, October 7, 2020 1:58 PM
> > To: Serra, Mr. Efren, Contractor, Code 7531 <
> > efren.serra.ctr at nrlmry.navy.mil>
> > Cc: ericg at ucar.edu
> > Subject: Fwd: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus v9.1
> >
> > Eric,
> >
> > Efren Serra works at NRL and has been running the Grid-Stat tool
in
> > MET to evaluate the performance of forecast for the probability of
> > significant wave heights > 12 ft.
> >
> > The attached graphic illustrates this case where there's a large
> > difference in the forecast probability values. The NAVGEM model on
the
> > left has probability values up to 60% while the OFCL forecast
includes
> > probabilities up to 90%. Efren is looking comparing these models
using
> > the Brier score computed over a 10x10 degree box.
> >
> > He was surprised not to see a larger difference in the resulting
> > reliability diagrams.
> >
> > I explained how Grid-Stat evaulates probabilities using an Nx2
> > probability contingency table. Therefore the number of bins will
> > affect the results. In particular, when computing the Brier score,
the
> > centerpoint of the bin is used instead of the raw probability
values
> that went into the bin.
> >
> > The 2 attached images beginning with "wp_102018_basins" show the
Brier
> > score across multiple forecast lead times.
> > And the 3 attached images beginning with "wp_102018_gt12ft" show
> > Reliability diagrams for a few different lead times.
> >
> > I assume that jagged pattern in the reliability diagram is the
result
> > of choosing too many bins, resulting in bins with a small number
of
> > points which leads to sporadic results.
> >
> > On the one hand, choosing many bins minimizes the effect of
binning in
> > the computation of the Brier score. On the other hand, too many
bins
> > make for a reliability diagram that is not very smooth.
> >
> > Do you have any recommendations or advice for Efren in the
> > verification of the probabilistic data or interpretation of
results?
> >
> > Thanks,
> > John
> >
> > ---------- Forwarded message ---------
> > From: efren.serra.ctr at nrlmry.navy.mil via RT <met_help at ucar.edu>
> > Date: Wed, Oct 7, 2020 at 2:18 PM
> > Subject: RE: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus v9.1
> > To: <johnhg at ucar.edu>
> >
> >
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96983 >
> >
> > John - These are plots of BASER_i or reliability diagrams using
> > cat_thresh == 0.01 as you suggested. What can you able to conclude
> > from these? The jigsaw pattern in WW3NAVGEM, for instance. I was
> wondering about that.
> > Thanks for your help. I'm going to try cat_thresh == 0.05 next.
> >
> > -----Original Message-----
> > From: John Halley Gotway via RT <met_help at ucar.edu>
> > Sent: Wednesday, October 7, 2020 9:01 AM
> > To: Serra, Mr. Efren, Contractor, Code 7531 <
> > efren.serra.ctr at nrlmry.navy.mil
> > >
> > Subject: Re: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus v9.1
> >
> > Efren,
> >
> > I see that you're doing probabilistic verification of two models.
All
> > of the probabilistic stats computed by MET are derived from an Nx2
> > probabilistic contingency table. Your choice of the forecast
> > categorical threshold defines the N probability bins. And your
choice
> > of the observed categorical threshold defines the 2 yes/no bins
for the
> observation.
> >
> > It looks like you're processing the probability of significant
wave
> > height
> > > 12 ft.
> >
> > So you'd set the observation cat_thresh = >12; since that's the
event
> > for which probabilities are defined.
> >
> > And if you set the forecast cat_thresh = ==0.2, you'd be using 5
> > probability bins:
> >    0.0 to 0.2, 0.2 to 0.4, 0.4 to 0.6, 0.6 to 0.8, and 0.8 to 1.0
If
> > you set the forecast cat_thresh = ==0.1, you'd be using 10
probability
> bins.
> >
> > MET computes the Brier Score from the Nx2 table... not the raw
> > probability values.
> > In computing the Brier Score, all points falling inside a bin are
> > evaluated using the mid-point of that bin.
> > Points in 0.0 to 0.2 are processed as 0.1.
> > Points in 0.2 to 0.4 are processed as 0.3.
> > Points in 0.4 to 0.6 are processed as 0.5.
> > Points in 0.6 to 0.8 are processed as 0.7.
> > Points in 0.8 to 1.0 are processed as 0.9.
> >
> > In a recent MET-Help question from NOAA/WPC, we discussed this
situation.
> > The Brier Score values were unexpected because WPC's probability
> > values were actually already binned. The process of re-binning the
> > already binned probabilities introduced some unexpected diffs in
the
> > resulting Brier Score values. As a result, we wrote up this GitHub
issue:
> >    https://github.com/dtcenter/MET/issues/1495
> >
> > Now I'm not sure if/how this applies to your data.  But to
minimize
> > the effect of binning, you could try using 100 probability bins by
> > setting "cat_thresh = ==0.01;" I'd be curious to see how much
impact
> > the cat_thresh setting has on the results. And remember that the
> > statistics reported by Grid-Stat are computed over some spatial
area,
> > as indicated by VX_MASK. If that spatial area is large, it's
likely
> > that the relative similarity between the fields AWAY from the
large
> > event is averaging things out and making their performance look
more
> similar on average.
> >
> > So take a look at how you've defined the masking regions for Grid-
Stat.
> >
> > Thanks,
> > John
> >
> > On Wed, Oct 7, 2020 at 9:38 AM Julie Prestopnik via RT
> > <met_help at ucar.edu>
> > wrote:
> >
> > >
> > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96983 >
> > >
> > > Hi Efren.
> > >
> > > I just wanted to follow up on your previous email.  You
mentioned,
> > > "I was wondering about the difference between specifying
forecast
> > > bins of
> > > .2 versus .1 with cat_thresh = [==.1] or [==.2]".
> > >
> > > That information can be found on this page of the User's Guide:
> > >
> > >
https://dtcenter.github.io/MET/Users_Guide/data_io.html#configuratio
> > > n-
> > > file-details
> > >
> > > - Threshold:
> > > >       - A threshold type (<, <=, ==, !-, >=, or >) followed by
a
> > > > numeric
> > > value.
> > > >       - The threshold type may also be specified using two
letter
> > > abbreviations
> > > >         (lt, le, eq, ne, ge, gt).
> > > >       - Multiple thresholds may be combined by specifying the
> > > > logic type
> > > of AND
> > > >         (&&) or *OR (||)*. For example, ">=5&&<=10" defines
the
> > > > numbers
> > > between 5
> > > >         and 10 and *"==1||==2" defines numbers exactly equal
to 1
> > > > or
> > > > 2.*
> > > >
> > > >
> > > If that doesn't help and doesn't explain why you are not seeing
the
> > > differences you expect, please let me know and I will see if
John
> > > can better assist you.
> > >
> > > Thanks!
> > >
> > > Julie
> > >
> > > On Wed, Oct 7, 2020 at 9:32 AM efren.serra.ctr at nrlmry.navy.mil
via
> > > RT < met_help at ucar.edu> wrote:
> > >
> > > >
> > > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96983
>
> > > >
> > > > Attached are plots I made from pjc, prc and pstd output. Here
I'm
> > > > including the Brier score out of pstd data files. We are
comparing
> > > > two different probability models versus the same ground truth
and
> > > > we expect remarkable differences between the probabilistic
models,
> > > > but we don't see that in these plots. I have worked with John
> > > > before, so I was wondering
> > > if
> > > > you don't mind sharing this with him.
> > > >
> > > > -----Original Message-----
> > > > From: Julie Prestopnik via RT <met_help at ucar.edu>
> > > > Sent: Wednesday, October 7, 2020 7:41 AM
> > > > To: Serra, Mr. Efren, Contractor, Code 7531 <
> > > > efren.serra.ctr at nrlmry.navy.mil>
> > > > Subject: Re: [rt.rap.ucar.edu #96983] cat_thresh v8.1 versus
v9.1
> > > >
> > > > Hi Efren.
> > > >
> > > > I see that you are under the impression that Grid-Stat for
METv9.1
> > > > no longer includes cat_thresh.  However, if you look in the
> > > > met-9.1/share/met/config/GridStatConfig_default file, you can
see
> > > > that cat_thresh is still available for use in Grid-Stat:
> > > >
> > > > //
> > > > > // Forecast and observation fields to be verified // fcst =
{
> > > > >    field = [
> > > > >       {
> > > > >         name       = "APCP";
> > > > >         level      = [ "A03" ];
> > > > >         *cat_thresh *= [ >0.0, >=5.0 ];
> > > > >       }
> > > > >    ];
> > > > > }
> > > >
> > > >
> > > > You mentioned that the v9.1 documentation doesn't have
cat_thresh
> > > > described.  Please let us know if there is something in the
> > > > METv8.1 documentation that is missing from the METv9.1
> > > > documentation (specific quote, section, etc.) so that we can
> > > > ensure it's inclusion in the METv9.1 documentation.
> > > >
> > > > Thanks!
> > > >
> > > > Julie
> > > >
> > > > On Tue, Oct 6, 2020 at 4:27 PM efren.serra.ctr at nrlmry.navy.mil
via
> > > > RT < met_help at ucar.edu> wrote:
> > > >
> > > > >
> > > > > Tue Oct 06 16:27:16 2020: Request 96983 was acted upon.
> > > > > Transaction: Ticket created by
efren.serra.ctr at nrlmry.navy.mil
> > > > >        Queue: met_help
> > > > >      Subject: cat_thresh v8.1 versus v9.1
> > > > >        Owner: Nobody
> > > > >   Requestors: efren.serra.ctr at nrlmry.navy.mil
> > > > >       Status: new
> > > > >  Ticket <URL:
> > > > > https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96983
> > > > > >
> > > > >
> > > > >
> > > > > Folks - I updated my MET tools from v8.1 to v9.1. I had the
> > > > > following fcst entry for my GridStatConfig file(s) but I
noticed
> > > > > that v9.1 documentation doesn't have cat_thresh. What should
I
> > > > > map cat_thresh to? There are vld_thresh and cov_thresh?
Thank you.
> > > > >
> > > > > fcst = {
> > > > >
> > > > >    valid_time = "20180711_12";
> > > > >    field = [
> > > > >       {
> > > > >         file_type  = GRIB1;
> > > > >         model      = "WW3NAVGEM";
> > > > >         name       = "MFLX";
> > > > >         level      = "Z0";
> > > > >         prob       = TRUE;
> > > > >         cat_thresh = [ ==0.1 ];
> > > > >       }
> > > > >    ];
> > > > >
> > > > > }
> > > > >
> > > > > Efren A. Serra (Contractor)
> > > > > Physicist
> > > > >
> > > > > DeVine Consulting, Inc.
> > > > > Naval Research Laboratory
> > > > > Marine Meteorology Division
> > > > > 7 Grace Hopper Ave., STOP 2
> > > > > Monterey, CA 93943
> > > > > Code 7542
> > > > > Mobile: 408-425-5027
> > > > >
> > > > >
> > > > >
> > > >
> > > > --
> > > > Julie Prestopnik (she/her/hers)
> > > > Software Engineer
> > > > National Center for Atmospheric Research Research Applications
> > > > Laboratory
> > > > Email: jpresto at ucar.edu
> > > >
> > > > My working day may not be your working day.  Please do not
feel
> > > > obliged
> > > to
> > > > reply to this email outside of your normal working hours.
> > > >
> > > >
> > > >
> > >
> > > --
> > > Julie Prestopnik (she/her/hers)
> > > Software Engineer
> > > National Center for Atmospheric Research Research Applications
> > > Laboratory
> > > Email: jpresto at ucar.edu
> > >
> > > My working day may not be your working day.  Please do not feel
> > > obliged to reply to this email outside of your normal working
hours.
> > >
> > >
> >
> >
> >
> >
>
>
>
>

------------------------------------------------


More information about the Met_help mailing list