[Met_help] [rt.rap.ucar.edu #92879] History for Question about "prob_as_scalar" for ensemble probabilistic forecast verification

Wed Nov 6 16:45:25 MST 2019

----------------------------------------------------------------
  Initial Request
----------------------------------------------------------------

Hi MET Help Desk,

So the work I’ve done in recent weeks involved taking the ensemble relative frequency output from ensemble_stat and creating probabilistic verification output for precipitation. That has worked well for probability calculations from the ERF. However, I’d like to step back through grid stat and compute the traditional 2x2 contingency table data in order to generate output for the BSS, FSS, and NBRCNT line types. While reading along in the user guide, I came across the flag “prob_as_scalar” which from my current understanding would allow me to do the calculations mentioned above, but I was wondering if I could get some clarification on how it works because form the probability standpoint, it’s still a little unclear to me how I can go about verifying the probability data as scalars and how I would go about setting that up. Would I be reading in the ensemble relative frequency variable from the ensemble_stat output or should I use the actual precipitation variables an!
 d set the threshold to the precipitation total that I choose?

Thanks!

-Brian
—————————————————————
Brian Matilla
Research Fellow— Warn-on-Forecast Team
Cooperative Institute for Mesoscale Meteorological Studies — The University of Oklahoma
NOAA National Severe Storms Laboratory

Phone: (405) 325-1688

----------------------------------------------------------------
  Complete Ticket History
----------------------------------------------------------------

Subject: Question about "prob_as_scalar" for ensemble probabilistic forecast verification
From: John Halley Gotway
Time: Thu Oct 31 11:43:57 2019

Brian,

I do not think the "*prob_as_scalar*" configuration flag option will
be
useful to you.  Here's why...

When verifying the NetCDF output from Ensemble-Stat in Grid-Stat and
Point-Stat, I suspect that you're using "*prob = TRUE;*" in those
config
files.  That tells Grid-Stat and Point-Stat to process the Ensemble-
Stat
output as being probability data for which probabilistic stats should
be
computed.  But that "*prob*" option can be set in two ways in MET:
  - As a boolean to indicate that the data should be processed as
probabilities.
  - Or as a dictionary defining the probability definition to be
extracted
from input GRIB files.

For example, here's some sample output from wgrib for NOAA Short-Range
Ensemble Forecast (SREF) data:

54:1188450:d=12040821:PROB:kpds5=191:kpds6=1:kpds7=0:TR=4:P1=0:P2=12:TimeU=1:sfc:0-
12hr
acc:ensemble:prob(APCP>0.250000):NAve=0
55:1221124:d=12040821:PROB:kpds5=191:kpds6=1:kpds7=0:TR=4:P1=0:P2=12:TimeU=1:sfc:0-
12hr
acc:ensemble:prob(APCP>1.270000):NAve=0
56:1253798:d=12040821:PROB:kpds5=191:kpds6=1:kpds7=0:TR=4:P1=0:P2=12:TimeU=1:sfc:0-
12hr
acc:ensemble:prob(APCP>2.540000):NAve=0

All of these records are all encoded using the same GRIB code (191)
but the
probability info is stored in an extended portion of the Product
Description section.  And MET needs to be told which record should be
used.  For example, to process record number 55 data in MET (for the
probability of 24-hour APCP > 1.27), you'd use:

*   name = "PROB"; level = "A12"; prob = { name = "APCP"; thresh_lo =
1.27;
}*

We added the "*prob_as_scalar*" option for this case... when you need
to
specify "*prob*" as a dictionary defining the GRIB data to be used,
but you
want to process that data as being a simple scalar field.  Since that
is
not the situation you're in, I don't think it's useful to you.

Your situation sounds much more straight-forward.  Just omit "prob =
TRUE;"... or set "prob = FALSE;"... and MET will process the data as
being
any other field of scalars and perform whatever comparisons you
define.
But it's ultimately up to you to decide what comparisons and
thresholds are
meaningful.

For example, let's say you have ensemble_stat output with a variable
named "APCP_24_A24_ENS_FREQ_gt0.0".
You could compare that to a precip analysis, like StageIV or CCPA.
But you
might want to define separate thresholds for the forecast and
observations:

fcst = {
  field = [ {name = "APCP_24_A24_ENS_FREQ_gt0.0"; level = "(*,*)";
cat_thresh = [ >0, >0.1, >0.5 ]; } ];
}
obs = {
  field = [ { name = "APCP"; level = "A24"; cat_thresh = [ >0, >0, >0
]; } ]
}

This would produce 3 output 2x2 contingency tables (i.e. 3 CTC lines
and 3
CTS lines)... one for fcst and obs both being > 0, one for fcst >0.1
and
obs >0, and one for fcst >0.5 and obs >0.  But these thresholds are
totally
up to you and should be chosen by the scientific questions you're
trying to
answer.

Hope that helps clarify.

Thanks,
John

On Wed, Oct 30, 2019 at 4:28 PM Brian Matilla - NOAA Affiliate via RT
<
met_help at ucar.edu> wrote:

>
> Wed Oct 30 16:27:56 2019: Request 92879 was acted upon.
> Transaction: Ticket created by brian.matilla at noaa.gov
>        Queue: met_help
>      Subject: Question about "prob_as_scalar" for ensemble
probabilistic
> forecast verification
>        Owner: Nobody
>   Requestors: brian.matilla at noaa.gov
>       Status: new
>  Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=92879 >
>
>
> Hi MET Help Desk,
>
> So the work I’ve done in recent weeks involved taking the ensemble
> relative frequency output from ensemble_stat and creating
probabilistic
> verification output for precipitation. That has worked well for
probability
> calculations from the ERF. However, I’d like to step back through
grid stat
> and compute the traditional 2x2 contingency table data in order to
generate
> output for the BSS, FSS, and NBRCNT line types. While reading along
in the
> user guide, I came across the flag “prob_as_scalar” which from my
current
> understanding would allow me to do the calculations mentioned above,
but I
> was wondering if I could get some clarification on how it works
because
> form the probability standpoint, it’s still a little unclear to me
how I
> can go about verifying the probability data as scalars and how I
would go
> about setting that up. Would I be reading in the ensemble relative
> frequency variable from the ensemble_stat output or should I use the
actual
> precipitation variables an!
>  d set the threshold to the precipitation total that I choose?
>
> Thanks!
>
> -Brian
> —————————————————————
> Brian Matilla
> Research Fellow— Warn-on-Forecast Team
> Cooperative Institute for Mesoscale Meteorological Studies — The
> University of Oklahoma
> NOAA National Severe Storms Laboratory
>
> Phone: (405) 325-1688
>
>
>
>

------------------------------------------------
Subject: Question about "prob_as_scalar" for ensemble probabilistic forecast verification
From: Brian Matilla - NOAA Affiliate
Time: Mon Nov 04 16:57:01 2019

Hi John,

Sorry for the delayed reply. Thanks very much for the detailed
explanation and clearing up that question. It makes sense and since my
precipitation data doesn’t contain PROB, I’ll just omit "prob" or use
“prob = FALSE” and go from there.

I guess my next question then comes to the de-biasing of the data
within MET. Since I’m using v.8.1.1, I noticed there’s the ability to
debias the data which would be useful for getting a more refined
verification result for precip. The objective would be to de-bias the
forecast and observed precipitation values such that I could then
discriminate events based on the percentile under which the events
fall under (ex. 95th percentile for HREF versus Stage IV) since that’s
much closer to the overall scientific question I am trying to solve.
But before I go about using the debiasing technique, would it be most
recommended to use ensemble_stat first to calculate the debiased
ensemble values and then use grid_stat to calculate the percentile
thresholds or could grid_stat accomplish both of those tasks in one
go? Sorry if these may seem like rather silly questions; I’d like to
just "measure twice and cut once” with the right technique in MET.

Thanks!

-Brian

> On Oct 31, 2019, at 12:43 PM, John Halley Gotway via RT
<met_help at ucar.edu> wrote:
>
> Brian,
>
> I do not think the "*prob_as_scalar*" configuration flag option will
be
> useful to you.  Here's why...
>
> When verifying the NetCDF output from Ensemble-Stat in Grid-Stat and
> Point-Stat, I suspect that you're using "*prob = TRUE;*" in those
config
> files.  That tells Grid-Stat and Point-Stat to process the Ensemble-
Stat
> output as being probability data for which probabilistic stats
should be
> computed.  But that "*prob*" option can be set in two ways in MET:
>  - As a boolean to indicate that the data should be processed as
> probabilities.
>  - Or as a dictionary defining the probability definition to be
extracted
> from input GRIB files.
>
> For example, here's some sample output from wgrib for NOAA Short-
Range
> Ensemble Forecast (SREF) data:
>
>
54:1188450:d=12040821:PROB:kpds5=191:kpds6=1:kpds7=0:TR=4:P1=0:P2=12:TimeU=1:sfc:0-
12hr
> acc:ensemble:prob(APCP>0.250000):NAve=0
>
55:1221124:d=12040821:PROB:kpds5=191:kpds6=1:kpds7=0:TR=4:P1=0:P2=12:TimeU=1:sfc:0-
12hr
> acc:ensemble:prob(APCP>1.270000):NAve=0
>
56:1253798:d=12040821:PROB:kpds5=191:kpds6=1:kpds7=0:TR=4:P1=0:P2=12:TimeU=1:sfc:0-
12hr
> acc:ensemble:prob(APCP>2.540000):NAve=0
>
> All of these records are all encoded using the same GRIB code (191)
but the
> probability info is stored in an extended portion of the Product
> Description section.  And MET needs to be told which record should
be
> used.  For example, to process record number 55 data in MET (for the
> probability of 24-hour APCP > 1.27), you'd use:
>
> *   name = "PROB"; level = "A12"; prob = { name = "APCP"; thresh_lo
= 1.27;
> }*
>
> We added the "*prob_as_scalar*" option for this case... when you
need to
> specify "*prob*" as a dictionary defining the GRIB data to be used,
but you
> want to process that data as being a simple scalar field.  Since
that is
> not the situation you're in, I don't think it's useful to you.
>
> Your situation sounds much more straight-forward.  Just omit "prob =
> TRUE;"... or set "prob = FALSE;"... and MET will process the data as
being
> any other field of scalars and perform whatever comparisons you
define.
> But it's ultimately up to you to decide what comparisons and
thresholds are
> meaningful.
>
> For example, let's say you have ensemble_stat output with a variable
> named "APCP_24_A24_ENS_FREQ_gt0.0".
> You could compare that to a precip analysis, like StageIV or CCPA.
But you
> might want to define separate thresholds for the forecast and
observations:
>
> fcst = {
>  field = [ {name = "APCP_24_A24_ENS_FREQ_gt0.0"; level = "(*,*)";
> cat_thresh = [ >0, >0.1, >0.5 ]; } ];
> }
> obs = {
>  field = [ { name = "APCP"; level = "A24"; cat_thresh = [ >0, >0, >0
]; } ]
> }
>
> This would produce 3 output 2x2 contingency tables (i.e. 3 CTC lines
and 3
> CTS lines)... one for fcst and obs both being > 0, one for fcst >0.1
and
> obs >0, and one for fcst >0.5 and obs >0.  But these thresholds are
totally
> up to you and should be chosen by the scientific questions you're
trying to
> answer.
>
> Hope that helps clarify.
>
> Thanks,
> John
>
>
> On Wed, Oct 30, 2019 at 4:28 PM Brian Matilla - NOAA Affiliate via
RT <
> met_help at ucar.edu> wrote:
>
>>
>> Wed Oct 30 16:27:56 2019: Request 92879 was acted upon.
>> Transaction: Ticket created by brian.matilla at noaa.gov
>>       Queue: met_help
>>     Subject: Question about "prob_as_scalar" for ensemble
probabilistic
>> forecast verification
>>       Owner: Nobody
>>  Requestors: brian.matilla at noaa.gov
>>      Status: new
>> Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=92879 >
>>
>>
>> Hi MET Help Desk,
>>
>> So the work I’ve done in recent weeks involved taking the ensemble
>> relative frequency output from ensemble_stat and creating
probabilistic
>> verification output for precipitation. That has worked well for
probability
>> calculations from the ERF. However, I’d like to step back through
grid stat
>> and compute the traditional 2x2 contingency table data in order to
generate
>> output for the BSS, FSS, and NBRCNT line types. While reading along
in the
>> user guide, I came across the flag “prob_as_scalar” which from my
current
>> understanding would allow me to do the calculations mentioned
above, but I
>> was wondering if I could get some clarification on how it works
because
>> form the probability standpoint, it’s still a little unclear to me
how I
>> can go about verifying the probability data as scalars and how I
would go
>> about setting that up. Would I be reading in the ensemble relative
>> frequency variable from the ensemble_stat output or should I use
the actual
>> precipitation variables an!
>> d set the threshold to the precipitation total that I choose?
>>
>> Thanks!
>>
>> -Brian
>> —————————————————————
>> Brian Matilla
>> Research Fellow— Warn-on-Forecast Team
>> Cooperative Institute for Mesoscale Meteorological Studies — The
>> University of Oklahoma
>> NOAA National Severe Storms Laboratory
>>
>> Phone: (405) 325-1688
>>
>>
>>
>>
>

------------------------------------------------
Subject: Question about "prob_as_scalar" for ensemble probabilistic forecast verification
From: John Halley Gotway
Time: Tue Nov 05 09:51:20 2019

Brian,

I'm not sure if this exactly answers your question or not, but I'd
recommend considering two percentile options that are available in
met-8.1.

First, you could consider using the percentile thresholds exactly as
you've
described, to select the 95th percentile of forecast and
observation values.  You'd do that using thresholds of ">SFP95" and
">SOP95".  A second option to consider is choosing a real threshold
for the
observation (e.g. >25.4) and then setting the forecast threshold to
"==FBIAS1".  That tells MET to pick a forecast threshold which results
in a
frequency bias of 1.  Both of these methods de-bias the forecast, but
the
second option is tied to a tangible threshold.  In fact, you can use
both
of these options in a single call to Grid-Stat.

For example...

fcst = {
   cat_thresh = [ >25.4, >25.4, >SFP90, >SFP95 ];
...
}
obs = {
   cat_thresh = [ >25.4, ==FBIAS1, >SOP90, >SOP95 ];
...
}

This will produce 4 output contingency tables with the specified
fcst/obs
thresholds applied.

I don't understand exactly what you mean by "use ensemble_stat first
to
calculate the debiased ensemble values".  The real question with
percentile
thresholds is "what is the set of numbers from which the percentiles
are
calculated?".  For SFP, SOP, and FBIAS1, they are computed from the
current
SAMPLE... meaning the values falling inside the current verification
region
for the current day.

Hope that helps.

Thanks,
John

On Mon, Nov 4, 2019 at 4:57 PM Brian Matilla - NOAA Affiliate via RT <
met_help at ucar.edu> wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=92879 >
>
> Hi John,
>
> Sorry for the delayed reply. Thanks very much for the detailed
explanation
> and clearing up that question. It makes sense and since my
precipitation
> data doesn’t contain PROB, I’ll just omit "prob" or use “prob =
FALSE” and
> go from there.
>
> I guess my next question then comes to the de-biasing of the data
within
> MET. Since I’m using v.8.1.1, I noticed there’s the ability to
debias the
> data which would be useful for getting a more refined verification
result
> for precip. The objective would be to de-bias the forecast and
observed
> precipitation values such that I could then discriminate events
based on
> the percentile under which the events fall under (ex. 95th
percentile for
> HREF versus Stage IV) since that’s much closer to the overall
scientific
> question I am trying to solve. But before I go about using the
debiasing
> technique, would it be most recommended to use ensemble_stat first
to
> calculate the debiased ensemble values and then use grid_stat to
calculate
> the percentile thresholds or could grid_stat accomplish both of
those tasks
> in one go? Sorry if these may seem like rather silly questions; I’d
like to
> just "measure twice and cut once” with the right technique in MET.
>
> Thanks!
>
> -Brian
>
> > On Oct 31, 2019, at 12:43 PM, John Halley Gotway via RT <
> met_help at ucar.edu> wrote:
> >
> > Brian,
> >
> > I do not think the "*prob_as_scalar*" configuration flag option
will be
> > useful to you.  Here's why...
> >
> > When verifying the NetCDF output from Ensemble-Stat in Grid-Stat
and
> > Point-Stat, I suspect that you're using "*prob = TRUE;*" in those
config
> > files.  That tells Grid-Stat and Point-Stat to process the
Ensemble-Stat
> > output as being probability data for which probabilistic stats
should be
> > computed.  But that "*prob*" option can be set in two ways in MET:
> >  - As a boolean to indicate that the data should be processed as
> > probabilities.
> >  - Or as a dictionary defining the probability definition to be
extracted
> > from input GRIB files.
> >
> > For example, here's some sample output from wgrib for NOAA Short-
Range
> > Ensemble Forecast (SREF) data:
> >
> >
>
54:1188450:d=12040821:PROB:kpds5=191:kpds6=1:kpds7=0:TR=4:P1=0:P2=12:TimeU=1:sfc:0-
12hr
> > acc:ensemble:prob(APCP>0.250000):NAve=0
> >
>
55:1221124:d=12040821:PROB:kpds5=191:kpds6=1:kpds7=0:TR=4:P1=0:P2=12:TimeU=1:sfc:0-
12hr
> > acc:ensemble:prob(APCP>1.270000):NAve=0
> >
>
56:1253798:d=12040821:PROB:kpds5=191:kpds6=1:kpds7=0:TR=4:P1=0:P2=12:TimeU=1:sfc:0-
12hr
> > acc:ensemble:prob(APCP>2.540000):NAve=0
> >
> > All of these records are all encoded using the same GRIB code
(191) but
> the
> > probability info is stored in an extended portion of the Product
> > Description section.  And MET needs to be told which record should
be
> > used.  For example, to process record number 55 data in MET (for
the
> > probability of 24-hour APCP > 1.27), you'd use:
> >
> > *   name = "PROB"; level = "A12"; prob = { name = "APCP";
thresh_lo =
> 1.27;
> > }*
> >
> > We added the "*prob_as_scalar*" option for this case... when you
need to
> > specify "*prob*" as a dictionary defining the GRIB data to be
used, but
> you
> > want to process that data as being a simple scalar field.  Since
that is
> > not the situation you're in, I don't think it's useful to you.
> >
> > Your situation sounds much more straight-forward.  Just omit "prob
=
> > TRUE;"... or set "prob = FALSE;"... and MET will process the data
as
> being
> > any other field of scalars and perform whatever comparisons you
define.
> > But it's ultimately up to you to decide what comparisons and
thresholds
> are
> > meaningful.
> >
> > For example, let's say you have ensemble_stat output with a
variable
> > named "APCP_24_A24_ENS_FREQ_gt0.0".
> > You could compare that to a precip analysis, like StageIV or CCPA.
But
> you
> > might want to define separate thresholds for the forecast and
> observations:
> >
> > fcst = {
> >  field = [ {name = "APCP_24_A24_ENS_FREQ_gt0.0"; level = "(*,*)";
> > cat_thresh = [ >0, >0.1, >0.5 ]; } ];
> > }
> > obs = {
> >  field = [ { name = "APCP"; level = "A24"; cat_thresh = [ >0, >0,
>0 ];
> } ]
> > }
> >
> > This would produce 3 output 2x2 contingency tables (i.e. 3 CTC
lines and
> 3
> > CTS lines)... one for fcst and obs both being > 0, one for fcst
>0.1 and
> > obs >0, and one for fcst >0.5 and obs >0.  But these thresholds
are
> totally
> > up to you and should be chosen by the scientific questions you're
trying
> to
> > answer.
> >
> > Hope that helps clarify.
> >
> > Thanks,
> > John
> >
> >
> > On Wed, Oct 30, 2019 at 4:28 PM Brian Matilla - NOAA Affiliate via
RT <
> > met_help at ucar.edu> wrote:
> >
> >>
> >> Wed Oct 30 16:27:56 2019: Request 92879 was acted upon.
> >> Transaction: Ticket created by brian.matilla at noaa.gov
> >>       Queue: met_help
> >>     Subject: Question about "prob_as_scalar" for ensemble
probabilistic
> >> forecast verification
> >>       Owner: Nobody
> >>  Requestors: brian.matilla at noaa.gov
> >>      Status: new
> >> Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=92879 >
> >>
> >>
> >> Hi MET Help Desk,
> >>
> >> So the work I’ve done in recent weeks involved taking the
ensemble
> >> relative frequency output from ensemble_stat and creating
probabilistic
> >> verification output for precipitation. That has worked well for
> probability
> >> calculations from the ERF. However, I’d like to step back through
grid
> stat
> >> and compute the traditional 2x2 contingency table data in order
to
> generate
> >> output for the BSS, FSS, and NBRCNT line types. While reading
along in
> the
> >> user guide, I came across the flag “prob_as_scalar” which from my
> current
> >> understanding would allow me to do the calculations mentioned
above,
> but I
> >> was wondering if I could get some clarification on how it works
because
> >> form the probability standpoint, it’s still a little unclear to
me how I
> >> can go about verifying the probability data as scalars and how I
would
> go
> >> about setting that up. Would I be reading in the ensemble
relative
> >> frequency variable from the ensemble_stat output or should I use
the
> actual
> >> precipitation variables an!
> >> d set the threshold to the precipitation total that I choose?
> >>
> >> Thanks!
> >>
> >> -Brian
> >> —————————————————————
> >> Brian Matilla
> >> Research Fellow— Warn-on-Forecast Team
> >> Cooperative Institute for Mesoscale Meteorological Studies — The
> >> University of Oklahoma
> >> NOAA National Severe Storms Laboratory
> >>
> >> Phone: (405) 325-1688
> >>
> >>
> >>
> >>
> >
>
>
>

------------------------------------------------
Subject: Re: [rt.rap.ucar.edu #92879] Question about "prob_as_scalar" for ensemble probabilistic forecast verification
From: Brian Matilla - NOAA Affiliate
Time: Wed Nov 06 16:35:54 2019

John,

Thanks! That actually does help clear up a few things. I’ll go back
and work out some of the details but your advice definitely helps
smooth out some uncertainty, especially knowing that I can run both
options through grid_stat.

-Brian

> On Nov 5, 2019, at 10:51 AM, John Halley Gotway via RT
<met_help at ucar.edu> wrote:
>
> Brian,
>
> I'm not sure if this exactly answers your question or not, but I'd
> recommend considering two percentile options that are available in
met-8.1.
>
> First, you could consider using the percentile thresholds exactly as
you've
> described, to select the 95th percentile of forecast and
> observation values.  You'd do that using thresholds of ">SFP95" and
> ">SOP95".  A second option to consider is choosing a real threshold
for the
> observation (e.g. >25.4) and then setting the forecast threshold to
> "==FBIAS1".  That tells MET to pick a forecast threshold which
results in a
> frequency bias of 1.  Both of these methods de-bias the forecast,
but the
> second option is tied to a tangible threshold.  In fact, you can use
both
> of these options in a single call to Grid-Stat.
>
> For example...
>
> fcst = {
>   cat_thresh = [ >25.4, >25.4, >SFP90, >SFP95 ];
> ...
> }
> obs = {
>   cat_thresh = [ >25.4, ==FBIAS1, >SOP90, >SOP95 ];
> ...
> }
>
> This will produce 4 output contingency tables with the specified
fcst/obs
> thresholds applied.
>
> I don't understand exactly what you mean by "use ensemble_stat first
to
> calculate the debiased ensemble values".  The real question with
percentile
> thresholds is "what is the set of numbers from which the percentiles
are
> calculated?".  For SFP, SOP, and FBIAS1, they are computed from the
current
> SAMPLE... meaning the values falling inside the current verification
region
> for the current day.
>
> Hope that helps.
>
> Thanks,
> John
>
>
> On Mon, Nov 4, 2019 at 4:57 PM Brian Matilla - NOAA Affiliate via RT
<
> met_help at ucar.edu> wrote:
>
>>
>> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=92879 >
>>
>> Hi John,
>>
>> Sorry for the delayed reply. Thanks very much for the detailed
explanation
>> and clearing up that question. It makes sense and since my
precipitation
>> data doesn’t contain PROB, I’ll just omit "prob" or use “prob =
FALSE” and
>> go from there.
>>
>> I guess my next question then comes to the de-biasing of the data
within
>> MET. Since I’m using v.8.1.1, I noticed there’s the ability to
debias the
>> data which would be useful for getting a more refined verification
result
>> for precip. The objective would be to de-bias the forecast and
observed
>> precipitation values such that I could then discriminate events
based on
>> the percentile under which the events fall under (ex. 95th
percentile for
>> HREF versus Stage IV) since that’s much closer to the overall
scientific
>> question I am trying to solve. But before I go about using the
debiasing
>> technique, would it be most recommended to use ensemble_stat first
to
>> calculate the debiased ensemble values and then use grid_stat to
calculate
>> the percentile thresholds or could grid_stat accomplish both of
those tasks
>> in one go? Sorry if these may seem like rather silly questions; I’d
like to
>> just "measure twice and cut once” with the right technique in MET.
>>
>> Thanks!
>>
>> -Brian
>>
>>> On Oct 31, 2019, at 12:43 PM, John Halley Gotway via RT <
>> met_help at ucar.edu> wrote:
>>>
>>> Brian,
>>>
>>> I do not think the "*prob_as_scalar*" configuration flag option
will be
>>> useful to you.  Here's why...
>>>
>>> When verifying the NetCDF output from Ensemble-Stat in Grid-Stat
and
>>> Point-Stat, I suspect that you're using "*prob = TRUE;*" in those
config
>>> files.  That tells Grid-Stat and Point-Stat to process the
Ensemble-Stat
>>> output as being probability data for which probabilistic stats
should be
>>> computed.  But that "*prob*" option can be set in two ways in MET:
>>> - As a boolean to indicate that the data should be processed as
>>> probabilities.
>>> - Or as a dictionary defining the probability definition to be
extracted
>>> from input GRIB files.
>>>
>>> For example, here's some sample output from wgrib for NOAA Short-
Range
>>> Ensemble Forecast (SREF) data:
>>>
>>>
>>
54:1188450:d=12040821:PROB:kpds5=191:kpds6=1:kpds7=0:TR=4:P1=0:P2=12:TimeU=1:sfc:0-
12hr
>>> acc:ensemble:prob(APCP>0.250000):NAve=0
>>>
>>
55:1221124:d=12040821:PROB:kpds5=191:kpds6=1:kpds7=0:TR=4:P1=0:P2=12:TimeU=1:sfc:0-
12hr
>>> acc:ensemble:prob(APCP>1.270000):NAve=0
>>>
>>
56:1253798:d=12040821:PROB:kpds5=191:kpds6=1:kpds7=0:TR=4:P1=0:P2=12:TimeU=1:sfc:0-
12hr
>>> acc:ensemble:prob(APCP>2.540000):NAve=0
>>>
>>> All of these records are all encoded using the same GRIB code
(191) but
>> the
>>> probability info is stored in an extended portion of the Product
>>> Description section.  And MET needs to be told which record should
be
>>> used.  For example, to process record number 55 data in MET (for
the
>>> probability of 24-hour APCP > 1.27), you'd use:
>>>
>>> *   name = "PROB"; level = "A12"; prob = { name = "APCP";
thresh_lo =
>> 1.27;
>>> }*
>>>
>>> We added the "*prob_as_scalar*" option for this case... when you
need to
>>> specify "*prob*" as a dictionary defining the GRIB data to be
used, but
>> you
>>> want to process that data as being a simple scalar field.  Since
that is
>>> not the situation you're in, I don't think it's useful to you.
>>>
>>> Your situation sounds much more straight-forward.  Just omit "prob
=
>>> TRUE;"... or set "prob = FALSE;"... and MET will process the data
as
>> being
>>> any other field of scalars and perform whatever comparisons you
define.
>>> But it's ultimately up to you to decide what comparisons and
thresholds
>> are
>>> meaningful.
>>>
>>> For example, let's say you have ensemble_stat output with a
variable
>>> named "APCP_24_A24_ENS_FREQ_gt0.0".
>>> You could compare that to a precip analysis, like StageIV or CCPA.
But
>> you
>>> might want to define separate thresholds for the forecast and
>> observations:
>>>
>>> fcst = {
>>> field = [ {name = "APCP_24_A24_ENS_FREQ_gt0.0"; level = "(*,*)";
>>> cat_thresh = [ >0, >0.1, >0.5 ]; } ];
>>> }
>>> obs = {
>>> field = [ { name = "APCP"; level = "A24"; cat_thresh = [ >0, >0,
>0 ];
>> } ]
>>> }
>>>
>>> This would produce 3 output 2x2 contingency tables (i.e. 3 CTC
lines and
>> 3
>>> CTS lines)... one for fcst and obs both being > 0, one for fcst
>0.1 and
>>> obs >0, and one for fcst >0.5 and obs >0.  But these thresholds
are
>> totally
>>> up to you and should be chosen by the scientific questions you're
trying
>> to
>>> answer.
>>>
>>> Hope that helps clarify.
>>>
>>> Thanks,
>>> John
>>>
>>>
>>> On Wed, Oct 30, 2019 at 4:28 PM Brian Matilla - NOAA Affiliate via
RT <
>>> met_help at ucar.edu> wrote:
>>>
>>>>
>>>> Wed Oct 30 16:27:56 2019: Request 92879 was acted upon.
>>>> Transaction: Ticket created by brian.matilla at noaa.gov
>>>>      Queue: met_help
>>>>    Subject: Question about "prob_as_scalar" for ensemble
probabilistic
>>>> forecast verification
>>>>      Owner: Nobody
>>>> Requestors: brian.matilla at noaa.gov
>>>>     Status: new
>>>> Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=92879 >
>>>>
>>>>
>>>> Hi MET Help Desk,
>>>>
>>>> So the work I’ve done in recent weeks involved taking the
ensemble
>>>> relative frequency output from ensemble_stat and creating
probabilistic
>>>> verification output for precipitation. That has worked well for
>> probability
>>>> calculations from the ERF. However, I’d like to step back through
grid
>> stat
>>>> and compute the traditional 2x2 contingency table data in order
to
>> generate
>>>> output for the BSS, FSS, and NBRCNT line types. While reading
along in
>> the
>>>> user guide, I came across the flag “prob_as_scalar” which from my
>> current
>>>> understanding would allow me to do the calculations mentioned
above,
>> but I
>>>> was wondering if I could get some clarification on how it works
because
>>>> form the probability standpoint, it’s still a little unclear to
me how I
>>>> can go about verifying the probability data as scalars and how I
would
>> go
>>>> about setting that up. Would I be reading in the ensemble
relative
>>>> frequency variable from the ensemble_stat output or should I use
the
>> actual
>>>> precipitation variables an!
>>>> d set the threshold to the precipitation total that I choose?
>>>>
>>>> Thanks!
>>>>
>>>> -Brian
>>>> —————————————————————
>>>> Brian Matilla
>>>> Research Fellow— Warn-on-Forecast Team
>>>> Cooperative Institute for Mesoscale Meteorological Studies — The
>>>> University of Oklahoma
>>>> NOAA National Severe Storms Laboratory
>>>>
>>>> Phone: (405) 325-1688
>>>>
>>>>
>>>>
>>>>
>>>
>>
>>
>>
>

------------------------------------------------
Subject: Question about "prob_as_scalar" for ensemble probabilistic forecast verification
From: John Halley Gotway
Time: Wed Nov 06 16:45:22 2019

Brian,

Great.  I'll go ahead and resolve this ticket.  But if additional
issues
arise, feel free to write a new email to met_help at ucar.edu.

Thanks,
John

On Wed, Nov 6, 2019 at 4:36 PM Brian Matilla - NOAA Affiliate via RT <
met_help at ucar.edu> wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=92879 >
>
> John,
>
> Thanks! That actually does help clear up a few things. I’ll go back
and
> work out some of the details but your advice definitely helps smooth
out
> some uncertainty, especially knowing that I can run both options
through
> grid_stat.
>
> -Brian
>
> > On Nov 5, 2019, at 10:51 AM, John Halley Gotway via RT <
> met_help at ucar.edu> wrote:
> >
> > Brian,
> >
> > I'm not sure if this exactly answers your question or not, but I'd
> > recommend considering two percentile options that are available in
> met-8.1.
> >
> > First, you could consider using the percentile thresholds exactly
as
> you've
> > described, to select the 95th percentile of forecast and
> > observation values.  You'd do that using thresholds of ">SFP95"
and
> > ">SOP95".  A second option to consider is choosing a real
threshold for
> the
> > observation (e.g. >25.4) and then setting the forecast threshold
to
> > "==FBIAS1".  That tells MET to pick a forecast threshold which
results
> in a
> > frequency bias of 1.  Both of these methods de-bias the forecast,
but the
> > second option is tied to a tangible threshold.  In fact, you can
use both
> > of these options in a single call to Grid-Stat.
> >
> > For example...
> >
> > fcst = {
> >   cat_thresh = [ >25.4, >25.4, >SFP90, >SFP95 ];
> > ...
> > }
> > obs = {
> >   cat_thresh = [ >25.4, ==FBIAS1, >SOP90, >SOP95 ];
> > ...
> > }
> >
> > This will produce 4 output contingency tables with the specified
fcst/obs
> > thresholds applied.
> >
> > I don't understand exactly what you mean by "use ensemble_stat
first to
> > calculate the debiased ensemble values".  The real question with
> percentile
> > thresholds is "what is the set of numbers from which the
percentiles are
> > calculated?".  For SFP, SOP, and FBIAS1, they are computed from
the
> current
> > SAMPLE... meaning the values falling inside the current
verification
> region
> > for the current day.
> >
> > Hope that helps.
> >
> > Thanks,
> > John
> >
> >
> > On Mon, Nov 4, 2019 at 4:57 PM Brian Matilla - NOAA Affiliate via
RT <
> > met_help at ucar.edu> wrote:
> >
> >>
> >> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=92879 >
> >>
> >> Hi John,
> >>
> >> Sorry for the delayed reply. Thanks very much for the detailed
> explanation
> >> and clearing up that question. It makes sense and since my
precipitation
> >> data doesn’t contain PROB, I’ll just omit "prob" or use “prob =
FALSE”
> and
> >> go from there.
> >>
> >> I guess my next question then comes to the de-biasing of the data
within
> >> MET. Since I’m using v.8.1.1, I noticed there’s the ability to
debias
> the
> >> data which would be useful for getting a more refined
verification
> result
> >> for precip. The objective would be to de-bias the forecast and
observed
> >> precipitation values such that I could then discriminate events
based on
> >> the percentile under which the events fall under (ex. 95th
percentile
> for
> >> HREF versus Stage IV) since that’s much closer to the overall
scientific
> >> question I am trying to solve. But before I go about using the
debiasing
> >> technique, would it be most recommended to use ensemble_stat
first to
> >> calculate the debiased ensemble values and then use grid_stat to
> calculate
> >> the percentile thresholds or could grid_stat accomplish both of
those
> tasks
> >> in one go? Sorry if these may seem like rather silly questions;
I’d
> like to
> >> just "measure twice and cut once” with the right technique in
MET.
> >>
> >> Thanks!
> >>
> >> -Brian
> >>
> >>> On Oct 31, 2019, at 12:43 PM, John Halley Gotway via RT <
> >> met_help at ucar.edu> wrote:
> >>>
> >>> Brian,
> >>>
> >>> I do not think the "*prob_as_scalar*" configuration flag option
will be
> >>> useful to you.  Here's why...
> >>>
> >>> When verifying the NetCDF output from Ensemble-Stat in Grid-Stat
and
> >>> Point-Stat, I suspect that you're using "*prob = TRUE;*" in
those
> config
> >>> files.  That tells Grid-Stat and Point-Stat to process the
> Ensemble-Stat
> >>> output as being probability data for which probabilistic stats
should
> be
> >>> computed.  But that "*prob*" option can be set in two ways in
MET:
> >>> - As a boolean to indicate that the data should be processed as
> >>> probabilities.
> >>> - Or as a dictionary defining the probability definition to be
> extracted
> >>> from input GRIB files.
> >>>
> >>> For example, here's some sample output from wgrib for NOAA
Short-Range
> >>> Ensemble Forecast (SREF) data:
> >>>
> >>>
> >>
>
54:1188450:d=12040821:PROB:kpds5=191:kpds6=1:kpds7=0:TR=4:P1=0:P2=12:TimeU=1:sfc:0-
12hr
> >>> acc:ensemble:prob(APCP>0.250000):NAve=0
> >>>
> >>
>
55:1221124:d=12040821:PROB:kpds5=191:kpds6=1:kpds7=0:TR=4:P1=0:P2=12:TimeU=1:sfc:0-
12hr
> >>> acc:ensemble:prob(APCP>1.270000):NAve=0
> >>>
> >>
>
56:1253798:d=12040821:PROB:kpds5=191:kpds6=1:kpds7=0:TR=4:P1=0:P2=12:TimeU=1:sfc:0-
12hr
> >>> acc:ensemble:prob(APCP>2.540000):NAve=0
> >>>
> >>> All of these records are all encoded using the same GRIB code
(191) but
> >> the
> >>> probability info is stored in an extended portion of the Product
> >>> Description section.  And MET needs to be told which record
should be
> >>> used.  For example, to process record number 55 data in MET (for
the
> >>> probability of 24-hour APCP > 1.27), you'd use:
> >>>
> >>> *   name = "PROB"; level = "A12"; prob = { name = "APCP";
thresh_lo =
> >> 1.27;
> >>> }*
> >>>
> >>> We added the "*prob_as_scalar*" option for this case... when you
need
> to
> >>> specify "*prob*" as a dictionary defining the GRIB data to be
used, but
> >> you
> >>> want to process that data as being a simple scalar field.  Since
that
> is
> >>> not the situation you're in, I don't think it's useful to you.
> >>>
> >>> Your situation sounds much more straight-forward.  Just omit
"prob =
> >>> TRUE;"... or set "prob = FALSE;"... and MET will process the
data as
> >> being
> >>> any other field of scalars and perform whatever comparisons you
define.
> >>> But it's ultimately up to you to decide what comparisons and
thresholds
> >> are
> >>> meaningful.
> >>>
> >>> For example, let's say you have ensemble_stat output with a
variable
> >>> named "APCP_24_A24_ENS_FREQ_gt0.0".
> >>> You could compare that to a precip analysis, like StageIV or
CCPA.  But
> >> you
> >>> might want to define separate thresholds for the forecast and
> >> observations:
> >>>
> >>> fcst = {
> >>> field = [ {name = "APCP_24_A24_ENS_FREQ_gt0.0"; level = "(*,*)";
> >>> cat_thresh = [ >0, >0.1, >0.5 ]; } ];
> >>> }
> >>> obs = {
> >>> field = [ { name = "APCP"; level = "A24"; cat_thresh = [ >0, >0,
>0 ];
> >> } ]
> >>> }
> >>>
> >>> This would produce 3 output 2x2 contingency tables (i.e. 3 CTC
lines
> and
> >> 3
> >>> CTS lines)... one for fcst and obs both being > 0, one for fcst
>0.1
> and
> >>> obs >0, and one for fcst >0.5 and obs >0.  But these thresholds
are
> >> totally
> >>> up to you and should be chosen by the scientific questions
you're
> trying
> >> to
> >>> answer.
> >>>
> >>> Hope that helps clarify.
> >>>
> >>> Thanks,
> >>> John
> >>>
> >>>
> >>> On Wed, Oct 30, 2019 at 4:28 PM Brian Matilla - NOAA Affiliate
via RT <
> >>> met_help at ucar.edu> wrote:
> >>>
> >>>>
> >>>> Wed Oct 30 16:27:56 2019: Request 92879 was acted upon.
> >>>> Transaction: Ticket created by brian.matilla at noaa.gov
> >>>>      Queue: met_help
> >>>>    Subject: Question about "prob_as_scalar" for ensemble
probabilistic
> >>>> forecast verification
> >>>>      Owner: Nobody
> >>>> Requestors: brian.matilla at noaa.gov
> >>>>     Status: new
> >>>> Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=92879
> >
> >>>>
> >>>>
> >>>> Hi MET Help Desk,
> >>>>
> >>>> So the work I’ve done in recent weeks involved taking the
ensemble
> >>>> relative frequency output from ensemble_stat and creating
> probabilistic
> >>>> verification output for precipitation. That has worked well for
> >> probability
> >>>> calculations from the ERF. However, I’d like to step back
through grid
> >> stat
> >>>> and compute the traditional 2x2 contingency table data in order
to
> >> generate
> >>>> output for the BSS, FSS, and NBRCNT line types. While reading
along in
> >> the
> >>>> user guide, I came across the flag “prob_as_scalar” which from
my
> >> current
> >>>> understanding would allow me to do the calculations mentioned
above,
> >> but I
> >>>> was wondering if I could get some clarification on how it works
> because
> >>>> form the probability standpoint, it’s still a little unclear to
me
> how I
> >>>> can go about verifying the probability data as scalars and how
I would
> >> go
> >>>> about setting that up. Would I be reading in the ensemble
relative
> >>>> frequency variable from the ensemble_stat output or should I
use the
> >> actual
> >>>> precipitation variables an!
> >>>> d set the threshold to the precipitation total that I choose?
> >>>>
> >>>> Thanks!
> >>>>
> >>>> -Brian
> >>>> —————————————————————
> >>>> Brian Matilla
> >>>> Research Fellow— Warn-on-Forecast Team
> >>>> Cooperative Institute for Mesoscale Meteorological Studies —
The
> >>>> University of Oklahoma
> >>>> NOAA National Severe Storms Laboratory
> >>>>
> >>>> Phone: (405) 325-1688
> >>>>
> >>>>
> >>>>
> >>>>
> >>>
> >>
> >>
> >>
> >
>
>
>
>

------------------------------------------------