[Met_help] [rt.rap.ucar.edu #87043] History for Neighborhood contigency tables

John Halley Gotway via RT met_help at ucar.edu
Tue Sep 25 14:05:29 MDT 2018


----------------------------------------------------------------
  Initial Request
----------------------------------------------------------------

Good afternoon,

I am running grid_stat with a neighborhood, and am trying to find out how contingency tables (nbrctc) are populated. I can see where "forecast_yes" is defined (i.e., when fractional forecast coverage exceeds the user-defined cov_thresh) and vice versa for "forecast_no". Is the same logic applied to observations (i.e., when fractional observation coverage exceeds cov_thresh), or is "observed_yes" defined at each raw observation file grid point with no neighborhood consideration?

Thanks for your help!
Marcus


----------------------------------------------------------------
  Complete Ticket History
----------------------------------------------------------------

Subject: Neighborhood contigency tables
From: John Halley Gotway
Time: Fri Sep 21 16:30:08 2018

Hi Marcus,

I see you have a question about the NBRCTC output from the Grid-Stat
tool.
When applying neighborhood methods in Grid-Stat, it can create output
in
the NBRCNT, NBRCTC, and NBRCTS line types.

The NBRCNT output includes the fractions brier score and fractions
skill
score which are, by far, the most commonly used neighborhood statistic
types.

In fact, to my knowledge, the NBRCTC contingency table counts and the
corresponding NBRCTS contingency table statistics do not appear in any
published verification literature.  And for that reason, we've
second-guessed whether the tool should be computing them at all.  So
all
those caveats aside, here's how the logic works.

(1) In Grid-Stat, loop through the requested list of fields and
compute
stats for each one of them.
(2) For each forecast field/observation field combo, loop over the
categorical thresholds (cat_thresh) requested.
(3) For each categorical threshold requested, loop over the
neighborhood
areas (nbrhd.width) requested.
(4) Apply the current forecast categorical threshold and neighborhood
size
to transform the raw forecast values into a fractional coverage field.
(5) Apply the current observation categorical threshold and
neighborhood
size to transform the raw observation values into a fractional
coverage
field.
The value at each grid point represents the frequency of that event in
the
neighborhood of that point.
(6) Compare the forecast and observation fractional coverage values
directly to compute the stats for the NBRCNT line type.
(7) If the user has requested NBRCTC or NBRCTS output, loop over the
coverage thresholds (nbhrd.cov_thresh) requested.
(8) For each coverage threshold, threshold both the forecast and
observation fractional coverage fields to defined events/non-events.
(9) Compare the forecast/observation events/non-events to compute the
counts and statistics for the NBRCTC and NBRCTS line types.

The single "cov_thresh" threshold is applied to both the forecast and
observation fields.  There's no way to specify them differently for
each.

Hope that helps clarify.  Just let me know what questions remain.

Thanks,
John Halley Gotway


On Wed, Sep 19, 2018 at 1:55 PM Johnson, Marcus R. via RT
<met_help at ucar.edu>
wrote:

>
> Wed Sep 19 13:55:11 2018: Request 87043 was acted upon.
> Transaction: Ticket created by marcus.johnson at ou.edu
>        Queue: met_help
>      Subject: Neighborhood contigency tables
>        Owner: Nobody
>   Requestors: marcus.johnson at ou.edu
>       Status: new
>  Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=87043 >
>
>
> Good afternoon,
>
> I am running grid_stat with a neighborhood, and am trying to find
out how
> contingency tables (nbrctc) are populated. I can see where
"forecast_yes"
> is defined (i.e., when fractional forecast coverage exceeds the
> user-defined cov_thresh) and vice versa for "forecast_no". Is the
same
> logic applied to observations (i.e., when fractional observation
coverage
> exceeds cov_thresh), or is "observed_yes" defined at each raw
observation
> file grid point with no neighborhood consideration?
>
> Thanks for your help!
> Marcus
>
>

------------------------------------------------
Subject: RE: [rt.rap.ucar.edu #87043] Neighborhood contigency tables
From: Johnson, Marcus R.
Time: Mon Sep 24 12:44:42 2018

Hi John,

Thank you for the thorough explanation. If I am reading correctly, a
"hit" is defined where fraction coverage exceeds the cov_thresh for
forecasts and observations. When this threshold is set to >0, this is
the same method of the "neighborhood maximum" approach, which Schwartz
(2017) (which also contains NM references) suggested for neighborhood-
based contingency tables. Therefore, there is precedence in literature
such that the user can be confident in the results.

Marcus

-----Original Message-----
From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
Sent: Friday, September 21, 2018 5:30 PM
To: Johnson, Marcus R.
Subject: Re: [rt.rap.ucar.edu #87043] Neighborhood contigency tables

Hi Marcus,

I see you have a question about the NBRCTC output from the Grid-Stat
tool.
When applying neighborhood methods in Grid-Stat, it can create output
in
the NBRCNT, NBRCTC, and NBRCTS line types.

The NBRCNT output includes the fractions brier score and fractions
skill
score which are, by far, the most commonly used neighborhood statistic
types.

In fact, to my knowledge, the NBRCTC contingency table counts and the
corresponding NBRCTS contingency table statistics do not appear in any
published verification literature.  And for that reason, we've
second-guessed whether the tool should be computing them at all.  So
all
those caveats aside, here's how the logic works.

(1) In Grid-Stat, loop through the requested list of fields and
compute
stats for each one of them.
(2) For each forecast field/observation field combo, loop over the
categorical thresholds (cat_thresh) requested.
(3) For each categorical threshold requested, loop over the
neighborhood
areas (nbrhd.width) requested.
(4) Apply the current forecast categorical threshold and neighborhood
size
to transform the raw forecast values into a fractional coverage field.
(5) Apply the current observation categorical threshold and
neighborhood
size to transform the raw observation values into a fractional
coverage
field.
The value at each grid point represents the frequency of that event in
the
neighborhood of that point.
(6) Compare the forecast and observation fractional coverage values
directly to compute the stats for the NBRCNT line type.
(7) If the user has requested NBRCTC or NBRCTS output, loop over the
coverage thresholds (nbhrd.cov_thresh) requested.
(8) For each coverage threshold, threshold both the forecast and
observation fractional coverage fields to defined events/non-events.
(9) Compare the forecast/observation events/non-events to compute the
counts and statistics for the NBRCTC and NBRCTS line types.

The single "cov_thresh" threshold is applied to both the forecast and
observation fields.  There's no way to specify them differently for
each.

Hope that helps clarify.  Just let me know what questions remain.

Thanks,
John Halley Gotway


On Wed, Sep 19, 2018 at 1:55 PM Johnson, Marcus R. via RT
<met_help at ucar.edu>
wrote:

>
> Wed Sep 19 13:55:11 2018: Request 87043 was acted upon.
> Transaction: Ticket created by marcus.johnson at ou.edu
>        Queue: met_help
>      Subject: Neighborhood contigency tables
>        Owner: Nobody
>   Requestors: marcus.johnson at ou.edu
>       Status: new
>  Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=87043 >
>
>
> Good afternoon,
>
> I am running grid_stat with a neighborhood, and am trying to find
out how
> contingency tables (nbrctc) are populated. I can see where
"forecast_yes"
> is defined (i.e., when fractional forecast coverage exceeds the
> user-defined cov_thresh) and vice versa for "forecast_no". Is the
same
> logic applied to observations (i.e., when fractional observation
coverage
> exceeds cov_thresh), or is "observed_yes" defined at each raw
observation
> file grid point with no neighborhood consideration?
>
> Thanks for your help!
> Marcus
>
>



------------------------------------------------
Subject: Neighborhood contigency tables
From: John Halley Gotway
Time: Mon Sep 24 14:10:19 2018

Marcus,

Yes, I think that logic makes sense.  Using cov_thresh > 0, a hit
would
mean that the event (as defined by the cat_thresh) occurred somewhere
in
that neighborhood.

However, they may be a more straight-forward way of applying this
logic.
The config file of the Grid-Stat tool includes a section called
"interp".
For Point-Stat, the "interp" section defines how the gridded forecast
data
should be interpolated to the point observation location.  In Grid-
Stat,
the "interp" option serves as a way of smoothing the data.  Let's say
you
set:

interp = {
   field           = FCST;
   vld_thresh = 1.0;
   shape        = CIRCLE;
   type = [
      {
         method = MAX;
         width  = 5;
      }
   ];
}

This tells Grid-Stat to smooth the forecast data by replacing the
value at
each grid point with the maximum value found in the circle of diameter
10
surrounding the current grid point.

Once you've smoothed the forecast in this way, you can apply
categorical
thresholds directly and compute contingency table statistics.

I *think* this would produce the same result as what you've described,
but
would be more direct.  Make sense?

Thanks,
John

On Mon, Sep 24, 2018 at 12:44 PM Johnson, Marcus R. via RT <
met_help at ucar.edu> wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=87043 >
>
> Hi John,
>
> Thank you for the thorough explanation. If I am reading correctly, a
"hit"
> is defined where fraction coverage exceeds the cov_thresh for
forecasts and
> observations. When this threshold is set to >0, this is the same
method of
> the "neighborhood maximum" approach, which Schwartz (2017) (which
also
> contains NM references) suggested for neighborhood-based contingency
> tables. Therefore, there is precedence in literature such that the
user can
> be confident in the results.
>
> Marcus
>
> -----Original Message-----
> From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
> Sent: Friday, September 21, 2018 5:30 PM
> To: Johnson, Marcus R.
> Subject: Re: [rt.rap.ucar.edu #87043] Neighborhood contigency tables
>
> Hi Marcus,
>
> I see you have a question about the NBRCTC output from the Grid-Stat
tool.
> When applying neighborhood methods in Grid-Stat, it can create
output in
> the NBRCNT, NBRCTC, and NBRCTS line types.
>
> The NBRCNT output includes the fractions brier score and fractions
skill
> score which are, by far, the most commonly used neighborhood
statistic
> types.
>
> In fact, to my knowledge, the NBRCTC contingency table counts and
the
> corresponding NBRCTS contingency table statistics do not appear in
any
> published verification literature.  And for that reason, we've
> second-guessed whether the tool should be computing them at all.  So
all
> those caveats aside, here's how the logic works.
>
> (1) In Grid-Stat, loop through the requested list of fields and
compute
> stats for each one of them.
> (2) For each forecast field/observation field combo, loop over the
> categorical thresholds (cat_thresh) requested.
> (3) For each categorical threshold requested, loop over the
neighborhood
> areas (nbrhd.width) requested.
> (4) Apply the current forecast categorical threshold and
neighborhood size
> to transform the raw forecast values into a fractional coverage
field.
> (5) Apply the current observation categorical threshold and
neighborhood
> size to transform the raw observation values into a fractional
coverage
> field.
> The value at each grid point represents the frequency of that event
in the
> neighborhood of that point.
> (6) Compare the forecast and observation fractional coverage values
> directly to compute the stats for the NBRCNT line type.
> (7) If the user has requested NBRCTC or NBRCTS output, loop over the
> coverage thresholds (nbhrd.cov_thresh) requested.
> (8) For each coverage threshold, threshold both the forecast and
> observation fractional coverage fields to defined events/non-events.
> (9) Compare the forecast/observation events/non-events to compute
the
> counts and statistics for the NBRCTC and NBRCTS line types.
>
> The single "cov_thresh" threshold is applied to both the forecast
and
> observation fields.  There's no way to specify them differently for
each.
>
> Hope that helps clarify.  Just let me know what questions remain.
>
> Thanks,
> John Halley Gotway
>
>
> On Wed, Sep 19, 2018 at 1:55 PM Johnson, Marcus R. via RT <
> met_help at ucar.edu>
> wrote:
>
> >
> > Wed Sep 19 13:55:11 2018: Request 87043 was acted upon.
> > Transaction: Ticket created by marcus.johnson at ou.edu
> >        Queue: met_help
> >      Subject: Neighborhood contigency tables
> >        Owner: Nobody
> >   Requestors: marcus.johnson at ou.edu
> >       Status: new
> >  Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=87043 >
> >
> >
> > Good afternoon,
> >
> > I am running grid_stat with a neighborhood, and am trying to find
out how
> > contingency tables (nbrctc) are populated. I can see where
"forecast_yes"
> > is defined (i.e., when fractional forecast coverage exceeds the
> > user-defined cov_thresh) and vice versa for "forecast_no". Is the
same
> > logic applied to observations (i.e., when fractional observation
coverage
> > exceeds cov_thresh), or is "observed_yes" defined at each raw
observation
> > file grid point with no neighborhood consideration?
> >
> > Thanks for your help!
> > Marcus
> >
> >
>
>
>
>

------------------------------------------------
Subject: RE: [rt.rap.ucar.edu #87043] Neighborhood contigency tables
From: Johnson, Marcus R.
Time: Mon Sep 24 16:28:40 2018

Hi John,

That makes perfect sense. I believe that it should produce the same
results as setting cov_thresh > 0. Just so I am clear, width refers to
the interpolation circle's diameter, correct? (i.e., width = 5 would
be a circle of diameter 5 from the current grid point)

Marcus

-----Original Message-----
From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
Sent: Monday, September 24, 2018 3:10 PM
To: Johnson, Marcus R.
Subject: Re: [rt.rap.ucar.edu #87043] Neighborhood contigency tables

Marcus,

Yes, I think that logic makes sense.  Using cov_thresh > 0, a hit
would
mean that the event (as defined by the cat_thresh) occurred somewhere
in
that neighborhood.

However, they may be a more straight-forward way of applying this
logic.
The config file of the Grid-Stat tool includes a section called
"interp".
For Point-Stat, the "interp" section defines how the gridded forecast
data
should be interpolated to the point observation location.  In Grid-
Stat,
the "interp" option serves as a way of smoothing the data.  Let's say
you
set:

interp = {
   field           = FCST;
   vld_thresh = 1.0;
   shape        = CIRCLE;
   type = [
      {
         method = MAX;
         width  = 5;
      }
   ];
}

This tells Grid-Stat to smooth the forecast data by replacing the
value at
each grid point with the maximum value found in the circle of diameter
10
surrounding the current grid point.

Once you've smoothed the forecast in this way, you can apply
categorical
thresholds directly and compute contingency table statistics.

I *think* this would produce the same result as what you've described,
but
would be more direct.  Make sense?

Thanks,
John

On Mon, Sep 24, 2018 at 12:44 PM Johnson, Marcus R. via RT <
met_help at ucar.edu> wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=87043 >
>
> Hi John,
>
> Thank you for the thorough explanation. If I am reading correctly, a
"hit"
> is defined where fraction coverage exceeds the cov_thresh for
forecasts and
> observations. When this threshold is set to >0, this is the same
method of
> the "neighborhood maximum" approach, which Schwartz (2017) (which
also
> contains NM references) suggested for neighborhood-based contingency
> tables. Therefore, there is precedence in literature such that the
user can
> be confident in the results.
>
> Marcus
>
> -----Original Message-----
> From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
> Sent: Friday, September 21, 2018 5:30 PM
> To: Johnson, Marcus R.
> Subject: Re: [rt.rap.ucar.edu #87043] Neighborhood contigency tables
>
> Hi Marcus,
>
> I see you have a question about the NBRCTC output from the Grid-Stat
tool.
> When applying neighborhood methods in Grid-Stat, it can create
output in
> the NBRCNT, NBRCTC, and NBRCTS line types.
>
> The NBRCNT output includes the fractions brier score and fractions
skill
> score which are, by far, the most commonly used neighborhood
statistic
> types.
>
> In fact, to my knowledge, the NBRCTC contingency table counts and
the
> corresponding NBRCTS contingency table statistics do not appear in
any
> published verification literature.  And for that reason, we've
> second-guessed whether the tool should be computing them at all.  So
all
> those caveats aside, here's how the logic works.
>
> (1) In Grid-Stat, loop through the requested list of fields and
compute
> stats for each one of them.
> (2) For each forecast field/observation field combo, loop over the
> categorical thresholds (cat_thresh) requested.
> (3) For each categorical threshold requested, loop over the
neighborhood
> areas (nbrhd.width) requested.
> (4) Apply the current forecast categorical threshold and
neighborhood size
> to transform the raw forecast values into a fractional coverage
field.
> (5) Apply the current observation categorical threshold and
neighborhood
> size to transform the raw observation values into a fractional
coverage
> field.
> The value at each grid point represents the frequency of that event
in the
> neighborhood of that point.
> (6) Compare the forecast and observation fractional coverage values
> directly to compute the stats for the NBRCNT line type.
> (7) If the user has requested NBRCTC or NBRCTS output, loop over the
> coverage thresholds (nbhrd.cov_thresh) requested.
> (8) For each coverage threshold, threshold both the forecast and
> observation fractional coverage fields to defined events/non-events.
> (9) Compare the forecast/observation events/non-events to compute
the
> counts and statistics for the NBRCTC and NBRCTS line types.
>
> The single "cov_thresh" threshold is applied to both the forecast
and
> observation fields.  There's no way to specify them differently for
each.
>
> Hope that helps clarify.  Just let me know what questions remain.
>
> Thanks,
> John Halley Gotway
>
>
> On Wed, Sep 19, 2018 at 1:55 PM Johnson, Marcus R. via RT <
> met_help at ucar.edu>
> wrote:
>
> >
> > Wed Sep 19 13:55:11 2018: Request 87043 was acted upon.
> > Transaction: Ticket created by marcus.johnson at ou.edu
> >        Queue: met_help
> >      Subject: Neighborhood contigency tables
> >        Owner: Nobody
> >   Requestors: marcus.johnson at ou.edu
> >       Status: new
> >  Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=87043 >
> >
> >
> > Good afternoon,
> >
> > I am running grid_stat with a neighborhood, and am trying to find
out how
> > contingency tables (nbrctc) are populated. I can see where
"forecast_yes"
> > is defined (i.e., when fractional forecast coverage exceeds the
> > user-defined cov_thresh) and vice versa for "forecast_no". Is the
same
> > logic applied to observations (i.e., when fractional observation
coverage
> > exceeds cov_thresh), or is "observed_yes" defined at each raw
observation
> > file grid point with no neighborhood consideration?
> >
> > Thanks for your help!
> > Marcus
> >
> >
>
>
>
>



------------------------------------------------
Subject: Neighborhood contigency tables
From: John Halley Gotway
Time: Mon Sep 24 16:32:38 2018

Yes, that is correct.  Just like the width of a square is the length
of
it's side, the width of a circle is it's diameter.

I realize that's confusing, since we're so used to talking about the
radius.

FYI, there's more detail about this listed in the file
"data/config/README", which I've cut-and-pasted below:

//      - The "width" entry is an integer which specifies the size of
the
//        interpolation area. The area is either a square or circle
containing
//        the observation point. The width value specifies the width
of the
//        square or diameter of the circle. A width value of 1 is
interpreted
//        as the nearest neighbor model grid point to the observation
point.
//        For squares, a width of 2 defines a 2 x 2 box of grid points
around
//        the observation point (the 4 closest model grid points),
while a
width
//        of 3 defines a 3 x 3 box of grid points around the
observation
point,
//        and so on. For odd widths in grid-to-point comparisons
//        (i.e. Point-Stat), the interpolation area is centered on the
model
//        grid point closest to the observation point. For grid-to-
grid
//        comparisons (i.e. Grid-Stat), the width must be odd.




On Mon, Sep 24, 2018 at 4:29 PM Johnson, Marcus R. via RT
<met_help at ucar.edu>
wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=87043 >
>
> Hi John,
>
> That makes perfect sense. I believe that it should produce the same
> results as setting cov_thresh > 0. Just so I am clear, width refers
to the
> interpolation circle's diameter, correct? (i.e., width = 5 would be
a
> circle of diameter 5 from the current grid point)
>
> Marcus
>
> -----Original Message-----
> From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
> Sent: Monday, September 24, 2018 3:10 PM
> To: Johnson, Marcus R.
> Subject: Re: [rt.rap.ucar.edu #87043] Neighborhood contigency tables
>
> Marcus,
>
> Yes, I think that logic makes sense.  Using cov_thresh > 0, a hit
would
> mean that the event (as defined by the cat_thresh) occurred
somewhere in
> that neighborhood.
>
> However, they may be a more straight-forward way of applying this
logic.
> The config file of the Grid-Stat tool includes a section called
"interp".
> For Point-Stat, the "interp" section defines how the gridded
forecast data
> should be interpolated to the point observation location.  In Grid-
Stat,
> the "interp" option serves as a way of smoothing the data.  Let's
say you
> set:
>
> interp = {
>    field           = FCST;
>    vld_thresh = 1.0;
>    shape        = CIRCLE;
>    type = [
>       {
>          method = MAX;
>          width  = 5;
>       }
>    ];
> }
>
> This tells Grid-Stat to smooth the forecast data by replacing the
value at
> each grid point with the maximum value found in the circle of
diameter 10
> surrounding the current grid point.
>
> Once you've smoothed the forecast in this way, you can apply
categorical
> thresholds directly and compute contingency table statistics.
>
> I *think* this would produce the same result as what you've
described, but
> would be more direct.  Make sense?
>
> Thanks,
> John
>
> On Mon, Sep 24, 2018 at 12:44 PM Johnson, Marcus R. via RT <
> met_help at ucar.edu> wrote:
>
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=87043 >
> >
> > Hi John,
> >
> > Thank you for the thorough explanation. If I am reading correctly,
a
> "hit"
> > is defined where fraction coverage exceeds the cov_thresh for
forecasts
> and
> > observations. When this threshold is set to >0, this is the same
method
> of
> > the "neighborhood maximum" approach, which Schwartz (2017) (which
also
> > contains NM references) suggested for neighborhood-based
contingency
> > tables. Therefore, there is precedence in literature such that the
user
> can
> > be confident in the results.
> >
> > Marcus
> >
> > -----Original Message-----
> > From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
> > Sent: Friday, September 21, 2018 5:30 PM
> > To: Johnson, Marcus R.
> > Subject: Re: [rt.rap.ucar.edu #87043] Neighborhood contigency
tables
> >
> > Hi Marcus,
> >
> > I see you have a question about the NBRCTC output from the Grid-
Stat
> tool.
> > When applying neighborhood methods in Grid-Stat, it can create
output in
> > the NBRCNT, NBRCTC, and NBRCTS line types.
> >
> > The NBRCNT output includes the fractions brier score and fractions
skill
> > score which are, by far, the most commonly used neighborhood
statistic
> > types.
> >
> > In fact, to my knowledge, the NBRCTC contingency table counts and
the
> > corresponding NBRCTS contingency table statistics do not appear in
any
> > published verification literature.  And for that reason, we've
> > second-guessed whether the tool should be computing them at all.
So all
> > those caveats aside, here's how the logic works.
> >
> > (1) In Grid-Stat, loop through the requested list of fields and
compute
> > stats for each one of them.
> > (2) For each forecast field/observation field combo, loop over the
> > categorical thresholds (cat_thresh) requested.
> > (3) For each categorical threshold requested, loop over the
neighborhood
> > areas (nbrhd.width) requested.
> > (4) Apply the current forecast categorical threshold and
neighborhood
> size
> > to transform the raw forecast values into a fractional coverage
field.
> > (5) Apply the current observation categorical threshold and
neighborhood
> > size to transform the raw observation values into a fractional
coverage
> > field.
> > The value at each grid point represents the frequency of that
event in
> the
> > neighborhood of that point.
> > (6) Compare the forecast and observation fractional coverage
values
> > directly to compute the stats for the NBRCNT line type.
> > (7) If the user has requested NBRCTC or NBRCTS output, loop over
the
> > coverage thresholds (nbhrd.cov_thresh) requested.
> > (8) For each coverage threshold, threshold both the forecast and
> > observation fractional coverage fields to defined events/non-
events.
> > (9) Compare the forecast/observation events/non-events to compute
the
> > counts and statistics for the NBRCTC and NBRCTS line types.
> >
> > The single "cov_thresh" threshold is applied to both the forecast
and
> > observation fields.  There's no way to specify them differently
for each.
> >
> > Hope that helps clarify.  Just let me know what questions remain.
> >
> > Thanks,
> > John Halley Gotway
> >
> >
> > On Wed, Sep 19, 2018 at 1:55 PM Johnson, Marcus R. via RT <
> > met_help at ucar.edu>
> > wrote:
> >
> > >
> > > Wed Sep 19 13:55:11 2018: Request 87043 was acted upon.
> > > Transaction: Ticket created by marcus.johnson at ou.edu
> > >        Queue: met_help
> > >      Subject: Neighborhood contigency tables
> > >        Owner: Nobody
> > >   Requestors: marcus.johnson at ou.edu
> > >       Status: new
> > >  Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=87043
> >
> > >
> > >
> > > Good afternoon,
> > >
> > > I am running grid_stat with a neighborhood, and am trying to
find out
> how
> > > contingency tables (nbrctc) are populated. I can see where
> "forecast_yes"
> > > is defined (i.e., when fractional forecast coverage exceeds the
> > > user-defined cov_thresh) and vice versa for "forecast_no". Is
the same
> > > logic applied to observations (i.e., when fractional observation
> coverage
> > > exceeds cov_thresh), or is "observed_yes" defined at each raw
> observation
> > > file grid point with no neighborhood consideration?
> > >
> > > Thanks for your help!
> > > Marcus
> > >
> > >
> >
> >
> >
> >
>
>
>
>

------------------------------------------------
Subject: RE: [rt.rap.ucar.edu #87043] Neighborhood contigency tables
From: Johnson, Marcus R.
Time: Tue Sep 25 13:35:45 2018

Great, thank you so much for your incredibly helpful and thorough
explanations. I really appreciate it.

Marcus

-----Original Message-----
From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
Sent: Monday, September 24, 2018 5:33 PM
To: Johnson, Marcus R.
Subject: Re: [rt.rap.ucar.edu #87043] Neighborhood contigency tables

Yes, that is correct.  Just like the width of a square is the length
of
it's side, the width of a circle is it's diameter.

I realize that's confusing, since we're so used to talking about the
radius.

FYI, there's more detail about this listed in the file
"data/config/README", which I've cut-and-pasted below:

//      - The "width" entry is an integer which specifies the size of
the
//        interpolation area. The area is either a square or circle
containing
//        the observation point. The width value specifies the width
of the
//        square or diameter of the circle. A width value of 1 is
interpreted
//        as the nearest neighbor model grid point to the observation
point.
//        For squares, a width of 2 defines a 2 x 2 box of grid points
around
//        the observation point (the 4 closest model grid points),
while a
width
//        of 3 defines a 3 x 3 box of grid points around the
observation
point,
//        and so on. For odd widths in grid-to-point comparisons
//        (i.e. Point-Stat), the interpolation area is centered on the
model
//        grid point closest to the observation point. For grid-to-
grid
//        comparisons (i.e. Grid-Stat), the width must be odd.




On Mon, Sep 24, 2018 at 4:29 PM Johnson, Marcus R. via RT
<met_help at ucar.edu>
wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=87043 >
>
> Hi John,
>
> That makes perfect sense. I believe that it should produce the same
> results as setting cov_thresh > 0. Just so I am clear, width refers
to the
> interpolation circle's diameter, correct? (i.e., width = 5 would be
a
> circle of diameter 5 from the current grid point)
>
> Marcus
>
> -----Original Message-----
> From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
> Sent: Monday, September 24, 2018 3:10 PM
> To: Johnson, Marcus R.
> Subject: Re: [rt.rap.ucar.edu #87043] Neighborhood contigency tables
>
> Marcus,
>
> Yes, I think that logic makes sense.  Using cov_thresh > 0, a hit
would
> mean that the event (as defined by the cat_thresh) occurred
somewhere in
> that neighborhood.
>
> However, they may be a more straight-forward way of applying this
logic.
> The config file of the Grid-Stat tool includes a section called
"interp".
> For Point-Stat, the "interp" section defines how the gridded
forecast data
> should be interpolated to the point observation location.  In Grid-
Stat,
> the "interp" option serves as a way of smoothing the data.  Let's
say you
> set:
>
> interp = {
>    field           = FCST;
>    vld_thresh = 1.0;
>    shape        = CIRCLE;
>    type = [
>       {
>          method = MAX;
>          width  = 5;
>       }
>    ];
> }
>
> This tells Grid-Stat to smooth the forecast data by replacing the
value at
> each grid point with the maximum value found in the circle of
diameter 10
> surrounding the current grid point.
>
> Once you've smoothed the forecast in this way, you can apply
categorical
> thresholds directly and compute contingency table statistics.
>
> I *think* this would produce the same result as what you've
described, but
> would be more direct.  Make sense?
>
> Thanks,
> John
>
> On Mon, Sep 24, 2018 at 12:44 PM Johnson, Marcus R. via RT <
> met_help at ucar.edu> wrote:
>
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=87043 >
> >
> > Hi John,
> >
> > Thank you for the thorough explanation. If I am reading correctly,
a
> "hit"
> > is defined where fraction coverage exceeds the cov_thresh for
forecasts
> and
> > observations. When this threshold is set to >0, this is the same
method
> of
> > the "neighborhood maximum" approach, which Schwartz (2017) (which
also
> > contains NM references) suggested for neighborhood-based
contingency
> > tables. Therefore, there is precedence in literature such that the
user
> can
> > be confident in the results.
> >
> > Marcus
> >
> > -----Original Message-----
> > From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
> > Sent: Friday, September 21, 2018 5:30 PM
> > To: Johnson, Marcus R.
> > Subject: Re: [rt.rap.ucar.edu #87043] Neighborhood contigency
tables
> >
> > Hi Marcus,
> >
> > I see you have a question about the NBRCTC output from the Grid-
Stat
> tool.
> > When applying neighborhood methods in Grid-Stat, it can create
output in
> > the NBRCNT, NBRCTC, and NBRCTS line types.
> >
> > The NBRCNT output includes the fractions brier score and fractions
skill
> > score which are, by far, the most commonly used neighborhood
statistic
> > types.
> >
> > In fact, to my knowledge, the NBRCTC contingency table counts and
the
> > corresponding NBRCTS contingency table statistics do not appear in
any
> > published verification literature.  And for that reason, we've
> > second-guessed whether the tool should be computing them at all.
So all
> > those caveats aside, here's how the logic works.
> >
> > (1) In Grid-Stat, loop through the requested list of fields and
compute
> > stats for each one of them.
> > (2) For each forecast field/observation field combo, loop over the
> > categorical thresholds (cat_thresh) requested.
> > (3) For each categorical threshold requested, loop over the
neighborhood
> > areas (nbrhd.width) requested.
> > (4) Apply the current forecast categorical threshold and
neighborhood
> size
> > to transform the raw forecast values into a fractional coverage
field.
> > (5) Apply the current observation categorical threshold and
neighborhood
> > size to transform the raw observation values into a fractional
coverage
> > field.
> > The value at each grid point represents the frequency of that
event in
> the
> > neighborhood of that point.
> > (6) Compare the forecast and observation fractional coverage
values
> > directly to compute the stats for the NBRCNT line type.
> > (7) If the user has requested NBRCTC or NBRCTS output, loop over
the
> > coverage thresholds (nbhrd.cov_thresh) requested.
> > (8) For each coverage threshold, threshold both the forecast and
> > observation fractional coverage fields to defined events/non-
events.
> > (9) Compare the forecast/observation events/non-events to compute
the
> > counts and statistics for the NBRCTC and NBRCTS line types.
> >
> > The single "cov_thresh" threshold is applied to both the forecast
and
> > observation fields.  There's no way to specify them differently
for each.
> >
> > Hope that helps clarify.  Just let me know what questions remain.
> >
> > Thanks,
> > John Halley Gotway
> >
> >
> > On Wed, Sep 19, 2018 at 1:55 PM Johnson, Marcus R. via RT <
> > met_help at ucar.edu>
> > wrote:
> >
> > >
> > > Wed Sep 19 13:55:11 2018: Request 87043 was acted upon.
> > > Transaction: Ticket created by marcus.johnson at ou.edu
> > >        Queue: met_help
> > >      Subject: Neighborhood contigency tables
> > >        Owner: Nobody
> > >   Requestors: marcus.johnson at ou.edu
> > >       Status: new
> > >  Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=87043
> >
> > >
> > >
> > > Good afternoon,
> > >
> > > I am running grid_stat with a neighborhood, and am trying to
find out
> how
> > > contingency tables (nbrctc) are populated. I can see where
> "forecast_yes"
> > > is defined (i.e., when fractional forecast coverage exceeds the
> > > user-defined cov_thresh) and vice versa for "forecast_no". Is
the same
> > > logic applied to observations (i.e., when fractional observation
> coverage
> > > exceeds cov_thresh), or is "observed_yes" defined at each raw
> observation
> > > file grid point with no neighborhood consideration?
> > >
> > > Thanks for your help!
> > > Marcus
> > >
> > >
> >
> >
> >
> >
>
>
>
>



------------------------------------------------


More information about the Met_help mailing list