[Met_help] [rt.rap.ucar.edu #96562] History for Including Climatology in grid_stat Config File

Mon Sep 28 12:11:24 MDT 2020

----------------------------------------------------------------
  Initial Request
----------------------------------------------------------------

Greetings,

For the first time I am attempting to calculate Brier Skill Score using
grid_stat from an input climatology file. I have created a probabilistic
flooding climatology file (spans from zero to one; image is here:
https://ftp.wpc.ncep.noaa.gov/erickson/test10/ALL_noMRGL_UFVS_ALL.png).
This climatology is static, so it doesn't change with time when inputting
the "model" and "observation" data. I believe I have successfully gotten
this to work using the command:

/opt/MET/90/bin/grid_stat ERO_s2020083112_e2020090112_vhr09.nc
ST4gFFG_s2020083112_e2020090112_vhr09.nc usethis -outdir ~

where grid_stat ERO_s2020083112_e2020090112_vhr09.nc are discrete forecast
probabilities of 0, 0.05, 0.1, 0.2, and 0.5
where ST4gFFG_s2020083112_e2020090112_vhr09.nc are observation values of 0
or 1
and usethis is the configuration file

Finally the climatology file that consists of "almost" continuous values
between 0 and 1 is named: UFVS_ST4gFFG_s2015010100_e2019123123_vhr12.nc

I have put all of these files at
https://ftp.wpc.ncep.noaa.gov/erickson/DTC/ for
your reference.

As for my questions:

1) I was wondering if the climatology file was properly ingested and
calculated for my example? I believe it is correct given the output below,
but I wanted to make sure, since this is my first time doing this:

*DEBUG 1: Forecast File:
/export/hpc-lw-dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL/ERO_s2020083112_e2020090112_vhr09.ncDEBUG
1: Observation File:
/export/hpc-lw-dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL/ST4gFFG_s2020083112_e2020090112_vhr09.ncDEBUG
3: Reading forecast data for EROSurface.DEBUG 3: Reading observation data
for ST4gFFGSurface.DEBUG 4: Met2dDataFileFactory::new_met_2d_data_file() ->
created new Met2dDataFile object of type "FileType_NcMet".DEBUG 4:DEBUG 4:
Latitude/Longitude Grid Data:DEBUG 4:      lat_ll: 25DEBUG 4:      lon_ll:
129.8DEBUG 4:   delta_lat: 0.09DEBUG 4:   delta_lon: 0.09DEBUG 4:
 Nlat: 276DEBUG 4:        Nlon: 721DEBUG 4:DEBUG 4:
VarInfoFactory::new_var_info() -> created new VarInfo object of type
"FileType_NcMet".DEBUG 3: For forecast valid at 20200901_120000, found 1
climatology field(s) with valid time(s): 20201231_230000DEBUG 3: Found 1
climatology fields.DEBUG 3: Found 1 climatology mean and 0 climatology
standard deviation field(s) for forecast EROSurface.DEBUG 2: Processing
masking regions.DEBUG 3: Processing grid mask: FULLDEBUG 4:
parse_grid_mask() -> parsing grid mask "FULL"DEBUG 2:DEBUG 2:
--------------------------------------------------------------------------------DEBUG
2:DEBUG 3: Smoothing field using the MAX(49) CircleTemplate interpolation
method.DEBUG 2: Processing EROSurface versus ST4gFFGSurface, for smoothing
method MAX_CIRCLE(49), over region FULL, using 190638 matched pairs.DEBUG
2: Computing Probabilistic Statistics.DEBUG 2:DEBUG 2:
--------------------------------------------------------------------------------DEBUG
2:DEBUG 1: Output file:
/export/hpc-lw-dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL//grid_stat_ST4gFFG_ERO_s2020090112_e2020090212_vhr09_240000L_20200901_120000V.statDEBUG
1: Output file:
/export/hpc-lw-dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL//grid_stat_ST4gFFG_ERO_s2020090112_e2020090212_vhr09_240000L_20200901_120000V_pairs.nc*

2) This question is a bit more basic. I am unable to manually calculate a
Brier Score value for the forecast and observation that properly matches
that in the stat file. My manually calculated Brier Score is systematically
lower. For this event, the stat file BS is 0.0119 and my value is 0.0116.
I've looked at C3 in the MET Tutorial guide
<https://dtcenter.org/sites/default/files/community-code/met/docs/user-guide/MET_Users_Guide_v9.0.pdf>,
but I'm still at a bit of a loss. Is there a simple way I can replicate the
calculation seen in the stat file?

Thank you again for your help and please let me know if you have any
questions.

Mike

-- 
Michael J. Erickson

Research Scientist
Cooperative Institute for Research in Environmental Sciences (CIRES)
NOAA/NWS/Weather Prediction Center
Phone:  301-683-1546

----------------------------------------------------------------
  Complete Ticket History
----------------------------------------------------------------

Subject: Including Climatology in grid_stat Config File
From: Minna Win
Time: Thu Sep 03 14:11:00 2020

Hi Mike,

It looks like you have a few questions associated with calculating
Brier
Skill Scores.  I'm assigning this ticket to John Halley Gotway.

Regards,
Minna
---------------
Minna Win
National Center for Atmospheric Research
Developmental Testbed Center
Phone: 303-497-8423
Fax:   303-497-8401

On Thu, Sep 3, 2020 at 1:13 PM Michael Erickson - NOAA Affiliate via
RT <
met_help at ucar.edu> wrote:

>
> Thu Sep 03 13:13:26 2020: Request 96562 was acted upon.
> Transaction: Ticket created by michael.j.erickson at noaa.gov
>        Queue: met_help
>      Subject: Including Climatology in grid_stat Config File
>        Owner: Nobody
>   Requestors: michael.j.erickson at noaa.gov
>       Status: new
>  Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562 >
>
>
> Greetings,
>
> For the first time I am attempting to calculate Brier Skill Score
using
> grid_stat from an input climatology file. I have created a
probabilistic
> flooding climatology file (spans from zero to one; image is here:
>
https://ftp.wpc.ncep.noaa.gov/erickson/test10/ALL_noMRGL_UFVS_ALL.png).
> This climatology is static, so it doesn't change with time when
inputting
> the "model" and "observation" data. I believe I have successfully
gotten
> this to work using the command:
>
> /opt/MET/90/bin/grid_stat ERO_s2020083112_e2020090112_vhr09.nc
> ST4gFFG_s2020083112_e2020090112_vhr09.nc usethis -outdir ~
>
> where grid_stat ERO_s2020083112_e2020090112_vhr09.nc are discrete
forecast
> probabilities of 0, 0.05, 0.1, 0.2, and 0.5
> where ST4gFFG_s2020083112_e2020090112_vhr09.nc are observation
values of 0
> or 1
> and usethis is the configuration file
>
> Finally the climatology file that consists of "almost" continuous
values
> between 0 and 1 is named:
UFVS_ST4gFFG_s2015010100_e2019123123_vhr12.nc
>
> I have put all of these files at
> https://ftp.wpc.ncep.noaa.gov/erickson/DTC/ for
> your reference.
>
> As for my questions:
>
> 1) I was wondering if the climatology file was properly ingested and
> calculated for my example? I believe it is correct given the output
below,
> but I wanted to make sure, since this is my first time doing this:
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> *DEBUG 1: Forecast File:
>
> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL/ERO_s2020083112_e2020090112_vhr09.ncDEBUG
> 1: Observation File:
>
> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL/ST4gFFG_s2020083112_e2020090112_vhr09.ncDEBUG
> 3: Reading forecast data for EROSurface.DEBUG 3: Reading observation
data
> for ST4gFFGSurface.DEBUG 4:
Met2dDataFileFactory::new_met_2d_data_file() ->
> created new Met2dDataFile object of type "FileType_NcMet".DEBUG
4:DEBUG 4:
> Latitude/Longitude Grid Data:DEBUG 4:      lat_ll: 25DEBUG 4:
lon_ll:
> 129.8DEBUG 4:   delta_lat: 0.09DEBUG 4:   delta_lon: 0.09DEBUG 4:
>  Nlat: 276DEBUG 4:        Nlon: 721DEBUG 4:DEBUG 4:
> VarInfoFactory::new_var_info() -> created new VarInfo object of type
> "FileType_NcMet".DEBUG 3: For forecast valid at 20200901_120000,
found 1
> climatology field(s) with valid time(s): 20201231_230000DEBUG 3:
Found 1
> climatology fields.DEBUG 3: Found 1 climatology mean and 0
climatology
> standard deviation field(s) for forecast EROSurface.DEBUG 2:
Processing
> masking regions.DEBUG 3: Processing grid mask: FULLDEBUG 4:
> parse_grid_mask() -> parsing grid mask "FULL"DEBUG 2:DEBUG 2:
>
>
--------------------------------------------------------------------------------DEBUG
> 2:DEBUG 3: Smoothing field using the MAX(49) CircleTemplate
interpolation
> method.DEBUG 2: Processing EROSurface versus ST4gFFGSurface, for
smoothing
> method MAX_CIRCLE(49), over region FULL, using 190638 matched
pairs.DEBUG
> 2: Computing Probabilistic Statistics.DEBUG 2:DEBUG 2:
>
>
--------------------------------------------------------------------------------DEBUG
> 2:DEBUG 1: Output file:
>
> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL//grid_stat_ST4gFFG_ERO_s2020090112_e2020090212_vhr09_240000L_20200901_120000V.statDEBUG
> 1: Output file:
>
> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL//grid_stat_ST4gFFG_ERO_s2020090112_e2020090212_vhr09_240000L_20200901_120000V_pairs.nc*
>
>
> 2) This question is a bit more basic. I am unable to manually
calculate a
> Brier Score value for the forecast and observation that properly
matches
> that in the stat file. My manually calculated Brier Score is
systematically
> lower. For this event, the stat file BS is 0.0119 and my value is
0.0116.
> I've looked at C3 in the MET Tutorial guide
> <
> https://dtcenter.org/sites/default/files/community-
code/met/docs/user-guide/MET_Users_Guide_v9.0.pdf
> >,
> but I'm still at a bit of a loss. Is there a simple way I can
replicate the
> calculation seen in the stat file?
>
> Thank you again for your help and please let me know if you have any
> questions.
>
> Mike
>
> --
> Michael J. Erickson
>
> Research Scientist
> Cooperative Institute for Research in Environmental Sciences (CIRES)
> NOAA/NWS/Weather Prediction Center
> Phone:  301-683-1546
>
>

------------------------------------------------
Subject: Including Climatology in grid_stat Config File
From: Michael Erickson - NOAA Affiliate
Time: Thu Sep 03 14:26:49 2020

Thank you Minna!

Mike

On Thu, Sep 3, 2020 at 4:12 PM Minna Win via RT <met_help at ucar.edu>
wrote:

> Hi Mike,
>
> It looks like you have a few questions associated with calculating
Brier
> Skill Scores.  I'm assigning this ticket to John Halley Gotway.
>
> Regards,
> Minna
> ---------------
> Minna Win
> National Center for Atmospheric Research
> Developmental Testbed Center
> Phone: 303-497-8423
> Fax:   303-497-8401
>
>
>
> On Thu, Sep 3, 2020 at 1:13 PM Michael Erickson - NOAA Affiliate via
RT <
> met_help at ucar.edu> wrote:
>
> >
> > Thu Sep 03 13:13:26 2020: Request 96562 was acted upon.
> > Transaction: Ticket created by michael.j.erickson at noaa.gov
> >        Queue: met_help
> >      Subject: Including Climatology in grid_stat Config File
> >        Owner: Nobody
> >   Requestors: michael.j.erickson at noaa.gov
> >       Status: new
> >  Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562 >
> >
> >
> > Greetings,
> >
> > For the first time I am attempting to calculate Brier Skill Score
using
> > grid_stat from an input climatology file. I have created a
probabilistic
> > flooding climatology file (spans from zero to one; image is here:
> >
https://ftp.wpc.ncep.noaa.gov/erickson/test10/ALL_noMRGL_UFVS_ALL.png).
> > This climatology is static, so it doesn't change with time when
inputting
> > the "model" and "observation" data. I believe I have successfully
gotten
> > this to work using the command:
> >
> > /opt/MET/90/bin/grid_stat ERO_s2020083112_e2020090112_vhr09.nc
> > ST4gFFG_s2020083112_e2020090112_vhr09.nc usethis -outdir ~
> >
> > where grid_stat ERO_s2020083112_e2020090112_vhr09.nc are discrete
> forecast
> > probabilities of 0, 0.05, 0.1, 0.2, and 0.5
> > where ST4gFFG_s2020083112_e2020090112_vhr09.nc are observation
values
> of 0
> > or 1
> > and usethis is the configuration file
> >
> > Finally the climatology file that consists of "almost" continuous
values
> > between 0 and 1 is named:
UFVS_ST4gFFG_s2015010100_e2019123123_vhr12.nc
> >
> > I have put all of these files at
> > https://ftp.wpc.ncep.noaa.gov/erickson/DTC/ for
> > your reference.
> >
> > As for my questions:
> >
> > 1) I was wondering if the climatology file was properly ingested
and
> > calculated for my example? I believe it is correct given the
output
> below,
> > but I wanted to make sure, since this is my first time doing this:
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > *DEBUG 1: Forecast File:
> >
> >
> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL/ERO_s2020083112_e2020090112_vhr09.ncDEBUG
> > 1: Observation File:
> >
> >
> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL/ST4gFFG_s2020083112_e2020090112_vhr09.ncDEBUG
> > 3: Reading forecast data for EROSurface.DEBUG 3: Reading
observation data
> > for ST4gFFGSurface.DEBUG 4:
Met2dDataFileFactory::new_met_2d_data_file()
> ->
> > created new Met2dDataFile object of type "FileType_NcMet".DEBUG
4:DEBUG
> 4:
> > Latitude/Longitude Grid Data:DEBUG 4:      lat_ll: 25DEBUG 4:
> lon_ll:
> > 129.8DEBUG 4:   delta_lat: 0.09DEBUG 4:   delta_lon: 0.09DEBUG 4:
> >  Nlat: 276DEBUG 4:        Nlon: 721DEBUG 4:DEBUG 4:
> > VarInfoFactory::new_var_info() -> created new VarInfo object of
type
> > "FileType_NcMet".DEBUG 3: For forecast valid at 20200901_120000,
found 1
> > climatology field(s) with valid time(s): 20201231_230000DEBUG 3:
Found 1
> > climatology fields.DEBUG 3: Found 1 climatology mean and 0
climatology
> > standard deviation field(s) for forecast EROSurface.DEBUG 2:
Processing
> > masking regions.DEBUG 3: Processing grid mask: FULLDEBUG 4:
> > parse_grid_mask() -> parsing grid mask "FULL"DEBUG 2:DEBUG 2:
> >
> >
>
--------------------------------------------------------------------------------DEBUG
> > 2:DEBUG 3: Smoothing field using the MAX(49) CircleTemplate
interpolation
> > method.DEBUG 2: Processing EROSurface versus ST4gFFGSurface, for
> smoothing
> > method MAX_CIRCLE(49), over region FULL, using 190638 matched
pairs.DEBUG
> > 2: Computing Probabilistic Statistics.DEBUG 2:DEBUG 2:
> >
> >
>
--------------------------------------------------------------------------------DEBUG
> > 2:DEBUG 1: Output file:
> >
> >
> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL//grid_stat_ST4gFFG_ERO_s2020090112_e2020090212_vhr09_240000L_20200901_120000V.statDEBUG
> > 1: Output file:
> >
> >
> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL//grid_stat_ST4gFFG_ERO_s2020090112_e2020090212_vhr09_240000L_20200901_120000V_pairs.nc*
> >
> >
> > 2) This question is a bit more basic. I am unable to manually
calculate a
> > Brier Score value for the forecast and observation that properly
matches
> > that in the stat file. My manually calculated Brier Score is
> systematically
> > lower. For this event, the stat file BS is 0.0119 and my value is
0.0116.
> > I've looked at C3 in the MET Tutorial guide
> > <
> >
> https://dtcenter.org/sites/default/files/community-
code/met/docs/user-guide/MET_Users_Guide_v9.0.pdf
> > >,
> > but I'm still at a bit of a loss. Is there a simple way I can
replicate
> the
> > calculation seen in the stat file?
> >
> > Thank you again for your help and please let me know if you have
any
> > questions.
> >
> > Mike
> >
> > --
> > Michael J. Erickson
> >
> > Research Scientist
> > Cooperative Institute for Research in Environmental Sciences
(CIRES)
> > NOAA/NWS/Weather Prediction Center
> > Phone:  301-683-1546
> >
> >
>
>

--
Michael J. Erickson

Research Scientist
Cooperative Institute for Research in Environmental Sciences (CIRES)
NOAA/NWS/Weather Prediction Center
Phone:  301-683-1546

------------------------------------------------
Subject: Including Climatology in grid_stat Config File
From: John Halley Gotway
Time: Thu Sep 03 16:47:39 2020

Hi Mike,

Looks like you were able to make a lot of progress. I certainly don't
see
anything wrong based on the log messages you sent.

I do notice that you're smoothing the observations with the maximum
value
in a circle of diameter 9... presumably for a good reason. And I see
that
smoothing step indicated in the log messages as well as the output
.stat
file.

Two questions.

(1) I wanted to try running locally, but didn't find the "climo" file
on
the WPC ftp site:
https://ftp.wpc.ncep.noaa.gov/erickson/DTC/UFVS_ST4gFFG_s2015010100_e2019123123_vhr12.nc
Could you add that?

(2) When you say that you tried to replicate the Brier score
computation,
what was your starting point? The raw input files or using the NetCDF
matched pairs output from Grid-Stat which already include the
computation
of the observation maximums?

Thanks,
John Halley Gotway

On Thu, Sep 3, 2020 at 2:26 PM Michael Erickson - NOAA Affiliate via
RT <
met_help at ucar.edu> wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562 >
>
> Thank you Minna!
>
> Mike
>
> On Thu, Sep 3, 2020 at 4:12 PM Minna Win via RT <met_help at ucar.edu>
wrote:
>
> > Hi Mike,
> >
> > It looks like you have a few questions associated with calculating
Brier
> > Skill Scores.  I'm assigning this ticket to John Halley Gotway.
> >
> > Regards,
> > Minna
> > ---------------
> > Minna Win
> > National Center for Atmospheric Research
> > Developmental Testbed Center
> > Phone: 303-497-8423
> > Fax:   303-497-8401
> >
> >
> >
> > On Thu, Sep 3, 2020 at 1:13 PM Michael Erickson - NOAA Affiliate
via RT <
> > met_help at ucar.edu> wrote:
> >
> > >
> > > Thu Sep 03 13:13:26 2020: Request 96562 was acted upon.
> > > Transaction: Ticket created by michael.j.erickson at noaa.gov
> > >        Queue: met_help
> > >      Subject: Including Climatology in grid_stat Config File
> > >        Owner: Nobody
> > >   Requestors: michael.j.erickson at noaa.gov
> > >       Status: new
> > >  Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562
> >
> > >
> > >
> > > Greetings,
> > >
> > > For the first time I am attempting to calculate Brier Skill
Score using
> > > grid_stat from an input climatology file. I have created a
> probabilistic
> > > flooding climatology file (spans from zero to one; image is
here:
> > >
https://ftp.wpc.ncep.noaa.gov/erickson/test10/ALL_noMRGL_UFVS_ALL.png
> ).
> > > This climatology is static, so it doesn't change with time when
> inputting
> > > the "model" and "observation" data. I believe I have
successfully
> gotten
> > > this to work using the command:
> > >
> > > /opt/MET/90/bin/grid_stat ERO_s2020083112_e2020090112_vhr09.nc
> > > ST4gFFG_s2020083112_e2020090112_vhr09.nc usethis -outdir ~
> > >
> > > where grid_stat ERO_s2020083112_e2020090112_vhr09.nc are
discrete
> > forecast
> > > probabilities of 0, 0.05, 0.1, 0.2, and 0.5
> > > where ST4gFFG_s2020083112_e2020090112_vhr09.nc are observation
values
> > of 0
> > > or 1
> > > and usethis is the configuration file
> > >
> > > Finally the climatology file that consists of "almost"
continuous
> values
> > > between 0 and 1 is named: UFVS_ST4gFFG_s2015010100_
> e2019123123_vhr12.nc
> > >
> > > I have put all of these files at
> > > https://ftp.wpc.ncep.noaa.gov/erickson/DTC/ for
> > > your reference.
> > >
> > > As for my questions:
> > >
> > > 1) I was wondering if the climatology file was properly ingested
and
> > > calculated for my example? I believe it is correct given the
output
> > below,
> > > but I wanted to make sure, since this is my first time doing
this:
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > *DEBUG 1: Forecast File:
> > >
> > >
> >
> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL/ERO_s2020083112_e2020090112_vhr09.ncDEBUG
> > > 1: Observation File:
> > >
> > >
> >
> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL/ST4gFFG_s2020083112_e2020090112_vhr09.ncDEBUG
> > > 3: Reading forecast data for EROSurface.DEBUG 3: Reading
observation
> data
> > > for ST4gFFGSurface.DEBUG 4:
> Met2dDataFileFactory::new_met_2d_data_file()
> > ->
> > > created new Met2dDataFile object of type "FileType_NcMet".DEBUG
4:DEBUG
> > 4:
> > > Latitude/Longitude Grid Data:DEBUG 4:      lat_ll: 25DEBUG 4:
> > lon_ll:
> > > 129.8DEBUG 4:   delta_lat: 0.09DEBUG 4:   delta_lon: 0.09DEBUG
4:
> > >  Nlat: 276DEBUG 4:        Nlon: 721DEBUG 4:DEBUG 4:
> > > VarInfoFactory::new_var_info() -> created new VarInfo object of
type
> > > "FileType_NcMet".DEBUG 3: For forecast valid at 20200901_120000,
found
> 1
> > > climatology field(s) with valid time(s): 20201231_230000DEBUG 3:
Found
> 1
> > > climatology fields.DEBUG 3: Found 1 climatology mean and 0
climatology
> > > standard deviation field(s) for forecast EROSurface.DEBUG 2:
Processing
> > > masking regions.DEBUG 3: Processing grid mask: FULLDEBUG 4:
> > > parse_grid_mask() -> parsing grid mask "FULL"DEBUG 2:DEBUG 2:
> > >
> > >
> >
>
--------------------------------------------------------------------------------DEBUG
> > > 2:DEBUG 3: Smoothing field using the MAX(49) CircleTemplate
> interpolation
> > > method.DEBUG 2: Processing EROSurface versus ST4gFFGSurface, for
> > smoothing
> > > method MAX_CIRCLE(49), over region FULL, using 190638 matched
> pairs.DEBUG
> > > 2: Computing Probabilistic Statistics.DEBUG 2:DEBUG 2:
> > >
> > >
> >
>
--------------------------------------------------------------------------------DEBUG
> > > 2:DEBUG 1: Output file:
> > >
> > >
> >
> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL//grid_stat_ST4gFFG_ERO_s2020090112_e2020090212_vhr09_240000L_20200901_120000V.statDEBUG
> > > 1: Output file:
> > >
> > >
> >
> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL//grid_stat_ST4gFFG_ERO_s2020090112_e2020090212_vhr09_240000L_20200901_120000V_pairs.nc*
> > >
> > >
> > > 2) This question is a bit more basic. I am unable to manually
> calculate a
> > > Brier Score value for the forecast and observation that properly
> matches
> > > that in the stat file. My manually calculated Brier Score is
> > systematically
> > > lower. For this event, the stat file BS is 0.0119 and my value
is
> 0.0116.
> > > I've looked at C3 in the MET Tutorial guide
> > > <
> > >
> >
> https://dtcenter.org/sites/default/files/community-
code/met/docs/user-guide/MET_Users_Guide_v9.0.pdf
> > > >,
> > > but I'm still at a bit of a loss. Is there a simple way I can
replicate
> > the
> > > calculation seen in the stat file?
> > >
> > > Thank you again for your help and please let me know if you have
any
> > > questions.
> > >
> > > Mike
> > >
> > > --
> > > Michael J. Erickson
> > >
> > > Research Scientist
> > > Cooperative Institute for Research in Environmental Sciences
(CIRES)
> > > NOAA/NWS/Weather Prediction Center
> > > Phone:  301-683-1546
> > >
> > >
> >
> >
>
> --
> Michael J. Erickson
>
> Research Scientist
> Cooperative Institute for Research in Environmental Sciences (CIRES)
> NOAA/NWS/Weather Prediction Center
> Phone:  301-683-1546
>
>

------------------------------------------------
Subject: Including Climatology in grid_stat Config File
From: John Halley Gotway
Time: Thu Sep 03 17:11:37 2020

Actually, I have a reasonable guess as to why you may be seeing a
difference.

All probabilistics verification in MET is based on an Nx2
probabilistic
contingency table. Those are the counts in the PCT line type. We do
this to
make it easier to aggregate statistics across multiple cases, but
summing
up contingency tables before recomputing statistics. But the pros/cons
of
this approach would probably be better addressed by a statistician. So
the
stats are computed using probability bins and not raw probability
values.

If you went and computed the Brier score by hand, you probably did so
using
raw probability values and not binning them first.

And this difference could explain the type of discrepancy you're
seeing.

To test this out, I reran your case...
(1) Using your original settings to confirm your Brier score of
0.011934.
(2) Using 10 equally-spaced probability bins (cat_thresh = [ ==0.1 ];)
which produced a Brier score of 0.013747.
(3) Using 50 equally-spaced probability bins (cat_thresh = [ ==0.2 ];)
which produced a Brier score of 0.01197.
(4) Using 100 equally-spaced probability bins (cat_thresh = [ ==0.01
];)
which produced a Brier score of 0.01193.

I suppose that doesn't example the exact discrepancy, but could
definitely
be involved.

Notice on this line of the brier score computation in MET:
https://github.com/dtcenter/MET/blob/2c9ae440a84024fbf62caa64c1b747f9a912236f/met/src/libcode/vx_statistics/contable_nx2.cc#L647

That the "probability" value returned by "row_proby()" is the mid-
point of
the bin.
So all of your forecast probability values of 0% which fall into the
first
bin are actually evaluated as having a probability value of 0.025
which is
the mid-point between 0 and 0.05 for the first bin.

Rerunning using the following to minimize that effect on the 0's:
cat_thresh = [ >=0.0, >=0.001, >=0.05, >=0.1, >=0.2, >=0.5, >=1.0 ];
produces a brier score of 0.011489.

So I'd say that the binning of the probability values is impacting the
Brier score out in the 4th decimal place.

John

On Thu, Sep 3, 2020 at 4:47 PM John Halley Gotway <johnhg at ucar.edu>
wrote:

> Hi Mike,
>
> Looks like you were able to make a lot of progress. I certainly
don't see
> anything wrong based on the log messages you sent.
>
> I do notice that you're smoothing the observations with the maximum
value
> in a circle of diameter 9... presumably for a good reason. And I see
that
> smoothing step indicated in the log messages as well as the output
.stat
> file.
>
> Two questions.
>
> (1) I wanted to try running locally, but didn't find the "climo"
file on
> the WPC ftp site:
>
>
https://ftp.wpc.ncep.noaa.gov/erickson/DTC/UFVS_ST4gFFG_s2015010100_e2019123123_vhr12.nc
>
<https://ftp.wpc.ncep.noaa.gov/erickson/DTC/UFVS_ST4gFFG_s2015010100_e2019123123_vhr12.nc>
> Could you add that?
>
> (2) When you say that you tried to replicate the Brier score
computation,
> what was your starting point? The raw input files or using the
NetCDF
> matched pairs output from Grid-Stat which already include the
computation
> of the observation maximums?
>
> Thanks,
> John Halley Gotway
>
> On Thu, Sep 3, 2020 at 2:26 PM Michael Erickson - NOAA Affiliate via
RT <
> met_help at ucar.edu> wrote:
>
>>
>> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562 >
>>
>> Thank you Minna!
>>
>> Mike
>>
>> On Thu, Sep 3, 2020 at 4:12 PM Minna Win via RT <met_help at ucar.edu>
>> wrote:
>>
>> > Hi Mike,
>> >
>> > It looks like you have a few questions associated with
calculating Brier
>> > Skill Scores.  I'm assigning this ticket to John Halley Gotway.
>> >
>> > Regards,
>> > Minna
>> > ---------------
>> > Minna Win
>> > National Center for Atmospheric Research
>> > Developmental Testbed Center
>> > Phone: 303-497-8423
>> > Fax:   303-497-8401
>> >
>> >
>> >
>> > On Thu, Sep 3, 2020 at 1:13 PM Michael Erickson - NOAA Affiliate
via RT
>> <
>> > met_help at ucar.edu> wrote:
>> >
>> > >
>> > > Thu Sep 03 13:13:26 2020: Request 96562 was acted upon.
>> > > Transaction: Ticket created by michael.j.erickson at noaa.gov
>> > >        Queue: met_help
>> > >      Subject: Including Climatology in grid_stat Config File
>> > >        Owner: Nobody
>> > >   Requestors: michael.j.erickson at noaa.gov
>> > >       Status: new
>> > >  Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562
>> >
>> > >
>> > >
>> > > Greetings,
>> > >
>> > > For the first time I am attempting to calculate Brier Skill
Score
>> using
>> > > grid_stat from an input climatology file. I have created a
>> probabilistic
>> > > flooding climatology file (spans from zero to one; image is
here:
>> > >
https://ftp.wpc.ncep.noaa.gov/erickson/test10/ALL_noMRGL_UFVS_ALL.png
>> ).
>> > > This climatology is static, so it doesn't change with time when
>> inputting
>> > > the "model" and "observation" data. I believe I have
successfully
>> gotten
>> > > this to work using the command:
>> > >
>> > > /opt/MET/90/bin/grid_stat ERO_s2020083112_e2020090112_vhr09.nc
>> > > ST4gFFG_s2020083112_e2020090112_vhr09.nc usethis -outdir ~
>> > >
>> > > where grid_stat ERO_s2020083112_e2020090112_vhr09.nc are
discrete
>> > forecast
>> > > probabilities of 0, 0.05, 0.1, 0.2, and 0.5
>> > > where ST4gFFG_s2020083112_e2020090112_vhr09.nc are observation
values
>> > of 0
>> > > or 1
>> > > and usethis is the configuration file
>> > >
>> > > Finally the climatology file that consists of "almost"
continuous
>> values
>> > > between 0 and 1 is named: UFVS_ST4gFFG_s2015010100_
>> e2019123123_vhr12.nc
>> > >
>> > > I have put all of these files at
>> > > https://ftp.wpc.ncep.noaa.gov/erickson/DTC/ for
>> > > your reference.
>> > >
>> > > As for my questions:
>> > >
>> > > 1) I was wondering if the climatology file was properly
ingested and
>> > > calculated for my example? I believe it is correct given the
output
>> > below,
>> > > but I wanted to make sure, since this is my first time doing
this:
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > > *DEBUG 1: Forecast File:
>> > >
>> > >
>> >
>> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL/ERO_s2020083112_e2020090112_vhr09.ncDEBUG
>> > > 1: Observation File:
>> > >
>> > >
>> >
>> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL/ST4gFFG_s2020083112_e2020090112_vhr09.ncDEBUG
>> > > 3: Reading forecast data for EROSurface.DEBUG 3: Reading
observation
>> data
>> > > for ST4gFFGSurface.DEBUG 4:
>> Met2dDataFileFactory::new_met_2d_data_file()
>> > ->
>> > > created new Met2dDataFile object of type "FileType_NcMet".DEBUG
>> 4:DEBUG
>> > 4:
>> > > Latitude/Longitude Grid Data:DEBUG 4:      lat_ll: 25DEBUG 4:
>> > lon_ll:
>> > > 129.8DEBUG 4:   delta_lat: 0.09DEBUG 4:   delta_lon: 0.09DEBUG
4:
>> > >  Nlat: 276DEBUG 4:        Nlon: 721DEBUG 4:DEBUG 4:
>> > > VarInfoFactory::new_var_info() -> created new VarInfo object of
type
>> > > "FileType_NcMet".DEBUG 3: For forecast valid at
20200901_120000,
>> found 1
>> > > climatology field(s) with valid time(s): 20201231_230000DEBUG
3:
>> Found 1
>> > > climatology fields.DEBUG 3: Found 1 climatology mean and 0
climatology
>> > > standard deviation field(s) for forecast EROSurface.DEBUG 2:
>> Processing
>> > > masking regions.DEBUG 3: Processing grid mask: FULLDEBUG 4:
>> > > parse_grid_mask() -> parsing grid mask "FULL"DEBUG 2:DEBUG 2:
>> > >
>> > >
>> >
>>
--------------------------------------------------------------------------------DEBUG
>> > > 2:DEBUG 3: Smoothing field using the MAX(49) CircleTemplate
>> interpolation
>> > > method.DEBUG 2: Processing EROSurface versus ST4gFFGSurface,
for
>> > smoothing
>> > > method MAX_CIRCLE(49), over region FULL, using 190638 matched
>> pairs.DEBUG
>> > > 2: Computing Probabilistic Statistics.DEBUG 2:DEBUG 2:
>> > >
>> > >
>> >
>>
--------------------------------------------------------------------------------DEBUG
>> > > 2:DEBUG 1: Output file:
>> > >
>> > >
>> >
>> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL//grid_stat_ST4gFFG_ERO_s2020090112_e2020090212_vhr09_240000L_20200901_120000V.statDEBUG
>> > > 1: Output file:
>> > >
>> > >
>> >
>> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL//grid_stat_ST4gFFG_ERO_s2020090112_e2020090212_vhr09_240000L_20200901_120000V_pairs.nc*
>> > >
>> > >
>> > > 2) This question is a bit more basic. I am unable to manually
>> calculate a
>> > > Brier Score value for the forecast and observation that
properly
>> matches
>> > > that in the stat file. My manually calculated Brier Score is
>> > systematically
>> > > lower. For this event, the stat file BS is 0.0119 and my value
is
>> 0.0116.
>> > > I've looked at C3 in the MET Tutorial guide
>> > > <
>> > >
>> >
>> https://dtcenter.org/sites/default/files/community-
code/met/docs/user-guide/MET_Users_Guide_v9.0.pdf
>> > > >,
>> > > but I'm still at a bit of a loss. Is there a simple way I can
>> replicate
>> > the
>> > > calculation seen in the stat file?
>> > >
>> > > Thank you again for your help and please let me know if you
have any
>> > > questions.
>> > >
>> > > Mike
>> > >
>> > > --
>> > > Michael J. Erickson
>> > >
>> > > Research Scientist
>> > > Cooperative Institute for Research in Environmental Sciences
(CIRES)
>> > > NOAA/NWS/Weather Prediction Center
>> > > Phone:  301-683-1546
>> > >
>> > >
>> >
>> >
>>
>> --
>> Michael J. Erickson
>>
>> Research Scientist
>> Cooperative Institute for Research in Environmental Sciences
(CIRES)
>> NOAA/NWS/Weather Prediction Center
>> Phone:  301-683-1546
>>
>>

------------------------------------------------
Subject: Including Climatology in grid_stat Config File
From: Michael Erickson - NOAA Affiliate
Time: Fri Sep 04 05:07:35 2020

Hi John,

Thank you for your quick and helpful response! To answer your
questions
from the first email:

1) I have included the climo file in case you wanted to see it:
https://ftp.wpc.ncep.noaa.gov/erickson/DTC/UFVS_ST4gFFG_s2015010100_e2019123123_vhr12.nc

2) I start from the netcdf output from grid_stat, load that data into
the
python workspace, and compute the brier score from that.

Also the circle diameter of 9 in the observation file is to draw a 40
km
radius around the "observation."

>From your latter email, it sounds like I may not be able to exactly
replicate the Brier Score calculation. In the spirit of best
practices,
would you recommend I change cat_thresh  to "= [ >=0.0, >=0.001,
>=0.05,
>=0.1, >=0.2, >=0.5, >=1.0 ];" or keep my cat_thresh as it currently
is as
long as I am consistent? I was also wondering if grid_stat bins the
probabilities for the climo field as it does for the probabilities in
the
forecast field?

Thanks again!

Mike

On Thu, Sep 3, 2020 at 7:12 PM John Halley Gotway via RT
<met_help at ucar.edu>
wrote:

> Actually, I have a reasonable guess as to why you may be seeing a
> difference.
>
> All probabilistics verification in MET is based on an Nx2
probabilistic
> contingency table. Those are the counts in the PCT line type. We do
this to
> make it easier to aggregate statistics across multiple cases, but
summing
> up contingency tables before recomputing statistics. But the
pros/cons of
> this approach would probably be better addressed by a statistician.
So the
> stats are computed using probability bins and not raw probability
values.
>
> If you went and computed the Brier score by hand, you probably did
so using
> raw probability values and not binning them first.
>
> And this difference could explain the type of discrepancy you're
seeing.
>
> To test this out, I reran your case...
> (1) Using your original settings to confirm your Brier score of
0.011934.
> (2) Using 10 equally-spaced probability bins (cat_thresh = [ ==0.1
];)
> which produced a Brier score of 0.013747.
> (3) Using 50 equally-spaced probability bins (cat_thresh = [ ==0.2
];)
> which produced a Brier score of 0.01197.
> (4) Using 100 equally-spaced probability bins (cat_thresh = [ ==0.01
];)
> which produced a Brier score of 0.01193.
>
> I suppose that doesn't example the exact discrepancy, but could
definitely
> be involved.
>
> Notice on this line of the brier score computation in MET:
>
>
https://github.com/dtcenter/MET/blob/2c9ae440a84024fbf62caa64c1b747f9a912236f/met/src/libcode/vx_statistics/contable_nx2.cc#L647
>
> That the "probability" value returned by "row_proby()" is the mid-
point of
> the bin.
> So all of your forecast probability values of 0% which fall into the
first
> bin are actually evaluated as having a probability value of 0.025
which is
> the mid-point between 0 and 0.05 for the first bin.
>
> Rerunning using the following to minimize that effect on the 0's:
> cat_thresh = [ >=0.0, >=0.001, >=0.05, >=0.1, >=0.2, >=0.5, >=1.0 ];
> produces a brier score of 0.011489.
>
> So I'd say that the binning of the probability values is impacting
the
> Brier score out in the 4th decimal place.
>
> John
>
> On Thu, Sep 3, 2020 at 4:47 PM John Halley Gotway <johnhg at ucar.edu>
wrote:
>
> > Hi Mike,
> >
> > Looks like you were able to make a lot of progress. I certainly
don't see
> > anything wrong based on the log messages you sent.
> >
> > I do notice that you're smoothing the observations with the
maximum value
> > in a circle of diameter 9... presumably for a good reason. And I
see that
> > smoothing step indicated in the log messages as well as the output
.stat
> > file.
> >
> > Two questions.
> >
> > (1) I wanted to try running locally, but didn't find the "climo"
file on
> > the WPC ftp site:
> >
> >
>
https://ftp.wpc.ncep.noaa.gov/erickson/DTC/UFVS_ST4gFFG_s2015010100_e2019123123_vhr12.nc
> > <
>
https://ftp.wpc.ncep.noaa.gov/erickson/DTC/UFVS_ST4gFFG_s2015010100_e2019123123_vhr12.nc
> >
> > Could you add that?
> >
> > (2) When you say that you tried to replicate the Brier score
computation,
> > what was your starting point? The raw input files or using the
NetCDF
> > matched pairs output from Grid-Stat which already include the
computation
> > of the observation maximums?
> >
> > Thanks,
> > John Halley Gotway
> >
> > On Thu, Sep 3, 2020 at 2:26 PM Michael Erickson - NOAA Affiliate
via RT <
> > met_help at ucar.edu> wrote:
> >
> >>
> >> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562 >
> >>
> >> Thank you Minna!
> >>
> >> Mike
> >>
> >> On Thu, Sep 3, 2020 at 4:12 PM Minna Win via RT
<met_help at ucar.edu>
> >> wrote:
> >>
> >> > Hi Mike,
> >> >
> >> > It looks like you have a few questions associated with
calculating
> Brier
> >> > Skill Scores.  I'm assigning this ticket to John Halley Gotway.
> >> >
> >> > Regards,
> >> > Minna
> >> > ---------------
> >> > Minna Win
> >> > National Center for Atmospheric Research
> >> > Developmental Testbed Center
> >> > Phone: 303-497-8423
> >> > Fax:   303-497-8401
> >> >
> >> >
> >> >
> >> > On Thu, Sep 3, 2020 at 1:13 PM Michael Erickson - NOAA
Affiliate via
> RT
> >> <
> >> > met_help at ucar.edu> wrote:
> >> >
> >> > >
> >> > > Thu Sep 03 13:13:26 2020: Request 96562 was acted upon.
> >> > > Transaction: Ticket created by michael.j.erickson at noaa.gov
> >> > >        Queue: met_help
> >> > >      Subject: Including Climatology in grid_stat Config File
> >> > >        Owner: Nobody
> >> > >   Requestors: michael.j.erickson at noaa.gov
> >> > >       Status: new
> >> > >  Ticket <URL:
> https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562
> >> >
> >> > >
> >> > >
> >> > > Greetings,
> >> > >
> >> > > For the first time I am attempting to calculate Brier Skill
Score
> >> using
> >> > > grid_stat from an input climatology file. I have created a
> >> probabilistic
> >> > > flooding climatology file (spans from zero to one; image is
here:
> >> > >
>
https://ftp.wpc.ncep.noaa.gov/erickson/test10/ALL_noMRGL_UFVS_ALL.png
> >> ).
> >> > > This climatology is static, so it doesn't change with time
when
> >> inputting
> >> > > the "model" and "observation" data. I believe I have
successfully
> >> gotten
> >> > > this to work using the command:
> >> > >
> >> > > /opt/MET/90/bin/grid_stat
ERO_s2020083112_e2020090112_vhr09.nc
> >> > > ST4gFFG_s2020083112_e2020090112_vhr09.nc usethis -outdir ~
> >> > >
> >> > > where grid_stat ERO_s2020083112_e2020090112_vhr09.nc are
discrete
> >> > forecast
> >> > > probabilities of 0, 0.05, 0.1, 0.2, and 0.5
> >> > > where ST4gFFG_s2020083112_e2020090112_vhr09.nc are
observation
> values
> >> > of 0
> >> > > or 1
> >> > > and usethis is the configuration file
> >> > >
> >> > > Finally the climatology file that consists of "almost"
continuous
> >> values
> >> > > between 0 and 1 is named: UFVS_ST4gFFG_s2015010100_
> >> e2019123123_vhr12.nc
> >> > >
> >> > > I have put all of these files at
> >> > > https://ftp.wpc.ncep.noaa.gov/erickson/DTC/ for
> >> > > your reference.
> >> > >
> >> > > As for my questions:
> >> > >
> >> > > 1) I was wondering if the climatology file was properly
ingested and
> >> > > calculated for my example? I believe it is correct given the
output
> >> > below,
> >> > > but I wanted to make sure, since this is my first time doing
this:
> >> > >
> >> > >
> >> > >
> >> > >
> >> > >
> >> > >
> >> > >
> >> > >
> >> > >
> >> > >
> >> > >
> >> > >
> >> > >
> >> > >
> >> > >
> >> > >
> >> > >
> >> > >
> >> > >
> >> > >
> >> > >
> >> > >
> >> > >
> >> > >
> >> > >
> >> > >
> >> > >
> >> > >
> >> > >
> >> > >
> >> > >
> >> > >
> >> > > *DEBUG 1: Forecast File:
> >> > >
> >> > >
> >> >
> >>
> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL/ERO_s2020083112_e2020090112_vhr09.ncDEBUG
> >> > > 1: Observation File:
> >> > >
> >> > >
> >> >
> >>
> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL/ST4gFFG_s2020083112_e2020090112_vhr09.ncDEBUG
> >> > > 3: Reading forecast data for EROSurface.DEBUG 3: Reading
observation
> >> data
> >> > > for ST4gFFGSurface.DEBUG 4:
> >> Met2dDataFileFactory::new_met_2d_data_file()
> >> > ->
> >> > > created new Met2dDataFile object of type
"FileType_NcMet".DEBUG
> >> 4:DEBUG
> >> > 4:
> >> > > Latitude/Longitude Grid Data:DEBUG 4:      lat_ll: 25DEBUG 4:
> >> > lon_ll:
> >> > > 129.8DEBUG 4:   delta_lat: 0.09DEBUG 4:   delta_lon:
0.09DEBUG 4:
> >> > >  Nlat: 276DEBUG 4:        Nlon: 721DEBUG 4:DEBUG 4:
> >> > > VarInfoFactory::new_var_info() -> created new VarInfo object
of type
> >> > > "FileType_NcMet".DEBUG 3: For forecast valid at
20200901_120000,
> >> found 1
> >> > > climatology field(s) with valid time(s): 20201231_230000DEBUG
3:
> >> Found 1
> >> > > climatology fields.DEBUG 3: Found 1 climatology mean and 0
> climatology
> >> > > standard deviation field(s) for forecast EROSurface.DEBUG 2:
> >> Processing
> >> > > masking regions.DEBUG 3: Processing grid mask: FULLDEBUG 4:
> >> > > parse_grid_mask() -> parsing grid mask "FULL"DEBUG 2:DEBUG 2:
> >> > >
> >> > >
> >> >
> >>
>
--------------------------------------------------------------------------------DEBUG
> >> > > 2:DEBUG 3: Smoothing field using the MAX(49) CircleTemplate
> >> interpolation
> >> > > method.DEBUG 2: Processing EROSurface versus ST4gFFGSurface,
for
> >> > smoothing
> >> > > method MAX_CIRCLE(49), over region FULL, using 190638 matched
> >> pairs.DEBUG
> >> > > 2: Computing Probabilistic Statistics.DEBUG 2:DEBUG 2:
> >> > >
> >> > >
> >> >
> >>
>
--------------------------------------------------------------------------------DEBUG
> >> > > 2:DEBUG 1: Output file:
> >> > >
> >> > >
> >> >
> >>
> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL//grid_stat_ST4gFFG_ERO_s2020090112_e2020090212_vhr09_240000L_20200901_120000V.statDEBUG
> >> > > 1: Output file:
> >> > >
> >> > >
> >> >
> >>
> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL//grid_stat_ST4gFFG_ERO_s2020090112_e2020090212_vhr09_240000L_20200901_120000V_pairs.nc*
> >> > >
> >> > >
> >> > > 2) This question is a bit more basic. I am unable to manually
> >> calculate a
> >> > > Brier Score value for the forecast and observation that
properly
> >> matches
> >> > > that in the stat file. My manually calculated Brier Score is
> >> > systematically
> >> > > lower. For this event, the stat file BS is 0.0119 and my
value is
> >> 0.0116.
> >> > > I've looked at C3 in the MET Tutorial guide
> >> > > <
> >> > >
> >> >
> >>
> https://dtcenter.org/sites/default/files/community-
code/met/docs/user-guide/MET_Users_Guide_v9.0.pdf
> >> > > >,
> >> > > but I'm still at a bit of a loss. Is there a simple way I can
> >> replicate
> >> > the
> >> > > calculation seen in the stat file?
> >> > >
> >> > > Thank you again for your help and please let me know if you
have any
> >> > > questions.
> >> > >
> >> > > Mike
> >> > >
> >> > > --
> >> > > Michael J. Erickson
> >> > >
> >> > > Research Scientist
> >> > > Cooperative Institute for Research in Environmental Sciences
(CIRES)
> >> > > NOAA/NWS/Weather Prediction Center
> >> > > Phone:  301-683-1546
> >> > >
> >> > >
> >> >
> >> >
> >>
> >> --
> >> Michael J. Erickson
> >>
> >> Research Scientist
> >> Cooperative Institute for Research in Environmental Sciences
(CIRES)
> >> NOAA/NWS/Weather Prediction Center
> >> Phone:  301-683-1546
> >>
> >>
>
>

--
Michael J. Erickson

Research Scientist
Cooperative Institute for Research in Environmental Sciences (CIRES)
NOAA/NWS/Weather Prediction Center
Phone:  301-683-1546

------------------------------------------------
Subject: Including Climatology in grid_stat Config File
From: John Halley Gotway
Time: Fri Sep 04 10:25:17 2020

Mike,

I don't really have a recommendation on best practices with regards to
the
binning of probability values.

I can say that I more commonly see people choose fixed bin widths,
like
"==0.10" (for 10 bins) or "==0.05" (for 20 bins) instead of variable
width
bins, such as:
[ >=0.0, >=0.05, >=0.1, >=0.2, >=0.5, >=1.0 ]

But I suspect that's more out of convenience than anything else. With
regards to your chosen bins, I suspect you set them up this way since
you
have lots of low probability values closer to 0.0 and relatively few
probability values closer to 1.0. While this may be a good choice for
relatively rare events, it wouldn't be as good of a choice for very
common
events resulting in high probability values.

Choosing 20 bins (==0.05) would include all of your current bin
boundaries
and enable you to sample evenly across the probability space,
regardless of
whether the values are bunched near 0 or 1. And mathematically, your
current bins would be derivable from these.

But if your chosen bins follow some existing WPC convention, I don't
see an
obvious reason to change them.

Please let me know if you'd like me to forward this question to one of
the
statisticians in our group for their advice.

Thanks,
John

On Fri, Sep 4, 2020 at 5:08 AM Michael Erickson - NOAA Affiliate via
RT <
met_help at ucar.edu> wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562 >
>
> Hi John,
>
> Thank you for your quick and helpful response! To answer your
questions
> from the first email:
>
> 1) I have included the climo file in case you wanted to see it:
>
>
https://ftp.wpc.ncep.noaa.gov/erickson/DTC/UFVS_ST4gFFG_s2015010100_e2019123123_vhr12.nc
>
> 2) I start from the netcdf output from grid_stat, load that data
into the
> python workspace, and compute the brier score from that.
>
> Also the circle diameter of 9 in the observation file is to draw a
40 km
> radius around the "observation."
>
> From your latter email, it sounds like I may not be able to exactly
> replicate the Brier Score calculation. In the spirit of best
practices,
> would you recommend I change cat_thresh  to "= [ >=0.0, >=0.001,
>=0.05,
> >=0.1, >=0.2, >=0.5, >=1.0 ];" or keep my cat_thresh as it currently
is as
> long as I am consistent? I was also wondering if grid_stat bins the
> probabilities for the climo field as it does for the probabilities
in the
> forecast field?
>
> Thanks again!
>
> Mike
>
> On Thu, Sep 3, 2020 at 7:12 PM John Halley Gotway via RT <
> met_help at ucar.edu>
> wrote:
>
> > Actually, I have a reasonable guess as to why you may be seeing a
> > difference.
> >
> > All probabilistics verification in MET is based on an Nx2
probabilistic
> > contingency table. Those are the counts in the PCT line type. We
do this
> to
> > make it easier to aggregate statistics across multiple cases, but
summing
> > up contingency tables before recomputing statistics. But the
pros/cons of
> > this approach would probably be better addressed by a
statistician. So
> the
> > stats are computed using probability bins and not raw probability
values.
> >
> > If you went and computed the Brier score by hand, you probably did
so
> using
> > raw probability values and not binning them first.
> >
> > And this difference could explain the type of discrepancy you're
seeing.
> >
> > To test this out, I reran your case...
> > (1) Using your original settings to confirm your Brier score of
0.011934.
> > (2) Using 10 equally-spaced probability bins (cat_thresh = [ ==0.1
];)
> > which produced a Brier score of 0.013747.
> > (3) Using 50 equally-spaced probability bins (cat_thresh = [ ==0.2
];)
> > which produced a Brier score of 0.01197.
> > (4) Using 100 equally-spaced probability bins (cat_thresh = [
==0.01 ];)
> > which produced a Brier score of 0.01193.
> >
> > I suppose that doesn't example the exact discrepancy, but could
> definitely
> > be involved.
> >
> > Notice on this line of the brier score computation in MET:
> >
> >
>
https://github.com/dtcenter/MET/blob/2c9ae440a84024fbf62caa64c1b747f9a912236f/met/src/libcode/vx_statistics/contable_nx2.cc#L647
> >
> > That the "probability" value returned by "row_proby()" is the mid-
point
> of
> > the bin.
> > So all of your forecast probability values of 0% which fall into
the
> first
> > bin are actually evaluated as having a probability value of 0.025
which
> is
> > the mid-point between 0 and 0.05 for the first bin.
> >
> > Rerunning using the following to minimize that effect on the 0's:
> > cat_thresh = [ >=0.0, >=0.001, >=0.05, >=0.1, >=0.2, >=0.5, >=1.0
];
> > produces a brier score of 0.011489.
> >
> > So I'd say that the binning of the probability values is impacting
the
> > Brier score out in the 4th decimal place.
> >
> > John
> >
> > On Thu, Sep 3, 2020 at 4:47 PM John Halley Gotway
<johnhg at ucar.edu>
> wrote:
> >
> > > Hi Mike,
> > >
> > > Looks like you were able to make a lot of progress. I certainly
don't
> see
> > > anything wrong based on the log messages you sent.
> > >
> > > I do notice that you're smoothing the observations with the
maximum
> value
> > > in a circle of diameter 9... presumably for a good reason. And I
see
> that
> > > smoothing step indicated in the log messages as well as the
output
> .stat
> > > file.
> > >
> > > Two questions.
> > >
> > > (1) I wanted to try running locally, but didn't find the "climo"
file
> on
> > > the WPC ftp site:
> > >
> > >
> >
>
https://ftp.wpc.ncep.noaa.gov/erickson/DTC/UFVS_ST4gFFG_s2015010100_e2019123123_vhr12.nc
> > > <
> >
>
https://ftp.wpc.ncep.noaa.gov/erickson/DTC/UFVS_ST4gFFG_s2015010100_e2019123123_vhr12.nc
> > >
> > > Could you add that?
> > >
> > > (2) When you say that you tried to replicate the Brier score
> computation,
> > > what was your starting point? The raw input files or using the
NetCDF
> > > matched pairs output from Grid-Stat which already include the
> computation
> > > of the observation maximums?
> > >
> > > Thanks,
> > > John Halley Gotway
> > >
> > > On Thu, Sep 3, 2020 at 2:26 PM Michael Erickson - NOAA Affiliate
via
> RT <
> > > met_help at ucar.edu> wrote:
> > >
> > >>
> > >> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562 >
> > >>
> > >> Thank you Minna!
> > >>
> > >> Mike
> > >>
> > >> On Thu, Sep 3, 2020 at 4:12 PM Minna Win via RT
<met_help at ucar.edu>
> > >> wrote:
> > >>
> > >> > Hi Mike,
> > >> >
> > >> > It looks like you have a few questions associated with
calculating
> > Brier
> > >> > Skill Scores.  I'm assigning this ticket to John Halley
Gotway.
> > >> >
> > >> > Regards,
> > >> > Minna
> > >> > ---------------
> > >> > Minna Win
> > >> > National Center for Atmospheric Research
> > >> > Developmental Testbed Center
> > >> > Phone: 303-497-8423
> > >> > Fax:   303-497-8401
> > >> >
> > >> >
> > >> >
> > >> > On Thu, Sep 3, 2020 at 1:13 PM Michael Erickson - NOAA
Affiliate via
> > RT
> > >> <
> > >> > met_help at ucar.edu> wrote:
> > >> >
> > >> > >
> > >> > > Thu Sep 03 13:13:26 2020: Request 96562 was acted upon.
> > >> > > Transaction: Ticket created by michael.j.erickson at noaa.gov
> > >> > >        Queue: met_help
> > >> > >      Subject: Including Climatology in grid_stat Config
File
> > >> > >        Owner: Nobody
> > >> > >   Requestors: michael.j.erickson at noaa.gov
> > >> > >       Status: new
> > >> > >  Ticket <URL:
> > https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562
> > >> >
> > >> > >
> > >> > >
> > >> > > Greetings,
> > >> > >
> > >> > > For the first time I am attempting to calculate Brier Skill
Score
> > >> using
> > >> > > grid_stat from an input climatology file. I have created a
> > >> probabilistic
> > >> > > flooding climatology file (spans from zero to one; image is
here:
> > >> > >
> >
https://ftp.wpc.ncep.noaa.gov/erickson/test10/ALL_noMRGL_UFVS_ALL.png
> > >> ).
> > >> > > This climatology is static, so it doesn't change with time
when
> > >> inputting
> > >> > > the "model" and "observation" data. I believe I have
successfully
> > >> gotten
> > >> > > this to work using the command:
> > >> > >
> > >> > > /opt/MET/90/bin/grid_stat
ERO_s2020083112_e2020090112_vhr09.nc
> > >> > > ST4gFFG_s2020083112_e2020090112_vhr09.nc usethis -outdir ~
> > >> > >
> > >> > > where grid_stat ERO_s2020083112_e2020090112_vhr09.nc are
discrete
> > >> > forecast
> > >> > > probabilities of 0, 0.05, 0.1, 0.2, and 0.5
> > >> > > where ST4gFFG_s2020083112_e2020090112_vhr09.nc are
observation
> > values
> > >> > of 0
> > >> > > or 1
> > >> > > and usethis is the configuration file
> > >> > >
> > >> > > Finally the climatology file that consists of "almost"
continuous
> > >> values
> > >> > > between 0 and 1 is named: UFVS_ST4gFFG_s2015010100_
> > >> e2019123123_vhr12.nc
> > >> > >
> > >> > > I have put all of these files at
> > >> > > https://ftp.wpc.ncep.noaa.gov/erickson/DTC/ for
> > >> > > your reference.
> > >> > >
> > >> > > As for my questions:
> > >> > >
> > >> > > 1) I was wondering if the climatology file was properly
ingested
> and
> > >> > > calculated for my example? I believe it is correct given
the
> output
> > >> > below,
> > >> > > but I wanted to make sure, since this is my first time
doing this:
> > >> > >
> > >> > >
> > >> > >
> > >> > >
> > >> > >
> > >> > >
> > >> > >
> > >> > >
> > >> > >
> > >> > >
> > >> > >
> > >> > >
> > >> > >
> > >> > >
> > >> > >
> > >> > >
> > >> > >
> > >> > >
> > >> > >
> > >> > >
> > >> > >
> > >> > >
> > >> > >
> > >> > >
> > >> > >
> > >> > >
> > >> > >
> > >> > >
> > >> > >
> > >> > >
> > >> > >
> > >> > >
> > >> > > *DEBUG 1: Forecast File:
> > >> > >
> > >> > >
> > >> >
> > >>
> >
> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL/ERO_s2020083112_e2020090112_vhr09.ncDEBUG
> > >> > > 1: Observation File:
> > >> > >
> > >> > >
> > >> >
> > >>
> >
> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL/ST4gFFG_s2020083112_e2020090112_vhr09.ncDEBUG
> > >> > > 3: Reading forecast data for EROSurface.DEBUG 3: Reading
> observation
> > >> data
> > >> > > for ST4gFFGSurface.DEBUG 4:
> > >> Met2dDataFileFactory::new_met_2d_data_file()
> > >> > ->
> > >> > > created new Met2dDataFile object of type
"FileType_NcMet".DEBUG
> > >> 4:DEBUG
> > >> > 4:
> > >> > > Latitude/Longitude Grid Data:DEBUG 4:      lat_ll: 25DEBUG
4:
> > >> > lon_ll:
> > >> > > 129.8DEBUG 4:   delta_lat: 0.09DEBUG 4:   delta_lon:
0.09DEBUG 4:
> > >> > >  Nlat: 276DEBUG 4:        Nlon: 721DEBUG 4:DEBUG 4:
> > >> > > VarInfoFactory::new_var_info() -> created new VarInfo
object of
> type
> > >> > > "FileType_NcMet".DEBUG 3: For forecast valid at
20200901_120000,
> > >> found 1
> > >> > > climatology field(s) with valid time(s):
20201231_230000DEBUG 3:
> > >> Found 1
> > >> > > climatology fields.DEBUG 3: Found 1 climatology mean and 0
> > climatology
> > >> > > standard deviation field(s) for forecast EROSurface.DEBUG
2:
> > >> Processing
> > >> > > masking regions.DEBUG 3: Processing grid mask: FULLDEBUG 4:
> > >> > > parse_grid_mask() -> parsing grid mask "FULL"DEBUG 2:DEBUG
2:
> > >> > >
> > >> > >
> > >> >
> > >>
> >
>
--------------------------------------------------------------------------------DEBUG
> > >> > > 2:DEBUG 3: Smoothing field using the MAX(49) CircleTemplate
> > >> interpolation
> > >> > > method.DEBUG 2: Processing EROSurface versus
ST4gFFGSurface, for
> > >> > smoothing
> > >> > > method MAX_CIRCLE(49), over region FULL, using 190638
matched
> > >> pairs.DEBUG
> > >> > > 2: Computing Probabilistic Statistics.DEBUG 2:DEBUG 2:
> > >> > >
> > >> > >
> > >> >
> > >>
> >
>
--------------------------------------------------------------------------------DEBUG
> > >> > > 2:DEBUG 1: Output file:
> > >> > >
> > >> > >
> > >> >
> > >>
> >
> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL//grid_stat_ST4gFFG_ERO_s2020090112_e2020090212_vhr09_240000L_20200901_120000V.statDEBUG
> > >> > > 1: Output file:
> > >> > >
> > >> > >
> > >> >
> > >>
> >
> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL//grid_stat_ST4gFFG_ERO_s2020090112_e2020090212_vhr09_240000L_20200901_120000V_pairs.nc*
> > >> > >
> > >> > >
> > >> > > 2) This question is a bit more basic. I am unable to
manually
> > >> calculate a
> > >> > > Brier Score value for the forecast and observation that
properly
> > >> matches
> > >> > > that in the stat file. My manually calculated Brier Score
is
> > >> > systematically
> > >> > > lower. For this event, the stat file BS is 0.0119 and my
value is
> > >> 0.0116.
> > >> > > I've looked at C3 in the MET Tutorial guide
> > >> > > <
> > >> > >
> > >> >
> > >>
> >
> https://dtcenter.org/sites/default/files/community-
code/met/docs/user-guide/MET_Users_Guide_v9.0.pdf
> > >> > > >,
> > >> > > but I'm still at a bit of a loss. Is there a simple way I
can
> > >> replicate
> > >> > the
> > >> > > calculation seen in the stat file?
> > >> > >
> > >> > > Thank you again for your help and please let me know if you
have
> any
> > >> > > questions.
> > >> > >
> > >> > > Mike
> > >> > >
> > >> > > --
> > >> > > Michael J. Erickson
> > >> > >
> > >> > > Research Scientist
> > >> > > Cooperative Institute for Research in Environmental
Sciences
> (CIRES)
> > >> > > NOAA/NWS/Weather Prediction Center
> > >> > > Phone:  301-683-1546
> > >> > >
> > >> > >
> > >> >
> > >> >
> > >>
> > >> --
> > >> Michael J. Erickson
> > >>
> > >> Research Scientist
> > >> Cooperative Institute for Research in Environmental Sciences
(CIRES)
> > >> NOAA/NWS/Weather Prediction Center
> > >> Phone:  301-683-1546
> > >>
> > >>
> >
> >
>
> --
> Michael J. Erickson
>
> Research Scientist
> Cooperative Institute for Research in Environmental Sciences (CIRES)
> NOAA/NWS/Weather Prediction Center
> Phone:  301-683-1546
>
>

------------------------------------------------
Subject: Including Climatology in grid_stat Config File
From: John Halley Gotway
Time: Fri Sep 04 10:40:42 2020

Mike,

2 more things I forgot to address.

First, I pulled that climo field but when I ran grid_stat with your
usethis
config file, it did not actually read the climo data.

DEBUG 3: Found 0 climatology fields.

I'm wondering what additional configuration settings you used to make
this
work?

Second, the answer to your question is yes. The exact same binning
logic
used for the forecast probabilities is applied to the climo data. In
fact,
the forecast probability bins are applied to both the forecast and
climo
data. So you do not need to define separate "cat_thresh" settings for
the
climo. They won't be used anyway.

Here's the spot in the library code where the climo probabilistic
contingency table is created using the forecast probability bins:

https://github.com/dtcenter/MET/blob/2c9ae440a84024fbf62caa64c1b747f9a912236f/met/src/libcode/vx_statistics/compute_stats.cc#L767

Thanks,
John

On Fri, Sep 4, 2020 at 10:25 AM John Halley Gotway <johnhg at ucar.edu>
wrote:

> Mike,
>
> I don't really have a recommendation on best practices with regards
to the
> binning of probability values.
>
> I can say that I more commonly see people choose fixed bin widths,
like
> "==0.10" (for 10 bins) or "==0.05" (for 20 bins) instead of variable
width
> bins, such as:
> [ >=0.0, >=0.05, >=0.1, >=0.2, >=0.5, >=1.0 ]
>
> But I suspect that's more out of convenience than anything else.
With
> regards to your chosen bins, I suspect you set them up this way
since you
> have lots of low probability values closer to 0.0 and relatively few
> probability values closer to 1.0. While this may be a good choice
for
> relatively rare events, it wouldn't be as good of a choice for very
common
> events resulting in high probability values.
>
> Choosing 20 bins (==0.05) would include all of your current bin
boundaries
> and enable you to sample evenly across the probability space,
regardless of
> whether the values are bunched near 0 or 1. And mathematically, your
> current bins would be derivable from these.
>
> But if your chosen bins follow some existing WPC convention, I don't
see
> an obvious reason to change them.
>
> Please let me know if you'd like me to forward this question to one
of the
> statisticians in our group for their advice.
>
> Thanks,
> John
>
> On Fri, Sep 4, 2020 at 5:08 AM Michael Erickson - NOAA Affiliate via
RT <
> met_help at ucar.edu> wrote:
>
>>
>> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562 >
>>
>> Hi John,
>>
>> Thank you for your quick and helpful response! To answer your
questions
>> from the first email:
>>
>> 1) I have included the climo file in case you wanted to see it:
>>
>>
https://ftp.wpc.ncep.noaa.gov/erickson/DTC/UFVS_ST4gFFG_s2015010100_e2019123123_vhr12.nc
>>
>> 2) I start from the netcdf output from grid_stat, load that data
into the
>> python workspace, and compute the brier score from that.
>>
>> Also the circle diameter of 9 in the observation file is to draw a
40 km
>> radius around the "observation."
>>
>> From your latter email, it sounds like I may not be able to exactly
>> replicate the Brier Score calculation. In the spirit of best
practices,
>> would you recommend I change cat_thresh  to "= [ >=0.0, >=0.001,
>=0.05,
>> >=0.1, >=0.2, >=0.5, >=1.0 ];" or keep my cat_thresh as it
currently is as
>> long as I am consistent? I was also wondering if grid_stat bins the
>> probabilities for the climo field as it does for the probabilities
in the
>> forecast field?
>>
>> Thanks again!
>>
>> Mike
>>
>> On Thu, Sep 3, 2020 at 7:12 PM John Halley Gotway via RT <
>> met_help at ucar.edu>
>> wrote:
>>
>> > Actually, I have a reasonable guess as to why you may be seeing a
>> > difference.
>> >
>> > All probabilistics verification in MET is based on an Nx2
probabilistic
>> > contingency table. Those are the counts in the PCT line type. We
do
>> this to
>> > make it easier to aggregate statistics across multiple cases, but
>> summing
>> > up contingency tables before recomputing statistics. But the
pros/cons
>> of
>> > this approach would probably be better addressed by a
statistician. So
>> the
>> > stats are computed using probability bins and not raw probability
>> values.
>> >
>> > If you went and computed the Brier score by hand, you probably
did so
>> using
>> > raw probability values and not binning them first.
>> >
>> > And this difference could explain the type of discrepancy you're
seeing.
>> >
>> > To test this out, I reran your case...
>> > (1) Using your original settings to confirm your Brier score of
>> 0.011934.
>> > (2) Using 10 equally-spaced probability bins (cat_thresh = [
==0.1 ];)
>> > which produced a Brier score of 0.013747.
>> > (3) Using 50 equally-spaced probability bins (cat_thresh = [
==0.2 ];)
>> > which produced a Brier score of 0.01197.
>> > (4) Using 100 equally-spaced probability bins (cat_thresh = [
==0.01 ];)
>> > which produced a Brier score of 0.01193.
>> >
>> > I suppose that doesn't example the exact discrepancy, but could
>> definitely
>> > be involved.
>> >
>> > Notice on this line of the brier score computation in MET:
>> >
>> >
>>
https://github.com/dtcenter/MET/blob/2c9ae440a84024fbf62caa64c1b747f9a912236f/met/src/libcode/vx_statistics/contable_nx2.cc#L647
>> >
>> > That the "probability" value returned by "row_proby()" is the
mid-point
>> of
>> > the bin.
>> > So all of your forecast probability values of 0% which fall into
the
>> first
>> > bin are actually evaluated as having a probability value of 0.025
which
>> is
>> > the mid-point between 0 and 0.05 for the first bin.
>> >
>> > Rerunning using the following to minimize that effect on the 0's:
>> > cat_thresh = [ >=0.0, >=0.001, >=0.05, >=0.1, >=0.2, >=0.5, >=1.0
];
>> > produces a brier score of 0.011489.
>> >
>> > So I'd say that the binning of the probability values is
impacting the
>> > Brier score out in the 4th decimal place.
>> >
>> > John
>> >
>> > On Thu, Sep 3, 2020 at 4:47 PM John Halley Gotway
<johnhg at ucar.edu>
>> wrote:
>> >
>> > > Hi Mike,
>> > >
>> > > Looks like you were able to make a lot of progress. I certainly
don't
>> see
>> > > anything wrong based on the log messages you sent.
>> > >
>> > > I do notice that you're smoothing the observations with the
maximum
>> value
>> > > in a circle of diameter 9... presumably for a good reason. And
I see
>> that
>> > > smoothing step indicated in the log messages as well as the
output
>> .stat
>> > > file.
>> > >
>> > > Two questions.
>> > >
>> > > (1) I wanted to try running locally, but didn't find the
"climo" file
>> on
>> > > the WPC ftp site:
>> > >
>> > >
>> >
>>
https://ftp.wpc.ncep.noaa.gov/erickson/DTC/UFVS_ST4gFFG_s2015010100_e2019123123_vhr12.nc
>> > > <
>> >
>>
https://ftp.wpc.ncep.noaa.gov/erickson/DTC/UFVS_ST4gFFG_s2015010100_e2019123123_vhr12.nc
>> > >
>> > > Could you add that?
>> > >
>> > > (2) When you say that you tried to replicate the Brier score
>> computation,
>> > > what was your starting point? The raw input files or using the
NetCDF
>> > > matched pairs output from Grid-Stat which already include the
>> computation
>> > > of the observation maximums?
>> > >
>> > > Thanks,
>> > > John Halley Gotway
>> > >
>> > > On Thu, Sep 3, 2020 at 2:26 PM Michael Erickson - NOAA
Affiliate via
>> RT <
>> > > met_help at ucar.edu> wrote:
>> > >
>> > >>
>> > >> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562
>
>> > >>
>> > >> Thank you Minna!
>> > >>
>> > >> Mike
>> > >>
>> > >> On Thu, Sep 3, 2020 at 4:12 PM Minna Win via RT
<met_help at ucar.edu>
>> > >> wrote:
>> > >>
>> > >> > Hi Mike,
>> > >> >
>> > >> > It looks like you have a few questions associated with
calculating
>> > Brier
>> > >> > Skill Scores.  I'm assigning this ticket to John Halley
Gotway.
>> > >> >
>> > >> > Regards,
>> > >> > Minna
>> > >> > ---------------
>> > >> > Minna Win
>> > >> > National Center for Atmospheric Research
>> > >> > Developmental Testbed Center
>> > >> > Phone: 303-497-8423
>> > >> > Fax:   303-497-8401
>> > >> >
>> > >> >
>> > >> >
>> > >> > On Thu, Sep 3, 2020 at 1:13 PM Michael Erickson - NOAA
Affiliate
>> via
>> > RT
>> > >> <
>> > >> > met_help at ucar.edu> wrote:
>> > >> >
>> > >> > >
>> > >> > > Thu Sep 03 13:13:26 2020: Request 96562 was acted upon.
>> > >> > > Transaction: Ticket created by michael.j.erickson at noaa.gov
>> > >> > >        Queue: met_help
>> > >> > >      Subject: Including Climatology in grid_stat Config
File
>> > >> > >        Owner: Nobody
>> > >> > >   Requestors: michael.j.erickson at noaa.gov
>> > >> > >       Status: new
>> > >> > >  Ticket <URL:
>> > https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562
>> > >> >
>> > >> > >
>> > >> > >
>> > >> > > Greetings,
>> > >> > >
>> > >> > > For the first time I am attempting to calculate Brier
Skill Score
>> > >> using
>> > >> > > grid_stat from an input climatology file. I have created a
>> > >> probabilistic
>> > >> > > flooding climatology file (spans from zero to one; image
is here:
>> > >> > >
>> >
https://ftp.wpc.ncep.noaa.gov/erickson/test10/ALL_noMRGL_UFVS_ALL.png
>> > >> ).
>> > >> > > This climatology is static, so it doesn't change with time
when
>> > >> inputting
>> > >> > > the "model" and "observation" data. I believe I have
successfully
>> > >> gotten
>> > >> > > this to work using the command:
>> > >> > >
>> > >> > > /opt/MET/90/bin/grid_stat
ERO_s2020083112_e2020090112_vhr09.nc
>> > >> > > ST4gFFG_s2020083112_e2020090112_vhr09.nc usethis -outdir ~
>> > >> > >
>> > >> > > where grid_stat ERO_s2020083112_e2020090112_vhr09.nc are
>> discrete
>> > >> > forecast
>> > >> > > probabilities of 0, 0.05, 0.1, 0.2, and 0.5
>> > >> > > where ST4gFFG_s2020083112_e2020090112_vhr09.nc are
observation
>> > values
>> > >> > of 0
>> > >> > > or 1
>> > >> > > and usethis is the configuration file
>> > >> > >
>> > >> > > Finally the climatology file that consists of "almost"
continuous
>> > >> values
>> > >> > > between 0 and 1 is named: UFVS_ST4gFFG_s2015010100_
>> > >> e2019123123_vhr12.nc
>> > >> > >
>> > >> > > I have put all of these files at
>> > >> > > https://ftp.wpc.ncep.noaa.gov/erickson/DTC/ for
>> > >> > > your reference.
>> > >> > >
>> > >> > > As for my questions:
>> > >> > >
>> > >> > > 1) I was wondering if the climatology file was properly
ingested
>> and
>> > >> > > calculated for my example? I believe it is correct given
the
>> output
>> > >> > below,
>> > >> > > but I wanted to make sure, since this is my first time
doing
>> this:
>> > >> > >
>> > >> > >
>> > >> > >
>> > >> > >
>> > >> > >
>> > >> > >
>> > >> > >
>> > >> > >
>> > >> > >
>> > >> > >
>> > >> > >
>> > >> > >
>> > >> > >
>> > >> > >
>> > >> > >
>> > >> > >
>> > >> > >
>> > >> > >
>> > >> > >
>> > >> > >
>> > >> > >
>> > >> > >
>> > >> > >
>> > >> > >
>> > >> > >
>> > >> > >
>> > >> > >
>> > >> > >
>> > >> > >
>> > >> > >
>> > >> > >
>> > >> > >
>> > >> > > *DEBUG 1: Forecast File:
>> > >> > >
>> > >> > >
>> > >> >
>> > >>
>> >
>> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL/ERO_s2020083112_e2020090112_vhr09.ncDEBUG
>> > >> > > 1: Observation File:
>> > >> > >
>> > >> > >
>> > >> >
>> > >>
>> >
>> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL/ST4gFFG_s2020083112_e2020090112_vhr09.ncDEBUG
>> > >> > > 3: Reading forecast data for EROSurface.DEBUG 3: Reading
>> observation
>> > >> data
>> > >> > > for ST4gFFGSurface.DEBUG 4:
>> > >> Met2dDataFileFactory::new_met_2d_data_file()
>> > >> > ->
>> > >> > > created new Met2dDataFile object of type
"FileType_NcMet".DEBUG
>> > >> 4:DEBUG
>> > >> > 4:
>> > >> > > Latitude/Longitude Grid Data:DEBUG 4:      lat_ll: 25DEBUG
4:
>> > >> > lon_ll:
>> > >> > > 129.8DEBUG 4:   delta_lat: 0.09DEBUG 4:   delta_lon:
0.09DEBUG 4:
>> > >> > >  Nlat: 276DEBUG 4:        Nlon: 721DEBUG 4:DEBUG 4:
>> > >> > > VarInfoFactory::new_var_info() -> created new VarInfo
object of
>> type
>> > >> > > "FileType_NcMet".DEBUG 3: For forecast valid at
20200901_120000,
>> > >> found 1
>> > >> > > climatology field(s) with valid time(s):
20201231_230000DEBUG 3:
>> > >> Found 1
>> > >> > > climatology fields.DEBUG 3: Found 1 climatology mean and 0
>> > climatology
>> > >> > > standard deviation field(s) for forecast EROSurface.DEBUG
2:
>> > >> Processing
>> > >> > > masking regions.DEBUG 3: Processing grid mask: FULLDEBUG
4:
>> > >> > > parse_grid_mask() -> parsing grid mask "FULL"DEBUG 2:DEBUG
2:
>> > >> > >
>> > >> > >
>> > >> >
>> > >>
>> >
>>
--------------------------------------------------------------------------------DEBUG
>> > >> > > 2:DEBUG 3: Smoothing field using the MAX(49)
CircleTemplate
>> > >> interpolation
>> > >> > > method.DEBUG 2: Processing EROSurface versus
ST4gFFGSurface, for
>> > >> > smoothing
>> > >> > > method MAX_CIRCLE(49), over region FULL, using 190638
matched
>> > >> pairs.DEBUG
>> > >> > > 2: Computing Probabilistic Statistics.DEBUG 2:DEBUG 2:
>> > >> > >
>> > >> > >
>> > >> >
>> > >>
>> >
>>
--------------------------------------------------------------------------------DEBUG
>> > >> > > 2:DEBUG 1: Output file:
>> > >> > >
>> > >> > >
>> > >> >
>> > >>
>> >
>> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL//grid_stat_ST4gFFG_ERO_s2020090112_e2020090212_vhr09_240000L_20200901_120000V.statDEBUG
>> > >> > > 1: Output file:
>> > >> > >
>> > >> > >
>> > >> >
>> > >>
>> >
>> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL//grid_stat_ST4gFFG_ERO_s2020090112_e2020090212_vhr09_240000L_20200901_120000V_pairs.nc*
>> > >> > >
>> > >> > >
>> > >> > > 2) This question is a bit more basic. I am unable to
manually
>> > >> calculate a
>> > >> > > Brier Score value for the forecast and observation that
properly
>> > >> matches
>> > >> > > that in the stat file. My manually calculated Brier Score
is
>> > >> > systematically
>> > >> > > lower. For this event, the stat file BS is 0.0119 and my
value is
>> > >> 0.0116.
>> > >> > > I've looked at C3 in the MET Tutorial guide
>> > >> > > <
>> > >> > >
>> > >> >
>> > >>
>> >
>> https://dtcenter.org/sites/default/files/community-
code/met/docs/user-guide/MET_Users_Guide_v9.0.pdf
>> > >> > > >,
>> > >> > > but I'm still at a bit of a loss. Is there a simple way I
can
>> > >> replicate
>> > >> > the
>> > >> > > calculation seen in the stat file?
>> > >> > >
>> > >> > > Thank you again for your help and please let me know if
you have
>> any
>> > >> > > questions.
>> > >> > >
>> > >> > > Mike
>> > >> > >
>> > >> > > --
>> > >> > > Michael J. Erickson
>> > >> > >
>> > >> > > Research Scientist
>> > >> > > Cooperative Institute for Research in Environmental
Sciences
>> (CIRES)
>> > >> > > NOAA/NWS/Weather Prediction Center
>> > >> > > Phone:  301-683-1546
>> > >> > >
>> > >> > >
>> > >> >
>> > >> >
>> > >>
>> > >> --
>> > >> Michael J. Erickson
>> > >>
>> > >> Research Scientist
>> > >> Cooperative Institute for Research in Environmental Sciences
(CIRES)
>> > >> NOAA/NWS/Weather Prediction Center
>> > >> Phone:  301-683-1546
>> > >>
>> > >>
>> >
>> >
>>
>> --
>> Michael J. Erickson
>>
>> Research Scientist
>> Cooperative Institute for Research in Environmental Sciences
(CIRES)
>> NOAA/NWS/Weather Prediction Center
>> Phone:  301-683-1546
>>
>>

------------------------------------------------
Subject: Including Climatology in grid_stat Config File
From: Michael Erickson - NOAA Affiliate
Time: Fri Sep 04 10:42:58 2020

Hi John,

Thank you for your response. I chose those specific probability
thresholds
since I am comparing WPC's Excessive Rainfall Outlooks (e.g. only have
values of 0, 0.05, 0.1, 0.2 or 0.5) to observations (e.g. values of 0
or
1).

If you didn't mind forwarding this discussion to one of the
statisticians,
I would appreciate it. I want to make sure that I am computing BS/BSS
properly. There are other collaborators that are doing similar work to
me,
and I want to make sure I can explain any potential differences
between my
numbers and theirs.

Thanks!

Mike

On Fri, Sep 4, 2020 at 12:25 PM John Halley Gotway via RT
<met_help at ucar.edu>
wrote:

> Mike,
>
> I don't really have a recommendation on best practices with regards
to the
> binning of probability values.
>
> I can say that I more commonly see people choose fixed bin widths,
like
> "==0.10" (for 10 bins) or "==0.05" (for 20 bins) instead of variable
width
> bins, such as:
> [ >=0.0, >=0.05, >=0.1, >=0.2, >=0.5, >=1.0 ]
>
> But I suspect that's more out of convenience than anything else.
With
> regards to your chosen bins, I suspect you set them up this way
since you
> have lots of low probability values closer to 0.0 and relatively few
> probability values closer to 1.0. While this may be a good choice
for
> relatively rare events, it wouldn't be as good of a choice for very
common
> events resulting in high probability values.
>
> Choosing 20 bins (==0.05) would include all of your current bin
boundaries
> and enable you to sample evenly across the probability space,
regardless of
> whether the values are bunched near 0 or 1. And mathematically, your
> current bins would be derivable from these.
>
> But if your chosen bins follow some existing WPC convention, I don't
see an
> obvious reason to change them.
>
> Please let me know if you'd like me to forward this question to one
of the
> statisticians in our group for their advice.
>
> Thanks,
> John
>
> On Fri, Sep 4, 2020 at 5:08 AM Michael Erickson - NOAA Affiliate via
RT <
> met_help at ucar.edu> wrote:
>
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562 >
> >
> > Hi John,
> >
> > Thank you for your quick and helpful response! To answer your
questions
> > from the first email:
> >
> > 1) I have included the climo file in case you wanted to see it:
> >
> >
>
https://ftp.wpc.ncep.noaa.gov/erickson/DTC/UFVS_ST4gFFG_s2015010100_e2019123123_vhr12.nc
> >
> > 2) I start from the netcdf output from grid_stat, load that data
into the
> > python workspace, and compute the brier score from that.
> >
> > Also the circle diameter of 9 in the observation file is to draw a
40 km
> > radius around the "observation."
> >
> > From your latter email, it sounds like I may not be able to
exactly
> > replicate the Brier Score calculation. In the spirit of best
practices,
> > would you recommend I change cat_thresh  to "= [ >=0.0, >=0.001,
>=0.05,
> > >=0.1, >=0.2, >=0.5, >=1.0 ];" or keep my cat_thresh as it
currently is
> as
> > long as I am consistent? I was also wondering if grid_stat bins
the
> > probabilities for the climo field as it does for the probabilities
in the
> > forecast field?
> >
> > Thanks again!
> >
> > Mike
> >
> > On Thu, Sep 3, 2020 at 7:12 PM John Halley Gotway via RT <
> > met_help at ucar.edu>
> > wrote:
> >
> > > Actually, I have a reasonable guess as to why you may be seeing
a
> > > difference.
> > >
> > > All probabilistics verification in MET is based on an Nx2
probabilistic
> > > contingency table. Those are the counts in the PCT line type. We
do
> this
> > to
> > > make it easier to aggregate statistics across multiple cases,
but
> summing
> > > up contingency tables before recomputing statistics. But the
pros/cons
> of
> > > this approach would probably be better addressed by a
statistician. So
> > the
> > > stats are computed using probability bins and not raw
probability
> values.
> > >
> > > If you went and computed the Brier score by hand, you probably
did so
> > using
> > > raw probability values and not binning them first.
> > >
> > > And this difference could explain the type of discrepancy you're
> seeing.
> > >
> > > To test this out, I reran your case...
> > > (1) Using your original settings to confirm your Brier score of
> 0.011934.
> > > (2) Using 10 equally-spaced probability bins (cat_thresh = [
==0.1 ];)
> > > which produced a Brier score of 0.013747.
> > > (3) Using 50 equally-spaced probability bins (cat_thresh = [
==0.2 ];)
> > > which produced a Brier score of 0.01197.
> > > (4) Using 100 equally-spaced probability bins (cat_thresh = [
==0.01
> ];)
> > > which produced a Brier score of 0.01193.
> > >
> > > I suppose that doesn't example the exact discrepancy, but could
> > definitely
> > > be involved.
> > >
> > > Notice on this line of the brier score computation in MET:
> > >
> > >
> >
>
https://github.com/dtcenter/MET/blob/2c9ae440a84024fbf62caa64c1b747f9a912236f/met/src/libcode/vx_statistics/contable_nx2.cc#L647
> > >
> > > That the "probability" value returned by "row_proby()" is the
mid-point
> > of
> > > the bin.
> > > So all of your forecast probability values of 0% which fall into
the
> > first
> > > bin are actually evaluated as having a probability value of
0.025 which
> > is
> > > the mid-point between 0 and 0.05 for the first bin.
> > >
> > > Rerunning using the following to minimize that effect on the
0's:
> > > cat_thresh = [ >=0.0, >=0.001, >=0.05, >=0.1, >=0.2, >=0.5,
>=1.0 ];
> > > produces a brier score of 0.011489.
> > >
> > > So I'd say that the binning of the probability values is
impacting the
> > > Brier score out in the 4th decimal place.
> > >
> > > John
> > >
> > > On Thu, Sep 3, 2020 at 4:47 PM John Halley Gotway
<johnhg at ucar.edu>
> > wrote:
> > >
> > > > Hi Mike,
> > > >
> > > > Looks like you were able to make a lot of progress. I
certainly don't
> > see
> > > > anything wrong based on the log messages you sent.
> > > >
> > > > I do notice that you're smoothing the observations with the
maximum
> > value
> > > > in a circle of diameter 9... presumably for a good reason. And
I see
> > that
> > > > smoothing step indicated in the log messages as well as the
output
> > .stat
> > > > file.
> > > >
> > > > Two questions.
> > > >
> > > > (1) I wanted to try running locally, but didn't find the
"climo" file
> > on
> > > > the WPC ftp site:
> > > >
> > > >
> > >
> >
>
https://ftp.wpc.ncep.noaa.gov/erickson/DTC/UFVS_ST4gFFG_s2015010100_e2019123123_vhr12.nc
> > > > <
> > >
> >
>
https://ftp.wpc.ncep.noaa.gov/erickson/DTC/UFVS_ST4gFFG_s2015010100_e2019123123_vhr12.nc
> > > >
> > > > Could you add that?
> > > >
> > > > (2) When you say that you tried to replicate the Brier score
> > computation,
> > > > what was your starting point? The raw input files or using the
NetCDF
> > > > matched pairs output from Grid-Stat which already include the
> > computation
> > > > of the observation maximums?
> > > >
> > > > Thanks,
> > > > John Halley Gotway
> > > >
> > > > On Thu, Sep 3, 2020 at 2:26 PM Michael Erickson - NOAA
Affiliate via
> > RT <
> > > > met_help at ucar.edu> wrote:
> > > >
> > > >>
> > > >> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562
>
> > > >>
> > > >> Thank you Minna!
> > > >>
> > > >> Mike
> > > >>
> > > >> On Thu, Sep 3, 2020 at 4:12 PM Minna Win via RT
<met_help at ucar.edu>
> > > >> wrote:
> > > >>
> > > >> > Hi Mike,
> > > >> >
> > > >> > It looks like you have a few questions associated with
calculating
> > > Brier
> > > >> > Skill Scores.  I'm assigning this ticket to John Halley
Gotway.
> > > >> >
> > > >> > Regards,
> > > >> > Minna
> > > >> > ---------------
> > > >> > Minna Win
> > > >> > National Center for Atmospheric Research
> > > >> > Developmental Testbed Center
> > > >> > Phone: 303-497-8423
> > > >> > Fax:   303-497-8401
> > > >> >
> > > >> >
> > > >> >
> > > >> > On Thu, Sep 3, 2020 at 1:13 PM Michael Erickson - NOAA
Affiliate
> via
> > > RT
> > > >> <
> > > >> > met_help at ucar.edu> wrote:
> > > >> >
> > > >> > >
> > > >> > > Thu Sep 03 13:13:26 2020: Request 96562 was acted upon.
> > > >> > > Transaction: Ticket created by
michael.j.erickson at noaa.gov
> > > >> > >        Queue: met_help
> > > >> > >      Subject: Including Climatology in grid_stat Config
File
> > > >> > >        Owner: Nobody
> > > >> > >   Requestors: michael.j.erickson at noaa.gov
> > > >> > >       Status: new
> > > >> > >  Ticket <URL:
> > > https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562
> > > >> >
> > > >> > >
> > > >> > >
> > > >> > > Greetings,
> > > >> > >
> > > >> > > For the first time I am attempting to calculate Brier
Skill
> Score
> > > >> using
> > > >> > > grid_stat from an input climatology file. I have created
a
> > > >> probabilistic
> > > >> > > flooding climatology file (spans from zero to one; image
is
> here:
> > > >> > >
> > >
https://ftp.wpc.ncep.noaa.gov/erickson/test10/ALL_noMRGL_UFVS_ALL.png
> > > >> ).
> > > >> > > This climatology is static, so it doesn't change with
time when
> > > >> inputting
> > > >> > > the "model" and "observation" data. I believe I have
> successfully
> > > >> gotten
> > > >> > > this to work using the command:
> > > >> > >
> > > >> > > /opt/MET/90/bin/grid_stat
ERO_s2020083112_e2020090112_vhr09.nc
> > > >> > > ST4gFFG_s2020083112_e2020090112_vhr09.nc usethis -outdir
~
> > > >> > >
> > > >> > > where grid_stat ERO_s2020083112_e2020090112_vhr09.nc are
> discrete
> > > >> > forecast
> > > >> > > probabilities of 0, 0.05, 0.1, 0.2, and 0.5
> > > >> > > where ST4gFFG_s2020083112_e2020090112_vhr09.nc are
observation
> > > values
> > > >> > of 0
> > > >> > > or 1
> > > >> > > and usethis is the configuration file
> > > >> > >
> > > >> > > Finally the climatology file that consists of "almost"
> continuous
> > > >> values
> > > >> > > between 0 and 1 is named: UFVS_ST4gFFG_s2015010100_
> > > >> e2019123123_vhr12.nc
> > > >> > >
> > > >> > > I have put all of these files at
> > > >> > > https://ftp.wpc.ncep.noaa.gov/erickson/DTC/ for
> > > >> > > your reference.
> > > >> > >
> > > >> > > As for my questions:
> > > >> > >
> > > >> > > 1) I was wondering if the climatology file was properly
ingested
> > and
> > > >> > > calculated for my example? I believe it is correct given
the
> > output
> > > >> > below,
> > > >> > > but I wanted to make sure, since this is my first time
doing
> this:
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> > > *DEBUG 1: Forecast File:
> > > >> > >
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL/ERO_s2020083112_e2020090112_vhr09.ncDEBUG
> > > >> > > 1: Observation File:
> > > >> > >
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL/ST4gFFG_s2020083112_e2020090112_vhr09.ncDEBUG
> > > >> > > 3: Reading forecast data for EROSurface.DEBUG 3: Reading
> > observation
> > > >> data
> > > >> > > for ST4gFFGSurface.DEBUG 4:
> > > >> Met2dDataFileFactory::new_met_2d_data_file()
> > > >> > ->
> > > >> > > created new Met2dDataFile object of type
"FileType_NcMet".DEBUG
> > > >> 4:DEBUG
> > > >> > 4:
> > > >> > > Latitude/Longitude Grid Data:DEBUG 4:      lat_ll:
25DEBUG 4:
> > > >> > lon_ll:
> > > >> > > 129.8DEBUG 4:   delta_lat: 0.09DEBUG 4:   delta_lon:
0.09DEBUG
> 4:
> > > >> > >  Nlat: 276DEBUG 4:        Nlon: 721DEBUG 4:DEBUG 4:
> > > >> > > VarInfoFactory::new_var_info() -> created new VarInfo
object of
> > type
> > > >> > > "FileType_NcMet".DEBUG 3: For forecast valid at
20200901_120000,
> > > >> found 1
> > > >> > > climatology field(s) with valid time(s):
20201231_230000DEBUG 3:
> > > >> Found 1
> > > >> > > climatology fields.DEBUG 3: Found 1 climatology mean and
0
> > > climatology
> > > >> > > standard deviation field(s) for forecast EROSurface.DEBUG
2:
> > > >> Processing
> > > >> > > masking regions.DEBUG 3: Processing grid mask: FULLDEBUG
4:
> > > >> > > parse_grid_mask() -> parsing grid mask "FULL"DEBUG
2:DEBUG 2:
> > > >> > >
> > > >> > >
> > > >> >
> > > >>
> > >
> >
>
--------------------------------------------------------------------------------DEBUG
> > > >> > > 2:DEBUG 3: Smoothing field using the MAX(49)
CircleTemplate
> > > >> interpolation
> > > >> > > method.DEBUG 2: Processing EROSurface versus
ST4gFFGSurface, for
> > > >> > smoothing
> > > >> > > method MAX_CIRCLE(49), over region FULL, using 190638
matched
> > > >> pairs.DEBUG
> > > >> > > 2: Computing Probabilistic Statistics.DEBUG 2:DEBUG 2:
> > > >> > >
> > > >> > >
> > > >> >
> > > >>
> > >
> >
>
--------------------------------------------------------------------------------DEBUG
> > > >> > > 2:DEBUG 1: Output file:
> > > >> > >
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL//grid_stat_ST4gFFG_ERO_s2020090112_e2020090212_vhr09_240000L_20200901_120000V.statDEBUG
> > > >> > > 1: Output file:
> > > >> > >
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL//grid_stat_ST4gFFG_ERO_s2020090112_e2020090212_vhr09_240000L_20200901_120000V_pairs.nc*
> > > >> > >
> > > >> > >
> > > >> > > 2) This question is a bit more basic. I am unable to
manually
> > > >> calculate a
> > > >> > > Brier Score value for the forecast and observation that
properly
> > > >> matches
> > > >> > > that in the stat file. My manually calculated Brier Score
is
> > > >> > systematically
> > > >> > > lower. For this event, the stat file BS is 0.0119 and my
value
> is
> > > >> 0.0116.
> > > >> > > I've looked at C3 in the MET Tutorial guide
> > > >> > > <
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> https://dtcenter.org/sites/default/files/community-
code/met/docs/user-guide/MET_Users_Guide_v9.0.pdf
> > > >> > > >,
> > > >> > > but I'm still at a bit of a loss. Is there a simple way I
can
> > > >> replicate
> > > >> > the
> > > >> > > calculation seen in the stat file?
> > > >> > >
> > > >> > > Thank you again for your help and please let me know if
you have
> > any
> > > >> > > questions.
> > > >> > >
> > > >> > > Mike
> > > >> > >
> > > >> > > --
> > > >> > > Michael J. Erickson
> > > >> > >
> > > >> > > Research Scientist
> > > >> > > Cooperative Institute for Research in Environmental
Sciences
> > (CIRES)
> > > >> > > NOAA/NWS/Weather Prediction Center
> > > >> > > Phone:  301-683-1546
> > > >> > >
> > > >> > >
> > > >> >
> > > >> >
> > > >>
> > > >> --
> > > >> Michael J. Erickson
> > > >>
> > > >> Research Scientist
> > > >> Cooperative Institute for Research in Environmental Sciences
(CIRES)
> > > >> NOAA/NWS/Weather Prediction Center
> > > >> Phone:  301-683-1546
> > > >>
> > > >>
> > >
> > >
> >
> > --
> > Michael J. Erickson
> >
> > Research Scientist
> > Cooperative Institute for Research in Environmental Sciences
(CIRES)
> > NOAA/NWS/Weather Prediction Center
> > Phone:  301-683-1546
> >
> >
>
>

--
Michael J. Erickson

Research Scientist
Cooperative Institute for Research in Environmental Sciences (CIRES)
NOAA/NWS/Weather Prediction Center
Phone:  301-683-1546

------------------------------------------------
Subject: Including Climatology in grid_stat Config File
From: Michael Erickson - NOAA Affiliate
Time: Fri Sep 04 10:47:05 2020

Hi John,

Thanks for your answers and sounds good! That is strange that the
climo
file was not found for your setting. The only detail I can think of is
that
within the climo field, the file_name specification is static:

file_name = [
"/usr1/wpc_cpgffh/gribs/ERO_verif/static/UFVS_ST4gFFG_s2015010100_e2019123123_vhr12.nc"
];

I believe you concluded that my climo read-in looked correct?

Thanks,

Mike

On Fri, Sep 4, 2020 at 12:42 PM John Halley Gotway via RT
<met_help at ucar.edu>
wrote:

> Mike,
>
> 2 more things I forgot to address.
>
> First, I pulled that climo field but when I ran grid_stat with your
usethis
> config file, it did not actually read the climo data.
>
> DEBUG 3: Found 0 climatology fields.
>
>
> I'm wondering what additional configuration settings you used to
make this
> work?
>
>
> Second, the answer to your question is yes. The exact same binning
logic
> used for the forecast probabilities is applied to the climo data. In
fact,
> the forecast probability bins are applied to both the forecast and
climo
> data. So you do not need to define separate "cat_thresh" settings
for the
> climo. They won't be used anyway.
>
>
> Here's the spot in the library code where the climo probabilistic
> contingency table is created using the forecast probability bins:
>
>
>
>
https://github.com/dtcenter/MET/blob/2c9ae440a84024fbf62caa64c1b747f9a912236f/met/src/libcode/vx_statistics/compute_stats.cc#L767
>
>
> Thanks,
> John
>
> On Fri, Sep 4, 2020 at 10:25 AM John Halley Gotway <johnhg at ucar.edu>
> wrote:
>
> > Mike,
> >
> > I don't really have a recommendation on best practices with
regards to
> the
> > binning of probability values.
> >
> > I can say that I more commonly see people choose fixed bin widths,
like
> > "==0.10" (for 10 bins) or "==0.05" (for 20 bins) instead of
variable
> width
> > bins, such as:
> > [ >=0.0, >=0.05, >=0.1, >=0.2, >=0.5, >=1.0 ]
> >
> > But I suspect that's more out of convenience than anything else.
With
> > regards to your chosen bins, I suspect you set them up this way
since you
> > have lots of low probability values closer to 0.0 and relatively
few
> > probability values closer to 1.0. While this may be a good choice
for
> > relatively rare events, it wouldn't be as good of a choice for
very
> common
> > events resulting in high probability values.
> >
> > Choosing 20 bins (==0.05) would include all of your current bin
> boundaries
> > and enable you to sample evenly across the probability space,
regardless
> of
> > whether the values are bunched near 0 or 1. And mathematically,
your
> > current bins would be derivable from these.
> >
> > But if your chosen bins follow some existing WPC convention, I
don't see
> > an obvious reason to change them.
> >
> > Please let me know if you'd like me to forward this question to
one of
> the
> > statisticians in our group for their advice.
> >
> > Thanks,
> > John
> >
> > On Fri, Sep 4, 2020 at 5:08 AM Michael Erickson - NOAA Affiliate
via RT <
> > met_help at ucar.edu> wrote:
> >
> >>
> >> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562 >
> >>
> >> Hi John,
> >>
> >> Thank you for your quick and helpful response! To answer your
questions
> >> from the first email:
> >>
> >> 1) I have included the climo file in case you wanted to see it:
> >>
> >>
>
https://ftp.wpc.ncep.noaa.gov/erickson/DTC/UFVS_ST4gFFG_s2015010100_e2019123123_vhr12.nc
> >>
> >> 2) I start from the netcdf output from grid_stat, load that data
into
> the
> >> python workspace, and compute the brier score from that.
> >>
> >> Also the circle diameter of 9 in the observation file is to draw
a 40 km
> >> radius around the "observation."
> >>
> >> From your latter email, it sounds like I may not be able to
exactly
> >> replicate the Brier Score calculation. In the spirit of best
practices,
> >> would you recommend I change cat_thresh  to "= [ >=0.0, >=0.001,
>=0.05,
> >> >=0.1, >=0.2, >=0.5, >=1.0 ];" or keep my cat_thresh as it
currently is
> as
> >> long as I am consistent? I was also wondering if grid_stat bins
the
> >> probabilities for the climo field as it does for the
probabilities in
> the
> >> forecast field?
> >>
> >> Thanks again!
> >>
> >> Mike
> >>
> >> On Thu, Sep 3, 2020 at 7:12 PM John Halley Gotway via RT <
> >> met_help at ucar.edu>
> >> wrote:
> >>
> >> > Actually, I have a reasonable guess as to why you may be seeing
a
> >> > difference.
> >> >
> >> > All probabilistics verification in MET is based on an Nx2
> probabilistic
> >> > contingency table. Those are the counts in the PCT line type.
We do
> >> this to
> >> > make it easier to aggregate statistics across multiple cases,
but
> >> summing
> >> > up contingency tables before recomputing statistics. But the
pros/cons
> >> of
> >> > this approach would probably be better addressed by a
statistician. So
> >> the
> >> > stats are computed using probability bins and not raw
probability
> >> values.
> >> >
> >> > If you went and computed the Brier score by hand, you probably
did so
> >> using
> >> > raw probability values and not binning them first.
> >> >
> >> > And this difference could explain the type of discrepancy
you're
> seeing.
> >> >
> >> > To test this out, I reran your case...
> >> > (1) Using your original settings to confirm your Brier score of
> >> 0.011934.
> >> > (2) Using 10 equally-spaced probability bins (cat_thresh = [
==0.1 ];)
> >> > which produced a Brier score of 0.013747.
> >> > (3) Using 50 equally-spaced probability bins (cat_thresh = [
==0.2 ];)
> >> > which produced a Brier score of 0.01197.
> >> > (4) Using 100 equally-spaced probability bins (cat_thresh = [
==0.01
> ];)
> >> > which produced a Brier score of 0.01193.
> >> >
> >> > I suppose that doesn't example the exact discrepancy, but could
> >> definitely
> >> > be involved.
> >> >
> >> > Notice on this line of the brier score computation in MET:
> >> >
> >> >
> >>
>
https://github.com/dtcenter/MET/blob/2c9ae440a84024fbf62caa64c1b747f9a912236f/met/src/libcode/vx_statistics/contable_nx2.cc#L647
> >> >
> >> > That the "probability" value returned by "row_proby()" is the
> mid-point
> >> of
> >> > the bin.
> >> > So all of your forecast probability values of 0% which fall
into the
> >> first
> >> > bin are actually evaluated as having a probability value of
0.025
> which
> >> is
> >> > the mid-point between 0 and 0.05 for the first bin.
> >> >
> >> > Rerunning using the following to minimize that effect on the
0's:
> >> > cat_thresh = [ >=0.0, >=0.001, >=0.05, >=0.1, >=0.2, >=0.5,
>=1.0 ];
> >> > produces a brier score of 0.011489.
> >> >
> >> > So I'd say that the binning of the probability values is
impacting the
> >> > Brier score out in the 4th decimal place.
> >> >
> >> > John
> >> >
> >> > On Thu, Sep 3, 2020 at 4:47 PM John Halley Gotway
<johnhg at ucar.edu>
> >> wrote:
> >> >
> >> > > Hi Mike,
> >> > >
> >> > > Looks like you were able to make a lot of progress. I
certainly
> don't
> >> see
> >> > > anything wrong based on the log messages you sent.
> >> > >
> >> > > I do notice that you're smoothing the observations with the
maximum
> >> value
> >> > > in a circle of diameter 9... presumably for a good reason.
And I see
> >> that
> >> > > smoothing step indicated in the log messages as well as the
output
> >> .stat
> >> > > file.
> >> > >
> >> > > Two questions.
> >> > >
> >> > > (1) I wanted to try running locally, but didn't find the
"climo"
> file
> >> on
> >> > > the WPC ftp site:
> >> > >
> >> > >
> >> >
> >>
>
https://ftp.wpc.ncep.noaa.gov/erickson/DTC/UFVS_ST4gFFG_s2015010100_e2019123123_vhr12.nc
> >> > > <
> >> >
> >>
>
https://ftp.wpc.ncep.noaa.gov/erickson/DTC/UFVS_ST4gFFG_s2015010100_e2019123123_vhr12.nc
> >> > >
> >> > > Could you add that?
> >> > >
> >> > > (2) When you say that you tried to replicate the Brier score
> >> computation,
> >> > > what was your starting point? The raw input files or using
the
> NetCDF
> >> > > matched pairs output from Grid-Stat which already include the
> >> computation
> >> > > of the observation maximums?
> >> > >
> >> > > Thanks,
> >> > > John Halley Gotway
> >> > >
> >> > > On Thu, Sep 3, 2020 at 2:26 PM Michael Erickson - NOAA
Affiliate via
> >> RT <
> >> > > met_help at ucar.edu> wrote:
> >> > >
> >> > >>
> >> > >> <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562 >
> >> > >>
> >> > >> Thank you Minna!
> >> > >>
> >> > >> Mike
> >> > >>
> >> > >> On Thu, Sep 3, 2020 at 4:12 PM Minna Win via RT
<met_help at ucar.edu
> >
> >> > >> wrote:
> >> > >>
> >> > >> > Hi Mike,
> >> > >> >
> >> > >> > It looks like you have a few questions associated with
> calculating
> >> > Brier
> >> > >> > Skill Scores.  I'm assigning this ticket to John Halley
Gotway.
> >> > >> >
> >> > >> > Regards,
> >> > >> > Minna
> >> > >> > ---------------
> >> > >> > Minna Win
> >> > >> > National Center for Atmospheric Research
> >> > >> > Developmental Testbed Center
> >> > >> > Phone: 303-497-8423
> >> > >> > Fax:   303-497-8401
> >> > >> >
> >> > >> >
> >> > >> >
> >> > >> > On Thu, Sep 3, 2020 at 1:13 PM Michael Erickson - NOAA
Affiliate
> >> via
> >> > RT
> >> > >> <
> >> > >> > met_help at ucar.edu> wrote:
> >> > >> >
> >> > >> > >
> >> > >> > > Thu Sep 03 13:13:26 2020: Request 96562 was acted upon.
> >> > >> > > Transaction: Ticket created by
michael.j.erickson at noaa.gov
> >> > >> > >        Queue: met_help
> >> > >> > >      Subject: Including Climatology in grid_stat Config
File
> >> > >> > >        Owner: Nobody
> >> > >> > >   Requestors: michael.j.erickson at noaa.gov
> >> > >> > >       Status: new
> >> > >> > >  Ticket <URL:
> >> > https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562
> >> > >> >
> >> > >> > >
> >> > >> > >
> >> > >> > > Greetings,
> >> > >> > >
> >> > >> > > For the first time I am attempting to calculate Brier
Skill
> Score
> >> > >> using
> >> > >> > > grid_stat from an input climatology file. I have created
a
> >> > >> probabilistic
> >> > >> > > flooding climatology file (spans from zero to one; image
is
> here:
> >> > >> > >
> >> >
https://ftp.wpc.ncep.noaa.gov/erickson/test10/ALL_noMRGL_UFVS_ALL.png
> >> > >> ).
> >> > >> > > This climatology is static, so it doesn't change with
time when
> >> > >> inputting
> >> > >> > > the "model" and "observation" data. I believe I have
> successfully
> >> > >> gotten
> >> > >> > > this to work using the command:
> >> > >> > >
> >> > >> > > /opt/MET/90/bin/grid_stat
ERO_s2020083112_e2020090112_vhr09.nc
> >> > >> > > ST4gFFG_s2020083112_e2020090112_vhr09.nc usethis -outdir
~
> >> > >> > >
> >> > >> > > where grid_stat ERO_s2020083112_e2020090112_vhr09.nc are
> >> discrete
> >> > >> > forecast
> >> > >> > > probabilities of 0, 0.05, 0.1, 0.2, and 0.5
> >> > >> > > where ST4gFFG_s2020083112_e2020090112_vhr09.nc are
observation
> >> > values
> >> > >> > of 0
> >> > >> > > or 1
> >> > >> > > and usethis is the configuration file
> >> > >> > >
> >> > >> > > Finally the climatology file that consists of "almost"
> continuous
> >> > >> values
> >> > >> > > between 0 and 1 is named: UFVS_ST4gFFG_s2015010100_
> >> > >> e2019123123_vhr12.nc
> >> > >> > >
> >> > >> > > I have put all of these files at
> >> > >> > > https://ftp.wpc.ncep.noaa.gov/erickson/DTC/ for
> >> > >> > > your reference.
> >> > >> > >
> >> > >> > > As for my questions:
> >> > >> > >
> >> > >> > > 1) I was wondering if the climatology file was properly
> ingested
> >> and
> >> > >> > > calculated for my example? I believe it is correct given
the
> >> output
> >> > >> > below,
> >> > >> > > but I wanted to make sure, since this is my first time
doing
> >> this:
> >> > >> > >
> >> > >> > >
> >> > >> > >
> >> > >> > >
> >> > >> > >
> >> > >> > >
> >> > >> > >
> >> > >> > >
> >> > >> > >
> >> > >> > >
> >> > >> > >
> >> > >> > >
> >> > >> > >
> >> > >> > >
> >> > >> > >
> >> > >> > >
> >> > >> > >
> >> > >> > >
> >> > >> > >
> >> > >> > >
> >> > >> > >
> >> > >> > >
> >> > >> > >
> >> > >> > >
> >> > >> > >
> >> > >> > >
> >> > >> > >
> >> > >> > >
> >> > >> > >
> >> > >> > >
> >> > >> > >
> >> > >> > >
> >> > >> > > *DEBUG 1: Forecast File:
> >> > >> > >
> >> > >> > >
> >> > >> >
> >> > >>
> >> >
> >>
> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL/ERO_s2020083112_e2020090112_vhr09.ncDEBUG
> >> > >> > > 1: Observation File:
> >> > >> > >
> >> > >> > >
> >> > >> >
> >> > >>
> >> >
> >>
> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL/ST4gFFG_s2020083112_e2020090112_vhr09.ncDEBUG
> >> > >> > > 3: Reading forecast data for EROSurface.DEBUG 3: Reading
> >> observation
> >> > >> data
> >> > >> > > for ST4gFFGSurface.DEBUG 4:
> >> > >> Met2dDataFileFactory::new_met_2d_data_file()
> >> > >> > ->
> >> > >> > > created new Met2dDataFile object of type
"FileType_NcMet".DEBUG
> >> > >> 4:DEBUG
> >> > >> > 4:
> >> > >> > > Latitude/Longitude Grid Data:DEBUG 4:      lat_ll:
25DEBUG 4:
> >> > >> > lon_ll:
> >> > >> > > 129.8DEBUG 4:   delta_lat: 0.09DEBUG 4:   delta_lon:
0.09DEBUG
> 4:
> >> > >> > >  Nlat: 276DEBUG 4:        Nlon: 721DEBUG 4:DEBUG 4:
> >> > >> > > VarInfoFactory::new_var_info() -> created new VarInfo
object of
> >> type
> >> > >> > > "FileType_NcMet".DEBUG 3: For forecast valid at
> 20200901_120000,
> >> > >> found 1
> >> > >> > > climatology field(s) with valid time(s):
20201231_230000DEBUG
> 3:
> >> > >> Found 1
> >> > >> > > climatology fields.DEBUG 3: Found 1 climatology mean and
0
> >> > climatology
> >> > >> > > standard deviation field(s) for forecast
EROSurface.DEBUG 2:
> >> > >> Processing
> >> > >> > > masking regions.DEBUG 3: Processing grid mask: FULLDEBUG
4:
> >> > >> > > parse_grid_mask() -> parsing grid mask "FULL"DEBUG
2:DEBUG 2:
> >> > >> > >
> >> > >> > >
> >> > >> >
> >> > >>
> >> >
> >>
>
--------------------------------------------------------------------------------DEBUG
> >> > >> > > 2:DEBUG 3: Smoothing field using the MAX(49)
CircleTemplate
> >> > >> interpolation
> >> > >> > > method.DEBUG 2: Processing EROSurface versus
ST4gFFGSurface,
> for
> >> > >> > smoothing
> >> > >> > > method MAX_CIRCLE(49), over region FULL, using 190638
matched
> >> > >> pairs.DEBUG
> >> > >> > > 2: Computing Probabilistic Statistics.DEBUG 2:DEBUG 2:
> >> > >> > >
> >> > >> > >
> >> > >> >
> >> > >>
> >> >
> >>
>
--------------------------------------------------------------------------------DEBUG
> >> > >> > > 2:DEBUG 1: Output file:
> >> > >> > >
> >> > >> > >
> >> > >> >
> >> > >>
> >> >
> >>
> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL//grid_stat_ST4gFFG_ERO_s2020090112_e2020090212_vhr09_240000L_20200901_120000V.statDEBUG
> >> > >> > > 1: Output file:
> >> > >> > >
> >> > >> > >
> >> > >> >
> >> > >>
> >> >
> >>
> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL//grid_stat_ST4gFFG_ERO_s2020090112_e2020090212_vhr09_240000L_20200901_120000V_pairs.nc*
> >> > >> > >
> >> > >> > >
> >> > >> > > 2) This question is a bit more basic. I am unable to
manually
> >> > >> calculate a
> >> > >> > > Brier Score value for the forecast and observation that
> properly
> >> > >> matches
> >> > >> > > that in the stat file. My manually calculated Brier
Score is
> >> > >> > systematically
> >> > >> > > lower. For this event, the stat file BS is 0.0119 and my
value
> is
> >> > >> 0.0116.
> >> > >> > > I've looked at C3 in the MET Tutorial guide
> >> > >> > > <
> >> > >> > >
> >> > >> >
> >> > >>
> >> >
> >>
> https://dtcenter.org/sites/default/files/community-
code/met/docs/user-guide/MET_Users_Guide_v9.0.pdf
> >> > >> > > >,
> >> > >> > > but I'm still at a bit of a loss. Is there a simple way
I can
> >> > >> replicate
> >> > >> > the
> >> > >> > > calculation seen in the stat file?
> >> > >> > >
> >> > >> > > Thank you again for your help and please let me know if
you
> have
> >> any
> >> > >> > > questions.
> >> > >> > >
> >> > >> > > Mike
> >> > >> > >
> >> > >> > > --
> >> > >> > > Michael J. Erickson
> >> > >> > >
> >> > >> > > Research Scientist
> >> > >> > > Cooperative Institute for Research in Environmental
Sciences
> >> (CIRES)
> >> > >> > > NOAA/NWS/Weather Prediction Center
> >> > >> > > Phone:  301-683-1546
> >> > >> > >
> >> > >> > >
> >> > >> >
> >> > >> >
> >> > >>
> >> > >> --
> >> > >> Michael J. Erickson
> >> > >>
> >> > >> Research Scientist
> >> > >> Cooperative Institute for Research in Environmental Sciences
> (CIRES)
> >> > >> NOAA/NWS/Weather Prediction Center
> >> > >> Phone:  301-683-1546
> >> > >>
> >> > >>
> >> >
> >> >
> >>
> >> --
> >> Michael J. Erickson
> >>
> >> Research Scientist
> >> Cooperative Institute for Research in Environmental Sciences
(CIRES)
> >> NOAA/NWS/Weather Prediction Center
> >> Phone:  301-683-1546
> >>
> >>
>
>

--
Michael J. Erickson

Research Scientist
Cooperative Institute for Research in Environmental Sciences (CIRES)
NOAA/NWS/Weather Prediction Center
Phone:  301-683-1546

------------------------------------------------
Subject: Including Climatology in grid_stat Config File
From: John Halley Gotway
Time: Fri Sep 04 11:14:11 2020

Barb and Eric,

I've added you to this met-help ticket from Mike Erickson from
NOAA/WPC.
We're hoping to get some advice from one or both of you about
probabilistic
verification.

Mike is running Grid-Stat to verify WPC's Excessive Rainfall Outlooks
against StageIV precip. The forecast probability values are always 0,
0.05,
0.1, 0.2, 0.5, or 1.0.
When Mike computes the Brier score by hand, it differs from the
results
reported by Grid-Stat out in the 3rd decimal place.

My theory is that the difference is caused by the fact that MET does
not
compute the Brier score directly on the probability values. Instead,
it
bins them into an Nx2 probabilistic contingency table and computes the
Brier score from that table. And the mid-point of each bin is used in
the
Brier score computations. So different probability bins will result in
a
slightly different Brier score.

Mike is currently using probability thresholds as follows:
   cat_thresh = [ >=0.0, >=0.05, >=0.1, >=0.2, >=0.5, >=1.0 ];

And that's consistent with the probability values. But when you think
about
it...
- Forecasts of 0% fall into the first bin and are evaluated as being a
value of 0.025 (mid-point of the 0.0 to 0.05 bin)
- Forecasts of 5% fall into the second bin and are evaluated as being
a
value of 0.075 (mid-point of the 0.05 to 0.1 bin)
- Forecasts of 10% fall into the third bin and are evaluated as being
a
value of 0.150 (mid-point of the 0.1 to 0.2 bin).
- and so on for the other probability values

Seems like the binning of probability values works better for
continuous
probability values and not so well for probabilities that have already
been
binned!

I'm wondering if you have any thoughts or advice about this situation?

Thanks,
John

On Fri, Sep 4, 2020 at 10:47 AM Michael Erickson - NOAA Affiliate via
RT <
met_help at ucar.edu> wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562 >
>
> Hi John,
>
> Thanks for your answers and sounds good! That is strange that the
climo
> file was not found for your setting. The only detail I can think of
is that
> within the climo field, the file_name specification is static:
>
> file_name = [
>
"/usr1/wpc_cpgffh/gribs/ERO_verif/static/UFVS_ST4gFFG_s2015010100_e2019123123_vhr12.nc"
> ];
>
>
> I believe you concluded that my climo read-in looked correct?
>
> Thanks,
>
> Mike
>
>
> On Fri, Sep 4, 2020 at 12:42 PM John Halley Gotway via RT <
> met_help at ucar.edu>
> wrote:
>
> > Mike,
> >
> > 2 more things I forgot to address.
> >
> > First, I pulled that climo field but when I ran grid_stat with
your
> usethis
> > config file, it did not actually read the climo data.
> >
> > DEBUG 3: Found 0 climatology fields.
> >
> >
> > I'm wondering what additional configuration settings you used to
make
> this
> > work?
> >
> >
> > Second, the answer to your question is yes. The exact same binning
logic
> > used for the forecast probabilities is applied to the climo data.
In
> fact,
> > the forecast probability bins are applied to both the forecast and
climo
> > data. So you do not need to define separate "cat_thresh" settings
for the
> > climo. They won't be used anyway.
> >
> >
> > Here's the spot in the library code where the climo probabilistic
> > contingency table is created using the forecast probability bins:
> >
> >
> >
> >
>
https://github.com/dtcenter/MET/blob/2c9ae440a84024fbf62caa64c1b747f9a912236f/met/src/libcode/vx_statistics/compute_stats.cc#L767
> >
> >
> > Thanks,
> > John
> >
> > On Fri, Sep 4, 2020 at 10:25 AM John Halley Gotway
<johnhg at ucar.edu>
> > wrote:
> >
> > > Mike,
> > >
> > > I don't really have a recommendation on best practices with
regards to
> > the
> > > binning of probability values.
> > >
> > > I can say that I more commonly see people choose fixed bin
widths, like
> > > "==0.10" (for 10 bins) or "==0.05" (for 20 bins) instead of
variable
> > width
> > > bins, such as:
> > > [ >=0.0, >=0.05, >=0.1, >=0.2, >=0.5, >=1.0 ]
> > >
> > > But I suspect that's more out of convenience than anything else.
With
> > > regards to your chosen bins, I suspect you set them up this way
since
> you
> > > have lots of low probability values closer to 0.0 and relatively
few
> > > probability values closer to 1.0. While this may be a good
choice for
> > > relatively rare events, it wouldn't be as good of a choice for
very
> > common
> > > events resulting in high probability values.
> > >
> > > Choosing 20 bins (==0.05) would include all of your current bin
> > boundaries
> > > and enable you to sample evenly across the probability space,
> regardless
> > of
> > > whether the values are bunched near 0 or 1. And mathematically,
your
> > > current bins would be derivable from these.
> > >
> > > But if your chosen bins follow some existing WPC convention, I
don't
> see
> > > an obvious reason to change them.
> > >
> > > Please let me know if you'd like me to forward this question to
one of
> > the
> > > statisticians in our group for their advice.
> > >
> > > Thanks,
> > > John
> > >
> > > On Fri, Sep 4, 2020 at 5:08 AM Michael Erickson - NOAA Affiliate
via
> RT <
> > > met_help at ucar.edu> wrote:
> > >
> > >>
> > >> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562 >
> > >>
> > >> Hi John,
> > >>
> > >> Thank you for your quick and helpful response! To answer your
> questions
> > >> from the first email:
> > >>
> > >> 1) I have included the climo file in case you wanted to see it:
> > >>
> > >>
> >
>
https://ftp.wpc.ncep.noaa.gov/erickson/DTC/UFVS_ST4gFFG_s2015010100_e2019123123_vhr12.nc
> > >>
> > >> 2) I start from the netcdf output from grid_stat, load that
data into
> > the
> > >> python workspace, and compute the brier score from that.
> > >>
> > >> Also the circle diameter of 9 in the observation file is to
draw a 40
> km
> > >> radius around the "observation."
> > >>
> > >> From your latter email, it sounds like I may not be able to
exactly
> > >> replicate the Brier Score calculation. In the spirit of best
> practices,
> > >> would you recommend I change cat_thresh  to "= [ >=0.0,
>=0.001,
> >=0.05,
> > >> >=0.1, >=0.2, >=0.5, >=1.0 ];" or keep my cat_thresh as it
currently
> is
> > as
> > >> long as I am consistent? I was also wondering if grid_stat bins
the
> > >> probabilities for the climo field as it does for the
probabilities in
> > the
> > >> forecast field?
> > >>
> > >> Thanks again!
> > >>
> > >> Mike
> > >>
> > >> On Thu, Sep 3, 2020 at 7:12 PM John Halley Gotway via RT <
> > >> met_help at ucar.edu>
> > >> wrote:
> > >>
> > >> > Actually, I have a reasonable guess as to why you may be
seeing a
> > >> > difference.
> > >> >
> > >> > All probabilistics verification in MET is based on an Nx2
> > probabilistic
> > >> > contingency table. Those are the counts in the PCT line type.
We do
> > >> this to
> > >> > make it easier to aggregate statistics across multiple cases,
but
> > >> summing
> > >> > up contingency tables before recomputing statistics. But the
> pros/cons
> > >> of
> > >> > this approach would probably be better addressed by a
statistician.
> So
> > >> the
> > >> > stats are computed using probability bins and not raw
probability
> > >> values.
> > >> >
> > >> > If you went and computed the Brier score by hand, you
probably did
> so
> > >> using
> > >> > raw probability values and not binning them first.
> > >> >
> > >> > And this difference could explain the type of discrepancy
you're
> > seeing.
> > >> >
> > >> > To test this out, I reran your case...
> > >> > (1) Using your original settings to confirm your Brier score
of
> > >> 0.011934.
> > >> > (2) Using 10 equally-spaced probability bins (cat_thresh = [
==0.1
> ];)
> > >> > which produced a Brier score of 0.013747.
> > >> > (3) Using 50 equally-spaced probability bins (cat_thresh = [
==0.2
> ];)
> > >> > which produced a Brier score of 0.01197.
> > >> > (4) Using 100 equally-spaced probability bins (cat_thresh = [
==0.01
> > ];)
> > >> > which produced a Brier score of 0.01193.
> > >> >
> > >> > I suppose that doesn't example the exact discrepancy, but
could
> > >> definitely
> > >> > be involved.
> > >> >
> > >> > Notice on this line of the brier score computation in MET:
> > >> >
> > >> >
> > >>
> >
>
https://github.com/dtcenter/MET/blob/2c9ae440a84024fbf62caa64c1b747f9a912236f/met/src/libcode/vx_statistics/contable_nx2.cc#L647
> > >> >
> > >> > That the "probability" value returned by "row_proby()" is the
> > mid-point
> > >> of
> > >> > the bin.
> > >> > So all of your forecast probability values of 0% which fall
into the
> > >> first
> > >> > bin are actually evaluated as having a probability value of
0.025
> > which
> > >> is
> > >> > the mid-point between 0 and 0.05 for the first bin.
> > >> >
> > >> > Rerunning using the following to minimize that effect on the
0's:
> > >> > cat_thresh = [ >=0.0, >=0.001, >=0.05, >=0.1, >=0.2, >=0.5,
>=1.0 ];
> > >> > produces a brier score of 0.011489.
> > >> >
> > >> > So I'd say that the binning of the probability values is
impacting
> the
> > >> > Brier score out in the 4th decimal place.
> > >> >
> > >> > John
> > >> >
> > >> > On Thu, Sep 3, 2020 at 4:47 PM John Halley Gotway
<johnhg at ucar.edu>
> > >> wrote:
> > >> >
> > >> > > Hi Mike,
> > >> > >
> > >> > > Looks like you were able to make a lot of progress. I
certainly
> > don't
> > >> see
> > >> > > anything wrong based on the log messages you sent.
> > >> > >
> > >> > > I do notice that you're smoothing the observations with the
> maximum
> > >> value
> > >> > > in a circle of diameter 9... presumably for a good reason.
And I
> see
> > >> that
> > >> > > smoothing step indicated in the log messages as well as the
output
> > >> .stat
> > >> > > file.
> > >> > >
> > >> > > Two questions.
> > >> > >
> > >> > > (1) I wanted to try running locally, but didn't find the
"climo"
> > file
> > >> on
> > >> > > the WPC ftp site:
> > >> > >
> > >> > >
> > >> >
> > >>
> >
>
https://ftp.wpc.ncep.noaa.gov/erickson/DTC/UFVS_ST4gFFG_s2015010100_e2019123123_vhr12.nc
> > >> > > <
> > >> >
> > >>
> >
>
https://ftp.wpc.ncep.noaa.gov/erickson/DTC/UFVS_ST4gFFG_s2015010100_e2019123123_vhr12.nc
> > >> > >
> > >> > > Could you add that?
> > >> > >
> > >> > > (2) When you say that you tried to replicate the Brier
score
> > >> computation,
> > >> > > what was your starting point? The raw input files or using
the
> > NetCDF
> > >> > > matched pairs output from Grid-Stat which already include
the
> > >> computation
> > >> > > of the observation maximums?
> > >> > >
> > >> > > Thanks,
> > >> > > John Halley Gotway
> > >> > >
> > >> > > On Thu, Sep 3, 2020 at 2:26 PM Michael Erickson - NOAA
Affiliate
> via
> > >> RT <
> > >> > > met_help at ucar.edu> wrote:
> > >> > >
> > >> > >>
> > >> > >> <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562 >
> > >> > >>
> > >> > >> Thank you Minna!
> > >> > >>
> > >> > >> Mike
> > >> > >>
> > >> > >> On Thu, Sep 3, 2020 at 4:12 PM Minna Win via RT <
> met_help at ucar.edu
> > >
> > >> > >> wrote:
> > >> > >>
> > >> > >> > Hi Mike,
> > >> > >> >
> > >> > >> > It looks like you have a few questions associated with
> > calculating
> > >> > Brier
> > >> > >> > Skill Scores.  I'm assigning this ticket to John Halley
Gotway.
> > >> > >> >
> > >> > >> > Regards,
> > >> > >> > Minna
> > >> > >> > ---------------
> > >> > >> > Minna Win
> > >> > >> > National Center for Atmospheric Research
> > >> > >> > Developmental Testbed Center
> > >> > >> > Phone: 303-497-8423
> > >> > >> > Fax:   303-497-8401
> > >> > >> >
> > >> > >> >
> > >> > >> >
> > >> > >> > On Thu, Sep 3, 2020 at 1:13 PM Michael Erickson - NOAA
> Affiliate
> > >> via
> > >> > RT
> > >> > >> <
> > >> > >> > met_help at ucar.edu> wrote:
> > >> > >> >
> > >> > >> > >
> > >> > >> > > Thu Sep 03 13:13:26 2020: Request 96562 was acted
upon.
> > >> > >> > > Transaction: Ticket created by
michael.j.erickson at noaa.gov
> > >> > >> > >        Queue: met_help
> > >> > >> > >      Subject: Including Climatology in grid_stat
Config File
> > >> > >> > >        Owner: Nobody
> > >> > >> > >   Requestors: michael.j.erickson at noaa.gov
> > >> > >> > >       Status: new
> > >> > >> > >  Ticket <URL:
> > >> > https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562
> > >> > >> >
> > >> > >> > >
> > >> > >> > >
> > >> > >> > > Greetings,
> > >> > >> > >
> > >> > >> > > For the first time I am attempting to calculate Brier
Skill
> > Score
> > >> > >> using
> > >> > >> > > grid_stat from an input climatology file. I have
created a
> > >> > >> probabilistic
> > >> > >> > > flooding climatology file (spans from zero to one;
image is
> > here:
> > >> > >> > >
> > >> >
>
https://ftp.wpc.ncep.noaa.gov/erickson/test10/ALL_noMRGL_UFVS_ALL.png
> > >> > >> ).
> > >> > >> > > This climatology is static, so it doesn't change with
time
> when
> > >> > >> inputting
> > >> > >> > > the "model" and "observation" data. I believe I have
> > successfully
> > >> > >> gotten
> > >> > >> > > this to work using the command:
> > >> > >> > >
> > >> > >> > > /opt/MET/90/bin/grid_stat ERO_s2020083112_e2020090112_
> vhr09.nc
> > >> > >> > > ST4gFFG_s2020083112_e2020090112_vhr09.nc usethis
-outdir ~
> > >> > >> > >
> > >> > >> > > where grid_stat ERO_s2020083112_e2020090112_vhr09.nc
are
> > >> discrete
> > >> > >> > forecast
> > >> > >> > > probabilities of 0, 0.05, 0.1, 0.2, and 0.5
> > >> > >> > > where ST4gFFG_s2020083112_e2020090112_vhr09.nc are
> observation
> > >> > values
> > >> > >> > of 0
> > >> > >> > > or 1
> > >> > >> > > and usethis is the configuration file
> > >> > >> > >
> > >> > >> > > Finally the climatology file that consists of "almost"
> > continuous
> > >> > >> values
> > >> > >> > > between 0 and 1 is named: UFVS_ST4gFFG_s2015010100_
> > >> > >> e2019123123_vhr12.nc
> > >> > >> > >
> > >> > >> > > I have put all of these files at
> > >> > >> > > https://ftp.wpc.ncep.noaa.gov/erickson/DTC/ for
> > >> > >> > > your reference.
> > >> > >> > >
> > >> > >> > > As for my questions:
> > >> > >> > >
> > >> > >> > > 1) I was wondering if the climatology file was
properly
> > ingested
> > >> and
> > >> > >> > > calculated for my example? I believe it is correct
given the
> > >> output
> > >> > >> > below,
> > >> > >> > > but I wanted to make sure, since this is my first time
doing
> > >> this:
> > >> > >> > >
> > >> > >> > >
> > >> > >> > >
> > >> > >> > >
> > >> > >> > >
> > >> > >> > >
> > >> > >> > >
> > >> > >> > >
> > >> > >> > >
> > >> > >> > >
> > >> > >> > >
> > >> > >> > >
> > >> > >> > >
> > >> > >> > >
> > >> > >> > >
> > >> > >> > >
> > >> > >> > >
> > >> > >> > >
> > >> > >> > >
> > >> > >> > >
> > >> > >> > >
> > >> > >> > >
> > >> > >> > >
> > >> > >> > >
> > >> > >> > >
> > >> > >> > >
> > >> > >> > >
> > >> > >> > >
> > >> > >> > >
> > >> > >> > >
> > >> > >> > >
> > >> > >> > >
> > >> > >> > > *DEBUG 1: Forecast File:
> > >> > >> > >
> > >> > >> > >
> > >> > >> >
> > >> > >>
> > >> >
> > >>
> >
> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL/ERO_s2020083112_e2020090112_vhr09.ncDEBUG
> > >> > >> > > 1: Observation File:
> > >> > >> > >
> > >> > >> > >
> > >> > >> >
> > >> > >>
> > >> >
> > >>
> >
> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL/ST4gFFG_s2020083112_e2020090112_vhr09.ncDEBUG
> > >> > >> > > 3: Reading forecast data for EROSurface.DEBUG 3:
Reading
> > >> observation
> > >> > >> data
> > >> > >> > > for ST4gFFGSurface.DEBUG 4:
> > >> > >> Met2dDataFileFactory::new_met_2d_data_file()
> > >> > >> > ->
> > >> > >> > > created new Met2dDataFile object of type
> "FileType_NcMet".DEBUG
> > >> > >> 4:DEBUG
> > >> > >> > 4:
> > >> > >> > > Latitude/Longitude Grid Data:DEBUG 4:      lat_ll:
25DEBUG 4:
> > >> > >> > lon_ll:
> > >> > >> > > 129.8DEBUG 4:   delta_lat: 0.09DEBUG 4:   delta_lon:
> 0.09DEBUG
> > 4:
> > >> > >> > >  Nlat: 276DEBUG 4:        Nlon: 721DEBUG 4:DEBUG 4:
> > >> > >> > > VarInfoFactory::new_var_info() -> created new VarInfo
object
> of
> > >> type
> > >> > >> > > "FileType_NcMet".DEBUG 3: For forecast valid at
> > 20200901_120000,
> > >> > >> found 1
> > >> > >> > > climatology field(s) with valid time(s):
20201231_230000DEBUG
> > 3:
> > >> > >> Found 1
> > >> > >> > > climatology fields.DEBUG 3: Found 1 climatology mean
and 0
> > >> > climatology
> > >> > >> > > standard deviation field(s) for forecast
EROSurface.DEBUG 2:
> > >> > >> Processing
> > >> > >> > > masking regions.DEBUG 3: Processing grid mask:
FULLDEBUG 4:
> > >> > >> > > parse_grid_mask() -> parsing grid mask "FULL"DEBUG
2:DEBUG 2:
> > >> > >> > >
> > >> > >> > >
> > >> > >> >
> > >> > >>
> > >> >
> > >>
> >
>
--------------------------------------------------------------------------------DEBUG
> > >> > >> > > 2:DEBUG 3: Smoothing field using the MAX(49)
CircleTemplate
> > >> > >> interpolation
> > >> > >> > > method.DEBUG 2: Processing EROSurface versus
ST4gFFGSurface,
> > for
> > >> > >> > smoothing
> > >> > >> > > method MAX_CIRCLE(49), over region FULL, using 190638
matched
> > >> > >> pairs.DEBUG
> > >> > >> > > 2: Computing Probabilistic Statistics.DEBUG 2:DEBUG 2:
> > >> > >> > >
> > >> > >> > >
> > >> > >> >
> > >> > >>
> > >> >
> > >>
> >
>
--------------------------------------------------------------------------------DEBUG
> > >> > >> > > 2:DEBUG 1: Output file:
> > >> > >> > >
> > >> > >> > >
> > >> > >> >
> > >> > >>
> > >> >
> > >>
> >
> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL//grid_stat_ST4gFFG_ERO_s2020090112_e2020090212_vhr09_240000L_20200901_120000V.statDEBUG
> > >> > >> > > 1: Output file:
> > >> > >> > >
> > >> > >> > >
> > >> > >> >
> > >> > >>
> > >> >
> > >>
> >
> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL//grid_stat_ST4gFFG_ERO_s2020090112_e2020090212_vhr09_240000L_20200901_120000V_pairs.nc*
> > >> > >> > >
> > >> > >> > >
> > >> > >> > > 2) This question is a bit more basic. I am unable to
manually
> > >> > >> calculate a
> > >> > >> > > Brier Score value for the forecast and observation
that
> > properly
> > >> > >> matches
> > >> > >> > > that in the stat file. My manually calculated Brier
Score is
> > >> > >> > systematically
> > >> > >> > > lower. For this event, the stat file BS is 0.0119 and
my
> value
> > is
> > >> > >> 0.0116.
> > >> > >> > > I've looked at C3 in the MET Tutorial guide
> > >> > >> > > <
> > >> > >> > >
> > >> > >> >
> > >> > >>
> > >> >
> > >>
> >
> https://dtcenter.org/sites/default/files/community-
code/met/docs/user-guide/MET_Users_Guide_v9.0.pdf
> > >> > >> > > >,
> > >> > >> > > but I'm still at a bit of a loss. Is there a simple
way I can
> > >> > >> replicate
> > >> > >> > the
> > >> > >> > > calculation seen in the stat file?
> > >> > >> > >
> > >> > >> > > Thank you again for your help and please let me know
if you
> > have
> > >> any
> > >> > >> > > questions.
> > >> > >> > >
> > >> > >> > > Mike
> > >> > >> > >
> > >> > >> > > --
> > >> > >> > > Michael J. Erickson
> > >> > >> > >
> > >> > >> > > Research Scientist
> > >> > >> > > Cooperative Institute for Research in Environmental
Sciences
> > >> (CIRES)
> > >> > >> > > NOAA/NWS/Weather Prediction Center
> > >> > >> > > Phone:  301-683-1546
> > >> > >> > >
> > >> > >> > >
> > >> > >> >
> > >> > >> >
> > >> > >>
> > >> > >> --
> > >> > >> Michael J. Erickson
> > >> > >>
> > >> > >> Research Scientist
> > >> > >> Cooperative Institute for Research in Environmental
Sciences
> > (CIRES)
> > >> > >> NOAA/NWS/Weather Prediction Center
> > >> > >> Phone:  301-683-1546
> > >> > >>
> > >> > >>
> > >> >
> > >> >
> > >>
> > >> --
> > >> Michael J. Erickson
> > >>
> > >> Research Scientist
> > >> Cooperative Institute for Research in Environmental Sciences
(CIRES)
> > >> NOAA/NWS/Weather Prediction Center
> > >> Phone:  301-683-1546
> > >>
> > >>
> >
> >
>
> --
> Michael J. Erickson
>
> Research Scientist
> Cooperative Institute for Research in Environmental Sciences (CIRES)
> NOAA/NWS/Weather Prediction Center
> Phone:  301-683-1546
>
>

------------------------------------------------
Subject: Including Climatology in grid_stat Config File
From: Eric Gilleland
Time: Tue Sep 08 13:39:47 2020

Hi John,

I agree that if the probabilities have already been binned, then it is
strange to then take the midpoint (re-binning).

Eric

On Fri, Sep 4, 2020 at 11:14 AM John Halley Gotway via RT
<met_help at ucar.edu>
wrote:

> Barb and Eric,
>
> I've added you to this met-help ticket from Mike Erickson from
NOAA/WPC.
> We're hoping to get some advice from one or both of you about
probabilistic
> verification.
>
> Mike is running Grid-Stat to verify WPC's Excessive Rainfall
Outlooks
> against StageIV precip. The forecast probability values are always
0, 0.05,
> 0.1, 0.2, 0.5, or 1.0.
> When Mike computes the Brier score by hand, it differs from the
results
> reported by Grid-Stat out in the 3rd decimal place.
>
> My theory is that the difference is caused by the fact that MET does
not
> compute the Brier score directly on the probability values. Instead,
it
> bins them into an Nx2 probabilistic contingency table and computes
the
> Brier score from that table. And the mid-point of each bin is used
in the
> Brier score computations. So different probability bins will result
in a
> slightly different Brier score.
>
> Mike is currently using probability thresholds as follows:
>    cat_thresh = [ >=0.0, >=0.05, >=0.1, >=0.2, >=0.5, >=1.0 ];
>
> And that's consistent with the probability values. But when you
think about
> it...
> - Forecasts of 0% fall into the first bin and are evaluated as being
a
> value of 0.025 (mid-point of the 0.0 to 0.05 bin)
> - Forecasts of 5% fall into the second bin and are evaluated as
being a
> value of 0.075 (mid-point of the 0.05 to 0.1 bin)
> - Forecasts of 10% fall into the third bin and are evaluated as
being a
> value of 0.150 (mid-point of the 0.1 to 0.2 bin).
> - and so on for the other probability values
>
> Seems like the binning of probability values works better for
continuous
> probability values and not so well for probabilities that have
already been
> binned!
>
> I'm wondering if you have any thoughts or advice about this
situation?
>
> Thanks,
> John
>
> On Fri, Sep 4, 2020 at 10:47 AM Michael Erickson - NOAA Affiliate
via RT <
> met_help at ucar.edu> wrote:
>
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562 >
> >
> > Hi John,
> >
> > Thanks for your answers and sounds good! That is strange that the
climo
> > file was not found for your setting. The only detail I can think
of is
> that
> > within the climo field, the file_name specification is static:
> >
> > file_name = [
> >
>
"/usr1/wpc_cpgffh/gribs/ERO_verif/static/UFVS_ST4gFFG_s2015010100_e2019123123_vhr12.nc"
> > ];
> >
> >
> > I believe you concluded that my climo read-in looked correct?
> >
> > Thanks,
> >
> > Mike
> >
> >
> > On Fri, Sep 4, 2020 at 12:42 PM John Halley Gotway via RT <
> > met_help at ucar.edu>
> > wrote:
> >
> > > Mike,
> > >
> > > 2 more things I forgot to address.
> > >
> > > First, I pulled that climo field but when I ran grid_stat with
your
> > usethis
> > > config file, it did not actually read the climo data.
> > >
> > > DEBUG 3: Found 0 climatology fields.
> > >
> > >
> > > I'm wondering what additional configuration settings you used to
make
> > this
> > > work?
> > >
> > >
> > > Second, the answer to your question is yes. The exact same
binning
> logic
> > > used for the forecast probabilities is applied to the climo
data. In
> > fact,
> > > the forecast probability bins are applied to both the forecast
and
> climo
> > > data. So you do not need to define separate "cat_thresh"
settings for
> the
> > > climo. They won't be used anyway.
> > >
> > >
> > > Here's the spot in the library code where the climo
probabilistic
> > > contingency table is created using the forecast probability
bins:
> > >
> > >
> > >
> > >
> >
>
https://github.com/dtcenter/MET/blob/2c9ae440a84024fbf62caa64c1b747f9a912236f/met/src/libcode/vx_statistics/compute_stats.cc#L767
> > >
> > >
> > > Thanks,
> > > John
> > >
> > > On Fri, Sep 4, 2020 at 10:25 AM John Halley Gotway
<johnhg at ucar.edu>
> > > wrote:
> > >
> > > > Mike,
> > > >
> > > > I don't really have a recommendation on best practices with
regards
> to
> > > the
> > > > binning of probability values.
> > > >
> > > > I can say that I more commonly see people choose fixed bin
widths,
> like
> > > > "==0.10" (for 10 bins) or "==0.05" (for 20 bins) instead of
variable
> > > width
> > > > bins, such as:
> > > > [ >=0.0, >=0.05, >=0.1, >=0.2, >=0.5, >=1.0 ]
> > > >
> > > > But I suspect that's more out of convenience than anything
else. With
> > > > regards to your chosen bins, I suspect you set them up this
way since
> > you
> > > > have lots of low probability values closer to 0.0 and
relatively few
> > > > probability values closer to 1.0. While this may be a good
choice for
> > > > relatively rare events, it wouldn't be as good of a choice for
very
> > > common
> > > > events resulting in high probability values.
> > > >
> > > > Choosing 20 bins (==0.05) would include all of your current
bin
> > > boundaries
> > > > and enable you to sample evenly across the probability space,
> > regardless
> > > of
> > > > whether the values are bunched near 0 or 1. And
mathematically, your
> > > > current bins would be derivable from these.
> > > >
> > > > But if your chosen bins follow some existing WPC convention, I
don't
> > see
> > > > an obvious reason to change them.
> > > >
> > > > Please let me know if you'd like me to forward this question
to one
> of
> > > the
> > > > statisticians in our group for their advice.
> > > >
> > > > Thanks,
> > > > John
> > > >
> > > > On Fri, Sep 4, 2020 at 5:08 AM Michael Erickson - NOAA
Affiliate via
> > RT <
> > > > met_help at ucar.edu> wrote:
> > > >
> > > >>
> > > >> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562
>
> > > >>
> > > >> Hi John,
> > > >>
> > > >> Thank you for your quick and helpful response! To answer your
> > questions
> > > >> from the first email:
> > > >>
> > > >> 1) I have included the climo file in case you wanted to see
it:
> > > >>
> > > >>
> > >
> >
>
https://ftp.wpc.ncep.noaa.gov/erickson/DTC/UFVS_ST4gFFG_s2015010100_e2019123123_vhr12.nc
> > > >>
> > > >> 2) I start from the netcdf output from grid_stat, load that
data
> into
> > > the
> > > >> python workspace, and compute the brier score from that.
> > > >>
> > > >> Also the circle diameter of 9 in the observation file is to
draw a
> 40
> > km
> > > >> radius around the "observation."
> > > >>
> > > >> From your latter email, it sounds like I may not be able to
exactly
> > > >> replicate the Brier Score calculation. In the spirit of best
> > practices,
> > > >> would you recommend I change cat_thresh  to "= [ >=0.0,
>=0.001,
> > >=0.05,
> > > >> >=0.1, >=0.2, >=0.5, >=1.0 ];" or keep my cat_thresh as it
currently
> > is
> > > as
> > > >> long as I am consistent? I was also wondering if grid_stat
bins the
> > > >> probabilities for the climo field as it does for the
probabilities
> in
> > > the
> > > >> forecast field?
> > > >>
> > > >> Thanks again!
> > > >>
> > > >> Mike
> > > >>
> > > >> On Thu, Sep 3, 2020 at 7:12 PM John Halley Gotway via RT <
> > > >> met_help at ucar.edu>
> > > >> wrote:
> > > >>
> > > >> > Actually, I have a reasonable guess as to why you may be
seeing a
> > > >> > difference.
> > > >> >
> > > >> > All probabilistics verification in MET is based on an Nx2
> > > probabilistic
> > > >> > contingency table. Those are the counts in the PCT line
type. We
> do
> > > >> this to
> > > >> > make it easier to aggregate statistics across multiple
cases, but
> > > >> summing
> > > >> > up contingency tables before recomputing statistics. But
the
> > pros/cons
> > > >> of
> > > >> > this approach would probably be better addressed by a
> statistician.
> > So
> > > >> the
> > > >> > stats are computed using probability bins and not raw
probability
> > > >> values.
> > > >> >
> > > >> > If you went and computed the Brier score by hand, you
probably did
> > so
> > > >> using
> > > >> > raw probability values and not binning them first.
> > > >> >
> > > >> > And this difference could explain the type of discrepancy
you're
> > > seeing.
> > > >> >
> > > >> > To test this out, I reran your case...
> > > >> > (1) Using your original settings to confirm your Brier
score of
> > > >> 0.011934.
> > > >> > (2) Using 10 equally-spaced probability bins (cat_thresh =
[ ==0.1
> > ];)
> > > >> > which produced a Brier score of 0.013747.
> > > >> > (3) Using 50 equally-spaced probability bins (cat_thresh =
[ ==0.2
> > ];)
> > > >> > which produced a Brier score of 0.01197.
> > > >> > (4) Using 100 equally-spaced probability bins (cat_thresh =
[
> ==0.01
> > > ];)
> > > >> > which produced a Brier score of 0.01193.
> > > >> >
> > > >> > I suppose that doesn't example the exact discrepancy, but
could
> > > >> definitely
> > > >> > be involved.
> > > >> >
> > > >> > Notice on this line of the brier score computation in MET:
> > > >> >
> > > >> >
> > > >>
> > >
> >
>
https://github.com/dtcenter/MET/blob/2c9ae440a84024fbf62caa64c1b747f9a912236f/met/src/libcode/vx_statistics/contable_nx2.cc#L647
> > > >> >
> > > >> > That the "probability" value returned by "row_proby()" is
the
> > > mid-point
> > > >> of
> > > >> > the bin.
> > > >> > So all of your forecast probability values of 0% which fall
into
> the
> > > >> first
> > > >> > bin are actually evaluated as having a probability value of
0.025
> > > which
> > > >> is
> > > >> > the mid-point between 0 and 0.05 for the first bin.
> > > >> >
> > > >> > Rerunning using the following to minimize that effect on
the 0's:
> > > >> > cat_thresh = [ >=0.0, >=0.001, >=0.05, >=0.1, >=0.2, >=0.5,
>=1.0
> ];
> > > >> > produces a brier score of 0.011489.
> > > >> >
> > > >> > So I'd say that the binning of the probability values is
impacting
> > the
> > > >> > Brier score out in the 4th decimal place.
> > > >> >
> > > >> > John
> > > >> >
> > > >> > On Thu, Sep 3, 2020 at 4:47 PM John Halley Gotway <
> johnhg at ucar.edu>
> > > >> wrote:
> > > >> >
> > > >> > > Hi Mike,
> > > >> > >
> > > >> > > Looks like you were able to make a lot of progress. I
certainly
> > > don't
> > > >> see
> > > >> > > anything wrong based on the log messages you sent.
> > > >> > >
> > > >> > > I do notice that you're smoothing the observations with
the
> > maximum
> > > >> value
> > > >> > > in a circle of diameter 9... presumably for a good
reason. And I
> > see
> > > >> that
> > > >> > > smoothing step indicated in the log messages as well as
the
> output
> > > >> .stat
> > > >> > > file.
> > > >> > >
> > > >> > > Two questions.
> > > >> > >
> > > >> > > (1) I wanted to try running locally, but didn't find the
"climo"
> > > file
> > > >> on
> > > >> > > the WPC ftp site:
> > > >> > >
> > > >> > >
> > > >> >
> > > >>
> > >
> >
>
https://ftp.wpc.ncep.noaa.gov/erickson/DTC/UFVS_ST4gFFG_s2015010100_e2019123123_vhr12.nc
> > > >> > > <
> > > >> >
> > > >>
> > >
> >
>
https://ftp.wpc.ncep.noaa.gov/erickson/DTC/UFVS_ST4gFFG_s2015010100_e2019123123_vhr12.nc
> > > >> > >
> > > >> > > Could you add that?
> > > >> > >
> > > >> > > (2) When you say that you tried to replicate the Brier
score
> > > >> computation,
> > > >> > > what was your starting point? The raw input files or
using the
> > > NetCDF
> > > >> > > matched pairs output from Grid-Stat which already include
the
> > > >> computation
> > > >> > > of the observation maximums?
> > > >> > >
> > > >> > > Thanks,
> > > >> > > John Halley Gotway
> > > >> > >
> > > >> > > On Thu, Sep 3, 2020 at 2:26 PM Michael Erickson - NOAA
Affiliate
> > via
> > > >> RT <
> > > >> > > met_help at ucar.edu> wrote:
> > > >> > >
> > > >> > >>
> > > >> > >> <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562
> >
> > > >> > >>
> > > >> > >> Thank you Minna!
> > > >> > >>
> > > >> > >> Mike
> > > >> > >>
> > > >> > >> On Thu, Sep 3, 2020 at 4:12 PM Minna Win via RT <
> > met_help at ucar.edu
> > > >
> > > >> > >> wrote:
> > > >> > >>
> > > >> > >> > Hi Mike,
> > > >> > >> >
> > > >> > >> > It looks like you have a few questions associated with
> > > calculating
> > > >> > Brier
> > > >> > >> > Skill Scores.  I'm assigning this ticket to John
Halley
> Gotway.
> > > >> > >> >
> > > >> > >> > Regards,
> > > >> > >> > Minna
> > > >> > >> > ---------------
> > > >> > >> > Minna Win
> > > >> > >> > National Center for Atmospheric Research
> > > >> > >> > Developmental Testbed Center
> > > >> > >> > Phone: 303-497-8423
> > > >> > >> > Fax:   303-497-8401
> > > >> > >> >
> > > >> > >> >
> > > >> > >> >
> > > >> > >> > On Thu, Sep 3, 2020 at 1:13 PM Michael Erickson - NOAA
> > Affiliate
> > > >> via
> > > >> > RT
> > > >> > >> <
> > > >> > >> > met_help at ucar.edu> wrote:
> > > >> > >> >
> > > >> > >> > >
> > > >> > >> > > Thu Sep 03 13:13:26 2020: Request 96562 was acted
upon.
> > > >> > >> > > Transaction: Ticket created by
michael.j.erickson at noaa.gov
> > > >> > >> > >        Queue: met_help
> > > >> > >> > >      Subject: Including Climatology in grid_stat
Config
> File
> > > >> > >> > >        Owner: Nobody
> > > >> > >> > >   Requestors: michael.j.erickson at noaa.gov
> > > >> > >> > >       Status: new
> > > >> > >> > >  Ticket <URL:
> > > >> > https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562
> > > >> > >> >
> > > >> > >> > >
> > > >> > >> > >
> > > >> > >> > > Greetings,
> > > >> > >> > >
> > > >> > >> > > For the first time I am attempting to calculate
Brier Skill
> > > Score
> > > >> > >> using
> > > >> > >> > > grid_stat from an input climatology file. I have
created a
> > > >> > >> probabilistic
> > > >> > >> > > flooding climatology file (spans from zero to one;
image is
> > > here:
> > > >> > >> > >
> > > >> >
> >
https://ftp.wpc.ncep.noaa.gov/erickson/test10/ALL_noMRGL_UFVS_ALL.png
> > > >> > >> ).
> > > >> > >> > > This climatology is static, so it doesn't change
with time
> > when
> > > >> > >> inputting
> > > >> > >> > > the "model" and "observation" data. I believe I have
> > > successfully
> > > >> > >> gotten
> > > >> > >> > > this to work using the command:
> > > >> > >> > >
> > > >> > >> > > /opt/MET/90/bin/grid_stat
ERO_s2020083112_e2020090112_
> > vhr09.nc
> > > >> > >> > > ST4gFFG_s2020083112_e2020090112_vhr09.nc usethis
-outdir ~
> > > >> > >> > >
> > > >> > >> > > where grid_stat ERO_s2020083112_e2020090112_vhr09.nc
are
> > > >> discrete
> > > >> > >> > forecast
> > > >> > >> > > probabilities of 0, 0.05, 0.1, 0.2, and 0.5
> > > >> > >> > > where ST4gFFG_s2020083112_e2020090112_vhr09.nc are
> > observation
> > > >> > values
> > > >> > >> > of 0
> > > >> > >> > > or 1
> > > >> > >> > > and usethis is the configuration file
> > > >> > >> > >
> > > >> > >> > > Finally the climatology file that consists of
"almost"
> > > continuous
> > > >> > >> values
> > > >> > >> > > between 0 and 1 is named: UFVS_ST4gFFG_s2015010100_
> > > >> > >> e2019123123_vhr12.nc
> > > >> > >> > >
> > > >> > >> > > I have put all of these files at
> > > >> > >> > > https://ftp.wpc.ncep.noaa.gov/erickson/DTC/ for
> > > >> > >> > > your reference.
> > > >> > >> > >
> > > >> > >> > > As for my questions:
> > > >> > >> > >
> > > >> > >> > > 1) I was wondering if the climatology file was
properly
> > > ingested
> > > >> and
> > > >> > >> > > calculated for my example? I believe it is correct
given
> the
> > > >> output
> > > >> > >> > below,
> > > >> > >> > > but I wanted to make sure, since this is my first
time
> doing
> > > >> this:
> > > >> > >> > >
> > > >> > >> > >
> > > >> > >> > >
> > > >> > >> > >
> > > >> > >> > >
> > > >> > >> > >
> > > >> > >> > >
> > > >> > >> > >
> > > >> > >> > >
> > > >> > >> > >
> > > >> > >> > >
> > > >> > >> > >
> > > >> > >> > >
> > > >> > >> > >
> > > >> > >> > >
> > > >> > >> > >
> > > >> > >> > >
> > > >> > >> > >
> > > >> > >> > >
> > > >> > >> > >
> > > >> > >> > >
> > > >> > >> > >
> > > >> > >> > >
> > > >> > >> > >
> > > >> > >> > >
> > > >> > >> > >
> > > >> > >> > >
> > > >> > >> > >
> > > >> > >> > >
> > > >> > >> > >
> > > >> > >> > >
> > > >> > >> > >
> > > >> > >> > > *DEBUG 1: Forecast File:
> > > >> > >> > >
> > > >> > >> > >
> > > >> > >> >
> > > >> > >>
> > > >> >
> > > >>
> > >
> >
> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL/ERO_s2020083112_e2020090112_vhr09.ncDEBUG
> > > >> > >> > > 1: Observation File:
> > > >> > >> > >
> > > >> > >> > >
> > > >> > >> >
> > > >> > >>
> > > >> >
> > > >>
> > >
> >
> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL/ST4gFFG_s2020083112_e2020090112_vhr09.ncDEBUG
> > > >> > >> > > 3: Reading forecast data for EROSurface.DEBUG 3:
Reading
> > > >> observation
> > > >> > >> data
> > > >> > >> > > for ST4gFFGSurface.DEBUG 4:
> > > >> > >> Met2dDataFileFactory::new_met_2d_data_file()
> > > >> > >> > ->
> > > >> > >> > > created new Met2dDataFile object of type
> > "FileType_NcMet".DEBUG
> > > >> > >> 4:DEBUG
> > > >> > >> > 4:
> > > >> > >> > > Latitude/Longitude Grid Data:DEBUG 4:      lat_ll:
25DEBUG
> 4:
> > > >> > >> > lon_ll:
> > > >> > >> > > 129.8DEBUG 4:   delta_lat: 0.09DEBUG 4:   delta_lon:
> > 0.09DEBUG
> > > 4:
> > > >> > >> > >  Nlat: 276DEBUG 4:        Nlon: 721DEBUG 4:DEBUG 4:
> > > >> > >> > > VarInfoFactory::new_var_info() -> created new
VarInfo
> object
> > of
> > > >> type
> > > >> > >> > > "FileType_NcMet".DEBUG 3: For forecast valid at
> > > 20200901_120000,
> > > >> > >> found 1
> > > >> > >> > > climatology field(s) with valid time(s):
> 20201231_230000DEBUG
> > > 3:
> > > >> > >> Found 1
> > > >> > >> > > climatology fields.DEBUG 3: Found 1 climatology mean
and 0
> > > >> > climatology
> > > >> > >> > > standard deviation field(s) for forecast
EROSurface.DEBUG
> 2:
> > > >> > >> Processing
> > > >> > >> > > masking regions.DEBUG 3: Processing grid mask:
FULLDEBUG 4:
> > > >> > >> > > parse_grid_mask() -> parsing grid mask "FULL"DEBUG
2:DEBUG
> 2:
> > > >> > >> > >
> > > >> > >> > >
> > > >> > >> >
> > > >> > >>
> > > >> >
> > > >>
> > >
> >
>
--------------------------------------------------------------------------------DEBUG
> > > >> > >> > > 2:DEBUG 3: Smoothing field using the MAX(49)
CircleTemplate
> > > >> > >> interpolation
> > > >> > >> > > method.DEBUG 2: Processing EROSurface versus
> ST4gFFGSurface,
> > > for
> > > >> > >> > smoothing
> > > >> > >> > > method MAX_CIRCLE(49), over region FULL, using
190638
> matched
> > > >> > >> pairs.DEBUG
> > > >> > >> > > 2: Computing Probabilistic Statistics.DEBUG 2:DEBUG
2:
> > > >> > >> > >
> > > >> > >> > >
> > > >> > >> >
> > > >> > >>
> > > >> >
> > > >>
> > >
> >
>
--------------------------------------------------------------------------------DEBUG
> > > >> > >> > > 2:DEBUG 1: Output file:
> > > >> > >> > >
> > > >> > >> > >
> > > >> > >> >
> > > >> > >>
> > > >> >
> > > >>
> > >
> >
> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL//grid_stat_ST4gFFG_ERO_s2020090112_e2020090212_vhr09_240000L_20200901_120000V.statDEBUG
> > > >> > >> > > 1: Output file:
> > > >> > >> > >
> > > >> > >> > >
> > > >> > >> >
> > > >> > >>
> > > >> >
> > > >>
> > >
> >
> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL//grid_stat_ST4gFFG_ERO_s2020090112_e2020090212_vhr09_240000L_20200901_120000V_pairs.nc*
> > > >> > >> > >
> > > >> > >> > >
> > > >> > >> > > 2) This question is a bit more basic. I am unable to
> manually
> > > >> > >> calculate a
> > > >> > >> > > Brier Score value for the forecast and observation
that
> > > properly
> > > >> > >> matches
> > > >> > >> > > that in the stat file. My manually calculated Brier
Score
> is
> > > >> > >> > systematically
> > > >> > >> > > lower. For this event, the stat file BS is 0.0119
and my
> > value
> > > is
> > > >> > >> 0.0116.
> > > >> > >> > > I've looked at C3 in the MET Tutorial guide
> > > >> > >> > > <
> > > >> > >> > >
> > > >> > >> >
> > > >> > >>
> > > >> >
> > > >>
> > >
> >
> https://dtcenter.org/sites/default/files/community-
code/met/docs/user-guide/MET_Users_Guide_v9.0.pdf
> > > >> > >> > > >,
> > > >> > >> > > but I'm still at a bit of a loss. Is there a simple
way I
> can
> > > >> > >> replicate
> > > >> > >> > the
> > > >> > >> > > calculation seen in the stat file?
> > > >> > >> > >
> > > >> > >> > > Thank you again for your help and please let me know
if you
> > > have
> > > >> any
> > > >> > >> > > questions.
> > > >> > >> > >
> > > >> > >> > > Mike
> > > >> > >> > >
> > > >> > >> > > --
> > > >> > >> > > Michael J. Erickson
> > > >> > >> > >
> > > >> > >> > > Research Scientist
> > > >> > >> > > Cooperative Institute for Research in Environmental
> Sciences
> > > >> (CIRES)
> > > >> > >> > > NOAA/NWS/Weather Prediction Center
> > > >> > >> > > Phone:  301-683-1546
> > > >> > >> > >
> > > >> > >> > >
> > > >> > >> >
> > > >> > >> >
> > > >> > >>
> > > >> > >> --
> > > >> > >> Michael J. Erickson
> > > >> > >>
> > > >> > >> Research Scientist
> > > >> > >> Cooperative Institute for Research in Environmental
Sciences
> > > (CIRES)
> > > >> > >> NOAA/NWS/Weather Prediction Center
> > > >> > >> Phone:  301-683-1546
> > > >> > >>
> > > >> > >>
> > > >> >
> > > >> >
> > > >>
> > > >> --
> > > >> Michael J. Erickson
> > > >>
> > > >> Research Scientist
> > > >> Cooperative Institute for Research in Environmental Sciences
(CIRES)
> > > >> NOAA/NWS/Weather Prediction Center
> > > >> Phone:  301-683-1546
> > > >>
> > > >>
> > >
> > >
> >
> > --
> > Michael J. Erickson
> >
> > Research Scientist
> > Cooperative Institute for Research in Environmental Sciences
(CIRES)
> > NOAA/NWS/Weather Prediction Center
> > Phone:  301-683-1546
> >
> >
>
>

------------------------------------------------
Subject: Including Climatology in grid_stat Config File
From: Michael Erickson - NOAA Affiliate
Time: Tue Sep 08 14:15:00 2020

Hi Eric and John,

Thank you for your response to this matter. What would be the best
practice
to take in this situation?

Thanks,

Mike

On Tue, Sep 8, 2020 at 3:41 PM Eric Gilleland via RT
<met_help at ucar.edu>
wrote:

> Hi John,
>
> I agree that if the probabilities have already been binned, then it
is
> strange to then take the midpoint (re-binning).
>
> Eric
>
> On Fri, Sep 4, 2020 at 11:14 AM John Halley Gotway via RT <
> met_help at ucar.edu>
> wrote:
>
> > Barb and Eric,
> >
> > I've added you to this met-help ticket from Mike Erickson from
NOAA/WPC.
> > We're hoping to get some advice from one or both of you about
> probabilistic
> > verification.
> >
> > Mike is running Grid-Stat to verify WPC's Excessive Rainfall
Outlooks
> > against StageIV precip. The forecast probability values are always
0,
> 0.05,
> > 0.1, 0.2, 0.5, or 1.0.
> > When Mike computes the Brier score by hand, it differs from the
results
> > reported by Grid-Stat out in the 3rd decimal place.
> >
> > My theory is that the difference is caused by the fact that MET
does not
> > compute the Brier score directly on the probability values.
Instead, it
> > bins them into an Nx2 probabilistic contingency table and computes
the
> > Brier score from that table. And the mid-point of each bin is used
in the
> > Brier score computations. So different probability bins will
result in a
> > slightly different Brier score.
> >
> > Mike is currently using probability thresholds as follows:
> >    cat_thresh = [ >=0.0, >=0.05, >=0.1, >=0.2, >=0.5, >=1.0 ];
> >
> > And that's consistent with the probability values. But when you
think
> about
> > it...
> > - Forecasts of 0% fall into the first bin and are evaluated as
being a
> > value of 0.025 (mid-point of the 0.0 to 0.05 bin)
> > - Forecasts of 5% fall into the second bin and are evaluated as
being a
> > value of 0.075 (mid-point of the 0.05 to 0.1 bin)
> > - Forecasts of 10% fall into the third bin and are evaluated as
being a
> > value of 0.150 (mid-point of the 0.1 to 0.2 bin).
> > - and so on for the other probability values
> >
> > Seems like the binning of probability values works better for
continuous
> > probability values and not so well for probabilities that have
already
> been
> > binned!
> >
> > I'm wondering if you have any thoughts or advice about this
situation?
> >
> > Thanks,
> > John
> >
> > On Fri, Sep 4, 2020 at 10:47 AM Michael Erickson - NOAA Affiliate
via RT
> <
> > met_help at ucar.edu> wrote:
> >
> > >
> > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562 >
> > >
> > > Hi John,
> > >
> > > Thanks for your answers and sounds good! That is strange that
the climo
> > > file was not found for your setting. The only detail I can think
of is
> > that
> > > within the climo field, the file_name specification is static:
> > >
> > > file_name = [
> > >
> >
>
"/usr1/wpc_cpgffh/gribs/ERO_verif/static/UFVS_ST4gFFG_s2015010100_e2019123123_vhr12.nc"
> > > ];
> > >
> > >
> > > I believe you concluded that my climo read-in looked correct?
> > >
> > > Thanks,
> > >
> > > Mike
> > >
> > >
> > > On Fri, Sep 4, 2020 at 12:42 PM John Halley Gotway via RT <
> > > met_help at ucar.edu>
> > > wrote:
> > >
> > > > Mike,
> > > >
> > > > 2 more things I forgot to address.
> > > >
> > > > First, I pulled that climo field but when I ran grid_stat with
your
> > > usethis
> > > > config file, it did not actually read the climo data.
> > > >
> > > > DEBUG 3: Found 0 climatology fields.
> > > >
> > > >
> > > > I'm wondering what additional configuration settings you used
to make
> > > this
> > > > work?
> > > >
> > > >
> > > > Second, the answer to your question is yes. The exact same
binning
> > logic
> > > > used for the forecast probabilities is applied to the climo
data. In
> > > fact,
> > > > the forecast probability bins are applied to both the forecast
and
> > climo
> > > > data. So you do not need to define separate "cat_thresh"
settings for
> > the
> > > > climo. They won't be used anyway.
> > > >
> > > >
> > > > Here's the spot in the library code where the climo
probabilistic
> > > > contingency table is created using the forecast probability
bins:
> > > >
> > > >
> > > >
> > > >
> > >
> >
>
https://github.com/dtcenter/MET/blob/2c9ae440a84024fbf62caa64c1b747f9a912236f/met/src/libcode/vx_statistics/compute_stats.cc#L767
> > > >
> > > >
> > > > Thanks,
> > > > John
> > > >
> > > > On Fri, Sep 4, 2020 at 10:25 AM John Halley Gotway
<johnhg at ucar.edu>
> > > > wrote:
> > > >
> > > > > Mike,
> > > > >
> > > > > I don't really have a recommendation on best practices with
regards
> > to
> > > > the
> > > > > binning of probability values.
> > > > >
> > > > > I can say that I more commonly see people choose fixed bin
widths,
> > like
> > > > > "==0.10" (for 10 bins) or "==0.05" (for 20 bins) instead of
> variable
> > > > width
> > > > > bins, such as:
> > > > > [ >=0.0, >=0.05, >=0.1, >=0.2, >=0.5, >=1.0 ]
> > > > >
> > > > > But I suspect that's more out of convenience than anything
else.
> With
> > > > > regards to your chosen bins, I suspect you set them up this
way
> since
> > > you
> > > > > have lots of low probability values closer to 0.0 and
relatively
> few
> > > > > probability values closer to 1.0. While this may be a good
choice
> for
> > > > > relatively rare events, it wouldn't be as good of a choice
for very
> > > > common
> > > > > events resulting in high probability values.
> > > > >
> > > > > Choosing 20 bins (==0.05) would include all of your current
bin
> > > > boundaries
> > > > > and enable you to sample evenly across the probability
space,
> > > regardless
> > > > of
> > > > > whether the values are bunched near 0 or 1. And
mathematically,
> your
> > > > > current bins would be derivable from these.
> > > > >
> > > > > But if your chosen bins follow some existing WPC convention,
I
> don't
> > > see
> > > > > an obvious reason to change them.
> > > > >
> > > > > Please let me know if you'd like me to forward this question
to one
> > of
> > > > the
> > > > > statisticians in our group for their advice.
> > > > >
> > > > > Thanks,
> > > > > John
> > > > >
> > > > > On Fri, Sep 4, 2020 at 5:08 AM Michael Erickson - NOAA
Affiliate
> via
> > > RT <
> > > > > met_help at ucar.edu> wrote:
> > > > >
> > > > >>
> > > > >> <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562 >
> > > > >>
> > > > >> Hi John,
> > > > >>
> > > > >> Thank you for your quick and helpful response! To answer
your
> > > questions
> > > > >> from the first email:
> > > > >>
> > > > >> 1) I have included the climo file in case you wanted to see
it:
> > > > >>
> > > > >>
> > > >
> > >
> >
>
https://ftp.wpc.ncep.noaa.gov/erickson/DTC/UFVS_ST4gFFG_s2015010100_e2019123123_vhr12.nc
> > > > >>
> > > > >> 2) I start from the netcdf output from grid_stat, load that
data
> > into
> > > > the
> > > > >> python workspace, and compute the brier score from that.
> > > > >>
> > > > >> Also the circle diameter of 9 in the observation file is to
draw a
> > 40
> > > km
> > > > >> radius around the "observation."
> > > > >>
> > > > >> From your latter email, it sounds like I may not be able to
> exactly
> > > > >> replicate the Brier Score calculation. In the spirit of
best
> > > practices,
> > > > >> would you recommend I change cat_thresh  to "= [ >=0.0,
>=0.001,
> > > >=0.05,
> > > > >> >=0.1, >=0.2, >=0.5, >=1.0 ];" or keep my cat_thresh as it
> currently
> > > is
> > > > as
> > > > >> long as I am consistent? I was also wondering if grid_stat
bins
> the
> > > > >> probabilities for the climo field as it does for the
probabilities
> > in
> > > > the
> > > > >> forecast field?
> > > > >>
> > > > >> Thanks again!
> > > > >>
> > > > >> Mike
> > > > >>
> > > > >> On Thu, Sep 3, 2020 at 7:12 PM John Halley Gotway via RT <
> > > > >> met_help at ucar.edu>
> > > > >> wrote:
> > > > >>
> > > > >> > Actually, I have a reasonable guess as to why you may be
seeing
> a
> > > > >> > difference.
> > > > >> >
> > > > >> > All probabilistics verification in MET is based on an Nx2
> > > > probabilistic
> > > > >> > contingency table. Those are the counts in the PCT line
type. We
> > do
> > > > >> this to
> > > > >> > make it easier to aggregate statistics across multiple
cases,
> but
> > > > >> summing
> > > > >> > up contingency tables before recomputing statistics. But
the
> > > pros/cons
> > > > >> of
> > > > >> > this approach would probably be better addressed by a
> > statistician.
> > > So
> > > > >> the
> > > > >> > stats are computed using probability bins and not raw
> probability
> > > > >> values.
> > > > >> >
> > > > >> > If you went and computed the Brier score by hand, you
probably
> did
> > > so
> > > > >> using
> > > > >> > raw probability values and not binning them first.
> > > > >> >
> > > > >> > And this difference could explain the type of discrepancy
you're
> > > > seeing.
> > > > >> >
> > > > >> > To test this out, I reran your case...
> > > > >> > (1) Using your original settings to confirm your Brier
score of
> > > > >> 0.011934.
> > > > >> > (2) Using 10 equally-spaced probability bins (cat_thresh
= [
> ==0.1
> > > ];)
> > > > >> > which produced a Brier score of 0.013747.
> > > > >> > (3) Using 50 equally-spaced probability bins (cat_thresh
= [
> ==0.2
> > > ];)
> > > > >> > which produced a Brier score of 0.01197.
> > > > >> > (4) Using 100 equally-spaced probability bins (cat_thresh
= [
> > ==0.01
> > > > ];)
> > > > >> > which produced a Brier score of 0.01193.
> > > > >> >
> > > > >> > I suppose that doesn't example the exact discrepancy, but
could
> > > > >> definitely
> > > > >> > be involved.
> > > > >> >
> > > > >> > Notice on this line of the brier score computation in
MET:
> > > > >> >
> > > > >> >
> > > > >>
> > > >
> > >
> >
>
https://github.com/dtcenter/MET/blob/2c9ae440a84024fbf62caa64c1b747f9a912236f/met/src/libcode/vx_statistics/contable_nx2.cc#L647
> > > > >> >
> > > > >> > That the "probability" value returned by "row_proby()" is
the
> > > > mid-point
> > > > >> of
> > > > >> > the bin.
> > > > >> > So all of your forecast probability values of 0% which
fall into
> > the
> > > > >> first
> > > > >> > bin are actually evaluated as having a probability value
of
> 0.025
> > > > which
> > > > >> is
> > > > >> > the mid-point between 0 and 0.05 for the first bin.
> > > > >> >
> > > > >> > Rerunning using the following to minimize that effect on
the
> 0's:
> > > > >> > cat_thresh = [ >=0.0, >=0.001, >=0.05, >=0.1, >=0.2,
>=0.5,
> >=1.0
> > ];
> > > > >> > produces a brier score of 0.011489.
> > > > >> >
> > > > >> > So I'd say that the binning of the probability values is
> impacting
> > > the
> > > > >> > Brier score out in the 4th decimal place.
> > > > >> >
> > > > >> > John
> > > > >> >
> > > > >> > On Thu, Sep 3, 2020 at 4:47 PM John Halley Gotway <
> > johnhg at ucar.edu>
> > > > >> wrote:
> > > > >> >
> > > > >> > > Hi Mike,
> > > > >> > >
> > > > >> > > Looks like you were able to make a lot of progress. I
> certainly
> > > > don't
> > > > >> see
> > > > >> > > anything wrong based on the log messages you sent.
> > > > >> > >
> > > > >> > > I do notice that you're smoothing the observations with
the
> > > maximum
> > > > >> value
> > > > >> > > in a circle of diameter 9... presumably for a good
reason.
> And I
> > > see
> > > > >> that
> > > > >> > > smoothing step indicated in the log messages as well as
the
> > output
> > > > >> .stat
> > > > >> > > file.
> > > > >> > >
> > > > >> > > Two questions.
> > > > >> > >
> > > > >> > > (1) I wanted to try running locally, but didn't find
the
> "climo"
> > > > file
> > > > >> on
> > > > >> > > the WPC ftp site:
> > > > >> > >
> > > > >> > >
> > > > >> >
> > > > >>
> > > >
> > >
> >
>
https://ftp.wpc.ncep.noaa.gov/erickson/DTC/UFVS_ST4gFFG_s2015010100_e2019123123_vhr12.nc
> > > > >> > > <
> > > > >> >
> > > > >>
> > > >
> > >
> >
>
https://ftp.wpc.ncep.noaa.gov/erickson/DTC/UFVS_ST4gFFG_s2015010100_e2019123123_vhr12.nc
> > > > >> > >
> > > > >> > > Could you add that?
> > > > >> > >
> > > > >> > > (2) When you say that you tried to replicate the Brier
score
> > > > >> computation,
> > > > >> > > what was your starting point? The raw input files or
using the
> > > > NetCDF
> > > > >> > > matched pairs output from Grid-Stat which already
include the
> > > > >> computation
> > > > >> > > of the observation maximums?
> > > > >> > >
> > > > >> > > Thanks,
> > > > >> > > John Halley Gotway
> > > > >> > >
> > > > >> > > On Thu, Sep 3, 2020 at 2:26 PM Michael Erickson - NOAA
> Affiliate
> > > via
> > > > >> RT <
> > > > >> > > met_help at ucar.edu> wrote:
> > > > >> > >
> > > > >> > >>
> > > > >> > >> <URL:
> https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562
> > >
> > > > >> > >>
> > > > >> > >> Thank you Minna!
> > > > >> > >>
> > > > >> > >> Mike
> > > > >> > >>
> > > > >> > >> On Thu, Sep 3, 2020 at 4:12 PM Minna Win via RT <
> > > met_help at ucar.edu
> > > > >
> > > > >> > >> wrote:
> > > > >> > >>
> > > > >> > >> > Hi Mike,
> > > > >> > >> >
> > > > >> > >> > It looks like you have a few questions associated
with
> > > > calculating
> > > > >> > Brier
> > > > >> > >> > Skill Scores.  I'm assigning this ticket to John
Halley
> > Gotway.
> > > > >> > >> >
> > > > >> > >> > Regards,
> > > > >> > >> > Minna
> > > > >> > >> > ---------------
> > > > >> > >> > Minna Win
> > > > >> > >> > National Center for Atmospheric Research
> > > > >> > >> > Developmental Testbed Center
> > > > >> > >> > Phone: 303-497-8423
> > > > >> > >> > Fax:   303-497-8401
> > > > >> > >> >
> > > > >> > >> >
> > > > >> > >> >
> > > > >> > >> > On Thu, Sep 3, 2020 at 1:13 PM Michael Erickson -
NOAA
> > > Affiliate
> > > > >> via
> > > > >> > RT
> > > > >> > >> <
> > > > >> > >> > met_help at ucar.edu> wrote:
> > > > >> > >> >
> > > > >> > >> > >
> > > > >> > >> > > Thu Sep 03 13:13:26 2020: Request 96562 was acted
upon.
> > > > >> > >> > > Transaction: Ticket created by
> michael.j.erickson at noaa.gov
> > > > >> > >> > >        Queue: met_help
> > > > >> > >> > >      Subject: Including Climatology in grid_stat
Config
> > File
> > > > >> > >> > >        Owner: Nobody
> > > > >> > >> > >   Requestors: michael.j.erickson at noaa.gov
> > > > >> > >> > >       Status: new
> > > > >> > >> > >  Ticket <URL:
> > > > >> > https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562
> > > > >> > >> >
> > > > >> > >> > >
> > > > >> > >> > >
> > > > >> > >> > > Greetings,
> > > > >> > >> > >
> > > > >> > >> > > For the first time I am attempting to calculate
Brier
> Skill
> > > > Score
> > > > >> > >> using
> > > > >> > >> > > grid_stat from an input climatology file. I have
created
> a
> > > > >> > >> probabilistic
> > > > >> > >> > > flooding climatology file (spans from zero to one;
image
> is
> > > > here:
> > > > >> > >> > >
> > > > >> >
> > >
https://ftp.wpc.ncep.noaa.gov/erickson/test10/ALL_noMRGL_UFVS_ALL.png
> > > > >> > >> ).
> > > > >> > >> > > This climatology is static, so it doesn't change
with
> time
> > > when
> > > > >> > >> inputting
> > > > >> > >> > > the "model" and "observation" data. I believe I
have
> > > > successfully
> > > > >> > >> gotten
> > > > >> > >> > > this to work using the command:
> > > > >> > >> > >
> > > > >> > >> > > /opt/MET/90/bin/grid_stat
ERO_s2020083112_e2020090112_
> > > vhr09.nc
> > > > >> > >> > > ST4gFFG_s2020083112_e2020090112_vhr09.nc usethis
> -outdir ~
> > > > >> > >> > >
> > > > >> > >> > > where grid_stat
ERO_s2020083112_e2020090112_vhr09.nc are
> > > > >> discrete
> > > > >> > >> > forecast
> > > > >> > >> > > probabilities of 0, 0.05, 0.1, 0.2, and 0.5
> > > > >> > >> > > where ST4gFFG_s2020083112_e2020090112_vhr09.nc are
> > > observation
> > > > >> > values
> > > > >> > >> > of 0
> > > > >> > >> > > or 1
> > > > >> > >> > > and usethis is the configuration file
> > > > >> > >> > >
> > > > >> > >> > > Finally the climatology file that consists of
"almost"
> > > > continuous
> > > > >> > >> values
> > > > >> > >> > > between 0 and 1 is named:
UFVS_ST4gFFG_s2015010100_
> > > > >> > >> e2019123123_vhr12.nc
> > > > >> > >> > >
> > > > >> > >> > > I have put all of these files at
> > > > >> > >> > > https://ftp.wpc.ncep.noaa.gov/erickson/DTC/ for
> > > > >> > >> > > your reference.
> > > > >> > >> > >
> > > > >> > >> > > As for my questions:
> > > > >> > >> > >
> > > > >> > >> > > 1) I was wondering if the climatology file was
properly
> > > > ingested
> > > > >> and
> > > > >> > >> > > calculated for my example? I believe it is correct
given
> > the
> > > > >> output
> > > > >> > >> > below,
> > > > >> > >> > > but I wanted to make sure, since this is my first
time
> > doing
> > > > >> this:
> > > > >> > >> > >
> > > > >> > >> > >
> > > > >> > >> > >
> > > > >> > >> > >
> > > > >> > >> > >
> > > > >> > >> > >
> > > > >> > >> > >
> > > > >> > >> > >
> > > > >> > >> > >
> > > > >> > >> > >
> > > > >> > >> > >
> > > > >> > >> > >
> > > > >> > >> > >
> > > > >> > >> > >
> > > > >> > >> > >
> > > > >> > >> > >
> > > > >> > >> > >
> > > > >> > >> > >
> > > > >> > >> > >
> > > > >> > >> > >
> > > > >> > >> > >
> > > > >> > >> > >
> > > > >> > >> > >
> > > > >> > >> > >
> > > > >> > >> > >
> > > > >> > >> > >
> > > > >> > >> > >
> > > > >> > >> > >
> > > > >> > >> > >
> > > > >> > >> > >
> > > > >> > >> > >
> > > > >> > >> > >
> > > > >> > >> > > *DEBUG 1: Forecast File:
> > > > >> > >> > >
> > > > >> > >> > >
> > > > >> > >> >
> > > > >> > >>
> > > > >> >
> > > > >>
> > > >
> > >
> >
> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL/ERO_s2020083112_e2020090112_vhr09.ncDEBUG
> > > > >> > >> > > 1: Observation File:
> > > > >> > >> > >
> > > > >> > >> > >
> > > > >> > >> >
> > > > >> > >>
> > > > >> >
> > > > >>
> > > >
> > >
> >
> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL/ST4gFFG_s2020083112_e2020090112_vhr09.ncDEBUG
> > > > >> > >> > > 3: Reading forecast data for EROSurface.DEBUG 3:
Reading
> > > > >> observation
> > > > >> > >> data
> > > > >> > >> > > for ST4gFFGSurface.DEBUG 4:
> > > > >> > >> Met2dDataFileFactory::new_met_2d_data_file()
> > > > >> > >> > ->
> > > > >> > >> > > created new Met2dDataFile object of type
> > > "FileType_NcMet".DEBUG
> > > > >> > >> 4:DEBUG
> > > > >> > >> > 4:
> > > > >> > >> > > Latitude/Longitude Grid Data:DEBUG 4:      lat_ll:
> 25DEBUG
> > 4:
> > > > >> > >> > lon_ll:
> > > > >> > >> > > 129.8DEBUG 4:   delta_lat: 0.09DEBUG 4:
delta_lon:
> > > 0.09DEBUG
> > > > 4:
> > > > >> > >> > >  Nlat: 276DEBUG 4:        Nlon: 721DEBUG 4:DEBUG
4:
> > > > >> > >> > > VarInfoFactory::new_var_info() -> created new
VarInfo
> > object
> > > of
> > > > >> type
> > > > >> > >> > > "FileType_NcMet".DEBUG 3: For forecast valid at
> > > > 20200901_120000,
> > > > >> > >> found 1
> > > > >> > >> > > climatology field(s) with valid time(s):
> > 20201231_230000DEBUG
> > > > 3:
> > > > >> > >> Found 1
> > > > >> > >> > > climatology fields.DEBUG 3: Found 1 climatology
mean and
> 0
> > > > >> > climatology
> > > > >> > >> > > standard deviation field(s) for forecast
EROSurface.DEBUG
> > 2:
> > > > >> > >> Processing
> > > > >> > >> > > masking regions.DEBUG 3: Processing grid mask:
FULLDEBUG
> 4:
> > > > >> > >> > > parse_grid_mask() -> parsing grid mask "FULL"DEBUG
> 2:DEBUG
> > 2:
> > > > >> > >> > >
> > > > >> > >> > >
> > > > >> > >> >
> > > > >> > >>
> > > > >> >
> > > > >>
> > > >
> > >
> >
>
--------------------------------------------------------------------------------DEBUG
> > > > >> > >> > > 2:DEBUG 3: Smoothing field using the MAX(49)
> CircleTemplate
> > > > >> > >> interpolation
> > > > >> > >> > > method.DEBUG 2: Processing EROSurface versus
> > ST4gFFGSurface,
> > > > for
> > > > >> > >> > smoothing
> > > > >> > >> > > method MAX_CIRCLE(49), over region FULL, using
190638
> > matched
> > > > >> > >> pairs.DEBUG
> > > > >> > >> > > 2: Computing Probabilistic Statistics.DEBUG
2:DEBUG 2:
> > > > >> > >> > >
> > > > >> > >> > >
> > > > >> > >> >
> > > > >> > >>
> > > > >> >
> > > > >>
> > > >
> > >
> >
>
--------------------------------------------------------------------------------DEBUG
> > > > >> > >> > > 2:DEBUG 1: Output file:
> > > > >> > >> > >
> > > > >> > >> > >
> > > > >> > >> >
> > > > >> > >>
> > > > >> >
> > > > >>
> > > >
> > >
> >
> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL//grid_stat_ST4gFFG_ERO_s2020090112_e2020090212_vhr09_240000L_20200901_120000V.statDEBUG
> > > > >> > >> > > 1: Output file:
> > > > >> > >> > >
> > > > >> > >> > >
> > > > >> > >> >
> > > > >> > >>
> > > > >> >
> > > > >>
> > > >
> > >
> >
> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL//grid_stat_ST4gFFG_ERO_s2020090112_e2020090212_vhr09_240000L_20200901_120000V_pairs.nc*
> > > > >> > >> > >
> > > > >> > >> > >
> > > > >> > >> > > 2) This question is a bit more basic. I am unable
to
> > manually
> > > > >> > >> calculate a
> > > > >> > >> > > Brier Score value for the forecast and observation
that
> > > > properly
> > > > >> > >> matches
> > > > >> > >> > > that in the stat file. My manually calculated
Brier Score
> > is
> > > > >> > >> > systematically
> > > > >> > >> > > lower. For this event, the stat file BS is 0.0119
and my
> > > value
> > > > is
> > > > >> > >> 0.0116.
> > > > >> > >> > > I've looked at C3 in the MET Tutorial guide
> > > > >> > >> > > <
> > > > >> > >> > >
> > > > >> > >> >
> > > > >> > >>
> > > > >> >
> > > > >>
> > > >
> > >
> >
> https://dtcenter.org/sites/default/files/community-
code/met/docs/user-guide/MET_Users_Guide_v9.0.pdf
> > > > >> > >> > > >,
> > > > >> > >> > > but I'm still at a bit of a loss. Is there a
simple way I
> > can
> > > > >> > >> replicate
> > > > >> > >> > the
> > > > >> > >> > > calculation seen in the stat file?
> > > > >> > >> > >
> > > > >> > >> > > Thank you again for your help and please let me
know if
> you
> > > > have
> > > > >> any
> > > > >> > >> > > questions.
> > > > >> > >> > >
> > > > >> > >> > > Mike
> > > > >> > >> > >
> > > > >> > >> > > --
> > > > >> > >> > > Michael J. Erickson
> > > > >> > >> > >
> > > > >> > >> > > Research Scientist
> > > > >> > >> > > Cooperative Institute for Research in
Environmental
> > Sciences
> > > > >> (CIRES)
> > > > >> > >> > > NOAA/NWS/Weather Prediction Center
> > > > >> > >> > > Phone:  301-683-1546
> > > > >> > >> > >
> > > > >> > >> > >
> > > > >> > >> >
> > > > >> > >> >
> > > > >> > >>
> > > > >> > >> --
> > > > >> > >> Michael J. Erickson
> > > > >> > >>
> > > > >> > >> Research Scientist
> > > > >> > >> Cooperative Institute for Research in Environmental
Sciences
> > > > (CIRES)
> > > > >> > >> NOAA/NWS/Weather Prediction Center
> > > > >> > >> Phone:  301-683-1546
> > > > >> > >>
> > > > >> > >>
> > > > >> >
> > > > >> >
> > > > >>
> > > > >> --
> > > > >> Michael J. Erickson
> > > > >>
> > > > >> Research Scientist
> > > > >> Cooperative Institute for Research in Environmental
Sciences
> (CIRES)
> > > > >> NOAA/NWS/Weather Prediction Center
> > > > >> Phone:  301-683-1546
> > > > >>
> > > > >>
> > > >
> > > >
> > >
> > > --
> > > Michael J. Erickson
> > >
> > > Research Scientist
> > > Cooperative Institute for Research in Environmental Sciences
(CIRES)
> > > NOAA/NWS/Weather Prediction Center
> > > Phone:  301-683-1546
> > >
> > >
> >
> >
>
>

--
Michael J. Erickson

Research Scientist
Cooperative Institute for Research in Environmental Sciences (CIRES)
NOAA/NWS/Weather Prediction Center
Phone:  301-683-1546

------------------------------------------------
Subject: Including Climatology in grid_stat Config File
From: Barbara Brown
Time: Tue Sep 08 14:55:11 2020

I agree with Eric and John. The way MET does this generally makes
sense for
ensemble forecasts (or other cases when you want MET to select the
thresholds) but not for the cse when the probabilities for specific
categories are provided by the user.  I'm not sure what the work-
around
might be (John may have ideas!) but in the long-run it would be good
to
allow for this option.

Barb
---
Barbara Brown, Senior Research Associate
Research Applications Laboratory
NCAR PO Box 3000
Boulder CO 80307-3000 USA
Ph: +1 303 497 8468  FAX: +1 303 497 8401

On Tue, Sep 8, 2020 at 2:14 PM Michael Erickson - NOAA Affiliate <
michael.j.erickson at noaa.gov> wrote:

> Hi Eric and John,
>
> Thank you for your response to this matter. What would be the best
> practice to take in this situation?
>
> Thanks,
>
> Mike
>
> On Tue, Sep 8, 2020 at 3:41 PM Eric Gilleland via RT
<met_help at ucar.edu>
> wrote:
>
>> Hi John,
>>
>> I agree that if the probabilities have already been binned, then it
is
>> strange to then take the midpoint (re-binning).
>>
>> Eric
>>
>> On Fri, Sep 4, 2020 at 11:14 AM John Halley Gotway via RT <
>> met_help at ucar.edu>
>> wrote:
>>
>> > Barb and Eric,
>> >
>> > I've added you to this met-help ticket from Mike Erickson from
NOAA/WPC.
>> > We're hoping to get some advice from one or both of you about
>> probabilistic
>> > verification.
>> >
>> > Mike is running Grid-Stat to verify WPC's Excessive Rainfall
Outlooks
>> > against StageIV precip. The forecast probability values are
always 0,
>> 0.05,
>> > 0.1, 0.2, 0.5, or 1.0.
>> > When Mike computes the Brier score by hand, it differs from the
results
>> > reported by Grid-Stat out in the 3rd decimal place.
>> >
>> > My theory is that the difference is caused by the fact that MET
does not
>> > compute the Brier score directly on the probability values.
Instead, it
>> > bins them into an Nx2 probabilistic contingency table and
computes the
>> > Brier score from that table. And the mid-point of each bin is
used in
>> the
>> > Brier score computations. So different probability bins will
result in a
>> > slightly different Brier score.
>> >
>> > Mike is currently using probability thresholds as follows:
>> >    cat_thresh = [ >=0.0, >=0.05, >=0.1, >=0.2, >=0.5, >=1.0 ];
>> >
>> > And that's consistent with the probability values. But when you
think
>> about
>> > it...
>> > - Forecasts of 0% fall into the first bin and are evaluated as
being a
>> > value of 0.025 (mid-point of the 0.0 to 0.05 bin)
>> > - Forecasts of 5% fall into the second bin and are evaluated as
being a
>> > value of 0.075 (mid-point of the 0.05 to 0.1 bin)
>> > - Forecasts of 10% fall into the third bin and are evaluated as
being a
>> > value of 0.150 (mid-point of the 0.1 to 0.2 bin).
>> > - and so on for the other probability values
>> >
>> > Seems like the binning of probability values works better for
continuous
>> > probability values and not so well for probabilities that have
already
>> been
>> > binned!
>> >
>> > I'm wondering if you have any thoughts or advice about this
situation?
>> >
>> > Thanks,
>> > John
>> >
>> > On Fri, Sep 4, 2020 at 10:47 AM Michael Erickson - NOAA Affiliate
via
>> RT <
>> > met_help at ucar.edu> wrote:
>> >
>> > >
>> > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562 >
>> > >
>> > > Hi John,
>> > >
>> > > Thanks for your answers and sounds good! That is strange that
the
>> climo
>> > > file was not found for your setting. The only detail I can
think of is
>> > that
>> > > within the climo field, the file_name specification is static:
>> > >
>> > > file_name = [
>> > >
>> >
>>
"/usr1/wpc_cpgffh/gribs/ERO_verif/static/UFVS_ST4gFFG_s2015010100_e2019123123_vhr12.nc"
>> > > ];
>> > >
>> > >
>> > > I believe you concluded that my climo read-in looked correct?
>> > >
>> > > Thanks,
>> > >
>> > > Mike
>> > >
>> > >
>> > > On Fri, Sep 4, 2020 at 12:42 PM John Halley Gotway via RT <
>> > > met_help at ucar.edu>
>> > > wrote:
>> > >
>> > > > Mike,
>> > > >
>> > > > 2 more things I forgot to address.
>> > > >
>> > > > First, I pulled that climo field but when I ran grid_stat
with your
>> > > usethis
>> > > > config file, it did not actually read the climo data.
>> > > >
>> > > > DEBUG 3: Found 0 climatology fields.
>> > > >
>> > > >
>> > > > I'm wondering what additional configuration settings you used
to
>> make
>> > > this
>> > > > work?
>> > > >
>> > > >
>> > > > Second, the answer to your question is yes. The exact same
binning
>> > logic
>> > > > used for the forecast probabilities is applied to the climo
data. In
>> > > fact,
>> > > > the forecast probability bins are applied to both the
forecast and
>> > climo
>> > > > data. So you do not need to define separate "cat_thresh"
settings
>> for
>> > the
>> > > > climo. They won't be used anyway.
>> > > >
>> > > >
>> > > > Here's the spot in the library code where the climo
probabilistic
>> > > > contingency table is created using the forecast probability
bins:
>> > > >
>> > > >
>> > > >
>> > > >
>> > >
>> >
>>
https://github.com/dtcenter/MET/blob/2c9ae440a84024fbf62caa64c1b747f9a912236f/met/src/libcode/vx_statistics/compute_stats.cc#L767
>> > > >
>> > > >
>> > > > Thanks,
>> > > > John
>> > > >
>> > > > On Fri, Sep 4, 2020 at 10:25 AM John Halley Gotway
<johnhg at ucar.edu
>> >
>> > > > wrote:
>> > > >
>> > > > > Mike,
>> > > > >
>> > > > > I don't really have a recommendation on best practices with
>> regards
>> > to
>> > > > the
>> > > > > binning of probability values.
>> > > > >
>> > > > > I can say that I more commonly see people choose fixed bin
widths,
>> > like
>> > > > > "==0.10" (for 10 bins) or "==0.05" (for 20 bins) instead of
>> variable
>> > > > width
>> > > > > bins, such as:
>> > > > > [ >=0.0, >=0.05, >=0.1, >=0.2, >=0.5, >=1.0 ]
>> > > > >
>> > > > > But I suspect that's more out of convenience than anything
else.
>> With
>> > > > > regards to your chosen bins, I suspect you set them up this
way
>> since
>> > > you
>> > > > > have lots of low probability values closer to 0.0 and
relatively
>> few
>> > > > > probability values closer to 1.0. While this may be a good
choice
>> for
>> > > > > relatively rare events, it wouldn't be as good of a choice
for
>> very
>> > > > common
>> > > > > events resulting in high probability values.
>> > > > >
>> > > > > Choosing 20 bins (==0.05) would include all of your current
bin
>> > > > boundaries
>> > > > > and enable you to sample evenly across the probability
space,
>> > > regardless
>> > > > of
>> > > > > whether the values are bunched near 0 or 1. And
mathematically,
>> your
>> > > > > current bins would be derivable from these.
>> > > > >
>> > > > > But if your chosen bins follow some existing WPC
convention, I
>> don't
>> > > see
>> > > > > an obvious reason to change them.
>> > > > >
>> > > > > Please let me know if you'd like me to forward this
question to
>> one
>> > of
>> > > > the
>> > > > > statisticians in our group for their advice.
>> > > > >
>> > > > > Thanks,
>> > > > > John
>> > > > >
>> > > > > On Fri, Sep 4, 2020 at 5:08 AM Michael Erickson - NOAA
Affiliate
>> via
>> > > RT <
>> > > > > met_help at ucar.edu> wrote:
>> > > > >
>> > > > >>
>> > > > >> <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562 >
>> > > > >>
>> > > > >> Hi John,
>> > > > >>
>> > > > >> Thank you for your quick and helpful response! To answer
your
>> > > questions
>> > > > >> from the first email:
>> > > > >>
>> > > > >> 1) I have included the climo file in case you wanted to
see it:
>> > > > >>
>> > > > >>
>> > > >
>> > >
>> >
>>
https://ftp.wpc.ncep.noaa.gov/erickson/DTC/UFVS_ST4gFFG_s2015010100_e2019123123_vhr12.nc
>> > > > >>
>> > > > >> 2) I start from the netcdf output from grid_stat, load
that data
>> > into
>> > > > the
>> > > > >> python workspace, and compute the brier score from that.
>> > > > >>
>> > > > >> Also the circle diameter of 9 in the observation file is
to draw
>> a
>> > 40
>> > > km
>> > > > >> radius around the "observation."
>> > > > >>
>> > > > >> From your latter email, it sounds like I may not be able
to
>> exactly
>> > > > >> replicate the Brier Score calculation. In the spirit of
best
>> > > practices,
>> > > > >> would you recommend I change cat_thresh  to "= [ >=0.0,
>=0.001,
>> > > >=0.05,
>> > > > >> >=0.1, >=0.2, >=0.5, >=1.0 ];" or keep my cat_thresh as it
>> currently
>> > > is
>> > > > as
>> > > > >> long as I am consistent? I was also wondering if grid_stat
bins
>> the
>> > > > >> probabilities for the climo field as it does for the
>> probabilities
>> > in
>> > > > the
>> > > > >> forecast field?
>> > > > >>
>> > > > >> Thanks again!
>> > > > >>
>> > > > >> Mike
>> > > > >>
>> > > > >> On Thu, Sep 3, 2020 at 7:12 PM John Halley Gotway via RT <
>> > > > >> met_help at ucar.edu>
>> > > > >> wrote:
>> > > > >>
>> > > > >> > Actually, I have a reasonable guess as to why you may be
>> seeing a
>> > > > >> > difference.
>> > > > >> >
>> > > > >> > All probabilistics verification in MET is based on an
Nx2
>> > > > probabilistic
>> > > > >> > contingency table. Those are the counts in the PCT line
type.
>> We
>> > do
>> > > > >> this to
>> > > > >> > make it easier to aggregate statistics across multiple
cases,
>> but
>> > > > >> summing
>> > > > >> > up contingency tables before recomputing statistics. But
the
>> > > pros/cons
>> > > > >> of
>> > > > >> > this approach would probably be better addressed by a
>> > statistician.
>> > > So
>> > > > >> the
>> > > > >> > stats are computed using probability bins and not raw
>> probability
>> > > > >> values.
>> > > > >> >
>> > > > >> > If you went and computed the Brier score by hand, you
probably
>> did
>> > > so
>> > > > >> using
>> > > > >> > raw probability values and not binning them first.
>> > > > >> >
>> > > > >> > And this difference could explain the type of
discrepancy
>> you're
>> > > > seeing.
>> > > > >> >
>> > > > >> > To test this out, I reran your case...
>> > > > >> > (1) Using your original settings to confirm your Brier
score of
>> > > > >> 0.011934.
>> > > > >> > (2) Using 10 equally-spaced probability bins (cat_thresh
= [
>> ==0.1
>> > > ];)
>> > > > >> > which produced a Brier score of 0.013747.
>> > > > >> > (3) Using 50 equally-spaced probability bins (cat_thresh
= [
>> ==0.2
>> > > ];)
>> > > > >> > which produced a Brier score of 0.01197.
>> > > > >> > (4) Using 100 equally-spaced probability bins
(cat_thresh = [
>> > ==0.01
>> > > > ];)
>> > > > >> > which produced a Brier score of 0.01193.
>> > > > >> >
>> > > > >> > I suppose that doesn't example the exact discrepancy,
but could
>> > > > >> definitely
>> > > > >> > be involved.
>> > > > >> >
>> > > > >> > Notice on this line of the brier score computation in
MET:
>> > > > >> >
>> > > > >> >
>> > > > >>
>> > > >
>> > >
>> >
>>
https://github.com/dtcenter/MET/blob/2c9ae440a84024fbf62caa64c1b747f9a912236f/met/src/libcode/vx_statistics/contable_nx2.cc#L647
>> > > > >> >
>> > > > >> > That the "probability" value returned by "row_proby()"
is the
>> > > > mid-point
>> > > > >> of
>> > > > >> > the bin.
>> > > > >> > So all of your forecast probability values of 0% which
fall
>> into
>> > the
>> > > > >> first
>> > > > >> > bin are actually evaluated as having a probability value
of
>> 0.025
>> > > > which
>> > > > >> is
>> > > > >> > the mid-point between 0 and 0.05 for the first bin.
>> > > > >> >
>> > > > >> > Rerunning using the following to minimize that effect on
the
>> 0's:
>> > > > >> > cat_thresh = [ >=0.0, >=0.001, >=0.05, >=0.1, >=0.2,
>=0.5,
>> >=1.0
>> > ];
>> > > > >> > produces a brier score of 0.011489.
>> > > > >> >
>> > > > >> > So I'd say that the binning of the probability values is
>> impacting
>> > > the
>> > > > >> > Brier score out in the 4th decimal place.
>> > > > >> >
>> > > > >> > John
>> > > > >> >
>> > > > >> > On Thu, Sep 3, 2020 at 4:47 PM John Halley Gotway <
>> > johnhg at ucar.edu>
>> > > > >> wrote:
>> > > > >> >
>> > > > >> > > Hi Mike,
>> > > > >> > >
>> > > > >> > > Looks like you were able to make a lot of progress. I
>> certainly
>> > > > don't
>> > > > >> see
>> > > > >> > > anything wrong based on the log messages you sent.
>> > > > >> > >
>> > > > >> > > I do notice that you're smoothing the observations
with the
>> > > maximum
>> > > > >> value
>> > > > >> > > in a circle of diameter 9... presumably for a good
reason.
>> And I
>> > > see
>> > > > >> that
>> > > > >> > > smoothing step indicated in the log messages as well
as the
>> > output
>> > > > >> .stat
>> > > > >> > > file.
>> > > > >> > >
>> > > > >> > > Two questions.
>> > > > >> > >
>> > > > >> > > (1) I wanted to try running locally, but didn't find
the
>> "climo"
>> > > > file
>> > > > >> on
>> > > > >> > > the WPC ftp site:
>> > > > >> > >
>> > > > >> > >
>> > > > >> >
>> > > > >>
>> > > >
>> > >
>> >
>>
https://ftp.wpc.ncep.noaa.gov/erickson/DTC/UFVS_ST4gFFG_s2015010100_e2019123123_vhr12.nc
>> > > > >> > > <
>> > > > >> >
>> > > > >>
>> > > >
>> > >
>> >
>>
https://ftp.wpc.ncep.noaa.gov/erickson/DTC/UFVS_ST4gFFG_s2015010100_e2019123123_vhr12.nc
>> > > > >> > >
>> > > > >> > > Could you add that?
>> > > > >> > >
>> > > > >> > > (2) When you say that you tried to replicate the Brier
score
>> > > > >> computation,
>> > > > >> > > what was your starting point? The raw input files or
using
>> the
>> > > > NetCDF
>> > > > >> > > matched pairs output from Grid-Stat which already
include the
>> > > > >> computation
>> > > > >> > > of the observation maximums?
>> > > > >> > >
>> > > > >> > > Thanks,
>> > > > >> > > John Halley Gotway
>> > > > >> > >
>> > > > >> > > On Thu, Sep 3, 2020 at 2:26 PM Michael Erickson - NOAA
>> Affiliate
>> > > via
>> > > > >> RT <
>> > > > >> > > met_help at ucar.edu> wrote:
>> > > > >> > >
>> > > > >> > >>
>> > > > >> > >> <URL:
>> https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562
>> > >
>> > > > >> > >>
>> > > > >> > >> Thank you Minna!
>> > > > >> > >>
>> > > > >> > >> Mike
>> > > > >> > >>
>> > > > >> > >> On Thu, Sep 3, 2020 at 4:12 PM Minna Win via RT <
>> > > met_help at ucar.edu
>> > > > >
>> > > > >> > >> wrote:
>> > > > >> > >>
>> > > > >> > >> > Hi Mike,
>> > > > >> > >> >
>> > > > >> > >> > It looks like you have a few questions associated
with
>> > > > calculating
>> > > > >> > Brier
>> > > > >> > >> > Skill Scores.  I'm assigning this ticket to John
Halley
>> > Gotway.
>> > > > >> > >> >
>> > > > >> > >> > Regards,
>> > > > >> > >> > Minna
>> > > > >> > >> > ---------------
>> > > > >> > >> > Minna Win
>> > > > >> > >> > National Center for Atmospheric Research
>> > > > >> > >> > Developmental Testbed Center
>> > > > >> > >> > Phone: 303-497-8423
>> > > > >> > >> > Fax:   303-497-8401
>> > > > >> > >> >
>> > > > >> > >> >
>> > > > >> > >> >
>> > > > >> > >> > On Thu, Sep 3, 2020 at 1:13 PM Michael Erickson -
NOAA
>> > > Affiliate
>> > > > >> via
>> > > > >> > RT
>> > > > >> > >> <
>> > > > >> > >> > met_help at ucar.edu> wrote:
>> > > > >> > >> >
>> > > > >> > >> > >
>> > > > >> > >> > > Thu Sep 03 13:13:26 2020: Request 96562 was acted
upon.
>> > > > >> > >> > > Transaction: Ticket created by
>> michael.j.erickson at noaa.gov
>> > > > >> > >> > >        Queue: met_help
>> > > > >> > >> > >      Subject: Including Climatology in grid_stat
Config
>> > File
>> > > > >> > >> > >        Owner: Nobody
>> > > > >> > >> > >   Requestors: michael.j.erickson at noaa.gov
>> > > > >> > >> > >       Status: new
>> > > > >> > >> > >  Ticket <URL:
>> > > > >> > https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562
>> > > > >> > >> >
>> > > > >> > >> > >
>> > > > >> > >> > >
>> > > > >> > >> > > Greetings,
>> > > > >> > >> > >
>> > > > >> > >> > > For the first time I am attempting to calculate
Brier
>> Skill
>> > > > Score
>> > > > >> > >> using
>> > > > >> > >> > > grid_stat from an input climatology file. I have
>> created a
>> > > > >> > >> probabilistic
>> > > > >> > >> > > flooding climatology file (spans from zero to
one;
>> image is
>> > > > here:
>> > > > >> > >> > >
>> > > > >> >
>> > >
https://ftp.wpc.ncep.noaa.gov/erickson/test10/ALL_noMRGL_UFVS_ALL.png
>> > > > >> > >> ).
>> > > > >> > >> > > This climatology is static, so it doesn't change
with
>> time
>> > > when
>> > > > >> > >> inputting
>> > > > >> > >> > > the "model" and "observation" data. I believe I
have
>> > > > successfully
>> > > > >> > >> gotten
>> > > > >> > >> > > this to work using the command:
>> > > > >> > >> > >
>> > > > >> > >> > > /opt/MET/90/bin/grid_stat
ERO_s2020083112_e2020090112_
>> > > vhr09.nc
>> > > > >> > >> > > ST4gFFG_s2020083112_e2020090112_vhr09.nc usethis
>> -outdir ~
>> > > > >> > >> > >
>> > > > >> > >> > > where grid_stat
ERO_s2020083112_e2020090112_vhr09.nc
>> are
>> > > > >> discrete
>> > > > >> > >> > forecast
>> > > > >> > >> > > probabilities of 0, 0.05, 0.1, 0.2, and 0.5
>> > > > >> > >> > > where ST4gFFG_s2020083112_e2020090112_vhr09.nc
are
>> > > observation
>> > > > >> > values
>> > > > >> > >> > of 0
>> > > > >> > >> > > or 1
>> > > > >> > >> > > and usethis is the configuration file
>> > > > >> > >> > >
>> > > > >> > >> > > Finally the climatology file that consists of
"almost"
>> > > > continuous
>> > > > >> > >> values
>> > > > >> > >> > > between 0 and 1 is named:
UFVS_ST4gFFG_s2015010100_
>> > > > >> > >> e2019123123_vhr12.nc
>> > > > >> > >> > >
>> > > > >> > >> > > I have put all of these files at
>> > > > >> > >> > > https://ftp.wpc.ncep.noaa.gov/erickson/DTC/ for
>> > > > >> > >> > > your reference.
>> > > > >> > >> > >
>> > > > >> > >> > > As for my questions:
>> > > > >> > >> > >
>> > > > >> > >> > > 1) I was wondering if the climatology file was
properly
>> > > > ingested
>> > > > >> and
>> > > > >> > >> > > calculated for my example? I believe it is
correct given
>> > the
>> > > > >> output
>> > > > >> > >> > below,
>> > > > >> > >> > > but I wanted to make sure, since this is my first
time
>> > doing
>> > > > >> this:
>> > > > >> > >> > >
>> > > > >> > >> > >
>> > > > >> > >> > >
>> > > > >> > >> > >
>> > > > >> > >> > >
>> > > > >> > >> > >
>> > > > >> > >> > >
>> > > > >> > >> > >
>> > > > >> > >> > >
>> > > > >> > >> > >
>> > > > >> > >> > >
>> > > > >> > >> > >
>> > > > >> > >> > >
>> > > > >> > >> > >
>> > > > >> > >> > >
>> > > > >> > >> > >
>> > > > >> > >> > >
>> > > > >> > >> > >
>> > > > >> > >> > >
>> > > > >> > >> > >
>> > > > >> > >> > >
>> > > > >> > >> > >
>> > > > >> > >> > >
>> > > > >> > >> > >
>> > > > >> > >> > >
>> > > > >> > >> > >
>> > > > >> > >> > >
>> > > > >> > >> > >
>> > > > >> > >> > >
>> > > > >> > >> > >
>> > > > >> > >> > >
>> > > > >> > >> > >
>> > > > >> > >> > > *DEBUG 1: Forecast File:
>> > > > >> > >> > >
>> > > > >> > >> > >
>> > > > >> > >> >
>> > > > >> > >>
>> > > > >> >
>> > > > >>
>> > > >
>> > >
>> >
>> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL/ERO_s2020083112_e2020090112_vhr09.ncDEBUG
>> > > > >> > >> > > 1: Observation File:
>> > > > >> > >> > >
>> > > > >> > >> > >
>> > > > >> > >> >
>> > > > >> > >>
>> > > > >> >
>> > > > >>
>> > > >
>> > >
>> >
>> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL/ST4gFFG_s2020083112_e2020090112_vhr09.ncDEBUG
>> > > > >> > >> > > 3: Reading forecast data for EROSurface.DEBUG 3:
Reading
>> > > > >> observation
>> > > > >> > >> data
>> > > > >> > >> > > for ST4gFFGSurface.DEBUG 4:
>> > > > >> > >> Met2dDataFileFactory::new_met_2d_data_file()
>> > > > >> > >> > ->
>> > > > >> > >> > > created new Met2dDataFile object of type
>> > > "FileType_NcMet".DEBUG
>> > > > >> > >> 4:DEBUG
>> > > > >> > >> > 4:
>> > > > >> > >> > > Latitude/Longitude Grid Data:DEBUG 4:
lat_ll:
>> 25DEBUG
>> > 4:
>> > > > >> > >> > lon_ll:
>> > > > >> > >> > > 129.8DEBUG 4:   delta_lat: 0.09DEBUG 4:
delta_lon:
>> > > 0.09DEBUG
>> > > > 4:
>> > > > >> > >> > >  Nlat: 276DEBUG 4:        Nlon: 721DEBUG 4:DEBUG
4:
>> > > > >> > >> > > VarInfoFactory::new_var_info() -> created new
VarInfo
>> > object
>> > > of
>> > > > >> type
>> > > > >> > >> > > "FileType_NcMet".DEBUG 3: For forecast valid at
>> > > > 20200901_120000,
>> > > > >> > >> found 1
>> > > > >> > >> > > climatology field(s) with valid time(s):
>> > 20201231_230000DEBUG
>> > > > 3:
>> > > > >> > >> Found 1
>> > > > >> > >> > > climatology fields.DEBUG 3: Found 1 climatology
mean
>> and 0
>> > > > >> > climatology
>> > > > >> > >> > > standard deviation field(s) for forecast
>> EROSurface.DEBUG
>> > 2:
>> > > > >> > >> Processing
>> > > > >> > >> > > masking regions.DEBUG 3: Processing grid mask:
>> FULLDEBUG 4:
>> > > > >> > >> > > parse_grid_mask() -> parsing grid mask
"FULL"DEBUG
>> 2:DEBUG
>> > 2:
>> > > > >> > >> > >
>> > > > >> > >> > >
>> > > > >> > >> >
>> > > > >> > >>
>> > > > >> >
>> > > > >>
>> > > >
>> > >
>> >
>>
--------------------------------------------------------------------------------DEBUG
>> > > > >> > >> > > 2:DEBUG 3: Smoothing field using the MAX(49)
>> CircleTemplate
>> > > > >> > >> interpolation
>> > > > >> > >> > > method.DEBUG 2: Processing EROSurface versus
>> > ST4gFFGSurface,
>> > > > for
>> > > > >> > >> > smoothing
>> > > > >> > >> > > method MAX_CIRCLE(49), over region FULL, using
190638
>> > matched
>> > > > >> > >> pairs.DEBUG
>> > > > >> > >> > > 2: Computing Probabilistic Statistics.DEBUG
2:DEBUG 2:
>> > > > >> > >> > >
>> > > > >> > >> > >
>> > > > >> > >> >
>> > > > >> > >>
>> > > > >> >
>> > > > >>
>> > > >
>> > >
>> >
>>
--------------------------------------------------------------------------------DEBUG
>> > > > >> > >> > > 2:DEBUG 1: Output file:
>> > > > >> > >> > >
>> > > > >> > >> > >
>> > > > >> > >> >
>> > > > >> > >>
>> > > > >> >
>> > > > >>
>> > > >
>> > >
>> >
>> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL//grid_stat_ST4gFFG_ERO_s2020090112_e2020090212_vhr09_240000L_20200901_120000V.statDEBUG
>> > > > >> > >> > > 1: Output file:
>> > > > >> > >> > >
>> > > > >> > >> > >
>> > > > >> > >> >
>> > > > >> > >>
>> > > > >> >
>> > > > >>
>> > > >
>> > >
>> >
>> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL//grid_stat_ST4gFFG_ERO_s2020090112_e2020090212_vhr09_240000L_20200901_120000V_pairs.nc*
>> > > > >> > >> > >
>> > > > >> > >> > >
>> > > > >> > >> > > 2) This question is a bit more basic. I am unable
to
>> > manually
>> > > > >> > >> calculate a
>> > > > >> > >> > > Brier Score value for the forecast and
observation that
>> > > > properly
>> > > > >> > >> matches
>> > > > >> > >> > > that in the stat file. My manually calculated
Brier
>> Score
>> > is
>> > > > >> > >> > systematically
>> > > > >> > >> > > lower. For this event, the stat file BS is 0.0119
and my
>> > > value
>> > > > is
>> > > > >> > >> 0.0116.
>> > > > >> > >> > > I've looked at C3 in the MET Tutorial guide
>> > > > >> > >> > > <
>> > > > >> > >> > >
>> > > > >> > >> >
>> > > > >> > >>
>> > > > >> >
>> > > > >>
>> > > >
>> > >
>> >
>> https://dtcenter.org/sites/default/files/community-
code/met/docs/user-guide/MET_Users_Guide_v9.0.pdf
>> > > > >> > >> > > >,
>> > > > >> > >> > > but I'm still at a bit of a loss. Is there a
simple way
>> I
>> > can
>> > > > >> > >> replicate
>> > > > >> > >> > the
>> > > > >> > >> > > calculation seen in the stat file?
>> > > > >> > >> > >
>> > > > >> > >> > > Thank you again for your help and please let me
know if
>> you
>> > > > have
>> > > > >> any
>> > > > >> > >> > > questions.
>> > > > >> > >> > >
>> > > > >> > >> > > Mike
>> > > > >> > >> > >
>> > > > >> > >> > > --
>> > > > >> > >> > > Michael J. Erickson
>> > > > >> > >> > >
>> > > > >> > >> > > Research Scientist
>> > > > >> > >> > > Cooperative Institute for Research in
Environmental
>> > Sciences
>> > > > >> (CIRES)
>> > > > >> > >> > > NOAA/NWS/Weather Prediction Center
>> > > > >> > >> > > Phone:  301-683-1546
>> > > > >> > >> > >
>> > > > >> > >> > >
>> > > > >> > >> >
>> > > > >> > >> >
>> > > > >> > >>
>> > > > >> > >> --
>> > > > >> > >> Michael J. Erickson
>> > > > >> > >>
>> > > > >> > >> Research Scientist
>> > > > >> > >> Cooperative Institute for Research in Environmental
Sciences
>> > > > (CIRES)
>> > > > >> > >> NOAA/NWS/Weather Prediction Center
>> > > > >> > >> Phone:  301-683-1546
>> > > > >> > >>
>> > > > >> > >>
>> > > > >> >
>> > > > >> >
>> > > > >>
>> > > > >> --
>> > > > >> Michael J. Erickson
>> > > > >>
>> > > > >> Research Scientist
>> > > > >> Cooperative Institute for Research in Environmental
Sciences
>> (CIRES)
>> > > > >> NOAA/NWS/Weather Prediction Center
>> > > > >> Phone:  301-683-1546
>> > > > >>
>> > > > >>
>> > > >
>> > > >
>> > >
>> > > --
>> > > Michael J. Erickson
>> > >
>> > > Research Scientist
>> > > Cooperative Institute for Research in Environmental Sciences
(CIRES)
>> > > NOAA/NWS/Weather Prediction Center
>> > > Phone:  301-683-1546
>> > >
>> > >
>> >
>> >
>>
>>
>
> --
> Michael J. Erickson
>
> Research Scientist
> Cooperative Institute for Research in Environmental Sciences (CIRES)
> NOAA/NWS/Weather Prediction Center
> Phone:  301-683-1546
>

------------------------------------------------
Subject: Including Climatology in grid_stat Config File
From: Michael Erickson - NOAA Affiliate
Time: Wed Sep 09 15:50:02 2020

Thanks Everyone for your helpful responses.

I have been using grid_stat for WPC's Excessive Rainfall Outlook
(consisting of probabilities of 0, 0.05, 0.1, 0.2, and 0.5) for years.
These results are dependent upon MET, so I wanted to make sure I am
following best practices.

What is DTC's guidance on how to proceed forward with this? Should I
change
my cat_thresh to "= [ >=0.0, >=0.001, >=0.05, >=0.1, >=0.2, >=0.5,
>=1.0
];" or is my current setting fine given that both the "forecast" and
"observation" are broken down by the same discrete increments? I can
also
just calculate Brier Score manually outside of grid_stat.

Thanks,

Mike

On Tue, Sep 8, 2020 at 4:56 PM Barbara Brown via RT
<met_help at ucar.edu>
wrote:

> I agree with Eric and John. The way MET does this generally makes
sense for
> ensemble forecasts (or other cases when you want MET to select the
> thresholds) but not for the cse when the probabilities for specific
> categories are provided by the user.  I'm not sure what the work-
around
> might be (John may have ideas!) but in the long-run it would be good
to
> allow for this option.
>
> Barb
> ---
> Barbara Brown, Senior Research Associate
> Research Applications Laboratory
> NCAR PO Box 3000
> Boulder CO 80307-3000 USA
> Ph: +1 303 497 8468  FAX: +1 303 497 8401
>
>
> On Tue, Sep 8, 2020 at 2:14 PM Michael Erickson - NOAA Affiliate <
> michael.j.erickson at noaa.gov> wrote:
>
> > Hi Eric and John,
> >
> > Thank you for your response to this matter. What would be the best
> > practice to take in this situation?
> >
> > Thanks,
> >
> > Mike
> >
> > On Tue, Sep 8, 2020 at 3:41 PM Eric Gilleland via RT
<met_help at ucar.edu>
> > wrote:
> >
> >> Hi John,
> >>
> >> I agree that if the probabilities have already been binned, then
it is
> >> strange to then take the midpoint (re-binning).
> >>
> >> Eric
> >>
> >> On Fri, Sep 4, 2020 at 11:14 AM John Halley Gotway via RT <
> >> met_help at ucar.edu>
> >> wrote:
> >>
> >> > Barb and Eric,
> >> >
> >> > I've added you to this met-help ticket from Mike Erickson from
> NOAA/WPC.
> >> > We're hoping to get some advice from one or both of you about
> >> probabilistic
> >> > verification.
> >> >
> >> > Mike is running Grid-Stat to verify WPC's Excessive Rainfall
Outlooks
> >> > against StageIV precip. The forecast probability values are
always 0,
> >> 0.05,
> >> > 0.1, 0.2, 0.5, or 1.0.
> >> > When Mike computes the Brier score by hand, it differs from the
> results
> >> > reported by Grid-Stat out in the 3rd decimal place.
> >> >
> >> > My theory is that the difference is caused by the fact that MET
does
> not
> >> > compute the Brier score directly on the probability values.
Instead,
> it
> >> > bins them into an Nx2 probabilistic contingency table and
computes the
> >> > Brier score from that table. And the mid-point of each bin is
used in
> >> the
> >> > Brier score computations. So different probability bins will
result
> in a
> >> > slightly different Brier score.
> >> >
> >> > Mike is currently using probability thresholds as follows:
> >> >    cat_thresh = [ >=0.0, >=0.05, >=0.1, >=0.2, >=0.5, >=1.0 ];
> >> >
> >> > And that's consistent with the probability values. But when you
think
> >> about
> >> > it...
> >> > - Forecasts of 0% fall into the first bin and are evaluated as
being a
> >> > value of 0.025 (mid-point of the 0.0 to 0.05 bin)
> >> > - Forecasts of 5% fall into the second bin and are evaluated as
being
> a
> >> > value of 0.075 (mid-point of the 0.05 to 0.1 bin)
> >> > - Forecasts of 10% fall into the third bin and are evaluated as
being
> a
> >> > value of 0.150 (mid-point of the 0.1 to 0.2 bin).
> >> > - and so on for the other probability values
> >> >
> >> > Seems like the binning of probability values works better for
> continuous
> >> > probability values and not so well for probabilities that have
already
> >> been
> >> > binned!
> >> >
> >> > I'm wondering if you have any thoughts or advice about this
situation?
> >> >
> >> > Thanks,
> >> > John
> >> >
> >> > On Fri, Sep 4, 2020 at 10:47 AM Michael Erickson - NOAA
Affiliate via
> >> RT <
> >> > met_help at ucar.edu> wrote:
> >> >
> >> > >
> >> > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562
>
> >> > >
> >> > > Hi John,
> >> > >
> >> > > Thanks for your answers and sounds good! That is strange that
the
> >> climo
> >> > > file was not found for your setting. The only detail I can
think of
> is
> >> > that
> >> > > within the climo field, the file_name specification is
static:
> >> > >
> >> > > file_name = [
> >> > >
> >> >
> >>
>
"/usr1/wpc_cpgffh/gribs/ERO_verif/static/UFVS_ST4gFFG_s2015010100_e2019123123_vhr12.nc"
> >> > > ];
> >> > >
> >> > >
> >> > > I believe you concluded that my climo read-in looked correct?
> >> > >
> >> > > Thanks,
> >> > >
> >> > > Mike
> >> > >
> >> > >
> >> > > On Fri, Sep 4, 2020 at 12:42 PM John Halley Gotway via RT <
> >> > > met_help at ucar.edu>
> >> > > wrote:
> >> > >
> >> > > > Mike,
> >> > > >
> >> > > > 2 more things I forgot to address.
> >> > > >
> >> > > > First, I pulled that climo field but when I ran grid_stat
with
> your
> >> > > usethis
> >> > > > config file, it did not actually read the climo data.
> >> > > >
> >> > > > DEBUG 3: Found 0 climatology fields.
> >> > > >
> >> > > >
> >> > > > I'm wondering what additional configuration settings you
used to
> >> make
> >> > > this
> >> > > > work?
> >> > > >
> >> > > >
> >> > > > Second, the answer to your question is yes. The exact same
binning
> >> > logic
> >> > > > used for the forecast probabilities is applied to the climo
data.
> In
> >> > > fact,
> >> > > > the forecast probability bins are applied to both the
forecast and
> >> > climo
> >> > > > data. So you do not need to define separate "cat_thresh"
settings
> >> for
> >> > the
> >> > > > climo. They won't be used anyway.
> >> > > >
> >> > > >
> >> > > > Here's the spot in the library code where the climo
probabilistic
> >> > > > contingency table is created using the forecast probability
bins:
> >> > > >
> >> > > >
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
>
https://github.com/dtcenter/MET/blob/2c9ae440a84024fbf62caa64c1b747f9a912236f/met/src/libcode/vx_statistics/compute_stats.cc#L767
> >> > > >
> >> > > >
> >> > > > Thanks,
> >> > > > John
> >> > > >
> >> > > > On Fri, Sep 4, 2020 at 10:25 AM John Halley Gotway <
> johnhg at ucar.edu
> >> >
> >> > > > wrote:
> >> > > >
> >> > > > > Mike,
> >> > > > >
> >> > > > > I don't really have a recommendation on best practices
with
> >> regards
> >> > to
> >> > > > the
> >> > > > > binning of probability values.
> >> > > > >
> >> > > > > I can say that I more commonly see people choose fixed
bin
> widths,
> >> > like
> >> > > > > "==0.10" (for 10 bins) or "==0.05" (for 20 bins) instead
of
> >> variable
> >> > > > width
> >> > > > > bins, such as:
> >> > > > > [ >=0.0, >=0.05, >=0.1, >=0.2, >=0.5, >=1.0 ]
> >> > > > >
> >> > > > > But I suspect that's more out of convenience than
anything else.
> >> With
> >> > > > > regards to your chosen bins, I suspect you set them up
this way
> >> since
> >> > > you
> >> > > > > have lots of low probability values closer to 0.0 and
relatively
> >> few
> >> > > > > probability values closer to 1.0. While this may be a
good
> choice
> >> for
> >> > > > > relatively rare events, it wouldn't be as good of a
choice for
> >> very
> >> > > > common
> >> > > > > events resulting in high probability values.
> >> > > > >
> >> > > > > Choosing 20 bins (==0.05) would include all of your
current bin
> >> > > > boundaries
> >> > > > > and enable you to sample evenly across the probability
space,
> >> > > regardless
> >> > > > of
> >> > > > > whether the values are bunched near 0 or 1. And
mathematically,
> >> your
> >> > > > > current bins would be derivable from these.
> >> > > > >
> >> > > > > But if your chosen bins follow some existing WPC
convention, I
> >> don't
> >> > > see
> >> > > > > an obvious reason to change them.
> >> > > > >
> >> > > > > Please let me know if you'd like me to forward this
question to
> >> one
> >> > of
> >> > > > the
> >> > > > > statisticians in our group for their advice.
> >> > > > >
> >> > > > > Thanks,
> >> > > > > John
> >> > > > >
> >> > > > > On Fri, Sep 4, 2020 at 5:08 AM Michael Erickson - NOAA
Affiliate
> >> via
> >> > > RT <
> >> > > > > met_help at ucar.edu> wrote:
> >> > > > >
> >> > > > >>
> >> > > > >> <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562
> >
> >> > > > >>
> >> > > > >> Hi John,
> >> > > > >>
> >> > > > >> Thank you for your quick and helpful response! To answer
your
> >> > > questions
> >> > > > >> from the first email:
> >> > > > >>
> >> > > > >> 1) I have included the climo file in case you wanted to
see it:
> >> > > > >>
> >> > > > >>
> >> > > >
> >> > >
> >> >
> >>
>
https://ftp.wpc.ncep.noaa.gov/erickson/DTC/UFVS_ST4gFFG_s2015010100_e2019123123_vhr12.nc
> >> > > > >>
> >> > > > >> 2) I start from the netcdf output from grid_stat, load
that
> data
> >> > into
> >> > > > the
> >> > > > >> python workspace, and compute the brier score from that.
> >> > > > >>
> >> > > > >> Also the circle diameter of 9 in the observation file is
to
> draw
> >> a
> >> > 40
> >> > > km
> >> > > > >> radius around the "observation."
> >> > > > >>
> >> > > > >> From your latter email, it sounds like I may not be able
to
> >> exactly
> >> > > > >> replicate the Brier Score calculation. In the spirit of
best
> >> > > practices,
> >> > > > >> would you recommend I change cat_thresh  to "= [ >=0.0,
> >=0.001,
> >> > > >=0.05,
> >> > > > >> >=0.1, >=0.2, >=0.5, >=1.0 ];" or keep my cat_thresh as
it
> >> currently
> >> > > is
> >> > > > as
> >> > > > >> long as I am consistent? I was also wondering if
grid_stat bins
> >> the
> >> > > > >> probabilities for the climo field as it does for the
> >> probabilities
> >> > in
> >> > > > the
> >> > > > >> forecast field?
> >> > > > >>
> >> > > > >> Thanks again!
> >> > > > >>
> >> > > > >> Mike
> >> > > > >>
> >> > > > >> On Thu, Sep 3, 2020 at 7:12 PM John Halley Gotway via RT
<
> >> > > > >> met_help at ucar.edu>
> >> > > > >> wrote:
> >> > > > >>
> >> > > > >> > Actually, I have a reasonable guess as to why you may
be
> >> seeing a
> >> > > > >> > difference.
> >> > > > >> >
> >> > > > >> > All probabilistics verification in MET is based on an
Nx2
> >> > > > probabilistic
> >> > > > >> > contingency table. Those are the counts in the PCT
line type.
> >> We
> >> > do
> >> > > > >> this to
> >> > > > >> > make it easier to aggregate statistics across multiple
cases,
> >> but
> >> > > > >> summing
> >> > > > >> > up contingency tables before recomputing statistics.
But the
> >> > > pros/cons
> >> > > > >> of
> >> > > > >> > this approach would probably be better addressed by a
> >> > statistician.
> >> > > So
> >> > > > >> the
> >> > > > >> > stats are computed using probability bins and not raw
> >> probability
> >> > > > >> values.
> >> > > > >> >
> >> > > > >> > If you went and computed the Brier score by hand, you
> probably
> >> did
> >> > > so
> >> > > > >> using
> >> > > > >> > raw probability values and not binning them first.
> >> > > > >> >
> >> > > > >> > And this difference could explain the type of
discrepancy
> >> you're
> >> > > > seeing.
> >> > > > >> >
> >> > > > >> > To test this out, I reran your case...
> >> > > > >> > (1) Using your original settings to confirm your Brier
score
> of
> >> > > > >> 0.011934.
> >> > > > >> > (2) Using 10 equally-spaced probability bins
(cat_thresh = [
> >> ==0.1
> >> > > ];)
> >> > > > >> > which produced a Brier score of 0.013747.
> >> > > > >> > (3) Using 50 equally-spaced probability bins
(cat_thresh = [
> >> ==0.2
> >> > > ];)
> >> > > > >> > which produced a Brier score of 0.01197.
> >> > > > >> > (4) Using 100 equally-spaced probability bins
(cat_thresh = [
> >> > ==0.01
> >> > > > ];)
> >> > > > >> > which produced a Brier score of 0.01193.
> >> > > > >> >
> >> > > > >> > I suppose that doesn't example the exact discrepancy,
but
> could
> >> > > > >> definitely
> >> > > > >> > be involved.
> >> > > > >> >
> >> > > > >> > Notice on this line of the brier score computation in
MET:
> >> > > > >> >
> >> > > > >> >
> >> > > > >>
> >> > > >
> >> > >
> >> >
> >>
>
https://github.com/dtcenter/MET/blob/2c9ae440a84024fbf62caa64c1b747f9a912236f/met/src/libcode/vx_statistics/contable_nx2.cc#L647
> >> > > > >> >
> >> > > > >> > That the "probability" value returned by "row_proby()"
is the
> >> > > > mid-point
> >> > > > >> of
> >> > > > >> > the bin.
> >> > > > >> > So all of your forecast probability values of 0% which
fall
> >> into
> >> > the
> >> > > > >> first
> >> > > > >> > bin are actually evaluated as having a probability
value of
> >> 0.025
> >> > > > which
> >> > > > >> is
> >> > > > >> > the mid-point between 0 and 0.05 for the first bin.
> >> > > > >> >
> >> > > > >> > Rerunning using the following to minimize that effect
on the
> >> 0's:
> >> > > > >> > cat_thresh = [ >=0.0, >=0.001, >=0.05, >=0.1, >=0.2,
>=0.5,
> >> >=1.0
> >> > ];
> >> > > > >> > produces a brier score of 0.011489.
> >> > > > >> >
> >> > > > >> > So I'd say that the binning of the probability values
is
> >> impacting
> >> > > the
> >> > > > >> > Brier score out in the 4th decimal place.
> >> > > > >> >
> >> > > > >> > John
> >> > > > >> >
> >> > > > >> > On Thu, Sep 3, 2020 at 4:47 PM John Halley Gotway <
> >> > johnhg at ucar.edu>
> >> > > > >> wrote:
> >> > > > >> >
> >> > > > >> > > Hi Mike,
> >> > > > >> > >
> >> > > > >> > > Looks like you were able to make a lot of progress.
I
> >> certainly
> >> > > > don't
> >> > > > >> see
> >> > > > >> > > anything wrong based on the log messages you sent.
> >> > > > >> > >
> >> > > > >> > > I do notice that you're smoothing the observations
with the
> >> > > maximum
> >> > > > >> value
> >> > > > >> > > in a circle of diameter 9... presumably for a good
reason.
> >> And I
> >> > > see
> >> > > > >> that
> >> > > > >> > > smoothing step indicated in the log messages as well
as the
> >> > output
> >> > > > >> .stat
> >> > > > >> > > file.
> >> > > > >> > >
> >> > > > >> > > Two questions.
> >> > > > >> > >
> >> > > > >> > > (1) I wanted to try running locally, but didn't find
the
> >> "climo"
> >> > > > file
> >> > > > >> on
> >> > > > >> > > the WPC ftp site:
> >> > > > >> > >
> >> > > > >> > >
> >> > > > >> >
> >> > > > >>
> >> > > >
> >> > >
> >> >
> >>
>
https://ftp.wpc.ncep.noaa.gov/erickson/DTC/UFVS_ST4gFFG_s2015010100_e2019123123_vhr12.nc
> >> > > > >> > > <
> >> > > > >> >
> >> > > > >>
> >> > > >
> >> > >
> >> >
> >>
>
https://ftp.wpc.ncep.noaa.gov/erickson/DTC/UFVS_ST4gFFG_s2015010100_e2019123123_vhr12.nc
> >> > > > >> > >
> >> > > > >> > > Could you add that?
> >> > > > >> > >
> >> > > > >> > > (2) When you say that you tried to replicate the
Brier
> score
> >> > > > >> computation,
> >> > > > >> > > what was your starting point? The raw input files or
using
> >> the
> >> > > > NetCDF
> >> > > > >> > > matched pairs output from Grid-Stat which already
include
> the
> >> > > > >> computation
> >> > > > >> > > of the observation maximums?
> >> > > > >> > >
> >> > > > >> > > Thanks,
> >> > > > >> > > John Halley Gotway
> >> > > > >> > >
> >> > > > >> > > On Thu, Sep 3, 2020 at 2:26 PM Michael Erickson -
NOAA
> >> Affiliate
> >> > > via
> >> > > > >> RT <
> >> > > > >> > > met_help at ucar.edu> wrote:
> >> > > > >> > >
> >> > > > >> > >>
> >> > > > >> > >> <URL:
> >> https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562
> >> > >
> >> > > > >> > >>
> >> > > > >> > >> Thank you Minna!
> >> > > > >> > >>
> >> > > > >> > >> Mike
> >> > > > >> > >>
> >> > > > >> > >> On Thu, Sep 3, 2020 at 4:12 PM Minna Win via RT <
> >> > > met_help at ucar.edu
> >> > > > >
> >> > > > >> > >> wrote:
> >> > > > >> > >>
> >> > > > >> > >> > Hi Mike,
> >> > > > >> > >> >
> >> > > > >> > >> > It looks like you have a few questions associated
with
> >> > > > calculating
> >> > > > >> > Brier
> >> > > > >> > >> > Skill Scores.  I'm assigning this ticket to John
Halley
> >> > Gotway.
> >> > > > >> > >> >
> >> > > > >> > >> > Regards,
> >> > > > >> > >> > Minna
> >> > > > >> > >> > ---------------
> >> > > > >> > >> > Minna Win
> >> > > > >> > >> > National Center for Atmospheric Research
> >> > > > >> > >> > Developmental Testbed Center
> >> > > > >> > >> > Phone: 303-497-8423
> >> > > > >> > >> > Fax:   303-497-8401
> >> > > > >> > >> >
> >> > > > >> > >> >
> >> > > > >> > >> >
> >> > > > >> > >> > On Thu, Sep 3, 2020 at 1:13 PM Michael Erickson -
NOAA
> >> > > Affiliate
> >> > > > >> via
> >> > > > >> > RT
> >> > > > >> > >> <
> >> > > > >> > >> > met_help at ucar.edu> wrote:
> >> > > > >> > >> >
> >> > > > >> > >> > >
> >> > > > >> > >> > > Thu Sep 03 13:13:26 2020: Request 96562 was
acted
> upon.
> >> > > > >> > >> > > Transaction: Ticket created by
> >> michael.j.erickson at noaa.gov
> >> > > > >> > >> > >        Queue: met_help
> >> > > > >> > >> > >      Subject: Including Climatology in
grid_stat
> Config
> >> > File
> >> > > > >> > >> > >        Owner: Nobody
> >> > > > >> > >> > >   Requestors: michael.j.erickson at noaa.gov
> >> > > > >> > >> > >       Status: new
> >> > > > >> > >> > >  Ticket <URL:
> >> > > > >> >
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562
> >> > > > >> > >> >
> >> > > > >> > >> > >
> >> > > > >> > >> > >
> >> > > > >> > >> > > Greetings,
> >> > > > >> > >> > >
> >> > > > >> > >> > > For the first time I am attempting to calculate
Brier
> >> Skill
> >> > > > Score
> >> > > > >> > >> using
> >> > > > >> > >> > > grid_stat from an input climatology file. I
have
> >> created a
> >> > > > >> > >> probabilistic
> >> > > > >> > >> > > flooding climatology file (spans from zero to
one;
> >> image is
> >> > > > here:
> >> > > > >> > >> > >
> >> > > > >> >
> >> > >
>
https://ftp.wpc.ncep.noaa.gov/erickson/test10/ALL_noMRGL_UFVS_ALL.png
> >> > > > >> > >> ).
> >> > > > >> > >> > > This climatology is static, so it doesn't
change with
> >> time
> >> > > when
> >> > > > >> > >> inputting
> >> > > > >> > >> > > the "model" and "observation" data. I believe I
have
> >> > > > successfully
> >> > > > >> > >> gotten
> >> > > > >> > >> > > this to work using the command:
> >> > > > >> > >> > >
> >> > > > >> > >> > > /opt/MET/90/bin/grid_stat
ERO_s2020083112_e2020090112_
> >> > > vhr09.nc
> >> > > > >> > >> > > ST4gFFG_s2020083112_e2020090112_vhr09.nc
usethis
> >> -outdir ~
> >> > > > >> > >> > >
> >> > > > >> > >> > > where grid_stat
ERO_s2020083112_e2020090112_vhr09.nc
> >> are
> >> > > > >> discrete
> >> > > > >> > >> > forecast
> >> > > > >> > >> > > probabilities of 0, 0.05, 0.1, 0.2, and 0.5
> >> > > > >> > >> > > where ST4gFFG_s2020083112_e2020090112_vhr09.nc
are
> >> > > observation
> >> > > > >> > values
> >> > > > >> > >> > of 0
> >> > > > >> > >> > > or 1
> >> > > > >> > >> > > and usethis is the configuration file
> >> > > > >> > >> > >
> >> > > > >> > >> > > Finally the climatology file that consists of
"almost"
> >> > > > continuous
> >> > > > >> > >> values
> >> > > > >> > >> > > between 0 and 1 is named:
UFVS_ST4gFFG_s2015010100_
> >> > > > >> > >> e2019123123_vhr12.nc
> >> > > > >> > >> > >
> >> > > > >> > >> > > I have put all of these files at
> >> > > > >> > >> > > https://ftp.wpc.ncep.noaa.gov/erickson/DTC/ for
> >> > > > >> > >> > > your reference.
> >> > > > >> > >> > >
> >> > > > >> > >> > > As for my questions:
> >> > > > >> > >> > >
> >> > > > >> > >> > > 1) I was wondering if the climatology file was
> properly
> >> > > > ingested
> >> > > > >> and
> >> > > > >> > >> > > calculated for my example? I believe it is
correct
> given
> >> > the
> >> > > > >> output
> >> > > > >> > >> > below,
> >> > > > >> > >> > > but I wanted to make sure, since this is my
first time
> >> > doing
> >> > > > >> this:
> >> > > > >> > >> > >
> >> > > > >> > >> > >
> >> > > > >> > >> > >
> >> > > > >> > >> > >
> >> > > > >> > >> > >
> >> > > > >> > >> > >
> >> > > > >> > >> > >
> >> > > > >> > >> > >
> >> > > > >> > >> > >
> >> > > > >> > >> > >
> >> > > > >> > >> > >
> >> > > > >> > >> > >
> >> > > > >> > >> > >
> >> > > > >> > >> > >
> >> > > > >> > >> > >
> >> > > > >> > >> > >
> >> > > > >> > >> > >
> >> > > > >> > >> > >
> >> > > > >> > >> > >
> >> > > > >> > >> > >
> >> > > > >> > >> > >
> >> > > > >> > >> > >
> >> > > > >> > >> > >
> >> > > > >> > >> > >
> >> > > > >> > >> > >
> >> > > > >> > >> > >
> >> > > > >> > >> > >
> >> > > > >> > >> > >
> >> > > > >> > >> > >
> >> > > > >> > >> > >
> >> > > > >> > >> > >
> >> > > > >> > >> > >
> >> > > > >> > >> > > *DEBUG 1: Forecast File:
> >> > > > >> > >> > >
> >> > > > >> > >> > >
> >> > > > >> > >> >
> >> > > > >> > >>
> >> > > > >> >
> >> > > > >>
> >> > > >
> >> > >
> >> >
> >>
> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL/ERO_s2020083112_e2020090112_vhr09.ncDEBUG
> >> > > > >> > >> > > 1: Observation File:
> >> > > > >> > >> > >
> >> > > > >> > >> > >
> >> > > > >> > >> >
> >> > > > >> > >>
> >> > > > >> >
> >> > > > >>
> >> > > >
> >> > >
> >> >
> >>
> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL/ST4gFFG_s2020083112_e2020090112_vhr09.ncDEBUG
> >> > > > >> > >> > > 3: Reading forecast data for EROSurface.DEBUG
3:
> Reading
> >> > > > >> observation
> >> > > > >> > >> data
> >> > > > >> > >> > > for ST4gFFGSurface.DEBUG 4:
> >> > > > >> > >> Met2dDataFileFactory::new_met_2d_data_file()
> >> > > > >> > >> > ->
> >> > > > >> > >> > > created new Met2dDataFile object of type
> >> > > "FileType_NcMet".DEBUG
> >> > > > >> > >> 4:DEBUG
> >> > > > >> > >> > 4:
> >> > > > >> > >> > > Latitude/Longitude Grid Data:DEBUG 4:
lat_ll:
> >> 25DEBUG
> >> > 4:
> >> > > > >> > >> > lon_ll:
> >> > > > >> > >> > > 129.8DEBUG 4:   delta_lat: 0.09DEBUG 4:
delta_lon:
> >> > > 0.09DEBUG
> >> > > > 4:
> >> > > > >> > >> > >  Nlat: 276DEBUG 4:        Nlon: 721DEBUG
4:DEBUG 4:
> >> > > > >> > >> > > VarInfoFactory::new_var_info() -> created new
VarInfo
> >> > object
> >> > > of
> >> > > > >> type
> >> > > > >> > >> > > "FileType_NcMet".DEBUG 3: For forecast valid at
> >> > > > 20200901_120000,
> >> > > > >> > >> found 1
> >> > > > >> > >> > > climatology field(s) with valid time(s):
> >> > 20201231_230000DEBUG
> >> > > > 3:
> >> > > > >> > >> Found 1
> >> > > > >> > >> > > climatology fields.DEBUG 3: Found 1 climatology
mean
> >> and 0
> >> > > > >> > climatology
> >> > > > >> > >> > > standard deviation field(s) for forecast
> >> EROSurface.DEBUG
> >> > 2:
> >> > > > >> > >> Processing
> >> > > > >> > >> > > masking regions.DEBUG 3: Processing grid mask:
> >> FULLDEBUG 4:
> >> > > > >> > >> > > parse_grid_mask() -> parsing grid mask
"FULL"DEBUG
> >> 2:DEBUG
> >> > 2:
> >> > > > >> > >> > >
> >> > > > >> > >> > >
> >> > > > >> > >> >
> >> > > > >> > >>
> >> > > > >> >
> >> > > > >>
> >> > > >
> >> > >
> >> >
> >>
>
--------------------------------------------------------------------------------DEBUG
> >> > > > >> > >> > > 2:DEBUG 3: Smoothing field using the MAX(49)
> >> CircleTemplate
> >> > > > >> > >> interpolation
> >> > > > >> > >> > > method.DEBUG 2: Processing EROSurface versus
> >> > ST4gFFGSurface,
> >> > > > for
> >> > > > >> > >> > smoothing
> >> > > > >> > >> > > method MAX_CIRCLE(49), over region FULL, using
190638
> >> > matched
> >> > > > >> > >> pairs.DEBUG
> >> > > > >> > >> > > 2: Computing Probabilistic Statistics.DEBUG
2:DEBUG 2:
> >> > > > >> > >> > >
> >> > > > >> > >> > >
> >> > > > >> > >> >
> >> > > > >> > >>
> >> > > > >> >
> >> > > > >>
> >> > > >
> >> > >
> >> >
> >>
>
--------------------------------------------------------------------------------DEBUG
> >> > > > >> > >> > > 2:DEBUG 1: Output file:
> >> > > > >> > >> > >
> >> > > > >> > >> > >
> >> > > > >> > >> >
> >> > > > >> > >>
> >> > > > >> >
> >> > > > >>
> >> > > >
> >> > >
> >> >
> >>
> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL//grid_stat_ST4gFFG_ERO_s2020090112_e2020090212_vhr09_240000L_20200901_120000V.statDEBUG
> >> > > > >> > >> > > 1: Output file:
> >> > > > >> > >> > >
> >> > > > >> > >> > >
> >> > > > >> > >> >
> >> > > > >> > >>
> >> > > > >> >
> >> > > > >>
> >> > > >
> >> > >
> >> >
> >>
> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL//grid_stat_ST4gFFG_ERO_s2020090112_e2020090212_vhr09_240000L_20200901_120000V_pairs.nc*
> >> > > > >> > >> > >
> >> > > > >> > >> > >
> >> > > > >> > >> > > 2) This question is a bit more basic. I am
unable to
> >> > manually
> >> > > > >> > >> calculate a
> >> > > > >> > >> > > Brier Score value for the forecast and
observation
> that
> >> > > > properly
> >> > > > >> > >> matches
> >> > > > >> > >> > > that in the stat file. My manually calculated
Brier
> >> Score
> >> > is
> >> > > > >> > >> > systematically
> >> > > > >> > >> > > lower. For this event, the stat file BS is
0.0119 and
> my
> >> > > value
> >> > > > is
> >> > > > >> > >> 0.0116.
> >> > > > >> > >> > > I've looked at C3 in the MET Tutorial guide
> >> > > > >> > >> > > <
> >> > > > >> > >> > >
> >> > > > >> > >> >
> >> > > > >> > >>
> >> > > > >> >
> >> > > > >>
> >> > > >
> >> > >
> >> >
> >>
> https://dtcenter.org/sites/default/files/community-
code/met/docs/user-guide/MET_Users_Guide_v9.0.pdf
> >> > > > >> > >> > > >,
> >> > > > >> > >> > > but I'm still at a bit of a loss. Is there a
simple
> way
> >> I
> >> > can
> >> > > > >> > >> replicate
> >> > > > >> > >> > the
> >> > > > >> > >> > > calculation seen in the stat file?
> >> > > > >> > >> > >
> >> > > > >> > >> > > Thank you again for your help and please let me
know
> if
> >> you
> >> > > > have
> >> > > > >> any
> >> > > > >> > >> > > questions.
> >> > > > >> > >> > >
> >> > > > >> > >> > > Mike
> >> > > > >> > >> > >
> >> > > > >> > >> > > --
> >> > > > >> > >> > > Michael J. Erickson
> >> > > > >> > >> > >
> >> > > > >> > >> > > Research Scientist
> >> > > > >> > >> > > Cooperative Institute for Research in
Environmental
> >> > Sciences
> >> > > > >> (CIRES)
> >> > > > >> > >> > > NOAA/NWS/Weather Prediction Center
> >> > > > >> > >> > > Phone:  301-683-1546
> >> > > > >> > >> > >
> >> > > > >> > >> > >
> >> > > > >> > >> >
> >> > > > >> > >> >
> >> > > > >> > >>
> >> > > > >> > >> --
> >> > > > >> > >> Michael J. Erickson
> >> > > > >> > >>
> >> > > > >> > >> Research Scientist
> >> > > > >> > >> Cooperative Institute for Research in Environmental
> Sciences
> >> > > > (CIRES)
> >> > > > >> > >> NOAA/NWS/Weather Prediction Center
> >> > > > >> > >> Phone:  301-683-1546
> >> > > > >> > >>
> >> > > > >> > >>
> >> > > > >> >
> >> > > > >> >
> >> > > > >>
> >> > > > >> --
> >> > > > >> Michael J. Erickson
> >> > > > >>
> >> > > > >> Research Scientist
> >> > > > >> Cooperative Institute for Research in Environmental
Sciences
> >> (CIRES)
> >> > > > >> NOAA/NWS/Weather Prediction Center
> >> > > > >> Phone:  301-683-1546
> >> > > > >>
> >> > > > >>
> >> > > >
> >> > > >
> >> > >
> >> > > --
> >> > > Michael J. Erickson
> >> > >
> >> > > Research Scientist
> >> > > Cooperative Institute for Research in Environmental Sciences
(CIRES)
> >> > > NOAA/NWS/Weather Prediction Center
> >> > > Phone:  301-683-1546
> >> > >
> >> > >
> >> >
> >> >
> >>
> >>
> >
> > --
> > Michael J. Erickson
> >
> > Research Scientist
> > Cooperative Institute for Research in Environmental Sciences
(CIRES)
> > NOAA/NWS/Weather Prediction Center
> > Phone:  301-683-1546
> >
>
>

--
Michael J. Erickson

Research Scientist
Cooperative Institute for Research in Environmental Sciences (CIRES)
NOAA/NWS/Weather Prediction Center
Phone:  301-683-1546

------------------------------------------------
Subject: Including Climatology in grid_stat Config File
From: John Halley Gotway
Time: Thu Sep 10 13:27:07 2020

Sorry it took me so long to answer. So we know that MET uses the
centerpoint of the bin as the probability value. And we know that your
data
is already binned with the only valid probability values being:
0.0, 0.05, 0.1, 0.2, 0.5, 1.

So we want to choose bins whose centerpoints correspond to these
probability values. However, we're a little constrained because MET
requires the first and last ones to be 0 and 1, respectively, and that
everything in between be monotonically increasing.

The most concise way I can think of uses 7 bins defined by:
cat_thresh = [ >=0.0, >=0.001, >=0.1, >=0.1001, >=0.3, >=0.7, >=0.999,
>=1.0 ];

Bin 1 for prob = 0: 0 to 0.001
Bin 2 for prob = 0.05: 0.001 to 0.1
Bin 3 for prob = 0.1: 0.1 to 0.1001
Bin 4 for prob = 0.2: 0.1001 to 0.3
Bin 5 for prob = 0.5: 0.3 to 0.7
Bin 6 as a placeholder: 0.7 to 0.999
Bin 7 for prob = 1.0: 0.999 to 1.0

But perhaps it'd be more clear with:
cat_thresh = [ >=0.0, >=0.001, >=0.05, >=0.0501, >=0.10, >=0.101,
>=0.2,
>=0.201, >=0.5, >=0.501, >=0.999, >=1.0 ];

But all these mental gymnastics seem way too confusing!
So what changes can we make to Point-Stat and Grid-Stat to better
handle
this situation in the future?

No very obvious solution occurs to me, but some options include:

(1) Add a config option to switch from using the mid-point of the
probability bin to using the left or right side.
But for the first bin, you'd want the left side. And for the last bin,
you'd want the right side! We could consider 0 to be a special case?
And this requires the user to be very savvy to understand all these
details.

(2) Consider changing the logic to ALWAYS include bins for 0 to 0 and
1 to
1 since the endpoints are kind of special cases?
But that'd change existing results which is not good.

(3) Pre-process the input probability values before any smoothing or
interpolation to point observations occurs.
Keep track of the unique values to determine if the data is binned.
But what qualifies as being binned? 5 unique probabilities? 10? 20?
50? 100?
Potentially print a warning message if they've chosen probability bins
poorly?
What does poorly mean?

If we can define some very specific solutions, we can make the code do
whatever we want.

But ideally the changes would not change existing results, be
intuitive for
a user to understand, and be easy to document.

Please let me know.

Thanks,
John

On Wed, Sep 9, 2020 at 3:50 PM Michael Erickson - NOAA Affiliate via
RT <
met_help at ucar.edu> wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562 >
>
> Thanks Everyone for your helpful responses.
>
> I have been using grid_stat for WPC's Excessive Rainfall Outlook
> (consisting of probabilities of 0, 0.05, 0.1, 0.2, and 0.5) for
years.
> These results are dependent upon MET, so I wanted to make sure I am
> following best practices.
>
> What is DTC's guidance on how to proceed forward with this? Should I
change
> my cat_thresh to "= [ >=0.0, >=0.001, >=0.05, >=0.1, >=0.2, >=0.5,
>=1.0
> ];" or is my current setting fine given that both the "forecast" and
> "observation" are broken down by the same discrete increments? I can
also
> just calculate Brier Score manually outside of grid_stat.
>
> Thanks,
>
> Mike
>
> On Tue, Sep 8, 2020 at 4:56 PM Barbara Brown via RT
<met_help at ucar.edu>
> wrote:
>
> > I agree with Eric and John. The way MET does this generally makes
sense
> for
> > ensemble forecasts (or other cases when you want MET to select the
> > thresholds) but not for the cse when the probabilities for
specific
> > categories are provided by the user.  I'm not sure what the work-
around
> > might be (John may have ideas!) but in the long-run it would be
good to
> > allow for this option.
> >
> > Barb
> > ---
> > Barbara Brown, Senior Research Associate
> > Research Applications Laboratory
> > NCAR PO Box 3000
> > Boulder CO 80307-3000 USA
> > Ph: +1 303 497 8468  FAX: +1 303 497 8401
> >
> >
> > On Tue, Sep 8, 2020 at 2:14 PM Michael Erickson - NOAA Affiliate <
> > michael.j.erickson at noaa.gov> wrote:
> >
> > > Hi Eric and John,
> > >
> > > Thank you for your response to this matter. What would be the
best
> > > practice to take in this situation?
> > >
> > > Thanks,
> > >
> > > Mike
> > >
> > > On Tue, Sep 8, 2020 at 3:41 PM Eric Gilleland via RT <
> met_help at ucar.edu>
> > > wrote:
> > >
> > >> Hi John,
> > >>
> > >> I agree that if the probabilities have already been binned,
then it is
> > >> strange to then take the midpoint (re-binning).
> > >>
> > >> Eric
> > >>
> > >> On Fri, Sep 4, 2020 at 11:14 AM John Halley Gotway via RT <
> > >> met_help at ucar.edu>
> > >> wrote:
> > >>
> > >> > Barb and Eric,
> > >> >
> > >> > I've added you to this met-help ticket from Mike Erickson
from
> > NOAA/WPC.
> > >> > We're hoping to get some advice from one or both of you about
> > >> probabilistic
> > >> > verification.
> > >> >
> > >> > Mike is running Grid-Stat to verify WPC's Excessive Rainfall
> Outlooks
> > >> > against StageIV precip. The forecast probability values are
always
> 0,
> > >> 0.05,
> > >> > 0.1, 0.2, 0.5, or 1.0.
> > >> > When Mike computes the Brier score by hand, it differs from
the
> > results
> > >> > reported by Grid-Stat out in the 3rd decimal place.
> > >> >
> > >> > My theory is that the difference is caused by the fact that
MET does
> > not
> > >> > compute the Brier score directly on the probability values.
Instead,
> > it
> > >> > bins them into an Nx2 probabilistic contingency table and
computes
> the
> > >> > Brier score from that table. And the mid-point of each bin is
used
> in
> > >> the
> > >> > Brier score computations. So different probability bins will
result
> > in a
> > >> > slightly different Brier score.
> > >> >
> > >> > Mike is currently using probability thresholds as follows:
> > >> >    cat_thresh = [ >=0.0, >=0.05, >=0.1, >=0.2, >=0.5, >=1.0
];
> > >> >
> > >> > And that's consistent with the probability values. But when
you
> think
> > >> about
> > >> > it...
> > >> > - Forecasts of 0% fall into the first bin and are evaluated
as
> being a
> > >> > value of 0.025 (mid-point of the 0.0 to 0.05 bin)
> > >> > - Forecasts of 5% fall into the second bin and are evaluated
as
> being
> > a
> > >> > value of 0.075 (mid-point of the 0.05 to 0.1 bin)
> > >> > - Forecasts of 10% fall into the third bin and are evaluated
as
> being
> > a
> > >> > value of 0.150 (mid-point of the 0.1 to 0.2 bin).
> > >> > - and so on for the other probability values
> > >> >
> > >> > Seems like the binning of probability values works better for
> > continuous
> > >> > probability values and not so well for probabilities that
have
> already
> > >> been
> > >> > binned!
> > >> >
> > >> > I'm wondering if you have any thoughts or advice about this
> situation?
> > >> >
> > >> > Thanks,
> > >> > John
> > >> >
> > >> > On Fri, Sep 4, 2020 at 10:47 AM Michael Erickson - NOAA
Affiliate
> via
> > >> RT <
> > >> > met_help at ucar.edu> wrote:
> > >> >
> > >> > >
> > >> > > <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562 >
> > >> > >
> > >> > > Hi John,
> > >> > >
> > >> > > Thanks for your answers and sounds good! That is strange
that the
> > >> climo
> > >> > > file was not found for your setting. The only detail I can
think
> of
> > is
> > >> > that
> > >> > > within the climo field, the file_name specification is
static:
> > >> > >
> > >> > > file_name = [
> > >> > >
> > >> >
> > >>
> >
>
"/usr1/wpc_cpgffh/gribs/ERO_verif/static/UFVS_ST4gFFG_s2015010100_e2019123123_vhr12.nc"
> > >> > > ];
> > >> > >
> > >> > >
> > >> > > I believe you concluded that my climo read-in looked
correct?
> > >> > >
> > >> > > Thanks,
> > >> > >
> > >> > > Mike
> > >> > >
> > >> > >
> > >> > > On Fri, Sep 4, 2020 at 12:42 PM John Halley Gotway via RT <
> > >> > > met_help at ucar.edu>
> > >> > > wrote:
> > >> > >
> > >> > > > Mike,
> > >> > > >
> > >> > > > 2 more things I forgot to address.
> > >> > > >
> > >> > > > First, I pulled that climo field but when I ran grid_stat
with
> > your
> > >> > > usethis
> > >> > > > config file, it did not actually read the climo data.
> > >> > > >
> > >> > > > DEBUG 3: Found 0 climatology fields.
> > >> > > >
> > >> > > >
> > >> > > > I'm wondering what additional configuration settings you
used to
> > >> make
> > >> > > this
> > >> > > > work?
> > >> > > >
> > >> > > >
> > >> > > > Second, the answer to your question is yes. The exact
same
> binning
> > >> > logic
> > >> > > > used for the forecast probabilities is applied to the
climo
> data.
> > In
> > >> > > fact,
> > >> > > > the forecast probability bins are applied to both the
forecast
> and
> > >> > climo
> > >> > > > data. So you do not need to define separate "cat_thresh"
> settings
> > >> for
> > >> > the
> > >> > > > climo. They won't be used anyway.
> > >> > > >
> > >> > > >
> > >> > > > Here's the spot in the library code where the climo
> probabilistic
> > >> > > > contingency table is created using the forecast
probability
> bins:
> > >> > > >
> > >> > > >
> > >> > > >
> > >> > > >
> > >> > >
> > >> >
> > >>
> >
>
https://github.com/dtcenter/MET/blob/2c9ae440a84024fbf62caa64c1b747f9a912236f/met/src/libcode/vx_statistics/compute_stats.cc#L767
> > >> > > >
> > >> > > >
> > >> > > > Thanks,
> > >> > > > John
> > >> > > >
> > >> > > > On Fri, Sep 4, 2020 at 10:25 AM John Halley Gotway <
> > johnhg at ucar.edu
> > >> >
> > >> > > > wrote:
> > >> > > >
> > >> > > > > Mike,
> > >> > > > >
> > >> > > > > I don't really have a recommendation on best practices
with
> > >> regards
> > >> > to
> > >> > > > the
> > >> > > > > binning of probability values.
> > >> > > > >
> > >> > > > > I can say that I more commonly see people choose fixed
bin
> > widths,
> > >> > like
> > >> > > > > "==0.10" (for 10 bins) or "==0.05" (for 20 bins)
instead of
> > >> variable
> > >> > > > width
> > >> > > > > bins, such as:
> > >> > > > > [ >=0.0, >=0.05, >=0.1, >=0.2, >=0.5, >=1.0 ]
> > >> > > > >
> > >> > > > > But I suspect that's more out of convenience than
anything
> else.
> > >> With
> > >> > > > > regards to your chosen bins, I suspect you set them up
this
> way
> > >> since
> > >> > > you
> > >> > > > > have lots of low probability values closer to 0.0 and
> relatively
> > >> few
> > >> > > > > probability values closer to 1.0. While this may be a
good
> > choice
> > >> for
> > >> > > > > relatively rare events, it wouldn't be as good of a
choice for
> > >> very
> > >> > > > common
> > >> > > > > events resulting in high probability values.
> > >> > > > >
> > >> > > > > Choosing 20 bins (==0.05) would include all of your
current
> bin
> > >> > > > boundaries
> > >> > > > > and enable you to sample evenly across the probability
space,
> > >> > > regardless
> > >> > > > of
> > >> > > > > whether the values are bunched near 0 or 1. And
> mathematically,
> > >> your
> > >> > > > > current bins would be derivable from these.
> > >> > > > >
> > >> > > > > But if your chosen bins follow some existing WPC
convention, I
> > >> don't
> > >> > > see
> > >> > > > > an obvious reason to change them.
> > >> > > > >
> > >> > > > > Please let me know if you'd like me to forward this
question
> to
> > >> one
> > >> > of
> > >> > > > the
> > >> > > > > statisticians in our group for their advice.
> > >> > > > >
> > >> > > > > Thanks,
> > >> > > > > John
> > >> > > > >
> > >> > > > > On Fri, Sep 4, 2020 at 5:08 AM Michael Erickson - NOAA
> Affiliate
> > >> via
> > >> > > RT <
> > >> > > > > met_help at ucar.edu> wrote:
> > >> > > > >
> > >> > > > >>
> > >> > > > >> <URL:
> https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562
> > >
> > >> > > > >>
> > >> > > > >> Hi John,
> > >> > > > >>
> > >> > > > >> Thank you for your quick and helpful response! To
answer your
> > >> > > questions
> > >> > > > >> from the first email:
> > >> > > > >>
> > >> > > > >> 1) I have included the climo file in case you wanted
to see
> it:
> > >> > > > >>
> > >> > > > >>
> > >> > > >
> > >> > >
> > >> >
> > >>
> >
>
https://ftp.wpc.ncep.noaa.gov/erickson/DTC/UFVS_ST4gFFG_s2015010100_e2019123123_vhr12.nc
> > >> > > > >>
> > >> > > > >> 2) I start from the netcdf output from grid_stat, load
that
> > data
> > >> > into
> > >> > > > the
> > >> > > > >> python workspace, and compute the brier score from
that.
> > >> > > > >>
> > >> > > > >> Also the circle diameter of 9 in the observation file
is to
> > draw
> > >> a
> > >> > 40
> > >> > > km
> > >> > > > >> radius around the "observation."
> > >> > > > >>
> > >> > > > >> From your latter email, it sounds like I may not be
able to
> > >> exactly
> > >> > > > >> replicate the Brier Score calculation. In the spirit
of best
> > >> > > practices,
> > >> > > > >> would you recommend I change cat_thresh  to "= [
>=0.0,
> > >=0.001,
> > >> > > >=0.05,
> > >> > > > >> >=0.1, >=0.2, >=0.5, >=1.0 ];" or keep my cat_thresh
as it
> > >> currently
> > >> > > is
> > >> > > > as
> > >> > > > >> long as I am consistent? I was also wondering if
grid_stat
> bins
> > >> the
> > >> > > > >> probabilities for the climo field as it does for the
> > >> probabilities
> > >> > in
> > >> > > > the
> > >> > > > >> forecast field?
> > >> > > > >>
> > >> > > > >> Thanks again!
> > >> > > > >>
> > >> > > > >> Mike
> > >> > > > >>
> > >> > > > >> On Thu, Sep 3, 2020 at 7:12 PM John Halley Gotway via
RT <
> > >> > > > >> met_help at ucar.edu>
> > >> > > > >> wrote:
> > >> > > > >>
> > >> > > > >> > Actually, I have a reasonable guess as to why you
may be
> > >> seeing a
> > >> > > > >> > difference.
> > >> > > > >> >
> > >> > > > >> > All probabilistics verification in MET is based on
an Nx2
> > >> > > > probabilistic
> > >> > > > >> > contingency table. Those are the counts in the PCT
line
> type.
> > >> We
> > >> > do
> > >> > > > >> this to
> > >> > > > >> > make it easier to aggregate statistics across
multiple
> cases,
> > >> but
> > >> > > > >> summing
> > >> > > > >> > up contingency tables before recomputing statistics.
But
> the
> > >> > > pros/cons
> > >> > > > >> of
> > >> > > > >> > this approach would probably be better addressed by
a
> > >> > statistician.
> > >> > > So
> > >> > > > >> the
> > >> > > > >> > stats are computed using probability bins and not
raw
> > >> probability
> > >> > > > >> values.
> > >> > > > >> >
> > >> > > > >> > If you went and computed the Brier score by hand,
you
> > probably
> > >> did
> > >> > > so
> > >> > > > >> using
> > >> > > > >> > raw probability values and not binning them first.
> > >> > > > >> >
> > >> > > > >> > And this difference could explain the type of
discrepancy
> > >> you're
> > >> > > > seeing.
> > >> > > > >> >
> > >> > > > >> > To test this out, I reran your case...
> > >> > > > >> > (1) Using your original settings to confirm your
Brier
> score
> > of
> > >> > > > >> 0.011934.
> > >> > > > >> > (2) Using 10 equally-spaced probability bins
(cat_thresh =
> [
> > >> ==0.1
> > >> > > ];)
> > >> > > > >> > which produced a Brier score of 0.013747.
> > >> > > > >> > (3) Using 50 equally-spaced probability bins
(cat_thresh =
> [
> > >> ==0.2
> > >> > > ];)
> > >> > > > >> > which produced a Brier score of 0.01197.
> > >> > > > >> > (4) Using 100 equally-spaced probability bins
(cat_thresh
> = [
> > >> > ==0.01
> > >> > > > ];)
> > >> > > > >> > which produced a Brier score of 0.01193.
> > >> > > > >> >
> > >> > > > >> > I suppose that doesn't example the exact
discrepancy, but
> > could
> > >> > > > >> definitely
> > >> > > > >> > be involved.
> > >> > > > >> >
> > >> > > > >> > Notice on this line of the brier score computation
in MET:
> > >> > > > >> >
> > >> > > > >> >
> > >> > > > >>
> > >> > > >
> > >> > >
> > >> >
> > >>
> >
>
https://github.com/dtcenter/MET/blob/2c9ae440a84024fbf62caa64c1b747f9a912236f/met/src/libcode/vx_statistics/contable_nx2.cc#L647
> > >> > > > >> >
> > >> > > > >> > That the "probability" value returned by
"row_proby()" is
> the
> > >> > > > mid-point
> > >> > > > >> of
> > >> > > > >> > the bin.
> > >> > > > >> > So all of your forecast probability values of 0%
which fall
> > >> into
> > >> > the
> > >> > > > >> first
> > >> > > > >> > bin are actually evaluated as having a probability
value of
> > >> 0.025
> > >> > > > which
> > >> > > > >> is
> > >> > > > >> > the mid-point between 0 and 0.05 for the first bin.
> > >> > > > >> >
> > >> > > > >> > Rerunning using the following to minimize that
effect on
> the
> > >> 0's:
> > >> > > > >> > cat_thresh = [ >=0.0, >=0.001, >=0.05, >=0.1, >=0.2,
>=0.5,
> > >> >=1.0
> > >> > ];
> > >> > > > >> > produces a brier score of 0.011489.
> > >> > > > >> >
> > >> > > > >> > So I'd say that the binning of the probability
values is
> > >> impacting
> > >> > > the
> > >> > > > >> > Brier score out in the 4th decimal place.
> > >> > > > >> >
> > >> > > > >> > John
> > >> > > > >> >
> > >> > > > >> > On Thu, Sep 3, 2020 at 4:47 PM John Halley Gotway <
> > >> > johnhg at ucar.edu>
> > >> > > > >> wrote:
> > >> > > > >> >
> > >> > > > >> > > Hi Mike,
> > >> > > > >> > >
> > >> > > > >> > > Looks like you were able to make a lot of
progress. I
> > >> certainly
> > >> > > > don't
> > >> > > > >> see
> > >> > > > >> > > anything wrong based on the log messages you sent.
> > >> > > > >> > >
> > >> > > > >> > > I do notice that you're smoothing the observations
with
> the
> > >> > > maximum
> > >> > > > >> value
> > >> > > > >> > > in a circle of diameter 9... presumably for a good
> reason.
> > >> And I
> > >> > > see
> > >> > > > >> that
> > >> > > > >> > > smoothing step indicated in the log messages as
well as
> the
> > >> > output
> > >> > > > >> .stat
> > >> > > > >> > > file.
> > >> > > > >> > >
> > >> > > > >> > > Two questions.
> > >> > > > >> > >
> > >> > > > >> > > (1) I wanted to try running locally, but didn't
find the
> > >> "climo"
> > >> > > > file
> > >> > > > >> on
> > >> > > > >> > > the WPC ftp site:
> > >> > > > >> > >
> > >> > > > >> > >
> > >> > > > >> >
> > >> > > > >>
> > >> > > >
> > >> > >
> > >> >
> > >>
> >
>
https://ftp.wpc.ncep.noaa.gov/erickson/DTC/UFVS_ST4gFFG_s2015010100_e2019123123_vhr12.nc
> > >> > > > >> > > <
> > >> > > > >> >
> > >> > > > >>
> > >> > > >
> > >> > >
> > >> >
> > >>
> >
>
https://ftp.wpc.ncep.noaa.gov/erickson/DTC/UFVS_ST4gFFG_s2015010100_e2019123123_vhr12.nc
> > >> > > > >> > >
> > >> > > > >> > > Could you add that?
> > >> > > > >> > >
> > >> > > > >> > > (2) When you say that you tried to replicate the
Brier
> > score
> > >> > > > >> computation,
> > >> > > > >> > > what was your starting point? The raw input files
or
> using
> > >> the
> > >> > > > NetCDF
> > >> > > > >> > > matched pairs output from Grid-Stat which already
include
> > the
> > >> > > > >> computation
> > >> > > > >> > > of the observation maximums?
> > >> > > > >> > >
> > >> > > > >> > > Thanks,
> > >> > > > >> > > John Halley Gotway
> > >> > > > >> > >
> > >> > > > >> > > On Thu, Sep 3, 2020 at 2:26 PM Michael Erickson -
NOAA
> > >> Affiliate
> > >> > > via
> > >> > > > >> RT <
> > >> > > > >> > > met_help at ucar.edu> wrote:
> > >> > > > >> > >
> > >> > > > >> > >>
> > >> > > > >> > >> <URL:
> > >> https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562
> > >> > >
> > >> > > > >> > >>
> > >> > > > >> > >> Thank you Minna!
> > >> > > > >> > >>
> > >> > > > >> > >> Mike
> > >> > > > >> > >>
> > >> > > > >> > >> On Thu, Sep 3, 2020 at 4:12 PM Minna Win via RT <
> > >> > > met_help at ucar.edu
> > >> > > > >
> > >> > > > >> > >> wrote:
> > >> > > > >> > >>
> > >> > > > >> > >> > Hi Mike,
> > >> > > > >> > >> >
> > >> > > > >> > >> > It looks like you have a few questions
associated with
> > >> > > > calculating
> > >> > > > >> > Brier
> > >> > > > >> > >> > Skill Scores.  I'm assigning this ticket to
John
> Halley
> > >> > Gotway.
> > >> > > > >> > >> >
> > >> > > > >> > >> > Regards,
> > >> > > > >> > >> > Minna
> > >> > > > >> > >> > ---------------
> > >> > > > >> > >> > Minna Win
> > >> > > > >> > >> > National Center for Atmospheric Research
> > >> > > > >> > >> > Developmental Testbed Center
> > >> > > > >> > >> > Phone: 303-497-8423
> > >> > > > >> > >> > Fax:   303-497-8401
> > >> > > > >> > >> >
> > >> > > > >> > >> >
> > >> > > > >> > >> >
> > >> > > > >> > >> > On Thu, Sep 3, 2020 at 1:13 PM Michael Erickson
- NOAA
> > >> > > Affiliate
> > >> > > > >> via
> > >> > > > >> > RT
> > >> > > > >> > >> <
> > >> > > > >> > >> > met_help at ucar.edu> wrote:
> > >> > > > >> > >> >
> > >> > > > >> > >> > >
> > >> > > > >> > >> > > Thu Sep 03 13:13:26 2020: Request 96562 was
acted
> > upon.
> > >> > > > >> > >> > > Transaction: Ticket created by
> > >> michael.j.erickson at noaa.gov
> > >> > > > >> > >> > >        Queue: met_help
> > >> > > > >> > >> > >      Subject: Including Climatology in
grid_stat
> > Config
> > >> > File
> > >> > > > >> > >> > >        Owner: Nobody
> > >> > > > >> > >> > >   Requestors: michael.j.erickson at noaa.gov
> > >> > > > >> > >> > >       Status: new
> > >> > > > >> > >> > >  Ticket <URL:
> > >> > > > >> >
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562
> > >> > > > >> > >> >
> > >> > > > >> > >> > >
> > >> > > > >> > >> > >
> > >> > > > >> > >> > > Greetings,
> > >> > > > >> > >> > >
> > >> > > > >> > >> > > For the first time I am attempting to
calculate
> Brier
> > >> Skill
> > >> > > > Score
> > >> > > > >> > >> using
> > >> > > > >> > >> > > grid_stat from an input climatology file. I
have
> > >> created a
> > >> > > > >> > >> probabilistic
> > >> > > > >> > >> > > flooding climatology file (spans from zero to
one;
> > >> image is
> > >> > > > here:
> > >> > > > >> > >> > >
> > >> > > > >> >
> > >> > >
> >
https://ftp.wpc.ncep.noaa.gov/erickson/test10/ALL_noMRGL_UFVS_ALL.png
> > >> > > > >> > >> ).
> > >> > > > >> > >> > > This climatology is static, so it doesn't
change
> with
> > >> time
> > >> > > when
> > >> > > > >> > >> inputting
> > >> > > > >> > >> > > the "model" and "observation" data. I believe
I have
> > >> > > > successfully
> > >> > > > >> > >> gotten
> > >> > > > >> > >> > > this to work using the command:
> > >> > > > >> > >> > >
> > >> > > > >> > >> > > /opt/MET/90/bin/grid_stat
> ERO_s2020083112_e2020090112_
> > >> > > vhr09.nc
> > >> > > > >> > >> > > ST4gFFG_s2020083112_e2020090112_vhr09.nc
usethis
> > >> -outdir ~
> > >> > > > >> > >> > >
> > >> > > > >> > >> > > where grid_stat ERO_s2020083112_e2020090112_
> vhr09.nc
> > >> are
> > >> > > > >> discrete
> > >> > > > >> > >> > forecast
> > >> > > > >> > >> > > probabilities of 0, 0.05, 0.1, 0.2, and 0.5
> > >> > > > >> > >> > > where
ST4gFFG_s2020083112_e2020090112_vhr09.nc are
> > >> > > observation
> > >> > > > >> > values
> > >> > > > >> > >> > of 0
> > >> > > > >> > >> > > or 1
> > >> > > > >> > >> > > and usethis is the configuration file
> > >> > > > >> > >> > >
> > >> > > > >> > >> > > Finally the climatology file that consists of
> "almost"
> > >> > > > continuous
> > >> > > > >> > >> values
> > >> > > > >> > >> > > between 0 and 1 is named:
UFVS_ST4gFFG_s2015010100_
> > >> > > > >> > >> e2019123123_vhr12.nc
> > >> > > > >> > >> > >
> > >> > > > >> > >> > > I have put all of these files at
> > >> > > > >> > >> > > https://ftp.wpc.ncep.noaa.gov/erickson/DTC/
for
> > >> > > > >> > >> > > your reference.
> > >> > > > >> > >> > >
> > >> > > > >> > >> > > As for my questions:
> > >> > > > >> > >> > >
> > >> > > > >> > >> > > 1) I was wondering if the climatology file
was
> > properly
> > >> > > > ingested
> > >> > > > >> and
> > >> > > > >> > >> > > calculated for my example? I believe it is
correct
> > given
> > >> > the
> > >> > > > >> output
> > >> > > > >> > >> > below,
> > >> > > > >> > >> > > but I wanted to make sure, since this is my
first
> time
> > >> > doing
> > >> > > > >> this:
> > >> > > > >> > >> > >
> > >> > > > >> > >> > >
> > >> > > > >> > >> > >
> > >> > > > >> > >> > >
> > >> > > > >> > >> > >
> > >> > > > >> > >> > >
> > >> > > > >> > >> > >
> > >> > > > >> > >> > >
> > >> > > > >> > >> > >
> > >> > > > >> > >> > >
> > >> > > > >> > >> > >
> > >> > > > >> > >> > >
> > >> > > > >> > >> > >
> > >> > > > >> > >> > >
> > >> > > > >> > >> > >
> > >> > > > >> > >> > >
> > >> > > > >> > >> > >
> > >> > > > >> > >> > >
> > >> > > > >> > >> > >
> > >> > > > >> > >> > >
> > >> > > > >> > >> > >
> > >> > > > >> > >> > >
> > >> > > > >> > >> > >
> > >> > > > >> > >> > >
> > >> > > > >> > >> > >
> > >> > > > >> > >> > >
> > >> > > > >> > >> > >
> > >> > > > >> > >> > >
> > >> > > > >> > >> > >
> > >> > > > >> > >> > >
> > >> > > > >> > >> > >
> > >> > > > >> > >> > >
> > >> > > > >> > >> > > *DEBUG 1: Forecast File:
> > >> > > > >> > >> > >
> > >> > > > >> > >> > >
> > >> > > > >> > >> >
> > >> > > > >> > >>
> > >> > > > >> >
> > >> > > > >>
> > >> > > >
> > >> > >
> > >> >
> > >>
> >
> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL/ERO_s2020083112_e2020090112_vhr09.ncDEBUG
> > >> > > > >> > >> > > 1: Observation File:
> > >> > > > >> > >> > >
> > >> > > > >> > >> > >
> > >> > > > >> > >> >
> > >> > > > >> > >>
> > >> > > > >> >
> > >> > > > >>
> > >> > > >
> > >> > >
> > >> >
> > >>
> >
> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL/ST4gFFG_s2020083112_e2020090112_vhr09.ncDEBUG
> > >> > > > >> > >> > > 3: Reading forecast data for EROSurface.DEBUG
3:
> > Reading
> > >> > > > >> observation
> > >> > > > >> > >> data
> > >> > > > >> > >> > > for ST4gFFGSurface.DEBUG 4:
> > >> > > > >> > >> Met2dDataFileFactory::new_met_2d_data_file()
> > >> > > > >> > >> > ->
> > >> > > > >> > >> > > created new Met2dDataFile object of type
> > >> > > "FileType_NcMet".DEBUG
> > >> > > > >> > >> 4:DEBUG
> > >> > > > >> > >> > 4:
> > >> > > > >> > >> > > Latitude/Longitude Grid Data:DEBUG 4:
lat_ll:
> > >> 25DEBUG
> > >> > 4:
> > >> > > > >> > >> > lon_ll:
> > >> > > > >> > >> > > 129.8DEBUG 4:   delta_lat: 0.09DEBUG 4:
delta_lon:
> > >> > > 0.09DEBUG
> > >> > > > 4:
> > >> > > > >> > >> > >  Nlat: 276DEBUG 4:        Nlon: 721DEBUG
4:DEBUG 4:
> > >> > > > >> > >> > > VarInfoFactory::new_var_info() -> created new
> VarInfo
> > >> > object
> > >> > > of
> > >> > > > >> type
> > >> > > > >> > >> > > "FileType_NcMet".DEBUG 3: For forecast valid
at
> > >> > > > 20200901_120000,
> > >> > > > >> > >> found 1
> > >> > > > >> > >> > > climatology field(s) with valid time(s):
> > >> > 20201231_230000DEBUG
> > >> > > > 3:
> > >> > > > >> > >> Found 1
> > >> > > > >> > >> > > climatology fields.DEBUG 3: Found 1
climatology mean
> > >> and 0
> > >> > > > >> > climatology
> > >> > > > >> > >> > > standard deviation field(s) for forecast
> > >> EROSurface.DEBUG
> > >> > 2:
> > >> > > > >> > >> Processing
> > >> > > > >> > >> > > masking regions.DEBUG 3: Processing grid
mask:
> > >> FULLDEBUG 4:
> > >> > > > >> > >> > > parse_grid_mask() -> parsing grid mask
"FULL"DEBUG
> > >> 2:DEBUG
> > >> > 2:
> > >> > > > >> > >> > >
> > >> > > > >> > >> > >
> > >> > > > >> > >> >
> > >> > > > >> > >>
> > >> > > > >> >
> > >> > > > >>
> > >> > > >
> > >> > >
> > >> >
> > >>
> >
>
--------------------------------------------------------------------------------DEBUG
> > >> > > > >> > >> > > 2:DEBUG 3: Smoothing field using the MAX(49)
> > >> CircleTemplate
> > >> > > > >> > >> interpolation
> > >> > > > >> > >> > > method.DEBUG 2: Processing EROSurface versus
> > >> > ST4gFFGSurface,
> > >> > > > for
> > >> > > > >> > >> > smoothing
> > >> > > > >> > >> > > method MAX_CIRCLE(49), over region FULL,
using
> 190638
> > >> > matched
> > >> > > > >> > >> pairs.DEBUG
> > >> > > > >> > >> > > 2: Computing Probabilistic Statistics.DEBUG
2:DEBUG
> 2:
> > >> > > > >> > >> > >
> > >> > > > >> > >> > >
> > >> > > > >> > >> >
> > >> > > > >> > >>
> > >> > > > >> >
> > >> > > > >>
> > >> > > >
> > >> > >
> > >> >
> > >>
> >
>
--------------------------------------------------------------------------------DEBUG
> > >> > > > >> > >> > > 2:DEBUG 1: Output file:
> > >> > > > >> > >> > >
> > >> > > > >> > >> > >
> > >> > > > >> > >> >
> > >> > > > >> > >>
> > >> > > > >> >
> > >> > > > >>
> > >> > > >
> > >> > >
> > >> >
> > >>
> >
> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL//grid_stat_ST4gFFG_ERO_s2020090112_e2020090212_vhr09_240000L_20200901_120000V.statDEBUG
> > >> > > > >> > >> > > 1: Output file:
> > >> > > > >> > >> > >
> > >> > > > >> > >> > >
> > >> > > > >> > >> >
> > >> > > > >> > >>
> > >> > > > >> >
> > >> > > > >>
> > >> > > >
> > >> > >
> > >> >
> > >>
> >
> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL//grid_stat_ST4gFFG_ERO_s2020090112_e2020090212_vhr09_240000L_20200901_120000V_pairs.nc*
> > >> > > > >> > >> > >
> > >> > > > >> > >> > >
> > >> > > > >> > >> > > 2) This question is a bit more basic. I am
unable to
> > >> > manually
> > >> > > > >> > >> calculate a
> > >> > > > >> > >> > > Brier Score value for the forecast and
observation
> > that
> > >> > > > properly
> > >> > > > >> > >> matches
> > >> > > > >> > >> > > that in the stat file. My manually calculated
Brier
> > >> Score
> > >> > is
> > >> > > > >> > >> > systematically
> > >> > > > >> > >> > > lower. For this event, the stat file BS is
0.0119
> and
> > my
> > >> > > value
> > >> > > > is
> > >> > > > >> > >> 0.0116.
> > >> > > > >> > >> > > I've looked at C3 in the MET Tutorial guide
> > >> > > > >> > >> > > <
> > >> > > > >> > >> > >
> > >> > > > >> > >> >
> > >> > > > >> > >>
> > >> > > > >> >
> > >> > > > >>
> > >> > > >
> > >> > >
> > >> >
> > >>
> >
> https://dtcenter.org/sites/default/files/community-
code/met/docs/user-guide/MET_Users_Guide_v9.0.pdf
> > >> > > > >> > >> > > >,
> > >> > > > >> > >> > > but I'm still at a bit of a loss. Is there a
simple
> > way
> > >> I
> > >> > can
> > >> > > > >> > >> replicate
> > >> > > > >> > >> > the
> > >> > > > >> > >> > > calculation seen in the stat file?
> > >> > > > >> > >> > >
> > >> > > > >> > >> > > Thank you again for your help and please let
me know
> > if
> > >> you
> > >> > > > have
> > >> > > > >> any
> > >> > > > >> > >> > > questions.
> > >> > > > >> > >> > >
> > >> > > > >> > >> > > Mike
> > >> > > > >> > >> > >
> > >> > > > >> > >> > > --
> > >> > > > >> > >> > > Michael J. Erickson
> > >> > > > >> > >> > >
> > >> > > > >> > >> > > Research Scientist
> > >> > > > >> > >> > > Cooperative Institute for Research in
Environmental
> > >> > Sciences
> > >> > > > >> (CIRES)
> > >> > > > >> > >> > > NOAA/NWS/Weather Prediction Center
> > >> > > > >> > >> > > Phone:  301-683-1546
> > >> > > > >> > >> > >
> > >> > > > >> > >> > >
> > >> > > > >> > >> >
> > >> > > > >> > >> >
> > >> > > > >> > >>
> > >> > > > >> > >> --
> > >> > > > >> > >> Michael J. Erickson
> > >> > > > >> > >>
> > >> > > > >> > >> Research Scientist
> > >> > > > >> > >> Cooperative Institute for Research in
Environmental
> > Sciences
> > >> > > > (CIRES)
> > >> > > > >> > >> NOAA/NWS/Weather Prediction Center
> > >> > > > >> > >> Phone:  301-683-1546
> > >> > > > >> > >>
> > >> > > > >> > >>
> > >> > > > >> >
> > >> > > > >> >
> > >> > > > >>
> > >> > > > >> --
> > >> > > > >> Michael J. Erickson
> > >> > > > >>
> > >> > > > >> Research Scientist
> > >> > > > >> Cooperative Institute for Research in Environmental
Sciences
> > >> (CIRES)
> > >> > > > >> NOAA/NWS/Weather Prediction Center
> > >> > > > >> Phone:  301-683-1546
> > >> > > > >>
> > >> > > > >>
> > >> > > >
> > >> > > >
> > >> > >
> > >> > > --
> > >> > > Michael J. Erickson
> > >> > >
> > >> > > Research Scientist
> > >> > > Cooperative Institute for Research in Environmental
Sciences
> (CIRES)
> > >> > > NOAA/NWS/Weather Prediction Center
> > >> > > Phone:  301-683-1546
> > >> > >
> > >> > >
> > >> >
> > >> >
> > >>
> > >>
> > >
> > > --
> > > Michael J. Erickson
> > >
> > > Research Scientist
> > > Cooperative Institute for Research in Environmental Sciences
(CIRES)
> > > NOAA/NWS/Weather Prediction Center
> > > Phone:  301-683-1546
> > >
> >
> >
>
> --
> Michael J. Erickson
>
> Research Scientist
> Cooperative Institute for Research in Environmental Sciences (CIRES)
> NOAA/NWS/Weather Prediction Center
> Phone:  301-683-1546
>
>

------------------------------------------------
Subject: Including Climatology in grid_stat Config File
From: Michael Erickson - NOAA Affiliate
Time: Fri Sep 11 05:45:05 2020

Hi John,

Thank you for your help here! I do appreciate it. I will gradually
work in
your recommended changes to my python scripts.

Regarding your options, these are good suggestions and I can
understand how
complicated this is. I would advise against 2) since this would change
the
results from previous versions. Option 1) is appealing to me, but I'm
not
sure if there are many other users with discrete thresholds to their
gridded data. I could see the utility of a -left, -middle, -right
option
which will default to mid point binning when unspecified. It's
unfortunate
that the user will lose either the left or right most category with
this
option, but if the user is this savvy to get to this level of detail,
they
can probably modify either their data or the threshold to meet within
the
constraints of left/right binning. Another option is to calculate BS
without summing through the thresholds, but this loses a layer of
complexity that I like.

I hope this helps and thank you!

Mike

On Thu, Sep 10, 2020 at 3:28 PM John Halley Gotway via RT
<met_help at ucar.edu>
wrote:

> Sorry it took me so long to answer. So we know that MET uses the
> centerpoint of the bin as the probability value. And we know that
your data
> is already binned with the only valid probability values being:
> 0.0, 0.05, 0.1, 0.2, 0.5, 1.
>
> So we want to choose bins whose centerpoints correspond to these
> probability values. However, we're a little constrained because MET
> requires the first and last ones to be 0 and 1, respectively, and
that
> everything in between be monotonically increasing.
>
> The most concise way I can think of uses 7 bins defined by:
> cat_thresh = [ >=0.0, >=0.001, >=0.1, >=0.1001, >=0.3, >=0.7,
>=0.999,
> >=1.0 ];
>
> Bin 1 for prob = 0: 0 to 0.001
> Bin 2 for prob = 0.05: 0.001 to 0.1
> Bin 3 for prob = 0.1: 0.1 to 0.1001
> Bin 4 for prob = 0.2: 0.1001 to 0.3
> Bin 5 for prob = 0.5: 0.3 to 0.7
> Bin 6 as a placeholder: 0.7 to 0.999
> Bin 7 for prob = 1.0: 0.999 to 1.0
>
> But perhaps it'd be more clear with:
> cat_thresh = [ >=0.0, >=0.001, >=0.05, >=0.0501, >=0.10, >=0.101,
>=0.2,
> >=0.201, >=0.5, >=0.501, >=0.999, >=1.0 ];
>
> But all these mental gymnastics seem way too confusing!
> So what changes can we make to Point-Stat and Grid-Stat to better
handle
> this situation in the future?
>
> No very obvious solution occurs to me, but some options include:
>
> (1) Add a config option to switch from using the mid-point of the
> probability bin to using the left or right side.
> But for the first bin, you'd want the left side. And for the last
bin,
> you'd want the right side! We could consider 0 to be a special case?
> And this requires the user to be very savvy to understand all these
> details.
>
> (2) Consider changing the logic to ALWAYS include bins for 0 to 0
and 1 to
> 1 since the endpoints are kind of special cases?
> But that'd change existing results which is not good.
>
> (3) Pre-process the input probability values before any smoothing or
> interpolation to point observations occurs.
> Keep track of the unique values to determine if the data is binned.
> But what qualifies as being binned? 5 unique probabilities? 10? 20?
50?
> 100?
> Potentially print a warning message if they've chosen probability
bins
> poorly?
> What does poorly mean?
>
> If we can define some very specific solutions, we can make the code
do
> whatever we want.
>
> But ideally the changes would not change existing results, be
intuitive for
> a user to understand, and be easy to document.
>
> Please let me know.
>
> Thanks,
> John
>
> On Wed, Sep 9, 2020 at 3:50 PM Michael Erickson - NOAA Affiliate via
RT <
> met_help at ucar.edu> wrote:
>
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562 >
> >
> > Thanks Everyone for your helpful responses.
> >
> > I have been using grid_stat for WPC's Excessive Rainfall Outlook
> > (consisting of probabilities of 0, 0.05, 0.1, 0.2, and 0.5) for
years.
> > These results are dependent upon MET, so I wanted to make sure I
am
> > following best practices.
> >
> > What is DTC's guidance on how to proceed forward with this? Should
I
> change
> > my cat_thresh to "= [ >=0.0, >=0.001, >=0.05, >=0.1, >=0.2, >=0.5,
>=1.0
> > ];" or is my current setting fine given that both the "forecast"
and
> > "observation" are broken down by the same discrete increments? I
can also
> > just calculate Brier Score manually outside of grid_stat.
> >
> > Thanks,
> >
> > Mike
> >
> > On Tue, Sep 8, 2020 at 4:56 PM Barbara Brown via RT
<met_help at ucar.edu>
> > wrote:
> >
> > > I agree with Eric and John. The way MET does this generally
makes sense
> > for
> > > ensemble forecasts (or other cases when you want MET to select
the
> > > thresholds) but not for the cse when the probabilities for
specific
> > > categories are provided by the user.  I'm not sure what the
work-around
> > > might be (John may have ideas!) but in the long-run it would be
good to
> > > allow for this option.
> > >
> > > Barb
> > > ---
> > > Barbara Brown, Senior Research Associate
> > > Research Applications Laboratory
> > > NCAR PO Box 3000
> > > Boulder CO 80307-3000 USA
> > > Ph: +1 303 497 8468  FAX: +1 303 497 8401
> > >
> > >
> > > On Tue, Sep 8, 2020 at 2:14 PM Michael Erickson - NOAA Affiliate
<
> > > michael.j.erickson at noaa.gov> wrote:
> > >
> > > > Hi Eric and John,
> > > >
> > > > Thank you for your response to this matter. What would be the
best
> > > > practice to take in this situation?
> > > >
> > > > Thanks,
> > > >
> > > > Mike
> > > >
> > > > On Tue, Sep 8, 2020 at 3:41 PM Eric Gilleland via RT <
> > met_help at ucar.edu>
> > > > wrote:
> > > >
> > > >> Hi John,
> > > >>
> > > >> I agree that if the probabilities have already been binned,
then it
> is
> > > >> strange to then take the midpoint (re-binning).
> > > >>
> > > >> Eric
> > > >>
> > > >> On Fri, Sep 4, 2020 at 11:14 AM John Halley Gotway via RT <
> > > >> met_help at ucar.edu>
> > > >> wrote:
> > > >>
> > > >> > Barb and Eric,
> > > >> >
> > > >> > I've added you to this met-help ticket from Mike Erickson
from
> > > NOAA/WPC.
> > > >> > We're hoping to get some advice from one or both of you
about
> > > >> probabilistic
> > > >> > verification.
> > > >> >
> > > >> > Mike is running Grid-Stat to verify WPC's Excessive
Rainfall
> > Outlooks
> > > >> > against StageIV precip. The forecast probability values are
always
> > 0,
> > > >> 0.05,
> > > >> > 0.1, 0.2, 0.5, or 1.0.
> > > >> > When Mike computes the Brier score by hand, it differs from
the
> > > results
> > > >> > reported by Grid-Stat out in the 3rd decimal place.
> > > >> >
> > > >> > My theory is that the difference is caused by the fact that
MET
> does
> > > not
> > > >> > compute the Brier score directly on the probability values.
> Instead,
> > > it
> > > >> > bins them into an Nx2 probabilistic contingency table and
computes
> > the
> > > >> > Brier score from that table. And the mid-point of each bin
is used
> > in
> > > >> the
> > > >> > Brier score computations. So different probability bins
will
> result
> > > in a
> > > >> > slightly different Brier score.
> > > >> >
> > > >> > Mike is currently using probability thresholds as follows:
> > > >> >    cat_thresh = [ >=0.0, >=0.05, >=0.1, >=0.2, >=0.5, >=1.0
];
> > > >> >
> > > >> > And that's consistent with the probability values. But when
you
> > think
> > > >> about
> > > >> > it...
> > > >> > - Forecasts of 0% fall into the first bin and are evaluated
as
> > being a
> > > >> > value of 0.025 (mid-point of the 0.0 to 0.05 bin)
> > > >> > - Forecasts of 5% fall into the second bin and are
evaluated as
> > being
> > > a
> > > >> > value of 0.075 (mid-point of the 0.05 to 0.1 bin)
> > > >> > - Forecasts of 10% fall into the third bin and are
evaluated as
> > being
> > > a
> > > >> > value of 0.150 (mid-point of the 0.1 to 0.2 bin).
> > > >> > - and so on for the other probability values
> > > >> >
> > > >> > Seems like the binning of probability values works better
for
> > > continuous
> > > >> > probability values and not so well for probabilities that
have
> > already
> > > >> been
> > > >> > binned!
> > > >> >
> > > >> > I'm wondering if you have any thoughts or advice about this
> > situation?
> > > >> >
> > > >> > Thanks,
> > > >> > John
> > > >> >
> > > >> > On Fri, Sep 4, 2020 at 10:47 AM Michael Erickson - NOAA
Affiliate
> > via
> > > >> RT <
> > > >> > met_help at ucar.edu> wrote:
> > > >> >
> > > >> > >
> > > >> > > <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562 >
> > > >> > >
> > > >> > > Hi John,
> > > >> > >
> > > >> > > Thanks for your answers and sounds good! That is strange
that
> the
> > > >> climo
> > > >> > > file was not found for your setting. The only detail I
can think
> > of
> > > is
> > > >> > that
> > > >> > > within the climo field, the file_name specification is
static:
> > > >> > >
> > > >> > > file_name = [
> > > >> > >
> > > >> >
> > > >>
> > >
> >
>
"/usr1/wpc_cpgffh/gribs/ERO_verif/static/UFVS_ST4gFFG_s2015010100_e2019123123_vhr12.nc"
> > > >> > > ];
> > > >> > >
> > > >> > >
> > > >> > > I believe you concluded that my climo read-in looked
correct?
> > > >> > >
> > > >> > > Thanks,
> > > >> > >
> > > >> > > Mike
> > > >> > >
> > > >> > >
> > > >> > > On Fri, Sep 4, 2020 at 12:42 PM John Halley Gotway via RT
<
> > > >> > > met_help at ucar.edu>
> > > >> > > wrote:
> > > >> > >
> > > >> > > > Mike,
> > > >> > > >
> > > >> > > > 2 more things I forgot to address.
> > > >> > > >
> > > >> > > > First, I pulled that climo field but when I ran
grid_stat with
> > > your
> > > >> > > usethis
> > > >> > > > config file, it did not actually read the climo data.
> > > >> > > >
> > > >> > > > DEBUG 3: Found 0 climatology fields.
> > > >> > > >
> > > >> > > >
> > > >> > > > I'm wondering what additional configuration settings
you used
> to
> > > >> make
> > > >> > > this
> > > >> > > > work?
> > > >> > > >
> > > >> > > >
> > > >> > > > Second, the answer to your question is yes. The exact
same
> > binning
> > > >> > logic
> > > >> > > > used for the forecast probabilities is applied to the
climo
> > data.
> > > In
> > > >> > > fact,
> > > >> > > > the forecast probability bins are applied to both the
forecast
> > and
> > > >> > climo
> > > >> > > > data. So you do not need to define separate
"cat_thresh"
> > settings
> > > >> for
> > > >> > the
> > > >> > > > climo. They won't be used anyway.
> > > >> > > >
> > > >> > > >
> > > >> > > > Here's the spot in the library code where the climo
> > probabilistic
> > > >> > > > contingency table is created using the forecast
probability
> > bins:
> > > >> > > >
> > > >> > > >
> > > >> > > >
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > >
> >
>
https://github.com/dtcenter/MET/blob/2c9ae440a84024fbf62caa64c1b747f9a912236f/met/src/libcode/vx_statistics/compute_stats.cc#L767
> > > >> > > >
> > > >> > > >
> > > >> > > > Thanks,
> > > >> > > > John
> > > >> > > >
> > > >> > > > On Fri, Sep 4, 2020 at 10:25 AM John Halley Gotway <
> > > johnhg at ucar.edu
> > > >> >
> > > >> > > > wrote:
> > > >> > > >
> > > >> > > > > Mike,
> > > >> > > > >
> > > >> > > > > I don't really have a recommendation on best
practices with
> > > >> regards
> > > >> > to
> > > >> > > > the
> > > >> > > > > binning of probability values.
> > > >> > > > >
> > > >> > > > > I can say that I more commonly see people choose
fixed bin
> > > widths,
> > > >> > like
> > > >> > > > > "==0.10" (for 10 bins) or "==0.05" (for 20 bins)
instead of
> > > >> variable
> > > >> > > > width
> > > >> > > > > bins, such as:
> > > >> > > > > [ >=0.0, >=0.05, >=0.1, >=0.2, >=0.5, >=1.0 ]
> > > >> > > > >
> > > >> > > > > But I suspect that's more out of convenience than
anything
> > else.
> > > >> With
> > > >> > > > > regards to your chosen bins, I suspect you set them
up this
> > way
> > > >> since
> > > >> > > you
> > > >> > > > > have lots of low probability values closer to 0.0 and
> > relatively
> > > >> few
> > > >> > > > > probability values closer to 1.0. While this may be a
good
> > > choice
> > > >> for
> > > >> > > > > relatively rare events, it wouldn't be as good of a
choice
> for
> > > >> very
> > > >> > > > common
> > > >> > > > > events resulting in high probability values.
> > > >> > > > >
> > > >> > > > > Choosing 20 bins (==0.05) would include all of your
current
> > bin
> > > >> > > > boundaries
> > > >> > > > > and enable you to sample evenly across the
probability
> space,
> > > >> > > regardless
> > > >> > > > of
> > > >> > > > > whether the values are bunched near 0 or 1. And
> > mathematically,
> > > >> your
> > > >> > > > > current bins would be derivable from these.
> > > >> > > > >
> > > >> > > > > But if your chosen bins follow some existing WPC
> convention, I
> > > >> don't
> > > >> > > see
> > > >> > > > > an obvious reason to change them.
> > > >> > > > >
> > > >> > > > > Please let me know if you'd like me to forward this
question
> > to
> > > >> one
> > > >> > of
> > > >> > > > the
> > > >> > > > > statisticians in our group for their advice.
> > > >> > > > >
> > > >> > > > > Thanks,
> > > >> > > > > John
> > > >> > > > >
> > > >> > > > > On Fri, Sep 4, 2020 at 5:08 AM Michael Erickson -
NOAA
> > Affiliate
> > > >> via
> > > >> > > RT <
> > > >> > > > > met_help at ucar.edu> wrote:
> > > >> > > > >
> > > >> > > > >>
> > > >> > > > >> <URL:
> > https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562
> > > >
> > > >> > > > >>
> > > >> > > > >> Hi John,
> > > >> > > > >>
> > > >> > > > >> Thank you for your quick and helpful response! To
answer
> your
> > > >> > > questions
> > > >> > > > >> from the first email:
> > > >> > > > >>
> > > >> > > > >> 1) I have included the climo file in case you wanted
to see
> > it:
> > > >> > > > >>
> > > >> > > > >>
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > >
> >
>
https://ftp.wpc.ncep.noaa.gov/erickson/DTC/UFVS_ST4gFFG_s2015010100_e2019123123_vhr12.nc
> > > >> > > > >>
> > > >> > > > >> 2) I start from the netcdf output from grid_stat,
load that
> > > data
> > > >> > into
> > > >> > > > the
> > > >> > > > >> python workspace, and compute the brier score from
that.
> > > >> > > > >>
> > > >> > > > >> Also the circle diameter of 9 in the observation
file is to
> > > draw
> > > >> a
> > > >> > 40
> > > >> > > km
> > > >> > > > >> radius around the "observation."
> > > >> > > > >>
> > > >> > > > >> From your latter email, it sounds like I may not be
able to
> > > >> exactly
> > > >> > > > >> replicate the Brier Score calculation. In the spirit
of
> best
> > > >> > > practices,
> > > >> > > > >> would you recommend I change cat_thresh  to "= [
>=0.0,
> > > >=0.001,
> > > >> > > >=0.05,
> > > >> > > > >> >=0.1, >=0.2, >=0.5, >=1.0 ];" or keep my cat_thresh
as it
> > > >> currently
> > > >> > > is
> > > >> > > > as
> > > >> > > > >> long as I am consistent? I was also wondering if
grid_stat
> > bins
> > > >> the
> > > >> > > > >> probabilities for the climo field as it does for the
> > > >> probabilities
> > > >> > in
> > > >> > > > the
> > > >> > > > >> forecast field?
> > > >> > > > >>
> > > >> > > > >> Thanks again!
> > > >> > > > >>
> > > >> > > > >> Mike
> > > >> > > > >>
> > > >> > > > >> On Thu, Sep 3, 2020 at 7:12 PM John Halley Gotway
via RT <
> > > >> > > > >> met_help at ucar.edu>
> > > >> > > > >> wrote:
> > > >> > > > >>
> > > >> > > > >> > Actually, I have a reasonable guess as to why you
may be
> > > >> seeing a
> > > >> > > > >> > difference.
> > > >> > > > >> >
> > > >> > > > >> > All probabilistics verification in MET is based on
an Nx2
> > > >> > > > probabilistic
> > > >> > > > >> > contingency table. Those are the counts in the PCT
line
> > type.
> > > >> We
> > > >> > do
> > > >> > > > >> this to
> > > >> > > > >> > make it easier to aggregate statistics across
multiple
> > cases,
> > > >> but
> > > >> > > > >> summing
> > > >> > > > >> > up contingency tables before recomputing
statistics. But
> > the
> > > >> > > pros/cons
> > > >> > > > >> of
> > > >> > > > >> > this approach would probably be better addressed
by a
> > > >> > statistician.
> > > >> > > So
> > > >> > > > >> the
> > > >> > > > >> > stats are computed using probability bins and not
raw
> > > >> probability
> > > >> > > > >> values.
> > > >> > > > >> >
> > > >> > > > >> > If you went and computed the Brier score by hand,
you
> > > probably
> > > >> did
> > > >> > > so
> > > >> > > > >> using
> > > >> > > > >> > raw probability values and not binning them first.
> > > >> > > > >> >
> > > >> > > > >> > And this difference could explain the type of
discrepancy
> > > >> you're
> > > >> > > > seeing.
> > > >> > > > >> >
> > > >> > > > >> > To test this out, I reran your case...
> > > >> > > > >> > (1) Using your original settings to confirm your
Brier
> > score
> > > of
> > > >> > > > >> 0.011934.
> > > >> > > > >> > (2) Using 10 equally-spaced probability bins
(cat_thresh
> =
> > [
> > > >> ==0.1
> > > >> > > ];)
> > > >> > > > >> > which produced a Brier score of 0.013747.
> > > >> > > > >> > (3) Using 50 equally-spaced probability bins
(cat_thresh
> =
> > [
> > > >> ==0.2
> > > >> > > ];)
> > > >> > > > >> > which produced a Brier score of 0.01197.
> > > >> > > > >> > (4) Using 100 equally-spaced probability bins
(cat_thresh
> > = [
> > > >> > ==0.01
> > > >> > > > ];)
> > > >> > > > >> > which produced a Brier score of 0.01193.
> > > >> > > > >> >
> > > >> > > > >> > I suppose that doesn't example the exact
discrepancy, but
> > > could
> > > >> > > > >> definitely
> > > >> > > > >> > be involved.
> > > >> > > > >> >
> > > >> > > > >> > Notice on this line of the brier score computation
in
> MET:
> > > >> > > > >> >
> > > >> > > > >> >
> > > >> > > > >>
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > >
> >
>
https://github.com/dtcenter/MET/blob/2c9ae440a84024fbf62caa64c1b747f9a912236f/met/src/libcode/vx_statistics/contable_nx2.cc#L647
> > > >> > > > >> >
> > > >> > > > >> > That the "probability" value returned by
"row_proby()" is
> > the
> > > >> > > > mid-point
> > > >> > > > >> of
> > > >> > > > >> > the bin.
> > > >> > > > >> > So all of your forecast probability values of 0%
which
> fall
> > > >> into
> > > >> > the
> > > >> > > > >> first
> > > >> > > > >> > bin are actually evaluated as having a probability
value
> of
> > > >> 0.025
> > > >> > > > which
> > > >> > > > >> is
> > > >> > > > >> > the mid-point between 0 and 0.05 for the first
bin.
> > > >> > > > >> >
> > > >> > > > >> > Rerunning using the following to minimize that
effect on
> > the
> > > >> 0's:
> > > >> > > > >> > cat_thresh = [ >=0.0, >=0.001, >=0.05, >=0.1,
>=0.2,
> >=0.5,
> > > >> >=1.0
> > > >> > ];
> > > >> > > > >> > produces a brier score of 0.011489.
> > > >> > > > >> >
> > > >> > > > >> > So I'd say that the binning of the probability
values is
> > > >> impacting
> > > >> > > the
> > > >> > > > >> > Brier score out in the 4th decimal place.
> > > >> > > > >> >
> > > >> > > > >> > John
> > > >> > > > >> >
> > > >> > > > >> > On Thu, Sep 3, 2020 at 4:47 PM John Halley Gotway
<
> > > >> > johnhg at ucar.edu>
> > > >> > > > >> wrote:
> > > >> > > > >> >
> > > >> > > > >> > > Hi Mike,
> > > >> > > > >> > >
> > > >> > > > >> > > Looks like you were able to make a lot of
progress. I
> > > >> certainly
> > > >> > > > don't
> > > >> > > > >> see
> > > >> > > > >> > > anything wrong based on the log messages you
sent.
> > > >> > > > >> > >
> > > >> > > > >> > > I do notice that you're smoothing the
observations with
> > the
> > > >> > > maximum
> > > >> > > > >> value
> > > >> > > > >> > > in a circle of diameter 9... presumably for a
good
> > reason.
> > > >> And I
> > > >> > > see
> > > >> > > > >> that
> > > >> > > > >> > > smoothing step indicated in the log messages as
well as
> > the
> > > >> > output
> > > >> > > > >> .stat
> > > >> > > > >> > > file.
> > > >> > > > >> > >
> > > >> > > > >> > > Two questions.
> > > >> > > > >> > >
> > > >> > > > >> > > (1) I wanted to try running locally, but didn't
find
> the
> > > >> "climo"
> > > >> > > > file
> > > >> > > > >> on
> > > >> > > > >> > > the WPC ftp site:
> > > >> > > > >> > >
> > > >> > > > >> > >
> > > >> > > > >> >
> > > >> > > > >>
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > >
> >
>
https://ftp.wpc.ncep.noaa.gov/erickson/DTC/UFVS_ST4gFFG_s2015010100_e2019123123_vhr12.nc
> > > >> > > > >> > > <
> > > >> > > > >> >
> > > >> > > > >>
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > >
> >
>
https://ftp.wpc.ncep.noaa.gov/erickson/DTC/UFVS_ST4gFFG_s2015010100_e2019123123_vhr12.nc
> > > >> > > > >> > >
> > > >> > > > >> > > Could you add that?
> > > >> > > > >> > >
> > > >> > > > >> > > (2) When you say that you tried to replicate the
Brier
> > > score
> > > >> > > > >> computation,
> > > >> > > > >> > > what was your starting point? The raw input
files or
> > using
> > > >> the
> > > >> > > > NetCDF
> > > >> > > > >> > > matched pairs output from Grid-Stat which
already
> include
> > > the
> > > >> > > > >> computation
> > > >> > > > >> > > of the observation maximums?
> > > >> > > > >> > >
> > > >> > > > >> > > Thanks,
> > > >> > > > >> > > John Halley Gotway
> > > >> > > > >> > >
> > > >> > > > >> > > On Thu, Sep 3, 2020 at 2:26 PM Michael Erickson
- NOAA
> > > >> Affiliate
> > > >> > > via
> > > >> > > > >> RT <
> > > >> > > > >> > > met_help at ucar.edu> wrote:
> > > >> > > > >> > >
> > > >> > > > >> > >>
> > > >> > > > >> > >> <URL:
> > > >> https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562
> > > >> > >
> > > >> > > > >> > >>
> > > >> > > > >> > >> Thank you Minna!
> > > >> > > > >> > >>
> > > >> > > > >> > >> Mike
> > > >> > > > >> > >>
> > > >> > > > >> > >> On Thu, Sep 3, 2020 at 4:12 PM Minna Win via RT
<
> > > >> > > met_help at ucar.edu
> > > >> > > > >
> > > >> > > > >> > >> wrote:
> > > >> > > > >> > >>
> > > >> > > > >> > >> > Hi Mike,
> > > >> > > > >> > >> >
> > > >> > > > >> > >> > It looks like you have a few questions
associated
> with
> > > >> > > > calculating
> > > >> > > > >> > Brier
> > > >> > > > >> > >> > Skill Scores.  I'm assigning this ticket to
John
> > Halley
> > > >> > Gotway.
> > > >> > > > >> > >> >
> > > >> > > > >> > >> > Regards,
> > > >> > > > >> > >> > Minna
> > > >> > > > >> > >> > ---------------
> > > >> > > > >> > >> > Minna Win
> > > >> > > > >> > >> > National Center for Atmospheric Research
> > > >> > > > >> > >> > Developmental Testbed Center
> > > >> > > > >> > >> > Phone: 303-497-8423
> > > >> > > > >> > >> > Fax:   303-497-8401
> > > >> > > > >> > >> >
> > > >> > > > >> > >> >
> > > >> > > > >> > >> >
> > > >> > > > >> > >> > On Thu, Sep 3, 2020 at 1:13 PM Michael
Erickson -
> NOAA
> > > >> > > Affiliate
> > > >> > > > >> via
> > > >> > > > >> > RT
> > > >> > > > >> > >> <
> > > >> > > > >> > >> > met_help at ucar.edu> wrote:
> > > >> > > > >> > >> >
> > > >> > > > >> > >> > >
> > > >> > > > >> > >> > > Thu Sep 03 13:13:26 2020: Request 96562 was
acted
> > > upon.
> > > >> > > > >> > >> > > Transaction: Ticket created by
> > > >> michael.j.erickson at noaa.gov
> > > >> > > > >> > >> > >        Queue: met_help
> > > >> > > > >> > >> > >      Subject: Including Climatology in
grid_stat
> > > Config
> > > >> > File
> > > >> > > > >> > >> > >        Owner: Nobody
> > > >> > > > >> > >> > >   Requestors: michael.j.erickson at noaa.gov
> > > >> > > > >> > >> > >       Status: new
> > > >> > > > >> > >> > >  Ticket <URL:
> > > >> > > > >> >
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562
> > > >> > > > >> > >> >
> > > >> > > > >> > >> > >
> > > >> > > > >> > >> > >
> > > >> > > > >> > >> > > Greetings,
> > > >> > > > >> > >> > >
> > > >> > > > >> > >> > > For the first time I am attempting to
calculate
> > Brier
> > > >> Skill
> > > >> > > > Score
> > > >> > > > >> > >> using
> > > >> > > > >> > >> > > grid_stat from an input climatology file. I
have
> > > >> created a
> > > >> > > > >> > >> probabilistic
> > > >> > > > >> > >> > > flooding climatology file (spans from zero
to one;
> > > >> image is
> > > >> > > > here:
> > > >> > > > >> > >> > >
> > > >> > > > >> >
> > > >> > >
> > >
https://ftp.wpc.ncep.noaa.gov/erickson/test10/ALL_noMRGL_UFVS_ALL.png
> > > >> > > > >> > >> ).
> > > >> > > > >> > >> > > This climatology is static, so it doesn't
change
> > with
> > > >> time
> > > >> > > when
> > > >> > > > >> > >> inputting
> > > >> > > > >> > >> > > the "model" and "observation" data. I
believe I
> have
> > > >> > > > successfully
> > > >> > > > >> > >> gotten
> > > >> > > > >> > >> > > this to work using the command:
> > > >> > > > >> > >> > >
> > > >> > > > >> > >> > > /opt/MET/90/bin/grid_stat
> > ERO_s2020083112_e2020090112_
> > > >> > > vhr09.nc
> > > >> > > > >> > >> > > ST4gFFG_s2020083112_e2020090112_vhr09.nc
usethis
> > > >> -outdir ~
> > > >> > > > >> > >> > >
> > > >> > > > >> > >> > > where grid_stat
ERO_s2020083112_e2020090112_
> > vhr09.nc
> > > >> are
> > > >> > > > >> discrete
> > > >> > > > >> > >> > forecast
> > > >> > > > >> > >> > > probabilities of 0, 0.05, 0.1, 0.2, and 0.5
> > > >> > > > >> > >> > > where
ST4gFFG_s2020083112_e2020090112_vhr09.nc
> are
> > > >> > > observation
> > > >> > > > >> > values
> > > >> > > > >> > >> > of 0
> > > >> > > > >> > >> > > or 1
> > > >> > > > >> > >> > > and usethis is the configuration file
> > > >> > > > >> > >> > >
> > > >> > > > >> > >> > > Finally the climatology file that consists
of
> > "almost"
> > > >> > > > continuous
> > > >> > > > >> > >> values
> > > >> > > > >> > >> > > between 0 and 1 is named:
> UFVS_ST4gFFG_s2015010100_
> > > >> > > > >> > >> e2019123123_vhr12.nc
> > > >> > > > >> > >> > >
> > > >> > > > >> > >> > > I have put all of these files at
> > > >> > > > >> > >> > > https://ftp.wpc.ncep.noaa.gov/erickson/DTC/
for
> > > >> > > > >> > >> > > your reference.
> > > >> > > > >> > >> > >
> > > >> > > > >> > >> > > As for my questions:
> > > >> > > > >> > >> > >
> > > >> > > > >> > >> > > 1) I was wondering if the climatology file
was
> > > properly
> > > >> > > > ingested
> > > >> > > > >> and
> > > >> > > > >> > >> > > calculated for my example? I believe it is
correct
> > > given
> > > >> > the
> > > >> > > > >> output
> > > >> > > > >> > >> > below,
> > > >> > > > >> > >> > > but I wanted to make sure, since this is my
first
> > time
> > > >> > doing
> > > >> > > > >> this:
> > > >> > > > >> > >> > >
> > > >> > > > >> > >> > >
> > > >> > > > >> > >> > >
> > > >> > > > >> > >> > >
> > > >> > > > >> > >> > >
> > > >> > > > >> > >> > >
> > > >> > > > >> > >> > >
> > > >> > > > >> > >> > >
> > > >> > > > >> > >> > >
> > > >> > > > >> > >> > >
> > > >> > > > >> > >> > >
> > > >> > > > >> > >> > >
> > > >> > > > >> > >> > >
> > > >> > > > >> > >> > >
> > > >> > > > >> > >> > >
> > > >> > > > >> > >> > >
> > > >> > > > >> > >> > >
> > > >> > > > >> > >> > >
> > > >> > > > >> > >> > >
> > > >> > > > >> > >> > >
> > > >> > > > >> > >> > >
> > > >> > > > >> > >> > >
> > > >> > > > >> > >> > >
> > > >> > > > >> > >> > >
> > > >> > > > >> > >> > >
> > > >> > > > >> > >> > >
> > > >> > > > >> > >> > >
> > > >> > > > >> > >> > >
> > > >> > > > >> > >> > >
> > > >> > > > >> > >> > >
> > > >> > > > >> > >> > >
> > > >> > > > >> > >> > >
> > > >> > > > >> > >> > > *DEBUG 1: Forecast File:
> > > >> > > > >> > >> > >
> > > >> > > > >> > >> > >
> > > >> > > > >> > >> >
> > > >> > > > >> > >>
> > > >> > > > >> >
> > > >> > > > >>
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL/ERO_s2020083112_e2020090112_vhr09.ncDEBUG
> > > >> > > > >> > >> > > 1: Observation File:
> > > >> > > > >> > >> > >
> > > >> > > > >> > >> > >
> > > >> > > > >> > >> >
> > > >> > > > >> > >>
> > > >> > > > >> >
> > > >> > > > >>
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL/ST4gFFG_s2020083112_e2020090112_vhr09.ncDEBUG
> > > >> > > > >> > >> > > 3: Reading forecast data for
EROSurface.DEBUG 3:
> > > Reading
> > > >> > > > >> observation
> > > >> > > > >> > >> data
> > > >> > > > >> > >> > > for ST4gFFGSurface.DEBUG 4:
> > > >> > > > >> > >> Met2dDataFileFactory::new_met_2d_data_file()
> > > >> > > > >> > >> > ->
> > > >> > > > >> > >> > > created new Met2dDataFile object of type
> > > >> > > "FileType_NcMet".DEBUG
> > > >> > > > >> > >> 4:DEBUG
> > > >> > > > >> > >> > 4:
> > > >> > > > >> > >> > > Latitude/Longitude Grid Data:DEBUG 4:
lat_ll:
> > > >> 25DEBUG
> > > >> > 4:
> > > >> > > > >> > >> > lon_ll:
> > > >> > > > >> > >> > > 129.8DEBUG 4:   delta_lat: 0.09DEBUG 4:
>  delta_lon:
> > > >> > > 0.09DEBUG
> > > >> > > > 4:
> > > >> > > > >> > >> > >  Nlat: 276DEBUG 4:        Nlon: 721DEBUG
4:DEBUG
> 4:
> > > >> > > > >> > >> > > VarInfoFactory::new_var_info() -> created
new
> > VarInfo
> > > >> > object
> > > >> > > of
> > > >> > > > >> type
> > > >> > > > >> > >> > > "FileType_NcMet".DEBUG 3: For forecast
valid at
> > > >> > > > 20200901_120000,
> > > >> > > > >> > >> found 1
> > > >> > > > >> > >> > > climatology field(s) with valid time(s):
> > > >> > 20201231_230000DEBUG
> > > >> > > > 3:
> > > >> > > > >> > >> Found 1
> > > >> > > > >> > >> > > climatology fields.DEBUG 3: Found 1
climatology
> mean
> > > >> and 0
> > > >> > > > >> > climatology
> > > >> > > > >> > >> > > standard deviation field(s) for forecast
> > > >> EROSurface.DEBUG
> > > >> > 2:
> > > >> > > > >> > >> Processing
> > > >> > > > >> > >> > > masking regions.DEBUG 3: Processing grid
mask:
> > > >> FULLDEBUG 4:
> > > >> > > > >> > >> > > parse_grid_mask() -> parsing grid mask
"FULL"DEBUG
> > > >> 2:DEBUG
> > > >> > 2:
> > > >> > > > >> > >> > >
> > > >> > > > >> > >> > >
> > > >> > > > >> > >> >
> > > >> > > > >> > >>
> > > >> > > > >> >
> > > >> > > > >>
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > >
> >
>
--------------------------------------------------------------------------------DEBUG
> > > >> > > > >> > >> > > 2:DEBUG 3: Smoothing field using the
MAX(49)
> > > >> CircleTemplate
> > > >> > > > >> > >> interpolation
> > > >> > > > >> > >> > > method.DEBUG 2: Processing EROSurface
versus
> > > >> > ST4gFFGSurface,
> > > >> > > > for
> > > >> > > > >> > >> > smoothing
> > > >> > > > >> > >> > > method MAX_CIRCLE(49), over region FULL,
using
> > 190638
> > > >> > matched
> > > >> > > > >> > >> pairs.DEBUG
> > > >> > > > >> > >> > > 2: Computing Probabilistic Statistics.DEBUG
> 2:DEBUG
> > 2:
> > > >> > > > >> > >> > >
> > > >> > > > >> > >> > >
> > > >> > > > >> > >> >
> > > >> > > > >> > >>
> > > >> > > > >> >
> > > >> > > > >>
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > >
> >
>
--------------------------------------------------------------------------------DEBUG
> > > >> > > > >> > >> > > 2:DEBUG 1: Output file:
> > > >> > > > >> > >> > >
> > > >> > > > >> > >> > >
> > > >> > > > >> > >> >
> > > >> > > > >> > >>
> > > >> > > > >> >
> > > >> > > > >>
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL//grid_stat_ST4gFFG_ERO_s2020090112_e2020090212_vhr09_240000L_20200901_120000V.statDEBUG
> > > >> > > > >> > >> > > 1: Output file:
> > > >> > > > >> > >> > >
> > > >> > > > >> > >> > >
> > > >> > > > >> > >> >
> > > >> > > > >> > >>
> > > >> > > > >> >
> > > >> > > > >>
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL//grid_stat_ST4gFFG_ERO_s2020090112_e2020090212_vhr09_240000L_20200901_120000V_pairs.nc*
> > > >> > > > >> > >> > >
> > > >> > > > >> > >> > >
> > > >> > > > >> > >> > > 2) This question is a bit more basic. I am
unable
> to
> > > >> > manually
> > > >> > > > >> > >> calculate a
> > > >> > > > >> > >> > > Brier Score value for the forecast and
observation
> > > that
> > > >> > > > properly
> > > >> > > > >> > >> matches
> > > >> > > > >> > >> > > that in the stat file. My manually
calculated
> Brier
> > > >> Score
> > > >> > is
> > > >> > > > >> > >> > systematically
> > > >> > > > >> > >> > > lower. For this event, the stat file BS is
0.0119
> > and
> > > my
> > > >> > > value
> > > >> > > > is
> > > >> > > > >> > >> 0.0116.
> > > >> > > > >> > >> > > I've looked at C3 in the MET Tutorial guide
> > > >> > > > >> > >> > > <
> > > >> > > > >> > >> > >
> > > >> > > > >> > >> >
> > > >> > > > >> > >>
> > > >> > > > >> >
> > > >> > > > >>
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> https://dtcenter.org/sites/default/files/community-
code/met/docs/user-guide/MET_Users_Guide_v9.0.pdf
> > > >> > > > >> > >> > > >,
> > > >> > > > >> > >> > > but I'm still at a bit of a loss. Is there
a
> simple
> > > way
> > > >> I
> > > >> > can
> > > >> > > > >> > >> replicate
> > > >> > > > >> > >> > the
> > > >> > > > >> > >> > > calculation seen in the stat file?
> > > >> > > > >> > >> > >
> > > >> > > > >> > >> > > Thank you again for your help and please
let me
> know
> > > if
> > > >> you
> > > >> > > > have
> > > >> > > > >> any
> > > >> > > > >> > >> > > questions.
> > > >> > > > >> > >> > >
> > > >> > > > >> > >> > > Mike
> > > >> > > > >> > >> > >
> > > >> > > > >> > >> > > --
> > > >> > > > >> > >> > > Michael J. Erickson
> > > >> > > > >> > >> > >
> > > >> > > > >> > >> > > Research Scientist
> > > >> > > > >> > >> > > Cooperative Institute for Research in
> Environmental
> > > >> > Sciences
> > > >> > > > >> (CIRES)
> > > >> > > > >> > >> > > NOAA/NWS/Weather Prediction Center
> > > >> > > > >> > >> > > Phone:  301-683-1546
> > > >> > > > >> > >> > >
> > > >> > > > >> > >> > >
> > > >> > > > >> > >> >
> > > >> > > > >> > >> >
> > > >> > > > >> > >>
> > > >> > > > >> > >> --
> > > >> > > > >> > >> Michael J. Erickson
> > > >> > > > >> > >>
> > > >> > > > >> > >> Research Scientist
> > > >> > > > >> > >> Cooperative Institute for Research in
Environmental
> > > Sciences
> > > >> > > > (CIRES)
> > > >> > > > >> > >> NOAA/NWS/Weather Prediction Center
> > > >> > > > >> > >> Phone:  301-683-1546
> > > >> > > > >> > >>
> > > >> > > > >> > >>
> > > >> > > > >> >
> > > >> > > > >> >
> > > >> > > > >>
> > > >> > > > >> --
> > > >> > > > >> Michael J. Erickson
> > > >> > > > >>
> > > >> > > > >> Research Scientist
> > > >> > > > >> Cooperative Institute for Research in Environmental
> Sciences
> > > >> (CIRES)
> > > >> > > > >> NOAA/NWS/Weather Prediction Center
> > > >> > > > >> Phone:  301-683-1546
> > > >> > > > >>
> > > >> > > > >>
> > > >> > > >
> > > >> > > >
> > > >> > >
> > > >> > > --
> > > >> > > Michael J. Erickson
> > > >> > >
> > > >> > > Research Scientist
> > > >> > > Cooperative Institute for Research in Environmental
Sciences
> > (CIRES)
> > > >> > > NOAA/NWS/Weather Prediction Center
> > > >> > > Phone:  301-683-1546
> > > >> > >
> > > >> > >
> > > >> >
> > > >> >
> > > >>
> > > >>
> > > >
> > > > --
> > > > Michael J. Erickson
> > > >
> > > > Research Scientist
> > > > Cooperative Institute for Research in Environmental Sciences
(CIRES)
> > > > NOAA/NWS/Weather Prediction Center
> > > > Phone:  301-683-1546
> > > >
> > >
> > >
> >
> > --
> > Michael J. Erickson
> >
> > Research Scientist
> > Cooperative Institute for Research in Environmental Sciences
(CIRES)
> > NOAA/NWS/Weather Prediction Center
> > Phone:  301-683-1546
> >
> >
>
>

--
Michael J. Erickson

Research Scientist
Cooperative Institute for Research in Environmental Sciences (CIRES)
NOAA/NWS/Weather Prediction Center
Phone:  301-683-1546

------------------------------------------------
Subject: Including Climatology in grid_stat Config File
From: Michael Erickson - NOAA Affiliate
Time: Wed Sep 16 07:44:32 2020

Hi All,

I have redone my configuration file "usethis" to reflect the new
threshold
values. After that I have run grid_stat in the same manner as my
initial
email:

/opt/MET/90/bin/grid_stat ERO_s2020083112_e2020090112_vhr21.nc
ST4gFFG_s2020083112_e2020090112_vhr21.nc usethis -outdir ~

I've put the input/output files here:
https://ftp.wpc.ncep.noaa.gov/erickson/DTC/.

I still have the issue where my Brier Score value is 0.0116 and the
stat
file is ~0.0120. My method for computing BS is more simple than that
in
MET, where I just compute the mean squared difference between model
and
observation probabilities (e.g. no summing through contingency table
counts
as is done on p 446 of the MET tutorial
<https://dtcenter.org/sites/default/files/community-
code/met/docs/user-guide/MET_Users_Guide_v9.0.pdf>
).

I am wondering if the Brier Score and Brier Skill Score values in the
MET
stat file look correct for this case? If so, then I am content on
proceeding forward with implementing this new setup at WPC.

Thank you for all of your help with this!

Mike

On Fri, Sep 11, 2020 at 7:44 AM Michael Erickson - NOAA Affiliate <
michael.j.erickson at noaa.gov> wrote:

> Hi John,
>
> Thank you for your help here! I do appreciate it. I will gradually
work in
> your recommended changes to my python scripts.
>
> Regarding your options, these are good suggestions and I can
understand
> how complicated this is. I would advise against 2) since this would
change
> the results from previous versions. Option 1) is appealing to me,
but I'm
> not sure if there are many other users with discrete thresholds to
their
> gridded data. I could see the utility of a -left, -middle, -right
option
> which will default to mid point binning when unspecified. It's
unfortunate
> that the user will lose either the left or right most category with
this
> option, but if the user is this savvy to get to this level of
detail, they
> can probably modify either their data or the threshold to meet
within the
> constraints of left/right binning. Another option is to calculate BS
> without summing through the thresholds, but this loses a layer of
> complexity that I like.
>
> I hope this helps and thank you!
>
> Mike
>
> On Thu, Sep 10, 2020 at 3:28 PM John Halley Gotway via RT <
> met_help at ucar.edu> wrote:
>
>> Sorry it took me so long to answer. So we know that MET uses the
>> centerpoint of the bin as the probability value. And we know that
your
>> data
>> is already binned with the only valid probability values being:
>> 0.0, 0.05, 0.1, 0.2, 0.5, 1.
>>
>> So we want to choose bins whose centerpoints correspond to these
>> probability values. However, we're a little constrained because MET
>> requires the first and last ones to be 0 and 1, respectively, and
that
>> everything in between be monotonically increasing.
>>
>> The most concise way I can think of uses 7 bins defined by:
>> cat_thresh = [ >=0.0, >=0.001, >=0.1, >=0.1001, >=0.3, >=0.7,
>=0.999,
>> >=1.0 ];
>>
>> Bin 1 for prob = 0: 0 to 0.001
>> Bin 2 for prob = 0.05: 0.001 to 0.1
>> Bin 3 for prob = 0.1: 0.1 to 0.1001
>> Bin 4 for prob = 0.2: 0.1001 to 0.3
>> Bin 5 for prob = 0.5: 0.3 to 0.7
>> Bin 6 as a placeholder: 0.7 to 0.999
>> Bin 7 for prob = 1.0: 0.999 to 1.0
>>
>> But perhaps it'd be more clear with:
>> cat_thresh = [ >=0.0, >=0.001, >=0.05, >=0.0501, >=0.10, >=0.101,
>=0.2,
>> >=0.201, >=0.5, >=0.501, >=0.999, >=1.0 ];
>>
>> But all these mental gymnastics seem way too confusing!
>> So what changes can we make to Point-Stat and Grid-Stat to better
handle
>> this situation in the future?
>>
>> No very obvious solution occurs to me, but some options include:
>>
>> (1) Add a config option to switch from using the mid-point of the
>> probability bin to using the left or right side.
>> But for the first bin, you'd want the left side. And for the last
bin,
>> you'd want the right side! We could consider 0 to be a special
case?
>> And this requires the user to be very savvy to understand all these
>> details.
>>
>> (2) Consider changing the logic to ALWAYS include bins for 0 to 0
and 1 to
>> 1 since the endpoints are kind of special cases?
>> But that'd change existing results which is not good.
>>
>> (3) Pre-process the input probability values before any smoothing
or
>> interpolation to point observations occurs.
>> Keep track of the unique values to determine if the data is binned.
>> But what qualifies as being binned? 5 unique probabilities? 10? 20?
50?
>> 100?
>> Potentially print a warning message if they've chosen probability
bins
>> poorly?
>> What does poorly mean?
>>
>> If we can define some very specific solutions, we can make the code
do
>> whatever we want.
>>
>> But ideally the changes would not change existing results, be
intuitive
>> for
>> a user to understand, and be easy to document.
>>
>> Please let me know.
>>
>> Thanks,
>> John
>>
>> On Wed, Sep 9, 2020 at 3:50 PM Michael Erickson - NOAA Affiliate
via RT <
>> met_help at ucar.edu> wrote:
>>
>> >
>> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562 >
>> >
>> > Thanks Everyone for your helpful responses.
>> >
>> > I have been using grid_stat for WPC's Excessive Rainfall Outlook
>> > (consisting of probabilities of 0, 0.05, 0.1, 0.2, and 0.5) for
years.
>> > These results are dependent upon MET, so I wanted to make sure I
am
>> > following best practices.
>> >
>> > What is DTC's guidance on how to proceed forward with this?
Should I
>> change
>> > my cat_thresh to "= [ >=0.0, >=0.001, >=0.05, >=0.1, >=0.2,
>=0.5, >=1.0
>> > ];" or is my current setting fine given that both the "forecast"
and
>> > "observation" are broken down by the same discrete increments? I
can
>> also
>> > just calculate Brier Score manually outside of grid_stat.
>> >
>> > Thanks,
>> >
>> > Mike
>> >
>> > On Tue, Sep 8, 2020 at 4:56 PM Barbara Brown via RT
<met_help at ucar.edu>
>> > wrote:
>> >
>> > > I agree with Eric and John. The way MET does this generally
makes
>> sense
>> > for
>> > > ensemble forecasts (or other cases when you want MET to select
the
>> > > thresholds) but not for the cse when the probabilities for
specific
>> > > categories are provided by the user.  I'm not sure what the
>> work-around
>> > > might be (John may have ideas!) but in the long-run it would be
good
>> to
>> > > allow for this option.
>> > >
>> > > Barb
>> > > ---
>> > > Barbara Brown, Senior Research Associate
>> > > Research Applications Laboratory
>> > > NCAR PO Box 3000
>> > > Boulder CO 80307-3000 USA
>> > > Ph: +1 303 497 8468  FAX: +1 303 497 8401
>> > >
>> > >
>> > > On Tue, Sep 8, 2020 at 2:14 PM Michael Erickson - NOAA
Affiliate <
>> > > michael.j.erickson at noaa.gov> wrote:
>> > >
>> > > > Hi Eric and John,
>> > > >
>> > > > Thank you for your response to this matter. What would be the
best
>> > > > practice to take in this situation?
>> > > >
>> > > > Thanks,
>> > > >
>> > > > Mike
>> > > >
>> > > > On Tue, Sep 8, 2020 at 3:41 PM Eric Gilleland via RT <
>> > met_help at ucar.edu>
>> > > > wrote:
>> > > >
>> > > >> Hi John,
>> > > >>
>> > > >> I agree that if the probabilities have already been binned,
then
>> it is
>> > > >> strange to then take the midpoint (re-binning).
>> > > >>
>> > > >> Eric
>> > > >>
>> > > >> On Fri, Sep 4, 2020 at 11:14 AM John Halley Gotway via RT <
>> > > >> met_help at ucar.edu>
>> > > >> wrote:
>> > > >>
>> > > >> > Barb and Eric,
>> > > >> >
>> > > >> > I've added you to this met-help ticket from Mike Erickson
from
>> > > NOAA/WPC.
>> > > >> > We're hoping to get some advice from one or both of you
about
>> > > >> probabilistic
>> > > >> > verification.
>> > > >> >
>> > > >> > Mike is running Grid-Stat to verify WPC's Excessive
Rainfall
>> > Outlooks
>> > > >> > against StageIV precip. The forecast probability values
are
>> always
>> > 0,
>> > > >> 0.05,
>> > > >> > 0.1, 0.2, 0.5, or 1.0.
>> > > >> > When Mike computes the Brier score by hand, it differs
from the
>> > > results
>> > > >> > reported by Grid-Stat out in the 3rd decimal place.
>> > > >> >
>> > > >> > My theory is that the difference is caused by the fact
that MET
>> does
>> > > not
>> > > >> > compute the Brier score directly on the probability
values.
>> Instead,
>> > > it
>> > > >> > bins them into an Nx2 probabilistic contingency table and
>> computes
>> > the
>> > > >> > Brier score from that table. And the mid-point of each bin
is
>> used
>> > in
>> > > >> the
>> > > >> > Brier score computations. So different probability bins
will
>> result
>> > > in a
>> > > >> > slightly different Brier score.
>> > > >> >
>> > > >> > Mike is currently using probability thresholds as follows:
>> > > >> >    cat_thresh = [ >=0.0, >=0.05, >=0.1, >=0.2, >=0.5,
>=1.0 ];
>> > > >> >
>> > > >> > And that's consistent with the probability values. But
when you
>> > think
>> > > >> about
>> > > >> > it...
>> > > >> > - Forecasts of 0% fall into the first bin and are
evaluated as
>> > being a
>> > > >> > value of 0.025 (mid-point of the 0.0 to 0.05 bin)
>> > > >> > - Forecasts of 5% fall into the second bin and are
evaluated as
>> > being
>> > > a
>> > > >> > value of 0.075 (mid-point of the 0.05 to 0.1 bin)
>> > > >> > - Forecasts of 10% fall into the third bin and are
evaluated as
>> > being
>> > > a
>> > > >> > value of 0.150 (mid-point of the 0.1 to 0.2 bin).
>> > > >> > - and so on for the other probability values
>> > > >> >
>> > > >> > Seems like the binning of probability values works better
for
>> > > continuous
>> > > >> > probability values and not so well for probabilities that
have
>> > already
>> > > >> been
>> > > >> > binned!
>> > > >> >
>> > > >> > I'm wondering if you have any thoughts or advice about
this
>> > situation?
>> > > >> >
>> > > >> > Thanks,
>> > > >> > John
>> > > >> >
>> > > >> > On Fri, Sep 4, 2020 at 10:47 AM Michael Erickson - NOAA
Affiliate
>> > via
>> > > >> RT <
>> > > >> > met_help at ucar.edu> wrote:
>> > > >> >
>> > > >> > >
>> > > >> > > <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562
>> >
>> > > >> > >
>> > > >> > > Hi John,
>> > > >> > >
>> > > >> > > Thanks for your answers and sounds good! That is strange
that
>> the
>> > > >> climo
>> > > >> > > file was not found for your setting. The only detail I
can
>> think
>> > of
>> > > is
>> > > >> > that
>> > > >> > > within the climo field, the file_name specification is
static:
>> > > >> > >
>> > > >> > > file_name = [
>> > > >> > >
>> > > >> >
>> > > >>
>> > >
>> >
>>
"/usr1/wpc_cpgffh/gribs/ERO_verif/static/UFVS_ST4gFFG_s2015010100_e2019123123_vhr12.nc"
>> > > >> > > ];
>> > > >> > >
>> > > >> > >
>> > > >> > > I believe you concluded that my climo read-in looked
correct?
>> > > >> > >
>> > > >> > > Thanks,
>> > > >> > >
>> > > >> > > Mike
>> > > >> > >
>> > > >> > >
>> > > >> > > On Fri, Sep 4, 2020 at 12:42 PM John Halley Gotway via
RT <
>> > > >> > > met_help at ucar.edu>
>> > > >> > > wrote:
>> > > >> > >
>> > > >> > > > Mike,
>> > > >> > > >
>> > > >> > > > 2 more things I forgot to address.
>> > > >> > > >
>> > > >> > > > First, I pulled that climo field but when I ran
grid_stat
>> with
>> > > your
>> > > >> > > usethis
>> > > >> > > > config file, it did not actually read the climo data.
>> > > >> > > >
>> > > >> > > > DEBUG 3: Found 0 climatology fields.
>> > > >> > > >
>> > > >> > > >
>> > > >> > > > I'm wondering what additional configuration settings
you
>> used to
>> > > >> make
>> > > >> > > this
>> > > >> > > > work?
>> > > >> > > >
>> > > >> > > >
>> > > >> > > > Second, the answer to your question is yes. The exact
same
>> > binning
>> > > >> > logic
>> > > >> > > > used for the forecast probabilities is applied to the
climo
>> > data.
>> > > In
>> > > >> > > fact,
>> > > >> > > > the forecast probability bins are applied to both the
>> forecast
>> > and
>> > > >> > climo
>> > > >> > > > data. So you do not need to define separate
"cat_thresh"
>> > settings
>> > > >> for
>> > > >> > the
>> > > >> > > > climo. They won't be used anyway.
>> > > >> > > >
>> > > >> > > >
>> > > >> > > > Here's the spot in the library code where the climo
>> > probabilistic
>> > > >> > > > contingency table is created using the forecast
probability
>> > bins:
>> > > >> > > >
>> > > >> > > >
>> > > >> > > >
>> > > >> > > >
>> > > >> > >
>> > > >> >
>> > > >>
>> > >
>> >
>>
https://github.com/dtcenter/MET/blob/2c9ae440a84024fbf62caa64c1b747f9a912236f/met/src/libcode/vx_statistics/compute_stats.cc#L767
>> > > >> > > >
>> > > >> > > >
>> > > >> > > > Thanks,
>> > > >> > > > John
>> > > >> > > >
>> > > >> > > > On Fri, Sep 4, 2020 at 10:25 AM John Halley Gotway <
>> > > johnhg at ucar.edu
>> > > >> >
>> > > >> > > > wrote:
>> > > >> > > >
>> > > >> > > > > Mike,
>> > > >> > > > >
>> > > >> > > > > I don't really have a recommendation on best
practices with
>> > > >> regards
>> > > >> > to
>> > > >> > > > the
>> > > >> > > > > binning of probability values.
>> > > >> > > > >
>> > > >> > > > > I can say that I more commonly see people choose
fixed bin
>> > > widths,
>> > > >> > like
>> > > >> > > > > "==0.10" (for 10 bins) or "==0.05" (for 20 bins)
instead of
>> > > >> variable
>> > > >> > > > width
>> > > >> > > > > bins, such as:
>> > > >> > > > > [ >=0.0, >=0.05, >=0.1, >=0.2, >=0.5, >=1.0 ]
>> > > >> > > > >
>> > > >> > > > > But I suspect that's more out of convenience than
anything
>> > else.
>> > > >> With
>> > > >> > > > > regards to your chosen bins, I suspect you set them
up this
>> > way
>> > > >> since
>> > > >> > > you
>> > > >> > > > > have lots of low probability values closer to 0.0
and
>> > relatively
>> > > >> few
>> > > >> > > > > probability values closer to 1.0. While this may be
a good
>> > > choice
>> > > >> for
>> > > >> > > > > relatively rare events, it wouldn't be as good of a
choice
>> for
>> > > >> very
>> > > >> > > > common
>> > > >> > > > > events resulting in high probability values.
>> > > >> > > > >
>> > > >> > > > > Choosing 20 bins (==0.05) would include all of your
current
>> > bin
>> > > >> > > > boundaries
>> > > >> > > > > and enable you to sample evenly across the
probability
>> space,
>> > > >> > > regardless
>> > > >> > > > of
>> > > >> > > > > whether the values are bunched near 0 or 1. And
>> > mathematically,
>> > > >> your
>> > > >> > > > > current bins would be derivable from these.
>> > > >> > > > >
>> > > >> > > > > But if your chosen bins follow some existing WPC
>> convention, I
>> > > >> don't
>> > > >> > > see
>> > > >> > > > > an obvious reason to change them.
>> > > >> > > > >
>> > > >> > > > > Please let me know if you'd like me to forward this
>> question
>> > to
>> > > >> one
>> > > >> > of
>> > > >> > > > the
>> > > >> > > > > statisticians in our group for their advice.
>> > > >> > > > >
>> > > >> > > > > Thanks,
>> > > >> > > > > John
>> > > >> > > > >
>> > > >> > > > > On Fri, Sep 4, 2020 at 5:08 AM Michael Erickson -
NOAA
>> > Affiliate
>> > > >> via
>> > > >> > > RT <
>> > > >> > > > > met_help at ucar.edu> wrote:
>> > > >> > > > >
>> > > >> > > > >>
>> > > >> > > > >> <URL:
>> > https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562
>> > > >
>> > > >> > > > >>
>> > > >> > > > >> Hi John,
>> > > >> > > > >>
>> > > >> > > > >> Thank you for your quick and helpful response! To
answer
>> your
>> > > >> > > questions
>> > > >> > > > >> from the first email:
>> > > >> > > > >>
>> > > >> > > > >> 1) I have included the climo file in case you
wanted to
>> see
>> > it:
>> > > >> > > > >>
>> > > >> > > > >>
>> > > >> > > >
>> > > >> > >
>> > > >> >
>> > > >>
>> > >
>> >
>>
https://ftp.wpc.ncep.noaa.gov/erickson/DTC/UFVS_ST4gFFG_s2015010100_e2019123123_vhr12.nc
>> > > >> > > > >>
>> > > >> > > > >> 2) I start from the netcdf output from grid_stat,
load
>> that
>> > > data
>> > > >> > into
>> > > >> > > > the
>> > > >> > > > >> python workspace, and compute the brier score from
that.
>> > > >> > > > >>
>> > > >> > > > >> Also the circle diameter of 9 in the observation
file is
>> to
>> > > draw
>> > > >> a
>> > > >> > 40
>> > > >> > > km
>> > > >> > > > >> radius around the "observation."
>> > > >> > > > >>
>> > > >> > > > >> From your latter email, it sounds like I may not be
able
>> to
>> > > >> exactly
>> > > >> > > > >> replicate the Brier Score calculation. In the
spirit of
>> best
>> > > >> > > practices,
>> > > >> > > > >> would you recommend I change cat_thresh  to "= [
>=0.0,
>> > > >=0.001,
>> > > >> > > >=0.05,
>> > > >> > > > >> >=0.1, >=0.2, >=0.5, >=1.0 ];" or keep my
cat_thresh as it
>> > > >> currently
>> > > >> > > is
>> > > >> > > > as
>> > > >> > > > >> long as I am consistent? I was also wondering if
grid_stat
>> > bins
>> > > >> the
>> > > >> > > > >> probabilities for the climo field as it does for
the
>> > > >> probabilities
>> > > >> > in
>> > > >> > > > the
>> > > >> > > > >> forecast field?
>> > > >> > > > >>
>> > > >> > > > >> Thanks again!
>> > > >> > > > >>
>> > > >> > > > >> Mike
>> > > >> > > > >>
>> > > >> > > > >> On Thu, Sep 3, 2020 at 7:12 PM John Halley Gotway
via RT <
>> > > >> > > > >> met_help at ucar.edu>
>> > > >> > > > >> wrote:
>> > > >> > > > >>
>> > > >> > > > >> > Actually, I have a reasonable guess as to why you
may be
>> > > >> seeing a
>> > > >> > > > >> > difference.
>> > > >> > > > >> >
>> > > >> > > > >> > All probabilistics verification in MET is based
on an
>> Nx2
>> > > >> > > > probabilistic
>> > > >> > > > >> > contingency table. Those are the counts in the
PCT line
>> > type.
>> > > >> We
>> > > >> > do
>> > > >> > > > >> this to
>> > > >> > > > >> > make it easier to aggregate statistics across
multiple
>> > cases,
>> > > >> but
>> > > >> > > > >> summing
>> > > >> > > > >> > up contingency tables before recomputing
statistics. But
>> > the
>> > > >> > > pros/cons
>> > > >> > > > >> of
>> > > >> > > > >> > this approach would probably be better addressed
by a
>> > > >> > statistician.
>> > > >> > > So
>> > > >> > > > >> the
>> > > >> > > > >> > stats are computed using probability bins and not
raw
>> > > >> probability
>> > > >> > > > >> values.
>> > > >> > > > >> >
>> > > >> > > > >> > If you went and computed the Brier score by hand,
you
>> > > probably
>> > > >> did
>> > > >> > > so
>> > > >> > > > >> using
>> > > >> > > > >> > raw probability values and not binning them
first.
>> > > >> > > > >> >
>> > > >> > > > >> > And this difference could explain the type of
>> discrepancy
>> > > >> you're
>> > > >> > > > seeing.
>> > > >> > > > >> >
>> > > >> > > > >> > To test this out, I reran your case...
>> > > >> > > > >> > (1) Using your original settings to confirm your
Brier
>> > score
>> > > of
>> > > >> > > > >> 0.011934.
>> > > >> > > > >> > (2) Using 10 equally-spaced probability bins
>> (cat_thresh =
>> > [
>> > > >> ==0.1
>> > > >> > > ];)
>> > > >> > > > >> > which produced a Brier score of 0.013747.
>> > > >> > > > >> > (3) Using 50 equally-spaced probability bins
>> (cat_thresh =
>> > [
>> > > >> ==0.2
>> > > >> > > ];)
>> > > >> > > > >> > which produced a Brier score of 0.01197.
>> > > >> > > > >> > (4) Using 100 equally-spaced probability bins
>> (cat_thresh
>> > = [
>> > > >> > ==0.01
>> > > >> > > > ];)
>> > > >> > > > >> > which produced a Brier score of 0.01193.
>> > > >> > > > >> >
>> > > >> > > > >> > I suppose that doesn't example the exact
discrepancy,
>> but
>> > > could
>> > > >> > > > >> definitely
>> > > >> > > > >> > be involved.
>> > > >> > > > >> >
>> > > >> > > > >> > Notice on this line of the brier score
computation in
>> MET:
>> > > >> > > > >> >
>> > > >> > > > >> >
>> > > >> > > > >>
>> > > >> > > >
>> > > >> > >
>> > > >> >
>> > > >>
>> > >
>> >
>>
https://github.com/dtcenter/MET/blob/2c9ae440a84024fbf62caa64c1b747f9a912236f/met/src/libcode/vx_statistics/contable_nx2.cc#L647
>> > > >> > > > >> >
>> > > >> > > > >> > That the "probability" value returned by
"row_proby()"
>> is
>> > the
>> > > >> > > > mid-point
>> > > >> > > > >> of
>> > > >> > > > >> > the bin.
>> > > >> > > > >> > So all of your forecast probability values of 0%
which
>> fall
>> > > >> into
>> > > >> > the
>> > > >> > > > >> first
>> > > >> > > > >> > bin are actually evaluated as having a
probability
>> value of
>> > > >> 0.025
>> > > >> > > > which
>> > > >> > > > >> is
>> > > >> > > > >> > the mid-point between 0 and 0.05 for the first
bin.
>> > > >> > > > >> >
>> > > >> > > > >> > Rerunning using the following to minimize that
effect on
>> > the
>> > > >> 0's:
>> > > >> > > > >> > cat_thresh = [ >=0.0, >=0.001, >=0.05, >=0.1,
>=0.2,
>> >=0.5,
>> > > >> >=1.0
>> > > >> > ];
>> > > >> > > > >> > produces a brier score of 0.011489.
>> > > >> > > > >> >
>> > > >> > > > >> > So I'd say that the binning of the probability
values is
>> > > >> impacting
>> > > >> > > the
>> > > >> > > > >> > Brier score out in the 4th decimal place.
>> > > >> > > > >> >
>> > > >> > > > >> > John
>> > > >> > > > >> >
>> > > >> > > > >> > On Thu, Sep 3, 2020 at 4:47 PM John Halley Gotway
<
>> > > >> > johnhg at ucar.edu>
>> > > >> > > > >> wrote:
>> > > >> > > > >> >
>> > > >> > > > >> > > Hi Mike,
>> > > >> > > > >> > >
>> > > >> > > > >> > > Looks like you were able to make a lot of
progress. I
>> > > >> certainly
>> > > >> > > > don't
>> > > >> > > > >> see
>> > > >> > > > >> > > anything wrong based on the log messages you
sent.
>> > > >> > > > >> > >
>> > > >> > > > >> > > I do notice that you're smoothing the
observations
>> with
>> > the
>> > > >> > > maximum
>> > > >> > > > >> value
>> > > >> > > > >> > > in a circle of diameter 9... presumably for a
good
>> > reason.
>> > > >> And I
>> > > >> > > see
>> > > >> > > > >> that
>> > > >> > > > >> > > smoothing step indicated in the log messages as
well
>> as
>> > the
>> > > >> > output
>> > > >> > > > >> .stat
>> > > >> > > > >> > > file.
>> > > >> > > > >> > >
>> > > >> > > > >> > > Two questions.
>> > > >> > > > >> > >
>> > > >> > > > >> > > (1) I wanted to try running locally, but didn't
find
>> the
>> > > >> "climo"
>> > > >> > > > file
>> > > >> > > > >> on
>> > > >> > > > >> > > the WPC ftp site:
>> > > >> > > > >> > >
>> > > >> > > > >> > >
>> > > >> > > > >> >
>> > > >> > > > >>
>> > > >> > > >
>> > > >> > >
>> > > >> >
>> > > >>
>> > >
>> >
>>
https://ftp.wpc.ncep.noaa.gov/erickson/DTC/UFVS_ST4gFFG_s2015010100_e2019123123_vhr12.nc
>> > > >> > > > >> > > <
>> > > >> > > > >> >
>> > > >> > > > >>
>> > > >> > > >
>> > > >> > >
>> > > >> >
>> > > >>
>> > >
>> >
>>
https://ftp.wpc.ncep.noaa.gov/erickson/DTC/UFVS_ST4gFFG_s2015010100_e2019123123_vhr12.nc
>> > > >> > > > >> > >
>> > > >> > > > >> > > Could you add that?
>> > > >> > > > >> > >
>> > > >> > > > >> > > (2) When you say that you tried to replicate
the Brier
>> > > score
>> > > >> > > > >> computation,
>> > > >> > > > >> > > what was your starting point? The raw input
files or
>> > using
>> > > >> the
>> > > >> > > > NetCDF
>> > > >> > > > >> > > matched pairs output from Grid-Stat which
already
>> include
>> > > the
>> > > >> > > > >> computation
>> > > >> > > > >> > > of the observation maximums?
>> > > >> > > > >> > >
>> > > >> > > > >> > > Thanks,
>> > > >> > > > >> > > John Halley Gotway
>> > > >> > > > >> > >
>> > > >> > > > >> > > On Thu, Sep 3, 2020 at 2:26 PM Michael Erickson
- NOAA
>> > > >> Affiliate
>> > > >> > > via
>> > > >> > > > >> RT <
>> > > >> > > > >> > > met_help at ucar.edu> wrote:
>> > > >> > > > >> > >
>> > > >> > > > >> > >>
>> > > >> > > > >> > >> <URL:
>> > > >> https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562
>> > > >> > >
>> > > >> > > > >> > >>
>> > > >> > > > >> > >> Thank you Minna!
>> > > >> > > > >> > >>
>> > > >> > > > >> > >> Mike
>> > > >> > > > >> > >>
>> > > >> > > > >> > >> On Thu, Sep 3, 2020 at 4:12 PM Minna Win via
RT <
>> > > >> > > met_help at ucar.edu
>> > > >> > > > >
>> > > >> > > > >> > >> wrote:
>> > > >> > > > >> > >>
>> > > >> > > > >> > >> > Hi Mike,
>> > > >> > > > >> > >> >
>> > > >> > > > >> > >> > It looks like you have a few questions
associated
>> with
>> > > >> > > > calculating
>> > > >> > > > >> > Brier
>> > > >> > > > >> > >> > Skill Scores.  I'm assigning this ticket to
John
>> > Halley
>> > > >> > Gotway.
>> > > >> > > > >> > >> >
>> > > >> > > > >> > >> > Regards,
>> > > >> > > > >> > >> > Minna
>> > > >> > > > >> > >> > ---------------
>> > > >> > > > >> > >> > Minna Win
>> > > >> > > > >> > >> > National Center for Atmospheric Research
>> > > >> > > > >> > >> > Developmental Testbed Center
>> > > >> > > > >> > >> > Phone: 303-497-8423
>> > > >> > > > >> > >> > Fax:   303-497-8401
>> > > >> > > > >> > >> >
>> > > >> > > > >> > >> >
>> > > >> > > > >> > >> >
>> > > >> > > > >> > >> > On Thu, Sep 3, 2020 at 1:13 PM Michael
Erickson -
>> NOAA
>> > > >> > > Affiliate
>> > > >> > > > >> via
>> > > >> > > > >> > RT
>> > > >> > > > >> > >> <
>> > > >> > > > >> > >> > met_help at ucar.edu> wrote:
>> > > >> > > > >> > >> >
>> > > >> > > > >> > >> > >
>> > > >> > > > >> > >> > > Thu Sep 03 13:13:26 2020: Request 96562
was acted
>> > > upon.
>> > > >> > > > >> > >> > > Transaction: Ticket created by
>> > > >> michael.j.erickson at noaa.gov
>> > > >> > > > >> > >> > >        Queue: met_help
>> > > >> > > > >> > >> > >      Subject: Including Climatology in
grid_stat
>> > > Config
>> > > >> > File
>> > > >> > > > >> > >> > >        Owner: Nobody
>> > > >> > > > >> > >> > >   Requestors: michael.j.erickson at noaa.gov
>> > > >> > > > >> > >> > >       Status: new
>> > > >> > > > >> > >> > >  Ticket <URL:
>> > > >> > > > >> >
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562
>> > > >> > > > >> > >> >
>> > > >> > > > >> > >> > >
>> > > >> > > > >> > >> > >
>> > > >> > > > >> > >> > > Greetings,
>> > > >> > > > >> > >> > >
>> > > >> > > > >> > >> > > For the first time I am attempting to
calculate
>> > Brier
>> > > >> Skill
>> > > >> > > > Score
>> > > >> > > > >> > >> using
>> > > >> > > > >> > >> > > grid_stat from an input climatology file.
I have
>> > > >> created a
>> > > >> > > > >> > >> probabilistic
>> > > >> > > > >> > >> > > flooding climatology file (spans from zero
to
>> one;
>> > > >> image is
>> > > >> > > > here:
>> > > >> > > > >> > >> > >
>> > > >> > > > >> >
>> > > >> > >
>> > >
https://ftp.wpc.ncep.noaa.gov/erickson/test10/ALL_noMRGL_UFVS_ALL.png
>> > > >> > > > >> > >> ).
>> > > >> > > > >> > >> > > This climatology is static, so it doesn't
change
>> > with
>> > > >> time
>> > > >> > > when
>> > > >> > > > >> > >> inputting
>> > > >> > > > >> > >> > > the "model" and "observation" data. I
believe I
>> have
>> > > >> > > > successfully
>> > > >> > > > >> > >> gotten
>> > > >> > > > >> > >> > > this to work using the command:
>> > > >> > > > >> > >> > >
>> > > >> > > > >> > >> > > /opt/MET/90/bin/grid_stat
>> > ERO_s2020083112_e2020090112_
>> > > >> > > vhr09.nc
>> > > >> > > > >> > >> > > ST4gFFG_s2020083112_e2020090112_vhr09.nc
usethis
>> > > >> -outdir ~
>> > > >> > > > >> > >> > >
>> > > >> > > > >> > >> > > where grid_stat
ERO_s2020083112_e2020090112_
>> > vhr09.nc
>> > > >> are
>> > > >> > > > >> discrete
>> > > >> > > > >> > >> > forecast
>> > > >> > > > >> > >> > > probabilities of 0, 0.05, 0.1, 0.2, and
0.5
>> > > >> > > > >> > >> > > where
ST4gFFG_s2020083112_e2020090112_vhr09.nc
>> are
>> > > >> > > observation
>> > > >> > > > >> > values
>> > > >> > > > >> > >> > of 0
>> > > >> > > > >> > >> > > or 1
>> > > >> > > > >> > >> > > and usethis is the configuration file
>> > > >> > > > >> > >> > >
>> > > >> > > > >> > >> > > Finally the climatology file that consists
of
>> > "almost"
>> > > >> > > > continuous
>> > > >> > > > >> > >> values
>> > > >> > > > >> > >> > > between 0 and 1 is named:
>> UFVS_ST4gFFG_s2015010100_
>> > > >> > > > >> > >> e2019123123_vhr12.nc
>> > > >> > > > >> > >> > >
>> > > >> > > > >> > >> > > I have put all of these files at
>> > > >> > > > >> > >> > >
https://ftp.wpc.ncep.noaa.gov/erickson/DTC/ for
>> > > >> > > > >> > >> > > your reference.
>> > > >> > > > >> > >> > >
>> > > >> > > > >> > >> > > As for my questions:
>> > > >> > > > >> > >> > >
>> > > >> > > > >> > >> > > 1) I was wondering if the climatology file
was
>> > > properly
>> > > >> > > > ingested
>> > > >> > > > >> and
>> > > >> > > > >> > >> > > calculated for my example? I believe it is
>> correct
>> > > given
>> > > >> > the
>> > > >> > > > >> output
>> > > >> > > > >> > >> > below,
>> > > >> > > > >> > >> > > but I wanted to make sure, since this is
my first
>> > time
>> > > >> > doing
>> > > >> > > > >> this:
>> > > >> > > > >> > >> > >
>> > > >> > > > >> > >> > >
>> > > >> > > > >> > >> > >
>> > > >> > > > >> > >> > >
>> > > >> > > > >> > >> > >
>> > > >> > > > >> > >> > >
>> > > >> > > > >> > >> > >
>> > > >> > > > >> > >> > >
>> > > >> > > > >> > >> > >
>> > > >> > > > >> > >> > >
>> > > >> > > > >> > >> > >
>> > > >> > > > >> > >> > >
>> > > >> > > > >> > >> > >
>> > > >> > > > >> > >> > >
>> > > >> > > > >> > >> > >
>> > > >> > > > >> > >> > >
>> > > >> > > > >> > >> > >
>> > > >> > > > >> > >> > >
>> > > >> > > > >> > >> > >
>> > > >> > > > >> > >> > >
>> > > >> > > > >> > >> > >
>> > > >> > > > >> > >> > >
>> > > >> > > > >> > >> > >
>> > > >> > > > >> > >> > >
>> > > >> > > > >> > >> > >
>> > > >> > > > >> > >> > >
>> > > >> > > > >> > >> > >
>> > > >> > > > >> > >> > >
>> > > >> > > > >> > >> > >
>> > > >> > > > >> > >> > >
>> > > >> > > > >> > >> > >
>> > > >> > > > >> > >> > >
>> > > >> > > > >> > >> > > *DEBUG 1: Forecast File:
>> > > >> > > > >> > >> > >
>> > > >> > > > >> > >> > >
>> > > >> > > > >> > >> >
>> > > >> > > > >> > >>
>> > > >> > > > >> >
>> > > >> > > > >>
>> > > >> > > >
>> > > >> > >
>> > > >> >
>> > > >>
>> > >
>> >
>> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL/ERO_s2020083112_e2020090112_vhr09.ncDEBUG
>> > > >> > > > >> > >> > > 1: Observation File:
>> > > >> > > > >> > >> > >
>> > > >> > > > >> > >> > >
>> > > >> > > > >> > >> >
>> > > >> > > > >> > >>
>> > > >> > > > >> >
>> > > >> > > > >>
>> > > >> > > >
>> > > >> > >
>> > > >> >
>> > > >>
>> > >
>> >
>> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL/ST4gFFG_s2020083112_e2020090112_vhr09.ncDEBUG
>> > > >> > > > >> > >> > > 3: Reading forecast data for
EROSurface.DEBUG 3:
>> > > Reading
>> > > >> > > > >> observation
>> > > >> > > > >> > >> data
>> > > >> > > > >> > >> > > for ST4gFFGSurface.DEBUG 4:
>> > > >> > > > >> > >> Met2dDataFileFactory::new_met_2d_data_file()
>> > > >> > > > >> > >> > ->
>> > > >> > > > >> > >> > > created new Met2dDataFile object of type
>> > > >> > > "FileType_NcMet".DEBUG
>> > > >> > > > >> > >> 4:DEBUG
>> > > >> > > > >> > >> > 4:
>> > > >> > > > >> > >> > > Latitude/Longitude Grid Data:DEBUG 4:
>> lat_ll:
>> > > >> 25DEBUG
>> > > >> > 4:
>> > > >> > > > >> > >> > lon_ll:
>> > > >> > > > >> > >> > > 129.8DEBUG 4:   delta_lat: 0.09DEBUG 4:
>>  delta_lon:
>> > > >> > > 0.09DEBUG
>> > > >> > > > 4:
>> > > >> > > > >> > >> > >  Nlat: 276DEBUG 4:        Nlon: 721DEBUG
4:DEBUG
>> 4:
>> > > >> > > > >> > >> > > VarInfoFactory::new_var_info() -> created
new
>> > VarInfo
>> > > >> > object
>> > > >> > > of
>> > > >> > > > >> type
>> > > >> > > > >> > >> > > "FileType_NcMet".DEBUG 3: For forecast
valid at
>> > > >> > > > 20200901_120000,
>> > > >> > > > >> > >> found 1
>> > > >> > > > >> > >> > > climatology field(s) with valid time(s):
>> > > >> > 20201231_230000DEBUG
>> > > >> > > > 3:
>> > > >> > > > >> > >> Found 1
>> > > >> > > > >> > >> > > climatology fields.DEBUG 3: Found 1
climatology
>> mean
>> > > >> and 0
>> > > >> > > > >> > climatology
>> > > >> > > > >> > >> > > standard deviation field(s) for forecast
>> > > >> EROSurface.DEBUG
>> > > >> > 2:
>> > > >> > > > >> > >> Processing
>> > > >> > > > >> > >> > > masking regions.DEBUG 3: Processing grid
mask:
>> > > >> FULLDEBUG 4:
>> > > >> > > > >> > >> > > parse_grid_mask() -> parsing grid mask
>> "FULL"DEBUG
>> > > >> 2:DEBUG
>> > > >> > 2:
>> > > >> > > > >> > >> > >
>> > > >> > > > >> > >> > >
>> > > >> > > > >> > >> >
>> > > >> > > > >> > >>
>> > > >> > > > >> >
>> > > >> > > > >>
>> > > >> > > >
>> > > >> > >
>> > > >> >
>> > > >>
>> > >
>> >
>>
--------------------------------------------------------------------------------DEBUG
>> > > >> > > > >> > >> > > 2:DEBUG 3: Smoothing field using the
MAX(49)
>> > > >> CircleTemplate
>> > > >> > > > >> > >> interpolation
>> > > >> > > > >> > >> > > method.DEBUG 2: Processing EROSurface
versus
>> > > >> > ST4gFFGSurface,
>> > > >> > > > for
>> > > >> > > > >> > >> > smoothing
>> > > >> > > > >> > >> > > method MAX_CIRCLE(49), over region FULL,
using
>> > 190638
>> > > >> > matched
>> > > >> > > > >> > >> pairs.DEBUG
>> > > >> > > > >> > >> > > 2: Computing Probabilistic
Statistics.DEBUG
>> 2:DEBUG
>> > 2:
>> > > >> > > > >> > >> > >
>> > > >> > > > >> > >> > >
>> > > >> > > > >> > >> >
>> > > >> > > > >> > >>
>> > > >> > > > >> >
>> > > >> > > > >>
>> > > >> > > >
>> > > >> > >
>> > > >> >
>> > > >>
>> > >
>> >
>>
--------------------------------------------------------------------------------DEBUG
>> > > >> > > > >> > >> > > 2:DEBUG 1: Output file:
>> > > >> > > > >> > >> > >
>> > > >> > > > >> > >> > >
>> > > >> > > > >> > >> >
>> > > >> > > > >> > >>
>> > > >> > > > >> >
>> > > >> > > > >>
>> > > >> > > >
>> > > >> > >
>> > > >> >
>> > > >>
>> > >
>> >
>> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL//grid_stat_ST4gFFG_ERO_s2020090112_e2020090212_vhr09_240000L_20200901_120000V.statDEBUG
>> > > >> > > > >> > >> > > 1: Output file:
>> > > >> > > > >> > >> > >
>> > > >> > > > >> > >> > >
>> > > >> > > > >> > >> >
>> > > >> > > > >> > >>
>> > > >> > > > >> >
>> > > >> > > > >>
>> > > >> > > >
>> > > >> > >
>> > > >> >
>> > > >>
>> > >
>> >
>> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL//grid_stat_ST4gFFG_ERO_s2020090112_e2020090212_vhr09_240000L_20200901_120000V_pairs.nc*
>> > > >> > > > >> > >> > >
>> > > >> > > > >> > >> > >
>> > > >> > > > >> > >> > > 2) This question is a bit more basic. I am
>> unable to
>> > > >> > manually
>> > > >> > > > >> > >> calculate a
>> > > >> > > > >> > >> > > Brier Score value for the forecast and
>> observation
>> > > that
>> > > >> > > > properly
>> > > >> > > > >> > >> matches
>> > > >> > > > >> > >> > > that in the stat file. My manually
calculated
>> Brier
>> > > >> Score
>> > > >> > is
>> > > >> > > > >> > >> > systematically
>> > > >> > > > >> > >> > > lower. For this event, the stat file BS is
0.0119
>> > and
>> > > my
>> > > >> > > value
>> > > >> > > > is
>> > > >> > > > >> > >> 0.0116.
>> > > >> > > > >> > >> > > I've looked at C3 in the MET Tutorial
guide
>> > > >> > > > >> > >> > > <
>> > > >> > > > >> > >> > >
>> > > >> > > > >> > >> >
>> > > >> > > > >> > >>
>> > > >> > > > >> >
>> > > >> > > > >>
>> > > >> > > >
>> > > >> > >
>> > > >> >
>> > > >>
>> > >
>> >
>> https://dtcenter.org/sites/default/files/community-
code/met/docs/user-guide/MET_Users_Guide_v9.0.pdf
>> > > >> > > > >> > >> > > >,
>> > > >> > > > >> > >> > > but I'm still at a bit of a loss. Is there
a
>> simple
>> > > way
>> > > >> I
>> > > >> > can
>> > > >> > > > >> > >> replicate
>> > > >> > > > >> > >> > the
>> > > >> > > > >> > >> > > calculation seen in the stat file?
>> > > >> > > > >> > >> > >
>> > > >> > > > >> > >> > > Thank you again for your help and please
let me
>> know
>> > > if
>> > > >> you
>> > > >> > > > have
>> > > >> > > > >> any
>> > > >> > > > >> > >> > > questions.
>> > > >> > > > >> > >> > >
>> > > >> > > > >> > >> > > Mike
>> > > >> > > > >> > >> > >
>> > > >> > > > >> > >> > > --
>> > > >> > > > >> > >> > > Michael J. Erickson
>> > > >> > > > >> > >> > >
>> > > >> > > > >> > >> > > Research Scientist
>> > > >> > > > >> > >> > > Cooperative Institute for Research in
>> Environmental
>> > > >> > Sciences
>> > > >> > > > >> (CIRES)
>> > > >> > > > >> > >> > > NOAA/NWS/Weather Prediction Center
>> > > >> > > > >> > >> > > Phone:  301-683-1546
>> > > >> > > > >> > >> > >
>> > > >> > > > >> > >> > >
>> > > >> > > > >> > >> >
>> > > >> > > > >> > >> >
>> > > >> > > > >> > >>
>> > > >> > > > >> > >> --
>> > > >> > > > >> > >> Michael J. Erickson
>> > > >> > > > >> > >>
>> > > >> > > > >> > >> Research Scientist
>> > > >> > > > >> > >> Cooperative Institute for Research in
Environmental
>> > > Sciences
>> > > >> > > > (CIRES)
>> > > >> > > > >> > >> NOAA/NWS/Weather Prediction Center
>> > > >> > > > >> > >> Phone:  301-683-1546
>> > > >> > > > >> > >>
>> > > >> > > > >> > >>
>> > > >> > > > >> >
>> > > >> > > > >> >
>> > > >> > > > >>
>> > > >> > > > >> --
>> > > >> > > > >> Michael J. Erickson
>> > > >> > > > >>
>> > > >> > > > >> Research Scientist
>> > > >> > > > >> Cooperative Institute for Research in Environmental
>> Sciences
>> > > >> (CIRES)
>> > > >> > > > >> NOAA/NWS/Weather Prediction Center
>> > > >> > > > >> Phone:  301-683-1546
>> > > >> > > > >>
>> > > >> > > > >>
>> > > >> > > >
>> > > >> > > >
>> > > >> > >
>> > > >> > > --
>> > > >> > > Michael J. Erickson
>> > > >> > >
>> > > >> > > Research Scientist
>> > > >> > > Cooperative Institute for Research in Environmental
Sciences
>> > (CIRES)
>> > > >> > > NOAA/NWS/Weather Prediction Center
>> > > >> > > Phone:  301-683-1546
>> > > >> > >
>> > > >> > >
>> > > >> >
>> > > >> >
>> > > >>
>> > > >>
>> > > >
>> > > > --
>> > > > Michael J. Erickson
>> > > >
>> > > > Research Scientist
>> > > > Cooperative Institute for Research in Environmental Sciences
(CIRES)
>> > > > NOAA/NWS/Weather Prediction Center
>> > > > Phone:  301-683-1546
>> > > >
>> > >
>> > >
>> >
>> > --
>> > Michael J. Erickson
>> >
>> > Research Scientist
>> > Cooperative Institute for Research in Environmental Sciences
(CIRES)
>> > NOAA/NWS/Weather Prediction Center
>> > Phone:  301-683-1546
>> >
>> >
>>
>>
>
> --
> Michael J. Erickson
>
> Research Scientist
> Cooperative Institute for Research in Environmental Sciences (CIRES)
> NOAA/NWS/Weather Prediction Center
> Phone:  301-683-1546
>

--
Michael J. Erickson

Research Scientist
Cooperative Institute for Research in Environmental Sciences (CIRES)
NOAA/NWS/Weather Prediction Center
Phone:  301-683-1546

------------------------------------------------
Subject: Including Climatology in grid_stat Config File
From: Michael Erickson - NOAA Affiliate
Time: Wed Sep 16 08:03:40 2020

Hello,

I have one additional question, and that is when extracting the BSS
value
from the stat file based on my input climatology, would I extract
field 40
(Brier Skill Score relative to external climatology)?

Thanks again,

Mike

On Wed, Sep 16, 2020 at 9:44 AM Michael Erickson - NOAA Affiliate <
michael.j.erickson at noaa.gov> wrote:

> Hi All,
>
> I have redone my configuration file "usethis" to reflect the new
threshold
> values. After that I have run grid_stat in the same manner as my
initial
> email:
>
> /opt/MET/90/bin/grid_stat ERO_s2020083112_e2020090112_vhr21.nc
> ST4gFFG_s2020083112_e2020090112_vhr21.nc usethis -outdir ~
>
> I've put the input/output files here:
> https://ftp.wpc.ncep.noaa.gov/erickson/DTC/.
>
> I still have the issue where my Brier Score value is 0.0116 and the
stat
> file is ~0.0120. My method for computing BS is more simple than that
in
> MET, where I just compute the mean squared difference between model
and
> observation probabilities (e.g. no summing through contingency table
counts
> as is done on p 446 of the MET tutorial
> <https://dtcenter.org/sites/default/files/community-
code/met/docs/user-guide/MET_Users_Guide_v9.0.pdf>
> ).
>
> I am wondering if the Brier Score and Brier Skill Score values in
the MET
> stat file look correct for this case? If so, then I am content on
> proceeding forward with implementing this new setup at WPC.
>
> Thank you for all of your help with this!
>
> Mike
>
>
>
> On Fri, Sep 11, 2020 at 7:44 AM Michael Erickson - NOAA Affiliate <
> michael.j.erickson at noaa.gov> wrote:
>
>> Hi John,
>>
>> Thank you for your help here! I do appreciate it. I will gradually
work
>> in your recommended changes to my python scripts.
>>
>> Regarding your options, these are good suggestions and I can
understand
>> how complicated this is. I would advise against 2) since this would
change
>> the results from previous versions. Option 1) is appealing to me,
but I'm
>> not sure if there are many other users with discrete thresholds to
their
>> gridded data. I could see the utility of a -left, -middle, -right
option
>> which will default to mid point binning when unspecified. It's
unfortunate
>> that the user will lose either the left or right most category with
this
>> option, but if the user is this savvy to get to this level of
detail, they
>> can probably modify either their data or the threshold to meet
within the
>> constraints of left/right binning. Another option is to calculate
BS
>> without summing through the thresholds, but this loses a layer of
>> complexity that I like.
>>
>> I hope this helps and thank you!
>>
>> Mike
>>
>> On Thu, Sep 10, 2020 at 3:28 PM John Halley Gotway via RT <
>> met_help at ucar.edu> wrote:
>>
>>> Sorry it took me so long to answer. So we know that MET uses the
>>> centerpoint of the bin as the probability value. And we know that
your
>>> data
>>> is already binned with the only valid probability values being:
>>> 0.0, 0.05, 0.1, 0.2, 0.5, 1.
>>>
>>> So we want to choose bins whose centerpoints correspond to these
>>> probability values. However, we're a little constrained because
MET
>>> requires the first and last ones to be 0 and 1, respectively, and
that
>>> everything in between be monotonically increasing.
>>>
>>> The most concise way I can think of uses 7 bins defined by:
>>> cat_thresh = [ >=0.0, >=0.001, >=0.1, >=0.1001, >=0.3, >=0.7,
>=0.999,
>>> >=1.0 ];
>>>
>>> Bin 1 for prob = 0: 0 to 0.001
>>> Bin 2 for prob = 0.05: 0.001 to 0.1
>>> Bin 3 for prob = 0.1: 0.1 to 0.1001
>>> Bin 4 for prob = 0.2: 0.1001 to 0.3
>>> Bin 5 for prob = 0.5: 0.3 to 0.7
>>> Bin 6 as a placeholder: 0.7 to 0.999
>>> Bin 7 for prob = 1.0: 0.999 to 1.0
>>>
>>> But perhaps it'd be more clear with:
>>> cat_thresh = [ >=0.0, >=0.001, >=0.05, >=0.0501, >=0.10, >=0.101,
>=0.2,
>>> >=0.201, >=0.5, >=0.501, >=0.999, >=1.0 ];
>>>
>>> But all these mental gymnastics seem way too confusing!
>>> So what changes can we make to Point-Stat and Grid-Stat to better
handle
>>> this situation in the future?
>>>
>>> No very obvious solution occurs to me, but some options include:
>>>
>>> (1) Add a config option to switch from using the mid-point of the
>>> probability bin to using the left or right side.
>>> But for the first bin, you'd want the left side. And for the last
bin,
>>> you'd want the right side! We could consider 0 to be a special
case?
>>> And this requires the user to be very savvy to understand all
these
>>> details.
>>>
>>> (2) Consider changing the logic to ALWAYS include bins for 0 to 0
and 1
>>> to
>>> 1 since the endpoints are kind of special cases?
>>> But that'd change existing results which is not good.
>>>
>>> (3) Pre-process the input probability values before any smoothing
or
>>> interpolation to point observations occurs.
>>> Keep track of the unique values to determine if the data is
binned.
>>> But what qualifies as being binned? 5 unique probabilities? 10?
20? 50?
>>> 100?
>>> Potentially print a warning message if they've chosen probability
bins
>>> poorly?
>>> What does poorly mean?
>>>
>>> If we can define some very specific solutions, we can make the
code do
>>> whatever we want.
>>>
>>> But ideally the changes would not change existing results, be
intuitive
>>> for
>>> a user to understand, and be easy to document.
>>>
>>> Please let me know.
>>>
>>> Thanks,
>>> John
>>>
>>> On Wed, Sep 9, 2020 at 3:50 PM Michael Erickson - NOAA Affiliate
via RT <
>>> met_help at ucar.edu> wrote:
>>>
>>> >
>>> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562 >
>>> >
>>> > Thanks Everyone for your helpful responses.
>>> >
>>> > I have been using grid_stat for WPC's Excessive Rainfall Outlook
>>> > (consisting of probabilities of 0, 0.05, 0.1, 0.2, and 0.5) for
years.
>>> > These results are dependent upon MET, so I wanted to make sure I
am
>>> > following best practices.
>>> >
>>> > What is DTC's guidance on how to proceed forward with this?
Should I
>>> change
>>> > my cat_thresh to "= [ >=0.0, >=0.001, >=0.05, >=0.1, >=0.2,
>=0.5,
>>> >=1.0
>>> > ];" or is my current setting fine given that both the "forecast"
and
>>> > "observation" are broken down by the same discrete increments? I
can
>>> also
>>> > just calculate Brier Score manually outside of grid_stat.
>>> >
>>> > Thanks,
>>> >
>>> > Mike
>>> >
>>> > On Tue, Sep 8, 2020 at 4:56 PM Barbara Brown via RT
<met_help at ucar.edu
>>> >
>>> > wrote:
>>> >
>>> > > I agree with Eric and John. The way MET does this generally
makes
>>> sense
>>> > for
>>> > > ensemble forecasts (or other cases when you want MET to select
the
>>> > > thresholds) but not for the cse when the probabilities for
specific
>>> > > categories are provided by the user.  I'm not sure what the
>>> work-around
>>> > > might be (John may have ideas!) but in the long-run it would
be good
>>> to
>>> > > allow for this option.
>>> > >
>>> > > Barb
>>> > > ---
>>> > > Barbara Brown, Senior Research Associate
>>> > > Research Applications Laboratory
>>> > > NCAR PO Box 3000
>>> > > Boulder CO 80307-3000 USA
>>> > > Ph: +1 303 497 8468  FAX: +1 303 497 8401
>>> > >
>>> > >
>>> > > On Tue, Sep 8, 2020 at 2:14 PM Michael Erickson - NOAA
Affiliate <
>>> > > michael.j.erickson at noaa.gov> wrote:
>>> > >
>>> > > > Hi Eric and John,
>>> > > >
>>> > > > Thank you for your response to this matter. What would be
the best
>>> > > > practice to take in this situation?
>>> > > >
>>> > > > Thanks,
>>> > > >
>>> > > > Mike
>>> > > >
>>> > > > On Tue, Sep 8, 2020 at 3:41 PM Eric Gilleland via RT <
>>> > met_help at ucar.edu>
>>> > > > wrote:
>>> > > >
>>> > > >> Hi John,
>>> > > >>
>>> > > >> I agree that if the probabilities have already been binned,
then
>>> it is
>>> > > >> strange to then take the midpoint (re-binning).
>>> > > >>
>>> > > >> Eric
>>> > > >>
>>> > > >> On Fri, Sep 4, 2020 at 11:14 AM John Halley Gotway via RT <
>>> > > >> met_help at ucar.edu>
>>> > > >> wrote:
>>> > > >>
>>> > > >> > Barb and Eric,
>>> > > >> >
>>> > > >> > I've added you to this met-help ticket from Mike Erickson
from
>>> > > NOAA/WPC.
>>> > > >> > We're hoping to get some advice from one or both of you
about
>>> > > >> probabilistic
>>> > > >> > verification.
>>> > > >> >
>>> > > >> > Mike is running Grid-Stat to verify WPC's Excessive
Rainfall
>>> > Outlooks
>>> > > >> > against StageIV precip. The forecast probability values
are
>>> always
>>> > 0,
>>> > > >> 0.05,
>>> > > >> > 0.1, 0.2, 0.5, or 1.0.
>>> > > >> > When Mike computes the Brier score by hand, it differs
from the
>>> > > results
>>> > > >> > reported by Grid-Stat out in the 3rd decimal place.
>>> > > >> >
>>> > > >> > My theory is that the difference is caused by the fact
that MET
>>> does
>>> > > not
>>> > > >> > compute the Brier score directly on the probability
values.
>>> Instead,
>>> > > it
>>> > > >> > bins them into an Nx2 probabilistic contingency table and
>>> computes
>>> > the
>>> > > >> > Brier score from that table. And the mid-point of each
bin is
>>> used
>>> > in
>>> > > >> the
>>> > > >> > Brier score computations. So different probability bins
will
>>> result
>>> > > in a
>>> > > >> > slightly different Brier score.
>>> > > >> >
>>> > > >> > Mike is currently using probability thresholds as
follows:
>>> > > >> >    cat_thresh = [ >=0.0, >=0.05, >=0.1, >=0.2, >=0.5,
>=1.0 ];
>>> > > >> >
>>> > > >> > And that's consistent with the probability values. But
when you
>>> > think
>>> > > >> about
>>> > > >> > it...
>>> > > >> > - Forecasts of 0% fall into the first bin and are
evaluated as
>>> > being a
>>> > > >> > value of 0.025 (mid-point of the 0.0 to 0.05 bin)
>>> > > >> > - Forecasts of 5% fall into the second bin and are
evaluated as
>>> > being
>>> > > a
>>> > > >> > value of 0.075 (mid-point of the 0.05 to 0.1 bin)
>>> > > >> > - Forecasts of 10% fall into the third bin and are
evaluated as
>>> > being
>>> > > a
>>> > > >> > value of 0.150 (mid-point of the 0.1 to 0.2 bin).
>>> > > >> > - and so on for the other probability values
>>> > > >> >
>>> > > >> > Seems like the binning of probability values works better
for
>>> > > continuous
>>> > > >> > probability values and not so well for probabilities that
have
>>> > already
>>> > > >> been
>>> > > >> > binned!
>>> > > >> >
>>> > > >> > I'm wondering if you have any thoughts or advice about
this
>>> > situation?
>>> > > >> >
>>> > > >> > Thanks,
>>> > > >> > John
>>> > > >> >
>>> > > >> > On Fri, Sep 4, 2020 at 10:47 AM Michael Erickson - NOAA
>>> Affiliate
>>> > via
>>> > > >> RT <
>>> > > >> > met_help at ucar.edu> wrote:
>>> > > >> >
>>> > > >> > >
>>> > > >> > > <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562
>>> >
>>> > > >> > >
>>> > > >> > > Hi John,
>>> > > >> > >
>>> > > >> > > Thanks for your answers and sounds good! That is
strange that
>>> the
>>> > > >> climo
>>> > > >> > > file was not found for your setting. The only detail I
can
>>> think
>>> > of
>>> > > is
>>> > > >> > that
>>> > > >> > > within the climo field, the file_name specification is
static:
>>> > > >> > >
>>> > > >> > > file_name = [
>>> > > >> > >
>>> > > >> >
>>> > > >>
>>> > >
>>> >
>>>
"/usr1/wpc_cpgffh/gribs/ERO_verif/static/UFVS_ST4gFFG_s2015010100_e2019123123_vhr12.nc"
>>> > > >> > > ];
>>> > > >> > >
>>> > > >> > >
>>> > > >> > > I believe you concluded that my climo read-in looked
correct?
>>> > > >> > >
>>> > > >> > > Thanks,
>>> > > >> > >
>>> > > >> > > Mike
>>> > > >> > >
>>> > > >> > >
>>> > > >> > > On Fri, Sep 4, 2020 at 12:42 PM John Halley Gotway via
RT <
>>> > > >> > > met_help at ucar.edu>
>>> > > >> > > wrote:
>>> > > >> > >
>>> > > >> > > > Mike,
>>> > > >> > > >
>>> > > >> > > > 2 more things I forgot to address.
>>> > > >> > > >
>>> > > >> > > > First, I pulled that climo field but when I ran
grid_stat
>>> with
>>> > > your
>>> > > >> > > usethis
>>> > > >> > > > config file, it did not actually read the climo data.
>>> > > >> > > >
>>> > > >> > > > DEBUG 3: Found 0 climatology fields.
>>> > > >> > > >
>>> > > >> > > >
>>> > > >> > > > I'm wondering what additional configuration settings
you
>>> used to
>>> > > >> make
>>> > > >> > > this
>>> > > >> > > > work?
>>> > > >> > > >
>>> > > >> > > >
>>> > > >> > > > Second, the answer to your question is yes. The exact
same
>>> > binning
>>> > > >> > logic
>>> > > >> > > > used for the forecast probabilities is applied to the
climo
>>> > data.
>>> > > In
>>> > > >> > > fact,
>>> > > >> > > > the forecast probability bins are applied to both the
>>> forecast
>>> > and
>>> > > >> > climo
>>> > > >> > > > data. So you do not need to define separate
"cat_thresh"
>>> > settings
>>> > > >> for
>>> > > >> > the
>>> > > >> > > > climo. They won't be used anyway.
>>> > > >> > > >
>>> > > >> > > >
>>> > > >> > > > Here's the spot in the library code where the climo
>>> > probabilistic
>>> > > >> > > > contingency table is created using the forecast
probability
>>> > bins:
>>> > > >> > > >
>>> > > >> > > >
>>> > > >> > > >
>>> > > >> > > >
>>> > > >> > >
>>> > > >> >
>>> > > >>
>>> > >
>>> >
>>>
https://github.com/dtcenter/MET/blob/2c9ae440a84024fbf62caa64c1b747f9a912236f/met/src/libcode/vx_statistics/compute_stats.cc#L767
>>> > > >> > > >
>>> > > >> > > >
>>> > > >> > > > Thanks,
>>> > > >> > > > John
>>> > > >> > > >
>>> > > >> > > > On Fri, Sep 4, 2020 at 10:25 AM John Halley Gotway <
>>> > > johnhg at ucar.edu
>>> > > >> >
>>> > > >> > > > wrote:
>>> > > >> > > >
>>> > > >> > > > > Mike,
>>> > > >> > > > >
>>> > > >> > > > > I don't really have a recommendation on best
practices
>>> with
>>> > > >> regards
>>> > > >> > to
>>> > > >> > > > the
>>> > > >> > > > > binning of probability values.
>>> > > >> > > > >
>>> > > >> > > > > I can say that I more commonly see people choose
fixed bin
>>> > > widths,
>>> > > >> > like
>>> > > >> > > > > "==0.10" (for 10 bins) or "==0.05" (for 20 bins)
instead
>>> of
>>> > > >> variable
>>> > > >> > > > width
>>> > > >> > > > > bins, such as:
>>> > > >> > > > > [ >=0.0, >=0.05, >=0.1, >=0.2, >=0.5, >=1.0 ]
>>> > > >> > > > >
>>> > > >> > > > > But I suspect that's more out of convenience than
anything
>>> > else.
>>> > > >> With
>>> > > >> > > > > regards to your chosen bins, I suspect you set them
up
>>> this
>>> > way
>>> > > >> since
>>> > > >> > > you
>>> > > >> > > > > have lots of low probability values closer to 0.0
and
>>> > relatively
>>> > > >> few
>>> > > >> > > > > probability values closer to 1.0. While this may be
a good
>>> > > choice
>>> > > >> for
>>> > > >> > > > > relatively rare events, it wouldn't be as good of a
>>> choice for
>>> > > >> very
>>> > > >> > > > common
>>> > > >> > > > > events resulting in high probability values.
>>> > > >> > > > >
>>> > > >> > > > > Choosing 20 bins (==0.05) would include all of your
>>> current
>>> > bin
>>> > > >> > > > boundaries
>>> > > >> > > > > and enable you to sample evenly across the
probability
>>> space,
>>> > > >> > > regardless
>>> > > >> > > > of
>>> > > >> > > > > whether the values are bunched near 0 or 1. And
>>> > mathematically,
>>> > > >> your
>>> > > >> > > > > current bins would be derivable from these.
>>> > > >> > > > >
>>> > > >> > > > > But if your chosen bins follow some existing WPC
>>> convention, I
>>> > > >> don't
>>> > > >> > > see
>>> > > >> > > > > an obvious reason to change them.
>>> > > >> > > > >
>>> > > >> > > > > Please let me know if you'd like me to forward this
>>> question
>>> > to
>>> > > >> one
>>> > > >> > of
>>> > > >> > > > the
>>> > > >> > > > > statisticians in our group for their advice.
>>> > > >> > > > >
>>> > > >> > > > > Thanks,
>>> > > >> > > > > John
>>> > > >> > > > >
>>> > > >> > > > > On Fri, Sep 4, 2020 at 5:08 AM Michael Erickson -
NOAA
>>> > Affiliate
>>> > > >> via
>>> > > >> > > RT <
>>> > > >> > > > > met_help at ucar.edu> wrote:
>>> > > >> > > > >
>>> > > >> > > > >>
>>> > > >> > > > >> <URL:
>>> > https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562
>>> > > >
>>> > > >> > > > >>
>>> > > >> > > > >> Hi John,
>>> > > >> > > > >>
>>> > > >> > > > >> Thank you for your quick and helpful response! To
answer
>>> your
>>> > > >> > > questions
>>> > > >> > > > >> from the first email:
>>> > > >> > > > >>
>>> > > >> > > > >> 1) I have included the climo file in case you
wanted to
>>> see
>>> > it:
>>> > > >> > > > >>
>>> > > >> > > > >>
>>> > > >> > > >
>>> > > >> > >
>>> > > >> >
>>> > > >>
>>> > >
>>> >
>>>
https://ftp.wpc.ncep.noaa.gov/erickson/DTC/UFVS_ST4gFFG_s2015010100_e2019123123_vhr12.nc
>>> > > >> > > > >>
>>> > > >> > > > >> 2) I start from the netcdf output from grid_stat,
load
>>> that
>>> > > data
>>> > > >> > into
>>> > > >> > > > the
>>> > > >> > > > >> python workspace, and compute the brier score from
that.
>>> > > >> > > > >>
>>> > > >> > > > >> Also the circle diameter of 9 in the observation
file is
>>> to
>>> > > draw
>>> > > >> a
>>> > > >> > 40
>>> > > >> > > km
>>> > > >> > > > >> radius around the "observation."
>>> > > >> > > > >>
>>> > > >> > > > >> From your latter email, it sounds like I may not
be able
>>> to
>>> > > >> exactly
>>> > > >> > > > >> replicate the Brier Score calculation. In the
spirit of
>>> best
>>> > > >> > > practices,
>>> > > >> > > > >> would you recommend I change cat_thresh  to "= [
>=0.0,
>>> > > >=0.001,
>>> > > >> > > >=0.05,
>>> > > >> > > > >> >=0.1, >=0.2, >=0.5, >=1.0 ];" or keep my
cat_thresh as
>>> it
>>> > > >> currently
>>> > > >> > > is
>>> > > >> > > > as
>>> > > >> > > > >> long as I am consistent? I was also wondering if
>>> grid_stat
>>> > bins
>>> > > >> the
>>> > > >> > > > >> probabilities for the climo field as it does for
the
>>> > > >> probabilities
>>> > > >> > in
>>> > > >> > > > the
>>> > > >> > > > >> forecast field?
>>> > > >> > > > >>
>>> > > >> > > > >> Thanks again!
>>> > > >> > > > >>
>>> > > >> > > > >> Mike
>>> > > >> > > > >>
>>> > > >> > > > >> On Thu, Sep 3, 2020 at 7:12 PM John Halley Gotway
via RT
>>> <
>>> > > >> > > > >> met_help at ucar.edu>
>>> > > >> > > > >> wrote:
>>> > > >> > > > >>
>>> > > >> > > > >> > Actually, I have a reasonable guess as to why
you may
>>> be
>>> > > >> seeing a
>>> > > >> > > > >> > difference.
>>> > > >> > > > >> >
>>> > > >> > > > >> > All probabilistics verification in MET is based
on an
>>> Nx2
>>> > > >> > > > probabilistic
>>> > > >> > > > >> > contingency table. Those are the counts in the
PCT line
>>> > type.
>>> > > >> We
>>> > > >> > do
>>> > > >> > > > >> this to
>>> > > >> > > > >> > make it easier to aggregate statistics across
multiple
>>> > cases,
>>> > > >> but
>>> > > >> > > > >> summing
>>> > > >> > > > >> > up contingency tables before recomputing
statistics.
>>> But
>>> > the
>>> > > >> > > pros/cons
>>> > > >> > > > >> of
>>> > > >> > > > >> > this approach would probably be better addressed
by a
>>> > > >> > statistician.
>>> > > >> > > So
>>> > > >> > > > >> the
>>> > > >> > > > >> > stats are computed using probability bins and
not raw
>>> > > >> probability
>>> > > >> > > > >> values.
>>> > > >> > > > >> >
>>> > > >> > > > >> > If you went and computed the Brier score by
hand, you
>>> > > probably
>>> > > >> did
>>> > > >> > > so
>>> > > >> > > > >> using
>>> > > >> > > > >> > raw probability values and not binning them
first.
>>> > > >> > > > >> >
>>> > > >> > > > >> > And this difference could explain the type of
>>> discrepancy
>>> > > >> you're
>>> > > >> > > > seeing.
>>> > > >> > > > >> >
>>> > > >> > > > >> > To test this out, I reran your case...
>>> > > >> > > > >> > (1) Using your original settings to confirm your
Brier
>>> > score
>>> > > of
>>> > > >> > > > >> 0.011934.
>>> > > >> > > > >> > (2) Using 10 equally-spaced probability bins
>>> (cat_thresh =
>>> > [
>>> > > >> ==0.1
>>> > > >> > > ];)
>>> > > >> > > > >> > which produced a Brier score of 0.013747.
>>> > > >> > > > >> > (3) Using 50 equally-spaced probability bins
>>> (cat_thresh =
>>> > [
>>> > > >> ==0.2
>>> > > >> > > ];)
>>> > > >> > > > >> > which produced a Brier score of 0.01197.
>>> > > >> > > > >> > (4) Using 100 equally-spaced probability bins
>>> (cat_thresh
>>> > = [
>>> > > >> > ==0.01
>>> > > >> > > > ];)
>>> > > >> > > > >> > which produced a Brier score of 0.01193.
>>> > > >> > > > >> >
>>> > > >> > > > >> > I suppose that doesn't example the exact
discrepancy,
>>> but
>>> > > could
>>> > > >> > > > >> definitely
>>> > > >> > > > >> > be involved.
>>> > > >> > > > >> >
>>> > > >> > > > >> > Notice on this line of the brier score
computation in
>>> MET:
>>> > > >> > > > >> >
>>> > > >> > > > >> >
>>> > > >> > > > >>
>>> > > >> > > >
>>> > > >> > >
>>> > > >> >
>>> > > >>
>>> > >
>>> >
>>>
https://github.com/dtcenter/MET/blob/2c9ae440a84024fbf62caa64c1b747f9a912236f/met/src/libcode/vx_statistics/contable_nx2.cc#L647
>>> > > >> > > > >> >
>>> > > >> > > > >> > That the "probability" value returned by
"row_proby()"
>>> is
>>> > the
>>> > > >> > > > mid-point
>>> > > >> > > > >> of
>>> > > >> > > > >> > the bin.
>>> > > >> > > > >> > So all of your forecast probability values of 0%
which
>>> fall
>>> > > >> into
>>> > > >> > the
>>> > > >> > > > >> first
>>> > > >> > > > >> > bin are actually evaluated as having a
probability
>>> value of
>>> > > >> 0.025
>>> > > >> > > > which
>>> > > >> > > > >> is
>>> > > >> > > > >> > the mid-point between 0 and 0.05 for the first
bin.
>>> > > >> > > > >> >
>>> > > >> > > > >> > Rerunning using the following to minimize that
effect
>>> on
>>> > the
>>> > > >> 0's:
>>> > > >> > > > >> > cat_thresh = [ >=0.0, >=0.001, >=0.05, >=0.1,
>=0.2,
>>> >=0.5,
>>> > > >> >=1.0
>>> > > >> > ];
>>> > > >> > > > >> > produces a brier score of 0.011489.
>>> > > >> > > > >> >
>>> > > >> > > > >> > So I'd say that the binning of the probability
values
>>> is
>>> > > >> impacting
>>> > > >> > > the
>>> > > >> > > > >> > Brier score out in the 4th decimal place.
>>> > > >> > > > >> >
>>> > > >> > > > >> > John
>>> > > >> > > > >> >
>>> > > >> > > > >> > On Thu, Sep 3, 2020 at 4:47 PM John Halley
Gotway <
>>> > > >> > johnhg at ucar.edu>
>>> > > >> > > > >> wrote:
>>> > > >> > > > >> >
>>> > > >> > > > >> > > Hi Mike,
>>> > > >> > > > >> > >
>>> > > >> > > > >> > > Looks like you were able to make a lot of
progress. I
>>> > > >> certainly
>>> > > >> > > > don't
>>> > > >> > > > >> see
>>> > > >> > > > >> > > anything wrong based on the log messages you
sent.
>>> > > >> > > > >> > >
>>> > > >> > > > >> > > I do notice that you're smoothing the
observations
>>> with
>>> > the
>>> > > >> > > maximum
>>> > > >> > > > >> value
>>> > > >> > > > >> > > in a circle of diameter 9... presumably for a
good
>>> > reason.
>>> > > >> And I
>>> > > >> > > see
>>> > > >> > > > >> that
>>> > > >> > > > >> > > smoothing step indicated in the log messages
as well
>>> as
>>> > the
>>> > > >> > output
>>> > > >> > > > >> .stat
>>> > > >> > > > >> > > file.
>>> > > >> > > > >> > >
>>> > > >> > > > >> > > Two questions.
>>> > > >> > > > >> > >
>>> > > >> > > > >> > > (1) I wanted to try running locally, but
didn't find
>>> the
>>> > > >> "climo"
>>> > > >> > > > file
>>> > > >> > > > >> on
>>> > > >> > > > >> > > the WPC ftp site:
>>> > > >> > > > >> > >
>>> > > >> > > > >> > >
>>> > > >> > > > >> >
>>> > > >> > > > >>
>>> > > >> > > >
>>> > > >> > >
>>> > > >> >
>>> > > >>
>>> > >
>>> >
>>>
https://ftp.wpc.ncep.noaa.gov/erickson/DTC/UFVS_ST4gFFG_s2015010100_e2019123123_vhr12.nc
>>> > > >> > > > >> > > <
>>> > > >> > > > >> >
>>> > > >> > > > >>
>>> > > >> > > >
>>> > > >> > >
>>> > > >> >
>>> > > >>
>>> > >
>>> >
>>>
https://ftp.wpc.ncep.noaa.gov/erickson/DTC/UFVS_ST4gFFG_s2015010100_e2019123123_vhr12.nc
>>> > > >> > > > >> > >
>>> > > >> > > > >> > > Could you add that?
>>> > > >> > > > >> > >
>>> > > >> > > > >> > > (2) When you say that you tried to replicate
the
>>> Brier
>>> > > score
>>> > > >> > > > >> computation,
>>> > > >> > > > >> > > what was your starting point? The raw input
files or
>>> > using
>>> > > >> the
>>> > > >> > > > NetCDF
>>> > > >> > > > >> > > matched pairs output from Grid-Stat which
already
>>> include
>>> > > the
>>> > > >> > > > >> computation
>>> > > >> > > > >> > > of the observation maximums?
>>> > > >> > > > >> > >
>>> > > >> > > > >> > > Thanks,
>>> > > >> > > > >> > > John Halley Gotway
>>> > > >> > > > >> > >
>>> > > >> > > > >> > > On Thu, Sep 3, 2020 at 2:26 PM Michael
Erickson -
>>> NOAA
>>> > > >> Affiliate
>>> > > >> > > via
>>> > > >> > > > >> RT <
>>> > > >> > > > >> > > met_help at ucar.edu> wrote:
>>> > > >> > > > >> > >
>>> > > >> > > > >> > >>
>>> > > >> > > > >> > >> <URL:
>>> > > >> https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562
>>> > > >> > >
>>> > > >> > > > >> > >>
>>> > > >> > > > >> > >> Thank you Minna!
>>> > > >> > > > >> > >>
>>> > > >> > > > >> > >> Mike
>>> > > >> > > > >> > >>
>>> > > >> > > > >> > >> On Thu, Sep 3, 2020 at 4:12 PM Minna Win via
RT <
>>> > > >> > > met_help at ucar.edu
>>> > > >> > > > >
>>> > > >> > > > >> > >> wrote:
>>> > > >> > > > >> > >>
>>> > > >> > > > >> > >> > Hi Mike,
>>> > > >> > > > >> > >> >
>>> > > >> > > > >> > >> > It looks like you have a few questions
associated
>>> with
>>> > > >> > > > calculating
>>> > > >> > > > >> > Brier
>>> > > >> > > > >> > >> > Skill Scores.  I'm assigning this ticket to
John
>>> > Halley
>>> > > >> > Gotway.
>>> > > >> > > > >> > >> >
>>> > > >> > > > >> > >> > Regards,
>>> > > >> > > > >> > >> > Minna
>>> > > >> > > > >> > >> > ---------------
>>> > > >> > > > >> > >> > Minna Win
>>> > > >> > > > >> > >> > National Center for Atmospheric Research
>>> > > >> > > > >> > >> > Developmental Testbed Center
>>> > > >> > > > >> > >> > Phone: 303-497-8423
>>> > > >> > > > >> > >> > Fax:   303-497-8401
>>> > > >> > > > >> > >> >
>>> > > >> > > > >> > >> >
>>> > > >> > > > >> > >> >
>>> > > >> > > > >> > >> > On Thu, Sep 3, 2020 at 1:13 PM Michael
Erickson -
>>> NOAA
>>> > > >> > > Affiliate
>>> > > >> > > > >> via
>>> > > >> > > > >> > RT
>>> > > >> > > > >> > >> <
>>> > > >> > > > >> > >> > met_help at ucar.edu> wrote:
>>> > > >> > > > >> > >> >
>>> > > >> > > > >> > >> > >
>>> > > >> > > > >> > >> > > Thu Sep 03 13:13:26 2020: Request 96562
was
>>> acted
>>> > > upon.
>>> > > >> > > > >> > >> > > Transaction: Ticket created by
>>> > > >> michael.j.erickson at noaa.gov
>>> > > >> > > > >> > >> > >        Queue: met_help
>>> > > >> > > > >> > >> > >      Subject: Including Climatology in
grid_stat
>>> > > Config
>>> > > >> > File
>>> > > >> > > > >> > >> > >        Owner: Nobody
>>> > > >> > > > >> > >> > >   Requestors: michael.j.erickson at noaa.gov
>>> > > >> > > > >> > >> > >       Status: new
>>> > > >> > > > >> > >> > >  Ticket <URL:
>>> > > >> > > > >> >
>>> https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562
>>> > > >> > > > >> > >> >
>>> > > >> > > > >> > >> > >
>>> > > >> > > > >> > >> > >
>>> > > >> > > > >> > >> > > Greetings,
>>> > > >> > > > >> > >> > >
>>> > > >> > > > >> > >> > > For the first time I am attempting to
calculate
>>> > Brier
>>> > > >> Skill
>>> > > >> > > > Score
>>> > > >> > > > >> > >> using
>>> > > >> > > > >> > >> > > grid_stat from an input climatology file.
I have
>>> > > >> created a
>>> > > >> > > > >> > >> probabilistic
>>> > > >> > > > >> > >> > > flooding climatology file (spans from
zero to
>>> one;
>>> > > >> image is
>>> > > >> > > > here:
>>> > > >> > > > >> > >> > >
>>> > > >> > > > >> >
>>> > > >> > >
>>> > >
>>>
https://ftp.wpc.ncep.noaa.gov/erickson/test10/ALL_noMRGL_UFVS_ALL.png
>>> > > >> > > > >> > >> ).
>>> > > >> > > > >> > >> > > This climatology is static, so it doesn't
change
>>> > with
>>> > > >> time
>>> > > >> > > when
>>> > > >> > > > >> > >> inputting
>>> > > >> > > > >> > >> > > the "model" and "observation" data. I
believe I
>>> have
>>> > > >> > > > successfully
>>> > > >> > > > >> > >> gotten
>>> > > >> > > > >> > >> > > this to work using the command:
>>> > > >> > > > >> > >> > >
>>> > > >> > > > >> > >> > > /opt/MET/90/bin/grid_stat
>>> > ERO_s2020083112_e2020090112_
>>> > > >> > > vhr09.nc
>>> > > >> > > > >> > >> > > ST4gFFG_s2020083112_e2020090112_vhr09.nc
>>> usethis
>>> > > >> -outdir ~
>>> > > >> > > > >> > >> > >
>>> > > >> > > > >> > >> > > where grid_stat
ERO_s2020083112_e2020090112_
>>> > vhr09.nc
>>> > > >> are
>>> > > >> > > > >> discrete
>>> > > >> > > > >> > >> > forecast
>>> > > >> > > > >> > >> > > probabilities of 0, 0.05, 0.1, 0.2, and
0.5
>>> > > >> > > > >> > >> > > where
ST4gFFG_s2020083112_e2020090112_vhr09.nc
>>> are
>>> > > >> > > observation
>>> > > >> > > > >> > values
>>> > > >> > > > >> > >> > of 0
>>> > > >> > > > >> > >> > > or 1
>>> > > >> > > > >> > >> > > and usethis is the configuration file
>>> > > >> > > > >> > >> > >
>>> > > >> > > > >> > >> > > Finally the climatology file that
consists of
>>> > "almost"
>>> > > >> > > > continuous
>>> > > >> > > > >> > >> values
>>> > > >> > > > >> > >> > > between 0 and 1 is named:
>>> UFVS_ST4gFFG_s2015010100_
>>> > > >> > > > >> > >> e2019123123_vhr12.nc
>>> > > >> > > > >> > >> > >
>>> > > >> > > > >> > >> > > I have put all of these files at
>>> > > >> > > > >> > >> > >
https://ftp.wpc.ncep.noaa.gov/erickson/DTC/ for
>>> > > >> > > > >> > >> > > your reference.
>>> > > >> > > > >> > >> > >
>>> > > >> > > > >> > >> > > As for my questions:
>>> > > >> > > > >> > >> > >
>>> > > >> > > > >> > >> > > 1) I was wondering if the climatology
file was
>>> > > properly
>>> > > >> > > > ingested
>>> > > >> > > > >> and
>>> > > >> > > > >> > >> > > calculated for my example? I believe it
is
>>> correct
>>> > > given
>>> > > >> > the
>>> > > >> > > > >> output
>>> > > >> > > > >> > >> > below,
>>> > > >> > > > >> > >> > > but I wanted to make sure, since this is
my
>>> first
>>> > time
>>> > > >> > doing
>>> > > >> > > > >> this:
>>> > > >> > > > >> > >> > >
>>> > > >> > > > >> > >> > >
>>> > > >> > > > >> > >> > >
>>> > > >> > > > >> > >> > >
>>> > > >> > > > >> > >> > >
>>> > > >> > > > >> > >> > >
>>> > > >> > > > >> > >> > >
>>> > > >> > > > >> > >> > >
>>> > > >> > > > >> > >> > >
>>> > > >> > > > >> > >> > >
>>> > > >> > > > >> > >> > >
>>> > > >> > > > >> > >> > >
>>> > > >> > > > >> > >> > >
>>> > > >> > > > >> > >> > >
>>> > > >> > > > >> > >> > >
>>> > > >> > > > >> > >> > >
>>> > > >> > > > >> > >> > >
>>> > > >> > > > >> > >> > >
>>> > > >> > > > >> > >> > >
>>> > > >> > > > >> > >> > >
>>> > > >> > > > >> > >> > >
>>> > > >> > > > >> > >> > >
>>> > > >> > > > >> > >> > >
>>> > > >> > > > >> > >> > >
>>> > > >> > > > >> > >> > >
>>> > > >> > > > >> > >> > >
>>> > > >> > > > >> > >> > >
>>> > > >> > > > >> > >> > >
>>> > > >> > > > >> > >> > >
>>> > > >> > > > >> > >> > >
>>> > > >> > > > >> > >> > >
>>> > > >> > > > >> > >> > >
>>> > > >> > > > >> > >> > > *DEBUG 1: Forecast File:
>>> > > >> > > > >> > >> > >
>>> > > >> > > > >> > >> > >
>>> > > >> > > > >> > >> >
>>> > > >> > > > >> > >>
>>> > > >> > > > >> >
>>> > > >> > > > >>
>>> > > >> > > >
>>> > > >> > >
>>> > > >> >
>>> > > >>
>>> > >
>>> >
>>> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL/ERO_s2020083112_e2020090112_vhr09.ncDEBUG
>>> > > >> > > > >> > >> > > 1: Observation File:
>>> > > >> > > > >> > >> > >
>>> > > >> > > > >> > >> > >
>>> > > >> > > > >> > >> >
>>> > > >> > > > >> > >>
>>> > > >> > > > >> >
>>> > > >> > > > >>
>>> > > >> > > >
>>> > > >> > >
>>> > > >> >
>>> > > >>
>>> > >
>>> >
>>> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL/ST4gFFG_s2020083112_e2020090112_vhr09.ncDEBUG
>>> > > >> > > > >> > >> > > 3: Reading forecast data for
EROSurface.DEBUG 3:
>>> > > Reading
>>> > > >> > > > >> observation
>>> > > >> > > > >> > >> data
>>> > > >> > > > >> > >> > > for ST4gFFGSurface.DEBUG 4:
>>> > > >> > > > >> > >> Met2dDataFileFactory::new_met_2d_data_file()
>>> > > >> > > > >> > >> > ->
>>> > > >> > > > >> > >> > > created new Met2dDataFile object of type
>>> > > >> > > "FileType_NcMet".DEBUG
>>> > > >> > > > >> > >> 4:DEBUG
>>> > > >> > > > >> > >> > 4:
>>> > > >> > > > >> > >> > > Latitude/Longitude Grid Data:DEBUG 4:
>>> lat_ll:
>>> > > >> 25DEBUG
>>> > > >> > 4:
>>> > > >> > > > >> > >> > lon_ll:
>>> > > >> > > > >> > >> > > 129.8DEBUG 4:   delta_lat: 0.09DEBUG 4:
>>>  delta_lon:
>>> > > >> > > 0.09DEBUG
>>> > > >> > > > 4:
>>> > > >> > > > >> > >> > >  Nlat: 276DEBUG 4:        Nlon: 721DEBUG
>>> 4:DEBUG 4:
>>> > > >> > > > >> > >> > > VarInfoFactory::new_var_info() -> created
new
>>> > VarInfo
>>> > > >> > object
>>> > > >> > > of
>>> > > >> > > > >> type
>>> > > >> > > > >> > >> > > "FileType_NcMet".DEBUG 3: For forecast
valid at
>>> > > >> > > > 20200901_120000,
>>> > > >> > > > >> > >> found 1
>>> > > >> > > > >> > >> > > climatology field(s) with valid time(s):
>>> > > >> > 20201231_230000DEBUG
>>> > > >> > > > 3:
>>> > > >> > > > >> > >> Found 1
>>> > > >> > > > >> > >> > > climatology fields.DEBUG 3: Found 1
climatology
>>> mean
>>> > > >> and 0
>>> > > >> > > > >> > climatology
>>> > > >> > > > >> > >> > > standard deviation field(s) for forecast
>>> > > >> EROSurface.DEBUG
>>> > > >> > 2:
>>> > > >> > > > >> > >> Processing
>>> > > >> > > > >> > >> > > masking regions.DEBUG 3: Processing grid
mask:
>>> > > >> FULLDEBUG 4:
>>> > > >> > > > >> > >> > > parse_grid_mask() -> parsing grid mask
>>> "FULL"DEBUG
>>> > > >> 2:DEBUG
>>> > > >> > 2:
>>> > > >> > > > >> > >> > >
>>> > > >> > > > >> > >> > >
>>> > > >> > > > >> > >> >
>>> > > >> > > > >> > >>
>>> > > >> > > > >> >
>>> > > >> > > > >>
>>> > > >> > > >
>>> > > >> > >
>>> > > >> >
>>> > > >>
>>> > >
>>> >
>>>
--------------------------------------------------------------------------------DEBUG
>>> > > >> > > > >> > >> > > 2:DEBUG 3: Smoothing field using the
MAX(49)
>>> > > >> CircleTemplate
>>> > > >> > > > >> > >> interpolation
>>> > > >> > > > >> > >> > > method.DEBUG 2: Processing EROSurface
versus
>>> > > >> > ST4gFFGSurface,
>>> > > >> > > > for
>>> > > >> > > > >> > >> > smoothing
>>> > > >> > > > >> > >> > > method MAX_CIRCLE(49), over region FULL,
using
>>> > 190638
>>> > > >> > matched
>>> > > >> > > > >> > >> pairs.DEBUG
>>> > > >> > > > >> > >> > > 2: Computing Probabilistic
Statistics.DEBUG
>>> 2:DEBUG
>>> > 2:
>>> > > >> > > > >> > >> > >
>>> > > >> > > > >> > >> > >
>>> > > >> > > > >> > >> >
>>> > > >> > > > >> > >>
>>> > > >> > > > >> >
>>> > > >> > > > >>
>>> > > >> > > >
>>> > > >> > >
>>> > > >> >
>>> > > >>
>>> > >
>>> >
>>>
--------------------------------------------------------------------------------DEBUG
>>> > > >> > > > >> > >> > > 2:DEBUG 1: Output file:
>>> > > >> > > > >> > >> > >
>>> > > >> > > > >> > >> > >
>>> > > >> > > > >> > >> >
>>> > > >> > > > >> > >>
>>> > > >> > > > >> >
>>> > > >> > > > >>
>>> > > >> > > >
>>> > > >> > >
>>> > > >> >
>>> > > >>
>>> > >
>>> >
>>> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL//grid_stat_ST4gFFG_ERO_s2020090112_e2020090212_vhr09_240000L_20200901_120000V.statDEBUG
>>> > > >> > > > >> > >> > > 1: Output file:
>>> > > >> > > > >> > >> > >
>>> > > >> > > > >> > >> > >
>>> > > >> > > > >> > >> >
>>> > > >> > > > >> > >>
>>> > > >> > > > >> >
>>> > > >> > > > >>
>>> > > >> > > >
>>> > > >> > >
>>> > > >> >
>>> > > >>
>>> > >
>>> >
>>> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL//grid_stat_ST4gFFG_ERO_s2020090112_e2020090212_vhr09_240000L_20200901_120000V_pairs.nc*
>>> > > >> > > > >> > >> > >
>>> > > >> > > > >> > >> > >
>>> > > >> > > > >> > >> > > 2) This question is a bit more basic. I
am
>>> unable to
>>> > > >> > manually
>>> > > >> > > > >> > >> calculate a
>>> > > >> > > > >> > >> > > Brier Score value for the forecast and
>>> observation
>>> > > that
>>> > > >> > > > properly
>>> > > >> > > > >> > >> matches
>>> > > >> > > > >> > >> > > that in the stat file. My manually
calculated
>>> Brier
>>> > > >> Score
>>> > > >> > is
>>> > > >> > > > >> > >> > systematically
>>> > > >> > > > >> > >> > > lower. For this event, the stat file BS
is
>>> 0.0119
>>> > and
>>> > > my
>>> > > >> > > value
>>> > > >> > > > is
>>> > > >> > > > >> > >> 0.0116.
>>> > > >> > > > >> > >> > > I've looked at C3 in the MET Tutorial
guide
>>> > > >> > > > >> > >> > > <
>>> > > >> > > > >> > >> > >
>>> > > >> > > > >> > >> >
>>> > > >> > > > >> > >>
>>> > > >> > > > >> >
>>> > > >> > > > >>
>>> > > >> > > >
>>> > > >> > >
>>> > > >> >
>>> > > >>
>>> > >
>>> >
>>> https://dtcenter.org/sites/default/files/community-
code/met/docs/user-guide/MET_Users_Guide_v9.0.pdf
>>> > > >> > > > >> > >> > > >,
>>> > > >> > > > >> > >> > > but I'm still at a bit of a loss. Is
there a
>>> simple
>>> > > way
>>> > > >> I
>>> > > >> > can
>>> > > >> > > > >> > >> replicate
>>> > > >> > > > >> > >> > the
>>> > > >> > > > >> > >> > > calculation seen in the stat file?
>>> > > >> > > > >> > >> > >
>>> > > >> > > > >> > >> > > Thank you again for your help and please
let me
>>> know
>>> > > if
>>> > > >> you
>>> > > >> > > > have
>>> > > >> > > > >> any
>>> > > >> > > > >> > >> > > questions.
>>> > > >> > > > >> > >> > >
>>> > > >> > > > >> > >> > > Mike
>>> > > >> > > > >> > >> > >
>>> > > >> > > > >> > >> > > --
>>> > > >> > > > >> > >> > > Michael J. Erickson
>>> > > >> > > > >> > >> > >
>>> > > >> > > > >> > >> > > Research Scientist
>>> > > >> > > > >> > >> > > Cooperative Institute for Research in
>>> Environmental
>>> > > >> > Sciences
>>> > > >> > > > >> (CIRES)
>>> > > >> > > > >> > >> > > NOAA/NWS/Weather Prediction Center
>>> > > >> > > > >> > >> > > Phone:  301-683-1546
>>> > > >> > > > >> > >> > >
>>> > > >> > > > >> > >> > >
>>> > > >> > > > >> > >> >
>>> > > >> > > > >> > >> >
>>> > > >> > > > >> > >>
>>> > > >> > > > >> > >> --
>>> > > >> > > > >> > >> Michael J. Erickson
>>> > > >> > > > >> > >>
>>> > > >> > > > >> > >> Research Scientist
>>> > > >> > > > >> > >> Cooperative Institute for Research in
Environmental
>>> > > Sciences
>>> > > >> > > > (CIRES)
>>> > > >> > > > >> > >> NOAA/NWS/Weather Prediction Center
>>> > > >> > > > >> > >> Phone:  301-683-1546
>>> > > >> > > > >> > >>
>>> > > >> > > > >> > >>
>>> > > >> > > > >> >
>>> > > >> > > > >> >
>>> > > >> > > > >>
>>> > > >> > > > >> --
>>> > > >> > > > >> Michael J. Erickson
>>> > > >> > > > >>
>>> > > >> > > > >> Research Scientist
>>> > > >> > > > >> Cooperative Institute for Research in
Environmental
>>> Sciences
>>> > > >> (CIRES)
>>> > > >> > > > >> NOAA/NWS/Weather Prediction Center
>>> > > >> > > > >> Phone:  301-683-1546
>>> > > >> > > > >>
>>> > > >> > > > >>
>>> > > >> > > >
>>> > > >> > > >
>>> > > >> > >
>>> > > >> > > --
>>> > > >> > > Michael J. Erickson
>>> > > >> > >
>>> > > >> > > Research Scientist
>>> > > >> > > Cooperative Institute for Research in Environmental
Sciences
>>> > (CIRES)
>>> > > >> > > NOAA/NWS/Weather Prediction Center
>>> > > >> > > Phone:  301-683-1546
>>> > > >> > >
>>> > > >> > >
>>> > > >> >
>>> > > >> >
>>> > > >>
>>> > > >>
>>> > > >
>>> > > > --
>>> > > > Michael J. Erickson
>>> > > >
>>> > > > Research Scientist
>>> > > > Cooperative Institute for Research in Environmental Sciences
>>> (CIRES)
>>> > > > NOAA/NWS/Weather Prediction Center
>>> > > > Phone:  301-683-1546
>>> > > >
>>> > >
>>> > >
>>> >
>>> > --
>>> > Michael J. Erickson
>>> >
>>> > Research Scientist
>>> > Cooperative Institute for Research in Environmental Sciences
(CIRES)
>>> > NOAA/NWS/Weather Prediction Center
>>> > Phone:  301-683-1546
>>> >
>>> >
>>>
>>>
>>
>> --
>> Michael J. Erickson
>>
>> Research Scientist
>> Cooperative Institute for Research in Environmental Sciences
(CIRES)
>> NOAA/NWS/Weather Prediction Center
>> Phone:  301-683-1546
>>
>
>
> --
> Michael J. Erickson
>
> Research Scientist
> Cooperative Institute for Research in Environmental Sciences (CIRES)
> NOAA/NWS/Weather Prediction Center
> Phone:  301-683-1546
>

--
Michael J. Erickson

Research Scientist
Cooperative Institute for Research in Environmental Sciences (CIRES)
NOAA/NWS/Weather Prediction Center
Phone:  301-683-1546

------------------------------------------------
Subject: Including Climatology in grid_stat Config File
From: John Halley Gotway
Time: Thu Sep 17 12:33:15 2020

Mike,

Yes, that's correct:
   https://dtcenter.github.io/MET/Users_Guide/point-stat.html#id15

Field 40 is BSS relative to external climatology whereas BSS_SMPL is
relative to the sample climatology.

Sounds like you're still seeing a difference of 0.0004 in the computed
value (0.0120 vs 0.0116). I can't provide an explanation or defense
for
that difference but am glad to see that it's a very small number!

I wrote up this GitHub issue to improve the probabilistic vx in MET in
this
case:
https://github.com/dtcenter/MET/issues/1495

If you're already on GitHub and would like to join the DTCenter
organization, just let me know you GitHub user name. I could add you
to the
"METplus Team" and then tag you as the "scientist" on this issue. It's
just
up to you the level at which you'd like to collaborate.

Thanks,
John

On Wed, Sep 16, 2020 at 8:04 AM Michael Erickson - NOAA Affiliate via
RT <
met_help at ucar.edu> wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562 >
>
> Hello,
>
> I have one additional question, and that is when extracting the BSS
value
> from the stat file based on my input climatology, would I extract
field 40
> (Brier Skill Score relative to external climatology)?
>
> Thanks again,
>
> Mike
>
> On Wed, Sep 16, 2020 at 9:44 AM Michael Erickson - NOAA Affiliate <
> michael.j.erickson at noaa.gov> wrote:
>
> > Hi All,
> >
> > I have redone my configuration file "usethis" to reflect the new
> threshold
> > values. After that I have run grid_stat in the same manner as my
initial
> > email:
> >
> > /opt/MET/90/bin/grid_stat ERO_s2020083112_e2020090112_vhr21.nc
> > ST4gFFG_s2020083112_e2020090112_vhr21.nc usethis -outdir ~
> >
> > I've put the input/output files here:
> > https://ftp.wpc.ncep.noaa.gov/erickson/DTC/.
> >
> > I still have the issue where my Brier Score value is 0.0116 and
the stat
> > file is ~0.0120. My method for computing BS is more simple than
that in
> > MET, where I just compute the mean squared difference between
model and
> > observation probabilities (e.g. no summing through contingency
table
> counts
> > as is done on p 446 of the MET tutorial
> > <
> https://dtcenter.org/sites/default/files/community-
code/met/docs/user-guide/MET_Users_Guide_v9.0.pdf
> >
> > ).
> >
> > I am wondering if the Brier Score and Brier Skill Score values in
the MET
> > stat file look correct for this case? If so, then I am content on
> > proceeding forward with implementing this new setup at WPC.
> >
> > Thank you for all of your help with this!
> >
> > Mike
> >
> >
> >
> > On Fri, Sep 11, 2020 at 7:44 AM Michael Erickson - NOAA Affiliate
<
> > michael.j.erickson at noaa.gov> wrote:
> >
> >> Hi John,
> >>
> >> Thank you for your help here! I do appreciate it. I will
gradually work
> >> in your recommended changes to my python scripts.
> >>
> >> Regarding your options, these are good suggestions and I can
understand
> >> how complicated this is. I would advise against 2) since this
would
> change
> >> the results from previous versions. Option 1) is appealing to me,
but
> I'm
> >> not sure if there are many other users with discrete thresholds
to their
> >> gridded data. I could see the utility of a -left, -middle, -right
option
> >> which will default to mid point binning when unspecified. It's
> unfortunate
> >> that the user will lose either the left or right most category
with this
> >> option, but if the user is this savvy to get to this level of
detail,
> they
> >> can probably modify either their data or the threshold to meet
within
> the
> >> constraints of left/right binning. Another option is to calculate
BS
> >> without summing through the thresholds, but this loses a layer of
> >> complexity that I like.
> >>
> >> I hope this helps and thank you!
> >>
> >> Mike
> >>
> >> On Thu, Sep 10, 2020 at 3:28 PM John Halley Gotway via RT <
> >> met_help at ucar.edu> wrote:
> >>
> >>> Sorry it took me so long to answer. So we know that MET uses the
> >>> centerpoint of the bin as the probability value. And we know
that your
> >>> data
> >>> is already binned with the only valid probability values being:
> >>> 0.0, 0.05, 0.1, 0.2, 0.5, 1.
> >>>
> >>> So we want to choose bins whose centerpoints correspond to these
> >>> probability values. However, we're a little constrained because
MET
> >>> requires the first and last ones to be 0 and 1, respectively,
and that
> >>> everything in between be monotonically increasing.
> >>>
> >>> The most concise way I can think of uses 7 bins defined by:
> >>> cat_thresh = [ >=0.0, >=0.001, >=0.1, >=0.1001, >=0.3, >=0.7,
>=0.999,
> >>> >=1.0 ];
> >>>
> >>> Bin 1 for prob = 0: 0 to 0.001
> >>> Bin 2 for prob = 0.05: 0.001 to 0.1
> >>> Bin 3 for prob = 0.1: 0.1 to 0.1001
> >>> Bin 4 for prob = 0.2: 0.1001 to 0.3
> >>> Bin 5 for prob = 0.5: 0.3 to 0.7
> >>> Bin 6 as a placeholder: 0.7 to 0.999
> >>> Bin 7 for prob = 1.0: 0.999 to 1.0
> >>>
> >>> But perhaps it'd be more clear with:
> >>> cat_thresh = [ >=0.0, >=0.001, >=0.05, >=0.0501, >=0.10,
>=0.101,
> >=0.2,
> >>> >=0.201, >=0.5, >=0.501, >=0.999, >=1.0 ];
> >>>
> >>> But all these mental gymnastics seem way too confusing!
> >>> So what changes can we make to Point-Stat and Grid-Stat to
better
> handle
> >>> this situation in the future?
> >>>
> >>> No very obvious solution occurs to me, but some options include:
> >>>
> >>> (1) Add a config option to switch from using the mid-point of
the
> >>> probability bin to using the left or right side.
> >>> But for the first bin, you'd want the left side. And for the
last bin,
> >>> you'd want the right side! We could consider 0 to be a special
case?
> >>> And this requires the user to be very savvy to understand all
these
> >>> details.
> >>>
> >>> (2) Consider changing the logic to ALWAYS include bins for 0 to
0 and 1
> >>> to
> >>> 1 since the endpoints are kind of special cases?
> >>> But that'd change existing results which is not good.
> >>>
> >>> (3) Pre-process the input probability values before any
smoothing or
> >>> interpolation to point observations occurs.
> >>> Keep track of the unique values to determine if the data is
binned.
> >>> But what qualifies as being binned? 5 unique probabilities? 10?
20? 50?
> >>> 100?
> >>> Potentially print a warning message if they've chosen
probability bins
> >>> poorly?
> >>> What does poorly mean?
> >>>
> >>> If we can define some very specific solutions, we can make the
code do
> >>> whatever we want.
> >>>
> >>> But ideally the changes would not change existing results, be
intuitive
> >>> for
> >>> a user to understand, and be easy to document.
> >>>
> >>> Please let me know.
> >>>
> >>> Thanks,
> >>> John
> >>>
> >>> On Wed, Sep 9, 2020 at 3:50 PM Michael Erickson - NOAA Affiliate
via
> RT <
> >>> met_help at ucar.edu> wrote:
> >>>
> >>> >
> >>> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562
>
> >>> >
> >>> > Thanks Everyone for your helpful responses.
> >>> >
> >>> > I have been using grid_stat for WPC's Excessive Rainfall
Outlook
> >>> > (consisting of probabilities of 0, 0.05, 0.1, 0.2, and 0.5)
for
> years.
> >>> > These results are dependent upon MET, so I wanted to make sure
I am
> >>> > following best practices.
> >>> >
> >>> > What is DTC's guidance on how to proceed forward with this?
Should I
> >>> change
> >>> > my cat_thresh to "= [ >=0.0, >=0.001, >=0.05, >=0.1, >=0.2,
>=0.5,
> >>> >=1.0
> >>> > ];" or is my current setting fine given that both the
"forecast" and
> >>> > "observation" are broken down by the same discrete increments?
I can
> >>> also
> >>> > just calculate Brier Score manually outside of grid_stat.
> >>> >
> >>> > Thanks,
> >>> >
> >>> > Mike
> >>> >
> >>> > On Tue, Sep 8, 2020 at 4:56 PM Barbara Brown via RT <
> met_help at ucar.edu
> >>> >
> >>> > wrote:
> >>> >
> >>> > > I agree with Eric and John. The way MET does this generally
makes
> >>> sense
> >>> > for
> >>> > > ensemble forecasts (or other cases when you want MET to
select the
> >>> > > thresholds) but not for the cse when the probabilities for
specific
> >>> > > categories are provided by the user.  I'm not sure what the
> >>> work-around
> >>> > > might be (John may have ideas!) but in the long-run it would
be
> good
> >>> to
> >>> > > allow for this option.
> >>> > >
> >>> > > Barb
> >>> > > ---
> >>> > > Barbara Brown, Senior Research Associate
> >>> > > Research Applications Laboratory
> >>> > > NCAR PO Box 3000
> >>> > > Boulder CO 80307-3000 USA
> >>> > > Ph: +1 303 497 8468  FAX: +1 303 497 8401
> >>> > >
> >>> > >
> >>> > > On Tue, Sep 8, 2020 at 2:14 PM Michael Erickson - NOAA
Affiliate <
> >>> > > michael.j.erickson at noaa.gov> wrote:
> >>> > >
> >>> > > > Hi Eric and John,
> >>> > > >
> >>> > > > Thank you for your response to this matter. What would be
the
> best
> >>> > > > practice to take in this situation?
> >>> > > >
> >>> > > > Thanks,
> >>> > > >
> >>> > > > Mike
> >>> > > >
> >>> > > > On Tue, Sep 8, 2020 at 3:41 PM Eric Gilleland via RT <
> >>> > met_help at ucar.edu>
> >>> > > > wrote:
> >>> > > >
> >>> > > >> Hi John,
> >>> > > >>
> >>> > > >> I agree that if the probabilities have already been
binned, then
> >>> it is
> >>> > > >> strange to then take the midpoint (re-binning).
> >>> > > >>
> >>> > > >> Eric
> >>> > > >>
> >>> > > >> On Fri, Sep 4, 2020 at 11:14 AM John Halley Gotway via RT
<
> >>> > > >> met_help at ucar.edu>
> >>> > > >> wrote:
> >>> > > >>
> >>> > > >> > Barb and Eric,
> >>> > > >> >
> >>> > > >> > I've added you to this met-help ticket from Mike
Erickson from
> >>> > > NOAA/WPC.
> >>> > > >> > We're hoping to get some advice from one or both of you
about
> >>> > > >> probabilistic
> >>> > > >> > verification.
> >>> > > >> >
> >>> > > >> > Mike is running Grid-Stat to verify WPC's Excessive
Rainfall
> >>> > Outlooks
> >>> > > >> > against StageIV precip. The forecast probability values
are
> >>> always
> >>> > 0,
> >>> > > >> 0.05,
> >>> > > >> > 0.1, 0.2, 0.5, or 1.0.
> >>> > > >> > When Mike computes the Brier score by hand, it differs
from
> the
> >>> > > results
> >>> > > >> > reported by Grid-Stat out in the 3rd decimal place.
> >>> > > >> >
> >>> > > >> > My theory is that the difference is caused by the fact
that
> MET
> >>> does
> >>> > > not
> >>> > > >> > compute the Brier score directly on the probability
values.
> >>> Instead,
> >>> > > it
> >>> > > >> > bins them into an Nx2 probabilistic contingency table
and
> >>> computes
> >>> > the
> >>> > > >> > Brier score from that table. And the mid-point of each
bin is
> >>> used
> >>> > in
> >>> > > >> the
> >>> > > >> > Brier score computations. So different probability bins
will
> >>> result
> >>> > > in a
> >>> > > >> > slightly different Brier score.
> >>> > > >> >
> >>> > > >> > Mike is currently using probability thresholds as
follows:
> >>> > > >> >    cat_thresh = [ >=0.0, >=0.05, >=0.1, >=0.2, >=0.5,
>=1.0 ];
> >>> > > >> >
> >>> > > >> > And that's consistent with the probability values. But
when
> you
> >>> > think
> >>> > > >> about
> >>> > > >> > it...
> >>> > > >> > - Forecasts of 0% fall into the first bin and are
evaluated as
> >>> > being a
> >>> > > >> > value of 0.025 (mid-point of the 0.0 to 0.05 bin)
> >>> > > >> > - Forecasts of 5% fall into the second bin and are
evaluated
> as
> >>> > being
> >>> > > a
> >>> > > >> > value of 0.075 (mid-point of the 0.05 to 0.1 bin)
> >>> > > >> > - Forecasts of 10% fall into the third bin and are
evaluated
> as
> >>> > being
> >>> > > a
> >>> > > >> > value of 0.150 (mid-point of the 0.1 to 0.2 bin).
> >>> > > >> > - and so on for the other probability values
> >>> > > >> >
> >>> > > >> > Seems like the binning of probability values works
better for
> >>> > > continuous
> >>> > > >> > probability values and not so well for probabilities
that have
> >>> > already
> >>> > > >> been
> >>> > > >> > binned!
> >>> > > >> >
> >>> > > >> > I'm wondering if you have any thoughts or advice about
this
> >>> > situation?
> >>> > > >> >
> >>> > > >> > Thanks,
> >>> > > >> > John
> >>> > > >> >
> >>> > > >> > On Fri, Sep 4, 2020 at 10:47 AM Michael Erickson - NOAA
> >>> Affiliate
> >>> > via
> >>> > > >> RT <
> >>> > > >> > met_help at ucar.edu> wrote:
> >>> > > >> >
> >>> > > >> > >
> >>> > > >> > > <URL:
> https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562
> >>> >
> >>> > > >> > >
> >>> > > >> > > Hi John,
> >>> > > >> > >
> >>> > > >> > > Thanks for your answers and sounds good! That is
strange
> that
> >>> the
> >>> > > >> climo
> >>> > > >> > > file was not found for your setting. The only detail
I can
> >>> think
> >>> > of
> >>> > > is
> >>> > > >> > that
> >>> > > >> > > within the climo field, the file_name specification
is
> static:
> >>> > > >> > >
> >>> > > >> > > file_name = [
> >>> > > >> > >
> >>> > > >> >
> >>> > > >>
> >>> > >
> >>> >
> >>>
>
"/usr1/wpc_cpgffh/gribs/ERO_verif/static/UFVS_ST4gFFG_s2015010100_e2019123123_vhr12.nc"
> >>> > > >> > > ];
> >>> > > >> > >
> >>> > > >> > >
> >>> > > >> > > I believe you concluded that my climo read-in looked
> correct?
> >>> > > >> > >
> >>> > > >> > > Thanks,
> >>> > > >> > >
> >>> > > >> > > Mike
> >>> > > >> > >
> >>> > > >> > >
> >>> > > >> > > On Fri, Sep 4, 2020 at 12:42 PM John Halley Gotway
via RT <
> >>> > > >> > > met_help at ucar.edu>
> >>> > > >> > > wrote:
> >>> > > >> > >
> >>> > > >> > > > Mike,
> >>> > > >> > > >
> >>> > > >> > > > 2 more things I forgot to address.
> >>> > > >> > > >
> >>> > > >> > > > First, I pulled that climo field but when I ran
grid_stat
> >>> with
> >>> > > your
> >>> > > >> > > usethis
> >>> > > >> > > > config file, it did not actually read the climo
data.
> >>> > > >> > > >
> >>> > > >> > > > DEBUG 3: Found 0 climatology fields.
> >>> > > >> > > >
> >>> > > >> > > >
> >>> > > >> > > > I'm wondering what additional configuration
settings you
> >>> used to
> >>> > > >> make
> >>> > > >> > > this
> >>> > > >> > > > work?
> >>> > > >> > > >
> >>> > > >> > > >
> >>> > > >> > > > Second, the answer to your question is yes. The
exact same
> >>> > binning
> >>> > > >> > logic
> >>> > > >> > > > used for the forecast probabilities is applied to
the
> climo
> >>> > data.
> >>> > > In
> >>> > > >> > > fact,
> >>> > > >> > > > the forecast probability bins are applied to both
the
> >>> forecast
> >>> > and
> >>> > > >> > climo
> >>> > > >> > > > data. So you do not need to define separate
"cat_thresh"
> >>> > settings
> >>> > > >> for
> >>> > > >> > the
> >>> > > >> > > > climo. They won't be used anyway.
> >>> > > >> > > >
> >>> > > >> > > >
> >>> > > >> > > > Here's the spot in the library code where the climo
> >>> > probabilistic
> >>> > > >> > > > contingency table is created using the forecast
> probability
> >>> > bins:
> >>> > > >> > > >
> >>> > > >> > > >
> >>> > > >> > > >
> >>> > > >> > > >
> >>> > > >> > >
> >>> > > >> >
> >>> > > >>
> >>> > >
> >>> >
> >>>
>
https://github.com/dtcenter/MET/blob/2c9ae440a84024fbf62caa64c1b747f9a912236f/met/src/libcode/vx_statistics/compute_stats.cc#L767
> >>> > > >> > > >
> >>> > > >> > > >
> >>> > > >> > > > Thanks,
> >>> > > >> > > > John
> >>> > > >> > > >
> >>> > > >> > > > On Fri, Sep 4, 2020 at 10:25 AM John Halley Gotway
<
> >>> > > johnhg at ucar.edu
> >>> > > >> >
> >>> > > >> > > > wrote:
> >>> > > >> > > >
> >>> > > >> > > > > Mike,
> >>> > > >> > > > >
> >>> > > >> > > > > I don't really have a recommendation on best
practices
> >>> with
> >>> > > >> regards
> >>> > > >> > to
> >>> > > >> > > > the
> >>> > > >> > > > > binning of probability values.
> >>> > > >> > > > >
> >>> > > >> > > > > I can say that I more commonly see people choose
fixed
> bin
> >>> > > widths,
> >>> > > >> > like
> >>> > > >> > > > > "==0.10" (for 10 bins) or "==0.05" (for 20 bins)
instead
> >>> of
> >>> > > >> variable
> >>> > > >> > > > width
> >>> > > >> > > > > bins, such as:
> >>> > > >> > > > > [ >=0.0, >=0.05, >=0.1, >=0.2, >=0.5, >=1.0 ]
> >>> > > >> > > > >
> >>> > > >> > > > > But I suspect that's more out of convenience than
> anything
> >>> > else.
> >>> > > >> With
> >>> > > >> > > > > regards to your chosen bins, I suspect you set
them up
> >>> this
> >>> > way
> >>> > > >> since
> >>> > > >> > > you
> >>> > > >> > > > > have lots of low probability values closer to 0.0
and
> >>> > relatively
> >>> > > >> few
> >>> > > >> > > > > probability values closer to 1.0. While this may
be a
> good
> >>> > > choice
> >>> > > >> for
> >>> > > >> > > > > relatively rare events, it wouldn't be as good of
a
> >>> choice for
> >>> > > >> very
> >>> > > >> > > > common
> >>> > > >> > > > > events resulting in high probability values.
> >>> > > >> > > > >
> >>> > > >> > > > > Choosing 20 bins (==0.05) would include all of
your
> >>> current
> >>> > bin
> >>> > > >> > > > boundaries
> >>> > > >> > > > > and enable you to sample evenly across the
probability
> >>> space,
> >>> > > >> > > regardless
> >>> > > >> > > > of
> >>> > > >> > > > > whether the values are bunched near 0 or 1. And
> >>> > mathematically,
> >>> > > >> your
> >>> > > >> > > > > current bins would be derivable from these.
> >>> > > >> > > > >
> >>> > > >> > > > > But if your chosen bins follow some existing WPC
> >>> convention, I
> >>> > > >> don't
> >>> > > >> > > see
> >>> > > >> > > > > an obvious reason to change them.
> >>> > > >> > > > >
> >>> > > >> > > > > Please let me know if you'd like me to forward
this
> >>> question
> >>> > to
> >>> > > >> one
> >>> > > >> > of
> >>> > > >> > > > the
> >>> > > >> > > > > statisticians in our group for their advice.
> >>> > > >> > > > >
> >>> > > >> > > > > Thanks,
> >>> > > >> > > > > John
> >>> > > >> > > > >
> >>> > > >> > > > > On Fri, Sep 4, 2020 at 5:08 AM Michael Erickson -
NOAA
> >>> > Affiliate
> >>> > > >> via
> >>> > > >> > > RT <
> >>> > > >> > > > > met_help at ucar.edu> wrote:
> >>> > > >> > > > >
> >>> > > >> > > > >>
> >>> > > >> > > > >> <URL:
> >>> > https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562
> >>> > > >
> >>> > > >> > > > >>
> >>> > > >> > > > >> Hi John,
> >>> > > >> > > > >>
> >>> > > >> > > > >> Thank you for your quick and helpful response!
To
> answer
> >>> your
> >>> > > >> > > questions
> >>> > > >> > > > >> from the first email:
> >>> > > >> > > > >>
> >>> > > >> > > > >> 1) I have included the climo file in case you
wanted to
> >>> see
> >>> > it:
> >>> > > >> > > > >>
> >>> > > >> > > > >>
> >>> > > >> > > >
> >>> > > >> > >
> >>> > > >> >
> >>> > > >>
> >>> > >
> >>> >
> >>>
>
https://ftp.wpc.ncep.noaa.gov/erickson/DTC/UFVS_ST4gFFG_s2015010100_e2019123123_vhr12.nc
> >>> > > >> > > > >>
> >>> > > >> > > > >> 2) I start from the netcdf output from
grid_stat, load
> >>> that
> >>> > > data
> >>> > > >> > into
> >>> > > >> > > > the
> >>> > > >> > > > >> python workspace, and compute the brier score
from
> that.
> >>> > > >> > > > >>
> >>> > > >> > > > >> Also the circle diameter of 9 in the observation
file
> is
> >>> to
> >>> > > draw
> >>> > > >> a
> >>> > > >> > 40
> >>> > > >> > > km
> >>> > > >> > > > >> radius around the "observation."
> >>> > > >> > > > >>
> >>> > > >> > > > >> From your latter email, it sounds like I may not
be
> able
> >>> to
> >>> > > >> exactly
> >>> > > >> > > > >> replicate the Brier Score calculation. In the
spirit of
> >>> best
> >>> > > >> > > practices,
> >>> > > >> > > > >> would you recommend I change cat_thresh  to "= [
>=0.0,
> >>> > > >=0.001,
> >>> > > >> > > >=0.05,
> >>> > > >> > > > >> >=0.1, >=0.2, >=0.5, >=1.0 ];" or keep my
cat_thresh as
> >>> it
> >>> > > >> currently
> >>> > > >> > > is
> >>> > > >> > > > as
> >>> > > >> > > > >> long as I am consistent? I was also wondering if
> >>> grid_stat
> >>> > bins
> >>> > > >> the
> >>> > > >> > > > >> probabilities for the climo field as it does for
the
> >>> > > >> probabilities
> >>> > > >> > in
> >>> > > >> > > > the
> >>> > > >> > > > >> forecast field?
> >>> > > >> > > > >>
> >>> > > >> > > > >> Thanks again!
> >>> > > >> > > > >>
> >>> > > >> > > > >> Mike
> >>> > > >> > > > >>
> >>> > > >> > > > >> On Thu, Sep 3, 2020 at 7:12 PM John Halley
Gotway via
> RT
> >>> <
> >>> > > >> > > > >> met_help at ucar.edu>
> >>> > > >> > > > >> wrote:
> >>> > > >> > > > >>
> >>> > > >> > > > >> > Actually, I have a reasonable guess as to why
you may
> >>> be
> >>> > > >> seeing a
> >>> > > >> > > > >> > difference.
> >>> > > >> > > > >> >
> >>> > > >> > > > >> > All probabilistics verification in MET is
based on an
> >>> Nx2
> >>> > > >> > > > probabilistic
> >>> > > >> > > > >> > contingency table. Those are the counts in the
PCT
> line
> >>> > type.
> >>> > > >> We
> >>> > > >> > do
> >>> > > >> > > > >> this to
> >>> > > >> > > > >> > make it easier to aggregate statistics across
> multiple
> >>> > cases,
> >>> > > >> but
> >>> > > >> > > > >> summing
> >>> > > >> > > > >> > up contingency tables before recomputing
statistics.
> >>> But
> >>> > the
> >>> > > >> > > pros/cons
> >>> > > >> > > > >> of
> >>> > > >> > > > >> > this approach would probably be better
addressed by a
> >>> > > >> > statistician.
> >>> > > >> > > So
> >>> > > >> > > > >> the
> >>> > > >> > > > >> > stats are computed using probability bins and
not raw
> >>> > > >> probability
> >>> > > >> > > > >> values.
> >>> > > >> > > > >> >
> >>> > > >> > > > >> > If you went and computed the Brier score by
hand, you
> >>> > > probably
> >>> > > >> did
> >>> > > >> > > so
> >>> > > >> > > > >> using
> >>> > > >> > > > >> > raw probability values and not binning them
first.
> >>> > > >> > > > >> >
> >>> > > >> > > > >> > And this difference could explain the type of
> >>> discrepancy
> >>> > > >> you're
> >>> > > >> > > > seeing.
> >>> > > >> > > > >> >
> >>> > > >> > > > >> > To test this out, I reran your case...
> >>> > > >> > > > >> > (1) Using your original settings to confirm
your
> Brier
> >>> > score
> >>> > > of
> >>> > > >> > > > >> 0.011934.
> >>> > > >> > > > >> > (2) Using 10 equally-spaced probability bins
> >>> (cat_thresh =
> >>> > [
> >>> > > >> ==0.1
> >>> > > >> > > ];)
> >>> > > >> > > > >> > which produced a Brier score of 0.013747.
> >>> > > >> > > > >> > (3) Using 50 equally-spaced probability bins
> >>> (cat_thresh =
> >>> > [
> >>> > > >> ==0.2
> >>> > > >> > > ];)
> >>> > > >> > > > >> > which produced a Brier score of 0.01197.
> >>> > > >> > > > >> > (4) Using 100 equally-spaced probability bins
> >>> (cat_thresh
> >>> > = [
> >>> > > >> > ==0.01
> >>> > > >> > > > ];)
> >>> > > >> > > > >> > which produced a Brier score of 0.01193.
> >>> > > >> > > > >> >
> >>> > > >> > > > >> > I suppose that doesn't example the exact
discrepancy,
> >>> but
> >>> > > could
> >>> > > >> > > > >> definitely
> >>> > > >> > > > >> > be involved.
> >>> > > >> > > > >> >
> >>> > > >> > > > >> > Notice on this line of the brier score
computation in
> >>> MET:
> >>> > > >> > > > >> >
> >>> > > >> > > > >> >
> >>> > > >> > > > >>
> >>> > > >> > > >
> >>> > > >> > >
> >>> > > >> >
> >>> > > >>
> >>> > >
> >>> >
> >>>
>
https://github.com/dtcenter/MET/blob/2c9ae440a84024fbf62caa64c1b747f9a912236f/met/src/libcode/vx_statistics/contable_nx2.cc#L647
> >>> > > >> > > > >> >
> >>> > > >> > > > >> > That the "probability" value returned by
> "row_proby()"
> >>> is
> >>> > the
> >>> > > >> > > > mid-point
> >>> > > >> > > > >> of
> >>> > > >> > > > >> > the bin.
> >>> > > >> > > > >> > So all of your forecast probability values of
0%
> which
> >>> fall
> >>> > > >> into
> >>> > > >> > the
> >>> > > >> > > > >> first
> >>> > > >> > > > >> > bin are actually evaluated as having a
probability
> >>> value of
> >>> > > >> 0.025
> >>> > > >> > > > which
> >>> > > >> > > > >> is
> >>> > > >> > > > >> > the mid-point between 0 and 0.05 for the first
bin.
> >>> > > >> > > > >> >
> >>> > > >> > > > >> > Rerunning using the following to minimize that
effect
> >>> on
> >>> > the
> >>> > > >> 0's:
> >>> > > >> > > > >> > cat_thresh = [ >=0.0, >=0.001, >=0.05, >=0.1,
>=0.2,
> >>> >=0.5,
> >>> > > >> >=1.0
> >>> > > >> > ];
> >>> > > >> > > > >> > produces a brier score of 0.011489.
> >>> > > >> > > > >> >
> >>> > > >> > > > >> > So I'd say that the binning of the probability
values
> >>> is
> >>> > > >> impacting
> >>> > > >> > > the
> >>> > > >> > > > >> > Brier score out in the 4th decimal place.
> >>> > > >> > > > >> >
> >>> > > >> > > > >> > John
> >>> > > >> > > > >> >
> >>> > > >> > > > >> > On Thu, Sep 3, 2020 at 4:47 PM John Halley
Gotway <
> >>> > > >> > johnhg at ucar.edu>
> >>> > > >> > > > >> wrote:
> >>> > > >> > > > >> >
> >>> > > >> > > > >> > > Hi Mike,
> >>> > > >> > > > >> > >
> >>> > > >> > > > >> > > Looks like you were able to make a lot of
> progress. I
> >>> > > >> certainly
> >>> > > >> > > > don't
> >>> > > >> > > > >> see
> >>> > > >> > > > >> > > anything wrong based on the log messages you
sent.
> >>> > > >> > > > >> > >
> >>> > > >> > > > >> > > I do notice that you're smoothing the
observations
> >>> with
> >>> > the
> >>> > > >> > > maximum
> >>> > > >> > > > >> value
> >>> > > >> > > > >> > > in a circle of diameter 9... presumably for
a good
> >>> > reason.
> >>> > > >> And I
> >>> > > >> > > see
> >>> > > >> > > > >> that
> >>> > > >> > > > >> > > smoothing step indicated in the log messages
as
> well
> >>> as
> >>> > the
> >>> > > >> > output
> >>> > > >> > > > >> .stat
> >>> > > >> > > > >> > > file.
> >>> > > >> > > > >> > >
> >>> > > >> > > > >> > > Two questions.
> >>> > > >> > > > >> > >
> >>> > > >> > > > >> > > (1) I wanted to try running locally, but
didn't
> find
> >>> the
> >>> > > >> "climo"
> >>> > > >> > > > file
> >>> > > >> > > > >> on
> >>> > > >> > > > >> > > the WPC ftp site:
> >>> > > >> > > > >> > >
> >>> > > >> > > > >> > >
> >>> > > >> > > > >> >
> >>> > > >> > > > >>
> >>> > > >> > > >
> >>> > > >> > >
> >>> > > >> >
> >>> > > >>
> >>> > >
> >>> >
> >>>
>
https://ftp.wpc.ncep.noaa.gov/erickson/DTC/UFVS_ST4gFFG_s2015010100_e2019123123_vhr12.nc
> >>> > > >> > > > >> > > <
> >>> > > >> > > > >> >
> >>> > > >> > > > >>
> >>> > > >> > > >
> >>> > > >> > >
> >>> > > >> >
> >>> > > >>
> >>> > >
> >>> >
> >>>
>
https://ftp.wpc.ncep.noaa.gov/erickson/DTC/UFVS_ST4gFFG_s2015010100_e2019123123_vhr12.nc
> >>> > > >> > > > >> > >
> >>> > > >> > > > >> > > Could you add that?
> >>> > > >> > > > >> > >
> >>> > > >> > > > >> > > (2) When you say that you tried to replicate
the
> >>> Brier
> >>> > > score
> >>> > > >> > > > >> computation,
> >>> > > >> > > > >> > > what was your starting point? The raw input
files
> or
> >>> > using
> >>> > > >> the
> >>> > > >> > > > NetCDF
> >>> > > >> > > > >> > > matched pairs output from Grid-Stat which
already
> >>> include
> >>> > > the
> >>> > > >> > > > >> computation
> >>> > > >> > > > >> > > of the observation maximums?
> >>> > > >> > > > >> > >
> >>> > > >> > > > >> > > Thanks,
> >>> > > >> > > > >> > > John Halley Gotway
> >>> > > >> > > > >> > >
> >>> > > >> > > > >> > > On Thu, Sep 3, 2020 at 2:26 PM Michael
Erickson -
> >>> NOAA
> >>> > > >> Affiliate
> >>> > > >> > > via
> >>> > > >> > > > >> RT <
> >>> > > >> > > > >> > > met_help at ucar.edu> wrote:
> >>> > > >> > > > >> > >
> >>> > > >> > > > >> > >>
> >>> > > >> > > > >> > >> <URL:
> >>> > > >> https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562
> >>> > > >> > >
> >>> > > >> > > > >> > >>
> >>> > > >> > > > >> > >> Thank you Minna!
> >>> > > >> > > > >> > >>
> >>> > > >> > > > >> > >> Mike
> >>> > > >> > > > >> > >>
> >>> > > >> > > > >> > >> On Thu, Sep 3, 2020 at 4:12 PM Minna Win
via RT <
> >>> > > >> > > met_help at ucar.edu
> >>> > > >> > > > >
> >>> > > >> > > > >> > >> wrote:
> >>> > > >> > > > >> > >>
> >>> > > >> > > > >> > >> > Hi Mike,
> >>> > > >> > > > >> > >> >
> >>> > > >> > > > >> > >> > It looks like you have a few questions
> associated
> >>> with
> >>> > > >> > > > calculating
> >>> > > >> > > > >> > Brier
> >>> > > >> > > > >> > >> > Skill Scores.  I'm assigning this ticket
to John
> >>> > Halley
> >>> > > >> > Gotway.
> >>> > > >> > > > >> > >> >
> >>> > > >> > > > >> > >> > Regards,
> >>> > > >> > > > >> > >> > Minna
> >>> > > >> > > > >> > >> > ---------------
> >>> > > >> > > > >> > >> > Minna Win
> >>> > > >> > > > >> > >> > National Center for Atmospheric Research
> >>> > > >> > > > >> > >> > Developmental Testbed Center
> >>> > > >> > > > >> > >> > Phone: 303-497-8423
> >>> > > >> > > > >> > >> > Fax:   303-497-8401
> >>> > > >> > > > >> > >> >
> >>> > > >> > > > >> > >> >
> >>> > > >> > > > >> > >> >
> >>> > > >> > > > >> > >> > On Thu, Sep 3, 2020 at 1:13 PM Michael
Erickson
> -
> >>> NOAA
> >>> > > >> > > Affiliate
> >>> > > >> > > > >> via
> >>> > > >> > > > >> > RT
> >>> > > >> > > > >> > >> <
> >>> > > >> > > > >> > >> > met_help at ucar.edu> wrote:
> >>> > > >> > > > >> > >> >
> >>> > > >> > > > >> > >> > >
> >>> > > >> > > > >> > >> > > Thu Sep 03 13:13:26 2020: Request 96562
was
> >>> acted
> >>> > > upon.
> >>> > > >> > > > >> > >> > > Transaction: Ticket created by
> >>> > > >> michael.j.erickson at noaa.gov
> >>> > > >> > > > >> > >> > >        Queue: met_help
> >>> > > >> > > > >> > >> > >      Subject: Including Climatology in
> grid_stat
> >>> > > Config
> >>> > > >> > File
> >>> > > >> > > > >> > >> > >        Owner: Nobody
> >>> > > >> > > > >> > >> > >   Requestors:
michael.j.erickson at noaa.gov
> >>> > > >> > > > >> > >> > >       Status: new
> >>> > > >> > > > >> > >> > >  Ticket <URL:
> >>> > > >> > > > >> >
> >>> https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562
> >>> > > >> > > > >> > >> >
> >>> > > >> > > > >> > >> > >
> >>> > > >> > > > >> > >> > >
> >>> > > >> > > > >> > >> > > Greetings,
> >>> > > >> > > > >> > >> > >
> >>> > > >> > > > >> > >> > > For the first time I am attempting to
> calculate
> >>> > Brier
> >>> > > >> Skill
> >>> > > >> > > > Score
> >>> > > >> > > > >> > >> using
> >>> > > >> > > > >> > >> > > grid_stat from an input climatology
file. I
> have
> >>> > > >> created a
> >>> > > >> > > > >> > >> probabilistic
> >>> > > >> > > > >> > >> > > flooding climatology file (spans from
zero to
> >>> one;
> >>> > > >> image is
> >>> > > >> > > > here:
> >>> > > >> > > > >> > >> > >
> >>> > > >> > > > >> >
> >>> > > >> > >
> >>> > >
> >>>
https://ftp.wpc.ncep.noaa.gov/erickson/test10/ALL_noMRGL_UFVS_ALL.png
> >>> > > >> > > > >> > >> ).
> >>> > > >> > > > >> > >> > > This climatology is static, so it
doesn't
> change
> >>> > with
> >>> > > >> time
> >>> > > >> > > when
> >>> > > >> > > > >> > >> inputting
> >>> > > >> > > > >> > >> > > the "model" and "observation" data. I
believe
> I
> >>> have
> >>> > > >> > > > successfully
> >>> > > >> > > > >> > >> gotten
> >>> > > >> > > > >> > >> > > this to work using the command:
> >>> > > >> > > > >> > >> > >
> >>> > > >> > > > >> > >> > > /opt/MET/90/bin/grid_stat
> >>> > ERO_s2020083112_e2020090112_
> >>> > > >> > > vhr09.nc
> >>> > > >> > > > >> > >> > >
ST4gFFG_s2020083112_e2020090112_vhr09.nc
> >>> usethis
> >>> > > >> -outdir ~
> >>> > > >> > > > >> > >> > >
> >>> > > >> > > > >> > >> > > where grid_stat
ERO_s2020083112_e2020090112_
> >>> > vhr09.nc
> >>> > > >> are
> >>> > > >> > > > >> discrete
> >>> > > >> > > > >> > >> > forecast
> >>> > > >> > > > >> > >> > > probabilities of 0, 0.05, 0.1, 0.2, and
0.5
> >>> > > >> > > > >> > >> > > where ST4gFFG_s2020083112_
> e2020090112_vhr09.nc
> >>> are
> >>> > > >> > > observation
> >>> > > >> > > > >> > values
> >>> > > >> > > > >> > >> > of 0
> >>> > > >> > > > >> > >> > > or 1
> >>> > > >> > > > >> > >> > > and usethis is the configuration file
> >>> > > >> > > > >> > >> > >
> >>> > > >> > > > >> > >> > > Finally the climatology file that
consists of
> >>> > "almost"
> >>> > > >> > > > continuous
> >>> > > >> > > > >> > >> values
> >>> > > >> > > > >> > >> > > between 0 and 1 is named:
> >>> UFVS_ST4gFFG_s2015010100_
> >>> > > >> > > > >> > >> e2019123123_vhr12.nc
> >>> > > >> > > > >> > >> > >
> >>> > > >> > > > >> > >> > > I have put all of these files at
> >>> > > >> > > > >> > >> > >
https://ftp.wpc.ncep.noaa.gov/erickson/DTC/
> for
> >>> > > >> > > > >> > >> > > your reference.
> >>> > > >> > > > >> > >> > >
> >>> > > >> > > > >> > >> > > As for my questions:
> >>> > > >> > > > >> > >> > >
> >>> > > >> > > > >> > >> > > 1) I was wondering if the climatology
file was
> >>> > > properly
> >>> > > >> > > > ingested
> >>> > > >> > > > >> and
> >>> > > >> > > > >> > >> > > calculated for my example? I believe it
is
> >>> correct
> >>> > > given
> >>> > > >> > the
> >>> > > >> > > > >> output
> >>> > > >> > > > >> > >> > below,
> >>> > > >> > > > >> > >> > > but I wanted to make sure, since this
is my
> >>> first
> >>> > time
> >>> > > >> > doing
> >>> > > >> > > > >> this:
> >>> > > >> > > > >> > >> > >
> >>> > > >> > > > >> > >> > >
> >>> > > >> > > > >> > >> > >
> >>> > > >> > > > >> > >> > >
> >>> > > >> > > > >> > >> > >
> >>> > > >> > > > >> > >> > >
> >>> > > >> > > > >> > >> > >
> >>> > > >> > > > >> > >> > >
> >>> > > >> > > > >> > >> > >
> >>> > > >> > > > >> > >> > >
> >>> > > >> > > > >> > >> > >
> >>> > > >> > > > >> > >> > >
> >>> > > >> > > > >> > >> > >
> >>> > > >> > > > >> > >> > >
> >>> > > >> > > > >> > >> > >
> >>> > > >> > > > >> > >> > >
> >>> > > >> > > > >> > >> > >
> >>> > > >> > > > >> > >> > >
> >>> > > >> > > > >> > >> > >
> >>> > > >> > > > >> > >> > >
> >>> > > >> > > > >> > >> > >
> >>> > > >> > > > >> > >> > >
> >>> > > >> > > > >> > >> > >
> >>> > > >> > > > >> > >> > >
> >>> > > >> > > > >> > >> > >
> >>> > > >> > > > >> > >> > >
> >>> > > >> > > > >> > >> > >
> >>> > > >> > > > >> > >> > >
> >>> > > >> > > > >> > >> > >
> >>> > > >> > > > >> > >> > >
> >>> > > >> > > > >> > >> > >
> >>> > > >> > > > >> > >> > >
> >>> > > >> > > > >> > >> > > *DEBUG 1: Forecast File:
> >>> > > >> > > > >> > >> > >
> >>> > > >> > > > >> > >> > >
> >>> > > >> > > > >> > >> >
> >>> > > >> > > > >> > >>
> >>> > > >> > > > >> >
> >>> > > >> > > > >>
> >>> > > >> > > >
> >>> > > >> > >
> >>> > > >> >
> >>> > > >>
> >>> > >
> >>> >
> >>>
> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL/ERO_s2020083112_e2020090112_vhr09.ncDEBUG
> >>> > > >> > > > >> > >> > > 1: Observation File:
> >>> > > >> > > > >> > >> > >
> >>> > > >> > > > >> > >> > >
> >>> > > >> > > > >> > >> >
> >>> > > >> > > > >> > >>
> >>> > > >> > > > >> >
> >>> > > >> > > > >>
> >>> > > >> > > >
> >>> > > >> > >
> >>> > > >> >
> >>> > > >>
> >>> > >
> >>> >
> >>>
> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL/ST4gFFG_s2020083112_e2020090112_vhr09.ncDEBUG
> >>> > > >> > > > >> > >> > > 3: Reading forecast data for
EROSurface.DEBUG
> 3:
> >>> > > Reading
> >>> > > >> > > > >> observation
> >>> > > >> > > > >> > >> data
> >>> > > >> > > > >> > >> > > for ST4gFFGSurface.DEBUG 4:
> >>> > > >> > > > >> > >>
Met2dDataFileFactory::new_met_2d_data_file()
> >>> > > >> > > > >> > >> > ->
> >>> > > >> > > > >> > >> > > created new Met2dDataFile object of
type
> >>> > > >> > > "FileType_NcMet".DEBUG
> >>> > > >> > > > >> > >> 4:DEBUG
> >>> > > >> > > > >> > >> > 4:
> >>> > > >> > > > >> > >> > > Latitude/Longitude Grid Data:DEBUG 4:
> >>> lat_ll:
> >>> > > >> 25DEBUG
> >>> > > >> > 4:
> >>> > > >> > > > >> > >> > lon_ll:
> >>> > > >> > > > >> > >> > > 129.8DEBUG 4:   delta_lat: 0.09DEBUG 4:
> >>>  delta_lon:
> >>> > > >> > > 0.09DEBUG
> >>> > > >> > > > 4:
> >>> > > >> > > > >> > >> > >  Nlat: 276DEBUG 4:        Nlon:
721DEBUG
> >>> 4:DEBUG 4:
> >>> > > >> > > > >> > >> > > VarInfoFactory::new_var_info() ->
created new
> >>> > VarInfo
> >>> > > >> > object
> >>> > > >> > > of
> >>> > > >> > > > >> type
> >>> > > >> > > > >> > >> > > "FileType_NcMet".DEBUG 3: For forecast
valid
> at
> >>> > > >> > > > 20200901_120000,
> >>> > > >> > > > >> > >> found 1
> >>> > > >> > > > >> > >> > > climatology field(s) with valid
time(s):
> >>> > > >> > 20201231_230000DEBUG
> >>> > > >> > > > 3:
> >>> > > >> > > > >> > >> Found 1
> >>> > > >> > > > >> > >> > > climatology fields.DEBUG 3: Found 1
> climatology
> >>> mean
> >>> > > >> and 0
> >>> > > >> > > > >> > climatology
> >>> > > >> > > > >> > >> > > standard deviation field(s) for
forecast
> >>> > > >> EROSurface.DEBUG
> >>> > > >> > 2:
> >>> > > >> > > > >> > >> Processing
> >>> > > >> > > > >> > >> > > masking regions.DEBUG 3: Processing
grid mask:
> >>> > > >> FULLDEBUG 4:
> >>> > > >> > > > >> > >> > > parse_grid_mask() -> parsing grid mask
> >>> "FULL"DEBUG
> >>> > > >> 2:DEBUG
> >>> > > >> > 2:
> >>> > > >> > > > >> > >> > >
> >>> > > >> > > > >> > >> > >
> >>> > > >> > > > >> > >> >
> >>> > > >> > > > >> > >>
> >>> > > >> > > > >> >
> >>> > > >> > > > >>
> >>> > > >> > > >
> >>> > > >> > >
> >>> > > >> >
> >>> > > >>
> >>> > >
> >>> >
> >>>
>
--------------------------------------------------------------------------------DEBUG
> >>> > > >> > > > >> > >> > > 2:DEBUG 3: Smoothing field using the
MAX(49)
> >>> > > >> CircleTemplate
> >>> > > >> > > > >> > >> interpolation
> >>> > > >> > > > >> > >> > > method.DEBUG 2: Processing EROSurface
versus
> >>> > > >> > ST4gFFGSurface,
> >>> > > >> > > > for
> >>> > > >> > > > >> > >> > smoothing
> >>> > > >> > > > >> > >> > > method MAX_CIRCLE(49), over region
FULL, using
> >>> > 190638
> >>> > > >> > matched
> >>> > > >> > > > >> > >> pairs.DEBUG
> >>> > > >> > > > >> > >> > > 2: Computing Probabilistic
Statistics.DEBUG
> >>> 2:DEBUG
> >>> > 2:
> >>> > > >> > > > >> > >> > >
> >>> > > >> > > > >> > >> > >
> >>> > > >> > > > >> > >> >
> >>> > > >> > > > >> > >>
> >>> > > >> > > > >> >
> >>> > > >> > > > >>
> >>> > > >> > > >
> >>> > > >> > >
> >>> > > >> >
> >>> > > >>
> >>> > >
> >>> >
> >>>
>
--------------------------------------------------------------------------------DEBUG
> >>> > > >> > > > >> > >> > > 2:DEBUG 1: Output file:
> >>> > > >> > > > >> > >> > >
> >>> > > >> > > > >> > >> > >
> >>> > > >> > > > >> > >> >
> >>> > > >> > > > >> > >>
> >>> > > >> > > > >> >
> >>> > > >> > > > >>
> >>> > > >> > > >
> >>> > > >> > >
> >>> > > >> >
> >>> > > >>
> >>> > >
> >>> >
> >>>
> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL//grid_stat_ST4gFFG_ERO_s2020090112_e2020090212_vhr09_240000L_20200901_120000V.statDEBUG
> >>> > > >> > > > >> > >> > > 1: Output file:
> >>> > > >> > > > >> > >> > >
> >>> > > >> > > > >> > >> > >
> >>> > > >> > > > >> > >> >
> >>> > > >> > > > >> > >>
> >>> > > >> > > > >> >
> >>> > > >> > > > >>
> >>> > > >> > > >
> >>> > > >> > >
> >>> > > >> >
> >>> > > >>
> >>> > >
> >>> >
> >>>
> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL//grid_stat_ST4gFFG_ERO_s2020090112_e2020090212_vhr09_240000L_20200901_120000V_pairs.nc*
> >>> > > >> > > > >> > >> > >
> >>> > > >> > > > >> > >> > >
> >>> > > >> > > > >> > >> > > 2) This question is a bit more basic. I
am
> >>> unable to
> >>> > > >> > manually
> >>> > > >> > > > >> > >> calculate a
> >>> > > >> > > > >> > >> > > Brier Score value for the forecast and
> >>> observation
> >>> > > that
> >>> > > >> > > > properly
> >>> > > >> > > > >> > >> matches
> >>> > > >> > > > >> > >> > > that in the stat file. My manually
calculated
> >>> Brier
> >>> > > >> Score
> >>> > > >> > is
> >>> > > >> > > > >> > >> > systematically
> >>> > > >> > > > >> > >> > > lower. For this event, the stat file BS
is
> >>> 0.0119
> >>> > and
> >>> > > my
> >>> > > >> > > value
> >>> > > >> > > > is
> >>> > > >> > > > >> > >> 0.0116.
> >>> > > >> > > > >> > >> > > I've looked at C3 in the MET Tutorial
guide
> >>> > > >> > > > >> > >> > > <
> >>> > > >> > > > >> > >> > >
> >>> > > >> > > > >> > >> >
> >>> > > >> > > > >> > >>
> >>> > > >> > > > >> >
> >>> > > >> > > > >>
> >>> > > >> > > >
> >>> > > >> > >
> >>> > > >> >
> >>> > > >>
> >>> > >
> >>> >
> >>>
> https://dtcenter.org/sites/default/files/community-
code/met/docs/user-guide/MET_Users_Guide_v9.0.pdf
> >>> > > >> > > > >> > >> > > >,
> >>> > > >> > > > >> > >> > > but I'm still at a bit of a loss. Is
there a
> >>> simple
> >>> > > way
> >>> > > >> I
> >>> > > >> > can
> >>> > > >> > > > >> > >> replicate
> >>> > > >> > > > >> > >> > the
> >>> > > >> > > > >> > >> > > calculation seen in the stat file?
> >>> > > >> > > > >> > >> > >
> >>> > > >> > > > >> > >> > > Thank you again for your help and
please let
> me
> >>> know
> >>> > > if
> >>> > > >> you
> >>> > > >> > > > have
> >>> > > >> > > > >> any
> >>> > > >> > > > >> > >> > > questions.
> >>> > > >> > > > >> > >> > >
> >>> > > >> > > > >> > >> > > Mike
> >>> > > >> > > > >> > >> > >
> >>> > > >> > > > >> > >> > > --
> >>> > > >> > > > >> > >> > > Michael J. Erickson
> >>> > > >> > > > >> > >> > >
> >>> > > >> > > > >> > >> > > Research Scientist
> >>> > > >> > > > >> > >> > > Cooperative Institute for Research in
> >>> Environmental
> >>> > > >> > Sciences
> >>> > > >> > > > >> (CIRES)
> >>> > > >> > > > >> > >> > > NOAA/NWS/Weather Prediction Center
> >>> > > >> > > > >> > >> > > Phone:  301-683-1546
> >>> > > >> > > > >> > >> > >
> >>> > > >> > > > >> > >> > >
> >>> > > >> > > > >> > >> >
> >>> > > >> > > > >> > >> >
> >>> > > >> > > > >> > >>
> >>> > > >> > > > >> > >> --
> >>> > > >> > > > >> > >> Michael J. Erickson
> >>> > > >> > > > >> > >>
> >>> > > >> > > > >> > >> Research Scientist
> >>> > > >> > > > >> > >> Cooperative Institute for Research in
> Environmental
> >>> > > Sciences
> >>> > > >> > > > (CIRES)
> >>> > > >> > > > >> > >> NOAA/NWS/Weather Prediction Center
> >>> > > >> > > > >> > >> Phone:  301-683-1546
> >>> > > >> > > > >> > >>
> >>> > > >> > > > >> > >>
> >>> > > >> > > > >> >
> >>> > > >> > > > >> >
> >>> > > >> > > > >>
> >>> > > >> > > > >> --
> >>> > > >> > > > >> Michael J. Erickson
> >>> > > >> > > > >>
> >>> > > >> > > > >> Research Scientist
> >>> > > >> > > > >> Cooperative Institute for Research in
Environmental
> >>> Sciences
> >>> > > >> (CIRES)
> >>> > > >> > > > >> NOAA/NWS/Weather Prediction Center
> >>> > > >> > > > >> Phone:  301-683-1546
> >>> > > >> > > > >>
> >>> > > >> > > > >>
> >>> > > >> > > >
> >>> > > >> > > >
> >>> > > >> > >
> >>> > > >> > > --
> >>> > > >> > > Michael J. Erickson
> >>> > > >> > >
> >>> > > >> > > Research Scientist
> >>> > > >> > > Cooperative Institute for Research in Environmental
Sciences
> >>> > (CIRES)
> >>> > > >> > > NOAA/NWS/Weather Prediction Center
> >>> > > >> > > Phone:  301-683-1546
> >>> > > >> > >
> >>> > > >> > >
> >>> > > >> >
> >>> > > >> >
> >>> > > >>
> >>> > > >>
> >>> > > >
> >>> > > > --
> >>> > > > Michael J. Erickson
> >>> > > >
> >>> > > > Research Scientist
> >>> > > > Cooperative Institute for Research in Environmental
Sciences
> >>> (CIRES)
> >>> > > > NOAA/NWS/Weather Prediction Center
> >>> > > > Phone:  301-683-1546
> >>> > > >
> >>> > >
> >>> > >
> >>> >
> >>> > --
> >>> > Michael J. Erickson
> >>> >
> >>> > Research Scientist
> >>> > Cooperative Institute for Research in Environmental Sciences
(CIRES)
> >>> > NOAA/NWS/Weather Prediction Center
> >>> > Phone:  301-683-1546
> >>> >
> >>> >
> >>>
> >>>
> >>
> >> --
> >> Michael J. Erickson
> >>
> >> Research Scientist
> >> Cooperative Institute for Research in Environmental Sciences
(CIRES)
> >> NOAA/NWS/Weather Prediction Center
> >> Phone:  301-683-1546
> >>
> >
> >
> > --
> > Michael J. Erickson
> >
> > Research Scientist
> > Cooperative Institute for Research in Environmental Sciences
(CIRES)
> > NOAA/NWS/Weather Prediction Center
> > Phone:  301-683-1546
> >
>
>
> --
> Michael J. Erickson
>
> Research Scientist
> Cooperative Institute for Research in Environmental Sciences (CIRES)
> NOAA/NWS/Weather Prediction Center
> Phone:  301-683-1546
>
>

------------------------------------------------
Subject: Including Climatology in grid_stat Config File
From: Michael Erickson - NOAA Affiliate
Time: Mon Sep 21 05:14:05 2020

Hi John,

Thank you for checking my code and for putting in a fix to the
probabilistic vx! I'd be happy to join the DTCenter on GitHub. My user
is
typhonmike.

Thanks,

Mike

On Thu, Sep 17, 2020 at 2:34 PM John Halley Gotway via RT
<met_help at ucar.edu>
wrote:

> Mike,
>
> Yes, that's correct:
>    https://dtcenter.github.io/MET/Users_Guide/point-stat.html#id15
>
> Field 40 is BSS relative to external climatology whereas BSS_SMPL is
> relative to the sample climatology.
>
> Sounds like you're still seeing a difference of 0.0004 in the
computed
> value (0.0120 vs 0.0116). I can't provide an explanation or defense
for
> that difference but am glad to see that it's a very small number!
>
> I wrote up this GitHub issue to improve the probabilistic vx in MET
in this
> case:
> https://github.com/dtcenter/MET/issues/1495
>
> If you're already on GitHub and would like to join the DTCenter
> organization, just let me know you GitHub user name. I could add you
to the
> "METplus Team" and then tag you as the "scientist" on this issue.
It's just
> up to you the level at which you'd like to collaborate.
>
> Thanks,
> John
>
> On Wed, Sep 16, 2020 at 8:04 AM Michael Erickson - NOAA Affiliate
via RT <
> met_help at ucar.edu> wrote:
>
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562 >
> >
> > Hello,
> >
> > I have one additional question, and that is when extracting the
BSS value
> > from the stat file based on my input climatology, would I extract
field
> 40
> > (Brier Skill Score relative to external climatology)?
> >
> > Thanks again,
> >
> > Mike
> >
> > On Wed, Sep 16, 2020 at 9:44 AM Michael Erickson - NOAA Affiliate
<
> > michael.j.erickson at noaa.gov> wrote:
> >
> > > Hi All,
> > >
> > > I have redone my configuration file "usethis" to reflect the new
> > threshold
> > > values. After that I have run grid_stat in the same manner as my
> initial
> > > email:
> > >
> > > /opt/MET/90/bin/grid_stat ERO_s2020083112_e2020090112_vhr21.nc
> > > ST4gFFG_s2020083112_e2020090112_vhr21.nc usethis -outdir ~
> > >
> > > I've put the input/output files here:
> > > https://ftp.wpc.ncep.noaa.gov/erickson/DTC/.
> > >
> > > I still have the issue where my Brier Score value is 0.0116 and
the
> stat
> > > file is ~0.0120. My method for computing BS is more simple than
that in
> > > MET, where I just compute the mean squared difference between
model and
> > > observation probabilities (e.g. no summing through contingency
table
> > counts
> > > as is done on p 446 of the MET tutorial
> > > <
> >
> https://dtcenter.org/sites/default/files/community-
code/met/docs/user-guide/MET_Users_Guide_v9.0.pdf
> > >
> > > ).
> > >
> > > I am wondering if the Brier Score and Brier Skill Score values
in the
> MET
> > > stat file look correct for this case? If so, then I am content
on
> > > proceeding forward with implementing this new setup at WPC.
> > >
> > > Thank you for all of your help with this!
> > >
> > > Mike
> > >
> > >
> > >
> > > On Fri, Sep 11, 2020 at 7:44 AM Michael Erickson - NOAA
Affiliate <
> > > michael.j.erickson at noaa.gov> wrote:
> > >
> > >> Hi John,
> > >>
> > >> Thank you for your help here! I do appreciate it. I will
gradually
> work
> > >> in your recommended changes to my python scripts.
> > >>
> > >> Regarding your options, these are good suggestions and I can
> understand
> > >> how complicated this is. I would advise against 2) since this
would
> > change
> > >> the results from previous versions. Option 1) is appealing to
me, but
> > I'm
> > >> not sure if there are many other users with discrete thresholds
to
> their
> > >> gridded data. I could see the utility of a -left, -middle,
-right
> option
> > >> which will default to mid point binning when unspecified. It's
> > unfortunate
> > >> that the user will lose either the left or right most category
with
> this
> > >> option, but if the user is this savvy to get to this level of
detail,
> > they
> > >> can probably modify either their data or the threshold to meet
within
> > the
> > >> constraints of left/right binning. Another option is to
calculate BS
> > >> without summing through the thresholds, but this loses a layer
of
> > >> complexity that I like.
> > >>
> > >> I hope this helps and thank you!
> > >>
> > >> Mike
> > >>
> > >> On Thu, Sep 10, 2020 at 3:28 PM John Halley Gotway via RT <
> > >> met_help at ucar.edu> wrote:
> > >>
> > >>> Sorry it took me so long to answer. So we know that MET uses
the
> > >>> centerpoint of the bin as the probability value. And we know
that
> your
> > >>> data
> > >>> is already binned with the only valid probability values
being:
> > >>> 0.0, 0.05, 0.1, 0.2, 0.5, 1.
> > >>>
> > >>> So we want to choose bins whose centerpoints correspond to
these
> > >>> probability values. However, we're a little constrained
because MET
> > >>> requires the first and last ones to be 0 and 1, respectively,
and
> that
> > >>> everything in between be monotonically increasing.
> > >>>
> > >>> The most concise way I can think of uses 7 bins defined by:
> > >>> cat_thresh = [ >=0.0, >=0.001, >=0.1, >=0.1001, >=0.3, >=0.7,
> >=0.999,
> > >>> >=1.0 ];
> > >>>
> > >>> Bin 1 for prob = 0: 0 to 0.001
> > >>> Bin 2 for prob = 0.05: 0.001 to 0.1
> > >>> Bin 3 for prob = 0.1: 0.1 to 0.1001
> > >>> Bin 4 for prob = 0.2: 0.1001 to 0.3
> > >>> Bin 5 for prob = 0.5: 0.3 to 0.7
> > >>> Bin 6 as a placeholder: 0.7 to 0.999
> > >>> Bin 7 for prob = 1.0: 0.999 to 1.0
> > >>>
> > >>> But perhaps it'd be more clear with:
> > >>> cat_thresh = [ >=0.0, >=0.001, >=0.05, >=0.0501, >=0.10,
>=0.101,
> > >=0.2,
> > >>> >=0.201, >=0.5, >=0.501, >=0.999, >=1.0 ];
> > >>>
> > >>> But all these mental gymnastics seem way too confusing!
> > >>> So what changes can we make to Point-Stat and Grid-Stat to
better
> > handle
> > >>> this situation in the future?
> > >>>
> > >>> No very obvious solution occurs to me, but some options
include:
> > >>>
> > >>> (1) Add a config option to switch from using the mid-point of
the
> > >>> probability bin to using the left or right side.
> > >>> But for the first bin, you'd want the left side. And for the
last
> bin,
> > >>> you'd want the right side! We could consider 0 to be a special
case?
> > >>> And this requires the user to be very savvy to understand all
these
> > >>> details.
> > >>>
> > >>> (2) Consider changing the logic to ALWAYS include bins for 0
to 0
> and 1
> > >>> to
> > >>> 1 since the endpoints are kind of special cases?
> > >>> But that'd change existing results which is not good.
> > >>>
> > >>> (3) Pre-process the input probability values before any
smoothing or
> > >>> interpolation to point observations occurs.
> > >>> Keep track of the unique values to determine if the data is
binned.
> > >>> But what qualifies as being binned? 5 unique probabilities?
10? 20?
> 50?
> > >>> 100?
> > >>> Potentially print a warning message if they've chosen
probability
> bins
> > >>> poorly?
> > >>> What does poorly mean?
> > >>>
> > >>> If we can define some very specific solutions, we can make the
code
> do
> > >>> whatever we want.
> > >>>
> > >>> But ideally the changes would not change existing results, be
> intuitive
> > >>> for
> > >>> a user to understand, and be easy to document.
> > >>>
> > >>> Please let me know.
> > >>>
> > >>> Thanks,
> > >>> John
> > >>>
> > >>> On Wed, Sep 9, 2020 at 3:50 PM Michael Erickson - NOAA
Affiliate via
> > RT <
> > >>> met_help at ucar.edu> wrote:
> > >>>
> > >>> >
> > >>> > <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562 >
> > >>> >
> > >>> > Thanks Everyone for your helpful responses.
> > >>> >
> > >>> > I have been using grid_stat for WPC's Excessive Rainfall
Outlook
> > >>> > (consisting of probabilities of 0, 0.05, 0.1, 0.2, and 0.5)
for
> > years.
> > >>> > These results are dependent upon MET, so I wanted to make
sure I am
> > >>> > following best practices.
> > >>> >
> > >>> > What is DTC's guidance on how to proceed forward with this?
Should
> I
> > >>> change
> > >>> > my cat_thresh to "= [ >=0.0, >=0.001, >=0.05, >=0.1, >=0.2,
>=0.5,
> > >>> >=1.0
> > >>> > ];" or is my current setting fine given that both the
"forecast"
> and
> > >>> > "observation" are broken down by the same discrete
increments? I
> can
> > >>> also
> > >>> > just calculate Brier Score manually outside of grid_stat.
> > >>> >
> > >>> > Thanks,
> > >>> >
> > >>> > Mike
> > >>> >
> > >>> > On Tue, Sep 8, 2020 at 4:56 PM Barbara Brown via RT <
> > met_help at ucar.edu
> > >>> >
> > >>> > wrote:
> > >>> >
> > >>> > > I agree with Eric and John. The way MET does this
generally makes
> > >>> sense
> > >>> > for
> > >>> > > ensemble forecasts (or other cases when you want MET to
select
> the
> > >>> > > thresholds) but not for the cse when the probabilities for
> specific
> > >>> > > categories are provided by the user.  I'm not sure what
the
> > >>> work-around
> > >>> > > might be (John may have ideas!) but in the long-run it
would be
> > good
> > >>> to
> > >>> > > allow for this option.
> > >>> > >
> > >>> > > Barb
> > >>> > > ---
> > >>> > > Barbara Brown, Senior Research Associate
> > >>> > > Research Applications Laboratory
> > >>> > > NCAR PO Box 3000
> > >>> > > Boulder CO 80307-3000 USA
> > >>> > > Ph: +1 303 497 8468  FAX: +1 303 497 8401
> > >>> > >
> > >>> > >
> > >>> > > On Tue, Sep 8, 2020 at 2:14 PM Michael Erickson - NOAA
Affiliate
> <
> > >>> > > michael.j.erickson at noaa.gov> wrote:
> > >>> > >
> > >>> > > > Hi Eric and John,
> > >>> > > >
> > >>> > > > Thank you for your response to this matter. What would
be the
> > best
> > >>> > > > practice to take in this situation?
> > >>> > > >
> > >>> > > > Thanks,
> > >>> > > >
> > >>> > > > Mike
> > >>> > > >
> > >>> > > > On Tue, Sep 8, 2020 at 3:41 PM Eric Gilleland via RT <
> > >>> > met_help at ucar.edu>
> > >>> > > > wrote:
> > >>> > > >
> > >>> > > >> Hi John,
> > >>> > > >>
> > >>> > > >> I agree that if the probabilities have already been
binned,
> then
> > >>> it is
> > >>> > > >> strange to then take the midpoint (re-binning).
> > >>> > > >>
> > >>> > > >> Eric
> > >>> > > >>
> > >>> > > >> On Fri, Sep 4, 2020 at 11:14 AM John Halley Gotway via
RT <
> > >>> > > >> met_help at ucar.edu>
> > >>> > > >> wrote:
> > >>> > > >>
> > >>> > > >> > Barb and Eric,
> > >>> > > >> >
> > >>> > > >> > I've added you to this met-help ticket from Mike
Erickson
> from
> > >>> > > NOAA/WPC.
> > >>> > > >> > We're hoping to get some advice from one or both of
you
> about
> > >>> > > >> probabilistic
> > >>> > > >> > verification.
> > >>> > > >> >
> > >>> > > >> > Mike is running Grid-Stat to verify WPC's Excessive
Rainfall
> > >>> > Outlooks
> > >>> > > >> > against StageIV precip. The forecast probability
values are
> > >>> always
> > >>> > 0,
> > >>> > > >> 0.05,
> > >>> > > >> > 0.1, 0.2, 0.5, or 1.0.
> > >>> > > >> > When Mike computes the Brier score by hand, it
differs from
> > the
> > >>> > > results
> > >>> > > >> > reported by Grid-Stat out in the 3rd decimal place.
> > >>> > > >> >
> > >>> > > >> > My theory is that the difference is caused by the
fact that
> > MET
> > >>> does
> > >>> > > not
> > >>> > > >> > compute the Brier score directly on the probability
values.
> > >>> Instead,
> > >>> > > it
> > >>> > > >> > bins them into an Nx2 probabilistic contingency table
and
> > >>> computes
> > >>> > the
> > >>> > > >> > Brier score from that table. And the mid-point of
each bin
> is
> > >>> used
> > >>> > in
> > >>> > > >> the
> > >>> > > >> > Brier score computations. So different probability
bins will
> > >>> result
> > >>> > > in a
> > >>> > > >> > slightly different Brier score.
> > >>> > > >> >
> > >>> > > >> > Mike is currently using probability thresholds as
follows:
> > >>> > > >> >    cat_thresh = [ >=0.0, >=0.05, >=0.1, >=0.2, >=0.5,
>=1.0
> ];
> > >>> > > >> >
> > >>> > > >> > And that's consistent with the probability values.
But when
> > you
> > >>> > think
> > >>> > > >> about
> > >>> > > >> > it...
> > >>> > > >> > - Forecasts of 0% fall into the first bin and are
evaluated
> as
> > >>> > being a
> > >>> > > >> > value of 0.025 (mid-point of the 0.0 to 0.05 bin)
> > >>> > > >> > - Forecasts of 5% fall into the second bin and are
evaluated
> > as
> > >>> > being
> > >>> > > a
> > >>> > > >> > value of 0.075 (mid-point of the 0.05 to 0.1 bin)
> > >>> > > >> > - Forecasts of 10% fall into the third bin and are
evaluated
> > as
> > >>> > being
> > >>> > > a
> > >>> > > >> > value of 0.150 (mid-point of the 0.1 to 0.2 bin).
> > >>> > > >> > - and so on for the other probability values
> > >>> > > >> >
> > >>> > > >> > Seems like the binning of probability values works
better
> for
> > >>> > > continuous
> > >>> > > >> > probability values and not so well for probabilities
that
> have
> > >>> > already
> > >>> > > >> been
> > >>> > > >> > binned!
> > >>> > > >> >
> > >>> > > >> > I'm wondering if you have any thoughts or advice
about this
> > >>> > situation?
> > >>> > > >> >
> > >>> > > >> > Thanks,
> > >>> > > >> > John
> > >>> > > >> >
> > >>> > > >> > On Fri, Sep 4, 2020 at 10:47 AM Michael Erickson -
NOAA
> > >>> Affiliate
> > >>> > via
> > >>> > > >> RT <
> > >>> > > >> > met_help at ucar.edu> wrote:
> > >>> > > >> >
> > >>> > > >> > >
> > >>> > > >> > > <URL:
> > https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562
> > >>> >
> > >>> > > >> > >
> > >>> > > >> > > Hi John,
> > >>> > > >> > >
> > >>> > > >> > > Thanks for your answers and sounds good! That is
strange
> > that
> > >>> the
> > >>> > > >> climo
> > >>> > > >> > > file was not found for your setting. The only
detail I can
> > >>> think
> > >>> > of
> > >>> > > is
> > >>> > > >> > that
> > >>> > > >> > > within the climo field, the file_name specification
is
> > static:
> > >>> > > >> > >
> > >>> > > >> > > file_name = [
> > >>> > > >> > >
> > >>> > > >> >
> > >>> > > >>
> > >>> > >
> > >>> >
> > >>>
> >
>
"/usr1/wpc_cpgffh/gribs/ERO_verif/static/UFVS_ST4gFFG_s2015010100_e2019123123_vhr12.nc"
> > >>> > > >> > > ];
> > >>> > > >> > >
> > >>> > > >> > >
> > >>> > > >> > > I believe you concluded that my climo read-in
looked
> > correct?
> > >>> > > >> > >
> > >>> > > >> > > Thanks,
> > >>> > > >> > >
> > >>> > > >> > > Mike
> > >>> > > >> > >
> > >>> > > >> > >
> > >>> > > >> > > On Fri, Sep 4, 2020 at 12:42 PM John Halley Gotway
via RT
> <
> > >>> > > >> > > met_help at ucar.edu>
> > >>> > > >> > > wrote:
> > >>> > > >> > >
> > >>> > > >> > > > Mike,
> > >>> > > >> > > >
> > >>> > > >> > > > 2 more things I forgot to address.
> > >>> > > >> > > >
> > >>> > > >> > > > First, I pulled that climo field but when I ran
> grid_stat
> > >>> with
> > >>> > > your
> > >>> > > >> > > usethis
> > >>> > > >> > > > config file, it did not actually read the climo
data.
> > >>> > > >> > > >
> > >>> > > >> > > > DEBUG 3: Found 0 climatology fields.
> > >>> > > >> > > >
> > >>> > > >> > > >
> > >>> > > >> > > > I'm wondering what additional configuration
settings you
> > >>> used to
> > >>> > > >> make
> > >>> > > >> > > this
> > >>> > > >> > > > work?
> > >>> > > >> > > >
> > >>> > > >> > > >
> > >>> > > >> > > > Second, the answer to your question is yes. The
exact
> same
> > >>> > binning
> > >>> > > >> > logic
> > >>> > > >> > > > used for the forecast probabilities is applied to
the
> > climo
> > >>> > data.
> > >>> > > In
> > >>> > > >> > > fact,
> > >>> > > >> > > > the forecast probability bins are applied to both
the
> > >>> forecast
> > >>> > and
> > >>> > > >> > climo
> > >>> > > >> > > > data. So you do not need to define separate
"cat_thresh"
> > >>> > settings
> > >>> > > >> for
> > >>> > > >> > the
> > >>> > > >> > > > climo. They won't be used anyway.
> > >>> > > >> > > >
> > >>> > > >> > > >
> > >>> > > >> > > > Here's the spot in the library code where the
climo
> > >>> > probabilistic
> > >>> > > >> > > > contingency table is created using the forecast
> > probability
> > >>> > bins:
> > >>> > > >> > > >
> > >>> > > >> > > >
> > >>> > > >> > > >
> > >>> > > >> > > >
> > >>> > > >> > >
> > >>> > > >> >
> > >>> > > >>
> > >>> > >
> > >>> >
> > >>>
> >
>
https://github.com/dtcenter/MET/blob/2c9ae440a84024fbf62caa64c1b747f9a912236f/met/src/libcode/vx_statistics/compute_stats.cc#L767
> > >>> > > >> > > >
> > >>> > > >> > > >
> > >>> > > >> > > > Thanks,
> > >>> > > >> > > > John
> > >>> > > >> > > >
> > >>> > > >> > > > On Fri, Sep 4, 2020 at 10:25 AM John Halley
Gotway <
> > >>> > > johnhg at ucar.edu
> > >>> > > >> >
> > >>> > > >> > > > wrote:
> > >>> > > >> > > >
> > >>> > > >> > > > > Mike,
> > >>> > > >> > > > >
> > >>> > > >> > > > > I don't really have a recommendation on best
practices
> > >>> with
> > >>> > > >> regards
> > >>> > > >> > to
> > >>> > > >> > > > the
> > >>> > > >> > > > > binning of probability values.
> > >>> > > >> > > > >
> > >>> > > >> > > > > I can say that I more commonly see people
choose fixed
> > bin
> > >>> > > widths,
> > >>> > > >> > like
> > >>> > > >> > > > > "==0.10" (for 10 bins) or "==0.05" (for 20
bins)
> instead
> > >>> of
> > >>> > > >> variable
> > >>> > > >> > > > width
> > >>> > > >> > > > > bins, such as:
> > >>> > > >> > > > > [ >=0.0, >=0.05, >=0.1, >=0.2, >=0.5, >=1.0 ]
> > >>> > > >> > > > >
> > >>> > > >> > > > > But I suspect that's more out of convenience
than
> > anything
> > >>> > else.
> > >>> > > >> With
> > >>> > > >> > > > > regards to your chosen bins, I suspect you set
them up
> > >>> this
> > >>> > way
> > >>> > > >> since
> > >>> > > >> > > you
> > >>> > > >> > > > > have lots of low probability values closer to
0.0 and
> > >>> > relatively
> > >>> > > >> few
> > >>> > > >> > > > > probability values closer to 1.0. While this
may be a
> > good
> > >>> > > choice
> > >>> > > >> for
> > >>> > > >> > > > > relatively rare events, it wouldn't be as good
of a
> > >>> choice for
> > >>> > > >> very
> > >>> > > >> > > > common
> > >>> > > >> > > > > events resulting in high probability values.
> > >>> > > >> > > > >
> > >>> > > >> > > > > Choosing 20 bins (==0.05) would include all of
your
> > >>> current
> > >>> > bin
> > >>> > > >> > > > boundaries
> > >>> > > >> > > > > and enable you to sample evenly across the
probability
> > >>> space,
> > >>> > > >> > > regardless
> > >>> > > >> > > > of
> > >>> > > >> > > > > whether the values are bunched near 0 or 1. And
> > >>> > mathematically,
> > >>> > > >> your
> > >>> > > >> > > > > current bins would be derivable from these.
> > >>> > > >> > > > >
> > >>> > > >> > > > > But if your chosen bins follow some existing
WPC
> > >>> convention, I
> > >>> > > >> don't
> > >>> > > >> > > see
> > >>> > > >> > > > > an obvious reason to change them.
> > >>> > > >> > > > >
> > >>> > > >> > > > > Please let me know if you'd like me to forward
this
> > >>> question
> > >>> > to
> > >>> > > >> one
> > >>> > > >> > of
> > >>> > > >> > > > the
> > >>> > > >> > > > > statisticians in our group for their advice.
> > >>> > > >> > > > >
> > >>> > > >> > > > > Thanks,
> > >>> > > >> > > > > John
> > >>> > > >> > > > >
> > >>> > > >> > > > > On Fri, Sep 4, 2020 at 5:08 AM Michael Erickson
- NOAA
> > >>> > Affiliate
> > >>> > > >> via
> > >>> > > >> > > RT <
> > >>> > > >> > > > > met_help at ucar.edu> wrote:
> > >>> > > >> > > > >
> > >>> > > >> > > > >>
> > >>> > > >> > > > >> <URL:
> > >>> > https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562
> > >>> > > >
> > >>> > > >> > > > >>
> > >>> > > >> > > > >> Hi John,
> > >>> > > >> > > > >>
> > >>> > > >> > > > >> Thank you for your quick and helpful response!
To
> > answer
> > >>> your
> > >>> > > >> > > questions
> > >>> > > >> > > > >> from the first email:
> > >>> > > >> > > > >>
> > >>> > > >> > > > >> 1) I have included the climo file in case you
wanted
> to
> > >>> see
> > >>> > it:
> > >>> > > >> > > > >>
> > >>> > > >> > > > >>
> > >>> > > >> > > >
> > >>> > > >> > >
> > >>> > > >> >
> > >>> > > >>
> > >>> > >
> > >>> >
> > >>>
> >
>
https://ftp.wpc.ncep.noaa.gov/erickson/DTC/UFVS_ST4gFFG_s2015010100_e2019123123_vhr12.nc
> > >>> > > >> > > > >>
> > >>> > > >> > > > >> 2) I start from the netcdf output from
grid_stat,
> load
> > >>> that
> > >>> > > data
> > >>> > > >> > into
> > >>> > > >> > > > the
> > >>> > > >> > > > >> python workspace, and compute the brier score
from
> > that.
> > >>> > > >> > > > >>
> > >>> > > >> > > > >> Also the circle diameter of 9 in the
observation file
> > is
> > >>> to
> > >>> > > draw
> > >>> > > >> a
> > >>> > > >> > 40
> > >>> > > >> > > km
> > >>> > > >> > > > >> radius around the "observation."
> > >>> > > >> > > > >>
> > >>> > > >> > > > >> From your latter email, it sounds like I may
not be
> > able
> > >>> to
> > >>> > > >> exactly
> > >>> > > >> > > > >> replicate the Brier Score calculation. In the
spirit
> of
> > >>> best
> > >>> > > >> > > practices,
> > >>> > > >> > > > >> would you recommend I change cat_thresh  to "=
[
> >=0.0,
> > >>> > > >=0.001,
> > >>> > > >> > > >=0.05,
> > >>> > > >> > > > >> >=0.1, >=0.2, >=0.5, >=1.0 ];" or keep my
cat_thresh
> as
> > >>> it
> > >>> > > >> currently
> > >>> > > >> > > is
> > >>> > > >> > > > as
> > >>> > > >> > > > >> long as I am consistent? I was also wondering
if
> > >>> grid_stat
> > >>> > bins
> > >>> > > >> the
> > >>> > > >> > > > >> probabilities for the climo field as it does
for the
> > >>> > > >> probabilities
> > >>> > > >> > in
> > >>> > > >> > > > the
> > >>> > > >> > > > >> forecast field?
> > >>> > > >> > > > >>
> > >>> > > >> > > > >> Thanks again!
> > >>> > > >> > > > >>
> > >>> > > >> > > > >> Mike
> > >>> > > >> > > > >>
> > >>> > > >> > > > >> On Thu, Sep 3, 2020 at 7:12 PM John Halley
Gotway via
> > RT
> > >>> <
> > >>> > > >> > > > >> met_help at ucar.edu>
> > >>> > > >> > > > >> wrote:
> > >>> > > >> > > > >>
> > >>> > > >> > > > >> > Actually, I have a reasonable guess as to
why you
> may
> > >>> be
> > >>> > > >> seeing a
> > >>> > > >> > > > >> > difference.
> > >>> > > >> > > > >> >
> > >>> > > >> > > > >> > All probabilistics verification in MET is
based on
> an
> > >>> Nx2
> > >>> > > >> > > > probabilistic
> > >>> > > >> > > > >> > contingency table. Those are the counts in
the PCT
> > line
> > >>> > type.
> > >>> > > >> We
> > >>> > > >> > do
> > >>> > > >> > > > >> this to
> > >>> > > >> > > > >> > make it easier to aggregate statistics
across
> > multiple
> > >>> > cases,
> > >>> > > >> but
> > >>> > > >> > > > >> summing
> > >>> > > >> > > > >> > up contingency tables before recomputing
> statistics.
> > >>> But
> > >>> > the
> > >>> > > >> > > pros/cons
> > >>> > > >> > > > >> of
> > >>> > > >> > > > >> > this approach would probably be better
addressed
> by a
> > >>> > > >> > statistician.
> > >>> > > >> > > So
> > >>> > > >> > > > >> the
> > >>> > > >> > > > >> > stats are computed using probability bins
and not
> raw
> > >>> > > >> probability
> > >>> > > >> > > > >> values.
> > >>> > > >> > > > >> >
> > >>> > > >> > > > >> > If you went and computed the Brier score by
hand,
> you
> > >>> > > probably
> > >>> > > >> did
> > >>> > > >> > > so
> > >>> > > >> > > > >> using
> > >>> > > >> > > > >> > raw probability values and not binning them
first.
> > >>> > > >> > > > >> >
> > >>> > > >> > > > >> > And this difference could explain the type
of
> > >>> discrepancy
> > >>> > > >> you're
> > >>> > > >> > > > seeing.
> > >>> > > >> > > > >> >
> > >>> > > >> > > > >> > To test this out, I reran your case...
> > >>> > > >> > > > >> > (1) Using your original settings to confirm
your
> > Brier
> > >>> > score
> > >>> > > of
> > >>> > > >> > > > >> 0.011934.
> > >>> > > >> > > > >> > (2) Using 10 equally-spaced probability bins
> > >>> (cat_thresh =
> > >>> > [
> > >>> > > >> ==0.1
> > >>> > > >> > > ];)
> > >>> > > >> > > > >> > which produced a Brier score of 0.013747.
> > >>> > > >> > > > >> > (3) Using 50 equally-spaced probability bins
> > >>> (cat_thresh =
> > >>> > [
> > >>> > > >> ==0.2
> > >>> > > >> > > ];)
> > >>> > > >> > > > >> > which produced a Brier score of 0.01197.
> > >>> > > >> > > > >> > (4) Using 100 equally-spaced probability
bins
> > >>> (cat_thresh
> > >>> > = [
> > >>> > > >> > ==0.01
> > >>> > > >> > > > ];)
> > >>> > > >> > > > >> > which produced a Brier score of 0.01193.
> > >>> > > >> > > > >> >
> > >>> > > >> > > > >> > I suppose that doesn't example the exact
> discrepancy,
> > >>> but
> > >>> > > could
> > >>> > > >> > > > >> definitely
> > >>> > > >> > > > >> > be involved.
> > >>> > > >> > > > >> >
> > >>> > > >> > > > >> > Notice on this line of the brier score
computation
> in
> > >>> MET:
> > >>> > > >> > > > >> >
> > >>> > > >> > > > >> >
> > >>> > > >> > > > >>
> > >>> > > >> > > >
> > >>> > > >> > >
> > >>> > > >> >
> > >>> > > >>
> > >>> > >
> > >>> >
> > >>>
> >
>
https://github.com/dtcenter/MET/blob/2c9ae440a84024fbf62caa64c1b747f9a912236f/met/src/libcode/vx_statistics/contable_nx2.cc#L647
> > >>> > > >> > > > >> >
> > >>> > > >> > > > >> > That the "probability" value returned by
> > "row_proby()"
> > >>> is
> > >>> > the
> > >>> > > >> > > > mid-point
> > >>> > > >> > > > >> of
> > >>> > > >> > > > >> > the bin.
> > >>> > > >> > > > >> > So all of your forecast probability values
of 0%
> > which
> > >>> fall
> > >>> > > >> into
> > >>> > > >> > the
> > >>> > > >> > > > >> first
> > >>> > > >> > > > >> > bin are actually evaluated as having a
probability
> > >>> value of
> > >>> > > >> 0.025
> > >>> > > >> > > > which
> > >>> > > >> > > > >> is
> > >>> > > >> > > > >> > the mid-point between 0 and 0.05 for the
first bin.
> > >>> > > >> > > > >> >
> > >>> > > >> > > > >> > Rerunning using the following to minimize
that
> effect
> > >>> on
> > >>> > the
> > >>> > > >> 0's:
> > >>> > > >> > > > >> > cat_thresh = [ >=0.0, >=0.001, >=0.05,
>=0.1,
> >=0.2,
> > >>> >=0.5,
> > >>> > > >> >=1.0
> > >>> > > >> > ];
> > >>> > > >> > > > >> > produces a brier score of 0.011489.
> > >>> > > >> > > > >> >
> > >>> > > >> > > > >> > So I'd say that the binning of the
probability
> values
> > >>> is
> > >>> > > >> impacting
> > >>> > > >> > > the
> > >>> > > >> > > > >> > Brier score out in the 4th decimal place.
> > >>> > > >> > > > >> >
> > >>> > > >> > > > >> > John
> > >>> > > >> > > > >> >
> > >>> > > >> > > > >> > On Thu, Sep 3, 2020 at 4:47 PM John Halley
Gotway <
> > >>> > > >> > johnhg at ucar.edu>
> > >>> > > >> > > > >> wrote:
> > >>> > > >> > > > >> >
> > >>> > > >> > > > >> > > Hi Mike,
> > >>> > > >> > > > >> > >
> > >>> > > >> > > > >> > > Looks like you were able to make a lot of
> > progress. I
> > >>> > > >> certainly
> > >>> > > >> > > > don't
> > >>> > > >> > > > >> see
> > >>> > > >> > > > >> > > anything wrong based on the log messages
you
> sent.
> > >>> > > >> > > > >> > >
> > >>> > > >> > > > >> > > I do notice that you're smoothing the
> observations
> > >>> with
> > >>> > the
> > >>> > > >> > > maximum
> > >>> > > >> > > > >> value
> > >>> > > >> > > > >> > > in a circle of diameter 9... presumably
for a
> good
> > >>> > reason.
> > >>> > > >> And I
> > >>> > > >> > > see
> > >>> > > >> > > > >> that
> > >>> > > >> > > > >> > > smoothing step indicated in the log
messages as
> > well
> > >>> as
> > >>> > the
> > >>> > > >> > output
> > >>> > > >> > > > >> .stat
> > >>> > > >> > > > >> > > file.
> > >>> > > >> > > > >> > >
> > >>> > > >> > > > >> > > Two questions.
> > >>> > > >> > > > >> > >
> > >>> > > >> > > > >> > > (1) I wanted to try running locally, but
didn't
> > find
> > >>> the
> > >>> > > >> "climo"
> > >>> > > >> > > > file
> > >>> > > >> > > > >> on
> > >>> > > >> > > > >> > > the WPC ftp site:
> > >>> > > >> > > > >> > >
> > >>> > > >> > > > >> > >
> > >>> > > >> > > > >> >
> > >>> > > >> > > > >>
> > >>> > > >> > > >
> > >>> > > >> > >
> > >>> > > >> >
> > >>> > > >>
> > >>> > >
> > >>> >
> > >>>
> >
>
https://ftp.wpc.ncep.noaa.gov/erickson/DTC/UFVS_ST4gFFG_s2015010100_e2019123123_vhr12.nc
> > >>> > > >> > > > >> > > <
> > >>> > > >> > > > >> >
> > >>> > > >> > > > >>
> > >>> > > >> > > >
> > >>> > > >> > >
> > >>> > > >> >
> > >>> > > >>
> > >>> > >
> > >>> >
> > >>>
> >
>
https://ftp.wpc.ncep.noaa.gov/erickson/DTC/UFVS_ST4gFFG_s2015010100_e2019123123_vhr12.nc
> > >>> > > >> > > > >> > >
> > >>> > > >> > > > >> > > Could you add that?
> > >>> > > >> > > > >> > >
> > >>> > > >> > > > >> > > (2) When you say that you tried to
replicate the
> > >>> Brier
> > >>> > > score
> > >>> > > >> > > > >> computation,
> > >>> > > >> > > > >> > > what was your starting point? The raw
input files
> > or
> > >>> > using
> > >>> > > >> the
> > >>> > > >> > > > NetCDF
> > >>> > > >> > > > >> > > matched pairs output from Grid-Stat which
already
> > >>> include
> > >>> > > the
> > >>> > > >> > > > >> computation
> > >>> > > >> > > > >> > > of the observation maximums?
> > >>> > > >> > > > >> > >
> > >>> > > >> > > > >> > > Thanks,
> > >>> > > >> > > > >> > > John Halley Gotway
> > >>> > > >> > > > >> > >
> > >>> > > >> > > > >> > > On Thu, Sep 3, 2020 at 2:26 PM Michael
Erickson -
> > >>> NOAA
> > >>> > > >> Affiliate
> > >>> > > >> > > via
> > >>> > > >> > > > >> RT <
> > >>> > > >> > > > >> > > met_help at ucar.edu> wrote:
> > >>> > > >> > > > >> > >
> > >>> > > >> > > > >> > >>
> > >>> > > >> > > > >> > >> <URL:
> > >>> > > >> https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562
> > >>> > > >> > >
> > >>> > > >> > > > >> > >>
> > >>> > > >> > > > >> > >> Thank you Minna!
> > >>> > > >> > > > >> > >>
> > >>> > > >> > > > >> > >> Mike
> > >>> > > >> > > > >> > >>
> > >>> > > >> > > > >> > >> On Thu, Sep 3, 2020 at 4:12 PM Minna Win
via RT
> <
> > >>> > > >> > > met_help at ucar.edu
> > >>> > > >> > > > >
> > >>> > > >> > > > >> > >> wrote:
> > >>> > > >> > > > >> > >>
> > >>> > > >> > > > >> > >> > Hi Mike,
> > >>> > > >> > > > >> > >> >
> > >>> > > >> > > > >> > >> > It looks like you have a few questions
> > associated
> > >>> with
> > >>> > > >> > > > calculating
> > >>> > > >> > > > >> > Brier
> > >>> > > >> > > > >> > >> > Skill Scores.  I'm assigning this
ticket to
> John
> > >>> > Halley
> > >>> > > >> > Gotway.
> > >>> > > >> > > > >> > >> >
> > >>> > > >> > > > >> > >> > Regards,
> > >>> > > >> > > > >> > >> > Minna
> > >>> > > >> > > > >> > >> > ---------------
> > >>> > > >> > > > >> > >> > Minna Win
> > >>> > > >> > > > >> > >> > National Center for Atmospheric
Research
> > >>> > > >> > > > >> > >> > Developmental Testbed Center
> > >>> > > >> > > > >> > >> > Phone: 303-497-8423
> > >>> > > >> > > > >> > >> > Fax:   303-497-8401
> > >>> > > >> > > > >> > >> >
> > >>> > > >> > > > >> > >> >
> > >>> > > >> > > > >> > >> >
> > >>> > > >> > > > >> > >> > On Thu, Sep 3, 2020 at 1:13 PM Michael
> Erickson
> > -
> > >>> NOAA
> > >>> > > >> > > Affiliate
> > >>> > > >> > > > >> via
> > >>> > > >> > > > >> > RT
> > >>> > > >> > > > >> > >> <
> > >>> > > >> > > > >> > >> > met_help at ucar.edu> wrote:
> > >>> > > >> > > > >> > >> >
> > >>> > > >> > > > >> > >> > >
> > >>> > > >> > > > >> > >> > > Thu Sep 03 13:13:26 2020: Request
96562 was
> > >>> acted
> > >>> > > upon.
> > >>> > > >> > > > >> > >> > > Transaction: Ticket created by
> > >>> > > >> michael.j.erickson at noaa.gov
> > >>> > > >> > > > >> > >> > >        Queue: met_help
> > >>> > > >> > > > >> > >> > >      Subject: Including Climatology
in
> > grid_stat
> > >>> > > Config
> > >>> > > >> > File
> > >>> > > >> > > > >> > >> > >        Owner: Nobody
> > >>> > > >> > > > >> > >> > >   Requestors:
michael.j.erickson at noaa.gov
> > >>> > > >> > > > >> > >> > >       Status: new
> > >>> > > >> > > > >> > >> > >  Ticket <URL:
> > >>> > > >> > > > >> >
> > >>> https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562
> > >>> > > >> > > > >> > >> >
> > >>> > > >> > > > >> > >> > >
> > >>> > > >> > > > >> > >> > >
> > >>> > > >> > > > >> > >> > > Greetings,
> > >>> > > >> > > > >> > >> > >
> > >>> > > >> > > > >> > >> > > For the first time I am attempting to
> > calculate
> > >>> > Brier
> > >>> > > >> Skill
> > >>> > > >> > > > Score
> > >>> > > >> > > > >> > >> using
> > >>> > > >> > > > >> > >> > > grid_stat from an input climatology
file. I
> > have
> > >>> > > >> created a
> > >>> > > >> > > > >> > >> probabilistic
> > >>> > > >> > > > >> > >> > > flooding climatology file (spans from
zero
> to
> > >>> one;
> > >>> > > >> image is
> > >>> > > >> > > > here:
> > >>> > > >> > > > >> > >> > >
> > >>> > > >> > > > >> >
> > >>> > > >> > >
> > >>> > >
> > >>>
>
https://ftp.wpc.ncep.noaa.gov/erickson/test10/ALL_noMRGL_UFVS_ALL.png
> > >>> > > >> > > > >> > >> ).
> > >>> > > >> > > > >> > >> > > This climatology is static, so it
doesn't
> > change
> > >>> > with
> > >>> > > >> time
> > >>> > > >> > > when
> > >>> > > >> > > > >> > >> inputting
> > >>> > > >> > > > >> > >> > > the "model" and "observation" data. I
> believe
> > I
> > >>> have
> > >>> > > >> > > > successfully
> > >>> > > >> > > > >> > >> gotten
> > >>> > > >> > > > >> > >> > > this to work using the command:
> > >>> > > >> > > > >> > >> > >
> > >>> > > >> > > > >> > >> > > /opt/MET/90/bin/grid_stat
> > >>> > ERO_s2020083112_e2020090112_
> > >>> > > >> > > vhr09.nc
> > >>> > > >> > > > >> > >> > >
ST4gFFG_s2020083112_e2020090112_vhr09.nc
> > >>> usethis
> > >>> > > >> -outdir ~
> > >>> > > >> > > > >> > >> > >
> > >>> > > >> > > > >> > >> > > where grid_stat
ERO_s2020083112_e2020090112_
> > >>> > vhr09.nc
> > >>> > > >> are
> > >>> > > >> > > > >> discrete
> > >>> > > >> > > > >> > >> > forecast
> > >>> > > >> > > > >> > >> > > probabilities of 0, 0.05, 0.1, 0.2,
and 0.5
> > >>> > > >> > > > >> > >> > > where ST4gFFG_s2020083112_
> > e2020090112_vhr09.nc
> > >>> are
> > >>> > > >> > > observation
> > >>> > > >> > > > >> > values
> > >>> > > >> > > > >> > >> > of 0
> > >>> > > >> > > > >> > >> > > or 1
> > >>> > > >> > > > >> > >> > > and usethis is the configuration file
> > >>> > > >> > > > >> > >> > >
> > >>> > > >> > > > >> > >> > > Finally the climatology file that
consists
> of
> > >>> > "almost"
> > >>> > > >> > > > continuous
> > >>> > > >> > > > >> > >> values
> > >>> > > >> > > > >> > >> > > between 0 and 1 is named:
> > >>> UFVS_ST4gFFG_s2015010100_
> > >>> > > >> > > > >> > >> e2019123123_vhr12.nc
> > >>> > > >> > > > >> > >> > >
> > >>> > > >> > > > >> > >> > > I have put all of these files at
> > >>> > > >> > > > >> > >> > >
https://ftp.wpc.ncep.noaa.gov/erickson/DTC/
> > for
> > >>> > > >> > > > >> > >> > > your reference.
> > >>> > > >> > > > >> > >> > >
> > >>> > > >> > > > >> > >> > > As for my questions:
> > >>> > > >> > > > >> > >> > >
> > >>> > > >> > > > >> > >> > > 1) I was wondering if the climatology
file
> was
> > >>> > > properly
> > >>> > > >> > > > ingested
> > >>> > > >> > > > >> and
> > >>> > > >> > > > >> > >> > > calculated for my example? I believe
it is
> > >>> correct
> > >>> > > given
> > >>> > > >> > the
> > >>> > > >> > > > >> output
> > >>> > > >> > > > >> > >> > below,
> > >>> > > >> > > > >> > >> > > but I wanted to make sure, since this
is my
> > >>> first
> > >>> > time
> > >>> > > >> > doing
> > >>> > > >> > > > >> this:
> > >>> > > >> > > > >> > >> > >
> > >>> > > >> > > > >> > >> > >
> > >>> > > >> > > > >> > >> > >
> > >>> > > >> > > > >> > >> > >
> > >>> > > >> > > > >> > >> > >
> > >>> > > >> > > > >> > >> > >
> > >>> > > >> > > > >> > >> > >
> > >>> > > >> > > > >> > >> > >
> > >>> > > >> > > > >> > >> > >
> > >>> > > >> > > > >> > >> > >
> > >>> > > >> > > > >> > >> > >
> > >>> > > >> > > > >> > >> > >
> > >>> > > >> > > > >> > >> > >
> > >>> > > >> > > > >> > >> > >
> > >>> > > >> > > > >> > >> > >
> > >>> > > >> > > > >> > >> > >
> > >>> > > >> > > > >> > >> > >
> > >>> > > >> > > > >> > >> > >
> > >>> > > >> > > > >> > >> > >
> > >>> > > >> > > > >> > >> > >
> > >>> > > >> > > > >> > >> > >
> > >>> > > >> > > > >> > >> > >
> > >>> > > >> > > > >> > >> > >
> > >>> > > >> > > > >> > >> > >
> > >>> > > >> > > > >> > >> > >
> > >>> > > >> > > > >> > >> > >
> > >>> > > >> > > > >> > >> > >
> > >>> > > >> > > > >> > >> > >
> > >>> > > >> > > > >> > >> > >
> > >>> > > >> > > > >> > >> > >
> > >>> > > >> > > > >> > >> > >
> > >>> > > >> > > > >> > >> > >
> > >>> > > >> > > > >> > >> > > *DEBUG 1: Forecast File:
> > >>> > > >> > > > >> > >> > >
> > >>> > > >> > > > >> > >> > >
> > >>> > > >> > > > >> > >> >
> > >>> > > >> > > > >> > >>
> > >>> > > >> > > > >> >
> > >>> > > >> > > > >>
> > >>> > > >> > > >
> > >>> > > >> > >
> > >>> > > >> >
> > >>> > > >>
> > >>> > >
> > >>> >
> > >>>
> >
> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL/ERO_s2020083112_e2020090112_vhr09.ncDEBUG
> > >>> > > >> > > > >> > >> > > 1: Observation File:
> > >>> > > >> > > > >> > >> > >
> > >>> > > >> > > > >> > >> > >
> > >>> > > >> > > > >> > >> >
> > >>> > > >> > > > >> > >>
> > >>> > > >> > > > >> >
> > >>> > > >> > > > >>
> > >>> > > >> > > >
> > >>> > > >> > >
> > >>> > > >> >
> > >>> > > >>
> > >>> > >
> > >>> >
> > >>>
> >
> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL/ST4gFFG_s2020083112_e2020090112_vhr09.ncDEBUG
> > >>> > > >> > > > >> > >> > > 3: Reading forecast data for
> EROSurface.DEBUG
> > 3:
> > >>> > > Reading
> > >>> > > >> > > > >> observation
> > >>> > > >> > > > >> > >> data
> > >>> > > >> > > > >> > >> > > for ST4gFFGSurface.DEBUG 4:
> > >>> > > >> > > > >> > >>
Met2dDataFileFactory::new_met_2d_data_file()
> > >>> > > >> > > > >> > >> > ->
> > >>> > > >> > > > >> > >> > > created new Met2dDataFile object of
type
> > >>> > > >> > > "FileType_NcMet".DEBUG
> > >>> > > >> > > > >> > >> 4:DEBUG
> > >>> > > >> > > > >> > >> > 4:
> > >>> > > >> > > > >> > >> > > Latitude/Longitude Grid Data:DEBUG 4:
> > >>> lat_ll:
> > >>> > > >> 25DEBUG
> > >>> > > >> > 4:
> > >>> > > >> > > > >> > >> > lon_ll:
> > >>> > > >> > > > >> > >> > > 129.8DEBUG 4:   delta_lat: 0.09DEBUG
4:
> > >>>  delta_lon:
> > >>> > > >> > > 0.09DEBUG
> > >>> > > >> > > > 4:
> > >>> > > >> > > > >> > >> > >  Nlat: 276DEBUG 4:        Nlon:
721DEBUG
> > >>> 4:DEBUG 4:
> > >>> > > >> > > > >> > >> > > VarInfoFactory::new_var_info() ->
created
> new
> > >>> > VarInfo
> > >>> > > >> > object
> > >>> > > >> > > of
> > >>> > > >> > > > >> type
> > >>> > > >> > > > >> > >> > > "FileType_NcMet".DEBUG 3: For
forecast valid
> > at
> > >>> > > >> > > > 20200901_120000,
> > >>> > > >> > > > >> > >> found 1
> > >>> > > >> > > > >> > >> > > climatology field(s) with valid
time(s):
> > >>> > > >> > 20201231_230000DEBUG
> > >>> > > >> > > > 3:
> > >>> > > >> > > > >> > >> Found 1
> > >>> > > >> > > > >> > >> > > climatology fields.DEBUG 3: Found 1
> > climatology
> > >>> mean
> > >>> > > >> and 0
> > >>> > > >> > > > >> > climatology
> > >>> > > >> > > > >> > >> > > standard deviation field(s) for
forecast
> > >>> > > >> EROSurface.DEBUG
> > >>> > > >> > 2:
> > >>> > > >> > > > >> > >> Processing
> > >>> > > >> > > > >> > >> > > masking regions.DEBUG 3: Processing
grid
> mask:
> > >>> > > >> FULLDEBUG 4:
> > >>> > > >> > > > >> > >> > > parse_grid_mask() -> parsing grid
mask
> > >>> "FULL"DEBUG
> > >>> > > >> 2:DEBUG
> > >>> > > >> > 2:
> > >>> > > >> > > > >> > >> > >
> > >>> > > >> > > > >> > >> > >
> > >>> > > >> > > > >> > >> >
> > >>> > > >> > > > >> > >>
> > >>> > > >> > > > >> >
> > >>> > > >> > > > >>
> > >>> > > >> > > >
> > >>> > > >> > >
> > >>> > > >> >
> > >>> > > >>
> > >>> > >
> > >>> >
> > >>>
> >
>
--------------------------------------------------------------------------------DEBUG
> > >>> > > >> > > > >> > >> > > 2:DEBUG 3: Smoothing field using the
MAX(49)
> > >>> > > >> CircleTemplate
> > >>> > > >> > > > >> > >> interpolation
> > >>> > > >> > > > >> > >> > > method.DEBUG 2: Processing EROSurface
versus
> > >>> > > >> > ST4gFFGSurface,
> > >>> > > >> > > > for
> > >>> > > >> > > > >> > >> > smoothing
> > >>> > > >> > > > >> > >> > > method MAX_CIRCLE(49), over region
FULL,
> using
> > >>> > 190638
> > >>> > > >> > matched
> > >>> > > >> > > > >> > >> pairs.DEBUG
> > >>> > > >> > > > >> > >> > > 2: Computing Probabilistic
Statistics.DEBUG
> > >>> 2:DEBUG
> > >>> > 2:
> > >>> > > >> > > > >> > >> > >
> > >>> > > >> > > > >> > >> > >
> > >>> > > >> > > > >> > >> >
> > >>> > > >> > > > >> > >>
> > >>> > > >> > > > >> >
> > >>> > > >> > > > >>
> > >>> > > >> > > >
> > >>> > > >> > >
> > >>> > > >> >
> > >>> > > >>
> > >>> > >
> > >>> >
> > >>>
> >
>
--------------------------------------------------------------------------------DEBUG
> > >>> > > >> > > > >> > >> > > 2:DEBUG 1: Output file:
> > >>> > > >> > > > >> > >> > >
> > >>> > > >> > > > >> > >> > >
> > >>> > > >> > > > >> > >> >
> > >>> > > >> > > > >> > >>
> > >>> > > >> > > > >> >
> > >>> > > >> > > > >>
> > >>> > > >> > > >
> > >>> > > >> > >
> > >>> > > >> >
> > >>> > > >>
> > >>> > >
> > >>> >
> > >>>
> >
> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL//grid_stat_ST4gFFG_ERO_s2020090112_e2020090212_vhr09_240000L_20200901_120000V.statDEBUG
> > >>> > > >> > > > >> > >> > > 1: Output file:
> > >>> > > >> > > > >> > >> > >
> > >>> > > >> > > > >> > >> > >
> > >>> > > >> > > > >> > >> >
> > >>> > > >> > > > >> > >>
> > >>> > > >> > > > >> >
> > >>> > > >> > > > >>
> > >>> > > >> > > >
> > >>> > > >> > >
> > >>> > > >> >
> > >>> > > >>
> > >>> > >
> > >>> >
> > >>>
> >
> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL//grid_stat_ST4gFFG_ERO_s2020090112_e2020090212_vhr09_240000L_20200901_120000V_pairs.nc*
> > >>> > > >> > > > >> > >> > >
> > >>> > > >> > > > >> > >> > >
> > >>> > > >> > > > >> > >> > > 2) This question is a bit more basic.
I am
> > >>> unable to
> > >>> > > >> > manually
> > >>> > > >> > > > >> > >> calculate a
> > >>> > > >> > > > >> > >> > > Brier Score value for the forecast
and
> > >>> observation
> > >>> > > that
> > >>> > > >> > > > properly
> > >>> > > >> > > > >> > >> matches
> > >>> > > >> > > > >> > >> > > that in the stat file. My manually
> calculated
> > >>> Brier
> > >>> > > >> Score
> > >>> > > >> > is
> > >>> > > >> > > > >> > >> > systematically
> > >>> > > >> > > > >> > >> > > lower. For this event, the stat file
BS is
> > >>> 0.0119
> > >>> > and
> > >>> > > my
> > >>> > > >> > > value
> > >>> > > >> > > > is
> > >>> > > >> > > > >> > >> 0.0116.
> > >>> > > >> > > > >> > >> > > I've looked at C3 in the MET Tutorial
guide
> > >>> > > >> > > > >> > >> > > <
> > >>> > > >> > > > >> > >> > >
> > >>> > > >> > > > >> > >> >
> > >>> > > >> > > > >> > >>
> > >>> > > >> > > > >> >
> > >>> > > >> > > > >>
> > >>> > > >> > > >
> > >>> > > >> > >
> > >>> > > >> >
> > >>> > > >>
> > >>> > >
> > >>> >
> > >>>
> >
> https://dtcenter.org/sites/default/files/community-
code/met/docs/user-guide/MET_Users_Guide_v9.0.pdf
> > >>> > > >> > > > >> > >> > > >,
> > >>> > > >> > > > >> > >> > > but I'm still at a bit of a loss. Is
there a
> > >>> simple
> > >>> > > way
> > >>> > > >> I
> > >>> > > >> > can
> > >>> > > >> > > > >> > >> replicate
> > >>> > > >> > > > >> > >> > the
> > >>> > > >> > > > >> > >> > > calculation seen in the stat file?
> > >>> > > >> > > > >> > >> > >
> > >>> > > >> > > > >> > >> > > Thank you again for your help and
please let
> > me
> > >>> know
> > >>> > > if
> > >>> > > >> you
> > >>> > > >> > > > have
> > >>> > > >> > > > >> any
> > >>> > > >> > > > >> > >> > > questions.
> > >>> > > >> > > > >> > >> > >
> > >>> > > >> > > > >> > >> > > Mike
> > >>> > > >> > > > >> > >> > >
> > >>> > > >> > > > >> > >> > > --
> > >>> > > >> > > > >> > >> > > Michael J. Erickson
> > >>> > > >> > > > >> > >> > >
> > >>> > > >> > > > >> > >> > > Research Scientist
> > >>> > > >> > > > >> > >> > > Cooperative Institute for Research in
> > >>> Environmental
> > >>> > > >> > Sciences
> > >>> > > >> > > > >> (CIRES)
> > >>> > > >> > > > >> > >> > > NOAA/NWS/Weather Prediction Center
> > >>> > > >> > > > >> > >> > > Phone:  301-683-1546
> > >>> > > >> > > > >> > >> > >
> > >>> > > >> > > > >> > >> > >
> > >>> > > >> > > > >> > >> >
> > >>> > > >> > > > >> > >> >
> > >>> > > >> > > > >> > >>
> > >>> > > >> > > > >> > >> --
> > >>> > > >> > > > >> > >> Michael J. Erickson
> > >>> > > >> > > > >> > >>
> > >>> > > >> > > > >> > >> Research Scientist
> > >>> > > >> > > > >> > >> Cooperative Institute for Research in
> > Environmental
> > >>> > > Sciences
> > >>> > > >> > > > (CIRES)
> > >>> > > >> > > > >> > >> NOAA/NWS/Weather Prediction Center
> > >>> > > >> > > > >> > >> Phone:  301-683-1546
> > >>> > > >> > > > >> > >>
> > >>> > > >> > > > >> > >>
> > >>> > > >> > > > >> >
> > >>> > > >> > > > >> >
> > >>> > > >> > > > >>
> > >>> > > >> > > > >> --
> > >>> > > >> > > > >> Michael J. Erickson
> > >>> > > >> > > > >>
> > >>> > > >> > > > >> Research Scientist
> > >>> > > >> > > > >> Cooperative Institute for Research in
Environmental
> > >>> Sciences
> > >>> > > >> (CIRES)
> > >>> > > >> > > > >> NOAA/NWS/Weather Prediction Center
> > >>> > > >> > > > >> Phone:  301-683-1546
> > >>> > > >> > > > >>
> > >>> > > >> > > > >>
> > >>> > > >> > > >
> > >>> > > >> > > >
> > >>> > > >> > >
> > >>> > > >> > > --
> > >>> > > >> > > Michael J. Erickson
> > >>> > > >> > >
> > >>> > > >> > > Research Scientist
> > >>> > > >> > > Cooperative Institute for Research in Environmental
> Sciences
> > >>> > (CIRES)
> > >>> > > >> > > NOAA/NWS/Weather Prediction Center
> > >>> > > >> > > Phone:  301-683-1546
> > >>> > > >> > >
> > >>> > > >> > >
> > >>> > > >> >
> > >>> > > >> >
> > >>> > > >>
> > >>> > > >>
> > >>> > > >
> > >>> > > > --
> > >>> > > > Michael J. Erickson
> > >>> > > >
> > >>> > > > Research Scientist
> > >>> > > > Cooperative Institute for Research in Environmental
Sciences
> > >>> (CIRES)
> > >>> > > > NOAA/NWS/Weather Prediction Center
> > >>> > > > Phone:  301-683-1546
> > >>> > > >
> > >>> > >
> > >>> > >
> > >>> >
> > >>> > --
> > >>> > Michael J. Erickson
> > >>> >
> > >>> > Research Scientist
> > >>> > Cooperative Institute for Research in Environmental Sciences
> (CIRES)
> > >>> > NOAA/NWS/Weather Prediction Center
> > >>> > Phone:  301-683-1546
> > >>> >
> > >>> >
> > >>>
> > >>>
> > >>
> > >> --
> > >> Michael J. Erickson
> > >>
> > >> Research Scientist
> > >> Cooperative Institute for Research in Environmental Sciences
(CIRES)
> > >> NOAA/NWS/Weather Prediction Center
> > >> Phone:  301-683-1546
> > >>
> > >
> > >
> > > --
> > > Michael J. Erickson
> > >
> > > Research Scientist
> > > Cooperative Institute for Research in Environmental Sciences
(CIRES)
> > > NOAA/NWS/Weather Prediction Center
> > > Phone:  301-683-1546
> > >
> >
> >
> > --
> > Michael J. Erickson
> >
> > Research Scientist
> > Cooperative Institute for Research in Environmental Sciences
(CIRES)
> > NOAA/NWS/Weather Prediction Center
> > Phone:  301-683-1546
> >
> >
>
>

--
Michael J. Erickson

Research Scientist
Cooperative Institute for Research in Environmental Sciences (CIRES)
NOAA/NWS/Weather Prediction Center
Phone:  301-683-1546

------------------------------------------------
Subject: Including Climatology in grid_stat Config File
From: John Halley Gotway
Time: Mon Sep 21 11:18:42 2020

Mike,

Aha, that username sounded familiar. We've already added you to the
dtcenter organization previously. So I just assigned you as the
scientist
on that issue.
https://github.com/dtcenter/MET/issues/1495

We like to have at least one engineer and scientist on each issue...
unless
there's a good reason not to.

That "alert: NEED PROJECT ASSIGNMENT" label means that we need to
decide
how to slot this work into a development cycle (i.e. GitHub project
from
https://github.com/dtcenter/MET/projects).

I'll go ahead and resolve this issue. But I'd recommend that future
details
and specifics related to this work be added as comments to that issue.

Thanks,
John

On Mon, Sep 21, 2020 at 5:14 AM Michael Erickson - NOAA Affiliate via
RT <
met_help at ucar.edu> wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562 >
>
> Hi John,
>
> Thank you for checking my code and for putting in a fix to the
> probabilistic vx! I'd be happy to join the DTCenter on GitHub. My
user is
> typhonmike.
>
> Thanks,
>
> Mike
>
> On Thu, Sep 17, 2020 at 2:34 PM John Halley Gotway via RT <
> met_help at ucar.edu>
> wrote:
>
> > Mike,
> >
> > Yes, that's correct:
> >    https://dtcenter.github.io/MET/Users_Guide/point-stat.html#id15
> >
> > Field 40 is BSS relative to external climatology whereas BSS_SMPL
is
> > relative to the sample climatology.
> >
> > Sounds like you're still seeing a difference of 0.0004 in the
computed
> > value (0.0120 vs 0.0116). I can't provide an explanation or
defense for
> > that difference but am glad to see that it's a very small number!
> >
> > I wrote up this GitHub issue to improve the probabilistic vx in
MET in
> this
> > case:
> > https://github.com/dtcenter/MET/issues/1495
> >
> > If you're already on GitHub and would like to join the DTCenter
> > organization, just let me know you GitHub user name. I could add
you to
> the
> > "METplus Team" and then tag you as the "scientist" on this issue.
It's
> just
> > up to you the level at which you'd like to collaborate.
> >
> > Thanks,
> > John
> >
> > On Wed, Sep 16, 2020 at 8:04 AM Michael Erickson - NOAA Affiliate
via RT
> <
> > met_help at ucar.edu> wrote:
> >
> > >
> > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562 >
> > >
> > > Hello,
> > >
> > > I have one additional question, and that is when extracting the
BSS
> value
> > > from the stat file based on my input climatology, would I
extract field
> > 40
> > > (Brier Skill Score relative to external climatology)?
> > >
> > > Thanks again,
> > >
> > > Mike
> > >
> > > On Wed, Sep 16, 2020 at 9:44 AM Michael Erickson - NOAA
Affiliate <
> > > michael.j.erickson at noaa.gov> wrote:
> > >
> > > > Hi All,
> > > >
> > > > I have redone my configuration file "usethis" to reflect the
new
> > > threshold
> > > > values. After that I have run grid_stat in the same manner as
my
> > initial
> > > > email:
> > > >
> > > > /opt/MET/90/bin/grid_stat ERO_s2020083112_e2020090112_vhr21.nc
> > > > ST4gFFG_s2020083112_e2020090112_vhr21.nc usethis -outdir ~
> > > >
> > > > I've put the input/output files here:
> > > > https://ftp.wpc.ncep.noaa.gov/erickson/DTC/.
> > > >
> > > > I still have the issue where my Brier Score value is 0.0116
and the
> > stat
> > > > file is ~0.0120. My method for computing BS is more simple
than that
> in
> > > > MET, where I just compute the mean squared difference between
model
> and
> > > > observation probabilities (e.g. no summing through contingency
table
> > > counts
> > > > as is done on p 446 of the MET tutorial
> > > > <
> > >
> >
> https://dtcenter.org/sites/default/files/community-
code/met/docs/user-guide/MET_Users_Guide_v9.0.pdf
> > > >
> > > > ).
> > > >
> > > > I am wondering if the Brier Score and Brier Skill Score values
in the
> > MET
> > > > stat file look correct for this case? If so, then I am content
on
> > > > proceeding forward with implementing this new setup at WPC.
> > > >
> > > > Thank you for all of your help with this!
> > > >
> > > > Mike
> > > >
> > > >
> > > >
> > > > On Fri, Sep 11, 2020 at 7:44 AM Michael Erickson - NOAA
Affiliate <
> > > > michael.j.erickson at noaa.gov> wrote:
> > > >
> > > >> Hi John,
> > > >>
> > > >> Thank you for your help here! I do appreciate it. I will
gradually
> > work
> > > >> in your recommended changes to my python scripts.
> > > >>
> > > >> Regarding your options, these are good suggestions and I can
> > understand
> > > >> how complicated this is. I would advise against 2) since this
would
> > > change
> > > >> the results from previous versions. Option 1) is appealing to
me,
> but
> > > I'm
> > > >> not sure if there are many other users with discrete
thresholds to
> > their
> > > >> gridded data. I could see the utility of a -left, -middle,
-right
> > option
> > > >> which will default to mid point binning when unspecified.
It's
> > > unfortunate
> > > >> that the user will lose either the left or right most
category with
> > this
> > > >> option, but if the user is this savvy to get to this level of
> detail,
> > > they
> > > >> can probably modify either their data or the threshold to
meet
> within
> > > the
> > > >> constraints of left/right binning. Another option is to
calculate BS
> > > >> without summing through the thresholds, but this loses a
layer of
> > > >> complexity that I like.
> > > >>
> > > >> I hope this helps and thank you!
> > > >>
> > > >> Mike
> > > >>
> > > >> On Thu, Sep 10, 2020 at 3:28 PM John Halley Gotway via RT <
> > > >> met_help at ucar.edu> wrote:
> > > >>
> > > >>> Sorry it took me so long to answer. So we know that MET uses
the
> > > >>> centerpoint of the bin as the probability value. And we know
that
> > your
> > > >>> data
> > > >>> is already binned with the only valid probability values
being:
> > > >>> 0.0, 0.05, 0.1, 0.2, 0.5, 1.
> > > >>>
> > > >>> So we want to choose bins whose centerpoints correspond to
these
> > > >>> probability values. However, we're a little constrained
because MET
> > > >>> requires the first and last ones to be 0 and 1,
respectively, and
> > that
> > > >>> everything in between be monotonically increasing.
> > > >>>
> > > >>> The most concise way I can think of uses 7 bins defined by:
> > > >>> cat_thresh = [ >=0.0, >=0.001, >=0.1, >=0.1001, >=0.3,
>=0.7,
> > >=0.999,
> > > >>> >=1.0 ];
> > > >>>
> > > >>> Bin 1 for prob = 0: 0 to 0.001
> > > >>> Bin 2 for prob = 0.05: 0.001 to 0.1
> > > >>> Bin 3 for prob = 0.1: 0.1 to 0.1001
> > > >>> Bin 4 for prob = 0.2: 0.1001 to 0.3
> > > >>> Bin 5 for prob = 0.5: 0.3 to 0.7
> > > >>> Bin 6 as a placeholder: 0.7 to 0.999
> > > >>> Bin 7 for prob = 1.0: 0.999 to 1.0
> > > >>>
> > > >>> But perhaps it'd be more clear with:
> > > >>> cat_thresh = [ >=0.0, >=0.001, >=0.05, >=0.0501, >=0.10,
>=0.101,
> > > >=0.2,
> > > >>> >=0.201, >=0.5, >=0.501, >=0.999, >=1.0 ];
> > > >>>
> > > >>> But all these mental gymnastics seem way too confusing!
> > > >>> So what changes can we make to Point-Stat and Grid-Stat to
better
> > > handle
> > > >>> this situation in the future?
> > > >>>
> > > >>> No very obvious solution occurs to me, but some options
include:
> > > >>>
> > > >>> (1) Add a config option to switch from using the mid-point
of the
> > > >>> probability bin to using the left or right side.
> > > >>> But for the first bin, you'd want the left side. And for the
last
> > bin,
> > > >>> you'd want the right side! We could consider 0 to be a
special
> case?
> > > >>> And this requires the user to be very savvy to understand
all these
> > > >>> details.
> > > >>>
> > > >>> (2) Consider changing the logic to ALWAYS include bins for 0
to 0
> > and 1
> > > >>> to
> > > >>> 1 since the endpoints are kind of special cases?
> > > >>> But that'd change existing results which is not good.
> > > >>>
> > > >>> (3) Pre-process the input probability values before any
smoothing
> or
> > > >>> interpolation to point observations occurs.
> > > >>> Keep track of the unique values to determine if the data is
binned.
> > > >>> But what qualifies as being binned? 5 unique probabilities?
10? 20?
> > 50?
> > > >>> 100?
> > > >>> Potentially print a warning message if they've chosen
probability
> > bins
> > > >>> poorly?
> > > >>> What does poorly mean?
> > > >>>
> > > >>> If we can define some very specific solutions, we can make
the code
> > do
> > > >>> whatever we want.
> > > >>>
> > > >>> But ideally the changes would not change existing results,
be
> > intuitive
> > > >>> for
> > > >>> a user to understand, and be easy to document.
> > > >>>
> > > >>> Please let me know.
> > > >>>
> > > >>> Thanks,
> > > >>> John
> > > >>>
> > > >>> On Wed, Sep 9, 2020 at 3:50 PM Michael Erickson - NOAA
Affiliate
> via
> > > RT <
> > > >>> met_help at ucar.edu> wrote:
> > > >>>
> > > >>> >
> > > >>> > <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562 >
> > > >>> >
> > > >>> > Thanks Everyone for your helpful responses.
> > > >>> >
> > > >>> > I have been using grid_stat for WPC's Excessive Rainfall
Outlook
> > > >>> > (consisting of probabilities of 0, 0.05, 0.1, 0.2, and
0.5) for
> > > years.
> > > >>> > These results are dependent upon MET, so I wanted to make
sure I
> am
> > > >>> > following best practices.
> > > >>> >
> > > >>> > What is DTC's guidance on how to proceed forward with
this?
> Should
> > I
> > > >>> change
> > > >>> > my cat_thresh to "= [ >=0.0, >=0.001, >=0.05, >=0.1,
>=0.2,
> >=0.5,
> > > >>> >=1.0
> > > >>> > ];" or is my current setting fine given that both the
"forecast"
> > and
> > > >>> > "observation" are broken down by the same discrete
increments? I
> > can
> > > >>> also
> > > >>> > just calculate Brier Score manually outside of grid_stat.
> > > >>> >
> > > >>> > Thanks,
> > > >>> >
> > > >>> > Mike
> > > >>> >
> > > >>> > On Tue, Sep 8, 2020 at 4:56 PM Barbara Brown via RT <
> > > met_help at ucar.edu
> > > >>> >
> > > >>> > wrote:
> > > >>> >
> > > >>> > > I agree with Eric and John. The way MET does this
generally
> makes
> > > >>> sense
> > > >>> > for
> > > >>> > > ensemble forecasts (or other cases when you want MET to
select
> > the
> > > >>> > > thresholds) but not for the cse when the probabilities
for
> > specific
> > > >>> > > categories are provided by the user.  I'm not sure what
the
> > > >>> work-around
> > > >>> > > might be (John may have ideas!) but in the long-run it
would be
> > > good
> > > >>> to
> > > >>> > > allow for this option.
> > > >>> > >
> > > >>> > > Barb
> > > >>> > > ---
> > > >>> > > Barbara Brown, Senior Research Associate
> > > >>> > > Research Applications Laboratory
> > > >>> > > NCAR PO Box 3000
> > > >>> > > Boulder CO 80307-3000 USA
> > > >>> > > Ph: +1 303 497 8468  FAX: +1 303 497 8401
> > > >>> > >
> > > >>> > >
> > > >>> > > On Tue, Sep 8, 2020 at 2:14 PM Michael Erickson - NOAA
> Affiliate
> > <
> > > >>> > > michael.j.erickson at noaa.gov> wrote:
> > > >>> > >
> > > >>> > > > Hi Eric and John,
> > > >>> > > >
> > > >>> > > > Thank you for your response to this matter. What would
be the
> > > best
> > > >>> > > > practice to take in this situation?
> > > >>> > > >
> > > >>> > > > Thanks,
> > > >>> > > >
> > > >>> > > > Mike
> > > >>> > > >
> > > >>> > > > On Tue, Sep 8, 2020 at 3:41 PM Eric Gilleland via RT <
> > > >>> > met_help at ucar.edu>
> > > >>> > > > wrote:
> > > >>> > > >
> > > >>> > > >> Hi John,
> > > >>> > > >>
> > > >>> > > >> I agree that if the probabilities have already been
binned,
> > then
> > > >>> it is
> > > >>> > > >> strange to then take the midpoint (re-binning).
> > > >>> > > >>
> > > >>> > > >> Eric
> > > >>> > > >>
> > > >>> > > >> On Fri, Sep 4, 2020 at 11:14 AM John Halley Gotway
via RT <
> > > >>> > > >> met_help at ucar.edu>
> > > >>> > > >> wrote:
> > > >>> > > >>
> > > >>> > > >> > Barb and Eric,
> > > >>> > > >> >
> > > >>> > > >> > I've added you to this met-help ticket from Mike
Erickson
> > from
> > > >>> > > NOAA/WPC.
> > > >>> > > >> > We're hoping to get some advice from one or both of
you
> > about
> > > >>> > > >> probabilistic
> > > >>> > > >> > verification.
> > > >>> > > >> >
> > > >>> > > >> > Mike is running Grid-Stat to verify WPC's Excessive
> Rainfall
> > > >>> > Outlooks
> > > >>> > > >> > against StageIV precip. The forecast probability
values
> are
> > > >>> always
> > > >>> > 0,
> > > >>> > > >> 0.05,
> > > >>> > > >> > 0.1, 0.2, 0.5, or 1.0.
> > > >>> > > >> > When Mike computes the Brier score by hand, it
differs
> from
> > > the
> > > >>> > > results
> > > >>> > > >> > reported by Grid-Stat out in the 3rd decimal place.
> > > >>> > > >> >
> > > >>> > > >> > My theory is that the difference is caused by the
fact
> that
> > > MET
> > > >>> does
> > > >>> > > not
> > > >>> > > >> > compute the Brier score directly on the probability
> values.
> > > >>> Instead,
> > > >>> > > it
> > > >>> > > >> > bins them into an Nx2 probabilistic contingency
table and
> > > >>> computes
> > > >>> > the
> > > >>> > > >> > Brier score from that table. And the mid-point of
each bin
> > is
> > > >>> used
> > > >>> > in
> > > >>> > > >> the
> > > >>> > > >> > Brier score computations. So different probability
bins
> will
> > > >>> result
> > > >>> > > in a
> > > >>> > > >> > slightly different Brier score.
> > > >>> > > >> >
> > > >>> > > >> > Mike is currently using probability thresholds as
follows:
> > > >>> > > >> >    cat_thresh = [ >=0.0, >=0.05, >=0.1, >=0.2,
>=0.5,
> >=1.0
> > ];
> > > >>> > > >> >
> > > >>> > > >> > And that's consistent with the probability values.
But
> when
> > > you
> > > >>> > think
> > > >>> > > >> about
> > > >>> > > >> > it...
> > > >>> > > >> > - Forecasts of 0% fall into the first bin and are
> evaluated
> > as
> > > >>> > being a
> > > >>> > > >> > value of 0.025 (mid-point of the 0.0 to 0.05 bin)
> > > >>> > > >> > - Forecasts of 5% fall into the second bin and are
> evaluated
> > > as
> > > >>> > being
> > > >>> > > a
> > > >>> > > >> > value of 0.075 (mid-point of the 0.05 to 0.1 bin)
> > > >>> > > >> > - Forecasts of 10% fall into the third bin and are
> evaluated
> > > as
> > > >>> > being
> > > >>> > > a
> > > >>> > > >> > value of 0.150 (mid-point of the 0.1 to 0.2 bin).
> > > >>> > > >> > - and so on for the other probability values
> > > >>> > > >> >
> > > >>> > > >> > Seems like the binning of probability values works
better
> > for
> > > >>> > > continuous
> > > >>> > > >> > probability values and not so well for
probabilities that
> > have
> > > >>> > already
> > > >>> > > >> been
> > > >>> > > >> > binned!
> > > >>> > > >> >
> > > >>> > > >> > I'm wondering if you have any thoughts or advice
about
> this
> > > >>> > situation?
> > > >>> > > >> >
> > > >>> > > >> > Thanks,
> > > >>> > > >> > John
> > > >>> > > >> >
> > > >>> > > >> > On Fri, Sep 4, 2020 at 10:47 AM Michael Erickson -
NOAA
> > > >>> Affiliate
> > > >>> > via
> > > >>> > > >> RT <
> > > >>> > > >> > met_help at ucar.edu> wrote:
> > > >>> > > >> >
> > > >>> > > >> > >
> > > >>> > > >> > > <URL:
> > > https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562
> > > >>> >
> > > >>> > > >> > >
> > > >>> > > >> > > Hi John,
> > > >>> > > >> > >
> > > >>> > > >> > > Thanks for your answers and sounds good! That is
strange
> > > that
> > > >>> the
> > > >>> > > >> climo
> > > >>> > > >> > > file was not found for your setting. The only
detail I
> can
> > > >>> think
> > > >>> > of
> > > >>> > > is
> > > >>> > > >> > that
> > > >>> > > >> > > within the climo field, the file_name
specification is
> > > static:
> > > >>> > > >> > >
> > > >>> > > >> > > file_name = [
> > > >>> > > >> > >
> > > >>> > > >> >
> > > >>> > > >>
> > > >>> > >
> > > >>> >
> > > >>>
> > >
> >
>
"/usr1/wpc_cpgffh/gribs/ERO_verif/static/UFVS_ST4gFFG_s2015010100_e2019123123_vhr12.nc"
> > > >>> > > >> > > ];
> > > >>> > > >> > >
> > > >>> > > >> > >
> > > >>> > > >> > > I believe you concluded that my climo read-in
looked
> > > correct?
> > > >>> > > >> > >
> > > >>> > > >> > > Thanks,
> > > >>> > > >> > >
> > > >>> > > >> > > Mike
> > > >>> > > >> > >
> > > >>> > > >> > >
> > > >>> > > >> > > On Fri, Sep 4, 2020 at 12:42 PM John Halley
Gotway via
> RT
> > <
> > > >>> > > >> > > met_help at ucar.edu>
> > > >>> > > >> > > wrote:
> > > >>> > > >> > >
> > > >>> > > >> > > > Mike,
> > > >>> > > >> > > >
> > > >>> > > >> > > > 2 more things I forgot to address.
> > > >>> > > >> > > >
> > > >>> > > >> > > > First, I pulled that climo field but when I ran
> > grid_stat
> > > >>> with
> > > >>> > > your
> > > >>> > > >> > > usethis
> > > >>> > > >> > > > config file, it did not actually read the climo
data.
> > > >>> > > >> > > >
> > > >>> > > >> > > > DEBUG 3: Found 0 climatology fields.
> > > >>> > > >> > > >
> > > >>> > > >> > > >
> > > >>> > > >> > > > I'm wondering what additional configuration
settings
> you
> > > >>> used to
> > > >>> > > >> make
> > > >>> > > >> > > this
> > > >>> > > >> > > > work?
> > > >>> > > >> > > >
> > > >>> > > >> > > >
> > > >>> > > >> > > > Second, the answer to your question is yes. The
exact
> > same
> > > >>> > binning
> > > >>> > > >> > logic
> > > >>> > > >> > > > used for the forecast probabilities is applied
to the
> > > climo
> > > >>> > data.
> > > >>> > > In
> > > >>> > > >> > > fact,
> > > >>> > > >> > > > the forecast probability bins are applied to
both the
> > > >>> forecast
> > > >>> > and
> > > >>> > > >> > climo
> > > >>> > > >> > > > data. So you do not need to define separate
> "cat_thresh"
> > > >>> > settings
> > > >>> > > >> for
> > > >>> > > >> > the
> > > >>> > > >> > > > climo. They won't be used anyway.
> > > >>> > > >> > > >
> > > >>> > > >> > > >
> > > >>> > > >> > > > Here's the spot in the library code where the
climo
> > > >>> > probabilistic
> > > >>> > > >> > > > contingency table is created using the forecast
> > > probability
> > > >>> > bins:
> > > >>> > > >> > > >
> > > >>> > > >> > > >
> > > >>> > > >> > > >
> > > >>> > > >> > > >
> > > >>> > > >> > >
> > > >>> > > >> >
> > > >>> > > >>
> > > >>> > >
> > > >>> >
> > > >>>
> > >
> >
>
https://github.com/dtcenter/MET/blob/2c9ae440a84024fbf62caa64c1b747f9a912236f/met/src/libcode/vx_statistics/compute_stats.cc#L767
> > > >>> > > >> > > >
> > > >>> > > >> > > >
> > > >>> > > >> > > > Thanks,
> > > >>> > > >> > > > John
> > > >>> > > >> > > >
> > > >>> > > >> > > > On Fri, Sep 4, 2020 at 10:25 AM John Halley
Gotway <
> > > >>> > > johnhg at ucar.edu
> > > >>> > > >> >
> > > >>> > > >> > > > wrote:
> > > >>> > > >> > > >
> > > >>> > > >> > > > > Mike,
> > > >>> > > >> > > > >
> > > >>> > > >> > > > > I don't really have a recommendation on best
> practices
> > > >>> with
> > > >>> > > >> regards
> > > >>> > > >> > to
> > > >>> > > >> > > > the
> > > >>> > > >> > > > > binning of probability values.
> > > >>> > > >> > > > >
> > > >>> > > >> > > > > I can say that I more commonly see people
choose
> fixed
> > > bin
> > > >>> > > widths,
> > > >>> > > >> > like
> > > >>> > > >> > > > > "==0.10" (for 10 bins) or "==0.05" (for 20
bins)
> > instead
> > > >>> of
> > > >>> > > >> variable
> > > >>> > > >> > > > width
> > > >>> > > >> > > > > bins, such as:
> > > >>> > > >> > > > > [ >=0.0, >=0.05, >=0.1, >=0.2, >=0.5, >=1.0 ]
> > > >>> > > >> > > > >
> > > >>> > > >> > > > > But I suspect that's more out of convenience
than
> > > anything
> > > >>> > else.
> > > >>> > > >> With
> > > >>> > > >> > > > > regards to your chosen bins, I suspect you
set them
> up
> > > >>> this
> > > >>> > way
> > > >>> > > >> since
> > > >>> > > >> > > you
> > > >>> > > >> > > > > have lots of low probability values closer to
0.0
> and
> > > >>> > relatively
> > > >>> > > >> few
> > > >>> > > >> > > > > probability values closer to 1.0. While this
may be
> a
> > > good
> > > >>> > > choice
> > > >>> > > >> for
> > > >>> > > >> > > > > relatively rare events, it wouldn't be as
good of a
> > > >>> choice for
> > > >>> > > >> very
> > > >>> > > >> > > > common
> > > >>> > > >> > > > > events resulting in high probability values.
> > > >>> > > >> > > > >
> > > >>> > > >> > > > > Choosing 20 bins (==0.05) would include all
of your
> > > >>> current
> > > >>> > bin
> > > >>> > > >> > > > boundaries
> > > >>> > > >> > > > > and enable you to sample evenly across the
> probability
> > > >>> space,
> > > >>> > > >> > > regardless
> > > >>> > > >> > > > of
> > > >>> > > >> > > > > whether the values are bunched near 0 or 1.
And
> > > >>> > mathematically,
> > > >>> > > >> your
> > > >>> > > >> > > > > current bins would be derivable from these.
> > > >>> > > >> > > > >
> > > >>> > > >> > > > > But if your chosen bins follow some existing
WPC
> > > >>> convention, I
> > > >>> > > >> don't
> > > >>> > > >> > > see
> > > >>> > > >> > > > > an obvious reason to change them.
> > > >>> > > >> > > > >
> > > >>> > > >> > > > > Please let me know if you'd like me to
forward this
> > > >>> question
> > > >>> > to
> > > >>> > > >> one
> > > >>> > > >> > of
> > > >>> > > >> > > > the
> > > >>> > > >> > > > > statisticians in our group for their advice.
> > > >>> > > >> > > > >
> > > >>> > > >> > > > > Thanks,
> > > >>> > > >> > > > > John
> > > >>> > > >> > > > >
> > > >>> > > >> > > > > On Fri, Sep 4, 2020 at 5:08 AM Michael
Erickson -
> NOAA
> > > >>> > Affiliate
> > > >>> > > >> via
> > > >>> > > >> > > RT <
> > > >>> > > >> > > > > met_help at ucar.edu> wrote:
> > > >>> > > >> > > > >
> > > >>> > > >> > > > >>
> > > >>> > > >> > > > >> <URL:
> > > >>> > https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562
> > > >>> > > >
> > > >>> > > >> > > > >>
> > > >>> > > >> > > > >> Hi John,
> > > >>> > > >> > > > >>
> > > >>> > > >> > > > >> Thank you for your quick and helpful
response! To
> > > answer
> > > >>> your
> > > >>> > > >> > > questions
> > > >>> > > >> > > > >> from the first email:
> > > >>> > > >> > > > >>
> > > >>> > > >> > > > >> 1) I have included the climo file in case
you
> wanted
> > to
> > > >>> see
> > > >>> > it:
> > > >>> > > >> > > > >>
> > > >>> > > >> > > > >>
> > > >>> > > >> > > >
> > > >>> > > >> > >
> > > >>> > > >> >
> > > >>> > > >>
> > > >>> > >
> > > >>> >
> > > >>>
> > >
> >
>
https://ftp.wpc.ncep.noaa.gov/erickson/DTC/UFVS_ST4gFFG_s2015010100_e2019123123_vhr12.nc
> > > >>> > > >> > > > >>
> > > >>> > > >> > > > >> 2) I start from the netcdf output from
grid_stat,
> > load
> > > >>> that
> > > >>> > > data
> > > >>> > > >> > into
> > > >>> > > >> > > > the
> > > >>> > > >> > > > >> python workspace, and compute the brier
score from
> > > that.
> > > >>> > > >> > > > >>
> > > >>> > > >> > > > >> Also the circle diameter of 9 in the
observation
> file
> > > is
> > > >>> to
> > > >>> > > draw
> > > >>> > > >> a
> > > >>> > > >> > 40
> > > >>> > > >> > > km
> > > >>> > > >> > > > >> radius around the "observation."
> > > >>> > > >> > > > >>
> > > >>> > > >> > > > >> From your latter email, it sounds like I may
not be
> > > able
> > > >>> to
> > > >>> > > >> exactly
> > > >>> > > >> > > > >> replicate the Brier Score calculation. In
the
> spirit
> > of
> > > >>> best
> > > >>> > > >> > > practices,
> > > >>> > > >> > > > >> would you recommend I change cat_thresh  to
"= [
> > >=0.0,
> > > >>> > > >=0.001,
> > > >>> > > >> > > >=0.05,
> > > >>> > > >> > > > >> >=0.1, >=0.2, >=0.5, >=1.0 ];" or keep my
> cat_thresh
> > as
> > > >>> it
> > > >>> > > >> currently
> > > >>> > > >> > > is
> > > >>> > > >> > > > as
> > > >>> > > >> > > > >> long as I am consistent? I was also
wondering if
> > > >>> grid_stat
> > > >>> > bins
> > > >>> > > >> the
> > > >>> > > >> > > > >> probabilities for the climo field as it does
for
> the
> > > >>> > > >> probabilities
> > > >>> > > >> > in
> > > >>> > > >> > > > the
> > > >>> > > >> > > > >> forecast field?
> > > >>> > > >> > > > >>
> > > >>> > > >> > > > >> Thanks again!
> > > >>> > > >> > > > >>
> > > >>> > > >> > > > >> Mike
> > > >>> > > >> > > > >>
> > > >>> > > >> > > > >> On Thu, Sep 3, 2020 at 7:12 PM John Halley
Gotway
> via
> > > RT
> > > >>> <
> > > >>> > > >> > > > >> met_help at ucar.edu>
> > > >>> > > >> > > > >> wrote:
> > > >>> > > >> > > > >>
> > > >>> > > >> > > > >> > Actually, I have a reasonable guess as to
why you
> > may
> > > >>> be
> > > >>> > > >> seeing a
> > > >>> > > >> > > > >> > difference.
> > > >>> > > >> > > > >> >
> > > >>> > > >> > > > >> > All probabilistics verification in MET is
based
> on
> > an
> > > >>> Nx2
> > > >>> > > >> > > > probabilistic
> > > >>> > > >> > > > >> > contingency table. Those are the counts in
the
> PCT
> > > line
> > > >>> > type.
> > > >>> > > >> We
> > > >>> > > >> > do
> > > >>> > > >> > > > >> this to
> > > >>> > > >> > > > >> > make it easier to aggregate statistics
across
> > > multiple
> > > >>> > cases,
> > > >>> > > >> but
> > > >>> > > >> > > > >> summing
> > > >>> > > >> > > > >> > up contingency tables before recomputing
> > statistics.
> > > >>> But
> > > >>> > the
> > > >>> > > >> > > pros/cons
> > > >>> > > >> > > > >> of
> > > >>> > > >> > > > >> > this approach would probably be better
addressed
> > by a
> > > >>> > > >> > statistician.
> > > >>> > > >> > > So
> > > >>> > > >> > > > >> the
> > > >>> > > >> > > > >> > stats are computed using probability bins
and not
> > raw
> > > >>> > > >> probability
> > > >>> > > >> > > > >> values.
> > > >>> > > >> > > > >> >
> > > >>> > > >> > > > >> > If you went and computed the Brier score
by hand,
> > you
> > > >>> > > probably
> > > >>> > > >> did
> > > >>> > > >> > > so
> > > >>> > > >> > > > >> using
> > > >>> > > >> > > > >> > raw probability values and not binning
them
> first.
> > > >>> > > >> > > > >> >
> > > >>> > > >> > > > >> > And this difference could explain the type
of
> > > >>> discrepancy
> > > >>> > > >> you're
> > > >>> > > >> > > > seeing.
> > > >>> > > >> > > > >> >
> > > >>> > > >> > > > >> > To test this out, I reran your case...
> > > >>> > > >> > > > >> > (1) Using your original settings to
confirm your
> > > Brier
> > > >>> > score
> > > >>> > > of
> > > >>> > > >> > > > >> 0.011934.
> > > >>> > > >> > > > >> > (2) Using 10 equally-spaced probability
bins
> > > >>> (cat_thresh =
> > > >>> > [
> > > >>> > > >> ==0.1
> > > >>> > > >> > > ];)
> > > >>> > > >> > > > >> > which produced a Brier score of 0.013747.
> > > >>> > > >> > > > >> > (3) Using 50 equally-spaced probability
bins
> > > >>> (cat_thresh =
> > > >>> > [
> > > >>> > > >> ==0.2
> > > >>> > > >> > > ];)
> > > >>> > > >> > > > >> > which produced a Brier score of 0.01197.
> > > >>> > > >> > > > >> > (4) Using 100 equally-spaced probability
bins
> > > >>> (cat_thresh
> > > >>> > = [
> > > >>> > > >> > ==0.01
> > > >>> > > >> > > > ];)
> > > >>> > > >> > > > >> > which produced a Brier score of 0.01193.
> > > >>> > > >> > > > >> >
> > > >>> > > >> > > > >> > I suppose that doesn't example the exact
> > discrepancy,
> > > >>> but
> > > >>> > > could
> > > >>> > > >> > > > >> definitely
> > > >>> > > >> > > > >> > be involved.
> > > >>> > > >> > > > >> >
> > > >>> > > >> > > > >> > Notice on this line of the brier score
> computation
> > in
> > > >>> MET:
> > > >>> > > >> > > > >> >
> > > >>> > > >> > > > >> >
> > > >>> > > >> > > > >>
> > > >>> > > >> > > >
> > > >>> > > >> > >
> > > >>> > > >> >
> > > >>> > > >>
> > > >>> > >
> > > >>> >
> > > >>>
> > >
> >
>
https://github.com/dtcenter/MET/blob/2c9ae440a84024fbf62caa64c1b747f9a912236f/met/src/libcode/vx_statistics/contable_nx2.cc#L647
> > > >>> > > >> > > > >> >
> > > >>> > > >> > > > >> > That the "probability" value returned by
> > > "row_proby()"
> > > >>> is
> > > >>> > the
> > > >>> > > >> > > > mid-point
> > > >>> > > >> > > > >> of
> > > >>> > > >> > > > >> > the bin.
> > > >>> > > >> > > > >> > So all of your forecast probability values
of 0%
> > > which
> > > >>> fall
> > > >>> > > >> into
> > > >>> > > >> > the
> > > >>> > > >> > > > >> first
> > > >>> > > >> > > > >> > bin are actually evaluated as having a
> probability
> > > >>> value of
> > > >>> > > >> 0.025
> > > >>> > > >> > > > which
> > > >>> > > >> > > > >> is
> > > >>> > > >> > > > >> > the mid-point between 0 and 0.05 for the
first
> bin.
> > > >>> > > >> > > > >> >
> > > >>> > > >> > > > >> > Rerunning using the following to minimize
that
> > effect
> > > >>> on
> > > >>> > the
> > > >>> > > >> 0's:
> > > >>> > > >> > > > >> > cat_thresh = [ >=0.0, >=0.001, >=0.05,
>=0.1,
> > >=0.2,
> > > >>> >=0.5,
> > > >>> > > >> >=1.0
> > > >>> > > >> > ];
> > > >>> > > >> > > > >> > produces a brier score of 0.011489.
> > > >>> > > >> > > > >> >
> > > >>> > > >> > > > >> > So I'd say that the binning of the
probability
> > values
> > > >>> is
> > > >>> > > >> impacting
> > > >>> > > >> > > the
> > > >>> > > >> > > > >> > Brier score out in the 4th decimal place.
> > > >>> > > >> > > > >> >
> > > >>> > > >> > > > >> > John
> > > >>> > > >> > > > >> >
> > > >>> > > >> > > > >> > On Thu, Sep 3, 2020 at 4:47 PM John Halley
> Gotway <
> > > >>> > > >> > johnhg at ucar.edu>
> > > >>> > > >> > > > >> wrote:
> > > >>> > > >> > > > >> >
> > > >>> > > >> > > > >> > > Hi Mike,
> > > >>> > > >> > > > >> > >
> > > >>> > > >> > > > >> > > Looks like you were able to make a lot
of
> > > progress. I
> > > >>> > > >> certainly
> > > >>> > > >> > > > don't
> > > >>> > > >> > > > >> see
> > > >>> > > >> > > > >> > > anything wrong based on the log messages
you
> > sent.
> > > >>> > > >> > > > >> > >
> > > >>> > > >> > > > >> > > I do notice that you're smoothing the
> > observations
> > > >>> with
> > > >>> > the
> > > >>> > > >> > > maximum
> > > >>> > > >> > > > >> value
> > > >>> > > >> > > > >> > > in a circle of diameter 9... presumably
for a
> > good
> > > >>> > reason.
> > > >>> > > >> And I
> > > >>> > > >> > > see
> > > >>> > > >> > > > >> that
> > > >>> > > >> > > > >> > > smoothing step indicated in the log
messages as
> > > well
> > > >>> as
> > > >>> > the
> > > >>> > > >> > output
> > > >>> > > >> > > > >> .stat
> > > >>> > > >> > > > >> > > file.
> > > >>> > > >> > > > >> > >
> > > >>> > > >> > > > >> > > Two questions.
> > > >>> > > >> > > > >> > >
> > > >>> > > >> > > > >> > > (1) I wanted to try running locally, but
didn't
> > > find
> > > >>> the
> > > >>> > > >> "climo"
> > > >>> > > >> > > > file
> > > >>> > > >> > > > >> on
> > > >>> > > >> > > > >> > > the WPC ftp site:
> > > >>> > > >> > > > >> > >
> > > >>> > > >> > > > >> > >
> > > >>> > > >> > > > >> >
> > > >>> > > >> > > > >>
> > > >>> > > >> > > >
> > > >>> > > >> > >
> > > >>> > > >> >
> > > >>> > > >>
> > > >>> > >
> > > >>> >
> > > >>>
> > >
> >
>
https://ftp.wpc.ncep.noaa.gov/erickson/DTC/UFVS_ST4gFFG_s2015010100_e2019123123_vhr12.nc
> > > >>> > > >> > > > >> > > <
> > > >>> > > >> > > > >> >
> > > >>> > > >> > > > >>
> > > >>> > > >> > > >
> > > >>> > > >> > >
> > > >>> > > >> >
> > > >>> > > >>
> > > >>> > >
> > > >>> >
> > > >>>
> > >
> >
>
https://ftp.wpc.ncep.noaa.gov/erickson/DTC/UFVS_ST4gFFG_s2015010100_e2019123123_vhr12.nc
> > > >>> > > >> > > > >> > >
> > > >>> > > >> > > > >> > > Could you add that?
> > > >>> > > >> > > > >> > >
> > > >>> > > >> > > > >> > > (2) When you say that you tried to
replicate
> the
> > > >>> Brier
> > > >>> > > score
> > > >>> > > >> > > > >> computation,
> > > >>> > > >> > > > >> > > what was your starting point? The raw
input
> files
> > > or
> > > >>> > using
> > > >>> > > >> the
> > > >>> > > >> > > > NetCDF
> > > >>> > > >> > > > >> > > matched pairs output from Grid-Stat
which
> already
> > > >>> include
> > > >>> > > the
> > > >>> > > >> > > > >> computation
> > > >>> > > >> > > > >> > > of the observation maximums?
> > > >>> > > >> > > > >> > >
> > > >>> > > >> > > > >> > > Thanks,
> > > >>> > > >> > > > >> > > John Halley Gotway
> > > >>> > > >> > > > >> > >
> > > >>> > > >> > > > >> > > On Thu, Sep 3, 2020 at 2:26 PM Michael
> Erickson -
> > > >>> NOAA
> > > >>> > > >> Affiliate
> > > >>> > > >> > > via
> > > >>> > > >> > > > >> RT <
> > > >>> > > >> > > > >> > > met_help at ucar.edu> wrote:
> > > >>> > > >> > > > >> > >
> > > >>> > > >> > > > >> > >>
> > > >>> > > >> > > > >> > >> <URL:
> > > >>> > > >>
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562
> > > >>> > > >> > >
> > > >>> > > >> > > > >> > >>
> > > >>> > > >> > > > >> > >> Thank you Minna!
> > > >>> > > >> > > > >> > >>
> > > >>> > > >> > > > >> > >> Mike
> > > >>> > > >> > > > >> > >>
> > > >>> > > >> > > > >> > >> On Thu, Sep 3, 2020 at 4:12 PM Minna
Win via
> RT
> > <
> > > >>> > > >> > > met_help at ucar.edu
> > > >>> > > >> > > > >
> > > >>> > > >> > > > >> > >> wrote:
> > > >>> > > >> > > > >> > >>
> > > >>> > > >> > > > >> > >> > Hi Mike,
> > > >>> > > >> > > > >> > >> >
> > > >>> > > >> > > > >> > >> > It looks like you have a few
questions
> > > associated
> > > >>> with
> > > >>> > > >> > > > calculating
> > > >>> > > >> > > > >> > Brier
> > > >>> > > >> > > > >> > >> > Skill Scores.  I'm assigning this
ticket to
> > John
> > > >>> > Halley
> > > >>> > > >> > Gotway.
> > > >>> > > >> > > > >> > >> >
> > > >>> > > >> > > > >> > >> > Regards,
> > > >>> > > >> > > > >> > >> > Minna
> > > >>> > > >> > > > >> > >> > ---------------
> > > >>> > > >> > > > >> > >> > Minna Win
> > > >>> > > >> > > > >> > >> > National Center for Atmospheric
Research
> > > >>> > > >> > > > >> > >> > Developmental Testbed Center
> > > >>> > > >> > > > >> > >> > Phone: 303-497-8423
> > > >>> > > >> > > > >> > >> > Fax:   303-497-8401
> > > >>> > > >> > > > >> > >> >
> > > >>> > > >> > > > >> > >> >
> > > >>> > > >> > > > >> > >> >
> > > >>> > > >> > > > >> > >> > On Thu, Sep 3, 2020 at 1:13 PM
Michael
> > Erickson
> > > -
> > > >>> NOAA
> > > >>> > > >> > > Affiliate
> > > >>> > > >> > > > >> via
> > > >>> > > >> > > > >> > RT
> > > >>> > > >> > > > >> > >> <
> > > >>> > > >> > > > >> > >> > met_help at ucar.edu> wrote:
> > > >>> > > >> > > > >> > >> >
> > > >>> > > >> > > > >> > >> > >
> > > >>> > > >> > > > >> > >> > > Thu Sep 03 13:13:26 2020: Request
96562
> was
> > > >>> acted
> > > >>> > > upon.
> > > >>> > > >> > > > >> > >> > > Transaction: Ticket created by
> > > >>> > > >> michael.j.erickson at noaa.gov
> > > >>> > > >> > > > >> > >> > >        Queue: met_help
> > > >>> > > >> > > > >> > >> > >      Subject: Including Climatology
in
> > > grid_stat
> > > >>> > > Config
> > > >>> > > >> > File
> > > >>> > > >> > > > >> > >> > >        Owner: Nobody
> > > >>> > > >> > > > >> > >> > >   Requestors:
michael.j.erickson at noaa.gov
> > > >>> > > >> > > > >> > >> > >       Status: new
> > > >>> > > >> > > > >> > >> > >  Ticket <URL:
> > > >>> > > >> > > > >> >
> > > >>> https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96562
> > > >>> > > >> > > > >> > >> >
> > > >>> > > >> > > > >> > >> > >
> > > >>> > > >> > > > >> > >> > >
> > > >>> > > >> > > > >> > >> > > Greetings,
> > > >>> > > >> > > > >> > >> > >
> > > >>> > > >> > > > >> > >> > > For the first time I am attempting
to
> > > calculate
> > > >>> > Brier
> > > >>> > > >> Skill
> > > >>> > > >> > > > Score
> > > >>> > > >> > > > >> > >> using
> > > >>> > > >> > > > >> > >> > > grid_stat from an input climatology
file.
> I
> > > have
> > > >>> > > >> created a
> > > >>> > > >> > > > >> > >> probabilistic
> > > >>> > > >> > > > >> > >> > > flooding climatology file (spans
from zero
> > to
> > > >>> one;
> > > >>> > > >> image is
> > > >>> > > >> > > > here:
> > > >>> > > >> > > > >> > >> > >
> > > >>> > > >> > > > >> >
> > > >>> > > >> > >
> > > >>> > >
> > > >>>
> >
https://ftp.wpc.ncep.noaa.gov/erickson/test10/ALL_noMRGL_UFVS_ALL.png
> > > >>> > > >> > > > >> > >> ).
> > > >>> > > >> > > > >> > >> > > This climatology is static, so it
doesn't
> > > change
> > > >>> > with
> > > >>> > > >> time
> > > >>> > > >> > > when
> > > >>> > > >> > > > >> > >> inputting
> > > >>> > > >> > > > >> > >> > > the "model" and "observation" data.
I
> > believe
> > > I
> > > >>> have
> > > >>> > > >> > > > successfully
> > > >>> > > >> > > > >> > >> gotten
> > > >>> > > >> > > > >> > >> > > this to work using the command:
> > > >>> > > >> > > > >> > >> > >
> > > >>> > > >> > > > >> > >> > > /opt/MET/90/bin/grid_stat
> > > >>> > ERO_s2020083112_e2020090112_
> > > >>> > > >> > > vhr09.nc
> > > >>> > > >> > > > >> > >> > >
ST4gFFG_s2020083112_e2020090112_vhr09.nc
> > > >>> usethis
> > > >>> > > >> -outdir ~
> > > >>> > > >> > > > >> > >> > >
> > > >>> > > >> > > > >> > >> > > where grid_stat
> ERO_s2020083112_e2020090112_
> > > >>> > vhr09.nc
> > > >>> > > >> are
> > > >>> > > >> > > > >> discrete
> > > >>> > > >> > > > >> > >> > forecast
> > > >>> > > >> > > > >> > >> > > probabilities of 0, 0.05, 0.1, 0.2,
and
> 0.5
> > > >>> > > >> > > > >> > >> > > where ST4gFFG_s2020083112_
> > > e2020090112_vhr09.nc
> > > >>> are
> > > >>> > > >> > > observation
> > > >>> > > >> > > > >> > values
> > > >>> > > >> > > > >> > >> > of 0
> > > >>> > > >> > > > >> > >> > > or 1
> > > >>> > > >> > > > >> > >> > > and usethis is the configuration
file
> > > >>> > > >> > > > >> > >> > >
> > > >>> > > >> > > > >> > >> > > Finally the climatology file that
consists
> > of
> > > >>> > "almost"
> > > >>> > > >> > > > continuous
> > > >>> > > >> > > > >> > >> values
> > > >>> > > >> > > > >> > >> > > between 0 and 1 is named:
> > > >>> UFVS_ST4gFFG_s2015010100_
> > > >>> > > >> > > > >> > >> e2019123123_vhr12.nc
> > > >>> > > >> > > > >> > >> > >
> > > >>> > > >> > > > >> > >> > > I have put all of these files at
> > > >>> > > >> > > > >> > >> > >
> https://ftp.wpc.ncep.noaa.gov/erickson/DTC/
> > > for
> > > >>> > > >> > > > >> > >> > > your reference.
> > > >>> > > >> > > > >> > >> > >
> > > >>> > > >> > > > >> > >> > > As for my questions:
> > > >>> > > >> > > > >> > >> > >
> > > >>> > > >> > > > >> > >> > > 1) I was wondering if the
climatology file
> > was
> > > >>> > > properly
> > > >>> > > >> > > > ingested
> > > >>> > > >> > > > >> and
> > > >>> > > >> > > > >> > >> > > calculated for my example? I
believe it is
> > > >>> correct
> > > >>> > > given
> > > >>> > > >> > the
> > > >>> > > >> > > > >> output
> > > >>> > > >> > > > >> > >> > below,
> > > >>> > > >> > > > >> > >> > > but I wanted to make sure, since
this is
> my
> > > >>> first
> > > >>> > time
> > > >>> > > >> > doing
> > > >>> > > >> > > > >> this:
> > > >>> > > >> > > > >> > >> > >
> > > >>> > > >> > > > >> > >> > >
> > > >>> > > >> > > > >> > >> > >
> > > >>> > > >> > > > >> > >> > >
> > > >>> > > >> > > > >> > >> > >
> > > >>> > > >> > > > >> > >> > >
> > > >>> > > >> > > > >> > >> > >
> > > >>> > > >> > > > >> > >> > >
> > > >>> > > >> > > > >> > >> > >
> > > >>> > > >> > > > >> > >> > >
> > > >>> > > >> > > > >> > >> > >
> > > >>> > > >> > > > >> > >> > >
> > > >>> > > >> > > > >> > >> > >
> > > >>> > > >> > > > >> > >> > >
> > > >>> > > >> > > > >> > >> > >
> > > >>> > > >> > > > >> > >> > >
> > > >>> > > >> > > > >> > >> > >
> > > >>> > > >> > > > >> > >> > >
> > > >>> > > >> > > > >> > >> > >
> > > >>> > > >> > > > >> > >> > >
> > > >>> > > >> > > > >> > >> > >
> > > >>> > > >> > > > >> > >> > >
> > > >>> > > >> > > > >> > >> > >
> > > >>> > > >> > > > >> > >> > >
> > > >>> > > >> > > > >> > >> > >
> > > >>> > > >> > > > >> > >> > >
> > > >>> > > >> > > > >> > >> > >
> > > >>> > > >> > > > >> > >> > >
> > > >>> > > >> > > > >> > >> > >
> > > >>> > > >> > > > >> > >> > >
> > > >>> > > >> > > > >> > >> > >
> > > >>> > > >> > > > >> > >> > >
> > > >>> > > >> > > > >> > >> > > *DEBUG 1: Forecast File:
> > > >>> > > >> > > > >> > >> > >
> > > >>> > > >> > > > >> > >> > >
> > > >>> > > >> > > > >> > >> >
> > > >>> > > >> > > > >> > >>
> > > >>> > > >> > > > >> >
> > > >>> > > >> > > > >>
> > > >>> > > >> > > >
> > > >>> > > >> > >
> > > >>> > > >> >
> > > >>> > > >>
> > > >>> > >
> > > >>> >
> > > >>>
> > >
> >
> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL/ERO_s2020083112_e2020090112_vhr09.ncDEBUG
> > > >>> > > >> > > > >> > >> > > 1: Observation File:
> > > >>> > > >> > > > >> > >> > >
> > > >>> > > >> > > > >> > >> > >
> > > >>> > > >> > > > >> > >> >
> > > >>> > > >> > > > >> > >>
> > > >>> > > >> > > > >> >
> > > >>> > > >> > > > >>
> > > >>> > > >> > > >
> > > >>> > > >> > >
> > > >>> > > >> >
> > > >>> > > >>
> > > >>> > >
> > > >>> >
> > > >>>
> > >
> >
> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL/ST4gFFG_s2020083112_e2020090112_vhr09.ncDEBUG
> > > >>> > > >> > > > >> > >> > > 3: Reading forecast data for
> > EROSurface.DEBUG
> > > 3:
> > > >>> > > Reading
> > > >>> > > >> > > > >> observation
> > > >>> > > >> > > > >> > >> data
> > > >>> > > >> > > > >> > >> > > for ST4gFFGSurface.DEBUG 4:
> > > >>> > > >> > > > >> > >>
Met2dDataFileFactory::new_met_2d_data_file()
> > > >>> > > >> > > > >> > >> > ->
> > > >>> > > >> > > > >> > >> > > created new Met2dDataFile object of
type
> > > >>> > > >> > > "FileType_NcMet".DEBUG
> > > >>> > > >> > > > >> > >> 4:DEBUG
> > > >>> > > >> > > > >> > >> > 4:
> > > >>> > > >> > > > >> > >> > > Latitude/Longitude Grid Data:DEBUG
4:
> > > >>> lat_ll:
> > > >>> > > >> 25DEBUG
> > > >>> > > >> > 4:
> > > >>> > > >> > > > >> > >> > lon_ll:
> > > >>> > > >> > > > >> > >> > > 129.8DEBUG 4:   delta_lat:
0.09DEBUG 4:
> > > >>>  delta_lon:
> > > >>> > > >> > > 0.09DEBUG
> > > >>> > > >> > > > 4:
> > > >>> > > >> > > > >> > >> > >  Nlat: 276DEBUG 4:        Nlon:
721DEBUG
> > > >>> 4:DEBUG 4:
> > > >>> > > >> > > > >> > >> > > VarInfoFactory::new_var_info() ->
created
> > new
> > > >>> > VarInfo
> > > >>> > > >> > object
> > > >>> > > >> > > of
> > > >>> > > >> > > > >> type
> > > >>> > > >> > > > >> > >> > > "FileType_NcMet".DEBUG 3: For
forecast
> valid
> > > at
> > > >>> > > >> > > > 20200901_120000,
> > > >>> > > >> > > > >> > >> found 1
> > > >>> > > >> > > > >> > >> > > climatology field(s) with valid
time(s):
> > > >>> > > >> > 20201231_230000DEBUG
> > > >>> > > >> > > > 3:
> > > >>> > > >> > > > >> > >> Found 1
> > > >>> > > >> > > > >> > >> > > climatology fields.DEBUG 3: Found 1
> > > climatology
> > > >>> mean
> > > >>> > > >> and 0
> > > >>> > > >> > > > >> > climatology
> > > >>> > > >> > > > >> > >> > > standard deviation field(s) for
forecast
> > > >>> > > >> EROSurface.DEBUG
> > > >>> > > >> > 2:
> > > >>> > > >> > > > >> > >> Processing
> > > >>> > > >> > > > >> > >> > > masking regions.DEBUG 3: Processing
grid
> > mask:
> > > >>> > > >> FULLDEBUG 4:
> > > >>> > > >> > > > >> > >> > > parse_grid_mask() -> parsing grid
mask
> > > >>> "FULL"DEBUG
> > > >>> > > >> 2:DEBUG
> > > >>> > > >> > 2:
> > > >>> > > >> > > > >> > >> > >
> > > >>> > > >> > > > >> > >> > >
> > > >>> > > >> > > > >> > >> >
> > > >>> > > >> > > > >> > >>
> > > >>> > > >> > > > >> >
> > > >>> > > >> > > > >>
> > > >>> > > >> > > >
> > > >>> > > >> > >
> > > >>> > > >> >
> > > >>> > > >>
> > > >>> > >
> > > >>> >
> > > >>>
> > >
> >
>
--------------------------------------------------------------------------------DEBUG
> > > >>> > > >> > > > >> > >> > > 2:DEBUG 3: Smoothing field using
the
> MAX(49)
> > > >>> > > >> CircleTemplate
> > > >>> > > >> > > > >> > >> interpolation
> > > >>> > > >> > > > >> > >> > > method.DEBUG 2: Processing
EROSurface
> versus
> > > >>> > > >> > ST4gFFGSurface,
> > > >>> > > >> > > > for
> > > >>> > > >> > > > >> > >> > smoothing
> > > >>> > > >> > > > >> > >> > > method MAX_CIRCLE(49), over region
FULL,
> > using
> > > >>> > 190638
> > > >>> > > >> > matched
> > > >>> > > >> > > > >> > >> pairs.DEBUG
> > > >>> > > >> > > > >> > >> > > 2: Computing Probabilistic
> Statistics.DEBUG
> > > >>> 2:DEBUG
> > > >>> > 2:
> > > >>> > > >> > > > >> > >> > >
> > > >>> > > >> > > > >> > >> > >
> > > >>> > > >> > > > >> > >> >
> > > >>> > > >> > > > >> > >>
> > > >>> > > >> > > > >> >
> > > >>> > > >> > > > >>
> > > >>> > > >> > > >
> > > >>> > > >> > >
> > > >>> > > >> >
> > > >>> > > >>
> > > >>> > >
> > > >>> >
> > > >>>
> > >
> >
>
--------------------------------------------------------------------------------DEBUG
> > > >>> > > >> > > > >> > >> > > 2:DEBUG 1: Output file:
> > > >>> > > >> > > > >> > >> > >
> > > >>> > > >> > > > >> > >> > >
> > > >>> > > >> > > > >> > >> >
> > > >>> > > >> > > > >> > >>
> > > >>> > > >> > > > >> >
> > > >>> > > >> > > > >>
> > > >>> > > >> > > >
> > > >>> > > >> > >
> > > >>> > > >> >
> > > >>> > > >>
> > > >>> > >
> > > >>> >
> > > >>>
> > >
> >
> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL//grid_stat_ST4gFFG_ERO_s2020090112_e2020090212_vhr09_240000L_20200901_120000V.statDEBUG
> > > >>> > > >> > > > >> > >> > > 1: Output file:
> > > >>> > > >> > > > >> > >> > >
> > > >>> > > >> > > > >> > >> > >
> > > >>> > > >> > > > >> > >> >
> > > >>> > > >> > > > >> > >>
> > > >>> > > >> > > > >> >
> > > >>> > > >> > > > >>
> > > >>> > > >> > > >
> > > >>> > > >> > >
> > > >>> > > >> >
> > > >>> > > >>
> > > >>> > >
> > > >>> >
> > > >>>
> > >
> >
> /export/hpc-lw-
dtbdev5/merickson/ERO_verif/ERO_verif_day2_ALL//grid_stat_ST4gFFG_ERO_s2020090112_e2020090212_vhr09_240000L_20200901_120000V_pairs.nc*
> > > >>> > > >> > > > >> > >> > >
> > > >>> > > >> > > > >> > >> > >
> > > >>> > > >> > > > >> > >> > > 2) This question is a bit more
basic. I am
> > > >>> unable to
> > > >>> > > >> > manually
> > > >>> > > >> > > > >> > >> calculate a
> > > >>> > > >> > > > >> > >> > > Brier Score value for the forecast
and
> > > >>> observation
> > > >>> > > that
> > > >>> > > >> > > > properly
> > > >>> > > >> > > > >> > >> matches
> > > >>> > > >> > > > >> > >> > > that in the stat file. My manually
> > calculated
> > > >>> Brier
> > > >>> > > >> Score
> > > >>> > > >> > is
> > > >>> > > >> > > > >> > >> > systematically
> > > >>> > > >> > > > >> > >> > > lower. For this event, the stat
file BS is
> > > >>> 0.0119
> > > >>> > and
> > > >>> > > my
> > > >>> > > >> > > value
> > > >>> > > >> > > > is
> > > >>> > > >> > > > >> > >> 0.0116.
> > > >>> > > >> > > > >> > >> > > I've looked at C3 in the MET
Tutorial
> guide
> > > >>> > > >> > > > >> > >> > > <
> > > >>> > > >> > > > >> > >> > >
> > > >>> > > >> > > > >> > >> >
> > > >>> > > >> > > > >> > >>
> > > >>> > > >> > > > >> >
> > > >>> > > >> > > > >>
> > > >>> > > >> > > >
> > > >>> > > >> > >
> > > >>> > > >> >
> > > >>> > > >>
> > > >>> > >
> > > >>> >
> > > >>>
> > >
> >
> https://dtcenter.org/sites/default/files/community-
code/met/docs/user-guide/MET_Users_Guide_v9.0.pdf
> > > >>> > > >> > > > >> > >> > > >,
> > > >>> > > >> > > > >> > >> > > but I'm still at a bit of a loss.
Is
> there a
> > > >>> simple
> > > >>> > > way
> > > >>> > > >> I
> > > >>> > > >> > can
> > > >>> > > >> > > > >> > >> replicate
> > > >>> > > >> > > > >> > >> > the
> > > >>> > > >> > > > >> > >> > > calculation seen in the stat file?
> > > >>> > > >> > > > >> > >> > >
> > > >>> > > >> > > > >> > >> > > Thank you again for your help and
please
> let
> > > me
> > > >>> know
> > > >>> > > if
> > > >>> > > >> you
> > > >>> > > >> > > > have
> > > >>> > > >> > > > >> any
> > > >>> > > >> > > > >> > >> > > questions.
> > > >>> > > >> > > > >> > >> > >
> > > >>> > > >> > > > >> > >> > > Mike
> > > >>> > > >> > > > >> > >> > >
> > > >>> > > >> > > > >> > >> > > --
> > > >>> > > >> > > > >> > >> > > Michael J. Erickson
> > > >>> > > >> > > > >> > >> > >
> > > >>> > > >> > > > >> > >> > > Research Scientist
> > > >>> > > >> > > > >> > >> > > Cooperative Institute for Research
in
> > > >>> Environmental
> > > >>> > > >> > Sciences
> > > >>> > > >> > > > >> (CIRES)
> > > >>> > > >> > > > >> > >> > > NOAA/NWS/Weather Prediction Center
> > > >>> > > >> > > > >> > >> > > Phone:  301-683-1546
> > > >>> > > >> > > > >> > >> > >
> > > >>> > > >> > > > >> > >> > >
> > > >>> > > >> > > > >> > >> >
> > > >>> > > >> > > > >> > >> >
> > > >>> > > >> > > > >> > >>
> > > >>> > > >> > > > >> > >> --
> > > >>> > > >> > > > >> > >> Michael J. Erickson
> > > >>> > > >> > > > >> > >>
> > > >>> > > >> > > > >> > >> Research Scientist
> > > >>> > > >> > > > >> > >> Cooperative Institute for Research in
> > > Environmental
> > > >>> > > Sciences
> > > >>> > > >> > > > (CIRES)
> > > >>> > > >> > > > >> > >> NOAA/NWS/Weather Prediction Center
> > > >>> > > >> > > > >> > >> Phone:  301-683-1546
> > > >>> > > >> > > > >> > >>
> > > >>> > > >> > > > >> > >>
> > > >>> > > >> > > > >> >
> > > >>> > > >> > > > >> >
> > > >>> > > >> > > > >>
> > > >>> > > >> > > > >> --
> > > >>> > > >> > > > >> Michael J. Erickson
> > > >>> > > >> > > > >>
> > > >>> > > >> > > > >> Research Scientist
> > > >>> > > >> > > > >> Cooperative Institute for Research in
Environmental
> > > >>> Sciences
> > > >>> > > >> (CIRES)
> > > >>> > > >> > > > >> NOAA/NWS/Weather Prediction Center
> > > >>> > > >> > > > >> Phone:  301-683-1546
> > > >>> > > >> > > > >>
> > > >>> > > >> > > > >>
> > > >>> > > >> > > >
> > > >>> > > >> > > >
> > > >>> > > >> > >
> > > >>> > > >> > > --
> > > >>> > > >> > > Michael J. Erickson
> > > >>> > > >> > >
> > > >>> > > >> > > Research Scientist
> > > >>> > > >> > > Cooperative Institute for Research in
Environmental
> > Sciences
> > > >>> > (CIRES)
> > > >>> > > >> > > NOAA/NWS/Weather Prediction Center
> > > >>> > > >> > > Phone:  301-683-1546
> > > >>> > > >> > >
> > > >>> > > >> > >
> > > >>> > > >> >
> > > >>> > > >> >
> > > >>> > > >>
> > > >>> > > >>
> > > >>> > > >
> > > >>> > > > --
> > > >>> > > > Michael J. Erickson
> > > >>> > > >
> > > >>> > > > Research Scientist
> > > >>> > > > Cooperative Institute for Research in Environmental
Sciences
> > > >>> (CIRES)
> > > >>> > > > NOAA/NWS/Weather Prediction Center
> > > >>> > > > Phone:  301-683-1546
> > > >>> > > >
> > > >>> > >
> > > >>> > >
> > > >>> >
> > > >>> > --
> > > >>> > Michael J. Erickson
> > > >>> >
> > > >>> > Research Scientist
> > > >>> > Cooperative Institute for Research in Environmental
Sciences
> > (CIRES)
> > > >>> > NOAA/NWS/Weather Prediction Center
> > > >>> > Phone:  301-683-1546
> > > >>> >
> > > >>> >
> > > >>>
> > > >>>
> > > >>
> > > >> --
> > > >> Michael J. Erickson
> > > >>
> > > >> Research Scientist
> > > >> Cooperative Institute for Research in Environmental Sciences
(CIRES)
> > > >> NOAA/NWS/Weather Prediction Center
> > > >> Phone:  301-683-1546
> > > >>
> > > >
> > > >
> > > > --
> > > > Michael J. Erickson
> > > >
> > > > Research Scientist
> > > > Cooperative Institute for Research in Environmental Sciences
(CIRES)
> > > > NOAA/NWS/Weather Prediction Center
> > > > Phone:  301-683-1546
> > > >
> > >
> > >
> > > --
> > > Michael J. Erickson
> > >
> > > Research Scientist
> > > Cooperative Institute for Research in Environmental Sciences
(CIRES)
> > > NOAA/NWS/Weather Prediction Center
> > > Phone:  301-683-1546
> > >
> > >
> >
> >
>
> --
> Michael J. Erickson
>
> Research Scientist
> Cooperative Institute for Research in Environmental Sciences (CIRES)
> NOAA/NWS/Weather Prediction Center
> Phone:  301-683-1546
>
>

------------------------------------------------
Subject: Including Climatology in grid_stat Config File
From: John Halley Gotway
Time: Mon Sep 28 08:37:08 2020

Mike,

Sorry for the long delay in responding to this. It is always good to
look
carefully at the log messages!

When doing probabilistic verification with a defined climatology as
reference, Point-Stat and Grid-Stat apply the same forecast
probability
bins to the climatology values. In fact, it is not possible to define
them
separately.

Here's an excerpt starting on line 756 of this file:
https://github.com/dtcenter/MET/blob/3804255cf682ae6582f6505e29f82f29c7c29589/met/src/libcode/vx_statistics/compute_stats.cc#L756

// // Set up the forecast Nx2ContingencyTable
//
pct_info.pct.clear();
pct_info.pct.set_size(n_thresh-1);
* pct_info.pct.set_thresholds(p_thresh.vals());*
//
// Set up the climatology Nx2ContingencyTable
//
pct_info.climo_pct.clear();
pct_info.climo_pct.set_size(n_thresh-1);
* pct_info.climo_pct.set_thresholds(p_thresh.vals());*

The two lines shown in bold apply the same set of thresholds for
verifying
the forecast and climo probabilities.

OK, so what does this cryptic log message mean?

*DEBUG 4: parse_conf_climo_cdf() -> For "cdf_bins" (1) and
"center_bins"(false), defined climatology CDF thresholds:
>=0.00000,>=1.00000*

This is related to the binned climatology logic used by NOAA/EMC. It
basically means that we're just verifying the climo values in one big
group. For global vx, EMC applies climatological bins. Let's say you
have
10,000 matched pairs and have defined 3 climo bins: 0 to 0.33, 0.33 to
0.66, and 0.66 to 1. Those 10,000 pairs are subdivided into 3 groups
by
seeing where the current obs value falls within the climo distribution
(bin
1, 2, or 3). Then statistics are computed for each of those 3
subsets...
and the final output statistics are reported as the mean of the
statistics
across those bins.

The log message you see basically mean that we are NOT applying
binnned
climatology logic, which is the default.

I propose that I check for the default setting of "climo_cdf.cdf_bins
= 1"
and print a more clear log message to avoid future confusion, such as:

*DEBUG 4: parse_conf_climo_cdf() -> Since "cdf_bins" = 1, no
climatology
bins will be applied.*

Do you agree?

Thanks,
John

On Wed, Sep 23, 2020 at 10:20 AM George McCabe via RT
<met_help at ucar.edu>
wrote:

>
> Wed Sep 23 10:19:52 2020: Request 96813 was acted upon.
> Transaction: Given to johnhg (John Halley Gotway) by mccabe
>        Queue: met_help
>      Subject: Including Climatology in grid_stat Config File - Part
2
>        Owner: johnhg
>   Requestors: michael.j.erickson at noaa.gov
>       Status: new
>  Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96813 >
>
>
> This transaction appears to have no content
>

------------------------------------------------
Subject: Including Climatology in grid_stat Config File
From: John Halley Gotway
Time: Mon Sep 28 09:39:38 2020

Mike,

FYI, here's the GitHub issue which describes the update to the log
message
for this:
https://github.com/dtcenter/MET/issues/1502

And here's the Pull Request to merge the code changes into the develop
branch:
https://github.com/dtcenter/MET/pull/1503

I assigned Julie Prestopnik, another developer in the DTC, to review
the PR
to double-check my work.

If you have any advice about the updated log message, please add it as
a
comment on the issue.

Thanks,
John

On Mon, Sep 28, 2020 at 8:36 AM John Halley Gotway <johnhg at ucar.edu>
wrote:

> Mike,
>
> Sorry for the long delay in responding to this. It is always good to
look
> carefully at the log messages!
>
> When doing probabilistic verification with a defined climatology as
> reference, Point-Stat and Grid-Stat apply the same forecast
probability
> bins to the climatology values. In fact, it is not possible to
define them
> separately.
>
> Here's an excerpt starting on line 756 of this file:
>
>
https://github.com/dtcenter/MET/blob/3804255cf682ae6582f6505e29f82f29c7c29589/met/src/libcode/vx_statistics/compute_stats.cc#L756
>
>
> // // Set up the forecast Nx2ContingencyTable
> //
> pct_info.pct.clear();
> pct_info.pct.set_size(n_thresh-1);
> * pct_info.pct.set_thresholds(p_thresh.vals());*
> //
> // Set up the climatology Nx2ContingencyTable
> //
> pct_info.climo_pct.clear();
> pct_info.climo_pct.set_size(n_thresh-1);
> * pct_info.climo_pct.set_thresholds(p_thresh.vals());*
>
> The two lines shown in bold apply the same set of thresholds for
verifying
> the forecast and climo probabilities.
>
> OK, so what does this cryptic log message mean?
>
>
> *DEBUG 4: parse_conf_climo_cdf() -> For "cdf_bins" (1) and
> "center_bins"(false), defined climatology CDF thresholds:
> >=0.00000,>=1.00000*
>
> This is related to the binned climatology logic used by NOAA/EMC. It
> basically means that we're just verifying the climo values in one
big
> group. For global vx, EMC applies climatological bins. Let's say you
have
> 10,000 matched pairs and have defined 3 climo bins: 0 to 0.33, 0.33
to
> 0.66, and 0.66 to 1. Those 10,000 pairs are subdivided into 3 groups
by
> seeing where the current obs value falls within the climo
distribution (bin
> 1, 2, or 3). Then statistics are computed for each of those 3
subsets...
> and the final output statistics are reported as the mean of the
statistics
> across those bins.
>
> The log message you see basically mean that we are NOT applying
binnned
> climatology logic, which is the default.
>
> I propose that I check for the default setting of
"climo_cdf.cdf_bins = 1"
> and print a more clear log message to avoid future confusion, such
as:
>
> *DEBUG 4: parse_conf_climo_cdf() -> Since "cdf_bins" = 1, no
climatology
> bins will be applied.*
>
> Do you agree?
>
> Thanks,
> John
>
> On Wed, Sep 23, 2020 at 10:20 AM George McCabe via RT
<met_help at ucar.edu>
> wrote:
>
>>
>> Wed Sep 23 10:19:52 2020: Request 96813 was acted upon.
>> Transaction: Given to johnhg (John Halley Gotway) by mccabe
>>        Queue: met_help
>>      Subject: Including Climatology in grid_stat Config File - Part
2
>>        Owner: johnhg
>>   Requestors: michael.j.erickson at noaa.gov
>>       Status: new
>>  Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96813 >
>>
>>
>> This transaction appears to have no content
>>
>

------------------------------------------------
Subject: Including Climatology in grid_stat Config File
From: Michael Erickson - NOAA Affiliate
Time: Mon Sep 28 11:09:31 2020

Hi John,

Everything you said sounds good and thank you for clarifying my
question.

Mike

On Mon, Sep 28, 2020 at 11:39 AM John Halley Gotway via RT <
met_help at ucar.edu> wrote:

> Mike,
>
> FYI, here's the GitHub issue which describes the update to the log
message
> for this:
> https://github.com/dtcenter/MET/issues/1502
>
> And here's the Pull Request to merge the code changes into the
develop
> branch:
> https://github.com/dtcenter/MET/pull/1503
>
> I assigned Julie Prestopnik, another developer in the DTC, to review
the PR
> to double-check my work.
>
> If you have any advice about the updated log message, please add it
as a
> comment on the issue.
>
> Thanks,
> John
>
> On Mon, Sep 28, 2020 at 8:36 AM John Halley Gotway <johnhg at ucar.edu>
> wrote:
>
> > Mike,
> >
> > Sorry for the long delay in responding to this. It is always good
to look
> > carefully at the log messages!
> >
> > When doing probabilistic verification with a defined climatology
as
> > reference, Point-Stat and Grid-Stat apply the same forecast
probability
> > bins to the climatology values. In fact, it is not possible to
define
> them
> > separately.
> >
> > Here's an excerpt starting on line 756 of this file:
> >
> >
>
https://github.com/dtcenter/MET/blob/3804255cf682ae6582f6505e29f82f29c7c29589/met/src/libcode/vx_statistics/compute_stats.cc#L756
> >
> >
> > // // Set up the forecast Nx2ContingencyTable
> > //
> > pct_info.pct.clear();
> > pct_info.pct.set_size(n_thresh-1);
> > * pct_info.pct.set_thresholds(p_thresh.vals());*
> > //
> > // Set up the climatology Nx2ContingencyTable
> > //
> > pct_info.climo_pct.clear();
> > pct_info.climo_pct.set_size(n_thresh-1);
> > * pct_info.climo_pct.set_thresholds(p_thresh.vals());*
> >
> > The two lines shown in bold apply the same set of thresholds for
> verifying
> > the forecast and climo probabilities.
> >
> > OK, so what does this cryptic log message mean?
> >
> >
> > *DEBUG 4: parse_conf_climo_cdf() -> For "cdf_bins" (1) and
> > "center_bins"(false), defined climatology CDF thresholds:
> > >=0.00000,>=1.00000*
> >
> > This is related to the binned climatology logic used by NOAA/EMC.
It
> > basically means that we're just verifying the climo values in one
big
> > group. For global vx, EMC applies climatological bins. Let's say
you have
> > 10,000 matched pairs and have defined 3 climo bins: 0 to 0.33,
0.33 to
> > 0.66, and 0.66 to 1. Those 10,000 pairs are subdivided into 3
groups by
> > seeing where the current obs value falls within the climo
distribution
> (bin
> > 1, 2, or 3). Then statistics are computed for each of those 3
subsets...
> > and the final output statistics are reported as the mean of the
> statistics
> > across those bins.
> >
> > The log message you see basically mean that we are NOT applying
binnned
> > climatology logic, which is the default.
> >
> > I propose that I check for the default setting of
"climo_cdf.cdf_bins =
> 1"
> > and print a more clear log message to avoid future confusion, such
as:
> >
> > *DEBUG 4: parse_conf_climo_cdf() -> Since "cdf_bins" = 1, no
climatology
> > bins will be applied.*
> >
> > Do you agree?
> >
> > Thanks,
> > John
> >
> > On Wed, Sep 23, 2020 at 10:20 AM George McCabe via RT
<met_help at ucar.edu
> >
> > wrote:
> >
> >>
> >> Wed Sep 23 10:19:52 2020: Request 96813 was acted upon.
> >> Transaction: Given to johnhg (John Halley Gotway) by mccabe
> >>        Queue: met_help
> >>      Subject: Including Climatology in grid_stat Config File -
Part 2
> >>        Owner: johnhg
> >>   Requestors: michael.j.erickson at noaa.gov
> >>       Status: new
> >>  Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96813 >
> >>
> >>
> >> This transaction appears to have no content
> >>
> >
>
>

--
Michael J. Erickson

Research Scientist
Cooperative Institute for Research in Environmental Sciences (CIRES)
NOAA/NWS/Weather Prediction Center
Phone:  301-683-1546

------------------------------------------------