[Met_help] [rt.rap.ucar.edu #62194] History for [Fwd: Possible 4.1 small error in contingency statistics]

Thu Jul 11 15:08:30 MDT 2013

----------------------------------------------------------------
  Initial Request
----------------------------------------------------------------

Hi Wally,

I'm on vacation until next Monday and this is more of a met_help question
anyway so I'm forwarding it to them.

Sorry I'm not much help right now.
Cheers, Tara

---------------------------- Original Message ----------------------------
Subject: Possible 4.1 small error in contingency statistics
From:    "Wallace Clark - NOAA Affiliate" <wallace.l.clark at noaa.gov>
Date:    Mon, July 8, 2013 5:04 pm
To:      "Tara Jensen" <jensen at ucar.edu>
--------------------------------------------------------------------------

Hi Tara,

There may still be an error in the cts output. I have attached the cts file
for a comparison of GFS analysis with itself. Note the values of FY_OY
through FMEAN in the third line do not agree with the top two lines. I have
turned off merging and smoothing, though there is merging during the
matching if there is overlap. I have also attached the ps and obj.txt files
for context.

Cheers I think,
Wally

-- 
----------------------------------------------------------
Wallace L. Clark, Senior Data Analyst 3
Science and Technology Corporation (STC)
NOAA Earth System Research Laboratory
325 Broadway R/PSD2
Boulder, CO 80305
303.497.3101 (ph)
Wallace.L.Clark at noaa.gov
----------------------------------------------------------

----------------------------------------------------------------
  Complete Ticket History
----------------------------------------------------------------

Subject: Re: [rt.rap.ucar.edu #62194] [Fwd: Possible 4.1 small error in contingency statistics]
From: John Halley Gotway
Time: Thu Jul 11 10:12:52 2013

Wally (and Tara),

I don't think that there is a problem here in the output of MODE.
It's just that the CTS output from MODE bears some explanation.

The CTS output contains 3 lines, labelled in the "FIELD" column as
RAW, FILTER, and OBJECT.  The contingency table information in the RAW
line is computed over the raw input fields themselves.
They're thresholded using the convolution threshold from the config
file (>=250 in your example).

The contingency table information in the OBJECT line is computed
instead using the resolved objects.  Anywhere an object is defined is
considered to be an event, and anywhere there's no object is
considered to be a non-event.  Then we add up the counts and compute
the statistics.

The FILTER line will be identical to the RAW line, unless you've used
the "raw_thresh" option, in which case it might differ slightly.

So why are they RAW and OBJECT lines different?  The object definition
process includes a convolution (or smoothing) step.  In your example,
it looks like you may have the convolution radius set to 0,
so that shouldn't be in play here.  But you've set the raw threshold
>= 10 and the area threshold >= 12 gs.  Perhaps there was a very small
object (with area = 7) that was thrown out because of the
area threshold.  That could explain the fact that the RAW and OBJECT
counts differ by 7.  You could try rerunnig this case with no object
area threshold to see how it changes things.

Hope that helps clarify.

Thanks,
John

On 07/10/2013 01:48 PM, Tara Jensen via RT wrote:
>
> Wed Jul 10 13:48:38 2013: Request 62194 was acted upon.
> Transaction: Ticket created by jensen
>         Queue: met_help
>       Subject: [Fwd: Possible 4.1 small error in contingency
statistics]
>         Owner: Nobody
>    Requestors: jensen at ucar.edu
>        Status: new
>   Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=62194 >
>
>
> Hi Wally,
>
> I'm on vacation until next Monday and this is more of a met_help
question
> anyway so I'm forwarding it to them.
>
> Sorry I'm not much help right now.
> Cheers, Tara
>
> ---------------------------- Original Message
----------------------------
> Subject: Possible 4.1 small error in contingency statistics
> From:    "Wallace Clark - NOAA Affiliate" <wallace.l.clark at noaa.gov>
> Date:    Mon, July 8, 2013 5:04 pm
> To:      "Tara Jensen" <jensen at ucar.edu>
>
--------------------------------------------------------------------------
>
> Hi Tara,
>
> There may still be an error in the cts output. I have attached the
cts file
> for a comparison of GFS analysis with itself. Note the values of
FY_OY
> through FMEAN in the third line do not agree with the top two lines.
I have
> turned off merging and smoothing, though there is merging during the
> matching if there is overlap. I have also attached the ps and
obj.txt files
> for context.
>
> Cheers I think,
> Wally
>

------------------------------------------------
Subject: Re: [rt.rap.ucar.edu #62194] [Fwd: Possible 4.1 small error in contingency statistics]
From: wallace.l.clark at noaa.gov
Time: Thu Jul 11 12:16:06 2013

Hi John,
Your clarifications are useful as I have been somewhat uncertain about
the exact differences btwn the three lines.

My concern, though, is that the I used the gfs analysis field for both
the forecast and observed fields. (This seemed to be an easy way to
count total objects in the analysis). But this should yield perfect
consensus numbers and perfect agreement between forecast and
observation, which it didn't. ???

W

On Jul 11, 2013, at 10:12 AM, "John Halley Gotway via RT"
<met_help at ucar.edu> wrote:

> Wally (and Tara),
>
> I don't think that there is a problem here in the output of MODE.
It's just that the CTS output from MODE bears some explanation.
>
> The CTS output contains 3 lines, labelled in the "FIELD" column as
RAW, FILTER, and OBJECT.  The contingency table information in the RAW
line is computed over the raw input fields themselves.
> They're thresholded using the convolution threshold from the config
file (>=250 in your example).
>
> The contingency table information in the OBJECT line is computed
instead using the resolved objects.  Anywhere an object is defined is
considered to be an event, and anywhere there's no object is
> considered to be a non-event.  Then we add up the counts and compute
the statistics.
>
> The FILTER line will be identical to the RAW line, unless you've
used the "raw_thresh" option, in which case it might differ slightly.
>
> So why are they RAW and OBJECT lines different?  The object
definition process includes a convolution (or smoothing) step.  In
your example, it looks like you may have the convolution radius set to
0,
> so that shouldn't be in play here.  But you've set the raw threshold
>= 10 and the area threshold >= 12 gs.  Perhaps there was a very small
object (with area = 7) that was thrown out because of the
> area threshold.  That could explain the fact that the RAW and OBJECT
counts differ by 7.  You could try rerunnig this case with no object
area threshold to see how it changes things.
>
> Hope that helps clarify.
>
> Thanks,
> John
>
> On 07/10/2013 01:48 PM, Tara Jensen via RT wrote:
>>
>> Wed Jul 10 13:48:38 2013: Request 62194 was acted upon.
>> Transaction: Ticket created by jensen
>>        Queue: met_help
>>      Subject: [Fwd: Possible 4.1 small error in contingency
statistics]
>>        Owner: Nobody
>>   Requestors: jensen at ucar.edu
>>       Status: new
>>  Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=62194 >
>>
>>
>> Hi Wally,
>>
>> I'm on vacation until next Monday and this is more of a met_help
question
>> anyway so I'm forwarding it to them.
>>
>> Sorry I'm not much help right now.
>> Cheers, Tara
>>
>> ---------------------------- Original Message
----------------------------
>> Subject: Possible 4.1 small error in contingency statistics
>> From:    "Wallace Clark - NOAA Affiliate"
<wallace.l.clark at noaa.gov>
>> Date:    Mon, July 8, 2013 5:04 pm
>> To:      "Tara Jensen" <jensen at ucar.edu>
>>
--------------------------------------------------------------------------
>>
>> Hi Tara,
>>
>> There may still be an error in the cts output. I have attached the
cts file
>> for a comparison of GFS analysis with itself. Note the values of
FY_OY
>> through FMEAN in the third line do not agree with the top two
lines. I have
>> turned off merging and smoothing, though there is merging during
the
>> matching if there is overlap. I have also attached the ps and
obj.txt files
>> for context.
>>
>> Cheers I think,
>> Wally
>>
>

------------------------------------------------
Subject: Re: [rt.rap.ucar.edu #62194] [Fwd: Possible 4.1 small error in contingency statistics]
From: John Halley Gotway
Time: Thu Jul 11 12:29:49 2013

Wally,

I think you're interpreting this incorrectly.  It did result in
perfect agreement between the fields.  Take a look at the FY_ON and
FN_OY columns in the CTS file.  They're all 0.  The FY_ON and FN_OY
columns tell you the number of grid points at which the forecast and
observation *disagree*.  Since these are all 0, the forecast and
observation agree at every single point.

The difference between the RAW and OBJECT line is in the FY_OY and
FN_OY columns.  They differ by 7 grid points.  So when computing a
contingency table over the RAW data, there are 7 more "events"
than when computing a contingency table over the OBJECT data.  But the
forecast and observation fields completely agree, it's just that logic
for defining events differs between the RAW and OBJECT data.

So there's no problem here.  Make sense?

Thanks,
John

On 07/11/2013 12:16 PM, wallace.l.clark at noaa.gov via RT wrote:
>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=62194 >
>
> Hi John,
> Your clarifications are useful as I have been somewhat uncertain
about the exact differences btwn the three lines.
>
> My concern, though, is that the I used the gfs analysis field for
both the forecast and observed fields. (This seemed to be an easy way
to count total objects in the analysis). But this should yield perfect
consensus numbers and perfect agreement between forecast and
observation, which it didn't. ???
>
> W
>
>
>
> On Jul 11, 2013, at 10:12 AM, "John Halley Gotway via RT"
<met_help at ucar.edu> wrote:
>
>> Wally (and Tara),
>>
>> I don't think that there is a problem here in the output of MODE.
It's just that the CTS output from MODE bears some explanation.
>>
>> The CTS output contains 3 lines, labelled in the "FIELD" column as
RAW, FILTER, and OBJECT.  The contingency table information in the RAW
line is computed over the raw input fields themselves.
>> They're thresholded using the convolution threshold from the config
file (>=250 in your example).
>>
>> The contingency table information in the OBJECT line is computed
instead using the resolved objects.  Anywhere an object is defined is
considered to be an event, and anywhere there's no object is
>> considered to be a non-event.  Then we add up the counts and
compute the statistics.
>>
>> The FILTER line will be identical to the RAW line, unless you've
used the "raw_thresh" option, in which case it might differ slightly.
>>
>> So why are they RAW and OBJECT lines different?  The object
definition process includes a convolution (or smoothing) step.  In
your example, it looks like you may have the convolution radius set to
0,
>> so that shouldn't be in play here.  But you've set the raw
threshold >= 10 and the area threshold >= 12 gs.  Perhaps there was a
very small object (with area = 7) that was thrown out because of the
>> area threshold.  That could explain the fact that the RAW and
OBJECT counts differ by 7.  You could try rerunnig this case with no
object area threshold to see how it changes things.
>>
>> Hope that helps clarify.
>>
>> Thanks,
>> John
>>
>> On 07/10/2013 01:48 PM, Tara Jensen via RT wrote:
>>>
>>> Wed Jul 10 13:48:38 2013: Request 62194 was acted upon.
>>> Transaction: Ticket created by jensen
>>>         Queue: met_help
>>>       Subject: [Fwd: Possible 4.1 small error in contingency
statistics]
>>>         Owner: Nobody
>>>    Requestors: jensen at ucar.edu
>>>        Status: new
>>>   Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=62194 >
>>>
>>>
>>> Hi Wally,
>>>
>>> I'm on vacation until next Monday and this is more of a met_help
question
>>> anyway so I'm forwarding it to them.
>>>
>>> Sorry I'm not much help right now.
>>> Cheers, Tara
>>>
>>> ---------------------------- Original Message
----------------------------
>>> Subject: Possible 4.1 small error in contingency statistics
>>> From:    "Wallace Clark - NOAA Affiliate"
<wallace.l.clark at noaa.gov>
>>> Date:    Mon, July 8, 2013 5:04 pm
>>> To:      "Tara Jensen" <jensen at ucar.edu>
>>>
--------------------------------------------------------------------------
>>>
>>> Hi Tara,
>>>
>>> There may still be an error in the cts output. I have attached the
cts file
>>> for a comparison of GFS analysis with itself. Note the values of
FY_OY
>>> through FMEAN in the third line do not agree with the top two
lines. I have
>>> turned off merging and smoothing, though there is merging during
the
>>> matching if there is overlap. I have also attached the ps and
obj.txt files
>>> for context.
>>>
>>> Cheers I think,
>>> Wally
>>>
>>
>

------------------------------------------------
Subject: [Fwd: Possible 4.1 small error in contingency statistics]
From: wallace.l.clark at noaa.gov
Time: Thu Jul 11 14:04:42 2013

John,

Ah! This time I do get it.  The minimum area threshold I left set to
12
caused 'event' pixels to be reclassified as 'non-event' pixels, thus
changing FY_OY, FN_ON, and the base rate. But because this occurred
identically in both data sets, the skill scores remained perfect. So
all is
as it should be!

Much Thanks,

Wally

On Thu, Jul 11, 2013 at 12:29 PM, John Halley Gotway via RT <
met_help at ucar.edu> wrote:

> Wally,
>
> I think you're interpreting this incorrectly.  It did result in
perfect
> agreement between the fields.  Take a look at the FY_ON and FN_OY
columns
> in the CTS file.  They're all 0.  The FY_ON and FN_OY
> columns tell you the number of grid points at which the forecast and
> observation *disagree*.  Since these are all 0, the forecast and
> observation agree at every single point.
>
> The difference between the RAW and OBJECT line is in the FY_OY and
FN_OY
> columns.  They differ by 7 grid points.  So when computing a
contingency
> table over the RAW data, there are 7 more "events"
> than when computing a contingency table over the OBJECT data.  But
the
> forecast and observation fields completely agree, it's just that
logic for
> defining events differs between the RAW and OBJECT data.
>
> So there's no problem here.  Make sense?
>
> Thanks,
> John
>
> On 07/11/2013 12:16 PM, wallace.l.clark at noaa.gov via RT wrote:
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=62194 >
> >
> > Hi John,
> > Your clarifications are useful as I have been somewhat uncertain
about
> the exact differences btwn the three lines.
> >
> > My concern, though, is that the I used the gfs analysis field for
both
> the forecast and observed fields. (This seemed to be an easy way to
count
> total objects in the analysis). But this should yield perfect
consensus
> numbers and perfect agreement between forecast and observation,
which it
> didn't. ???
> >
> > W
> >
> >
> >
> > On Jul 11, 2013, at 10:12 AM, "John Halley Gotway via RT" <
> met_help at ucar.edu> wrote:
> >
> >> Wally (and Tara),
> >>
> >> I don't think that there is a problem here in the output of MODE.
It's
> just that the CTS output from MODE bears some explanation.
> >>
> >> The CTS output contains 3 lines, labelled in the "FIELD" column
as RAW,
> FILTER, and OBJECT.  The contingency table information in the RAW
line is
> computed over the raw input fields themselves.
> >> They're thresholded using the convolution threshold from the
config
> file (>=250 in your example).
> >>
> >> The contingency table information in the OBJECT line is computed
> instead using the resolved objects.  Anywhere an object is defined
is
> considered to be an event, and anywhere there's no object is
> >> considered to be a non-event.  Then we add up the counts and
compute
> the statistics.
> >>
> >> The FILTER line will be identical to the RAW line, unless you've
used
> the "raw_thresh" option, in which case it might differ slightly.
> >>
> >> So why are they RAW and OBJECT lines different?  The object
definition
> process includes a convolution (or smoothing) step.  In your
example, it
> looks like you may have the convolution radius set to 0,
> >> so that shouldn't be in play here.  But you've set the raw
threshold >=
> 10 and the area threshold >= 12 gs.  Perhaps there was a very small
object
> (with area = 7) that was thrown out because of the
> >> area threshold.  That could explain the fact that the RAW and
OBJECT
> counts differ by 7.  You could try rerunnig this case with no object
area
> threshold to see how it changes things.
> >>
> >> Hope that helps clarify.
> >>
> >> Thanks,
> >> John
> >>
> >> On 07/10/2013 01:48 PM, Tara Jensen via RT wrote:
> >>>
> >>> Wed Jul 10 13:48:38 2013: Request 62194 was acted upon.
> >>> Transaction: Ticket created by jensen
> >>>         Queue: met_help
> >>>       Subject: [Fwd: Possible 4.1 small error in contingency
> statistics]
> >>>         Owner: Nobody
> >>>    Requestors: jensen at ucar.edu
> >>>        Status: new
> >>>   Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=62194>
> >>>
> >>>
> >>> Hi Wally,
> >>>
> >>> I'm on vacation until next Monday and this is more of a met_help
> question
> >>> anyway so I'm forwarding it to them.
> >>>
> >>> Sorry I'm not much help right now.
> >>> Cheers, Tara
> >>>
> >>> ---------------------------- Original Message
> ----------------------------
> >>> Subject: Possible 4.1 small error in contingency statistics
> >>> From:    "Wallace Clark - NOAA Affiliate"
<wallace.l.clark at noaa.gov>
> >>> Date:    Mon, July 8, 2013 5:04 pm
> >>> To:      "Tara Jensen" <jensen at ucar.edu>
> >>>
>
--------------------------------------------------------------------------
> >>>
> >>> Hi Tara,
> >>>
> >>> There may still be an error in the cts output. I have attached
the cts
> file
> >>> for a comparison of GFS analysis with itself. Note the values of
FY_OY
> >>> through FMEAN in the third line do not agree with the top two
lines. I
> have
> >>> turned off merging and smoothing, though there is merging during
the
> >>> matching if there is overlap. I have also attached the ps and
obj.txt
> files
> >>> for context.
> >>>
> >>> Cheers I think,
> >>> Wally
> >>>
> >>
> >
>
>

--
----------------------------------------------------------
Wallace L. Clark, Senior Data Analyst 3
Science and Technology Corporation (STC)
NOAA Earth System Research Laboratory
325 Broadway R/PSD2
Boulder, CO 80305
303.497.3101 (ph)
Wallace.L.Clark at noaa.gov
----------------------------------------------------------

------------------------------------------------