[Met_help] [rt.rap.ucar.edu #88095] History for MET-TC homogeneous verification issue

Tue Dec 11 11:26:09 MST 2018

----------------------------------------------------------------
  Initial Request
----------------------------------------------------------------

Hello,

I have been trying to do a homogeneous track comparison with tc_stat, but I
keep ending up with too many cases.  I did a simple test case and found
that MET-TC is verifying forecasts from the adeck file that were issued at
times *prior* to genesis (ie. where storm classification is "DB" at hour
0).  I am using the "column_str" flag to filter out any lines where the
storm is classified as DB, but this does not take care of lead times where
it is classified as a real storm and that particular forecast started with
a "DB" classification.  It is NHC policy to *not* verify such forecasts at
all since they were issued prior to genesis.  If I use the "init_str" flag
in addition to the "column_str" flag, I get their intersection, not their
union, so I end up with more cases at each lead time than I should.

./tc_stat -lookin $inp -job summary -by AMODEL,LEAD -column ${field} -out
$out -event_equal TRUE -column_str LEVEL TD,TS,TY,TC,HU,SD,SS,ST -init_str
LEVEL TD,TS,TY,TC,HU,SD,SS,ST

I am wondering if there is a way to get the union of init_str and
column_str or some other way to avoid including forecasts that were
initialized prior to official storm formation as well as those times when
the storm was classified as "DB".

Thank you,
Shannon
-- 
Shannon Rees
UCAR Visiting Scientist
Geophysical Fluid Dynamics Lab
Princeton University Forrestal Campus
201 Forrestal Rd Princeton, NJ
609-452-5384

----------------------------------------------------------------
  Complete Ticket History
----------------------------------------------------------------

Subject: MET-TC homogeneous verification issue
From: John Halley Gotway
Time: Mon Dec 10 13:15:07 2018

Shannon,

I read through your email, and I understand that you want to discard
any
ADECK tracks where LEVEL = DB for lead hour = 0.  And you're right,
you can
use the "-init_str" job command option to do so:
   -init_str LEVEL TD,TS,TY,TC,HU,SD,SS,ST

This will only keep tracks where the value of LEVEL for the 0-hour
forecast
is included in that list.

But you lost me after that...
- Using the "-column_str" option in addition will *subset* tracks down
to
the set of points where LEVEL shows up in the list.
- Taking the intersection of the "-init_str" and "-column_str" options
would result in *less* data, not more.
- Taking the union would result in *more* data, not less.

Are you saying that the job listed above results in the tracks where
LEVEL=DB for the 0-hour forecast?  That should not be the case.

Could you please send me some of your data and sample commands to
better
illustrate the problem?

Thanks
John

On Mon, Dec 10, 2018 at 12:09 PM Shannon Rees - NOAA Affiliate via RT
<
met_help at ucar.edu> wrote:

>
> Mon Dec 10 12:08:37 2018: Request 88095 was acted upon.
> Transaction: Ticket created by shannon.rees at noaa.gov
>        Queue: met_help
>      Subject: MET-TC homogeneous verification issue
>        Owner: Nobody
>   Requestors: shannon.rees at noaa.gov
>       Status: new
>  Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=88095 >
>
>
> Hello,
>
> I have been trying to do a homogeneous track comparison with
tc_stat, but I
> keep ending up with too many cases.  I did a simple test case and
found
> that MET-TC is verifying forecasts from the adeck file that were
issued at
> times *prior* to genesis (ie. where storm classification is "DB" at
hour
> 0).  I am using the "column_str" flag to filter out any lines where
the
> storm is classified as DB, but this does not take care of lead times
where
> it is classified as a real storm and that particular forecast
started with
> a "DB" classification.  It is NHC policy to *not* verify such
forecasts at
> all since they were issued prior to genesis.  If I use the
"init_str" flag
> in addition to the "column_str" flag, I get their intersection, not
their
> union, so I end up with more cases at each lead time than I should.
>
> ./tc_stat -lookin $inp -job summary -by AMODEL,LEAD -column ${field}
-out
> $out -event_equal TRUE -column_str LEVEL TD,TS,TY,TC,HU,SD,SS,ST
-init_str
> LEVEL TD,TS,TY,TC,HU,SD,SS,ST
>
> I am wondering if there is a way to get the union of init_str and
> column_str or some other way to avoid including forecasts that were
> initialized prior to official storm formation as well as those times
when
> the storm was classified as "DB".
>
> Thank you,
> Shannon
> --
> Shannon Rees
> UCAR Visiting Scientist
> Geophysical Fluid Dynamics Lab
> Princeton University Forrestal Campus
> 201 Forrestal Rd Princeton, NJ
> 609-452-5384
>
>

------------------------------------------------
Subject: MET-TC homogeneous verification issue
From: Shannon Rees - NOAA Affiliate
Time: Tue Dec 11 07:59:43 2018

Hi John,

I've realized that you are correct about the intersection of the
"-init_str" and "-column_str" options resulting in less data.  That is
what
I now see in my test case using just two models, GFSO and HWRF.  My
original case used three other models which I artificially added to
the
adeck files.  I realized the added models didn't have the correct
"LEVEL"
codes, which is likely causing the problem I saw.

Just so I understand... I thought the intersection of the two would be
a
problem because I want to exclude all DB entries as well as all
entries
from forecasts that had DB at lead time = 0 hours.  Wouldn't the
intersection of the two mean that the only entries to be excluded
would
have to fall into both categories?  So only the entries from forecasts
that
had DB at lead time = 0 *and* were also labelled DB at that particular
lead
time would be excluded?  I thought the union of the two would be the
only
way to select all of the entries I want to exclude.  This is not what
I'm
seeing after all so I know I must be wrong in thinking this.

Thanks for your help!
Shannon

On Mon, Dec 10, 2018 at 3:15 PM John Halley Gotway via RT
<met_help at ucar.edu>
wrote:

> Shannon,
>
> I read through your email, and I understand that you want to discard
any
> ADECK tracks where LEVEL = DB for lead hour = 0.  And you're right,
you can
> use the "-init_str" job command option to do so:
>    -init_str LEVEL TD,TS,TY,TC,HU,SD,SS,ST
>
> This will only keep tracks where the value of LEVEL for the 0-hour
forecast
> is included in that list.
>
> But you lost me after that...
> - Using the "-column_str" option in addition will *subset* tracks
down to
> the set of points where LEVEL shows up in the list.
> - Taking the intersection of the "-init_str" and "-column_str"
options
> would result in *less* data, not more.
> - Taking the union would result in *more* data, not less.
>
> Are you saying that the job listed above results in the tracks where
> LEVEL=DB for the 0-hour forecast?  That should not be the case.
>
> Could you please send me some of your data and sample commands to
better
> illustrate the problem?
>
> Thanks
> John
>
>
> On Mon, Dec 10, 2018 at 12:09 PM Shannon Rees - NOAA Affiliate via
RT <
> met_help at ucar.edu> wrote:
>
> >
> > Mon Dec 10 12:08:37 2018: Request 88095 was acted upon.
> > Transaction: Ticket created by shannon.rees at noaa.gov
> >        Queue: met_help
> >      Subject: MET-TC homogeneous verification issue
> >        Owner: Nobody
> >   Requestors: shannon.rees at noaa.gov
> >       Status: new
> >  Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=88095 >
> >
> >
> > Hello,
> >
> > I have been trying to do a homogeneous track comparison with
tc_stat,
> but I
> > keep ending up with too many cases.  I did a simple test case and
found
> > that MET-TC is verifying forecasts from the adeck file that were
issued
> at
> > times *prior* to genesis (ie. where storm classification is "DB"
at hour
> > 0).  I am using the "column_str" flag to filter out any lines
where the
> > storm is classified as DB, but this does not take care of lead
times
> where
> > it is classified as a real storm and that particular forecast
started
> with
> > a "DB" classification.  It is NHC policy to *not* verify such
forecasts
> at
> > all since they were issued prior to genesis.  If I use the
"init_str"
> flag
> > in addition to the "column_str" flag, I get their intersection,
not their
> > union, so I end up with more cases at each lead time than I
should.
> >
> > ./tc_stat -lookin $inp -job summary -by AMODEL,LEAD -column
${field} -out
> > $out -event_equal TRUE -column_str LEVEL TD,TS,TY,TC,HU,SD,SS,ST
> -init_str
> > LEVEL TD,TS,TY,TC,HU,SD,SS,ST
> >
> > I am wondering if there is a way to get the union of init_str and
> > column_str or some other way to avoid including forecasts that
were
> > initialized prior to official storm formation as well as those
times when
> > the storm was classified as "DB".
> >
> > Thank you,
> > Shannon
> > --
> > Shannon Rees
> > UCAR Visiting Scientist
> > Geophysical Fluid Dynamics Lab
> > Princeton University Forrestal Campus
> > 201 Forrestal Rd Princeton, NJ
> > 609-452-5384
> >
> >
>
>

--
Shannon Rees
UCAR Visiting Scientist
Geophysical Fluid Dynamics Lab
Princeton University Forrestal Campus
201 Forrestal Rd Princeton, NJ
609-452-5384

------------------------------------------------
Subject: MET-TC homogeneous verification issue
From: John Halley Gotway
Time: Tue Dec 11 09:40:50 2018

Shannon,

I think the confusion here is stemming from the difference between
inclusion/exclusion and intersection/union.  Let me clarify what the
following two options are doing:

(1) -init_str LEVEL TD,TS,TY,TC,HU,SD,SS,ST ... says to keep (i.e.
"include") tracks where the LEVEL at forecast hour 0 is one of the
ones
listed.
(2) -column_str LEVEL TD,TS,TY,TC,HU,SD,SS,ST ... says to keep (i.e.
"include") track points where the LEVEL is one of the ones listed.

All of these filtering options tell tc_stat what to "include"... not
"exclude".

The output should only contain data that meets both criteria (1) and
(2)...
(i.e. their "intersection").  So only track points from the specified
LEVEL
list from tracks whose 0-hour LEVEL was one of the ones listed.

I think that's exactly what you want.  But if the output doesn't match
that, please let me know.

Thanks,
John

On Tue, Dec 11, 2018 at 8:00 AM Shannon Rees - NOAA Affiliate via RT <
met_help at ucar.edu> wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=88095 >
>
> Hi John,
>
> I've realized that you are correct about the intersection of the
> "-init_str" and "-column_str" options resulting in less data.  That
is what
> I now see in my test case using just two models, GFSO and HWRF.  My
> original case used three other models which I artificially added to
the
> adeck files.  I realized the added models didn't have the correct
"LEVEL"
> codes, which is likely causing the problem I saw.
>
> Just so I understand... I thought the intersection of the two would
be a
> problem because I want to exclude all DB entries as well as all
entries
> from forecasts that had DB at lead time = 0 hours.  Wouldn't the
> intersection of the two mean that the only entries to be excluded
would
> have to fall into both categories?  So only the entries from
forecasts that
> had DB at lead time = 0 *and* were also labelled DB at that
particular lead
> time would be excluded?  I thought the union of the two would be the
only
> way to select all of the entries I want to exclude.  This is not
what I'm
> seeing after all so I know I must be wrong in thinking this.
>
> Thanks for your help!
> Shannon
>
>
>
>
>
>
>
>
>
> On Mon, Dec 10, 2018 at 3:15 PM John Halley Gotway via RT <
> met_help at ucar.edu>
> wrote:
>
> > Shannon,
> >
> > I read through your email, and I understand that you want to
discard any
> > ADECK tracks where LEVEL = DB for lead hour = 0.  And you're
right, you
> can
> > use the "-init_str" job command option to do so:
> >    -init_str LEVEL TD,TS,TY,TC,HU,SD,SS,ST
> >
> > This will only keep tracks where the value of LEVEL for the 0-hour
> forecast
> > is included in that list.
> >
> > But you lost me after that...
> > - Using the "-column_str" option in addition will *subset* tracks
down to
> > the set of points where LEVEL shows up in the list.
> > - Taking the intersection of the "-init_str" and "-column_str"
options
> > would result in *less* data, not more.
> > - Taking the union would result in *more* data, not less.
> >
> > Are you saying that the job listed above results in the tracks
where
> > LEVEL=DB for the 0-hour forecast?  That should not be the case.
> >
> > Could you please send me some of your data and sample commands to
better
> > illustrate the problem?
> >
> > Thanks
> > John
> >
> >
> > On Mon, Dec 10, 2018 at 12:09 PM Shannon Rees - NOAA Affiliate via
RT <
> > met_help at ucar.edu> wrote:
> >
> > >
> > > Mon Dec 10 12:08:37 2018: Request 88095 was acted upon.
> > > Transaction: Ticket created by shannon.rees at noaa.gov
> > >        Queue: met_help
> > >      Subject: MET-TC homogeneous verification issue
> > >        Owner: Nobody
> > >   Requestors: shannon.rees at noaa.gov
> > >       Status: new
> > >  Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=88095
> >
> > >
> > >
> > > Hello,
> > >
> > > I have been trying to do a homogeneous track comparison with
tc_stat,
> > but I
> > > keep ending up with too many cases.  I did a simple test case
and found
> > > that MET-TC is verifying forecasts from the adeck file that were
issued
> > at
> > > times *prior* to genesis (ie. where storm classification is "DB"
at
> hour
> > > 0).  I am using the "column_str" flag to filter out any lines
where the
> > > storm is classified as DB, but this does not take care of lead
times
> > where
> > > it is classified as a real storm and that particular forecast
started
> > with
> > > a "DB" classification.  It is NHC policy to *not* verify such
forecasts
> > at
> > > all since they were issued prior to genesis.  If I use the
"init_str"
> > flag
> > > in addition to the "column_str" flag, I get their intersection,
not
> their
> > > union, so I end up with more cases at each lead time than I
should.
> > >
> > > ./tc_stat -lookin $inp -job summary -by AMODEL,LEAD -column
${field}
> -out
> > > $out -event_equal TRUE -column_str LEVEL TD,TS,TY,TC,HU,SD,SS,ST
> > -init_str
> > > LEVEL TD,TS,TY,TC,HU,SD,SS,ST
> > >
> > > I am wondering if there is a way to get the union of init_str
and
> > > column_str or some other way to avoid including forecasts that
were
> > > initialized prior to official storm formation as well as those
times
> when
> > > the storm was classified as "DB".
> > >
> > > Thank you,
> > > Shannon
> > > --
> > > Shannon Rees
> > > UCAR Visiting Scientist
> > > Geophysical Fluid Dynamics Lab
> > > Princeton University Forrestal Campus
> > > 201 Forrestal Rd Princeton, NJ
> > > 609-452-5384
> > >
> > >
> >
> >
>
> --
> Shannon Rees
> UCAR Visiting Scientist
> Geophysical Fluid Dynamics Lab
> Princeton University Forrestal Campus
> 201 Forrestal Rd Princeton, NJ
> 609-452-5384
>
>

------------------------------------------------
Subject: MET-TC homogeneous verification issue
From: Shannon Rees - NOAA Affiliate
Time: Tue Dec 11 10:35:37 2018

John,

Thanks for the explanation.  That clears it up.  I was thinking about
the
exclusion/inclusion in a backwards way.  I believe the filtering is
working
as you explained, I just have a problem with my adeck data.  You can
close
the ticket.

On Tue, Dec 11, 2018 at 11:40 AM John Halley Gotway via RT <
met_help at ucar.edu> wrote:

> Shannon,
>
> I think the confusion here is stemming from the difference between
> inclusion/exclusion and intersection/union.  Let me clarify what the
> following two options are doing:
>
> (1) -init_str LEVEL TD,TS,TY,TC,HU,SD,SS,ST ... says to keep (i.e.
> "include") tracks where the LEVEL at forecast hour 0 is one of the
ones
> listed.
> (2) -column_str LEVEL TD,TS,TY,TC,HU,SD,SS,ST ... says to keep (i.e.
> "include") track points where the LEVEL is one of the ones listed.
>
> All of these filtering options tell tc_stat what to "include"... not
> "exclude".
>
> The output should only contain data that meets both criteria (1) and
(2)...
> (i.e. their "intersection").  So only track points from the
specified LEVEL
> list from tracks whose 0-hour LEVEL was one of the ones listed.
>
> I think that's exactly what you want.  But if the output doesn't
match
> that, please let me know.
>
> Thanks,
> John
>
> On Tue, Dec 11, 2018 at 8:00 AM Shannon Rees - NOAA Affiliate via RT
<
> met_help at ucar.edu> wrote:
>
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=88095 >
> >
> > Hi John,
> >
> > I've realized that you are correct about the intersection of the
> > "-init_str" and "-column_str" options resulting in less data.
That is
> what
> > I now see in my test case using just two models, GFSO and HWRF.
My
> > original case used three other models which I artificially added
to the
> > adeck files.  I realized the added models didn't have the correct
"LEVEL"
> > codes, which is likely causing the problem I saw.
> >
> > Just so I understand... I thought the intersection of the two
would be a
> > problem because I want to exclude all DB entries as well as all
entries
> > from forecasts that had DB at lead time = 0 hours.  Wouldn't the
> > intersection of the two mean that the only entries to be excluded
would
> > have to fall into both categories?  So only the entries from
forecasts
> that
> > had DB at lead time = 0 *and* were also labelled DB at that
particular
> lead
> > time would be excluded?  I thought the union of the two would be
the only
> > way to select all of the entries I want to exclude.  This is not
what I'm
> > seeing after all so I know I must be wrong in thinking this.
> >
> > Thanks for your help!
> > Shannon
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > On Mon, Dec 10, 2018 at 3:15 PM John Halley Gotway via RT <
> > met_help at ucar.edu>
> > wrote:
> >
> > > Shannon,
> > >
> > > I read through your email, and I understand that you want to
discard
> any
> > > ADECK tracks where LEVEL = DB for lead hour = 0.  And you're
right, you
> > can
> > > use the "-init_str" job command option to do so:
> > >    -init_str LEVEL TD,TS,TY,TC,HU,SD,SS,ST
> > >
> > > This will only keep tracks where the value of LEVEL for the 0-
hour
> > forecast
> > > is included in that list.
> > >
> > > But you lost me after that...
> > > - Using the "-column_str" option in addition will *subset*
tracks down
> to
> > > the set of points where LEVEL shows up in the list.
> > > - Taking the intersection of the "-init_str" and "-column_str"
options
> > > would result in *less* data, not more.
> > > - Taking the union would result in *more* data, not less.
> > >
> > > Are you saying that the job listed above results in the tracks
where
> > > LEVEL=DB for the 0-hour forecast?  That should not be the case.
> > >
> > > Could you please send me some of your data and sample commands
to
> better
> > > illustrate the problem?
> > >
> > > Thanks
> > > John
> > >
> > >
> > > On Mon, Dec 10, 2018 at 12:09 PM Shannon Rees - NOAA Affiliate
via RT <
> > > met_help at ucar.edu> wrote:
> > >
> > > >
> > > > Mon Dec 10 12:08:37 2018: Request 88095 was acted upon.
> > > > Transaction: Ticket created by shannon.rees at noaa.gov
> > > >        Queue: met_help
> > > >      Subject: MET-TC homogeneous verification issue
> > > >        Owner: Nobody
> > > >   Requestors: shannon.rees at noaa.gov
> > > >       Status: new
> > > >  Ticket <URL:
> https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=88095
> > >
> > > >
> > > >
> > > > Hello,
> > > >
> > > > I have been trying to do a homogeneous track comparison with
tc_stat,
> > > but I
> > > > keep ending up with too many cases.  I did a simple test case
and
> found
> > > > that MET-TC is verifying forecasts from the adeck file that
were
> issued
> > > at
> > > > times *prior* to genesis (ie. where storm classification is
"DB" at
> > hour
> > > > 0).  I am using the "column_str" flag to filter out any lines
where
> the
> > > > storm is classified as DB, but this does not take care of lead
times
> > > where
> > > > it is classified as a real storm and that particular forecast
started
> > > with
> > > > a "DB" classification.  It is NHC policy to *not* verify such
> forecasts
> > > at
> > > > all since they were issued prior to genesis.  If I use the
"init_str"
> > > flag
> > > > in addition to the "column_str" flag, I get their
intersection, not
> > their
> > > > union, so I end up with more cases at each lead time than I
should.
> > > >
> > > > ./tc_stat -lookin $inp -job summary -by AMODEL,LEAD -column
${field}
> > -out
> > > > $out -event_equal TRUE -column_str LEVEL
TD,TS,TY,TC,HU,SD,SS,ST
> > > -init_str
> > > > LEVEL TD,TS,TY,TC,HU,SD,SS,ST
> > > >
> > > > I am wondering if there is a way to get the union of init_str
and
> > > > column_str or some other way to avoid including forecasts that
were
> > > > initialized prior to official storm formation as well as those
times
> > when
> > > > the storm was classified as "DB".
> > > >
> > > > Thank you,
> > > > Shannon
> > > > --
> > > > Shannon Rees
> > > > UCAR Visiting Scientist
> > > > Geophysical Fluid Dynamics Lab
> > > > Princeton University Forrestal Campus
> > > > 201 Forrestal Rd Princeton, NJ
> > > > 609-452-5384
> > > >
> > > >
> > >
> > >
> >
> > --
> > Shannon Rees
> > UCAR Visiting Scientist
> > Geophysical Fluid Dynamics Lab
> > Princeton University Forrestal Campus
> > 201 Forrestal Rd Princeton, NJ
> > 609-452-5384
> >
> >
>
>

--
Shannon Rees
UCAR Visiting Scientist
Geophysical Fluid Dynamics Lab
Princeton University Forrestal Campus
201 Forrestal Rd Princeton, NJ
609-452-5384

------------------------------------------------