[Met_help] [rt.rap.ucar.edu #67048] History for MET-TC paired tests

Mon Jun 2 14:12:56 MDT 2014

----------------------------------------------------------------
  Initial Request
----------------------------------------------------------------

HI MET team,

I used plot_tcmpr.R to plot TK_ERR vs lead time with confidence 
intervals.  Plus, I plotted relative performance and rank.  Very nice, 
and easy to use! (coming from someone who doesn't even use R. :) )

But can I use MET- tc_stat to get pairwise differences between 2 models, 
such as described in 
http://www.ral.ucar.edu/projects/hfip/includes/h2013/2013-Stream-15-methodology-20May2013-final.pdf, 
so that I can address whether the track error differences are 
statistically significant?

It seems like this was done for the 2013 Stream 1.5 candidate 
evaluation, but I'm not sure if the pairwise differences can be done by 
MET-TC or if they have to be done separately.

dave

----------------------------------------------------------------
  Complete Ticket History
----------------------------------------------------------------

Subject: Re: [rt.rap.ucar.edu #67048] MET-TC paired tests
From: John Halley Gotway
Time: Mon May 12 12:09:15 2014

Dave,

Yep, that plotting script can handle pair-wise differences.  There's a
command line option named "-series" to specify what lines are drawn on
the plot.  Suppose your models are named GFSI and OFCL,
and you use "-series AMODEL GFSI,OFCL,GFSI-OFCL".

That'll plot 3 lines on the plot - one for GFSI, one for OFCL, and a
third for their pairwise difference.  If you only want to difference,
you'd use "-series AMODEL GFSL-OFCL".

Give that a shot and let me know if you run into any problems.

Thanks,
John

On 05/11/2014 12:35 PM, David Ahijevych via RT wrote:
>
> Sun May 11 12:35:30 2014: Request 67048 was acted upon.
> Transaction: Ticket created by ahijevyc
>         Queue: met_help
>       Subject: MET-TC paired tests
>         Owner: Nobody
>    Requestors: ahijevyc at ucar.edu
>        Status: new
>   Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=67048 >
>
>
> HI MET team,
>
> I used plot_tcmpr.R to plot TK_ERR vs lead time with confidence
> intervals.  Plus, I plotted relative performance and rank.  Very
nice,
> and easy to use! (coming from someone who doesn't even use R. :) )
>
> But can I use MET- tc_stat to get pairwise differences between 2
models,
> such as described in
> http://www.ral.ucar.edu/projects/hfip/includes/h2013/2013-Stream-15-
methodology-20May2013-final.pdf,
> so that I can address whether the track error differences are
> statistically significant?
>
> It seems like this was done for the 2013 Stream 1.5 candidate
> evaluation, but I'm not sure if the pairwise differences can be done
by
> MET-TC or if they have to be done separately.
>
> dave
>
>
>
>

------------------------------------------------
Subject: Re: [rt.rap.ucar.edu #67048] MET-TC paired tests
From: David Ahijevych
Time: Mon May 12 15:01:06 2014

Thanks . .very powerful this script is.

On 5/12/14 12:09 PM, John Halley Gotway via RT wrote:
> Dave,
>
> Yep, that plotting script can handle pair-wise differences.  There's
a command line option named "-series" to specify what lines are drawn
on the plot.  Suppose your models are named GFSI and OFCL,
> and you use "-series AMODEL GFSI,OFCL,GFSI-OFCL".
>
> That'll plot 3 lines on the plot - one for GFSI, one for OFCL, and a
third for their pairwise difference.  If you only want to difference,
you'd use "-series AMODEL GFSL-OFCL".
>
> Give that a shot and let me know if you run into any problems.
>
> Thanks,
> John
>
> On 05/11/2014 12:35 PM, David Ahijevych via RT wrote:
>> Sun May 11 12:35:30 2014: Request 67048 was acted upon.
>> Transaction: Ticket created by ahijevyc
>>          Queue: met_help
>>        Subject: MET-TC paired tests
>>          Owner: Nobody
>>     Requestors: ahijevyc at ucar.edu
>>         Status: new
>>    Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=67048 >
>>
>>
>> HI MET team,
>>
>> I used plot_tcmpr.R to plot TK_ERR vs lead time with confidence
>> intervals.  Plus, I plotted relative performance and rank.  Very
nice,
>> and easy to use! (coming from someone who doesn't even use R. :) )
>>
>> But can I use MET- tc_stat to get pairwise differences between 2
models,
>> such as described in
>> http://www.ral.ucar.edu/projects/hfip/includes/h2013/2013-Stream-
15-methodology-20May2013-final.pdf,
>> so that I can address whether the track error differences are
>> statistically significant?
>>
>> It seems like this was done for the 2013 Stream 1.5 candidate
>> evaluation, but I'm not sure if the pairwise differences can be
done by
>> MET-TC or if they have to be done separately.
>>
>> dave
>>
>>
>>
>>

------------------------------------------------
Subject: Re: [rt.rap.ucar.edu #67048] MET-TC paired tests
From: David Ahijevych
Time: Fri May 16 15:49:36 2014

Occasionally plot_tcmpr.R adds spurious numbers and non-colored
numbers
on my rank plots.  Is there a way to share an image with you?

n 5/12/14 12:09 PM, John Halley Gotway via RT wrote:
> Dave,
>
> Yep, that plotting script can handle pair-wise differences.  There's
a command line option named "-series" to specify what lines are drawn
on the plot.  Suppose your models are named GFSI and OFCL,
> and you use "-series AMODEL GFSI,OFCL,GFSI-OFCL".
>
> That'll plot 3 lines on the plot - one for GFSI, one for OFCL, and a
third for their pairwise difference.  If you only want to difference,
you'd use "-series AMODEL GFSL-OFCL".
>
> Give that a shot and let me know if you run into any problems.
>
> Thanks,
> John
>
> On 05/11/2014 12:35 PM, David Ahijevych via RT wrote:
>> Sun May 11 12:35:30 2014: Request 67048 was acted upon.
>> Transaction: Ticket created by ahijevyc
>>          Queue: met_help
>>        Subject: MET-TC paired tests
>>          Owner: Nobody
>>     Requestors: ahijevyc at ucar.edu
>>         Status: new
>>    Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=67048 >
>>
>>
>> HI MET team,
>>
>> I used plot_tcmpr.R to plot TK_ERR vs lead time with confidence
>> intervals.  Plus, I plotted relative performance and rank.  Very
nice,
>> and easy to use! (coming from someone who doesn't even use R. :) )
>>
>> But can I use MET- tc_stat to get pairwise differences between 2
models,
>> such as described in
>> http://www.ral.ucar.edu/projects/hfip/includes/h2013/2013-Stream-
15-methodology-20May2013-final.pdf,
>> so that I can address whether the track error differences are
>> statistically significant?
>>
>> It seems like this was done for the 2013 Stream 1.5 candidate
>> evaluation, but I'm not sure if the pairwise differences can be
done by
>> MET-TC or if they have to be done separately.
>>
>> dave
>>
>>
>>
>>
>
>

------------------------------------------------
Subject: Re: [rt.rap.ucar.edu #67048] MET-TC paired tests
From: John Halley Gotway
Time: Fri May 16 16:39:31 2014

Dave,

Are there little black numbers that appear above/below the lines for
the best/worst ranks?  If so, those aren't "spurious" - they're there
by design.  I'll try to give you an explanation, but for more
detail I'll refer you to Tressa or one of the people working on HFIP.

These are due to "ties" in the data.  When ranking performance across
multiple models, sometime models have the exact same performance.
Ties are more likely for intensity errors than track errors
since the intensity differences are more "binned".  You have to decide
how to assign a rank when there's a tie.  But of course, we couldn't
decide, so we handle them in two different ways.  Generally
speaking, when there are ties, we let R randomly assign the rank to be
used.  Those results are shown in the colored numbers and lines on the
plot.  However, we wanted to see how much ties were
affecting performance.  So the black numbers show what the best/worst
rank lines would be if instead of randomly assigning the rank, we gave
the model we're ranking the benefit of the doubt by giving
it the best possible rank available.

Here's 2 lines taken from plot_tcmpr_util.R where this is being done:
    rank_random = function(x) { return(rank(x, na.last="keep",
ties.method="random")[1]); }
    rank_min    = function(x) { return(rank(x, na.last="keep",
ties.method="min")[1]); }

Notice the difference in "ties.method".

If you want to see this in action, you could look on this DTC website
showing our 2013 retrospective testing:
    http://www.ral.ucar.edu/projects/hfip/h2013/verify/
In the "Plot Type" box select "Rank Plots" and then hit "View Plot".

Thanks,
John

On 05/16/2014 03:49 PM, David Ahijevych via RT wrote:
>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=67048 >
>
> Occasionally plot_tcmpr.R adds spurious numbers and non-colored
numbers
> on my rank plots.  Is there a way to share an image with you?
>
>
>
> n 5/12/14 12:09 PM, John Halley Gotway via RT wrote:
>> Dave
>>
>> Yep, that plotting script can handle pair-wise differences.
There's a command line option named "-series" to specify what lines
are drawn on the plot.  Suppose your models are named GFSI and OFCL,
>> and you use "-series AMODEL GFSI,OFCL,GFSI-OFCL".
>>
>> That'll plot 3 lines on the plot - one for GFSI, one for OFCL, and
a third for their pairwise difference.  If you only want to
difference, you'd use "-series AMODEL GFSL-OFCL".
>>
>> Give that a shot and let me know if you run into any problems.
>>
>> Thanks,
>> John
>>
>> On 05/11/2014 12:35 PM, David Ahijevych via RT wrote:
>>> Sun May 11 12:35:30 2014: Request 67048 was acted upon.
>>> Transaction: Ticket created by ahijevyc
>>>           Queue: met_help
>>>         Subject: MET-TC paired tests
>>>           Owner: Nobody
>>>      Requestors: ahijevyc at ucar.edu
>>>          Status: new
>>>     Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=67048 >
>>>
>>>
>>> HI MET team,
>>>
>>> I used plot_tcmpr.R to plot TK_ERR vs lead time with confidence
>>> intervals.  Plus, I plotted relative performance and rank.  Very
nice,
>>> and easy to use! (coming from someone who doesn't even use R. :) )
>>>
>>> But can I use MET- tc_stat to get pairwise differences between 2
models,
>>> such as described in
>>> http://www.ral.ucar.edu/projects/hfip/includes/h2013/2013-Stream-
15-methodology-20May2013-final.pdf,
>>> so that I can address whether the track error differences are
>>> statistically significant?
>>>
>>> It seems like this was done for the 2013 Stream 1.5 candidate
>>> evaluation, but I'm not sure if the pairwise differences can be
done by
>>> MET-TC or if they have to be done separately.
>>>
>>> dave
>>>
>>>
>>>
>>>
>>
>>
>

------------------------------------------------
Subject: Re: [rt.rap.ucar.edu #67048] MET-TC paired tests
From: David Ahijevych
Time: Fri May 16 16:47:17 2014

  John,

Yes - I think that's what I'm seeing. . . Black numbers are the best a
model could do, given the most favorable rank among ties.

Dave

On 5/16/14 4:39 PM, John Halley Gotway via RT wrote:
> Dave,
>
> Are there little black numbers that appear above/below the lines for
the best/worst ranks?  If so, those aren't "spurious" - they're there
by design.  I'll try to give you an explanation, but for more
> detail I'll refer you to Tressa or one of the people working on
HFIP.
>
> These are due to "ties" in the data.  When ranking performance
across multiple models, sometime models have the exact same
performance.  Ties are more likely for intensity errors than track
errors
> since the intensity differences are more "binned".  You have to
decide how to assign a rank when there's a tie.  But of course, we
couldn't decide, so we handle them in two different ways.  Generally
> speaking, when there are ties, we let R randomly assign the rank to
be used.  Those results are shown in the colored numbers and lines on
the plot.  However, we wanted to see how much ties were
> affecting performance.  So the black numbers show what the
best/worst rank lines would be if instead of randomly assigning the
rank, we gave the model we're ranking the benefit of the doubt by
giving
> it the best possible rank available.
>
> Here's 2 lines taken from plot_tcmpr_util.R where this is being
done:
>      rank_random = function(x) { return(rank(x, na.last="keep",
ties.method="random")[1]); }
>      rank_min    = function(x) { return(rank(x, na.last="keep",
ties.method="min")[1]); }
>
> Notice the difference in "ties.method".
>
> If you want to see this in action, you could look on this DTC
website showing our 2013 retrospective testing:
>      http://www.ral.ucar.edu/projects/hfip/h2013/verify/
> In the "Plot Type" box select "Rank Plots" and then hit "View Plot".
>
> Thanks,
> John
>
> On 05/16/2014 03:49 PM, David Ahijevych via RT wrote:
>> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=67048 >
>>
>> Occasionally plot_tcmpr.R adds spurious numbers and non-colored
numbers
>> on my rank plots.  Is there a way to share an image with you?
>>
>>
>>
>> n 5/12/14 12:09 PM, John Halley Gotway via RT wrote:
>>> Dave
>>>
>>> Yep, that plotting script can handle pair-wise differences.
There's a command line option named "-series" to specify what lines
are drawn on the plot.  Suppose your models are named GFSI and OFCL,
>>> and you use "-series AMODEL GFSI,OFCL,GFSI-OFCL".
>>>
>>> That'll plot 3 lines on the plot - one for GFSI, one for OFCL, and
a third for their pairwise difference.  If you only want to
difference, you'd use "-series AMODEL GFSL-OFCL".
>>>
>>> Give that a shot and let me know if you run into any problems.
>>>
>>> Thanks,
>>> John
>>>
>>> On 05/11/2014 12:35 PM, David Ahijevych via RT wrote:
>>>> Sun May 11 12:35:30 2014: Request 67048 was acted upon.
>>>> Transaction: Ticket created by ahijevyc
>>>>            Queue: met_help
>>>>          Subject: MET-TC paired tests
>>>>            Owner: Nobody
>>>>       Requestors: ahijevyc at ucar.edu
>>>>           Status: new
>>>>      Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=67048 >
>>>>
>>>>
>>>> HI MET team,
>>>>
>>>> I used plot_tcmpr.R to plot TK_ERR vs lead time with confidence
>>>> intervals.  Plus, I plotted relative performance and rank.  Very
nice,
>>>> and easy to use! (coming from someone who doesn't even use R. :)
)
>>>>
>>>> But can I use MET- tc_stat to get pairwise differences between 2
models,
>>>> such as described in
>>>> http://www.ral.ucar.edu/projects/hfip/includes/h2013/2013-Stream-
15-methodology-20May2013-final.pdf,
>>>> so that I can address whether the track error differences are
>>>> statistically significant?
>>>>
>>>> It seems like this was done for the 2013 Stream 1.5 candidate
>>>> evaluation, but I'm not sure if the pairwise differences can be
done by
>>>> MET-TC or if they have to be done separately.
>>>>
>>>> dave
>>>>
>>>>
>>>>
>>>>
>>>

------------------------------------------------
Subject: Re: [rt.rap.ucar.edu #67048] MET-TC paired tests
From: John Halley Gotway
Time: Fri May 16 17:13:28 2014

Dave,

Yes, that's correct.  And we only plot them for the best and worst
lines.  Otherwise, the plot would be littered with a bunch of little
black numbers and be difficult to read.

John

On 05/16/2014 04:47 PM, David Ahijevych via RT wrote:
>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=67048 >
>
>    John,
>
> Yes - I think that's what I'm seeing. . . Black numbers are the best
a
> model could do, given the most favorable rank among ties.
>
> Dave
>
> On 5/16/14 4:39 PM, John Halley Gotway via RT wrote:
>> Dave,
>>
>> Are there little black numbers that appear above/below the lines
for the best/worst ranks?  If so, those aren't "spurious" - they're
there by design.  I'll try to give you an explanation, but for more
>> detail I'll refer you to Tressa or one of the people working on
HFIP.
>>
>> These are due to "ties" in the data.  When ranking performance
across multiple models, sometime models have the exact same
performance.  Ties are more likely for intensity errors than track
errors
>> since the intensity differences are more "binned".  You have to
decide how to assign a rank when there's a tie.  But of course, we
couldn't decide, so we handle them in two different ways.  Generally
>> speaking, when there are ties, we let R randomly assign the rank to
be used.  Those results are shown in the colored numbers and lines on
the plot.  However, we wanted to see how much ties were
>> affecting performance.  So the black numbers show what the
best/worst rank lines would be if instead of randomly assigning the
rank, we gave the model we're ranking the benefit of the doubt by
giving
>> it the best possible rank available.
>>
>> Here's 2 lines taken from plot_tcmpr_util.R where this is being
done:
>>       rank_random = function(x) { return(rank(x, na.last="keep",
ties.method="random")[1]); }
>>       rank_min    = function(x) { return(rank(x, na.last="keep",
ties.method="min")[1]); }
>>
>> Notice the difference in "ties.method".
>>
>> If you want to see this in action, you could look on this DTC
website showing our 2013 retrospective testing:
>>       http://www.ral.ucar.edu/projects/hfip/h2013/verify/
>> In the "Plot Type" box select "Rank Plots" and then hit "View
Plot".
>>
>> Thanks,
>> John
>>
>> On 05/16/2014 03:49 PM, David Ahijevych via RT wrote:
>>> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=67048 >
>>>
>>> Occasionally plot_tcmpr.R adds spurious numbers and non-colored
numbers
>>> on my rank plots.  Is there a way to share an image with you?
>>>
>>>
>>>
>>> n 5/12/14 12:09 PM, John Halley Gotway via RT wrote:
>>>> Dave
>>>>
>>>> Yep, that plotting script can handle pair-wise differences.
There's a command line option named "-series" to specify what lines
are drawn on the plot.  Suppose your models are named GFSI and OFCL,
>>>> and you use "-series AMODEL GFSI,OFCL,GFSI-OFCL".
>>>>
>>>> That'll plot 3 lines on the plot - one for GFSI, one for OFCL,
and a third for their pairwise difference.  If you only want to
difference, you'd use "-series AMODEL GFSL-OFCL".
>>>>
>>>> Give that a shot and let me know if you run into any problems.
>>>>
>>>> Thanks,
>>>> John
>>>>
>>>> On 05/11/2014 12:35 PM, David Ahijevych via RT wrote:
>>>>> Sun May 11 12:35:30 2014: Request 67048 was acted upon.
>>>>> Transaction: Ticket created by ahijevyc
>>>>>             Queue: met_help
>>>>>           Subject: MET-TC paired tests
>>>>>             Owner: Nobody
>>>>>        Requestors: ahijevyc at ucar.edu
>>>>>            Status: new
>>>>>       Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=67048 >
>>>>>
>>>>>
>>>>> HI MET team,
>>>>>
>>>>> I used plot_tcmpr.R to plot TK_ERR vs lead time with confidence
>>>>> intervals.  Plus, I plotted relative performance and rank.  Very
nice,
>>>>> and easy to use! (coming from someone who doesn't even use R. :)
)
>>>>>
>>>>> But can I use MET- tc_stat to get pairwise differences between 2
models,
>>>>> such as described in
>>>>> http://www.ral.ucar.edu/projects/hfip/includes/h2013/2013-
Stream-15-methodology-20May2013-final.pdf,
>>>>> so that I can address whether the track error differences are
>>>>> statistically significant?
>>>>>
>>>>> It seems like this was done for the 2013 Stream 1.5 candidate
>>>>> evaluation, but I'm not sure if the pairwise differences can be
done by
>>>>> MET-TC or if they have to be done separately.
>>>>>
>>>>> dave
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>

------------------------------------------------