[Met_help] MODE Advice: [Fwd: Re: Measuring model skill in mode_analysis]

Sat May 9 13:07:56 MDT 2009

Hi Jonathan,

John has done a good job giving you some insights into interpretation of 
MODE output.  As he said, this is an area where we are still learning 
what works best.  And probably the best things to do will always be 
user-dependent. Nevertheless, I'll try to give you a few more thoughts 
about interpretation of MODE output. Let me know if any of them are helpful!

First, I've attached a recent paper that Chris Davis, Randy Bullock, and 
I just submitted for 2nd reviews.  This paper talks more about the MMI 
as well as an object-based Gilbert Skill Score (i.e., ETS).  Of course, 
these scores are like other single-valued skill scores - they can't tell 
the whole story by themselves, so should be used in conjunction with 
other types of metrics.

A type of object-based CSI that we've also used is based on 
area-weighting  (called "Area-Weighted CSI" or AWCSI).  It works 
something like the following:

AWCSI = [(hit area weight) * #hits ] / [(hit area weight * # hits) + 
(miss area weight * # misses) + (false alarm area weight * # false alarms) ]

Each area weight is the ratio of size of the (hit, miss, or false alarm) 
objects to the total area of all objects and # hits = number of matched 
objects; # misses = # unmatched observed objects; and # false alarms = # 
unmatched forecast objects.  This measure answers the question: How well 
did the forecast "yes" objects correspond to the observed "yes" objects?

The three things I've mentioned above are summary measures.  It also 
might be useful to look at distributions of values of some of the 
individual object measures, such as
o  Intersection area divided by union area for matched objects
o  Ratios of - or differences between - forecast and observed intensity 
percentiles
    for matched pairs of objects - the 50th and 90th percentiles are the 
ones
    we normally focus on
o  Differences in any of the other attributes
o  Attributes associated with the missed and false alarm objects

It seems like it would be useful to look at these distributions as a 
function of time of day as well as level of activity (e.g., measured by 
the total area of observed objects).

We would very much like your feedback on things that you find to be 
useful.  In fact, it is our hope to post some new analysis approaches on 
the MET website.

Also - please let us know if there are other attributes that you would 
like to be able to look at, that might be particularly useful for your 
application.  We're always looking for ways to improve the methods in MET.

I apologize for taking awhile to respond to your email.  Please let me 
know if you have any questions or comments.

Barb Brown

> -------- Original Message --------
> Subject: Re: Measuring model skill in mode_analysis
> Date: Thu, 30 Apr 2009 08:10:08 -0600
> From: John Halley Gotway <johnhg at rap.ucar.edu>
> To: Case, Jonathan (MSFC-VP61)[Other] <jonathan.case-1 at nasa.gov>
> References: <C1333A631BF21841864F141194ED3D4A1522011D2F at NDMSSCC02.ndc.nasa.gov> <49F5B659.1000904 at rap.ucar.edu> <C1333A631BF21841864F141194ED3D4A199DDAEDE2 at NDMSSCC02.ndc.nasa.gov>
> <49F5C810.5030603 at rap.ucar.edu> <C1333A631BF21841864F141194ED3D4A199E064EC7 at NDMSSCC02.ndc.nasa.gov>
>
> Jonathan,
>
> Exactly - Those contingency table stats are just the traditional verification method.  They're meant to serve as a point of reference for the MODE object output and answer the question "How would this
> forecast have performed using a traditional verification approach?"  To get at what you're after, we could consider a 4th line of output that contains contingency table counts computed in a different
> way based on the results of the object matching.  The difficulty is defining how exactly to construct that contingency table.
>
> Unfortunately, I don't have answer for you about how "BEST" to interpret the MODE output.  It's an open area of research, and it's one reason why we wanted to distribute MODE to the community - to see
> how people use it.  And I sympathize with you that by simply averaging, most the differences in object attributes get smeared away.  We saw the same thing when applying MODE to NCWF2 convective cells.
>
> So while I don't know the best way of interpreting the results, I can tell you what people have done in the past:
>
> (1) Chris Davis (NCAR) has used MODE to define objects without even paying attention to the matching.  Instead, he looked at the distributions of the object areas and intensities based on time of day.
>  By comparing the forecast object distribution to the observation object distribution, he was able to diagnose some timing and intensity errors in the model.
>
> (2) Dave Ahijevych (NCAR) has used the MODE matching output to construct contingency tables as you've suggested.
>
> (3) There are several metrics you could look at for a single case.  Please take a look at the sample R script I posted recently for post-processing the MODE output files:
> http://www.dtcenter.org/met/users/downloads/analysis_scripts.php.  The "mode_summary.R" script computes counts and areas of matched/unmatched objects and computes a metric we're calling MMI (Median of
> the Maximum Interest).  Listed at the bottom of this message is a description of MMI.
>
> (4) The issue of scale (radius and threshold) is a very important one.  You may want to use MODE to analyze your data at multiple scales to determine at which scale the forecast is useful (you can use
> the neighborhood methods Fractions Skill Score for this as well).  On that same scripts page, take a look at the "mode_quilt_plot.R" Rscript and the "sample_PDF_file" that it creates.  One thing you
> can do is use many different choices of radii and thresholds and plot metrics for each scale.
>
> Hopefully that'll help.  I'm forwarding this message to the scientists in our group to see if they have any additional advice.
>
> Thanks,
> John
>
> MMI stands for "Median of the Maximum Interest".  Take a look at the example PostScript file here:
> .../METv2.0/out/mode/mode_RH_ISBL_500_vs_RH_ISBL_500_120000L_20050807_120000V_000000A.ps
>
> In this example, MMI is computed as follows:
> - There are 8 simple forecast objects and 12 simple observation objects.
> - For each one of those objects, the maximum interest is computed by looking at the interest values for all simple object pairs in which it's a member.
> - So we have one maximum interest value for each object: 8 forecast values and 12 observation values.
> - If an object is way off by itself and wasn't compared to any other objects because it's so far away, it's maximum interest value is 0.
> - Next we compute MMI 3 ways:
>    (1) MMI with respect to forecast objects is the median of those 8 forecast maximum interest values (MMI w.r.t fcst = 0.8800).
>    (2) MMI with respect to observation objects is the median of those 12 observation maximum interest values (MMI w.r.t. obs = 0.8723).
>    (3) MMI with respect to all objects is the median of all 20 forecast and observation maximum interest values (MMI w.r.t all = 0.8765).
>
>
> Case, Jonathan (MSFC-VP61)[Other] wrote:
>   
>> Hi John,
>>
>> I'm still not entirely sure what can be done with these contingency stats in MODE output.  
>> It seems from your description below that the FCST/OBS objects must overlap at a particular grid point in order to receive a "hit".  At face value, these contingency stats do not appear to provide much added value beyond traditional or neighborhood stats.  
>>
>> So, is there a recommended way to measure the "skill" of these matched/unmatched objects besides summarizing and comparing relatively obscure object attributes such as complexity ratio, convex hull distance, etc.?  I have found that the mean object attributes are VERY similar to one another when computed over a series of many forecasts.  The most obvious method (which I've done so far) is to simply compare the matched and unmatched object areas using -bycase, in order to see which one has more hits and fewer false alarms.  
>>
>> Also, some object attributes are inherently dependent on the configurable parameters set prior to running MODE.  For example, I found that by reducing the obs/fcst_conv_radius (I ended up using a value of "3"), MODE produced a more stringent, realistic matching of the objects.  When the radii are reduced, the mean CENTROID_DIST is subsequently reduced as well because objects must be closer to one another in order to be matched.  [Motivation: If these radii are too large, then convective precipitation systems on the west coast of Florida would be paired up with convection on the east coast of the Florida peninsula, which to a forecaster wouldn't be considered a "hit".]
>>
>> So, as you can see, I'm still trying to get my hands around how best to summarize the MODE output.
>>
>> Thanks for your insight, as always!
>> Jonathan
>>
>>     
>>> -----Original Message-----
>>> From: John Halley Gotway [mailto:johnhg at rap.ucar.edu]
>>> Sent: Monday, April 27, 2009 9:58 AM
>>> To: Case, Jonathan (MSFC-VP61)[Other]
>>> Cc: met_help at ucar.edu
>>> Subject: Re: [Met_help] mode_analysis question
>>>
>>> Jonathan,
>>>
>>> Actually, the counts in those files have nothing to do with the
>>> matching that was performed by MODE.  The intent of the file is to give
>>> you an easy way of seeing what you'd get with a traditional
>>> verification of the input fields as opposed to an object-based
>>> verification.  Let me point you to the first paragraph of section 6.3.3
>>> of the MET User's Guide for a description of these lines.  Here's
>>> some more detail based on the contents of the "FIELD" column.
>>>
>>> FIELD Column...
>>> (1) RAW: Apply any masking of bad data, grid masking, or polygon
>>> matching to the raw fields.  Threshold the raw fields using the
>>> "fcst_conv_thresh" and "obs_conv_thresh" config values to define 0/1
>>> fields.  Compute the contingency table counts by comparing these 0/1
>>> fields grid point by grid point.
>>>
>>> (2) FILTER: Apply any masking of bad data, grid masking, or polygon
>>> matching to the raw fields.  In addition, apply the "fcst_raw_thresh"
>>> and "obs_raw_thresh" to filter out any additional values.
>>> Then use the "fcst_conv_thresh" and "obs_conv_thresh" config values to
>>> define 0/1 fields, and compute a contingency table from them.
>>>
>>> (3) OBJECT: Once objects have been defined in the forecast and
>>> observation fields, consider any grid point inside of an object to have
>>> a value of 1 and any grid point outside of an object to have a
>>> value of 0.  Compute the contingency table counts by comparing these
>>> 0/1 fields grid point by grid point.
>>>
>>> So really object matching has nothing to do with it.
>>>
>>> Make sense?
>>>
>>> John
>>>
>>>
>>> For the RAW line:
>>>
>>> Case, Jonathan (MSFC-VP61)[Other] wrote:
>>>       
>>>> Hi John,
>>>>
>>>> You hit the nail on the head with #1.  I saw the counts in the MODE
>>>>         
>>> contingency files (*_cts.txt) and was wondering if we could summarize
>>> those in mode_analysis (to which the answer is no as you said).
>>>       
>>>> It seems like we could fairly easily read in the contingency counts
>>>>         
>>> in shell script, with a little management of filenames.
>>>       
>>>> I also wonder how those contingency files are calculated in the MODE
>>>>         
>>> output?  Is any forecast object that matches an observed object
>>> considered a "hit", and then the individual grid points within that
>>> object are added to the total of FY_OY?  And similarly with the
>>> unmatched forecast/observed objects, are their grid points summed to be
>>> FY_ON and FN_OY?  It seems like it could get a bit unclear when dealing
>>> with FCST/OBS objects that don't overlap when trying to determine
>>> FY_ON, FN_OY, and FN_ON.  Perhaps you could embellish for me so I can
>>> make more sense about the MODE contingency stat files?
>>>       
>>>> Thanks for your help,
>>>> Jonathan
>>>>
>>>>         
>>>>> -----Original Message-----
>>>>> From: John Halley Gotway [mailto:johnhg at rap.ucar.edu]
>>>>> Sent: Monday, April 27, 2009 8:43 AM
>>>>> To: Case, Jonathan (MSFC-VP61)[Other]
>>>>> Cc: met_help at mailman.ucar.edu
>>>>> Subject: Re: [Met_help] mode_analysis question
>>>>>
>>>>> Jonathan,
>>>>>
>>>>> I'm a little unclear exactly what you're asking here.  It could be
>>>>>           
>>> one
>>>       
>>>>> of two things... but the answer to both is no.
>>>>>
>>>>> The MODE-Analysis tool currently performs 2 types of jobs by reading
>>>>> the MODE object statistics files:
>>>>> (1) A "summary" job in which you select one or more MODE output
>>>>>           
>>> columns
>>>       
>>>>> of interest and it summarizes the data in those columns.
>>>>> (2) A "bycase" job that produces summary information for each MODE
>>>>>           
>>> run
>>>       
>>>>> consisting of counts and areas of matched and unmatched objects.
>>>>>
>>>>> Let me try to understand exactly what you're asking though.  Are you
>>>>> asking:
>>>>> (1) Can MODE-Analysis read those MODE contingency table statistics
>>>>> files (*_cts.txt) output and aggregate them across many cases?
>>>>> (2) Can MODE-Analysis read the MODE object statistics file, treat
>>>>>           
>>> the
>>>       
>>>>> matched/unmatched object counts or areas as the elements of a
>>>>> contingency table and derive some contingency table statistics based
>>>>> on that "object matching" contingency table?
>>>>>
>>>>> Like I said, the answer to both is no.  People here have done (2)
>>>>>           
>>> here
>>>       
>>>>> before, but it's still kind of an open question how best to use the
>>>>> object counts or areas to populate the elements of a
>>>>> contingency table.  You have to decide how to define the hits,
>>>>>           
>>> misses,
>>>       
>>>>> false alarms, and correct nulls.  And doing so gets a bit messy -
>>>>> especially defining the correct nulls.
>>>>>
>>>>> Thanks,
>>>>> John
>>>>>
>>>>> Case, Jonathan (MSFC-VP61)[Other] wrote:
>>>>>           
>>>>>> Dear Met_help,
>>>>>>
>>>>>> Does mode_analysis let the user summarize contingency statistics of
>>>>>>             
>>>>> the paired objects, or can it only summarize the various object
>>>>> attributes?
>>>>>           
>>>>>> Thanks for the help,
>>>>>> Jonathan
>>>>>>
>>>>>> ***********************************************************
>>>>>> Jonathan Case, ENSCO, Inc.
>>>>>> Aerospace Sciences & Engineering Division
>>>>>> Short-term Prediction Research and Transition Center
>>>>>> 320 Sparkman Drive, Room 3062
>>>>>> Huntsville, AL 35805-1912
>>>>>> Voice: (256) 961-7504   Fax: (256) 961-7788
>>>>>> Emails: Jonathan.Case-1 at nasa.gov
>>>>>>              case.jonathan at ensco.com
>>>>>> ***********************************************************
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> -------------------------------------------------------------------
>>>>>>             
>>> --
>>>       
>>>>> ---
>>>>>           
>>>>>> _______________________________________________
>>>>>> Met_help mailing list
>>>>>> Met_help at mailman.ucar.edu
>>>>>> http://mailman.ucar.edu/mailman/listinfo/met_help
>>>>>>             
>
> _______________________________________________
> Met_help mailing list
> Met_help at mailman.ucar.edu
> http://mailman.ucar.edu/mailman/listinfo/met_help
>   

-- 
*************************************************************
Barbara G. Brown                 Phone: (303) 497-8468
NCAR/P.O. Box 3000               FAX: (303) 497-8386
Boulder CO 80307-3000 U.S.A.     e-mail: bgb at ucar.edu 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: Davis et al 2009 24Apr2009.pdf
Type: application/pdf
Size: 4561115 bytes
Desc: not available
Url : http://mailman.ucar.edu/pipermail/met_help/attachments/20090509/64e0aeb7/attachment-0001.pdf