[Met_help] MODE Advice: [Fwd: Re: Measuring model skill in mode_analysis]

Wed May 13 11:36:03 MDT 2009

Jonathan and Barb,

As of METv2.0, the MMI, MMIF, and MMIO values are being written on the first page of the MODE PostScript output in the bottom-middle of the page.  But they are not dumped out in the any of the ASCII
output files.

The MODE-Analysis tool does not currently compute MMI directly.  However, we have posted an Rscript to the MET website that will compute MMI (as well as other summary info):
http://www.dtcenter.org/met/users/downloads/Rscripts/mode_summary.R

This Rscript may be run on the output of one or more MODE runs.

John

Case, Jonathan (MSFC-VP61)[Other] wrote:
> Hi Barbara,
> 
> I finished reading the draft MET article you sent me.  It seems like the interest matrix is key in order to summarize model skill based on MMI, MMIF, and MMIO.
> 
> Given the standard output from MODE and the mode_analysis tool, how would I go about calculating the MMI, MMIF, and MMIO stats?  I have figured out how to output the interest summary, but I don't see how this can result in the matrix of output alluded to in the paper.
> 
> Here is a sample of output I've produced:
> ********************************************
> lc2:/raid1/casejl/MET/MODE > mode_analysis -lookin ./seus_control_ge10mm/JUN/2008060103 -summary -column INTEREST -fcst_accum 01
> 
> 
> 
> Total mode lines read =  397
> Total mode lines kept =   16
> 
>    Field    N     Min     Max    Mean   StdDev     P10     P25     P50     P75     P90      Sum
> --------   --   -----   -----   -----   ------   -----   -----   -----   -----   -----   ------
> interest   16   0.525   0.963   0.729    0.131   0.574   0.626   0.728   0.857   0.885   11.668
> ********************************************
> 
> Thanks for the help,
> Jonathan
> 
>> -----Original Message-----
>> From: Barbara Brown [mailto:bgb at ucar.edu]
>> Sent: Saturday, May 09, 2009 2:08 PM
>> To: Case, Jonathan (MSFC-VP61)[Other]
>> Cc: met_help
>> Subject: Re: [Met_help] MODE Advice: [Fwd: Re: Measuring model skill in
>> mode_analysis]
>>
>>
>> Hi Jonathan,
>>
>> John has done a good job giving you some insights into interpretation of
>> MODE output.  As he said, this is an area where we are still learning what
>> works best.  And probably the best things to do will always be user-
>> dependent. Nevertheless, I'll try to give you a few more thoughts about
>> interpretation of MODE output. Let me know if any of them are helpful!
>>
>> First, I've attached a recent paper that Chris Davis, Randy Bullock, and I
>> just submitted for 2nd reviews.  This paper talks more about the MMI as
>> well as an object-based Gilbert Skill Score (i.e., ETS).  Of course, these
>> scores are like other single-valued skill scores - they can't tell the
>> whole story by themselves, so should be used in conjunction with other
>> types of metrics.
>>
>> A type of object-based CSI that we've also used is based on area-weighting
>> (called "Area-Weighted CSI" or AWCSI).  It works something like the
>> following:
>>
>> AWCSI = [(hit area weight) * #hits ] / [(hit area weight * # hits) + (miss
>> area weight * # misses) + (false alarm area weight * # false alarms) ]
>>
>> Each area weight is the ratio of size of the (hit, miss, or false alarm)
>> objects to the total area of all objects and # hits = number of matched
>> objects; # misses = # unmatched observed objects; and # false alarms = #
>> unmatched forecast objects.  This measure answers the question: How well
>> did the forecast "yes" objects correspond to the observed "yes" objects?
>>
>> The three things I've mentioned above are summary measures.  It also might
>> be useful to look at distributions of values of some of the individual
>> object measures, such as
>>   o  Intersection area divided by union area for matched objects
>>   o  Ratios of - or differences between - forecast and observed intensity
>> percentiles
>>      for matched pairs of objects - the 50th and 90th percentiles are the
>> ones
>>      we normally focus on
>>   o  Differences in any of the other attributes
>>   o  Attributes associated with the missed and false alarm objects
>>
>> It seems like it would be useful to look at these distributions as a
>> function of time of day as well as level of activity (e.g., measured by the
>> total area of observed objects).
>>
>> We would very much like your feedback on things that you find to be useful.
>> In fact, it is our hope to post some new analysis approaches on the MET
>> website.
>>
>> Also - please let us know if there are other attributes that you would like
>> to be able to look at, that might be particularly useful for your
>> application.  We're always looking for ways to improve the methods in MET.
>>
>> I apologize for taking awhile to respond to your email.  Please let me know
>> if you have any questions or comments.
>>
>> Barb Brown
>>
>>
>>
>>> -------- Original Message --------
>>> Subject: Re: Measuring model skill in mode_analysis
>>> Date: Thu, 30 Apr 2009 08:10:08 -0600
>>> From: John Halley Gotway <johnhg at rap.ucar.edu>
>>> To: Case, Jonathan (MSFC-VP61)[Other] <jonathan.case-1 at nasa.gov>
>>> References:
>>> <C1333A631BF21841864F141194ED3D4A1522011D2F at NDMSSCC02.ndc.nasa.gov>
>>> <49F5B659.1000904 at rap.ucar.edu>
>>> <C1333A631BF21841864F141194ED3D4A199DDAEDE2 at NDMSSCC02.ndc.nasa.gov>
>>> <49F5C810.5030603 at rap.ucar.edu>
>>> <C1333A631BF21841864F141194ED3D4A199E064EC7 at NDMSSCC02.ndc.nasa.gov>
>>>
>>> Jonathan,
>>>
>>> Exactly - Those contingency table stats are just the traditional
>>> verification method.  They're meant to serve as a point of reference
>>> for the MODE object output and answer the question "How would this
>> forecast have performed using a traditional verification approach?"  To get
>> at what you're after, we could consider a 4th line of output that contains
>> contingency table counts computed in a different way based on the results
>> of the object matching.  The difficulty is defining how exactly to
>> construct that contingency table.
>>> Unfortunately, I don't have answer for you about how "BEST" to
>>> interpret the MODE output.  It's an open area of research, and it's one
>> reason why we wanted to distribute MODE to the community - to see how
>> people use it.  And I sympathize with you that by simply averaging, most
>> the differences in object attributes get smeared away.  We saw the same
>> thing when applying MODE to NCWF2 convective cells.
>>> So while I don't know the best way of interpreting the results, I can
>> tell you what people have done in the past:
>>> (1) Chris Davis (NCAR) has used MODE to define objects without even
>> paying attention to the matching.  Instead, he looked at the distributions
>> of the object areas and intensities based on time of day.
>>>  By comparing the forecast object distribution to the observation object
>> distribution, he was able to diagnose some timing and intensity errors in
>> the model.
>>> (2) Dave Ahijevych (NCAR) has used the MODE matching output to construct
>> contingency tables as you've suggested.
>>> (3) There are several metrics you could look at for a single case.
>> Please take a look at the sample R script I posted recently for post-
>> processing the MODE output files:
>>> http://www.dtcenter.org/met/users/downloads/analysis_scripts.php.  The
>>> "mode_summary.R" script computes counts and areas of matched/unmatched
>> objects and computes a metric we're calling MMI (Median of the Maximum
>> Interest).  Listed at the bottom of this message is a description of MMI.
>>> (4) The issue of scale (radius and threshold) is a very important one.
>>> You may want to use MODE to analyze your data at multiple scales to
>>> determine at which scale the forecast is useful (you can use the
>> neighborhood methods Fractions Skill Score for this as well).  On that same
>> scripts page, take a look at the "mode_quilt_plot.R" Rscript and the
>> "sample_PDF_file" that it creates.  One thing you can do is use many
>> different choices of radii and thresholds and plot metrics for each scale.
>>> Hopefully that'll help.  I'm forwarding this message to the scientists in
>> our group to see if they have any additional advice.
>>> Thanks,
>>> John
>>>
>>> MMI stands for "Median of the Maximum Interest".  Take a look at the
>> example PostScript file here:
>>> .../METv2.0/out/mode/mode_RH_ISBL_500_vs_RH_ISBL_500_120000L_20050807_
>>> 120000V_000000A.ps
>>>
>>> In this example, MMI is computed as follows:
>>> - There are 8 simple forecast objects and 12 simple observation objects.
>>> - For each one of those objects, the maximum interest is computed by
>> looking at the interest values for all simple object pairs in which it's a
>> member.
>>> - So we have one maximum interest value for each object: 8 forecast
>> values and 12 observation values.
>>> - If an object is way off by itself and wasn't compared to any other
>> objects because it's so far away, it's maximum interest value is 0.
>>> - Next we compute MMI 3 ways:
>>>    (1) MMI with respect to forecast objects is the median of those 8
>> forecast maximum interest values (MMI w.r.t fcst = 0.8800).
>>>    (2) MMI with respect to observation objects is the median of those 12
>> observation maximum interest values (MMI w.r.t. obs = 0.8723).
>>>    (3) MMI with respect to all objects is the median of all 20 forecast
>> and observation maximum interest values (MMI w.r.t all = 0.8765).
>>>
>>> Case, Jonathan (MSFC-VP61)[Other] wrote:
>>>
>>>> Hi John,
>>>>
>>>> I'm still not entirely sure what can be done with these contingency
>> stats in MODE output.
>>>> It seems from your description below that the FCST/OBS objects must
>> overlap at a particular grid point in order to receive a "hit".  At face
>> value, these contingency stats do not appear to provide much added value
>> beyond traditional or neighborhood stats.
>>>> So, is there a recommended way to measure the "skill" of these
>> matched/unmatched objects besides summarizing and comparing relatively
>> obscure object attributes such as complexity ratio, convex hull distance,
>> etc.?  I have found that the mean object attributes are VERY similar to one
>> another when computed over a series of many forecasts.  The most obvious
>> method (which I've done so far) is to simply compare the matched and
>> unmatched object areas using -bycase, in order to see which one has more
>> hits and fewer false alarms.
>>>> Also, some object attributes are inherently dependent on the
>>>> configurable parameters set prior to running MODE.  For example, I
>>>> found that by reducing the obs/fcst_conv_radius (I ended up using a
>>>> value of "3"), MODE produced a more stringent, realistic matching of
>>>> the objects.  When the radii are reduced, the mean CENTROID_DIST is
>>>> subsequently reduced as well because objects must be closer to one
>>>> another in order to be matched.  [Motivation: If these radii are too
>>>> large, then convective precipitation systems on the west coast of
>>>> Florida would be paired up with convection on the east coast of the
>>>> Florida peninsula, which to a forecaster wouldn't be considered a
>>>> "hit".]
>>>>
>>>> So, as you can see, I'm still trying to get my hands around how best to
>> summarize the MODE output.
>>>> Thanks for your insight, as always!
>>>> Jonathan
>>>>
>>>>
>>>>> -----Original Message-----
>>>>> From: John Halley Gotway [mailto:johnhg at rap.ucar.edu]
>>>>> Sent: Monday, April 27, 2009 9:58 AM
>>>>> To: Case, Jonathan (MSFC-VP61)[Other]
>>>>> Cc: met_help at ucar.edu
>>>>> Subject: Re: [Met_help] mode_analysis question
>>>>>
>>>>> Jonathan,
>>>>>
>>>>> Actually, the counts in those files have nothing to do with the
>>>>> matching that was performed by MODE.  The intent of the file is to
>>>>> give you an easy way of seeing what you'd get with a traditional
>>>>> verification of the input fields as opposed to an object-based
>>>>> verification.  Let me point you to the first paragraph of section
>>>>> 6.3.3 of the MET User's Guide for a description of these lines.
>>>>> Here's some more detail based on the contents of the "FIELD" column.
>>>>>
>>>>> FIELD Column...
>>>>> (1) RAW: Apply any masking of bad data, grid masking, or polygon
>>>>> matching to the raw fields.  Threshold the raw fields using the
>>>>> "fcst_conv_thresh" and "obs_conv_thresh" config values to define 0/1
>>>>> fields.  Compute the contingency table counts by comparing these 0/1
>>>>> fields grid point by grid point.
>>>>>
>>>>> (2) FILTER: Apply any masking of bad data, grid masking, or polygon
>>>>> matching to the raw fields.  In addition, apply the "fcst_raw_thresh"
>>>>> and "obs_raw_thresh" to filter out any additional values.
>>>>> Then use the "fcst_conv_thresh" and "obs_conv_thresh" config values
>>>>> to define 0/1 fields, and compute a contingency table from them.
>>>>>
>>>>> (3) OBJECT: Once objects have been defined in the forecast and
>>>>> observation fields, consider any grid point inside of an object to
>>>>> have a value of 1 and any grid point outside of an object to have a
>>>>> value of 0.  Compute the contingency table counts by comparing these
>>>>> 0/1 fields grid point by grid point.
>>>>>
>>>>> So really object matching has nothing to do with it.
>>>>>
>>>>> Make sense?
>>>>>
>>>>> John
>>>>>
>>>>>
>>>>> For the RAW line:
>>>>>
>>>>> Case, Jonathan (MSFC-VP61)[Other] wrote:
>>>>>
>>>>>> Hi John,
>>>>>>
>>>>>> You hit the nail on the head with #1.  I saw the counts in the MODE
>>>>>>
>>>>> contingency files (*_cts.txt) and was wondering if we could
>>>>> summarize those in mode_analysis (to which the answer is no as you
>> said).
>>>>>> It seems like we could fairly easily read in the contingency counts
>>>>>>
>>>>> in shell script, with a little management of filenames.
>>>>>
>>>>>> I also wonder how those contingency files are calculated in the
>>>>>> MODE
>>>>>>
>>>>> output?  Is any forecast object that matches an observed object
>>>>> considered a "hit", and then the individual grid points within that
>>>>> object are added to the total of FY_OY?  And similarly with the
>>>>> unmatched forecast/observed objects, are their grid points summed to
>>>>> be FY_ON and FN_OY?  It seems like it could get a bit unclear when
>>>>> dealing with FCST/OBS objects that don't overlap when trying to
>>>>> determine FY_ON, FN_OY, and FN_ON.  Perhaps you could embellish for
>>>>> me so I can make more sense about the MODE contingency stat files?
>>>>>
>>>>>> Thanks for your help,
>>>>>> Jonathan
>>>>>>
>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: John Halley Gotway [mailto:johnhg at rap.ucar.edu]
>>>>>>> Sent: Monday, April 27, 2009 8:43 AM
>>>>>>> To: Case, Jonathan (MSFC-VP61)[Other]
>>>>>>> Cc: met_help at mailman.ucar.edu
>>>>>>> Subject: Re: [Met_help] mode_analysis question
>>>>>>>
>>>>>>> Jonathan,
>>>>>>>
>>>>>>> I'm a little unclear exactly what you're asking here.  It could be
>>>>>>>
>>>>> one
>>>>>
>>>>>>> of two things... but the answer to both is no.
>>>>>>>
>>>>>>> The MODE-Analysis tool currently performs 2 types of jobs by
>>>>>>> reading the MODE object statistics files:
>>>>>>> (1) A "summary" job in which you select one or more MODE output
>>>>>>>
>>>>> columns
>>>>>
>>>>>>> of interest and it summarizes the data in those columns.
>>>>>>> (2) A "bycase" job that produces summary information for each MODE
>>>>>>>
>>>>> run
>>>>>
>>>>>>> consisting of counts and areas of matched and unmatched objects.
>>>>>>>
>>>>>>> Let me try to understand exactly what you're asking though.  Are
>>>>>>> you
>>>>>>> asking:
>>>>>>> (1) Can MODE-Analysis read those MODE contingency table statistics
>>>>>>> files (*_cts.txt) output and aggregate them across many cases?
>>>>>>> (2) Can MODE-Analysis read the MODE object statistics file, treat
>>>>>>>
>>>>> the
>>>>>
>>>>>>> matched/unmatched object counts or areas as the elements of a
>>>>>>> contingency table and derive some contingency table statistics
>>>>>>> based on that "object matching" contingency table?
>>>>>>>
>>>>>>> Like I said, the answer to both is no.  People here have done (2)
>>>>>>>
>>>>> here
>>>>>
>>>>>>> before, but it's still kind of an open question how best to use
>>>>>>> the object counts or areas to populate the elements of a
>>>>>>> contingency table.  You have to decide how to define the hits,
>>>>>>>
>>>>> misses,
>>>>>
>>>>>>> false alarms, and correct nulls.  And doing so gets a bit messy -
>>>>>>> especially defining the correct nulls.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> John
>>>>>>>
>>>>>>> Case, Jonathan (MSFC-VP61)[Other] wrote:
>>>>>>>
>>>>>>>> Dear Met_help,
>>>>>>>>
>>>>>>>> Does mode_analysis let the user summarize contingency statistics
>>>>>>>> of
>>>>>>>>
>>>>>>> the paired objects, or can it only summarize the various object
>>>>>>> attributes?
>>>>>>>
>>>>>>>> Thanks for the help,
>>>>>>>> Jonathan
>>>>>>>>
>>>>>>>> ***********************************************************
>>>>>>>> Jonathan Case, ENSCO, Inc.
>>>>>>>> Aerospace Sciences & Engineering Division Short-term Prediction
>>>>>>>> Research and Transition Center 320 Sparkman Drive, Room 3062
>>>>>>>> Huntsville, AL 35805-1912
>>>>>>>> Voice: (256) 961-7504   Fax: (256) 961-7788
>>>>>>>> Emails: Jonathan.Case-1 at nasa.gov
>>>>>>>>              case.jonathan at ensco.com
>>>>>>>> ***********************************************************
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> -----------------------------------------------------------------
>>>>>>>> --
>>>>>>>>
>>>>> --
>>>>>
>>>>>>> ---
>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Met_help mailing list
>>>>>>>> Met_help at mailman.ucar.edu
>>>>>>>> http://mailman.ucar.edu/mailman/listinfo/met_help
>>>>>>>>
>>> _______________________________________________
>>> Met_help mailing list
>>> Met_help at mailman.ucar.edu
>>> http://mailman.ucar.edu/mailman/listinfo/met_help
>>>
>> --
>> *************************************************************
>> Barbara G. Brown                 Phone: (303) 497-8468
>> NCAR/P.O. Box 3000               FAX: (303) 497-8386
>> Boulder CO 80307-3000 U.S.A.     e-mail: bgb at ucar.edu
>