[Met_help] grid_stat question on neighborhood verification

John Halley Gotway johnhg at rap.ucar.edu
Mon Mar 16 08:19:48 MDT 2009


Jonathan,

First, a note about the speed.  Since you're running a beta version of METv2.0, there was a performance issue that we have since discovered and fixed.  When you begin using the released version of
METv2.0, you'll find that it runs much faster.  We're using some new classes for generating the output ASCII files.  They're supposed to do a bit of book-keeping to figure out columns widths and
formatting.  However, we realized that instead of doing a "little bit" of book-keeping, they were doing way too much of it!  With that fix, it runs much quicker.

Also, as you noted, turning off the correlation coefficients speeds it up, and setting the "n_boot_rep" to 0 to turn off bootstrapping speeds it up.

As for neighborhood methods, here's how it works:
(1) You define the raw thresholds values in which you're interested using the "fcst_thresh" and "obs_thresh" parameters.
(2) You define the neighborhood sizes of interest using the "nbr_width" parameter.
(3) For each combination of raw threshold and neighborhood size, a fractional coverage field is computed.  For example, the threshold ">=5.0" and a neighborhood size of 5.  For each grid point in the
forecast field, the raw value at that grid point is replaced by a fractional coverage value as follows.  A 5-by-5 box is drawn around current grid point, and we calculate the number of those 25 points
that have a value >=5.0.  Suppose 10 of them do, and the fractional coverage value for that point is defined to be 10/25, or 0.4.  The same process is done in the observation field to compute a
fractional coverage field.
(4) The "nbr_threshold" is used in the computation of the fractional coverage fields to decide what to do with bad data values.  This determines the percentage of points that need to be valid in order
for a fractional coverage value to be computed.  Since it's set to 1, or 100%, all 25 of the neighborhood points have to be valid for a valid fractional coverage value to be computed.
(5) Now we have a fractional coverage field for the forecast and observation.  Those two fields can be compared directly to compute scores like the Fractions Brier Score and Fractions Skill Score in
the NBRCNT output line.
(6) Alternatively, you could threshold the fractional coverage fields to compute the NBRCTC and NBRCTS output lines.  We use the "nbr_frac_threshold" parameter to determine which thresholds between 0
and 1 you'd like to apply to those fields.  In you're case, you've chosen >=0.5.

I'd suggest rerunning this case, but try using multiple values for nbr_frac_threshold, like "gt0.0 ge0.25 ge0.50 ge0.75".  And then see how the results change.  When doing this much processing on the
fields, the interpretation can get a bit confusing.  But for example, for a raw threshold >=5.0mm, neighborhood size of 5-by-5, and neighborhood threshold of >0.0... you're really asking a question
like "When I forecast precip of >=5.0mm somewhere nearby (within 25 grid points), does precip >=5.0mm actually occur anywhere nearby (within 25 grid points)?".  Also, you may want to read up about the
Fractions Skill Score and interpretations of that.

For more information on methods and statistics, and for interpretation of results, let me refer you to Tressa Fowler, tressa at ucar.edu, the statistician who's leading the development of MET.

Good luck,
John

Case, Jonathan (MSFC-VP61)[Other] wrote:
> Hello John,
> 
>  
> 
> I must not be applying the neighborhood verification method properly because of the preliminary numbers I'm seeing.  
> 
> I've looked over a few different sets of output, and so far the neighborhood verification numbers are nearly the same as the standard CTS numbers for various thresholds of precipitation.   I was under the impression that by applying a neighborhood box that the stringency of the verification would be relaxed so that the categorical and skill scores would be improved.  However, that doesn't seem to be the case.  
> 
>  
> 
> Here is what I applied for my 4-km grid results: (for 3-hour accumulated precipitation)
> 
>  
> 
> ·         fcst_thresh[] = [ "ge5 ge10 ge25" ]; 
> 
> ·         obs_thresh[]  = [ "ge5 ge10 ge25" ];
> 
> ·         nbr_width[] = [ 5, 13 ];  à corresponding to ~20km and 50km neighborhood "boxes"
> 
> ·         nbr_threshold = 1.0;
> 
> ·         nbr_frac_threshold[] = [ "ge0.5" ];
> 
>  
> 
> I'm wondering whether the nbr_frac_threshold needs to be reduced/relaxed?  I don't fully understand what this parameter does and how to set it effectively based on the neighborhood width values.  
> 
>  
> 
> Let's first see if my interpretation is correct.  If I set nbr_frac_threshold to "ge0.5", does this mean that at least 50% of all the neighborhood grid points have to meet or exceed the list of fcst/obs_thresh[] in order for a "hit" to occur?  I was thinking initially that if ANY grid point in the OBS/FCST paired neighborhood meets or exceeds the thresholds, then it should be considered a hit.  But that doesn't appear to be the case the way I have configured this run.  
> 
>  
> 
> Finally, on a side note, the grid_stat program is running *extremely* slow, even with the correlation coefficients turned off.  Granted, I am running on the large STIV grid, but I am masking it based on a WRF.poly file I created only over the SE U.S., amounting to about a 350x350 grid.  I ran through only half a month's worth of control+experimental forecasts over this past weekend.   It takes several minutes to get through just one set of forecast/observation calculations at a single output time.  Is there a way to optimize the grid_stat program or is this program known to run very slowly based on the # of computations being made.
> 
>  
> 
> Thanks again for the help,
> 
> Jonathan 
> 
>  
> 
>  
> 
> *********************************************************** 
> Jonathan Case, ENSCO, Inc. 
> Aerospace Sciences & Engineering Division 
> Short-term Prediction Research and Transition Center 
> 320 Sparkman Drive, Room 3062 
> Huntsville, AL 35805-1912 
> Voice: (256) 961-7504   Fax: (256) 961-7788 
> Emails: Jonathan.Case-1 at nasa.gov
> 
>              case.jonathan at ensco.com
> 
> ***********************************************************
> 
>  
> 
> 


More information about the Met_help mailing list