[Met_help] help in obtaining archived PREPBUFR observations

Tue Jan 27 10:18:07 MST 2009

Jonathan,

Great, I'll get in touch with you hopefully in a few weeks to test out a beta release of METv2.0.

As for references for MODE, I've asked some of the scientists here for a list of what's been published.  However, I don't think you'll find exactly what you're looking for.  The intent with with the
configurable parameters is to allow users to tune MODE to do what they'd like it to do.  And we've tried to provide good default values.

I count 65 parameters in the METv1.1 configuration file.  Based on my own experiences, here's some advice about a few of them:

- mask_missing_flag = 3;
Set mask_missing_flag to 3 so that missing data in the fcst or obs field doesn't lead to the definition of "false alarm" objects or "misses" that don't have any chance of matching.

There are other masking options you may choose it you'd like to limit the area in which objects are defined.  It's up to you to determine and define your region of interest with those parameters.

- fcst_conv_radius    = 5;
- obs_conv_radius     = 5;
- fcst_conv_threshold = "ge5.0";
- obs_conv_threshold  = "ge5.0";
These are the 4 most important parameters in my opinion.  They define the scale of objects that you'd like to look at.  You could set them to define very small, high resolution objects or very large,
smooth objects.  You'll find though that MODE doesn't perform very on high resolution objects.  For example, if you resolve objects the size of convective cores, MODE will take a very LONG time to run
and you may not like how it does the matching - it's too messy to understand what's going on.  If you define objects too large, you'll loose any useful information that was in the original field.  I'd
suggest trying to define "medium" sized objects.  For example, if you have a line of storms stretching from Chicago to St. Louis, they'd all be included in the same object.  MODE tends to perform very
well on objects defined at that scale.

Ultimately, you'll need to play around with the convolution radius and threshold to get the objects you like.  Try looking at the raw field, blurring your eyes, and picking out what objects you see.
Then choose settings that define those types of objects.

On a 4-km domain, I'd suggest trying convolution radii of 5, 10, and 15.  And the convolution threshold will depend on what type of field you're using.  Start by choosing a threshold that has some
meteorological meaning.  For example, perhaps precipitation > 5mm means something to you.

- fcst_area_threshold  = "ge0";
- obs_area_threshold   = "ge0";
Regardless of what convolution radii and thresholds you choose, you may end up defining some very small objects.  You could set these parameters to throw those away if you like.  Perhaps, you're not
interested in objects that are smaller than 50 grid squares?  It's up to you.

- fcst_merge_flag = 1;
- obs_merge_flag = 1;
- match_flag = 1;
I suggest the settings above.  I really like how the "double thresholding" merging performs.  And it's very intuitive.  Also, setting match_flag to 1 gives the best performance in my opinion.
Basically, it gives the "benefit of the doubt" to the forecast.  It allows the definition of the objects in one field to inform the merging in the other field.  Others have criticized MODE for this,
but I see it as a strength.

- fcst_merge_threshold = "ge1.25";
- obs_merge_threshold  = "ge1.25";
If you use the double threshold merging approach, you need to set these to a reasonable value.  I'd suggest starting with them at "conv_threshold * 0.25".  Then play with them from there.

So once you have your objects defined in the way that makes sense to you, and you're happy with the matching and merging logic you've selected, the next step would be to play with the fuzzy engine
weights.  Start by using the default settings.  But you can tweak them to get MODE to perform how you like.  Ultimately, it's up to you to decide what object attributes you consider to be most
important.  For example, you may decide that you only want objects that actually overlap to match... in that case, you dial up the int_area_ratio_weight.

- total_interest_threshold  = 0.7;
The last important parameter is the total_interest_threshold.  By default we have it set at 0.7, but you could turn it up or down.  Ultimately, it determines what's "close enough" to consider a match.
 You should try keeping everything the same for a single case but turn this up and down - it'll be pretty evident what the effect is.  I'd say that reasonable values are between 0.5 and 0.9.

Lastly, I'd suggest trying your settings on many cases.  You'll find that you can configure MODE to get it to do exactly what you want it to do for one case.  But when you do that, it "breaks" all of
your other cases.  Any settings change will have some effect, and you won't be able to get it to do exactly what you want it to do every time.  The best you can hope for is to look for settings that
provide results you're pretty happy with most of the time.

Good luck and feel free to write with more questions.

John

Case, Jonathan (MSFC)[] wrote:
> Dear John,
> 
> Thank you two-fold for (1) your very prompt response, and (2) the
> helpful information!  
> This is indeed useful to me as the period of record for my current
> experiments are FEB-AUG 2007, and JUN-AUG 2008.  I suppose for dates
> prior to DEC 2006, the MADIS archive and some type of conversion script
> would be the way to go.   
> 
> *I have somewhat limited experience with wget, so I'll definitely read
> up more on its capabilities.  I appreciate you reminding me of the wget
> command.  
> 
> *I could certainly be a beta tester if you'd like; however, we're still
> pretty green in using MET, so we might not know what improvements to
> look for!   
> 
> *Finally, I would like take advantage of the MODE object-oriented
> technique for verifying convective precip over the SE U.S. for my
> JUN-AUG 2008 project.  Right now, I'm pretty overwhelmed with all the
> tunable parameters, object merging, and output parameters.  I'd like to
> make sense of how best to tune the input parameters based on the type of
> object verification we're interested in (fine-scale hourly to 3-hourly
> precip from daily 4-km explicit WRF forecasts).  Besides the user's
> guide and tutorial, do you have any recommended publications that
> describe how best to tune MODE, and interpret the non-standard output
> statistics?  You probably will be hearing from me in the near future as
> we spin-up more on MODE.  I'd be more than happy to include you (or
> other MET personnel) as co-authors in future publications if you can
> help us configure MODE to produce some meaningful summary stats.  This
> is certainly the way to go in today's era of high-resolution models!
> 
> All the best,
> Jonathan
> 
>> -----Original Message-----
>> From: John Halley Gotway [mailto:johnhg at rap.ucar.edu]
>> Sent: Monday, January 26, 2009 4:14 PM
>> To: Case, Jonathan (MSFC)[]
>> Cc: met_help at mailman.ucar.edu
>> Subject: Re: [Met_help] help in obtaining archived PREPBUFR
>> observations
>>
>> Jonathan,
>>
>> I have good news and bad news for you.
>>
>> First, the data you're actually looking for are the files that contain
>> "prepbufr" in them - not simply "bufr".  And the GDAS prepbufr data is
>> stored in 6 hours chunks - 00Z, 06Z, 12Z, and 18Z.  I
>> believe each file contains +/- 3 hours of data around the time.  So
> the
>> 06Z file contains observations between 03Z and 09Z.
>>
>> Here's one of the files you're looking for:
>>
> http://nomads.ncdc.noaa.gov/data/gdas/200803/20080330/gdas1.t06z.prepbu
>> fr.nr
>>
>> Each one of these prepbufr files contains all of the observation types
>> put together.  But unfortunately, you may find that this prepbufr
>> archive does not go back in time as far as you need.  I think
>> it's only available here back to 12/14/2006.  Does that work for you?
>>
>> Retrieving the data is actually pretty easy.  You can use the "wget"
>> unix command to grab a whole bunch of files from the web.  So you'd
>> just need to write a script to construct the full path for the
>> files you'd like to retrieve, save them to a file, and pass it to
> wget.
>> For example, suppose a file named "my_prepbufr_files.txt" contains the
>> following 4 lines:
>>
>>
> http://nomads.ncdc.noaa.gov/data/gdas/200612/20061231/gdas1.t00z.prepbu
>> fr.nr
>>
> http://nomads.ncdc.noaa.gov/data/gdas/200612/20061231/gdas1.t06z.prepbu
>> fr.nr
>>
> http://nomads.ncdc.noaa.gov/data/gdas/200612/20061231/gdas1.t12z.prepbu
>> fr.nr
>>
> http://nomads.ncdc.noaa.gov/data/gdas/200612/20061231/gdas1.t18z.prepbu
>> fr.nr
>>
>> Run the following command to grab all those files:
>> wget -i my_prepbufr_files.txt
>>
>> It should be pretty straight-forward to generate a list of files you'd
>> like.  And then you can just run the wget command overnight.
>>
>> Since the observations are in 6 hour chunks, you'll need to run them
>> through the PB2NC tool to generate the 1 hour files you'd like.
>>
>> In METv1.1 (the current released version), you'd need to run PB2NC 6
>> times to generate the 6 1-hour files.  In METv2.0 (to be released in
>> Feb/March), you'd need to run it through PB2NC only once, and
>> then use command line arguments to Point-Stat to control the time
> range
>> of observations to be used for each Point-Stat run.
>>
>> If you'd like, in a couple of weeks, you'd be welcome to run the beta
>> version of METv2.0.  Having more people test it out is always a good
>> thing prior to a release.
>>
>> Hope this helps.  Let me know if you still have questions.
>>
>> John Halley Gotway
>> johnhg at ucar.edu
>>
>> Case, Jonathan (MSFC)[] wrote:
>>> Dear MET help,
>>>
>>>
>>>
>>> I'd like to get started in running the standard verification
>> statistics
>>> programs such as point-stat.
>>>
>>> On your web site, you point to
> http://nomads.ncdc.noaa.gov/data/gdas/
>> as
>>> a source of archived PREPBUFR observations that are used in the
> GDAS.
>>> However, there are numerous files with the string "bufr", the data
>> only
>>> go back to 2006, and it would be cumbersome to download each file
>> needed
>>> for verification.
>>>
>>>
>>>
>>> Therefore, I'd like to ask what is the best way to obtain hourly or
>>> sub-hourly PREPBUFR surface observations that can be used in point-
>> stat
>>> to compute typical surface verification, and what datasets should I
>> be
>>> looking for?
>>>
>>>
>>>
>>> I appreciate your assistance!
>>> Jonathan
>>>
>>>
>>>
>>> ***********************************************************
>>> Jonathan Case, ENSCO, Inc.
>>> Aerospace Sciences & Engineering Division
>>> Short-term Prediction Research and Transition Center
>>> 320 Sparkman Drive, Room 3062
>>> Huntsville, AL 35805-1912
>>> Voice: (256) 961-7504   Fax: (256) 961-7788
>>> Emails: Jonathan.Case-1 at nasa.gov
>>>
>>>              case.jonathan at ensco.com
>>>
>>> ***********************************************************
>>>
>>>
>>>
>>>
>>>
>>>
>>>
> ---------------------------------------------------------------------
>> ---
>>> _______________________________________________
>>> Met_help mailing list
>>> Met_help at mailman.ucar.edu
>>> http://mailman.ucar.edu/mailman/listinfo/met_help
> _______________________________________________
> Met_help mailing list
> Met_help at mailman.ucar.edu
> http://mailman.ucar.edu/mailman/listinfo/met_help