[Met_help] [rt.rap.ucar.edu #54891] History for point_stat error when using -obs_valid_beg / -obs_valid_end in METv3.1

Tue Apr 10 13:59:26 MDT 2012

----------------------------------------------------------------
  Initial Request
----------------------------------------------------------------

Hello again Paul/methelp,

It appears that the -obs_valid_beg and -obs_valid_end options are not working properly in the public METv3.1.
Once I removed the options, the point_stat program ran to completion.

Could you please investigate a correction so that I can use these options again?
                "-obs_valid_beg time" in YYYYMMDD[_HH[MMSS]] sets the beginning of the matching time window (optional).
                "-obs_valid_end time" in YYYYMMDD[_HH[MMSS]] sets the end of the matching time window (optional).

Thank you,
Jonathan

-----Original Message-----
From: Paul Oldenburg via RT [mailto:met_help at ucar.edu]
Sent: Thursday, March 01, 2012 2:50 PM
To: Case, Jonathan (MSFC-ZP11)[ENSCO INC]
Cc: tcram at ucar.edu
Subject: Re: [rt.rap.ucar.edu #52626] RE: Help with ds337.0

Jonathan,

I developed a patch for handling point observations in MET that will allow the user to optionally throw out duplicate observations.  The process that I implemented does not use the observation with the timestamp closest to the forecast valid time as you suggested, because of complications in how the code handles obs.  Then, when I thought about this, it occurred to me that it doesn't matter because the observation is a duplicate anyway.  (right?)

A duplicate observation is defined as an observation with identical message type, station id, grib code (forecast field), latitude, longitude, level, height and observed value to an observation that has already been processed.

Please deploy the latest MET patches and then the attached patch:

1. Deploy latest METv3.1 patches from http://www.dtcenter.org/met/users/support/known_issues/METv3.1/index.php
2. Save attached tarball to base MET directory 3. Untar it, which should overwrite four source files 4. Run 'make clean' and then 'make'

When this is complete, you should notice a new command-line option for point_stat: -unique.  When you use this, point_stat should detect and throw out duplicate observations.  If your verbosity level is set to 3 or higher, it will report which observations are being thrown out.  Please test this and let me know if you have any trouble or if it does not work in the way that you expected.

Thanks,

Paul

On 02/13/2012 03:10 PM, Case, Jonathan[ENSCO INC] via RT wrote:
>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=52626>
>
> Hello John/Tim/Methelp,
>
> I finally got back into looking at this issue with duplicate obs showing up in the PB2NC output, resulting in duplicate fcst-obs pairs being processed by point_stat.
>
> I found a single obs site in Central America that is generating a problem on 1 Oct 2011 at 12z (stid "MSSS").
> I processed ONLY this obs through pb2nc to see what the result is in the netcdf file.
>
> Here is what I see: (from an ncdump of the netcdf file) .
> .
> .
> .
> data:
>
>   obs_arr =
>    0, 33, 942.5, -9999, 0,
>    0, 34, 942.5, -9999, -1,
>    0, 32, 942.5, -9999, 1,
>    1, 51, 942.5, 619.9501, 0.016576,
>    1, 11, 942.5, 619.9501, 297.15,
>    1, 17, 942.5, 619.9501, 293.9545,
>    1, 2, 942.5, 619.9501, 101349.7,
>    2, 33, 942.5, -9999, 0,
>    2, 34, 942.5, -9999, -1,
>    2, 32, 942.5, -9999, 1,
>    3, 51, 942.5, 619.9501, 0.016576,
>    3, 11, 942.5, 619.9501, 297.15,
>    3, 17, 942.5, 619.9501, 293.9545,
>    3, 2, 942.5, 619.9501, 101349.7 ;
>
>   hdr_typ =
>    "ADPSFC",
>    "ADPSFC",
>    "ADPSFC",
>    "ADPSFC" ;
>
>   hdr_sid =
>    "MSSS",
>    "MSSS",
>    "MSSS",
>    "MSSS" ;
>
>   hdr_vld =
>    "20111001_115000",
>    "20111001_115000",
>    "20111001_115501",
>    "20111001_115501" ;
>
>   hdr_arr =
>    13.7, -89.12, 621,
>    13.7, -89.12, 621,
>    13.7, -89.12, 621,
>    13.7, -89.12, 621 ;
> }
>
> So, from what I can tell, the station is reporting the same obs at 2
> different times,
> 1150(00) UTC and 1155(01) UTC.  Do you have any recommendation on how I can retain only one of these obs, preferably the one closest to the top of the hour?  I know I could dramatically narrow down the time window (e.g. +/- 5 min), but I suspect this would likely miss out on most observations that report about 10 minutes before the hour.
>
> I value your feedback on this matter.
> Sincerely,
> Jonathan
>
> -----Original Message-----
> From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
> Sent: Thursday, January 19, 2012 1:40 PM
> To: Case, Jonathan (MSFC-VP61)[ENSCO INC]
> Cc: tcram at ucar.edu
> Subject: Re: [rt.rap.ucar.edu #52626] RE: Help with ds337.0
>
> Jonathan,
>
> OK, I reran my analysis using the setting you suggested:
>      in_report_type[] = [ 512, 522, 531, 540, 562 ];
>
> Here's what I see:
>
>     - For qm=2, there are 29 locations, all with unique header information, and all with unique lat/lons.
>       It looks like the station id's are all alphabetical.  So the "in_report_type" setting has filtered out the numeric station id's.
>
>     - For qm=9, there are 57 locations - but only 29 of them have unique header information!
>
> So I'll need to look more closely at what PB2NC is doing here.  It looks like setting qm=9 really is causing duplicate observations to be retained.
>
> When I get a chance, I run it through the debugger to investigate.
>
> Thanks,
> John
>
>
> On 01/19/2012 12:33 PM, Case, Jonathan[ENSCO INC] via RT wrote:
>>
>> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=52626>
>>
>> John,
>>
>> I noticed that even with specifying the input report types, there are still a few duplicate observations in the final netcdf dataset.
>> So, I'm seeing the same thing as in your analysis.
>>
>> -Jonathan
>>
>> -----Original Message-----
>> From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
>> Sent: Thursday, January 19, 2012 1:01 PM
>> To: Case, Jonathan (MSFC-VP61)[ENSCO INC]
>> Cc: tcram at ucar.edu
>> Subject: Re: [rt.rap.ucar.edu #52626] RE: Help with ds337.0
>>
>> Jonathan,
>>
>> I apologize for the long delay in getting back to you on this.  We've been scrambling over the last couple of weeks to finish up development on a new release.  Here's my recollection of what's going on with this issue:
>>
>>      - You're using the GDAS PrepBUFR observation dataset, but you're finding that PB2NC retains very few ADPSFC observations when you a quality marker of 2.
>>      - We advised via MET-Help that the algorithm employed by NCEP in the GDAS processing sets most ADPSFC observations' quality marker to a value of 9.  NCEP does that to prevent those observations from being used in the data assimilation.  So the use of quality marker = 9 is more an artifact of the data assimilation process and not really saying anything about the quality of those observations.
>>      - When you switch to using a quality marker = 9 in PB2NC, you got many matches, but ended up with more "duplicate" observations.
>>
>> So is using a quality marker = 9 in PB2NC causing "duplicate" observations to be retained?
>>
>> I did some investigation on this issue this morning.  Here's what I did:
>>
>> - Retrieved this file:
>> http://nomads.ncep.noaa.gov/pub/data/nccf/com/gfs/prod/gdas.20120112/
>> gdas1.t12z.prepbufr.nr
>> - Ran it through PB2NC from message type = ADPSFC, time window = +/- 0 seconds, and quality markers of 2 and 9.
>> - For both, I used the updated version of the plot_point_obs tool to create a plot of the data and dump header information about the points being plotted.
>> - I also used the -dump option for PB2NC to dump all of the ADPSFC observations to ASCII format.
>>
>> I've attached several things to this message:
>> - The postscript output of plot_point_obs for qm = 2 and qm = 9m, after first converting to png format.
>> - The output from the plot_point_obs tool for both runs.
>>
>> For qm=2, there were 51 locations plotted in your domain.
>>      - Of those 51...
>>         - All 51 header entries are unique.
>>         - There are only 36 unique combinations of lat/lon.
>> For qm=9, there were 101 locations plotted in your domain.
>>      - Of those 101...
>>         - There are only 52 unique header entries.
>>         - There are only 37 unique combinations of lat/lon.
>>
>> I think there are two issues occurring here:
>>
>> (1) When using qm=2, you'll often see two observing locations that look the same except for the station ID.  For example:
>>     [ ADPSFC, 78792, 20120112_120000, 9.05, -79.37, 11 ]
>>     [ ADPSFC, MPTO,  20120112_120000, 9.05, -79.37, 11 ]
>>
>> I looked at the observations that correspond to these and found that they do actually differ slightly.
>>
>> (2) The second, larger issue here is when using qm=9.  It does appear that we're really getting duplicate observations.  Foe example:
>>     [ ADPSFC, 78792, 20120112_120000, 9.05, -79.37, 11 ]
>>     [ ADPSFC, 78792, 20120112_120000, 9.05, -79.37, 11 ]
>>
>> This will likely require further debugging of the PB2NC tool to figure out what's going on.
>>
>> I just wanted to let you know what I've found so far.
>>
>> Thanks,
>> John Halley Gotway
>>
>>
>>
>> On 01/13/2012 03:50 PM, Case, Jonathan[ENSCO INC] via RT wrote:
>>>
>>> Fri Jan 13 15:50:08 2012: Request 52626 was acted upon.
>>> Transaction: Ticket created by jonathan.case-1 at nasa.gov
>>>           Queue: met_help
>>>         Subject: RE: Help with ds337.0
>>>           Owner: Nobody
>>>      Requestors: jonathan.case-1 at nasa.gov
>>>          Status: new
>>>     Ticket<URL:
>>> https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=52626>
>>>
>>>
>>> Hi Tom/MET help,
>>>
>>> Thanks for the fantastically quick reply, Tom!
>>>
>>> It turns out that I'm specifically referring to the netcdf output from the pb2nc program.
>>>
>>> I already sent a help ticket to the MET team, asking if they have a means for removing the duplicate obs from their PB2NC process.  At the time, they didn't refer to the history of data as it undergoes QC, so this might help me track down the reason for the duplicate obs.  So, I have CC'd the met_help to this email.
>>>
>>> It turns out that when I initially ran pb2nc with the default quality control flag set to "2" (i.e. quality_mark_thresh in the PB2NCConfig_default file), I did not get ANY surface observations in my final netcdf file over Central America.  Upon email exchanges with the MET team, it was recommended that I set the quality control flag to "9" to be able to accept more observations into the netcdf outfile.
>>>
>>>>    From what it sounds like, I need to better understand what the "happy medium" should be in setting the quality_mark_thresh flag in pb2nc.  2 is too restrictive, while 9 appears to be allowing duplicate observations into the mix as a result of the QC process.
>>>
>>> Any recommendations are greatly welcome!
>>>
>>> Thanks much,
>>> Jonathan
>>>
>>>
>>> From: Thomas Cram [mailto:tcram at ucar.edu]
>>> Sent: Friday, January 13, 2012 4:39 PM
>>> To: Case, Jonathan (MSFC-VP61)[ENSCO INC]
>>> Subject: Re: Help with ds337.0
>>>
>>> Hi Jonathan,
>>>
>>> the only experience I have working with the MET software is using the pb2nc utility to convert PREPBUFR observations into a NetCDF dataset, so my knowledge of MET is limited.  However, the one reason I can think of for the duplicate observations is that you're seeing the same observation after several stages of quality-control pre-processing.  The PREPBUFR files contain a complete history of the data as it's modified during QC, so each station will have multiple reports at a single time.  There's a quality control flag appended to each PREPBUFR message; you want to keep the observation with the lowest QC number.
>>>
>>> Can you send me the date and time for the examples you list below?  I'll take a look at the PREPBUFR messages and see if this is the case.
>>>
>>> If this doesn't explain it, then I'll forward your question on to MET support desk and see if they know the reason for duplicate observations.  They are intimately familiar with the PREPBUFR obs, so I'm sure they can help you out.
>>>
>>> - Tom
>>>
>>> On Jan 13, 2012, at 3:16 PM, Case, Jonathan (MSFC-VP61)[ENSCO INC] wrote:
>>>
>>>
>>> Dear Thomas,
>>>
>>> This is Jonathan Case of the NASA SPoRT Center (http://weather.msfc.nasa.gov/sport/) in Huntsville, AL.
>>> I am conducting some weather model verification using the MET verification software (NCAR's Meteorological Evaluation Tools) and the NCEP GDAS PREPBUFR point observation files for ground truth.  I have accessed archived GDAS PREPBUFR files from NCAR's repository at http://dss.ucar.edu/datasets/ds337.0/ and began producing difference stats over Central America between the model forecast and observations obtained from the PREPBUFR files.
>>>
>>> Now here is the interesting part:  When I examined the textual difference files generated by the MET software, I noticed that there were several stations with "duplicate" observations that led to duplicate forecast-observation difference pairs.  I put duplicate in quotes because the observed values were not necessarily the same but usually very close to one another.
>>> The duplicate observations arose from the fact that at the same observation location, there would be a 5-digit WMO identifier as well as a 4-digit text station ID at a given hour.
>>> I stumbled on these duplicate station data when I made a table of stations and mapped them, revealing the duplicates.
>>>
>>> Some examples I stumbled on include:
>>> *         78720/MHTG (both at 14.05N, -87.22E)
>>> *         78641/MGGT (both at 14.58N, -90.52E)
>>> *         78711/MHPL (both at 15.22N, -83.80E)
>>> *         78708/MHLM (both at 15.45N, -87.93)
>>>
>>> There are others, but I thought I'd provide a few examples to start.
>>>
>>> If the source of the duplicates is NCEP/EMC, I wonder if it would be helpful to send them a note as well?
>>>
>>> Let me know how you would like to proceed.
>>>
>>> Most sincerely,
>>> Jonathan
>>>
>>> --------------------------------------------------------------------
>>> --
>>> ----------------------------
>>> Jonathan Case, ENSCO Inc.
>>> NASA Short-term Prediction Research and Transition Center (aka SPoRT
>>> Center) 320 Sparkman Drive, Room 3062 Huntsville, AL 35805
>>> Voice: 256.961.7504
>>> Fax: 256.961.7788
>>> Emails: Jonathan.Case-1 at nasa.gov<mailto:Jonathan.Case-1 at nasa.gov>    /
>>> case.jonathan at ensco.com<mailto:case.jonathan at ensco.com>
>>> --------------------------------------------------------------------
>>> --
>>> ----------------------------
>>>
>>> "Whether the weather is cold, or whether the weather is hot, we'll weather
>>>      the weather whether we like it or not!"
>>>
>>>
>>> Thomas Cram
>>> NCAR / CISL / DSS
>>> 303-497-1217
>>> tcram at ucar.edu<mailto:tcram at ucar.edu>
>>>
>>>
>>>
>>
>
>

----------------------------------------------------------------
  Complete Ticket History
----------------------------------------------------------------

Subject: Re: [rt.rap.ucar.edu #54891] point_stat error when using -obs_valid_beg / -obs_valid_end in METv3.1
From: John Halley Gotway
Time: Mon Mar 05 11:31:48 2012

Jonathan,

You are right!  Sorry for the trouble and thanks for letting us know
about it.

I've posted a bugfix for the problem here:
    http://www.dtcenter.org/met/users/support/known_issues/METv3.1/index.php

In METv3.1, we standardized the processing of command line options
across all the MET tools.  In doing this, we made an error in the
function called for the "-obs_valid_end" command line option in
point_stat.  It's an easy one-line fix to call the correct function.

Please give this fix a shot and let us know if you have any more
problems in your use of METv3.1.

Thanks,
John

On 03/05/2012 10:10 AM, Case, Jonathan[ENSCO INC] via RT wrote:
>
> Mon Mar 05 10:10:02 2012: Request 54891 was acted upon.
> Transaction: Ticket created by jonathan.case-1 at nasa.gov
>         Queue: met_help
>       Subject: point_stat error when using -obs_valid_beg /
-obs_valid_end in METv3.1
>         Owner: Nobody
>    Requestors: jonathan.case-1 at nasa.gov
>        Status: new
>   Ticket<URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=54891>
>
>
> Hello again Paul/methelp,
>
> It appears that the -obs_valid_beg and -obs_valid_end options are
not working properly in the public METv3.1.
> Once I removed the options, the point_stat program ran to
completion.
>
> Could you please investigate a correction so that I can use these
options again?
>                  "-obs_valid_beg time" in YYYYMMDD[_HH[MMSS]] sets
the beginning of the matching time window (optional).
>                  "-obs_valid_end time" in YYYYMMDD[_HH[MMSS]] sets
the end of the matching time window (optional).
>
> Thank you,
> Jonathan
>
> -----Original Message-----
> From: Paul Oldenburg via RT [mailto:met_help at ucar.edu]
> Sent: Thursday, March 01, 2012 2:50 PM
> To: Case, Jonathan (MSFC-ZP11)[ENSCO INC]
> Cc: tcram at ucar.edu
> Subject: Re: [rt.rap.ucar.edu #52626] RE: Help with ds337.0
>
> Jonathan,
>
> I developed a patch for handling point observations in MET that will
allow the user to optionally throw out duplicate observations.  The
process that I implemented does not use the observation with the
timestamp closest to the forecast valid time as you suggested, because
of complications in how the code handles obs.  Then, when I thought
about this, it occurred to me that it doesn't matter because the
observation is a duplicate anyway.  (right?)
>
> A duplicate observation is defined as an observation with identical
message type, station id, grib code (forecast field), latitude,
longitude, level, height and observed value to an observation that has
already been processed.
>
> Please deploy the latest MET patches and then the attached patch:
>
> 1. Deploy latest METv3.1 patches from
http://www.dtcenter.org/met/users/support/known_issues/METv3.1/index.php
> 2. Save attached tarball to base MET directory 3. Untar it, which
should overwrite four source files 4. Run 'make clean' and then 'make'
>
> When this is complete, you should notice a new command-line option
for point_stat: -unique.  When you use this, point_stat should detect
and throw out duplicate observations.  If your verbosity level is set
to 3 or higher, it will report which observations are being thrown
out.  Please test this and let me know if you have any trouble or if
it does not work in the way that you expected.
>
> Thanks,
>
> Paul
>
>
> On 02/13/2012 03:10 PM, Case, Jonathan[ENSCO INC] via RT wrote:
>>
>> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=52626>
>>
>> Hello John/Tim/Methelp,
>>
>> I finally got back into looking at this issue with duplicate obs
showing up in the PB2NC output, resulting in duplicate fcst-obs pairs
being processed by point_stat.
>>
>> I found a single obs site in Central America that is generating a
problem on 1 Oct 2011 at 12z (stid "MSSS").
>> I processed ONLY this obs through pb2nc to see what the result is
in the netcdf file.
>>
>> Here is what I see: (from an ncdump of the netcdf file) .
>> .
>> .
>> .
>> data:
>>
>>    obs_arr =
>>     0, 33, 942.5, -9999, 0,
>>     0, 34, 942.5, -9999, -1,
>>     0, 32, 942.5, -9999, 1,
>>     1, 51, 942.5, 619.9501, 0.016576,
>>     1, 11, 942.5, 619.9501, 297.15,
>>     1, 17, 942.5, 619.9501, 293.9545,
>>     1, 2, 942.5, 619.9501, 101349.7,
>>     2, 33, 942.5, -9999, 0,
>>     2, 34, 942.5, -9999, -1,
>>     2, 32, 942.5, -9999, 1,
>>     3, 51, 942.5, 619.9501, 0.016576,
>>     3, 11, 942.5, 619.9501, 297.15,
>>     3, 17, 942.5, 619.9501, 293.9545,
>>     3, 2, 942.5, 619.9501, 101349.7 ;
>>
>>    hdr_typ =
>>     "ADPSFC",
>>     "ADPSFC",
>>     "ADPSFC",
>>     "ADPSFC" ;
>>
>>    hdr_sid =
>>     "MSSS",
>>     "MSSS",
>>     "MSSS",
>>     "MSSS" ;
>>
>>    hdr_vld =
>>     "20111001_115000",
>>     "20111001_115000",
>>     "20111001_115501",
>>     "20111001_115501" ;
>>
>>    hdr_arr =
>>     13.7, -89.12, 621,
>>     13.7, -89.12, 621,
>>     13.7, -89.12, 621,
>>     13.7, -89.12, 621 ;
>> }
>>
>> So, from what I can tell, the station is reporting the same obs at
2
>> different times,
>> 1150(00) UTC and 1155(01) UTC.  Do you have any recommendation on
how I can retain only one of these obs, preferably the one closest to
the top of the hour?  I know I could dramatically narrow down the time
window (e.g. +/- 5 min), but I suspect this would likely miss out on
most observations that report about 10 minutes before the hour.
>>
>> I value your feedback on this matter.
>> Sincerely,
>> Jonathan
>>
>> -----Original Message-----
>> From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
>> Sent: Thursday, January 19, 2012 1:40 PM
>> To: Case, Jonathan (MSFC-VP61)[ENSCO INC]
>> Cc: tcram at ucar.edu
>> Subject: Re: [rt.rap.ucar.edu #52626] RE: Help with ds337.0
>>
>> Jonathan,
>>
>> OK, I reran my analysis using the setting you suggested:
>>       in_report_type[] = [ 512, 522, 531, 540, 562 ];
>>
>> Here's what I see:
>>
>>      - For qm=2, there are 29 locations, all with unique header
information, and all with unique lat/lons.
>>        It looks like the station id's are all alphabetical.  So the
"in_report_type" setting has filtered out the numeric station id's.
>>
>>      - For qm=9, there are 57 locations - but only 29 of them have
unique header information!
>>
>> So I'll need to look more closely at what PB2NC is doing here.  It
looks like setting qm=9 really is causing duplicate observations to be
retained.
>>
>> When I get a chance, I run it through the debugger to investigate.
>>
>> Thanks,
>> John
>>
>>
>> On 01/19/2012 12:33 PM, Case, Jonathan[ENSCO INC] via RT wrote:
>>>
>>> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=52626>
>>>
>>> John,
>>>
>>> I noticed that even with specifying the input report types, there
are still a few duplicate observations in the final netcdf dataset.
>>> So, I'm seeing the same thing as in your analysis.
>>>
>>> -Jonathan
>>>
>>> -----Original Message-----
>>> From: John Halley Gotway via RT [mailto:met_help at ucar.edu]
>>> Sent: Thursday, January 19, 2012 1:01 PM
>>> To: Case, Jonathan (MSFC-VP61)[ENSCO INC]
>>> Cc: tcram at ucar.edu
>>> Subject: Re: [rt.rap.ucar.edu #52626] RE: Help with ds337.0
>>>
>>> Jonathan,
>>>
>>> I apologize for the long delay in getting back to you on this.
We've been scrambling over the last couple of weeks to finish up
development on a new release.  Here's my recollection of what's going
on with this issue:
>>>
>>>       - You're using the GDAS PrepBUFR observation dataset, but
you're finding that PB2NC retains very few ADPSFC observations when
you a quality marker of 2.
>>>       - We advised via MET-Help that the algorithm employed by
NCEP in the GDAS processing sets most ADPSFC observations' quality
marker to a value of 9.  NCEP does that to prevent those observations
from being used in the data assimilation.  So the use of quality
marker = 9 is more an artifact of the data assimilation process and
not really saying anything about the quality of those observations.
>>>       - When you switch to using a quality marker = 9 in PB2NC,
you got many matches, but ended up with more "duplicate" observations.
>>>
>>> So is using a quality marker = 9 in PB2NC causing "duplicate"
observations to be retained?
>>>
>>> I did some investigation on this issue this morning.  Here's what
I did:
>>>
>>> - Retrieved this file:
>>>
http://nomads.ncep.noaa.gov/pub/data/nccf/com/gfs/prod/gdas.20120112/
>>> gdas1.t12z.prepbufr.nr
>>> - Ran it through PB2NC from message type = ADPSFC, time window =
+/- 0 seconds, and quality markers of 2 and 9.
>>> - For both, I used the updated version of the plot_point_obs tool
to create a plot of the data and dump header information about the
points being plotted.
>>> - I also used the -dump option for PB2NC to dump all of the ADPSFC
observations to ASCII format.
>>>
>>> I've attached several things to this message:
>>> - The postscript output of plot_point_obs for qm = 2 and qm = 9m,
after first converting to png format.
>>> - The output from the plot_point_obs tool for both runs.
>>>
>>> For qm=2, there were 51 locations plotted in your domain.
>>>       - Of those 51...
>>>          - All 51 header entries are unique.
>>>          - There are only 36 unique combinations of lat/lon.
>>> For qm=9, there were 101 locations plotted in your domain.
>>>       - Of those 101...
>>>          - There are only 52 unique header entries.
>>>          - There are only 37 unique combinations of lat/lon.
>>>
>>> I think there are two issues occurring here:
>>>
>>> (1) When using qm=2, you'll often see two observing locations that
look the same except for the station ID.  For example:
>>>      [ ADPSFC, 78792, 20120112_120000, 9.05, -79.37, 11 ]
>>>      [ ADPSFC, MPTO,  20120112_120000, 9.05, -79.37, 11 ]
>>>
>>> I looked at the observations that correspond to these and found
that they do actually differ slightly.
>>>
>>> (2) The second, larger issue here is when using qm=9.  It does
appear that we're really getting duplicate observations.  Foe example:
>>>      [ ADPSFC, 78792, 20120112_120000, 9.05, -79.37, 11 ]
>>>      [ ADPSFC, 78792, 20120112_120000, 9.05, -79.37, 11 ]
>>>
>>> This will likely require further debugging of the PB2NC tool to
figure out what's going on.
>>>
>>> I just wanted to let you know what I've found so far.
>>>
>>> Thanks,
>>> John Halley Gotway
>>>
>>>
>>>
>>> On 01/13/2012 03:50 PM, Case, Jonathan[ENSCO INC] via RT wrote:
>>>>
>>>> Fri Jan 13 15:50:08 2012: Request 52626 was acted upon.
>>>> Transaction: Ticket created by jonathan.case-1 at nasa.gov
>>>>            Queue: met_help
>>>>          Subject: RE: Help with ds337.0
>>>>            Owner: Nobody
>>>>       Requestors: jonathan.case-1 at nasa.gov
>>>>           Status: new
>>>>      Ticket<URL:
>>>> https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=52626>
>>>>
>>>>
>>>> Hi Tom/MET help,
>>>>
>>>> Thanks for the fantastically quick reply, Tom!
>>>>
>>>> It turns out that I'm specifically referring to the netcdf output
from the pb2nc program.
>>>>
>>>> I already sent a help ticket to the MET team, asking if they have
a means for removing the duplicate obs from their PB2NC process.  At
the time, they didn't refer to the history of data as it undergoes QC,
so this might help me track down the reason for the duplicate obs.
So, I have CC'd the met_help to this email.
>>>>
>>>> It turns out that when I initially ran pb2nc with the default
quality control flag set to "2" (i.e. quality_mark_thresh in the
PB2NCConfig_default file), I did not get ANY surface observations in
my final netcdf file over Central America.  Upon email exchanges with
the MET team, it was recommended that I set the quality control flag
to "9" to be able to accept more observations into the netcdf outfile.
>>>>
>>>>>      From what it sounds like, I need to better understand what
the "happy medium" should be in setting the quality_mark_thresh flag
in pb2nc.  2 is too restrictive, while 9 appears to be allowing
duplicate observations into the mix as a result of the QC process.
>>>>
>>>> Any recommendations are greatly welcome!
>>>>
>>>> Thanks much,
>>>> Jonathan
>>>>
>>>>
>>>> From: Thomas Cram [mailto:tcram at ucar.edu]
>>>> Sent: Friday, January 13, 2012 4:39 PM
>>>> To: Case, Jonathan (MSFC-VP61)[ENSCO INC]
>>>> Subject: Re: Help with ds337.0
>>>>
>>>> Hi Jonathan,
>>>>
>>>> the only experience I have working with the MET software is using
the pb2nc utility to convert PREPBUFR observations into a NetCDF
dataset, so my knowledge of MET is limited.  However, the one reason I
can think of for the duplicate observations is that you're seeing the
same observation after several stages of quality-control pre-
processing.  The PREPBUFR files contain a complete history of the data
as it's modified during QC, so each station will have multiple reports
at a single time.  There's a quality control flag appended to each
PREPBUFR message; you want to keep the observation with the lowest QC
number.
>>>>
>>>> Can you send me the date and time for the examples you list
below?  I'll take a look at the PREPBUFR messages and see if this is
the case.
>>>>
>>>> If this doesn't explain it, then I'll forward your question on to
MET support desk and see if they know the reason for duplicate
observations.  They are intimately familiar with the PREPBUFR obs, so
I'm sure they can help you out.
>>>>
>>>> - Tom
>>>>
>>>> On Jan 13, 2012, at 3:16 PM, Case, Jonathan (MSFC-VP61)[ENSCO
INC] wrote:
>>>>
>>>>
>>>> Dear Thomas,
>>>>
>>>> This is Jonathan Case of the NASA SPoRT Center
(http://weather.msfc.nasa.gov/sport/) in Huntsville, AL.
>>>> I am conducting some weather model verification using the MET
verification software (NCAR's Meteorological Evaluation Tools) and the
NCEP GDAS PREPBUFR point observation files for ground truth.  I have
accessed archived GDAS PREPBUFR files from NCAR's repository at
http://dss.ucar.edu/datasets/ds337.0/ and began producing difference
stats over Central America between the model forecast and observations
obtained from the PREPBUFR files.
>>>>
>>>> Now here is the interesting part:  When I examined the textual
difference files generated by the MET software, I noticed that there
were several stations with "duplicate" observations that led to
duplicate forecast-observation difference pairs.  I put duplicate in
quotes because the observed values were not necessarily the same but
usually very close to one another.
>>>> The duplicate observations arose from the fact that at the same
observation location, there would be a 5-digit WMO identifier as well
as a 4-digit text station ID at a given hour.
>>>> I stumbled on these duplicate station data when I made a table of
stations and mapped them, revealing the duplicates.
>>>>
>>>> Some examples I stumbled on include:
>>>> *         78720/MHTG (both at 14.05N, -87.22E)
>>>> *         78641/MGGT (both at 14.58N, -90.52E)
>>>> *         78711/MHPL (both at 15.22N, -83.80E)
>>>> *         78708/MHLM (both at 15.45N, -87.93)
>>>>
>>>> There are others, but I thought I'd provide a few examples to
start.
>>>>
>>>> If the source of the duplicates is NCEP/EMC, I wonder if it would
be helpful to send them a note as well?
>>>>
>>>> Let me know how you would like to proceed.
>>>>
>>>> Most sincerely,
>>>> Jonathan
>>>>
>>>>
--------------------------------------------------------------------
>>>> --
>>>> ----------------------------
>>>> Jonathan Case, ENSCO Inc.
>>>> NASA Short-term Prediction Research and Transition Center (aka
SPoRT
>>>> Center) 320 Sparkman Drive, Room 3062 Huntsville, AL 35805
>>>> Voice: 256.961.7504
>>>> Fax: 256.961.7788
>>>> Emails: Jonathan.Case-1 at nasa.gov<mailto:Jonathan.Case-1 at nasa.gov>
/
>>>> case.jonathan at ensco.com<mailto:case.jonathan at ensco.com>
>>>>
--------------------------------------------------------------------
>>>> --
>>>> ----------------------------
>>>>
>>>> "Whether the weather is cold, or whether the weather is hot,
we'll weather
>>>>       the weather whether we like it or not!"
>>>>
>>>>
>>>> Thomas Cram
>>>> NCAR / CISL / DSS
>>>> 303-497-1217
>>>> tcram at ucar.edu<mailto:tcram at ucar.edu>
>>>>
>>>>
>>>>
>>>
>>
>>
>
>

------------------------------------------------