[Met_help] [rt.rap.ucar.edu #51931] History for duplicate observation data in PB2NC

Tressa Fowler via RT met_help at ucar.edu
Fri Mar 9 13:20:53 MST 2012


----------------------------------------------------------------
  Initial Request
----------------------------------------------------------------

Dear MET help,

I am using PB2NC to convert the GDAS prepbufr observation files to netcdf obs files for use in point_stat.
One thing I have come across after running point_stat is that there are duplicate observations with different station IDs in the output mpr files.
I came across this when I tried to make a table of the station IDs of observations being used in verification over Central America by parsing the contents of a sample mpr text file.

Oddly enough, while the "duplicate" observations at a particular time have different station IDs (one with the 5-digit WMO ID and the other being a 4-character station ID) but identical lat/lon values, they often have slight differences in the data values for temperature, dewpoint, and sea level pressure, at least for a sample station we've queried.  These differences sometimes look like decimal truncation, while other times they are simply different in the tenths or hundredths place.   Also, the obs elevation shows up as "NA" for the wind variables, which doesn't seem right.

So, I'd like to see what you recommend for removing the "redundant" observations or picking the best one to use in verification.
There were perhaps 10% or more of the Central American surface observations that had these redundant observations in the sample dataset I worked with.

Thank you for your time,
Jonathan

--------------------------------------------------------------------------------------------------
Jonathan Case, ENSCO Inc.
NASA Short-term Prediction Research and Transition Center (aka SPoRT Center)
320 Sparkman Drive, Room 3062
Huntsville, AL 35805
Voice: 256.961.7504
Fax: 256.961.7788
Emails: Jonathan.Case-1 at nasa.gov / case.jonathan at ensco.com
--------------------------------------------------------------------------------------------------

"Whether the weather is cold, or whether the weather is hot, we'll weather
  the weather whether we like it or not!"



----------------------------------------------------------------
  Complete Ticket History
----------------------------------------------------------------

Subject: duplicate observation data in PB2NC 
From: Tressa Fowler
Time: Fri Dec 16 10:59:00 2011

Hi Jonathan,

This is a hard problem and we don't have an answer. It is pretty easy
for a person to tell on a case by case basis which observations are
duplicates, but trying to make up a set of criteria for automating is
a huge can of worms.

We have done some duplicate elimination in the past, using 'close
enough' measures for location and variable, but they are slow. I think
the better person to ask would be an expert in your particular
measurement area and type.

We have no plans to implement anything like this in our software,
primarily because it is so case dependent.

Let me know if you have further questions.

Tressa

On Thu Dec 08 11:51:24 2011, jonathan.case-1 at nasa.gov wrote:
> Dear MET help,
>
> I am using PB2NC to convert the GDAS prepbufr observation files to
>    netcdf obs files for use in point_stat.
> One thing I have come across after running point_stat is that there
>    are duplicate observations with different station IDs in the
output
>    mpr files.
> I came across this when I tried to make a table of the station IDs
of
>    observations being used in verification over Central America by
>    parsing the contents of a sample mpr text file.
>
> Oddly enough, while the "duplicate" observations at a particular
time
>    have different station IDs (one with the 5-digit WMO ID and the
>    other being a 4-character station ID) but identical lat/lon
values,
>    they often have slight differences in the data values for
>    temperature, dewpoint, and sea level pressure, at least for a
>    sample station we've queried.  These differences sometimes look
>    like decimal truncation, while other times they are simply
>    different in the tenths or hundredths place.   Also, the obs
>    elevation shows up as "NA" for the wind variables, which doesn't
>    seem right.
>
> So, I'd like to see what you recommend for removing the "redundant"
>    observations or picking the best one to use in verification.
> There were perhaps 10% or more of the Central American surface
>    observations that had these redundant observations in the sample
>    dataset I worked with.
>
> Thank you for your time,
> Jonathan
>
>
--------------------------------------------------------------------------------------------------
> Jonathan Case, ENSCO Inc.
> NASA Short-term Prediction Research and Transition Center (aka SPoRT
>    Center)
> 320 Sparkman Drive, Room 3062
> Huntsville, AL 35805
> Voice: 256.961.7504
> Fax: 256.961.7788
> Emails: Jonathan.Case-1 at nasa.gov / case.jonathan at ensco.com
>
--------------------------------------------------------------------------------------------------
>
> "Whether the weather is cold, or whether the weather is hot, we'll
>    weather
>   the weather whether we like it or not!"
>



------------------------------------------------
Subject: RE: [rt.rap.ucar.edu #51931] duplicate observation data in PB2NC  
From: Case, Jonathan[ENSCO INC]
Time: Fri Dec 16 11:02:23 2011

Thanks for the response, Tressa.  

This at least gives us something
to go by.  
We have thought about using the netcdf kitchen sink
utility to modify the resulting netcdf obs file if possible.  I
haven't explored this option in detail yet, however.

I suppose a
broader question to ask is "why do these duplicate obs show up in the
NCEP PREPBUFR files"?

Regards,
Jonathan


-----Original
Message-----
From: Tressa Fowler via RT [mailto:met_help at ucar.edu]
Sent: Friday, December 16, 2011 11:59 AM
To: Case, Jonathan (MSFC-
VP61)[ENSCO INC]
Cc: bullock at ucar.edu; Srikishen, Jayanthi (MSFC-
VP61)[Universities Space Research Association (USRA)]
Subject:
[rt.rap.ucar.edu #51931] duplicate observation data in PB2NC 

Hi
Jonathan,

This is a hard problem and we don't have an answer. It is
pretty easy for a person to tell on a case by case basis which
observations are duplicates, but trying to make up a set of criteria
for automating is a huge can of worms. 

We have done some duplicate
elimination in the past, using 'close enough' measures for location
and variable, but they are slow. I think the better person to ask
would be an expert in your particular measurement area and type.
We have no plans to implement anything like this in our software,
primarily because it is so case dependent. 

Let me know if you have
further questions. 

Tressa

On Thu Dec 08 11:51:24 2011,
jonathan.case-1 at nasa.gov wrote:
> Dear MET help,
> 
> I am using
PB2NC to convert the GDAS prepbufr observation files to
>    netcdf
obs files for use in point_stat.
> One thing I have come across after
running point_stat is that there
>    are duplicate observations with
different station IDs in the output
>    mpr files.
> I came across
this when I tried to make a table of the station IDs of
>
observations being used in verification over Central America by
>
parsing the contents of a sample mpr text file.
> 
> Oddly enough,
while the "duplicate" observations at a particular time
>    have
different station IDs (one with the 5-digit WMO ID and the
>    other
being a 4-character station ID) but identical lat/lon values,
>
they often have slight differences in the data values for
>
temperature, dewpoint, and sea level pressure, at least for a
>
sample station we've queried.  These differences sometimes look
>
like decimal truncation, while other times they are simply
>
different in the tenths or hundredths place.   Also, the obs
>
elevation shows up as "NA" for the wind variables, which doesn't
>
seem right.
> 
> So, I'd like to see what you recommend for removing
the "redundant"
>    observations or picking the best one to use in
verification.
> There were perhaps 10% or more of the Central
American surface
>    observations that had these redundant
observations in the sample
>    dataset I worked with.
> 
> Thank
you for your time,
> Jonathan
> 
>
--------------------------------------------------------------------------------------------------
> Jonathan Case, ENSCO Inc.
> NASA Short-term Prediction Research and
Transition Center (aka SPoRT
>    Center)
> 320 Sparkman Drive, Room
3062
> Huntsville, AL 35805
> Voice: 256.961.7504
> Fax:
256.961.7788
> Emails: Jonathan.Case-1 at nasa.gov /
case.jonathan at ensco.com
>
--------------------------------------------------------------------------------------------------
> 
> "Whether the weather is cold, or whether the weather is hot,
we'll
>    weather
>   the weather whether we like it or not!"
>

------------------------------------------------


More information about the Met_help mailing list