[Met_help] [rt.rap.ucar.edu #82430] History for time window discrepancies

Howard Soh via RT met_help at ucar.edu
Thu May 16 14:14:46 MDT 2019


----------------------------------------------------------------
  Initial Request
----------------------------------------------------------------

Hi, there,

I'm still having issues with discrepancies between VSDB and MET, and in
this case it has to do with the time window.

I'm running a current comparison and have been using 30 minutes as the time
window.  In editbufr, it compares the value of the time window (in
hundredths of an hour, so that is set to 50 hundredths), and compares that
to the value of DHR in the observation (also in hundredths of an hour).  I
just checked the code to see if it is doing that correctly and it does.

In PB2NC, in the time window section, I set the window to -1800 and 1800,
because there are 1800 seconds in 30 minutes.  I believe the obs_window
values need to be set in seconds, is that correct?

OK, so when I do that,  I get 232604 time-window rejections for PB2NC and
173316 time-window rejections for editbufr.  I don't know if this is due to
the fact that editbufr checks the domain first and if they are rejected by
domain, they aren't rejected again for the time window, where it seems in
PB2NC, time window rejections are done first (which could make up for the
larger number of rejections for PB2NC).

In all, editbufr retains 10878 reports and PB2NC retains 10120 reports,
which is still a discrepancy.  Both use the same sets of report types (in
this case 181, 281, 284, 187, and 287).  Both use the G212 domain to set
the domain limits.  Is there something else I can be looking at in PB2NC to
ensure that I am comparing apples to apples?

Also, I am curious:  PB2NC lists 44479 total observations retained or
derived.  What does this number represent?  Does that mean that the total
number of observations includes stuff like T and Q for the same station as
two observations?  Or is that something else?  That number doesn't appear
anywhere in what's retained in the editbufr report.  The 10120 in PB2NC is
the same order of the 10878 in editbufr, so I am guessing those numbers are
the numbers we should be comparing.  Can you confirm that?

Thanks for your assistance.

Perry


----------------------------------------------------------------
  Complete Ticket History
----------------------------------------------------------------

Subject: Re: [rt.rap.ucar.edu #82430] time window discrepancies
From: Howard Soh
Time: Thu Oct 19 17:28:07 2017


Question 1.

I believe the obs_window values need to be set in seconds, is that
correct?

Yes, the unit is second.

Question 2.

Does that mean that the total number of observations includes stuff
like
T and Q for the same station as two observations?

Yes, that's correct. The variables are defined at the configuration:
- obs_bufr_var for 6.1
- obs_grib_code for 6.0 or early versions

For example :
obs_grib_code = [ "SPFH", "TMP",  "HGT",  "UGRD", "VGRD",
                   "DPT",  "WIND", "RH",   "MIXR" ];
==>
The first five variables come from the PrepBufr and the other four
variables are derived.

In this case (9 variables), the maximum total number of observations
is
(if not rejected):
"Total PrepBufr Messages retained" * 9 * "the number of vertical
levels"

FYI: The order of rejections (if configured):
     1. message type
     2. station_id
     3. valid time
     4. grid mask
     5. poly mask
     6. elevation
     7. PrepBufr report type
     8. input report type
     9. instrument type
     10. vertical level
     11. If the pressure level is invalid
     12. If virtual temperature is invalid
     13. if the observation value is invalid.
     14. if the quality mark is invalid.
     15. If the quality mark is greater than the quality mark
threshold
     16. If the data level category is listed in the configuration
file,
and it is not in the list
     17. If the associated GRIB code (or the variable index) is not in
the configuration file


I believe you know the additional debug output with "-v 2" option.
Here is an example:

DEBUG 2: PrepBufr Time Center:          20120409_120000
DEBUG 2: Searching Time Window:         20120409_113000 to
20120409_123000
DEBUG 2: Processing 76306 PrepBufr messages...
...
DEBUG 2: Total PrepBufr Messages processed      = 76306
DEBUG 2: Rejected based on message type         = 0
DEBUG 2: Rejected based on station id           = 0
DEBUG 2: Rejected based on valid time           = 50115
DEBUG 2: Rejected based on masking grid         = 0
DEBUG 2: Rejected based on masking polygon      = 17244
DEBUG 2: Rejected based on elevation            = 0
DEBUG 2: Rejected based on pb report type       = 0
DEBUG 2: Rejected based on input report type    = 0
DEBUG 2: Rejected based on instrument type      = 0
DEBUG 2: Rejected based on zero observations    = 505
DEBUG 2: Total PrepBufr Messages retained       = 8442
DEBUG 2: Total observations retained or derived = 75771

"Total PrepBufr Messages" - "the sum of rejected counts" = "Total
PrepBufr Messages retained".
76306 - 50115 - 17244 - 505 = 8442 retained record

Possible observation count if the vertical level is 1 and using above
configuration (9 variables).
8442 * 9 * 1 = 75,978

Total observations retained or derived = 75771
75,978 - 75,771 = 207   ==> 207 observation data was NOT saved because
of invalid value, invalid quality mark, or quality mark threshold.

Cheers,
Howard

On 10/19/2017 9:47 AM, perry.shafran at noaa.gov via RT wrote:
> Thu Oct 19 09:47:56 2017: Request 82430 was acted upon.
> Transaction: Ticket created by perry.shafran at noaa.gov
>         Queue: met_help
>       Subject: time window discrepancies
>         Owner: Nobody
>    Requestors: perry.shafran at noaa.gov
>        Status: new
>   Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=82430 >
>
>
> Hi, there,
>
> I'm still having issues with discrepancies between VSDB and MET, and
in
> this case it has to do with the time window.
>
> I'm running a current comparison and have been using 30 minutes as
the time
> window.  In editbufr, it compares the value of the time window (in
> hundredths of an hour, so that is set to 50 hundredths), and
compares that
> to the value of DHR in the observation (also in hundredths of an
hour).  I
> just checked the code to see if it is doing that correctly and it
does.
>
> In PB2NC, in the time window section, I set the window to -1800 and
1800,
> because there are 1800 seconds in 30 minutes.  I believe the
obs_window
> values need to be set in seconds, is that correct?
>
> OK, so when I do that,  I get 232604 time-window rejections for
PB2NC and
> 173316 time-window rejections for editbufr.  I don't know if this is
due to
> the fact that editbufr checks the domain first and if they are
rejected by
> domain, they aren't rejected again for the time window, where it
seems in
> PB2NC, time window rejections are done first (which could make up
for the
> larger number of rejections for PB2NC).
>
> In all, editbufr retains 10878 reports and PB2NC retains 10120
reports,
> which is still a discrepancy.  Both use the same sets of report
types (in
> this case 181, 281, 284, 187, and 287).  Both use the G212 domain to
set
> the domain limits.  Is there something else I can be looking at in
PB2NC to
> ensure that I am comparing apples to apples?
>
> Also, I am curious:  PB2NC lists 44479 total observations retained
or
> derived.  What does this number represent?  Does that mean that the
total
> number of observations includes stuff like T and Q for the same
station as
> two observations?  Or is that something else?  That number doesn't
appear
> anywhere in what's retained in the editbufr report.  The 10120 in
PB2NC is
> the same order of the 10878 in editbufr, so I am guessing those
numbers are
> the numbers we should be comparing.  Can you confirm that?
>
> Thanks for your assistance.
>
> Perry
>


------------------------------------------------
Subject: Re: [rt.rap.ucar.edu #82430] time window discrepancies
From: Howard Soh
Time: Thu Oct 19 17:44:56 2017

Suggestion for comparing apples to apples:

Apply just one filtering:
         Trial 1: time window without grid masking
         Trial 2: grid masking only (including all time window)
                 For example, set plus/minus one day (86400) or one
week
(604800) as time window

Cheers,
Howard


On 10/19/2017 5:27 PM, hsoh wrote:
>
> Question 1.
>
> I believe the obs_window values need to be set in seconds, is that
> correct?
>
> Yes, the unit is second.
>
> Question 2.
>
> Does that mean that the total number of observations includes stuff
like
> T and Q for the same station as two observations?
>
> Yes, that's correct. The variables are defined at the configuration:
> - obs_bufr_var for 6.1
> - obs_grib_code for 6.0 or early versions
>
> For example :
> obs_grib_code = [ "SPFH", "TMP",  "HGT",  "UGRD", "VGRD",
>                   "DPT",  "WIND", "RH",   "MIXR" ];
> ==>
> The first five variables come from the PrepBufr and the other four
> variables are derived.
>
> In this case (9 variables), the maximum total number of observations
> is (if not rejected):
> "Total PrepBufr Messages retained" * 9 * "the number of vertical
levels"
>
> FYI: The order of rejections (if configured):
>     1. message type
>     2. station_id
>     3. valid time
>     4. grid mask
>     5. poly mask
>     6. elevation
>     7. PrepBufr report type
>     8. input report type
>     9. instrument type
>     10. vertical level
>     11. If the pressure level is invalid
>     12. If virtual temperature is invalid
>     13. if the observation value is invalid.
>     14. if the quality mark is invalid.
>     15. If the quality mark is greater than the quality mark
threshold
>     16. If the data level category is listed in the configuration
> file, and it is not in the list
>     17. If the associated GRIB code (or the variable index) is not
in
> the configuration file
>
>
> I believe you know the additional debug output with "-v 2" option.
> Here is an example:
>
> DEBUG 2: PrepBufr Time Center:          20120409_120000
> DEBUG 2: Searching Time Window:         20120409_113000 to
> 20120409_123000
> DEBUG 2: Processing 76306 PrepBufr messages...
> ...
> DEBUG 2: Total PrepBufr Messages processed      = 76306
> DEBUG 2: Rejected based on message type         = 0
> DEBUG 2: Rejected based on station id           = 0
> DEBUG 2: Rejected based on valid time           = 50115
> DEBUG 2: Rejected based on masking grid         = 0
> DEBUG 2: Rejected based on masking polygon      = 17244
> DEBUG 2: Rejected based on elevation            = 0
> DEBUG 2: Rejected based on pb report type       = 0
> DEBUG 2: Rejected based on input report type    = 0
> DEBUG 2: Rejected based on instrument type      = 0
> DEBUG 2: Rejected based on zero observations    = 505
> DEBUG 2: Total PrepBufr Messages retained       = 8442
> DEBUG 2: Total observations retained or derived = 75771
>
> "Total PrepBufr Messages" - "the sum of rejected counts" = "Total
> PrepBufr Messages retained".
> 76306 - 50115 - 17244 - 505 = 8442 retained record
>
> Possible observation count if the vertical level is 1 and using
above
> configuration (9 variables).
> 8442 * 9 * 1 = 75,978
>
> Total observations retained or derived = 75771
> 75,978 - 75,771 = 207   ==> 207 observation data was NOT saved
because
> of invalid value, invalid quality mark, or quality mark threshold.
>
> Cheers,
> Howard
>
> On 10/19/2017 9:47 AM, perry.shafran at noaa.gov via RT wrote:
>> Thu Oct 19 09:47:56 2017: Request 82430 was acted upon.
>> Transaction: Ticket created by perry.shafran at noaa.gov
>>         Queue: met_help
>>       Subject: time window discrepancies
>>         Owner: Nobody
>>    Requestors: perry.shafran at noaa.gov
>>        Status: new
>>   Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=82430 >
>>
>>
>> Hi, there,
>>
>> I'm still having issues with discrepancies between VSDB and MET,
and in
>> this case it has to do with the time window.
>>
>> I'm running a current comparison and have been using 30 minutes as
>> the time
>> window.  In editbufr, it compares the value of the time window (in
>> hundredths of an hour, so that is set to 50 hundredths), and
compares
>> that
>> to the value of DHR in the observation (also in hundredths of an
>> hour).  I
>> just checked the code to see if it is doing that correctly and it
does.
>>
>> In PB2NC, in the time window section, I set the window to -1800 and
>> 1800,
>> because there are 1800 seconds in 30 minutes.  I believe the
obs_window
>> values need to be set in seconds, is that correct?
>>
>> OK, so when I do that,  I get 232604 time-window rejections for
PB2NC
>> and
>> 173316 time-window rejections for editbufr.  I don't know if this
is
>> due to
>> the fact that editbufr checks the domain first and if they are
>> rejected by
>> domain, they aren't rejected again for the time window, where it
>> seems in
>> PB2NC, time window rejections are done first (which could make up
for
>> the
>> larger number of rejections for PB2NC).
>>
>> In all, editbufr retains 10878 reports and PB2NC retains 10120
reports,
>> which is still a discrepancy.  Both use the same sets of report
types
>> (in
>> this case 181, 281, 284, 187, and 287).  Both use the G212 domain
to set
>> the domain limits.  Is there something else I can be looking at in
>> PB2NC to
>> ensure that I am comparing apples to apples?
>>
>> Also, I am curious:  PB2NC lists 44479 total observations retained
or
>> derived.  What does this number represent?  Does that mean that the
>> total
>> number of observations includes stuff like T and Q for the same
>> station as
>> two observations?  Or is that something else?  That number doesn't
>> appear
>> anywhere in what's retained in the editbufr report.  The 10120 in
>> PB2NC is
>> the same order of the 10878 in editbufr, so I am guessing those
>> numbers are
>> the numbers we should be comparing.  Can you confirm that?
>>
>> Thanks for your assistance.
>>
>> Perry
>>
>


------------------------------------------------
Subject: time window discrepancies
From: perry.shafran at noaa.gov
Time: Fri Oct 20 06:56:54 2017

Hi, Howard,

Thanks!  I will try these and see what I get, both in VSDB and in MET.

Perry

On Thu, Oct 19, 2017 at 7:44 PM, hsoh <hsoh at ucar.edu> wrote:

> Suggestion for comparing apples to apples:
>
> Apply just one filtering:
>         Trial 1: time window without grid masking
>         Trial 2: grid masking only (including all time window)
>                 For example, set plus/minus one day (86400) or one
week
> (604800) as time window
>
> Cheers,
> Howard
>
>
>
> On 10/19/2017 5:27 PM, hsoh wrote:
>
>>
>> Question 1.
>>
>> I believe the obs_window values need to be set in seconds, is that
>> correct?
>>
>> Yes, the unit is second.
>>
>> Question 2.
>>
>> Does that mean that the total number of observations includes stuff
like
>> T and Q for the same station as two observations?
>>
>> Yes, that's correct. The variables are defined at the
configuration:
>> - obs_bufr_var for 6.1
>> - obs_grib_code for 6.0 or early versions
>>
>> For example :
>> obs_grib_code = [ "SPFH", "TMP",  "HGT",  "UGRD", "VGRD",
>>                   "DPT",  "WIND", "RH",   "MIXR" ];
>> ==>
>> The first five variables come from the PrepBufr and the other four
>> variables are derived.
>>
>> In this case (9 variables), the maximum total number of
observations is
>> (if not rejected):
>> "Total PrepBufr Messages retained" * 9 * "the number of vertical
levels"
>>
>> FYI: The order of rejections (if configured):
>>     1. message type
>>     2. station_id
>>     3. valid time
>>     4. grid mask
>>     5. poly mask
>>     6. elevation
>>     7. PrepBufr report type
>>     8. input report type
>>     9. instrument type
>>     10. vertical level
>>     11. If the pressure level is invalid
>>     12. If virtual temperature is invalid
>>     13. if the observation value is invalid.
>>     14. if the quality mark is invalid.
>>     15. If the quality mark is greater than the quality mark
threshold
>>     16. If the data level category is listed in the configuration
file,
>> and it is not in the list
>>     17. If the associated GRIB code (or the variable index) is not
in the
>> configuration file
>>
>>
>> I believe you know the additional debug output with "-v 2" option.
>> Here is an example:
>>
>> DEBUG 2: PrepBufr Time Center:          20120409_120000
>> DEBUG 2: Searching Time Window:         20120409_113000 to
20120409_123000
>> DEBUG 2: Processing 76306 PrepBufr messages...
>> ...
>> DEBUG 2: Total PrepBufr Messages processed      = 76306
>> DEBUG 2: Rejected based on message type         = 0
>> DEBUG 2: Rejected based on station id           = 0
>> DEBUG 2: Rejected based on valid time           = 50115
>> DEBUG 2: Rejected based on masking grid         = 0
>> DEBUG 2: Rejected based on masking polygon      = 17244
>> DEBUG 2: Rejected based on elevation            = 0
>> DEBUG 2: Rejected based on pb report type       = 0
>> DEBUG 2: Rejected based on input report type    = 0
>> DEBUG 2: Rejected based on instrument type      = 0
>> DEBUG 2: Rejected based on zero observations    = 505
>> DEBUG 2: Total PrepBufr Messages retained       = 8442
>> DEBUG 2: Total observations retained or derived = 75771
>>
>> "Total PrepBufr Messages" - "the sum of rejected counts" = "Total
>> PrepBufr Messages retained".
>> 76306 - 50115 - 17244 - 505 = 8442 retained record
>>
>> Possible observation count if the vertical level is 1 and using
above
>> configuration (9 variables).
>> 8442 * 9 * 1 = 75,978
>>
>> Total observations retained or derived = 75771
>> 75,978 - 75,771 = 207   ==> 207 observation data was NOT saved
because of
>> invalid value, invalid quality mark, or quality mark threshold.
>>
>> Cheers,
>> Howard
>>
>> On 10/19/2017 9:47 AM, perry.shafran at noaa.gov via RT wrote:
>>
>>> Thu Oct 19 09:47:56 2017: Request 82430 was acted upon.
>>> Transaction: Ticket created by perry.shafran at noaa.gov
>>>         Queue: met_help
>>>       Subject: time window discrepancies
>>>         Owner: Nobody
>>>    Requestors: perry.shafran at noaa.gov
>>>        Status: new
>>>   Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=82430 >
>>>
>>>
>>> Hi, there,
>>>
>>> I'm still having issues with discrepancies between VSDB and MET,
and in
>>> this case it has to do with the time window.
>>>
>>> I'm running a current comparison and have been using 30 minutes as
the
>>> time
>>> window.  In editbufr, it compares the value of the time window (in
>>> hundredths of an hour, so that is set to 50 hundredths), and
compares
>>> that
>>> to the value of DHR in the observation (also in hundredths of an
hour).
>>> I
>>> just checked the code to see if it is doing that correctly and it
does.
>>>
>>> In PB2NC, in the time window section, I set the window to -1800
and 1800,
>>> because there are 1800 seconds in 30 minutes.  I believe the
obs_window
>>> values need to be set in seconds, is that correct?
>>>
>>> OK, so when I do that,  I get 232604 time-window rejections for
PB2NC and
>>> 173316 time-window rejections for editbufr.  I don't know if this
is due
>>> to
>>> the fact that editbufr checks the domain first and if they are
rejected
>>> by
>>> domain, they aren't rejected again for the time window, where it
seems in
>>> PB2NC, time window rejections are done first (which could make up
for the
>>> larger number of rejections for PB2NC).
>>>
>>> In all, editbufr retains 10878 reports and PB2NC retains 10120
reports,
>>> which is still a discrepancy.  Both use the same sets of report
types (in
>>> this case 181, 281, 284, 187, and 287).  Both use the G212 domain
to set
>>> the domain limits.  Is there something else I can be looking at in
PB2NC
>>> to
>>> ensure that I am comparing apples to apples?
>>>
>>> Also, I am curious:  PB2NC lists 44479 total observations retained
or
>>> derived.  What does this number represent?  Does that mean that
the total
>>> number of observations includes stuff like T and Q for the same
station
>>> as
>>> two observations?  Or is that something else?  That number doesn't
appear
>>> anywhere in what's retained in the editbufr report.  The 10120 in
PB2NC
>>> is
>>> the same order of the 10878 in editbufr, so I am guessing those
numbers
>>> are
>>> the numbers we should be comparing.  Can you confirm that?
>>>
>>> Thanks for your assistance.
>>>
>>> Perry
>>>
>>>
>>
>

------------------------------------------------
Subject: time window discrepancies
From: perry.shafran at noaa.gov
Time: Fri Oct 20 08:24:43 2017

OK, this was helpful.

Test 1:  When I removed the grid masking, what happened is that, after
the
rejections based on report type, I got an identical number for both
VSDB
and MET.

MET adds another level of rejections that VSDB doesn't have:

DEBUG 2: Rejected based on zero observations    = 1219

Is there a way to turn this off in PB2NC so I don't have this
rejection?

Now to look at the second test.

Perry

On Thu, Oct 19, 2017 at 7:44 PM, hsoh <hsoh at ucar.edu> wrote:

> Suggestion for comparing apples to apples:
>
> Apply just one filtering:
>         Trial 1: time window without grid masking
>         Trial 2: grid masking only (including all time window)
>                 For example, set plus/minus one day (86400) or one
week
> (604800) as time window
>
> Cheers,
> Howard
>
>
>
> On 10/19/2017 5:27 PM, hsoh wrote:
>
>>
>> Question 1.
>>
>> I believe the obs_window values need to be set in seconds, is that
>> correct?
>>
>> Yes, the unit is second.
>>
>> Question 2.
>>
>> Does that mean that the total number of observations includes stuff
like
>> T and Q for the same station as two observations?
>>
>> Yes, that's correct. The variables are defined at the
configuration:
>> - obs_bufr_var for 6.1
>> - obs_grib_code for 6.0 or early versions
>>
>> For example :
>> obs_grib_code = [ "SPFH", "TMP",  "HGT",  "UGRD", "VGRD",
>>                   "DPT",  "WIND", "RH",   "MIXR" ];
>> ==>
>> The first five variables come from the PrepBufr and the other four
>> variables are derived.
>>
>> In this case (9 variables), the maximum total number of
observations is
>> (if not rejected):
>> "Total PrepBufr Messages retained" * 9 * "the number of vertical
levels"
>>
>> FYI: The order of rejections (if configured):
>>     1. message type
>>     2. station_id
>>     3. valid time
>>     4. grid mask
>>     5. poly mask
>>     6. elevation
>>     7. PrepBufr report type
>>     8. input report type
>>     9. instrument type
>>     10. vertical level
>>     11. If the pressure level is invalid
>>     12. If virtual temperature is invalid
>>     13. if the observation value is invalid.
>>     14. if the quality mark is invalid.
>>     15. If the quality mark is greater than the quality mark
threshold
>>     16. If the data level category is listed in the configuration
file,
>> and it is not in the list
>>     17. If the associated GRIB code (or the variable index) is not
in the
>> configuration file
>>
>>
>> I believe you know the additional debug output with "-v 2" option.
>> Here is an example:
>>
>> DEBUG 2: PrepBufr Time Center:          20120409_120000
>> DEBUG 2: Searching Time Window:         20120409_113000 to
20120409_123000
>> DEBUG 2: Processing 76306 PrepBufr messages...
>> ...
>> DEBUG 2: Total PrepBufr Messages processed      = 76306
>> DEBUG 2: Rejected based on message type         = 0
>> DEBUG 2: Rejected based on station id           = 0
>> DEBUG 2: Rejected based on valid time           = 50115
>> DEBUG 2: Rejected based on masking grid         = 0
>> DEBUG 2: Rejected based on masking polygon      = 17244
>> DEBUG 2: Rejected based on elevation            = 0
>> DEBUG 2: Rejected based on pb report type       = 0
>> DEBUG 2: Rejected based on input report type    = 0
>> DEBUG 2: Rejected based on instrument type      = 0
>> DEBUG 2: Rejected based on zero observations    = 505
>> DEBUG 2: Total PrepBufr Messages retained       = 8442
>> DEBUG 2: Total observations retained or derived = 75771
>>
>> "Total PrepBufr Messages" - "the sum of rejected counts" = "Total
>> PrepBufr Messages retained".
>> 76306 - 50115 - 17244 - 505 = 8442 retained record
>>
>> Possible observation count if the vertical level is 1 and using
above
>> configuration (9 variables).
>> 8442 * 9 * 1 = 75,978
>>
>> Total observations retained or derived = 75771
>> 75,978 - 75,771 = 207   ==> 207 observation data was NOT saved
because of
>> invalid value, invalid quality mark, or quality mark threshold.
>>
>> Cheers,
>> Howard
>>
>> On 10/19/2017 9:47 AM, perry.shafran at noaa.gov via RT wrote:
>>
>>> Thu Oct 19 09:47:56 2017: Request 82430 was acted upon.
>>> Transaction: Ticket created by perry.shafran at noaa.gov
>>>         Queue: met_help
>>>       Subject: time window discrepancies
>>>         Owner: Nobody
>>>    Requestors: perry.shafran at noaa.gov
>>>        Status: new
>>>   Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=82430 >
>>>
>>>
>>> Hi, there,
>>>
>>> I'm still having issues with discrepancies between VSDB and MET,
and in
>>> this case it has to do with the time window.
>>>
>>> I'm running a current comparison and have been using 30 minutes as
the
>>> time
>>> window.  In editbufr, it compares the value of the time window (in
>>> hundredths of an hour, so that is set to 50 hundredths), and
compares
>>> that
>>> to the value of DHR in the observation (also in hundredths of an
hour).
>>> I
>>> just checked the code to see if it is doing that correctly and it
does.
>>>
>>> In PB2NC, in the time window section, I set the window to -1800
and 1800,
>>> because there are 1800 seconds in 30 minutes.  I believe the
obs_window
>>> values need to be set in seconds, is that correct?
>>>
>>> OK, so when I do that,  I get 232604 time-window rejections for
PB2NC and
>>> 173316 time-window rejections for editbufr.  I don't know if this
is due
>>> to
>>> the fact that editbufr checks the domain first and if they are
rejected
>>> by
>>> domain, they aren't rejected again for the time window, where it
seems in
>>> PB2NC, time window rejections are done first (which could make up
for the
>>> larger number of rejections for PB2NC).
>>>
>>> In all, editbufr retains 10878 reports and PB2NC retains 10120
reports,
>>> which is still a discrepancy.  Both use the same sets of report
types (in
>>> this case 181, 281, 284, 187, and 287).  Both use the G212 domain
to set
>>> the domain limits.  Is there something else I can be looking at in
PB2NC
>>> to
>>> ensure that I am comparing apples to apples?
>>>
>>> Also, I am curious:  PB2NC lists 44479 total observations retained
or
>>> derived.  What does this number represent?  Does that mean that
the total
>>> number of observations includes stuff like T and Q for the same
station
>>> as
>>> two observations?  Or is that something else?  That number doesn't
appear
>>> anywhere in what's retained in the editbufr report.  The 10120 in
PB2NC
>>> is
>>> the same order of the 10878 in editbufr, so I am guessing those
numbers
>>> are
>>> the numbers we should be comparing.  Can you confirm that?
>>>
>>> Thanks for your assistance.
>>>
>>> Perry
>>>
>>>
>>
>

------------------------------------------------
Subject: time window discrepancies
From: Howard Soh
Time: Fri Oct 20 09:39:39 2017

No, it can not be turned off.
It's just information for handling PrepBufr input and the NetCDF
output.

The invalid data is not saved to NetCDF.
"Rejected based on zero observation" is the count of the records
(message)
which all observation data was filtered by item 10 to 17 (including
derived
variables).

Cheers,
Howard

On Fri, Oct 20, 2017 at 8:24 AM, Perry Shafran - NOAA Affiliate <
perry.shafran at noaa.gov> wrote:

> OK, this was helpful.
>
> Test 1:  When I removed the grid masking, what happened is that,
after the
> rejections based on report type, I got an identical number for both
VSDB
> and MET.
>
> MET adds another level of rejections that VSDB doesn't have:
>
> DEBUG 2: Rejected based on zero observations    = 1219
>
> Is there a way to turn this off in PB2NC so I don't have this
rejection?
>
> Now to look at the second test.
>
> Perry
>
> On Thu, Oct 19, 2017 at 7:44 PM, hsoh <hsoh at ucar.edu> wrote:
>
>> Suggestion for comparing apples to apples:
>>
>> Apply just one filtering:
>>         Trial 1: time window without grid masking
>>         Trial 2: grid masking only (including all time window)
>>                 For example, set plus/minus one day (86400) or one
week
>> (604800) as time window
>>
>> Cheers,
>> Howard
>>
>>
>>
>> On 10/19/2017 5:27 PM, hsoh wrote:
>>
>>>
>>> Question 1.
>>>
>>> I believe the obs_window values need to be set in seconds, is that
>>> correct?
>>>
>>> Yes, the unit is second.
>>>
>>> Question 2.
>>>
>>> Does that mean that the total number of observations includes
stuff like
>>> T and Q for the same station as two observations?
>>>
>>> Yes, that's correct. The variables are defined at the
configuration:
>>> - obs_bufr_var for 6.1
>>> - obs_grib_code for 6.0 or early versions
>>>
>>> For example :
>>> obs_grib_code = [ "SPFH", "TMP",  "HGT",  "UGRD", "VGRD",
>>>                   "DPT",  "WIND", "RH",   "MIXR" ];
>>> ==>
>>> The first five variables come from the PrepBufr and the other four
>>> variables are derived.
>>>
>>> In this case (9 variables), the maximum total number of
observations is
>>> (if not rejected):
>>> "Total PrepBufr Messages retained" * 9 * "the number of vertical
levels"
>>>
>>> FYI: The order of rejections (if configured):
>>>     1. message type
>>>     2. station_id
>>>     3. valid time
>>>     4. grid mask
>>>     5. poly mask
>>>     6. elevation
>>>     7. PrepBufr report type
>>>     8. input report type
>>>     9. instrument type
>>>     10. vertical level
>>>     11. If the pressure level is invalid
>>>     12. If virtual temperature is invalid
>>>     13. if the observation value is invalid.
>>>     14. if the quality mark is invalid.
>>>     15. If the quality mark is greater than the quality mark
threshold
>>>     16. If the data level category is listed in the configuration
file,
>>> and it is not in the list
>>>     17. If the associated GRIB code (or the variable index) is not
in
>>> the configuration file
>>>
>>>
>>> I believe you know the additional debug output with "-v 2" option.
>>> Here is an example:
>>>
>>> DEBUG 2: PrepBufr Time Center:          20120409_120000
>>> DEBUG 2: Searching Time Window:         20120409_113000 to
>>> 20120409_123000
>>> DEBUG 2: Processing 76306 PrepBufr messages...
>>> ...
>>> DEBUG 2: Total PrepBufr Messages processed      = 76306
>>> DEBUG 2: Rejected based on message type         = 0
>>> DEBUG 2: Rejected based on station id           = 0
>>> DEBUG 2: Rejected based on valid time           = 50115
>>> DEBUG 2: Rejected based on masking grid         = 0
>>> DEBUG 2: Rejected based on masking polygon      = 17244
>>> DEBUG 2: Rejected based on elevation            = 0
>>> DEBUG 2: Rejected based on pb report type       = 0
>>> DEBUG 2: Rejected based on input report type    = 0
>>> DEBUG 2: Rejected based on instrument type      = 0
>>> DEBUG 2: Rejected based on zero observations    = 505
>>> DEBUG 2: Total PrepBufr Messages retained       = 8442
>>> DEBUG 2: Total observations retained or derived = 75771
>>>
>>> "Total PrepBufr Messages" - "the sum of rejected counts" = "Total
>>> PrepBufr Messages retained".
>>> 76306 - 50115 - 17244 - 505 = 8442 retained record
>>>
>>> Possible observation count if the vertical level is 1 and using
above
>>> configuration (9 variables).
>>> 8442 * 9 * 1 = 75,978
>>>
>>> Total observations retained or derived = 75771
>>> 75,978 - 75,771 = 207   ==> 207 observation data was NOT saved
because
>>> of invalid value, invalid quality mark, or quality mark threshold.
>>>
>>> Cheers,
>>> Howard
>>>
>>> On 10/19/2017 9:47 AM, perry.shafran at noaa.gov via RT wrote:
>>>
>>>> Thu Oct 19 09:47:56 2017: Request 82430 was acted upon.
>>>> Transaction: Ticket created by perry.shafran at noaa.gov
>>>>         Queue: met_help
>>>>       Subject: time window discrepancies
>>>>         Owner: Nobody
>>>>    Requestors: perry.shafran at noaa.gov
>>>>        Status: new
>>>>   Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=82430
>>>> >
>>>>
>>>>
>>>> Hi, there,
>>>>
>>>> I'm still having issues with discrepancies between VSDB and MET,
and in
>>>> this case it has to do with the time window.
>>>>
>>>> I'm running a current comparison and have been using 30 minutes
as the
>>>> time
>>>> window.  In editbufr, it compares the value of the time window
(in
>>>> hundredths of an hour, so that is set to 50 hundredths), and
compares
>>>> that
>>>> to the value of DHR in the observation (also in hundredths of an
>>>> hour).  I
>>>> just checked the code to see if it is doing that correctly and it
does.
>>>>
>>>> In PB2NC, in the time window section, I set the window to -1800
and
>>>> 1800,
>>>> because there are 1800 seconds in 30 minutes.  I believe the
obs_window
>>>> values need to be set in seconds, is that correct?
>>>>
>>>> OK, so when I do that,  I get 232604 time-window rejections for
PB2NC
>>>> and
>>>> 173316 time-window rejections for editbufr.  I don't know if this
is
>>>> due to
>>>> the fact that editbufr checks the domain first and if they are
rejected
>>>> by
>>>> domain, they aren't rejected again for the time window, where it
seems
>>>> in
>>>> PB2NC, time window rejections are done first (which could make up
for
>>>> the
>>>> larger number of rejections for PB2NC).
>>>>
>>>> In all, editbufr retains 10878 reports and PB2NC retains 10120
reports,
>>>> which is still a discrepancy.  Both use the same sets of report
types
>>>> (in
>>>> this case 181, 281, 284, 187, and 287).  Both use the G212 domain
to set
>>>> the domain limits.  Is there something else I can be looking at
in
>>>> PB2NC to
>>>> ensure that I am comparing apples to apples?
>>>>
>>>> Also, I am curious:  PB2NC lists 44479 total observations
retained or
>>>> derived.  What does this number represent?  Does that mean that
the
>>>> total
>>>> number of observations includes stuff like T and Q for the same
station
>>>> as
>>>> two observations?  Or is that something else?  That number
doesn't
>>>> appear
>>>> anywhere in what's retained in the editbufr report.  The 10120 in
PB2NC
>>>> is
>>>> the same order of the 10878 in editbufr, so I am guessing those
numbers
>>>> are
>>>> the numbers we should be comparing.  Can you confirm that?
>>>>
>>>> Thanks for your assistance.
>>>>
>>>> Perry
>>>>
>>>>
>>>
>>
>

------------------------------------------------
Subject: time window discrepancies
From: perry.shafran at noaa.gov
Time: Fri Oct 20 09:59:34 2017

OK, thanks.  This does seem to be one source of discrepancy though.  I
suppose we'll have to accept that.

Thanks!

Perry

On Fri, Oct 20, 2017 at 11:39 AM, Howard Soh <hsoh at ucar.edu> wrote:

> No, it can not be turned off.
> It's just information for handling PrepBufr input and the NetCDF
output.
>
> The invalid data is not saved to NetCDF.
> "Rejected based on zero observation" is the count of the records
> (message) which all observation data was filtered by item 10 to 17
> (including derived variables).
>
> Cheers,
> Howard
>
> On Fri, Oct 20, 2017 at 8:24 AM, Perry Shafran - NOAA Affiliate <
> perry.shafran at noaa.gov> wrote:
>
>> OK, this was helpful.
>>
>> Test 1:  When I removed the grid masking, what happened is that,
after
>> the rejections based on report type, I got an identical number for
both
>> VSDB and MET.
>>
>> MET adds another level of rejections that VSDB doesn't have:
>>
>> DEBUG 2: Rejected based on zero observations    = 1219
>>
>> Is there a way to turn this off in PB2NC so I don't have this
rejection?
>>
>> Now to look at the second test.
>>
>> Perry
>>
>> On Thu, Oct 19, 2017 at 7:44 PM, hsoh <hsoh at ucar.edu> wrote:
>>
>>> Suggestion for comparing apples to apples:
>>>
>>> Apply just one filtering:
>>>         Trial 1: time window without grid masking
>>>         Trial 2: grid masking only (including all time window)
>>>                 For example, set plus/minus one day (86400) or one
week
>>> (604800) as time window
>>>
>>> Cheers,
>>> Howard
>>>
>>>
>>>
>>> On 10/19/2017 5:27 PM, hsoh wrote:
>>>
>>>>
>>>> Question 1.
>>>>
>>>> I believe the obs_window values need to be set in seconds, is
that
>>>> correct?
>>>>
>>>> Yes, the unit is second.
>>>>
>>>> Question 2.
>>>>
>>>> Does that mean that the total number of observations includes
stuff like
>>>> T and Q for the same station as two observations?
>>>>
>>>> Yes, that's correct. The variables are defined at the
configuration:
>>>> - obs_bufr_var for 6.1
>>>> - obs_grib_code for 6.0 or early versions
>>>>
>>>> For example :
>>>> obs_grib_code = [ "SPFH", "TMP",  "HGT",  "UGRD", "VGRD",
>>>>                   "DPT",  "WIND", "RH",   "MIXR" ];
>>>> ==>
>>>> The first five variables come from the PrepBufr and the other
four
>>>> variables are derived.
>>>>
>>>> In this case (9 variables), the maximum total number of
observations is
>>>> (if not rejected):
>>>> "Total PrepBufr Messages retained" * 9 * "the number of vertical
levels"
>>>>
>>>> FYI: The order of rejections (if configured):
>>>>     1. message type
>>>>     2. station_id
>>>>     3. valid time
>>>>     4. grid mask
>>>>     5. poly mask
>>>>     6. elevation
>>>>     7. PrepBufr report type
>>>>     8. input report type
>>>>     9. instrument type
>>>>     10. vertical level
>>>>     11. If the pressure level is invalid
>>>>     12. If virtual temperature is invalid
>>>>     13. if the observation value is invalid.
>>>>     14. if the quality mark is invalid.
>>>>     15. If the quality mark is greater than the quality mark
threshold
>>>>     16. If the data level category is listed in the configuration
file,
>>>> and it is not in the list
>>>>     17. If the associated GRIB code (or the variable index) is
not in
>>>> the configuration file
>>>>
>>>>
>>>> I believe you know the additional debug output with "-v 2"
option.
>>>> Here is an example:
>>>>
>>>> DEBUG 2: PrepBufr Time Center:          20120409_120000
>>>> DEBUG 2: Searching Time Window:         20120409_113000 to
>>>> 20120409_123000
>>>> DEBUG 2: Processing 76306 PrepBufr messages...
>>>> ...
>>>> DEBUG 2: Total PrepBufr Messages processed      = 76306
>>>> DEBUG 2: Rejected based on message type         = 0
>>>> DEBUG 2: Rejected based on station id           = 0
>>>> DEBUG 2: Rejected based on valid time           = 50115
>>>> DEBUG 2: Rejected based on masking grid         = 0
>>>> DEBUG 2: Rejected based on masking polygon      = 17244
>>>> DEBUG 2: Rejected based on elevation            = 0
>>>> DEBUG 2: Rejected based on pb report type       = 0
>>>> DEBUG 2: Rejected based on input report type    = 0
>>>> DEBUG 2: Rejected based on instrument type      = 0
>>>> DEBUG 2: Rejected based on zero observations    = 505
>>>> DEBUG 2: Total PrepBufr Messages retained       = 8442
>>>> DEBUG 2: Total observations retained or derived = 75771
>>>>
>>>> "Total PrepBufr Messages" - "the sum of rejected counts" = "Total
>>>> PrepBufr Messages retained".
>>>> 76306 - 50115 - 17244 - 505 = 8442 retained record
>>>>
>>>> Possible observation count if the vertical level is 1 and using
above
>>>> configuration (9 variables).
>>>> 8442 * 9 * 1 = 75,978
>>>>
>>>> Total observations retained or derived = 75771
>>>> 75,978 - 75,771 = 207   ==> 207 observation data was NOT saved
because
>>>> of invalid value, invalid quality mark, or quality mark
threshold.
>>>>
>>>> Cheers,
>>>> Howard
>>>>
>>>> On 10/19/2017 9:47 AM, perry.shafran at noaa.gov via RT wrote:
>>>>
>>>>> Thu Oct 19 09:47:56 2017: Request 82430 was acted upon.
>>>>> Transaction: Ticket created by perry.shafran at noaa.gov
>>>>>         Queue: met_help
>>>>>       Subject: time window discrepancies
>>>>>         Owner: Nobody
>>>>>    Requestors: perry.shafran at noaa.gov
>>>>>        Status: new
>>>>>   Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=82430
>>>>> >
>>>>>
>>>>>
>>>>> Hi, there,
>>>>>
>>>>> I'm still having issues with discrepancies between VSDB and MET,
and in
>>>>> this case it has to do with the time window.
>>>>>
>>>>> I'm running a current comparison and have been using 30 minutes
as the
>>>>> time
>>>>> window.  In editbufr, it compares the value of the time window
(in
>>>>> hundredths of an hour, so that is set to 50 hundredths), and
compares
>>>>> that
>>>>> to the value of DHR in the observation (also in hundredths of an
>>>>> hour).  I
>>>>> just checked the code to see if it is doing that correctly and
it does.
>>>>>
>>>>> In PB2NC, in the time window section, I set the window to -1800
and
>>>>> 1800,
>>>>> because there are 1800 seconds in 30 minutes.  I believe the
obs_window
>>>>> values need to be set in seconds, is that correct?
>>>>>
>>>>> OK, so when I do that,  I get 232604 time-window rejections for
PB2NC
>>>>> and
>>>>> 173316 time-window rejections for editbufr.  I don't know if
this is
>>>>> due to
>>>>> the fact that editbufr checks the domain first and if they are
>>>>> rejected by
>>>>> domain, they aren't rejected again for the time window, where it
seems
>>>>> in
>>>>> PB2NC, time window rejections are done first (which could make
up for
>>>>> the
>>>>> larger number of rejections for PB2NC).
>>>>>
>>>>> In all, editbufr retains 10878 reports and PB2NC retains 10120
reports,
>>>>> which is still a discrepancy.  Both use the same sets of report
types
>>>>> (in
>>>>> this case 181, 281, 284, 187, and 287).  Both use the G212
domain to
>>>>> set
>>>>> the domain limits.  Is there something else I can be looking at
in
>>>>> PB2NC to
>>>>> ensure that I am comparing apples to apples?
>>>>>
>>>>> Also, I am curious:  PB2NC lists 44479 total observations
retained or
>>>>> derived.  What does this number represent?  Does that mean that
the
>>>>> total
>>>>> number of observations includes stuff like T and Q for the same
>>>>> station as
>>>>> two observations?  Or is that something else?  That number
doesn't
>>>>> appear
>>>>> anywhere in what's retained in the editbufr report.  The 10120
in
>>>>> PB2NC is
>>>>> the same order of the 10878 in editbufr, so I am guessing those
>>>>> numbers are
>>>>> the numbers we should be comparing.  Can you confirm that?
>>>>>
>>>>> Thanks for your assistance.
>>>>>
>>>>> Perry
>>>>>
>>>>>
>>>>
>>>
>>
>

------------------------------------------------
Subject: time window discrepancies
From: perry.shafran at noaa.gov
Time: Fri Oct 20 12:03:11 2017

Just did Test 2 and all looks pretty good.  Each code, MET and VSDB
rejected the exact same number of observations with the masking
rejections
and no time window rejections.

I'm going to see what happens when both are rejected and do a good
comparison.

Thanks!

Perry

On Fri, Oct 20, 2017 at 11:39 AM, Howard Soh <hsoh at ucar.edu> wrote:

> No, it can not be turned off.
> It's just information for handling PrepBufr input and the NetCDF
output.
>
> The invalid data is not saved to NetCDF.
> "Rejected based on zero observation" is the count of the records
> (message) which all observation data was filtered by item 10 to 17
> (including derived variables).
>
> Cheers,
> Howard
>
> On Fri, Oct 20, 2017 at 8:24 AM, Perry Shafran - NOAA Affiliate <
> perry.shafran at noaa.gov> wrote:
>
>> OK, this was helpful.
>>
>> Test 1:  When I removed the grid masking, what happened is that,
after
>> the rejections based on report type, I got an identical number for
both
>> VSDB and MET.
>>
>> MET adds another level of rejections that VSDB doesn't have:
>>
>> DEBUG 2: Rejected based on zero observations    = 1219
>>
>> Is there a way to turn this off in PB2NC so I don't have this
rejection?
>>
>> Now to look at the second test.
>>
>> Perry
>>
>> On Thu, Oct 19, 2017 at 7:44 PM, hsoh <hsoh at ucar.edu> wrote:
>>
>>> Suggestion for comparing apples to apples:
>>>
>>> Apply just one filtering:
>>>         Trial 1: time window without grid masking
>>>         Trial 2: grid masking only (including all time window)
>>>                 For example, set plus/minus one day (86400) or one
week
>>> (604800) as time window
>>>
>>> Cheers,
>>> Howard
>>>
>>>
>>>
>>> On 10/19/2017 5:27 PM, hsoh wrote:
>>>
>>>>
>>>> Question 1.
>>>>
>>>> I believe the obs_window values need to be set in seconds, is
that
>>>> correct?
>>>>
>>>> Yes, the unit is second.
>>>>
>>>> Question 2.
>>>>
>>>> Does that mean that the total number of observations includes
stuff like
>>>> T and Q for the same station as two observations?
>>>>
>>>> Yes, that's correct. The variables are defined at the
configuration:
>>>> - obs_bufr_var for 6.1
>>>> - obs_grib_code for 6.0 or early versions
>>>>
>>>> For example :
>>>> obs_grib_code = [ "SPFH", "TMP",  "HGT",  "UGRD", "VGRD",
>>>>                   "DPT",  "WIND", "RH",   "MIXR" ];
>>>> ==>
>>>> The first five variables come from the PrepBufr and the other
four
>>>> variables are derived.
>>>>
>>>> In this case (9 variables), the maximum total number of
observations is
>>>> (if not rejected):
>>>> "Total PrepBufr Messages retained" * 9 * "the number of vertical
levels"
>>>>
>>>> FYI: The order of rejections (if configured):
>>>>     1. message type
>>>>     2. station_id
>>>>     3. valid time
>>>>     4. grid mask
>>>>     5. poly mask
>>>>     6. elevation
>>>>     7. PrepBufr report type
>>>>     8. input report type
>>>>     9. instrument type
>>>>     10. vertical level
>>>>     11. If the pressure level is invalid
>>>>     12. If virtual temperature is invalid
>>>>     13. if the observation value is invalid.
>>>>     14. if the quality mark is invalid.
>>>>     15. If the quality mark is greater than the quality mark
threshold
>>>>     16. If the data level category is listed in the configuration
file,
>>>> and it is not in the list
>>>>     17. If the associated GRIB code (or the variable index) is
not in
>>>> the configuration file
>>>>
>>>>
>>>> I believe you know the additional debug output with "-v 2"
option.
>>>> Here is an example:
>>>>
>>>> DEBUG 2: PrepBufr Time Center:          20120409_120000
>>>> DEBUG 2: Searching Time Window:         20120409_113000 to
>>>> 20120409_123000
>>>> DEBUG 2: Processing 76306 PrepBufr messages...
>>>> ...
>>>> DEBUG 2: Total PrepBufr Messages processed      = 76306
>>>> DEBUG 2: Rejected based on message type         = 0
>>>> DEBUG 2: Rejected based on station id           = 0
>>>> DEBUG 2: Rejected based on valid time           = 50115
>>>> DEBUG 2: Rejected based on masking grid         = 0
>>>> DEBUG 2: Rejected based on masking polygon      = 17244
>>>> DEBUG 2: Rejected based on elevation            = 0
>>>> DEBUG 2: Rejected based on pb report type       = 0
>>>> DEBUG 2: Rejected based on input report type    = 0
>>>> DEBUG 2: Rejected based on instrument type      = 0
>>>> DEBUG 2: Rejected based on zero observations    = 505
>>>> DEBUG 2: Total PrepBufr Messages retained       = 8442
>>>> DEBUG 2: Total observations retained or derived = 75771
>>>>
>>>> "Total PrepBufr Messages" - "the sum of rejected counts" = "Total
>>>> PrepBufr Messages retained".
>>>> 76306 - 50115 - 17244 - 505 = 8442 retained record
>>>>
>>>> Possible observation count if the vertical level is 1 and using
above
>>>> configuration (9 variables).
>>>> 8442 * 9 * 1 = 75,978
>>>>
>>>> Total observations retained or derived = 75771
>>>> 75,978 - 75,771 = 207   ==> 207 observation data was NOT saved
because
>>>> of invalid value, invalid quality mark, or quality mark
threshold.
>>>>
>>>> Cheers,
>>>> Howard
>>>>
>>>> On 10/19/2017 9:47 AM, perry.shafran at noaa.gov via RT wrote:
>>>>
>>>>> Thu Oct 19 09:47:56 2017: Request 82430 was acted upon.
>>>>> Transaction: Ticket created by perry.shafran at noaa.gov
>>>>>         Queue: met_help
>>>>>       Subject: time window discrepancies
>>>>>         Owner: Nobody
>>>>>    Requestors: perry.shafran at noaa.gov
>>>>>        Status: new
>>>>>   Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=82430
>>>>> >
>>>>>
>>>>>
>>>>> Hi, there,
>>>>>
>>>>> I'm still having issues with discrepancies between VSDB and MET,
and in
>>>>> this case it has to do with the time window.
>>>>>
>>>>> I'm running a current comparison and have been using 30 minutes
as the
>>>>> time
>>>>> window.  In editbufr, it compares the value of the time window
(in
>>>>> hundredths of an hour, so that is set to 50 hundredths), and
compares
>>>>> that
>>>>> to the value of DHR in the observation (also in hundredths of an
>>>>> hour).  I
>>>>> just checked the code to see if it is doing that correctly and
it does.
>>>>>
>>>>> In PB2NC, in the time window section, I set the window to -1800
and
>>>>> 1800,
>>>>> because there are 1800 seconds in 30 minutes.  I believe the
obs_window
>>>>> values need to be set in seconds, is that correct?
>>>>>
>>>>> OK, so when I do that,  I get 232604 time-window rejections for
PB2NC
>>>>> and
>>>>> 173316 time-window rejections for editbufr.  I don't know if
this is
>>>>> due to
>>>>> the fact that editbufr checks the domain first and if they are
>>>>> rejected by
>>>>> domain, they aren't rejected again for the time window, where it
seems
>>>>> in
>>>>> PB2NC, time window rejections are done first (which could make
up for
>>>>> the
>>>>> larger number of rejections for PB2NC).
>>>>>
>>>>> In all, editbufr retains 10878 reports and PB2NC retains 10120
reports,
>>>>> which is still a discrepancy.  Both use the same sets of report
types
>>>>> (in
>>>>> this case 181, 281, 284, 187, and 287).  Both use the G212
domain to
>>>>> set
>>>>> the domain limits.  Is there something else I can be looking at
in
>>>>> PB2NC to
>>>>> ensure that I am comparing apples to apples?
>>>>>
>>>>> Also, I am curious:  PB2NC lists 44479 total observations
retained or
>>>>> derived.  What does this number represent?  Does that mean that
the
>>>>> total
>>>>> number of observations includes stuff like T and Q for the same
>>>>> station as
>>>>> two observations?  Or is that something else?  That number
doesn't
>>>>> appear
>>>>> anywhere in what's retained in the editbufr report.  The 10120
in
>>>>> PB2NC is
>>>>> the same order of the 10878 in editbufr, so I am guessing those
>>>>> numbers are
>>>>> the numbers we should be comparing.  Can you confirm that?
>>>>>
>>>>> Thanks for your assistance.
>>>>>
>>>>> Perry
>>>>>
>>>>>
>>>>
>>>
>>
>

------------------------------------------------
Subject: time window discrepancies
From: perry.shafran at noaa.gov
Time: Fri Oct 20 12:42:54 2017

OK, I just did a run that I was doing before - with both the time
window
check and the grid domain check.  I found that the discrepancies I
have
been seeing in ob counts is indeed the zero observation rejection.  I
guess
with that knowledge we can at least explain any discrepancy between
VSDB
and MET.

Considering that you cannot change the MET code by removing this
rejection
check, might you be able to show me what you look for so perhaps I can
add
a rejection to the VSDB code?  That way we'll for sure have the same
sets
of observations for both codes.

Thanks!

Perry

On Fri, Oct 20, 2017 at 2:03 PM, Perry Shafran - NOAA Affiliate <
perry.shafran at noaa.gov> wrote:

> Just did Test 2 and all looks pretty good.  Each code, MET and VSDB
> rejected the exact same number of observations with the masking
rejections
> and no time window rejections.
>
> I'm going to see what happens when both are rejected and do a good
> comparison.
>
> Thanks!
>
> Perry
>
> On Fri, Oct 20, 2017 at 11:39 AM, Howard Soh <hsoh at ucar.edu> wrote:
>
>> No, it can not be turned off.
>> It's just information for handling PrepBufr input and the NetCDF
output.
>>
>> The invalid data is not saved to NetCDF.
>> "Rejected based on zero observation" is the count of the records
>> (message) which all observation data was filtered by item 10 to 17
>> (including derived variables).
>>
>> Cheers,
>> Howard
>>
>> On Fri, Oct 20, 2017 at 8:24 AM, Perry Shafran - NOAA Affiliate <
>> perry.shafran at noaa.gov> wrote:
>>
>>> OK, this was helpful.
>>>
>>> Test 1:  When I removed the grid masking, what happened is that,
after
>>> the rejections based on report type, I got an identical number for
both
>>> VSDB and MET.
>>>
>>> MET adds another level of rejections that VSDB doesn't have:
>>>
>>> DEBUG 2: Rejected based on zero observations    = 1219
>>>
>>> Is there a way to turn this off in PB2NC so I don't have this
rejection?
>>>
>>> Now to look at the second test.
>>>
>>> Perry
>>>
>>> On Thu, Oct 19, 2017 at 7:44 PM, hsoh <hsoh at ucar.edu> wrote:
>>>
>>>> Suggestion for comparing apples to apples:
>>>>
>>>> Apply just one filtering:
>>>>         Trial 1: time window without grid masking
>>>>         Trial 2: grid masking only (including all time window)
>>>>                 For example, set plus/minus one day (86400) or
one week
>>>> (604800) as time window
>>>>
>>>> Cheers,
>>>> Howard
>>>>
>>>>
>>>>
>>>> On 10/19/2017 5:27 PM, hsoh wrote:
>>>>
>>>>>
>>>>> Question 1.
>>>>>
>>>>> I believe the obs_window values need to be set in seconds, is
that
>>>>> correct?
>>>>>
>>>>> Yes, the unit is second.
>>>>>
>>>>> Question 2.
>>>>>
>>>>> Does that mean that the total number of observations includes
stuff
>>>>> like
>>>>> T and Q for the same station as two observations?
>>>>>
>>>>> Yes, that's correct. The variables are defined at the
configuration:
>>>>> - obs_bufr_var for 6.1
>>>>> - obs_grib_code for 6.0 or early versions
>>>>>
>>>>> For example :
>>>>> obs_grib_code = [ "SPFH", "TMP",  "HGT",  "UGRD", "VGRD",
>>>>>                   "DPT",  "WIND", "RH",   "MIXR" ];
>>>>> ==>
>>>>> The first five variables come from the PrepBufr and the other
four
>>>>> variables are derived.
>>>>>
>>>>> In this case (9 variables), the maximum total number of
observations
>>>>> is (if not rejected):
>>>>> "Total PrepBufr Messages retained" * 9 * "the number of vertical
>>>>> levels"
>>>>>
>>>>> FYI: The order of rejections (if configured):
>>>>>     1. message type
>>>>>     2. station_id
>>>>>     3. valid time
>>>>>     4. grid mask
>>>>>     5. poly mask
>>>>>     6. elevation
>>>>>     7. PrepBufr report type
>>>>>     8. input report type
>>>>>     9. instrument type
>>>>>     10. vertical level
>>>>>     11. If the pressure level is invalid
>>>>>     12. If virtual temperature is invalid
>>>>>     13. if the observation value is invalid.
>>>>>     14. if the quality mark is invalid.
>>>>>     15. If the quality mark is greater than the quality mark
threshold
>>>>>     16. If the data level category is listed in the
configuration
>>>>> file, and it is not in the list
>>>>>     17. If the associated GRIB code (or the variable index) is
not in
>>>>> the configuration file
>>>>>
>>>>>
>>>>> I believe you know the additional debug output with "-v 2"
option.
>>>>> Here is an example:
>>>>>
>>>>> DEBUG 2: PrepBufr Time Center:          20120409_120000
>>>>> DEBUG 2: Searching Time Window:         20120409_113000 to
>>>>> 20120409_123000
>>>>> DEBUG 2: Processing 76306 PrepBufr messages...
>>>>> ...
>>>>> DEBUG 2: Total PrepBufr Messages processed      = 76306
>>>>> DEBUG 2: Rejected based on message type         = 0
>>>>> DEBUG 2: Rejected based on station id           = 0
>>>>> DEBUG 2: Rejected based on valid time           = 50115
>>>>> DEBUG 2: Rejected based on masking grid         = 0
>>>>> DEBUG 2: Rejected based on masking polygon      = 17244
>>>>> DEBUG 2: Rejected based on elevation            = 0
>>>>> DEBUG 2: Rejected based on pb report type       = 0
>>>>> DEBUG 2: Rejected based on input report type    = 0
>>>>> DEBUG 2: Rejected based on instrument type      = 0
>>>>> DEBUG 2: Rejected based on zero observations    = 505
>>>>> DEBUG 2: Total PrepBufr Messages retained       = 8442
>>>>> DEBUG 2: Total observations retained or derived = 75771
>>>>>
>>>>> "Total PrepBufr Messages" - "the sum of rejected counts" =
"Total
>>>>> PrepBufr Messages retained".
>>>>> 76306 - 50115 - 17244 - 505 = 8442 retained record
>>>>>
>>>>> Possible observation count if the vertical level is 1 and using
above
>>>>> configuration (9 variables).
>>>>> 8442 * 9 * 1 = 75,978
>>>>>
>>>>> Total observations retained or derived = 75771
>>>>> 75,978 - 75,771 = 207   ==> 207 observation data was NOT saved
because
>>>>> of invalid value, invalid quality mark, or quality mark
threshold.
>>>>>
>>>>> Cheers,
>>>>> Howard
>>>>>
>>>>> On 10/19/2017 9:47 AM, perry.shafran at noaa.gov via RT wrote:
>>>>>
>>>>>> Thu Oct 19 09:47:56 2017: Request 82430 was acted upon.
>>>>>> Transaction: Ticket created by perry.shafran at noaa.gov
>>>>>>         Queue: met_help
>>>>>>       Subject: time window discrepancies
>>>>>>         Owner: Nobody
>>>>>>    Requestors: perry.shafran at noaa.gov
>>>>>>        Status: new
>>>>>>   Ticket <URL: https://rt.rap.ucar.edu/rt/Tic
>>>>>> ket/Display.html?id=82430 >
>>>>>>
>>>>>>
>>>>>> Hi, there,
>>>>>>
>>>>>> I'm still having issues with discrepancies between VSDB and
MET, and
>>>>>> in
>>>>>> this case it has to do with the time window.
>>>>>>
>>>>>> I'm running a current comparison and have been using 30 minutes
as
>>>>>> the time
>>>>>> window.  In editbufr, it compares the value of the time window
(in
>>>>>> hundredths of an hour, so that is set to 50 hundredths), and
compares
>>>>>> that
>>>>>> to the value of DHR in the observation (also in hundredths of
an
>>>>>> hour).  I
>>>>>> just checked the code to see if it is doing that correctly and
it
>>>>>> does.
>>>>>>
>>>>>> In PB2NC, in the time window section, I set the window to -1800
and
>>>>>> 1800,
>>>>>> because there are 1800 seconds in 30 minutes.  I believe the
>>>>>> obs_window
>>>>>> values need to be set in seconds, is that correct?
>>>>>>
>>>>>> OK, so when I do that,  I get 232604 time-window rejections for
PB2NC
>>>>>> and
>>>>>> 173316 time-window rejections for editbufr.  I don't know if
this is
>>>>>> due to
>>>>>> the fact that editbufr checks the domain first and if they are
>>>>>> rejected by
>>>>>> domain, they aren't rejected again for the time window, where
it
>>>>>> seems in
>>>>>> PB2NC, time window rejections are done first (which could make
up for
>>>>>> the
>>>>>> larger number of rejections for PB2NC).
>>>>>>
>>>>>> In all, editbufr retains 10878 reports and PB2NC retains 10120
>>>>>> reports,
>>>>>> which is still a discrepancy.  Both use the same sets of report
types
>>>>>> (in
>>>>>> this case 181, 281, 284, 187, and 287).  Both use the G212
domain to
>>>>>> set
>>>>>> the domain limits.  Is there something else I can be looking at
in
>>>>>> PB2NC to
>>>>>> ensure that I am comparing apples to apples?
>>>>>>
>>>>>> Also, I am curious:  PB2NC lists 44479 total observations
retained or
>>>>>> derived.  What does this number represent?  Does that mean that
the
>>>>>> total
>>>>>> number of observations includes stuff like T and Q for the same
>>>>>> station as
>>>>>> two observations?  Or is that something else?  That number
doesn't
>>>>>> appear
>>>>>> anywhere in what's retained in the editbufr report.  The 10120
in
>>>>>> PB2NC is
>>>>>> the same order of the 10878 in editbufr, so I am guessing those
>>>>>> numbers are
>>>>>> the numbers we should be comparing.  Can you confirm that?
>>>>>>
>>>>>> Thanks for your assistance.
>>>>>>
>>>>>> Perry
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

------------------------------------------------
Subject: time window discrepancies
From: John Halley Gotway
Time: Fri Oct 20 13:19:06 2017

Perry,

The PB2NC tool processes each PREPBUFR message separately.  As it
processes
each one, it keeps track of the number of observations retained or
derived
from it.  If that number > 0, it writes out the header information
(i.e.
station name, elevation, and lat/lon) to the output NetCDF file.  If
not,
it increments the rejection for zero obs counter.  It would be pretty
straight-forward to add a config option for PB2NC to tell it to skip
the
application of that logic.  But I don't think it'll be as helpful as
you
imagine.

Ultimately, it's the number of individual observation values used for
each
verification task that need to match.  Having MET write out some
additional
header location information to which no observations actually
correspond
won't have any impact on that goal.

When comparing the number of PREPBUFR messages processed by PB2NC and
editbufr, can't we just say that PB2NC # Messages + PB2NC # Rejected
for
Zero Obs = # Editbufr?

Thanks,
John


On Fri, Oct 20, 2017 at 12:42 PM, perry.shafran at noaa.gov via RT <
met_help at ucar.edu> wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=82430 >
>
> OK, I just did a run that I was doing before - with both the time
window
> check and the grid domain check.  I found that the discrepancies I
have
> been seeing in ob counts is indeed the zero observation rejection.
I guess
> with that knowledge we can at least explain any discrepancy between
VSDB
> and MET.
>
> Considering that you cannot change the MET code by removing this
rejection
> check, might you be able to show me what you look for so perhaps I
can add
> a rejection to the VSDB code?  That way we'll for sure have the same
sets
> of observations for both codes.
>
> Thanks!
>
> Perry
>
> On Fri, Oct 20, 2017 at 2:03 PM, Perry Shafran - NOAA Affiliate <
> perry.shafran at noaa.gov> wrote:
>
> > Just did Test 2 and all looks pretty good.  Each code, MET and
VSDB
> > rejected the exact same number of observations with the masking
> rejections
> > and no time window rejections.
> >
> > I'm going to see what happens when both are rejected and do a good
> > comparison.
> >
> > Thanks!
> >
> > Perry
> >
> > On Fri, Oct 20, 2017 at 11:39 AM, Howard Soh <hsoh at ucar.edu>
wrote:
> >
> >> No, it can not be turned off.
> >> It's just information for handling PrepBufr input and the NetCDF
output.
> >>
> >> The invalid data is not saved to NetCDF.
> >> "Rejected based on zero observation" is the count of the records
> >> (message) which all observation data was filtered by item 10 to
17
> >> (including derived variables).
> >>
> >> Cheers,
> >> Howard
> >>
> >> On Fri, Oct 20, 2017 at 8:24 AM, Perry Shafran - NOAA Affiliate <
> >> perry.shafran at noaa.gov> wrote:
> >>
> >>> OK, this was helpful.
> >>>
> >>> Test 1:  When I removed the grid masking, what happened is that,
after
> >>> the rejections based on report type, I got an identical number
for both
> >>> VSDB and MET.
> >>>
> >>> MET adds another level of rejections that VSDB doesn't have:
> >>>
> >>> DEBUG 2: Rejected based on zero observations    = 1219
> >>>
> >>> Is there a way to turn this off in PB2NC so I don't have this
> rejection?
> >>>
> >>> Now to look at the second test.
> >>>
> >>> Perry
> >>>
> >>> On Thu, Oct 19, 2017 at 7:44 PM, hsoh <hsoh at ucar.edu> wrote:
> >>>
> >>>> Suggestion for comparing apples to apples:
> >>>>
> >>>> Apply just one filtering:
> >>>>         Trial 1: time window without grid masking
> >>>>         Trial 2: grid masking only (including all time window)
> >>>>                 For example, set plus/minus one day (86400) or
one
> week
> >>>> (604800) as time window
> >>>>
> >>>> Cheers,
> >>>> Howard
> >>>>
> >>>>
> >>>>
> >>>> On 10/19/2017 5:27 PM, hsoh wrote:
> >>>>
> >>>>>
> >>>>> Question 1.
> >>>>>
> >>>>> I believe the obs_window values need to be set in seconds, is
that
> >>>>> correct?
> >>>>>
> >>>>> Yes, the unit is second.
> >>>>>
> >>>>> Question 2.
> >>>>>
> >>>>> Does that mean that the total number of observations includes
stuff
> >>>>> like
> >>>>> T and Q for the same station as two observations?
> >>>>>
> >>>>> Yes, that's correct. The variables are defined at the
configuration:
> >>>>> - obs_bufr_var for 6.1
> >>>>> - obs_grib_code for 6.0 or early versions
> >>>>>
> >>>>> For example :
> >>>>> obs_grib_code = [ "SPFH", "TMP",  "HGT",  "UGRD", "VGRD",
> >>>>>                   "DPT",  "WIND", "RH",   "MIXR" ];
> >>>>> ==>
> >>>>> The first five variables come from the PrepBufr and the other
four
> >>>>> variables are derived.
> >>>>>
> >>>>> In this case (9 variables), the maximum total number of
observations
> >>>>> is (if not rejected):
> >>>>> "Total PrepBufr Messages retained" * 9 * "the number of
vertical
> >>>>> levels"
> >>>>>
> >>>>> FYI: The order of rejections (if configured):
> >>>>>     1. message type
> >>>>>     2. station_id
> >>>>>     3. valid time
> >>>>>     4. grid mask
> >>>>>     5. poly mask
> >>>>>     6. elevation
> >>>>>     7. PrepBufr report type
> >>>>>     8. input report type
> >>>>>     9. instrument type
> >>>>>     10. vertical level
> >>>>>     11. If the pressure level is invalid
> >>>>>     12. If virtual temperature is invalid
> >>>>>     13. if the observation value is invalid.
> >>>>>     14. if the quality mark is invalid.
> >>>>>     15. If the quality mark is greater than the quality mark
> threshold
> >>>>>     16. If the data level category is listed in the
configuration
> >>>>> file, and it is not in the list
> >>>>>     17. If the associated GRIB code (or the variable index) is
not in
> >>>>> the configuration file
> >>>>>
> >>>>>
> >>>>> I believe you know the additional debug output with "-v 2"
option.
> >>>>> Here is an example:
> >>>>>
> >>>>> DEBUG 2: PrepBufr Time Center:          20120409_120000
> >>>>> DEBUG 2: Searching Time Window:         20120409_113000 to
> >>>>> 20120409_123000
> >>>>> DEBUG 2: Processing 76306 PrepBufr messages...
> >>>>> ...
> >>>>> DEBUG 2: Total PrepBufr Messages processed      = 76306
> >>>>> DEBUG 2: Rejected based on message type         = 0
> >>>>> DEBUG 2: Rejected based on station id           = 0
> >>>>> DEBUG 2: Rejected based on valid time           = 50115
> >>>>> DEBUG 2: Rejected based on masking grid         = 0
> >>>>> DEBUG 2: Rejected based on masking polygon      = 17244
> >>>>> DEBUG 2: Rejected based on elevation            = 0
> >>>>> DEBUG 2: Rejected based on pb report type       = 0
> >>>>> DEBUG 2: Rejected based on input report type    = 0
> >>>>> DEBUG 2: Rejected based on instrument type      = 0
> >>>>> DEBUG 2: Rejected based on zero observations    = 505
> >>>>> DEBUG 2: Total PrepBufr Messages retained       = 8442
> >>>>> DEBUG 2: Total observations retained or derived = 75771
> >>>>>
> >>>>> "Total PrepBufr Messages" - "the sum of rejected counts" =
"Total
> >>>>> PrepBufr Messages retained".
> >>>>> 76306 - 50115 - 17244 - 505 = 8442 retained record
> >>>>>
> >>>>> Possible observation count if the vertical level is 1 and
using above
> >>>>> configuration (9 variables).
> >>>>> 8442 * 9 * 1 = 75,978
> >>>>>
> >>>>> Total observations retained or derived = 75771
> >>>>> 75,978 - 75,771 = 207   ==> 207 observation data was NOT saved
> because
> >>>>> of invalid value, invalid quality mark, or quality mark
threshold.
> >>>>>
> >>>>> Cheers,
> >>>>> Howard
> >>>>>
> >>>>> On 10/19/2017 9:47 AM, perry.shafran at noaa.gov via RT wrote:
> >>>>>
> >>>>>> Thu Oct 19 09:47:56 2017: Request 82430 was acted upon.
> >>>>>> Transaction: Ticket created by perry.shafran at noaa.gov
> >>>>>>         Queue: met_help
> >>>>>>       Subject: time window discrepancies
> >>>>>>         Owner: Nobody
> >>>>>>    Requestors: perry.shafran at noaa.gov
> >>>>>>        Status: new
> >>>>>>   Ticket <URL: https://rt.rap.ucar.edu/rt/Tic
> >>>>>> ket/Display.html?id=82430 >
> >>>>>>
> >>>>>>
> >>>>>> Hi, there,
> >>>>>>
> >>>>>> I'm still having issues with discrepancies between VSDB and
MET, and
> >>>>>> in
> >>>>>> this case it has to do with the time window.
> >>>>>>
> >>>>>> I'm running a current comparison and have been using 30
minutes as
> >>>>>> the time
> >>>>>> window.  In editbufr, it compares the value of the time
window (in
> >>>>>> hundredths of an hour, so that is set to 50 hundredths), and
> compares
> >>>>>> that
> >>>>>> to the value of DHR in the observation (also in hundredths of
an
> >>>>>> hour).  I
> >>>>>> just checked the code to see if it is doing that correctly
and it
> >>>>>> does.
> >>>>>>
> >>>>>> In PB2NC, in the time window section, I set the window to
-1800 and
> >>>>>> 1800,
> >>>>>> because there are 1800 seconds in 30 minutes.  I believe the
> >>>>>> obs_window
> >>>>>> values need to be set in seconds, is that correct?
> >>>>>>
> >>>>>> OK, so when I do that,  I get 232604 time-window rejections
for
> PB2NC
> >>>>>> and
> >>>>>> 173316 time-window rejections for editbufr.  I don't know if
this is
> >>>>>> due to
> >>>>>> the fact that editbufr checks the domain first and if they
are
> >>>>>> rejected by
> >>>>>> domain, they aren't rejected again for the time window, where
it
> >>>>>> seems in
> >>>>>> PB2NC, time window rejections are done first (which could
make up
> for
> >>>>>> the
> >>>>>> larger number of rejections for PB2NC).
> >>>>>>
> >>>>>> In all, editbufr retains 10878 reports and PB2NC retains
10120
> >>>>>> reports,
> >>>>>> which is still a discrepancy.  Both use the same sets of
report
> types
> >>>>>> (in
> >>>>>> this case 181, 281, 284, 187, and 287).  Both use the G212
domain to
> >>>>>> set
> >>>>>> the domain limits.  Is there something else I can be looking
at in
> >>>>>> PB2NC to
> >>>>>> ensure that I am comparing apples to apples?
> >>>>>>
> >>>>>> Also, I am curious:  PB2NC lists 44479 total observations
retained
> or
> >>>>>> derived.  What does this number represent?  Does that mean
that the
> >>>>>> total
> >>>>>> number of observations includes stuff like T and Q for the
same
> >>>>>> station as
> >>>>>> two observations?  Or is that something else?  That number
doesn't
> >>>>>> appear
> >>>>>> anywhere in what's retained in the editbufr report.  The
10120 in
> >>>>>> PB2NC is
> >>>>>> the same order of the 10878 in editbufr, so I am guessing
those
> >>>>>> numbers are
> >>>>>> the numbers we should be comparing.  Can you confirm that?
> >>>>>>
> >>>>>> Thanks for your assistance.
> >>>>>>
> >>>>>> Perry
> >>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
> >
>
>

------------------------------------------------
Subject: time window discrepancies
From: perry.shafran at noaa.gov
Time: Fri Oct 20 13:27:30 2017

Hi, John,

Yes, I think I have come to that conclusion.  I just wanted to find
out
potential reasons for ob count differences between MET and VSDB, and I
think this is one of them.  Would you agree?

Thanks!

Perry

On Fri, Oct 20, 2017 at 3:19 PM, John Halley Gotway via RT <
met_help at ucar.edu> wrote:

> Perry,
>
> The PB2NC tool processes each PREPBUFR message separately.  As it
processes
> each one, it keeps track of the number of observations retained or
derived
> from it.  If that number > 0, it writes out the header information
(i.e.
> station name, elevation, and lat/lon) to the output NetCDF file.  If
not,
> it increments the rejection for zero obs counter.  It would be
pretty
> straight-forward to add a config option for PB2NC to tell it to skip
the
> application of that logic.  But I don't think it'll be as helpful as
you
> imagine.
>
> Ultimately, it's the number of individual observation values used
for each
> verification task that need to match.  Having MET write out some
additional
> header location information to which no observations actually
correspond
> won't have any impact on that goal.
>
> When comparing the number of PREPBUFR messages processed by PB2NC
and
> editbufr, can't we just say that PB2NC # Messages + PB2NC # Rejected
for
> Zero Obs = # Editbufr?
>
> Thanks,
> John
>
>
> On Fri, Oct 20, 2017 at 12:42 PM, perry.shafran at noaa.gov via RT <
> met_help at ucar.edu> wrote:
>
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=82430 >
> >
> > OK, I just did a run that I was doing before - with both the time
window
> > check and the grid domain check.  I found that the discrepancies I
have
> > been seeing in ob counts is indeed the zero observation rejection.
I
> guess
> > with that knowledge we can at least explain any discrepancy
between VSDB
> > and MET.
> >
> > Considering that you cannot change the MET code by removing this
> rejection
> > check, might you be able to show me what you look for so perhaps I
can
> add
> > a rejection to the VSDB code?  That way we'll for sure have the
same sets
> > of observations for both codes.
> >
> > Thanks!
> >
> > Perry
> >
> > On Fri, Oct 20, 2017 at 2:03 PM, Perry Shafran - NOAA Affiliate <
> > perry.shafran at noaa.gov> wrote:
> >
> > > Just did Test 2 and all looks pretty good.  Each code, MET and
VSDB
> > > rejected the exact same number of observations with the masking
> > rejections
> > > and no time window rejections.
> > >
> > > I'm going to see what happens when both are rejected and do a
good
> > > comparison.
> > >
> > > Thanks!
> > >
> > > Perry
> > >
> > > On Fri, Oct 20, 2017 at 11:39 AM, Howard Soh <hsoh at ucar.edu>
wrote:
> > >
> > >> No, it can not be turned off.
> > >> It's just information for handling PrepBufr input and the
NetCDF
> output.
> > >>
> > >> The invalid data is not saved to NetCDF.
> > >> "Rejected based on zero observation" is the count of the
records
> > >> (message) which all observation data was filtered by item 10 to
17
> > >> (including derived variables).
> > >>
> > >> Cheers,
> > >> Howard
> > >>
> > >> On Fri, Oct 20, 2017 at 8:24 AM, Perry Shafran - NOAA Affiliate
<
> > >> perry.shafran at noaa.gov> wrote:
> > >>
> > >>> OK, this was helpful.
> > >>>
> > >>> Test 1:  When I removed the grid masking, what happened is
that,
> after
> > >>> the rejections based on report type, I got an identical number
for
> both
> > >>> VSDB and MET.
> > >>>
> > >>> MET adds another level of rejections that VSDB doesn't have:
> > >>>
> > >>> DEBUG 2: Rejected based on zero observations    = 1219
> > >>>
> > >>> Is there a way to turn this off in PB2NC so I don't have this
> > rejection?
> > >>>
> > >>> Now to look at the second test.
> > >>>
> > >>> Perry
> > >>>
> > >>> On Thu, Oct 19, 2017 at 7:44 PM, hsoh <hsoh at ucar.edu> wrote:
> > >>>
> > >>>> Suggestion for comparing apples to apples:
> > >>>>
> > >>>> Apply just one filtering:
> > >>>>         Trial 1: time window without grid masking
> > >>>>         Trial 2: grid masking only (including all time
window)
> > >>>>                 For example, set plus/minus one day (86400)
or one
> > week
> > >>>> (604800) as time window
> > >>>>
> > >>>> Cheers,
> > >>>> Howard
> > >>>>
> > >>>>
> > >>>>
> > >>>> On 10/19/2017 5:27 PM, hsoh wrote:
> > >>>>
> > >>>>>
> > >>>>> Question 1.
> > >>>>>
> > >>>>> I believe the obs_window values need to be set in seconds,
is that
> > >>>>> correct?
> > >>>>>
> > >>>>> Yes, the unit is second.
> > >>>>>
> > >>>>> Question 2.
> > >>>>>
> > >>>>> Does that mean that the total number of observations
includes stuff
> > >>>>> like
> > >>>>> T and Q for the same station as two observations?
> > >>>>>
> > >>>>> Yes, that's correct. The variables are defined at the
> configuration:
> > >>>>> - obs_bufr_var for 6.1
> > >>>>> - obs_grib_code for 6.0 or early versions
> > >>>>>
> > >>>>> For example :
> > >>>>> obs_grib_code = [ "SPFH", "TMP",  "HGT",  "UGRD", "VGRD",
> > >>>>>                   "DPT",  "WIND", "RH",   "MIXR" ];
> > >>>>> ==>
> > >>>>> The first five variables come from the PrepBufr and the
other four
> > >>>>> variables are derived.
> > >>>>>
> > >>>>> In this case (9 variables), the maximum total number of
> observations
> > >>>>> is (if not rejected):
> > >>>>> "Total PrepBufr Messages retained" * 9 * "the number of
vertical
> > >>>>> levels"
> > >>>>>
> > >>>>> FYI: The order of rejections (if configured):
> > >>>>>     1. message type
> > >>>>>     2. station_id
> > >>>>>     3. valid time
> > >>>>>     4. grid mask
> > >>>>>     5. poly mask
> > >>>>>     6. elevation
> > >>>>>     7. PrepBufr report type
> > >>>>>     8. input report type
> > >>>>>     9. instrument type
> > >>>>>     10. vertical level
> > >>>>>     11. If the pressure level is invalid
> > >>>>>     12. If virtual temperature is invalid
> > >>>>>     13. if the observation value is invalid.
> > >>>>>     14. if the quality mark is invalid.
> > >>>>>     15. If the quality mark is greater than the quality mark
> > threshold
> > >>>>>     16. If the data level category is listed in the
configuration
> > >>>>> file, and it is not in the list
> > >>>>>     17. If the associated GRIB code (or the variable index)
is not
> in
> > >>>>> the configuration file
> > >>>>>
> > >>>>>
> > >>>>> I believe you know the additional debug output with "-v 2"
option.
> > >>>>> Here is an example:
> > >>>>>
> > >>>>> DEBUG 2: PrepBufr Time Center:          20120409_120000
> > >>>>> DEBUG 2: Searching Time Window:         20120409_113000 to
> > >>>>> 20120409_123000
> > >>>>> DEBUG 2: Processing 76306 PrepBufr messages...
> > >>>>> ...
> > >>>>> DEBUG 2: Total PrepBufr Messages processed      = 76306
> > >>>>> DEBUG 2: Rejected based on message type         = 0
> > >>>>> DEBUG 2: Rejected based on station id           = 0
> > >>>>> DEBUG 2: Rejected based on valid time           = 50115
> > >>>>> DEBUG 2: Rejected based on masking grid         = 0
> > >>>>> DEBUG 2: Rejected based on masking polygon      = 17244
> > >>>>> DEBUG 2: Rejected based on elevation            = 0
> > >>>>> DEBUG 2: Rejected based on pb report type       = 0
> > >>>>> DEBUG 2: Rejected based on input report type    = 0
> > >>>>> DEBUG 2: Rejected based on instrument type      = 0
> > >>>>> DEBUG 2: Rejected based on zero observations    = 505
> > >>>>> DEBUG 2: Total PrepBufr Messages retained       = 8442
> > >>>>> DEBUG 2: Total observations retained or derived = 75771
> > >>>>>
> > >>>>> "Total PrepBufr Messages" - "the sum of rejected counts" =
"Total
> > >>>>> PrepBufr Messages retained".
> > >>>>> 76306 - 50115 - 17244 - 505 = 8442 retained record
> > >>>>>
> > >>>>> Possible observation count if the vertical level is 1 and
using
> above
> > >>>>> configuration (9 variables).
> > >>>>> 8442 * 9 * 1 = 75,978
> > >>>>>
> > >>>>> Total observations retained or derived = 75771
> > >>>>> 75,978 - 75,771 = 207   ==> 207 observation data was NOT
saved
> > because
> > >>>>> of invalid value, invalid quality mark, or quality mark
threshold.
> > >>>>>
> > >>>>> Cheers,
> > >>>>> Howard
> > >>>>>
> > >>>>> On 10/19/2017 9:47 AM, perry.shafran at noaa.gov via RT wrote:
> > >>>>>
> > >>>>>> Thu Oct 19 09:47:56 2017: Request 82430 was acted upon.
> > >>>>>> Transaction: Ticket created by perry.shafran at noaa.gov
> > >>>>>>         Queue: met_help
> > >>>>>>       Subject: time window discrepancies
> > >>>>>>         Owner: Nobody
> > >>>>>>    Requestors: perry.shafran at noaa.gov
> > >>>>>>        Status: new
> > >>>>>>   Ticket <URL: https://rt.rap.ucar.edu/rt/Tic
> > >>>>>> ket/Display.html?id=82430 >
> > >>>>>>
> > >>>>>>
> > >>>>>> Hi, there,
> > >>>>>>
> > >>>>>> I'm still having issues with discrepancies between VSDB and
MET,
> and
> > >>>>>> in
> > >>>>>> this case it has to do with the time window.
> > >>>>>>
> > >>>>>> I'm running a current comparison and have been using 30
minutes as
> > >>>>>> the time
> > >>>>>> window.  In editbufr, it compares the value of the time
window (in
> > >>>>>> hundredths of an hour, so that is set to 50 hundredths),
and
> > compares
> > >>>>>> that
> > >>>>>> to the value of DHR in the observation (also in hundredths
of an
> > >>>>>> hour).  I
> > >>>>>> just checked the code to see if it is doing that correctly
and it
> > >>>>>> does.
> > >>>>>>
> > >>>>>> In PB2NC, in the time window section, I set the window to
-1800
> and
> > >>>>>> 1800,
> > >>>>>> because there are 1800 seconds in 30 minutes.  I believe
the
> > >>>>>> obs_window
> > >>>>>> values need to be set in seconds, is that correct?
> > >>>>>>
> > >>>>>> OK, so when I do that,  I get 232604 time-window rejections
for
> > PB2NC
> > >>>>>> and
> > >>>>>> 173316 time-window rejections for editbufr.  I don't know
if this
> is
> > >>>>>> due to
> > >>>>>> the fact that editbufr checks the domain first and if they
are
> > >>>>>> rejected by
> > >>>>>> domain, they aren't rejected again for the time window,
where it
> > >>>>>> seems in
> > >>>>>> PB2NC, time window rejections are done first (which could
make up
> > for
> > >>>>>> the
> > >>>>>> larger number of rejections for PB2NC).
> > >>>>>>
> > >>>>>> In all, editbufr retains 10878 reports and PB2NC retains
10120
> > >>>>>> reports,
> > >>>>>> which is still a discrepancy.  Both use the same sets of
report
> > types
> > >>>>>> (in
> > >>>>>> this case 181, 281, 284, 187, and 287).  Both use the G212
domain
> to
> > >>>>>> set
> > >>>>>> the domain limits.  Is there something else I can be
looking at in
> > >>>>>> PB2NC to
> > >>>>>> ensure that I am comparing apples to apples?
> > >>>>>>
> > >>>>>> Also, I am curious:  PB2NC lists 44479 total observations
retained
> > or
> > >>>>>> derived.  What does this number represent?  Does that mean
that
> the
> > >>>>>> total
> > >>>>>> number of observations includes stuff like T and Q for the
same
> > >>>>>> station as
> > >>>>>> two observations?  Or is that something else?  That number
doesn't
> > >>>>>> appear
> > >>>>>> anywhere in what's retained in the editbufr report.  The
10120 in
> > >>>>>> PB2NC is
> > >>>>>> the same order of the 10878 in editbufr, so I am guessing
those
> > >>>>>> numbers are
> > >>>>>> the numbers we should be comparing.  Can you confirm that?
> > >>>>>>
> > >>>>>> Thanks for your assistance.
> > >>>>>>
> > >>>>>> Perry
> > >>>>>>
> > >>>>>>
> > >>>>>
> > >>>>
> > >>>
> > >>
> > >
> >
> >
>
>

------------------------------------------------
Subject: time window discrepancies
From: John Halley Gotway
Time: Fri Oct 20 16:30:15 2017

Perry,

No, I don't think that PB2NC discarding PREPBUFR messages from which
no
observations are retained or derived would contribute to differences
in the
number of matched pairs downstream.  I suspect that the main
differences
would be caused by which grid points are included in each masking
region.

But we could discuss it more thoroughly next week.

Thanks,
John

On Fri, Oct 20, 2017 at 1:27 PM, perry.shafran at noaa.gov via RT <
met_help at ucar.edu> wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=82430 >
>
> Hi, John,
>
> Yes, I think I have come to that conclusion.  I just wanted to find
out
> potential reasons for ob count differences between MET and VSDB, and
I
> think this is one of them.  Would you agree?
>
> Thanks!
>
> Perry
>
> On Fri, Oct 20, 2017 at 3:19 PM, John Halley Gotway via RT <
> met_help at ucar.edu> wrote:
>
> > Perry,
> >
> > The PB2NC tool processes each PREPBUFR message separately.  As it
> processes
> > each one, it keeps track of the number of observations retained or
> derived
> > from it.  If that number > 0, it writes out the header information
(i.e.
> > station name, elevation, and lat/lon) to the output NetCDF file.
If not,
> > it increments the rejection for zero obs counter.  It would be
pretty
> > straight-forward to add a config option for PB2NC to tell it to
skip the
> > application of that logic.  But I don't think it'll be as helpful
as you
> > imagine.
> >
> > Ultimately, it's the number of individual observation values used
for
> each
> > verification task that need to match.  Having MET write out some
> additional
> > header location information to which no observations actually
correspond
> > won't have any impact on that goal.
> >
> > When comparing the number of PREPBUFR messages processed by PB2NC
and
> > editbufr, can't we just say that PB2NC # Messages + PB2NC #
Rejected for
> > Zero Obs = # Editbufr?
> >
> > Thanks,
> > John
> >
> >
> > On Fri, Oct 20, 2017 at 12:42 PM, perry.shafran at noaa.gov via RT <
> > met_help at ucar.edu> wrote:
> >
> > >
> > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=82430 >
> > >
> > > OK, I just did a run that I was doing before - with both the
time
> window
> > > check and the grid domain check.  I found that the discrepancies
I have
> > > been seeing in ob counts is indeed the zero observation
rejection.  I
> > guess
> > > with that knowledge we can at least explain any discrepancy
between
> VSDB
> > > and MET.
> > >
> > > Considering that you cannot change the MET code by removing this
> > rejection
> > > check, might you be able to show me what you look for so perhaps
I can
> > add
> > > a rejection to the VSDB code?  That way we'll for sure have the
same
> sets
> > > of observations for both codes.
> > >
> > > Thanks!
> > >
> > > Perry
> > >
> > > On Fri, Oct 20, 2017 at 2:03 PM, Perry Shafran - NOAA Affiliate
<
> > > perry.shafran at noaa.gov> wrote:
> > >
> > > > Just did Test 2 and all looks pretty good.  Each code, MET and
VSDB
> > > > rejected the exact same number of observations with the
masking
> > > rejections
> > > > and no time window rejections.
> > > >
> > > > I'm going to see what happens when both are rejected and do a
good
> > > > comparison.
> > > >
> > > > Thanks!
> > > >
> > > > Perry
> > > >
> > > > On Fri, Oct 20, 2017 at 11:39 AM, Howard Soh <hsoh at ucar.edu>
wrote:
> > > >
> > > >> No, it can not be turned off.
> > > >> It's just information for handling PrepBufr input and the
NetCDF
> > output.
> > > >>
> > > >> The invalid data is not saved to NetCDF.
> > > >> "Rejected based on zero observation" is the count of the
records
> > > >> (message) which all observation data was filtered by item 10
to 17
> > > >> (including derived variables).
> > > >>
> > > >> Cheers,
> > > >> Howard
> > > >>
> > > >> On Fri, Oct 20, 2017 at 8:24 AM, Perry Shafran - NOAA
Affiliate <
> > > >> perry.shafran at noaa.gov> wrote:
> > > >>
> > > >>> OK, this was helpful.
> > > >>>
> > > >>> Test 1:  When I removed the grid masking, what happened is
that,
> > after
> > > >>> the rejections based on report type, I got an identical
number for
> > both
> > > >>> VSDB and MET.
> > > >>>
> > > >>> MET adds another level of rejections that VSDB doesn't have:
> > > >>>
> > > >>> DEBUG 2: Rejected based on zero observations    = 1219
> > > >>>
> > > >>> Is there a way to turn this off in PB2NC so I don't have
this
> > > rejection?
> > > >>>
> > > >>> Now to look at the second test.
> > > >>>
> > > >>> Perry
> > > >>>
> > > >>> On Thu, Oct 19, 2017 at 7:44 PM, hsoh <hsoh at ucar.edu> wrote:
> > > >>>
> > > >>>> Suggestion for comparing apples to apples:
> > > >>>>
> > > >>>> Apply just one filtering:
> > > >>>>         Trial 1: time window without grid masking
> > > >>>>         Trial 2: grid masking only (including all time
window)
> > > >>>>                 For example, set plus/minus one day (86400)
or one
> > > week
> > > >>>> (604800) as time window
> > > >>>>
> > > >>>> Cheers,
> > > >>>> Howard
> > > >>>>
> > > >>>>
> > > >>>>
> > > >>>> On 10/19/2017 5:27 PM, hsoh wrote:
> > > >>>>
> > > >>>>>
> > > >>>>> Question 1.
> > > >>>>>
> > > >>>>> I believe the obs_window values need to be set in seconds,
is
> that
> > > >>>>> correct?
> > > >>>>>
> > > >>>>> Yes, the unit is second.
> > > >>>>>
> > > >>>>> Question 2.
> > > >>>>>
> > > >>>>> Does that mean that the total number of observations
includes
> stuff
> > > >>>>> like
> > > >>>>> T and Q for the same station as two observations?
> > > >>>>>
> > > >>>>> Yes, that's correct. The variables are defined at the
> > configuration:
> > > >>>>> - obs_bufr_var for 6.1
> > > >>>>> - obs_grib_code for 6.0 or early versions
> > > >>>>>
> > > >>>>> For example :
> > > >>>>> obs_grib_code = [ "SPFH", "TMP",  "HGT",  "UGRD", "VGRD",
> > > >>>>>                   "DPT",  "WIND", "RH",   "MIXR" ];
> > > >>>>> ==>
> > > >>>>> The first five variables come from the PrepBufr and the
other
> four
> > > >>>>> variables are derived.
> > > >>>>>
> > > >>>>> In this case (9 variables), the maximum total number of
> > observations
> > > >>>>> is (if not rejected):
> > > >>>>> "Total PrepBufr Messages retained" * 9 * "the number of
vertical
> > > >>>>> levels"
> > > >>>>>
> > > >>>>> FYI: The order of rejections (if configured):
> > > >>>>>     1. message type
> > > >>>>>     2. station_id
> > > >>>>>     3. valid time
> > > >>>>>     4. grid mask
> > > >>>>>     5. poly mask
> > > >>>>>     6. elevation
> > > >>>>>     7. PrepBufr report type
> > > >>>>>     8. input report type
> > > >>>>>     9. instrument type
> > > >>>>>     10. vertical level
> > > >>>>>     11. If the pressure level is invalid
> > > >>>>>     12. If virtual temperature is invalid
> > > >>>>>     13. if the observation value is invalid.
> > > >>>>>     14. if the quality mark is invalid.
> > > >>>>>     15. If the quality mark is greater than the quality
mark
> > > threshold
> > > >>>>>     16. If the data level category is listed in the
configuration
> > > >>>>> file, and it is not in the list
> > > >>>>>     17. If the associated GRIB code (or the variable
index) is
> not
> > in
> > > >>>>> the configuration file
> > > >>>>>
> > > >>>>>
> > > >>>>> I believe you know the additional debug output with "-v 2"
> option.
> > > >>>>> Here is an example:
> > > >>>>>
> > > >>>>> DEBUG 2: PrepBufr Time Center:          20120409_120000
> > > >>>>> DEBUG 2: Searching Time Window:         20120409_113000 to
> > > >>>>> 20120409_123000
> > > >>>>> DEBUG 2: Processing 76306 PrepBufr messages...
> > > >>>>> ...
> > > >>>>> DEBUG 2: Total PrepBufr Messages processed      = 76306
> > > >>>>> DEBUG 2: Rejected based on message type         = 0
> > > >>>>> DEBUG 2: Rejected based on station id           = 0
> > > >>>>> DEBUG 2: Rejected based on valid time           = 50115
> > > >>>>> DEBUG 2: Rejected based on masking grid         = 0
> > > >>>>> DEBUG 2: Rejected based on masking polygon      = 17244
> > > >>>>> DEBUG 2: Rejected based on elevation            = 0
> > > >>>>> DEBUG 2: Rejected based on pb report type       = 0
> > > >>>>> DEBUG 2: Rejected based on input report type    = 0
> > > >>>>> DEBUG 2: Rejected based on instrument type      = 0
> > > >>>>> DEBUG 2: Rejected based on zero observations    = 505
> > > >>>>> DEBUG 2: Total PrepBufr Messages retained       = 8442
> > > >>>>> DEBUG 2: Total observations retained or derived = 75771
> > > >>>>>
> > > >>>>> "Total PrepBufr Messages" - "the sum of rejected counts" =
"Total
> > > >>>>> PrepBufr Messages retained".
> > > >>>>> 76306 - 50115 - 17244 - 505 = 8442 retained record
> > > >>>>>
> > > >>>>> Possible observation count if the vertical level is 1 and
using
> > above
> > > >>>>> configuration (9 variables).
> > > >>>>> 8442 * 9 * 1 = 75,978
> > > >>>>>
> > > >>>>> Total observations retained or derived = 75771
> > > >>>>> 75,978 - 75,771 = 207   ==> 207 observation data was NOT
saved
> > > because
> > > >>>>> of invalid value, invalid quality mark, or quality mark
> threshold.
> > > >>>>>
> > > >>>>> Cheers,
> > > >>>>> Howard
> > > >>>>>
> > > >>>>> On 10/19/2017 9:47 AM, perry.shafran at noaa.gov via RT
wrote:
> > > >>>>>
> > > >>>>>> Thu Oct 19 09:47:56 2017: Request 82430 was acted upon.
> > > >>>>>> Transaction: Ticket created by perry.shafran at noaa.gov
> > > >>>>>>         Queue: met_help
> > > >>>>>>       Subject: time window discrepancies
> > > >>>>>>         Owner: Nobody
> > > >>>>>>    Requestors: perry.shafran at noaa.gov
> > > >>>>>>        Status: new
> > > >>>>>>   Ticket <URL: https://rt.rap.ucar.edu/rt/Tic
> > > >>>>>> ket/Display.html?id=82430 >
> > > >>>>>>
> > > >>>>>>
> > > >>>>>> Hi, there,
> > > >>>>>>
> > > >>>>>> I'm still having issues with discrepancies between VSDB
and MET,
> > and
> > > >>>>>> in
> > > >>>>>> this case it has to do with the time window.
> > > >>>>>>
> > > >>>>>> I'm running a current comparison and have been using 30
minutes
> as
> > > >>>>>> the time
> > > >>>>>> window.  In editbufr, it compares the value of the time
window
> (in
> > > >>>>>> hundredths of an hour, so that is set to 50 hundredths),
and
> > > compares
> > > >>>>>> that
> > > >>>>>> to the value of DHR in the observation (also in
hundredths of an
> > > >>>>>> hour).  I
> > > >>>>>> just checked the code to see if it is doing that
correctly and
> it
> > > >>>>>> does.
> > > >>>>>>
> > > >>>>>> In PB2NC, in the time window section, I set the window to
-1800
> > and
> > > >>>>>> 1800,
> > > >>>>>> because there are 1800 seconds in 30 minutes.  I believe
the
> > > >>>>>> obs_window
> > > >>>>>> values need to be set in seconds, is that correct?
> > > >>>>>>
> > > >>>>>> OK, so when I do that,  I get 232604 time-window
rejections for
> > > PB2NC
> > > >>>>>> and
> > > >>>>>> 173316 time-window rejections for editbufr.  I don't know
if
> this
> > is
> > > >>>>>> due to
> > > >>>>>> the fact that editbufr checks the domain first and if
they are
> > > >>>>>> rejected by
> > > >>>>>> domain, they aren't rejected again for the time window,
where it
> > > >>>>>> seems in
> > > >>>>>> PB2NC, time window rejections are done first (which could
make
> up
> > > for
> > > >>>>>> the
> > > >>>>>> larger number of rejections for PB2NC).
> > > >>>>>>
> > > >>>>>> In all, editbufr retains 10878 reports and PB2NC retains
10120
> > > >>>>>> reports,
> > > >>>>>> which is still a discrepancy.  Both use the same sets of
report
> > > types
> > > >>>>>> (in
> > > >>>>>> this case 181, 281, 284, 187, and 287).  Both use the
G212
> domain
> > to
> > > >>>>>> set
> > > >>>>>> the domain limits.  Is there something else I can be
looking at
> in
> > > >>>>>> PB2NC to
> > > >>>>>> ensure that I am comparing apples to apples?
> > > >>>>>>
> > > >>>>>> Also, I am curious:  PB2NC lists 44479 total observations
> retained
> > > or
> > > >>>>>> derived.  What does this number represent?  Does that
mean that
> > the
> > > >>>>>> total
> > > >>>>>> number of observations includes stuff like T and Q for
the same
> > > >>>>>> station as
> > > >>>>>> two observations?  Or is that something else?  That
number
> doesn't
> > > >>>>>> appear
> > > >>>>>> anywhere in what's retained in the editbufr report.  The
10120
> in
> > > >>>>>> PB2NC is
> > > >>>>>> the same order of the 10878 in editbufr, so I am guessing
those
> > > >>>>>> numbers are
> > > >>>>>> the numbers we should be comparing.  Can you confirm
that?
> > > >>>>>>
> > > >>>>>> Thanks for your assistance.
> > > >>>>>>
> > > >>>>>> Perry
> > > >>>>>>
> > > >>>>>>
> > > >>>>>
> > > >>>>
> > > >>>
> > > >>
> > > >
> > >
> > >
> >
> >
>
>

------------------------------------------------
Subject: time window discrepancies
From: perry.shafran at noaa.gov
Time: Mon Oct 23 06:39:26 2017

John,

But the difference between VSDB and MET, in the tests that I've run
that
Howard advised that I run, that difference is always equal to the
discarding of prepbufr messages which no obs are derived.  See the
evidence
higher up in this email chain.

Perry

On Fri, Oct 20, 2017 at 6:30 PM, John Halley Gotway via RT <
met_help at ucar.edu> wrote:

> Perry,
>
> No, I don't think that PB2NC discarding PREPBUFR messages from which
no
> observations are retained or derived would contribute to differences
in the
> number of matched pairs downstream.  I suspect that the main
differences
> would be caused by which grid points are included in each masking
region.
>
> But we could discuss it more thoroughly next week.
>
> Thanks,
> John
>
> On Fri, Oct 20, 2017 at 1:27 PM, perry.shafran at noaa.gov via RT <
> met_help at ucar.edu> wrote:
>
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=82430 >
> >
> > Hi, John,
> >
> > Yes, I think I have come to that conclusion.  I just wanted to
find out
> > potential reasons for ob count differences between MET and VSDB,
and I
> > think this is one of them.  Would you agree?
> >
> > Thanks!
> >
> > Perry
> >
> > On Fri, Oct 20, 2017 at 3:19 PM, John Halley Gotway via RT <
> > met_help at ucar.edu> wrote:
> >
> > > Perry,
> > >
> > > The PB2NC tool processes each PREPBUFR message separately.  As
it
> > processes
> > > each one, it keeps track of the number of observations retained
or
> > derived
> > > from it.  If that number > 0, it writes out the header
information
> (i.e.
> > > station name, elevation, and lat/lon) to the output NetCDF file.
If
> not,
> > > it increments the rejection for zero obs counter.  It would be
pretty
> > > straight-forward to add a config option for PB2NC to tell it to
skip
> the
> > > application of that logic.  But I don't think it'll be as
helpful as
> you
> > > imagine.
> > >
> > > Ultimately, it's the number of individual observation values
used for
> > each
> > > verification task that need to match.  Having MET write out some
> > additional
> > > header location information to which no observations actually
> correspond
> > > won't have any impact on that goal.
> > >
> > > When comparing the number of PREPBUFR messages processed by
PB2NC and
> > > editbufr, can't we just say that PB2NC # Messages + PB2NC #
Rejected
> for
> > > Zero Obs = # Editbufr?
> > >
> > > Thanks,
> > > John
> > >
> > >
> > > On Fri, Oct 20, 2017 at 12:42 PM, perry.shafran at noaa.gov via RT
<
> > > met_help at ucar.edu> wrote:
> > >
> > > >
> > > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=82430
>
> > > >
> > > > OK, I just did a run that I was doing before - with both the
time
> > window
> > > > check and the grid domain check.  I found that the
discrepancies I
> have
> > > > been seeing in ob counts is indeed the zero observation
rejection.  I
> > > guess
> > > > with that knowledge we can at least explain any discrepancy
between
> > VSDB
> > > > and MET.
> > > >
> > > > Considering that you cannot change the MET code by removing
this
> > > rejection
> > > > check, might you be able to show me what you look for so
perhaps I
> can
> > > add
> > > > a rejection to the VSDB code?  That way we'll for sure have
the same
> > sets
> > > > of observations for both codes.
> > > >
> > > > Thanks!
> > > >
> > > > Perry
> > > >
> > > > On Fri, Oct 20, 2017 at 2:03 PM, Perry Shafran - NOAA
Affiliate <
> > > > perry.shafran at noaa.gov> wrote:
> > > >
> > > > > Just did Test 2 and all looks pretty good.  Each code, MET
and VSDB
> > > > > rejected the exact same number of observations with the
masking
> > > > rejections
> > > > > and no time window rejections.
> > > > >
> > > > > I'm going to see what happens when both are rejected and do
a good
> > > > > comparison.
> > > > >
> > > > > Thanks!
> > > > >
> > > > > Perry
> > > > >
> > > > > On Fri, Oct 20, 2017 at 11:39 AM, Howard Soh <hsoh at ucar.edu>
> wrote:
> > > > >
> > > > >> No, it can not be turned off.
> > > > >> It's just information for handling PrepBufr input and the
NetCDF
> > > output.
> > > > >>
> > > > >> The invalid data is not saved to NetCDF.
> > > > >> "Rejected based on zero observation" is the count of the
records
> > > > >> (message) which all observation data was filtered by item
10 to 17
> > > > >> (including derived variables).
> > > > >>
> > > > >> Cheers,
> > > > >> Howard
> > > > >>
> > > > >> On Fri, Oct 20, 2017 at 8:24 AM, Perry Shafran - NOAA
Affiliate <
> > > > >> perry.shafran at noaa.gov> wrote:
> > > > >>
> > > > >>> OK, this was helpful.
> > > > >>>
> > > > >>> Test 1:  When I removed the grid masking, what happened is
that,
> > > after
> > > > >>> the rejections based on report type, I got an identical
number
> for
> > > both
> > > > >>> VSDB and MET.
> > > > >>>
> > > > >>> MET adds another level of rejections that VSDB doesn't
have:
> > > > >>>
> > > > >>> DEBUG 2: Rejected based on zero observations    = 1219
> > > > >>>
> > > > >>> Is there a way to turn this off in PB2NC so I don't have
this
> > > > rejection?
> > > > >>>
> > > > >>> Now to look at the second test.
> > > > >>>
> > > > >>> Perry
> > > > >>>
> > > > >>> On Thu, Oct 19, 2017 at 7:44 PM, hsoh <hsoh at ucar.edu>
wrote:
> > > > >>>
> > > > >>>> Suggestion for comparing apples to apples:
> > > > >>>>
> > > > >>>> Apply just one filtering:
> > > > >>>>         Trial 1: time window without grid masking
> > > > >>>>         Trial 2: grid masking only (including all time
window)
> > > > >>>>                 For example, set plus/minus one day
(86400) or
> one
> > > > week
> > > > >>>> (604800) as time window
> > > > >>>>
> > > > >>>> Cheers,
> > > > >>>> Howard
> > > > >>>>
> > > > >>>>
> > > > >>>>
> > > > >>>> On 10/19/2017 5:27 PM, hsoh wrote:
> > > > >>>>
> > > > >>>>>
> > > > >>>>> Question 1.
> > > > >>>>>
> > > > >>>>> I believe the obs_window values need to be set in
seconds, is
> > that
> > > > >>>>> correct?
> > > > >>>>>
> > > > >>>>> Yes, the unit is second.
> > > > >>>>>
> > > > >>>>> Question 2.
> > > > >>>>>
> > > > >>>>> Does that mean that the total number of observations
includes
> > stuff
> > > > >>>>> like
> > > > >>>>> T and Q for the same station as two observations?
> > > > >>>>>
> > > > >>>>> Yes, that's correct. The variables are defined at the
> > > configuration:
> > > > >>>>> - obs_bufr_var for 6.1
> > > > >>>>> - obs_grib_code for 6.0 or early versions
> > > > >>>>>
> > > > >>>>> For example :
> > > > >>>>> obs_grib_code = [ "SPFH", "TMP",  "HGT",  "UGRD",
"VGRD",
> > > > >>>>>                   "DPT",  "WIND", "RH",   "MIXR" ];
> > > > >>>>> ==>
> > > > >>>>> The first five variables come from the PrepBufr and the
other
> > four
> > > > >>>>> variables are derived.
> > > > >>>>>
> > > > >>>>> In this case (9 variables), the maximum total number of
> > > observations
> > > > >>>>> is (if not rejected):
> > > > >>>>> "Total PrepBufr Messages retained" * 9 * "the number of
> vertical
> > > > >>>>> levels"
> > > > >>>>>
> > > > >>>>> FYI: The order of rejections (if configured):
> > > > >>>>>     1. message type
> > > > >>>>>     2. station_id
> > > > >>>>>     3. valid time
> > > > >>>>>     4. grid mask
> > > > >>>>>     5. poly mask
> > > > >>>>>     6. elevation
> > > > >>>>>     7. PrepBufr report type
> > > > >>>>>     8. input report type
> > > > >>>>>     9. instrument type
> > > > >>>>>     10. vertical level
> > > > >>>>>     11. If the pressure level is invalid
> > > > >>>>>     12. If virtual temperature is invalid
> > > > >>>>>     13. if the observation value is invalid.
> > > > >>>>>     14. if the quality mark is invalid.
> > > > >>>>>     15. If the quality mark is greater than the quality
mark
> > > > threshold
> > > > >>>>>     16. If the data level category is listed in the
> configuration
> > > > >>>>> file, and it is not in the list
> > > > >>>>>     17. If the associated GRIB code (or the variable
index) is
> > not
> > > in
> > > > >>>>> the configuration file
> > > > >>>>>
> > > > >>>>>
> > > > >>>>> I believe you know the additional debug output with "-v
2"
> > option.
> > > > >>>>> Here is an example:
> > > > >>>>>
> > > > >>>>> DEBUG 2: PrepBufr Time Center:          20120409_120000
> > > > >>>>> DEBUG 2: Searching Time Window:         20120409_113000
to
> > > > >>>>> 20120409_123000
> > > > >>>>> DEBUG 2: Processing 76306 PrepBufr messages...
> > > > >>>>> ...
> > > > >>>>> DEBUG 2: Total PrepBufr Messages processed      = 76306
> > > > >>>>> DEBUG 2: Rejected based on message type         = 0
> > > > >>>>> DEBUG 2: Rejected based on station id           = 0
> > > > >>>>> DEBUG 2: Rejected based on valid time           = 50115
> > > > >>>>> DEBUG 2: Rejected based on masking grid         = 0
> > > > >>>>> DEBUG 2: Rejected based on masking polygon      = 17244
> > > > >>>>> DEBUG 2: Rejected based on elevation            = 0
> > > > >>>>> DEBUG 2: Rejected based on pb report type       = 0
> > > > >>>>> DEBUG 2: Rejected based on input report type    = 0
> > > > >>>>> DEBUG 2: Rejected based on instrument type      = 0
> > > > >>>>> DEBUG 2: Rejected based on zero observations    = 505
> > > > >>>>> DEBUG 2: Total PrepBufr Messages retained       = 8442
> > > > >>>>> DEBUG 2: Total observations retained or derived = 75771
> > > > >>>>>
> > > > >>>>> "Total PrepBufr Messages" - "the sum of rejected counts"
=
> "Total
> > > > >>>>> PrepBufr Messages retained".
> > > > >>>>> 76306 - 50115 - 17244 - 505 = 8442 retained record
> > > > >>>>>
> > > > >>>>> Possible observation count if the vertical level is 1
and using
> > > above
> > > > >>>>> configuration (9 variables).
> > > > >>>>> 8442 * 9 * 1 = 75,978
> > > > >>>>>
> > > > >>>>> Total observations retained or derived = 75771
> > > > >>>>> 75,978 - 75,771 = 207   ==> 207 observation data was NOT
saved
> > > > because
> > > > >>>>> of invalid value, invalid quality mark, or quality mark
> > threshold.
> > > > >>>>>
> > > > >>>>> Cheers,
> > > > >>>>> Howard
> > > > >>>>>
> > > > >>>>> On 10/19/2017 9:47 AM, perry.shafran at noaa.gov via RT
wrote:
> > > > >>>>>
> > > > >>>>>> Thu Oct 19 09:47:56 2017: Request 82430 was acted upon.
> > > > >>>>>> Transaction: Ticket created by perry.shafran at noaa.gov
> > > > >>>>>>         Queue: met_help
> > > > >>>>>>       Subject: time window discrepancies
> > > > >>>>>>         Owner: Nobody
> > > > >>>>>>    Requestors: perry.shafran at noaa.gov
> > > > >>>>>>        Status: new
> > > > >>>>>>   Ticket <URL: https://rt.rap.ucar.edu/rt/Tic
> > > > >>>>>> ket/Display.html?id=82430 >
> > > > >>>>>>
> > > > >>>>>>
> > > > >>>>>> Hi, there,
> > > > >>>>>>
> > > > >>>>>> I'm still having issues with discrepancies between VSDB
and
> MET,
> > > and
> > > > >>>>>> in
> > > > >>>>>> this case it has to do with the time window.
> > > > >>>>>>
> > > > >>>>>> I'm running a current comparison and have been using 30
> minutes
> > as
> > > > >>>>>> the time
> > > > >>>>>> window.  In editbufr, it compares the value of the time
window
> > (in
> > > > >>>>>> hundredths of an hour, so that is set to 50
hundredths), and
> > > > compares
> > > > >>>>>> that
> > > > >>>>>> to the value of DHR in the observation (also in
hundredths of
> an
> > > > >>>>>> hour).  I
> > > > >>>>>> just checked the code to see if it is doing that
correctly and
> > it
> > > > >>>>>> does.
> > > > >>>>>>
> > > > >>>>>> In PB2NC, in the time window section, I set the window
to
> -1800
> > > and
> > > > >>>>>> 1800,
> > > > >>>>>> because there are 1800 seconds in 30 minutes.  I
believe the
> > > > >>>>>> obs_window
> > > > >>>>>> values need to be set in seconds, is that correct?
> > > > >>>>>>
> > > > >>>>>> OK, so when I do that,  I get 232604 time-window
rejections
> for
> > > > PB2NC
> > > > >>>>>> and
> > > > >>>>>> 173316 time-window rejections for editbufr.  I don't
know if
> > this
> > > is
> > > > >>>>>> due to
> > > > >>>>>> the fact that editbufr checks the domain first and if
they are
> > > > >>>>>> rejected by
> > > > >>>>>> domain, they aren't rejected again for the time window,
where
> it
> > > > >>>>>> seems in
> > > > >>>>>> PB2NC, time window rejections are done first (which
could make
> > up
> > > > for
> > > > >>>>>> the
> > > > >>>>>> larger number of rejections for PB2NC).
> > > > >>>>>>
> > > > >>>>>> In all, editbufr retains 10878 reports and PB2NC
retains 10120
> > > > >>>>>> reports,
> > > > >>>>>> which is still a discrepancy.  Both use the same sets
of
> report
> > > > types
> > > > >>>>>> (in
> > > > >>>>>> this case 181, 281, 284, 187, and 287).  Both use the
G212
> > domain
> > > to
> > > > >>>>>> set
> > > > >>>>>> the domain limits.  Is there something else I can be
looking
> at
> > in
> > > > >>>>>> PB2NC to
> > > > >>>>>> ensure that I am comparing apples to apples?
> > > > >>>>>>
> > > > >>>>>> Also, I am curious:  PB2NC lists 44479 total
observations
> > retained
> > > > or
> > > > >>>>>> derived.  What does this number represent?  Does that
mean
> that
> > > the
> > > > >>>>>> total
> > > > >>>>>> number of observations includes stuff like T and Q for
the
> same
> > > > >>>>>> station as
> > > > >>>>>> two observations?  Or is that something else?  That
number
> > doesn't
> > > > >>>>>> appear
> > > > >>>>>> anywhere in what's retained in the editbufr report.
The 10120
> > in
> > > > >>>>>> PB2NC is
> > > > >>>>>> the same order of the 10878 in editbufr, so I am
guessing
> those
> > > > >>>>>> numbers are
> > > > >>>>>> the numbers we should be comparing.  Can you confirm
that?
> > > > >>>>>>
> > > > >>>>>> Thanks for your assistance.
> > > > >>>>>>
> > > > >>>>>> Perry
> > > > >>>>>>
> > > > >>>>>>
> > > > >>>>>
> > > > >>>>
> > > > >>>
> > > > >>
> > > > >
> > > >
> > > >
> > >
> > >
> >
> >
>
>

------------------------------------------------
Subject: time window discrepancies
From: John Halley Gotway
Time: Mon Oct 23 12:55:33 2017

Perry,

OK, let's resolve any more questions about this in person.

Thanks,
John

On Mon, Oct 23, 2017 at 6:39 AM, perry.shafran at noaa.gov via RT <
met_help at ucar.edu> wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=82430 >
>
> John,
>
> But the difference between VSDB and MET, in the tests that I've run
that
> Howard advised that I run, that difference is always equal to the
> discarding of prepbufr messages which no obs are derived.  See the
evidence
> higher up in this email chain.
>
> Perry
>
> On Fri, Oct 20, 2017 at 6:30 PM, John Halley Gotway via RT <
> met_help at ucar.edu> wrote:
>
> > Perry,
> >
> > No, I don't think that PB2NC discarding PREPBUFR messages from
which no
> > observations are retained or derived would contribute to
differences in
> the
> > number of matched pairs downstream.  I suspect that the main
differences
> > would be caused by which grid points are included in each masking
region.
> >
> > But we could discuss it more thoroughly next week.
> >
> > Thanks,
> > John
> >
> > On Fri, Oct 20, 2017 at 1:27 PM, perry.shafran at noaa.gov via RT <
> > met_help at ucar.edu> wrote:
> >
> > >
> > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=82430 >
> > >
> > > Hi, John,
> > >
> > > Yes, I think I have come to that conclusion.  I just wanted to
find out
> > > potential reasons for ob count differences between MET and VSDB,
and I
> > > think this is one of them.  Would you agree?
> > >
> > > Thanks!
> > >
> > > Perry
> > >
> > > On Fri, Oct 20, 2017 at 3:19 PM, John Halley Gotway via RT <
> > > met_help at ucar.edu> wrote:
> > >
> > > > Perry,
> > > >
> > > > The PB2NC tool processes each PREPBUFR message separately.  As
it
> > > processes
> > > > each one, it keeps track of the number of observations
retained or
> > > derived
> > > > from it.  If that number > 0, it writes out the header
information
> > (i.e.
> > > > station name, elevation, and lat/lon) to the output NetCDF
file.  If
> > not,
> > > > it increments the rejection for zero obs counter.  It would be
pretty
> > > > straight-forward to add a config option for PB2NC to tell it
to skip
> > the
> > > > application of that logic.  But I don't think it'll be as
helpful as
> > you
> > > > imagine.
> > > >
> > > > Ultimately, it's the number of individual observation values
used for
> > > each
> > > > verification task that need to match.  Having MET write out
some
> > > additional
> > > > header location information to which no observations actually
> > correspond
> > > > won't have any impact on that goal.
> > > >
> > > > When comparing the number of PREPBUFR messages processed by
PB2NC and
> > > > editbufr, can't we just say that PB2NC # Messages + PB2NC #
Rejected
> > for
> > > > Zero Obs = # Editbufr?
> > > >
> > > > Thanks,
> > > > John
> > > >
> > > >
> > > > On Fri, Oct 20, 2017 at 12:42 PM, perry.shafran at noaa.gov via
RT <
> > > > met_help at ucar.edu> wrote:
> > > >
> > > > >
> > > > > <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=82430 >
> > > > >
> > > > > OK, I just did a run that I was doing before - with both the
time
> > > window
> > > > > check and the grid domain check.  I found that the
discrepancies I
> > have
> > > > > been seeing in ob counts is indeed the zero observation
> rejection.  I
> > > > guess
> > > > > with that knowledge we can at least explain any discrepancy
between
> > > VSDB
> > > > > and MET.
> > > > >
> > > > > Considering that you cannot change the MET code by removing
this
> > > > rejection
> > > > > check, might you be able to show me what you look for so
perhaps I
> > can
> > > > add
> > > > > a rejection to the VSDB code?  That way we'll for sure have
the
> same
> > > sets
> > > > > of observations for both codes.
> > > > >
> > > > > Thanks!
> > > > >
> > > > > Perry
> > > > >
> > > > > On Fri, Oct 20, 2017 at 2:03 PM, Perry Shafran - NOAA
Affiliate <
> > > > > perry.shafran at noaa.gov> wrote:
> > > > >
> > > > > > Just did Test 2 and all looks pretty good.  Each code, MET
and
> VSDB
> > > > > > rejected the exact same number of observations with the
masking
> > > > > rejections
> > > > > > and no time window rejections.
> > > > > >
> > > > > > I'm going to see what happens when both are rejected and
do a
> good
> > > > > > comparison.
> > > > > >
> > > > > > Thanks!
> > > > > >
> > > > > > Perry
> > > > > >
> > > > > > On Fri, Oct 20, 2017 at 11:39 AM, Howard Soh
<hsoh at ucar.edu>
> > wrote:
> > > > > >
> > > > > >> No, it can not be turned off.
> > > > > >> It's just information for handling PrepBufr input and the
NetCDF
> > > > output.
> > > > > >>
> > > > > >> The invalid data is not saved to NetCDF.
> > > > > >> "Rejected based on zero observation" is the count of the
records
> > > > > >> (message) which all observation data was filtered by item
10 to
> 17
> > > > > >> (including derived variables).
> > > > > >>
> > > > > >> Cheers,
> > > > > >> Howard
> > > > > >>
> > > > > >> On Fri, Oct 20, 2017 at 8:24 AM, Perry Shafran - NOAA
Affiliate
> <
> > > > > >> perry.shafran at noaa.gov> wrote:
> > > > > >>
> > > > > >>> OK, this was helpful.
> > > > > >>>
> > > > > >>> Test 1:  When I removed the grid masking, what happened
is
> that,
> > > > after
> > > > > >>> the rejections based on report type, I got an identical
number
> > for
> > > > both
> > > > > >>> VSDB and MET.
> > > > > >>>
> > > > > >>> MET adds another level of rejections that VSDB doesn't
have:
> > > > > >>>
> > > > > >>> DEBUG 2: Rejected based on zero observations    = 1219
> > > > > >>>
> > > > > >>> Is there a way to turn this off in PB2NC so I don't have
this
> > > > > rejection?
> > > > > >>>
> > > > > >>> Now to look at the second test.
> > > > > >>>
> > > > > >>> Perry
> > > > > >>>
> > > > > >>> On Thu, Oct 19, 2017 at 7:44 PM, hsoh <hsoh at ucar.edu>
wrote:
> > > > > >>>
> > > > > >>>> Suggestion for comparing apples to apples:
> > > > > >>>>
> > > > > >>>> Apply just one filtering:
> > > > > >>>>         Trial 1: time window without grid masking
> > > > > >>>>         Trial 2: grid masking only (including all time
window)
> > > > > >>>>                 For example, set plus/minus one day
(86400) or
> > one
> > > > > week
> > > > > >>>> (604800) as time window
> > > > > >>>>
> > > > > >>>> Cheers,
> > > > > >>>> Howard
> > > > > >>>>
> > > > > >>>>
> > > > > >>>>
> > > > > >>>> On 10/19/2017 5:27 PM, hsoh wrote:
> > > > > >>>>
> > > > > >>>>>
> > > > > >>>>> Question 1.
> > > > > >>>>>
> > > > > >>>>> I believe the obs_window values need to be set in
seconds, is
> > > that
> > > > > >>>>> correct?
> > > > > >>>>>
> > > > > >>>>> Yes, the unit is second.
> > > > > >>>>>
> > > > > >>>>> Question 2.
> > > > > >>>>>
> > > > > >>>>> Does that mean that the total number of observations
includes
> > > stuff
> > > > > >>>>> like
> > > > > >>>>> T and Q for the same station as two observations?
> > > > > >>>>>
> > > > > >>>>> Yes, that's correct. The variables are defined at the
> > > > configuration:
> > > > > >>>>> - obs_bufr_var for 6.1
> > > > > >>>>> - obs_grib_code for 6.0 or early versions
> > > > > >>>>>
> > > > > >>>>> For example :
> > > > > >>>>> obs_grib_code = [ "SPFH", "TMP",  "HGT",  "UGRD",
"VGRD",
> > > > > >>>>>                   "DPT",  "WIND", "RH",   "MIXR" ];
> > > > > >>>>> ==>
> > > > > >>>>> The first five variables come from the PrepBufr and
the other
> > > four
> > > > > >>>>> variables are derived.
> > > > > >>>>>
> > > > > >>>>> In this case (9 variables), the maximum total number
of
> > > > observations
> > > > > >>>>> is (if not rejected):
> > > > > >>>>> "Total PrepBufr Messages retained" * 9 * "the number
of
> > vertical
> > > > > >>>>> levels"
> > > > > >>>>>
> > > > > >>>>> FYI: The order of rejections (if configured):
> > > > > >>>>>     1. message type
> > > > > >>>>>     2. station_id
> > > > > >>>>>     3. valid time
> > > > > >>>>>     4. grid mask
> > > > > >>>>>     5. poly mask
> > > > > >>>>>     6. elevation
> > > > > >>>>>     7. PrepBufr report type
> > > > > >>>>>     8. input report type
> > > > > >>>>>     9. instrument type
> > > > > >>>>>     10. vertical level
> > > > > >>>>>     11. If the pressure level is invalid
> > > > > >>>>>     12. If virtual temperature is invalid
> > > > > >>>>>     13. if the observation value is invalid.
> > > > > >>>>>     14. if the quality mark is invalid.
> > > > > >>>>>     15. If the quality mark is greater than the
quality mark
> > > > > threshold
> > > > > >>>>>     16. If the data level category is listed in the
> > configuration
> > > > > >>>>> file, and it is not in the list
> > > > > >>>>>     17. If the associated GRIB code (or the variable
index)
> is
> > > not
> > > > in
> > > > > >>>>> the configuration file
> > > > > >>>>>
> > > > > >>>>>
> > > > > >>>>> I believe you know the additional debug output with "-
v 2"
> > > option.
> > > > > >>>>> Here is an example:
> > > > > >>>>>
> > > > > >>>>> DEBUG 2: PrepBufr Time Center:
20120409_120000
> > > > > >>>>> DEBUG 2: Searching Time Window:
20120409_113000 to
> > > > > >>>>> 20120409_123000
> > > > > >>>>> DEBUG 2: Processing 76306 PrepBufr messages...
> > > > > >>>>> ...
> > > > > >>>>> DEBUG 2: Total PrepBufr Messages processed      =
76306
> > > > > >>>>> DEBUG 2: Rejected based on message type         = 0
> > > > > >>>>> DEBUG 2: Rejected based on station id           = 0
> > > > > >>>>> DEBUG 2: Rejected based on valid time           =
50115
> > > > > >>>>> DEBUG 2: Rejected based on masking grid         = 0
> > > > > >>>>> DEBUG 2: Rejected based on masking polygon      =
17244
> > > > > >>>>> DEBUG 2: Rejected based on elevation            = 0
> > > > > >>>>> DEBUG 2: Rejected based on pb report type       = 0
> > > > > >>>>> DEBUG 2: Rejected based on input report type    = 0
> > > > > >>>>> DEBUG 2: Rejected based on instrument type      = 0
> > > > > >>>>> DEBUG 2: Rejected based on zero observations    = 505
> > > > > >>>>> DEBUG 2: Total PrepBufr Messages retained       = 8442
> > > > > >>>>> DEBUG 2: Total observations retained or derived =
75771
> > > > > >>>>>
> > > > > >>>>> "Total PrepBufr Messages" - "the sum of rejected
counts" =
> > "Total
> > > > > >>>>> PrepBufr Messages retained".
> > > > > >>>>> 76306 - 50115 - 17244 - 505 = 8442 retained record
> > > > > >>>>>
> > > > > >>>>> Possible observation count if the vertical level is 1
and
> using
> > > > above
> > > > > >>>>> configuration (9 variables).
> > > > > >>>>> 8442 * 9 * 1 = 75,978
> > > > > >>>>>
> > > > > >>>>> Total observations retained or derived = 75771
> > > > > >>>>> 75,978 - 75,771 = 207   ==> 207 observation data was
NOT
> saved
> > > > > because
> > > > > >>>>> of invalid value, invalid quality mark, or quality
mark
> > > threshold.
> > > > > >>>>>
> > > > > >>>>> Cheers,
> > > > > >>>>> Howard
> > > > > >>>>>
> > > > > >>>>> On 10/19/2017 9:47 AM, perry.shafran at noaa.gov via RT
wrote:
> > > > > >>>>>
> > > > > >>>>>> Thu Oct 19 09:47:56 2017: Request 82430 was acted
upon.
> > > > > >>>>>> Transaction: Ticket created by perry.shafran at noaa.gov
> > > > > >>>>>>         Queue: met_help
> > > > > >>>>>>       Subject: time window discrepancies
> > > > > >>>>>>         Owner: Nobody
> > > > > >>>>>>    Requestors: perry.shafran at noaa.gov
> > > > > >>>>>>        Status: new
> > > > > >>>>>>   Ticket <URL: https://rt.rap.ucar.edu/rt/Tic
> > > > > >>>>>> ket/Display.html?id=82430 >
> > > > > >>>>>>
> > > > > >>>>>>
> > > > > >>>>>> Hi, there,
> > > > > >>>>>>
> > > > > >>>>>> I'm still having issues with discrepancies between
VSDB and
> > MET,
> > > > and
> > > > > >>>>>> in
> > > > > >>>>>> this case it has to do with the time window.
> > > > > >>>>>>
> > > > > >>>>>> I'm running a current comparison and have been using
30
> > minutes
> > > as
> > > > > >>>>>> the time
> > > > > >>>>>> window.  In editbufr, it compares the value of the
time
> window
> > > (in
> > > > > >>>>>> hundredths of an hour, so that is set to 50
hundredths), and
> > > > > compares
> > > > > >>>>>> that
> > > > > >>>>>> to the value of DHR in the observation (also in
hundredths
> of
> > an
> > > > > >>>>>> hour).  I
> > > > > >>>>>> just checked the code to see if it is doing that
correctly
> and
> > > it
> > > > > >>>>>> does.
> > > > > >>>>>>
> > > > > >>>>>> In PB2NC, in the time window section, I set the
window to
> > -1800
> > > > and
> > > > > >>>>>> 1800,
> > > > > >>>>>> because there are 1800 seconds in 30 minutes.  I
believe the
> > > > > >>>>>> obs_window
> > > > > >>>>>> values need to be set in seconds, is that correct?
> > > > > >>>>>>
> > > > > >>>>>> OK, so when I do that,  I get 232604 time-window
rejections
> > for
> > > > > PB2NC
> > > > > >>>>>> and
> > > > > >>>>>> 173316 time-window rejections for editbufr.  I don't
know if
> > > this
> > > > is
> > > > > >>>>>> due to
> > > > > >>>>>> the fact that editbufr checks the domain first and if
they
> are
> > > > > >>>>>> rejected by
> > > > > >>>>>> domain, they aren't rejected again for the time
window,
> where
> > it
> > > > > >>>>>> seems in
> > > > > >>>>>> PB2NC, time window rejections are done first (which
could
> make
> > > up
> > > > > for
> > > > > >>>>>> the
> > > > > >>>>>> larger number of rejections for PB2NC).
> > > > > >>>>>>
> > > > > >>>>>> In all, editbufr retains 10878 reports and PB2NC
retains
> 10120
> > > > > >>>>>> reports,
> > > > > >>>>>> which is still a discrepancy.  Both use the same sets
of
> > report
> > > > > types
> > > > > >>>>>> (in
> > > > > >>>>>> this case 181, 281, 284, 187, and 287).  Both use the
G212
> > > domain
> > > > to
> > > > > >>>>>> set
> > > > > >>>>>> the domain limits.  Is there something else I can be
looking
> > at
> > > in
> > > > > >>>>>> PB2NC to
> > > > > >>>>>> ensure that I am comparing apples to apples?
> > > > > >>>>>>
> > > > > >>>>>> Also, I am curious:  PB2NC lists 44479 total
observations
> > > retained
> > > > > or
> > > > > >>>>>> derived.  What does this number represent?  Does that
mean
> > that
> > > > the
> > > > > >>>>>> total
> > > > > >>>>>> number of observations includes stuff like T and Q
for the
> > same
> > > > > >>>>>> station as
> > > > > >>>>>> two observations?  Or is that something else?  That
number
> > > doesn't
> > > > > >>>>>> appear
> > > > > >>>>>> anywhere in what's retained in the editbufr report.
The
> 10120
> > > in
> > > > > >>>>>> PB2NC is
> > > > > >>>>>> the same order of the 10878 in editbufr, so I am
guessing
> > those
> > > > > >>>>>> numbers are
> > > > > >>>>>> the numbers we should be comparing.  Can you confirm
that?
> > > > > >>>>>>
> > > > > >>>>>> Thanks for your assistance.
> > > > > >>>>>>
> > > > > >>>>>> Perry
> > > > > >>>>>>
> > > > > >>>>>>
> > > > > >>>>>
> > > > > >>>>
> > > > > >>>
> > > > > >>
> > > > > >
> > > > >
> > > > >
> > > >
> > > >
> > >
> > >
> >
> >
>
>

------------------------------------------------


More information about the Met_help mailing list