[Met_help] [rt.rap.ucar.edu #46189] History for Masking in MET

Thu Apr 21 13:15:30 MDT 2011

----------------------------------------------------------------
  Initial Request
----------------------------------------------------------------

Hello,

Our customer is asking for clarification for how MET does its masking. I 
have generated ncdf files for the masks used in MET using US census 
data. For a specific WRF domain 2, I retain grid points and thus can 
compute statistics for Washington DC (shown in the attachments), 
however, for the coarser domain 1, the mask that MET generates does not 
retain any grid points. Can you please clarify exactly how MET computes 
its masks? I surmise that MET retains grid points only within the 
polygon defined by the census data. How it matches obs to these 
locations is another question, given the chosen interpolation technique.

Ultimately, the customer wants to ensure that only observations within a 
certain state are actually used for statistics for that state. 
Inspection of 3 randomly chosen point stat files (contents of one are 
attached) shows that 1) there are (always) duplicate entries (both 
ADPSFC) for 'DC' obs, 2) yet the station retained (apparently KDCA) is 
actually in VA, despite what the attached station listing states. And 3) 
there is a third KDCA observation with a slightly different pressure and 
observed temperature.

Can you please comment on the above three items?

I should note that these files were generated using MET v2 and before 
the inverted-stack (ordering) problem was identified.

In general, is it possible that MET retains multiple copies of the same 
observation?

The polygon based on the census data may have included the lat/lon 
location of KDCA. Is that why it could have been used or are there other 
criteria? As you can see, I'm using DW_MEAN weighting.

Any clarification would be very helpful.

Thanks.

John Henderson

----------------------------------------------------------------
  Complete Ticket History
----------------------------------------------------------------

Subject: Re: [rt.rap.ucar.edu #46189] Masking in MET
From: John Halley Gotway
Time: Mon Apr 18 11:09:45 2011

John,

It sounds like you're running Point-Stat using a masking polyline.
When you say: "I have generated ncdf files for the masks used in MET
using US census data", I assume that means you've generated a
NetCDF mask using Gen-Poly-Mask or some other method for defining your
masks for MET.  Since you're running Point-Stat, I think there are a
few points to make...

(1) First, regarding how the masks are defined.  Point-Stat can either
read a lat/lon polyline mask directly or read the NetCDF output of
Gen-Poly-Mask.  In the latter case, it simply reads the
masking information from the NetCDF file.  In the former case, it
reads in the lat/lon polyline.  Then it loops through each grid point
in your domain, retrieves the lat/lon value for it, and checks
to see if that lat/lon falls within the lat/lon polyline.  Grid points
falling inside the polyline are assigned a value of 1 and points
outside are assigned a value of 0.

(2) Second, regarding how masks are applied to point observations in
Point-Stat.  Point-Stat processes the observations one by one.  For
each, it converts the observation's lat/lon value to an x/y
value in the grid.  Then it checks to closest grid point in the mask
(using the nearest integer function, nint(x) and nint(y)).  If the
mask has a zero value at that grid point, the observation value
is not used.

(3) Third, there are other factors which affect which observations
are/are not used by Point-Stat.  Most notably, how you define the time
window has a big impact on which observations are actually
used.  Suppose that the forecast you're evaluating is valid at time T.
Point-Stat uses the beg_ds and end_ds settings in the config file to
define a matching time window as (T+beg_ds, T+end_ds).  All
observations whose valid time falls within that window are used.  I'm
guessing that where you see multiple observations at the same station,
there are multiple reports falling within that time window.
 Unfortunately in METv2.0, in the MPR line type, the observation time
columns just show how the matching time window was defined - it's
value stays constant for all the MPR lines.  In METv3.0, we've
updated that logic so that the actual valid time of that observation
is written out in those columns.  That would give you more information
about the timing of the observations.

When multiple observations occur at the same location in your time
window, Point-Stat simply uses all of them as if they were independent
observations.  I've proposed some logic to our advisory board
that would provide some control over what to do in that situation so
that only one observation value would be used - like, take their
average or only use the observation which is closest in time to
the forecast.  But they have decided not to implement any support for
that.

Now I don't know that I've actually answered any/all of your
questions.  But hopefully that helps.

If you have a specific questions about your data, feel free to send me
some sample files to run through MET.  And I can help you track down
what's going on.

Thanks,
John Halley Gotway
met_help at ucar.edu

applies the 0's and 1's of the mask you specify to the input forecast
data.  Wherever th

On 04/15/2011 11:24 AM, RAL HelpDesk {for jhenders at aer.com} wrote:
> based on the census data may have included the lat/lon
> location of KDCA. Is that why it could have been used or are there
other
> criteria? As you can see, I'm using DW_MEAN weighting.
>
> Any clarification would be very helpful.
>
> Thanks.
>

------------------------------------------------
Subject: Re: [rt.rap.ucar.edu #46189] Masking in MET
From: jhenders at aer.com
Time: Thu Apr 21 12:32:22 2011

Hello John,

My apologies for the delay in responding.

You have answered all the questions that I have for now. I will
forward
your email on to our customer and discuss the situation with them. For
their needs, I think they prefer a mask that is based on the
observation
location and not the location of the closest model grid point as is
implemented in MET. Of course, the current method in MET is perfectly
defensible.

Thanks for your detailed and prompt response.

Regards,

John

On 4/18/11 1:09 PM, RAL HelpDesk {for John Halley Gotway} wrote:
> John,
>
> It sounds like you're running Point-Stat using a masking polyline.
When you say: "I have generated ncdf files for the masks used in MET
using US census data", I assume that means you've generated a
> NetCDF mask using Gen-Poly-Mask or some other method for defining
your masks for MET.  Since you're running Point-Stat, I think there
are a few points to make...
>
> (1) First, regarding how the masks are defined.  Point-Stat can
either read a lat/lon polyline mask directly or read the NetCDF output
of Gen-Poly-Mask.  In the latter case, it simply reads the
> masking information from the NetCDF file.  In the former case, it
reads in the lat/lon polyline.  Then it loops through each grid point
in your domain, retrieves the lat/lon value for it, and checks
> to see if that lat/lon falls within the lat/lon polyline.  Grid
points falling inside the polyline are assigned a value of 1 and
points outside are assigned a value of 0.
>
> (2) Second, regarding how masks are applied to point observations in
Point-Stat.  Point-Stat processes the observations one by one.  For
each, it converts the observation's lat/lon value to an x/y
> value in the grid.  Then it checks to closest grid point in the mask
(using the nearest integer function, nint(x) and nint(y)).  If the
mask has a zero value at that grid point, the observation value
> is not used.
>
> (3) Third, there are other factors which affect which observations
are/are not used by Point-Stat.  Most notably, how you define the time
window has a big impact on which observations are actually
> used.  Suppose that the forecast you're evaluating is valid at time
T.  Point-Stat uses the beg_ds and end_ds settings in the config file
to define a matching time window as (T+beg_ds, T+end_ds).  All
> observations whose valid time falls within that window are used.
I'm guessing that where you see multiple observations at the same
station, there are multiple reports falling within that time window.
>   Unfortunately in METv2.0, in the MPR line type, the observation
time columns just show how the matching time window was defined - it's
value stays constant for all the MPR lines.  In METv3.0, we've
> updated that logic so that the actual valid time of that observation
is written out in those columns.  That would give you more information
about the timing of the observations.
>
> When multiple observations occur at the same location in your time
window, Point-Stat simply uses all of them as if they were independent
observations.  I've proposed some logic to our advisory board
> that would provide some control over what to do in that situation so
that only one observation value would be used - like, take their
average or only use the observation which is closest in time to
> the forecast.  But they have decided not to implement any support
for that.
>
> Now I don't know that I've actually answered any/all of your
questions.  But hopefully that helps.
>
> If you have a specific questions about your data, feel free to send
me some sample files to run through MET.  And I can help you track
down what's going on.
>
> Thanks,
> John Halley Gotway
> met_help at ucar.edu
>
>
> applies the 0's and 1's of the mask you specify to the input
forecast data.  Wherever th
>
> On 04/15/2011 11:24 AM, RAL HelpDesk {for jhenders at aer.com} wrote:
>> based on the census data may have included the lat/lon
>> location of KDCA. Is that why it could have been used or are there
other
>> criteria? As you can see, I'm using DW_MEAN weighting.
>>
>> Any clarification would be very helpful.
>>
>> Thanks.
>>

------------------------------------------------