[Met_help] [rt.rap.ucar.edu #90685] History for Issue with using MET Point_Stat with GDAS files after 2019061200

Mon Jul 22 11:05:02 MDT 2019

----------------------------------------------------------------
  Initial Request
----------------------------------------------------------------

To Whom It May Concern:

Hi! I have been using MET to verify my WRF forecasts, specifically the PB2NC and POINT_STAT commands. I have been verifying the WRF forecasts against operational PREPBUFR GDAS files, which has been working well. Except until the 2019061206 GDAS file, which doesn't seem to produce any matches for my POINT_STAT commands, while everything before and including 2019061200 works.

I also tried for GDAS files after 2019061206, such as 2019061212, 2019061400, etc, and the same problem exists. I wonder if this is something to do with the recent GFS update? If someone could please help me address this data issue that would be greatly appreciated.

Thank you for your prompt attention and support!

Jeremy

----------------------------------------------------------------
  Complete Ticket History
----------------------------------------------------------------

Subject: Issue with using MET Point_Stat with GDAS files after 2019061200
From: John Halley Gotway
Time: Tue Jun 18 16:50:34 2019

Jeremy,

I see you have a question about running the pb2nc and point_stat
tools.
That's interesting that you found that you're verification logic broke
on
2019061206.  To check the behavior, I retrieved a PREPBUFR file before
and
after that time:

BEFORE: wget
ftp://ftpprd.ncep.noaa.gov/pub/data/nccf/com/gfs/prod/gdas.20190611/gdas.t12z.prepbufr.nr
AFTER:   wget
ftp://ftpprd.ncep.noaa.gov/pub/data/nccf/com/gfs/prod/gdas.20190612/12/gdas.t12z.prepbufr.nr

I ran them both through pb2nc, extracting ADPUPA and ADPSFC message
types.
pb2nc retained/derived 308,034 observations on the 11th but only
175,427 on
the 12th.

One obvious difference I did note is that they've changed the
directory
structure on the ftp site.  On 20190612, they started organizing the
data
into subdirectories for the hour (00, 06, 12, 18).  If you're running
a
script to pull prepbufr files from the FTP site, please make sure it's
finding them in the right sub-directory.

Doing some more digging I found...
The number of ADPUPA observations remains unchanged... 136,114 for
both
days.
The number of ADPSFC (surface) observations is dramatically reduced...
from
171,920 down to 34,247!

So next I reran pb2nc but setting *quality_mark_thresh = 15; *to
retain all
observations regardless of the quality flag.
And that results in 332,235 and 337,818 observations on the 11th and
12th,
respectively.
The ADPUPA obs are very similar: 150,057 vs 155,787
The ADPSFC obs are also similar: 182,178 vs 182,031

So the big difference is in the quality mark values.
On the 11th...
182,178 observations = 171,920 with QM=2, 10 with QM=9, and 10,248
with
other QM's.
On the 12th...
182,031 observations = 34,247 with QM=2, 139,047 with QM=9, and 8,737
with
other QM's.

I'm guessing that with the GFS upgrade, they changed their GDAS
assimilation logic back to setting the quality marker = 9 for surface
obs
to avoid assimilating them.

So I'd recommend 2 things:
(1) In your PB2NC config file, set *quality_mark_thresh = 9;*
(2) In your Point-Stat config file, when verifying against surface
obs, set *obs_quality
= [0,1,2,9]; *to use surface observations with any of these quality
marks.

Hope that helps get you going.

Thanks,
John Halley Gotway

On Tue, Jun 18, 2019 at 4:06 PM Randy Bullock via RT
<met_help at ucar.edu>
wrote:

>
> Tue Jun 18 16:05:37 2019: Request 90685 was acted upon.
> Transaction: Given to johnhg (John Halley Gotway) by bullock
>        Queue: met_help
>      Subject: Issue with using MET Point_Stat with GDAS files after
> 2019061200
>        Owner: johnhg
>   Requestors: jdberman at albany.edu
>       Status: new
>  Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=90685 >
>
>
> This transaction appears to have no content
>

------------------------------------------------
Subject: Issue with using MET Point_Stat with GDAS files after 2019061200
From: Berman, Jeremy D
Time: Fri Jun 21 00:27:56 2019

Hi John,

Thank you so much for your help! I was bogged down trying to solve
this for a few days - I'm impressed you did it in two hours!

I made the changes and it worked for a case I did for 2019061800. I
should have mentioned I was looking at 2-meter Temperature and 10-
meter Wind Speed, both of which use ADPSFC, which based on your
observation counts explains why point_stat could not get any matched
pairs.

I have another question if you don't mind me asking: if I want to
compute verification for a different vertical level, such as 100-meter
Wind Speed, can MET do that for a WRF forecast file, even if the
forecast file does not have a 100-meter level? Would MET be able to do
a vertical interpolation in order to assess for that level (or any
vertical above ground level)?

Additionally: if I wanted to do verification with point_stat for an
entire vertical profile (let's say from the surface to 1000 meters
above ground level) could MET do that as well? I know in the example
MET tutorial there is a range of Temperature from 850-750hPa, but I
was wondering if this could work for a range of vertical meters above
ground?

Thank you for all your help!

Best,

Jeremy

________________________________
From: John Halley Gotway via RT <met_help at ucar.edu>
Sent: Tuesday, June 18, 2019 6:50:34 PM
To: Berman, Jeremy D
Cc: harrold at ucar.edu; jwolff at ucar.edu
Subject: Re: [rt.rap.ucar.edu #90685] Issue with using MET Point_Stat
with GDAS files after 2019061200

Jeremy,

I see you have a question about running the pb2nc and point_stat
tools.
That's interesting that you found that you're verification logic broke
on
2019061206.  To check the behavior, I retrieved a PREPBUFR file before
and
after that time:

BEFORE: wget
ftp://ftpprd.ncep.noaa.gov/pub/data/nccf/com/gfs/prod/gdas.20190611/gdas.t12z.prepbufr.nr
AFTER:   wget
ftp://ftpprd.ncep.noaa.gov/pub/data/nccf/com/gfs/prod/gdas.20190612/12/gdas.t12z.prepbufr.nr

I ran them both through pb2nc, extracting ADPUPA and ADPSFC message
types.
pb2nc retained/derived 308,034 observations on the 11th but only
175,427 on
the 12th.

One obvious difference I did note is that they've changed the
directory
structure on the ftp site.  On 20190612, they started organizing the
data
into subdirectories for the hour (00, 06, 12, 18).  If you're running
a
script to pull prepbufr files from the FTP site, please make sure it's
finding them in the right sub-directory.

Doing some more digging I found...
The number of ADPUPA observations remains unchanged... 136,114 for
both
days.
The number of ADPSFC (surface) observations is dramatically reduced...
from
171,920 down to 34,247!

So next I reran pb2nc but setting *quality_mark_thresh = 15; *to
retain all
observations regardless of the quality flag.
And that results in 332,235 and 337,818 observations on the 11th and
12th,
respectively.
The ADPUPA obs are very similar: 150,057 vs 155,787
The ADPSFC obs are also similar: 182,178 vs 182,031

So the big difference is in the quality mark values.
On the 11th...
182,178 observations = 171,920 with QM=2, 10 with QM=9, and 10,248
with
other QM's.
On the 12th...
182,031 observations = 34,247 with QM=2, 139,047 with QM=9, and 8,737
with
other QM's.

I'm guessing that with the GFS upgrade, they changed their GDAS
assimilation logic back to setting the quality marker = 9 for surface
obs
to avoid assimilating them.

So I'd recommend 2 things:
(1) In your PB2NC config file, set *quality_mark_thresh = 9;*
(2) In your Point-Stat config file, when verifying against surface
obs, set *obs_quality
= [0,1,2,9]; *to use surface observations with any of these quality
marks.

Hope that helps get you going.

Thanks,
John Halley Gotway

On Tue, Jun 18, 2019 at 4:06 PM Randy Bullock via RT
<met_help at ucar.edu>
wrote:

>
> Tue Jun 18 16:05:37 2019: Request 90685 was acted upon.
> Transaction: Given to johnhg (John Halley Gotway) by bullock
>        Queue: met_help
>      Subject: Issue with using MET Point_Stat with GDAS files after
> 2019061200
>        Owner: johnhg
>   Requestors: jdberman at albany.edu
>       Status: new
>  Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=90685 >
>
>
> This transaction appears to have no content
>

------------------------------------------------
Subject: Issue with using MET Point_Stat with GDAS files after 2019061200
From: John Halley Gotway
Time: Fri Jun 21 10:17:19 2019

Jeremy,

Unfortunately, no, MET does not include logic for handling WRF's
hybrid
vertical coordinate and interpolating to pressure or height levels.
In
addition, when using raw WRFOUT files, MET does not handle winds well
since
they're defined on the staggered grid.

For these two reasons, we recommend that users run their WRFOUT files
through the Unified Post Processor.  It destaggers the winds and
interpolates to the pressure level and height levels.  Here's a link
for
info about UPP:
https://dtcenter.org/community-code/unified-post-processor-upp

UPP writes GRIB1 or GRIB2 output files and the MET tools can handle
those
well.

Thanks,
John

On Fri, Jun 21, 2019 at 12:28 AM Berman, Jeremy D via RT
<met_help at ucar.edu>
wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=90685 >
>
> Hi John,
>
>
> Thank you so much for your help! I was bogged down trying to solve
this
> for a few days - I'm impressed you did it in two hours!
>
>
> I made the changes and it worked for a case I did for 2019061800. I
should
> have mentioned I was looking at 2-meter Temperature and 10-meter
Wind
> Speed, both of which use ADPSFC, which based on your observation
counts
> explains why point_stat could not get any matched pairs.
>
>
> I have another question if you don't mind me asking: if I want to
compute
> verification for a different vertical level, such as 100-meter Wind
Speed,
> can MET do that for a WRF forecast file, even if the forecast file
does not
> have a 100-meter level? Would MET be able to do a vertical
interpolation in
> order to assess for that level (or any vertical above ground level)?
>
>
> Additionally: if I wanted to do verification with point_stat for an
entire
> vertical profile (let's say from the surface to 1000 meters above
ground
> level) could MET do that as well? I know in the example MET tutorial
there
> is a range of Temperature from 850-750hPa, but I was wondering if
this
> could work for a range of vertical meters above ground?
>
>
> Thank you for all your help!
>
>
> Best,
>
> Jeremy
>
> ________________________________
> From: John Halley Gotway via RT <met_help at ucar.edu>
> Sent: Tuesday, June 18, 2019 6:50:34 PM
> To: Berman, Jeremy D
> Cc: harrold at ucar.edu; jwolff at ucar.edu
> Subject: Re: [rt.rap.ucar.edu #90685] Issue with using MET
Point_Stat
> with GDAS files after 2019061200
>
> Jeremy,
>
> I see you have a question about running the pb2nc and point_stat
tools.
> That's interesting that you found that you're verification logic
broke on
> 2019061206.  To check the behavior, I retrieved a PREPBUFR file
before and
> after that time:
>
> BEFORE: wget
>
>
ftp://ftpprd.ncep.noaa.gov/pub/data/nccf/com/gfs/prod/gdas.20190611/gdas.t12z.prepbufr.nr
> AFTER:   wget
>
>
ftp://ftpprd.ncep.noaa.gov/pub/data/nccf/com/gfs/prod/gdas.20190612/12/gdas.t12z.prepbufr.nr
>
> I ran them both through pb2nc, extracting ADPUPA and ADPSFC message
types.
> pb2nc retained/derived 308,034 observations on the 11th but only
175,427 on
> the 12th.
>
> One obvious difference I did note is that they've changed the
directory
> structure on the ftp site.  On 20190612, they started organizing the
data
> into subdirectories for the hour (00, 06, 12, 18).  If you're
running a
> script to pull prepbufr files from the FTP site, please make sure
it's
> finding them in the right sub-directory.
>
> Doing some more digging I found...
> The number of ADPUPA observations remains unchanged... 136,114 for
both
> days.
> The number of ADPSFC (surface) observations is dramatically
reduced... from
> 171,920 down to 34,247!
>
> So next I reran pb2nc but setting *quality_mark_thresh = 15; *to
retain all
> observations regardless of the quality flag.
> And that results in 332,235 and 337,818 observations on the 11th and
12th,
> respectively.
> The ADPUPA obs are very similar: 150,057 vs 155,787
> The ADPSFC obs are also similar: 182,178 vs 182,031
>
> So the big difference is in the quality mark values.
> On the 11th...
> 182,178 observations = 171,920 with QM=2, 10 with QM=9, and 10,248
with
> other QM's.
> On the 12th...
> 182,031 observations = 34,247 with QM=2, 139,047 with QM=9, and
8,737 with
> other QM's.
>
> I'm guessing that with the GFS upgrade, they changed their GDAS
> assimilation logic back to setting the quality marker = 9 for
surface obs
> to avoid assimilating them.
>
> So I'd recommend 2 things:
> (1) In your PB2NC config file, set *quality_mark_thresh = 9;*
> (2) In your Point-Stat config file, when verifying against surface
> obs, set *obs_quality
> = [0,1,2,9]; *to use surface observations with any of these quality
marks.
>
> Hope that helps get you going.
>
> Thanks,
> John Halley Gotway
>
> On Tue, Jun 18, 2019 at 4:06 PM Randy Bullock via RT
<met_help at ucar.edu>
> wrote:
>
> >
> > Tue Jun 18 16:05:37 2019: Request 90685 was acted upon.
> > Transaction: Given to johnhg (John Halley Gotway) by bullock
> >        Queue: met_help
> >      Subject: Issue with using MET Point_Stat with GDAS files
after
> > 2019061200
> >        Owner: johnhg
> >   Requestors: jdberman at albany.edu
> >       Status: new
> >  Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=90685 >
> >
> >
> > This transaction appears to have no content
> >
>
>
>

------------------------------------------------
Subject: Issue with using MET Point_Stat with GDAS files after 2019061200
From: Berman, Jeremy D
Time: Fri Jun 28 15:48:58 2019

Hi John,

Thanks for your reply! The UPP information was very helpful, and I've
used it successfully on my WRF runs.

I've encountered another minor issue, if you could provide some
insight. I've been trying to compare GFS forecasts vs my WRF forecasts
over the same verification region, such as only over my WRF domain
(e.g., extending over the Pacific and California). For reference, my
WRF is a 12km single domain run.

I've noticed that after doing the steps below, I am verifying over my
intended region (the WRF domain), but I'm getting slightly more
matched pairs for my WRF run than the GFS (54 vs 52 obs). Do you have
any idea of why this could be happening and how I could get the number
of matched observations to be the same?

Here is my procedure:

I use the "Gen-Poly-Mask Tool" and the -type grid on a UPP'ed WRF
forecast file to create a mask saved as a .nc file:

gen_vx_mask -type grid WRFPRS_d01_f001 WRFPRS_d01_f001 WRF_domain.nc

Then, I modified my pb2nc config file ("PB2NCConfig_PREPBUFR_GDAS") by
setting: poly = "WRF_domain.nc"

Then I run the "pb2nc" command on PREPBUFR_GDAS data to select
observations over my masked area:

pb2nc PREPBUFR_GDAS_f000.nr PREPBUFR_GDAS_f000_pb.nc
PB2NCConfig_PREPBUFR_GDAS -v 2

This creates a file called "PREPBUFR_GDAS_f000_pb.nc" which I checked
contains observations over my WRF mask.

Then I created point_stat config files for both GFS and WRF which have
the same default settings but have "grid = [ "FULL" ]; poly = [];" and
the interpolation method for NEAREST and BILIN. In my understanding,
this is fine since the pb2nc should have only selected observations
within the masked region.

I ran the point_stat command for both GFS and WRF forecast data, which
worked fine, but I noticed the discrepancy in the total number of
matched pairs.

I also tried having the point_stat config files having " grid = [];
poly = ["WRF_domain.nc"] " , but the same observation discrepancy
occurred.

Am I maybe using the mask incorrectly? Or could it be that since the
GFS and WRF may have different horizontal resolutions, that there
could be less matched pairs?

I should mention I also tried creating a mask.poly file with just a
boxed region (e.g., Northern California) and included poly =
"mask.poly" in the PB2NC Config File. After doing this I got the same
number of observations for GFS and WRF. So I'm not sure why using the
mask.poly worked fine, but using the WRF domain as a mask did not.

Thanks for your time and thoughts! If you need to see any example
config files or WRF output, let me know and I can provide that.

Thanks!

Jeremy

________________________________
From: John Halley Gotway via RT <met_help at ucar.edu>
Sent: Friday, June 21, 2019 12:17:19 PM
To: Berman, Jeremy D
Cc: harrold at ucar.edu; jwolff at ucar.edu
Subject: Re: [rt.rap.ucar.edu #90685] Issue with using MET Point_Stat
with GDAS files after 2019061200

Jeremy,

Unfortunately, no, MET does not include logic for handling WRF's
hybrid
vertical coordinate and interpolating to pressure or height levels.
In
addition, when using raw WRFOUT files, MET does not handle winds well
since
they're defined on the staggered grid.

For these two reasons, we recommend that users run their WRFOUT files
through the Unified Post Processor.  It destaggers the winds and
interpolates to the pressure level and height levels.  Here's a link
for
info about UPP:
https://dtcenter.org/community-code/unified-post-processor-upp

UPP writes GRIB1 or GRIB2 output files and the MET tools can handle
those
well.

Thanks,
John

On Fri, Jun 21, 2019 at 12:28 AM Berman, Jeremy D via RT
<met_help at ucar.edu>
wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=90685 >
>
> Hi John,
>
>
> Thank you so much for your help! I was bogged down trying to solve
this
> for a few days - I'm impressed you did it in two hours!
>
>
> I made the changes and it worked for a case I did for 2019061800. I
should
> have mentioned I was looking at 2-meter Temperature and 10-meter
Wind
> Speed, both of which use ADPSFC, which based on your observation
counts
> explains why point_stat could not get any matched pairs.
>
>
> I have another question if you don't mind me asking: if I want to
compute
> verification for a different vertical level, such as 100-meter Wind
Speed,
> can MET do that for a WRF forecast file, even if the forecast file
does not
> have a 100-meter level? Would MET be able to do a vertical
interpolation in
> order to assess for that level (or any vertical above ground level)?
>
>
> Additionally: if I wanted to do verification with point_stat for an
entire
> vertical profile (let's say from the surface to 1000 meters above
ground
> level) could MET do that as well? I know in the example MET tutorial
there
> is a range of Temperature from 850-750hPa, but I was wondering if
this
> could work for a range of vertical meters above ground?
>
>
> Thank you for all your help!
>
>
> Best,
>
> Jeremy
>
> ________________________________
> From: John Halley Gotway via RT <met_help at ucar.edu>
> Sent: Tuesday, June 18, 2019 6:50:34 PM
> To: Berman, Jeremy D
> Cc: harrold at ucar.edu; jwolff at ucar.edu
> Subject: Re: [rt.rap.ucar.edu #90685] Issue with using MET
Point_Stat
> with GDAS files after 2019061200
>
> Jeremy,
>
> I see you have a question about running the pb2nc and point_stat
tools.
> That's interesting that you found that you're verification logic
broke on
> 2019061206.  To check the behavior, I retrieved a PREPBUFR file
before and
> after that time:
>
> BEFORE: wget
>
>
ftp://ftpprd.ncep.noaa.gov/pub/data/nccf/com/gfs/prod/gdas.20190611/gdas.t12z.prepbufr.nr
> AFTER:   wget
>
>
ftp://ftpprd.ncep.noaa.gov/pub/data/nccf/com/gfs/prod/gdas.20190612/12/gdas.t12z.prepbufr.nr
>
> I ran them both through pb2nc, extracting ADPUPA and ADPSFC message
types.
> pb2nc retained/derived 308,034 observations on the 11th but only
175,427 on
> the 12th.
>
> One obvious difference I did note is that they've changed the
directory
> structure on the ftp site.  On 20190612, they started organizing the
data
> into subdirectories for the hour (00, 06, 12, 18).  If you're
running a
> script to pull prepbufr files from the FTP site, please make sure
it's
> finding them in the right sub-directory.
>
> Doing some more digging I found...
> The number of ADPUPA observations remains unchanged... 136,114 for
both
> days.
> The number of ADPSFC (surface) observations is dramatically
reduced... from
> 171,920 down to 34,247!
>
> So next I reran pb2nc but setting *quality_mark_thresh = 15; *to
retain all
> observations regardless of the quality flag.
> And that results in 332,235 and 337,818 observations on the 11th and
12th,
> respectively.
> The ADPUPA obs are very similar: 150,057 vs 155,787
> The ADPSFC obs are also similar: 182,178 vs 182,031
>
> So the big difference is in the quality mark values.
> On the 11th...
> 182,178 observations = 171,920 with QM=2, 10 with QM=9, and 10,248
with
> other QM's.
> On the 12th...
> 182,031 observations = 34,247 with QM=2, 139,047 with QM=9, and
8,737 with
> other QM's.
>
> I'm guessing that with the GFS upgrade, they changed their GDAS
> assimilation logic back to setting the quality marker = 9 for
surface obs
> to avoid assimilating them.
>
> So I'd recommend 2 things:
> (1) In your PB2NC config file, set *quality_mark_thresh = 9;*
> (2) In your Point-Stat config file, when verifying against surface
> obs, set *obs_quality
> = [0,1,2,9]; *to use surface observations with any of these quality
marks.
>
> Hope that helps get you going.
>
> Thanks,
> John Halley Gotway
>
> On Tue, Jun 18, 2019 at 4:06 PM Randy Bullock via RT
<met_help at ucar.edu>
> wrote:
>
> >
> > Tue Jun 18 16:05:37 2019: Request 90685 was acted upon.
> > Transaction: Given to johnhg (John Halley Gotway) by bullock
> >        Queue: met_help
> >      Subject: Issue with using MET Point_Stat with GDAS files
after
> > 2019061200
> >        Owner: johnhg
> >   Requestors: jdberman at albany.edu
> >       Status: new
> >  Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=90685 >
> >
> >
> > This transaction appears to have no content
> >
>
>
>

------------------------------------------------
Subject: Issue with using MET Point_Stat with GDAS files after 2019061200
From: Berman, Jeremy D
Time: Fri Jun 28 15:58:22 2019

Hi John,

I realized it's probably most helpful to you if I provided the files
and mask that I was using, so you can see what I was doing. Please let
me know if you can see these files (I didn't include in this email the
WRF output file since it is large (>45 mb) but I can via another way
if you need to see it).

Jeremy

________________________________
From: Berman, Jeremy D
Sent: Friday, June 28, 2019 5:48:53 PM
To: met_help at ucar.edu
Subject: Re: [rt.rap.ucar.edu #90685] Issue with using MET Point_Stat
with GDAS files after 2019061200

Hi John,

Thanks for your reply! The UPP information was very helpful, and I've
used it successfully on my WRF runs.

I've encountered another minor issue, if you could provide some
insight. I've been trying to compare GFS forecasts vs my WRF forecasts
over the same verification region, such as only over my WRF domain
(e.g., extending over the Pacific and California). For reference, my
WRF is a 12km single domain run.

I've noticed that after doing the steps below, I am verifying over my
intended region (the WRF domain), but I'm getting slightly more
matched pairs for my WRF run than the GFS (54 vs 52 obs). Do you have
any idea of why this could be happening and how I could get the number
of matched observations to be the same?

Here is my procedure:

I use the "Gen-Poly-Mask Tool" and the -type grid on a UPP'ed WRF
forecast file to create a mask saved as a .nc file:

gen_vx_mask -type grid WRFPRS_d01_f001 WRFPRS_d01_f001 WRF_domain.nc

Then, I modified my pb2nc config file ("PB2NCConfig_PREPBUFR_GDAS") by
setting: poly = "WRF_domain.nc"

Then I run the "pb2nc" command on PREPBUFR_GDAS data to select
observations over my masked area:

pb2nc PREPBUFR_GDAS_f000.nr PREPBUFR_GDAS_f000_pb.nc
PB2NCConfig_PREPBUFR_GDAS -v 2

This creates a file called "PREPBUFR_GDAS_f000_pb.nc" which I checked
contains observations over my WRF mask.

Then I created point_stat config files for both GFS and WRF which have
the same default settings but have "grid = [ "FULL" ]; poly = [];" and
the interpolation method for NEAREST and BILIN. In my understanding,
this is fine since the pb2nc should have only selected observations
within the masked region.

I ran the point_stat command for both GFS and WRF forecast data, which
worked fine, but I noticed the discrepancy in the total number of
matched pairs.

I also tried having the point_stat config files having " grid = [];
poly = ["WRF_domain.nc"] " , but the same observation discrepancy
occurred.

Am I maybe using the mask incorrectly? Or could it be that since the
GFS and WRF may have different horizontal resolutions, that there
could be less matched pairs?

I should mention I also tried creating a mask.poly file with just a
boxed region (e.g., Northern California) and included poly =
"mask.poly" in the PB2NC Config File. After doing this I got the same
number of observations for GFS and WRF. So I'm not sure why using the
mask.poly worked fine, but using the WRF domain as a mask did not.

Thanks for your time and thoughts! If you need to see any example
config files or WRF output, let me know and I can provide that.

Thanks!

Jeremy

________________________________
From: John Halley Gotway via RT <met_help at ucar.edu>
Sent: Friday, June 21, 2019 12:17:19 PM
To: Berman, Jeremy D
Cc: harrold at ucar.edu; jwolff at ucar.edu
Subject: Re: [rt.rap.ucar.edu #90685] Issue with using MET Point_Stat
with GDAS files after 2019061200

Jeremy,

Unfortunately, no, MET does not include logic for handling WRF's
hybrid
vertical coordinate and interpolating to pressure or height levels.
In
addition, when using raw WRFOUT files, MET does not handle winds well
since
they're defined on the staggered grid.

For these two reasons, we recommend that users run their WRFOUT files
through the Unified Post Processor.  It destaggers the winds and
interpolates to the pressure level and height levels.  Here's a link
for
info about UPP:
https://dtcenter.org/community-code/unified-post-processor-upp

UPP writes GRIB1 or GRIB2 output files and the MET tools can handle
those
well.

Thanks,
John

On Fri, Jun 21, 2019 at 12:28 AM Berman, Jeremy D via RT
<met_help at ucar.edu>
wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=90685 >
>
> Hi John,
>
>
> Thank you so much for your help! I was bogged down trying to solve
this
> for a few days - I'm impressed you did it in two hours!
>
>
> I made the changes and it worked for a case I did for 2019061800. I
should
> have mentioned I was looking at 2-meter Temperature and 10-meter
Wind
> Speed, both of which use ADPSFC, which based on your observation
counts
> explains why point_stat could not get any matched pairs.
>
>
> I have another question if you don't mind me asking: if I want to
compute
> verification for a different vertical level, such as 100-meter Wind
Speed,
> can MET do that for a WRF forecast file, even if the forecast file
does not
> have a 100-meter level? Would MET be able to do a vertical
interpolation in
> order to assess for that level (or any vertical above ground level)?
>
>
> Additionally: if I wanted to do verification with point_stat for an
entire
> vertical profile (let's say from the surface to 1000 meters above
ground
> level) could MET do that as well? I know in the example MET tutorial
there
> is a range of Temperature from 850-750hPa, but I was wondering if
this
> could work for a range of vertical meters above ground?
>
>
> Thank you for all your help!
>
>
> Best,
>
> Jeremy
>
> ________________________________
> From: John Halley Gotway via RT <met_help at ucar.edu>
> Sent: Tuesday, June 18, 2019 6:50:34 PM
> To: Berman, Jeremy D
> Cc: harrold at ucar.edu; jwolff at ucar.edu
> Subject: Re: [rt.rap.ucar.edu #90685] Issue with using MET
Point_Stat
> with GDAS files after 2019061200
>
> Jeremy,
>
> I see you have a question about running the pb2nc and point_stat
tools.
> That's interesting that you found that you're verification logic
broke on
> 2019061206.  To check the behavior, I retrieved a PREPBUFR file
before and
> after that time:
>
> BEFORE: wget
>
>
ftp://ftpprd.ncep.noaa.gov/pub/data/nccf/com/gfs/prod/gdas.20190611/gdas.t12z.prepbufr.nr
> AFTER:   wget
>
>
ftp://ftpprd.ncep.noaa.gov/pub/data/nccf/com/gfs/prod/gdas.20190612/12/gdas.t12z.prepbufr.nr
>
> I ran them both through pb2nc, extracting ADPUPA and ADPSFC message
types.
> pb2nc retained/derived 308,034 observations on the 11th but only
175,427 on
> the 12th.
>
> One obvious difference I did note is that they've changed the
directory
> structure on the ftp site.  On 20190612, they started organizing the
data
> into subdirectories for the hour (00, 06, 12, 18).  If you're
running a
> script to pull prepbufr files from the FTP site, please make sure
it's
> finding them in the right sub-directory.
>
> Doing some more digging I found...
> The number of ADPUPA observations remains unchanged... 136,114 for
both
> days.
> The number of ADPSFC (surface) observations is dramatically
reduced... from
> 171,920 down to 34,247!
>
> So next I reran pb2nc but setting *quality_mark_thresh = 15; *to
retain all
> observations regardless of the quality flag.
> And that results in 332,235 and 337,818 observations on the 11th and
12th,
> respectively.
> The ADPUPA obs are very similar: 150,057 vs 155,787
> The ADPSFC obs are also similar: 182,178 vs 182,031
>
> So the big difference is in the quality mark values.
> On the 11th...
> 182,178 observations = 171,920 with QM=2, 10 with QM=9, and 10,248
with
> other QM's.
> On the 12th...
> 182,031 observations = 34,247 with QM=2, 139,047 with QM=9, and
8,737 with
> other QM's.
>
> I'm guessing that with the GFS upgrade, they changed their GDAS
> assimilation logic back to setting the quality marker = 9 for
surface obs
> to avoid assimilating them.
>
> So I'd recommend 2 things:
> (1) In your PB2NC config file, set *quality_mark_thresh = 9;*
> (2) In your Point-Stat config file, when verifying against surface
> obs, set *obs_quality
> = [0,1,2,9]; *to use surface observations with any of these quality
marks.
>
> Hope that helps get you going.
>
> Thanks,
> John Halley Gotway
>
> On Tue, Jun 18, 2019 at 4:06 PM Randy Bullock via RT
<met_help at ucar.edu>
> wrote:
>
> >
> > Tue Jun 18 16:05:37 2019: Request 90685 was acted upon.
> > Transaction: Given to johnhg (John Halley Gotway) by bullock
> >        Queue: met_help
> >      Subject: Issue with using MET Point_Stat with GDAS files
after
> > 2019061200
> >        Owner: johnhg
> >   Requestors: jdberman at albany.edu
> >       Status: new
> >  Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=90685 >
> >
> >
> > This transaction appears to have no content
> >
>
>
>

------------------------------------------------
Subject: Issue with using MET Point_Stat with GDAS files after 2019061200
From: John Halley Gotway
Time: Mon Jul 01 11:27:23 2019

Jeremy,

I see you have a question about why you get slightly different number
of
matched pairs when verifying your WRF output and GFS output over the
same
spatial area.  Thanks for describing your processing logic.

So you're getting 54 matched pairs for WRF and only 52 for GFS.  I'm
trying
to reconcile in my mind why that would be the case.  Differences like
this
tend to occur along the edge of the domain.  But I would have guessed
the
result be the other way around... you'd get 2 fewer pairs for your the
limited area of the WRF run and two more for the global GFS.

I'd suggest digging a little deeper, and here's how...

(1) Rerun Point-Stat at verbosity level 3 by specifying "-v 3" on the
command line.  For each verification task, that'll dump out counts for
why
obs were or were not used.  Diffing the logs between WRF and GFS, you
might
find a difference which could point to an explanation.

(2) I see from your Point-Stat config file that you already have the
MPR
output line type turned on.  You could look at that output to
determine
which stations were included for WRF but not GFS.  That might yield
another
clue.

Do those methods shed any light on the differences?

Also, I plotted the WRF domain you sent and see that it's over the
eastern
pacific.  Be aware that the ADPSFC obs are only over land, a small
fraction
of your domain.  Surface water points are encoded as "SFCSHP" in
PREPBUFR.
So I'd suggest setting:

message_type   = [ "ADPSFC", "SFCSHP" ];

That'll verify them as two separate tasks.  If you want to process
them in
one big group, just use:

message_type   = [ "ONLYSF" ]; // That's defined in
message_type_group_map
as the union of ADPSFC and SFCSHP

Thanks,
John

On Fri, Jun 28, 2019 at 3:58 PM Berman, Jeremy D via RT
<met_help at ucar.edu>
wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=90685 >
>
> Hi John,
>
>
> I realized it's probably most helpful to you if I provided the files
and
> mask that I was using, so you can see what I was doing. Please let
me know
> if you can see these files (I didn't include in this email the WRF
output
> file since it is large (>45 mb) but I can via another way if you
need to
> see it).
>
>
> Jeremy
>
>
>
> ________________________________
> From: Berman, Jeremy D
> Sent: Friday, June 28, 2019 5:48:53 PM
> To: met_help at ucar.edu
> Subject: Re: [rt.rap.ucar.edu #90685] Issue with using MET
Point_Stat
> with GDAS files after 2019061200
>
>
> Hi John,
>
>
> Thanks for your reply! The UPP information was very helpful, and
I've used
> it successfully on my WRF runs.
>
>
> I've encountered another minor issue, if you could provide some
insight.
> I've been trying to compare GFS forecasts vs my WRF forecasts over
the same
> verification region, such as only over my WRF domain (e.g.,
extending over
> the Pacific and California). For reference, my WRF is a 12km single
domain
> run.
>
>
> I've noticed that after doing the steps below, I am verifying over
my
> intended region (the WRF domain), but I'm getting slightly more
matched
> pairs for my WRF run than the GFS (54 vs 52 obs). Do you have any
idea of
> why this could be happening and how I could get the number of
matched
> observations to be the same?
>
>
> Here is my procedure:
>
>
> I use the "Gen-Poly-Mask Tool" and the -type grid on a UPP'ed WRF
forecast
> file to create a mask saved as a .nc file:
>
>
> gen_vx_mask -type grid WRFPRS_d01_f001 WRFPRS_d01_f001 WRF_domain.nc
>
>
> Then, I modified my pb2nc config file ("PB2NCConfig_PREPBUFR_GDAS")
by
> setting: poly = "WRF_domain.nc"
>
>
> Then I run the "pb2nc" command on PREPBUFR_GDAS data to select
> observations over my masked area:
>
>
> pb2nc PREPBUFR_GDAS_f000.nr PREPBUFR_GDAS_f000_pb.nc
> PB2NCConfig_PREPBUFR_GDAS -v 2
>
>
> This creates a file called "PREPBUFR_GDAS_f000_pb.nc" which I
checked
> contains observations over my WRF mask.
>
>
> Then I created point_stat config files for both GFS and WRF which
have the
> same default settings but have "grid = [ "FULL" ]; poly = [];" and
the
> interpolation method for NEAREST and BILIN. In my understanding,
this is
> fine since the pb2nc should have only selected observations within
the
> masked region.
>
>
> I ran the point_stat command for both GFS and WRF forecast data,
which
> worked fine, but I noticed the discrepancy in the total number of
matched
> pairs.
>
>
> I also tried having the point_stat config files having " grid = [];
poly =
> ["WRF_domain.nc"] " , but the same observation discrepancy occurred.
>
>
> Am I maybe using the mask incorrectly? Or could it be that since the
GFS
> and WRF may have different horizontal resolutions, that there could
be less
> matched pairs?
>
>
> I should mention I also tried creating a mask.poly file with just a
boxed
> region (e.g., Northern California) and included poly = "mask.poly"
in the
> PB2NC Config File. After doing this I got the same number of
observations
> for GFS and WRF. So I'm not sure why using the mask.poly worked
fine, but
> using the WRF domain as a mask did not.
>
>
>
> Thanks for your time and thoughts! If you need to see any example
config
> files or WRF output, let me know and I can provide that.
>
>
> Thanks!
>
> Jeremy
>
>
>
>
>
>
> ________________________________
> From: John Halley Gotway via RT <met_help at ucar.edu>
> Sent: Friday, June 21, 2019 12:17:19 PM
> To: Berman, Jeremy D
> Cc: harrold at ucar.edu; jwolff at ucar.edu
> Subject: Re: [rt.rap.ucar.edu #90685] Issue with using MET
Point_Stat
> with GDAS files after 2019061200
>
> Jeremy,
>
> Unfortunately, no, MET does not include logic for handling WRF's
hybrid
> vertical coordinate and interpolating to pressure or height levels.
In
> addition, when using raw WRFOUT files, MET does not handle winds
well since
> they're defined on the staggered grid.
>
> For these two reasons, we recommend that users run their WRFOUT
files
> through the Unified Post Processor.  It destaggers the winds and
> interpolates to the pressure level and height levels.  Here's a link
for
> info about UPP:
> https://dtcenter.org/community-code/unified-post-processor-upp
>
> UPP writes GRIB1 or GRIB2 output files and the MET tools can handle
those
> well.
>
> Thanks,
> John
>
> On Fri, Jun 21, 2019 at 12:28 AM Berman, Jeremy D via RT <
> met_help at ucar.edu>
> wrote:
>
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=90685 >
> >
> > Hi John,
> >
> >
> > Thank you so much for your help! I was bogged down trying to solve
this
> > for a few days - I'm impressed you did it in two hours!
> >
> >
> > I made the changes and it worked for a case I did for 2019061800.
I
> should
> > have mentioned I was looking at 2-meter Temperature and 10-meter
Wind
> > Speed, both of which use ADPSFC, which based on your observation
counts
> > explains why point_stat could not get any matched pairs.
> >
> >
> > I have another question if you don't mind me asking: if I want to
compute
> > verification for a different vertical level, such as 100-meter
Wind
> Speed,
> > can MET do that for a WRF forecast file, even if the forecast file
does
> not
> > have a 100-meter level? Would MET be able to do a vertical
interpolation
> in
> > order to assess for that level (or any vertical above ground
level)?
> >
> >
> > Additionally: if I wanted to do verification with point_stat for
an
> entire
> > vertical profile (let's say from the surface to 1000 meters above
ground
> > level) could MET do that as well? I know in the example MET
tutorial
> there
> > is a range of Temperature from 850-750hPa, but I was wondering if
this
> > could work for a range of vertical meters above ground?
> >
> >
> > Thank you for all your help!
> >
> >
> > Best,
> >
> > Jeremy
> >
> > ________________________________
> > From: John Halley Gotway via RT <met_help at ucar.edu>
> > Sent: Tuesday, June 18, 2019 6:50:34 PM
> > To: Berman, Jeremy D
> > Cc: harrold at ucar.edu; jwolff at ucar.edu
> > Subject: Re: [rt.rap.ucar.edu #90685] Issue with using MET
Point_Stat
> > with GDAS files after 2019061200
> >
> > Jeremy,
> >
> > I see you have a question about running the pb2nc and point_stat
tools.
> > That's interesting that you found that you're verification logic
broke on
> > 2019061206.  To check the behavior, I retrieved a PREPBUFR file
before
> and
> > after that time:
> >
> > BEFORE: wget
> >
> >
>
ftp://ftpprd.ncep.noaa.gov/pub/data/nccf/com/gfs/prod/gdas.20190611/gdas.t12z.prepbufr.nr
> > AFTER:   wget
> >
> >
>
ftp://ftpprd.ncep.noaa.gov/pub/data/nccf/com/gfs/prod/gdas.20190612/12/gdas.t12z.prepbufr.nr
> >
> > I ran them both through pb2nc, extracting ADPUPA and ADPSFC
message
> types.
> > pb2nc retained/derived 308,034 observations on the 11th but only
175,427
> on
> > the 12th.
> >
> > One obvious difference I did note is that they've changed the
directory
> > structure on the ftp site.  On 20190612, they started organizing
the data
> > into subdirectories for the hour (00, 06, 12, 18).  If you're
running a
> > script to pull prepbufr files from the FTP site, please make sure
it's
> > finding them in the right sub-directory.
> >
> > Doing some more digging I found...
> > The number of ADPUPA observations remains unchanged... 136,114 for
both
> > days.
> > The number of ADPSFC (surface) observations is dramatically
reduced...
> from
> > 171,920 down to 34,247!
> >
> > So next I reran pb2nc but setting *quality_mark_thresh = 15; *to
retain
> all
> > observations regardless of the quality flag.
> > And that results in 332,235 and 337,818 observations on the 11th
and
> 12th,
> > respectively.
> > The ADPUPA obs are very similar: 150,057 vs 155,787
> > The ADPSFC obs are also similar: 182,178 vs 182,031
> >
> > So the big difference is in the quality mark values.
> > On the 11th...
> > 182,178 observations = 171,920 with QM=2, 10 with QM=9, and 10,248
with
> > other QM's.
> > On the 12th...
> > 182,031 observations = 34,247 with QM=2, 139,047 with QM=9, and
8,737
> with
> > other QM's.
> >
> > I'm guessing that with the GFS upgrade, they changed their GDAS
> > assimilation logic back to setting the quality marker = 9 for
surface obs
> > to avoid assimilating them.
> >
> > So I'd recommend 2 things:
> > (1) In your PB2NC config file, set *quality_mark_thresh = 9;*
> > (2) In your Point-Stat config file, when verifying against surface
> > obs, set *obs_quality
> > = [0,1,2,9]; *to use surface observations with any of these
quality
> marks.
> >
> > Hope that helps get you going.
> >
> > Thanks,
> > John Halley Gotway
> >
> > On Tue, Jun 18, 2019 at 4:06 PM Randy Bullock via RT
<met_help at ucar.edu>
> > wrote:
> >
> > >
> > > Tue Jun 18 16:05:37 2019: Request 90685 was acted upon.
> > > Transaction: Given to johnhg (John Halley Gotway) by bullock
> > >        Queue: met_help
> > >      Subject: Issue with using MET Point_Stat with GDAS files
after
> > > 2019061200
> > >        Owner: johnhg
> > >   Requestors: jdberman at albany.edu
> > >       Status: new
> > >  Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=90685
> >
> > >
> > >
> > > This transaction appears to have no content
> > >
> >
> >
> >
>
>
>

------------------------------------------------
Subject: Issue with using MET Point_Stat with GDAS files after 2019061200
From: Berman, Jeremy D
Time: Wed Jul 03 15:47:12 2019

Hi John,

Thanks for your input! I did as you suggested and found that the
missing points between my WRF and GFS were all along the northern or
eastern boundary of my WRF domain.

When doing the "diff" command between the Point_Stat files for WRF and
GFS with -v 3, I noticed the lines for the WRF file:

"Processing WIND/Z10 versus WIND/Z10, for observation type ADPSFC,
over region FULL, for interpolation method BILIN(4), using 52 pairs"

"DEBUG 3: Rejected: off the grid   = 1"

"Rejected: bad fcst value = 1"

While for GFS:

"Processing WIND/Z10 versus WIND/Z10, for observation type ADPSFC,
over region FULL, for interpolation method BILIN(4), using 54 pairs"

"DEBUG 3: Rejected: off the grid   = 0"

"Rejected: bad fcst value = 0"

I looked into the MPR files and found two stations that were
different. Those two stations were near the northern or eastern
boundary of my WRF domain, and one of them was actually outside the
WRF domain.

So my thinking is that the discrepancy occurs because (1) an
observation is close to a boundary and (2) the bilinear interpolation
approach can't find all the neighbor values because it is too close to
a boundary. I tested these same MET commands, but for the NEAREST
interpolation method: there still was 1 "rejected: off the grid", but
0 "rejected: bad fcst value". So I think this supports my hypothesis.
I think the reason the GFS has more observation matches is because it
has a global domain, so there is no boundary issue for observations,
like there is for my WRF simulation.

I think to fix this issue, and to get the same number of WRF to GFS
observation matches, I'll have to only use a polyline described by
"mask.poly", and I'll have to write a script that checks that the poly
line is not too close to the WRF domain (e.g., within 5 grid points).

I have another, related, question, if you don't mind me asking :)

1. is it possible to select specific METAR locations to run
PB2NC/Point_Stat on, instead of using all the observations within a
poly mask? For example, if I am looking at Northern California, but I
only want to validate averaged over the stations: KAAT, KACV, KBFL,
KBLU, KFAT, KPRB, KSAC for CNT/CTS files? I know the individual
stations are listed in MPR, but I wondered how I could do this for the
CNT/CTS calculations.

2. If I have multiple wind speed vertical levels from WRF run under
UPP such as Z10, Z30, Z80, Z100, Z305, which observation type would be
most appropriate to validate those higher altitude levels? I had used
ADPSFC, and it actually got matches for those higher altitude levels
of Z100 and Z305, but the observed wind speeds don't seem believable,
so perhaps I am using that incorrectly. Is there a more appropriate
observation type you'd suggest to verify against? Maybe profilers?

Thank you so much for all your help through this process. I look
forward to hearing your thoughts on these other questions.

Have a Happy Fourth of July!!

Jeremy

________________________________
From: John Halley Gotway via RT <met_help at ucar.edu>
Sent: Monday, July 1, 2019 1:27:23 PM
To: Berman, Jeremy D
Cc: harrold at ucar.edu; jwolff at ucar.edu
Subject: Re: [rt.rap.ucar.edu #90685] Issue with using MET Point_Stat
with GDAS files after 2019061200

Jeremy,

I see you have a question about why you get slightly different number
of
matched pairs when verifying your WRF output and GFS output over the
same
spatial area.  Thanks for describing your processing logic.

So you're getting 54 matched pairs for WRF and only 52 for GFS.  I'm
trying
to reconcile in my mind why that would be the case.  Differences like
this
tend to occur along the edge of the domain.  But I would have guessed
the
result be the other way around... you'd get 2 fewer pairs for your the
limited area of the WRF run and two more for the global GFS.

I'd suggest digging a little deeper, and here's how...

(1) Rerun Point-Stat at verbosity level 3 by specifying "-v 3" on the
command line.  For each verification task, that'll dump out counts for
why
obs were or were not used.  Diffing the logs between WRF and GFS, you
might
find a difference which could point to an explanation.

(2) I see from your Point-Stat config file that you already have the
MPR
output line type turned on.  You could look at that output to
determine
which stations were included for WRF but not GFS.  That might yield
another
clue.

Do those methods shed any light on the differences?

Also, I plotted the WRF domain you sent and see that it's over the
eastern
pacific.  Be aware that the ADPSFC obs are only over land, a small
fraction
of your domain.  Surface water points are encoded as "SFCSHP" in
PREPBUFR.
So I'd suggest setting:

message_type   = [ "ADPSFC", "SFCSHP" ];

That'll verify them as two separate tasks.  If you want to process
them in
one big group, just use:

message_type   = [ "ONLYSF" ]; // That's defined in
message_type_group_map
as the union of ADPSFC and SFCSHP

Thanks,
John

On Fri, Jun 28, 2019 at 3:58 PM Berman, Jeremy D via RT
<met_help at ucar.edu>
wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=90685 >
>
> Hi John,
>
>
> I realized it's probably most helpful to you if I provided the files
and
> mask that I was using, so you can see what I was doing. Please let
me know
> if you can see these files (I didn't include in this email the WRF
output
> file since it is large (>45 mb) but I can via another way if you
need to
> see it).
>
>
> Jeremy
>
>
>
> ________________________________
> From: Berman, Jeremy D
> Sent: Friday, June 28, 2019 5:48:53 PM
> To: met_help at ucar.edu
> Subject: Re: [rt.rap.ucar.edu #90685] Issue with using MET
Point_Stat
> with GDAS files after 2019061200
>
>
> Hi John,
>
>
> Thanks for your reply! The UPP information was very helpful, and
I've used
> it successfully on my WRF runs.
>
>
> I've encountered another minor issue, if you could provide some
insight.
> I've been trying to compare GFS forecasts vs my WRF forecasts over
the same
> verification region, such as only over my WRF domain (e.g.,
extending over
> the Pacific and California). For reference, my WRF is a 12km single
domain
> run.
>
>
> I've noticed that after doing the steps below, I am verifying over
my
> intended region (the WRF domain), but I'm getting slightly more
matched
> pairs for my WRF run than the GFS (54 vs 52 obs). Do you have any
idea of
> why this could be happening and how I could get the number of
matched
> observations to be the same?
>
>
> Here is my procedure:
>
>
> I use the "Gen-Poly-Mask Tool" and the -type grid on a UPP'ed WRF
forecast
> file to create a mask saved as a .nc file:
>
>
> gen_vx_mask -type grid WRFPRS_d01_f001 WRFPRS_d01_f001 WRF_domain.nc
>
>
> Then, I modified my pb2nc config file ("PB2NCConfig_PREPBUFR_GDAS")
by
> setting: poly = "WRF_domain.nc"
>
>
> Then I run the "pb2nc" command on PREPBUFR_GDAS data to select
> observations over my masked area:
>
>
> pb2nc PREPBUFR_GDAS_f000.nr PREPBUFR_GDAS_f000_pb.nc
> PB2NCConfig_PREPBUFR_GDAS -v 2
>
>
> This creates a file called "PREPBUFR_GDAS_f000_pb.nc" which I
checked
> contains observations over my WRF mask.
>
>
> Then I created point_stat config files for both GFS and WRF which
have the
> same default settings but have "grid = [ "FULL" ]; poly = [];" and
the
> interpolation method for NEAREST and BILIN. In my understanding,
this is
> fine since the pb2nc should have only selected observations within
the
> masked region.
>
>
> I ran the point_stat command for both GFS and WRF forecast data,
which
> worked fine, but I noticed the discrepancy in the total number of
matched
> pairs.
>
>
> I also tried having the point_stat config files having " grid = [];
poly =
> ["WRF_domain.nc"] " , but the same observation discrepancy occurred.
>
>
> Am I maybe using the mask incorrectly? Or could it be that since the
GFS
> and WRF may have different horizontal resolutions, that there could
be less
> matched pairs?
>
>
> I should mention I also tried creating a mask.poly file with just a
boxed
> region (e.g., Northern California) and included poly = "mask.poly"
in the
> PB2NC Config File. After doing this I got the same number of
observations
> for GFS and WRF. So I'm not sure why using the mask.poly worked
fine, but
> using the WRF domain as a mask did not.
>
>
>
> Thanks for your time and thoughts! If you need to see any example
config
> files or WRF output, let me know and I can provide that.
>
>
> Thanks!
>
> Jeremy
>
>
>
>
>
>
> ________________________________
> From: John Halley Gotway via RT <met_help at ucar.edu>
> Sent: Friday, June 21, 2019 12:17:19 PM
> To: Berman, Jeremy D
> Cc: harrold at ucar.edu; jwolff at ucar.edu
> Subject: Re: [rt.rap.ucar.edu #90685] Issue with using MET
Point_Stat
> with GDAS files after 2019061200
>
> Jeremy,
>
> Unfortunately, no, MET does not include logic for handling WRF's
hybrid
> vertical coordinate and interpolating to pressure or height levels.
In
> addition, when using raw WRFOUT files, MET does not handle winds
well since
> they're defined on the staggered grid.
>
> For these two reasons, we recommend that users run their WRFOUT
files
> through the Unified Post Processor.  It destaggers the winds and
> interpolates to the pressure level and height levels.  Here's a link
for
> info about UPP:
> https://dtcenter.org/community-code/unified-post-processor-upp
>
> UPP writes GRIB1 or GRIB2 output files and the MET tools can handle
those
> well.
>
> Thanks,
> John
>
> On Fri, Jun 21, 2019 at 12:28 AM Berman, Jeremy D via RT <
> met_help at ucar.edu>
> wrote:
>
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=90685 >
> >
> > Hi John,
> >
> >
> > Thank you so much for your help! I was bogged down trying to solve
this
> > for a few days - I'm impressed you did it in two hours!
> >
> >
> > I made the changes and it worked for a case I did for 2019061800.
I
> should
> > have mentioned I was looking at 2-meter Temperature and 10-meter
Wind
> > Speed, both of which use ADPSFC, which based on your observation
counts
> > explains why point_stat could not get any matched pairs.
> >
> >
> > I have another question if you don't mind me asking: if I want to
compute
> > verification for a different vertical level, such as 100-meter
Wind
> Speed,
> > can MET do that for a WRF forecast file, even if the forecast file
does
> not
> > have a 100-meter level? Would MET be able to do a vertical
interpolation
> in
> > order to assess for that level (or any vertical above ground
level)?
> >
> >
> > Additionally: if I wanted to do verification with point_stat for
an
> entire
> > vertical profile (let's say from the surface to 1000 meters above
ground
> > level) could MET do that as well? I know in the example MET
tutorial
> there
> > is a range of Temperature from 850-750hPa, but I was wondering if
this
> > could work for a range of vertical meters above ground?
> >
> >
> > Thank you for all your help!
> >
> >
> > Best,
> >
> > Jeremy
> >
> > ________________________________
> > From: John Halley Gotway via RT <met_help at ucar.edu>
> > Sent: Tuesday, June 18, 2019 6:50:34 PM
> > To: Berman, Jeremy D
> > Cc: harrold at ucar.edu; jwolff at ucar.edu
> > Subject: Re: [rt.rap.ucar.edu #90685] Issue with using MET
Point_Stat
> > with GDAS files after 2019061200
> >
> > Jeremy,
> >
> > I see you have a question about running the pb2nc and point_stat
tools.
> > That's interesting that you found that you're verification logic
broke on
> > 2019061206.  To check the behavior, I retrieved a PREPBUFR file
before
> and
> > after that time:
> >
> > BEFORE: wget
> >
> >
>
ftp://ftpprd.ncep.noaa.gov/pub/data/nccf/com/gfs/prod/gdas.20190611/gdas.t12z.prepbufr.nr
> > AFTER:   wget
> >
> >
>
ftp://ftpprd.ncep.noaa.gov/pub/data/nccf/com/gfs/prod/gdas.20190612/12/gdas.t12z.prepbufr.nr
> >
> > I ran them both through pb2nc, extracting ADPUPA and ADPSFC
message
> types.
> > pb2nc retained/derived 308,034 observations on the 11th but only
175,427
> on
> > the 12th.
> >
> > One obvious difference I did note is that they've changed the
directory
> > structure on the ftp site.  On 20190612, they started organizing
the data
> > into subdirectories for the hour (00, 06, 12, 18).  If you're
running a
> > script to pull prepbufr files from the FTP site, please make sure
it's
> > finding them in the right sub-directory.
> >
> > Doing some more digging I found...
> > The number of ADPUPA observations remains unchanged... 136,114 for
both
> > days.
> > The number of ADPSFC (surface) observations is dramatically
reduced...
> from
> > 171,920 down to 34,247!
> >
> > So next I reran pb2nc but setting *quality_mark_thresh = 15; *to
retain
> all
> > observations regardless of the quality flag.
> > And that results in 332,235 and 337,818 observations on the 11th
and
> 12th,
> > respectively.
> > The ADPUPA obs are very similar: 150,057 vs 155,787
> > The ADPSFC obs are also similar: 182,178 vs 182,031
> >
> > So the big difference is in the quality mark values.
> > On the 11th...
> > 182,178 observations = 171,920 with QM=2, 10 with QM=9, and 10,248
with
> > other QM's.
> > On the 12th...
> > 182,031 observations = 34,247 with QM=2, 139,047 with QM=9, and
8,737
> with
> > other QM's.
> >
> > I'm guessing that with the GFS upgrade, they changed their GDAS
> > assimilation logic back to setting the quality marker = 9 for
surface obs
> > to avoid assimilating them.
> >
> > So I'd recommend 2 things:
> > (1) In your PB2NC config file, set *quality_mark_thresh = 9;*
> > (2) In your Point-Stat config file, when verifying against surface
> > obs, set *obs_quality
> > = [0,1,2,9]; *to use surface observations with any of these
quality
> marks.
> >
> > Hope that helps get you going.
> >
> > Thanks,
> > John Halley Gotway
> >
> > On Tue, Jun 18, 2019 at 4:06 PM Randy Bullock via RT
<met_help at ucar.edu>
> > wrote:
> >
> > >
> > > Tue Jun 18 16:05:37 2019: Request 90685 was acted upon.
> > > Transaction: Given to johnhg (John Halley Gotway) by bullock
> > >        Queue: met_help
> > >      Subject: Issue with using MET Point_Stat with GDAS files
after
> > > 2019061200
> > >        Owner: johnhg
> > >   Requestors: jdberman at albany.edu
> > >       Status: new
> > >  Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=90685
> >
> > >
> > >
> > > This transaction appears to have no content
> > >
> >
> >
> >
>
>
>

------------------------------------------------
Subject: Issue with using MET Point_Stat with GDAS files after 2019061200
From: John Halley Gotway
Time: Thu Jul 04 07:32:32 2019

Jeremy,

I’m glad you’ve been able to make sense of the discrepancies in the
number
of pairs.  And I’m glad to hear that it’s GFS with more pairs than
WRF.
Your earlier email saying the opposite had me confused!

“but I'm getting slightly more matched pairs for my WRF run than the
GFS
(54 vs 52 obs).”

There has got to be a good way of handling this using MET but I just
don’t
know what it is off the top of my head.

Regarding (1), yes definitely.  In the Point-Stat config file, look in
the
“mask” dictionary and use the “sid” entry to specify a list of station
id’s
to be included in the stats.  Be sure to read about that in the users
guide
or the data/config/README file.

Regarding (2), AFPSFC is not a good choice to verify vertical levels
above
ground.  Please try ADPUPA (upper-air) instead. Or perhaps some other
obs
type?  I know a lot about the software but don’t know a lot about
breadth
of available obs.

Here’s why ADPSFC is a bad choice.  MET inherited this logic for
surface
verification from the verification done at NOAA/EMC.  When verifying
vertical level forecast against ADFSFC or SFCSHP obs, just match those
obs
without actually checking the height info.  This works for 2-m temp
and
10-m winds but not 305-m winds!

Thanks
John

On Wed, Jul 3, 2019 at 3:47 PM Berman, Jeremy D via RT
<met_help at ucar.edu>
wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=90685 >
>
> Hi John,
>
>
> Thanks for your input! I did as you suggested and found that the
missing
> points between my WRF and GFS were all along the northern or eastern
> boundary of my WRF domain.
>
>
> When doing the "diff" command between the Point_Stat files for WRF
and GFS
> with -v 3, I noticed the lines for the WRF file:
>
>
> "Processing WIND/Z10 versus WIND/Z10, for observation type ADPSFC,
over
> region FULL, for interpolation method BILIN(4), using 52 pairs"
>
> "DEBUG 3: Rejected: off the grid   = 1"
>
> "Rejected: bad fcst value = 1"
>
>
> While for GFS:
>
>
> "Processing WIND/Z10 versus WIND/Z10, for observation type ADPSFC,
over
> region FULL, for interpolation method BILIN(4), using 54 pairs"
>
> "DEBUG 3: Rejected: off the grid   = 0"
>
> "Rejected: bad fcst value = 0"
>
>
> I looked into the MPR files and found two stations that were
different.
> Those two stations were near the northern or eastern boundary of my
WRF
> domain, and one of them was actually outside the WRF domain.
>
>
> So my thinking is that the discrepancy occurs because (1) an
observation
> is close to a boundary and (2) the bilinear interpolation approach
can't
> find all the neighbor values because it is too close to a boundary.
I
> tested these same MET commands, but for the NEAREST interpolation
method:
> there still was 1 "rejected: off the grid", but 0 "rejected: bad
fcst
> value". So I think this supports my hypothesis. I think the reason
the GFS
> has more observation matches is because it has a global domain, so
there is
> no boundary issue for observations, like there is for my WRF
simulation.
>
>
> I think to fix this issue, and to get the same number of WRF to GFS
> observation matches, I'll have to only use a polyline described by
> "mask.poly", and I'll have to write a script that checks that the
poly line
> is not too close to the WRF domain (e.g., within 5 grid points).
>
>
>
> I have another, related, question, if you don't mind me asking :)
>
>
> 1. is it possible to select specific METAR locations to run
> PB2NC/Point_Stat on, instead of using all the observations within a
poly
> mask? For example, if I am looking at Northern California, but I
only want
> to validate averaged over the stations: KAAT, KACV, KBFL, KBLU,
KFAT, KPRB,
> KSAC for CNT/CTS files? I know the individual stations are listed in
MPR,
> but I wondered how I could do this for the CNT/CTS calculations.
>
>
> 2. If I have multiple wind speed vertical levels from WRF run under
UPP
> such as Z10, Z30, Z80, Z100, Z305, which observation type would be
most
> appropriate to validate those higher altitude levels? I had used
ADPSFC,
> and it actually got matches for those higher altitude levels of Z100
and
> Z305, but the observed wind speeds don't seem believable, so perhaps
I am
> using that incorrectly. Is there a more appropriate observation type
you'd
> suggest to verify against? Maybe profilers?
>
>
>
> Thank you so much for all your help through this process. I look
forward
> to hearing your thoughts on these other questions.
>
>
> Have a Happy Fourth of July!!
>
>
> Jeremy
>
>
>
>
> ________________________________
> From: John Halley Gotway via RT <met_help at ucar.edu>
> Sent: Monday, July 1, 2019 1:27:23 PM
> To: Berman, Jeremy D
> Cc: harrold at ucar.edu; jwolff at ucar.edu
> Subject: Re: [rt.rap.ucar.edu #90685] Issue with using MET
Point_Stat
> with GDAS files after 2019061200
>
> Jeremy,
>
> I see you have a question about why you get slightly different
number of
> matched pairs when verifying your WRF output and GFS output over the
same
> spatial area.  Thanks for describing your processing logic.
>
> So you're getting 54 matched pairs for WRF and only 52 for GFS.  I'm
trying
> to reconcile in my mind why that would be the case.  Differences
like this
> tend to occur along the edge of the domain.  But I would have
guessed the
> result be the other way around... you'd get 2 fewer pairs for your
the
> limited area of the WRF run and two more for the global GFS.
>
> I'd suggest digging a little deeper, and here's how...
>
> (1) Rerun Point-Stat at verbosity level 3 by specifying "-v 3" on
the
> command line.  For each verification task, that'll dump out counts
for why
> obs were or were not used.  Diffing the logs between WRF and GFS,
you might
> find a difference which could point to an explanation.
>
> (2) I see from your Point-Stat config file that you already have the
MPR
> output line type turned on.  You could look at that output to
determine
> which stations were included for WRF but not GFS.  That might yield
another
> clue.
>
> Do those methods shed any light on the differences?
>
> Also, I plotted the WRF domain you sent and see that it's over the
eastern
> pacific.  Be aware that the ADPSFC obs are only over land, a small
fraction
> of your domain.  Surface water points are encoded as "SFCSHP" in
PREPBUFR.
> So I'd suggest setting:
>
> message_type   = [ "ADPSFC", "SFCSHP" ];
>
> That'll verify them as two separate tasks.  If you want to process
them in
> one big group, just use:
>
> message_type   = [ "ONLYSF" ]; // That's defined in
message_type_group_map
> as the union of ADPSFC and SFCSHP
>
> Thanks,
> John
>
>
> On Fri, Jun 28, 2019 at 3:58 PM Berman, Jeremy D via RT
<met_help at ucar.edu
> >
> wrote:
>
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=90685 >
> >
> > Hi John,
> >
> >
> > I realized it's probably most helpful to you if I provided the
files and
> > mask that I was using, so you can see what I was doing. Please let
me
> know
> > if you can see these files (I didn't include in this email the WRF
output
> > file since it is large (>45 mb) but I can via another way if you
need to
> > see it).
> >
> >
> > Jeremy
> >
> >
> >
> > ________________________________
> > From: Berman, Jeremy D
> > Sent: Friday, June 28, 2019 5:48:53 PM
> > To: met_help at ucar.edu
> > Subject: Re: [rt.rap.ucar.edu #90685] Issue with using MET
Point_Stat
> > with GDAS files after 2019061200
> >
> >
> > Hi John,
> >
> >
> > Thanks for your reply! The UPP information was very helpful, and
I've
> used
> > it successfully on my WRF runs.
> >
> >
> > I've encountered another minor issue, if you could provide some
insight.
> > I've been trying to compare GFS forecasts vs my WRF forecasts over
the
> same
> > verification region, such as only over my WRF domain (e.g.,
extending
> over
> > the Pacific and California). For reference, my WRF is a 12km
single
> domain
> > run.
> >
> >
> > I've noticed that after doing the steps below, I am verifying over
my
> > intended region (the WRF domain), but I'm getting slightly more
matched
> > pairs for my WRF run than the GFS (54 vs 52 obs). Do you have any
idea of
> > why this could be happening and how I could get the number of
matched
> > observations to be the same?
> >
> >
> > Here is my procedure:
> >
> >
> > I use the "Gen-Poly-Mask Tool" and the -type grid on a UPP'ed WRF
> forecast
> > file to create a mask saved as a .nc file:
> >
> >
> > gen_vx_mask -type grid WRFPRS_d01_f001 WRFPRS_d01_f001
WRF_domain.nc
> >
> >
> > Then, I modified my pb2nc config file
("PB2NCConfig_PREPBUFR_GDAS") by
> > setting: poly = "WRF_domain.nc"
> >
> >
> > Then I run the "pb2nc" command on PREPBUFR_GDAS data to select
> > observations over my masked area:
> >
> >
> > pb2nc PREPBUFR_GDAS_f000.nr PREPBUFR_GDAS_f000_pb.nc
> > PB2NCConfig_PREPBUFR_GDAS -v 2
> >
> >
> > This creates a file called "PREPBUFR_GDAS_f000_pb.nc" which I
checked
> > contains observations over my WRF mask.
> >
> >
> > Then I created point_stat config files for both GFS and WRF which
have
> the
> > same default settings but have "grid = [ "FULL" ]; poly = [];" and
the
> > interpolation method for NEAREST and BILIN. In my understanding,
this is
> > fine since the pb2nc should have only selected observations within
the
> > masked region.
> >
> >
> > I ran the point_stat command for both GFS and WRF forecast data,
which
> > worked fine, but I noticed the discrepancy in the total number of
matched
> > pairs.
> >
> >
> > I also tried having the point_stat config files having " grid =
[]; poly
> =
> > ["WRF_domain.nc"] " , but the same observation discrepancy
occurred.
> >
> >
> > Am I maybe using the mask incorrectly? Or could it be that since
the GFS
> > and WRF may have different horizontal resolutions, that there
could be
> less
> > matched pairs?
> >
> >
> > I should mention I also tried creating a mask.poly file with just
a boxed
> > region (e.g., Northern California) and included poly = "mask.poly"
in the
> > PB2NC Config File. After doing this I got the same number of
observations
> > for GFS and WRF. So I'm not sure why using the mask.poly worked
fine, but
> > using the WRF domain as a mask did not.
> >
> >
> >
> > Thanks for your time and thoughts! If you need to see any example
config
> > files or WRF output, let me know and I can provide that.
> >
> >
> > Thanks!
> >
> > Jeremy
> >
> >
> >
> >
> >
> >
> > ________________________________
> > From: John Halley Gotway via RT <met_help at ucar.edu>
> > Sent: Friday, June 21, 2019 12:17:19 PM
> > To: Berman, Jeremy D
> > Cc: harrold at ucar.edu; jwolff at ucar.edu
> > Subject: Re: [rt.rap.ucar.edu #90685] Issue with using MET
Point_Stat
> > with GDAS files after 2019061200
> >
> > Jeremy,
> >
> > Unfortunately, no, MET does not include logic for handling WRF's
hybrid
> > vertical coordinate and interpolating to pressure or height
levels.  In
> > addition, when using raw WRFOUT files, MET does not handle winds
well
> since
> > they're defined on the staggered grid.
> >
> > For these two reasons, we recommend that users run their WRFOUT
files
> > through the Unified Post Processor.  It destaggers the winds and
> > interpolates to the pressure level and height levels.  Here's a
link for
> > info about UPP:
> > https://dtcenter.org/community-code/unified-post-processor-upp
> >
> > UPP writes GRIB1 or GRIB2 output files and the MET tools can
handle those
> > well.
> >
> > Thanks,
> > John
> >
> > On Fri, Jun 21, 2019 at 12:28 AM Berman, Jeremy D via RT <
> > met_help at ucar.edu>
> > wrote:
> >
> > >
> > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=90685 >
> > >
> > > Hi John,
> > >
> > >
> > > Thank you so much for your help! I was bogged down trying to
solve this
> > > for a few days - I'm impressed you did it in two hours!
> > >
> > >
> > > I made the changes and it worked for a case I did for
2019061800. I
> > should
> > > have mentioned I was looking at 2-meter Temperature and 10-meter
Wind
> > > Speed, both of which use ADPSFC, which based on your observation
counts
> > > explains why point_stat could not get any matched pairs.
> > >
> > >
> > > I have another question if you don't mind me asking: if I want
to
> compute
> > > verification for a different vertical level, such as 100-meter
Wind
> > Speed,
> > > can MET do that for a WRF forecast file, even if the forecast
file does
> > not
> > > have a 100-meter level? Would MET be able to do a vertical
> interpolation
> > in
> > > order to assess for that level (or any vertical above ground
level)?
> > >
> > >
> > > Additionally: if I wanted to do verification with point_stat for
an
> > entire
> > > vertical profile (let's say from the surface to 1000 meters
above
> ground
> > > level) could MET do that as well? I know in the example MET
tutorial
> > there
> > > is a range of Temperature from 850-750hPa, but I was wondering
if this
> > > could work for a range of vertical meters above ground?
> > >
> > >
> > > Thank you for all your help!
> > >
> > >
> > > Best,
> > >
> > > Jeremy
> > >
> > > ________________________________
> > > From: John Halley Gotway via RT <met_help at ucar.edu>
> > > Sent: Tuesday, June 18, 2019 6:50:34 PM
> > > To: Berman, Jeremy D
> > > Cc: harrold at ucar.edu; jwolff at ucar.edu
> > > Subject: Re: [rt.rap.ucar.edu #90685] Issue with using MET
Point_Stat
> > > with GDAS files after 2019061200
> > >
> > > Jeremy,
> > >
> > > I see you have a question about running the pb2nc and point_stat
tools.
> > > That's interesting that you found that you're verification logic
broke
> on
> > > 2019061206.  To check the behavior, I retrieved a PREPBUFR file
before
> > and
> > > after that time:
> > >
> > > BEFORE: wget
> > >
> > >
> >
>
ftp://ftpprd.ncep.noaa.gov/pub/data/nccf/com/gfs/prod/gdas.20190611/gdas.t12z.prepbufr.nr
> > > AFTER:   wget
> > >
> > >
> >
>
ftp://ftpprd.ncep.noaa.gov/pub/data/nccf/com/gfs/prod/gdas.20190612/12/gdas.t12z.prepbufr.nr
> > >
> > > I ran them both through pb2nc, extracting ADPUPA and ADPSFC
message
> > types.
> > > pb2nc retained/derived 308,034 observations on the 11th but only
> 175,427
> > on
> > > the 12th.
> > >
> > > One obvious difference I did note is that they've changed the
directory
> > > structure on the ftp site.  On 20190612, they started organizing
the
> data
> > > into subdirectories for the hour (00, 06, 12, 18).  If you're
running a
> > > script to pull prepbufr files from the FTP site, please make
sure it's
> > > finding them in the right sub-directory.
> > >
> > > Doing some more digging I found...
> > > The number of ADPUPA observations remains unchanged... 136,114
for both
> > > days.
> > > The number of ADPSFC (surface) observations is dramatically
reduced...
> > from
> > > 171,920 down to 34,247!
> > >
> > > So next I reran pb2nc but setting *quality_mark_thresh = 15; *to
retain
> > all
> > > observations regardless of the quality flag.
> > > And that results in 332,235 and 337,818 observations on the 11th
and
> > 12th,
> > > respectively.
> > > The ADPUPA obs are very similar: 150,057 vs 155,787
> > > The ADPSFC obs are also similar: 182,178 vs 182,031
> > >
> > > So the big difference is in the quality mark values.
> > > On the 11th...
> > > 182,178 observations = 171,920 with QM=2, 10 with QM=9, and
10,248 with
> > > other QM's.
> > > On the 12th...
> > > 182,031 observations = 34,247 with QM=2, 139,047 with QM=9, and
8,737
> > with
> > > other QM's.
> > >
> > > I'm guessing that with the GFS upgrade, they changed their GDAS
> > > assimilation logic back to setting the quality marker = 9 for
surface
> obs
> > > to avoid assimilating them.
> > >
> > > So I'd recommend 2 things:
> > > (1) In your PB2NC config file, set *quality_mark_thresh = 9;*
> > > (2) In your Point-Stat config file, when verifying against
surface
> > > obs, set *obs_quality
> > > = [0,1,2,9]; *to use surface observations with any of these
quality
> > marks.
> > >
> > > Hope that helps get you going.
> > >
> > > Thanks,
> > > John Halley Gotway
> > >
> > > On Tue, Jun 18, 2019 at 4:06 PM Randy Bullock via RT <
> met_help at ucar.edu>
> > > wrote:
> > >
> > > >
> > > > Tue Jun 18 16:05:37 2019: Request 90685 was acted upon.
> > > > Transaction: Given to johnhg (John Halley Gotway) by bullock
> > > >        Queue: met_help
> > > >      Subject: Issue with using MET Point_Stat with GDAS files
after
> > > > 2019061200
> > > >        Owner: johnhg
> > > >   Requestors: jdberman at albany.edu
> > > >       Status: new
> > > >  Ticket <URL:
> https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=90685
> > >
> > > >
> > > >
> > > > This transaction appears to have no content
> > > >
> > >
> > >
> > >
> >
> >
> >
>
>
>

------------------------------------------------
Subject: Issue with using MET Point_Stat with GDAS files after 2019061200
From: Berman, Jeremy D
Time: Wed Jul 17 14:49:49 2019

Hi John,

Thanks so much for your help earlier! I hope you did not have to work
all day on July 4th, and that you got to enjoy the holiday!

Sorry about the earlier confusion about the GFS vs WRF pairs. I
decided that to maintain the same observations for comparing GFS and
WRF, I would just use the mask_poly feature and just be careful to not
prescribe a poly line too close to my WRF domain boundary. So far it
works!

One quick question if I may ask you. As I described earlier, I'm
comparing my WRF forecasts vs. GFS forecasts over a region of northern
California. I'm using pb2nc for my observations from PREPBUFR, and
point_stat for the matched pairs. I've been looking at the continuous
statistics (CNT) and categorical statistics (CTS) for 10m Wind Speed,
and I'm seeing that GFS routinely has lower RMSE, BIAS, etc over the
region for a number of forecast lead times.

Of course there are many reasons that could explain why my WRF
forecast doesn't perform as well. One thing I would like to
investigate is whether the GFS performs better because there are
observations with near 0 m/s values. So my question is: how can I
filter the observations to only compute the CNT statistics for
observations with values greater than, say, 0.5 m/s?

I know in the Point_Stat Configuration Files there is a setting called
"cnt_thres[]" and "cnt_logic", however do these settings only apply to
the forecasted value or the observed value? For example, if the
observation value is 0.7 m/s, but the forecasted value is 0 m/s, if I
set the cnt_thres to 0.5, would it count that observation, and find a
matched pair, or not? I want to filter just the observation values, so
that my GFS and WRF forecasts are being "apple to apple" comparisons.

My hope is that I can try to better understand my GFS vs. WRF
statistical comparison by trying to filter the observation data by
only looking at CNT for high wind speeds. The GFS has a coarser
spatial resolution, so I would think a priori that my WRF simulation
(3 km) would provide better wind forecasts. So hopefully filtering the
observation dataset will help diagnose the differences.

Thanks!

Jeremy

________________________________
From: John Halley Gotway via RT <met_help at ucar.edu>
Sent: Thursday, July 4, 2019 9:32 AM
To: Berman, Jeremy D
Cc: harrold at ucar.edu; jwolff at ucar.edu
Subject: Re: [rt.rap.ucar.edu #90685] Issue with using MET Point_Stat
with GDAS files after 2019061200

Jeremy,

I’m glad you’ve been able to make sense of the discrepancies in the
number
of pairs.  And I’m glad to hear that it’s GFS with more pairs than
WRF.
Your earlier email saying the opposite had me confused!

“but I'm getting slightly more matched pairs for my WRF run than the
GFS
(54 vs 52 obs).”

There has got to be a good way of handling this using MET but I just
don’t
know what it is off the top of my head.

Regarding (1), yes definitely.  In the Point-Stat config file, look in
the
“mask” dictionary and use the “sid” entry to specify a list of station
id’s
to be included in the stats.  Be sure to read about that in the users
guide
or the data/config/README file.

Regarding (2), AFPSFC is not a good choice to verify vertical levels
above
ground.  Please try ADPUPA (upper-air) instead. Or perhaps some other
obs
type?  I know a lot about the software but don’t know a lot about
breadth
of available obs.

Here’s why ADPSFC is a bad choice.  MET inherited this logic for
surface
verification from the verification done at NOAA/EMC.  When verifying
vertical level forecast against ADFSFC or SFCSHP obs, just match those
obs
without actually checking the height info.  This works for 2-m temp
and
10-m winds but not 305-m winds!

Thanks
John

On Wed, Jul 3, 2019 at 3:47 PM Berman, Jeremy D via RT
<met_help at ucar.edu>
wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=90685 >
>
> Hi John,
>
>
> Thanks for your input! I did as you suggested and found that the
missing
> points between my WRF and GFS were all along the northern or eastern
> boundary of my WRF domain.
>
>
> When doing the "diff" command between the Point_Stat files for WRF
and GFS
> with -v 3, I noticed the lines for the WRF file:
>
>
> "Processing WIND/Z10 versus WIND/Z10, for observation type ADPSFC,
over
> region FULL, for interpolation method BILIN(4), using 52 pairs"
>
> "DEBUG 3: Rejected: off the grid   = 1"
>
> "Rejected: bad fcst value = 1"
>
>
> While for GFS:
>
>
> "Processing WIND/Z10 versus WIND/Z10, for observation type ADPSFC,
over
> region FULL, for interpolation method BILIN(4), using 54 pairs"
>
> "DEBUG 3: Rejected: off the grid   = 0"
>
> "Rejected: bad fcst value = 0"
>
>
> I looked into the MPR files and found two stations that were
different.
> Those two stations were near the northern or eastern boundary of my
WRF
> domain, and one of them was actually outside the WRF domain.
>
>
> So my thinking is that the discrepancy occurs because (1) an
observation
> is close to a boundary and (2) the bilinear interpolation approach
can't
> find all the neighbor values because it is too close to a boundary.
I
> tested these same MET commands, but for the NEAREST interpolation
method:
> there still was 1 "rejected: off the grid", but 0 "rejected: bad
fcst
> value". So I think this supports my hypothesis. I think the reason
the GFS
> has more observation matches is because it has a global domain, so
there is
> no boundary issue for observations, like there is for my WRF
simulation.
>
>
> I think to fix this issue, and to get the same number of WRF to GFS
> observation matches, I'll have to only use a polyline described by
> "mask.poly", and I'll have to write a script that checks that the
poly line
> is not too close to the WRF domain (e.g., within 5 grid points).
>
>
>
> I have another, related, question, if you don't mind me asking :)
>
>
> 1. is it possible to select specific METAR locations to run
> PB2NC/Point_Stat on, instead of using all the observations within a
poly
> mask? For example, if I am looking at Northern California, but I
only want
> to validate averaged over the stations: KAAT, KACV, KBFL, KBLU,
KFAT, KPRB,
> KSAC for CNT/CTS files? I know the individual stations are listed in
MPR,
> but I wondered how I could do this for the CNT/CTS calculations.
>
>
> 2. If I have multiple wind speed vertical levels from WRF run under
UPP
> such as Z10, Z30, Z80, Z100, Z305, which observation type would be
most
> appropriate to validate those higher altitude levels? I had used
ADPSFC,
> and it actually got matches for those higher altitude levels of Z100
and
> Z305, but the observed wind speeds don't seem believable, so perhaps
I am
> using that incorrectly. Is there a more appropriate observation type
you'd
> suggest to verify against? Maybe profilers?
>
>
>
> Thank you so much for all your help through this process. I look
forward
> to hearing your thoughts on these other questions.
>
>
> Have a Happy Fourth of July!!
>
>
> Jeremy
>
>
>
>
> ________________________________
> From: John Halley Gotway via RT <met_help at ucar.edu>
> Sent: Monday, July 1, 2019 1:27:23 PM
> To: Berman, Jeremy D
> Cc: harrold at ucar.edu; jwolff at ucar.edu
> Subject: Re: [rt.rap.ucar.edu #90685] Issue with using MET
Point_Stat
> with GDAS files after 2019061200
>
> Jeremy,
>
> I see you have a question about why you get slightly different
number of
> matched pairs when verifying your WRF output and GFS output over the
same
> spatial area.  Thanks for describing your processing logic.
>
> So you're getting 54 matched pairs for WRF and only 52 for GFS.  I'm
trying
> to reconcile in my mind why that would be the case.  Differences
like this
> tend to occur along the edge of the domain.  But I would have
guessed the
> result be the other way around... you'd get 2 fewer pairs for your
the
> limited area of the WRF run and two more for the global GFS.
>
> I'd suggest digging a little deeper, and here's how...
>
> (1) Rerun Point-Stat at verbosity level 3 by specifying "-v 3" on
the
> command line.  For each verification task, that'll dump out counts
for why
> obs were or were not used.  Diffing the logs between WRF and GFS,
you might
> find a difference which could point to an explanation.
>
> (2) I see from your Point-Stat config file that you already have the
MPR
> output line type turned on.  You could look at that output to
determine
> which stations were included for WRF but not GFS.  That might yield
another
> clue.
>
> Do those methods shed any light on the differences?
>
> Also, I plotted the WRF domain you sent and see that it's over the
eastern
> pacific.  Be aware that the ADPSFC obs are only over land, a small
fraction
> of your domain.  Surface water points are encoded as "SFCSHP" in
PREPBUFR.
> So I'd suggest setting:
>
> message_type   = [ "ADPSFC", "SFCSHP" ];
>
> That'll verify them as two separate tasks.  If you want to process
them in
> one big group, just use:
>
> message_type   = [ "ONLYSF" ]; // That's defined in
message_type_group_map
> as the union of ADPSFC and SFCSHP
>
> Thanks,
> John
>
>
> On Fri, Jun 28, 2019 at 3:58 PM Berman, Jeremy D via RT
<met_help at ucar.edu
> >
> wrote:
>
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=90685 >
> >
> > Hi John,
> >
> >
> > I realized it's probably most helpful to you if I provided the
files and
> > mask that I was using, so you can see what I was doing. Please let
me
> know
> > if you can see these files (I didn't include in this email the WRF
output
> > file since it is large (>45 mb) but I can via another way if you
need to
> > see it).
> >
> >
> > Jeremy
> >
> >
> >
> > ________________________________
> > From: Berman, Jeremy D
> > Sent: Friday, June 28, 2019 5:48:53 PM
> > To: met_help at ucar.edu
> > Subject: Re: [rt.rap.ucar.edu #90685] Issue with using MET
Point_Stat
> > with GDAS files after 2019061200
> >
> >
> > Hi John,
> >
> >
> > Thanks for your reply! The UPP information was very helpful, and
I've
> used
> > it successfully on my WRF runs.
> >
> >
> > I've encountered another minor issue, if you could provide some
insight.
> > I've been trying to compare GFS forecasts vs my WRF forecasts over
the
> same
> > verification region, such as only over my WRF domain (e.g.,
extending
> over
> > the Pacific and California). For reference, my WRF is a 12km
single
> domain
> > run.
> >
> >
> > I've noticed that after doing the steps below, I am verifying over
my
> > intended region (the WRF domain), but I'm getting slightly more
matched
> > pairs for my WRF run than the GFS (54 vs 52 obs). Do you have any
idea of
> > why this could be happening and how I could get the number of
matched
> > observations to be the same?
> >
> >
> > Here is my procedure:
> >
> >
> > I use the "Gen-Poly-Mask Tool" and the -type grid on a UPP'ed WRF
> forecast
> > file to create a mask saved as a .nc file:
> >
> >
> > gen_vx_mask -type grid WRFPRS_d01_f001 WRFPRS_d01_f001
WRF_domain.nc
> >
> >
> > Then, I modified my pb2nc config file
("PB2NCConfig_PREPBUFR_GDAS") by
> > setting: poly = "WRF_domain.nc"
> >
> >
> > Then I run the "pb2nc" command on PREPBUFR_GDAS data to select
> > observations over my masked area:
> >
> >
> > pb2nc PREPBUFR_GDAS_f000.nr PREPBUFR_GDAS_f000_pb.nc
> > PB2NCConfig_PREPBUFR_GDAS -v 2
> >
> >
> > This creates a file called "PREPBUFR_GDAS_f000_pb.nc" which I
checked
> > contains observations over my WRF mask.
> >
> >
> > Then I created point_stat config files for both GFS and WRF which
have
> the
> > same default settings but have "grid = [ "FULL" ]; poly = [];" and
the
> > interpolation method for NEAREST and BILIN. In my understanding,
this is
> > fine since the pb2nc should have only selected observations within
the
> > masked region.
> >
> >
> > I ran the point_stat command for both GFS and WRF forecast data,
which
> > worked fine, but I noticed the discrepancy in the total number of
matched
> > pairs.
> >
> >
> > I also tried having the point_stat config files having " grid =
[]; poly
> =
> > ["WRF_domain.nc"] " , but the same observation discrepancy
occurred.
> >
> >
> > Am I maybe using the mask incorrectly? Or could it be that since
the GFS
> > and WRF may have different horizontal resolutions, that there
could be
> less
> > matched pairs?
> >
> >
> > I should mention I also tried creating a mask.poly file with just
a boxed
> > region (e.g., Northern California) and included poly = "mask.poly"
in the
> > PB2NC Config File. After doing this I got the same number of
observations
> > for GFS and WRF. So I'm not sure why using the mask.poly worked
fine, but
> > using the WRF domain as a mask did not.
> >
> >
> >
> > Thanks for your time and thoughts! If you need to see any example
config
> > files or WRF output, let me know and I can provide that.
> >
> >
> > Thanks!
> >
> > Jeremy
> >
> >
> >
> >
> >
> >
> > ________________________________
> > From: John Halley Gotway via RT <met_help at ucar.edu>
> > Sent: Friday, June 21, 2019 12:17:19 PM
> > To: Berman, Jeremy D
> > Cc: harrold at ucar.edu; jwolff at ucar.edu
> > Subject: Re: [rt.rap.ucar.edu #90685] Issue with using MET
Point_Stat
> > with GDAS files after 2019061200
> >
> > Jeremy,
> >
> > Unfortunately, no, MET does not include logic for handling WRF's
hybrid
> > vertical coordinate and interpolating to pressure or height
levels.  In
> > addition, when using raw WRFOUT files, MET does not handle winds
well
> since
> > they're defined on the staggered grid.
> >
> > For these two reasons, we recommend that users run their WRFOUT
files
> > through the Unified Post Processor.  It destaggers the winds and
> > interpolates to the pressure level and height levels.  Here's a
link for
> > info about UPP:
> > https://dtcenter.org/community-code/unified-post-processor-upp
> >
> > UPP writes GRIB1 or GRIB2 output files and the MET tools can
handle those
> > well.
> >
> > Thanks,
> > John
> >
> > On Fri, Jun 21, 2019 at 12:28 AM Berman, Jeremy D via RT <
> > met_help at ucar.edu>
> > wrote:
> >
> > >
> > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=90685 >
> > >
> > > Hi John,
> > >
> > >
> > > Thank you so much for your help! I was bogged down trying to
solve this
> > > for a few days - I'm impressed you did it in two hours!
> > >
> > >
> > > I made the changes and it worked for a case I did for
2019061800. I
> > should
> > > have mentioned I was looking at 2-meter Temperature and 10-meter
Wind
> > > Speed, both of which use ADPSFC, which based on your observation
counts
> > > explains why point_stat could not get any matched pairs.
> > >
> > >
> > > I have another question if you don't mind me asking: if I want
to
> compute
> > > verification for a different vertical level, such as 100-meter
Wind
> > Speed,
> > > can MET do that for a WRF forecast file, even if the forecast
file does
> > not
> > > have a 100-meter level? Would MET be able to do a vertical
> interpolation
> > in
> > > order to assess for that level (or any vertical above ground
level)?
> > >
> > >
> > > Additionally: if I wanted to do verification with point_stat for
an
> > entire
> > > vertical profile (let's say from the surface to 1000 meters
above
> ground
> > > level) could MET do that as well? I know in the example MET
tutorial
> > there
> > > is a range of Temperature from 850-750hPa, but I was wondering
if this
> > > could work for a range of vertical meters above ground?
> > >
> > >
> > > Thank you for all your help!
> > >
> > >
> > > Best,
> > >
> > > Jeremy
> > >
> > > ________________________________
> > > From: John Halley Gotway via RT <met_help at ucar.edu>
> > > Sent: Tuesday, June 18, 2019 6:50:34 PM
> > > To: Berman, Jeremy D
> > > Cc: harrold at ucar.edu; jwolff at ucar.edu
> > > Subject: Re: [rt.rap.ucar.edu #90685] Issue with using MET
Point_Stat
> > > with GDAS files after 2019061200
> > >
> > > Jeremy,
> > >
> > > I see you have a question about running the pb2nc and point_stat
tools.
> > > That's interesting that you found that you're verification logic
broke
> on
> > > 2019061206.  To check the behavior, I retrieved a PREPBUFR file
before
> > and
> > > after that time:
> > >
> > > BEFORE: wget
> > >
> > >
> >
>
ftp://ftpprd.ncep.noaa.gov/pub/data/nccf/com/gfs/prod/gdas.20190611/gdas.t12z.prepbufr.nr
> > > AFTER:   wget
> > >
> > >
> >
>
ftp://ftpprd.ncep.noaa.gov/pub/data/nccf/com/gfs/prod/gdas.20190612/12/gdas.t12z.prepbufr.nr
> > >
> > > I ran them both through pb2nc, extracting ADPUPA and ADPSFC
message
> > types.
> > > pb2nc retained/derived 308,034 observations on the 11th but only
> 175,427
> > on
> > > the 12th.
> > >
> > > One obvious difference I did note is that they've changed the
directory
> > > structure on the ftp site.  On 20190612, they started organizing
the
> data
> > > into subdirectories for the hour (00, 06, 12, 18).  If you're
running a
> > > script to pull prepbufr files from the FTP site, please make
sure it's
> > > finding them in the right sub-directory.
> > >
> > > Doing some more digging I found...
> > > The number of ADPUPA observations remains unchanged... 136,114
for both
> > > days.
> > > The number of ADPSFC (surface) observations is dramatically
reduced...
> > from
> > > 171,920 down to 34,247!
> > >
> > > So next I reran pb2nc but setting *quality_mark_thresh = 15; *to
retain
> > all
> > > observations regardless of the quality flag.
> > > And that results in 332,235 and 337,818 observations on the 11th
and
> > 12th,
> > > respectively.
> > > The ADPUPA obs are very similar: 150,057 vs 155,787
> > > The ADPSFC obs are also similar: 182,178 vs 182,031
> > >
> > > So the big difference is in the quality mark values.
> > > On the 11th...
> > > 182,178 observations = 171,920 with QM=2, 10 with QM=9, and
10,248 with
> > > other QM's.
> > > On the 12th...
> > > 182,031 observations = 34,247 with QM=2, 139,047 with QM=9, and
8,737
> > with
> > > other QM's.
> > >
> > > I'm guessing that with the GFS upgrade, they changed their GDAS
> > > assimilation logic back to setting the quality marker = 9 for
surface
> obs
> > > to avoid assimilating them.
> > >
> > > So I'd recommend 2 things:
> > > (1) In your PB2NC config file, set *quality_mark_thresh = 9;*
> > > (2) In your Point-Stat config file, when verifying against
surface
> > > obs, set *obs_quality
> > > = [0,1,2,9]; *to use surface observations with any of these
quality
> > marks.
> > >
> > > Hope that helps get you going.
> > >
> > > Thanks,
> > > John Halley Gotway
> > >
> > > On Tue, Jun 18, 2019 at 4:06 PM Randy Bullock via RT <
> met_help at ucar.edu>
> > > wrote:
> > >
> > > >
> > > > Tue Jun 18 16:05:37 2019: Request 90685 was acted upon.
> > > > Transaction: Given to johnhg (John Halley Gotway) by bullock
> > > >        Queue: met_help
> > > >      Subject: Issue with using MET Point_Stat with GDAS files
after
> > > > 2019061200
> > > >        Owner: johnhg
> > > >   Requestors: jdberman at albany.edu
> > > >       Status: new
> > > >  Ticket <URL:
> https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=90685
> > >
> > > >
> > > >
> > > > This transaction appears to have no content
> > > >
> > >
> > >
> > >
> >
> >
> >
>
>
>

------------------------------------------------
Subject: Issue with using MET Point_Stat with GDAS files after 2019061200
From: John Halley Gotway
Time: Fri Jul 19 15:59:10 2019

Jeremy,

I see that you'd like to filter the observations and see how the
continuous
statistics change as the wind speed increases.  You're definitely on
the
right track.  The "cnt_thresh" option is used to subset the matched
pairs
before computing statistics.  And it's a great question as to whether
that
filter is applied to the fcst values, obs values, or both.

Here's a selection from the data/config/README file which describes
this:
https://github.com/NCAR/MET/blob/master_v8.1/met/data/config/README
// - The "cnt_thresh" entry is an array of thresholds for filtering
// data prior to computing continuous statistics and partial sums.
//
// - The "cnt_logic" entry may be set to UNION, INTERSECTION, or
SYMDIFF
// and controls the logic for how the forecast and observed cnt_thresh
// settings are combined when filtering matched pairs of forecast and
// observed values.

Let's say your Point-Stat config file looks like this to verify 10-
meter
winds:

*fcst = {   field = [ { name = "WIND"; level = [ "Z10" ]; } ];}*

*obs = {*

*   field = [ { name = "WIND"; level = [ "Z10" ]; } ];}*

If you add "cnt_thresh = [ >1.0 ];" outside of both the fcst and obs
dictionaries, the setting will apply to both fcst and obs values.  Put
it
inside the "obs", and it'll only apply to the obs.  And the default
cnt_logic setting is union:
*   cnt_logic = UNION;*

For what you're doing, I think applying it only to the obs makes for
better
comparison between models.  So I'd suggest:

*cnt_logic = INTERSECTION;*

*fcst = {*

*   field = [ { *
*           name = "WIND"; level = [ "Z10" ];*

*           cnt_thresh = [ >1.0, >2.0, >3.0 ];*

*     } ];}*

*obs = {*
*   field = [ {*
*           name = "WIND"; level = [ "Z10" ];*
*           cnt_thresh = [ NA, NA, NA ];*
*   } ];*
*}*

If you define 3 CNT thresh settings for the forecast, then you need 3
for
the obs too.  The NA threshold always evaluated to true.  So that's
why you
want the intersection logic.

Hope that helps.

Thanks,
John

On Wed, Jul 17, 2019 at 2:50 PM Berman, Jeremy D via RT
<met_help at ucar.edu>
wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=90685 >
>
> Hi John,
>
>
> Thanks so much for your help earlier! I hope you did not have to
work all
> day on July 4th, and that you got to enjoy the holiday!
>
>
> Sorry about the earlier confusion about the GFS vs WRF pairs. I
decided
> that to maintain the same observations for comparing GFS and WRF, I
would
> just use the mask_poly feature and just be careful to not prescribe
a poly
> line too close to my WRF domain boundary. So far it works!
>
>
> One quick question if I may ask you. As I described earlier, I'm
comparing
> my WRF forecasts vs. GFS forecasts over a region of northern
California.
> I'm using pb2nc for my observations from PREPBUFR, and point_stat
for the
> matched pairs. I've been looking at the continuous statistics (CNT)
and
> categorical statistics (CTS) for 10m Wind Speed, and I'm seeing that
GFS
> routinely has lower RMSE, BIAS, etc over the region for a number of
> forecast lead times.
>
>
> Of course there are many reasons that could explain why my WRF
forecast
> doesn't perform as well. One thing I would like to investigate is
whether
> the GFS performs better because there are observations with near 0
m/s
> values. So my question is: how can I filter the observations to only
> compute the CNT statistics for observations with values greater
than, say,
> 0.5 m/s?
>
>
> I know in the Point_Stat Configuration Files there is a setting
called
> "cnt_thres[]" and "cnt_logic", however do these settings only apply
to the
> forecasted value or the observed value? For example, if the
observation
> value is 0.7 m/s, but the forecasted value is 0 m/s, if I set the
cnt_thres
> to 0.5, would it count that observation, and find a matched pair, or
not? I
> want to filter just the observation values, so that my GFS and WRF
> forecasts are being "apple to apple" comparisons.
>
>
> My hope is that I can try to better understand my GFS vs. WRF
statistical
> comparison by trying to filter the observation data by only looking
at CNT
> for high wind speeds. The GFS has a coarser spatial resolution, so I
would
> think a priori that my WRF simulation (3 km) would provide better
wind
> forecasts. So hopefully filtering the observation dataset will help
> diagnose the differences.
>
>
> Thanks!
>
> Jeremy
>
>
>
>
> ________________________________
> From: John Halley Gotway via RT <met_help at ucar.edu>
> Sent: Thursday, July 4, 2019 9:32 AM
> To: Berman, Jeremy D
> Cc: harrold at ucar.edu; jwolff at ucar.edu
> Subject: Re: [rt.rap.ucar.edu #90685] Issue with using MET
Point_Stat
> with GDAS files after 2019061200
>
> Jeremy,
>
> I’m glad you’ve been able to make sense of the discrepancies in the
number
> of pairs.  And I’m glad to hear that it’s GFS with more pairs than
WRF.
> Your earlier email saying the opposite had me confused!
>
> “but I'm getting slightly more matched pairs for my WRF run than the
GFS
> (54 vs 52 obs).”
>
> There has got to be a good way of handling this using MET but I just
don’t
> know what it is off the top of my head.
>
> Regarding (1), yes definitely.  In the Point-Stat config file, look
in the
> “mask” dictionary and use the “sid” entry to specify a list of
station id’s
> to be included in the stats.  Be sure to read about that in the
users guide
> or the data/config/README file.
>
> Regarding (2), AFPSFC is not a good choice to verify vertical levels
above
> ground.  Please try ADPUPA (upper-air) instead. Or perhaps some
other obs
> type?  I know a lot about the software but don’t know a lot about
breadth
> of available obs.
>
> Here’s why ADPSFC is a bad choice.  MET inherited this logic for
surface
> verification from the verification done at NOAA/EMC.  When verifying
> vertical level forecast against ADFSFC or SFCSHP obs, just match
those obs
> without actually checking the height info.  This works for 2-m temp
and
> 10-m winds but not 305-m winds!
>
> Thanks
> John
>
>
> On Wed, Jul 3, 2019 at 3:47 PM Berman, Jeremy D via RT
<met_help at ucar.edu>
> wrote:
>
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=90685 >
> >
> > Hi John,
> >
> >
> > Thanks for your input! I did as you suggested and found that the
missing
> > points between my WRF and GFS were all along the northern or
eastern
> > boundary of my WRF domain.
> >
> >
> > When doing the "diff" command between the Point_Stat files for WRF
and
> GFS
> > with -v 3, I noticed the lines for the WRF file:
> >
> >
> > "Processing WIND/Z10 versus WIND/Z10, for observation type ADPSFC,
over
> > region FULL, for interpolation method BILIN(4), using 52 pairs"
> >
> > "DEBUG 3: Rejected: off the grid   = 1"
> >
> > "Rejected: bad fcst value = 1"
> >
> >
> > While for GFS:
> >
> >
> > "Processing WIND/Z10 versus WIND/Z10, for observation type ADPSFC,
over
> > region FULL, for interpolation method BILIN(4), using 54 pairs"
> >
> > "DEBUG 3: Rejected: off the grid   = 0"
> >
> > "Rejected: bad fcst value = 0"
> >
> >
> > I looked into the MPR files and found two stations that were
different.
> > Those two stations were near the northern or eastern boundary of
my WRF
> > domain, and one of them was actually outside the WRF domain.
> >
> >
> > So my thinking is that the discrepancy occurs because (1) an
observation
> > is close to a boundary and (2) the bilinear interpolation approach
can't
> > find all the neighbor values because it is too close to a
boundary. I
> > tested these same MET commands, but for the NEAREST interpolation
method:
> > there still was 1 "rejected: off the grid", but 0 "rejected: bad
fcst
> > value". So I think this supports my hypothesis. I think the reason
the
> GFS
> > has more observation matches is because it has a global domain, so
there
> is
> > no boundary issue for observations, like there is for my WRF
simulation.
> >
> >
> > I think to fix this issue, and to get the same number of WRF to
GFS
> > observation matches, I'll have to only use a polyline described by
> > "mask.poly", and I'll have to write a script that checks that the
poly
> line
> > is not too close to the WRF domain (e.g., within 5 grid points).
> >
> >
> >
> > I have another, related, question, if you don't mind me asking :)
> >
> >
> > 1. is it possible to select specific METAR locations to run
> > PB2NC/Point_Stat on, instead of using all the observations within
a poly
> > mask? For example, if I am looking at Northern California, but I
only
> want
> > to validate averaged over the stations: KAAT, KACV, KBFL, KBLU,
KFAT,
> KPRB,
> > KSAC for CNT/CTS files? I know the individual stations are listed
in MPR,
> > but I wondered how I could do this for the CNT/CTS calculations.
> >
> >
> > 2. If I have multiple wind speed vertical levels from WRF run
under UPP
> > such as Z10, Z30, Z80, Z100, Z305, which observation type would be
most
> > appropriate to validate those higher altitude levels? I had used
ADPSFC,
> > and it actually got matches for those higher altitude levels of
Z100 and
> > Z305, but the observed wind speeds don't seem believable, so
perhaps I am
> > using that incorrectly. Is there a more appropriate observation
type
> you'd
> > suggest to verify against? Maybe profilers?
> >
> >
> >
> > Thank you so much for all your help through this process. I look
forward
> > to hearing your thoughts on these other questions.
> >
> >
> > Have a Happy Fourth of July!!
> >
> >
> > Jeremy
> >
> >
> >
> >
> > ________________________________
> > From: John Halley Gotway via RT <met_help at ucar.edu>
> > Sent: Monday, July 1, 2019 1:27:23 PM
> > To: Berman, Jeremy D
> > Cc: harrold at ucar.edu; jwolff at ucar.edu
> > Subject: Re: [rt.rap.ucar.edu #90685] Issue with using MET
Point_Stat
> > with GDAS files after 2019061200
> >
> > Jeremy,
> >
> > I see you have a question about why you get slightly different
number of
> > matched pairs when verifying your WRF output and GFS output over
the same
> > spatial area.  Thanks for describing your processing logic.
> >
> > So you're getting 54 matched pairs for WRF and only 52 for GFS.
I'm
> trying
> > to reconcile in my mind why that would be the case.  Differences
like
> this
> > tend to occur along the edge of the domain.  But I would have
guessed the
> > result be the other way around... you'd get 2 fewer pairs for your
the
> > limited area of the WRF run and two more for the global GFS.
> >
> > I'd suggest digging a little deeper, and here's how...
> >
> > (1) Rerun Point-Stat at verbosity level 3 by specifying "-v 3" on
the
> > command line.  For each verification task, that'll dump out counts
for
> why
> > obs were or were not used.  Diffing the logs between WRF and GFS,
you
> might
> > find a difference which could point to an explanation.
> >
> > (2) I see from your Point-Stat config file that you already have
the MPR
> > output line type turned on.  You could look at that output to
determine
> > which stations were included for WRF but not GFS.  That might
yield
> another
> > clue.
> >
> > Do those methods shed any light on the differences?
> >
> > Also, I plotted the WRF domain you sent and see that it's over the
> eastern
> > pacific.  Be aware that the ADPSFC obs are only over land, a small
> fraction
> > of your domain.  Surface water points are encoded as "SFCSHP" in
> PREPBUFR.
> > So I'd suggest setting:
> >
> > message_type   = [ "ADPSFC", "SFCSHP" ];
> >
> > That'll verify them as two separate tasks.  If you want to process
them
> in
> > one big group, just use:
> >
> > message_type   = [ "ONLYSF" ]; // That's defined in
> message_type_group_map
> > as the union of ADPSFC and SFCSHP
> >
> > Thanks,
> > John
> >
> >
> > On Fri, Jun 28, 2019 at 3:58 PM Berman, Jeremy D via RT <
> met_help at ucar.edu
> > >
> > wrote:
> >
> > >
> > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=90685 >
> > >
> > > Hi John,
> > >
> > >
> > > I realized it's probably most helpful to you if I provided the
files
> and
> > > mask that I was using, so you can see what I was doing. Please
let me
> > know
> > > if you can see these files (I didn't include in this email the
WRF
> output
> > > file since it is large (>45 mb) but I can via another way if you
need
> to
> > > see it).
> > >
> > >
> > > Jeremy
> > >
> > >
> > >
> > > ________________________________
> > > From: Berman, Jeremy D
> > > Sent: Friday, June 28, 2019 5:48:53 PM
> > > To: met_help at ucar.edu
> > > Subject: Re: [rt.rap.ucar.edu #90685] Issue with using MET
Point_Stat
> > > with GDAS files after 2019061200
> > >
> > >
> > > Hi John,
> > >
> > >
> > > Thanks for your reply! The UPP information was very helpful, and
I've
> > used
> > > it successfully on my WRF runs.
> > >
> > >
> > > I've encountered another minor issue, if you could provide some
> insight.
> > > I've been trying to compare GFS forecasts vs my WRF forecasts
over the
> > same
> > > verification region, such as only over my WRF domain (e.g.,
extending
> > over
> > > the Pacific and California). For reference, my WRF is a 12km
single
> > domain
> > > run.
> > >
> > >
> > > I've noticed that after doing the steps below, I am verifying
over my
> > > intended region (the WRF domain), but I'm getting slightly more
matched
> > > pairs for my WRF run than the GFS (54 vs 52 obs). Do you have
any idea
> of
> > > why this could be happening and how I could get the number of
matched
> > > observations to be the same?
> > >
> > >
> > > Here is my procedure:
> > >
> > >
> > > I use the "Gen-Poly-Mask Tool" and the -type grid on a UPP'ed
WRF
> > forecast
> > > file to create a mask saved as a .nc file:
> > >
> > >
> > > gen_vx_mask -type grid WRFPRS_d01_f001 WRFPRS_d01_f001
WRF_domain.nc
> > >
> > >
> > > Then, I modified my pb2nc config file
("PB2NCConfig_PREPBUFR_GDAS") by
> > > setting: poly = "WRF_domain.nc"
> > >
> > >
> > > Then I run the "pb2nc" command on PREPBUFR_GDAS data to select
> > > observations over my masked area:
> > >
> > >
> > > pb2nc PREPBUFR_GDAS_f000.nr PREPBUFR_GDAS_f000_pb.nc
> > > PB2NCConfig_PREPBUFR_GDAS -v 2
> > >
> > >
> > > This creates a file called "PREPBUFR_GDAS_f000_pb.nc" which I
checked
> > > contains observations over my WRF mask.
> > >
> > >
> > > Then I created point_stat config files for both GFS and WRF
which have
> > the
> > > same default settings but have "grid = [ "FULL" ]; poly = [];"
and the
> > > interpolation method for NEAREST and BILIN. In my understanding,
this
> is
> > > fine since the pb2nc should have only selected observations
within the
> > > masked region.
> > >
> > >
> > > I ran the point_stat command for both GFS and WRF forecast data,
which
> > > worked fine, but I noticed the discrepancy in the total number
of
> matched
> > > pairs.
> > >
> > >
> > > I also tried having the point_stat config files having " grid =
[];
> poly
> > =
> > > ["WRF_domain.nc"] " , but the same observation discrepancy
occurred.
> > >
> > >
> > > Am I maybe using the mask incorrectly? Or could it be that since
the
> GFS
> > > and WRF may have different horizontal resolutions, that there
could be
> > less
> > > matched pairs?
> > >
> > >
> > > I should mention I also tried creating a mask.poly file with
just a
> boxed
> > > region (e.g., Northern California) and included poly =
"mask.poly" in
> the
> > > PB2NC Config File. After doing this I got the same number of
> observations
> > > for GFS and WRF. So I'm not sure why using the mask.poly worked
fine,
> but
> > > using the WRF domain as a mask did not.
> > >
> > >
> > >
> > > Thanks for your time and thoughts! If you need to see any
example
> config
> > > files or WRF output, let me know and I can provide that.
> > >
> > >
> > > Thanks!
> > >
> > > Jeremy
> > >
> > >
> > >
> > >
> > >
> > >
> > > ________________________________
> > > From: John Halley Gotway via RT <met_help at ucar.edu>
> > > Sent: Friday, June 21, 2019 12:17:19 PM
> > > To: Berman, Jeremy D
> > > Cc: harrold at ucar.edu; jwolff at ucar.edu
> > > Subject: Re: [rt.rap.ucar.edu #90685] Issue with using MET
Point_Stat
> > > with GDAS files after 2019061200
> > >
> > > Jeremy,
> > >
> > > Unfortunately, no, MET does not include logic for handling WRF's
hybrid
> > > vertical coordinate and interpolating to pressure or height
levels.  In
> > > addition, when using raw WRFOUT files, MET does not handle winds
well
> > since
> > > they're defined on the staggered grid.
> > >
> > > For these two reasons, we recommend that users run their WRFOUT
files
> > > through the Unified Post Processor.  It destaggers the winds and
> > > interpolates to the pressure level and height levels.  Here's a
link
> for
> > > info about UPP:
> > > https://dtcenter.org/community-code/unified-post-processor-upp
> > >
> > > UPP writes GRIB1 or GRIB2 output files and the MET tools can
handle
> those
> > > well.
> > >
> > > Thanks,
> > > John
> > >
> > > On Fri, Jun 21, 2019 at 12:28 AM Berman, Jeremy D via RT <
> > > met_help at ucar.edu>
> > > wrote:
> > >
> > > >
> > > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=90685
>
> > > >
> > > > Hi John,
> > > >
> > > >
> > > > Thank you so much for your help! I was bogged down trying to
solve
> this
> > > > for a few days - I'm impressed you did it in two hours!
> > > >
> > > >
> > > > I made the changes and it worked for a case I did for
2019061800. I
> > > should
> > > > have mentioned I was looking at 2-meter Temperature and 10-
meter Wind
> > > > Speed, both of which use ADPSFC, which based on your
observation
> counts
> > > > explains why point_stat could not get any matched pairs.
> > > >
> > > >
> > > > I have another question if you don't mind me asking: if I want
to
> > compute
> > > > verification for a different vertical level, such as 100-meter
Wind
> > > Speed,
> > > > can MET do that for a WRF forecast file, even if the forecast
file
> does
> > > not
> > > > have a 100-meter level? Would MET be able to do a vertical
> > interpolation
> > > in
> > > > order to assess for that level (or any vertical above ground
level)?
> > > >
> > > >
> > > > Additionally: if I wanted to do verification with point_stat
for an
> > > entire
> > > > vertical profile (let's say from the surface to 1000 meters
above
> > ground
> > > > level) could MET do that as well? I know in the example MET
tutorial
> > > there
> > > > is a range of Temperature from 850-750hPa, but I was wondering
if
> this
> > > > could work for a range of vertical meters above ground?
> > > >
> > > >
> > > > Thank you for all your help!
> > > >
> > > >
> > > > Best,
> > > >
> > > > Jeremy
> > > >
> > > > ________________________________
> > > > From: John Halley Gotway via RT <met_help at ucar.edu>
> > > > Sent: Tuesday, June 18, 2019 6:50:34 PM
> > > > To: Berman, Jeremy D
> > > > Cc: harrold at ucar.edu; jwolff at ucar.edu
> > > > Subject: Re: [rt.rap.ucar.edu #90685] Issue with using MET
> Point_Stat
> > > > with GDAS files after 2019061200
> > > >
> > > > Jeremy,
> > > >
> > > > I see you have a question about running the pb2nc and
point_stat
> tools.
> > > > That's interesting that you found that you're verification
logic
> broke
> > on
> > > > 2019061206.  To check the behavior, I retrieved a PREPBUFR
file
> before
> > > and
> > > > after that time:
> > > >
> > > > BEFORE: wget
> > > >
> > > >
> > >
> >
>
ftp://ftpprd.ncep.noaa.gov/pub/data/nccf/com/gfs/prod/gdas.20190611/gdas.t12z.prepbufr.nr
> > > > AFTER:   wget
> > > >
> > > >
> > >
> >
>
ftp://ftpprd.ncep.noaa.gov/pub/data/nccf/com/gfs/prod/gdas.20190612/12/gdas.t12z.prepbufr.nr
> > > >
> > > > I ran them both through pb2nc, extracting ADPUPA and ADPSFC
message
> > > types.
> > > > pb2nc retained/derived 308,034 observations on the 11th but
only
> > 175,427
> > > on
> > > > the 12th.
> > > >
> > > > One obvious difference I did note is that they've changed the
> directory
> > > > structure on the ftp site.  On 20190612, they started
organizing the
> > data
> > > > into subdirectories for the hour (00, 06, 12, 18).  If you're
> running a
> > > > script to pull prepbufr files from the FTP site, please make
sure
> it's
> > > > finding them in the right sub-directory.
> > > >
> > > > Doing some more digging I found...
> > > > The number of ADPUPA observations remains unchanged... 136,114
for
> both
> > > > days.
> > > > The number of ADPSFC (surface) observations is dramatically
> reduced...
> > > from
> > > > 171,920 down to 34,247!
> > > >
> > > > So next I reran pb2nc but setting *quality_mark_thresh = 15;
*to
> retain
> > > all
> > > > observations regardless of the quality flag.
> > > > And that results in 332,235 and 337,818 observations on the
11th and
> > > 12th,
> > > > respectively.
> > > > The ADPUPA obs are very similar: 150,057 vs 155,787
> > > > The ADPSFC obs are also similar: 182,178 vs 182,031
> > > >
> > > > So the big difference is in the quality mark values.
> > > > On the 11th...
> > > > 182,178 observations = 171,920 with QM=2, 10 with QM=9, and
10,248
> with
> > > > other QM's.
> > > > On the 12th...
> > > > 182,031 observations = 34,247 with QM=2, 139,047 with QM=9,
and 8,737
> > > with
> > > > other QM's.
> > > >
> > > > I'm guessing that with the GFS upgrade, they changed their
GDAS
> > > > assimilation logic back to setting the quality marker = 9 for
surface
> > obs
> > > > to avoid assimilating them.
> > > >
> > > > So I'd recommend 2 things:
> > > > (1) In your PB2NC config file, set *quality_mark_thresh = 9;*
> > > > (2) In your Point-Stat config file, when verifying against
surface
> > > > obs, set *obs_quality
> > > > = [0,1,2,9]; *to use surface observations with any of these
quality
> > > marks.
> > > >
> > > > Hope that helps get you going.
> > > >
> > > > Thanks,
> > > > John Halley Gotway
> > > >
> > > > On Tue, Jun 18, 2019 at 4:06 PM Randy Bullock via RT <
> > met_help at ucar.edu>
> > > > wrote:
> > > >
> > > > >
> > > > > Tue Jun 18 16:05:37 2019: Request 90685 was acted upon.
> > > > > Transaction: Given to johnhg (John Halley Gotway) by bullock
> > > > >        Queue: met_help
> > > > >      Subject: Issue with using MET Point_Stat with GDAS
files after
> > > > > 2019061200
> > > > >        Owner: johnhg
> > > > >   Requestors: jdberman at albany.edu
> > > > >       Status: new
> > > > >  Ticket <URL:
> > https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=90685
> > > >
> > > > >
> > > > >
> > > > > This transaction appears to have no content
> > > > >
> > > >
> > > >
> > > >
> > >
> > >
> > >
> >
> >
> >
>
>
>

------------------------------------------------
Subject: Issue with using MET Point_Stat with GDAS files after 2019061200
From: Berman, Jeremy D
Time: Mon Jul 22 10:58:26 2019

Thanks John! That was exactly what I needed and it worked! Thank you
for the very clear explanation.

Best!

Jeremy

________________________________
From: John Halley Gotway via RT <met_help at ucar.edu>
Sent: Friday, July 19, 2019 5:59:10 PM
To: Berman, Jeremy D <jdberman at albany.edu>
Cc: harrold at ucar.edu <harrold at ucar.edu>; jwolff at ucar.edu
<jwolff at ucar.edu>
Subject: Re: [rt.rap.ucar.edu #90685] Issue with using MET Point_Stat
with GDAS files after 2019061200

Jeremy,

I see that you'd like to filter the observations and see how the
continuous
statistics change as the wind speed increases.  You're definitely on
the
right track.  The "cnt_thresh" option is used to subset the matched
pairs
before computing statistics.  And it's a great question as to whether
that
filter is applied to the fcst values, obs values, or both.

Here's a selection from the data/config/README file which describes
this:
https://github.com/NCAR/MET/blob/master_v8.1/met/data/config/README
// - The "cnt_thresh" entry is an array of thresholds for filtering
// data prior to computing continuous statistics and partial sums.
//
// - The "cnt_logic" entry may be set to UNION, INTERSECTION, or
SYMDIFF
// and controls the logic for how the forecast and observed cnt_thresh
// settings are combined when filtering matched pairs of forecast and
// observed values.

Let's say your Point-Stat config file looks like this to verify 10-
meter
winds:

*fcst = {   field = [ { name = "WIND"; level = [ "Z10" ]; } ];}*

*obs = {*

*   field = [ { name = "WIND"; level = [ "Z10" ]; } ];}*

If you add "cnt_thresh = [ >1.0 ];" outside of both the fcst and obs
dictionaries, the setting will apply to both fcst and obs values.  Put
it
inside the "obs", and it'll only apply to the obs.  And the default
cnt_logic setting is union:
*   cnt_logic = UNION;*

For what you're doing, I think applying it only to the obs makes for
better
comparison between models.  So I'd suggest:

*cnt_logic = INTERSECTION;*

*fcst = {*

*   field = [ { *
*           name = "WIND"; level = [ "Z10" ];*

*           cnt_thresh = [ >1.0, >2.0, >3.0 ];*

*     } ];}*

*obs = {*
*   field = [ {*
*           name = "WIND"; level = [ "Z10" ];*
*           cnt_thresh = [ NA, NA, NA ];*
*   } ];*
*}*

If you define 3 CNT thresh settings for the forecast, then you need 3
for
the obs too.  The NA threshold always evaluated to true.  So that's
why you
want the intersection logic.

Hope that helps.

Thanks,
John

On Wed, Jul 17, 2019 at 2:50 PM Berman, Jeremy D via RT
<met_help at ucar.edu>
wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=90685 >
>
> Hi John,
>
>
> Thanks so much for your help earlier! I hope you did not have to
work all
> day on July 4th, and that you got to enjoy the holiday!
>
>
> Sorry about the earlier confusion about the GFS vs WRF pairs. I
decided
> that to maintain the same observations for comparing GFS and WRF, I
would
> just use the mask_poly feature and just be careful to not prescribe
a poly
> line too close to my WRF domain boundary. So far it works!
>
>
> One quick question if I may ask you. As I described earlier, I'm
comparing
> my WRF forecasts vs. GFS forecasts over a region of northern
California.
> I'm using pb2nc for my observations from PREPBUFR, and point_stat
for the
> matched pairs. I've been looking at the continuous statistics (CNT)
and
> categorical statistics (CTS) for 10m Wind Speed, and I'm seeing that
GFS
> routinely has lower RMSE, BIAS, etc over the region for a number of
> forecast lead times.
>
>
> Of course there are many reasons that could explain why my WRF
forecast
> doesn't perform as well. One thing I would like to investigate is
whether
> the GFS performs better because there are observations with near 0
m/s
> values. So my question is: how can I filter the observations to only
> compute the CNT statistics for observations with values greater
than, say,
> 0.5 m/s?
>
>
> I know in the Point_Stat Configuration Files there is a setting
called
> "cnt_thres[]" and "cnt_logic", however do these settings only apply
to the
> forecasted value or the observed value? For example, if the
observation
> value is 0.7 m/s, but the forecasted value is 0 m/s, if I set the
cnt_thres
> to 0.5, would it count that observation, and find a matched pair, or
not? I
> want to filter just the observation values, so that my GFS and WRF
> forecasts are being "apple to apple" comparisons.
>
>
> My hope is that I can try to better understand my GFS vs. WRF
statistical
> comparison by trying to filter the observation data by only looking
at CNT
> for high wind speeds. The GFS has a coarser spatial resolution, so I
would
> think a priori that my WRF simulation (3 km) would provide better
wind
> forecasts. So hopefully filtering the observation dataset will help
> diagnose the differences.
>
>
> Thanks!
>
> Jeremy
>
>
>
>
> ________________________________
> From: John Halley Gotway via RT <met_help at ucar.edu>
> Sent: Thursday, July 4, 2019 9:32 AM
> To: Berman, Jeremy D
> Cc: harrold at ucar.edu; jwolff at ucar.edu
> Subject: Re: [rt.rap.ucar.edu #90685] Issue with using MET
Point_Stat
> with GDAS files after 2019061200
>
> Jeremy,
>
> I’m glad you’ve been able to make sense of the discrepancies in the
number
> of pairs.  And I’m glad to hear that it’s GFS with more pairs than
WRF.
> Your earlier email saying the opposite had me confused!
>
> “but I'm getting slightly more matched pairs for my WRF run than the
GFS
> (54 vs 52 obs).”
>
> There has got to be a good way of handling this using MET but I just
don’t
> know what it is off the top of my head.
>
> Regarding (1), yes definitely.  In the Point-Stat config file, look
in the
> “mask” dictionary and use the “sid” entry to specify a list of
station id’s
> to be included in the stats.  Be sure to read about that in the
users guide
> or the data/config/README file.
>
> Regarding (2), AFPSFC is not a good choice to verify vertical levels
above
> ground.  Please try ADPUPA (upper-air) instead. Or perhaps some
other obs
> type?  I know a lot about the software but don’t know a lot about
breadth
> of available obs.
>
> Here’s why ADPSFC is a bad choice.  MET inherited this logic for
surface
> verification from the verification done at NOAA/EMC.  When verifying
> vertical level forecast against ADFSFC or SFCSHP obs, just match
those obs
> without actually checking the height info.  This works for 2-m temp
and
> 10-m winds but not 305-m winds!
>
> Thanks
> John
>
>
> On Wed, Jul 3, 2019 at 3:47 PM Berman, Jeremy D via RT
<met_help at ucar.edu>
> wrote:
>
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=90685 >
> >
> > Hi John,
> >
> >
> > Thanks for your input! I did as you suggested and found that the
missing
> > points between my WRF and GFS were all along the northern or
eastern
> > boundary of my WRF domain.
> >
> >
> > When doing the "diff" command between the Point_Stat files for WRF
and
> GFS
> > with -v 3, I noticed the lines for the WRF file:
> >
> >
> > "Processing WIND/Z10 versus WIND/Z10, for observation type ADPSFC,
over
> > region FULL, for interpolation method BILIN(4), using 52 pairs"
> >
> > "DEBUG 3: Rejected: off the grid   = 1"
> >
> > "Rejected: bad fcst value = 1"
> >
> >
> > While for GFS:
> >
> >
> > "Processing WIND/Z10 versus WIND/Z10, for observation type ADPSFC,
over
> > region FULL, for interpolation method BILIN(4), using 54 pairs"
> >
> > "DEBUG 3: Rejected: off the grid   = 0"
> >
> > "Rejected: bad fcst value = 0"
> >
> >
> > I looked into the MPR files and found two stations that were
different.
> > Those two stations were near the northern or eastern boundary of
my WRF
> > domain, and one of them was actually outside the WRF domain.
> >
> >
> > So my thinking is that the discrepancy occurs because (1) an
observation
> > is close to a boundary and (2) the bilinear interpolation approach
can't
> > find all the neighbor values because it is too close to a
boundary. I
> > tested these same MET commands, but for the NEAREST interpolation
method:
> > there still was 1 "rejected: off the grid", but 0 "rejected: bad
fcst
> > value". So I think this supports my hypothesis. I think the reason
the
> GFS
> > has more observation matches is because it has a global domain, so
there
> is
> > no boundary issue for observations, like there is for my WRF
simulation.
> >
> >
> > I think to fix this issue, and to get the same number of WRF to
GFS
> > observation matches, I'll have to only use a polyline described by
> > "mask.poly", and I'll have to write a script that checks that the
poly
> line
> > is not too close to the WRF domain (e.g., within 5 grid points).
> >
> >
> >
> > I have another, related, question, if you don't mind me asking :)
> >
> >
> > 1. is it possible to select specific METAR locations to run
> > PB2NC/Point_Stat on, instead of using all the observations within
a poly
> > mask? For example, if I am looking at Northern California, but I
only
> want
> > to validate averaged over the stations: KAAT, KACV, KBFL, KBLU,
KFAT,
> KPRB,
> > KSAC for CNT/CTS files? I know the individual stations are listed
in MPR,
> > but I wondered how I could do this for the CNT/CTS calculations.
> >
> >
> > 2. If I have multiple wind speed vertical levels from WRF run
under UPP
> > such as Z10, Z30, Z80, Z100, Z305, which observation type would be
most
> > appropriate to validate those higher altitude levels? I had used
ADPSFC,
> > and it actually got matches for those higher altitude levels of
Z100 and
> > Z305, but the observed wind speeds don't seem believable, so
perhaps I am
> > using that incorrectly. Is there a more appropriate observation
type
> you'd
> > suggest to verify against? Maybe profilers?
> >
> >
> >
> > Thank you so much for all your help through this process. I look
forward
> > to hearing your thoughts on these other questions.
> >
> >
> > Have a Happy Fourth of July!!
> >
> >
> > Jeremy
> >
> >
> >
> >
> > ________________________________
> > From: John Halley Gotway via RT <met_help at ucar.edu>
> > Sent: Monday, July 1, 2019 1:27:23 PM
> > To: Berman, Jeremy D
> > Cc: harrold at ucar.edu; jwolff at ucar.edu
> > Subject: Re: [rt.rap.ucar.edu #90685] Issue with using MET
Point_Stat
> > with GDAS files after 2019061200
> >
> > Jeremy,
> >
> > I see you have a question about why you get slightly different
number of
> > matched pairs when verifying your WRF output and GFS output over
the same
> > spatial area.  Thanks for describing your processing logic.
> >
> > So you're getting 54 matched pairs for WRF and only 52 for GFS.
I'm
> trying
> > to reconcile in my mind why that would be the case.  Differences
like
> this
> > tend to occur along the edge of the domain.  But I would have
guessed the
> > result be the other way around... you'd get 2 fewer pairs for your
the
> > limited area of the WRF run and two more for the global GFS.
> >
> > I'd suggest digging a little deeper, and here's how...
> >
> > (1) Rerun Point-Stat at verbosity level 3 by specifying "-v 3" on
the
> > command line.  For each verification task, that'll dump out counts
for
> why
> > obs were or were not used.  Diffing the logs between WRF and GFS,
you
> might
> > find a difference which could point to an explanation.
> >
> > (2) I see from your Point-Stat config file that you already have
the MPR
> > output line type turned on.  You could look at that output to
determine
> > which stations were included for WRF but not GFS.  That might
yield
> another
> > clue.
> >
> > Do those methods shed any light on the differences?
> >
> > Also, I plotted the WRF domain you sent and see that it's over the
> eastern
> > pacific.  Be aware that the ADPSFC obs are only over land, a small
> fraction
> > of your domain.  Surface water points are encoded as "SFCSHP" in
> PREPBUFR.
> > So I'd suggest setting:
> >
> > message_type   = [ "ADPSFC", "SFCSHP" ];
> >
> > That'll verify them as two separate tasks.  If you want to process
them
> in
> > one big group, just use:
> >
> > message_type   = [ "ONLYSF" ]; // That's defined in
> message_type_group_map
> > as the union of ADPSFC and SFCSHP
> >
> > Thanks,
> > John
> >
> >
> > On Fri, Jun 28, 2019 at 3:58 PM Berman, Jeremy D via RT <
> met_help at ucar.edu
> > >
> > wrote:
> >
> > >
> > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=90685 >
> > >
> > > Hi John,
> > >
> > >
> > > I realized it's probably most helpful to you if I provided the
files
> and
> > > mask that I was using, so you can see what I was doing. Please
let me
> > know
> > > if you can see these files (I didn't include in this email the
WRF
> output
> > > file since it is large (>45 mb) but I can via another way if you
need
> to
> > > see it).
> > >
> > >
> > > Jeremy
> > >
> > >
> > >
> > > ________________________________
> > > From: Berman, Jeremy D
> > > Sent: Friday, June 28, 2019 5:48:53 PM
> > > To: met_help at ucar.edu
> > > Subject: Re: [rt.rap.ucar.edu #90685] Issue with using MET
Point_Stat
> > > with GDAS files after 2019061200
> > >
> > >
> > > Hi John,
> > >
> > >
> > > Thanks for your reply! The UPP information was very helpful, and
I've
> > used
> > > it successfully on my WRF runs.
> > >
> > >
> > > I've encountered another minor issue, if you could provide some
> insight.
> > > I've been trying to compare GFS forecasts vs my WRF forecasts
over the
> > same
> > > verification region, such as only over my WRF domain (e.g.,
extending
> > over
> > > the Pacific and California). For reference, my WRF is a 12km
single
> > domain
> > > run.
> > >
> > >
> > > I've noticed that after doing the steps below, I am verifying
over my
> > > intended region (the WRF domain), but I'm getting slightly more
matched
> > > pairs for my WRF run than the GFS (54 vs 52 obs). Do you have
any idea
> of
> > > why this could be happening and how I could get the number of
matched
> > > observations to be the same?
> > >
> > >
> > > Here is my procedure:
> > >
> > >
> > > I use the "Gen-Poly-Mask Tool" and the -type grid on a UPP'ed
WRF
> > forecast
> > > file to create a mask saved as a .nc file:
> > >
> > >
> > > gen_vx_mask -type grid WRFPRS_d01_f001 WRFPRS_d01_f001
WRF_domain.nc
> > >
> > >
> > > Then, I modified my pb2nc config file
("PB2NCConfig_PREPBUFR_GDAS") by
> > > setting: poly = "WRF_domain.nc"
> > >
> > >
> > > Then I run the "pb2nc" command on PREPBUFR_GDAS data to select
> > > observations over my masked area:
> > >
> > >
> > > pb2nc PREPBUFR_GDAS_f000.nr PREPBUFR_GDAS_f000_pb.nc
> > > PB2NCConfig_PREPBUFR_GDAS -v 2
> > >
> > >
> > > This creates a file called "PREPBUFR_GDAS_f000_pb.nc" which I
checked
> > > contains observations over my WRF mask.
> > >
> > >
> > > Then I created point_stat config files for both GFS and WRF
which have
> > the
> > > same default settings but have "grid = [ "FULL" ]; poly = [];"
and the
> > > interpolation method for NEAREST and BILIN. In my understanding,
this
> is
> > > fine since the pb2nc should have only selected observations
within the
> > > masked region.
> > >
> > >
> > > I ran the point_stat command for both GFS and WRF forecast data,
which
> > > worked fine, but I noticed the discrepancy in the total number
of
> matched
> > > pairs.
> > >
> > >
> > > I also tried having the point_stat config files having " grid =
[];
> poly
> > =
> > > ["WRF_domain.nc"] " , but the same observation discrepancy
occurred.
> > >
> > >
> > > Am I maybe using the mask incorrectly? Or could it be that since
the
> GFS
> > > and WRF may have different horizontal resolutions, that there
could be
> > less
> > > matched pairs?
> > >
> > >
> > > I should mention I also tried creating a mask.poly file with
just a
> boxed
> > > region (e.g., Northern California) and included poly =
"mask.poly" in
> the
> > > PB2NC Config File. After doing this I got the same number of
> observations
> > > for GFS and WRF. So I'm not sure why using the mask.poly worked
fine,
> but
> > > using the WRF domain as a mask did not.
> > >
> > >
> > >
> > > Thanks for your time and thoughts! If you need to see any
example
> config
> > > files or WRF output, let me know and I can provide that.
> > >
> > >
> > > Thanks!
> > >
> > > Jeremy
> > >
> > >
> > >
> > >
> > >
> > >
> > > ________________________________
> > > From: John Halley Gotway via RT <met_help at ucar.edu>
> > > Sent: Friday, June 21, 2019 12:17:19 PM
> > > To: Berman, Jeremy D
> > > Cc: harrold at ucar.edu; jwolff at ucar.edu
> > > Subject: Re: [rt.rap.ucar.edu #90685] Issue with using MET
Point_Stat
> > > with GDAS files after 2019061200
> > >
> > > Jeremy,
> > >
> > > Unfortunately, no, MET does not include logic for handling WRF's
hybrid
> > > vertical coordinate and interpolating to pressure or height
levels.  In
> > > addition, when using raw WRFOUT files, MET does not handle winds
well
> > since
> > > they're defined on the staggered grid.
> > >
> > > For these two reasons, we recommend that users run their WRFOUT
files
> > > through the Unified Post Processor.  It destaggers the winds and
> > > interpolates to the pressure level and height levels.  Here's a
link
> for
> > > info about UPP:
> > > https://dtcenter.org/community-code/unified-post-processor-upp
> > >
> > > UPP writes GRIB1 or GRIB2 output files and the MET tools can
handle
> those
> > > well.
> > >
> > > Thanks,
> > > John
> > >
> > > On Fri, Jun 21, 2019 at 12:28 AM Berman, Jeremy D via RT <
> > > met_help at ucar.edu>
> > > wrote:
> > >
> > > >
> > > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=90685
>
> > > >
> > > > Hi John,
> > > >
> > > >
> > > > Thank you so much for your help! I was bogged down trying to
solve
> this
> > > > for a few days - I'm impressed you did it in two hours!
> > > >
> > > >
> > > > I made the changes and it worked for a case I did for
2019061800. I
> > > should
> > > > have mentioned I was looking at 2-meter Temperature and 10-
meter Wind
> > > > Speed, both of which use ADPSFC, which based on your
observation
> counts
> > > > explains why point_stat could not get any matched pairs.
> > > >
> > > >
> > > > I have another question if you don't mind me asking: if I want
to
> > compute
> > > > verification for a different vertical level, such as 100-meter
Wind
> > > Speed,
> > > > can MET do that for a WRF forecast file, even if the forecast
file
> does
> > > not
> > > > have a 100-meter level? Would MET be able to do a vertical
> > interpolation
> > > in
> > > > order to assess for that level (or any vertical above ground
level)?
> > > >
> > > >
> > > > Additionally: if I wanted to do verification with point_stat
for an
> > > entire
> > > > vertical profile (let's say from the surface to 1000 meters
above
> > ground
> > > > level) could MET do that as well? I know in the example MET
tutorial
> > > there
> > > > is a range of Temperature from 850-750hPa, but I was wondering
if
> this
> > > > could work for a range of vertical meters above ground?
> > > >
> > > >
> > > > Thank you for all your help!
> > > >
> > > >
> > > > Best,
> > > >
> > > > Jeremy
> > > >
> > > > ________________________________
> > > > From: John Halley Gotway via RT <met_help at ucar.edu>
> > > > Sent: Tuesday, June 18, 2019 6:50:34 PM
> > > > To: Berman, Jeremy D
> > > > Cc: harrold at ucar.edu; jwolff at ucar.edu
> > > > Subject: Re: [rt.rap.ucar.edu #90685] Issue with using MET
> Point_Stat
> > > > with GDAS files after 2019061200
> > > >
> > > > Jeremy,
> > > >
> > > > I see you have a question about running the pb2nc and
point_stat
> tools.
> > > > That's interesting that you found that you're verification
logic
> broke
> > on
> > > > 2019061206.  To check the behavior, I retrieved a PREPBUFR
file
> before
> > > and
> > > > after that time:
> > > >
> > > > BEFORE: wget
> > > >
> > > >
> > >
> >
>
ftp://ftpprd.ncep.noaa.gov/pub/data/nccf/com/gfs/prod/gdas.20190611/gdas.t12z.prepbufr.nr
> > > > AFTER:   wget
> > > >
> > > >
> > >
> >
>
ftp://ftpprd.ncep.noaa.gov/pub/data/nccf/com/gfs/prod/gdas.20190612/12/gdas.t12z.prepbufr.nr
> > > >
> > > > I ran them both through pb2nc, extracting ADPUPA and ADPSFC
message
> > > types.
> > > > pb2nc retained/derived 308,034 observations on the 11th but
only
> > 175,427
> > > on
> > > > the 12th.
> > > >
> > > > One obvious difference I did note is that they've changed the
> directory
> > > > structure on the ftp site.  On 20190612, they started
organizing the
> > data
> > > > into subdirectories for the hour (00, 06, 12, 18).  If you're
> running a
> > > > script to pull prepbufr files from the FTP site, please make
sure
> it's
> > > > finding them in the right sub-directory.
> > > >
> > > > Doing some more digging I found...
> > > > The number of ADPUPA observations remains unchanged... 136,114
for
> both
> > > > days.
> > > > The number of ADPSFC (surface) observations is dramatically
> reduced...
> > > from
> > > > 171,920 down to 34,247!
> > > >
> > > > So next I reran pb2nc but setting *quality_mark_thresh = 15;
*to
> retain
> > > all
> > > > observations regardless of the quality flag.
> > > > And that results in 332,235 and 337,818 observations on the
11th and
> > > 12th,
> > > > respectively.
> > > > The ADPUPA obs are very similar: 150,057 vs 155,787
> > > > The ADPSFC obs are also similar: 182,178 vs 182,031
> > > >
> > > > So the big difference is in the quality mark values.
> > > > On the 11th...
> > > > 182,178 observations = 171,920 with QM=2, 10 with QM=9, and
10,248
> with
> > > > other QM's.
> > > > On the 12th...
> > > > 182,031 observations = 34,247 with QM=2, 139,047 with QM=9,
and 8,737
> > > with
> > > > other QM's.
> > > >
> > > > I'm guessing that with the GFS upgrade, they changed their
GDAS
> > > > assimilation logic back to setting the quality marker = 9 for
surface
> > obs
> > > > to avoid assimilating them.
> > > >
> > > > So I'd recommend 2 things:
> > > > (1) In your PB2NC config file, set *quality_mark_thresh = 9;*
> > > > (2) In your Point-Stat config file, when verifying against
surface
> > > > obs, set *obs_quality
> > > > = [0,1,2,9]; *to use surface observations with any of these
quality
> > > marks.
> > > >
> > > > Hope that helps get you going.
> > > >
> > > > Thanks,
> > > > John Halley Gotway
> > > >
> > > > On Tue, Jun 18, 2019 at 4:06 PM Randy Bullock via RT <
> > met_help at ucar.edu>
> > > > wrote:
> > > >
> > > > >
> > > > > Tue Jun 18 16:05:37 2019: Request 90685 was acted upon.
> > > > > Transaction: Given to johnhg (John Halley Gotway) by bullock
> > > > >        Queue: met_help
> > > > >      Subject: Issue with using MET Point_Stat with GDAS
files after
> > > > > 2019061200
> > > > >        Owner: johnhg
> > > > >   Requestors: jdberman at albany.edu
> > > > >       Status: new
> > > > >  Ticket <URL:
> > https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=90685
> > > >
> > > > >
> > > > >
> > > > > This transaction appears to have no content
> > > > >
> > > >
> > > >
> > > >
> > >
> > >
> > >
> >
> >
> >
>
>
>

------------------------------------------------
Subject: Issue with using MET Point_Stat with GDAS files after 2019061200
From: John Halley Gotway
Time: Mon Jul 22 11:04:51 2019

Great, glad to hear that did the trick.  I'll resolve the ticket now.

John

On Mon, Jul 22, 2019 at 10:58 AM Berman, Jeremy D via RT
<met_help at ucar.edu>
wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=90685 >
>
> Thanks John! That was exactly what I needed and it worked! Thank you
for
> the very clear explanation.
>
>
> Best!
>
> Jeremy
>
> ________________________________
> From: John Halley Gotway via RT <met_help at ucar.edu>
> Sent: Friday, July 19, 2019 5:59:10 PM
> To: Berman, Jeremy D <jdberman at albany.edu>
> Cc: harrold at ucar.edu <harrold at ucar.edu>; jwolff at ucar.edu
<jwolff at ucar.edu>
> Subject: Re: [rt.rap.ucar.edu #90685] Issue with using MET
Point_Stat
> with GDAS files after 2019061200
>
> Jeremy,
>
> I see that you'd like to filter the observations and see how the
continuous
> statistics change as the wind speed increases.  You're definitely on
the
> right track.  The "cnt_thresh" option is used to subset the matched
pairs
> before computing statistics.  And it's a great question as to
whether that
> filter is applied to the fcst values, obs values, or both.
>
> Here's a selection from the data/config/README file which describes
this:
> https://github.com/NCAR/MET/blob/master_v8.1/met/data/config/README
> // - The "cnt_thresh" entry is an array of thresholds for filtering
> // data prior to computing continuous statistics and partial sums.
> //
> // - The "cnt_logic" entry may be set to UNION, INTERSECTION, or
SYMDIFF
> // and controls the logic for how the forecast and observed
cnt_thresh
> // settings are combined when filtering matched pairs of forecast
and
> // observed values.
>
> Let's say your Point-Stat config file looks like this to verify 10-
meter
> winds:
>
>
>
>
> *fcst = {   field = [ { name = "WIND"; level = [ "Z10" ]; } ];}*
>
> *obs = {*
>
> *   field = [ { name = "WIND"; level = [ "Z10" ]; } ];}*
>
> If you add "cnt_thresh = [ >1.0 ];" outside of both the fcst and obs
> dictionaries, the setting will apply to both fcst and obs values.
Put it
> inside the "obs", and it'll only apply to the obs.  And the default
> cnt_logic setting is union:
> *   cnt_logic = UNION;*
>
> For what you're doing, I think applying it only to the obs makes for
better
> comparison between models.  So I'd suggest:
>
> *cnt_logic = INTERSECTION;*
>
> *fcst = {*
>
> *   field = [ { *
> *           name = "WIND"; level = [ "Z10" ];*
>
> *           cnt_thresh = [ >1.0, >2.0, >3.0 ];*
>
>
> *     } ];}*
>
> *obs = {*
> *   field = [ {*
> *           name = "WIND"; level = [ "Z10" ];*
> *           cnt_thresh = [ NA, NA, NA ];*
> *   } ];*
> *}*
>
> If you define 3 CNT thresh settings for the forecast, then you need
3 for
> the obs too.  The NA threshold always evaluated to true.  So that's
why you
> want the intersection logic.
>
> Hope that helps.
>
> Thanks,
> John
>
> On Wed, Jul 17, 2019 at 2:50 PM Berman, Jeremy D via RT
<met_help at ucar.edu
> >
> wrote:
>
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=90685 >
> >
> > Hi John,
> >
> >
> > Thanks so much for your help earlier! I hope you did not have to
work all
> > day on July 4th, and that you got to enjoy the holiday!
> >
> >
> > Sorry about the earlier confusion about the GFS vs WRF pairs. I
decided
> > that to maintain the same observations for comparing GFS and WRF,
I would
> > just use the mask_poly feature and just be careful to not
prescribe a
> poly
> > line too close to my WRF domain boundary. So far it works!
> >
> >
> > One quick question if I may ask you. As I described earlier, I'm
> comparing
> > my WRF forecasts vs. GFS forecasts over a region of northern
California.
> > I'm using pb2nc for my observations from PREPBUFR, and point_stat
for the
> > matched pairs. I've been looking at the continuous statistics
(CNT) and
> > categorical statistics (CTS) for 10m Wind Speed, and I'm seeing
that GFS
> > routinely has lower RMSE, BIAS, etc over the region for a number
of
> > forecast lead times.
> >
> >
> > Of course there are many reasons that could explain why my WRF
forecast
> > doesn't perform as well. One thing I would like to investigate is
whether
> > the GFS performs better because there are observations with near 0
m/s
> > values. So my question is: how can I filter the observations to
only
> > compute the CNT statistics for observations with values greater
than,
> say,
> > 0.5 m/s?
> >
> >
> > I know in the Point_Stat Configuration Files there is a setting
called
> > "cnt_thres[]" and "cnt_logic", however do these settings only
apply to
> the
> > forecasted value or the observed value? For example, if the
observation
> > value is 0.7 m/s, but the forecasted value is 0 m/s, if I set the
> cnt_thres
> > to 0.5, would it count that observation, and find a matched pair,
or
> not? I
> > want to filter just the observation values, so that my GFS and WRF
> > forecasts are being "apple to apple" comparisons.
> >
> >
> > My hope is that I can try to better understand my GFS vs. WRF
statistical
> > comparison by trying to filter the observation data by only
looking at
> CNT
> > for high wind speeds. The GFS has a coarser spatial resolution, so
I
> would
> > think a priori that my WRF simulation (3 km) would provide better
wind
> > forecasts. So hopefully filtering the observation dataset will
help
> > diagnose the differences.
> >
> >
> > Thanks!
> >
> > Jeremy
> >
> >
> >
> >
> > ________________________________
> > From: John Halley Gotway via RT <met_help at ucar.edu>
> > Sent: Thursday, July 4, 2019 9:32 AM
> > To: Berman, Jeremy D
> > Cc: harrold at ucar.edu; jwolff at ucar.edu
> > Subject: Re: [rt.rap.ucar.edu #90685] Issue with using MET
Point_Stat
> > with GDAS files after 2019061200
> >
> > Jeremy,
> >
> > I’m glad you’ve been able to make sense of the discrepancies in
the
> number
> > of pairs.  And I’m glad to hear that it’s GFS with more pairs than
WRF.
> > Your earlier email saying the opposite had me confused!
> >
> > “but I'm getting slightly more matched pairs for my WRF run than
the GFS
> > (54 vs 52 obs).”
> >
> > There has got to be a good way of handling this using MET but I
just
> don’t
> > know what it is off the top of my head.
> >
> > Regarding (1), yes definitely.  In the Point-Stat config file,
look in
> the
> > “mask” dictionary and use the “sid” entry to specify a list of
station
> id’s
> > to be included in the stats.  Be sure to read about that in the
users
> guide
> > or the data/config/README file.
> >
> > Regarding (2), AFPSFC is not a good choice to verify vertical
levels
> above
> > ground.  Please try ADPUPA (upper-air) instead. Or perhaps some
other obs
> > type?  I know a lot about the software but don’t know a lot about
breadth
> > of available obs.
> >
> > Here’s why ADPSFC is a bad choice.  MET inherited this logic for
surface
> > verification from the verification done at NOAA/EMC.  When
verifying
> > vertical level forecast against ADFSFC or SFCSHP obs, just match
those
> obs
> > without actually checking the height info.  This works for 2-m
temp and
> > 10-m winds but not 305-m winds!
> >
> > Thanks
> > John
> >
> >
> > On Wed, Jul 3, 2019 at 3:47 PM Berman, Jeremy D via RT <
> met_help at ucar.edu>
> > wrote:
> >
> > >
> > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=90685 >
> > >
> > > Hi John,
> > >
> > >
> > > Thanks for your input! I did as you suggested and found that the
> missing
> > > points between my WRF and GFS were all along the northern or
eastern
> > > boundary of my WRF domain.
> > >
> > >
> > > When doing the "diff" command between the Point_Stat files for
WRF and
> > GFS
> > > with -v 3, I noticed the lines for the WRF file:
> > >
> > >
> > > "Processing WIND/Z10 versus WIND/Z10, for observation type
ADPSFC, over
> > > region FULL, for interpolation method BILIN(4), using 52 pairs"
> > >
> > > "DEBUG 3: Rejected: off the grid   = 1"
> > >
> > > "Rejected: bad fcst value = 1"
> > >
> > >
> > > While for GFS:
> > >
> > >
> > > "Processing WIND/Z10 versus WIND/Z10, for observation type
ADPSFC, over
> > > region FULL, for interpolation method BILIN(4), using 54 pairs"
> > >
> > > "DEBUG 3: Rejected: off the grid   = 0"
> > >
> > > "Rejected: bad fcst value = 0"
> > >
> > >
> > > I looked into the MPR files and found two stations that were
different.
> > > Those two stations were near the northern or eastern boundary of
my WRF
> > > domain, and one of them was actually outside the WRF domain.
> > >
> > >
> > > So my thinking is that the discrepancy occurs because (1) an
> observation
> > > is close to a boundary and (2) the bilinear interpolation
approach
> can't
> > > find all the neighbor values because it is too close to a
boundary. I
> > > tested these same MET commands, but for the NEAREST
interpolation
> method:
> > > there still was 1 "rejected: off the grid", but 0 "rejected: bad
fcst
> > > value". So I think this supports my hypothesis. I think the
reason the
> > GFS
> > > has more observation matches is because it has a global domain,
so
> there
> > is
> > > no boundary issue for observations, like there is for my WRF
> simulation.
> > >
> > >
> > > I think to fix this issue, and to get the same number of WRF to
GFS
> > > observation matches, I'll have to only use a polyline described
by
> > > "mask.poly", and I'll have to write a script that checks that
the poly
> > line
> > > is not too close to the WRF domain (e.g., within 5 grid points).
> > >
> > >
> > >
> > > I have another, related, question, if you don't mind me asking
:)
> > >
> > >
> > > 1. is it possible to select specific METAR locations to run
> > > PB2NC/Point_Stat on, instead of using all the observations
within a
> poly
> > > mask? For example, if I am looking at Northern California, but I
only
> > want
> > > to validate averaged over the stations: KAAT, KACV, KBFL, KBLU,
KFAT,
> > KPRB,
> > > KSAC for CNT/CTS files? I know the individual stations are
listed in
> MPR,
> > > but I wondered how I could do this for the CNT/CTS calculations.
> > >
> > >
> > > 2. If I have multiple wind speed vertical levels from WRF run
under UPP
> > > such as Z10, Z30, Z80, Z100, Z305, which observation type would
be most
> > > appropriate to validate those higher altitude levels? I had used
> ADPSFC,
> > > and it actually got matches for those higher altitude levels of
Z100
> and
> > > Z305, but the observed wind speeds don't seem believable, so
perhaps I
> am
> > > using that incorrectly. Is there a more appropriate observation
type
> > you'd
> > > suggest to verify against? Maybe profilers?
> > >
> > >
> > >
> > > Thank you so much for all your help through this process. I look
> forward
> > > to hearing your thoughts on these other questions.
> > >
> > >
> > > Have a Happy Fourth of July!!
> > >
> > >
> > > Jeremy
> > >
> > >
> > >
> > >
> > > ________________________________
> > > From: John Halley Gotway via RT <met_help at ucar.edu>
> > > Sent: Monday, July 1, 2019 1:27:23 PM
> > > To: Berman, Jeremy D
> > > Cc: harrold at ucar.edu; jwolff at ucar.edu
> > > Subject: Re: [rt.rap.ucar.edu #90685] Issue with using MET
Point_Stat
> > > with GDAS files after 2019061200
> > >
> > > Jeremy,
> > >
> > > I see you have a question about why you get slightly different
number
> of
> > > matched pairs when verifying your WRF output and GFS output over
the
> same
> > > spatial area.  Thanks for describing your processing logic.
> > >
> > > So you're getting 54 matched pairs for WRF and only 52 for GFS.
I'm
> > trying
> > > to reconcile in my mind why that would be the case.  Differences
like
> > this
> > > tend to occur along the edge of the domain.  But I would have
guessed
> the
> > > result be the other way around... you'd get 2 fewer pairs for
your the
> > > limited area of the WRF run and two more for the global GFS.
> > >
> > > I'd suggest digging a little deeper, and here's how...
> > >
> > > (1) Rerun Point-Stat at verbosity level 3 by specifying "-v 3"
on the
> > > command line.  For each verification task, that'll dump out
counts for
> > why
> > > obs were or were not used.  Diffing the logs between WRF and
GFS, you
> > might
> > > find a difference which could point to an explanation.
> > >
> > > (2) I see from your Point-Stat config file that you already have
the
> MPR
> > > output line type turned on.  You could look at that output to
determine
> > > which stations were included for WRF but not GFS.  That might
yield
> > another
> > > clue.
> > >
> > > Do those methods shed any light on the differences?
> > >
> > > Also, I plotted the WRF domain you sent and see that it's over
the
> > eastern
> > > pacific.  Be aware that the ADPSFC obs are only over land, a
small
> > fraction
> > > of your domain.  Surface water points are encoded as "SFCSHP" in
> > PREPBUFR.
> > > So I'd suggest setting:
> > >
> > > message_type   = [ "ADPSFC", "SFCSHP" ];
> > >
> > > That'll verify them as two separate tasks.  If you want to
process them
> > in
> > > one big group, just use:
> > >
> > > message_type   = [ "ONLYSF" ]; // That's defined in
> > message_type_group_map
> > > as the union of ADPSFC and SFCSHP
> > >
> > > Thanks,
> > > John
> > >
> > >
> > > On Fri, Jun 28, 2019 at 3:58 PM Berman, Jeremy D via RT <
> > met_help at ucar.edu
> > > >
> > > wrote:
> > >
> > > >
> > > > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=90685
>
> > > >
> > > > Hi John,
> > > >
> > > >
> > > > I realized it's probably most helpful to you if I provided the
files
> > and
> > > > mask that I was using, so you can see what I was doing. Please
let me
> > > know
> > > > if you can see these files (I didn't include in this email the
WRF
> > output
> > > > file since it is large (>45 mb) but I can via another way if
you need
> > to
> > > > see it).
> > > >
> > > >
> > > > Jeremy
> > > >
> > > >
> > > >
> > > > ________________________________
> > > > From: Berman, Jeremy D
> > > > Sent: Friday, June 28, 2019 5:48:53 PM
> > > > To: met_help at ucar.edu
> > > > Subject: Re: [rt.rap.ucar.edu #90685] Issue with using MET
> Point_Stat
> > > > with GDAS files after 2019061200
> > > >
> > > >
> > > > Hi John,
> > > >
> > > >
> > > > Thanks for your reply! The UPP information was very helpful,
and I've
> > > used
> > > > it successfully on my WRF runs.
> > > >
> > > >
> > > > I've encountered another minor issue, if you could provide
some
> > insight.
> > > > I've been trying to compare GFS forecasts vs my WRF forecasts
over
> the
> > > same
> > > > verification region, such as only over my WRF domain (e.g.,
extending
> > > over
> > > > the Pacific and California). For reference, my WRF is a 12km
single
> > > domain
> > > > run.
> > > >
> > > >
> > > > I've noticed that after doing the steps below, I am verifying
over my
> > > > intended region (the WRF domain), but I'm getting slightly
more
> matched
> > > > pairs for my WRF run than the GFS (54 vs 52 obs). Do you have
any
> idea
> > of
> > > > why this could be happening and how I could get the number of
matched
> > > > observations to be the same?
> > > >
> > > >
> > > > Here is my procedure:
> > > >
> > > >
> > > > I use the "Gen-Poly-Mask Tool" and the -type grid on a UPP'ed
WRF
> > > forecast
> > > > file to create a mask saved as a .nc file:
> > > >
> > > >
> > > > gen_vx_mask -type grid WRFPRS_d01_f001 WRFPRS_d01_f001
WRF_domain.nc
> > > >
> > > >
> > > > Then, I modified my pb2nc config file
("PB2NCConfig_PREPBUFR_GDAS")
> by
> > > > setting: poly = "WRF_domain.nc"
> > > >
> > > >
> > > > Then I run the "pb2nc" command on PREPBUFR_GDAS data to select
> > > > observations over my masked area:
> > > >
> > > >
> > > > pb2nc PREPBUFR_GDAS_f000.nr PREPBUFR_GDAS_f000_pb.nc
> > > > PB2NCConfig_PREPBUFR_GDAS -v 2
> > > >
> > > >
> > > > This creates a file called "PREPBUFR_GDAS_f000_pb.nc" which I
checked
> > > > contains observations over my WRF mask.
> > > >
> > > >
> > > > Then I created point_stat config files for both GFS and WRF
which
> have
> > > the
> > > > same default settings but have "grid = [ "FULL" ]; poly = [];"
and
> the
> > > > interpolation method for NEAREST and BILIN. In my
understanding, this
> > is
> > > > fine since the pb2nc should have only selected observations
within
> the
> > > > masked region.
> > > >
> > > >
> > > > I ran the point_stat command for both GFS and WRF forecast
data,
> which
> > > > worked fine, but I noticed the discrepancy in the total number
of
> > matched
> > > > pairs.
> > > >
> > > >
> > > > I also tried having the point_stat config files having " grid
= [];
> > poly
> > > =
> > > > ["WRF_domain.nc"] " , but the same observation discrepancy
occurred.
> > > >
> > > >
> > > > Am I maybe using the mask incorrectly? Or could it be that
since the
> > GFS
> > > > and WRF may have different horizontal resolutions, that there
could
> be
> > > less
> > > > matched pairs?
> > > >
> > > >
> > > > I should mention I also tried creating a mask.poly file with
just a
> > boxed
> > > > region (e.g., Northern California) and included poly =
"mask.poly" in
> > the
> > > > PB2NC Config File. After doing this I got the same number of
> > observations
> > > > for GFS and WRF. So I'm not sure why using the mask.poly
worked fine,
> > but
> > > > using the WRF domain as a mask did not.
> > > >
> > > >
> > > >
> > > > Thanks for your time and thoughts! If you need to see any
example
> > config
> > > > files or WRF output, let me know and I can provide that.
> > > >
> > > >
> > > > Thanks!
> > > >
> > > > Jeremy
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > ________________________________
> > > > From: John Halley Gotway via RT <met_help at ucar.edu>
> > > > Sent: Friday, June 21, 2019 12:17:19 PM
> > > > To: Berman, Jeremy D
> > > > Cc: harrold at ucar.edu; jwolff at ucar.edu
> > > > Subject: Re: [rt.rap.ucar.edu #90685] Issue with using MET
> Point_Stat
> > > > with GDAS files after 2019061200
> > > >
> > > > Jeremy,
> > > >
> > > > Unfortunately, no, MET does not include logic for handling
WRF's
> hybrid
> > > > vertical coordinate and interpolating to pressure or height
levels.
> In
> > > > addition, when using raw WRFOUT files, MET does not handle
winds well
> > > since
> > > > they're defined on the staggered grid.
> > > >
> > > > For these two reasons, we recommend that users run their
WRFOUT files
> > > > through the Unified Post Processor.  It destaggers the winds
and
> > > > interpolates to the pressure level and height levels.  Here's
a link
> > for
> > > > info about UPP:
> > > > https://dtcenter.org/community-code/unified-post-processor-upp
> > > >
> > > > UPP writes GRIB1 or GRIB2 output files and the MET tools can
handle
> > those
> > > > well.
> > > >
> > > > Thanks,
> > > > John
> > > >
> > > > On Fri, Jun 21, 2019 at 12:28 AM Berman, Jeremy D via RT <
> > > > met_help at ucar.edu>
> > > > wrote:
> > > >
> > > > >
> > > > > <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=90685 >
> > > > >
> > > > > Hi John,
> > > > >
> > > > >
> > > > > Thank you so much for your help! I was bogged down trying to
solve
> > this
> > > > > for a few days - I'm impressed you did it in two hours!
> > > > >
> > > > >
> > > > > I made the changes and it worked for a case I did for
2019061800. I
> > > > should
> > > > > have mentioned I was looking at 2-meter Temperature and 10-
meter
> Wind
> > > > > Speed, both of which use ADPSFC, which based on your
observation
> > counts
> > > > > explains why point_stat could not get any matched pairs.
> > > > >
> > > > >
> > > > > I have another question if you don't mind me asking: if I
want to
> > > compute
> > > > > verification for a different vertical level, such as 100-
meter Wind
> > > > Speed,
> > > > > can MET do that for a WRF forecast file, even if the
forecast file
> > does
> > > > not
> > > > > have a 100-meter level? Would MET be able to do a vertical
> > > interpolation
> > > > in
> > > > > order to assess for that level (or any vertical above ground
> level)?
> > > > >
> > > > >
> > > > > Additionally: if I wanted to do verification with point_stat
for an
> > > > entire
> > > > > vertical profile (let's say from the surface to 1000 meters
above
> > > ground
> > > > > level) could MET do that as well? I know in the example MET
> tutorial
> > > > there
> > > > > is a range of Temperature from 850-750hPa, but I was
wondering if
> > this
> > > > > could work for a range of vertical meters above ground?
> > > > >
> > > > >
> > > > > Thank you for all your help!
> > > > >
> > > > >
> > > > > Best,
> > > > >
> > > > > Jeremy
> > > > >
> > > > > ________________________________
> > > > > From: John Halley Gotway via RT <met_help at ucar.edu>
> > > > > Sent: Tuesday, June 18, 2019 6:50:34 PM
> > > > > To: Berman, Jeremy D
> > > > > Cc: harrold at ucar.edu; jwolff at ucar.edu
> > > > > Subject: Re: [rt.rap.ucar.edu #90685] Issue with using MET
> > Point_Stat
> > > > > with GDAS files after 2019061200
> > > > >
> > > > > Jeremy,
> > > > >
> > > > > I see you have a question about running the pb2nc and
point_stat
> > tools.
> > > > > That's interesting that you found that you're verification
logic
> > broke
> > > on
> > > > > 2019061206.  To check the behavior, I retrieved a PREPBUFR
file
> > before
> > > > and
> > > > > after that time:
> > > > >
> > > > > BEFORE: wget
> > > > >
> > > > >
> > > >
> > >
> >
>
ftp://ftpprd.ncep.noaa.gov/pub/data/nccf/com/gfs/prod/gdas.20190611/gdas.t12z.prepbufr.nr
> > > > > AFTER:   wget
> > > > >
> > > > >
> > > >
> > >
> >
>
ftp://ftpprd.ncep.noaa.gov/pub/data/nccf/com/gfs/prod/gdas.20190612/12/gdas.t12z.prepbufr.nr
> > > > >
> > > > > I ran them both through pb2nc, extracting ADPUPA and ADPSFC
message
> > > > types.
> > > > > pb2nc retained/derived 308,034 observations on the 11th but
only
> > > 175,427
> > > > on
> > > > > the 12th.
> > > > >
> > > > > One obvious difference I did note is that they've changed
the
> > directory
> > > > > structure on the ftp site.  On 20190612, they started
organizing
> the
> > > data
> > > > > into subdirectories for the hour (00, 06, 12, 18).  If
you're
> > running a
> > > > > script to pull prepbufr files from the FTP site, please make
sure
> > it's
> > > > > finding them in the right sub-directory.
> > > > >
> > > > > Doing some more digging I found...
> > > > > The number of ADPUPA observations remains unchanged...
136,114 for
> > both
> > > > > days.
> > > > > The number of ADPSFC (surface) observations is dramatically
> > reduced...
> > > > from
> > > > > 171,920 down to 34,247!
> > > > >
> > > > > So next I reran pb2nc but setting *quality_mark_thresh = 15;
*to
> > retain
> > > > all
> > > > > observations regardless of the quality flag.
> > > > > And that results in 332,235 and 337,818 observations on the
11th
> and
> > > > 12th,
> > > > > respectively.
> > > > > The ADPUPA obs are very similar: 150,057 vs 155,787
> > > > > The ADPSFC obs are also similar: 182,178 vs 182,031
> > > > >
> > > > > So the big difference is in the quality mark values.
> > > > > On the 11th...
> > > > > 182,178 observations = 171,920 with QM=2, 10 with QM=9, and
10,248
> > with
> > > > > other QM's.
> > > > > On the 12th...
> > > > > 182,031 observations = 34,247 with QM=2, 139,047 with QM=9,
and
> 8,737
> > > > with
> > > > > other QM's.
> > > > >
> > > > > I'm guessing that with the GFS upgrade, they changed their
GDAS
> > > > > assimilation logic back to setting the quality marker = 9
for
> surface
> > > obs
> > > > > to avoid assimilating them.
> > > > >
> > > > > So I'd recommend 2 things:
> > > > > (1) In your PB2NC config file, set *quality_mark_thresh =
9;*
> > > > > (2) In your Point-Stat config file, when verifying against
surface
> > > > > obs, set *obs_quality
> > > > > = [0,1,2,9]; *to use surface observations with any of these
quality
> > > > marks.
> > > > >
> > > > > Hope that helps get you going.
> > > > >
> > > > > Thanks,
> > > > > John Halley Gotway
> > > > >
> > > > > On Tue, Jun 18, 2019 at 4:06 PM Randy Bullock via RT <
> > > met_help at ucar.edu>
> > > > > wrote:
> > > > >
> > > > > >
> > > > > > Tue Jun 18 16:05:37 2019: Request 90685 was acted upon.
> > > > > > Transaction: Given to johnhg (John Halley Gotway) by
bullock
> > > > > >        Queue: met_help
> > > > > >      Subject: Issue with using MET Point_Stat with GDAS
files
> after
> > > > > > 2019061200
> > > > > >        Owner: johnhg
> > > > > >   Requestors: jdberman at albany.edu
> > > > > >       Status: new
> > > > > >  Ticket <URL:
> > > https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=90685
> > > > >
> > > > > >
> > > > > >
> > > > > > This transaction appears to have no content
> > > > > >
> > > > >
> > > > >
> > > > >
> > > >
> > > >
> > > >
> > >
> > >
> > >
> >
> >
> >
>
>
>

------------------------------------------------