[Met_help] [rt.rap.ucar.edu #63455] History for some small questions regarding the basic concepts about MET

John Halley Gotway via RT met_help at ucar.edu
Thu Oct 24 10:00:46 MDT 2013


----------------------------------------------------------------
  Initial Request
----------------------------------------------------------------

Dear Sir/Madam,

I am a new user of MET and I have several small questions want to ask about.

(1): In the manual at Table 4-3, I am a little bit confused about Forecast
and Observation Rate. According to the contingency table, whether
forecast rate=(n11+n00)/T? And how about observation rate, I don't know
what it should be compared to.

(2):My second question is about the grid/poly setting in the
Pointstatconfig file. I know that the poly is used to select the specific
area for verification, however, I don't know what exactly is the grid
option used for, is it also used to choose the place for verification? If
yes, what's the difference between the poly?

(3): I am a little bit confused about the 'match pair' mean. I have two
understandings:(A) If I set temp>273 at Z2, both the observation file and
forecast output have valid value at Z2, no matter they are larger than 273
or not;(B) Or it means that the value at Z2 from both observation and
forecast meet the temp>273 requirement?

(4): My another question is about the beg/end time setting. The observation
data I downloaded was DS 337.0 NCEP ADP Global Upper Air and Surface
Observations. Is the single file of this type of data only have valid
observation for a specific time spot, like if I download the data for
2006.07.18_18:00:00, the file only contains the observation data for that
time spot? If so, when combine with wrf single hour output, can I just set
this option to 0 and 0? Could you tell me what exact the beg/end used for?

(5) My 5th question is about the time series comparison. If I want to draw
the plot of MSE versus time at a specific observation point, what should I
do? And how to take the data of the specific point out?

(6) My last question was about the comparison between two wrf output. If I
want to know how to compare the result between two wrf output at the same
time at the same location but with different physics schemes. Whether it is
just put these two results into the grid-grid comparison and set
one wrfoutput as the observation? Or there are other way to execute this
job?

Thank you so much in advance for your time and help, I really appreciate it!

Best,

Jason


----------------------------------------------------------------
  Complete Ticket History
----------------------------------------------------------------

Subject: Re: [rt.rap.ucar.edu #63455] some small questions regarding the basic concepts about MET
From: John Halley Gotway
Time: Thu Oct 17 13:35:26 2013

Jason,

I've answered your questions inline below.

Thanks,
John Halley Gotway
met_help at ucar.edu

On 10/17/2013 08:54 AM, Xingcheng Lu via RT wrote:
>
> Thu Oct 17 08:54:28 2013: Request 63455 was acted upon.
> Transaction: Ticket created by xingchenglu2011 at u.northwestern.edu
>         Queue: met_help
>       Subject: some small questions regarding the basic concepts
about MET
>         Owner: Nobody
>    Requestors: xingchenglu2011 at u.northwestern.edu
>        Status: new
>   Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=63455 >
>
>
> Dear Sir/Madam,
>
> I am a new user of MET and I have several small questions want to
ask about.
>
> (1): In the manual at Table 4-3, I am a little bit confused about
Forecast
> and Observation Rate. According to the contingency table, whether
> forecast rate=(n11+n00)/T? And how about observation rate, I don't
know
> what it should be compared to.

The forecast rate is just the fraction of grid points in the forecast
domain at which the event is occurring.  The observation rate is the
fraction of grid points in the observation domain at which
the event is occurring.  The observation rate is also known at the
base rate (BASER from the CTS line shown in Table 4-5).  The FHO and
CTC output lines really contain the same information.  At the
DTC, we prefer to use the counts from the CTC line, while NCEP prefers
to use the ratios of the counts given in the FHO line.

If your forecast were perfect, the F_RATE would be identical to the
O_RATE.

>
> (2):My second question is about the grid/poly setting in the
> Pointstatconfig file. I know that the poly is used to select the
specific
> area for verification, however, I don't know what exactly is the
grid
> option used for, is it also used to choose the place for
verification? If
> yes, what's the difference between the poly?

Yes, the grid masking behaves the same way that the polyline masking
does.  It's just another way of specifying a geographic subset of the
domain.  It's generally much less useful than the polyline
masking since only the pre-defined NCEP grids are supported, and most
users don't find that very helpful.

>
> (3): I am a little bit confused about the 'match pair' mean. I have
two
> understandings:(A) If I set temp>273 at Z2, both the observation
file and
> forecast output have valid value at Z2, no matter they are larger
than 273
> or not;(B) Or it means that the value at Z2 from both observation
and
> forecast meet the temp>273 requirement?

A "matched pair" just means a pair of forecast and observation values
that go together.  Suppose you have 200 point observations of 2-meter
temperature that fall in your domain.  For each observation
value, Point-Stat computes an interpolated forecast value for that
observation location.  So you now have 200 pairs of forecast and
observation values.  Using those 200 matched pairs, you could define
continuous statistics directly (CNT output line).  Or you could choose
a threshold (like >273) and define a 2x2 contingency table.  With that
2x2 contingency table, you can dump out the counts in the
CTC line and/or the corresponding statistics in the CTS line.

I'm not sure if that answers your question.

>
> (4): My another question is about the beg/end time setting. The
observation
> data I downloaded was DS 337.0 NCEP ADP Global Upper Air and Surface
> Observations. Is the single file of this type of data only have
valid
> observation for a specific time spot, like if I download the data
for
> 2006.07.18_18:00:00, the file only contains the observation data for
that
> time spot? If so, when combine with wrf single hour output, can I
just set
> this option to 0 and 0? Could you tell me what exact the beg/end
used for?
>

There are 4 GDAS PREPBUFR files per day - 00Z, 06Z, 12Z, and 18Z.
Each file contains 6 hours worth of observations, 3 hours +/- the time
indicated in the file name.  So the 12Z file contains
observations from 09Z to 15Z.  When you run Point-Stat, you need to
pass it one or more observation files that contain the observations
you'd like to use to evaluate the forecast.  The time window you
set in the Point-Stat configuration file is set relative to the
forecast valid time.  Suppose your forecast is valid at 06Z and you've
set beg = -3600 and end = 3600 (in seconds).  So that's +/- 1
hour around your forecast valid time.  When Point-Stat sifts through
the observations, it'll only use the one whose valid time falls
between 05Z and 07Z.  It'll throw all the others out.  It's up to
you to decide how close in time your observations need to be to your
forecast time.  If you set beg = 0 and end = 0, only those
observations with exactly the same time as the forecast will be used.

> (5) My 5th question is about the time series comparison. If I want
to draw
> the plot of MSE versus time at a specific observation point, what
should I
> do? And how to take the data of the specific point out?
>

When you run Point-Stat, you can output the individual matched pair
values by turning on the MPR output line.  The individual forecast-
observation pairs are contained in that MPR line.  Since
Point-Stat is run once for each valid time, the time-series of an
individual point is scattered across many different output files.  If
you'd like, you could use the STAT-Analysis tool filter out the
MPR lines that correspond to a single station id.  For example, here's
how you might filter out the matched pairs for 2-m temperature for a
station named KDEN:

   stat_analysis -job filter -lookin point_stat/out -line_type MPR
-fcst_var TMP -fcst_lev Z2 -column_str OBS_SID KDEN -dump_row
TMP_Z2_KDEN_MPR.txt

That will read all of the files ending in ".stat" from the directory
named "point_stat/out", pick out the MPR lines, only with "TMP" in the
FCST_VAR column, only with "Z2" in the FCST_LEV column, only
with "KDEN" in the OBS_SID column, and write the output to a file
named "TMP_Z2_KDEN_MPR.txt".  There are filtering switches for each of
the 21 header columns common to each line type.  And you can
use the -column_min, -column_max, -column_eq, and -column_str options
to filter by the data columns (as we've done here for the OBS_SID
column).

Once you have the data stored this way, it's up to you to make the
time series plot with whatever software you'd like.

> (6) My last question was about the comparison between two wrf
output. If I
> want to know how to compare the result between two wrf output at the
same
> time at the same location but with different physics schemes.
Whether it is
> just put these two results into the grid-grid comparison and set
> one wrfoutput as the observation? Or there are other way to execute
this
> job?
>

Sure, you can easily treat one of them as the "forecast" and the other
as the "observation".  That is not strictly verification - more of
just a comparison.  Hopefully, the output from MET will help
you quantify the differences.  You should just think carefully about
it when trying to make sense of the output.

> Thank you so much in advance for your time and help, I really
appreciate it!
>
> Best,
>
> Jason
>

------------------------------------------------
Subject: some small questions regarding the basic concepts about MET
From: Xingcheng Lu
Time: Fri Oct 18 09:14:54 2013

Hi John,

Thank you so much for your detail reply and I have learned a lot from
your
answers.

According to your answers, I still have 3 questions:

For the question 3 and 4, can I say, if there are 200 observation
points
within my domain, and I set the time beg=0/end=0, then the match pair
is
200. However, when I set beg=3600/end=3600, if each observation point
has 3
values during this time period, then the matching pair should be 600.
Is my
understanding correct?

For the question 5, if I want to do the time series comparison for 12
hours at the same observation spot(with single hour wrf output and set
beg/end=0), whether I just need to rerun the MET for 12 times for
single different hour?

For the question 5, "stat_analysis -job filter -lookin point_stat/out
-line_type MPR -fcst_var TMP -fcst_lev Z2 -column_str OBS_SID KDEN
-dump_row TMP_Z2_KDEN_MPR.txt"  Can I type this on the Linux command
line
or I need to set this in the config file for stat analysis? Because I
found
that this line is a little bit similar to the jobs setting in the
config
file, but I cannot find the format you type in the manual. Whether
there is
any resource to further introduce this command line format?

By the way, do I have to use the grids with the same resolution if I
want
to do the grid-grid comparison? Also, because my research focus is on
global scale, do you know whether there is any daily grid observation
data
for the global scale?

Thanks again for your kind help!

Sincerely,

Jason






2013/10/18 John Halley Gotway via RT <met_help at ucar.edu>

> Jason,
>
> I've answered your questions inline below.
>
> Thanks,
> John Halley Gotway
> met_help at ucar.edu
>
> On 10/17/2013 08:54 AM, Xingcheng Lu via RT wrote:
> >
> > Thu Oct 17 08:54:28 2013: Request 63455 was acted upon.
> > Transaction: Ticket created by xingchenglu2011 at u.northwestern.edu
> >         Queue: met_help
> >       Subject: some small questions regarding the basic concepts
about
> MET
> >         Owner: Nobody
> >    Requestors: xingchenglu2011 at u.northwestern.edu
> >        Status: new
> >   Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=63455 >
> >
> >
> > Dear Sir/Madam,
> >
> > I am a new user of MET and I have several small questions want to
ask
> about.
> >
> > (1): In the manual at Table 4-3, I am a little bit confused about
> Forecast
> > and Observation Rate. According to the contingency table, whether
> > forecast rate=(n11+n00)/T? And how about observation rate, I don't
know
> > what it should be compared to.
>
> The forecast rate is just the fraction of grid points in the
forecast
> domain at which the event is occurring.  The observation rate is the
> fraction of grid points in the observation domain at which
> the event is occurring.  The observation rate is also known at the
base
> rate (BASER from the CTS line shown in Table 4-5).  The FHO and CTC
output
> lines really contain the same information.  At the
> DTC, we prefer to use the counts from the CTC line, while NCEP
prefers to
> use the ratios of the counts given in the FHO line.
>
> If your forecast were perfect, the F_RATE would be identical to the
O_RATE.
>
> >
> > (2):My second question is about the grid/poly setting in the
> > Pointstatconfig file. I know that the poly is used to select the
specific
> > area for verification, however, I don't know what exactly is the
grid
> > option used for, is it also used to choose the place for
verification? If
> > yes, what's the difference between the poly?
>
> Yes, the grid masking behaves the same way that the polyline masking
does.
>  It's just another way of specifying a geographic subset of the
domain.
>  It's generally much less useful than the polyline
> masking since only the pre-defined NCEP grids are supported, and
most
> users don't find that very helpful.
>
> >
> > (3): I am a little bit confused about the 'match pair' mean. I
have two
> > understandings:(A) If I set temp>273 at Z2, both the observation
file and
> > forecast output have valid value at Z2, no matter they are larger
than
> 273
> > or not;(B) Or it means that the value at Z2 from both observation
and
> > forecast meet the temp>273 requirement?
>
> A "matched pair" just means a pair of forecast and observation
values that
> go together.  Suppose you have 200 point observations of 2-meter
> temperature that fall in your domain.  For each observation
> value, Point-Stat computes an interpolated forecast value for that
> observation location.  So you now have 200 pairs of forecast and
> observation values.  Using those 200 matched pairs, you could define
> continuous statistics directly (CNT output line).  Or you could
choose a
> threshold (like >273) and define a 2x2 contingency table.  With that
2x2
> contingency table, you can dump out the counts in the
> CTC line and/or the corresponding statistics in the CTS line.
>
> I'm not sure if that answers your question.
>
> >
> > (4): My another question is about the beg/end time setting. The
> observation
> > data I downloaded was DS 337.0 NCEP ADP Global Upper Air and
Surface
> > Observations. Is the single file of this type of data only have
valid
> > observation for a specific time spot, like if I download the data
for
> > 2006.07.18_18:00:00, the file only contains the observation data
for that
> > time spot? If so, when combine with wrf single hour output, can I
just
> set
> > this option to 0 and 0? Could you tell me what exact the beg/end
used
> for?
> >
>
> There are 4 GDAS PREPBUFR files per day - 00Z, 06Z, 12Z, and 18Z.
Each
> file contains 6 hours worth of observations, 3 hours +/- the time
indicated
> in the file name.  So the 12Z file contains
> observations from 09Z to 15Z.  When you run Point-Stat, you need to
pass
> it one or more observation files that contain the observations you'd
like
> to use to evaluate the forecast.  The time window you
> set in the Point-Stat configuration file is set relative to the
forecast
> valid time.  Suppose your forecast is valid at 06Z and you've set
beg =
> -3600 and end = 3600 (in seconds).  So that's +/- 1
> hour around your forecast valid time.  When Point-Stat sifts through
the
> observations, it'll only use the one whose valid time falls between
05Z and
> 07Z.  It'll throw all the others out.  It's up to
> you to decide how close in time your observations need to be to your
> forecast time.  If you set beg = 0 and end = 0, only those
observations
> with exactly the same time as the forecast will be used.
>
> > (5) My 5th question is about the time series comparison. If I want
to
> draw
> > the plot of MSE versus time at a specific observation point, what
should
> I
> > do? And how to take the data of the specific point out?
> >
>
> When you run Point-Stat, you can output the individual matched pair
values
> by turning on the MPR output line.  The individual forecast-
observation
> pairs are contained in that MPR line.  Since
> Point-Stat is run once for each valid time, the time-series of an
> individual point is scattered across many different output files.
If you'd
> like, you could use the STAT-Analysis tool filter out the
> MPR lines that correspond to a single station id.  For example,
here's how
> you might filter out the matched pairs for 2-m temperature for a
station
> named KDEN:
>
>    stat_analysis -job filter -lookin point_stat/out -line_type MPR
> -fcst_var TMP -fcst_lev Z2 -column_str OBS_SID KDEN -dump_row
> TMP_Z2_KDEN_MPR.txt
>
> That will read all of the files ending in ".stat" from the directory
named
> "point_stat/out", pick out the MPR lines, only with "TMP" in the
FCST_VAR
> column, only with "Z2" in the FCST_LEV column, only
> with "KDEN" in the OBS_SID column, and write the output to a file
named
> "TMP_Z2_KDEN_MPR.txt".  There are filtering switches for each of the
21
> header columns common to each line type.  And you can
> use the -column_min, -column_max, -column_eq, and -column_str
options to
> filter by the data columns (as we've done here for the OBS_SID
column).
>
> Once you have the data stored this way, it's up to you to make the
time
> series plot with whatever software you'd like.
>
> > (6) My last question was about the comparison between two wrf
output. If
> I
> > want to know how to compare the result between two wrf output at
the same
> > time at the same location but with different physics schemes.
Whether it
> is
> > just put these two results into the grid-grid comparison and set
> > one wrfoutput as the observation? Or there are other way to
execute this
> > job?
> >
>
> Sure, you can easily treat one of them as the "forecast" and the
other as
> the "observation".  That is not strictly verification - more of just
a
> comparison.  Hopefully, the output from MET will help
> you quantify the differences.  You should just think carefully about
it
> when trying to make sense of the output.
>
> > Thank you so much in advance for your time and help, I really
appreciate
> it!
> >
> > Best,
> >
> > Jason
> >
>
>

------------------------------------------------
Subject: Re: [rt.rap.ucar.edu #63455] some small questions regarding the basic concepts about MET
From: John Halley Gotway
Time: Fri Oct 18 11:01:30 2013

Jason,

Answers are inline.

Thanks,
John

On 10/18/2013 09:14 AM, Xingcheng Lu via RT wrote:
>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=63455 >
>
> Hi John,
>
> Thank you so much for your detail reply and I have learned a lot
from your
> answers.
>
> According to your answers, I still have 3 questions:
>
> For the question 3 and 4, can I say, if there are 200 observation
points
> within my domain, and I set the time beg=0/end=0, then the match
pair is
> 200. However, when I set beg=3600/end=3600, if each observation
point has 3
> values during this time period, then the matching pair should be
600. Is my
> understanding correct?
>

Yes, your understanding is correct.  If multiple observations occur at
the same location during the time window, by default, they will all be
used.  *HOWEVER*, we realize that this behavior is not
always desirable, so there's an option in the configuration file to
control this logic.  Please take a look in the file
"METv4.1/data/config/README" for a description of the "duplicate_flag"
option.
Setting it's value to "SINGLE" will cause only a single observation
value for each location to be used.  The one whose valid time is
closest to that of the forecast time is chosen.

> For the question 5, if I want to do the time series comparison for
12
> hours at the same observation spot(with single hour wrf output and
set
> beg/end=0), whether I just need to rerun the MET for 12 times for
> single different hour?

The Point-Stat tool is intended to be run once for each valid time
being evaluated.  So if you're verifying 12 different times, you'd run
the tool 12 times.  Of course, we don't intend that people run
this manually themselves on the command line.  Instead, we expect that
you'd run it via a script of some sort that loops through the times
you'd like to evaluate.

Often MET users are interested in the performance of their model at
more than just a single location and for more than a single
variable/level.  So there's more "work to do" than just computing a
single matched pair value and writing it out.

Here's a few pieces of info you may find helpful...
- The MET configuration files support the use of environment
variables.  I often find it helpful when scripting up calls to the MET
tools to set some environment variable values and then reference
them in the configuration files I pass to the tools.  That will enable
you to control the behavior of the tools without having to maintain
many different versions of the config files.
- If you happen to have GRIB files that contain data for multiple
forecast hours (all 12 of your output times for example), you could
actually call Point-Stat once and evaluate them all at once.  But
the configuration files get a lot messier - in the "fcst.field"
setting you'd need to explicitly specify the valid time of the data to
be evaluated so that Point-Stat know which fields to use.  I
typically find calling Point-Stat once per valid time is easier.
- If you're interested in the performance of your model at a specific
set of station ID's, consider using the "mask.sid" option.  Rather
than defining your verification area spatially (as the "grid"
and "poly" options do), the station id (sid) option is just a list of
the stations over which you'd like to compute statistics.

>
> For the question 5, "stat_analysis -job filter -lookin
point_stat/out
> -line_type MPR -fcst_var TMP -fcst_lev Z2 -column_str OBS_SID KDEN
> -dump_row TMP_Z2_KDEN_MPR.txt"  Can I type this on the Linux command
line
> or I need to set this in the config file for stat analysis? Because
I found
> that this line is a little bit similar to the jobs setting in the
config
> file, but I cannot find the format you type in the manual. Whether
there is
> any resource to further introduce this command line format?

The STAT-Analysis tool can be run with or without a config file.  If
you provide a config file, you have define the job(s) to be run in it.
If not, you define a *SINGLE* job to be run on the command
line.  Usually, I run single jobs on the command line until I figure
out the set of analysis jobs I'd like to run via a script.  Then I
move them into a config file and run them all with one call to
STAT-Analysis.  Look in the file "METv4.1/data/config/README" for the
section on "Job command FILTERING".

>
> By the way, do I have to use the grids with the same resolution if I
want
> to do the grid-grid comparison? Also, because my research focus is
on
> global scale, do you know whether there is any daily grid
observation data
> for the global scale?

Yes, for the grid-to-grid comparisons performed by the Grid-Stat,
MODE, Series-Analysis, and Wavelet-Stat tools, it's the user's
responsibility to put their forecast and observation data on the same
grid.  In future versions of MET, we'd like to add tools to help users
do this.  But currently there isn't any support for this directly in
MET.  But for GRIB1 data, the copygb utility can be used to
regrid things.  Here's a portion of the MET online tutorial that
discusses this:
    http://www.dtcenter.org/met/users/support/online_tutorial/METv4.1/copygb/index.php

Availability of appropriate observations is always an issue.  And I
don't have a magic bullet for you.  You could always compare your
model output to a model analysis - like the global GFS analysis.
But that'll just tell you how well your model output matches GFS,
which has it's own set of errors.

>
> Thanks again for your kind help!
>
> Sincerely,
>
> Jason
>
>
>
>
>
>
> 2013/10/18 John Halley Gotway via RT <met_help at ucar.edu>
>
>> Jason,
>>
>> I've answered your questions inline below.
>>
>> Thanks,
>> John Halley Gotway
>> met_help at ucar.edu
>>
>> On 10/17/2013 08:54 AM, Xingcheng Lu via RT wrote:
>>>
>>> Thu Oct 17 08:54:28 2013: Request 63455 was acted upon.
>>> Transaction: Ticket created by xingchenglu2011 at u.northwestern.edu
>>>          Queue: met_help
>>>        Subject: some small questions regarding the basic concepts
about
>> MET
>>>          Owner: Nobody
>>>     Requestors: xingchenglu2011 at u.northwestern.edu
>>>         Status: new
>>>    Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=63455 >
>>>
>>>
>>> Dear Sir/Madam,
>>>
>>> I am a new user of MET and I have several small questions want to
ask
>> about.
>>>
>>> (1): In the manual at Table 4-3, I am a little bit confused about
>> Forecast
>>> and Observation Rate. According to the contingency table, whether
>>> forecast rate=(n11+n00)/T? And how about observation rate, I don't
know
>>> what it should be compared to.
>>
>> The forecast rate is just the fraction of grid points in the
forecast
>> domain at which the event is occurring.  The observation rate is
the
>> fraction of grid points in the observation domain at which
>> the event is occurring.  The observation rate is also known at the
base
>> rate (BASER from the CTS line shown in Table 4-5).  The FHO and CTC
output
>> lines really contain the same information.  At the
>> DTC, we prefer to use the counts from the CTC line, while NCEP
prefers to
>> use the ratios of the counts given in the FHO line.
>>
>> If your forecast were perfect, the F_RATE would be identical to the
O_RATE.
>>
>>>
>>> (2):My second question is about the grid/poly setting in the
>>> Pointstatconfig file. I know that the poly is used to select the
specific
>>> area for verification, however, I don't know what exactly is the
grid
>>> option used for, is it also used to choose the place for
verification? If
>>> yes, what's the difference between the poly?
>>
>> Yes, the grid masking behaves the same way that the polyline
masking does.
>>   It's just another way of specifying a geographic subset of the
domain.
>>   It's generally much less useful than the polyline
>> masking since only the pre-defined NCEP grids are supported, and
most
>> users don't find that very helpful.
>>
>>>
>>> (3): I am a little bit confused about the 'match pair' mean. I
have two
>>> understandings:(A) If I set temp>273 at Z2, both the observation
file and
>>> forecast output have valid value at Z2, no matter they are larger
than
>> 273
>>> or not;(B) Or it means that the value at Z2 from both observation
and
>>> forecast meet the temp>273 requirement?
>>
>> A "matched pair" just means a pair of forecast and observation
values that
>> go together.  Suppose you have 200 point observations of 2-meter
>> temperature that fall in your domain.  For each observation
>> value, Point-Stat computes an interpolated forecast value for that
>> observation location.  So you now have 200 pairs of forecast and
>> observation values.  Using those 200 matched pairs, you could
define
>> continuous statistics directly (CNT output line).  Or you could
choose a
>> threshold (like >273) and define a 2x2 contingency table.  With
that 2x2
>> contingency table, you can dump out the counts in the
>> CTC line and/or the corresponding statistics in the CTS line.
>>
>> I'm not sure if that answers your question.
>>
>>>
>>> (4): My another question is about the beg/end time setting. The
>> observation
>>> data I downloaded was DS 337.0 NCEP ADP Global Upper Air and
Surface
>>> Observations. Is the single file of this type of data only have
valid
>>> observation for a specific time spot, like if I download the data
for
>>> 2006.07.18_18:00:00, the file only contains the observation data
for that
>>> time spot? If so, when combine with wrf single hour output, can I
just
>> set
>>> this option to 0 and 0? Could you tell me what exact the beg/end
used
>> for?
>>>
>>
>> There are 4 GDAS PREPBUFR files per day - 00Z, 06Z, 12Z, and 18Z.
Each
>> file contains 6 hours worth of observations, 3 hours +/- the time
indicated
>> in the file name.  So the 12Z file contains
>> observations from 09Z to 15Z.  When you run Point-Stat, you need to
pass
>> it one or more observation files that contain the observations
you'd like
>> to use to evaluate the forecast.  The time window you
>> set in the Point-Stat configuration file is set relative to the
forecast
>> valid time.  Suppose your forecast is valid at 06Z and you've set
beg =
>> -3600 and end = 3600 (in seconds).  So that's +/- 1
>> hour around your forecast valid time.  When Point-Stat sifts
through the
>> observations, it'll only use the one whose valid time falls between
05Z and
>> 07Z.  It'll throw all the others out.  It's up to
>> you to decide how close in time your observations need to be to
your
>> forecast time.  If you set beg = 0 and end = 0, only those
observations
>> with exactly the same time as the forecast will be used.
>>
>>> (5) My 5th question is about the time series comparison. If I want
to
>> draw
>>> the plot of MSE versus time at a specific observation point, what
should
>> I
>>> do? And how to take the data of the specific point out?
>>>
>>
>> When you run Point-Stat, you can output the individual matched pair
values
>> by turning on the MPR output line.  The individual forecast-
observation
>> pairs are contained in that MPR line.  Since
>> Point-Stat is run once for each valid time, the time-series of an
>> individual point is scattered across many different output files.
If you'd
>> like, you could use the STAT-Analysis tool filter out the
>> MPR lines that correspond to a single station id.  For example,
here's how
>> you might filter out the matched pairs for 2-m temperature for a
station
>> named KDEN:
>>
>>     stat_analysis -job filter -lookin point_stat/out -line_type MPR
>> -fcst_var TMP -fcst_lev Z2 -column_str OBS_SID KDEN -dump_row
>> TMP_Z2_KDEN_MPR.txt
>>
>> That will read all of the files ending in ".stat" from the
directory named
>> "point_stat/out", pick out the MPR lines, only with "TMP" in the
FCST_VAR
>> column, only with "Z2" in the FCST_LEV column, only
>> with "KDEN" in the OBS_SID column, and write the output to a file
named
>> "TMP_Z2_KDEN_MPR.txt".  There are filtering switches for each of
the 21
>> header columns common to each line type.  And you can
>> use the -column_min, -column_max, -column_eq, and -column_str
options to
>> filter by the data columns (as we've done here for the OBS_SID
column).
>>
>> Once you have the data stored this way, it's up to you to make the
time
>> series plot with whatever software you'd like.
>>
>>> (6) My last question was about the comparison between two wrf
output. If
>> I
>>> want to know how to compare the result between two wrf output at
the same
>>> time at the same location but with different physics schemes.
Whether it
>> is
>>> just put these two results into the grid-grid comparison and set
>>> one wrfoutput as the observation? Or there are other way to
execute this
>>> job?
>>>
>>
>> Sure, you can easily treat one of them as the "forecast" and the
other as
>> the "observation".  That is not strictly verification - more of
just a
>> comparison.  Hopefully, the output from MET will help
>> you quantify the differences.  You should just think carefully
about it
>> when trying to make sense of the output.
>>
>>> Thank you so much in advance for your time and help, I really
appreciate
>> it!
>>>
>>> Best,
>>>
>>> Jason
>>>
>>
>>

------------------------------------------------
Subject: some small questions regarding the basic concepts about MET
From: Xingcheng Lu
Time: Mon Oct 21 07:44:03 2013

Hi John,

Thank you for your answers. You said that I could write some scripts
for
time series re-run, so what type of scripts I can write, do you mean
shell
script? If you have a script sample, it will be much helpful to me.
Thank
again for your help!

cheers,

Jason


2013/10/19 John Halley Gotway via RT <met_help at ucar.edu>

> Jason,
>
> Answers are inline.
>
> Thanks,
> John
>
> On 10/18/2013 09:14 AM, Xingcheng Lu via RT wrote:
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=63455 >
> >
> > Hi John,
> >
> > Thank you so much for your detail reply and I have learned a lot
from
> your
> > answers.
> >
> > According to your answers, I still have 3 questions:
> >
> > For the question 3 and 4, can I say, if there are 200 observation
points
> > within my domain, and I set the time beg=0/end=0, then the match
pair is
> > 200. However, when I set beg=3600/end=3600, if each observation
point
> has 3
> > values during this time period, then the matching pair should be
600. Is
> my
> > understanding correct?
> >
>
> Yes, your understanding is correct.  If multiple observations occur
at the
> same location during the time window, by default, they will all be
used.
>  *HOWEVER*, we realize that this behavior is not
> always desirable, so there's an option in the configuration file to
> control this logic.  Please take a look in the file
> "METv4.1/data/config/README" for a description of the
"duplicate_flag"
> option.
> Setting it's value to "SINGLE" will cause only a single observation
value
> for each location to be used.  The one whose valid time is closest
to that
> of the forecast time is chosen.
>
> > For the question 5, if I want to do the time series comparison for
12
> > hours at the same observation spot(with single hour wrf output and
set
> > beg/end=0), whether I just need to rerun the MET for 12 times for
> > single different hour?
>
> The Point-Stat tool is intended to be run once for each valid time
being
> evaluated.  So if you're verifying 12 different times, you'd run the
tool
> 12 times.  Of course, we don't intend that people run
> this manually themselves on the command line.  Instead, we expect
that
> you'd run it via a script of some sort that loops through the times
you'd
> like to evaluate.
>
> Often MET users are interested in the performance of their model at
more
> than just a single location and for more than a single
variable/level.  So
> there's more "work to do" than just computing a
> single matched pair value and writing it out.
>
> Here's a few pieces of info you may find helpful...
> - The MET configuration files support the use of environment
variables.  I
> often find it helpful when scripting up calls to the MET tools to
set some
> environment variable values and then reference
> them in the configuration files I pass to the tools.  That will
enable you
> to control the behavior of the tools without having to maintain many
> different versions of the config files.
> - If you happen to have GRIB files that contain data for multiple
forecast
> hours (all 12 of your output times for example), you could actually
call
> Point-Stat once and evaluate them all at once.  But
> the configuration files get a lot messier - in the "fcst.field"
setting
> you'd need to explicitly specify the valid time of the data to be
evaluated
> so that Point-Stat know which fields to use.  I
> typically find calling Point-Stat once per valid time is easier.
> - If you're interested in the performance of your model at a
specific set
> of station ID's, consider using the "mask.sid" option.  Rather than
> defining your verification area spatially (as the "grid"
> and "poly" options do), the station id (sid) option is just a list
of the
> stations over which you'd like to compute statistics.
>
> >
> > For the question 5, "stat_analysis -job filter -lookin
point_stat/out
> > -line_type MPR -fcst_var TMP -fcst_lev Z2 -column_str OBS_SID KDEN
> > -dump_row TMP_Z2_KDEN_MPR.txt"  Can I type this on the Linux
command line
> > or I need to set this in the config file for stat analysis?
Because I
> found
> > that this line is a little bit similar to the jobs setting in the
config
> > file, but I cannot find the format you type in the manual. Whether
there
> is
> > any resource to further introduce this command line format?
>
> The STAT-Analysis tool can be run with or without a config file.  If
you
> provide a config file, you have define the job(s) to be run in it.
If not,
> you define a *SINGLE* job to be run on the command
> line.  Usually, I run single jobs on the command line until I figure
out
> the set of analysis jobs I'd like to run via a script.  Then I move
them
> into a config file and run them all with one call to
> STAT-Analysis.  Look in the file "METv4.1/data/config/README" for
the
> section on "Job command FILTERING".
>
> >
> > By the way, do I have to use the grids with the same resolution if
I want
> > to do the grid-grid comparison? Also, because my research focus is
on
> > global scale, do you know whether there is any daily grid
observation
> data
> > for the global scale?
>
> Yes, for the grid-to-grid comparisons performed by the Grid-Stat,
MODE,
> Series-Analysis, and Wavelet-Stat tools, it's the user's
responsibility to
> put their forecast and observation data on the same
> grid.  In future versions of MET, we'd like to add tools to help
users do
> this.  But currently there isn't any support for this directly in
MET.  But
> for GRIB1 data, the copygb utility can be used to
> regrid things.  Here's a portion of the MET online tutorial that
discusses
> this:
>
>
http://www.dtcenter.org/met/users/support/online_tutorial/METv4.1/copygb/index.php
>
> Availability of appropriate observations is always an issue.  And I
don't
> have a magic bullet for you.  You could always compare your model
output to
> a model analysis - like the global GFS analysis.
> But that'll just tell you how well your model output matches GFS,
which
> has it's own set of errors.
>
> >
> > Thanks again for your kind help!
> >
> > Sincerely,
> >
> > Jason
> >
> >
> >
> >
> >
> >
> > 2013/10/18 John Halley Gotway via RT <met_help at ucar.edu>
> >
> >> Jason,
> >>
> >> I've answered your questions inline below.
> >>
> >> Thanks,
> >> John Halley Gotway
> >> met_help at ucar.edu
> >>
> >> On 10/17/2013 08:54 AM, Xingcheng Lu via RT wrote:
> >>>
> >>> Thu Oct 17 08:54:28 2013: Request 63455 was acted upon.
> >>> Transaction: Ticket created by
xingchenglu2011 at u.northwestern.edu
> >>>          Queue: met_help
> >>>        Subject: some small questions regarding the basic
concepts about
> >> MET
> >>>          Owner: Nobody
> >>>     Requestors: xingchenglu2011 at u.northwestern.edu
> >>>         Status: new
> >>>    Ticket <URL:
> https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=63455 >
> >>>
> >>>
> >>> Dear Sir/Madam,
> >>>
> >>> I am a new user of MET and I have several small questions want
to ask
> >> about.
> >>>
> >>> (1): In the manual at Table 4-3, I am a little bit confused
about
> >> Forecast
> >>> and Observation Rate. According to the contingency table,
whether
> >>> forecast rate=(n11+n00)/T? And how about observation rate, I
don't know
> >>> what it should be compared to.
> >>
> >> The forecast rate is just the fraction of grid points in the
forecast
> >> domain at which the event is occurring.  The observation rate is
the
> >> fraction of grid points in the observation domain at which
> >> the event is occurring.  The observation rate is also known at
the base
> >> rate (BASER from the CTS line shown in Table 4-5).  The FHO and
CTC
> output
> >> lines really contain the same information.  At the
> >> DTC, we prefer to use the counts from the CTC line, while NCEP
prefers
> to
> >> use the ratios of the counts given in the FHO line.
> >>
> >> If your forecast were perfect, the F_RATE would be identical to
the
> O_RATE.
> >>
> >>>
> >>> (2):My second question is about the grid/poly setting in the
> >>> Pointstatconfig file. I know that the poly is used to select the
> specific
> >>> area for verification, however, I don't know what exactly is the
grid
> >>> option used for, is it also used to choose the place for
verification?
> If
> >>> yes, what's the difference between the poly?
> >>
> >> Yes, the grid masking behaves the same way that the polyline
masking
> does.
> >>   It's just another way of specifying a geographic subset of the
domain.
> >>   It's generally much less useful than the polyline
> >> masking since only the pre-defined NCEP grids are supported, and
most
> >> users don't find that very helpful.
> >>
> >>>
> >>> (3): I am a little bit confused about the 'match pair' mean. I
have two
> >>> understandings:(A) If I set temp>273 at Z2, both the observation
file
> and
> >>> forecast output have valid value at Z2, no matter they are
larger than
> >> 273
> >>> or not;(B) Or it means that the value at Z2 from both
observation and
> >>> forecast meet the temp>273 requirement?
> >>
> >> A "matched pair" just means a pair of forecast and observation
values
> that
> >> go together.  Suppose you have 200 point observations of 2-meter
> >> temperature that fall in your domain.  For each observation
> >> value, Point-Stat computes an interpolated forecast value for
that
> >> observation location.  So you now have 200 pairs of forecast and
> >> observation values.  Using those 200 matched pairs, you could
define
> >> continuous statistics directly (CNT output line).  Or you could
choose a
> >> threshold (like >273) and define a 2x2 contingency table.  With
that 2x2
> >> contingency table, you can dump out the counts in the
> >> CTC line and/or the corresponding statistics in the CTS line.
> >>
> >> I'm not sure if that answers your question.
> >>
> >>>
> >>> (4): My another question is about the beg/end time setting. The
> >> observation
> >>> data I downloaded was DS 337.0 NCEP ADP Global Upper Air and
Surface
> >>> Observations. Is the single file of this type of data only have
valid
> >>> observation for a specific time spot, like if I download the
data for
> >>> 2006.07.18_18:00:00, the file only contains the observation data
for
> that
> >>> time spot? If so, when combine with wrf single hour output, can
I just
> >> set
> >>> this option to 0 and 0? Could you tell me what exact the beg/end
used
> >> for?
> >>>
> >>
> >> There are 4 GDAS PREPBUFR files per day - 00Z, 06Z, 12Z, and 18Z.
Each
> >> file contains 6 hours worth of observations, 3 hours +/- the time
> indicated
> >> in the file name.  So the 12Z file contains
> >> observations from 09Z to 15Z.  When you run Point-Stat, you need
to pass
> >> it one or more observation files that contain the observations
you'd
> like
> >> to use to evaluate the forecast.  The time window you
> >> set in the Point-Stat configuration file is set relative to the
forecast
> >> valid time.  Suppose your forecast is valid at 06Z and you've set
beg =
> >> -3600 and end = 3600 (in seconds).  So that's +/- 1
> >> hour around your forecast valid time.  When Point-Stat sifts
through the
> >> observations, it'll only use the one whose valid time falls
between 05Z
> and
> >> 07Z.  It'll throw all the others out.  It's up to
> >> you to decide how close in time your observations need to be to
your
> >> forecast time.  If you set beg = 0 and end = 0, only those
observations
> >> with exactly the same time as the forecast will be used.
> >>
> >>> (5) My 5th question is about the time series comparison. If I
want to
> >> draw
> >>> the plot of MSE versus time at a specific observation point,
what
> should
> >> I
> >>> do? And how to take the data of the specific point out?
> >>>
> >>
> >> When you run Point-Stat, you can output the individual matched
pair
> values
> >> by turning on the MPR output line.  The individual forecast-
observation
> >> pairs are contained in that MPR line.  Since
> >> Point-Stat is run once for each valid time, the time-series of an
> >> individual point is scattered across many different output files.
If
> you'd
> >> like, you could use the STAT-Analysis tool filter out the
> >> MPR lines that correspond to a single station id.  For example,
here's
> how
> >> you might filter out the matched pairs for 2-m temperature for a
station
> >> named KDEN:
> >>
> >>     stat_analysis -job filter -lookin point_stat/out -line_type
MPR
> >> -fcst_var TMP -fcst_lev Z2 -column_str OBS_SID KDEN -dump_row
> >> TMP_Z2_KDEN_MPR.txt
> >>
> >> That will read all of the files ending in ".stat" from the
directory
> named
> >> "point_stat/out", pick out the MPR lines, only with "TMP" in the
> FCST_VAR
> >> column, only with "Z2" in the FCST_LEV column, only
> >> with "KDEN" in the OBS_SID column, and write the output to a file
named
> >> "TMP_Z2_KDEN_MPR.txt".  There are filtering switches for each of
the 21
> >> header columns common to each line type.  And you can
> >> use the -column_min, -column_max, -column_eq, and -column_str
options to
> >> filter by the data columns (as we've done here for the OBS_SID
column).
> >>
> >> Once you have the data stored this way, it's up to you to make
the time
> >> series plot with whatever software you'd like.
> >>
> >>> (6) My last question was about the comparison between two wrf
output.
> If
> >> I
> >>> want to know how to compare the result between two wrf output at
the
> same
> >>> time at the same location but with different physics schemes.
Whether
> it
> >> is
> >>> just put these two results into the grid-grid comparison and set
> >>> one wrfoutput as the observation? Or there are other way to
execute
> this
> >>> job?
> >>>
> >>
> >> Sure, you can easily treat one of them as the "forecast" and the
other
> as
> >> the "observation".  That is not strictly verification - more of
just a
> >> comparison.  Hopefully, the output from MET will help
> >> you quantify the differences.  You should just think carefully
about it
> >> when trying to make sense of the output.
> >>
> >>> Thank you so much in advance for your time and help, I really
> appreciate
> >> it!
> >>>
> >>> Best,
> >>>
> >>> Jason
> >>>
> >>
> >>
>
>

------------------------------------------------
Subject: some small questions regarding the basic concepts about MET
From: John Halley Gotway
Time: Mon Oct 21 10:24:45 2013

Jason,

Yes, I mean shell scripting.  Users typically call the MET utilities
from within shell scripts (or any other scripting language, like PERL
or Python).  Typically, the shell script just loops over a
bunch of model initialization times and/or forecast lead times,
figures out the name of the forecast and observation files for that
time, and calls the appropriate MET tools.  I usually use the "date"
command when doing that looping.  I've attached a sample script
written in the korn shell that does the following...

   - outer loop for the model initialization times (from 2013030100 to
2013030212 every 12-hours)
   - inner loop for forecast lead times (from 0 to 36 hours every 6
hours)
   - compute forecast and observation file names
   - print a line stating where you should call the MET tool(s) for
that time

Just download and run the attached script to see what I mean.
Hopefully, this script will help get you started.

Thanks,
John


On 10/21/2013 07:44 AM, Xingcheng Lu via RT wrote:
>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=63455 >
>
> Hi John,
>
> Thank you for your answers. You said that I could write some scripts
for
> time series re-run, so what type of scripts I can write, do you mean
shell
> script? If you have a script sample, it will be much helpful to me.
Thank
> again for your help!
>
> cheers,
>
> Jason
>
>
> 2013/10/19 John Halley Gotway via RT <met_help at ucar.edu>
>
>> Jason,
>>
>> Answers are inline.
>>
>> Thanks,
>> John
>>
>> On 10/18/2013 09:14 AM, Xingcheng Lu via RT wrote:
>>>
>>> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=63455 >
>>>
>>> Hi John,
>>>
>>> Thank you so much for your detail reply and I have learned a lot
from
>> your
>>> answers.
>>>
>>> According to your answers, I still have 3 questions:
>>>
>>> For the question 3 and 4, can I say, if there are 200 observation
points
>>> within my domain, and I set the time beg=0/end=0, then the match
pair is
>>> 200. However, when I set beg=3600/end=3600, if each observation
point
>> has 3
>>> values during this time period, then the matching pair should be
600. Is
>> my
>>> understanding correct?
>>>
>>
>> Yes, your understanding is correct.  If multiple observations occur
at the
>> same location during the time window, by default, they will all be
used.
>>   *HOWEVER*, we realize that this behavior is not
>> always desirable, so there's an option in the configuration file to
>> control this logic.  Please take a look in the file
>> "METv4.1/data/config/README" for a description of the
"duplicate_flag"
>> option.
>> Setting it's value to "SINGLE" will cause only a single observation
value
>> for each location to be used.  The one whose valid time is closest
to that
>> of the forecast time is chosen.
>>
>>> For the question 5, if I want to do the time series comparison for
12
>>> hours at the same observation spot(with single hour wrf output and
set
>>> beg/end=0), whether I just need to rerun the MET for 12 times for
>>> single different hour?
>>
>> The Point-Stat tool is intended to be run once for each valid time
being
>> evaluated.  So if you're verifying 12 different times, you'd run
the tool
>> 12 times.  Of course, we don't intend that people run
>> this manually themselves on the command line.  Instead, we expect
that
>> you'd run it via a script of some sort that loops through the times
you'd
>> like to evaluate.
>>
>> Often MET users are interested in the performance of their model at
more
>> than just a single location and for more than a single
variable/level.  So
>> there's more "work to do" than just computing a
>> single matched pair value and writing it out.
>>
>> Here's a few pieces of info you may find helpful...
>> - The MET configuration files support the use of environment
variables.  I
>> often find it helpful when scripting up calls to the MET tools to
set some
>> environment variable values and then reference
>> them in the configuration files I pass to the tools.  That will
enable you
>> to control the behavior of the tools without having to maintain
many
>> different versions of the config files.
>> - If you happen to have GRIB files that contain data for multiple
forecast
>> hours (all 12 of your output times for example), you could actually
call
>> Point-Stat once and evaluate them all at once.  But
>> the configuration files get a lot messier - in the "fcst.field"
setting
>> you'd need to explicitly specify the valid time of the data to be
evaluated
>> so that Point-Stat know which fields to use.  I
>> typically find calling Point-Stat once per valid time is easier.
>> - If you're interested in the performance of your model at a
specific set
>> of station ID's, consider using the "mask.sid" option.  Rather than
>> defining your verification area spatially (as the "grid"
>> and "poly" options do), the station id (sid) option is just a list
of the
>> stations over which you'd like to compute statistics.
>>
>>>
>>> For the question 5, "stat_analysis -job filter -lookin
point_stat/out
>>> -line_type MPR -fcst_var TMP -fcst_lev Z2 -column_str OBS_SID KDEN
>>> -dump_row TMP_Z2_KDEN_MPR.txt"  Can I type this on the Linux
command line
>>> or I need to set this in the config file for stat analysis?
Because I
>> found
>>> that this line is a little bit similar to the jobs setting in the
config
>>> file, but I cannot find the format you type in the manual. Whether
there
>> is
>>> any resource to further introduce this command line format?
>>
>> The STAT-Analysis tool can be run with or without a config file.
If you
>> provide a config file, you have define the job(s) to be run in it.
If not,
>> you define a *SINGLE* job to be run on the command
>> line.  Usually, I run single jobs on the command line until I
figure out
>> the set of analysis jobs I'd like to run via a script.  Then I move
them
>> into a config file and run them all with one call to
>> STAT-Analysis.  Look in the file "METv4.1/data/config/README" for
the
>> section on "Job command FILTERING".
>>
>>>
>>> By the way, do I have to use the grids with the same resolution if
I want
>>> to do the grid-grid comparison? Also, because my research focus is
on
>>> global scale, do you know whether there is any daily grid
observation
>> data
>>> for the global scale?
>>
>> Yes, for the grid-to-grid comparisons performed by the Grid-Stat,
MODE,
>> Series-Analysis, and Wavelet-Stat tools, it's the user's
responsibility to
>> put their forecast and observation data on the same
>> grid.  In future versions of MET, we'd like to add tools to help
users do
>> this.  But currently there isn't any support for this directly in
MET.  But
>> for GRIB1 data, the copygb utility can be used to
>> regrid things.  Here's a portion of the MET online tutorial that
discusses
>> this:
>>
>>
http://www.dtcenter.org/met/users/support/online_tutorial/METv4.1/copygb/index.php
>>
>> Availability of appropriate observations is always an issue.  And I
don't
>> have a magic bullet for you.  You could always compare your model
output to
>> a model analysis - like the global GFS analysis.
>> But that'll just tell you how well your model output matches GFS,
which
>> has it's own set of errors.
>>
>>>
>>> Thanks again for your kind help!
>>>
>>> Sincerely,
>>>
>>> Jason
>>>
>>>
>>>
>>>
>>>
>>>
>>> 2013/10/18 John Halley Gotway via RT <met_help at ucar.edu>
>>>
>>>> Jason,
>>>>
>>>> I've answered your questions inline below.
>>>>
>>>> Thanks,
>>>> John Halley Gotway
>>>> met_help at ucar.edu
>>>>
>>>> On 10/17/2013 08:54 AM, Xingcheng Lu via RT wrote:
>>>>>
>>>>> Thu Oct 17 08:54:28 2013: Request 63455 was acted upon.
>>>>> Transaction: Ticket created by
xingchenglu2011 at u.northwestern.edu
>>>>>           Queue: met_help
>>>>>         Subject: some small questions regarding the basic
concepts about
>>>> MET
>>>>>           Owner: Nobody
>>>>>      Requestors: xingchenglu2011 at u.northwestern.edu
>>>>>          Status: new
>>>>>     Ticket <URL:
>> https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=63455 >
>>>>>
>>>>>
>>>>> Dear Sir/Madam,
>>>>>
>>>>> I am a new user of MET and I have several small questions want
to ask
>>>> about.
>>>>>
>>>>> (1): In the manual at Table 4-3, I am a little bit confused
about
>>>> Forecast
>>>>> and Observation Rate. According to the contingency table,
whether
>>>>> forecast rate=(n11+n00)/T? And how about observation rate, I
don't know
>>>>> what it should be compared to.
>>>>
>>>> The forecast rate is just the fraction of grid points in the
forecast
>>>> domain at which the event is occurring.  The observation rate is
the
>>>> fraction of grid points in the observation domain at which
>>>> the event is occurring.  The observation rate is also known at
the base
>>>> rate (BASER from the CTS line shown in Table 4-5).  The FHO and
CTC
>> output
>>>> lines really contain the same information.  At the
>>>> DTC, we prefer to use the counts from the CTC line, while NCEP
prefers
>> to
>>>> use the ratios of the counts given in the FHO line.
>>>>
>>>> If your forecast were perfect, the F_RATE would be identical to
the
>> O_RATE.
>>>>
>>>>>
>>>>> (2):My second question is about the grid/poly setting in the
>>>>> Pointstatconfig file. I know that the poly is used to select the
>> specific
>>>>> area for verification, however, I don't know what exactly is the
grid
>>>>> option used for, is it also used to choose the place for
verification?
>> If
>>>>> yes, what's the difference between the poly?
>>>>
>>>> Yes, the grid masking behaves the same way that the polyline
masking
>> does.
>>>>    It's just another way of specifying a geographic subset of the
domain.
>>>>    It's generally much less useful than the polyline
>>>> masking since only the pre-defined NCEP grids are supported, and
most
>>>> users don't find that very helpful.
>>>>
>>>>>
>>>>> (3): I am a little bit confused about the 'match pair' mean. I
have two
>>>>> understandings:(A) If I set temp>273 at Z2, both the observation
file
>> and
>>>>> forecast output have valid value at Z2, no matter they are
larger than
>>>> 273
>>>>> or not;(B) Or it means that the value at Z2 from both
observation and
>>>>> forecast meet the temp>273 requirement?
>>>>
>>>> A "matched pair" just means a pair of forecast and observation
values
>> that
>>>> go together.  Suppose you have 200 point observations of 2-meter
>>>> temperature that fall in your domain.  For each observation
>>>> value, Point-Stat computes an interpolated forecast value for
that
>>>> observation location.  So you now have 200 pairs of forecast and
>>>> observation values.  Using those 200 matched pairs, you could
define
>>>> continuous statistics directly (CNT output line).  Or you could
choose a
>>>> threshold (like >273) and define a 2x2 contingency table.  With
that 2x2
>>>> contingency table, you can dump out the counts in the
>>>> CTC line and/or the corresponding statistics in the CTS line.
>>>>
>>>> I'm not sure if that answers your question.
>>>>
>>>>>
>>>>> (4): My another question is about the beg/end time setting. The
>>>> observation
>>>>> data I downloaded was DS 337.0 NCEP ADP Global Upper Air and
Surface
>>>>> Observations. Is the single file of this type of data only have
valid
>>>>> observation for a specific time spot, like if I download the
data for
>>>>> 2006.07.18_18:00:00, the file only contains the observation data
for
>> that
>>>>> time spot? If so, when combine with wrf single hour output, can
I just
>>>> set
>>>>> this option to 0 and 0? Could you tell me what exact the beg/end
used
>>>> for?
>>>>>
>>>>
>>>> There are 4 GDAS PREPBUFR files per day - 00Z, 06Z, 12Z, and 18Z.
Each
>>>> file contains 6 hours worth of observations, 3 hours +/- the time
>> indicated
>>>> in the file name.  So the 12Z file contains
>>>> observations from 09Z to 15Z.  When you run Point-Stat, you need
to pass
>>>> it one or more observation files that contain the observations
you'd
>> like
>>>> to use to evaluate the forecast.  The time window you
>>>> set in the Point-Stat configuration file is set relative to the
forecast
>>>> valid time.  Suppose your forecast is valid at 06Z and you've set
beg =
>>>> -3600 and end = 3600 (in seconds).  So that's +/- 1
>>>> hour around your forecast valid time.  When Point-Stat sifts
through the
>>>> observations, it'll only use the one whose valid time falls
between 05Z
>> and
>>>> 07Z.  It'll throw all the others out.  It's up to
>>>> you to decide how close in time your observations need to be to
your
>>>> forecast time.  If you set beg = 0 and end = 0, only those
observations
>>>> with exactly the same time as the forecast will be used.
>>>>
>>>>> (5) My 5th question is about the time series comparison. If I
want to
>>>> draw
>>>>> the plot of MSE versus time at a specific observation point,
what
>> should
>>>> I
>>>>> do? And how to take the data of the specific point out?
>>>>>
>>>>
>>>> When you run Point-Stat, you can output the individual matched
pair
>> values
>>>> by turning on the MPR output line.  The individual forecast-
observation
>>>> pairs are contained in that MPR line.  Since
>>>> Point-Stat is run once for each valid time, the time-series of an
>>>> individual point is scattered across many different output files.
If
>> you'd
>>>> like, you could use the STAT-Analysis tool filter out the
>>>> MPR lines that correspond to a single station id.  For example,
here's
>> how
>>>> you might filter out the matched pairs for 2-m temperature for a
station
>>>> named KDEN:
>>>>
>>>>      stat_analysis -job filter -lookin point_stat/out -line_type
MPR
>>>> -fcst_var TMP -fcst_lev Z2 -column_str OBS_SID KDEN -dump_row
>>>> TMP_Z2_KDEN_MPR.txt
>>>>
>>>> That will read all of the files ending in ".stat" from the
directory
>> named
>>>> "point_stat/out", pick out the MPR lines, only with "TMP" in the
>> FCST_VAR
>>>> column, only with "Z2" in the FCST_LEV column, only
>>>> with "KDEN" in the OBS_SID column, and write the output to a file
named
>>>> "TMP_Z2_KDEN_MPR.txt".  There are filtering switches for each of
the 21
>>>> header columns common to each line type.  And you can
>>>> use the -column_min, -column_max, -column_eq, and -column_str
options to
>>>> filter by the data columns (as we've done here for the OBS_SID
column).
>>>>
>>>> Once you have the data stored this way, it's up to you to make
the time
>>>> series plot with whatever software you'd like.
>>>>
>>>>> (6) My last question was about the comparison between two wrf
output.
>> If
>>>> I
>>>>> want to know how to compare the result between two wrf output at
the
>> same
>>>>> time at the same location but with different physics schemes.
Whether
>> it
>>>> is
>>>>> just put these two results into the grid-grid comparison and set
>>>>> one wrfoutput as the observation? Or there are other way to
execute
>> this
>>>>> job?
>>>>>
>>>>
>>>> Sure, you can easily treat one of them as the "forecast" and the
other
>> as
>>>> the "observation".  That is not strictly verification - more of
just a
>>>> comparison.  Hopefully, the output from MET will help
>>>> you quantify the differences.  You should just think carefully
about it
>>>> when trying to make sense of the output.
>>>>
>>>>> Thank you so much in advance for your time and help, I really
>> appreciate
>>>> it!
>>>>>
>>>>> Best,
>>>>>
>>>>> Jason
>>>>>
>>>>
>>>>
>>
>>

------------------------------------------------
Subject: some small questions regarding the basic concepts about MET
From: John Halley Gotway
Time: Mon Oct 21 10:24:45 2013

#!/bin/ksh

################################################################################

# Model initialzation times to be processed
INIT_BEG_STR="20130301 00" # First initialization time in "YYYYMMDD
HH" format
INIT_END_STR="20130302 12" # Last initialization time in "YYYYMMDD HH"
format
INIT_INC_SEC=43200         # Initialization time interval in seconds
(12 hours = 43200 seconds)

# Forecast lead times to be processed
FCST_BEG_HR=0              # First lead time in hours
FCST_END_HR=36             # Last lead time in hours
FCST_INC_HR=6              # Lead time interval in hours

################################################################################

# Path to forecast files
FCST_BASE="/sample/fcst"

# Path to observation files
OBS_BASE="/sample/obs"

################################################################################

# Define the forecast hour variable to use 3 digits
# They "typeset -Z3" command in ksh prescribes the number of digits
FCST_CUR_HR=0
typeset -Z3 FCST_CUR_HR

# Convert dates from string to unix time
INIT_BEG_UT=`date -ud "${INIT_BEG_STR}" +%s`
INIT_END_UT=`date -ud "${INIT_END_STR}" +%s`

# Loop through the initialization times using unix time
INIT_CUR_UT=${INIT_BEG_UT}
while [ ${INIT_CUR_UT} -le ${INIT_END_UT} ]; do

  # Get the current initialization string
  INIT_CUR_STR=`date -ud '1970-01-01 UTC '${INIT_CUR_UT}' seconds'
+%Y%m%d%H`

  echo
  echo "Processing Initilazation time ${INIT_CUR_STR}"
  echo

  # Loop through the forecast lead times
  FCST_CUR_HR=${FCST_BEG_HR}
  while [ ${FCST_CUR_HR} -le ${FCST_END_HR} ]; do

    # Convert the current forecast hour to seconds
    FCST_CUR_SEC=`expr ${FCST_CUR_HR} \* 3600`

    # Compute valid time
    VALID_UT=`expr ${INIT_CUR_UT} + ${FCST_CUR_SEC}`
    VALID_STR=`date -ud '1970-01-01 UTC '${VALID_UT}' seconds'
+%Y%m%d%H`

    # Compute corresponding forecast and observation file names
    FCST_FILE="${FCST_BASE}/d01_${INIT_CUR_STR}_${FCST_CUR_HR}00.grib"
    OBS_FILE="${OBS_BASE}/ST4.d01.${VALID_STR}.06h"

    # Make some calls to the MET tools here
    echo "Add calls to MET tool(s) for \
initialization time = ${INIT_CUR_STR}, \
forecast lead time = ${FCST_CUR_HR} hours, \
valid time = ${VALID_STR}, \
forecast file = ${FCST_FILE}, and \
observation file = ${OBS_FILE}"

    # Increment the forecast time
    FCST_CUR_HR=`expr ${FCST_CUR_HR} + ${FCST_INC_HR}`

  done

  # Increment the initialization time
  INIT_CUR_UT=`expr ${INIT_CUR_UT} + ${INIT_INC_SEC}`

done

------------------------------------------------
Subject: some small questions regarding the basic concepts about MET
From: Xingcheng Lu
Time: Mon Oct 21 19:46:32 2013

Hello John,

Thank you very much for your script, I will study it in detail. Also,
I
really appreciate that you can provide me so many detailed answers, I
have
learned a lot from them. Hope that meet you again when I ask the
question
in met-help next time.

Sincerely,

Jason


2013/10/22 John Halley Gotway via RT <met_help at ucar.edu>

> Jason,
>
> Yes, I mean shell scripting.  Users typically call the MET utilities
from
> within shell scripts (or any other scripting language, like PERL or
> Python).  Typically, the shell script just loops over a
> bunch of model initialization times and/or forecast lead times,
figures
> out the name of the forecast and observation files for that time,
and calls
> the appropriate MET tools.  I usually use the "date"
> command when doing that looping.  I've attached a sample script
written in
> the korn shell that does the following...
>
>    - outer loop for the model initialization times (from 2013030100
to
> 2013030212 every 12-hours)
>    - inner loop for forecast lead times (from 0 to 36 hours every 6
hours)
>    - compute forecast and observation file names
>    - print a line stating where you should call the MET tool(s) for
that
> time
>
> Just download and run the attached script to see what I mean.
Hopefully,
> this script will help get you started.
>
> Thanks,
> John
>
>
> On 10/21/2013 07:44 AM, Xingcheng Lu via RT wrote:
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=63455 >
> >
> > Hi John,
> >
> > Thank you for your answers. You said that I could write some
scripts for
> > time series re-run, so what type of scripts I can write, do you
mean
> shell
> > script? If you have a script sample, it will be much helpful to
me. Thank
> > again for your help!
> >
> > cheers,
> >
> > Jason
> >
> >
> > 2013/10/19 John Halley Gotway via RT <met_help at ucar.edu>
> >
> >> Jason,
> >>
> >> Answers are inline.
> >>
> >> Thanks,
> >> John
> >>
> >> On 10/18/2013 09:14 AM, Xingcheng Lu via RT wrote:
> >>>
> >>> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=63455 >
> >>>
> >>> Hi John,
> >>>
> >>> Thank you so much for your detail reply and I have learned a lot
from
> >> your
> >>> answers.
> >>>
> >>> According to your answers, I still have 3 questions:
> >>>
> >>> For the question 3 and 4, can I say, if there are 200
observation
> points
> >>> within my domain, and I set the time beg=0/end=0, then the match
pair
> is
> >>> 200. However, when I set beg=3600/end=3600, if each observation
point
> >> has 3
> >>> values during this time period, then the matching pair should be
600.
> Is
> >> my
> >>> understanding correct?
> >>>
> >>
> >> Yes, your understanding is correct.  If multiple observations
occur at
> the
> >> same location during the time window, by default, they will all
be used.
> >>   *HOWEVER*, we realize that this behavior is not
> >> always desirable, so there's an option in the configuration file
to
> >> control this logic.  Please take a look in the file
> >> "METv4.1/data/config/README" for a description of the
"duplicate_flag"
> >> option.
> >> Setting it's value to "SINGLE" will cause only a single
observation
> value
> >> for each location to be used.  The one whose valid time is
closest to
> that
> >> of the forecast time is chosen.
> >>
> >>> For the question 5, if I want to do the time series comparison
for 12
> >>> hours at the same observation spot(with single hour wrf output
and set
> >>> beg/end=0), whether I just need to rerun the MET for 12 times
for
> >>> single different hour?
> >>
> >> The Point-Stat tool is intended to be run once for each valid
time being
> >> evaluated.  So if you're verifying 12 different times, you'd run
the
> tool
> >> 12 times.  Of course, we don't intend that people run
> >> this manually themselves on the command line.  Instead, we expect
that
> >> you'd run it via a script of some sort that loops through the
times
> you'd
> >> like to evaluate.
> >>
> >> Often MET users are interested in the performance of their model
at more
> >> than just a single location and for more than a single
variable/level.
>  So
> >> there's more "work to do" than just computing a
> >> single matched pair value and writing it out.
> >>
> >> Here's a few pieces of info you may find helpful...
> >> - The MET configuration files support the use of environment
variables.
>  I
> >> often find it helpful when scripting up calls to the MET tools to
set
> some
> >> environment variable values and then reference
> >> them in the configuration files I pass to the tools.  That will
enable
> you
> >> to control the behavior of the tools without having to maintain
many
> >> different versions of the config files.
> >> - If you happen to have GRIB files that contain data for multiple
> forecast
> >> hours (all 12 of your output times for example), you could
actually call
> >> Point-Stat once and evaluate them all at once.  But
> >> the configuration files get a lot messier - in the "fcst.field"
setting
> >> you'd need to explicitly specify the valid time of the data to be
> evaluated
> >> so that Point-Stat know which fields to use.  I
> >> typically find calling Point-Stat once per valid time is easier.
> >> - If you're interested in the performance of your model at a
specific
> set
> >> of station ID's, consider using the "mask.sid" option.  Rather
than
> >> defining your verification area spatially (as the "grid"
> >> and "poly" options do), the station id (sid) option is just a
list of
> the
> >> stations over which you'd like to compute statistics.
> >>
> >>>
> >>> For the question 5, "stat_analysis -job filter -lookin
point_stat/out
> >>> -line_type MPR -fcst_var TMP -fcst_lev Z2 -column_str OBS_SID
KDEN
> >>> -dump_row TMP_Z2_KDEN_MPR.txt"  Can I type this on the Linux
command
> line
> >>> or I need to set this in the config file for stat analysis?
Because I
> >> found
> >>> that this line is a little bit similar to the jobs setting in
the
> config
> >>> file, but I cannot find the format you type in the manual.
Whether
> there
> >> is
> >>> any resource to further introduce this command line format?
> >>
> >> The STAT-Analysis tool can be run with or without a config file.
If you
> >> provide a config file, you have define the job(s) to be run in
it.  If
> not,
> >> you define a *SINGLE* job to be run on the command
> >> line.  Usually, I run single jobs on the command line until I
figure out
> >> the set of analysis jobs I'd like to run via a script.  Then I
move them
> >> into a config file and run them all with one call to
> >> STAT-Analysis.  Look in the file "METv4.1/data/config/README" for
the
> >> section on "Job command FILTERING".
> >>
> >>>
> >>> By the way, do I have to use the grids with the same resolution
if I
> want
> >>> to do the grid-grid comparison? Also, because my research focus
is on
> >>> global scale, do you know whether there is any daily grid
observation
> >> data
> >>> for the global scale?
> >>
> >> Yes, for the grid-to-grid comparisons performed by the Grid-Stat,
MODE,
> >> Series-Analysis, and Wavelet-Stat tools, it's the user's
responsibility
> to
> >> put their forecast and observation data on the same
> >> grid.  In future versions of MET, we'd like to add tools to help
users
> do
> >> this.  But currently there isn't any support for this directly in
MET.
>  But
> >> for GRIB1 data, the copygb utility can be used to
> >> regrid things.  Here's a portion of the MET online tutorial that
> discusses
> >> this:
> >>
> >>
>
http://www.dtcenter.org/met/users/support/online_tutorial/METv4.1/copygb/index.php
> >>
> >> Availability of appropriate observations is always an issue.  And
I
> don't
> >> have a magic bullet for you.  You could always compare your model
> output to
> >> a model analysis - like the global GFS analysis.
> >> But that'll just tell you how well your model output matches GFS,
which
> >> has it's own set of errors.
> >>
> >>>
> >>> Thanks again for your kind help!
> >>>
> >>> Sincerely,
> >>>
> >>> Jason
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> 2013/10/18 John Halley Gotway via RT <met_help at ucar.edu>
> >>>
> >>>> Jason,
> >>>>
> >>>> I've answered your questions inline below.
> >>>>
> >>>> Thanks,
> >>>> John Halley Gotway
> >>>> met_help at ucar.edu
> >>>>
> >>>> On 10/17/2013 08:54 AM, Xingcheng Lu via RT wrote:
> >>>>>
> >>>>> Thu Oct 17 08:54:28 2013: Request 63455 was acted upon.
> >>>>> Transaction: Ticket created by
xingchenglu2011 at u.northwestern.edu
> >>>>>           Queue: met_help
> >>>>>         Subject: some small questions regarding the basic
concepts
> about
> >>>> MET
> >>>>>           Owner: Nobody
> >>>>>      Requestors: xingchenglu2011 at u.northwestern.edu
> >>>>>          Status: new
> >>>>>     Ticket <URL:
> >> https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=63455 >
> >>>>>
> >>>>>
> >>>>> Dear Sir/Madam,
> >>>>>
> >>>>> I am a new user of MET and I have several small questions want
to ask
> >>>> about.
> >>>>>
> >>>>> (1): In the manual at Table 4-3, I am a little bit confused
about
> >>>> Forecast
> >>>>> and Observation Rate. According to the contingency table,
whether
> >>>>> forecast rate=(n11+n00)/T? And how about observation rate, I
don't
> know
> >>>>> what it should be compared to.
> >>>>
> >>>> The forecast rate is just the fraction of grid points in the
forecast
> >>>> domain at which the event is occurring.  The observation rate
is the
> >>>> fraction of grid points in the observation domain at which
> >>>> the event is occurring.  The observation rate is also known at
the
> base
> >>>> rate (BASER from the CTS line shown in Table 4-5).  The FHO and
CTC
> >> output
> >>>> lines really contain the same information.  At the
> >>>> DTC, we prefer to use the counts from the CTC line, while NCEP
prefers
> >> to
> >>>> use the ratios of the counts given in the FHO line.
> >>>>
> >>>> If your forecast were perfect, the F_RATE would be identical to
the
> >> O_RATE.
> >>>>
> >>>>>
> >>>>> (2):My second question is about the grid/poly setting in the
> >>>>> Pointstatconfig file. I know that the poly is used to select
the
> >> specific
> >>>>> area for verification, however, I don't know what exactly is
the grid
> >>>>> option used for, is it also used to choose the place for
> verification?
> >> If
> >>>>> yes, what's the difference between the poly?
> >>>>
> >>>> Yes, the grid masking behaves the same way that the polyline
masking
> >> does.
> >>>>    It's just another way of specifying a geographic subset of
the
> domain.
> >>>>    It's generally much less useful than the polyline
> >>>> masking since only the pre-defined NCEP grids are supported,
and most
> >>>> users don't find that very helpful.
> >>>>
> >>>>>
> >>>>> (3): I am a little bit confused about the 'match pair' mean. I
have
> two
> >>>>> understandings:(A) If I set temp>273 at Z2, both the
observation file
> >> and
> >>>>> forecast output have valid value at Z2, no matter they are
larger
> than
> >>>> 273
> >>>>> or not;(B) Or it means that the value at Z2 from both
observation and
> >>>>> forecast meet the temp>273 requirement?
> >>>>
> >>>> A "matched pair" just means a pair of forecast and observation
values
> >> that
> >>>> go together.  Suppose you have 200 point observations of 2-
meter
> >>>> temperature that fall in your domain.  For each observation
> >>>> value, Point-Stat computes an interpolated forecast value for
that
> >>>> observation location.  So you now have 200 pairs of forecast
and
> >>>> observation values.  Using those 200 matched pairs, you could
define
> >>>> continuous statistics directly (CNT output line).  Or you could
> choose a
> >>>> threshold (like >273) and define a 2x2 contingency table.  With
that
> 2x2
> >>>> contingency table, you can dump out the counts in the
> >>>> CTC line and/or the corresponding statistics in the CTS line.
> >>>>
> >>>> I'm not sure if that answers your question.
> >>>>
> >>>>>
> >>>>> (4): My another question is about the beg/end time setting.
The
> >>>> observation
> >>>>> data I downloaded was DS 337.0 NCEP ADP Global Upper Air and
Surface
> >>>>> Observations. Is the single file of this type of data only
have valid
> >>>>> observation for a specific time spot, like if I download the
data for
> >>>>> 2006.07.18_18:00:00, the file only contains the observation
data for
> >> that
> >>>>> time spot? If so, when combine with wrf single hour output,
can I
> just
> >>>> set
> >>>>> this option to 0 and 0? Could you tell me what exact the
beg/end used
> >>>> for?
> >>>>>
> >>>>
> >>>> There are 4 GDAS PREPBUFR files per day - 00Z, 06Z, 12Z, and
18Z.
>  Each
> >>>> file contains 6 hours worth of observations, 3 hours +/- the
time
> >> indicated
> >>>> in the file name.  So the 12Z file contains
> >>>> observations from 09Z to 15Z.  When you run Point-Stat, you
need to
> pass
> >>>> it one or more observation files that contain the observations
you'd
> >> like
> >>>> to use to evaluate the forecast.  The time window you
> >>>> set in the Point-Stat configuration file is set relative to the
> forecast
> >>>> valid time.  Suppose your forecast is valid at 06Z and you've
set beg
> =
> >>>> -3600 and end = 3600 (in seconds).  So that's +/- 1
> >>>> hour around your forecast valid time.  When Point-Stat sifts
through
> the
> >>>> observations, it'll only use the one whose valid time falls
between
> 05Z
> >> and
> >>>> 07Z.  It'll throw all the others out.  It's up to
> >>>> you to decide how close in time your observations need to be to
your
> >>>> forecast time.  If you set beg = 0 and end = 0, only those
> observations
> >>>> with exactly the same time as the forecast will be used.
> >>>>
> >>>>> (5) My 5th question is about the time series comparison. If I
want to
> >>>> draw
> >>>>> the plot of MSE versus time at a specific observation point,
what
> >> should
> >>>> I
> >>>>> do? And how to take the data of the specific point out?
> >>>>>
> >>>>
> >>>> When you run Point-Stat, you can output the individual matched
pair
> >> values
> >>>> by turning on the MPR output line.  The individual
> forecast-observation
> >>>> pairs are contained in that MPR line.  Since
> >>>> Point-Stat is run once for each valid time, the time-series of
an
> >>>> individual point is scattered across many different output
files.  If
> >> you'd
> >>>> like, you could use the STAT-Analysis tool filter out the
> >>>> MPR lines that correspond to a single station id.  For example,
here's
> >> how
> >>>> you might filter out the matched pairs for 2-m temperature for
a
> station
> >>>> named KDEN:
> >>>>
> >>>>      stat_analysis -job filter -lookin point_stat/out
-line_type MPR
> >>>> -fcst_var TMP -fcst_lev Z2 -column_str OBS_SID KDEN -dump_row
> >>>> TMP_Z2_KDEN_MPR.txt
> >>>>
> >>>> That will read all of the files ending in ".stat" from the
directory
> >> named
> >>>> "point_stat/out", pick out the MPR lines, only with "TMP" in
the
> >> FCST_VAR
> >>>> column, only with "Z2" in the FCST_LEV column, only
> >>>> with "KDEN" in the OBS_SID column, and write the output to a
file
> named
> >>>> "TMP_Z2_KDEN_MPR.txt".  There are filtering switches for each
of the
> 21
> >>>> header columns common to each line type.  And you can
> >>>> use the -column_min, -column_max, -column_eq, and -column_str
options
> to
> >>>> filter by the data columns (as we've done here for the OBS_SID
> column).
> >>>>
> >>>> Once you have the data stored this way, it's up to you to make
the
> time
> >>>> series plot with whatever software you'd like.
> >>>>
> >>>>> (6) My last question was about the comparison between two wrf
output.
> >> If
> >>>> I
> >>>>> want to know how to compare the result between two wrf output
at the
> >> same
> >>>>> time at the same location but with different physics schemes.
Whether
> >> it
> >>>> is
> >>>>> just put these two results into the grid-grid comparison and
set
> >>>>> one wrfoutput as the observation? Or there are other way to
execute
> >> this
> >>>>> job?
> >>>>>
> >>>>
> >>>> Sure, you can easily treat one of them as the "forecast" and
the other
> >> as
> >>>> the "observation".  That is not strictly verification - more of
just a
> >>>> comparison.  Hopefully, the output from MET will help
> >>>> you quantify the differences.  You should just think carefully
about
> it
> >>>> when trying to make sense of the output.
> >>>>
> >>>>> Thank you so much in advance for your time and help, I really
> >> appreciate
> >>>> it!
> >>>>>
> >>>>> Best,
> >>>>>
> >>>>> Jason
> >>>>>
> >>>>
> >>>>
> >>
> >>
>
>

------------------------------------------------
Subject: some small questions regarding the basic concepts about MET
From: Xingcheng Lu
Time: Tue Oct 22 06:29:56 2013

Hi John,

Forgot to ask you a question about the observation data yesterday. The
NCEP
ADP Global Upper Air and Surface Weather Observations data, is it
recorded
at instantaneous time spot, or just recorded the average value over a
period? For example,if the data is recorded at 12:00pm, is the data at
this
time an average from 11:50 to 12:10, or the exact value at 12:00?
Thanks
again-

Best,

Jason


2013/10/22 Xingcheng Lu <xingchenglu2011 at u.northwestern.edu>

> Hello John,
>
> Thank you very much for your script, I will study it in detail.
Also, I
> really appreciate that you can provide me so many detailed answers,
I have
> learned a lot from them. Hope that meet you again when I ask the
question
> in met-help next time.
>
> Sincerely,
>
> Jason
>
>
> 2013/10/22 John Halley Gotway via RT <met_help at ucar.edu>
>
>> Jason,
>>
>> Yes, I mean shell scripting.  Users typically call the MET
utilities from
>> within shell scripts (or any other scripting language, like PERL or
>> Python).  Typically, the shell script just loops over a
>> bunch of model initialization times and/or forecast lead times,
figures
>> out the name of the forecast and observation files for that time,
and calls
>> the appropriate MET tools.  I usually use the "date"
>> command when doing that looping.  I've attached a sample script
written
>> in the korn shell that does the following...
>>
>>    - outer loop for the model initialization times (from 2013030100
to
>> 2013030212 every 12-hours)
>>    - inner loop for forecast lead times (from 0 to 36 hours every 6
hours)
>>    - compute forecast and observation file names
>>    - print a line stating where you should call the MET tool(s) for
that
>> time
>>
>> Just download and run the attached script to see what I mean.
Hopefully,
>> this script will help get you started.
>>
>> Thanks,
>> John
>>
>>
>> On 10/21/2013 07:44 AM, Xingcheng Lu via RT wrote:
>> >
>> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=63455 >
>> >
>> > Hi John,
>> >
>> > Thank you for your answers. You said that I could write some
scripts for
>> > time series re-run, so what type of scripts I can write, do you
mean
>> shell
>> > script? If you have a script sample, it will be much helpful to
me.
>> Thank
>> > again for your help!
>> >
>> > cheers,
>> >
>> > Jason
>> >
>> >
>> > 2013/10/19 John Halley Gotway via RT <met_help at ucar.edu>
>> >
>> >> Jason,
>> >>
>> >> Answers are inline.
>> >>
>> >> Thanks,
>> >> John
>> >>
>> >> On 10/18/2013 09:14 AM, Xingcheng Lu via RT wrote:
>> >>>
>> >>> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=63455 >
>> >>>
>> >>> Hi John,
>> >>>
>> >>> Thank you so much for your detail reply and I have learned a
lot from
>> >> your
>> >>> answers.
>> >>>
>> >>> According to your answers, I still have 3 questions:
>> >>>
>> >>> For the question 3 and 4, can I say, if there are 200
observation
>> points
>> >>> within my domain, and I set the time beg=0/end=0, then the
match pair
>> is
>> >>> 200. However, when I set beg=3600/end=3600, if each observation
point
>> >> has 3
>> >>> values during this time period, then the matching pair should
be 600.
>> Is
>> >> my
>> >>> understanding correct?
>> >>>
>> >>
>> >> Yes, your understanding is correct.  If multiple observations
occur at
>> the
>> >> same location during the time window, by default, they will all
be
>> used.
>> >>   *HOWEVER*, we realize that this behavior is not
>> >> always desirable, so there's an option in the configuration file
to
>> >> control this logic.  Please take a look in the file
>> >> "METv4.1/data/config/README" for a description of the
"duplicate_flag"
>> >> option.
>> >> Setting it's value to "SINGLE" will cause only a single
observation
>> value
>> >> for each location to be used.  The one whose valid time is
closest to
>> that
>> >> of the forecast time is chosen.
>> >>
>> >>> For the question 5, if I want to do the time series comparison
for 12
>> >>> hours at the same observation spot(with single hour wrf output
and set
>> >>> beg/end=0), whether I just need to rerun the MET for 12 times
for
>> >>> single different hour?
>> >>
>> >> The Point-Stat tool is intended to be run once for each valid
time
>> being
>> >> evaluated.  So if you're verifying 12 different times, you'd run
the
>> tool
>> >> 12 times.  Of course, we don't intend that people run
>> >> this manually themselves on the command line.  Instead, we
expect that
>> >> you'd run it via a script of some sort that loops through the
times
>> you'd
>> >> like to evaluate.
>> >>
>> >> Often MET users are interested in the performance of their model
at
>> more
>> >> than just a single location and for more than a single
variable/level.
>>  So
>> >> there's more "work to do" than just computing a
>> >> single matched pair value and writing it out.
>> >>
>> >> Here's a few pieces of info you may find helpful...
>> >> - The MET configuration files support the use of environment
>> variables.  I
>> >> often find it helpful when scripting up calls to the MET tools
to set
>> some
>> >> environment variable values and then reference
>> >> them in the configuration files I pass to the tools.  That will
enable
>> you
>> >> to control the behavior of the tools without having to maintain
many
>> >> different versions of the config files.
>> >> - If you happen to have GRIB files that contain data for
multiple
>> forecast
>> >> hours (all 12 of your output times for example), you could
actually
>> call
>> >> Point-Stat once and evaluate them all at once.  But
>> >> the configuration files get a lot messier - in the "fcst.field"
setting
>> >> you'd need to explicitly specify the valid time of the data to
be
>> evaluated
>> >> so that Point-Stat know which fields to use.  I
>> >> typically find calling Point-Stat once per valid time is easier.
>> >> - If you're interested in the performance of your model at a
specific
>> set
>> >> of station ID's, consider using the "mask.sid" option.  Rather
than
>> >> defining your verification area spatially (as the "grid"
>> >> and "poly" options do), the station id (sid) option is just a
list of
>> the
>> >> stations over which you'd like to compute statistics.
>> >>
>> >>>
>> >>> For the question 5, "stat_analysis -job filter -lookin
point_stat/out
>> >>> -line_type MPR -fcst_var TMP -fcst_lev Z2 -column_str OBS_SID
KDEN
>> >>> -dump_row TMP_Z2_KDEN_MPR.txt"  Can I type this on the Linux
command
>> line
>> >>> or I need to set this in the config file for stat analysis?
Because I
>> >> found
>> >>> that this line is a little bit similar to the jobs setting in
the
>> config
>> >>> file, but I cannot find the format you type in the manual.
Whether
>> there
>> >> is
>> >>> any resource to further introduce this command line format?
>> >>
>> >> The STAT-Analysis tool can be run with or without a config file.
If
>> you
>> >> provide a config file, you have define the job(s) to be run in
it.  If
>> not,
>> >> you define a *SINGLE* job to be run on the command
>> >> line.  Usually, I run single jobs on the command line until I
figure
>> out
>> >> the set of analysis jobs I'd like to run via a script.  Then I
move
>> them
>> >> into a config file and run them all with one call to
>> >> STAT-Analysis.  Look in the file "METv4.1/data/config/README"
for the
>> >> section on "Job command FILTERING".
>> >>
>> >>>
>> >>> By the way, do I have to use the grids with the same resolution
if I
>> want
>> >>> to do the grid-grid comparison? Also, because my research focus
is on
>> >>> global scale, do you know whether there is any daily grid
observation
>> >> data
>> >>> for the global scale?
>> >>
>> >> Yes, for the grid-to-grid comparisons performed by the Grid-
Stat, MODE,
>> >> Series-Analysis, and Wavelet-Stat tools, it's the user's
>> responsibility to
>> >> put their forecast and observation data on the same
>> >> grid.  In future versions of MET, we'd like to add tools to help
users
>> do
>> >> this.  But currently there isn't any support for this directly
in MET.
>>  But
>> >> for GRIB1 data, the copygb utility can be used to
>> >> regrid things.  Here's a portion of the MET online tutorial that
>> discusses
>> >> this:
>> >>
>> >>
>>
http://www.dtcenter.org/met/users/support/online_tutorial/METv4.1/copygb/index.php
>> >>
>> >> Availability of appropriate observations is always an issue.
And I
>> don't
>> >> have a magic bullet for you.  You could always compare your
model
>> output to
>> >> a model analysis - like the global GFS analysis.
>> >> But that'll just tell you how well your model output matches
GFS, which
>> >> has it's own set of errors.
>> >>
>> >>>
>> >>> Thanks again for your kind help!
>> >>>
>> >>> Sincerely,
>> >>>
>> >>> Jason
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>> 2013/10/18 John Halley Gotway via RT <met_help at ucar.edu>
>> >>>
>> >>>> Jason,
>> >>>>
>> >>>> I've answered your questions inline below.
>> >>>>
>> >>>> Thanks,
>> >>>> John Halley Gotway
>> >>>> met_help at ucar.edu
>> >>>>
>> >>>> On 10/17/2013 08:54 AM, Xingcheng Lu via RT wrote:
>> >>>>>
>> >>>>> Thu Oct 17 08:54:28 2013: Request 63455 was acted upon.
>> >>>>> Transaction: Ticket created by
xingchenglu2011 at u.northwestern.edu
>> >>>>>           Queue: met_help
>> >>>>>         Subject: some small questions regarding the basic
concepts
>> about
>> >>>> MET
>> >>>>>           Owner: Nobody
>> >>>>>      Requestors: xingchenglu2011 at u.northwestern.edu
>> >>>>>          Status: new
>> >>>>>     Ticket <URL:
>> >> https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=63455 >
>> >>>>>
>> >>>>>
>> >>>>> Dear Sir/Madam,
>> >>>>>
>> >>>>> I am a new user of MET and I have several small questions
want to
>> ask
>> >>>> about.
>> >>>>>
>> >>>>> (1): In the manual at Table 4-3, I am a little bit confused
about
>> >>>> Forecast
>> >>>>> and Observation Rate. According to the contingency table,
whether
>> >>>>> forecast rate=(n11+n00)/T? And how about observation rate, I
don't
>> know
>> >>>>> what it should be compared to.
>> >>>>
>> >>>> The forecast rate is just the fraction of grid points in the
forecast
>> >>>> domain at which the event is occurring.  The observation rate
is the
>> >>>> fraction of grid points in the observation domain at which
>> >>>> the event is occurring.  The observation rate is also known at
the
>> base
>> >>>> rate (BASER from the CTS line shown in Table 4-5).  The FHO
and CTC
>> >> output
>> >>>> lines really contain the same information.  At the
>> >>>> DTC, we prefer to use the counts from the CTC line, while NCEP
>> prefers
>> >> to
>> >>>> use the ratios of the counts given in the FHO line.
>> >>>>
>> >>>> If your forecast were perfect, the F_RATE would be identical
to the
>> >> O_RATE.
>> >>>>
>> >>>>>
>> >>>>> (2):My second question is about the grid/poly setting in the
>> >>>>> Pointstatconfig file. I know that the poly is used to select
the
>> >> specific
>> >>>>> area for verification, however, I don't know what exactly is
the
>> grid
>> >>>>> option used for, is it also used to choose the place for
>> verification?
>> >> If
>> >>>>> yes, what's the difference between the poly?
>> >>>>
>> >>>> Yes, the grid masking behaves the same way that the polyline
masking
>> >> does.
>> >>>>    It's just another way of specifying a geographic subset of
the
>> domain.
>> >>>>    It's generally much less useful than the polyline
>> >>>> masking since only the pre-defined NCEP grids are supported,
and most
>> >>>> users don't find that very helpful.
>> >>>>
>> >>>>>
>> >>>>> (3): I am a little bit confused about the 'match pair' mean.
I have
>> two
>> >>>>> understandings:(A) If I set temp>273 at Z2, both the
observation
>> file
>> >> and
>> >>>>> forecast output have valid value at Z2, no matter they are
larger
>> than
>> >>>> 273
>> >>>>> or not;(B) Or it means that the value at Z2 from both
observation
>> and
>> >>>>> forecast meet the temp>273 requirement?
>> >>>>
>> >>>> A "matched pair" just means a pair of forecast and observation
values
>> >> that
>> >>>> go together.  Suppose you have 200 point observations of 2-
meter
>> >>>> temperature that fall in your domain.  For each observation
>> >>>> value, Point-Stat computes an interpolated forecast value for
that
>> >>>> observation location.  So you now have 200 pairs of forecast
and
>> >>>> observation values.  Using those 200 matched pairs, you could
define
>> >>>> continuous statistics directly (CNT output line).  Or you
could
>> choose a
>> >>>> threshold (like >273) and define a 2x2 contingency table.
With that
>> 2x2
>> >>>> contingency table, you can dump out the counts in the
>> >>>> CTC line and/or the corresponding statistics in the CTS line.
>> >>>>
>> >>>> I'm not sure if that answers your question.
>> >>>>
>> >>>>>
>> >>>>> (4): My another question is about the beg/end time setting.
The
>> >>>> observation
>> >>>>> data I downloaded was DS 337.0 NCEP ADP Global Upper Air and
Surface
>> >>>>> Observations. Is the single file of this type of data only
have
>> valid
>> >>>>> observation for a specific time spot, like if I download the
data
>> for
>> >>>>> 2006.07.18_18:00:00, the file only contains the observation
data for
>> >> that
>> >>>>> time spot? If so, when combine with wrf single hour output,
can I
>> just
>> >>>> set
>> >>>>> this option to 0 and 0? Could you tell me what exact the
beg/end
>> used
>> >>>> for?
>> >>>>>
>> >>>>
>> >>>> There are 4 GDAS PREPBUFR files per day - 00Z, 06Z, 12Z, and
18Z.
>>  Each
>> >>>> file contains 6 hours worth of observations, 3 hours +/- the
time
>> >> indicated
>> >>>> in the file name.  So the 12Z file contains
>> >>>> observations from 09Z to 15Z.  When you run Point-Stat, you
need to
>> pass
>> >>>> it one or more observation files that contain the observations
you'd
>> >> like
>> >>>> to use to evaluate the forecast.  The time window you
>> >>>> set in the Point-Stat configuration file is set relative to
the
>> forecast
>> >>>> valid time.  Suppose your forecast is valid at 06Z and you've
set
>> beg =
>> >>>> -3600 and end = 3600 (in seconds).  So that's +/- 1
>> >>>> hour around your forecast valid time.  When Point-Stat sifts
through
>> the
>> >>>> observations, it'll only use the one whose valid time falls
between
>> 05Z
>> >> and
>> >>>> 07Z.  It'll throw all the others out.  It's up to
>> >>>> you to decide how close in time your observations need to be
to your
>> >>>> forecast time.  If you set beg = 0 and end = 0, only those
>> observations
>> >>>> with exactly the same time as the forecast will be used.
>> >>>>
>> >>>>> (5) My 5th question is about the time series comparison. If I
want
>> to
>> >>>> draw
>> >>>>> the plot of MSE versus time at a specific observation point,
what
>> >> should
>> >>>> I
>> >>>>> do? And how to take the data of the specific point out?
>> >>>>>
>> >>>>
>> >>>> When you run Point-Stat, you can output the individual matched
pair
>> >> values
>> >>>> by turning on the MPR output line.  The individual
>> forecast-observation
>> >>>> pairs are contained in that MPR line.  Since
>> >>>> Point-Stat is run once for each valid time, the time-series of
an
>> >>>> individual point is scattered across many different output
files.  If
>> >> you'd
>> >>>> like, you could use the STAT-Analysis tool filter out the
>> >>>> MPR lines that correspond to a single station id.  For
example,
>> here's
>> >> how
>> >>>> you might filter out the matched pairs for 2-m temperature for
a
>> station
>> >>>> named KDEN:
>> >>>>
>> >>>>      stat_analysis -job filter -lookin point_stat/out
-line_type MPR
>> >>>> -fcst_var TMP -fcst_lev Z2 -column_str OBS_SID KDEN -dump_row
>> >>>> TMP_Z2_KDEN_MPR.txt
>> >>>>
>> >>>> That will read all of the files ending in ".stat" from the
directory
>> >> named
>> >>>> "point_stat/out", pick out the MPR lines, only with "TMP" in
the
>> >> FCST_VAR
>> >>>> column, only with "Z2" in the FCST_LEV column, only
>> >>>> with "KDEN" in the OBS_SID column, and write the output to a
file
>> named
>> >>>> "TMP_Z2_KDEN_MPR.txt".  There are filtering switches for each
of the
>> 21
>> >>>> header columns common to each line type.  And you can
>> >>>> use the -column_min, -column_max, -column_eq, and -column_str
>> options to
>> >>>> filter by the data columns (as we've done here for the OBS_SID
>> column).
>> >>>>
>> >>>> Once you have the data stored this way, it's up to you to make
the
>> time
>> >>>> series plot with whatever software you'd like.
>> >>>>
>> >>>>> (6) My last question was about the comparison between two wrf
>> output.
>> >> If
>> >>>> I
>> >>>>> want to know how to compare the result between two wrf output
at the
>> >> same
>> >>>>> time at the same location but with different physics schemes.
>> Whether
>> >> it
>> >>>> is
>> >>>>> just put these two results into the grid-grid comparison and
set
>> >>>>> one wrfoutput as the observation? Or there are other way to
execute
>> >> this
>> >>>>> job?
>> >>>>>
>> >>>>
>> >>>> Sure, you can easily treat one of them as the "forecast" and
the
>> other
>> >> as
>> >>>> the "observation".  That is not strictly verification - more
of just
>> a
>> >>>> comparison.  Hopefully, the output from MET will help
>> >>>> you quantify the differences.  You should just think carefully
about
>> it
>> >>>> when trying to make sense of the output.
>> >>>>
>> >>>>> Thank you so much in advance for your time and help, I really
>> >> appreciate
>> >>>> it!
>> >>>>>
>> >>>>> Best,
>> >>>>>
>> >>>>> Jason
>> >>>>>
>> >>>>
>> >>>>
>> >>
>> >>
>>
>>
>

------------------------------------------------
Subject: Re: [rt.rap.ucar.edu #63455] some small questions regarding the basic concepts about MET
From: John Halley Gotway
Time: Tue Oct 22 10:00:19 2013

Jason,

There are 4 GDAS PrepBufr files for each day, at 00, 06, 12, and 18Z.
Each one contains 6 hours of point observation data that is +/- 3
hours around the timestamp in the file name.  For example, the
12Z GDAS PrepBufr file contains observations between 09 and 15Z.  When
you run Point-Stat, you set a matching time window in the config file
using the "obs_window" option.  Set "beg" and "end" to a
number of seconds.  When you run Point-Stat, it'll look at the valid
time of the forecast and build the time window around that.  Suppose
your forecast is valid at 20130512 00Z and you've set beg =
-3600 and end = 3600 (that's +/ 1 hour).  The matching time window
will be from 20130511 23Z to 20130512 01Z.  Point-Stat will try to use
any observations falling between those time when doing it's
verification.  And it's up to you to decide how big you want that time
window to be, e.g. +/- 1 hour, +/- 5 minutes, or +/- 0 seconds.

Some stations will report multiple times within those 2 hours.  By
default, all of those point observations will be used - even for
stations reporting multiple times.  However, you can change that
behavior by setting the "duplicate_flag" option in the config file.
You can read about this in METv4.1/data/config/README, but setting it
to a value of "SINGLE" will cause only one observation to be
used for each station.  Point-Stat will use the one whose valid time
is closest to the valid time of the forecast.

Hope that helps clarify.

Thanks,
John

On 10/22/2013 06:29 AM, Xingcheng Lu via RT wrote:
>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=63455 >
>
> Hi John,
>
> Forgot to ask you a question about the observation data yesterday.
The NCEP
> ADP Global Upper Air and Surface Weather Observations data, is it
recorded
> at instantaneous time spot, or just recorded the average value over
a
> period? For example,if the data is recorded at 12:00pm, is the data
at this
> time an average from 11:50 to 12:10, or the exact value at 12:00?
Thanks
> again-
>
> Best,
>
> Jason
>
>
> 2013/10/22 Xingcheng Lu <xingchenglu2011 at u.northwestern.edu>
>
>> Hello John,
>>
>> Thank you very much for your script, I will study it in detail.
Also, I
>> really appreciate that you can provide me so many detailed answers,
I have
>> learned a lot from them. Hope that meet you again when I ask the
question
>> in met-help next time.
>>
>> Sincerely,
>>
>> Jason
>>
>>
>> 2013/10/22 John Halley Gotway via RT <met_help at ucar.edu>
>>
>>> Jason,
>>>
>>> Yes, I mean shell scripting.  Users typically call the MET
utilities from
>>> within shell scripts (or any other scripting language, like PERL
or
>>> Python).  Typically, the shell script just loops over a
>>> bunch of model initialization times and/or forecast lead times,
figures
>>> out the name of the forecast and observation files for that time,
and calls
>>> the appropriate MET tools.  I usually use the "date"
>>> command when doing that looping.  I've attached a sample script
written
>>> in the korn shell that does the following...
>>>
>>>     - outer loop for the model initialization times (from
2013030100 to
>>> 2013030212 every 12-hours)
>>>     - inner loop for forecast lead times (from 0 to 36 hours every
6 hours)
>>>     - compute forecast and observation file names
>>>     - print a line stating where you should call the MET tool(s)
for that
>>> time
>>>
>>> Just download and run the attached script to see what I mean.
Hopefully,
>>> this script will help get you started.
>>>
>>> Thanks,
>>> John
>>>
>>>
>>> On 10/21/2013 07:44 AM, Xingcheng Lu via RT wrote:
>>>>
>>>> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=63455 >
>>>>
>>>> Hi John,
>>>>
>>>> Thank you for your answers. You said that I could write some
scripts for
>>>> time series re-run, so what type of scripts I can write, do you
mean
>>> shell
>>>> script? If you have a script sample, it will be much helpful to
me.
>>> Thank
>>>> again for your help!
>>>>
>>>> cheers,
>>>>
>>>> Jason
>>>>
>>>>
>>>> 2013/10/19 John Halley Gotway via RT <met_help at ucar.edu>
>>>>
>>>>> Jason,
>>>>>
>>>>> Answers are inline.
>>>>>
>>>>> Thanks,
>>>>> John
>>>>>
>>>>> On 10/18/2013 09:14 AM, Xingcheng Lu via RT wrote:
>>>>>>
>>>>>> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=63455 >
>>>>>>
>>>>>> Hi John,
>>>>>>
>>>>>> Thank you so much for your detail reply and I have learned a
lot from
>>>>> your
>>>>>> answers.
>>>>>>
>>>>>> According to your answers, I still have 3 questions:
>>>>>>
>>>>>> For the question 3 and 4, can I say, if there are 200
observation
>>> points
>>>>>> within my domain, and I set the time beg=0/end=0, then the
match pair
>>> is
>>>>>> 200. However, when I set beg=3600/end=3600, if each observation
point
>>>>> has 3
>>>>>> values during this time period, then the matching pair should
be 600.
>>> Is
>>>>> my
>>>>>> understanding correct?
>>>>>>
>>>>>
>>>>> Yes, your understanding is correct.  If multiple observations
occur at
>>> the
>>>>> same location during the time window, by default, they will all
be
>>> used.
>>>>>    *HOWEVER*, we realize that this behavior is not
>>>>> always desirable, so there's an option in the configuration file
to
>>>>> control this logic.  Please take a look in the file
>>>>> "METv4.1/data/config/README" for a description of the
"duplicate_flag"
>>>>> option.
>>>>> Setting it's value to "SINGLE" will cause only a single
observation
>>> value
>>>>> for each location to be used.  The one whose valid time is
closest to
>>> that
>>>>> of the forecast time is chosen.
>>>>>
>>>>>> For the question 5, if I want to do the time series comparison
for 12
>>>>>> hours at the same observation spot(with single hour wrf output
and set
>>>>>> beg/end=0), whether I just need to rerun the MET for 12 times
for
>>>>>> single different hour?
>>>>>
>>>>> The Point-Stat tool is intended to be run once for each valid
time
>>> being
>>>>> evaluated.  So if you're verifying 12 different times, you'd run
the
>>> tool
>>>>> 12 times.  Of course, we don't intend that people run
>>>>> this manually themselves on the command line.  Instead, we
expect that
>>>>> you'd run it via a script of some sort that loops through the
times
>>> you'd
>>>>> like to evaluate.
>>>>>
>>>>> Often MET users are interested in the performance of their model
at
>>> more
>>>>> than just a single location and for more than a single
variable/level.
>>>   So
>>>>> there's more "work to do" than just computing a
>>>>> single matched pair value and writing it out.
>>>>>
>>>>> Here's a few pieces of info you may find helpful...
>>>>> - The MET configuration files support the use of environment
>>> variables.  I
>>>>> often find it helpful when scripting up calls to the MET tools
to set
>>> some
>>>>> environment variable values and then reference
>>>>> them in the configuration files I pass to the tools.  That will
enable
>>> you
>>>>> to control the behavior of the tools without having to maintain
many
>>>>> different versions of the config files.
>>>>> - If you happen to have GRIB files that contain data for
multiple
>>> forecast
>>>>> hours (all 12 of your output times for example), you could
actually
>>> call
>>>>> Point-Stat once and evaluate them all at once.  But
>>>>> the configuration files get a lot messier - in the "fcst.field"
setting
>>>>> you'd need to explicitly specify the valid time of the data to
be
>>> evaluated
>>>>> so that Point-Stat know which fields to use.  I
>>>>> typically find calling Point-Stat once per valid time is easier.
>>>>> - If you're interested in the performance of your model at a
specific
>>> set
>>>>> of station ID's, consider using the "mask.sid" option.  Rather
than
>>>>> defining your verification area spatially (as the "grid"
>>>>> and "poly" options do), the station id (sid) option is just a
list of
>>> the
>>>>> stations over which you'd like to compute statistics.
>>>>>
>>>>>>
>>>>>> For the question 5, "stat_analysis -job filter -lookin
point_stat/out
>>>>>> -line_type MPR -fcst_var TMP -fcst_lev Z2 -column_str OBS_SID
KDEN
>>>>>> -dump_row TMP_Z2_KDEN_MPR.txt"  Can I type this on the Linux
command
>>> line
>>>>>> or I need to set this in the config file for stat analysis?
Because I
>>>>> found
>>>>>> that this line is a little bit similar to the jobs setting in
the
>>> config
>>>>>> file, but I cannot find the format you type in the manual.
Whether
>>> there
>>>>> is
>>>>>> any resource to further introduce this command line format?
>>>>>
>>>>> The STAT-Analysis tool can be run with or without a config file.
If
>>> you
>>>>> provide a config file, you have define the job(s) to be run in
it.  If
>>> not,
>>>>> you define a *SINGLE* job to be run on the command
>>>>> line.  Usually, I run single jobs on the command line until I
figure
>>> out
>>>>> the set of analysis jobs I'd like to run via a script.  Then I
move
>>> them
>>>>> into a config file and run them all with one call to
>>>>> STAT-Analysis.  Look in the file "METv4.1/data/config/README"
for the
>>>>> section on "Job command FILTERING".
>>>>>
>>>>>>
>>>>>> By the way, do I have to use the grids with the same resolution
if I
>>> want
>>>>>> to do the grid-grid comparison? Also, because my research focus
is on
>>>>>> global scale, do you know whether there is any daily grid
observation
>>>>> data
>>>>>> for the global scale?
>>>>>
>>>>> Yes, for the grid-to-grid comparisons performed by the Grid-
Stat, MODE,
>>>>> Series-Analysis, and Wavelet-Stat tools, it's the user's
>>> responsibility to
>>>>> put their forecast and observation data on the same
>>>>> grid.  In future versions of MET, we'd like to add tools to help
users
>>> do
>>>>> this.  But currently there isn't any support for this directly
in MET.
>>>   But
>>>>> for GRIB1 data, the copygb utility can be used to
>>>>> regrid things.  Here's a portion of the MET online tutorial that
>>> discusses
>>>>> this:
>>>>>
>>>>>
>>>
http://www.dtcenter.org/met/users/support/online_tutorial/METv4.1/copygb/index.php
>>>>>
>>>>> Availability of appropriate observations is always an issue.
And I
>>> don't
>>>>> have a magic bullet for you.  You could always compare your
model
>>> output to
>>>>> a model analysis - like the global GFS analysis.
>>>>> But that'll just tell you how well your model output matches
GFS, which
>>>>> has it's own set of errors.
>>>>>
>>>>>>
>>>>>> Thanks again for your kind help!
>>>>>>
>>>>>> Sincerely,
>>>>>>
>>>>>> Jason
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> 2013/10/18 John Halley Gotway via RT <met_help at ucar.edu>
>>>>>>
>>>>>>> Jason,
>>>>>>>
>>>>>>> I've answered your questions inline below.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> John Halley Gotway
>>>>>>> met_help at ucar.edu
>>>>>>>
>>>>>>> On 10/17/2013 08:54 AM, Xingcheng Lu via RT wrote:
>>>>>>>>
>>>>>>>> Thu Oct 17 08:54:28 2013: Request 63455 was acted upon.
>>>>>>>> Transaction: Ticket created by
xingchenglu2011 at u.northwestern.edu
>>>>>>>>            Queue: met_help
>>>>>>>>          Subject: some small questions regarding the basic
concepts
>>> about
>>>>>>> MET
>>>>>>>>            Owner: Nobody
>>>>>>>>       Requestors: xingchenglu2011 at u.northwestern.edu
>>>>>>>>           Status: new
>>>>>>>>      Ticket <URL:
>>>>> https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=63455 >
>>>>>>>>
>>>>>>>>
>>>>>>>> Dear Sir/Madam,
>>>>>>>>
>>>>>>>> I am a new user of MET and I have several small questions
want to
>>> ask
>>>>>>> about.
>>>>>>>>
>>>>>>>> (1): In the manual at Table 4-3, I am a little bit confused
about
>>>>>>> Forecast
>>>>>>>> and Observation Rate. According to the contingency table,
whether
>>>>>>>> forecast rate=(n11+n00)/T? And how about observation rate, I
don't
>>> know
>>>>>>>> what it should be compared to.
>>>>>>>
>>>>>>> The forecast rate is just the fraction of grid points in the
forecast
>>>>>>> domain at which the event is occurring.  The observation rate
is the
>>>>>>> fraction of grid points in the observation domain at which
>>>>>>> the event is occurring.  The observation rate is also known at
the
>>> base
>>>>>>> rate (BASER from the CTS line shown in Table 4-5).  The FHO
and CTC
>>>>> output
>>>>>>> lines really contain the same information.  At the
>>>>>>> DTC, we prefer to use the counts from the CTC line, while NCEP
>>> prefers
>>>>> to
>>>>>>> use the ratios of the counts given in the FHO line.
>>>>>>>
>>>>>>> If your forecast were perfect, the F_RATE would be identical
to the
>>>>> O_RATE.
>>>>>>>
>>>>>>>>
>>>>>>>> (2):My second question is about the grid/poly setting in the
>>>>>>>> Pointstatconfig file. I know that the poly is used to select
the
>>>>> specific
>>>>>>>> area for verification, however, I don't know what exactly is
the
>>> grid
>>>>>>>> option used for, is it also used to choose the place for
>>> verification?
>>>>> If
>>>>>>>> yes, what's the difference between the poly?
>>>>>>>
>>>>>>> Yes, the grid masking behaves the same way that the polyline
masking
>>>>> does.
>>>>>>>     It's just another way of specifying a geographic subset of
the
>>> domain.
>>>>>>>     It's generally much less useful than the polyline
>>>>>>> masking since only the pre-defined NCEP grids are supported,
and most
>>>>>>> users don't find that very helpful.
>>>>>>>
>>>>>>>>
>>>>>>>> (3): I am a little bit confused about the 'match pair' mean.
I have
>>> two
>>>>>>>> understandings:(A) If I set temp>273 at Z2, both the
observation
>>> file
>>>>> and
>>>>>>>> forecast output have valid value at Z2, no matter they are
larger
>>> than
>>>>>>> 273
>>>>>>>> or not;(B) Or it means that the value at Z2 from both
observation
>>> and
>>>>>>>> forecast meet the temp>273 requirement?
>>>>>>>
>>>>>>> A "matched pair" just means a pair of forecast and observation
values
>>>>> that
>>>>>>> go together.  Suppose you have 200 point observations of 2-
meter
>>>>>>> temperature that fall in your domain.  For each observation
>>>>>>> value, Point-Stat computes an interpolated forecast value for
that
>>>>>>> observation location.  So you now have 200 pairs of forecast
and
>>>>>>> observation values.  Using those 200 matched pairs, you could
define
>>>>>>> continuous statistics directly (CNT output line).  Or you
could
>>> choose a
>>>>>>> threshold (like >273) and define a 2x2 contingency table.
With that
>>> 2x2
>>>>>>> contingency table, you can dump out the counts in the
>>>>>>> CTC line and/or the corresponding statistics in the CTS line.
>>>>>>>
>>>>>>> I'm not sure if that answers your question.
>>>>>>>
>>>>>>>>
>>>>>>>> (4): My another question is about the beg/end time setting.
The
>>>>>>> observation
>>>>>>>> data I downloaded was DS 337.0 NCEP ADP Global Upper Air and
Surface
>>>>>>>> Observations. Is the single file of this type of data only
have
>>> valid
>>>>>>>> observation for a specific time spot, like if I download the
data
>>> for
>>>>>>>> 2006.07.18_18:00:00, the file only contains the observation
data for
>>>>> that
>>>>>>>> time spot? If so, when combine with wrf single hour output,
can I
>>> just
>>>>>>> set
>>>>>>>> this option to 0 and 0? Could you tell me what exact the
beg/end
>>> used
>>>>>>> for?
>>>>>>>>
>>>>>>>
>>>>>>> There are 4 GDAS PREPBUFR files per day - 00Z, 06Z, 12Z, and
18Z.
>>>   Each
>>>>>>> file contains 6 hours worth of observations, 3 hours +/- the
time
>>>>> indicated
>>>>>>> in the file name.  So the 12Z file contains
>>>>>>> observations from 09Z to 15Z.  When you run Point-Stat, you
need to
>>> pass
>>>>>>> it one or more observation files that contain the observations
you'd
>>>>> like
>>>>>>> to use to evaluate the forecast.  The time window you
>>>>>>> set in the Point-Stat configuration file is set relative to
the
>>> forecast
>>>>>>> valid time.  Suppose your forecast is valid at 06Z and you've
set
>>> beg =
>>>>>>> -3600 and end = 3600 (in seconds).  So that's +/- 1
>>>>>>> hour around your forecast valid time.  When Point-Stat sifts
through
>>> the
>>>>>>> observations, it'll only use the one whose valid time falls
between
>>> 05Z
>>>>> and
>>>>>>> 07Z.  It'll throw all the others out.  It's up to
>>>>>>> you to decide how close in time your observations need to be
to your
>>>>>>> forecast time.  If you set beg = 0 and end = 0, only those
>>> observations
>>>>>>> with exactly the same time as the forecast will be used.
>>>>>>>
>>>>>>>> (5) My 5th question is about the time series comparison. If I
want
>>> to
>>>>>>> draw
>>>>>>>> the plot of MSE versus time at a specific observation point,
what
>>>>> should
>>>>>>> I
>>>>>>>> do? And how to take the data of the specific point out?
>>>>>>>>
>>>>>>>
>>>>>>> When you run Point-Stat, you can output the individual matched
pair
>>>>> values
>>>>>>> by turning on the MPR output line.  The individual
>>> forecast-observation
>>>>>>> pairs are contained in that MPR line.  Since
>>>>>>> Point-Stat is run once for each valid time, the time-series of
an
>>>>>>> individual point is scattered across many different output
files.  If
>>>>> you'd
>>>>>>> like, you could use the STAT-Analysis tool filter out the
>>>>>>> MPR lines that correspond to a single station id.  For
example,
>>> here's
>>>>> how
>>>>>>> you might filter out the matched pairs for 2-m temperature for
a
>>> station
>>>>>>> named KDEN:
>>>>>>>
>>>>>>>       stat_analysis -job filter -lookin point_stat/out
-line_type MPR
>>>>>>> -fcst_var TMP -fcst_lev Z2 -column_str OBS_SID KDEN -dump_row
>>>>>>> TMP_Z2_KDEN_MPR.txt
>>>>>>>
>>>>>>> That will read all of the files ending in ".stat" from the
directory
>>>>> named
>>>>>>> "point_stat/out", pick out the MPR lines, only with "TMP" in
the
>>>>> FCST_VAR
>>>>>>> column, only with "Z2" in the FCST_LEV column, only
>>>>>>> with "KDEN" in the OBS_SID column, and write the output to a
file
>>> named
>>>>>>> "TMP_Z2_KDEN_MPR.txt".  There are filtering switches for each
of the
>>> 21
>>>>>>> header columns common to each line type.  And you can
>>>>>>> use the -column_min, -column_max, -column_eq, and -column_str
>>> options to
>>>>>>> filter by the data columns (as we've done here for the OBS_SID
>>> column).
>>>>>>>
>>>>>>> Once you have the data stored this way, it's up to you to make
the
>>> time
>>>>>>> series plot with whatever software you'd like.
>>>>>>>
>>>>>>>> (6) My last question was about the comparison between two wrf
>>> output.
>>>>> If
>>>>>>> I
>>>>>>>> want to know how to compare the result between two wrf output
at the
>>>>> same
>>>>>>>> time at the same location but with different physics schemes.
>>> Whether
>>>>> it
>>>>>>> is
>>>>>>>> just put these two results into the grid-grid comparison and
set
>>>>>>>> one wrfoutput as the observation? Or there are other way to
execute
>>>>> this
>>>>>>>> job?
>>>>>>>>
>>>>>>>
>>>>>>> Sure, you can easily treat one of them as the "forecast" and
the
>>> other
>>>>> as
>>>>>>> the "observation".  That is not strictly verification - more
of just
>>> a
>>>>>>> comparison.  Hopefully, the output from MET will help
>>>>>>> you quantify the differences.  You should just think carefully
about
>>> it
>>>>>>> when trying to make sense of the output.
>>>>>>>
>>>>>>>> Thank you so much in advance for your time and help, I really
>>>>> appreciate
>>>>>>> it!
>>>>>>>>
>>>>>>>> Best,
>>>>>>>>
>>>>>>>> Jason
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>
>>>>>
>>>
>>>
>>

------------------------------------------------
Subject: some small questions regarding the basic concepts about MET
From: Xingcheng Lu
Time: Tue Oct 22 20:01:19 2013

Hi John,

Thank you for your reply, I understand the working principle of
beg/end but
my question is about the observation data, I am not sure how the value
reported in the PrepBufr file comes from. My labmate told me that
there are
two types of observation data, one is directly from instantaneous
record,
and the other is from the average record. For example, to the
instantaneous
record, the data recorded at 12:00:00 is the true value at 12:00:00.
While
to some average records, the value for 12:00:00 may come from the
average
recorded value between 11:50:00 and 12:10:00, the 20 minutes' average
value. (same,the value for 12:30:00 comes from the average value
between
12:20:00 and 12:40:00). I have checked through the 4GDAS website, but
cannot find any information regarding about this. Sorry for the not
clear expression last email. Thank you!

Best,

Jason


2013/10/23 John Halley Gotway via RT <met_help at ucar.edu>

> Jason,
>
> There are 4 GDAS PrepBufr files for each day, at 00, 06, 12, and
18Z.
>  Each one contains 6 hours of point observation data that is +/- 3
hours
> around the timestamp in the file name.  For example, the
> 12Z GDAS PrepBufr file contains observations between 09 and 15Z.
When you
> run Point-Stat, you set a matching time window in the config file
using the
> "obs_window" option.  Set "beg" and "end" to a
> number of seconds.  When you run Point-Stat, it'll look at the valid
time
> of the forecast and build the time window around that.  Suppose your
> forecast is valid at 20130512 00Z and you've set beg =
> -3600 and end = 3600 (that's +/ 1 hour).  The matching time window
will be
> from 20130511 23Z to 20130512 01Z.  Point-Stat will try to use any
> observations falling between those time when doing it's
> verification.  And it's up to you to decide how big you want that
time
> window to be, e.g. +/- 1 hour, +/- 5 minutes, or +/- 0 seconds.
>
> Some stations will report multiple times within those 2 hours.  By
> default, all of those point observations will be used - even for
stations
> reporting multiple times.  However, you can change that
> behavior by setting the "duplicate_flag" option in the config file.
You
> can read about this in METv4.1/data/config/README, but setting it to
a
> value of "SINGLE" will cause only one observation to be
> used for each station.  Point-Stat will use the one whose valid time
is
> closest to the valid time of the forecast.
>
> Hope that helps clarify.
>
> Thanks,
> John
>
> On 10/22/2013 06:29 AM, Xingcheng Lu via RT wrote:
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=63455 >
> >
> > Hi John,
> >
> > Forgot to ask you a question about the observation data yesterday.
The
> NCEP
> > ADP Global Upper Air and Surface Weather Observations data, is it
> recorded
> > at instantaneous time spot, or just recorded the average value
over a
> > period? For example,if the data is recorded at 12:00pm, is the
data at
> this
> > time an average from 11:50 to 12:10, or the exact value at 12:00?
Thanks
> > again-
> >
> > Best,
> >
> > Jason
> >
> >
> > 2013/10/22 Xingcheng Lu <xingchenglu2011 at u.northwestern.edu>
> >
> >> Hello John,
> >>
> >> Thank you very much for your script, I will study it in detail.
Also, I
> >> really appreciate that you can provide me so many detailed
answers, I
> have
> >> learned a lot from them. Hope that meet you again when I ask the
> question
> >> in met-help next time.
> >>
> >> Sincerely,
> >>
> >> Jason
> >>
> >>
> >> 2013/10/22 John Halley Gotway via RT <met_help at ucar.edu>
> >>
> >>> Jason,
> >>>
> >>> Yes, I mean shell scripting.  Users typically call the MET
utilities
> from
> >>> within shell scripts (or any other scripting language, like PERL
or
> >>> Python).  Typically, the shell script just loops over a
> >>> bunch of model initialization times and/or forecast lead times,
figures
> >>> out the name of the forecast and observation files for that
time, and
> calls
> >>> the appropriate MET tools.  I usually use the "date"
> >>> command when doing that looping.  I've attached a sample script
written
> >>> in the korn shell that does the following...
> >>>
> >>>     - outer loop for the model initialization times (from
2013030100to
> >>> 2013030212 every 12-hours)
> >>>     - inner loop for forecast lead times (from 0 to 36 hours
every 6
> hours)
> >>>     - compute forecast and observation file names
> >>>     - print a line stating where you should call the MET tool(s)
for
> that
> >>> time
> >>>
> >>> Just download and run the attached script to see what I mean.
>  Hopefully,
> >>> this script will help get you started.
> >>>
> >>> Thanks,
> >>> John
> >>>
> >>>
> >>> On 10/21/2013 07:44 AM, Xingcheng Lu via RT wrote:
> >>>>
> >>>> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=63455 >
> >>>>
> >>>> Hi John,
> >>>>
> >>>> Thank you for your answers. You said that I could write some
scripts
> for
> >>>> time series re-run, so what type of scripts I can write, do you
mean
> >>> shell
> >>>> script? If you have a script sample, it will be much helpful to
me.
> >>> Thank
> >>>> again for your help!
> >>>>
> >>>> cheers,
> >>>>
> >>>> Jason
> >>>>
> >>>>
> >>>> 2013/10/19 John Halley Gotway via RT <met_help at ucar.edu>
> >>>>
> >>>>> Jason,
> >>>>>
> >>>>> Answers are inline.
> >>>>>
> >>>>> Thanks,
> >>>>> John
> >>>>>
> >>>>> On 10/18/2013 09:14 AM, Xingcheng Lu via RT wrote:
> >>>>>>
> >>>>>> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=63455
>
> >>>>>>
> >>>>>> Hi John,
> >>>>>>
> >>>>>> Thank you so much for your detail reply and I have learned a
lot
> from
> >>>>> your
> >>>>>> answers.
> >>>>>>
> >>>>>> According to your answers, I still have 3 questions:
> >>>>>>
> >>>>>> For the question 3 and 4, can I say, if there are 200
observation
> >>> points
> >>>>>> within my domain, and I set the time beg=0/end=0, then the
match
> pair
> >>> is
> >>>>>> 200. However, when I set beg=3600/end=3600, if each
observation
> point
> >>>>> has 3
> >>>>>> values during this time period, then the matching pair should
be
> 600.
> >>> Is
> >>>>> my
> >>>>>> understanding correct?
> >>>>>>
> >>>>>
> >>>>> Yes, your understanding is correct.  If multiple observations
occur
> at
> >>> the
> >>>>> same location during the time window, by default, they will
all be
> >>> used.
> >>>>>    *HOWEVER*, we realize that this behavior is not
> >>>>> always desirable, so there's an option in the configuration
file to
> >>>>> control this logic.  Please take a look in the file
> >>>>> "METv4.1/data/config/README" for a description of the
> "duplicate_flag"
> >>>>> option.
> >>>>> Setting it's value to "SINGLE" will cause only a single
observation
> >>> value
> >>>>> for each location to be used.  The one whose valid time is
closest to
> >>> that
> >>>>> of the forecast time is chosen.
> >>>>>
> >>>>>> For the question 5, if I want to do the time series
comparison for
> 12
> >>>>>> hours at the same observation spot(with single hour wrf
output and
> set
> >>>>>> beg/end=0), whether I just need to rerun the MET for 12 times
for
> >>>>>> single different hour?
> >>>>>
> >>>>> The Point-Stat tool is intended to be run once for each valid
time
> >>> being
> >>>>> evaluated.  So if you're verifying 12 different times, you'd
run the
> >>> tool
> >>>>> 12 times.  Of course, we don't intend that people run
> >>>>> this manually themselves on the command line.  Instead, we
expect
> that
> >>>>> you'd run it via a script of some sort that loops through the
times
> >>> you'd
> >>>>> like to evaluate.
> >>>>>
> >>>>> Often MET users are interested in the performance of their
model at
> >>> more
> >>>>> than just a single location and for more than a single
> variable/level.
> >>>   So
> >>>>> there's more "work to do" than just computing a
> >>>>> single matched pair value and writing it out.
> >>>>>
> >>>>> Here's a few pieces of info you may find helpful...
> >>>>> - The MET configuration files support the use of environment
> >>> variables.  I
> >>>>> often find it helpful when scripting up calls to the MET tools
to set
> >>> some
> >>>>> environment variable values and then reference
> >>>>> them in the configuration files I pass to the tools.  That
will
> enable
> >>> you
> >>>>> to control the behavior of the tools without having to
maintain many
> >>>>> different versions of the config files.
> >>>>> - If you happen to have GRIB files that contain data for
multiple
> >>> forecast
> >>>>> hours (all 12 of your output times for example), you could
actually
> >>> call
> >>>>> Point-Stat once and evaluate them all at once.  But
> >>>>> the configuration files get a lot messier - in the
"fcst.field"
> setting
> >>>>> you'd need to explicitly specify the valid time of the data to
be
> >>> evaluated
> >>>>> so that Point-Stat know which fields to use.  I
> >>>>> typically find calling Point-Stat once per valid time is
easier.
> >>>>> - If you're interested in the performance of your model at a
specific
> >>> set
> >>>>> of station ID's, consider using the "mask.sid" option.  Rather
than
> >>>>> defining your verification area spatially (as the "grid"
> >>>>> and "poly" options do), the station id (sid) option is just a
list of
> >>> the
> >>>>> stations over which you'd like to compute statistics.
> >>>>>
> >>>>>>
> >>>>>> For the question 5, "stat_analysis -job filter -lookin
> point_stat/out
> >>>>>> -line_type MPR -fcst_var TMP -fcst_lev Z2 -column_str OBS_SID
KDEN
> >>>>>> -dump_row TMP_Z2_KDEN_MPR.txt"  Can I type this on the Linux
command
> >>> line
> >>>>>> or I need to set this in the config file for stat analysis?
Because
> I
> >>>>> found
> >>>>>> that this line is a little bit similar to the jobs setting in
the
> >>> config
> >>>>>> file, but I cannot find the format you type in the manual.
Whether
> >>> there
> >>>>> is
> >>>>>> any resource to further introduce this command line format?
> >>>>>
> >>>>> The STAT-Analysis tool can be run with or without a config
file.  If
> >>> you
> >>>>> provide a config file, you have define the job(s) to be run in
it.
>  If
> >>> not,
> >>>>> you define a *SINGLE* job to be run on the command
> >>>>> line.  Usually, I run single jobs on the command line until I
figure
> >>> out
> >>>>> the set of analysis jobs I'd like to run via a script.  Then I
move
> >>> them
> >>>>> into a config file and run them all with one call to
> >>>>> STAT-Analysis.  Look in the file "METv4.1/data/config/README"
for the
> >>>>> section on "Job command FILTERING".
> >>>>>
> >>>>>>
> >>>>>> By the way, do I have to use the grids with the same
resolution if I
> >>> want
> >>>>>> to do the grid-grid comparison? Also, because my research
focus is
> on
> >>>>>> global scale, do you know whether there is any daily grid
> observation
> >>>>> data
> >>>>>> for the global scale?
> >>>>>
> >>>>> Yes, for the grid-to-grid comparisons performed by the Grid-
Stat,
> MODE,
> >>>>> Series-Analysis, and Wavelet-Stat tools, it's the user's
> >>> responsibility to
> >>>>> put their forecast and observation data on the same
> >>>>> grid.  In future versions of MET, we'd like to add tools to
help
> users
> >>> do
> >>>>> this.  But currently there isn't any support for this directly
in
> MET.
> >>>   But
> >>>>> for GRIB1 data, the copygb utility can be used to
> >>>>> regrid things.  Here's a portion of the MET online tutorial
that
> >>> discusses
> >>>>> this:
> >>>>>
> >>>>>
> >>>
>
http://www.dtcenter.org/met/users/support/online_tutorial/METv4.1/copygb/index.php
> >>>>>
> >>>>> Availability of appropriate observations is always an issue.
And I
> >>> don't
> >>>>> have a magic bullet for you.  You could always compare your
model
> >>> output to
> >>>>> a model analysis - like the global GFS analysis.
> >>>>> But that'll just tell you how well your model output matches
GFS,
> which
> >>>>> has it's own set of errors.
> >>>>>
> >>>>>>
> >>>>>> Thanks again for your kind help!
> >>>>>>
> >>>>>> Sincerely,
> >>>>>>
> >>>>>> Jason
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> 2013/10/18 John Halley Gotway via RT <met_help at ucar.edu>
> >>>>>>
> >>>>>>> Jason,
> >>>>>>>
> >>>>>>> I've answered your questions inline below.
> >>>>>>>
> >>>>>>> Thanks,
> >>>>>>> John Halley Gotway
> >>>>>>> met_help at ucar.edu
> >>>>>>>
> >>>>>>> On 10/17/2013 08:54 AM, Xingcheng Lu via RT wrote:
> >>>>>>>>
> >>>>>>>> Thu Oct 17 08:54:28 2013: Request 63455 was acted upon.
> >>>>>>>> Transaction: Ticket created by
xingchenglu2011 at u.northwestern.edu
> >>>>>>>>            Queue: met_help
> >>>>>>>>          Subject: some small questions regarding the basic
> concepts
> >>> about
> >>>>>>> MET
> >>>>>>>>            Owner: Nobody
> >>>>>>>>       Requestors: xingchenglu2011 at u.northwestern.edu
> >>>>>>>>           Status: new
> >>>>>>>>      Ticket <URL:
> >>>>> https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=63455 >
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Dear Sir/Madam,
> >>>>>>>>
> >>>>>>>> I am a new user of MET and I have several small questions
want to
> >>> ask
> >>>>>>> about.
> >>>>>>>>
> >>>>>>>> (1): In the manual at Table 4-3, I am a little bit confused
about
> >>>>>>> Forecast
> >>>>>>>> and Observation Rate. According to the contingency table,
whether
> >>>>>>>> forecast rate=(n11+n00)/T? And how about observation rate,
I don't
> >>> know
> >>>>>>>> what it should be compared to.
> >>>>>>>
> >>>>>>> The forecast rate is just the fraction of grid points in the
> forecast
> >>>>>>> domain at which the event is occurring.  The observation
rate is
> the
> >>>>>>> fraction of grid points in the observation domain at which
> >>>>>>> the event is occurring.  The observation rate is also known
at the
> >>> base
> >>>>>>> rate (BASER from the CTS line shown in Table 4-5).  The FHO
and CTC
> >>>>> output
> >>>>>>> lines really contain the same information.  At the
> >>>>>>> DTC, we prefer to use the counts from the CTC line, while
NCEP
> >>> prefers
> >>>>> to
> >>>>>>> use the ratios of the counts given in the FHO line.
> >>>>>>>
> >>>>>>> If your forecast were perfect, the F_RATE would be identical
to the
> >>>>> O_RATE.
> >>>>>>>
> >>>>>>>>
> >>>>>>>> (2):My second question is about the grid/poly setting in
the
> >>>>>>>> Pointstatconfig file. I know that the poly is used to
select the
> >>>>> specific
> >>>>>>>> area for verification, however, I don't know what exactly
is the
> >>> grid
> >>>>>>>> option used for, is it also used to choose the place for
> >>> verification?
> >>>>> If
> >>>>>>>> yes, what's the difference between the poly?
> >>>>>>>
> >>>>>>> Yes, the grid masking behaves the same way that the polyline
> masking
> >>>>> does.
> >>>>>>>     It's just another way of specifying a geographic subset
of the
> >>> domain.
> >>>>>>>     It's generally much less useful than the polyline
> >>>>>>> masking since only the pre-defined NCEP grids are supported,
and
> most
> >>>>>>> users don't find that very helpful.
> >>>>>>>
> >>>>>>>>
> >>>>>>>> (3): I am a little bit confused about the 'match pair'
mean. I
> have
> >>> two
> >>>>>>>> understandings:(A) If I set temp>273 at Z2, both the
observation
> >>> file
> >>>>> and
> >>>>>>>> forecast output have valid value at Z2, no matter they are
larger
> >>> than
> >>>>>>> 273
> >>>>>>>> or not;(B) Or it means that the value at Z2 from both
observation
> >>> and
> >>>>>>>> forecast meet the temp>273 requirement?
> >>>>>>>
> >>>>>>> A "matched pair" just means a pair of forecast and
observation
> values
> >>>>> that
> >>>>>>> go together.  Suppose you have 200 point observations of 2-
meter
> >>>>>>> temperature that fall in your domain.  For each observation
> >>>>>>> value, Point-Stat computes an interpolated forecast value
for that
> >>>>>>> observation location.  So you now have 200 pairs of forecast
and
> >>>>>>> observation values.  Using those 200 matched pairs, you
could
> define
> >>>>>>> continuous statistics directly (CNT output line).  Or you
could
> >>> choose a
> >>>>>>> threshold (like >273) and define a 2x2 contingency table.
With
> that
> >>> 2x2
> >>>>>>> contingency table, you can dump out the counts in the
> >>>>>>> CTC line and/or the corresponding statistics in the CTS
line.
> >>>>>>>
> >>>>>>> I'm not sure if that answers your question.
> >>>>>>>
> >>>>>>>>
> >>>>>>>> (4): My another question is about the beg/end time setting.
The
> >>>>>>> observation
> >>>>>>>> data I downloaded was DS 337.0 NCEP ADP Global Upper Air
and
> Surface
> >>>>>>>> Observations. Is the single file of this type of data only
have
> >>> valid
> >>>>>>>> observation for a specific time spot, like if I download
the data
> >>> for
> >>>>>>>> 2006.07.18_18:00:00, the file only contains the observation
data
> for
> >>>>> that
> >>>>>>>> time spot? If so, when combine with wrf single hour output,
can I
> >>> just
> >>>>>>> set
> >>>>>>>> this option to 0 and 0? Could you tell me what exact the
beg/end
> >>> used
> >>>>>>> for?
> >>>>>>>>
> >>>>>>>
> >>>>>>> There are 4 GDAS PREPBUFR files per day - 00Z, 06Z, 12Z, and
18Z.
> >>>   Each
> >>>>>>> file contains 6 hours worth of observations, 3 hours +/- the
time
> >>>>> indicated
> >>>>>>> in the file name.  So the 12Z file contains
> >>>>>>> observations from 09Z to 15Z.  When you run Point-Stat, you
need to
> >>> pass
> >>>>>>> it one or more observation files that contain the
observations
> you'd
> >>>>> like
> >>>>>>> to use to evaluate the forecast.  The time window you
> >>>>>>> set in the Point-Stat configuration file is set relative to
the
> >>> forecast
> >>>>>>> valid time.  Suppose your forecast is valid at 06Z and
you've set
> >>> beg =
> >>>>>>> -3600 and end = 3600 (in seconds).  So that's +/- 1
> >>>>>>> hour around your forecast valid time.  When Point-Stat sifts
> through
> >>> the
> >>>>>>> observations, it'll only use the one whose valid time falls
between
> >>> 05Z
> >>>>> and
> >>>>>>> 07Z.  It'll throw all the others out.  It's up to
> >>>>>>> you to decide how close in time your observations need to be
to
> your
> >>>>>>> forecast time.  If you set beg = 0 and end = 0, only those
> >>> observations
> >>>>>>> with exactly the same time as the forecast will be used.
> >>>>>>>
> >>>>>>>> (5) My 5th question is about the time series comparison. If
I want
> >>> to
> >>>>>>> draw
> >>>>>>>> the plot of MSE versus time at a specific observation
point, what
> >>>>> should
> >>>>>>> I
> >>>>>>>> do? And how to take the data of the specific point out?
> >>>>>>>>
> >>>>>>>
> >>>>>>> When you run Point-Stat, you can output the individual
matched pair
> >>>>> values
> >>>>>>> by turning on the MPR output line.  The individual
> >>> forecast-observation
> >>>>>>> pairs are contained in that MPR line.  Since
> >>>>>>> Point-Stat is run once for each valid time, the time-series
of an
> >>>>>>> individual point is scattered across many different output
files.
>  If
> >>>>> you'd
> >>>>>>> like, you could use the STAT-Analysis tool filter out the
> >>>>>>> MPR lines that correspond to a single station id.  For
example,
> >>> here's
> >>>>> how
> >>>>>>> you might filter out the matched pairs for 2-m temperature
for a
> >>> station
> >>>>>>> named KDEN:
> >>>>>>>
> >>>>>>>       stat_analysis -job filter -lookin point_stat/out
-line_type
> MPR
> >>>>>>> -fcst_var TMP -fcst_lev Z2 -column_str OBS_SID KDEN
-dump_row
> >>>>>>> TMP_Z2_KDEN_MPR.txt
> >>>>>>>
> >>>>>>> That will read all of the files ending in ".stat" from the
> directory
> >>>>> named
> >>>>>>> "point_stat/out", pick out the MPR lines, only with "TMP" in
the
> >>>>> FCST_VAR
> >>>>>>> column, only with "Z2" in the FCST_LEV column, only
> >>>>>>> with "KDEN" in the OBS_SID column, and write the output to a
file
> >>> named
> >>>>>>> "TMP_Z2_KDEN_MPR.txt".  There are filtering switches for
each of
> the
> >>> 21
> >>>>>>> header columns common to each line type.  And you can
> >>>>>>> use the -column_min, -column_max, -column_eq, and
-column_str
> >>> options to
> >>>>>>> filter by the data columns (as we've done here for the
OBS_SID
> >>> column).
> >>>>>>>
> >>>>>>> Once you have the data stored this way, it's up to you to
make the
> >>> time
> >>>>>>> series plot with whatever software you'd like.
> >>>>>>>
> >>>>>>>> (6) My last question was about the comparison between two
wrf
> >>> output.
> >>>>> If
> >>>>>>> I
> >>>>>>>> want to know how to compare the result between two wrf
output at
> the
> >>>>> same
> >>>>>>>> time at the same location but with different physics
schemes.
> >>> Whether
> >>>>> it
> >>>>>>> is
> >>>>>>>> just put these two results into the grid-grid comparison
and set
> >>>>>>>> one wrfoutput as the observation? Or there are other way to
> execute
> >>>>> this
> >>>>>>>> job?
> >>>>>>>>
> >>>>>>>
> >>>>>>> Sure, you can easily treat one of them as the "forecast" and
the
> >>> other
> >>>>> as
> >>>>>>> the "observation".  That is not strictly verification - more
of
> just
> >>> a
> >>>>>>> comparison.  Hopefully, the output from MET will help
> >>>>>>> you quantify the differences.  You should just think
carefully
> about
> >>> it
> >>>>>>> when trying to make sense of the output.
> >>>>>>>
> >>>>>>>> Thank you so much in advance for your time and help, I
really
> >>>>> appreciate
> >>>>>>> it!
> >>>>>>>>
> >>>>>>>> Best,
> >>>>>>>>
> >>>>>>>> Jason
> >>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>
> >>>>>
> >>>
> >>>
> >>
>
>

------------------------------------------------
Subject: Re: [rt.rap.ucar.edu #63455] some small questions regarding the basic concepts about MET
From: John Halley Gotway
Time: Wed Oct 23 09:58:44 2013

Jason,

OK, I understand your question now.  Unfortunately, I'm not able to
give you a definitive answer on this one.  But I talked to one of the
scientists here, and our assumption is that most, if not all,
of the observations contained within PREPBUFR files would be
instantaneous, not time-averaged.  For example, observations from
soundings contained in the ADPUPA message type would be instantaneous
measurements, as would surface variables like 2-meter temperature and
relative humidity from automated weather stations contained in the
ADPSFC message type.

It probably depends on the instrument type.  Some are definitely
instantaneous and others may perhaps be an average over some time
period.  Here's a link to the NCEP PrepBufr processing information:
    http://www.emc.ncep.noaa.gov/mmb/data_processing/prepbufr.doc/document.htm

This may (or may not) contain the information you're after.  At least
that's where I'd suggest you start looking.

Hope that helps.

Thanks,
John Halley Gotway
met_help at ucar.edu

On 10/22/2013 08:01 PM, Xingcheng Lu via RT wrote:
>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=63455 >
>
> Hi John,
>
> Thank you for your reply, I understand the working principle of
beg/end but
> my question is about the observation data, I am not sure how the
value
> reported in the PrepBufr file comes from. My labmate told me that
there are
> two types of observation data, one is directly from instantaneous
record,
> and the other is from the average record. For example, to the
instantaneous
> record, the data recorded at 12:00:00 is the true value at 12:00:00.
While
> to some average records, the value for 12:00:00 may come from the
average
> recorded value between 11:50:00 and 12:10:00, the 20 minutes'
average
> value. (same,the value for 12:30:00 comes from the average value
between
> 12:20:00 and 12:40:00). I have checked through the 4GDAS website,
but
> cannot find any information regarding about this. Sorry for the not
> clear expression last email. Thank you!
>
> Best,
>
> Jason
>
>
> 2013/10/23 John Halley Gotway via RT <met_help at ucar.edu>
>
>> Jason,
>>
>> There are 4 GDAS PrepBufr files for each day, at 00, 06, 12, and
18Z.
>>   Each one contains 6 hours of point observation data that is +/- 3
hours
>> around the timestamp in the file name.  For example, the
>> 12Z GDAS PrepBufr file contains observations between 09 and 15Z.
When you
>> run Point-Stat, you set a matching time window in the config file
using the
>> "obs_window" option.  Set "beg" and "end" to a
>> number of seconds.  When you run Point-Stat, it'll look at the
valid time
>> of the forecast and build the time window around that.  Suppose
your
>> forecast is valid at 20130512 00Z and you've set beg =
>> -3600 and end = 3600 (that's +/ 1 hour).  The matching time window
will be
>> from 20130511 23Z to 20130512 01Z.  Point-Stat will try to use any
>> observations falling between those time when doing it's
>> verification.  And it's up to you to decide how big you want that
time
>> window to be, e.g. +/- 1 hour, +/- 5 minutes, or +/- 0 seconds.
>>
>> Some stations will report multiple times within those 2 hours.  By
>> default, all of those point observations will be used - even for
stations
>> reporting multiple times.  However, you can change that
>> behavior by setting the "duplicate_flag" option in the config file.
You
>> can read about this in METv4.1/data/config/README, but setting it
to a
>> value of "SINGLE" will cause only one observation to be
>> used for each station.  Point-Stat will use the one whose valid
time is
>> closest to the valid time of the forecast.
>>
>> Hope that helps clarify.
>>
>> Thanks,
>> John
>>
>> On 10/22/2013 06:29 AM, Xingcheng Lu via RT wrote:
>>>
>>> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=63455 >
>>>
>>> Hi John,
>>>
>>> Forgot to ask you a question about the observation data yesterday.
The
>> NCEP
>>> ADP Global Upper Air and Surface Weather Observations data, is it
>> recorded
>>> at instantaneous time spot, or just recorded the average value
over a
>>> period? For example,if the data is recorded at 12:00pm, is the
data at
>> this
>>> time an average from 11:50 to 12:10, or the exact value at 12:00?
Thanks
>>> again-
>>>
>>> Best,
>>>
>>> Jason
>>>
>>>
>>> 2013/10/22 Xingcheng Lu <xingchenglu2011 at u.northwestern.edu>
>>>
>>>> Hello John,
>>>>
>>>> Thank you very much for your script, I will study it in detail.
Also, I
>>>> really appreciate that you can provide me so many detailed
answers, I
>> have
>>>> learned a lot from them. Hope that meet you again when I ask the
>> question
>>>> in met-help next time.
>>>>
>>>> Sincerely,
>>>>
>>>> Jason
>>>>
>>>>
>>>> 2013/10/22 John Halley Gotway via RT <met_help at ucar.edu>
>>>>
>>>>> Jason,
>>>>>
>>>>> Yes, I mean shell scripting.  Users typically call the MET
utilities
>> from
>>>>> within shell scripts (or any other scripting language, like PERL
or
>>>>> Python).  Typically, the shell script just loops over a
>>>>> bunch of model initialization times and/or forecast lead times,
figures
>>>>> out the name of the forecast and observation files for that
time, and
>> calls
>>>>> the appropriate MET tools.  I usually use the "date"
>>>>> command when doing that looping.  I've attached a sample script
written
>>>>> in the korn shell that does the following...
>>>>>
>>>>>      - outer loop for the model initialization times (from
2013030100to
>>>>> 2013030212 every 12-hours)
>>>>>      - inner loop for forecast lead times (from 0 to 36 hours
every 6
>> hours)
>>>>>      - compute forecast and observation file names
>>>>>      - print a line stating where you should call the MET
tool(s) for
>> that
>>>>> time
>>>>>
>>>>> Just download and run the attached script to see what I mean.
>>   Hopefully,
>>>>> this script will help get you started.
>>>>>
>>>>> Thanks,
>>>>> John
>>>>>
>>>>>
>>>>> On 10/21/2013 07:44 AM, Xingcheng Lu via RT wrote:
>>>>>>
>>>>>> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=63455 >
>>>>>>
>>>>>> Hi John,
>>>>>>
>>>>>> Thank you for your answers. You said that I could write some
scripts
>> for
>>>>>> time series re-run, so what type of scripts I can write, do you
mean
>>>>> shell
>>>>>> script? If you have a script sample, it will be much helpful to
me.
>>>>> Thank
>>>>>> again for your help!
>>>>>>
>>>>>> cheers,
>>>>>>
>>>>>> Jason
>>>>>>
>>>>>>
>>>>>> 2013/10/19 John Halley Gotway via RT <met_help at ucar.edu>
>>>>>>
>>>>>>> Jason,
>>>>>>>
>>>>>>> Answers are inline.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> John
>>>>>>>
>>>>>>> On 10/18/2013 09:14 AM, Xingcheng Lu via RT wrote:
>>>>>>>>
>>>>>>>> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=63455
>
>>>>>>>>
>>>>>>>> Hi John,
>>>>>>>>
>>>>>>>> Thank you so much for your detail reply and I have learned a
lot
>> from
>>>>>>> your
>>>>>>>> answers.
>>>>>>>>
>>>>>>>> According to your answers, I still have 3 questions:
>>>>>>>>
>>>>>>>> For the question 3 and 4, can I say, if there are 200
observation
>>>>> points
>>>>>>>> within my domain, and I set the time beg=0/end=0, then the
match
>> pair
>>>>> is
>>>>>>>> 200. However, when I set beg=3600/end=3600, if each
observation
>> point
>>>>>>> has 3
>>>>>>>> values during this time period, then the matching pair should
be
>> 600.
>>>>> Is
>>>>>>> my
>>>>>>>> understanding correct?
>>>>>>>>
>>>>>>>
>>>>>>> Yes, your understanding is correct.  If multiple observations
occur
>> at
>>>>> the
>>>>>>> same location during the time window, by default, they will
all be
>>>>> used.
>>>>>>>     *HOWEVER*, we realize that this behavior is not
>>>>>>> always desirable, so there's an option in the configuration
file to
>>>>>>> control this logic.  Please take a look in the file
>>>>>>> "METv4.1/data/config/README" for a description of the
>> "duplicate_flag"
>>>>>>> option.
>>>>>>> Setting it's value to "SINGLE" will cause only a single
observation
>>>>> value
>>>>>>> for each location to be used.  The one whose valid time is
closest to
>>>>> that
>>>>>>> of the forecast time is chosen.
>>>>>>>
>>>>>>>> For the question 5, if I want to do the time series
comparison for
>> 12
>>>>>>>> hours at the same observation spot(with single hour wrf
output and
>> set
>>>>>>>> beg/end=0), whether I just need to rerun the MET for 12 times
for
>>>>>>>> single different hour?
>>>>>>>
>>>>>>> The Point-Stat tool is intended to be run once for each valid
time
>>>>> being
>>>>>>> evaluated.  So if you're verifying 12 different times, you'd
run the
>>>>> tool
>>>>>>> 12 times.  Of course, we don't intend that people run
>>>>>>> this manually themselves on the command line.  Instead, we
expect
>> that
>>>>>>> you'd run it via a script of some sort that loops through the
times
>>>>> you'd
>>>>>>> like to evaluate.
>>>>>>>
>>>>>>> Often MET users are interested in the performance of their
model at
>>>>> more
>>>>>>> than just a single location and for more than a single
>> variable/level.
>>>>>    So
>>>>>>> there's more "work to do" than just computing a
>>>>>>> single matched pair value and writing it out.
>>>>>>>
>>>>>>> Here's a few pieces of info you may find helpful...
>>>>>>> - The MET configuration files support the use of environment
>>>>> variables.  I
>>>>>>> often find it helpful when scripting up calls to the MET tools
to set
>>>>> some
>>>>>>> environment variable values and then reference
>>>>>>> them in the configuration files I pass to the tools.  That
will
>> enable
>>>>> you
>>>>>>> to control the behavior of the tools without having to
maintain many
>>>>>>> different versions of the config files.
>>>>>>> - If you happen to have GRIB files that contain data for
multiple
>>>>> forecast
>>>>>>> hours (all 12 of your output times for example), you could
actually
>>>>> call
>>>>>>> Point-Stat once and evaluate them all at once.  But
>>>>>>> the configuration files get a lot messier - in the
"fcst.field"
>> setting
>>>>>>> you'd need to explicitly specify the valid time of the data to
be
>>>>> evaluated
>>>>>>> so that Point-Stat know which fields to use.  I
>>>>>>> typically find calling Point-Stat once per valid time is
easier.
>>>>>>> - If you're interested in the performance of your model at a
specific
>>>>> set
>>>>>>> of station ID's, consider using the "mask.sid" option.  Rather
than
>>>>>>> defining your verification area spatially (as the "grid"
>>>>>>> and "poly" options do), the station id (sid) option is just a
list of
>>>>> the
>>>>>>> stations over which you'd like to compute statistics.
>>>>>>>
>>>>>>>>
>>>>>>>> For the question 5, "stat_analysis -job filter -lookin
>> point_stat/out
>>>>>>>> -line_type MPR -fcst_var TMP -fcst_lev Z2 -column_str OBS_SID
KDEN
>>>>>>>> -dump_row TMP_Z2_KDEN_MPR.txt"  Can I type this on the Linux
command
>>>>> line
>>>>>>>> or I need to set this in the config file for stat analysis?
Because
>> I
>>>>>>> found
>>>>>>>> that this line is a little bit similar to the jobs setting in
the
>>>>> config
>>>>>>>> file, but I cannot find the format you type in the manual.
Whether
>>>>> there
>>>>>>> is
>>>>>>>> any resource to further introduce this command line format?
>>>>>>>
>>>>>>> The STAT-Analysis tool can be run with or without a config
file.  If
>>>>> you
>>>>>>> provide a config file, you have define the job(s) to be run in
it.
>>   If
>>>>> not,
>>>>>>> you define a *SINGLE* job to be run on the command
>>>>>>> line.  Usually, I run single jobs on the command line until I
figure
>>>>> out
>>>>>>> the set of analysis jobs I'd like to run via a script.  Then I
move
>>>>> them
>>>>>>> into a config file and run them all with one call to
>>>>>>> STAT-Analysis.  Look in the file "METv4.1/data/config/README"
for the
>>>>>>> section on "Job command FILTERING".
>>>>>>>
>>>>>>>>
>>>>>>>> By the way, do I have to use the grids with the same
resolution if I
>>>>> want
>>>>>>>> to do the grid-grid comparison? Also, because my research
focus is
>> on
>>>>>>>> global scale, do you know whether there is any daily grid
>> observation
>>>>>>> data
>>>>>>>> for the global scale?
>>>>>>>
>>>>>>> Yes, for the grid-to-grid comparisons performed by the Grid-
Stat,
>> MODE,
>>>>>>> Series-Analysis, and Wavelet-Stat tools, it's the user's
>>>>> responsibility to
>>>>>>> put their forecast and observation data on the same
>>>>>>> grid.  In future versions of MET, we'd like to add tools to
help
>> users
>>>>> do
>>>>>>> this.  But currently there isn't any support for this directly
in
>> MET.
>>>>>    But
>>>>>>> for GRIB1 data, the copygb utility can be used to
>>>>>>> regrid things.  Here's a portion of the MET online tutorial
that
>>>>> discusses
>>>>>>> this:
>>>>>>>
>>>>>>>
>>>>>
>>
http://www.dtcenter.org/met/users/support/online_tutorial/METv4.1/copygb/index.php
>>>>>>>
>>>>>>> Availability of appropriate observations is always an issue.
And I
>>>>> don't
>>>>>>> have a magic bullet for you.  You could always compare your
model
>>>>> output to
>>>>>>> a model analysis - like the global GFS analysis.
>>>>>>> But that'll just tell you how well your model output matches
GFS,
>> which
>>>>>>> has it's own set of errors.
>>>>>>>
>>>>>>>>
>>>>>>>> Thanks again for your kind help!
>>>>>>>>
>>>>>>>> Sincerely,
>>>>>>>>
>>>>>>>> Jason
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> 2013/10/18 John Halley Gotway via RT <met_help at ucar.edu>
>>>>>>>>
>>>>>>>>> Jason,
>>>>>>>>>
>>>>>>>>> I've answered your questions inline below.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> John Halley Gotway
>>>>>>>>> met_help at ucar.edu
>>>>>>>>>
>>>>>>>>> On 10/17/2013 08:54 AM, Xingcheng Lu via RT wrote:
>>>>>>>>>>
>>>>>>>>>> Thu Oct 17 08:54:28 2013: Request 63455 was acted upon.
>>>>>>>>>> Transaction: Ticket created by
xingchenglu2011 at u.northwestern.edu
>>>>>>>>>>             Queue: met_help
>>>>>>>>>>           Subject: some small questions regarding the basic
>> concepts
>>>>> about
>>>>>>>>> MET
>>>>>>>>>>             Owner: Nobody
>>>>>>>>>>        Requestors: xingchenglu2011 at u.northwestern.edu
>>>>>>>>>>            Status: new
>>>>>>>>>>       Ticket <URL:
>>>>>>> https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=63455 >
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Dear Sir/Madam,
>>>>>>>>>>
>>>>>>>>>> I am a new user of MET and I have several small questions
want to
>>>>> ask
>>>>>>>>> about.
>>>>>>>>>>
>>>>>>>>>> (1): In the manual at Table 4-3, I am a little bit confused
about
>>>>>>>>> Forecast
>>>>>>>>>> and Observation Rate. According to the contingency table,
whether
>>>>>>>>>> forecast rate=(n11+n00)/T? And how about observation rate,
I don't
>>>>> know
>>>>>>>>>> what it should be compared to.
>>>>>>>>>
>>>>>>>>> The forecast rate is just the fraction of grid points in the
>> forecast
>>>>>>>>> domain at which the event is occurring.  The observation
rate is
>> the
>>>>>>>>> fraction of grid points in the observation domain at which
>>>>>>>>> the event is occurring.  The observation rate is also known
at the
>>>>> base
>>>>>>>>> rate (BASER from the CTS line shown in Table 4-5).  The FHO
and CTC
>>>>>>> output
>>>>>>>>> lines really contain the same information.  At the
>>>>>>>>> DTC, we prefer to use the counts from the CTC line, while
NCEP
>>>>> prefers
>>>>>>> to
>>>>>>>>> use the ratios of the counts given in the FHO line.
>>>>>>>>>
>>>>>>>>> If your forecast were perfect, the F_RATE would be identical
to the
>>>>>>> O_RATE.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> (2):My second question is about the grid/poly setting in
the
>>>>>>>>>> Pointstatconfig file. I know that the poly is used to
select the
>>>>>>> specific
>>>>>>>>>> area for verification, however, I don't know what exactly
is the
>>>>> grid
>>>>>>>>>> option used for, is it also used to choose the place for
>>>>> verification?
>>>>>>> If
>>>>>>>>>> yes, what's the difference between the poly?
>>>>>>>>>
>>>>>>>>> Yes, the grid masking behaves the same way that the polyline
>> masking
>>>>>>> does.
>>>>>>>>>      It's just another way of specifying a geographic subset
of the
>>>>> domain.
>>>>>>>>>      It's generally much less useful than the polyline
>>>>>>>>> masking since only the pre-defined NCEP grids are supported,
and
>> most
>>>>>>>>> users don't find that very helpful.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> (3): I am a little bit confused about the 'match pair'
mean. I
>> have
>>>>> two
>>>>>>>>>> understandings:(A) If I set temp>273 at Z2, both the
observation
>>>>> file
>>>>>>> and
>>>>>>>>>> forecast output have valid value at Z2, no matter they are
larger
>>>>> than
>>>>>>>>> 273
>>>>>>>>>> or not;(B) Or it means that the value at Z2 from both
observation
>>>>> and
>>>>>>>>>> forecast meet the temp>273 requirement?
>>>>>>>>>
>>>>>>>>> A "matched pair" just means a pair of forecast and
observation
>> values
>>>>>>> that
>>>>>>>>> go together.  Suppose you have 200 point observations of 2-
meter
>>>>>>>>> temperature that fall in your domain.  For each observation
>>>>>>>>> value, Point-Stat computes an interpolated forecast value
for that
>>>>>>>>> observation location.  So you now have 200 pairs of forecast
and
>>>>>>>>> observation values.  Using those 200 matched pairs, you
could
>> define
>>>>>>>>> continuous statistics directly (CNT output line).  Or you
could
>>>>> choose a
>>>>>>>>> threshold (like >273) and define a 2x2 contingency table.
With
>> that
>>>>> 2x2
>>>>>>>>> contingency table, you can dump out the counts in the
>>>>>>>>> CTC line and/or the corresponding statistics in the CTS
line.
>>>>>>>>>
>>>>>>>>> I'm not sure if that answers your question.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> (4): My another question is about the beg/end time setting.
The
>>>>>>>>> observation
>>>>>>>>>> data I downloaded was DS 337.0 NCEP ADP Global Upper Air
and
>> Surface
>>>>>>>>>> Observations. Is the single file of this type of data only
have
>>>>> valid
>>>>>>>>>> observation for a specific time spot, like if I download
the data
>>>>> for
>>>>>>>>>> 2006.07.18_18:00:00, the file only contains the observation
data
>> for
>>>>>>> that
>>>>>>>>>> time spot? If so, when combine with wrf single hour output,
can I
>>>>> just
>>>>>>>>> set
>>>>>>>>>> this option to 0 and 0? Could you tell me what exact the
beg/end
>>>>> used
>>>>>>>>> for?
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> There are 4 GDAS PREPBUFR files per day - 00Z, 06Z, 12Z, and
18Z.
>>>>>    Each
>>>>>>>>> file contains 6 hours worth of observations, 3 hours +/- the
time
>>>>>>> indicated
>>>>>>>>> in the file name.  So the 12Z file contains
>>>>>>>>> observations from 09Z to 15Z.  When you run Point-Stat, you
need to
>>>>> pass
>>>>>>>>> it one or more observation files that contain the
observations
>> you'd
>>>>>>> like
>>>>>>>>> to use to evaluate the forecast.  The time window you
>>>>>>>>> set in the Point-Stat configuration file is set relative to
the
>>>>> forecast
>>>>>>>>> valid time.  Suppose your forecast is valid at 06Z and
you've set
>>>>> beg =
>>>>>>>>> -3600 and end = 3600 (in seconds).  So that's +/- 1
>>>>>>>>> hour around your forecast valid time.  When Point-Stat sifts
>> through
>>>>> the
>>>>>>>>> observations, it'll only use the one whose valid time falls
between
>>>>> 05Z
>>>>>>> and
>>>>>>>>> 07Z.  It'll throw all the others out.  It's up to
>>>>>>>>> you to decide how close in time your observations need to be
to
>> your
>>>>>>>>> forecast time.  If you set beg = 0 and end = 0, only those
>>>>> observations
>>>>>>>>> with exactly the same time as the forecast will be used.
>>>>>>>>>
>>>>>>>>>> (5) My 5th question is about the time series comparison. If
I want
>>>>> to
>>>>>>>>> draw
>>>>>>>>>> the plot of MSE versus time at a specific observation
point, what
>>>>>>> should
>>>>>>>>> I
>>>>>>>>>> do? And how to take the data of the specific point out?
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> When you run Point-Stat, you can output the individual
matched pair
>>>>>>> values
>>>>>>>>> by turning on the MPR output line.  The individual
>>>>> forecast-observation
>>>>>>>>> pairs are contained in that MPR line.  Since
>>>>>>>>> Point-Stat is run once for each valid time, the time-series
of an
>>>>>>>>> individual point is scattered across many different output
files.
>>   If
>>>>>>> you'd
>>>>>>>>> like, you could use the STAT-Analysis tool filter out the
>>>>>>>>> MPR lines that correspond to a single station id.  For
example,
>>>>> here's
>>>>>>> how
>>>>>>>>> you might filter out the matched pairs for 2-m temperature
for a
>>>>> station
>>>>>>>>> named KDEN:
>>>>>>>>>
>>>>>>>>>        stat_analysis -job filter -lookin point_stat/out
-line_type
>> MPR
>>>>>>>>> -fcst_var TMP -fcst_lev Z2 -column_str OBS_SID KDEN
-dump_row
>>>>>>>>> TMP_Z2_KDEN_MPR.txt
>>>>>>>>>
>>>>>>>>> That will read all of the files ending in ".stat" from the
>> directory
>>>>>>> named
>>>>>>>>> "point_stat/out", pick out the MPR lines, only with "TMP" in
the
>>>>>>> FCST_VAR
>>>>>>>>> column, only with "Z2" in the FCST_LEV column, only
>>>>>>>>> with "KDEN" in the OBS_SID column, and write the output to a
file
>>>>> named
>>>>>>>>> "TMP_Z2_KDEN_MPR.txt".  There are filtering switches for
each of
>> the
>>>>> 21
>>>>>>>>> header columns common to each line type.  And you can
>>>>>>>>> use the -column_min, -column_max, -column_eq, and
-column_str
>>>>> options to
>>>>>>>>> filter by the data columns (as we've done here for the
OBS_SID
>>>>> column).
>>>>>>>>>
>>>>>>>>> Once you have the data stored this way, it's up to you to
make the
>>>>> time
>>>>>>>>> series plot with whatever software you'd like.
>>>>>>>>>
>>>>>>>>>> (6) My last question was about the comparison between two
wrf
>>>>> output.
>>>>>>> If
>>>>>>>>> I
>>>>>>>>>> want to know how to compare the result between two wrf
output at
>> the
>>>>>>> same
>>>>>>>>>> time at the same location but with different physics
schemes.
>>>>> Whether
>>>>>>> it
>>>>>>>>> is
>>>>>>>>>> just put these two results into the grid-grid comparison
and set
>>>>>>>>>> one wrfoutput as the observation? Or there are other way to
>> execute
>>>>>>> this
>>>>>>>>>> job?
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Sure, you can easily treat one of them as the "forecast" and
the
>>>>> other
>>>>>>> as
>>>>>>>>> the "observation".  That is not strictly verification - more
of
>> just
>>>>> a
>>>>>>>>> comparison.  Hopefully, the output from MET will help
>>>>>>>>> you quantify the differences.  You should just think
carefully
>> about
>>>>> it
>>>>>>>>> when trying to make sense of the output.
>>>>>>>>>
>>>>>>>>>> Thank you so much in advance for your time and help, I
really
>>>>>>> appreciate
>>>>>>>>> it!
>>>>>>>>>>
>>>>>>>>>> Best,
>>>>>>>>>>
>>>>>>>>>> Jason
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>
>>>>>
>>>>
>>
>>

------------------------------------------------
Subject: some small questions regarding the basic concepts about MET
From: Xingcheng Lu
Time: Wed Oct 23 20:46:41 2013

Hi John,

OK, I see, thank you so much and you really help me a lot.

Jason


2013/10/23 John Halley Gotway via RT <met_help at ucar.edu>

> Jason,
>
> OK, I understand your question now.  Unfortunately, I'm not able to
give
> you a definitive answer on this one.  But I talked to one of the
scientists
> here, and our assumption is that most, if not all,
> of the observations contained within PREPBUFR files would be
> instantaneous, not time-averaged.  For example, observations from
soundings
> contained in the ADPUPA message type would be instantaneous
> measurements, as would surface variables like 2-meter temperature
and
> relative humidity from automated weather stations contained in the
ADPSFC
> message type.
>
> It probably depends on the instrument type.  Some are definitely
> instantaneous and others may perhaps be an average over some time
period.
>  Here's a link to the NCEP PrepBufr processing information:
>
>
http://www.emc.ncep.noaa.gov/mmb/data_processing/prepbufr.doc/document.htm
>
> This may (or may not) contain the information you're after.  At
least
> that's where I'd suggest you start looking.
>
> Hope that helps.
>
> Thanks,
> John Halley Gotway
> met_help at ucar.edu
>
> On 10/22/2013 08:01 PM, Xingcheng Lu via RT wrote:
> >
> > <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=63455 >
> >
> > Hi John,
> >
> > Thank you for your reply, I understand the working principle of
beg/end
> but
> > my question is about the observation data, I am not sure how the
value
> > reported in the PrepBufr file comes from. My labmate told me that
there
> are
> > two types of observation data, one is directly from instantaneous
record,
> > and the other is from the average record. For example, to the
> instantaneous
> > record, the data recorded at 12:00:00 is the true value at
12:00:00.
> While
> > to some average records, the value for 12:00:00 may come from the
average
> > recorded value between 11:50:00 and 12:10:00, the 20 minutes'
average
> > value. (same,the value for 12:30:00 comes from the average value
between
> > 12:20:00 and 12:40:00). I have checked through the 4GDAS website,
but
> > cannot find any information regarding about this. Sorry for the
not
> > clear expression last email. Thank you!
> >
> > Best,
> >
> > Jason
> >
> >
> > 2013/10/23 John Halley Gotway via RT <met_help at ucar.edu>
> >
> >> Jason,
> >>
> >> There are 4 GDAS PrepBufr files for each day, at 00, 06, 12, and
18Z.
> >>   Each one contains 6 hours of point observation data that is +/-
3
> hours
> >> around the timestamp in the file name.  For example, the
> >> 12Z GDAS PrepBufr file contains observations between 09 and 15Z.
When
> you
> >> run Point-Stat, you set a matching time window in the config file
using
> the
> >> "obs_window" option.  Set "beg" and "end" to a
> >> number of seconds.  When you run Point-Stat, it'll look at the
valid
> time
> >> of the forecast and build the time window around that.  Suppose
your
> >> forecast is valid at 20130512 00Z and you've set beg =
> >> -3600 and end = 3600 (that's +/ 1 hour).  The matching time
window will
> be
> >> from 20130511 23Z to 20130512 01Z.  Point-Stat will try to use
any
> >> observations falling between those time when doing it's
> >> verification.  And it's up to you to decide how big you want that
time
> >> window to be, e.g. +/- 1 hour, +/- 5 minutes, or +/- 0 seconds.
> >>
> >> Some stations will report multiple times within those 2 hours.
By
> >> default, all of those point observations will be used - even for
> stations
> >> reporting multiple times.  However, you can change that
> >> behavior by setting the "duplicate_flag" option in the config
file.  You
> >> can read about this in METv4.1/data/config/README, but setting it
to a
> >> value of "SINGLE" will cause only one observation to be
> >> used for each station.  Point-Stat will use the one whose valid
time is
> >> closest to the valid time of the forecast.
> >>
> >> Hope that helps clarify.
> >>
> >> Thanks,
> >> John
> >>
> >> On 10/22/2013 06:29 AM, Xingcheng Lu via RT wrote:
> >>>
> >>> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=63455 >
> >>>
> >>> Hi John,
> >>>
> >>> Forgot to ask you a question about the observation data
yesterday. The
> >> NCEP
> >>> ADP Global Upper Air and Surface Weather Observations data, is
it
> >> recorded
> >>> at instantaneous time spot, or just recorded the average value
over a
> >>> period? For example,if the data is recorded at 12:00pm, is the
data at
> >> this
> >>> time an average from 11:50 to 12:10, or the exact value at
12:00?
> Thanks
> >>> again-
> >>>
> >>> Best,
> >>>
> >>> Jason
> >>>
> >>>
> >>> 2013/10/22 Xingcheng Lu <xingchenglu2011 at u.northwestern.edu>
> >>>
> >>>> Hello John,
> >>>>
> >>>> Thank you very much for your script, I will study it in detail.
Also,
> I
> >>>> really appreciate that you can provide me so many detailed
answers, I
> >> have
> >>>> learned a lot from them. Hope that meet you again when I ask
the
> >> question
> >>>> in met-help next time.
> >>>>
> >>>> Sincerely,
> >>>>
> >>>> Jason
> >>>>
> >>>>
> >>>> 2013/10/22 John Halley Gotway via RT <met_help at ucar.edu>
> >>>>
> >>>>> Jason,
> >>>>>
> >>>>> Yes, I mean shell scripting.  Users typically call the MET
utilities
> >> from
> >>>>> within shell scripts (or any other scripting language, like
PERL or
> >>>>> Python).  Typically, the shell script just loops over a
> >>>>> bunch of model initialization times and/or forecast lead
times,
> figures
> >>>>> out the name of the forecast and observation files for that
time, and
> >> calls
> >>>>> the appropriate MET tools.  I usually use the "date"
> >>>>> command when doing that looping.  I've attached a sample
script
> written
> >>>>> in the korn shell that does the following...
> >>>>>
> >>>>>      - outer loop for the model initialization times (from
> 2013030100to
> >>>>> 2013030212 every 12-hours)
> >>>>>      - inner loop for forecast lead times (from 0 to 36 hours
every 6
> >> hours)
> >>>>>      - compute forecast and observation file names
> >>>>>      - print a line stating where you should call the MET
tool(s) for
> >> that
> >>>>> time
> >>>>>
> >>>>> Just download and run the attached script to see what I mean.
> >>   Hopefully,
> >>>>> this script will help get you started.
> >>>>>
> >>>>> Thanks,
> >>>>> John
> >>>>>
> >>>>>
> >>>>> On 10/21/2013 07:44 AM, Xingcheng Lu via RT wrote:
> >>>>>>
> >>>>>> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=63455
>
> >>>>>>
> >>>>>> Hi John,
> >>>>>>
> >>>>>> Thank you for your answers. You said that I could write some
scripts
> >> for
> >>>>>> time series re-run, so what type of scripts I can write, do
you mean
> >>>>> shell
> >>>>>> script? If you have a script sample, it will be much helpful
to me.
> >>>>> Thank
> >>>>>> again for your help!
> >>>>>>
> >>>>>> cheers,
> >>>>>>
> >>>>>> Jason
> >>>>>>
> >>>>>>
> >>>>>> 2013/10/19 John Halley Gotway via RT <met_help at ucar.edu>
> >>>>>>
> >>>>>>> Jason,
> >>>>>>>
> >>>>>>> Answers are inline.
> >>>>>>>
> >>>>>>> Thanks,
> >>>>>>> John
> >>>>>>>
> >>>>>>> On 10/18/2013 09:14 AM, Xingcheng Lu via RT wrote:
> >>>>>>>>
> >>>>>>>> <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=63455 >
> >>>>>>>>
> >>>>>>>> Hi John,
> >>>>>>>>
> >>>>>>>> Thank you so much for your detail reply and I have learned
a lot
> >> from
> >>>>>>> your
> >>>>>>>> answers.
> >>>>>>>>
> >>>>>>>> According to your answers, I still have 3 questions:
> >>>>>>>>
> >>>>>>>> For the question 3 and 4, can I say, if there are 200
observation
> >>>>> points
> >>>>>>>> within my domain, and I set the time beg=0/end=0, then the
match
> >> pair
> >>>>> is
> >>>>>>>> 200. However, when I set beg=3600/end=3600, if each
observation
> >> point
> >>>>>>> has 3
> >>>>>>>> values during this time period, then the matching pair
should be
> >> 600.
> >>>>> Is
> >>>>>>> my
> >>>>>>>> understanding correct?
> >>>>>>>>
> >>>>>>>
> >>>>>>> Yes, your understanding is correct.  If multiple
observations occur
> >> at
> >>>>> the
> >>>>>>> same location during the time window, by default, they will
all be
> >>>>> used.
> >>>>>>>     *HOWEVER*, we realize that this behavior is not
> >>>>>>> always desirable, so there's an option in the configuration
file to
> >>>>>>> control this logic.  Please take a look in the file
> >>>>>>> "METv4.1/data/config/README" for a description of the
> >> "duplicate_flag"
> >>>>>>> option.
> >>>>>>> Setting it's value to "SINGLE" will cause only a single
observation
> >>>>> value
> >>>>>>> for each location to be used.  The one whose valid time is
closest
> to
> >>>>> that
> >>>>>>> of the forecast time is chosen.
> >>>>>>>
> >>>>>>>> For the question 5, if I want to do the time series
comparison for
> >> 12
> >>>>>>>> hours at the same observation spot(with single hour wrf
output and
> >> set
> >>>>>>>> beg/end=0), whether I just need to rerun the MET for 12
times for
> >>>>>>>> single different hour?
> >>>>>>>
> >>>>>>> The Point-Stat tool is intended to be run once for each
valid time
> >>>>> being
> >>>>>>> evaluated.  So if you're verifying 12 different times, you'd
run
> the
> >>>>> tool
> >>>>>>> 12 times.  Of course, we don't intend that people run
> >>>>>>> this manually themselves on the command line.  Instead, we
expect
> >> that
> >>>>>>> you'd run it via a script of some sort that loops through
the times
> >>>>> you'd
> >>>>>>> like to evaluate.
> >>>>>>>
> >>>>>>> Often MET users are interested in the performance of their
model at
> >>>>> more
> >>>>>>> than just a single location and for more than a single
> >> variable/level.
> >>>>>    So
> >>>>>>> there's more "work to do" than just computing a
> >>>>>>> single matched pair value and writing it out.
> >>>>>>>
> >>>>>>> Here's a few pieces of info you may find helpful...
> >>>>>>> - The MET configuration files support the use of environment
> >>>>> variables.  I
> >>>>>>> often find it helpful when scripting up calls to the MET
tools to
> set
> >>>>> some
> >>>>>>> environment variable values and then reference
> >>>>>>> them in the configuration files I pass to the tools.  That
will
> >> enable
> >>>>> you
> >>>>>>> to control the behavior of the tools without having to
maintain
> many
> >>>>>>> different versions of the config files.
> >>>>>>> - If you happen to have GRIB files that contain data for
multiple
> >>>>> forecast
> >>>>>>> hours (all 12 of your output times for example), you could
actually
> >>>>> call
> >>>>>>> Point-Stat once and evaluate them all at once.  But
> >>>>>>> the configuration files get a lot messier - in the
"fcst.field"
> >> setting
> >>>>>>> you'd need to explicitly specify the valid time of the data
to be
> >>>>> evaluated
> >>>>>>> so that Point-Stat know which fields to use.  I
> >>>>>>> typically find calling Point-Stat once per valid time is
easier.
> >>>>>>> - If you're interested in the performance of your model at a
> specific
> >>>>> set
> >>>>>>> of station ID's, consider using the "mask.sid" option.
Rather than
> >>>>>>> defining your verification area spatially (as the "grid"
> >>>>>>> and "poly" options do), the station id (sid) option is just
a list
> of
> >>>>> the
> >>>>>>> stations over which you'd like to compute statistics.
> >>>>>>>
> >>>>>>>>
> >>>>>>>> For the question 5, "stat_analysis -job filter -lookin
> >> point_stat/out
> >>>>>>>> -line_type MPR -fcst_var TMP -fcst_lev Z2 -column_str
OBS_SID KDEN
> >>>>>>>> -dump_row TMP_Z2_KDEN_MPR.txt"  Can I type this on the
Linux
> command
> >>>>> line
> >>>>>>>> or I need to set this in the config file for stat analysis?
> Because
> >> I
> >>>>>>> found
> >>>>>>>> that this line is a little bit similar to the jobs setting
in the
> >>>>> config
> >>>>>>>> file, but I cannot find the format you type in the manual.
Whether
> >>>>> there
> >>>>>>> is
> >>>>>>>> any resource to further introduce this command line format?
> >>>>>>>
> >>>>>>> The STAT-Analysis tool can be run with or without a config
file.
>  If
> >>>>> you
> >>>>>>> provide a config file, you have define the job(s) to be run
in it.
> >>   If
> >>>>> not,
> >>>>>>> you define a *SINGLE* job to be run on the command
> >>>>>>> line.  Usually, I run single jobs on the command line until
I
> figure
> >>>>> out
> >>>>>>> the set of analysis jobs I'd like to run via a script.  Then
I move
> >>>>> them
> >>>>>>> into a config file and run them all with one call to
> >>>>>>> STAT-Analysis.  Look in the file
"METv4.1/data/config/README" for
> the
> >>>>>>> section on "Job command FILTERING".
> >>>>>>>
> >>>>>>>>
> >>>>>>>> By the way, do I have to use the grids with the same
resolution
> if I
> >>>>> want
> >>>>>>>> to do the grid-grid comparison? Also, because my research
focus is
> >> on
> >>>>>>>> global scale, do you know whether there is any daily grid
> >> observation
> >>>>>>> data
> >>>>>>>> for the global scale?
> >>>>>>>
> >>>>>>> Yes, for the grid-to-grid comparisons performed by the Grid-
Stat,
> >> MODE,
> >>>>>>> Series-Analysis, and Wavelet-Stat tools, it's the user's
> >>>>> responsibility to
> >>>>>>> put their forecast and observation data on the same
> >>>>>>> grid.  In future versions of MET, we'd like to add tools to
help
> >> users
> >>>>> do
> >>>>>>> this.  But currently there isn't any support for this
directly in
> >> MET.
> >>>>>    But
> >>>>>>> for GRIB1 data, the copygb utility can be used to
> >>>>>>> regrid things.  Here's a portion of the MET online tutorial
that
> >>>>> discusses
> >>>>>>> this:
> >>>>>>>
> >>>>>>>
> >>>>>
> >>
>
http://www.dtcenter.org/met/users/support/online_tutorial/METv4.1/copygb/index.php
> >>>>>>>
> >>>>>>> Availability of appropriate observations is always an issue.
And I
> >>>>> don't
> >>>>>>> have a magic bullet for you.  You could always compare your
model
> >>>>> output to
> >>>>>>> a model analysis - like the global GFS analysis.
> >>>>>>> But that'll just tell you how well your model output matches
GFS,
> >> which
> >>>>>>> has it's own set of errors.
> >>>>>>>
> >>>>>>>>
> >>>>>>>> Thanks again for your kind help!
> >>>>>>>>
> >>>>>>>> Sincerely,
> >>>>>>>>
> >>>>>>>> Jason
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> 2013/10/18 John Halley Gotway via RT <met_help at ucar.edu>
> >>>>>>>>
> >>>>>>>>> Jason,
> >>>>>>>>>
> >>>>>>>>> I've answered your questions inline below.
> >>>>>>>>>
> >>>>>>>>> Thanks,
> >>>>>>>>> John Halley Gotway
> >>>>>>>>> met_help at ucar.edu
> >>>>>>>>>
> >>>>>>>>> On 10/17/2013 08:54 AM, Xingcheng Lu via RT wrote:
> >>>>>>>>>>
> >>>>>>>>>> Thu Oct 17 08:54:28 2013: Request 63455 was acted upon.
> >>>>>>>>>> Transaction: Ticket created by
> xingchenglu2011 at u.northwestern.edu
> >>>>>>>>>>             Queue: met_help
> >>>>>>>>>>           Subject: some small questions regarding the
basic
> >> concepts
> >>>>> about
> >>>>>>>>> MET
> >>>>>>>>>>             Owner: Nobody
> >>>>>>>>>>        Requestors: xingchenglu2011 at u.northwestern.edu
> >>>>>>>>>>            Status: new
> >>>>>>>>>>       Ticket <URL:
> >>>>>>> https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=63455 >
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Dear Sir/Madam,
> >>>>>>>>>>
> >>>>>>>>>> I am a new user of MET and I have several small questions
want
> to
> >>>>> ask
> >>>>>>>>> about.
> >>>>>>>>>>
> >>>>>>>>>> (1): In the manual at Table 4-3, I am a little bit
confused
> about
> >>>>>>>>> Forecast
> >>>>>>>>>> and Observation Rate. According to the contingency table,
> whether
> >>>>>>>>>> forecast rate=(n11+n00)/T? And how about observation
rate, I
> don't
> >>>>> know
> >>>>>>>>>> what it should be compared to.
> >>>>>>>>>
> >>>>>>>>> The forecast rate is just the fraction of grid points in
the
> >> forecast
> >>>>>>>>> domain at which the event is occurring.  The observation
rate is
> >> the
> >>>>>>>>> fraction of grid points in the observation domain at which
> >>>>>>>>> the event is occurring.  The observation rate is also
known at
> the
> >>>>> base
> >>>>>>>>> rate (BASER from the CTS line shown in Table 4-5).  The
FHO and
> CTC
> >>>>>>> output
> >>>>>>>>> lines really contain the same information.  At the
> >>>>>>>>> DTC, we prefer to use the counts from the CTC line, while
NCEP
> >>>>> prefers
> >>>>>>> to
> >>>>>>>>> use the ratios of the counts given in the FHO line.
> >>>>>>>>>
> >>>>>>>>> If your forecast were perfect, the F_RATE would be
identical to
> the
> >>>>>>> O_RATE.
> >>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> (2):My second question is about the grid/poly setting in
the
> >>>>>>>>>> Pointstatconfig file. I know that the poly is used to
select the
> >>>>>>> specific
> >>>>>>>>>> area for verification, however, I don't know what exactly
is the
> >>>>> grid
> >>>>>>>>>> option used for, is it also used to choose the place for
> >>>>> verification?
> >>>>>>> If
> >>>>>>>>>> yes, what's the difference between the poly?
> >>>>>>>>>
> >>>>>>>>> Yes, the grid masking behaves the same way that the
polyline
> >> masking
> >>>>>>> does.
> >>>>>>>>>      It's just another way of specifying a geographic
subset of
> the
> >>>>> domain.
> >>>>>>>>>      It's generally much less useful than the polyline
> >>>>>>>>> masking since only the pre-defined NCEP grids are
supported, and
> >> most
> >>>>>>>>> users don't find that very helpful.
> >>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> (3): I am a little bit confused about the 'match pair'
mean. I
> >> have
> >>>>> two
> >>>>>>>>>> understandings:(A) If I set temp>273 at Z2, both the
observation
> >>>>> file
> >>>>>>> and
> >>>>>>>>>> forecast output have valid value at Z2, no matter they
are
> larger
> >>>>> than
> >>>>>>>>> 273
> >>>>>>>>>> or not;(B) Or it means that the value at Z2 from both
> observation
> >>>>> and
> >>>>>>>>>> forecast meet the temp>273 requirement?
> >>>>>>>>>
> >>>>>>>>> A "matched pair" just means a pair of forecast and
observation
> >> values
> >>>>>>> that
> >>>>>>>>> go together.  Suppose you have 200 point observations of
2-meter
> >>>>>>>>> temperature that fall in your domain.  For each
observation
> >>>>>>>>> value, Point-Stat computes an interpolated forecast value
for
> that
> >>>>>>>>> observation location.  So you now have 200 pairs of
forecast and
> >>>>>>>>> observation values.  Using those 200 matched pairs, you
could
> >> define
> >>>>>>>>> continuous statistics directly (CNT output line).  Or you
could
> >>>>> choose a
> >>>>>>>>> threshold (like >273) and define a 2x2 contingency table.
With
> >> that
> >>>>> 2x2
> >>>>>>>>> contingency table, you can dump out the counts in the
> >>>>>>>>> CTC line and/or the corresponding statistics in the CTS
line.
> >>>>>>>>>
> >>>>>>>>> I'm not sure if that answers your question.
> >>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> (4): My another question is about the beg/end time
setting. The
> >>>>>>>>> observation
> >>>>>>>>>> data I downloaded was DS 337.0 NCEP ADP Global Upper Air
and
> >> Surface
> >>>>>>>>>> Observations. Is the single file of this type of data
only have
> >>>>> valid
> >>>>>>>>>> observation for a specific time spot, like if I download
the
> data
> >>>>> for
> >>>>>>>>>> 2006.07.18_18:00:00, the file only contains the
observation data
> >> for
> >>>>>>> that
> >>>>>>>>>> time spot? If so, when combine with wrf single hour
output, can
> I
> >>>>> just
> >>>>>>>>> set
> >>>>>>>>>> this option to 0 and 0? Could you tell me what exact the
beg/end
> >>>>> used
> >>>>>>>>> for?
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> There are 4 GDAS PREPBUFR files per day - 00Z, 06Z, 12Z,
and 18Z.
> >>>>>    Each
> >>>>>>>>> file contains 6 hours worth of observations, 3 hours +/-
the time
> >>>>>>> indicated
> >>>>>>>>> in the file name.  So the 12Z file contains
> >>>>>>>>> observations from 09Z to 15Z.  When you run Point-Stat,
you need
> to
> >>>>> pass
> >>>>>>>>> it one or more observation files that contain the
observations
> >> you'd
> >>>>>>> like
> >>>>>>>>> to use to evaluate the forecast.  The time window you
> >>>>>>>>> set in the Point-Stat configuration file is set relative
to the
> >>>>> forecast
> >>>>>>>>> valid time.  Suppose your forecast is valid at 06Z and
you've set
> >>>>> beg =
> >>>>>>>>> -3600 and end = 3600 (in seconds).  So that's +/- 1
> >>>>>>>>> hour around your forecast valid time.  When Point-Stat
sifts
> >> through
> >>>>> the
> >>>>>>>>> observations, it'll only use the one whose valid time
falls
> between
> >>>>> 05Z
> >>>>>>> and
> >>>>>>>>> 07Z.  It'll throw all the others out.  It's up to
> >>>>>>>>> you to decide how close in time your observations need to
be to
> >> your
> >>>>>>>>> forecast time.  If you set beg = 0 and end = 0, only those
> >>>>> observations
> >>>>>>>>> with exactly the same time as the forecast will be used.
> >>>>>>>>>
> >>>>>>>>>> (5) My 5th question is about the time series comparison.
If I
> want
> >>>>> to
> >>>>>>>>> draw
> >>>>>>>>>> the plot of MSE versus time at a specific observation
point,
> what
> >>>>>>> should
> >>>>>>>>> I
> >>>>>>>>>> do? And how to take the data of the specific point out?
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> When you run Point-Stat, you can output the individual
matched
> pair
> >>>>>>> values
> >>>>>>>>> by turning on the MPR output line.  The individual
> >>>>> forecast-observation
> >>>>>>>>> pairs are contained in that MPR line.  Since
> >>>>>>>>> Point-Stat is run once for each valid time, the time-
series of an
> >>>>>>>>> individual point is scattered across many different output
files.
> >>   If
> >>>>>>> you'd
> >>>>>>>>> like, you could use the STAT-Analysis tool filter out the
> >>>>>>>>> MPR lines that correspond to a single station id.  For
example,
> >>>>> here's
> >>>>>>> how
> >>>>>>>>> you might filter out the matched pairs for 2-m temperature
for a
> >>>>> station
> >>>>>>>>> named KDEN:
> >>>>>>>>>
> >>>>>>>>>        stat_analysis -job filter -lookin point_stat/out
> -line_type
> >> MPR
> >>>>>>>>> -fcst_var TMP -fcst_lev Z2 -column_str OBS_SID KDEN
-dump_row
> >>>>>>>>> TMP_Z2_KDEN_MPR.txt
> >>>>>>>>>
> >>>>>>>>> That will read all of the files ending in ".stat" from the
> >> directory
> >>>>>>> named
> >>>>>>>>> "point_stat/out", pick out the MPR lines, only with "TMP"
in the
> >>>>>>> FCST_VAR
> >>>>>>>>> column, only with "Z2" in the FCST_LEV column, only
> >>>>>>>>> with "KDEN" in the OBS_SID column, and write the output to
a file
> >>>>> named
> >>>>>>>>> "TMP_Z2_KDEN_MPR.txt".  There are filtering switches for
each of
> >> the
> >>>>> 21
> >>>>>>>>> header columns common to each line type.  And you can
> >>>>>>>>> use the -column_min, -column_max, -column_eq, and
-column_str
> >>>>> options to
> >>>>>>>>> filter by the data columns (as we've done here for the
OBS_SID
> >>>>> column).
> >>>>>>>>>
> >>>>>>>>> Once you have the data stored this way, it's up to you to
make
> the
> >>>>> time
> >>>>>>>>> series plot with whatever software you'd like.
> >>>>>>>>>
> >>>>>>>>>> (6) My last question was about the comparison between two
wrf
> >>>>> output.
> >>>>>>> If
> >>>>>>>>> I
> >>>>>>>>>> want to know how to compare the result between two wrf
output at
> >> the
> >>>>>>> same
> >>>>>>>>>> time at the same location but with different physics
schemes.
> >>>>> Whether
> >>>>>>> it
> >>>>>>>>> is
> >>>>>>>>>> just put these two results into the grid-grid comparison
and set
> >>>>>>>>>> one wrfoutput as the observation? Or there are other way
to
> >> execute
> >>>>>>> this
> >>>>>>>>>> job?
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Sure, you can easily treat one of them as the "forecast"
and the
> >>>>> other
> >>>>>>> as
> >>>>>>>>> the "observation".  That is not strictly verification -
more of
> >> just
> >>>>> a
> >>>>>>>>> comparison.  Hopefully, the output from MET will help
> >>>>>>>>> you quantify the differences.  You should just think
carefully
> >> about
> >>>>> it
> >>>>>>>>> when trying to make sense of the output.
> >>>>>>>>>
> >>>>>>>>>> Thank you so much in advance for your time and help, I
really
> >>>>>>> appreciate
> >>>>>>>>> it!
> >>>>>>>>>>
> >>>>>>>>>> Best,
> >>>>>>>>>>
> >>>>>>>>>> Jason
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>
> >>>>>
> >>>>
> >>
> >>
>
>

------------------------------------------------


More information about the Met_help mailing list