[Met_help] [rt.rap.ucar.edu #87940] History for MODE

John Halley Gotway via RT met_help at ucar.edu
Tue Jul 9 12:07:03 MDT 2019


----------------------------------------------------------------
  Initial Request
----------------------------------------------------------------

Dear John,

I 'm working on MODE and I have many questions about the configure and
output:

*Configure mode:*

   - In *Verification grid*, the vld_thresh is a value between 0 and 1 that
   associated with the valid data in the regrid. I understood is a value that
   I could choose and is better if this value is close to 1. Also this
   parameter appear in the *Forecast and observation fields to be verified*
   section. Are the same? I put the same number for vld_thresh in verification
   grid and forecast and observation fields.


   -  *Forecast and observation fields to be verified:* I was changing the
   variables to understand the results but I have questions whit some of them:
   - *area_thresh*: I read the MET User's guide and I found that this is a
      threshold to define the area of the object and its units are in grid
      squares. Is it like a *conv_radius*?  For conv_radius=2 (20 km), I
      changed area_thresh=NA to >=4. I opened the Postscript file and I didn't
      see changes in the number of clusters but the NMI value is changed.
      - what's is the difference between *inten_perc_value* and i
      *nten_perc_thresh*?


   - *max_centroid_dist*: I understand that it's refer to the maximum
   distance between centroids to define an object but i don't know if that
   distance is in the same field or if that parameter compare the objects
   distance between the forecast field and the observation field to define a
   cluster.


   - In the *Fuzzy engine weights *I didn't change nothing, I use the same
   configure that in the configure default because i don't understand how
   works the weights. I read a paper that you wrote, The Method for
   Object-Based Diagnostic Evaluation (MODE) Applied to Numerical Forecasts
   from the 2005 NSSL/SPC Spring Program: In appendix A, it's talk about the
   weight; for centroid distance separation, the weight is 24% so if i put
   this information in the configure , must i put centroid_dist = 24.0?


   - *total_interest_thresh* is the *total interest value* that appear in
   the MET User's guide? so the *total_interest_thresh* will depend on the
   weights to the attributes, it is correct?

*Outputs:*

   - Postcript File: is there any rule to list the cluster? I thought that
   the first cluster will be the cluster with the biggest total intr, but
   sometimes this doesn't happen.
   - CST ascii file: in this file I have the information about the raw and
   by the object but I have the same total number of matched pairs and I don't
   understand why.

I've send the data with the configure file and the output as an example.

Thanks for the help,

Natalí


----------------------------------------------------------------
  Complete Ticket History
----------------------------------------------------------------

Subject: MODE
From: John Halley Gotway
Time: Thu Nov 29 12:34:43 2018

Natali,

I see you have several questions about MODE.  I'll answer inline after
each
question in red below.

Thanks,
John

On Wed, Nov 28, 2018 at 3:29 PM natali aranda via RT
<met_help at ucar.edu>
wrote:

>
> Wed Nov 28 15:29:03 2018: Request 87940 was acted upon.
> Transaction: Ticket created by natali.g.aranda at gmail.com
>        Queue: met_help
>      Subject: MODE
>        Owner: Nobody
>   Requestors: natali.g.aranda at gmail.com
>       Status: new
>  Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=87940 >
>
>
> Dear John,
>
> I 'm working on MODE and I have many questions about the configure
and
> output:
>
> *Configure mode:*
>
>    - In *Verification grid*, the vld_thresh is a value between 0 and
1 that
>    associated with the valid data in the regrid. I understood is a
value
> that
>    I could choose and is better if this value is close to 1. Also
this
>    parameter appear in the *Forecast and observation fields to be
verified*
>    section. Are the same? I put the same number for vld_thresh in
> verification
>    grid and forecast and observation fields.
>

*Before defining objects, MODE first smooths the data by convolving
it.
This step is controlled by the "conv_radius" setting.  The larger the
radius, the more smoothing.  Or setting the radius to 0 means no
smoothing.  The "vld_thresh" setting inside the "fcst" and "obs"
dictionaries is used during this smoothing step.  It controls what to
do
with missing data inside convolution area... like when the grid point
is
near the edge of the domain.  Here's a selection from the MET User's
Guide
which describes this:*








*//    - The "vld_thresh" entry specifies a number between 0 and 1.
When//      performing interpolation over some neighborhood of points
the
ratio of//      the number of valid data points to the total number of
points in the//      neighborhood is computed. If that ratio is less
than
this threshold,//      the matched pair is discarded. Setting this
threshold to 1, which is the//      default, requires that the entire
neighborhood must contain valid data.//      This variable will
typically
come into play only along the boundaries of//      the verification
region
chosen.*


>
>    -  *Forecast and observation fields to be verified:* I was
changing the
>    variables to understand the results but I have questions whit
some of
> them:
>    - *area_thresh*: I read the MET User's guide and I found that
this is a
>       threshold to define the area of the object and its units are
in grid
>       squares. Is it like a *conv_radius*?  For conv_radius=2 (20
km), I
>       changed area_thresh=NA to >=4. I opened the Postscript file
and I
> didn't
>       see changes in the number of clusters but the NMI value is
changed.
>       - what's is the difference between *inten_perc_value* and i
>       *nten_perc_thresh*?
>

*MODE defines individual objects in each field.  Each of those objects
has
several attributes.  One of those attributes is the object area, and
the
area is defined as the number of grid square which comprise that
object.
If you look at the "*_obj.txt" files, you'll see that one of the
columns is
named "AREA".  The "area_thresh" option applies to the object area
values
which appear in that column.  After defining objects, MODE applies the
area_thresh to the objects and throws away any objects that don't meet
the
area_thresh.  We typically use this to discard very small objects.*

*Similarly, the "inten_perc_value" and "inten_perc_thresh" are used to
filter objects and discard ones that don't meet the specified
criteria.
The "inten_perc_value" defines the percentile of interest for the
values
inside each object.  For example, 50 means the median... 90 means the
90th
percentile.  For each object, MODE computes the requested percentile
and
then applies the threshold.  Any objects which don't meet the
threshold are
thrown away.*

*Here's a selection from the version 7.0 MET User's Guide which
describes
this:*

*//    - The "area_thresh" entry specifies a threshold in grid squares
for
the*

*//      area of MODE objects.  Any objects not meeting this threshold
are*

*//      discarded.*

*//*

*//    - The "inten_perc_value" entry specifies the intensity
percentile
value*

*//      of interest between 0 and 100.  The percentile set by this
entry
will*

*//      be output in addition to the standard intensity percentiles.*

*//*

*//    - The "inten_perc_thresh" entry specifies a threshold for the
percentile*

*//      intensity of each MODE object.  Any objects not meeting this
threshold*

*//      are discarded.*


*However, please be aware that these object filtering options are
replaced
in met-8.0 by the "filter_attr_name" and "filter_attr_thresh" options.
For
example, here's how you'd discard objects less than 100 grid
squares...*


*In met-7.0:*

*area_thresh = gt100;*


*In met-8.0:*

*filter_attr_name   = [ "AREA" ];*

*filter_attr_thresh = [ gt100 ];*


>    - *max_centroid_dist*: I understand that it's refer to the
maximum
>    distance between centroids to define an object but i don't know
if that
>    distance is in the same field or if that parameter compare the
objects
>    distance between the forecast field and the observation field to
define
> a
>    cluster.
>
>
*Let's say there are 10 forecast objects and 15 observation objects.
MODE
inter-compares all possible combinations of those objects: 10 x 15 =
150.
But doing all those comparisons can be slow.  When objects are far
apart
they are not very likely to match eachother.  The "max_centroid_dist"
entry
defines how close the forecast and observation centroids need to be in
order for their attributes to be compared.  This is just a way to make
MODE
run a little more effiiciently.  Here's a selection from the MET
User's
Guide which describes this:*


*//*

*// The "max_centroid_dist" entry specifies the maximum allowable
distance
in*

*// grid squares between the centroids of objects for them to be
compared.*

*// Setting this to a reasonable value speeds up the runtime enabling
MODE
to*

*// skip unreasonable object comparisons.*
*//  *


>
>    - In the *Fuzzy engine weights *I didn't change nothing, I use
the same
>    configure that in the configure default because i don't
understand how
>    works the weights. I read a paper that you wrote, The Method for
>    Object-Based Diagnostic Evaluation (MODE) Applied to Numerical
Forecasts
>    from the 2005 NSSL/SPC Spring Program: In appendix A, it's talk
about
> the
>    weight; for centroid distance separation, the weight is 24% so if
i put
>    this information in the configure , must i put centroid_dist =
24.0?
>
>
*I'd recommend using the default weights at first.  Once you run MODE
on
several cases, you can look at the output and decide if you like the
way
MODE is matching objects between the fcst and obs or not.  Once you
get a
better sense of how it's working, you could tweak the weights to
affect
which objects get matched.*


>
>    - *total_interest_thresh* is the *total interest value* that
appear in
>    the MET User's guide? so the *total_interest_thresh* will depend
on the
>    weights to the attributes, it is correct?
>
>
*For each of the fcst/obs object comparisons, MODE computes an
interest
value between 0 and 1.  The higher that number, the more similar the
objects are.  The "total_interest_thresh" entry defines the threshold
for
that total interest value to say how large it must be for objects to
be
considered matches.*


> *Outputs:*
>
>    - Postcript File: is there any rule to list the cluster? I
thought that
>    the first cluster will be the cluster with the biggest total
intr, but
>    sometimes this doesn't happen.
>

*The numbering of the clusters is just done from one corner of the
grid to
another.  The cluster numbering does not give any information about
the how
good of a match it is.*


>    - CST ascii file: in this file I have the information about the
raw and
>    by the object but I have the same total number of matched pairs
and I
> don't
>    understand why.
>
>
*CTS stands for contingency table statistics.  The are statistics you
get
when you pick a threshold and do a categorical verification of gridded
data.  In met-6.1 and later, this file contains output lines, one for
"RAW"
and a second for "OBJECT".  However, the number of matched pairs in
the
TOTAL column should be the same for both.  It's the number of points
in the
domain.  The RAW line contains the stats you get by doing a
traditional
categorical verification of the raw input files.  The OBJECT line
contains
the stats you get by doing a traditional verification of the resolved
object field.  But it's still computed grid point by grid point... not
object by object.*


> I've send the data with the configure file and the output as an
example.
>
> Thanks for the help,
>
> Natalí
>
>

------------------------------------------------
Subject: MODE
From: natali aranda
Time: Wed Jan 30 14:12:50 2019

Dear John

thanks for the email,
It helped me but I still have questions about the max_centroid_dist:

In the configuration  MODE_R2_T3_9 , I regrid to observation field
where
the spatial resolution for the imerg (my observation field) is approx
10
km.
I use max_centroid_dist: 10. ( I understand that the max
_centroid_dist =
10 is like 100 km) Also I use total_interest_thresh=0.7

In the postscript file mode_300000L_20180209_120000V_240000A.ps I see
that
the first cluster has a centroid distance = 113, 81; this mean the the
distance between the centroids of the object in the forecast and in
the
observations field is 1138 km?? and how is possible if I put max
centroid
dist = 100 km? Also, all clusters have total interest lower than 0.7
so I
don't understand how the MODE do the match if I select the
total_interest_thresh in 0.7.

Also, I would like to ask about the merge_thresh and merge_flag:
I used three different configurations where I only changed the
merge_thresh
and merge flag:

  1) mode_300000L_20180224_120000V_240000A.ps_1
      merge_thresh= >=1.5
     merge_flag = THRESH

  2) mode_300000L_20180224_120000V_240000A.ps_2
     merge_thresh= >=3
     merge_flag = THRESH

  3) mode_300000L_20180224_120000V_240000A.ps_3
     merge_thresh= >=1.5
     merge_flag = NONE


If I compare the postscripts outputs, 2) and 3) have the same
information!
even if the merge characteristic are different. If I compare 1) with
2) and
1) with 3), the only difference is the number of clusters and the MMI
are
the same. I don' t see the advantage to change these parameters.

In aranda_data are the files,


Thanks for the help,

Natalí

El jue., 29 nov. 2018 a las 16:34, John Halley Gotway via RT (<
met_help at ucar.edu>) escribió:

> Natali,
>
> I see you have several questions about MODE.  I'll answer inline
after each
> question in red below.
>
> Thanks,
> John
>
> On Wed, Nov 28, 2018 at 3:29 PM natali aranda via RT
<met_help at ucar.edu>
> wrote:
>
> >
> > Wed Nov 28 15:29:03 2018: Request 87940 was acted upon.
> > Transaction: Ticket created by natali.g.aranda at gmail.com
> >        Queue: met_help
> >      Subject: MODE
> >        Owner: Nobody
> >   Requestors: natali.g.aranda at gmail.com
> >       Status: new
> >  Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=87940 >
> >
> >
> > Dear John,
> >
> > I 'm working on MODE and I have many questions about the configure
and
> > output:
> >
> > *Configure mode:*
> >
> >    - In *Verification grid*, the vld_thresh is a value between 0
and 1
> that
> >    associated with the valid data in the regrid. I understood is a
value
> > that
> >    I could choose and is better if this value is close to 1. Also
this
> >    parameter appear in the *Forecast and observation fields to be
> verified*
> >    section. Are the same? I put the same number for vld_thresh in
> > verification
> >    grid and forecast and observation fields.
> >
>
> *Before defining objects, MODE first smooths the data by convolving
it.
> This step is controlled by the "conv_radius" setting.  The larger
the
> radius, the more smoothing.  Or setting the radius to 0 means no
> smoothing.  The "vld_thresh" setting inside the "fcst" and "obs"
> dictionaries is used during this smoothing step.  It controls what
to do
> with missing data inside convolution area... like when the grid
point is
> near the edge of the domain.  Here's a selection from the MET User's
Guide
> which describes this:*
>
>
>
>
>
>
>
>
> *//    - The "vld_thresh" entry specifies a number between 0 and 1.
> When//      performing interpolation over some neighborhood of
points the
> ratio of//      the number of valid data points to the total number
of
> points in the//      neighborhood is computed. If that ratio is less
than
> this threshold,//      the matched pair is discarded. Setting this
> threshold to 1, which is the//      default, requires that the
entire
> neighborhood must contain valid data.//      This variable will
typically
> come into play only along the boundaries of//      the verification
region
> chosen.*
>
>
> >
> >    -  *Forecast and observation fields to be verified:* I was
changing
> the
> >    variables to understand the results but I have questions whit
some of
> > them:
> >    - *area_thresh*: I read the MET User's guide and I found that
this is
> a
> >       threshold to define the area of the object and its units are
in
> grid
> >       squares. Is it like a *conv_radius*?  For conv_radius=2 (20
km), I
> >       changed area_thresh=NA to >=4. I opened the Postscript file
and I
> > didn't
> >       see changes in the number of clusters but the NMI value is
changed.
> >       - what's is the difference between *inten_perc_value* and i
> >       *nten_perc_thresh*?
> >
>
> *MODE defines individual objects in each field.  Each of those
objects has
> several attributes.  One of those attributes is the object area, and
the
> area is defined as the number of grid square which comprise that
object.
> If you look at the "*_obj.txt" files, you'll see that one of the
columns is
> named "AREA".  The "area_thresh" option applies to the object area
values
> which appear in that column.  After defining objects, MODE applies
the
> area_thresh to the objects and throws away any objects that don't
meet the
> area_thresh.  We typically use this to discard very small objects.*
>
> *Similarly, the "inten_perc_value" and "inten_perc_thresh" are used
to
> filter objects and discard ones that don't meet the specified
criteria.
> The "inten_perc_value" defines the percentile of interest for the
values
> inside each object.  For example, 50 means the median... 90 means
the 90th
> percentile.  For each object, MODE computes the requested percentile
and
> then applies the threshold.  Any objects which don't meet the
threshold are
> thrown away.*
>
> *Here's a selection from the version 7.0 MET User's Guide which
describes
> this:*
>
> *//    - The "area_thresh" entry specifies a threshold in grid
squares for
> the*
>
> *//      area of MODE objects.  Any objects not meeting this
threshold are*
>
> *//      discarded.*
>
> *//*
>
> *//    - The "inten_perc_value" entry specifies the intensity
percentile
> value*
>
> *//      of interest between 0 and 100.  The percentile set by this
entry
> will*
>
> *//      be output in addition to the standard intensity
percentiles.*
>
> *//*
>
> *//    - The "inten_perc_thresh" entry specifies a threshold for the
> percentile*
>
> *//      intensity of each MODE object.  Any objects not meeting
this
> threshold*
>
> *//      are discarded.*
>
>
> *However, please be aware that these object filtering options are
replaced
> in met-8.0 by the "filter_attr_name" and "filter_attr_thresh"
options.  For
> example, here's how you'd discard objects less than 100 grid
squares...*
>
>
> *In met-7.0:*
>
> *area_thresh = gt100;*
>
>
> *In met-8.0:*
>
> *filter_attr_name   = [ "AREA" ];*
>
> *filter_attr_thresh = [ gt100 ];*
>
>
> >    - *max_centroid_dist*: I understand that it's refer to the
maximum
> >    distance between centroids to define an object but i don't know
if
> that
> >    distance is in the same field or if that parameter compare the
objects
> >    distance between the forecast field and the observation field
to
> define
> > a
> >    cluster.
> >
> >
> *Let's say there are 10 forecast objects and 15 observation objects.
MODE
> inter-compares all possible combinations of those objects: 10 x 15 =
150.
> But doing all those comparisons can be slow.  When objects are far
apart
> they are not very likely to match eachother.  The
"max_centroid_dist" entry
> defines how close the forecast and observation centroids need to be
in
> order for their attributes to be compared.  This is just a way to
make MODE
> run a little more effiiciently.  Here's a selection from the MET
User's
> Guide which describes this:*
>
>
> *//*
>
> *// The "max_centroid_dist" entry specifies the maximum allowable
distance
> in*
>
> *// grid squares between the centroids of objects for them to be
compared.*
>
> *// Setting this to a reasonable value speeds up the runtime
enabling MODE
> to*
>
> *// skip unreasonable object comparisons.*
> *//  *
>
>
> >
> >    - In the *Fuzzy engine weights *I didn't change nothing, I use
the
> same
> >    configure that in the configure default because i don't
understand how
> >    works the weights. I read a paper that you wrote, The Method
for
> >    Object-Based Diagnostic Evaluation (MODE) Applied to Numerical
> Forecasts
> >    from the 2005 NSSL/SPC Spring Program: In appendix A, it's talk
about
> > the
> >    weight; for centroid distance separation, the weight is 24% so
if i
> put
> >    this information in the configure , must i put centroid_dist =
24.0?
> >
> >
> *I'd recommend using the default weights at first.  Once you run
MODE on
> several cases, you can look at the output and decide if you like the
way
> MODE is matching objects between the fcst and obs or not.  Once you
get a
> better sense of how it's working, you could tweak the weights to
affect
> which objects get matched.*
>
>
> >
> >    - *total_interest_thresh* is the *total interest value* that
appear in
> >    the MET User's guide? so the *total_interest_thresh* will
depend on
> the
> >    weights to the attributes, it is correct?
> >
> >
> *For each of the fcst/obs object comparisons, MODE computes an
interest
> value between 0 and 1.  The higher that number, the more similar the
> objects are.  The "total_interest_thresh" entry defines the
threshold for
> that total interest value to say how large it must be for objects to
be
> considered matches.*
>
>
> > *Outputs:*
> >
> >    - Postcript File: is there any rule to list the cluster? I
thought
> that
> >    the first cluster will be the cluster with the biggest total
intr, but
> >    sometimes this doesn't happen.
> >
>
> *The numbering of the clusters is just done from one corner of the
grid to
> another.  The cluster numbering does not give any information about
the how
> good of a match it is.*
>
>
> >    - CST ascii file: in this file I have the information about the
raw
> and
> >    by the object but I have the same total number of matched pairs
and I
> > don't
> >    understand why.
> >
> >
> *CTS stands for contingency table statistics.  The are statistics
you get
> when you pick a threshold and do a categorical verification of
gridded
> data.  In met-6.1 and later, this file contains output lines, one
for "RAW"
> and a second for "OBJECT".  However, the number of matched pairs in
the
> TOTAL column should be the same for both.  It's the number of points
in the
> domain.  The RAW line contains the stats you get by doing a
traditional
> categorical verification of the raw input files.  The OBJECT line
contains
> the stats you get by doing a traditional verification of the
resolved
> object field.  But it's still computed grid point by grid point...
not
> object by object.*
>
>
> > I've send the data with the configure file and the output as an
example.
> >
> > Thanks for the help,
> >
> > Natalí
> >
> >
>
>

------------------------------------------------
Subject: MODE
From: John Halley Gotway
Time: Thu Jan 31 09:47:22 2019

Natali,

Thanks for sending sample PostScript output to illustrate your
questions.

Looks like your first question is this...

(1) If max_centroid_dist = 10, how can you get matching cluster
objects
with a centroid distance > 100?

The situation is actually more confusing than you're stating.  The
units
for max_centroid_distance and centroid_distance are the same... they
are
both in grid units... not km.

I'm looking at mode_300000L_20180209_120000V_240000A.ps, on page 5 and
see
that cluster pair #1 (colored RED) has centroid distance of 113.81.
Let's
step back and clarify how these clusters are created.  Your
configuration
resulted in 92 simple forecast objects and 75 simple observation
objects.
That means that MODE has 92 x 75 = 6900 object comparisons to do...
which
must take a very long time!  The purpose of max_centroid_dist is to
limit
that number of comparisons.  So if the centroids are too far away,
don't
waste time comparing those objects.  Personally, I think 10 is much
too
small.  Instead, I'd set it much larger... something like 0.5*max(Nx,
Ny).

So how did the red cluster get created.  Looking on page 6, you can
see the
result of the double threshold merging in the forecast field.  That
groups
together many, many forecast objects together.  But on page 7, you can
see
that red observation object does NOT get merged with any other nearby
objects.  So why do they match?  Looking on page 1, on the top right
side,
you see that simple objects F91 and O69 have interest of 0.9901.  So
those
match.  Whenever one member of a group (F91) match an object from the
other
field (O69), then the entire groups are matched.  That how we end up
with
large forecast cluster matching a very small observation cluster.

The centroid distance is RECOMPUTED for that cluster... and the
centroids
of the clusters are pretty far apart (113.81).

(2) Next question, why doesn't merge_thresh = >=1.25 and merge_thresh
=
>=3.0 result in different output?

So open up the files mode_300000L_20180224_120000V_240000A.ps_1 and
mode_300000L_20180224_120000V_240000A.ps_2 and look at pages 6 and 7.
The
merge_thresh defines the lower threshold used to group simple objects
together.  As you can see pages 6 and 7 do look different.  So the
merge
threshold is being applied correctly as you've requested in the config
file.

It just that this setting ended up have no impact on the resulting
object
matches for this particular case.  Not every configuration option will
have
a noticeable impact on the results for every run.

Looking at the example you sent, I see a lot of very small objects.
I'd
suggest making the objects much smoother to make MODE run faster and
make
the output easier to interpret:

- Set max_centroid_dist very, very large (>1000000) so that it has no
limitation on the object comparisons made.
- Set the convolution radius much higher to make the objects smoother
(conv_radius = 10).
- Set the convolution threshold a little lower since you'll be doing
more
smoothing (conv_thresh = >=1)
- Turn off threshold merging for now to make the logic simpler:
thresh_flag
= NONE
- Set mask_missing_flag = BOTH so that the areas of missing data in
the
forecast field (bottom and top-right corner of the domain) are also
applied
to the observation field.

Once you get MODE running on a smaller number of smoother objects and
understand how it's working, you could try reconfiguring to look at
smaller, more detailed objects.

Hope that helps.

Thanks,
John

On Wed, Jan 30, 2019 at 2:13 PM natali aranda via RT
<met_help at ucar.edu>
wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=87940 >
>
> Dear John
>
> thanks for the email,
> It helped me but I still have questions about the max_centroid_dist:
>
> In the configuration  MODE_R2_T3_9 , I regrid to observation field
where
> the spatial resolution for the imerg (my observation field) is
approx 10
> km.
> I use max_centroid_dist: 10. ( I understand that the max
_centroid_dist =
> 10 is like 100 km) Also I use total_interest_thresh=0.7
>
> In the postscript file mode_300000L_20180209_120000V_240000A.ps I
see  that
> the first cluster has a centroid distance = 113, 81; this mean the
the
> distance between the centroids of the object in the forecast and in
the
> observations field is 1138 km?? and how is possible if I put max
centroid
> dist = 100 km? Also, all clusters have total interest lower than 0.7
so I
> don't understand how the MODE do the match if I select the
> total_interest_thresh in 0.7.
>
> Also, I would like to ask about the merge_thresh and merge_flag:
> I used three different configurations where I only changed the
merge_thresh
> and merge flag:
>
>   1) mode_300000L_20180224_120000V_240000A.ps_1
>       merge_thresh= >=1.5
>      merge_flag = THRESH
>
>   2) mode_300000L_20180224_120000V_240000A.ps_2
>      merge_thresh= >=3
>      merge_flag = THRESH
>
>   3) mode_300000L_20180224_120000V_240000A.ps_3
>      merge_thresh= >=1.5
>      merge_flag = NONE
>
>
> If I compare the postscripts outputs, 2) and 3) have the same
information!
> even if the merge characteristic are different. If I compare 1) with
2) and
> 1) with 3), the only difference is the number of clusters and the
MMI are
> the same. I don' t see the advantage to change these parameters.
>
> In aranda_data are the files,
>
>
> Thanks for the help,
>
> Natalí
>
> El jue., 29 nov. 2018 a las 16:34, John Halley Gotway via RT (<
> met_help at ucar.edu>) escribió:
>
> > Natali,
> >
> > I see you have several questions about MODE.  I'll answer inline
after
> each
> > question in red below.
> >
> > Thanks,
> > John
> >
> > On Wed, Nov 28, 2018 at 3:29 PM natali aranda via RT
<met_help at ucar.edu>
> > wrote:
> >
> > >
> > > Wed Nov 28 15:29:03 2018: Request 87940 was acted upon.
> > > Transaction: Ticket created by natali.g.aranda at gmail.com
> > >        Queue: met_help
> > >      Subject: MODE
> > >        Owner: Nobody
> > >   Requestors: natali.g.aranda at gmail.com
> > >       Status: new
> > >  Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=87940
> >
> > >
> > >
> > > Dear John,
> > >
> > > I 'm working on MODE and I have many questions about the
configure and
> > > output:
> > >
> > > *Configure mode:*
> > >
> > >    - In *Verification grid*, the vld_thresh is a value between 0
and 1
> > that
> > >    associated with the valid data in the regrid. I understood is
a
> value
> > > that
> > >    I could choose and is better if this value is close to 1.
Also this
> > >    parameter appear in the *Forecast and observation fields to
be
> > verified*
> > >    section. Are the same? I put the same number for vld_thresh
in
> > > verification
> > >    grid and forecast and observation fields.
> > >
> >
> > *Before defining objects, MODE first smooths the data by
convolving it.
> > This step is controlled by the "conv_radius" setting.  The larger
the
> > radius, the more smoothing.  Or setting the radius to 0 means no
> > smoothing.  The "vld_thresh" setting inside the "fcst" and "obs"
> > dictionaries is used during this smoothing step.  It controls what
to do
> > with missing data inside convolution area... like when the grid
point is
> > near the edge of the domain.  Here's a selection from the MET
User's
> Guide
> > which describes this:*
> >
> >
> >
> >
> >
> >
> >
> >
> > *//    - The "vld_thresh" entry specifies a number between 0 and
1.
> > When//      performing interpolation over some neighborhood of
points the
> > ratio of//      the number of valid data points to the total
number of
> > points in the//      neighborhood is computed. If that ratio is
less than
> > this threshold,//      the matched pair is discarded. Setting this
> > threshold to 1, which is the//      default, requires that the
entire
> > neighborhood must contain valid data.//      This variable will
typically
> > come into play only along the boundaries of//      the
verification
> region
> > chosen.*
> >
> >
> > >
> > >    -  *Forecast and observation fields to be verified:* I was
changing
> > the
> > >    variables to understand the results but I have questions whit
some
> of
> > > them:
> > >    - *area_thresh*: I read the MET User's guide and I found that
this
> is
> > a
> > >       threshold to define the area of the object and its units
are in
> > grid
> > >       squares. Is it like a *conv_radius*?  For conv_radius=2
(20 km),
> I
> > >       changed area_thresh=NA to >=4. I opened the Postscript
file and I
> > > didn't
> > >       see changes in the number of clusters but the NMI value is
> changed.
> > >       - what's is the difference between *inten_perc_value* and
i
> > >       *nten_perc_thresh*?
> > >
> >
> > *MODE defines individual objects in each field.  Each of those
objects
> has
> > several attributes.  One of those attributes is the object area,
and the
> > area is defined as the number of grid square which comprise that
object.
> > If you look at the "*_obj.txt" files, you'll see that one of the
columns
> is
> > named "AREA".  The "area_thresh" option applies to the object area
values
> > which appear in that column.  After defining objects, MODE applies
the
> > area_thresh to the objects and throws away any objects that don't
meet
> the
> > area_thresh.  We typically use this to discard very small
objects.*
> >
> > *Similarly, the "inten_perc_value" and "inten_perc_thresh" are
used to
> > filter objects and discard ones that don't meet the specified
criteria.
> > The "inten_perc_value" defines the percentile of interest for the
values
> > inside each object.  For example, 50 means the median... 90 means
the
> 90th
> > percentile.  For each object, MODE computes the requested
percentile and
> > then applies the threshold.  Any objects which don't meet the
threshold
> are
> > thrown away.*
> >
> > *Here's a selection from the version 7.0 MET User's Guide which
describes
> > this:*
> >
> > *//    - The "area_thresh" entry specifies a threshold in grid
squares
> for
> > the*
> >
> > *//      area of MODE objects.  Any objects not meeting this
threshold
> are*
> >
> > *//      discarded.*
> >
> > *//*
> >
> > *//    - The "inten_perc_value" entry specifies the intensity
percentile
> > value*
> >
> > *//      of interest between 0 and 100.  The percentile set by
this entry
> > will*
> >
> > *//      be output in addition to the standard intensity
percentiles.*
> >
> > *//*
> >
> > *//    - The "inten_perc_thresh" entry specifies a threshold for
the
> > percentile*
> >
> > *//      intensity of each MODE object.  Any objects not meeting
this
> > threshold*
> >
> > *//      are discarded.*
> >
> >
> > *However, please be aware that these object filtering options are
> replaced
> > in met-8.0 by the "filter_attr_name" and "filter_attr_thresh"
options.
> For
> > example, here's how you'd discard objects less than 100 grid
squares...*
> >
> >
> > *In met-7.0:*
> >
> > *area_thresh = gt100;*
> >
> >
> > *In met-8.0:*
> >
> > *filter_attr_name   = [ "AREA" ];*
> >
> > *filter_attr_thresh = [ gt100 ];*
> >
> >
> > >    - *max_centroid_dist*: I understand that it's refer to the
maximum
> > >    distance between centroids to define an object but i don't
know if
> > that
> > >    distance is in the same field or if that parameter compare
the
> objects
> > >    distance between the forecast field and the observation field
to
> > define
> > > a
> > >    cluster.
> > >
> > >
> > *Let's say there are 10 forecast objects and 15 observation
objects.
> MODE
> > inter-compares all possible combinations of those objects: 10 x 15
= 150.
> > But doing all those comparisons can be slow.  When objects are far
apart
> > they are not very likely to match eachother.  The
"max_centroid_dist"
> entry
> > defines how close the forecast and observation centroids need to
be in
> > order for their attributes to be compared.  This is just a way to
make
> MODE
> > run a little more effiiciently.  Here's a selection from the MET
User's
> > Guide which describes this:*
> >
> >
> > *//*
> >
> > *// The "max_centroid_dist" entry specifies the maximum allowable
> distance
> > in*
> >
> > *// grid squares between the centroids of objects for them to be
> compared.*
> >
> > *// Setting this to a reasonable value speeds up the runtime
enabling
> MODE
> > to*
> >
> > *// skip unreasonable object comparisons.*
> > *//  *
> >
> >
> > >
> > >    - In the *Fuzzy engine weights *I didn't change nothing, I
use the
> > same
> > >    configure that in the configure default because i don't
understand
> how
> > >    works the weights. I read a paper that you wrote, The Method
for
> > >    Object-Based Diagnostic Evaluation (MODE) Applied to
Numerical
> > Forecasts
> > >    from the 2005 NSSL/SPC Spring Program: In appendix A, it's
talk
> about
> > > the
> > >    weight; for centroid distance separation, the weight is 24%
so if i
> > put
> > >    this information in the configure , must i put centroid_dist
= 24.0?
> > >
> > >
> > *I'd recommend using the default weights at first.  Once you run
MODE on
> > several cases, you can look at the output and decide if you like
the way
> > MODE is matching objects between the fcst and obs or not.  Once
you get a
> > better sense of how it's working, you could tweak the weights to
affect
> > which objects get matched.*
> >
> >
> > >
> > >    - *total_interest_thresh* is the *total interest value* that
appear
> in
> > >    the MET User's guide? so the *total_interest_thresh* will
depend on
> > the
> > >    weights to the attributes, it is correct?
> > >
> > >
> > *For each of the fcst/obs object comparisons, MODE computes an
interest
> > value between 0 and 1.  The higher that number, the more similar
the
> > objects are.  The "total_interest_thresh" entry defines the
threshold for
> > that total interest value to say how large it must be for objects
to be
> > considered matches.*
> >
> >
> > > *Outputs:*
> > >
> > >    - Postcript File: is there any rule to list the cluster? I
thought
> > that
> > >    the first cluster will be the cluster with the biggest total
intr,
> but
> > >    sometimes this doesn't happen.
> > >
> >
> > *The numbering of the clusters is just done from one corner of the
grid
> to
> > another.  The cluster numbering does not give any information
about the
> how
> > good of a match it is.*
> >
> >
> > >    - CST ascii file: in this file I have the information about
the raw
> > and
> > >    by the object but I have the same total number of matched
pairs and
> I
> > > don't
> > >    understand why.
> > >
> > >
> > *CTS stands for contingency table statistics.  The are statistics
you get
> > when you pick a threshold and do a categorical verification of
gridded
> > data.  In met-6.1 and later, this file contains output lines, one
for
> "RAW"
> > and a second for "OBJECT".  However, the number of matched pairs
in the
> > TOTAL column should be the same for both.  It's the number of
points in
> the
> > domain.  The RAW line contains the stats you get by doing a
traditional
> > categorical verification of the raw input files.  The OBJECT line
> contains
> > the stats you get by doing a traditional verification of the
resolved
> > object field.  But it's still computed grid point by grid point...
not
> > object by object.*
> >
> >
> > > I've send the data with the configure file and the output as an
> example.
> > >
> > > Thanks for the help,
> > >
> > > Natalí
> > >
> > >
> >
> >
>
>

------------------------------------------------


More information about the Met_help mailing list