[Met_help] [rt.rap.ucar.edu #96084] History for ensemble_stat

Fri Aug 7 11:14:53 MDT 2020

----------------------------------------------------------------
  Initial Request
----------------------------------------------------------------

Hello,

I am working with MET_v8.1. 
I am trying to calculate the spread/skill variance and RMSE using ensemble_stat with the grib1 files produced by UPP and  reanalysis data with specific format attribute:

It works well dealing with single variable (Here, HGT) at single level (such as 500 hPa).  When dealing with multiple levels, the results become unreasonable.
The command Line is: ensemble_stat 21 ens_
le_1 ... ens_
le_21 config_file -grid_obs levP250.nc -grid_obs levP500.nc -grid_obs levP700.nc 

Following is a snippet of settings for verification in the config_file 
##################################################
//
// Forecast and observation fields to be verified
//
fcst = {
   field = [
      {
         name               = "HGT";
         level              = [ "P250","P500","P750" ];
         cat_thresh = [ NA ];
         ens_ssvar_bin_size = 100;
         ens_phist_bin_size = 0.05;
      }
   ];
}
obs = fcst;
###################################################
The value of RMSE for 500 hPa and 700 hPa seems unreasonable. It seems that only the first observation file (-grid_obs levP250.nc) is used for verification at three levels. Is there any way to fix this problem, other than doing it three times separately.

Looking forward to your reply.

Thanks in advance,
Jiao B. F.
The Institute of Atmospheric Physics， Chinese Academy of Sciences

----------------------------------------------------------------
  Complete Ticket History
----------------------------------------------------------------

Subject: ensemble_stat
From: John Halley Gotway
Time: Mon Aug 03 11:07:45 2020

Jiao,

I see that you've found a potential problem with the Ensemble-Stat
tool.
When you specify multiple -grid_obs arguments, you suspect that all 3
forecast pressure levels are being verified against the first input
pressure level.

I'm looking at line 1570 of the ensemble_stat.cc source code:
https://github.com/NCAR/MET/blob/55a83e91a550807226e82224b02230f7422d75d0/met/src/tools/core/ensemble_stat/ensemble_stat.cc#L1570

I see that it's correctly looping through the verifying gridded
observation
input files (set by -grid_obs). It keeps reading data from them until
it
finds a match. I wish we had a log message written at line 1570 to
tell us
exactly which file was used:
https://github.com/NCAR/MET/blob/55a83e91a550807226e82224b02230f7422d75d0/met/src/tools/core/ensemble_stat/ensemble_stat.cc#L1578
We do have an upcoming release. I'll add that log message so this will
be
more clear in future releases.

It sounds like the problem is that all of the fields to be verified
(HGT at
P250, 500, and 700) "match" with data in the first -grid_obs input
file
(levP250.nc).
I'm guessing that if you reversed the order of the arguments, putting
P700
first (-grid_obs levP700.nc -grid_obs levP500.nc -grid_obs levP250.nc)
that
the P700 RMSE value would be reasonable, and the others not.

I see that you list P700 on the command line but P750 in the config
file.
I'll assume you intend to use P700 throughout.

Frankly, I'm surprised that you're getting any matches at all here.
You're
reading GRIB ensemble forecasts files and verifying against NetCDF
observations. You typically need to specify the configuration files
differently for those. Please run "ncdump -h" on each of those NetCDF
files:
    ncdump -h levP250.nc
And note the name of the output variable. I'm going to guess that
they're
named HGT_P250, HGT_P500, and HGT_P700, respectively. If so, then
please
try the exact configuration listed below. If not, then substitute in
the
actual names:

cat_thresh = [ NA ];
ens_ssvar_bin_size = 100;
ens_phist_bin_size = 0.05;

fcst = {
   field = [
      { name = "HGT"; level = [ "P250","P500","P700" ]; }
   ];
}
obs = {
   field = [
      { name = "HGT_P250"; level="(*,*)"; },
      { name = "HGT_P500"; level="(*,*)"; },
      { name = "HGT_P750"; level="(*,*)"; }
   ];
}

If you find that all three NetCDF files have the same variable name,
then
that's a problem. I think we'll need different variable named for
Ensemble-Stat to be able to distinguish between them.

If these things don't enable you to figure out what's going on, the
next
step would be sending me some sample data files to replicate this
behavior?

I'd really only need data for 2 ensemble members (ens_le_1 and
ens_le_2), 2
of the verifying gridded observations (levP250.nc and levP500.nc), and
you
ensemble-stat configuration file. You can post them to our anonymous
ftp
site following these instructions:
https://dtcenter.org/community-code/model-evaluation-tools-met/met-
help-desk#ftp

Thanks,
John Halley Gotway

On Sun, Aug 2, 2020 at 4:16 AM 焦宝峰 via RT <met_help at ucar.edu> wrote:

>
> Sun Aug 02 04:16:09 2020: Request 96084 was acted upon.
> Transaction: Ticket created by nuistjiao at 163.com
>        Queue: met_help
>      Subject: ensemble_stat
>        Owner: Nobody
>   Requestors: nuistjiao at 163.com
>       Status: new
>  Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96084 >
>
>
> Hello,
>
>
> I am working with MET_v8.1.
> I am trying to calculate the spread/skill variance and RMSE using
> ensemble_stat with the grib1 files produced by UPP and  reanalysis
data
> with specific format attribute:
>
>
> It works well dealing with single variable (Here, HGT) at single
level
> (such as 500 hPa).  When dealing with multiple levels, the results
become
> unreasonable.
> The command Line is: ensemble_stat 21 ens_ le_1 ... ens_ le_21
config_file
> -grid_obs levP250.nc -grid_obs levP500.nc -grid_obs levP700.nc
>
>
> Following is a snippet of settings for verification in the
config_file
> ##################################################
> //
> // Forecast and observation fields to be verified
> //
> fcst = {
>    field = [
>       {
>          name               = "HGT";
>          level              = [ "P250","P500","P750" ];
>          cat_thresh = [ NA ];
>          ens_ssvar_bin_size = 100;
>          ens_phist_bin_size = 0.05;
>       }
>    ];
> }
> obs = fcst;
> ###################################################
> The value of RMSE for 500 hPa and 700 hPa seems unreasonable. It
seems
> that only the first observation file (-grid_obs levP250.nc) is used
for
> verification at three levels. Is there any way to fix this problem,
other
> than doing it three times separately.
>
>
> Looking forward to your reply.
>
>
> Thanks in advance,
> Jiao B. F.
> The Institute of Atmospheric Physics， Chinese Academy of Sciences
>

------------------------------------------------
Subject: ensemble_stat
From: 焦宝峰
Time: Tue Aug 04 20:51:56 2020

Hi，

Thank you for your prompt and detailed reply.
It is true that the variable names in these NetCDF  files are the
same. The problem has been fixed after changing the variable names.

Thanks,
Jiao B. F.
On 8/4/2020 01:07，John Halley Gotway via RT<met_help at ucar.edu> wrote：
Jiao,

I see that you've found a potential problem with the Ensemble-Stat
tool.
When you specify multiple -grid_obs arguments, you suspect that all 3
forecast pressure levels are being verified against the first input
pressure level.

I'm looking at line 1570 of the ensemble_stat.cc source code:
https://github.com/NCAR/MET/blob/55a83e91a550807226e82224b02230f7422d75d0/met/src/tools/core/ensemble_stat/ensemble_stat.cc#L1570

I see that it's correctly looping through the verifying gridded
observation
input files (set by -grid_obs). It keeps reading data from them until
it
finds a match. I wish we had a log message written at line 1570 to
tell us
exactly which file was used:
https://github.com/NCAR/MET/blob/55a83e91a550807226e82224b02230f7422d75d0/met/src/tools/core/ensemble_stat/ensemble_stat.cc#L1578
We do have an upcoming release. I'll add that log message so this will
be
more clear in future releases.

It sounds like the problem is that all of the fields to be verified
(HGT at
P250, 500, and 700) "match" with data in the first -grid_obs input
file
(levP250.nc).
I'm guessing that if you reversed the order of the arguments, putting
P700
first (-grid_obs levP700.nc -grid_obs levP500.nc -grid_obs levP250.nc)
that
the P700 RMSE value would be reasonable, and the others not.

I see that you list P700 on the command line but P750 in the config
file.
I'll assume you intend to use P700 throughout.

Frankly, I'm surprised that you're getting any matches at all here.
You're
reading GRIB ensemble forecasts files and verifying against NetCDF
observations. You typically need to specify the configuration files
differently for those. Please run "ncdump -h" on each of those NetCDF
files:
ncdump -h levP250.nc
And note the name of the output variable. I'm going to guess that
they're
named HGT_P250, HGT_P500, and HGT_P700, respectively. If so, then
please
try the exact configuration listed below. If not, then substitute in
the
actual names:

cat_thresh = [ NA ];
ens_ssvar_bin_size = 100;
ens_phist_bin_size = 0.05;

fcst = {
field = [
{ name = "HGT"; level = [ "P250","P500","P700" ]; }
];
}
obs = {
field = [
{ name = "HGT_P250"; level="(*,*)"; },
{ name = "HGT_P500"; level="(*,*)"; },
{ name = "HGT_P750"; level="(*,*)"; }
];
}

If you find that all three NetCDF files have the same variable name,
then
that's a problem. I think we'll need different variable named for
Ensemble-Stat to be able to distinguish between them.

If these things don't enable you to figure out what's going on, the
next
step would be sending me some sample data files to replicate this
behavior?

I'd really only need data for 2 ensemble members (ens_le_1 and
ens_le_2), 2
of the verifying gridded observations (levP250.nc and levP500.nc), and
you
ensemble-stat configuration file. You can post them to our anonymous
ftp
site following these instructions:
https://dtcenter.org/community-code/model-evaluation-tools-met/met-
help-desk#ftp

Thanks,
John Halley Gotway

On Sun, Aug 2, 2020 at 4:16 AM 焦宝峰 via RT <met_help at ucar.edu> wrote:

Sun Aug 02 04:16:09 2020: Request 96084 was acted upon.
Transaction: Ticket created by nuistjiao at 163.com
Queue: met_help
Subject: ensemble_stat
Owner: Nobody
Requestors: nuistjiao at 163.com
Status: new
Ticket <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=96084 >

Hello,

I am working with MET_v8.1.
I am trying to calculate the spread/skill variance and RMSE using
ensemble_stat with the grib1 files produced by UPP and  reanalysis
data
with specific format attribute:

It works well dealing with single variable (Here, HGT) at single level
(such as 500 hPa).  When dealing with multiple levels, the results
become
unreasonable.
The command Line is: ensemble_stat 21 ens_ le_1 ... ens_ le_21
config_file
-grid_obs levP250.nc -grid_obs levP500.nc -grid_obs levP700.nc

Following is a snippet of settings for verification in the config_file
##################################################
//
// Forecast and observation fields to be verified
//
fcst = {
field = [
{
name               = "HGT";
level              = [ "P250","P500","P750" ];
cat_thresh = [ NA ];
ens_ssvar_bin_size = 100;
ens_phist_bin_size = 0.05;
}
];
}
obs = fcst;
###################################################
The value of RMSE for 500 hPa and 700 hPa seems unreasonable. It seems
that only the first observation file (-grid_obs levP250.nc) is used
for
verification at three levels. Is there any way to fix this problem,
other
than doing it three times separately.

Looking forward to your reply.

Thanks in advance,
Jiao B. F.
The Institute of Atmospheric Physics， Chinese Academy of Sciences

------------------------------------------------