[Met_help] [rt.rap.ucar.edu #97670] History for Questions about python embedding for a netcdf interface

Mon Dec 14 16:40:26 MST 2020

----------------------------------------------------------------
  Initial Request
----------------------------------------------------------------

Hi MET help,

I am with NASA’s GMAO and we are interested in using METplus to augment the verification of the GEOS model. However, due to many constraints, our output files use the netcdf format with multiple valid times and levels - which is unlikely to change. Some of these 4D output collections are large, and therefore, splitting them into single time slices or a subset of levels is not ideal either just for verification. Therefore, I have been looking at python embedding examples (https://dtcenter.github.io/METplus/develop/generated/met_tool_wrapper/) and trying to brainstorm possible ways to mimic the grib interface for our use. Specifically, I’d like to understand the best approach to looping through multiple INIT_TIME/VALID_TIME and levels while allowing my python scripts to find the matching time and level and return the appropriate 2D “met_data” xarray object.

1) Question 1:

I can envision a scenario writing a config file similar to:
FCST_VAR1_NAME = {INPUT_BASE}/myscripts/read_nc2xr.py {INPUT_BASE}/mydata/forecast_file.nc4 TMP {valid?fmt=%Y%m%d_%H%M} P500
where it would be relatively easy to select a single 2D array out of this dataset given the provided filename, variable, time, and level. However, what is unclear to me is what to do when I want to verify on multiple levels?

I’ve seen some examples similar to:
FCST_VAR1_LEVEL = “(0,*,*),(2,*,*),(4,*,*)…”
However, this seems to directly conflict with the required 2D xarray object returned by FCST_VAR<n>_NAME. How do these 2 config fields interplay using python embedding? Can FCST_VAR<n>_NAME return a 3D array for a given time slice while FCST_VAR<n>_LEVEL is used for level indices?

Alternatively, if I had  FCST_VAR1_LEVEL = P1000,P850,P700,P500,P250,P100, does this produce any internal keyword/variable such as {cur_level} that I could reference in the argument list for FCST_VAR<n>_NAME? In this case I’d prefer to use the more straightforward grib level/accumulation syntax which could be easily be used to lookup netcdf levels by a python script.

2) Question 2:

For python embedding, is there a list of acceptable key-value attrs (the metadata dictionary) for different grids? Do these correspond to those supported by xarray?

If you could provide any help or insight on these questions, it would be very helpful in thinking through a framework for our verification.

Thanks,
Scott

----------------------------------------------------------------
  Complete Ticket History
----------------------------------------------------------------

Subject: Questions about python embedding for a netcdf interface
From: George McCabe
Time: Mon Nov 30 14:34:02 2020

Hi Scott,

For Question 1, the MET tools only accept a 2D slab of data from
python
embedding. **Some** of the wrappers do set {CURRENT_FCST_NAME},
{CURRENT_FCST_LEVEL}, {CURRENT_OBS_NAME}, and {CURRENT_OBS_LEVEL} that
can
be referenced by other METplus config variables. However, I don't
think it
would work in this case. This functionality has only really been used
to
set the output prefix in the MET config for each run. In your case,
you
would have to reference CURRENT_FCST_LEVEL inside FCST_VAR<n>_NAME. I
am
fairly positive that this would not work as the code is written
currently.
I do see this as a useful enhancement that would make configuration
this
situation easier.

I will create a GitHub issue regarding this enhancement. To confirm
what
behavior you are expecting, the following configuration:

FCST_VAR1_NAME = {INPUT_BASE}/myscripts/read_nc2xr.py
{INPUT_BASE}/mydata/forecast_file.nc4 TMP {valid?fmt=%Y%m%d_%H%M}
{CURRENT_FCST_LEVEL}
FCST_VAR1_LEVELS = P1000,P850,P700,P500,P250,P100
would result in 6 calls to your python script -- one for each value in
the
levels list. Is that correct?

In the meantime, to obtain the results you require you could configure
the
wrappers in this way:

FCST_VAR1_NAME = {INPUT_BASE}/myscripts/read_nc2xr.py
{INPUT_BASE}/mydata/forecast_file.nc4 TMP {valid?fmt=%Y%m%d_%H%M}
P1000
FCST_VAR2_NAME = {INPUT_BASE}/myscripts/read_nc2xr.py
{INPUT_BASE}/mydata/forecast_file.nc4 TMP {valid?fmt=%Y%m%d_%H%M}
P850
FCST_VAR3_NAME = {INPUT_BASE}/myscripts/read_nc2xr.py
{INPUT_BASE}/mydata/forecast_file.nc4 TMP {valid?fmt=%Y%m%d_%H%M}
P700
FCST_VAR4_NAME = {INPUT_BASE}/myscripts/read_nc2xr.py
{INPUT_BASE}/mydata/forecast_file.nc4 TMP {valid?fmt=%Y%m%d_%H%M}
P500
FCST_VAR5_NAME = {INPUT_BASE}/myscripts/read_nc2xr.py
{INPUT_BASE}/mydata/forecast_file.nc4 TMP {valid?fmt=%Y%m%d_%H%M}
P250
FCST_VAR6_NAME = {INPUT_BASE}/myscripts/read_nc2xr.py
{INPUT_BASE}/mydata/forecast_file.nc4 TMP {valid?fmt=%Y%m%d_%H%M} P100

Each VAR<n> item will obtain a 2D field of data that will be
processed.
Please let me know if this does not work as you expect.

For Question 2, we just recently added more information on what grid
specifications for the supported grids in python embedding. The info
is
currently not in the latest 'develop' version of the docs on the web,
but
you can find it here:

https://github.com/dtcenter/MET/blob/develop/met/docs/Users_Guide/appendixF.rst#python-
embedding-for-2d-data

Please let me know if there is any information missing from here that
you
would like to be described in more detail. This is an evolving
document!

Thanks,
George

On Mon, Nov 30, 2020 at 2:01 PM Rabenhorst, Scott D. (GSFC-
610.1)[SCIENCE
SYSTEMS AND APPLICATIONS INC] via RT <met_help at ucar.edu> wrote:

>
> Mon Nov 30 14:00:42 2020: Request 97670 was acted upon.
> Transaction: Ticket created by scott.d.rabenhorst at nasa.gov
>        Queue: met_help
>      Subject: Questions about python embedding for a netcdf
interface
>        Owner: Nobody
>   Requestors: scott.d.rabenhorst at nasa.gov
>       Status: new
>  Ticket <URL:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=97670 >
>
>
> Hi MET help,
>
> I am with NASA’s GMAO and we are interested in using METplus to
augment
> the verification of the GEOS model. However, due to many
constraints, our
> output files use the netcdf format with multiple valid times and
levels -
> which is unlikely to change. Some of these 4D output collections are
large,
> and therefore, splitting them into single time slices or a subset of
levels
> is not ideal either just for verification. Therefore, I have been
looking
> at python embedding examples (
>
https://dtcenter.github.io/METplus/develop/generated/met_tool_wrapper/)
> and trying to brainstorm possible ways to mimic the grib interface
for our
> use. Specifically, I’d like to understand the best approach to
looping
> through multiple INIT_TIME/VALID_TIME and levels while allowing my
python
> scripts to find the matching time and level and return the
appropriate 2D
> “met_data” xarray object.
>
> 1) Question 1:
>
> I can envision a scenario writing a config file similar to:
> FCST_VAR1_NAME = {INPUT_BASE}/myscripts/read_nc2xr.py
> {INPUT_BASE}/mydata/forecast_file.nc4 TMP {valid?fmt=%Y%m%d_%H%M}
P500
> where it would be relatively easy to select a single 2D array out of
this
> dataset given the provided filename, variable, time, and level.
However,
> what is unclear to me is what to do when I want to verify on
multiple
> levels?
>
> I’ve seen some examples similar to:
> FCST_VAR1_LEVEL = “(0,*,*),(2,*,*),(4,*,*)…”
> However, this seems to directly conflict with the required 2D xarray
> object returned by FCST_VAR<n>_NAME. How do these 2 config fields
interplay
> using python embedding? Can FCST_VAR<n>_NAME return a 3D array for a
given
> time slice while FCST_VAR<n>_LEVEL is used for level indices?
>
> Alternatively, if I had  FCST_VAR1_LEVEL =
P1000,P850,P700,P500,P250,P100,
> does this produce any internal keyword/variable such as {cur_level}
that I
> could reference in the argument list for FCST_VAR<n>_NAME? In this
case I’d
> prefer to use the more straightforward grib level/accumulation
syntax which
> could be easily be used to lookup netcdf levels by a python script.
>
> 2) Question 2:
>
> For python embedding, is there a list of acceptable key-value attrs
(the
> metadata dictionary) for different grids? Do these correspond to
those
> supported by xarray?
>
> If you could provide any help or insight on these questions, it
would be
> very helpful in thinking through a framework for our verification.
>
> Thanks,
> Scott
>
>
>
>
>

--
George McCabe - Software Engineer III
National Center for Atmospheric Research
Research Applications Laboratory
303-497-2768
---
My working day may not be your working day. Please do not feel obliged
to
reply to this email outside of your normal working hours.

------------------------------------------------
Subject: Re: [EXTERNAL] Re: [rt.rap.ucar.edu #97670] Questions about python embedding for a netcdf interface
From: Rabenhorst, Scott D. (GSFC-610.1)[SCIENCE SYSTEMS AND APPLICATIONS INC]
Time: Fri Dec 04 10:16:23 2020

Hi George,

Thanks you so much for answering these questions and for creating an
issue. Yes, your example below was how I was envisioning the behavior.
That would be a very convenient feature to have the script called x-
times to loop through the level list keying off of a variable
{CURRENT_FCST_LEVEL}.

I am still working on testing some cases, but hopefully I will have
things working soon. While I was writing my read scripts, another
question occurred to me. I'm sure I missed this somewhere in the
documentation, but how can I pass missing/masked/fill values into
grid_stat, etc? Unlike some operational centers, we normally set field
gridpoints to a fill_value where the current pressure level is above
the surface pressure. I presume grid_stat can handle this? When using
python embedding, is there an additional attribute I can pass in
"attrs" specifying a fill_value?

On 11/30/20, 5:10 PM, "George McCabe via RT" <met_help at ucar.edu>
wrote:

    Hi Scott,

    For Question 1, the MET tools only accept a 2D slab of data from
python
    embedding. **Some** of the wrappers do set {CURRENT_FCST_NAME},
    {CURRENT_FCST_LEVEL}, {CURRENT_OBS_NAME}, and {CURRENT_OBS_LEVEL}
that can
    be referenced by other METplus config variables. However, I don't
think it
    would work in this case. This functionality has only really been
used to
    set the output prefix in the MET config for each run. In your
case, you
    would have to reference CURRENT_FCST_LEVEL inside
FCST_VAR<n>_NAME. I am
    fairly positive that this would not work as the code is written
currently.
    I do see this as a useful enhancement that would make
configuration this
    situation easier.

    I will create a GitHub issue regarding this enhancement. To
confirm what
    behavior you are expecting, the following configuration:

    FCST_VAR1_NAME = {INPUT_BASE}/myscripts/read_nc2xr.py
    {INPUT_BASE}/mydata/forecast_file.nc4 TMP {valid?fmt=%Y%m%d_%H%M}
    {CURRENT_FCST_LEVEL}
    FCST_VAR1_LEVELS = P1000,P850,P700,P500,P250,P100
    would result in 6 calls to your python script -- one for each
value in the
    levels list. Is that correct?

    In the meantime, to obtain the results you require you could
configure the
    wrappers in this way:

    FCST_VAR1_NAME = {INPUT_BASE}/myscripts/read_nc2xr.py
    {INPUT_BASE}/mydata/forecast_file.nc4 TMP {valid?fmt=%Y%m%d_%H%M}
P1000
    FCST_VAR2_NAME = {INPUT_BASE}/myscripts/read_nc2xr.py
    {INPUT_BASE}/mydata/forecast_file.nc4 TMP {valid?fmt=%Y%m%d_%H%M}
P850
    FCST_VAR3_NAME = {INPUT_BASE}/myscripts/read_nc2xr.py
    {INPUT_BASE}/mydata/forecast_file.nc4 TMP {valid?fmt=%Y%m%d_%H%M}
P700
    FCST_VAR4_NAME = {INPUT_BASE}/myscripts/read_nc2xr.py
    {INPUT_BASE}/mydata/forecast_file.nc4 TMP {valid?fmt=%Y%m%d_%H%M}
P500
    FCST_VAR5_NAME = {INPUT_BASE}/myscripts/read_nc2xr.py
    {INPUT_BASE}/mydata/forecast_file.nc4 TMP {valid?fmt=%Y%m%d_%H%M}
P250
    FCST_VAR6_NAME = {INPUT_BASE}/myscripts/read_nc2xr.py
    {INPUT_BASE}/mydata/forecast_file.nc4 TMP {valid?fmt=%Y%m%d_%H%M}
P100

    Each VAR<n> item will obtain a 2D field of data that will be
processed.
    Please let me know if this does not work as you expect.

    For Question 2, we just recently added more information on what
grid
    specifications for the supported grids in python embedding. The
info is
    currently not in the latest 'develop' version of the docs on the
web, but
    you can find it here:

    https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fdtcenter%2FMET%2Fblob%2Fdevelop%2Fmet%2Fdocs%2FUsers_Guide%2FappendixF.rst%23python-
embedding-for-2d-
data&data=04%7C01%7Cscott.d.rabenhorst%40nasa.gov%7Cf9679495a83c401bdba108d8957cbdbd%7C7005d45845be48ae8140d43da96dd17b%7C0%7C0%7C637423710289831614%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=jDa10KMlnoHiCUgaukiSpT3P61yMd58G%2Bk82iUhvxm4%3D&reserved=0

    Please let me know if there is any information missing from here
that you
    would like to be described in more detail. This is an evolving
document!

    Thanks,
    George

    On Mon, Nov 30, 2020 at 2:01 PM Rabenhorst, Scott D. (GSFC-
610.1)[SCIENCE
    SYSTEMS AND APPLICATIONS INC] via RT <met_help at ucar.edu> wrote:

    >
    > Mon Nov 30 14:00:42 2020: Request 97670 was acted upon.
    > Transaction: Ticket created by scott.d.rabenhorst at nasa.gov
    >        Queue: met_help
    >      Subject: Questions about python embedding for a netcdf
interface
    >        Owner: Nobody
    >   Requestors: scott.d.rabenhorst at nasa.gov
    >       Status: new
    >  Ticket <URL:
https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Frt.rap.ucar.edu%2Frt%2FTicket%2FDisplay.html%3Fid%3D97670&data=04%7C01%7Cscott.d.rabenhorst%40nasa.gov%7Cf9679495a83c401bdba108d8957cbdbd%7C7005d45845be48ae8140d43da96dd17b%7C0%7C0%7C637423710289831614%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=pT%2BYVc8rS%2Bq%2F8SB%2B2C55MOvXLQRL5qN5YiUp%2BoqaE1A%3D&reserved=0
>
    >
    >
    > Hi MET help,
    >
    > I am with NASA’s GMAO and we are interested in using METplus to
augment
    > the verification of the GEOS model. However, due to many
constraints, our
    > output files use the netcdf format with multiple valid times and
levels -
    > which is unlikely to change. Some of these 4D output collections
are large,
    > and therefore, splitting them into single time slices or a
subset of levels
    > is not ideal either just for verification. Therefore, I have
been looking
    > at python embedding examples (
    >
https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdtcenter.github.io%2FMETplus%2Fdevelop%2Fgenerated%2Fmet_tool_wrapper%2F&data=04%7C01%7Cscott.d.rabenhorst%40nasa.gov%7Cf9679495a83c401bdba108d8957cbdbd%7C7005d45845be48ae8140d43da96dd17b%7C0%7C0%7C637423710289831614%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=8z9E%2FnWtqrKN69uvyjnaomog5zmw%2BRtRu4fGVQ4K5cU%3D&reserved=0)
    > and trying to brainstorm possible ways to mimic the grib
interface for our
    > use. Specifically, I’d like to understand the best approach to
looping
    > through multiple INIT_TIME/VALID_TIME and levels while allowing
my python
    > scripts to find the matching time and level and return the
appropriate 2D
    > “met_data” xarray object.
    >
    > 1) Question 1:
    >
    > I can envision a scenario writing a config file similar to:
    > FCST_VAR1_NAME = {INPUT_BASE}/myscripts/read_nc2xr.py
    > {INPUT_BASE}/mydata/forecast_file.nc4 TMP
{valid?fmt=%Y%m%d_%H%M} P500
    > where it would be relatively easy to select a single 2D array
out of this
    > dataset given the provided filename, variable, time, and level.
However,
    > what is unclear to me is what to do when I want to verify on
multiple
    > levels?
    >
    > I’ve seen some examples similar to:
    > FCST_VAR1_LEVEL = “(0,*,*),(2,*,*),(4,*,*)…”
    > However, this seems to directly conflict with the required 2D
xarray
    > object returned by FCST_VAR<n>_NAME. How do these 2 config
fields interplay
    > using python embedding? Can FCST_VAR<n>_NAME return a 3D array
for a given
    > time slice while FCST_VAR<n>_LEVEL is used for level indices?
    >
    > Alternatively, if I had  FCST_VAR1_LEVEL =
P1000,P850,P700,P500,P250,P100,
    > does this produce any internal keyword/variable such as
{cur_level} that I
    > could reference in the argument list for FCST_VAR<n>_NAME? In
this case I’d
    > prefer to use the more straightforward grib level/accumulation
syntax which
    > could be easily be used to lookup netcdf levels by a python
script.
    >
    > 2) Question 2:
    >
    > For python embedding, is there a list of acceptable key-value
attrs (the
    > metadata dictionary) for different grids? Do these correspond to
those
    > supported by xarray?
    >
    > If you could provide any help or insight on these questions, it
would be
    > very helpful in thinking through a framework for our
verification.
    >
    > Thanks,
    > Scott
    >
    >
    >
    >
    >

    --
    George McCabe - Software Engineer III
    National Center for Atmospheric Research
    Research Applications Laboratory
    303-497-2768
    ---
    My working day may not be your working day. Please do not feel
obliged to
    reply to this email outside of your normal working hours.

------------------------------------------------
Subject: Re: [EXTERNAL] Re: [rt.rap.ucar.edu #97670] Questions about python embedding for a netcdf interface
From: Rabenhorst, Scott D. (GSFC-610.1)[SCIENCE SYSTEMS AND APPLICATIONS INC]
Time: Fri Dec 04 10:18:58 2020

Sorry, I clicked send before finishing my email below - but I was
mostly done. Please let me know if you have any suggestions. I have
greatly appreciated your help!

Thanks,
Scott

On 12/4/20, 12:16 PM, "Rabenhorst, Scott D. (GSFC-610.1)[SCIENCE
SYSTEMS AND APPLICATIONS INC]" <scott.d.rabenhorst at nasa.gov> wrote:

    Hi George,

    Thanks you so much for answering these questions and for creating
an issue. Yes, your example below was how I was envisioning the
behavior. That would be a very convenient feature to have the script
called x-times to loop through the level list keying off of a variable
{CURRENT_FCST_LEVEL}.

    I am still working on testing some cases, but hopefully I will
have things working soon. While I was writing my read scripts, another
question occurred to me. I'm sure I missed this somewhere in the
documentation, but how can I pass missing/masked/fill values into
grid_stat, etc? Unlike some operational centers, we normally set field
gridpoints to a fill_value where the current pressure level is above
the surface pressure. I presume grid_stat can handle this? When using
python embedding, is there an additional attribute I can pass in
"attrs" specifying a fill_value?

    On 11/30/20, 5:10 PM, "George McCabe via RT" <met_help at ucar.edu>
wrote:

        Hi Scott,

        For Question 1, the MET tools only accept a 2D slab of data
from python
        embedding. **Some** of the wrappers do set
{CURRENT_FCST_NAME},
        {CURRENT_FCST_LEVEL}, {CURRENT_OBS_NAME}, and
{CURRENT_OBS_LEVEL} that can
        be referenced by other METplus config variables. However, I
don't think it
        would work in this case. This functionality has only really
been used to
        set the output prefix in the MET config for each run. In your
case, you
        would have to reference CURRENT_FCST_LEVEL inside
FCST_VAR<n>_NAME. I am
        fairly positive that this would not work as the code is
written currently.
        I do see this as a useful enhancement that would make
configuration this
        situation easier.

        I will create a GitHub issue regarding this enhancement. To
confirm what
        behavior you are expecting, the following configuration:

        FCST_VAR1_NAME = {INPUT_BASE}/myscripts/read_nc2xr.py
        {INPUT_BASE}/mydata/forecast_file.nc4 TMP
{valid?fmt=%Y%m%d_%H%M}
        {CURRENT_FCST_LEVEL}
        FCST_VAR1_LEVELS = P1000,P850,P700,P500,P250,P100
        would result in 6 calls to your python script -- one for each
value in the
        levels list. Is that correct?

        In the meantime, to obtain the results you require you could
configure the
        wrappers in this way:

        FCST_VAR1_NAME = {INPUT_BASE}/myscripts/read_nc2xr.py
        {INPUT_BASE}/mydata/forecast_file.nc4 TMP
{valid?fmt=%Y%m%d_%H%M} P1000
        FCST_VAR2_NAME = {INPUT_BASE}/myscripts/read_nc2xr.py
        {INPUT_BASE}/mydata/forecast_file.nc4 TMP
{valid?fmt=%Y%m%d_%H%M}  P850
        FCST_VAR3_NAME = {INPUT_BASE}/myscripts/read_nc2xr.py
        {INPUT_BASE}/mydata/forecast_file.nc4 TMP
{valid?fmt=%Y%m%d_%H%M}  P700
        FCST_VAR4_NAME = {INPUT_BASE}/myscripts/read_nc2xr.py
        {INPUT_BASE}/mydata/forecast_file.nc4 TMP
{valid?fmt=%Y%m%d_%H%M}  P500
        FCST_VAR5_NAME = {INPUT_BASE}/myscripts/read_nc2xr.py
        {INPUT_BASE}/mydata/forecast_file.nc4 TMP
{valid?fmt=%Y%m%d_%H%M}  P250
        FCST_VAR6_NAME = {INPUT_BASE}/myscripts/read_nc2xr.py
        {INPUT_BASE}/mydata/forecast_file.nc4 TMP
{valid?fmt=%Y%m%d_%H%M} P100

        Each VAR<n> item will obtain a 2D field of data that will be
processed.
        Please let me know if this does not work as you expect.

        For Question 2, we just recently added more information on
what grid
        specifications for the supported grids in python embedding.
The info is
        currently not in the latest 'develop' version of the docs on
the web, but
        you can find it here:

        https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fdtcenter%2FMET%2Fblob%2Fdevelop%2Fmet%2Fdocs%2FUsers_Guide%2FappendixF.rst%23python-
embedding-for-2d-
data&data=04%7C01%7Cscott.d.rabenhorst%40nasa.gov%7Cf9679495a83c401bdba108d8957cbdbd%7C7005d45845be48ae8140d43da96dd17b%7C0%7C0%7C637423710289831614%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=jDa10KMlnoHiCUgaukiSpT3P61yMd58G%2Bk82iUhvxm4%3D&reserved=0

        Please let me know if there is any information missing from
here that you
        would like to be described in more detail. This is an evolving
document!

        Thanks,
        George

        On Mon, Nov 30, 2020 at 2:01 PM Rabenhorst, Scott D. (GSFC-
610.1)[SCIENCE
        SYSTEMS AND APPLICATIONS INC] via RT <met_help at ucar.edu>
wrote:

        >
        > Mon Nov 30 14:00:42 2020: Request 97670 was acted upon.
        > Transaction: Ticket created by scott.d.rabenhorst at nasa.gov
        >        Queue: met_help
        >      Subject: Questions about python embedding for a netcdf
interface
        >        Owner: Nobody
        >   Requestors: scott.d.rabenhorst at nasa.gov
        >       Status: new
        >  Ticket <URL:
https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Frt.rap.ucar.edu%2Frt%2FTicket%2FDisplay.html%3Fid%3D97670&data=04%7C01%7Cscott.d.rabenhorst%40nasa.gov%7Cf9679495a83c401bdba108d8957cbdbd%7C7005d45845be48ae8140d43da96dd17b%7C0%7C0%7C637423710289831614%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=pT%2BYVc8rS%2Bq%2F8SB%2B2C55MOvXLQRL5qN5YiUp%2BoqaE1A%3D&reserved=0
>
        >
        >
        > Hi MET help,
        >
        > I am with NASA’s GMAO and we are interested in using METplus
to augment
        > the verification of the GEOS model. However, due to many
constraints, our
        > output files use the netcdf format with multiple valid times
and levels -
        > which is unlikely to change. Some of these 4D output
collections are large,
        > and therefore, splitting them into single time slices or a
subset of levels
        > is not ideal either just for verification. Therefore, I have
been looking
        > at python embedding examples (
        >
https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdtcenter.github.io%2FMETplus%2Fdevelop%2Fgenerated%2Fmet_tool_wrapper%2F&data=04%7C01%7Cscott.d.rabenhorst%40nasa.gov%7Cf9679495a83c401bdba108d8957cbdbd%7C7005d45845be48ae8140d43da96dd17b%7C0%7C0%7C637423710289831614%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=8z9E%2FnWtqrKN69uvyjnaomog5zmw%2BRtRu4fGVQ4K5cU%3D&reserved=0)
        > and trying to brainstorm possible ways to mimic the grib
interface for our
        > use. Specifically, I’d like to understand the best approach
to looping
        > through multiple INIT_TIME/VALID_TIME and levels while
allowing my python
        > scripts to find the matching time and level and return the
appropriate 2D
        > “met_data” xarray object.
        >
        > 1) Question 1:
        >
        > I can envision a scenario writing a config file similar to:
        > FCST_VAR1_NAME = {INPUT_BASE}/myscripts/read_nc2xr.py
        > {INPUT_BASE}/mydata/forecast_file.nc4 TMP
{valid?fmt=%Y%m%d_%H%M} P500
        > where it would be relatively easy to select a single 2D
array out of this
        > dataset given the provided filename, variable, time, and
level. However,
        > what is unclear to me is what to do when I want to verify on
multiple
        > levels?
        >
        > I’ve seen some examples similar to:
        > FCST_VAR1_LEVEL = “(0,*,*),(2,*,*),(4,*,*)…”
        > However, this seems to directly conflict with the required
2D xarray
        > object returned by FCST_VAR<n>_NAME. How do these 2 config
fields interplay
        > using python embedding? Can FCST_VAR<n>_NAME return a 3D
array for a given
        > time slice while FCST_VAR<n>_LEVEL is used for level
indices?
        >
        > Alternatively, if I had  FCST_VAR1_LEVEL =
P1000,P850,P700,P500,P250,P100,
        > does this produce any internal keyword/variable such as
{cur_level} that I
        > could reference in the argument list for FCST_VAR<n>_NAME?
In this case I’d
        > prefer to use the more straightforward grib
level/accumulation syntax which
        > could be easily be used to lookup netcdf levels by a python
script.
        >
        > 2) Question 2:
        >
        > For python embedding, is there a list of acceptable key-
value attrs (the
        > metadata dictionary) for different grids? Do these
correspond to those
        > supported by xarray?
        >
        > If you could provide any help or insight on these questions,
it would be
        > very helpful in thinking through a framework for our
verification.
        >
        > Thanks,
        > Scott
        >
        >
        >
        >
        >

        --
        George McCabe - Software Engineer III
        National Center for Atmospheric Research
        Research Applications Laboratory
        303-497-2768
        ---
        My working day may not be your working day. Please do not feel
obliged to
        reply to this email outside of your normal working hours.

------------------------------------------------
Subject: Re: [EXTERNAL] Re: [rt.rap.ucar.edu #97670] Questions about python embedding for a netcdf interface
From: Rabenhorst, Scott D. (GSFC-610.1)[SCIENCE SYSTEMS AND APPLICATIONS INC]
Time: Fri Dec 04 11:55:43 2020

Hi George,

I have one more quick question, in addition to the one below regarding
handling missing/invalid data with python embedding. I was looking
over the link you sent
(https://github.com/dtcenter/MET/blob/develop/met/docs/Users_Guide/appendixF.rst#python-
embedding-for-2d-data) with more information about python embedding.
It states lead and accumulation times must follow the format HH[MMSS].
However, we often run forecasts out 10 days or 240 hours. Is the
wrapper code smart enough to expand to HHH[MMSS] to accommodate lead
times more than 99 hours?

Thanks,
Scott

On 12/4/20, 12:18 PM, "Rabenhorst, Scott D. (GSFC-610.1)[SCIENCE
SYSTEMS AND APPLICATIONS INC]" <scott.d.rabenhorst at nasa.gov> wrote:

    Sorry, I clicked send before finishing my email below - but I was
mostly done. Please let me know if you have any suggestions. I have
greatly appreciated your help!

    Thanks,
    Scott

    On 12/4/20, 12:16 PM, "Rabenhorst, Scott D. (GSFC-610.1)[SCIENCE
SYSTEMS AND APPLICATIONS INC]" <scott.d.rabenhorst at nasa.gov> wrote:

        Hi George,

        Thanks you so much for answering these questions and for
creating an issue. Yes, your example below was how I was envisioning
the behavior. That would be a very convenient feature to have the
script called x-times to loop through the level list keying off of a
variable {CURRENT_FCST_LEVEL}.

        I am still working on testing some cases, but hopefully I will
have things working soon. While I was writing my read scripts, another
question occurred to me. I'm sure I missed this somewhere in the
documentation, but how can I pass missing/masked/fill values into
grid_stat, etc? Unlike some operational centers, we normally set field
gridpoints to a fill_value where the current pressure level is above
the surface pressure. I presume grid_stat can handle this? When using
python embedding, is there an additional attribute I can pass in
"attrs" specifying a fill_value?

        On 11/30/20, 5:10 PM, "George McCabe via RT"
<met_help at ucar.edu> wrote:

            Hi Scott,

            For Question 1, the MET tools only accept a 2D slab of
data from python
            embedding. **Some** of the wrappers do set
{CURRENT_FCST_NAME},
            {CURRENT_FCST_LEVEL}, {CURRENT_OBS_NAME}, and
{CURRENT_OBS_LEVEL} that can
            be referenced by other METplus config variables. However,
I don't think it
            would work in this case. This functionality has only
really been used to
            set the output prefix in the MET config for each run. In
your case, you
            would have to reference CURRENT_FCST_LEVEL inside
FCST_VAR<n>_NAME. I am
            fairly positive that this would not work as the code is
written currently.
            I do see this as a useful enhancement that would make
configuration this
            situation easier.

            I will create a GitHub issue regarding this enhancement.
To confirm what
            behavior you are expecting, the following configuration:

            FCST_VAR1_NAME = {INPUT_BASE}/myscripts/read_nc2xr.py
            {INPUT_BASE}/mydata/forecast_file.nc4 TMP
{valid?fmt=%Y%m%d_%H%M}
            {CURRENT_FCST_LEVEL}
            FCST_VAR1_LEVELS = P1000,P850,P700,P500,P250,P100
            would result in 6 calls to your python script -- one for
each value in the
            levels list. Is that correct?

            In the meantime, to obtain the results you require you
could configure the
            wrappers in this way:

            FCST_VAR1_NAME = {INPUT_BASE}/myscripts/read_nc2xr.py
            {INPUT_BASE}/mydata/forecast_file.nc4 TMP
{valid?fmt=%Y%m%d_%H%M} P1000
            FCST_VAR2_NAME = {INPUT_BASE}/myscripts/read_nc2xr.py
            {INPUT_BASE}/mydata/forecast_file.nc4 TMP
{valid?fmt=%Y%m%d_%H%M}  P850
            FCST_VAR3_NAME = {INPUT_BASE}/myscripts/read_nc2xr.py
            {INPUT_BASE}/mydata/forecast_file.nc4 TMP
{valid?fmt=%Y%m%d_%H%M}  P700
            FCST_VAR4_NAME = {INPUT_BASE}/myscripts/read_nc2xr.py
            {INPUT_BASE}/mydata/forecast_file.nc4 TMP
{valid?fmt=%Y%m%d_%H%M}  P500
            FCST_VAR5_NAME = {INPUT_BASE}/myscripts/read_nc2xr.py
            {INPUT_BASE}/mydata/forecast_file.nc4 TMP
{valid?fmt=%Y%m%d_%H%M}  P250
            FCST_VAR6_NAME = {INPUT_BASE}/myscripts/read_nc2xr.py
            {INPUT_BASE}/mydata/forecast_file.nc4 TMP
{valid?fmt=%Y%m%d_%H%M} P100

            Each VAR<n> item will obtain a 2D field of data that will
be processed.
            Please let me know if this does not work as you expect.

            For Question 2, we just recently added more information on
what grid
            specifications for the supported grids in python
embedding. The info is
            currently not in the latest 'develop' version of the docs
on the web, but
            you can find it here:

            https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fdtcenter%2FMET%2Fblob%2Fdevelop%2Fmet%2Fdocs%2FUsers_Guide%2FappendixF.rst%23python-
embedding-for-2d-
data&data=04%7C01%7Cscott.d.rabenhorst%40nasa.gov%7Cf9679495a83c401bdba108d8957cbdbd%7C7005d45845be48ae8140d43da96dd17b%7C0%7C0%7C637423710289831614%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=jDa10KMlnoHiCUgaukiSpT3P61yMd58G%2Bk82iUhvxm4%3D&reserved=0

            Please let me know if there is any information missing
from here that you
            would like to be described in more detail. This is an
evolving document!

            Thanks,
            George

            On Mon, Nov 30, 2020 at 2:01 PM Rabenhorst, Scott D.
(GSFC-610.1)[SCIENCE
            SYSTEMS AND APPLICATIONS INC] via RT <met_help at ucar.edu>
wrote:

            >
            > Mon Nov 30 14:00:42 2020: Request 97670 was acted upon.
            > Transaction: Ticket created by
scott.d.rabenhorst at nasa.gov
            >        Queue: met_help
            >      Subject: Questions about python embedding for a
netcdf interface
            >        Owner: Nobody
            >   Requestors: scott.d.rabenhorst at nasa.gov
            >       Status: new
            >  Ticket <URL:
https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Frt.rap.ucar.edu%2Frt%2FTicket%2FDisplay.html%3Fid%3D97670&data=04%7C01%7Cscott.d.rabenhorst%40nasa.gov%7Cf9679495a83c401bdba108d8957cbdbd%7C7005d45845be48ae8140d43da96dd17b%7C0%7C0%7C637423710289831614%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=pT%2BYVc8rS%2Bq%2F8SB%2B2C55MOvXLQRL5qN5YiUp%2BoqaE1A%3D&reserved=0
>
            >
            >
            > Hi MET help,
            >
            > I am with NASA’s GMAO and we are interested in using
METplus to augment
            > the verification of the GEOS model. However, due to many
constraints, our
            > output files use the netcdf format with multiple valid
times and levels -
            > which is unlikely to change. Some of these 4D output
collections are large,
            > and therefore, splitting them into single time slices or
a subset of levels
            > is not ideal either just for verification. Therefore, I
have been looking
            > at python embedding examples (
            >
https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdtcenter.github.io%2FMETplus%2Fdevelop%2Fgenerated%2Fmet_tool_wrapper%2F&data=04%7C01%7Cscott.d.rabenhorst%40nasa.gov%7Cf9679495a83c401bdba108d8957cbdbd%7C7005d45845be48ae8140d43da96dd17b%7C0%7C0%7C637423710289831614%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=8z9E%2FnWtqrKN69uvyjnaomog5zmw%2BRtRu4fGVQ4K5cU%3D&reserved=0)
            > and trying to brainstorm possible ways to mimic the grib
interface for our
            > use. Specifically, I’d like to understand the best
approach to looping
            > through multiple INIT_TIME/VALID_TIME and levels while
allowing my python
            > scripts to find the matching time and level and return
the appropriate 2D
            > “met_data” xarray object.
            >
            > 1) Question 1:
            >
            > I can envision a scenario writing a config file similar
to:
            > FCST_VAR1_NAME = {INPUT_BASE}/myscripts/read_nc2xr.py
            > {INPUT_BASE}/mydata/forecast_file.nc4 TMP
{valid?fmt=%Y%m%d_%H%M} P500
            > where it would be relatively easy to select a single 2D
array out of this
            > dataset given the provided filename, variable, time, and
level. However,
            > what is unclear to me is what to do when I want to
verify on multiple
            > levels?
            >
            > I’ve seen some examples similar to:
            > FCST_VAR1_LEVEL = “(0,*,*),(2,*,*),(4,*,*)…”
            > However, this seems to directly conflict with the
required 2D xarray
            > object returned by FCST_VAR<n>_NAME. How do these 2
config fields interplay
            > using python embedding? Can FCST_VAR<n>_NAME return a 3D
array for a given
            > time slice while FCST_VAR<n>_LEVEL is used for level
indices?
            >
            > Alternatively, if I had  FCST_VAR1_LEVEL =
P1000,P850,P700,P500,P250,P100,
            > does this produce any internal keyword/variable such as
{cur_level} that I
            > could reference in the argument list for
FCST_VAR<n>_NAME? In this case I’d
            > prefer to use the more straightforward grib
level/accumulation syntax which
            > could be easily be used to lookup netcdf levels by a
python script.
            >
            > 2) Question 2:
            >
            > For python embedding, is there a list of acceptable key-
value attrs (the
            > metadata dictionary) for different grids? Do these
correspond to those
            > supported by xarray?
            >
            > If you could provide any help or insight on these
questions, it would be
            > very helpful in thinking through a framework for our
verification.
            >
            > Thanks,
            > Scott
            >
            >
            >
            >
            >

            --
            George McCabe - Software Engineer III
            National Center for Atmospheric Research
            Research Applications Laboratory
            303-497-2768
            ---
            My working day may not be your working day. Please do not
feel obliged to
            reply to this email outside of your normal working hours.

------------------------------------------------
Subject: Questions about python embedding for a netcdf interface
From: George McCabe
Time: Mon Dec 07 12:48:59 2020

Hi Scott,

-9999 is the value used for missing data in MET. I would recommend
adding a
line to your python script to change all fill values to this value
before
passing it into MET. Here is an example that sets negative values to
missing:

https://dtcenter.org/sites/default/files/community-code/met/python-
scripts/read_3B42RT.py.txt

This script calls:

data[data<0] = -9999

Your script could include something like:

data[data == fill_value] = -9999

I don't think the documentation mentions that the missing value to use
in
the python embedding scripts is -9999, but I have reached out to a
co-worker to update the docs to include this information.

By the way, here is the GitHub issue I created regarding the
enhancement we
discussed. Please feel free to add any comments if you are able to.

https://github.com/dtcenter/METplus/issues/719

Thanks,
George

On Fri, Dec 4, 2020 at 10:16 AM Rabenhorst, Scott D. (GSFC-
610.1)[SCIENCE
SYSTEMS AND APPLICATIONS INC] via RT <met_help at ucar.edu> wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=97670 >
>
> Hi George,
>
> Thanks you so much for answering these questions and for creating an
> issue. Yes, your example below was how I was envisioning the
behavior. That
> would be a very convenient feature to have the script called x-times
to
> loop through the level list keying off of a variable
{CURRENT_FCST_LEVEL}.
>
> I am still working on testing some cases, but hopefully I will have
things
> working soon. While I was writing my read scripts, another question
> occurred to me. I'm sure I missed this somewhere in the
documentation, but
> how can I pass missing/masked/fill values into grid_stat, etc?
Unlike some
> operational centers, we normally set field gridpoints to a
fill_value where
> the current pressure level is above the surface pressure. I presume
> grid_stat can handle this? When using python embedding, is there an
> additional attribute I can pass in "attrs" specifying a fill_value?
>
>
>
> On 11/30/20, 5:10 PM, "George McCabe via RT" <met_help at ucar.edu>
wrote:
>
>     Hi Scott,
>
>     For Question 1, the MET tools only accept a 2D slab of data from
python
>     embedding. **Some** of the wrappers do set {CURRENT_FCST_NAME},
>     {CURRENT_FCST_LEVEL}, {CURRENT_OBS_NAME}, and
{CURRENT_OBS_LEVEL} that
> can
>     be referenced by other METplus config variables. However, I
don't
> think it
>     would work in this case. This functionality has only really been
used
> to
>     set the output prefix in the MET config for each run. In your
case, you
>     would have to reference CURRENT_FCST_LEVEL inside
FCST_VAR<n>_NAME. I
> am
>     fairly positive that this would not work as the code is written
> currently.
>     I do see this as a useful enhancement that would make
configuration
> this
>     situation easier.
>
>     I will create a GitHub issue regarding this enhancement. To
confirm
> what
>     behavior you are expecting, the following configuration:
>
>     FCST_VAR1_NAME = {INPUT_BASE}/myscripts/read_nc2xr.py
>     {INPUT_BASE}/mydata/forecast_file.nc4 TMP
{valid?fmt=%Y%m%d_%H%M}
>     {CURRENT_FCST_LEVEL}
>     FCST_VAR1_LEVELS = P1000,P850,P700,P500,P250,P100
>     would result in 6 calls to your python script -- one for each
value in
> the
>     levels list. Is that correct?
>
>     In the meantime, to obtain the results you require you could
configure
> the
>     wrappers in this way:
>
>     FCST_VAR1_NAME = {INPUT_BASE}/myscripts/read_nc2xr.py
>     {INPUT_BASE}/mydata/forecast_file.nc4 TMP
{valid?fmt=%Y%m%d_%H%M} P1000
>     FCST_VAR2_NAME = {INPUT_BASE}/myscripts/read_nc2xr.py
>     {INPUT_BASE}/mydata/forecast_file.nc4 TMP
{valid?fmt=%Y%m%d_%H%M}  P850
>     FCST_VAR3_NAME = {INPUT_BASE}/myscripts/read_nc2xr.py
>     {INPUT_BASE}/mydata/forecast_file.nc4 TMP
{valid?fmt=%Y%m%d_%H%M}  P700
>     FCST_VAR4_NAME = {INPUT_BASE}/myscripts/read_nc2xr.py
>     {INPUT_BASE}/mydata/forecast_file.nc4 TMP
{valid?fmt=%Y%m%d_%H%M}  P500
>     FCST_VAR5_NAME = {INPUT_BASE}/myscripts/read_nc2xr.py
>     {INPUT_BASE}/mydata/forecast_file.nc4 TMP
{valid?fmt=%Y%m%d_%H%M}  P250
>     FCST_VAR6_NAME = {INPUT_BASE}/myscripts/read_nc2xr.py
>     {INPUT_BASE}/mydata/forecast_file.nc4 TMP
{valid?fmt=%Y%m%d_%H%M} P100
>
>     Each VAR<n> item will obtain a 2D field of data that will be
processed.
>     Please let me know if this does not work as you expect.
>
>
>     For Question 2, we just recently added more information on what
grid
>     specifications for the supported grids in python embedding. The
info is
>     currently not in the latest 'develop' version of the docs on the
web,
> but
>     you can find it here:
>
>
>
https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fdtcenter%2FMET%2Fblob%2Fdevelop%2Fmet%2Fdocs%2FUsers_Guide%2FappendixF.rst%23python-
embedding-for-2d-
data&data=04%7C01%7Cscott.d.rabenhorst%40nasa.gov%7Cf9679495a83c401bdba108d8957cbdbd%7C7005d45845be48ae8140d43da96dd17b%7C0%7C0%7C637423710289831614%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=jDa10KMlnoHiCUgaukiSpT3P61yMd58G%2Bk82iUhvxm4%3D&reserved=0
>
>     Please let me know if there is any information missing from here
that
> you
>     would like to be described in more detail. This is an evolving
> document!
>
>     Thanks,
>     George
>
>
>     On Mon, Nov 30, 2020 at 2:01 PM Rabenhorst, Scott D.
> (GSFC-610.1)[SCIENCE
>     SYSTEMS AND APPLICATIONS INC] via RT <met_help at ucar.edu> wrote:
>
>     >
>     > Mon Nov 30 14:00:42 2020: Request 97670 was acted upon.
>     > Transaction: Ticket created by scott.d.rabenhorst at nasa.gov
>     >        Queue: met_help
>     >      Subject: Questions about python embedding for a netcdf
interface
>     >        Owner: Nobody
>     >   Requestors: scott.d.rabenhorst at nasa.gov
>     >       Status: new
>     >  Ticket <URL:
>
https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Frt.rap.ucar.edu%2Frt%2FTicket%2FDisplay.html%3Fid%3D97670&data=04%7C01%7Cscott.d.rabenhorst%40nasa.gov%7Cf9679495a83c401bdba108d8957cbdbd%7C7005d45845be48ae8140d43da96dd17b%7C0%7C0%7C637423710289831614%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=pT%2BYVc8rS%2Bq%2F8SB%2B2C55MOvXLQRL5qN5YiUp%2BoqaE1A%3D&reserved=0
> >
>     >
>     >
>     > Hi MET help,
>     >
>     > I am with NASA’s GMAO and we are interested in using METplus
to
> augment
>     > the verification of the GEOS model. However, due to many
> constraints, our
>     > output files use the netcdf format with multiple valid times
and
> levels -
>     > which is unlikely to change. Some of these 4D output
collections are
> large,
>     > and therefore, splitting them into single time slices or a
subset of
> levels
>     > is not ideal either just for verification. Therefore, I have
been
> looking
>     > at python embedding examples (
>     >
>
https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdtcenter.github.io%2FMETplus%2Fdevelop%2Fgenerated%2Fmet_tool_wrapper%2F&data=04%7C01%7Cscott.d.rabenhorst%40nasa.gov%7Cf9679495a83c401bdba108d8957cbdbd%7C7005d45845be48ae8140d43da96dd17b%7C0%7C0%7C637423710289831614%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=8z9E%2FnWtqrKN69uvyjnaomog5zmw%2BRtRu4fGVQ4K5cU%3D&reserved=0
> )
>     > and trying to brainstorm possible ways to mimic the grib
interface
> for our
>     > use. Specifically, I’d like to understand the best approach to
> looping
>     > through multiple INIT_TIME/VALID_TIME and levels while
allowing my
> python
>     > scripts to find the matching time and level and return the
> appropriate 2D
>     > “met_data” xarray object.
>     >
>     > 1) Question 1:
>     >
>     > I can envision a scenario writing a config file similar to:
>     > FCST_VAR1_NAME = {INPUT_BASE}/myscripts/read_nc2xr.py
>     > {INPUT_BASE}/mydata/forecast_file.nc4 TMP
{valid?fmt=%Y%m%d_%H%M}
> P500
>     > where it would be relatively easy to select a single 2D array
out of
> this
>     > dataset given the provided filename, variable, time, and
level.
> However,
>     > what is unclear to me is what to do when I want to verify on
multiple
>     > levels?
>     >
>     > I’ve seen some examples similar to:
>     > FCST_VAR1_LEVEL = “(0,*,*),(2,*,*),(4,*,*)…”
>     > However, this seems to directly conflict with the required 2D
xarray
>     > object returned by FCST_VAR<n>_NAME. How do these 2 config
fields
> interplay
>     > using python embedding? Can FCST_VAR<n>_NAME return a 3D array
for a
> given
>     > time slice while FCST_VAR<n>_LEVEL is used for level indices?
>     >
>     > Alternatively, if I had  FCST_VAR1_LEVEL =
> P1000,P850,P700,P500,P250,P100,
>     > does this produce any internal keyword/variable such as
{cur_level}
> that I
>     > could reference in the argument list for FCST_VAR<n>_NAME? In
this
> case I’d
>     > prefer to use the more straightforward grib level/accumulation
> syntax which
>     > could be easily be used to lookup netcdf levels by a python
script.
>     >
>     > 2) Question 2:
>     >
>     > For python embedding, is there a list of acceptable key-value
attrs
> (the
>     > metadata dictionary) for different grids? Do these correspond
to
> those
>     > supported by xarray?
>     >
>     > If you could provide any help or insight on these questions,
it
> would be
>     > very helpful in thinking through a framework for our
verification.
>     >
>     > Thanks,
>     > Scott
>     >
>     >
>     >
>     >
>     >
>
>     --
>     George McCabe - Software Engineer III
>     National Center for Atmospheric Research
>     Research Applications Laboratory
>     303-497-2768
>     ---
>     My working day may not be your working day. Please do not feel
obliged
> to
>     reply to this email outside of your normal working hours.
>
>
>
>
>

--
George McCabe - Software Engineer III
National Center for Atmospheric Research
Research Applications Laboratory
303-497-2768
---
My working day may not be your working day. Please do not feel obliged
to
reply to this email outside of your normal working hours.

------------------------------------------------
Subject: Questions about python embedding for a netcdf interface
From: Scott Rabenhorst
Time: Mon Dec 07 12:59:00 2020

Good afternoon,

I have run a test case on my netcdf files using python embedding to
grid_stat as described below. I receive the message "METplus has
successfully finished running" at the end, however, the problem is
that
no statistics are computed. The log file shows my python script was
invoked for one file, but it is unclear if it ran for the other file.
Since it did not observe any obvious errors, I tarred up my directory
and attached it to this email in hope that a second pair of eyes may
see
what I am doing wrong. Statistics should have been generated between
our
analysis file "JM-
v10.16.2_C360_RPLY_E5.geosgcm_prog.20200105_0000z.nc4"
and forecast file "G5GMAO.geosgcm_fcst.20200101_0000z.nc4" at valid
time
2020-01-05_00:00:00 and 500 hPa pressure level. Since my netcdf files
are very large, I have ncdumped their contents with a *.dump
extension.
The g5_read_fcst.py.*.out files are the result of running my python
script alone on each netcdf file the way they are called by the
grid_stat config. I would greatly appreciate any ideas on why this did
not generate results.

Thanks,
Scott

On 12/4/20 1:55 PM, Rabenhorst, Scott D. (GSFC-610.1)[SCIENCE SYSTEMS
AND APPLICATIONS INC] wrote:
> Hi George,
>
> I have one more quick question, in addition to the one below
regarding handling missing/invalid data with python embedding. I was
looking over the link you sent
(https://github.com/dtcenter/MET/blob/develop/met/docs/Users_Guide/appendixF.rst#python-
embedding-for-2d-data) with more information about python embedding.
It states lead and accumulation times must follow the format HH[MMSS].
However, we often run forecasts out 10 days or 240 hours. Is the
wrapper code smart enough to expand to HHH[MMSS] to accommodate lead
times more than 99 hours?
>
> Thanks,
> Scott
>
>
> On 12/4/20, 12:18 PM, "Rabenhorst, Scott D. (GSFC-610.1)[SCIENCE
SYSTEMS AND APPLICATIONS INC]" <scott.d.rabenhorst at nasa.gov> wrote:
>
>      Sorry, I clicked send before finishing my email below - but I
was mostly done. Please let me know if you have any suggestions. I
have greatly appreciated your help!
>
>      Thanks,
>      Scott
>
>      On 12/4/20, 12:16 PM, "Rabenhorst, Scott D. (GSFC-
610.1)[SCIENCE SYSTEMS AND APPLICATIONS INC]"
<scott.d.rabenhorst at nasa.gov> wrote:
>
>          Hi George,
>
>          Thanks you so much for answering these questions and for
creating an issue. Yes, your example below was how I was envisioning
the behavior. That would be a very convenient feature to have the
script called x-times to loop through the level list keying off of a
variable {CURRENT_FCST_LEVEL}.
>
>          I am still working on testing some cases, but hopefully I
will have things working soon. While I was writing my read scripts,
another question occurred to me. I'm sure I missed this somewhere in
the documentation, but how can I pass missing/masked/fill values into
grid_stat, etc? Unlike some operational centers, we normally set field
gridpoints to a fill_value where the current pressure level is above
the surface pressure. I presume grid_stat can handle this? When using
python embedding, is there an additional attribute I can pass in
"attrs" specifying a fill_value?
>
>
>
>          On 11/30/20, 5:10 PM, "George McCabe via RT"
<met_help at ucar.edu> wrote:
>
>              Hi Scott,
>
>              For Question 1, the MET tools only accept a 2D slab of
data from python
>              embedding. **Some** of the wrappers do set
{CURRENT_FCST_NAME},
>              {CURRENT_FCST_LEVEL}, {CURRENT_OBS_NAME}, and
{CURRENT_OBS_LEVEL} that can
>              be referenced by other METplus config variables.
However, I don't think it
>              would work in this case. This functionality has only
really been used to
>              set the output prefix in the MET config for each run.
In your case, you
>              would have to reference CURRENT_FCST_LEVEL inside
FCST_VAR<n>_NAME. I am
>              fairly positive that this would not work as the code is
written currently.
>              I do see this as a useful enhancement that would make
configuration this
>              situation easier.
>
>              I will create a GitHub issue regarding this
enhancement. To confirm what
>              behavior you are expecting, the following
configuration:
>
>              FCST_VAR1_NAME = {INPUT_BASE}/myscripts/read_nc2xr.py
>              {INPUT_BASE}/mydata/forecast_file.nc4 TMP
{valid?fmt=%Y%m%d_%H%M}
>              {CURRENT_FCST_LEVEL}
>              FCST_VAR1_LEVELS = P1000,P850,P700,P500,P250,P100
>              would result in 6 calls to your python script -- one
for each value in the
>              levels list. Is that correct?
>
>              In the meantime, to obtain the results you require you
could configure the
>              wrappers in this way:
>
>              FCST_VAR1_NAME = {INPUT_BASE}/myscripts/read_nc2xr.py
>              {INPUT_BASE}/mydata/forecast_file.nc4 TMP
{valid?fmt=%Y%m%d_%H%M} P1000
>              FCST_VAR2_NAME = {INPUT_BASE}/myscripts/read_nc2xr.py
>              {INPUT_BASE}/mydata/forecast_file.nc4 TMP
{valid?fmt=%Y%m%d_%H%M}  P850
>              FCST_VAR3_NAME = {INPUT_BASE}/myscripts/read_nc2xr.py
>              {INPUT_BASE}/mydata/forecast_file.nc4 TMP
{valid?fmt=%Y%m%d_%H%M}  P700
>              FCST_VAR4_NAME = {INPUT_BASE}/myscripts/read_nc2xr.py
>              {INPUT_BASE}/mydata/forecast_file.nc4 TMP
{valid?fmt=%Y%m%d_%H%M}  P500
>              FCST_VAR5_NAME = {INPUT_BASE}/myscripts/read_nc2xr.py
>              {INPUT_BASE}/mydata/forecast_file.nc4 TMP
{valid?fmt=%Y%m%d_%H%M}  P250
>              FCST_VAR6_NAME = {INPUT_BASE}/myscripts/read_nc2xr.py
>              {INPUT_BASE}/mydata/forecast_file.nc4 TMP
{valid?fmt=%Y%m%d_%H%M} P100
>
>              Each VAR<n> item will obtain a 2D field of data that
will be processed.
>              Please let me know if this does not work as you expect.
>
>
>              For Question 2, we just recently added more information
on what grid
>              specifications for the supported grids in python
embedding. The info is
>              currently not in the latest 'develop' version of the
docs on the web, but
>              you can find it here:
>
>
https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fdtcenter%2FMET%2Fblob%2Fdevelop%2Fmet%2Fdocs%2FUsers_Guide%2FappendixF.rst%23python-
embedding-for-2d-
data&data=04%7C01%7Cscott.d.rabenhorst%40nasa.gov%7Cf9679495a83c401bdba108d8957cbdbd%7C7005d45845be48ae8140d43da96dd17b%7C0%7C0%7C637423710289831614%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=jDa10KMlnoHiCUgaukiSpT3P61yMd58G%2Bk82iUhvxm4%3D&reserved=0
>
>              Please let me know if there is any information missing
from here that you
>              would like to be described in more detail. This is an
evolving document!
>
>              Thanks,
>              George
>
>
>              On Mon, Nov 30, 2020 at 2:01 PM Rabenhorst, Scott D.
(GSFC-610.1)[SCIENCE
>              SYSTEMS AND APPLICATIONS INC] via RT
<met_help at ucar.edu> wrote:
>
>              >
>              > Mon Nov 30 14:00:42 2020: Request 97670 was acted
upon.
>              > Transaction: Ticket created by
scott.d.rabenhorst at nasa.gov
>              >        Queue: met_help
>              >      Subject: Questions about python embedding for a
netcdf interface
>              >        Owner: Nobody
>              >   Requestors: scott.d.rabenhorst at nasa.gov
>              >       Status: new
>              >  Ticket <URL:
https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Frt.rap.ucar.edu%2Frt%2FTicket%2FDisplay.html%3Fid%3D97670&data=04%7C01%7Cscott.d.rabenhorst%40nasa.gov%7Cf9679495a83c401bdba108d8957cbdbd%7C7005d45845be48ae8140d43da96dd17b%7C0%7C0%7C637423710289831614%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=pT%2BYVc8rS%2Bq%2F8SB%2B2C55MOvXLQRL5qN5YiUp%2BoqaE1A%3D&reserved=0
>
>              >
>              >
>              > Hi MET help,
>              >
>              > I am with NASA’s GMAO and we are interested in using
METplus to augment
>              > the verification of the GEOS model. However, due to
many constraints, our
>              > output files use the netcdf format with multiple
valid times and levels -
>              > which is unlikely to change. Some of these 4D output
collections are large,
>              > and therefore, splitting them into single time slices
or a subset of levels
>              > is not ideal either just for verification. Therefore,
I have been looking
>              > at python embedding examples (
>              >
https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdtcenter.github.io%2FMETplus%2Fdevelop%2Fgenerated%2Fmet_tool_wrapper%2F&data=04%7C01%7Cscott.d.rabenhorst%40nasa.gov%7Cf9679495a83c401bdba108d8957cbdbd%7C7005d45845be48ae8140d43da96dd17b%7C0%7C0%7C637423710289831614%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=8z9E%2FnWtqrKN69uvyjnaomog5zmw%2BRtRu4fGVQ4K5cU%3D&reserved=0)
>              > and trying to brainstorm possible ways to mimic the
grib interface for our
>              > use. Specifically, I’d like to understand the best
approach to looping
>              > through multiple INIT_TIME/VALID_TIME and levels
while allowing my python
>              > scripts to find the matching time and level and
return the appropriate 2D
>              > “met_data” xarray object.
>              >
>              > 1) Question 1:
>              >
>              > I can envision a scenario writing a config file
similar to:
>              > FCST_VAR1_NAME = {INPUT_BASE}/myscripts/read_nc2xr.py
>              > {INPUT_BASE}/mydata/forecast_file.nc4 TMP
{valid?fmt=%Y%m%d_%H%M} P500
>              > where it would be relatively easy to select a single
2D array out of this
>              > dataset given the provided filename, variable, time,
and level. However,
>              > what is unclear to me is what to do when I want to
verify on multiple
>              > levels?
>              >
>              > I’ve seen some examples similar to:
>              > FCST_VAR1_LEVEL = “(0,*,*),(2,*,*),(4,*,*)…”
>              > However, this seems to directly conflict with the
required 2D xarray
>              > object returned by FCST_VAR<n>_NAME. How do these 2
config fields interplay
>              > using python embedding? Can FCST_VAR<n>_NAME return a
3D array for a given
>              > time slice while FCST_VAR<n>_LEVEL is used for level
indices?
>              >
>              > Alternatively, if I had  FCST_VAR1_LEVEL =
P1000,P850,P700,P500,P250,P100,
>              > does this produce any internal keyword/variable such
as {cur_level} that I
>              > could reference in the argument list for
FCST_VAR<n>_NAME? In this case I’d
>              > prefer to use the more straightforward grib
level/accumulation syntax which
>              > could be easily be used to lookup netcdf levels by a
python script.
>              >
>              > 2) Question 2:
>              >
>              > For python embedding, is there a list of acceptable
key-value attrs (the
>              > metadata dictionary) for different grids? Do these
correspond to those
>              > supported by xarray?
>              >
>              > If you could provide any help or insight on these
questions, it would be
>              > very helpful in thinking through a framework for our
verification.
>              >
>              > Thanks,
>              > Scott
>              >
>              >
>              >
>              >
>              >
>
>              --
>              George McCabe - Software Engineer III
>              National Center for Atmospheric Research
>              Research Applications Laboratory
>              303-497-2768
>              ---
>              My working day may not be your working day. Please do
not feel obliged to
>              reply to this email outside of your normal working
hours.
>
>
>
>

------------------------------------------------
Subject: Questions about python embedding for a netcdf interface
From: George McCabe
Time: Mon Dec 07 13:20:59 2020

Hi Scott,

It looks like the "exit(0)" line at the end of your python embedding
script
is causing execution to stop instead of reading the data into MET.
Since it
is exiting with 0 and not a non-zero value, as far as it can tell,
everything went smoothly. I would remove that line and try again.
Also, it
is good practice to set LOG_LEVEL = DEBUG in your METplus config so
that
you see additional log output that is not shown with LOG_LEVEL=INFO.

- George

On Mon, Dec 7, 2020 at 1:01 PM Scott Rabenhorst via RT
<met_help at ucar.edu>
wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=97670 >
>
> Good afternoon,
>
> I have run a test case on my netcdf files using python embedding to
> grid_stat as described below. I receive the message "METplus has
> successfully finished running" at the end, however, the problem is
that
> no statistics are computed. The log file shows my python script was
> invoked for one file, but it is unclear if it ran for the other
file.
> Since it did not observe any obvious errors, I tarred up my
directory
> and attached it to this email in hope that a second pair of eyes may
see
> what I am doing wrong. Statistics should have been generated between
our
> analysis file "JM-
v10.16.2_C360_RPLY_E5.geosgcm_prog.20200105_0000z.nc4"
> and forecast file "G5GMAO.geosgcm_fcst.20200101_0000z.nc4" at valid
time
> 2020-01-05_00:00:00 and 500 hPa pressure level. Since my netcdf
files
> are very large, I have ncdumped their contents with a *.dump
extension.
> The g5_read_fcst.py.*.out files are the result of running my python
> script alone on each netcdf file the way they are called by the
> grid_stat config. I would greatly appreciate any ideas on why this
did
> not generate results.
>
> Thanks,
> Scott
>
>
> On 12/4/20 1:55 PM, Rabenhorst, Scott D. (GSFC-610.1)[SCIENCE
SYSTEMS
> AND APPLICATIONS INC] wrote:
> > Hi George,
> >
> > I have one more quick question, in addition to the one below
regarding
> handling missing/invalid data with python embedding. I was looking
over the
> link you sent (
>
https://github.com/dtcenter/MET/blob/develop/met/docs/Users_Guide/appendixF.rst#python-
embedding-for-2d-data)
> with more information about python embedding. It states lead and
> accumulation times must follow the format HH[MMSS]. However, we
often run
> forecasts out 10 days or 240 hours. Is the wrapper code smart enough
to
> expand to HHH[MMSS] to accommodate lead times more than 99 hours?
> >
> > Thanks,
> > Scott
> >
> >
> > On 12/4/20, 12:18 PM, "Rabenhorst, Scott D. (GSFC-610.1)[SCIENCE
> SYSTEMS AND APPLICATIONS INC]" <scott.d.rabenhorst at nasa.gov> wrote:
> >
> >      Sorry, I clicked send before finishing my email below - but I
was
> mostly done. Please let me know if you have any suggestions. I have
greatly
> appreciated your help!
> >
> >      Thanks,
> >      Scott
> >
> >      On 12/4/20, 12:16 PM, "Rabenhorst, Scott D. (GSFC-
610.1)[SCIENCE
> SYSTEMS AND APPLICATIONS INC]" <scott.d.rabenhorst at nasa.gov> wrote:
> >
> >          Hi George,
> >
> >          Thanks you so much for answering these questions and for
> creating an issue. Yes, your example below was how I was envisioning
the
> behavior. That would be a very convenient feature to have the script
called
> x-times to loop through the level list keying off of a variable
> {CURRENT_FCST_LEVEL}.
> >
> >          I am still working on testing some cases, but hopefully I
will
> have things working soon. While I was writing my read scripts,
another
> question occurred to me. I'm sure I missed this somewhere in the
> documentation, but how can I pass missing/masked/fill values into
> grid_stat, etc? Unlike some operational centers, we normally set
field
> gridpoints to a fill_value where the current pressure level is above
the
> surface pressure. I presume grid_stat can handle this? When using
python
> embedding, is there an additional attribute I can pass in "attrs"
> specifying a fill_value?
> >
> >
> >
> >          On 11/30/20, 5:10 PM, "George McCabe via RT"
<met_help at ucar.edu>
> wrote:
> >
> >              Hi Scott,
> >
> >              For Question 1, the MET tools only accept a 2D slab
of data
> from python
> >              embedding. **Some** of the wrappers do set
> {CURRENT_FCST_NAME},
> >              {CURRENT_FCST_LEVEL}, {CURRENT_OBS_NAME}, and
> {CURRENT_OBS_LEVEL} that can
> >              be referenced by other METplus config variables.
However, I
> don't think it
> >              would work in this case. This functionality has only
really
> been used to
> >              set the output prefix in the MET config for each run.
In
> your case, you
> >              would have to reference CURRENT_FCST_LEVEL inside
> FCST_VAR<n>_NAME. I am
> >              fairly positive that this would not work as the code
is
> written currently.
> >              I do see this as a useful enhancement that would make
> configuration this
> >              situation easier.
> >
> >              I will create a GitHub issue regarding this
enhancement. To
> confirm what
> >              behavior you are expecting, the following
configuration:
> >
> >              FCST_VAR1_NAME = {INPUT_BASE}/myscripts/read_nc2xr.py
> >              {INPUT_BASE}/mydata/forecast_file.nc4 TMP
> {valid?fmt=%Y%m%d_%H%M}
> >              {CURRENT_FCST_LEVEL}
> >              FCST_VAR1_LEVELS = P1000,P850,P700,P500,P250,P100
> >              would result in 6 calls to your python script -- one
for
> each value in the
> >              levels list. Is that correct?
> >
> >              In the meantime, to obtain the results you require
you
> could configure the
> >              wrappers in this way:
> >
> >              FCST_VAR1_NAME = {INPUT_BASE}/myscripts/read_nc2xr.py
> >              {INPUT_BASE}/mydata/forecast_file.nc4 TMP
> {valid?fmt=%Y%m%d_%H%M} P1000
> >              FCST_VAR2_NAME = {INPUT_BASE}/myscripts/read_nc2xr.py
> >              {INPUT_BASE}/mydata/forecast_file.nc4 TMP
> {valid?fmt=%Y%m%d_%H%M}  P850
> >              FCST_VAR3_NAME = {INPUT_BASE}/myscripts/read_nc2xr.py
> >              {INPUT_BASE}/mydata/forecast_file.nc4 TMP
> {valid?fmt=%Y%m%d_%H%M}  P700
> >              FCST_VAR4_NAME = {INPUT_BASE}/myscripts/read_nc2xr.py
> >              {INPUT_BASE}/mydata/forecast_file.nc4 TMP
> {valid?fmt=%Y%m%d_%H%M}  P500
> >              FCST_VAR5_NAME = {INPUT_BASE}/myscripts/read_nc2xr.py
> >              {INPUT_BASE}/mydata/forecast_file.nc4 TMP
> {valid?fmt=%Y%m%d_%H%M}  P250
> >              FCST_VAR6_NAME = {INPUT_BASE}/myscripts/read_nc2xr.py
> >              {INPUT_BASE}/mydata/forecast_file.nc4 TMP
> {valid?fmt=%Y%m%d_%H%M} P100
> >
> >              Each VAR<n> item will obtain a 2D field of data that
will
> be processed.
> >              Please let me know if this does not work as you
expect.
> >
> >
> >              For Question 2, we just recently added more
information on
> what grid
> >              specifications for the supported grids in python
embedding.
> The info is
> >              currently not in the latest 'develop' version of the
docs
> on the web, but
> >              you can find it here:
> >
> >
>
https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fdtcenter%2FMET%2Fblob%2Fdevelop%2Fmet%2Fdocs%2FUsers_Guide%2FappendixF.rst%23python-
embedding-for-2d-
data&data=04%7C01%7Cscott.d.rabenhorst%40nasa.gov%7Cf9679495a83c401bdba108d8957cbdbd%7C7005d45845be48ae8140d43da96dd17b%7C0%7C0%7C637423710289831614%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=jDa10KMlnoHiCUgaukiSpT3P61yMd58G%2Bk82iUhvxm4%3D&reserved=0
> >
> >              Please let me know if there is any information
missing from
> here that you
> >              would like to be described in more detail. This is an
> evolving document!
> >
> >              Thanks,
> >              George
> >
> >
> >              On Mon, Nov 30, 2020 at 2:01 PM Rabenhorst, Scott D.
> (GSFC-610.1)[SCIENCE
> >              SYSTEMS AND APPLICATIONS INC] via RT
<met_help at ucar.edu>
> wrote:
> >
> >              >
> >              > Mon Nov 30 14:00:42 2020: Request 97670 was acted
upon.
> >              > Transaction: Ticket created by
> scott.d.rabenhorst at nasa.gov
> >              >        Queue: met_help
> >              >      Subject: Questions about python embedding for
a
> netcdf interface
> >              >        Owner: Nobody
> >              >   Requestors: scott.d.rabenhorst at nasa.gov
> >              >       Status: new
> >              >  Ticket <URL:
>
https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Frt.rap.ucar.edu%2Frt%2FTicket%2FDisplay.html%3Fid%3D97670&data=04%7C01%7Cscott.d.rabenhorst%40nasa.gov%7Cf9679495a83c401bdba108d8957cbdbd%7C7005d45845be48ae8140d43da96dd17b%7C0%7C0%7C637423710289831614%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=pT%2BYVc8rS%2Bq%2F8SB%2B2C55MOvXLQRL5qN5YiUp%2BoqaE1A%3D&reserved=0
> >
> >              >
> >              >
> >              > Hi MET help,
> >              >
> >              > I am with NASA’s GMAO and we are interested in
using
> METplus to augment
> >              > the verification of the GEOS model. However, due to
many
> constraints, our
> >              > output files use the netcdf format with multiple
valid
> times and levels -
> >              > which is unlikely to change. Some of these 4D
output
> collections are large,
> >              > and therefore, splitting them into single time
slices or
> a subset of levels
> >              > is not ideal either just for verification.
Therefore, I
> have been looking
> >              > at python embedding examples (
> >              >
>
https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdtcenter.github.io%2FMETplus%2Fdevelop%2Fgenerated%2Fmet_tool_wrapper%2F&data=04%7C01%7Cscott.d.rabenhorst%40nasa.gov%7Cf9679495a83c401bdba108d8957cbdbd%7C7005d45845be48ae8140d43da96dd17b%7C0%7C0%7C637423710289831614%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=8z9E%2FnWtqrKN69uvyjnaomog5zmw%2BRtRu4fGVQ4K5cU%3D&reserved=0
> )
> >              > and trying to brainstorm possible ways to mimic the
grib
> interface for our
> >              > use. Specifically, I’d like to understand the best
> approach to looping
> >              > through multiple INIT_TIME/VALID_TIME and levels
while
> allowing my python
> >              > scripts to find the matching time and level and
return
> the appropriate 2D
> >              > “met_data” xarray object.
> >              >
> >              > 1) Question 1:
> >              >
> >              > I can envision a scenario writing a config file
similar
> to:
> >              > FCST_VAR1_NAME =
{INPUT_BASE}/myscripts/read_nc2xr.py
> >              > {INPUT_BASE}/mydata/forecast_file.nc4 TMP
> {valid?fmt=%Y%m%d_%H%M} P500
> >              > where it would be relatively easy to select a
single 2D
> array out of this
> >              > dataset given the provided filename, variable,
time, and
> level. However,
> >              > what is unclear to me is what to do when I want to
verify
> on multiple
> >              > levels?
> >              >
> >              > I’ve seen some examples similar to:
> >              > FCST_VAR1_LEVEL = “(0,*,*),(2,*,*),(4,*,*)…”
> >              > However, this seems to directly conflict with the
> required 2D xarray
> >              > object returned by FCST_VAR<n>_NAME. How do these 2
> config fields interplay
> >              > using python embedding? Can FCST_VAR<n>_NAME return
a 3D
> array for a given
> >              > time slice while FCST_VAR<n>_LEVEL is used for
level
> indices?
> >              >
> >              > Alternatively, if I had  FCST_VAR1_LEVEL =
> P1000,P850,P700,P500,P250,P100,
> >              > does this produce any internal keyword/variable
such as
> {cur_level} that I
> >              > could reference in the argument list for
> FCST_VAR<n>_NAME? In this case I’d
> >              > prefer to use the more straightforward grib
> level/accumulation syntax which
> >              > could be easily be used to lookup netcdf levels by
a
> python script.
> >              >
> >              > 2) Question 2:
> >              >
> >              > For python embedding, is there a list of acceptable
> key-value attrs (the
> >              > metadata dictionary) for different grids? Do these
> correspond to those
> >              > supported by xarray?
> >              >
> >              > If you could provide any help or insight on these
> questions, it would be
> >              > very helpful in thinking through a framework for
our
> verification.
> >              >
> >              > Thanks,
> >              > Scott
> >              >
> >              >
> >              >
> >              >
> >              >
> >
> >              --
> >              George McCabe - Software Engineer III
> >              National Center for Atmospheric Research
> >              Research Applications Laboratory
> >              303-497-2768
> >              ---
> >              My working day may not be your working day. Please do
not
> feel obliged to
> >              reply to this email outside of your normal working
hours.
> >
> >
> >
> >
>
>

--
George McCabe - Software Engineer III
National Center for Atmospheric Research
Research Applications Laboratory
303-497-2768
---
My working day may not be your working day. Please do not feel obliged
to
reply to this email outside of your normal working hours.

------------------------------------------------
Subject: Questions about python embedding for a netcdf interface
From: George McCabe
Time: Mon Dec 07 13:58:55 2020

Hi Scott,

To follow up about the missing/fill values, it looks like MET does
handle
these values properly. I ran plot_data_plane using the example from
the
GridStat python embedding use case like this:

/usr/local/met/bin/plot_data_plane PYTHON_NUMPY ~/out-test.ps
'name="/home/mccabe/read_ascii_numpy.py
/d1/projects/METplus/METplus_Data/met_test/data/python/obs.txt OBS";'

This generated the attached image plot_data_plane_zero.png. All of the
values outside of "obs" are 0.0 values.

I think reran the script but added the following line after the read
into
met_data:

met_data[met_data==0] = np.nan

This run resulted in the image attached called
plot_data_plane_nan.png. It
looks like your example sets the fill value to the correct value on
read,
so that should be enough to interpret the missing values properly in
MET.
You can use plot_data_plane to test this out to make sure. The grey
values
are missing/fill values if you are using the default color table.

Let me know if you have any questions or run into any other issues.

Thanks,
George

On Mon, Dec 7, 2020 at 1:20 PM George McCabe <mccabe at ucar.edu> wrote:

> Hi Scott,
>
> It looks like the "exit(0)" line at the end of your python embedding
> script is causing execution to stop instead of reading the data into
MET.
> Since it is exiting with 0 and not a non-zero value, as far as it
can tell,
> everything went smoothly. I would remove that line and try again.
Also, it
> is good practice to set LOG_LEVEL = DEBUG in your METplus config so
that
> you see additional log output that is not shown with LOG_LEVEL=INFO.
>
> - George
>
> On Mon, Dec 7, 2020 at 1:01 PM Scott Rabenhorst via RT
<met_help at ucar.edu>
> wrote:
>
>>
>> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=97670 >
>>
>> Good afternoon,
>>
>> I have run a test case on my netcdf files using python embedding to
>> grid_stat as described below. I receive the message "METplus has
>> successfully finished running" at the end, however, the problem is
that
>> no statistics are computed. The log file shows my python script was
>> invoked for one file, but it is unclear if it ran for the other
file.
>> Since it did not observe any obvious errors, I tarred up my
directory
>> and attached it to this email in hope that a second pair of eyes
may see
>> what I am doing wrong. Statistics should have been generated
between our
>> analysis file "JM-
v10.16.2_C360_RPLY_E5.geosgcm_prog.20200105_0000z.nc4"
>> and forecast file "G5GMAO.geosgcm_fcst.20200101_0000z.nc4" at valid
time
>> 2020-01-05_00:00:00 and 500 hPa pressure level. Since my netcdf
files
>> are very large, I have ncdumped their contents with a *.dump
extension.
>> The g5_read_fcst.py.*.out files are the result of running my python
>> script alone on each netcdf file the way they are called by the
>> grid_stat config. I would greatly appreciate any ideas on why this
did
>> not generate results.
>>
>> Thanks,
>> Scott
>>
>>
>> On 12/4/20 1:55 PM, Rabenhorst, Scott D. (GSFC-610.1)[SCIENCE
SYSTEMS
>> AND APPLICATIONS INC] wrote:
>> > Hi George,
>> >
>> > I have one more quick question, in addition to the one below
regarding
>> handling missing/invalid data with python embedding. I was looking
over the
>> link you sent (
>>
https://github.com/dtcenter/MET/blob/develop/met/docs/Users_Guide/appendixF.rst#python-
embedding-for-2d-data)
>> with more information about python embedding. It states lead and
>> accumulation times must follow the format HH[MMSS]. However, we
often run
>> forecasts out 10 days or 240 hours. Is the wrapper code smart
enough to
>> expand to HHH[MMSS] to accommodate lead times more than 99 hours?
>> >
>> > Thanks,
>> > Scott
>> >
>> >
>> > On 12/4/20, 12:18 PM, "Rabenhorst, Scott D. (GSFC-610.1)[SCIENCE
>> SYSTEMS AND APPLICATIONS INC]" <scott.d.rabenhorst at nasa.gov> wrote:
>> >
>> >      Sorry, I clicked send before finishing my email below - but
I was
>> mostly done. Please let me know if you have any suggestions. I have
greatly
>> appreciated your help!
>> >
>> >      Thanks,
>> >      Scott
>> >
>> >      On 12/4/20, 12:16 PM, "Rabenhorst, Scott D. (GSFC-
610.1)[SCIENCE
>> SYSTEMS AND APPLICATIONS INC]" <scott.d.rabenhorst at nasa.gov> wrote:
>> >
>> >          Hi George,
>> >
>> >          Thanks you so much for answering these questions and for
>> creating an issue. Yes, your example below was how I was
envisioning the
>> behavior. That would be a very convenient feature to have the
script called
>> x-times to loop through the level list keying off of a variable
>> {CURRENT_FCST_LEVEL}.
>> >
>> >          I am still working on testing some cases, but hopefully
I will
>> have things working soon. While I was writing my read scripts,
another
>> question occurred to me. I'm sure I missed this somewhere in the
>> documentation, but how can I pass missing/masked/fill values into
>> grid_stat, etc? Unlike some operational centers, we normally set
field
>> gridpoints to a fill_value where the current pressure level is
above the
>> surface pressure. I presume grid_stat can handle this? When using
python
>> embedding, is there an additional attribute I can pass in "attrs"
>> specifying a fill_value?
>> >
>> >
>> >
>> >          On 11/30/20, 5:10 PM, "George McCabe via RT" <
>> met_help at ucar.edu> wrote:
>> >
>> >              Hi Scott,
>> >
>> >              For Question 1, the MET tools only accept a 2D slab
of
>> data from python
>> >              embedding. **Some** of the wrappers do set
>> {CURRENT_FCST_NAME},
>> >              {CURRENT_FCST_LEVEL}, {CURRENT_OBS_NAME}, and
>> {CURRENT_OBS_LEVEL} that can
>> >              be referenced by other METplus config variables.
However,
>> I don't think it
>> >              would work in this case. This functionality has only
>> really been used to
>> >              set the output prefix in the MET config for each
run. In
>> your case, you
>> >              would have to reference CURRENT_FCST_LEVEL inside
>> FCST_VAR<n>_NAME. I am
>> >              fairly positive that this would not work as the code
is
>> written currently.
>> >              I do see this as a useful enhancement that would
make
>> configuration this
>> >              situation easier.
>> >
>> >              I will create a GitHub issue regarding this
enhancement.
>> To confirm what
>> >              behavior you are expecting, the following
configuration:
>> >
>> >              FCST_VAR1_NAME =
{INPUT_BASE}/myscripts/read_nc2xr.py
>> >              {INPUT_BASE}/mydata/forecast_file.nc4 TMP
>> {valid?fmt=%Y%m%d_%H%M}
>> >              {CURRENT_FCST_LEVEL}
>> >              FCST_VAR1_LEVELS = P1000,P850,P700,P500,P250,P100
>> >              would result in 6 calls to your python script -- one
for
>> each value in the
>> >              levels list. Is that correct?
>> >
>> >              In the meantime, to obtain the results you require
you
>> could configure the
>> >              wrappers in this way:
>> >
>> >              FCST_VAR1_NAME =
{INPUT_BASE}/myscripts/read_nc2xr.py
>> >              {INPUT_BASE}/mydata/forecast_file.nc4 TMP
>> {valid?fmt=%Y%m%d_%H%M} P1000
>> >              FCST_VAR2_NAME =
{INPUT_BASE}/myscripts/read_nc2xr.py
>> >              {INPUT_BASE}/mydata/forecast_file.nc4 TMP
>> {valid?fmt=%Y%m%d_%H%M}  P850
>> >              FCST_VAR3_NAME =
{INPUT_BASE}/myscripts/read_nc2xr.py
>> >              {INPUT_BASE}/mydata/forecast_file.nc4 TMP
>> {valid?fmt=%Y%m%d_%H%M}  P700
>> >              FCST_VAR4_NAME =
{INPUT_BASE}/myscripts/read_nc2xr.py
>> >              {INPUT_BASE}/mydata/forecast_file.nc4 TMP
>> {valid?fmt=%Y%m%d_%H%M}  P500
>> >              FCST_VAR5_NAME =
{INPUT_BASE}/myscripts/read_nc2xr.py
>> >              {INPUT_BASE}/mydata/forecast_file.nc4 TMP
>> {valid?fmt=%Y%m%d_%H%M}  P250
>> >              FCST_VAR6_NAME =
{INPUT_BASE}/myscripts/read_nc2xr.py
>> >              {INPUT_BASE}/mydata/forecast_file.nc4 TMP
>> {valid?fmt=%Y%m%d_%H%M} P100
>> >
>> >              Each VAR<n> item will obtain a 2D field of data that
will
>> be processed.
>> >              Please let me know if this does not work as you
expect.
>> >
>> >
>> >              For Question 2, we just recently added more
information on
>> what grid
>> >              specifications for the supported grids in python
>> embedding. The info is
>> >              currently not in the latest 'develop' version of the
docs
>> on the web, but
>> >              you can find it here:
>> >
>> >
>>
https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fdtcenter%2FMET%2Fblob%2Fdevelop%2Fmet%2Fdocs%2FUsers_Guide%2FappendixF.rst%23python-
embedding-for-2d-
data&data=04%7C01%7Cscott.d.rabenhorst%40nasa.gov%7Cf9679495a83c401bdba108d8957cbdbd%7C7005d45845be48ae8140d43da96dd17b%7C0%7C0%7C637423710289831614%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=jDa10KMlnoHiCUgaukiSpT3P61yMd58G%2Bk82iUhvxm4%3D&reserved=0
>> >
>> >              Please let me know if there is any information
missing
>> from here that you
>> >              would like to be described in more detail. This is
an
>> evolving document!
>> >
>> >              Thanks,
>> >              George
>> >
>> >
>> >              On Mon, Nov 30, 2020 at 2:01 PM Rabenhorst, Scott D.
>> (GSFC-610.1)[SCIENCE
>> >              SYSTEMS AND APPLICATIONS INC] via RT
<met_help at ucar.edu>
>> wrote:
>> >
>> >              >
>> >              > Mon Nov 30 14:00:42 2020: Request 97670 was acted
upon.
>> >              > Transaction: Ticket created by
>> scott.d.rabenhorst at nasa.gov
>> >              >        Queue: met_help
>> >              >      Subject: Questions about python embedding for
a
>> netcdf interface
>> >              >        Owner: Nobody
>> >              >   Requestors: scott.d.rabenhorst at nasa.gov
>> >              >       Status: new
>> >              >  Ticket <URL:
>>
https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Frt.rap.ucar.edu%2Frt%2FTicket%2FDisplay.html%3Fid%3D97670&data=04%7C01%7Cscott.d.rabenhorst%40nasa.gov%7Cf9679495a83c401bdba108d8957cbdbd%7C7005d45845be48ae8140d43da96dd17b%7C0%7C0%7C637423710289831614%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=pT%2BYVc8rS%2Bq%2F8SB%2B2C55MOvXLQRL5qN5YiUp%2BoqaE1A%3D&reserved=0
>> >
>> >              >
>> >              >
>> >              > Hi MET help,
>> >              >
>> >              > I am with NASA’s GMAO and we are interested in
using
>> METplus to augment
>> >              > the verification of the GEOS model. However, due
to many
>> constraints, our
>> >              > output files use the netcdf format with multiple
valid
>> times and levels -
>> >              > which is unlikely to change. Some of these 4D
output
>> collections are large,
>> >              > and therefore, splitting them into single time
slices or
>> a subset of levels
>> >              > is not ideal either just for verification.
Therefore, I
>> have been looking
>> >              > at python embedding examples (
>> >              >
>>
https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdtcenter.github.io%2FMETplus%2Fdevelop%2Fgenerated%2Fmet_tool_wrapper%2F&data=04%7C01%7Cscott.d.rabenhorst%40nasa.gov%7Cf9679495a83c401bdba108d8957cbdbd%7C7005d45845be48ae8140d43da96dd17b%7C0%7C0%7C637423710289831614%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=8z9E%2FnWtqrKN69uvyjnaomog5zmw%2BRtRu4fGVQ4K5cU%3D&reserved=0
>> )
>> >              > and trying to brainstorm possible ways to mimic
the grib
>> interface for our
>> >              > use. Specifically, I’d like to understand the best
>> approach to looping
>> >              > through multiple INIT_TIME/VALID_TIME and levels
while
>> allowing my python
>> >              > scripts to find the matching time and level and
return
>> the appropriate 2D
>> >              > “met_data” xarray object.
>> >              >
>> >              > 1) Question 1:
>> >              >
>> >              > I can envision a scenario writing a config file
similar
>> to:
>> >              > FCST_VAR1_NAME =
{INPUT_BASE}/myscripts/read_nc2xr.py
>> >              > {INPUT_BASE}/mydata/forecast_file.nc4 TMP
>> {valid?fmt=%Y%m%d_%H%M} P500
>> >              > where it would be relatively easy to select a
single 2D
>> array out of this
>> >              > dataset given the provided filename, variable,
time, and
>> level. However,
>> >              > what is unclear to me is what to do when I want to
>> verify on multiple
>> >              > levels?
>> >              >
>> >              > I’ve seen some examples similar to:
>> >              > FCST_VAR1_LEVEL = “(0,*,*),(2,*,*),(4,*,*)…”
>> >              > However, this seems to directly conflict with the
>> required 2D xarray
>> >              > object returned by FCST_VAR<n>_NAME. How do these
2
>> config fields interplay
>> >              > using python embedding? Can FCST_VAR<n>_NAME
return a 3D
>> array for a given
>> >              > time slice while FCST_VAR<n>_LEVEL is used for
level
>> indices?
>> >              >
>> >              > Alternatively, if I had  FCST_VAR1_LEVEL =
>> P1000,P850,P700,P500,P250,P100,
>> >              > does this produce any internal keyword/variable
such as
>> {cur_level} that I
>> >              > could reference in the argument list for
>> FCST_VAR<n>_NAME? In this case I’d
>> >              > prefer to use the more straightforward grib
>> level/accumulation syntax which
>> >              > could be easily be used to lookup netcdf levels by
a
>> python script.
>> >              >
>> >              > 2) Question 2:
>> >              >
>> >              > For python embedding, is there a list of
acceptable
>> key-value attrs (the
>> >              > metadata dictionary) for different grids? Do these
>> correspond to those
>> >              > supported by xarray?
>> >              >
>> >              > If you could provide any help or insight on these
>> questions, it would be
>> >              > very helpful in thinking through a framework for
our
>> verification.
>> >              >
>> >              > Thanks,
>> >              > Scott
>> >              >
>> >              >
>> >              >
>> >              >
>> >              >
>> >
>> >              --
>> >              George McCabe - Software Engineer III
>> >              National Center for Atmospheric Research
>> >              Research Applications Laboratory
>> >              303-497-2768
>> >              ---
>> >              My working day may not be your working day. Please
do not
>> feel obliged to
>> >              reply to this email outside of your normal working
hours.
>> >
>> >
>> >
>> >
>>
>>
>
> --
> George McCabe - Software Engineer III
> National Center for Atmospheric Research
> Research Applications Laboratory
> 303-497-2768
> ---
> My working day may not be your working day. Please do not feel
obliged to
> reply to this email outside of your normal working hours.
>

--
George McCabe - Software Engineer III
National Center for Atmospheric Research
Research Applications Laboratory
303-497-2768
---
My working day may not be your working day. Please do not feel obliged
to
reply to this email outside of your normal working hours.

------------------------------------------------
Subject: Re: [rt.rap.ucar.edu #97670] Questions about python embedding for a netcdf interface
From: Rabenhorst, Scott D. (GSFC-610.1)[SCIENCE SYSTEMS AND APPLICATIONS INC]
Time: Tue Dec 08 15:16:20 2020

Hi George,

Thanks for answering my questions. You were right, the problem was the
exit call at the end of my python script. Now it works as expected.
Also it sounds like the built-in "missing data" values in METplus can
either be np.nan or -9999. I will set accordingly. Your advice on
setting my logging to DEBUG was also helpful. All great input - thanks
for getting my basic test up and running.

I was wondering about one more thing, the "GRID_STAT_ONCE_PER_FIELD"
option, as I was reading through the METplus docs. I presume setting
this to True would yield much more efficient compute times for many
levels/variables. On the downside, there doesn't appear to be a way to
do this with my g5_read_fcst.py script as the met_data variable can
only be a 2D array. I guess the only way to enable this would be to
scrap the python embedding and write a different python script to find
all appropriate time/level netcdf indices
"(T1,L1,*,*),(T1,L2,*,*),..." for FCST/OBS_GRID_STAT_VAR<n>_LEVELS in
my *.conf file prior to running METplus? I ask this because my overall
goal is to compute stats for 12+ vars each with 20+ levels for 31-
member ensemble of 10-day forecasts while trying to do this quickly in
parallel on our HPC system. I guess I am trying to figure out the most
efficient way to make a 2D tool work in 5D. That reminds me, is there
a way to accommodate lead times greater than 99 hours?

Thanks,
Scott

On 12/7/20, 4:05 PM, "George McCabe via RT" <met_help at ucar.edu>
wrote:

    Hi Scott,

    To follow up about the missing/fill values, it looks like MET does
handle
    these values properly. I ran plot_data_plane using the example
from the
    GridStat python embedding use case like this:

    /usr/local/met/bin/plot_data_plane PYTHON_NUMPY ~/out-test.ps
    'name="/home/mccabe/read_ascii_numpy.py
    /d1/projects/METplus/METplus_Data/met_test/data/python/obs.txt
OBS";'

    This generated the attached image plot_data_plane_zero.png. All of
the
    values outside of "obs" are 0.0 values.

    I think reran the script but added the following line after the
read into
    met_data:

    met_data[met_data==0] = np.nan

    This run resulted in the image attached called
plot_data_plane_nan.png. It
    looks like your example sets the fill value to the correct value
on read,
    so that should be enough to interpret the missing values properly
in MET.
    You can use plot_data_plane to test this out to make sure. The
grey values
    are missing/fill values if you are using the default color table.

    Let me know if you have any questions or run into any other
issues.

    Thanks,
    George

    On Mon, Dec 7, 2020 at 1:20 PM George McCabe <mccabe at ucar.edu>
wrote:

    > Hi Scott,
    >
    > It looks like the "exit(0)" line at the end of your python
embedding
    > script is causing execution to stop instead of reading the data
into MET.
    > Since it is exiting with 0 and not a non-zero value, as far as
it can tell,
    > everything went smoothly. I would remove that line and try
again. Also, it
    > is good practice to set LOG_LEVEL = DEBUG in your METplus config
so that
    > you see additional log output that is not shown with
LOG_LEVEL=INFO.
    >
    > - George
    >
    > On Mon, Dec 7, 2020 at 1:01 PM Scott Rabenhorst via RT
<met_help at ucar.edu>
    > wrote:
    >
    >>
    >> <URL:
https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Frt.rap.ucar.edu%2Frt%2FTicket%2FDisplay.html%3Fid%3D97670&data=04%7C01%7Cscott.d.rabenhorst%40nasa.gov%7Cf5a79cedacee4e0f852f08d89af3d952%7C7005d45845be48ae8140d43da96dd17b%7C0%7C0%7C637429719398041539%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=DpvVwOAybremGu5pjNX2SLqDDClreeZ38syu%2FWnXgC4%3D&reserved=0
>
    >>
    >> Good afternoon,
    >>
    >> I have run a test case on my netcdf files using python
embedding to
    >> grid_stat as described below. I receive the message "METplus
has
    >> successfully finished running" at the end, however, the problem
is that
    >> no statistics are computed. The log file shows my python script
was
    >> invoked for one file, but it is unclear if it ran for the other
file.
    >> Since it did not observe any obvious errors, I tarred up my
directory
    >> and attached it to this email in hope that a second pair of
eyes may see
    >> what I am doing wrong. Statistics should have been generated
between our
    >> analysis file "JM-
v10.16.2_C360_RPLY_E5.geosgcm_prog.20200105_0000z.nc4"
    >> and forecast file "G5GMAO.geosgcm_fcst.20200101_0000z.nc4" at
valid time
    >> 2020-01-05_00:00:00 and 500 hPa pressure level. Since my netcdf
files
    >> are very large, I have ncdumped their contents with a *.dump
extension.
    >> The g5_read_fcst.py.*.out files are the result of running my
python
    >> script alone on each netcdf file the way they are called by the
    >> grid_stat config. I would greatly appreciate any ideas on why
this did
    >> not generate results.
    >>
    >> Thanks,
    >> Scott
    >>
    >>
    >> On 12/4/20 1:55 PM, Rabenhorst, Scott D. (GSFC-610.1)[SCIENCE
SYSTEMS
    >> AND APPLICATIONS INC] wrote:
    >> > Hi George,
    >> >
    >> > I have one more quick question, in addition to the one below
regarding
    >> handling missing/invalid data with python embedding. I was
looking over the
    >> link you sent (
    >>
https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fdtcenter%2FMET%2Fblob%2Fdevelop%2Fmet%2Fdocs%2FUsers_Guide%2FappendixF.rst%23python-
embedding-for-2d-
data&data=04%7C01%7Cscott.d.rabenhorst%40nasa.gov%7Cf5a79cedacee4e0f852f08d89af3d952%7C7005d45845be48ae8140d43da96dd17b%7C0%7C0%7C637429719398041539%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=U7WtaXX9bkEyI73WsPW%2FSkaqkN5fcIK78hRcgU0fsvM%3D&reserved=0)
    >> with more information about python embedding. It states lead
and
    >> accumulation times must follow the format HH[MMSS]. However, we
often run
    >> forecasts out 10 days or 240 hours. Is the wrapper code smart
enough to
    >> expand to HHH[MMSS] to accommodate lead times more than 99
hours?
    >> >
    >> > Thanks,
    >> > Scott
    >> >
    >> >
    >> > On 12/4/20, 12:18 PM, "Rabenhorst, Scott D. (GSFC-
610.1)[SCIENCE
    >> SYSTEMS AND APPLICATIONS INC]" <scott.d.rabenhorst at nasa.gov>
wrote:
    >> >
    >> >      Sorry, I clicked send before finishing my email below -
but I was
    >> mostly done. Please let me know if you have any suggestions. I
have greatly
    >> appreciated your help!
    >> >
    >> >      Thanks,
    >> >      Scott
    >> >
    >> >      On 12/4/20, 12:16 PM, "Rabenhorst, Scott D. (GSFC-
610.1)[SCIENCE
    >> SYSTEMS AND APPLICATIONS INC]" <scott.d.rabenhorst at nasa.gov>
wrote:
    >> >
    >> >          Hi George,
    >> >
    >> >          Thanks you so much for answering these questions and
for
    >> creating an issue. Yes, your example below was how I was
envisioning the
    >> behavior. That would be a very convenient feature to have the
script called
    >> x-times to loop through the level list keying off of a variable
    >> {CURRENT_FCST_LEVEL}.
    >> >
    >> >          I am still working on testing some cases, but
hopefully I will
    >> have things working soon. While I was writing my read scripts,
another
    >> question occurred to me. I'm sure I missed this somewhere in
the
    >> documentation, but how can I pass missing/masked/fill values
into
    >> grid_stat, etc? Unlike some operational centers, we normally
set field
    >> gridpoints to a fill_value where the current pressure level is
above the
    >> surface pressure. I presume grid_stat can handle this? When
using python
    >> embedding, is there an additional attribute I can pass in
"attrs"
    >> specifying a fill_value?
    >> >
    >> >
    >> >
    >> >          On 11/30/20, 5:10 PM, "George McCabe via RT" <
    >> met_help at ucar.edu> wrote:
    >> >
    >> >              Hi Scott,
    >> >
    >> >              For Question 1, the MET tools only accept a 2D
slab of
    >> data from python
    >> >              embedding. **Some** of the wrappers do set
    >> {CURRENT_FCST_NAME},
    >> >              {CURRENT_FCST_LEVEL}, {CURRENT_OBS_NAME}, and
    >> {CURRENT_OBS_LEVEL} that can
    >> >              be referenced by other METplus config variables.
However,
    >> I don't think it
    >> >              would work in this case. This functionality has
only
    >> really been used to
    >> >              set the output prefix in the MET config for each
run. In
    >> your case, you
    >> >              would have to reference CURRENT_FCST_LEVEL
inside
    >> FCST_VAR<n>_NAME. I am
    >> >              fairly positive that this would not work as the
code is
    >> written currently.
    >> >              I do see this as a useful enhancement that would
make
    >> configuration this
    >> >              situation easier.
    >> >
    >> >              I will create a GitHub issue regarding this
enhancement.
    >> To confirm what
    >> >              behavior you are expecting, the following
configuration:
    >> >
    >> >              FCST_VAR1_NAME =
{INPUT_BASE}/myscripts/read_nc2xr.py
    >> >              {INPUT_BASE}/mydata/forecast_file.nc4 TMP
    >> {valid?fmt=%Y%m%d_%H%M}
    >> >              {CURRENT_FCST_LEVEL}
    >> >              FCST_VAR1_LEVELS =
P1000,P850,P700,P500,P250,P100
    >> >              would result in 6 calls to your python script --
one for
    >> each value in the
    >> >              levels list. Is that correct?
    >> >
    >> >              In the meantime, to obtain the results you
require you
    >> could configure the
    >> >              wrappers in this way:
    >> >
    >> >              FCST_VAR1_NAME =
{INPUT_BASE}/myscripts/read_nc2xr.py
    >> >              {INPUT_BASE}/mydata/forecast_file.nc4 TMP
    >> {valid?fmt=%Y%m%d_%H%M} P1000
    >> >              FCST_VAR2_NAME =
{INPUT_BASE}/myscripts/read_nc2xr.py
    >> >              {INPUT_BASE}/mydata/forecast_file.nc4 TMP
    >> {valid?fmt=%Y%m%d_%H%M}  P850
    >> >              FCST_VAR3_NAME =
{INPUT_BASE}/myscripts/read_nc2xr.py
    >> >              {INPUT_BASE}/mydata/forecast_file.nc4 TMP
    >> {valid?fmt=%Y%m%d_%H%M}  P700
    >> >              FCST_VAR4_NAME =
{INPUT_BASE}/myscripts/read_nc2xr.py
    >> >              {INPUT_BASE}/mydata/forecast_file.nc4 TMP
    >> {valid?fmt=%Y%m%d_%H%M}  P500
    >> >              FCST_VAR5_NAME =
{INPUT_BASE}/myscripts/read_nc2xr.py
    >> >              {INPUT_BASE}/mydata/forecast_file.nc4 TMP
    >> {valid?fmt=%Y%m%d_%H%M}  P250
    >> >              FCST_VAR6_NAME =
{INPUT_BASE}/myscripts/read_nc2xr.py
    >> >              {INPUT_BASE}/mydata/forecast_file.nc4 TMP
    >> {valid?fmt=%Y%m%d_%H%M} P100
    >> >
    >> >              Each VAR<n> item will obtain a 2D field of data
that will
    >> be processed.
    >> >              Please let me know if this does not work as you
expect.
    >> >
    >> >
    >> >              For Question 2, we just recently added more
information on
    >> what grid
    >> >              specifications for the supported grids in python
    >> embedding. The info is
    >> >              currently not in the latest 'develop' version of
the docs
    >> on the web, but
    >> >              you can find it here:
    >> >
    >> >
    >>
https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fdtcenter%2FMET%2Fblob%2Fdevelop%2Fmet%2Fdocs%2FUsers_Guide%2FappendixF.rst%23python-
embedding-for-2d-
data&data=04%7C01%7Cscott.d.rabenhorst%40nasa.gov%7Cf5a79cedacee4e0f852f08d89af3d952%7C7005d45845be48ae8140d43da96dd17b%7C0%7C0%7C637429719398041539%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=U7WtaXX9bkEyI73WsPW%2FSkaqkN5fcIK78hRcgU0fsvM%3D&reserved=0
    >> >
    >> >              Please let me know if there is any information
missing
    >> from here that you
    >> >              would like to be described in more detail. This
is an
    >> evolving document!
    >> >
    >> >              Thanks,
    >> >              George
    >> >
    >> >
    >> >              On Mon, Nov 30, 2020 at 2:01 PM Rabenhorst,
Scott D.
    >> (GSFC-610.1)[SCIENCE
    >> >              SYSTEMS AND APPLICATIONS INC] via RT
<met_help at ucar.edu>
    >> wrote:
    >> >
    >> >              >
    >> >              > Mon Nov 30 14:00:42 2020: Request 97670 was
acted upon.
    >> >              > Transaction: Ticket created by
    >> scott.d.rabenhorst at nasa.gov
    >> >              >        Queue: met_help
    >> >              >      Subject: Questions about python embedding
for a
    >> netcdf interface
    >> >              >        Owner: Nobody
    >> >              >   Requestors: scott.d.rabenhorst at nasa.gov
    >> >              >       Status: new
    >> >              >  Ticket <URL:
    >>
https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Frt.rap.ucar.edu%2Frt%2FTicket%2FDisplay.html%3Fid%3D97670&data=04%7C01%7Cscott.d.rabenhorst%40nasa.gov%7Cf5a79cedacee4e0f852f08d89af3d952%7C7005d45845be48ae8140d43da96dd17b%7C0%7C0%7C637429719398041539%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=DpvVwOAybremGu5pjNX2SLqDDClreeZ38syu%2FWnXgC4%3D&reserved=0
    >> >
    >> >              >
    >> >              >
    >> >              > Hi MET help,
    >> >              >
    >> >              > I am with NASA’s GMAO and we are interested in
using
    >> METplus to augment
    >> >              > the verification of the GEOS model. However,
due to many
    >> constraints, our
    >> >              > output files use the netcdf format with
multiple valid
    >> times and levels -
    >> >              > which is unlikely to change. Some of these 4D
output
    >> collections are large,
    >> >              > and therefore, splitting them into single time
slices or
    >> a subset of levels
    >> >              > is not ideal either just for verification.
Therefore, I
    >> have been looking
    >> >              > at python embedding examples (
    >> >              >
    >>
https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdtcenter.github.io%2FMETplus%2Fdevelop%2Fgenerated%2Fmet_tool_wrapper%2F&data=04%7C01%7Cscott.d.rabenhorst%40nasa.gov%7Cf5a79cedacee4e0f852f08d89af3d952%7C7005d45845be48ae8140d43da96dd17b%7C0%7C0%7C637429719398051490%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=JC2%2BBpDQQ7uFPGHfQEZBuwyNobmBrdvxUfCUFrOIUi4%3D&reserved=0
    >> )
    >> >              > and trying to brainstorm possible ways to
mimic the grib
    >> interface for our
    >> >              > use. Specifically, I’d like to understand the
best
    >> approach to looping
    >> >              > through multiple INIT_TIME/VALID_TIME and
levels while
    >> allowing my python
    >> >              > scripts to find the matching time and level
and return
    >> the appropriate 2D
    >> >              > “met_data” xarray object.
    >> >              >
    >> >              > 1) Question 1:
    >> >              >
    >> >              > I can envision a scenario writing a config
file similar
    >> to:
    >> >              > FCST_VAR1_NAME =
{INPUT_BASE}/myscripts/read_nc2xr.py
    >> >              > {INPUT_BASE}/mydata/forecast_file.nc4 TMP
    >> {valid?fmt=%Y%m%d_%H%M} P500
    >> >              > where it would be relatively easy to select a
single 2D
    >> array out of this
    >> >              > dataset given the provided filename, variable,
time, and
    >> level. However,
    >> >              > what is unclear to me is what to do when I
want to
    >> verify on multiple
    >> >              > levels?
    >> >              >
    >> >              > I’ve seen some examples similar to:
    >> >              > FCST_VAR1_LEVEL = “(0,*,*),(2,*,*),(4,*,*)…”
    >> >              > However, this seems to directly conflict with
the
    >> required 2D xarray
    >> >              > object returned by FCST_VAR<n>_NAME. How do
these 2
    >> config fields interplay
    >> >              > using python embedding? Can FCST_VAR<n>_NAME
return a 3D
    >> array for a given
    >> >              > time slice while FCST_VAR<n>_LEVEL is used for
level
    >> indices?
    >> >              >
    >> >              > Alternatively, if I had  FCST_VAR1_LEVEL =
    >> P1000,P850,P700,P500,P250,P100,
    >> >              > does this produce any internal
keyword/variable such as
    >> {cur_level} that I
    >> >              > could reference in the argument list for
    >> FCST_VAR<n>_NAME? In this case I’d
    >> >              > prefer to use the more straightforward grib
    >> level/accumulation syntax which
    >> >              > could be easily be used to lookup netcdf
levels by a
    >> python script.
    >> >              >
    >> >              > 2) Question 2:
    >> >              >
    >> >              > For python embedding, is there a list of
acceptable
    >> key-value attrs (the
    >> >              > metadata dictionary) for different grids? Do
these
    >> correspond to those
    >> >              > supported by xarray?
    >> >              >
    >> >              > If you could provide any help or insight on
these
    >> questions, it would be
    >> >              > very helpful in thinking through a framework
for our
    >> verification.
    >> >              >
    >> >              > Thanks,
    >> >              > Scott
    >> >              >
    >> >              >
    >> >              >
    >> >              >
    >> >              >
    >> >
    >> >              --
    >> >              George McCabe - Software Engineer III
    >> >              National Center for Atmospheric Research
    >> >              Research Applications Laboratory
    >> >              303-497-2768
    >> >              ---
    >> >              My working day may not be your working day.
Please do not
    >> feel obliged to
    >> >              reply to this email outside of your normal
working hours.
    >> >
    >> >
    >> >
    >> >
    >>
    >>
    >
    > --
    > George McCabe - Software Engineer III
    > National Center for Atmospheric Research
    > Research Applications Laboratory
    > 303-497-2768
    > ---
    > My working day may not be your working day. Please do not feel
obliged to
    > reply to this email outside of your normal working hours.
    >

    --
    George McCabe - Software Engineer III
    National Center for Atmospheric Research
    Research Applications Laboratory
    303-497-2768
    ---
    My working day may not be your working day. Please do not feel
obliged to
    reply to this email outside of your normal working hours.

------------------------------------------------
Subject: Questions about python embedding for a netcdf interface
From: George McCabe
Time: Tue Dec 08 17:11:53 2020

Hi Scott,

All of the MET tools only support 2D data volumes, not just using the
Python embedding functionality. GRID_STAT_ONCE_PER_FIELD would produce
a
command for each name/level combination if True and a single command
(for a
given run time) for each name/level specified. The METplus wrappers
will
still run these commands serially either way. To run the commands
generated
by the wrappers in parallel, you could write a script to loop over the
criteria you want to run and call master_metplus.py once for each
var/level/ensemble/forecast.

Another option would be to try to utilize a new feature that became
available in METplus v4.0-beta2 that was just released today 12/8.
When you
run a use case, a file is created in the log directory named
.all_commands
with a timestamp at the end of the filename. This file contains a list
of
commands that were run in the use case and a list of the environment
variables that were set for each command. There is also a config
variable
called DO_NOT_RUN_EXE that will generate all of the commands but skip
running the actual MET executables. Using a combination of these, you
could
parse the list of all commands generated and parallelize the execution
of
them. Keep in mind that the wrappers cannot build commands if it can't
find
the required files, so if you have more than 1 item in your
PROCESS_LIST,
the intermediate files will not be found and some of the commands will
not
be built. The functionality to generate the list of commands was added
to
assist our internal testing and hasn't been utilized by any users
before,
although I have thought about the potential. If you do decide to go
this
route, your feedback could help improve this functionality to make it
more
usable.

Forecast leads greater than 99 hours are supported. Are you unable to
process them?

- George

On Tue, Dec 8, 2020 at 3:16 PM Rabenhorst, Scott D. (GSFC-
610.1)[SCIENCE
SYSTEMS AND APPLICATIONS INC] via RT <met_help at ucar.edu> wrote:

>
> <URL: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=97670 >
>
> Hi George,
>
> Thanks for answering my questions. You were right, the problem was
the
> exit call at the end of my python script. Now it works as expected.
Also it
> sounds like the built-in "missing data" values in METplus can either
be
> np.nan or -9999. I will set accordingly. Your advice on setting my
logging
> to DEBUG was also helpful. All great input - thanks for getting my
basic
> test up and running.
>
> I was wondering about one more thing, the "GRID_STAT_ONCE_PER_FIELD"
> option, as I was reading through the METplus docs. I presume setting
this
> to True would yield much more efficient compute times for many
> levels/variables. On the downside, there doesn't appear to be a way
to do
> this with my g5_read_fcst.py script as the met_data variable can
only be a
> 2D array. I guess the only way to enable this would be to scrap the
python
> embedding and write a different python script to find all
appropriate
> time/level netcdf indices "(T1,L1,*,*),(T1,L2,*,*),..." for
> FCST/OBS_GRID_STAT_VAR<n>_LEVELS in my *.conf file prior to running
> METplus? I ask this because my overall goal is to compute stats for
12+
> vars each with 20+ levels for 31-member ensemble of 10-day forecasts
while
> trying to do this quickly in parallel on our HPC system. I guess I
am
> trying to figure out the most efficient way to make a 2D tool work
in 5D.
> That reminds me, is there a way to accommodate lead times greater
than!
>   99 hours?
>
> Thanks,
> Scott
>
>
> On 12/7/20, 4:05 PM, "George McCabe via RT" <met_help at ucar.edu>
wrote:
>
>     Hi Scott,
>
>     To follow up about the missing/fill values, it looks like MET
does
> handle
>     these values properly. I ran plot_data_plane using the example
from the
>     GridStat python embedding use case like this:
>
>     /usr/local/met/bin/plot_data_plane PYTHON_NUMPY ~/out-test.ps
>     'name="/home/mccabe/read_ascii_numpy.py
>     /d1/projects/METplus/METplus_Data/met_test/data/python/obs.txt
OBS";'
>
>     This generated the attached image plot_data_plane_zero.png. All
of the
>     values outside of "obs" are 0.0 values.
>
>     I think reran the script but added the following line after the
read
> into
>     met_data:
>
>     met_data[met_data==0] = np.nan
>
>     This run resulted in the image attached called
> plot_data_plane_nan.png. It
>     looks like your example sets the fill value to the correct value
on
> read,
>     so that should be enough to interpret the missing values
properly in
> MET.
>     You can use plot_data_plane to test this out to make sure. The
grey
> values
>     are missing/fill values if you are using the default color
table.
>
>     Let me know if you have any questions or run into any other
issues.
>
>     Thanks,
>     George
>
>     On Mon, Dec 7, 2020 at 1:20 PM George McCabe <mccabe at ucar.edu>
wrote:
>
>     > Hi Scott,
>     >
>     > It looks like the "exit(0)" line at the end of your python
embedding
>     > script is causing execution to stop instead of reading the
data into
> MET.
>     > Since it is exiting with 0 and not a non-zero value, as far as
it
> can tell,
>     > everything went smoothly. I would remove that line and try
again.
> Also, it
>     > is good practice to set LOG_LEVEL = DEBUG in your METplus
config so
> that
>     > you see additional log output that is not shown with
LOG_LEVEL=INFO.
>     >
>     > - George
>     >
>     > On Mon, Dec 7, 2020 at 1:01 PM Scott Rabenhorst via RT <
> met_help at ucar.edu>
>     > wrote:
>     >
>     >>
>     >> <URL:
>
https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Frt.rap.ucar.edu%2Frt%2FTicket%2FDisplay.html%3Fid%3D97670&data=04%7C01%7Cscott.d.rabenhorst%40nasa.gov%7Cf5a79cedacee4e0f852f08d89af3d952%7C7005d45845be48ae8140d43da96dd17b%7C0%7C0%7C637429719398041539%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=DpvVwOAybremGu5pjNX2SLqDDClreeZ38syu%2FWnXgC4%3D&reserved=0
> >
>     >>
>     >> Good afternoon,
>     >>
>     >> I have run a test case on my netcdf files using python
embedding to
>     >> grid_stat as described below. I receive the message "METplus
has
>     >> successfully finished running" at the end, however, the
problem is
> that
>     >> no statistics are computed. The log file shows my python
script was
>     >> invoked for one file, but it is unclear if it ran for the
other
> file.
>     >> Since it did not observe any obvious errors, I tarred up my
> directory
>     >> and attached it to this email in hope that a second pair of
eyes
> may see
>     >> what I am doing wrong. Statistics should have been generated
> between our
>     >> analysis file
> "JM-v10.16.2_C360_RPLY_E5.geosgcm_prog.20200105_0000z.nc4"
>     >> and forecast file "G5GMAO.geosgcm_fcst.20200101_0000z.nc4" at
valid
> time
>     >> 2020-01-05_00:00:00 and 500 hPa pressure level. Since my
netcdf
> files
>     >> are very large, I have ncdumped their contents with a *.dump
> extension.
>     >> The g5_read_fcst.py.*.out files are the result of running my
python
>     >> script alone on each netcdf file the way they are called by
the
>     >> grid_stat config. I would greatly appreciate any ideas on why
this
> did
>     >> not generate results.
>     >>
>     >> Thanks,
>     >> Scott
>     >>
>     >>
>     >> On 12/4/20 1:55 PM, Rabenhorst, Scott D. (GSFC-610.1)[SCIENCE
> SYSTEMS
>     >> AND APPLICATIONS INC] wrote:
>     >> > Hi George,
>     >> >
>     >> > I have one more quick question, in addition to the one
below
> regarding
>     >> handling missing/invalid data with python embedding. I was
looking
> over the
>     >> link you sent (
>     >>
>
https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fdtcenter%2FMET%2Fblob%2Fdevelop%2Fmet%2Fdocs%2FUsers_Guide%2FappendixF.rst%23python-
embedding-for-2d-
data&data=04%7C01%7Cscott.d.rabenhorst%40nasa.gov%7Cf5a79cedacee4e0f852f08d89af3d952%7C7005d45845be48ae8140d43da96dd17b%7C0%7C0%7C637429719398041539%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=U7WtaXX9bkEyI73WsPW%2FSkaqkN5fcIK78hRcgU0fsvM%3D&reserved=0
> )
>     >> with more information about python embedding. It states lead
and
>     >> accumulation times must follow the format HH[MMSS]. However,
we
> often run
>     >> forecasts out 10 days or 240 hours. Is the wrapper code smart
> enough to
>     >> expand to HHH[MMSS] to accommodate lead times more than 99
hours?
>     >> >
>     >> > Thanks,
>     >> > Scott
>     >> >
>     >> >
>     >> > On 12/4/20, 12:18 PM, "Rabenhorst, Scott D. (GSFC-
610.1)[SCIENCE
>     >> SYSTEMS AND APPLICATIONS INC]" <scott.d.rabenhorst at nasa.gov>
wrote:
>     >> >
>     >> >      Sorry, I clicked send before finishing my email below
- but
> I was
>     >> mostly done. Please let me know if you have any suggestions.
I have
> greatly
>     >> appreciated your help!
>     >> >
>     >> >      Thanks,
>     >> >      Scott
>     >> >
>     >> >      On 12/4/20, 12:16 PM, "Rabenhorst, Scott D.
> (GSFC-610.1)[SCIENCE
>     >> SYSTEMS AND APPLICATIONS INC]" <scott.d.rabenhorst at nasa.gov>
wrote:
>     >> >
>     >> >          Hi George,
>     >> >
>     >> >          Thanks you so much for answering these questions
and for
>     >> creating an issue. Yes, your example below was how I was
> envisioning the
>     >> behavior. That would be a very convenient feature to have the
> script called
>     >> x-times to loop through the level list keying off of a
variable
>     >> {CURRENT_FCST_LEVEL}.
>     >> >
>     >> >          I am still working on testing some cases, but
hopefully
> I will
>     >> have things working soon. While I was writing my read
scripts,
> another
>     >> question occurred to me. I'm sure I missed this somewhere in
the
>     >> documentation, but how can I pass missing/masked/fill values
into
>     >> grid_stat, etc? Unlike some operational centers, we normally
set
> field
>     >> gridpoints to a fill_value where the current pressure level
is
> above the
>     >> surface pressure. I presume grid_stat can handle this? When
using
> python
>     >> embedding, is there an additional attribute I can pass in
"attrs"
>     >> specifying a fill_value?
>     >> >
>     >> >
>     >> >
>     >> >          On 11/30/20, 5:10 PM, "George McCabe via RT" <
>     >> met_help at ucar.edu> wrote:
>     >> >
>     >> >              Hi Scott,
>     >> >
>     >> >              For Question 1, the MET tools only accept a 2D
slab
> of
>     >> data from python
>     >> >              embedding. **Some** of the wrappers do set
>     >> {CURRENT_FCST_NAME},
>     >> >              {CURRENT_FCST_LEVEL}, {CURRENT_OBS_NAME}, and
>     >> {CURRENT_OBS_LEVEL} that can
>     >> >              be referenced by other METplus config
variables.
> However,
>     >> I don't think it
>     >> >              would work in this case. This functionality
has only
>     >> really been used to
>     >> >              set the output prefix in the MET config for
each
> run. In
>     >> your case, you
>     >> >              would have to reference CURRENT_FCST_LEVEL
inside
>     >> FCST_VAR<n>_NAME. I am
>     >> >              fairly positive that this would not work as
the code
> is
>     >> written currently.
>     >> >              I do see this as a useful enhancement that
would make
>     >> configuration this
>     >> >              situation easier.
>     >> >
>     >> >              I will create a GitHub issue regarding this
> enhancement.
>     >> To confirm what
>     >> >              behavior you are expecting, the following
> configuration:
>     >> >
>     >> >              FCST_VAR1_NAME =
{INPUT_BASE}/myscripts/read_nc2xr.py
>     >> >              {INPUT_BASE}/mydata/forecast_file.nc4 TMP
>     >> {valid?fmt=%Y%m%d_%H%M}
>     >> >              {CURRENT_FCST_LEVEL}
>     >> >              FCST_VAR1_LEVELS =
P1000,P850,P700,P500,P250,P100
>     >> >              would result in 6 calls to your python script
-- one
> for
>     >> each value in the
>     >> >              levels list. Is that correct?
>     >> >
>     >> >              In the meantime, to obtain the results you
require
> you
>     >> could configure the
>     >> >              wrappers in this way:
>     >> >
>     >> >              FCST_VAR1_NAME =
{INPUT_BASE}/myscripts/read_nc2xr.py
>     >> >              {INPUT_BASE}/mydata/forecast_file.nc4 TMP
>     >> {valid?fmt=%Y%m%d_%H%M} P1000
>     >> >              FCST_VAR2_NAME =
{INPUT_BASE}/myscripts/read_nc2xr.py
>     >> >              {INPUT_BASE}/mydata/forecast_file.nc4 TMP
>     >> {valid?fmt=%Y%m%d_%H%M}  P850
>     >> >              FCST_VAR3_NAME =
{INPUT_BASE}/myscripts/read_nc2xr.py
>     >> >              {INPUT_BASE}/mydata/forecast_file.nc4 TMP
>     >> {valid?fmt=%Y%m%d_%H%M}  P700
>     >> >              FCST_VAR4_NAME =
{INPUT_BASE}/myscripts/read_nc2xr.py
>     >> >              {INPUT_BASE}/mydata/forecast_file.nc4 TMP
>     >> {valid?fmt=%Y%m%d_%H%M}  P500
>     >> >              FCST_VAR5_NAME =
{INPUT_BASE}/myscripts/read_nc2xr.py
>     >> >              {INPUT_BASE}/mydata/forecast_file.nc4 TMP
>     >> {valid?fmt=%Y%m%d_%H%M}  P250
>     >> >              FCST_VAR6_NAME =
{INPUT_BASE}/myscripts/read_nc2xr.py
>     >> >              {INPUT_BASE}/mydata/forecast_file.nc4 TMP
>     >> {valid?fmt=%Y%m%d_%H%M} P100
>     >> >
>     >> >              Each VAR<n> item will obtain a 2D field of
data that
> will
>     >> be processed.
>     >> >              Please let me know if this does not work as
you
> expect.
>     >> >
>     >> >
>     >> >              For Question 2, we just recently added more
> information on
>     >> what grid
>     >> >              specifications for the supported grids in
python
>     >> embedding. The info is
>     >> >              currently not in the latest 'develop' version
of the
> docs
>     >> on the web, but
>     >> >              you can find it here:
>     >> >
>     >> >
>     >>
>
https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fdtcenter%2FMET%2Fblob%2Fdevelop%2Fmet%2Fdocs%2FUsers_Guide%2FappendixF.rst%23python-
embedding-for-2d-
data&data=04%7C01%7Cscott.d.rabenhorst%40nasa.gov%7Cf5a79cedacee4e0f852f08d89af3d952%7C7005d45845be48ae8140d43da96dd17b%7C0%7C0%7C637429719398041539%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=U7WtaXX9bkEyI73WsPW%2FSkaqkN5fcIK78hRcgU0fsvM%3D&reserved=0
>     >> >
>     >> >              Please let me know if there is any information
> missing
>     >> from here that you
>     >> >              would like to be described in more detail.
This is an
>     >> evolving document!
>     >> >
>     >> >              Thanks,
>     >> >              George
>     >> >
>     >> >
>     >> >              On Mon, Nov 30, 2020 at 2:01 PM Rabenhorst,
Scott D.
>     >> (GSFC-610.1)[SCIENCE
>     >> >              SYSTEMS AND APPLICATIONS INC] via RT <
> met_help at ucar.edu>
>     >> wrote:
>     >> >
>     >> >              >
>     >> >              > Mon Nov 30 14:00:42 2020: Request 97670 was
acted
> upon.
>     >> >              > Transaction: Ticket created by
>     >> scott.d.rabenhorst at nasa.gov
>     >> >              >        Queue: met_help
>     >> >              >      Subject: Questions about python
embedding for
> a
>     >> netcdf interface
>     >> >              >        Owner: Nobody
>     >> >              >   Requestors: scott.d.rabenhorst at nasa.gov
>     >> >              >       Status: new
>     >> >              >  Ticket <URL:
>     >>
>
https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Frt.rap.ucar.edu%2Frt%2FTicket%2FDisplay.html%3Fid%3D97670&data=04%7C01%7Cscott.d.rabenhorst%40nasa.gov%7Cf5a79cedacee4e0f852f08d89af3d952%7C7005d45845be48ae8140d43da96dd17b%7C0%7C0%7C637429719398041539%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=DpvVwOAybremGu5pjNX2SLqDDClreeZ38syu%2FWnXgC4%3D&reserved=0
>     >> >
>     >> >              >
>     >> >              >
>     >> >              > Hi MET help,
>     >> >              >
>     >> >              > I am with NASA’s GMAO and we are interested
in
> using
>     >> METplus to augment
>     >> >              > the verification of the GEOS model. However,
due
> to many
>     >> constraints, our
>     >> >              > output files use the netcdf format with
multiple
> valid
>     >> times and levels -
>     >> >              > which is unlikely to change. Some of these
4D
> output
>     >> collections are large,
>     >> >              > and therefore, splitting them into single
time
> slices or
>     >> a subset of levels
>     >> >              > is not ideal either just for verification.
> Therefore, I
>     >> have been looking
>     >> >              > at python embedding examples (
>     >> >              >
>     >>
>
https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdtcenter.github.io%2FMETplus%2Fdevelop%2Fgenerated%2Fmet_tool_wrapper%2F&data=04%7C01%7Cscott.d.rabenhorst%40nasa.gov%7Cf5a79cedacee4e0f852f08d89af3d952%7C7005d45845be48ae8140d43da96dd17b%7C0%7C0%7C637429719398051490%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=JC2%2BBpDQQ7uFPGHfQEZBuwyNobmBrdvxUfCUFrOIUi4%3D&reserved=0
>     >> )
>     >> >              > and trying to brainstorm possible ways to
mimic
> the grib
>     >> interface for our
>     >> >              > use. Specifically, I’d like to understand
the best
>     >> approach to looping
>     >> >              > through multiple INIT_TIME/VALID_TIME and
levels
> while
>     >> allowing my python
>     >> >              > scripts to find the matching time and level
and
> return
>     >> the appropriate 2D
>     >> >              > “met_data” xarray object.
>     >> >              >
>     >> >              > 1) Question 1:
>     >> >              >
>     >> >              > I can envision a scenario writing a config
file
> similar
>     >> to:
>     >> >              > FCST_VAR1_NAME =
> {INPUT_BASE}/myscripts/read_nc2xr.py
>     >> >              > {INPUT_BASE}/mydata/forecast_file.nc4 TMP
>     >> {valid?fmt=%Y%m%d_%H%M} P500
>     >> >              > where it would be relatively easy to select
a
> single 2D
>     >> array out of this
>     >> >              > dataset given the provided filename,
variable,
> time, and
>     >> level. However,
>     >> >              > what is unclear to me is what to do when I
want to
>     >> verify on multiple
>     >> >              > levels?
>     >> >              >
>     >> >              > I’ve seen some examples similar to:
>     >> >              > FCST_VAR1_LEVEL = “(0,*,*),(2,*,*),(4,*,*)…”
>     >> >              > However, this seems to directly conflict
with the
>     >> required 2D xarray
>     >> >              > object returned by FCST_VAR<n>_NAME. How do
these 2
>     >> config fields interplay
>     >> >              > using python embedding? Can FCST_VAR<n>_NAME
> return a 3D
>     >> array for a given
>     >> >              > time slice while FCST_VAR<n>_LEVEL is used
for
> level
>     >> indices?
>     >> >              >
>     >> >              > Alternatively, if I had  FCST_VAR1_LEVEL =
>     >> P1000,P850,P700,P500,P250,P100,
>     >> >              > does this produce any internal
keyword/variable
> such as
>     >> {cur_level} that I
>     >> >              > could reference in the argument list for
>     >> FCST_VAR<n>_NAME? In this case I’d
>     >> >              > prefer to use the more straightforward grib
>     >> level/accumulation syntax which
>     >> >              > could be easily be used to lookup netcdf
levels by
> a
>     >> python script.
>     >> >              >
>     >> >              > 2) Question 2:
>     >> >              >
>     >> >              > For python embedding, is there a list of
acceptable
>     >> key-value attrs (the
>     >> >              > metadata dictionary) for different grids? Do
these
>     >> correspond to those
>     >> >              > supported by xarray?
>     >> >              >
>     >> >              > If you could provide any help or insight on
these
>     >> questions, it would be
>     >> >              > very helpful in thinking through a framework
for
> our
>     >> verification.
>     >> >              >
>     >> >              > Thanks,
>     >> >              > Scott
>     >> >              >
>     >> >              >
>     >> >              >
>     >> >              >
>     >> >              >
>     >> >
>     >> >              --
>     >> >              George McCabe - Software Engineer III
>     >> >              National Center for Atmospheric Research
>     >> >              Research Applications Laboratory
>     >> >              303-497-2768
>     >> >              ---
>     >> >              My working day may not be your working day.
Please
> do not
>     >> feel obliged to
>     >> >              reply to this email outside of your normal
working
> hours.
>     >> >
>     >> >
>     >> >
>     >> >
>     >>
>     >>
>     >
>     > --
>     > George McCabe - Software Engineer III
>     > National Center for Atmospheric Research
>     > Research Applications Laboratory
>     > 303-497-2768
>     > ---
>     > My working day may not be your working day. Please do not feel
> obliged to
>     > reply to this email outside of your normal working hours.
>     >
>
>
>     --
>     George McCabe - Software Engineer III
>     National Center for Atmospheric Research
>     Research Applications Laboratory
>     303-497-2768
>     ---
>     My working day may not be your working day. Please do not feel
obliged
> to
>     reply to this email outside of your normal working hours.
>
>
>
>
>

--
George McCabe - Software Engineer III
National Center for Atmospheric Research
Research Applications Laboratory
303-497-2768
---
My working day may not be your working day. Please do not feel obliged
to
reply to this email outside of your normal working hours.

------------------------------------------------