[Go-essp-tech] Extending the DRS syntax to observations

Palanisamy, Giri palanisamyg at ornl.gov
Thu Jan 27 11:42:22 MST 2011


Hi Luca,

Thanks for the review. Please see my comments below:

>> Please see previous email about the very latest convention on the dataset version: instead of "v1" it needs to be something like "vYYYYMMDD"
Sure, we will use this convention for the dataset version

>>cmip5/observations/doe/arm/cmbe/1hr/atmos/ps/c1p1/vYYYYMMDD/ps_cfSites_arm_cmbe_c1p1barrow_2001010100-2010010100.nc

Sure, I am OK with adding location information as part of <processing level and product version> field, but we may need to consider adding a delimiter so that publishing script can parse these values correctly (example: c1p1-barrow)

Thanks!

Giri

From: Cinquini, Luca (3880) [mailto:Luca.Cinquini at jpl.nasa.gov]
Sent: Thursday, January 27, 2011 11:53 AM
To: Palanisamy, Giri
Cc: Renata McCoy; climate-obs; go-essp-tech at ucar.edu; Bob Drach; Dean Williams; Karl Taylor; Christensen, Sigurd W.
Subject: Re: Extending the DRS syntax to observations

Hi Giri,
            thanks for taking a stab at this...

o Please see previous email about the very latest convention on the dataset version: instead of "v1" it needs to be something like "vYYYYMMDD"

o I am not ware of any convention about case in DRS (anybody knows?), but I'd say we should try to establish one at least for observations: either always go with lower case, or have everything in lower case except possibly for AGENCY, MISSION and INSTRUMENT (as you have done). Someone needs to make an authoritative decision (PMDI?)

o I think the DRS spec mandates the <mission or project> field to be the same in the directory and filename, and I think we should follow that if we want to avoid future trouble. If you want to differentiate files from different sites, maybe "barrow" could be made part of the last field... In another words, the wiki now proposes that the model field <ensemble> in the file names is used as <processing level and product version> for observations, but maybe its meaning could be expanded to take into account anything that the data provider wants to encode to differentiate files. Maybe something like <product type>, which is encoded according to project-specific nomenclature ?

So your full example could be something like this:

cmip5/observations/doe/arm/cmbe/1hr/atmos/ps/c1p1/vYYYYMMDD/ps_cfSites_arm_cmbe_c1p1barrow_2001010100-2010010100.nc

(minus the case...)

BTW, the files still contain one single variable per file, right ?

I started to collect these proposed examples on the wiki, and I will modify them as consensus is reached...

thanks again,
Luca

On Jan 27, 2011, at 9:16 AM, Palanisamy, Giri wrote:


Hi Luca...

Some of us from ORNL with the help of Renata have defined the naming and file name structure for the ARM CMBE files, following the draft suggestions in Data and Metadata Requirements for CMIP5 Observational Datasets (http://oodt.jpl.nasa.gov/wiki/display/CLIMATE/Data+and+Metadata+Requirements+for+CMIP5+Observational+Datasets). Please see the details below.  We would like to have the group review this structure and provide us feedback.  We highlight two particular questions after the details are shown.

For the directory structure:
 cmip5/observations/DOE/ARM/CMBE/1hr/atmos/ps/c1p1/v1/
  We used this directory structure naming convention:
<activity>: cmip5
<product>: observations
<agency>: DOE
<mission or project>: ARM
<instrument>: CMBE
<frequency>:1hr
<observational realm>: atmos
<variable name>: ps
<processing level and product version>: c1p1
<dataset version number>: v1

for file naming:
 ps_cfSites_arm-barrow_cmbe_c1p1_2001010100-2010010100.nc

we used the following file naming convention:
<variable name>_<variable table>_<mission or project>_<instrument>_<processing level and product version>_<temporal subset>.nc

Two questions:

(1)     For the <mission or project> field, we are using <ARM> in the directory structure, but for this field in the file name structure, we are using < arm-barrow>.  This is mainly to consolidate the files from various ARM locations in one final directory (one place).  Do you think this will cause any issues in the publishing script, in the discovery process, or in CMIP5 tools using the data once they are found?
(2)     We sometimes use upper case where it seemed appropriate.  Is there a standard for case, especially as it applies to major projects, programs, etc.?


Thanks!

Giri
From: Renata McCoy [mailto:mccoy20 at llnl.gov]
Sent: Wednesday, January 26, 2011 6:51 PM
To: Cinquini, Luca (3880)
Cc: Renata McCoy; climate-obs; go-essp-tech at ucar.edu<mailto:go-essp-tech at ucar.edu>; Bob Drach; Dean Williams; Karl Taylor
Subject: Re: Extending the DRS syntax to observations

Hi Luca,

I was experimenting with setting up a publishing script for the ARM cmbe observational data following the proposal below. I see few issues:
I understand (I am cc-ing Bob Drach to check on that) the publisher needs at least those 3 fields for the "CMIP5' project:
project (which is 'CMIP5'), experiment (that could be specified as 'none' in command options, but maybe should be 'observations' ?),  and model
A 'model' is an important field and I was trying to set it up so as 'cmbe' would be my 'model', which seems to be equivalent to 'airs' for NASA AIRS data

The other problem is that we need a specific table (CMOR table) to establish the controlled variable list and data frequency that is obs specific.
We (with ORNL team) are trying to create an obsCfSites table, that would be mostly a copy of cfSites table with obs specific time definition (for cmbe it's 1 hour average, for CDIAC data - 30 min average, and for 3D data with vertical axis - we would like to specify that any pressure level could be reported). I think it would be good if we could encompass all the observational data needs in one observational table.

I did not try to test publish the data yet, I am working on rewriting the data with the appropriate metadata first, so I am not sure what other problems may pop up.

Greetings,
Renata

------------------------------------------------------------
Renata B. McCoy, Ph.D
Program for Climate Model Diagnosis and Intercomparison(L-103)
Lawrence Livermore National Laboratory
P.O. Box 808
Livermore, CA 94551

(925) 424-5237 (voice)
(925) 422-7675 (fax)
mccoy20 at llnl.gov<mailto:mccoy20 at llnl.gov>
------------------------------------------------------------


On Jan 26, 2011, at 2:50 PM, Cinquini, Luca (3880) wrote:



Hi all,
            apologies for cross-posting...

I'd like to start a discussion on how the DRS specification for CMIP5 model output (http://cmip-pcmdi.llnl.gov/cmip5/docs/cmip5_data_reference_syntax.pdf)
can be applied to observational datasets that will be made part of the same archive.  A proposal based on recent workshops with PCMDI is detailed on this wiki page:

http://oodt.jpl.nasa.gov/wiki/display/CLIMATE/Data+and+Metadata+Requirements+for+CMIP5+Observational+Datasets

As an example, the directory structure for the NASA AIRS dataset would look like this:

<root directory>/cmip5/observations/nasa/aqua/airs/mon/atmos/ta/l3/vYYYYMMDD/<files>

where:

"observations"=<product>, same as DRS "output1" or "output2"
"nasa"=<agency>, same as DRS "institute"
"aqua"=<mission>, replaces DRS "model"
"airs"=<instrument>, replaces DRS "experiment"
"l3"=<processing level>, replaces DRS "ensemble member"
vYYYYMMDD=the dated version, same as specified by the DRS

The values for the various fields <agency>, <mission>, <instrument> and <processing level> would need to be selected from a controlled vocabulary similar to the one established for models.

Any comment or insight on the matter is appreciated.... The idea is to try to finalize the specification relatively quickly, let's say a couple of weeks, so that we can start preparing
and publishing these observations into the CMIP5 archive.

thanks in advance,

Luca



-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ucar.edu/pipermail/go-essp-tech/attachments/20110127/1e6a051f/attachment-0001.html 


More information about the GO-ESSP-TECH mailing list