[Go-essp-tech] Extending the DRS syntax to observations

Cinquini, Luca (3880) Luca.Cinquini at jpl.nasa.gov
Thu Jan 27 09:58:55 MST 2011


Hi Steve,
would it be too much to ask you to edit the wiki to reflect your suggestions ? You have been dealing with these problems for a long time and it would be great to capture your expertise there - I don't think I can do justice to your ideas... If so, I will ask our sys admin Paul Zimdars to provide you with an account.
thanks, Luca

On Jan 27, 2011, at 9:46 AM, Steve Hankin wrote:

A small editing suggestion:  that this DRS extension should be referred to a an extension for **gridded observations**.    The term "observations", standing alone, would be misunderstood by some climate-relevant communities.

I just read through the wiki proposal and could not find an explicit discussion of spatial sampling concepts.  Those seem to be implicit in statements like "enforces a proper syntax for coordinate axes and grids".  (The statement "Datasets should be organized as time-series" seems potentially misleading, no?)    A few additional words to help reach a wider audience might be worthwhile.

    - Steve

=================================

On 1/27/2011 7:03 AM, Cinquini, Luca (3880) wrote:
Hi Sebastian,
thanks for your summary of the current status, it all seems correct to me. A few comments regarding the 3 points below:

1) Stephen Pascoe pointed out that the latest agreement is to derive the dataset version number from the date, for example v20100105 for data published on January 5th. I updated the wiki proposal to reflect this:

http://oodt.jpl.nasa.gov/wiki/display/CLIMATE/Data+and+Metadata+Requirements+for+CMIP5+Observational+Datasets

2) I agree we should focus on formalizing the DRS Controlled Vocabulary for obs via document sharing tools... when we have a first version, maybe PCMDI can encode it in a CMOR table ?
3) Ok, let's use [project:cmip5] in esg.ini. We might be able to share the same snippet of configuration across all agencies involved.

thanks, Luca

On Jan 27, 2011, at 7:41 AM, Sébastien Denvil wrote:

Hello Luca, all,

within CFMIP-obs context IPSL plan to publish observational datasets that will be made part of the CMIP5 database. We have up to now identify:
- CALIPSO-GOCCP (monthly & daily)
- PARASOL (monthly & daily)
- CloudSat diagnostics (consistent with COSP output), which are:
    - CFADs (radar reflectivity histograms function of altitude) prepared by Roj Marchand (CSU)
    - ISCCP diagnostics consistent with COSP outputs. Prepared by Yuying Zhang (PCMDI) and Rob Pincus (CIRES)
- SIRTA-Cabauw sites database developed within the EU FP7 EUCLIPSE project, prepared by Chiriaco & Haeffelin (IPSL). Could be included in the obsCfSites : in principle close to the ARM stations case.

To resume what I understood from this thread. There are 3 documents to produce:
1) agree on a DRS suitable for observational datasets (Luca's proposal looks good to me)
2) agree on a set of new cmor table (possibly to be included in the cmip5 cmor table git repository)
3) agree on a esg.ini set up able to publish observational datasets the way we want to

Regarding esg.ini what Renata suggested seems to me like a good idea:
"to make it a CMIP5 project then your definitions should be in the section
[project:cmip5]
which then means extending the existing cmip5 definitions and options, that mostly seems to be not a problem."

Setting up new table may prove to be the "hard point". Work on a Google Docs page with the CMOR table spreadsheet could ease the process.

Regards.
Sébastien

On 27/01/2011 14:28, martin.juckes at stfc.ac.uk<mailto:martin.juckes at stfc.ac.uk> wrote:
Hello All,

This is an interesting discussion – I’ve copied Victoria in because she is involved in discussions with the ESA Climate Change Initiative about data formats etc for the climate datasets ESA is planning to produce. It would be really good to try to coordinate efforts, at least up to the level of agreeing a set of controlled vocabularies to identify datasets, and agreeing what file attributes will be used to hold that information in the file.  Luca’s wiki page lays out the issues clearly – so perhaps we should see what the ESA CCI group think.

Cheers,
Martin

From: go-essp-tech-bounces at ucar.edu<mailto:go-essp-tech-bounces at ucar.edu> [mailto:go-essp-tech-bounces at ucar.edu] On Behalf Of Renata McCoy
Sent: 27 January 2011 00:42
To: Cinquini, Luca (3880)
Cc: go-essp-tech at ucar.edu<mailto:go-essp-tech at ucar.edu>; climate-obs; Giri Palanisamy; Renata McCoy
Subject: Re: [Go-essp-tech] Extending the DRS syntax to observations

Hi Luca,


I see that you defined your project as an AIRS project. If we want to make it a CMIP5 project then your definitions should be in the section
[project:cmip5]
which then means extending the existing cmip5 definitions and options, that mostly seems to be not a problem.

Giri set up a Google Docs page with the CMOR table spreadsheet. If he doesn't mind, we could all work on it.  I will work on it tomorrow to clean it up. Right now it is very ARM specific, but we discussed about changing that and making it all observation inclusive.

Renata


On Jan 26, 2011, at 4:19 PM, Cinquini, Luca (3880) wrote:


Hi Renata,
            I'm glad you are experimenting with the same issues...

I actually also experimented with using the ESG publisher to publish AIRS data that was organized in a DRS-like fashion like I mentioned:

<root directory>/cmip5/observations/nasa/aqua/airs/mon/atmos/ta/l3/vYYYYMMDD/<files>

I was able to succeed by using the following snippet of code in esg.ini:

[project:AIRS]
category_defaults =
        experiment | airs_exp
        model | airs_model
dataset_id = %(activity)s.%(product)s.%(agency)s.%(mission)s.%(instrument)s.%(time_frequency)s.%(realm)s.%(variable)s.%(level)s
dataset_name_format = AIRS Level 3 Monthly Data (NetCDF)
directory_format = /esg/data/%(activity)s/%(product)s/%(agency)s/%(mission)s/%(instrument)s/%(time_frequency)s/%(realm)s/%(variable)s/%(level)s/%(version)s
handler = esgcet.config.netcdf_handler:NetcdfHandler
parent_id = nasa.airs
variable_per_file = true
las_configure = true
maps = las_time_delta_map

In other words, I instructed the publisher to use fictitious values for experiment and model, but I also instructed it to read all the relevant DRS fields from the directory structure itself. That seemed to work.
Maybe Bob can comment if this seems reasonable to him.

And I agree that a CMOR table for observations needs to be developed. Could we open up that process so we can include all relevant values not just from ORNL, but also NASA, NOAA and any agency who will contribute observations ?

thanks a lot,
Luca

On Jan 26, 2011, at 4:50 PM, Renata McCoy wrote:


Hi Luca,

I was experimenting with setting up a publishing script for the ARM cmbe observational data following the proposal below. I see few issues:
I understand (I am cc-ing Bob Drach to check on that) the publisher needs at least those 3 fields for the "CMIP5' project:
project (which is 'CMIP5'), experiment (that could be specified as 'none' in command options, but maybe should be 'observations' ?),  and model
A 'model' is an important field and I was trying to set it up so as 'cmbe' would be my 'model', which seems to be equivalent to 'airs' for NASA AIRS data

The other problem is that we need a specific table (CMOR table) to establish the controlled variable list and data frequency that is obs specific.
We (with ORNL team) are trying to create an obsCfSites table, that would be mostly a copy of cfSites table with obs specific time definition (for cmbe it's 1 hour average, for CDIAC data - 30 min average, and for 3D data with vertical axis - we would like to specify that any pressure level could be reported). I think it would be good if we could encompass all the observational data needs in one observational table.

I did not try to test publish the data yet, I am working on rewriting the data with the appropriate metadata first, so I am not sure what other problems may pop up.

Greetings,
Renata

------------------------------------------------------------
Renata B. McCoy, Ph.D
Program for Climate Model Diagnosis and Intercomparison(L-103)
Lawrence Livermore National Laboratory
P.O. Box 808
Livermore, CA 94551

(925) 424-5237 (voice)
(925) 422-7675 (fax)
mccoy20 at llnl.gov<mailto:mccoy20 at llnl.gov>
------------------------------------------------------------


On Jan 26, 2011, at 2:50 PM, Cinquini, Luca (3880) wrote:


Hi all,
            apologies for cross-posting...

I'd like to start a discussion on how the DRS specification for CMIP5 model output (http://cmip-pcmdi.llnl.gov/cmip5/docs/cmip5_data_reference_syntax.pdf)
can be applied to observational datasets that will be made part of the same archive.  A proposal based on recent workshops with PCMDI is detailed on this wiki page:

http://oodt.jpl.nasa.gov/wiki/display/CLIMATE/Data+and+Metadata+Requirements+for+CMIP5+Observational+Datasets

As an example, the directory structure for the NASA AIRS dataset would look like this:

<root directory>/cmip5/observations/nasa/aqua/airs/mon/atmos/ta/l3/vYYYYMMDD/<files>

where:

"observations"=<product>, same as DRS "output1" or "output2"
"nasa"=<agency>, same as DRS "institute"
"aqua"=<mission>, replaces DRS "model"
"airs"=<instrument>, replaces DRS "experiment"
"l3"=<processing level>, replaces DRS "ensemble member"
vYYYYMMDD=the dated version, same as specified by the DRS

The values for the various fields <agency>, <mission>, <instrument> and <processing level> would need to be selected from a controlled vocabulary similar to the one established for models.

Any comment or insight on the matter is appreciated.... The idea is to try to finalize the specification relatively quickly, let's say a couple of weeks, so that we can start preparing
and publishing these observations into the CMIP5 archive.

thanks in advance,

Luca






--
Scanned by iCritical.



_______________________________________________
GO-ESSP-TECH mailing list
GO-ESSP-TECH at ucar.edu<mailto:GO-ESSP-TECH at ucar.edu>
http://mailman.ucar.edu/mailman/listinfo/go-essp-tech




--
Sébastien Denvil
IPSL, Pôle de modélisation du climat
UPMC, Case 101, 4 place Jussieu,
75252 Paris Cedex 5

Tour 45-55 2ème étage Bureau 209
Tel: 33 1 44 27 21 10
Fax: 33 1 44 27 39 02




_______________________________________________
GO-ESSP-TECH mailing list
GO-ESSP-TECH at ucar.edu<mailto:GO-ESSP-TECH at ucar.edu>
http://mailman.ucar.edu/mailman/listinfo/go-essp-tech


-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ucar.edu/pipermail/go-essp-tech/attachments/20110127/c2ec6d73/attachment-0001.html 


More information about the GO-ESSP-TECH mailing list