[Go-essp-tech] Extending the DRS syntax to observations

Karl Taylor taylor13 at llnl.gov
Wed Feb 9 22:06:56 MST 2011


Hi all,

I'm at a meeting on the east coast and will be tied up from 11-12 this 
morning.  Here are some brief comments concerning the draft proposal 
that I hope will be of some use in my absence:

1.  The directory structure and filenames (and the underlying DRS) are 
all meant to make it easier for users to navigate to the data they want 
and to unambiguously identify the data, so that is much more important 
than making it look like CMIP5.  My sense is that folks looking for 
observational data will want to be able to easily see (through the DRS 
categories)
a) the variable
b) the sampling frequency
c) perhaps the "realm"
d) the time-period over which the variable was measured

Users will want to be able to distinguish among the various 
observational products available for his (her) purposes.  So,
a) something about how the measurement was made and processed  (maybe 
instrument is sufficient; perhaps institute, mission, agency are not all 
needed)
b) version of observational product

2.  Note that the default directory structure generated by CMOR2 differs 
from the ESGF directory structure, as described in sections 3.1 and 3.3 of

http://cmip-pcmdi.llnl.gov/cmip5/output_req.html?submenuheader=2#req_format

In that document "It is recommended that ESGF data nodes should layout 
datasets on disk mapping DRS components to directories as:

<activity>/<product>/<institute>/<model>/<experiment>/<frequency>/<modeling 
realm>/<MIP table>/<ensemble member>/<version number>/<variable name>/ 
<CMOR filename>.nc

Example:

/CMIP5/output1/UKMO/HadCM3/decadal1990/day/atmos/day/r3i2p1/v20100105/tas/ 
tas_day_HADCM3_ decadal1990_r3i2p1_199001-199012.nc

The observations don't need to follow this template (and probably 
shouldn't), but the current observations draft document incorrectly 
describes the CMIP5 structure.

3.  I would recommend that observational products be written using 
CMOR2.  I do not think it is a good use of resources to generalize and 
"harden" the CMOR checker to enforce anything.  It wasn't meant for this 
purpose and this would be a big job.

4.  I would advise that all variables that appear in a single CMOR table 
at least
a) share the same sampling frequency
b) share the same realm (although you might want to include 2 
closely-related realms)

4.  Recall Charles Doutriaux's note that we do not yet have program 
support for some of what will be needed.

5.  also a reminder:  observations should not be under the "CMIP5" 
activity.  I can ask the WGCM if something like "obs4CMIP5" would be 
o.k. (I rather like this.)

In preparing the above comments, I've mostly thought about gridded 
global datasets.

Best regards,
Karl

On 2/9/11 9:25 AM, Cinquini, Luca (3880) wrote:
> Hi all,
> 	Dean Williams has kindly made the following number available for tomorrow's conference call:
>
>   (925) 424-8105
>   access code 305757#
>
> The call is scheduled for 8am PST / 9am MST / 10am CST / 11am EST / 16pm GMT / 17pm France/Germany. We will discuss the adoption of community-wide metadata conventions for observational datasets that are going to be made part of the CMIP5 archive.
>
> Thanks in advance to all for participating,
>
> Luca
>
> On Feb 9, 2011, at 8:35 AM, Christensen, Sigurd W. wrote:
>
>> Luca,
>>    Several of us think that a call Thursday would be good, but we probably won't finish then.
>>
>>    We think that not only a different CMOR table, but also a different directory structure/filename structure may be appropriate for three or more of the categories mentioned in the link:
>>
>> -Decide on whether to have one single CMOR table for observations (currently "obsSites"), or more than one depending on types of observational data:
>>    *remote sensed (grids and swaths)
>>    *in-situ stations (time series and profiles)
>>    *trajectory-based observations
>>    *in-situ gridded products
>>
>>    The discussion thus far emphasizes fields and order for naming conventions for satellite-based data.  Perhaps those can be finalized Thursday.  But point-oriented surface and/or profile time-series data (such as ARM, AmeriFlux, etc.), and trajectory-based observations, will likely need more consideration.  Karl, on January 31, indicated that variable name, modeling realm, and frequency should be carried to the DRS (Data Reference System), but the rest could in essence be tailored to the needs of observational data.
>>
>>    Thanks,
>>    Giri and Sig
>>
>>
>> -----Original Message-----
>> From: Cinquini, Luca (3880) [mailto:Luca.Cinquini at jpl.nasa.gov]
>> Sent: Tuesday, February 08, 2011 15:58
>> To: Lynnes, Christopher S. (GSFC-6102)
>> Cc: Huffman, George J. (GSFC-613.1)[SCIENCE SYSTEMS APPLICATIONS]; Karl Taylor; Steve Hankin; Bryan Lawrence; go-essp-tech at ucar.edu; Sébastien Denvil; climate-obs; McCoy, Renata
>> Subject: Re: [Go-essp-tech] Extending the DRS syntax to observations
>>
>> Hi all,
>> 	I would like to propose to have a conference call to discuss and hopefully resolve any remaining issues concerning metadata conventions for CMIP5 observations. Would anybody object if we had this call in only two days, next Thursday February 10, at 8am PST/11am EST - which I think is is 4pm in the UK and 5pm in France and Germany ? If this is too soon, we could postpone till next week.
>>
>> As a remainder, this is the URL of the current proposal:
>>
>> http://oodt.jpl.nasa.gov/wiki/display/CLIMATE/Data+and+Metadata+Requirements+for+CMIP5+Observational+Datasets
>>
>> which at the very beginning contains a summary of the issues still open. Please reply if you can't make the meeting and you really would like to attend, or if you think there are other issues to discuss.
>>
>> Best regards,
>> thanks, Luca
>>
>> P.S.: if the conference is a go, we'll setup a phone line....
>>
>>
>> On Feb 2, 2011, at 3:17 PM, Lynnes, Christopher S. (GSFC-6102) wrote:
>>
>>> On Feb 2, 2011, at 5:08 PM, Cinquini, Luca (3880) wrote:
>>>
>>>> Hi Chris and George,
>>>> 	thanks for your input... I guess the question is wether you would be opposed to re-arranging the fields according to an order that is commonly agreed upon (and that possibly resembles the DRS structure for models), provided that all the relevant information is included ?
>>> Since my philosophy is to tailor for the expected user community, I defer to you and your colleagues regarding the order, since you know the community.  My main interest is just ensuring the inclusion of the relevant information.
>>>
>>>> I think at this point we might be able to make faster progress by organizing a conference call to discuss these issues...
>>>>
>>>> thanks, Luca
>>>>
>>>> On Feb 2, 2011, at 2:42 PM, Lynnes, Christopher S. (GSFC-6102) wrote:
>>>>
>>>>> On Feb 2, 2011, at 4:26 PM, George J. Huffman wrote:
>>>>>
>>>>>> There are other variables that could go in the last position since the
>>>>>> original datasets contain multiple variables as "fields".  I should say
>>>>>> that the Goddard DISC puts Level before Instrument, and you might want
>>>>>> to consider why they did that.  [This is mostly an issue if you're
>>>>>> trying to build a syntax that is generally useful, not just focused on
>>>>>> gridded data.]
>>>>> We (at Goddard DISC) put Level before Instrument because we anticipate that the user community for Level 3 gridded data is somewhat distinct than for Level 2 or Level 1 swath data, which require considerably more sophisticated and customized tools to work with than Level 3.  I don't know if that is as relevant in the CMIP5 context as in our more generalized search interface (as George implies.)
>>>>> --
>>>>> Dr. Christopher Lynnes     NASA/GSFC, Code 610.2    phone: 301-614-5185
>>>>>
>>>>>
>>>>
>>> --
>>> Dr. Christopher Lynnes     NASA/GSFC, Code 610.2    phone: 301-614-5185
>>>
>>>
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ucar.edu/pipermail/go-essp-tech/attachments/20110209/317761ec/attachment-0001.html 


More information about the GO-ESSP-TECH mailing list