[Go-essp-tech] Updating the DRS document *with attachments*

Karl Taylor taylor13 at llnl.gov
Wed Jan 13 10:07:44 MST 2010


Hello Stephen,

CMOR can't know what the version is, and the writer probably won't know 
either.  (ESG will know, however.)  For the user it will be better not 
to have a version in the filename because then he won't have to look up 
the latest version for each model before reading in their data.  In the 
directory structure, we might want to have one of the "version" 
directories be "vlatest" and alias it to the actual directory (e.g., 
"v1", "v2", or whichever is the latest version).

As for the URL, I have no preference.

Best regards,
Karl

stephen.pascoe at stfc.ac.uk wrote:
>  
> Thanks Karl, that very clearly addresses the product component.  
>
> Getting this fixed will be very timely as UKMO pointed out the issue of
> experiments spanning requested/non-requested data earlier this week.  I
> was able to reassure them we are on to it.
>
> There were a couple of other points I raised: 
>
>   a) whether filenames should have a version.  I guess this decision is
> constrained by what CMOR can do as that is the de facto standard tool.
>   b) Whether the URL <hostname> should be <host-prefix>.  This is a
> small syntactic point that could easily be changed later. Some people
> might prefer to keep <hostname>
>
> I don't think either of these points should hold up updating the
> document on the website.
>
> Stephen.
>
> ---
> Stephen Pascoe  +44 (0)1235 445980
> British Atmospheric Data Centre
> Rutherford Appleton Laboratory
>
> -----Original Message-----
> From: Karl Taylor [mailto:taylor13 at llnl.gov] 
> Sent: 13 January 2010 00:11
> To: Juckes, Martin (STFC,RAL,SSTD)
> Cc: Pascoe, Stephen (STFC,RAL,SSTD); go-essp-tech at ucar.edu
> Subject: Updating the DRS document *with attachments*
>
> Karl Taylor wrote:
>   
>> Dear Martin and Stephen,
>>
>> thanks very much for thinking about this a providing some specific 
>> suggestions. This has prompted me to get back to this. I thought I had
>>     
>
>   
>> already placed on the web version 22 of the document (attached), which
>>     
>
>   
>> defined a "Product" component for DRS. After reading your input, I 
>> have altered this slightly, and also completed the definition of the 
>> experiment names (see the last appendix) in version 23 (also
>>     
> attached).
>   
>> Please let me know if this meets all the requirements.
>>
>> Best regards,
>>
>> martin.juckes at stfc.ac.uk wrote:
>>     
>>> Hello Karl,
>>>
>>> I think the following changes are what is needed - I'm happy to put 
>>> them into the document and circulate the result for approval if you 
>>> send me the latest version in word. I don't think this affects what 
>>> modelling groups are doing, since, if they archive in the DRS, 
>>> everything they archive will be in the 'full' section of the 
>>> directory tree and it will be up to the data node managers to create 
>>> the 'requested' section.
>>>
>>> In section 2.1: Modify the atomic dataset definition:
>>>
>>> /The collection of data that is output from a single model run and /
>>>
>>> /characterized by sharing a single activity, _activity component_, 
>>> institute, model, /
>>>
>>> /experiment/scenario, data frequency, modeling-realm, variable name, 
>>> /
>>>
>>> /local ensemble member, and version. /
>>>
>>> I'm not sure about the terminology "activity component", but it is 
>>> reasonably descriptive.
>>>
>>> In the following paragraph, change `first six' to `first seven' and 
>>> add 'activity component' to the list.
>>>
>>> In section 2.2: Insert an `activity component' definition after the 
>>> 'activity' definition:
>>>
>>> *Activity component *
>>>
>>> The DRS will distinguish between 'full' and 'requested' atomic 
>>> datasets. The 'full' component
>>>
>>> will contain all the data archived, while the 'requested' component 
>>> will contain only
>>>
>>> those sections of the data in the PCMDI CMIP5 data request. The 
>>> atomic datasets within
>>>
>>> the 'requested' component will thus either be subsets of those 'full'
>>>       
>
>   
>>> component or identical
>>>
>>> to them when only the requested data is archived.
>>>
>>> In section 3.1: insert `/<category>' after '/<activity>/', and 
>>> similar changes to
>>>
>>> other URLs in this section.
>>>
>>> In section 3.2: as above.
>>>
>>> Add a new section 3.4:
>>>
>>> 3.4 Replication
>>>
>>> A subset of the data will be replicated at PCMDI, BADC and DKRZ. This
>>>       
>
>   
>>> subset will consist of a selection of complete atomic datasets from 
>>> the "requested" activity component.
>>>
>>> Cheers,
>>>
>>> Martin
>>>
>>>       
>>>> -----Original Message-----
>>>>         
>>>> From: Pascoe, Stephen (STFC,RAL,SSTD)
>>>>         
>>>> Sent: 07 January 2010 12:05
>>>>         
>>>> To: Juckes, Martin (STFC,RAL,SSTD); Karl Taylor
>>>>         
>>>> Cc: go-essp-tech at ucar.edu
>>>>         
>>>> Subject: Updating the DRS document
>>>>         
>>>> Hi Martin and Karl,
>>>>         
>>>> I am keen that the recent discussions on DRS aren't forgotten in 
>>>> the
>>>>         
>>>> rush to implement. Therefore I would like to see the DRS document
>>>>         
>>>> updated with the following:
>>>>         
>>>> 1. Product/Category Component.
>>>>         
>>>> As agreed in the versioning telco
>>>>         
>>> (http://**proj.badc.rl.ac.uk/go-essp/wiki/CMIP5/Meetings/telco091208) 
>>> we
>>>
>>>       
>>>> need an extra component in the DRS to distinguish between requested
>>>>         
>>>> and unrequested atomic datasets. I think the proposal was for a
>>>>         
>>>> category between Activity and Institute with a the value "full" or
>>>>         
>>>> "requested". I like the term "Product" for this component but I 
>>>> think
>>>>         
>>>> various names were discussed.
>>>>         
>>>> 2. Hostname Component in URLs.
>>>>         
>>>> I believe this needs to be replaced by a host prefix, thus allowing
>>>>         
>>>> some site-specific path elements to preceed the activity component.
>>>>         
>>>> Knowing how the datanode software is designed I think it is too
>>>>         
>>>> restrictive to insist the first element after the hostname is 
>>>> Activity
>>>>         
>>>> because in reality multiple services will be running on a datanode.
>>>>         
>>>> One could keep the current scheme with virtual hosts and redirects 
>>>> but
>>>>         
>>>> it seems unnecessary.
>>>>         
>>>> 3. Section 3.3 Filenames.
>>>>         
>>>> There is no version number in the filename convention: Shouldn't 
>>>> there
>>>>         
>>>> be? A recent comment from Sylvia suggested the ESG metadata display
>>>>         
>>>> will depend on this naming scheme so I would like to see more 
>>>> detail
>>>>         
>>>> here. Is the temporal_subset element manditory for all files in the
>>>>         
>>>> system?
>>>>         
>>>> I am happy to help with preparing a new draft if you wish.
>>>>         
>>>> Thanks,
>>>>         
>>>> Stephen.
>>>>         
>>> --
>>> Scanned by iCritical.
>>>
>>>
>>>       
>>     
>
>   



More information about the GO-ESSP-TECH mailing list