[Go-essp-tech] DRS syntax into ESG

Luca Cinquini luca at ucar.edu
Thu Nov 5 11:57:35 MST 2009


Hi Bob,
	how you build the dataset hierarchy really boils down on how you want  
users to browse. I was under the impression that you wanted users to  
browse the catalogs reflecting how the data was stored on disk, but  
maybe I was wrong. You don't think it would be too confusing to have  
all datasets for a single model/experiment/frequency/realm/variable/ 
ensemble be contained in the very same HTML page ?
I think for searching we all agree that what needs to be done is  
simply harvest all the fields in the database/triple store and then  
expose the corresponding facets.
thanks, Luca

On Nov 5, 2009, at 11:28 AM, Bob Drach wrote:

> Hi Luca,
>
> Thanks for raising the issue - I've been wondering about this too.
>
> The hierarchy of datasets as presented by the gateway - for users to  
> browse through - shouldn't necessarily be the same as the hierarchy  
> introduced by DRS. Users should be able to find datasets with as few  
> clicks as possible, which is why we just went through the exercise  
> of 'flattening' the THREDDS catalogs.
>
> The publisher already associates properties corresponding to the DRS  
> fields (model, experiment, etc.) into the catalogs, with the  
> exception of version numbers (which are coming in the next release).  
> So here's a way forward:
>
> - The publisher is configured such that the categories defined for  
> the IPCC5/CMIP5 project (activity) include the DRS fields. As I  
> said, this is already mostly true. The categories are mandatory -  
> must be resolved before publication.
> - Each catalog corresponding to a dataset has properties that define  
> these values. On publication the gateway ingests these values in  
> searchable fashion.
> - When the portal receives a DRS request, it parses the URL,  
> searches on the resulting fields, and resolves to the corresponding  
> dataset.
>
> The main point is that this can be independent of the dataset  
> hierarchy as generated during publication.
>
> Bob
>
> On Nov 5, 2009, at 4:50 AM, Luca Cinquini wrote:
>
>> Hi,
>> 	the purpose of this email is to start a conversation, and a plan of
>> action, on how to incorporate the DRS syntax into the ESG system.
>> As a reminder, the current DRS specification states that a CMIP5
>> dataset will be uniquely identified by the following URL:
>>
>> http://*<hostname>/<activity>/<institute>/<model>/<experiment>/
>> <frequency>/<modeling realm>/<variable>/<ensemble member>/<version>/
>> [<endpoint>]
>>
>> where most of the <...> fields are controlled vocabularies
>> for example:
>>
>> http://*badc.nerc.ac.uk/activity/institute/model/experiment/ 
>> frequency/realm/varname/r
>>  1/v1/
>>
>> The first question would be what does it mean to capture the  
>> semantics
>> of the DRS syntax within ESG ? I can see at least two answers:
>>
>> a) The user is able to browse the CMIP5 datasets hierarchically
>> according to the DRS hierarchy of fields
>> b) The user is able to search for data based on facets that reflect
>> the DRS syntax: activity, institute, experiment, etc..
>>
>> So how do we get there ? A straw-man workflow could be the following:
>>
>> o) The ESG Data Node publishing client, when building the THREDDS
>> catalogs, creates a hierarchy of datasets that reflects the syntax.
>> There is probably also a need to mark up these catalogs as "DRS" or
>> "CMIP5".
>> o) The ESG Gateway, when parsing these catalogs, invokes a specific
>> handler that creates the same datasets hierarchy (this is actually
>> automatic, I believe), and additionally associates corresponding
>> objects at each level of the hierarchy. For example, at first level
>> the dataset will be associated with an activity, at second level with
>> an institute, and so on. An alternative way would be to associate all
>> the objects only to the leaf level dataset.
>> o) When the metadata for the leaf nodes datasets is harvested into  
>> RDF
>> triples for searching, the dataset - object associations must be
>> transfered to the triple store
>> o) Specific CMIP5 facets can be configured to search by DRS fields
>> (perhaps only on the PCMDI Gateway, or perhaps on all gateways).
>>
>> As mentioned, this is just a start. I do believe though that this is
>> an extremely important issue that must be tackled as soon as  
>> possible.
>>
>> thanks, Luca
>>
>> _______________________________________________
>> GO-ESSP-TECH mailing list
>> GO-ESSP-TECH at ucar.edu
>> http://*mailman.ucar.edu/mailman/listinfo/go-essp-tech
>>
>



More information about the GO-ESSP-TECH mailing list