[Go-essp-tech] [is-enes-sa2-jra4] Example of configuring a datanode to serve CMIP3-DRS

Bob Drach drach1 at llnl.gov
Fri Jul 2 12:34:12 MDT 2010


Hi Estani,

It should be possible to do what you want without running multiple  
data nodes.

The purpose of the THREDDS dataset roots is to hide the directory  
structure from the end user, and to limit what the TDS can access. But  
THREDDS can certainly have multiple dataset roots.

In your example below, you should associate different paths with the  
locations, for example:

> <datasetRoot path="CMIP5_replicas" location="/replicated/CMIP5"/>
> <datasetRoot path="CMIP5_core" location="/core/CMIP5"/>


Also be aware that in the publisher configuration:

- the directory_format can have multiple values, separated by vertical  
bars (|). The publisher will use the first format that matches the  
directory structure being scanned.

- a useful strategy is to create different project sections for  
various groups of directives. You could define a cmip5_replica  
project, a cmip5_core project, etc.

Bob

On Jul 1, 2010, at 5:42 AM, Estanislao Gonzalez wrote:

> Hi Bryan,
>
> thanks for your answer!
> Running multiple ESG data nodes is always a possibility, but it  
> seems an
> overkill to us as we may have several different "data repositories".
> We would like to separate: core-replicated, core-non-replicated,
> non-core, non-core-on-hpss, as well as other non-cmip5 data. Having 5+
> ESG data nodes is not viable in our scenario.
>
> The TDS allows the separation of access URL from the underlying file
> structure so that it might be possible. AFAIK the publisher does not
> provide a simple way of doing this.
>
> Setting thredds_dataset_roots to different values while publishing  
> doesn't appear to work as those are mapped to a map-entry at the  
> catalog root:
> <datasetRoot path="CMIP5" location="/replicated/CMIP5"/>
> <datasetRoot path="CMIP5" location="/core/CMIP5"/>
> ..
>
> which is clearly non bijective and can't therefore be reversed to  
> locate the file from a given URL.
>
> While publishing all referred data will be held on a known location.  
> Is it possible to use somehow this information to setup a proper  
> catalog configuration so that the URL can be properly mapped? At  
> least on a dataset level?
>
> The whole HPSS staging procedure should be completely transparent to  
> the user, as well as the location of the files. I was just looking  
> at other options in case we cannot publish them the way we want...
>
> Cheers,
> Estani
>
>
>
>
> Bryan Lawrence wrote:
>> sorry.
>>
>> the first sentence should have read
>>
>> Just to note that *our* approach to the local versus replication  
>> issue
>> will be ...
>>
>> Cheers
>> Bryan
>>
>> On Thursday 01 Jul 2010 11:25:37 Bryan Lawrence wrote:
>>
>>> Hi Estani
>>>
>>> Just to note that your approach to the local versus replication will
>>> be to run two different ESG nodes ... which is in fact the desired
>>> outcome so as to get the right things in the catalogues at the right
>>> time (vis- a-viz qc etc).
>>>
>>> The issue with respect to cache, I'm not so sure about, in what way
>>> do you want to expose that into ESG?
>>>
>>> Bryan
>>>
>>> On Wednesday 30 Jun 2010 17:05:57 Estanislao Gonzalez wrote:
>>>
>>>> Hi Stephen,
>>>>
>>>> the page contains really helpful information, thanks a lot!
>>>>
>>>> I'm also interested in some variables of the DEFAULT section from
>>>> the esg.ini configuration file. More specifically:
>>>> thredds_dataset_roots (and maybe thredds_aggregation_services or
>>>> any other which was changed or you think it might be important)
>>>>
>>>> The main question here is: how can different local directory
>>>> structures be published to the same DRS structure?
>>>> The example scenario in our case will be:
>>>> /replicated/<DRS structure> - for replicated data
>>>> /local/<DRS structure> - for non replicated data hold on disk
>>>> /cache/<DRS structure> - for data staged from a HPSS system
>>>>
>>>> The only solution I can think of is to extend the URL before the
>>>> DRS structure starts (the URL won't be 100% DRS conform anyway). So
>>>>    http://*server/thredds/fileserver/<DRS structure>
>>>> will turn into
>>>>    http://*server/thredds/fileserver/replicated/<DRS structure>
>>>>    http://*server/thredds/fileserver/local/<DRS structure>
>>>>    http://*server/thredds/fileserver/cache/<DRS structure>
>>>>
>>>> Is that viable? Are there any other options?
>>>>
>>>> Thanks,
>>>> Estani
>>>>
>>>> stephen.pascoe at stfc.ac.uk wrote:
>>>>
>>>>> To illustrate how the ESG datanode can be configured to serve
>>>>> data for CMIP5 we have deployed a datanode containing a subset of
>>>>> CMIP3 in the Data Reference Syntax. Some key features of this
>>>>> deployment are:
>>>>>
>>>>>    * The underlying directory structure is based on the Data
>>>>>      Reference Syntax.
>>>>>    * Datasets published at the realm level.
>>>>>    * The token-based security filter is replaced by the
>>>>>      OpenidRelyingParty security filter.
>>>>>
>>>>> Further notes can be found at
>>>>> http://*proj.badc.rl.ac.uk/go-essp/wiki/CMIP3_Datanode
>>>>>
>>>>> This test deployment should be of interest to anyone wanting to
>>>>> know how DRS identifiers could be exposed in THREDDS catalogues
>>>>> and the TDS HTML interface.  You can also try downloading files
>>>>> with OpenID authentication or via wget with SSL-client
>>>>> certificate authentication.  See the link above for details.
>>>>>
>>>>> Cheers,
>>>>> Stephen.
>>>>>
>>>>>
>>>>> ---
>>>>> Stephen Pascoe  +44 (0)1235 445980
>>>>> British Atmospheric Data Centre
>>>>> Rutherford Appleton Laboratory
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> -----------------------------------------------------------------
>>>>> -- -----
>>>>>
>>>>> _______________________________________________
>>>>> GO-ESSP-TECH mailing list
>>>>> GO-ESSP-TECH at ucar.edu
>>>>> http://*mailman.ucar.edu/mailman/listinfo/go-essp-tech
>>>>>
>>
>>
>
>
> -- 
> Estanislao Gonzalez
>
> Max-Planck-Institut für Meteorologie (MPI-M)
> Deutsches Klimarechenzentrum (DKRZ) - German Climate Computing Centre
> Room 108 - Bundesstrasse 45a, D-20146 Hamburg, Germany
>
> Phone:   +49 (40) 46 00 94-126
> E-Mail:  estanislao.gonzalez at zmaw.de
>
> _______________________________________________
> GO-ESSP-TECH mailing list
> GO-ESSP-TECH at ucar.edu
> http://*mailman.ucar.edu/mailman/listinfo/go-essp-tech
>



More information about the GO-ESSP-TECH mailing list