[Go-essp-tech] Replication: requested and output DRS products.

Thu Jul 8 13:28:47 MDT 2010

Hi Karl,

As I recall we discussed whether to build into the publisher (probably  
within the CMIP5 / IPCC5 handler) the definition of 'requested  
datasets'. The publisher could then associate a 'requested' value for  
some property (product?) to aid the identification. This hasn't been  
done yet - and it's not obvious how straightforward it would be - but  
could be if the definition is sufficiently well defined at this point.

Bob

On Jul 7, 2010, at 5:59 PM, Karl Taylor wrote:

> Dear all,
>
> I'm not sure scientists care much about the distinction between the  
> "output", "requested", and "replicated" categories. Martin indicates  
> their might be mild interest in being able to search only over the  
> "requested" category, since outside that category, there may be  
> little uniformity in what is available from the different models.  
> There will be little interest in being able to distinguish between  
> "requested" and "replicated" unless there is a difference in the  
> quality control tests that have been applied to these two categories  
> (and then only if a noticeable amount of data in the "requested"  
> category wouldn't pass the "replicated" tests). Will this likely be  
> the case?
>
> Clearly the ESG federation must be able to decide which files to  
> replicate, so unlike the scientists there is real interest to some  
> of you on this list that we be able to distinguish that subset. I'm  
> not sure this information has to be part of the DRS though. Couldn't  
> we just have some database that lists the criteria for selecting  
> data to be replicated? The database coupled with coding to access  
> that information could be used to decide whether each file in the  
> "output" category needs to be replicated or not. This is why that  
> although the current DRS document allows "product" to be either  
> "output" or "requested", an all caps note appears stating: "[WILL  
> POSSIBLY MODIFY THE ABOVE IF WE DON’T NEED TO KNOW ABOUT  
> “REQUESTED”]."
>
> Bob Drach and I had some extended discussions about this some time  
> ago, but I can't recall if he decided to include some capability  
> along these lines in the publisher (i.e., enables the publisher to  
> determine whether files are in the "requested" category or not), or  
> if we've left that for completely independent external coding. Bob  
> returns from vacation later this week, so I suggest we wait for some  
> input from him.
>
> Best regards,
> Karl
>
> On 7/7/10 2:48 AM, martin.juckes at stfc.ac.uk wrote:
>> Hello Estani,
>>
>> The reference to CMIP5_archive_size.xls was not very useful,  
>> apologies for referencing a file that isn't publicly available --  
>> it is attached.
>>
>> According to the DRS document, everything should be found under the  
>> "output" branch, and the "requested" branch will be a subset of the  
>> "output".
>>
>> An end user may want a homogeneous dataset, and so may opt to  
>> restrict attention to the "requested" data where he is likely to  
>> find the same variables from a large range of models. He may, on  
>> the other hand, want all available data for a given set of  
>> experiments, in which case he should go to the "output" branch. He  
>> will then find additional (low priority) variables and extended  
>> time coverage from a small number of models.
>>
>> I'll see what can be done about a "DRS:requested" and  
>> "ESGF:replicated" document (or wiki page),
>>
>> cheers,
>> Martin
>>
>>
>>
>>
>> -----Original Message-----
>> From: Estanislao Gonzalez [mailto:estanislao.gonzalez at zmaw.de]
>> Sent: Wed 07/07/2010 08:54
>> To: Juckes, Martin (STFC,RAL,SSTD)
>> Cc: Pascoe, Stephen (STFC,RAL,SSTD); gavin at llnl.gov;  
>> drach1 at llnl.gov; go-essp-tech at ucar.edu; is-enes-sa2-jra4 at lists.enes.org 
>> ; doutriaux1 at llnl.gov
>> Subject: Re: Replication: requested and output DRS products.
>>
>> Hi Martin,
>>
>> I couldn't find the file you mentioned (CMIP5_archive_size.xls),  
>> could
>> you please provide a link to it?
>>
>> I'm aware now that output>  requested>  replicated. But the  
>> distinction
>> between the later ones is not clear to me. I totally agree that it  
>> would
>> be great if someone could sum that up.
>>
>> And one question from the "monster" thread that still remains is:
>> It is clear that requested is a subset of output. Does this imply  
>> that
>> all data under .../requested/... should also be found under the
>> .../output/... DRS sub-structure?
>>
>> I think not... but then again, why would the end user need to know  
>> about
>> this separation?
>>
>> Thanks,
>> Estnai
>>
>>
>> martin.juckes at stfc.ac.uk wrote:
>>
>>> Hello again,
>>>
>>> The decision as to what is to be replicated is, I think embedded  
>>> in "CMIP5_archive_size.xls", and its implementation through the  
>>> DRS is based on the separation between "requested" and "output"  
>>> products. It would be useful to have a brief document outlining  
>>> these decisions and some code to implement them. I'm not sure of  
>>> the latest status on these two points, perhaps Stephen can add  
>>> something.
>>>
>>> cheers,
>>> Martin
>>>
>>>
>>> -----Original Message-----
>>> From: Estanislao Gonzalez [mailto:estanislao.gonzalez at zmaw.de]
>>> Sent: Tue 06/07/2010 17:15
>>> To: V. Balaji
>>> Cc: Juckes, Martin (STFC,RAL,SSTD); Pascoe, Stephen  
>>> (STFC,RAL,SSTD); gavin at llnl.gov; drach1 at llnl.gov; go-essp-tech at ucar.edu 
>>> ; is-enes-sa2-jra4 at lists.enes.org; doutriaux1 at llnl.gov; taylor13 at llnl.gov
>>> Subject: Re: [Go-essp-tech] [is-enes-sa2-jra4] Example of      
>>> configuringadatanode to serve CMIP3-DRS
>>>
>>> Hi Balaji,
>>>
>>> To put things in context once more: (I think there's no such thing  
>>> as
>>> over-clarification :-)
>>>
>>> DRS file and directory structure will be assured. The problem is  
>>> if for
>>> some reason we have two different directories, e.g. A and B, and  
>>> we want
>>> to publish data in DRS from both directories. So we have A/<DRS
>>> structure>  and B/<DRS structure>.
>>> We'd like both of them to be mapped to a central URL, e.g.
>>> http://*www.*server.de/thredds/fileserver/<DRS structure>  so that  
>>> the user
>>> requires absolutely no knowledge about this separation.
>>>
>>> The remaining question is: why on earth would someone want to have  
>>> A and
>>> B?! :-)
>>> Well some reasons are:
>>> 1) simplified management. We don't have a mega-mix of millions of  
>>> files
>>> from which some have to me replicated, some are held only at our
>>> institution, some are "temporarily" held as being cached from tape.
>>> Telling these all apart might not be an easy task.
>>> 2) Safety. In such a context a simple error might be disastrous  
>>> (e.g.
>>> someone tries to remove the replicated files to re-deploy them  
>>> without
>>> being aware that they share the directory with other files...)
>>> 3) Backup. If we (ok, somebody else, we will have everything on  
>>> tape, I
>>> think...) want to backup a portion of the data, this won't be easily
>>> achieved (the replicated data is already redundant, but the other  
>>> isn't)
>>> 4) Storage. We might get more disks, but we will certainly won't  
>>> be able
>>> to "merge" all of them into a single storage (well, that's because  
>>> they
>>> will arrive way after we start publishing things, so the first disks
>>> will already have some data). In any case, for political (e.g.
>>> institutional), technical (e.g. disk speed) or philosophical (e.g
>>> ...uh....) reasons it might be desirable to keep different storages.
>>>
>>> And as I said we have to cope with that, somehow.
>>> The starter question was: can this be achieved with the publisher?  
>>> And
>>> the answer was "no".
>>>
>>> And I totally agree with you regarding AR5. I must have a very good
>>> reason for not attaining to a default, even defacto ones. But the
>>> decision behind the storage in AR5 is a political one that, AFAIK,  
>>> isn't
>>> taken yet.
>>>
>>> Well, I hope this helped to clarify things a bit.
>>>
>>> Thanks,
>>> Estani
>>>
>>> V. Balaji wrote:
>>>
>>>
>>>> There are undoubtedly parts of this I'm not following too well,  
>>>> so I
>>>> apologize in advance for any misunderstandings. This is all from  
>>>> the
>>>> perspective of a modeling center.
>>>>
>>>> I do not understand the logic for _not_ wishing to lay data out in
>>>> DRS-compliant fashion on the public data server. I know you can  
>>>> do it,
>>>> but I don't understand why you'd want to. One thing I'd like to  
>>>> make
>>>> sure is captured as a requirement is that 'wget -r' should deliver
>>>> data laid out per DRS directory structure.
>>>>
>>>> The second issue is that, again from the modeling centre  
>>>> perspective, I
>>>> fervently hope that whatever's done for CMIP5 becomes a de-facto
>>>> standard for other projects requiring coordinated model data  
>>>> output. We
>>>> (modeling centres) cannot build one-off solutions for each  
>>>> project. We
>>>> have with some success made CMOR1/AR4 a template which was forked  
>>>> off
>>>> for other projects (ENSEMBLES, CHFP, HTAP), because there's no  
>>>> way we
>>>> can repeatedly undertake the task of integrating multiple  
>>>> inconsistent
>>>> CMORs and DRSes into our data processing workflow. This is in ref  
>>>> to
>>>> Martin's question about "non-CMIP5 data".
>>>>
>>>> martin.juckes at stfc.ac.uk writes:
>>>>
>>>>
>>>>
>>>>> Hello Estanislao, Gavin,
>>>>>
>>>>> There is a key part of your problem I don't understand -- what  
>>>>> do you
>>>>> mean by "non CMIP5 data"?
>>>>>
>>>>> Before going into the ESGF CMIP5 archive, all files will be CMOR2
>>>>> compliant. This means that they fit in the "requested" or  
>>>>> "product"
>>>>> categories of the DRS. The data to be replicated will be a  
>>>>> subset of
>>>>> the "ESG published units" (also known as realm level datasets)  
>>>>> in the
>>>>> "requested" category.
>>>>>
>>>>> There has been an agreement that the ESGF CMIP5 archive would be  
>>>>> run
>>>>> on disk, and so it is not surprising that the infrastructure  
>>>>> does not
>>>>> support tape storage. I can see that something along the lines  
>>>>> Gavin
>>>>> describes would resolve the problems with tape storage, but we  
>>>>> need
>>>>> to get the disk based system working as the first priority.
>>>>>
>>>>> Stephen raises the issue of replication and this is relevant,  
>>>>> since
>>>>> straight disk to disk copies (i.e. to an external hard drive which
>>>>> can be posted) is a vital aspect of the replication plan. For the
>>>>> time being, this requires people to stick to the DRS directory
>>>>> structure.
>>>>>
>>>>> Within CMIP5 the data from different institutions is clearly
>>>>> separated at the institution directory level, I can't see why  
>>>>> there
>>>>> should be any confusion here.
>>>>>
>>>>> For non-CMIP5 data -- why would you want to describe it with the
>>>>> CMIP5 DRS?
>>>>>
>>>>> cheers,
>>>>> Martin
>>>>>
>>>>>
>>>>>
>>>>> -----Original Message-----
>>>>> From: is-enes-sa2-jra4-bounces at lists.enes.org on behalf of
>>>>> stephen.pascoe at stfc.ac.uk
>>>>> Sent: Tue 06/07/2010 12:13
>>>>> To: estanislao.gonzalez at zmaw.de; gavin at llnl.gov
>>>>> Cc: drach1 at llnl.gov; go-essp-tech at ucar.edu;
>>>>> is-enes-sa2-jra4 at lists.enes.org; doutriaux1 at llnl.gov
>>>>> Subject: Re: [is-enes-sa2-jra4] [Go-essp-tech] Example of
>>>>> configuringadatanode to serve CMIP3-DRS
>>>>>
>>>>>
>>>>>
>>>>> Hi Estanislao,
>>>>>
>>>>>
>>>>>
>>>>>> * The only true problem is to differentiate between core and
>>>>>> non-core data (which as far as I node is a file issue instead  
>>>>>> of a
>>>>>> dataset one,
>>>>>> i.e. some datasets contain core and non core data)
>>>>>>
>>>>>>
>>>>> I'm not sure you were involved then but we had lengthy discussions
>>>>> last year on how we would deal with the separation of requested  
>>>>> and
>>>>> non-requested data (Karl discourages the term "core").  There is a
>>>>> fundamental problem that the DRS vocabularies don't cleanly map  
>>>>> onto
>>>>> what is requested and not requested.  The outcome was to introduce
>>>>> the DRS component "product" to divide the two.  If you are  
>>>>> interested
>>>>> take a look at the following threads:
>>>>>
>>>>> http://*mailman.ucar.edu/pipermail/go-essp-tech/2010-January/ 
>>>>> 000335.html
>>>>> http://*mailman.ucar.edu/pipermail/go-essp-tech/2009-December/ 
>>>>> 000255.html
>>>>>
>>>>> There hasn't been much discussion of how we identify and manage
>>>>> requested data since then and the nitty-gritty details still  
>>>>> aren't
>>>>> fixed.  This is going to be a challenge when we come to replicate.
>>>>>
>>>>> S.
>>>>>
>>>>> ---
>>>>> Stephen Pascoe  +44 (0)1235 445980
>>>>> British Atmospheric Data Centre
>>>>> Rutherford Appleton Laboratory
>>>>>
>>>>> -----Original Message-----
>>>>> From: is-enes-sa2-jra4-bounces at lists.enes.org
>>>>> [mailto:is-enes-sa2-jra4-bounces at lists.enes.org] On Behalf Of
>>>>> Estanislao Gonzalez
>>>>> Sent: 06 July 2010 11:18
>>>>> To: Gavin M Bell
>>>>> Cc: drach1 at llnl.gov; go-essp-tech at ucar.edu;
>>>>> is-enes-sa2-jra4 at lists.enes.org; doutriaux1 at llnl.gov
>>>>> Subject: Re: [is-enes-sa2-jra4] [Go-essp-tech] Example of  
>>>>> configuring
>>>>> adatanode to serve CMIP3-DRS
>>>>>
>>>>> Hi people,
>>>>>
>>>>> well I think we do require something like this (at least at the  
>>>>> major
>>>>> data nodes where data will get replicated). Managing all data  
>>>>> mixed
>>>>> up under one single directory is not a very neat solution for the
>>>>> data administrator. In our particular case we will be publishing  
>>>>> many
>>>>> (much?
>>>>> :-) data from different institutions and even types (not only  
>>>>> CMIP5).
>>>>> And we shouldn't forget about the replicated data (is that ===
>>>>> core?), how can we tell which data requires being replicated? by
>>>>> maintaining a second "catalog" in a DB? I think by maintaining a
>>>>> separate filesystem a simple rsynch will do the job (after the  
>>>>> very
>>>>> first replication, of course).
>>>>> In any case the fact that we at DKRZ cannot hold all CMIP5 data on
>>>>> disk (yes, the core one we can :-) implies that we will have to
>>>>> maintain a cache somewhere, and mixing this cache with the core  
>>>>> data
>>>>> is something we should probably avoid.
>>>>>
>>>>> Gavin's solution, if I got it right, has a major problem. The
>>>>> catalogs will be created pointing to the real files (e.g.
>>>>> .../core/CMIP5), so that the filter can alter the request from the
>>>>> DRS query
>>>>> (../CMIP5/<core_data>) to the real one, and thus allow the TDS to
>>>>> work as usual. This leaves the catalogs unaltered and thereby the
>>>>> harvest data which will have no reference to the mapped DRS  
>>>>> structure
>>>>> but to the real one. OR did I miss something here?
>>>>>
>>>>> I have already tried several possible solutions without any  
>>>>> success
>>>>> at all:
>>>>> 1) Setting multiple datasetRoot entries is not allowed
>>>>> 2) Altering the TDS to accept multiple datasetRoot entries and  
>>>>> look
>>>>> in all of them one after the other after something matches is  
>>>>> almost
>>>>> impossible (for the time we have ahead, the mere architecture of  
>>>>> the
>>>>> TDS is, in my opinion, a mess).
>>>>> 3) In general altering the TDS is not a "nice" solution.
>>>>> 4) Filtering the request breaks the coherence between the catalogs
>>>>> and the DRS "virtual" structure (the catalogs have no information
>>>>> whatsoever that a second link to the files exists.
>>>>>
>>>>> The only viable solution I can think of (and it is still to see if
>>>>> it's really viable) is to maintain the files somewhere else and  
>>>>> link
>>>>> them to the "central" DRS filesystem before being published.
>>>>>
>>>>> After discussing this with Stephan we come up with something I'd  
>>>>> like
>>>>> to sum up here:
>>>>> * All non CMIP5 data can be mapped to a DRS structure "not"  
>>>>> starting
>>>>> with CMIP5 so it can be easily mapped to somewhere else (TDS  
>>>>> allows
>>>>> that)
>>>>> * The only true problem is to differentiate between core and non- 
>>>>> core
>>>>> data (which as far as I node is a file issue instead of a dataset
>>>>> one, i.e. some datasets contain core and non core data)
>>>>> * The replication can rely on external sources for differentiating
>>>>> this, e.g. a DB.
>>>>> * The cached non-core data can co-live, in the worst case  
>>>>> scenario,
>>>>> with the core data by removing the write permits of the later  
>>>>> (beside
>>>>> the security that it implies, this will be used as a flag in  
>>>>> case the
>>>>> server is restarted. All non-flagged (write enabled) files will be
>>>>> treated as left overs from the stopped cache and will be further  
>>>>> served)
>>>>>
>>>>> So we might get out with it without performing any major  
>>>>> changes. But
>>>>> this is something we should definitely discuss before next  
>>>>> iteration :-)
>>>>>
>>>>> I hope this brings some light into the matter... sorry for the
>>>>> lengthy mail...
>>>>>
>>>>> Regards,
>>>>> Estani
>>>>>
>>>>> Gavin M Bell wrote:
>>>>>
>>>>>
>>>>>> Martin,
>>>>>>
>>>>>> The savings is that the data provider / data-node admin doesn't  
>>>>>> have
>>>>>> to any additional work, whether it be provide any filesystem<- 
>>>>>> >  drs
>>>>>> mapping or (re)arranging their file system.  In the current  
>>>>>> state of
>>>>>> things all the salient information is already in the database  
>>>>>> created
>>>>>> as a result of the publisher [software] scan.  I think it would  
>>>>>> be
>>>>>> prudent to use that information to the benefit of our end users
>>>>>> instead of imposing a DRS directory structure requirement for esg
>>>>>> participation.
>>>>>>
>>>>>> You said:
>>>>>> "Remember that not having to configure the file system is only  
>>>>>> a real
>>>>>> saving if the alternative (configuring the file system to URL  
>>>>>> mapping)
>>>>>> is actually easier than configuring the file system."
>>>>>>
>>>>>> I am saying:
>>>>>> The 'alternative' you describe, does not exist.  Because there  
>>>>>> is no
>>>>>> "configuring the file system to URL mapping" necessary...  
>>>>>> unless the
>>>>>> end-user wants there to be. In which case we, as dutiful  
>>>>>> programmers,
>>>>>> provide that opportunity.  This is what my code sketch was
>>>>>> illustrating with the property "drs.resolve.strategy", and the  
>>>>>> use of
>>>>>> a factory and strategy pattern - of which we will set a default  
>>>>>> that
>>>>>> requires them to do *no additional work*.  The data-node admin  
>>>>>> won't
>>>>>> have to do any actual setup outside of running an "esg-node -- 
>>>>>> update".
>>>>>> The upgrade/update process (determined by the esg-node install  
>>>>>> script)
>>>>>> will install the filter, without them having to do anything  
>>>>>> additional.
>>>>>>
>>>>>> Indeed, the code I posted was a quick and dirty filter code  
>>>>>> sketch
>>>>>> demonstrating that putting a filter in place is easy. Yes, the
>>>>>> resolution work would be done in the code that I only alluded  
>>>>>> to, the
>>>>>> "DRSResolver". Current duties preclude me from actually  
>>>>>> implementing
>>>>>> this issue outright, today, for this email conversation.  
>>>>>> However, if
>>>>>> we all conclude that it is worthwhile then I or someone else  
>>>>>> could
>>>>>> make it happen.
>>>>>>
>>>>>> I hope I have done a better job of making more clear my point;  
>>>>>> that we
>>>>>> can free our end-users of this DRS directory structure  
>>>>>> requirement
>>>>>> while allowing the DRS itself to be more flexible with it's
>>>>>> representation.
>>>>>> Also that the mechanism I described does not preclude anyone from
>>>>>> setting up their filesystem to follow the DRS structure, we get  
>>>>>> that
>>>>>> for free! :-)
>>>>>>
>>>>>> I am glad that we do indeed agree that the effort to bring this  
>>>>>> to
>>>>>> fruition can and should be done in a way that does not impede or
>>>>>> distract the current  deliverable path.
>>>>>>
>>>>>> Thanks.
>>>>>>
>>>>>>
>>>>>> martin.juckes at stfc.ac.uk wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>>> Er... the attachment you sent didn't actually do any mapping.  
>>>>>>> But I'm
>>>>>>> sure it could be done. The extra work I'm talking about is the  
>>>>>>> same
>>>>>>> as the extra work you talk about at the end of your mail, so I'm
>>>>>>> going to ignore your suggestion at the start of your email  
>>>>>>> that there
>>>>>>> isn't any,
>>>>>>>
>>>>>>> cheers,
>>>>>>> Martin
>>>>>>>
>>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Gavin M Bell [mailto:gavin at llnl.gov]
>>>>>>> Sent: Mon 05/07/2010 21:37
>>>>>>> To: Juckes, Martin (STFC,RAL,SSTD)
>>>>>>> Cc: drach1 at llnl.gov; go-essp-tech at ucar.edu;
>>>>>>> is-enes-sa2-jra4 at lists.enes.org; doutriaux1 at llnl.gov
>>>>>>> Subject: Re: [Go-essp-tech] [is-enes-sa2-jra4] Example of  
>>>>>>> configuring
>>>>>>> adatanode to serve CMIP3-DRS
>>>>>>>
>>>>>>> Hi Martin,
>>>>>>>
>>>>>>> With regards to the savings... One, perhaps default, setup is  
>>>>>>> not
>>>>>>> having the data provider do anything additional at all with  
>>>>>>> respect
>>>>>>> to configuration or setup.  They simply use the publisher to  
>>>>>>> scan
>>>>>>> their files into the system, something that must be done in all
>>>>>>> cases... (so we can normalize that out). With that said, they  
>>>>>>> would
>>>>>>> not have to do
>>>>>>> *any* additional work.  No work is easier than some work,  
>>>>>>> regardless
>>>>>>> of how easy ;-).
>>>>>>>
>>>>>>> I have attached the filter code that would almost do it.  The  
>>>>>>> real
>>>>>>> intelligence would be in the "DRSResolver" object to do the
>>>>>>> resolution.
>>>>>>>  I would have sketched out that class as well but that would be
>>>>>>> tantamount to completing this task... and to finish it off I  
>>>>>>> would
>>>>>>> have to confer with Bob on the publisher database.  And have  
>>>>>>> us all
>>>>>>> settled on the DRS query syntax.
>>>>>>> With a DRS URL query scheme we could wrap this up quite  
>>>>>>> directly.
>>>>>>>
>>>>>>> The DRSResolver would:
>>>>>>> -parse the request url (the query) and pull out the salient  
>>>>>>> parts.
>>>>>>> -fashion those parts into a SQL query against the publisher  
>>>>>>> database
>>>>>>> -Return the thredds' root based url to the rest of the  
>>>>>>> processing
>>>>>>> stream. If it is not able to be resolved, punt and return the  
>>>>>>> same
>>>>>>> input string as the output and let some other part of the  
>>>>>>> process
>>>>>>> stream regurgitate an error.
>>>>>>>
>>>>>>> Because all the metadata is pulled out in the publisher's  
>>>>>>> scan, file
>>>>>>> system placement of the scanned files is moot.
>>>>>>>
>>>>>>> In the code I attached, I leave room for the data-node user to  
>>>>>>> select
>>>>>>> their own implementation of the resolver following a factory/ 
>>>>>>> strategy
>>>>>>> pattern.  At that point indeed we allow end users to do 'work'  
>>>>>>> with
>>>>>>> doing their own mappings.  Perhaps we integrate a few canned  
>>>>>>> mapping
>>>>>>> schemes etc... We can be arbitrarily cleaver with these kinds of
>>>>>>> things of course. :-)
>>>>>>>
>>>>>>> P.S.
>>>>>>> The DRSResolver logic would/should be ported to all ingress  
>>>>>>> request
>>>>>>> streams.  Also the published catalogs would be published with  
>>>>>>> the DRS
>>>>>>> query syntax scheme as the canonical name of the resource -  
>>>>>>> something
>>>>>>> the search facility would use to identify the resource.
>>>>>>>
>>>>>>> done.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> martin.juckes at stfc.ac.uk wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> Hi Gavin,
>>>>>>>>
>>>>>>>> I'm not convinced about the connection to Estanislao's email,  
>>>>>>>> but
>>>>>>>> the idea of thinking about the next step while implementing the
>>>>>>>> current system is certainly a good one. Remember that not  
>>>>>>>> having to
>>>>>>>> configure the file system is only a real saving if the  
>>>>>>>> alternative
>>>>>>>> (configuring the file system to URL mapping) is actually  
>>>>>>>> easier than
>>>>>>>> configuring the file system. Setting up the DRS is not  
>>>>>>>> difficult,
>>>>>>>>
>>>>>>>> cheers,
>>>>>>>> Martin
>>>>>>>>
>>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: Gavin M Bell [mailto:gavin at llnl.gov]
>>>>>>>> Sent: Mon 05/07/2010 19:45
>>>>>>>> To: Juckes, Martin (STFC,RAL,SSTD)
>>>>>>>> Cc: drach1 at llnl.gov; go-essp-tech at ucar.edu;
>>>>>>>> is-enes-sa2-jra4 at lists.enes.org; doutriaux1 at llnl.gov
>>>>>>>> Subject: Re: [Go-essp-tech] [is-enes-sa2-jra4] Example of
>>>>>>>> configuring adatanode to serve CMIP3-DRS
>>>>>>>>
>>>>>>>> Martin and friends,
>>>>>>>>
>>>>>>>> This is false economy.  Two things.  First implementing this  
>>>>>>>> is not
>>>>>>>> hard.  Secondly implementing this will resolve the issues  
>>>>>>>> r.w.t. the
>>>>>>>> incongruence between DRS and the filesystem that Estanislao's  
>>>>>>>> email
>>>>>>>> illuminated.  So it seems to me that the alternative is keep  
>>>>>>>> fitting
>>>>>>>> this square DRS peg in to the round file system hole.  That  
>>>>>>>> would
>>>>>>>> mean having to do a whole other set of gymnastics to get the  
>>>>>>>> DRS<->
>>>>>>>> file system beast tamed.  There is work to be done either way
>>>>>>>> because things are not ready to go as it stands. I suggest we  
>>>>>>>> fix
>>>>>>>> the problem at the root, now, not "later".  Essentially the  
>>>>>>>> current
>>>>>>>> course requires the data providers to jump through file system
>>>>>>>> layout hoops.  I am of the opinion that we should "require" as
>>>>>>>> little as possible from our users, especially something like
>>>>>>>> this... it hurts adoption IMHO.
>>>>>>>>
>>>>>>>> Actually, let me frame this differently.  How about we fork  
>>>>>>>> efforts,
>>>>>>>> and have some folks think about what the *query* URL should  
>>>>>>>> be for
>>>>>>>> the functionality I suggested, while others continue the  
>>>>>>>> current
>>>>>>>> path.  When the former development is ripe I update the install
>>>>>>>> script and have it installed upon the clients' next install
>>>>>>>> automagically, no slowdown for anyone.  The null transform  
>>>>>>>> would be
>>>>>>>> equivalent to what we have now so we would be backward  
>>>>>>>> compatible
>>>>>>>> for folks whole have done the task of making their file systems
>>>>>>>> congruent to DRS.  Fair enough?
>>>>>>>>
>>>>>>>> Sound good?
>>>>>>>>
>>>>>>>> martin.juckes at stfc.ac.uk wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> Hello Gavin, Bob,
>>>>>>>>>
>>>>>>>>> I agree that this is a good idea in principle, but I think  
>>>>>>>>> it is a
>>>>>>>>> bad idea now. The thing about "now" is that we want to  
>>>>>>>>> deploy and
>>>>>>>>> test the system we have agreed on. We want to do it now  
>>>>>>>>> because
>>>>>>>>> modelling centres have supercomputers running and churning  
>>>>>>>>> out vast
>>>>>>>>> volumes of data, there are thousands of scientists waiting  
>>>>>>>>> to get
>>>>>>>>> at it and we have the job of installing a system to  
>>>>>>>>> distribute it.
>>>>>>>>> It is, I think, I bad time to start implementing changes in  
>>>>>>>>> the
>>>>>>>>> system design. Sorry if this sounds a bit harsh, but impending
>>>>>>>>> deadlines make me nervous,
>>>>>>>>>
>>>>>>>>> cheers,
>>>>>>>>> Martin
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: go-essp-tech-bounces at ucar.edu on behalf of Bob Drach
>>>>>>>>> Sent: Mon 05/07/2010 19:18
>>>>>>>>> To: Gavin M Bell
>>>>>>>>> Cc: go-essp-tech at ucar.edu; is-enes-sa2-jra4 at lists.enes.org;  
>>>>>>>>> Charles
>>>>>>>>> Doutriaux
>>>>>>>>> Subject: Re: [Go-essp-tech] [is-enes-sa2-jra4] Example of
>>>>>>>>> configuring adatanode to serve CMIP3-DRS
>>>>>>>>>
>>>>>>>>> Hi Gavin,
>>>>>>>>>
>>>>>>>>> I agree completely. Having a regularized DRS syntax is a  
>>>>>>>>> very good
>>>>>>>>> idea, but to implement it we will need to introduce a level of
>>>>>>>>> indirection between the DRS URL (your 'query') and the  
>>>>>>>>> underlying
>>>>>>>>> filesystem. Separating these two concerns will have a very
>>>>>>>>> important
>>>>>>>>> benefit: it will allow the data node managers to organize  
>>>>>>>>> their
>>>>>>>>> filesystems as they see fit.
>>>>>>>>>
>>>>>>>>> Bob
>>>>>>>>>
>>>>>>>>> On Jul 5, 2010, at 11:10 AM, Gavin M Bell wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Hello gentle-people,
>>>>>>>>>>
>>>>>>>>>> Here is my two cents on this whole DRS business.  I think  
>>>>>>>>>> that the
>>>>>>>>>> fundamental issue to all of this is the ability to do  
>>>>>>>>>> resource
>>>>>>>>>> resolution (lookup).  The issue of having urls match a DRS
>>>>>>>>>> structure that matches the filesystem is a red herring  
>>>>>>>>>> (IMHO).
>>>>>>>>>> The basic issue is to be able to issue a query to the  
>>>>>>>>>> system such
>>>>>>>>>> that you find what you are looking for.  This query mechanism
>>>>>>>>>> should be separate mechanism than filesystem  
>>>>>>>>>> correspondence.  The
>>>>>>>>>> driving issue behind the file system correspondence push is  
>>>>>>>>>> so
>>>>>>>>>> that people and/or applications can infer the location of
>>>>>>>>>> resources in some regimented way.  The true heart of the  
>>>>>>>>>> issue is
>>>>>>>>>> not with the file system.  The heart of the issue is to  
>>>>>>>>>> perform a
>>>>>>>>>> query such that you provide resource resolution.  The file  
>>>>>>>>>> system
>>>>>>>>>> is a familiar mechanism but it isn't the only one.  The file
>>>>>>>>>> system takes a query (the file system path) and returns the
>>>>>>>>>> resource to us (the bits sitting at an inode location  
>>>>>>>>>> somewhere
>>>>>>>>>> that is memory mapped to some physical platter and spindle
>>>>>>>>>> location, that is mapped to the file system path).  We are
>>>>>>>>>> overloading the file system query mechanism when it is not
>>>>>>>>>> necessary.
>>>>>>>>>>
>>>>>>>>>> I propose the following:  We create a *filter* and a small
>>>>>>>>>> database (the latter we already have in the publisher).  We  
>>>>>>>>>> send a
>>>>>>>>>> *query* to the web server the web server *filter*  
>>>>>>>>>> intercepts that
>>>>>>>>>> *query* and resolves it, using the database to the actual  
>>>>>>>>>> resource
>>>>>>>>>> location and returns the resource you want.  Implementing  
>>>>>>>>>> this in
>>>>>>>>>> a filter divorces the query structure from the file system
>>>>>>>>>> structure.  The use of the database (that is generated by the
>>>>>>>>>> publisher when it scans) provides the resolution.
>>>>>>>>>> With this mechanism in place, WGET, as well as any other  
>>>>>>>>>> URL based
>>>>>>>>>> tool will be able to fetch the data as intended.
>>>>>>>>>>
>>>>>>>>>> BTW: The "query" is whatever we make it up to be... (not a
>>>>>>>>>> reference to SQL query).
>>>>>>>>>>
>>>>>>>>>> This gives the data-node admin the ability to put their files
>>>>>>>>>> wherever they want.  If they move files around and so on,  
>>>>>>>>>> they
>>>>>>>>>> just have to rescan with the publisher.  The issues around  
>>>>>>>>>> design
>>>>>>>>>> and efficiency can be address with varying degrees of  
>>>>>>>>>> cleverness.
>>>>>>>>>>
>>>>>>>>>> I welcome any thoughts on this issue... Please talk me  
>>>>>>>>>> down :-). I
>>>>>>>>>> think it is about time we put this DRS issue to bed.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Estanislao Gonzalez wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> Hi Bob,
>>>>>>>>>>>
>>>>>>>>>>> I guess you must be on vacations now. Anyway, here's the
>>>>>>>>>>> question, maybe someone else can answer it:
>>>>>>>>>>>
>>>>>>>>>>> The very first idea I had was almost what you proposed. Your
>>>>>>>>>>> proposal though leaves URLs of the form:
>>>>>>>>>>> http://*****myserver/thredds/fileserver/CMIP5_replicas/ 
>>>>>>>>>>> output/...
>>>>>>>>>>>                                                             < 
>>>>>>>>>>> ---
>>>>>>>>>>> (almost) DRS Structure ----------->
>>>>>>>>>>>
>>>>>>>>>>> Which has no valid DRS structure (CMIP5_replicas nor  
>>>>>>>>>>> CMIP5_core
>>>>>>>>>>> are in the DRS vocabulary).
>>>>>>>>>>>
>>>>>>>>>>> My proposal has a very similar flaw:
>>>>>>>>>>> http://*****myserver/thredds/fileserver/replicated/CMIP5/ 
>>>>>>>>>>> output/...
>>>>>>>>>>>
>>>>>>>>>>> <--- full DRS Structure ----------->  The DRS structure is
>>>>>>>>>>> preserved, but you cannot easily infer the correct URL  
>>>>>>>>>>> from any
>>>>>>>>>>> dataset. I think the Idea is: if you know the prefix
>>>>>>>>>>> (http.../fileserver/) and the dataset DRS name you can  
>>>>>>>>>>> always get
>>>>>>>>>>> the file without even browising the TDS:
>>>>>>>>>>> prefix + DRS = URL to file
>>>>>>>>>>>
>>>>>>>>>>> AFAIK the URL structure used by the TDS will never be 100%  
>>>>>>>>>>> DRS
>>>>>>>>>>> conform (according to the DRS version 0.27) This one has the
>>>>>>>>>>> form:
>>>>>>>>>>> http://*****<hostname>/<activity>/<product>/<institute>/ 
>>>>>>>>>>> <model>/
>>>>>>>>>>> <experiment>/<frequency>/<modeling
>>>>>>>>>>> realm>/<variable identifier>/<ensemble member>/<version>/
>>>>>>>>>>> [<endpoint>],
>>>>>>>>>>>
>>>>>>>>>>> where the TDS one has the endpoint moved to the front (the
>>>>>>>>>>> thredds/fileserver, thredds/dodsC, etc parts).
>>>>>>>>>>>
>>>>>>>>>>> To sum things up:
>>>>>>>>>>> Is it possible to publish files from different directory
>>>>>>>>>>> structures into an unified URL structure so that it is  
>>>>>>>>>>> completely
>>>>>>>>>>> transparent to the user?
>>>>>>>>>>> Am I the only one addressing this problem? Are all other
>>>>>>>>>>> institutions planning  to publish all files from only one
>>>>>>>>>>> directory?
>>>>>>>>>>>
>>>>>>>>>>> The only viable solution I can think of is to rely on  
>>>>>>>>>>> Stephen's
>>>>>>>>>>> versioning concept and maintaining a single true DRS  
>>>>>>>>>>> structure
>>>>>>>>>>> with links to files kept in other more manageable directory
>>>>>>>>>>> structures (This will probably involve adapting Stephen's  
>>>>>>>>>>> tool).
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Estani
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Bob Drach wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> Hi Estani,
>>>>>>>>>>>>
>>>>>>>>>>>> It should be possible to do what you want without running
>>>>>>>>>>>> multiple data nodes.
>>>>>>>>>>>>
>>>>>>>>>>>> The purpose of the THREDDS dataset roots is to hide the
>>>>>>>>>>>> directory structure from the end user, and to limit what  
>>>>>>>>>>>> the
>>>>>>>>>>>> TDS can access.
>>>>>>>>>>>> But
>>>>>>>>>>>> THREDDS can certainly have multiple dataset roots.
>>>>>>>>>>>>
>>>>>>>>>>>> In your example below, you should associate different  
>>>>>>>>>>>> paths with
>>>>>>>>>>>> the locations, for example:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> <datasetRoot path="CMIP5_replicas"
>>>>>>>>>>>>> location="/replicated/CMIP5"/>  <datasetRoot  
>>>>>>>>>>>>> path="CMIP5_core"
>>>>>>>>>>>>> location="/core/CMIP5"/>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>> Also be aware that in the publisher configuration:
>>>>>>>>>>>>
>>>>>>>>>>>> - the directory_format can have multiple values,  
>>>>>>>>>>>> separated by
>>>>>>>>>>>> vertical bars (|). The publisher will use the first  
>>>>>>>>>>>> format that
>>>>>>>>>>>> matches the directory structure being scanned.
>>>>>>>>>>>>
>>>>>>>>>>>> - a useful strategy is to create different project  
>>>>>>>>>>>> sections for
>>>>>>>>>>>> various groups of directives. You could define a  
>>>>>>>>>>>> cmip5_replica
>>>>>>>>>>>> project, a cmip5_core project, etc.
>>>>>>>>>>>>
>>>>>>>>>>>> Bob
>>>>>>>>>>>>
>>>>>>>>>>>> On Jul 1, 2010, at 5:42 AM, Estanislao Gonzalez wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi Bryan,
>>>>>>>>>>>>>
>>>>>>>>>>>>> thanks for your answer!
>>>>>>>>>>>>> Running multiple ESG data nodes is always a possibility,  
>>>>>>>>>>>>> but it
>>>>>>>>>>>>> seems an overkill to us as we may have several different  
>>>>>>>>>>>>> "data
>>>>>>>>>>>>> repositories".
>>>>>>>>>>>>> We would like to separate: core-replicated,
>>>>>>>>>>>>> core-non-replicated, non-core, non-core-on-hpss, as well  
>>>>>>>>>>>>> as
>>>>>>>>>>>>> other non-cmip5 data.
>>>>>>>>>>>>> Having 5+
>>>>>>>>>>>>> ESG data nodes is not viable in our scenario.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The TDS allows the separation of access URL from the  
>>>>>>>>>>>>> underlying
>>>>>>>>>>>>> file structure so that it might be possible. AFAIK the
>>>>>>>>>>>>> publisher does not provide a simple way of doing this.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Setting thredds_dataset_roots to different values while
>>>>>>>>>>>>> publishing doesn't appear to work as those are mapped to a
>>>>>>>>>>>>> map-entry at the catalog root:
>>>>>>>>>>>>> <datasetRoot path="CMIP5" location="/replicated/CMIP5"/>
>>>>>>>>>>>>> <datasetRoot path="CMIP5" location="/core/CMIP5"/>  ..
>>>>>>>>>>>>>
>>>>>>>>>>>>> which is clearly non bijective and can't therefore be  
>>>>>>>>>>>>> reversed
>>>>>>>>>>>>> to locate the file from a given URL.
>>>>>>>>>>>>>
>>>>>>>>>>>>> While publishing all referred data will be held on a known
>>>>>>>>>>>>> location.
>>>>>>>>>>>>> Is it possible to use somehow this information to setup a
>>>>>>>>>>>>> proper catalog configuration so that the URL can be  
>>>>>>>>>>>>> properly
>>>>>>>>>>>>> mapped? At least on a dataset level?
>>>>>>>>>>>>>
>>>>>>>>>>>>> The whole HPSS staging procedure should be completely
>>>>>>>>>>>>> transparent to the user, as well as the location of the  
>>>>>>>>>>>>> files.
>>>>>>>>>>>>> I was just looking at other options in case we cannot  
>>>>>>>>>>>>> publish
>>>>>>>>>>>>> them the way we want...
>>>>>>>>>>>>>
>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>> Estani
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Bryan Lawrence wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>> sorry.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> the first sentence should have read
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Just to note that *our* approach to the local versus
>>>>>>>>>>>>>> replication issue will be ...
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Cheers
>>>>>>>>>>>>>> Bryan
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Thursday 01 Jul 2010 11:25:37 Bryan Lawrence wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi Estani
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Just to note that your approach to the local versus
>>>>>>>>>>>>>>> replication will be to run two different ESG nodes ...  
>>>>>>>>>>>>>>> which
>>>>>>>>>>>>>>> is in fact the desired outcome so as to get the right  
>>>>>>>>>>>>>>> things
>>>>>>>>>>>>>>> in the catalogues at the right time (vis- a-viz qc etc).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The issue with respect to cache, I'm not so sure  
>>>>>>>>>>>>>>> about, in
>>>>>>>>>>>>>>> what way do you want to expose that into ESG?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Bryan
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Wednesday 30 Jun 2010 17:05:57 Estanislao Gonzalez  
>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi Stephen,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> the page contains really helpful information, thanks  
>>>>>>>>>>>>>>>> a lot!
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I'm also interested in some variables of the DEFAULT  
>>>>>>>>>>>>>>>> section
>>>>>>>>>>>>>>>> from the esg.ini configuration file. More specifically:
>>>>>>>>>>>>>>>> thredds_dataset_roots (and maybe
>>>>>>>>>>>>>>>> thredds_aggregation_services or any other which was  
>>>>>>>>>>>>>>>> changed
>>>>>>>>>>>>>>>> or you think it might be important)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The main question here is: how can different local  
>>>>>>>>>>>>>>>> directory
>>>>>>>>>>>>>>>> structures be published to the same DRS structure?
>>>>>>>>>>>>>>>> The example scenario in our case will be:
>>>>>>>>>>>>>>>> /replicated/<DRS structure>  - for replicated data
>>>>>>>>>>>>>>>> /local/<DRS structure>  - for non replicated data  
>>>>>>>>>>>>>>>> hold on
>>>>>>>>>>>>>>>> disk /cache/<DRS structure>  - for data staged from a  
>>>>>>>>>>>>>>>> HPSS
>>>>>>>>>>>>>>>> system
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The only solution I can think of is to extend the URL  
>>>>>>>>>>>>>>>> before
>>>>>>>>>>>>>>>> the DRS structure starts (the URL won't be 100% DRS  
>>>>>>>>>>>>>>>> conform
>>>>>>>>>>>>>>>> anyway). So
>>>>>>>>>>>>>>>>   http://******server/thredds/fileserver/<DRS  
>>>>>>>>>>>>>>>> structure>  will
>>>>>>>>>>>>>>>> turn into
>>>>>>>>>>>>>>>>   http://******server/thredds/fileserver/replicated/ 
>>>>>>>>>>>>>>>> <DRS
>>>>>>>>>>>>>>>> structure>
>>>>>>>>>>>>>>>>   http://******server/thredds/fileserver/local/<DRS  
>>>>>>>>>>>>>>>> structure>
>>>>>>>>>>>>>>>>   http://******server/thredds/fileserver/cache/<DRS
>>>>>>>>>>>>>>>> structure>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Is that viable? Are there any other options?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>> Estani
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> stephen.pascoe at stfc.ac.uk wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> To illustrate how the ESG datanode can be configured  
>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>> serve data for CMIP5 we have deployed a datanode  
>>>>>>>>>>>>>>>>> containing
>>>>>>>>>>>>>>>>> a subset of
>>>>>>>>>>>>>>>>> CMIP3 in the Data Reference Syntax. Some key  
>>>>>>>>>>>>>>>>> features of
>>>>>>>>>>>>>>>>> this deployment are:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>   * The underlying directory structure is based on  
>>>>>>>>>>>>>>>>> the Data
>>>>>>>>>>>>>>>>>     Reference Syntax.
>>>>>>>>>>>>>>>>>   * Datasets published at the realm level.
>>>>>>>>>>>>>>>>>   * The token-based security filter is replaced by the
>>>>>>>>>>>>>>>>>     OpenidRelyingParty security filter.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Further notes can be found at
>>>>>>>>>>>>>>>>> http://******proj.badc.rl.ac.uk/go-essp/wiki/ 
>>>>>>>>>>>>>>>>> CMIP3_Datanode
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> This test deployment should be of interest to anyone
>>>>>>>>>>>>>>>>> wanting to know how DRS identifiers could be exposed  
>>>>>>>>>>>>>>>>> in
>>>>>>>>>>>>>>>>> THREDDS catalogues and the TDS HTML interface.  You  
>>>>>>>>>>>>>>>>> can
>>>>>>>>>>>>>>>>> also try downloading files with OpenID  
>>>>>>>>>>>>>>>>> authentication or
>>>>>>>>>>>>>>>>> via wget with SSL-client certificate  
>>>>>>>>>>>>>>>>> authentication.  See
>>>>>>>>>>>>>>>>> the link above for details.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>>> Stephen.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> ---
>>>>>>>>>>>>>>>>> Stephen Pascoe  +44 (0)1235 445980 British  
>>>>>>>>>>>>>>>>> Atmospheric Data
>>>>>>>>>>>>>>>>> Centre Rutherford Appleton Laboratory
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> -----------------------------------------------------------
>>>>>>>>>>>>>>>>> ------
>>>>>>>>>>>>>>>>> -- -----
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>>>> GO-ESSP-TECH mailing list
>>>>>>>>>>>>>>>>> GO-ESSP-TECH at ucar.edu
>>>>>>>>>>>>>>>>> http://******mailman.ucar.edu/mailman/listinfo/go- 
>>>>>>>>>>>>>>>>> essp-tech
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>> Estanislao Gonzalez
>>>>>>>>>>>>>
>>>>>>>>>>>>> Max-Planck-Institut für Meteorologie (MPI-M) Deutsches
>>>>>>>>>>>>> Klimarechenzentrum (DKRZ) - German Climate Computing  
>>>>>>>>>>>>> Centre
>>>>>>>>>>>>> Room 108 - Bundesstrasse 45a, D-20146 Hamburg, Germany
>>>>>>>>>>>>>
>>>>>>>>>>>>> Phone:   +49 (40) 46 00 94-126
>>>>>>>>>>>>> E-Mail:  estanislao.gonzalez at zmaw.de
>>>>>>>>>>>>>
>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>> GO-ESSP-TECH mailing list
>>>>>>>>>>>>> GO-ESSP-TECH at ucar.edu
>>>>>>>>>>>>> http://******mailman.ucar.edu/mailman/listinfo/go-essp- 
>>>>>>>>>>>>> tech
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Gavin M. Bell
>>>>>>>>>> Lawrence Livermore National Labs
>>>>>>>>>> --
>>>>>>>>>>
>>>>>>>>>> "Never mistake a clear view for a short distance."
>>>>>>>>>>                  -Paul Saffo
>>>>>>>>>>
>>>>>>>>>> (GPG Key - http://****rainbow.llnl.gov/dist/keys/gavin.asc)
>>>>>>>>>>
>>>>>>>>>> A796 CE39 9C31 68A4 52A7  1F6B 66B7 B250 21D5 6D3E
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> GO-ESSP-TECH mailing list
>>>>>>>>> GO-ESSP-TECH at ucar.edu
>>>>>>>>> http://****mailman.ucar.edu/mailman/listinfo/go-essp-tech
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>
>>>>> --
>>>>> Estanislao Gonzalez
>>>>>
>>>>> Max-Planck-Institut für Meteorologie (MPI-M) Deutsches
>>>>> Klimarechenzentrum (DKRZ) - German Climate Computing Centre Room  
>>>>> 108
>>>>> - Bundesstrasse 45a, D-20146 Hamburg, Germany
>>>>>
>>>>> Phone:   +49 (40) 46 00 94-126
>>>>> E-Mail:  estanislao.gonzalez at zmaw.de
>>>>>
>>>>> _______________________________________________
>>>>> is-enes-sa2-jra4 mailing list
>>>>> is-enes-sa2-jra4 at lists.enes.org
>>>>> https://*lists.enes.org/mailman/listinfo/is-enes-sa2-jra4
>>>>>
>>>>>
>>>>>
>>>
>>>
>>>
>>
>> --
>> Estanislao Gonzalez
>>
>> Max-Planck-Institut für Meteorologie (MPI-M)
>> Deutsches Klimarechenzentrum (DKRZ) - German Climate Computing Centre
>> Room 108 - Bundesstrasse 45a, D-20146 Hamburg, Germany
>>
>> Phone:   +49 (40) 46 00 94-126
>> E-Mail:  estanislao.gonzalez at zmaw.de
>>
>>
>>
>>
>> --
>> Scanned by iCritical.
>>
>>