[Go-essp-tech] [esg-gateway-dev] Status of the CMIP5 Archive

Eric Nienhouse ejn at ucar.edu
Thu Apr 28 07:14:44 MDT 2011


Hi All,

Nice suggestions and discussion on this topic.  The original request for 
notice of recent publications could be considered as part of the overall 
requirements for federation wide notifications and alerts.  It is 
related to discussions we had earlier this month regarding Gateway and 
Datanode downtime notices and propagating them throughout the federation.

There likely is a need to support aggregating atomic publishing events 
to reduce granularity of the information as suggested.  The work noted 
re. distributed search, OpenSearch and downcasting frameworks is good 
information.  The Gateway is a natural aggregator of Datanode 
publication events and is already keenly aware of them.  Exploring the 
use case(s) will help us identify what is best here.

We should continue to discuss the best approach.  Many of us may be able 
to discuss this in person at the GO-ESSP workshop just over a week 
away.  I agree this would be a good topic to address.

Regards,

-Eric

Cinquini, Luca (3880) wrote:
> Hi,
> 	FYI we are indeed looking at a distributed search architecture for ESGF, and since each dataset that enters the system has a "last update" time stamp, it already comes with the possibility of querying for anything new... but again, at a level of granularity that is probably too detailed for what we want the user to have. Maybe a news cast capability triggered manually would be more like what is needed here. This would be a great topics of discussion at the workshop.
> thanks, Luca
>
> On Apr 28, 2011, at 5:50 AM, <philip.kershaw at stfc.ac.uk> wrote:
>
>   
>> Another option would be to proxy tomcat through Apache as JPL have done then the Pylons app could run via mod_wsgi.
>>
>> Cheers,
>> Phil
>>
>> From: <stephen.pascoe at stfc.ac.uk<mailto:stephen.pascoe at stfc.ac.uk>>
>> Date: Thu, 28 Apr 2011 08:49:43 +0000
>> To: <momipsl at ipsl.jussieu.fr<mailto:momipsl at ipsl.jussieu.fr>>, <drach1 at llnl.gov<mailto:drach1 at llnl.gov>>
>> Cc: <esg-gateway-dev at earthsystemgrid.org<mailto:esg-gateway-dev at earthsystemgrid.org>>, <go-essp-tech at ucar.edu<mailto:go-essp-tech at ucar.edu>>, <sebastien.denvil at ipsl.jussieu.fr<mailto:sebastien.denvil at ipsl.jussieu.fr>>
>> Subject: Re: [esg-gateway-dev] [Go-essp-tech] Status of the CMIP5 Archive
>>
>> Hi All,
>>
>> Similarly we have a web app which exposes publishing information in the datanode database for use between us and the MetOffice.  It uses the same SQLAlchemy model as esgcet with a few extra tables.  I don't think it would take me long to create a view that displays recently published datasets as an atom feed.
>>
>> However, there are complications.  This is a Pylons app which we deploy on a separate machine to the datanode or gateway.  If we were to put it on the datanode it would have to listen on a separate port to tomcat.  Or we could implement something similar in Java and put it in the datanode's tomcat.
>>
>> Also, Luca and Mark are right.  To be useful we would need to aggregate the information from each datanode and provide less fine-grained info like "Which experiments have just been published".  It begins to look a lot more like a query service with output as Atom/RSS.
>>
>> S.
>>
>> ---
>> Stephen Pascoe  +44 (0)1235 445980
>> Centre of Environmental Data Archival
>> STFC Rutherford Appleton Laboratory, Harwell Oxford, Didcot OX11 0QX, UK
>>
>> From: go-essp-tech-bounces at ucar.edu<mailto:go-essp-tech-bounces at ucar.edu> [mailto:go-essp-tech-bounces at ucar.edu] On Behalf Of Mark Morgan
>> Sent: 28 April 2011 11:11
>> To: Drach, Bob
>> Cc: Sébastien Denvil; go-essp-tech at ucar.edu<mailto:go-essp-tech at ucar.edu>; esg-gateway-dev at earthsystemgrid.org<mailto:esg-gateway-dev at earthsystemgrid.org>
>> Subject: Re: [Go-essp-tech] Status of the CMIP5 Archive
>>
>> Bob / Bryan
>>
>> Ontop of the possibilty of the DataNode exposing ESG-Publisher AtomPub HTTP endpoints, may I also add to the mix the possibility of ESG-Publisher search web services.  Such web services would in turn permit the development of a ESG distributed search broker, i.e. an aggregator of search results pulled from multiple data nodes.
>>
>> At IPSL we have developed a portal that launches an overnight batch job to harvest & aggregate meta-data derived from the THREDDS catalogs published at each of our data nodes.  We could certainly use the ESG-Publisher AtomPub feeds to optimise the synchronisation of the aggregated meta-data.
>>
>> However ultimately all paths lead to distributed search as in a few years we will have several millions of files/variables to search against and there are limits to what can be acheived with a solution based upon aggregation.  Hence the ESG-Publisher search web services and associated distributed search broker are of real interest.
>>
>> Regards
>>
>> Mark
>>
>>
>> On 27 Apr 2011, at 23:54, Drach, Bob wrote:
>>
>>
>> Hi Brian,
>>
>> Excellent suggestion. Anyone familiar with setting up an RSS feed?
>>
>> --Bob
>>
>>
>> On 4/27/11 12:58 PM, "Bryan Lawrence" <bryan.lawrence at stfc.ac.uk<mailto:bryan.lawrence at stfc.ac.uk>> wrote:
>>
>>
>> hi Bob
>>
>> I wonder how hard it would be to produce a data node feed (or a TDS
>> feed) of datasets published/revised as part of the publication step?
>>
>> It'd then be relatively easy to parse that for a "new items" page ...
>>
>> Cheers
>> Bryan
>>
>>
>> I'm happy to post a list of publication events if that would be
>> useful. But like you I would see this as a temporary solution until
>> some sort of registry solution could be devised (famous last words
>> ...). Also I can't really commit to keeping such a list up-to-date
>> when vacation etc. intervenes.
>>
>> I'd be curious to hear if others think this is a good idea as well.
>>
>> --Bob
>>
>>
>> On 4/27/11 3:30 AM, "Estanislao Gonzalez"
>> <estanislao.gonzalez at zmaw.de<mailto:estanislao.gonzalez at zmaw.de>>
>>
>> wrote:
>> Hi Sébastien,
>>
>> indeed this is a great idea, but changing this manually on every
>> gateway is not practical at all.
>>
>> I've already proposed moving this to a central registry of some
>> kind, but considering that the current registry, which is
>> essential, is not ready I'd suggest a quick and dirty procedure:
>>
>> * Bob (I know you love this :-), could you put this info in a file
>> publicly accessible at pcmdi3? (plain txt file, no headers,
>> nothing) * Gateway team: could you give us a one line ajax command
>> (or anything similar, preferably from the client side) depending
>> on the current js libraries to insert this text where it should?
>>
>> I think that'll do.
>>
>> Thanks,
>> Estani
>>
>> Am 27.04.2011 11:53, schrieb Sébastien Denvil:
>> Hi Bob, Stephen, Estanislao
>>
>> I noticed that pcmdi gateway have a notice on the homepage listing
>> new available datasets.
>>
>> Latest notice being : "BCC datasets will be available at the end
>> of April."
>>
>> Bob, could add this to your list:
>> "IPSL-CM5A-LR piControl and historical datasets available"
>> They are open to CMIP5-research role since 20th of April.
>>
>> Stephen, Estanislao I think it could be a good idea to duplicate
>> this notice on the other gateway to help people identifying which
>> datasets are accessible.
>>
>> Cheers.
>> Sébastien
>>
>> _______________________________________________
>> GO-ESSP-TECH mailing list
>> GO-ESSP-TECH at ucar.edu<mailto:GO-ESSP-TECH at ucar.edu>
>> http://mailman.ucar.edu/mailman/listinfo/go-essp-tech
>>
>> --
>> Bryan Lawrence
>> Director of Environmental Archival and Associated Research
>> (NCAS/British Atmospheric Data Centre and NCEO/NERC NEODC)
>> STFC, Rutherford Appleton Laboratory
>> Phone +44 1235 445012; Fax ... 5848;
>> Web: home.badc.rl.ac.uk/lawrence
>>
>> _______________________________________________
>> GO-ESSP-TECH mailing list
>> GO-ESSP-TECH at ucar.edu<mailto:GO-ESSP-TECH at ucar.edu>
>> http://mailman.ucar.edu/mailman/listinfo/go-essp-tech
>>
>> ---------------------------------------------------
>> Mark Morgan
>> Software Architect / Engineer
>> Institut Pierre Simon Laplace (IPSL),
>> Université Pierre Marie Curie,
>> 4 Place Jussieu,
>> Tour 45-55, Salle #207,
>> Paris 75005
>> France.
>> Tel : +33 (0) 1 44 27 49 10
>> Email: momipsl at ipsl.jussieu.fr<mailto:momipsl at ipsl.jussieu.fr>
>> ---------------------------------------------------
>>
>>
>>
>>
>>
>> --
>> Scanned by iCritical.
>>
>> _______________________________________________ esg-gateway-dev mailing list esg-gateway-dev at mailman.earthsystemgrid.org<mailto:esg-gateway-dev at mailman.earthsystemgrid.org> http://mailman.earthsystemgrid.org/mailman/listinfo/esg-gateway-dev
>> -- 
>> Scanned by iCritical.
>> _______________________________________________
>> esg-gateway-dev mailing list
>> esg-gateway-dev at mailman.earthsystemgrid.org
>> http://mailman.earthsystemgrid.org/mailman/listinfo/esg-gateway-dev
>>     
>
> _______________________________________________
> esg-gateway-dev mailing list
> esg-gateway-dev at mailman.earthsystemgrid.org
> http://mailman.earthsystemgrid.org/mailman/listinfo/esg-gateway-dev
>   



More information about the GO-ESSP-TECH mailing list