[Go-essp-tech] [esg-gateway-dev] Status of the CMIP5 Archive

Gavin M. Bell gavin at llnl.gov
Thu Apr 28 11:10:18 MDT 2011


 Hi All,

I just got to work and caught up on this thread.  The turn of the
conversation has been interesting to see.  From registry to
notification.  Both of these facilities are in the node manager and are
being tested at this moment.  The registry, as well as email
notification, are to go on line shortly and be in the next release. 
There are some other features in the node manager that touch on many of
the issues discussed in this thread.

As far as RSS feeds and other modes of notification, I think that they
are worth taking a look at and seeing how we can integrate them into
future notification modes.  Currenly we have email :-\.  True, the
question of granularity comes into play, but if we do this right we
could have a scalable system that still offers a useful level of
granularity, yes... I am hatching a plan.  :-).

As far stop-gap registry solutions... There will be little need for them
soon, however, in the mean (dt) time here is what you can do that will
provide a very smooth transition.  Every ESGF Node (i.e. running a node
manager) will generate the registration.xml document that represents
itself.  The xml content is based on the information that is present in
the ESGF_HOME/config/esgf.properties file (at the moment, perhaps to be
augmented by the /etc/esg.env file) - this properties file is generated
by the install process.  If you have the node manager installed, poke at
the url http://<your node>/esgf-node-manager/node (you should get a
hessian message "Hessian Requires Post" this is exactly what you should
see) - for your curiosity if you "tail -f" your catalina.out you will
see a bunch of output running periodically as the component services
come up as the node manager.  The main thing though is that it will
generate the registration.xml file at http://<your
host>/esgf-node-manager/registration.xml :-)  With that document
generated... just send a quick email to me and Bob (and the esg-node-dev
list) that you have generated the file.  We'll go to the url and pull it
down and aggregate it on PCMDI3.  DONE.

(caveat if you don't install with the installer... :-| hmmm...)

Again, this stop-gap will very shortly not even be needed, but I have to
test code before I release it... right?
Which as much as I like the picture (attached)... I try not to follow
his sentiments.... most of the time ;-).

No worries....

Now BACK TO THE HOUSE OF PAIN!!!!!  Mmmmwwaaahhh aaaaahhh ahhhhhh!!!
(eeeeviiil laugh)!!
:-)


On 4/28/11 6:14 AM, Mark Morgan wrote:
> Hi
>
> ... a news cast capability triggered manually ...
> How about ESG-Publisher posting to an ESG-F data twitter feed that can
> in turn be queried via the twitter.api
> (http://dev.twitter.com/doc, http://dev.twitter.com/doc/get/search)  
>
> Perhaps not a serious suggestion but abit of fun ahead of GO-ESSP.
>
> Mark
>
> PS - Note the 16 character limit on hash tags
>
> PS - @Phil Kershaw : http://dev.twitter.com/pages/auth_overview
>
>
>
> On 28 Apr 2011, at 14:55, Cinquini, Luca (3880) wrote:
>
>> Hi,
>> FYI we are indeed looking at a distributed search architecture for
>> ESGF, and since each dataset that enters the system has a "last
>> update" time stamp, it already comes with the possibility of querying
>> for anything new... but again, at a level of granularity that is
>> probably too detailed for what we want the user to have. Maybe a news
>> cast capability triggered manually would be more like what is needed
>> here. This would be a great topics of discussion at the workshop.
>> thanks, Luca
>>
>> On Apr 28, 2011, at 5:50 AM, <philip.kershaw at stfc.ac.uk
>> <mailto:philip.kershaw at stfc.ac.uk>> wrote:
>>
>>> Another option would be to proxy tomcat through Apache as JPL have
>>> done then the Pylons app could run via mod_wsgi.
>>>
>>> Cheers,
>>> Phil
>>>
>>> From: <stephen.pascoe at stfc.ac.uk<mailto:stephen.pascoe at stfc.ac.uk>>
>>> Date: Thu, 28 Apr 2011 08:49:43 +0000
>>> To: <momipsl at ipsl.jussieu.fr<mailto:momipsl at ipsl.jussieu.fr>>,
>>> <drach1 at llnl.gov<mailto:drach1 at llnl.gov>>
>>> Cc:
>>> <esg-gateway-dev at earthsystemgrid.org<mailto:esg-gateway-dev at earthsystemgrid.org>>,
>>> <go-essp-tech at ucar.edu<mailto:go-essp-tech at ucar.edu>>,
>>> <sebastien.denvil at ipsl.jussieu.fr<mailto:sebastien.denvil at ipsl.jussieu.fr>>
>>> Subject: Re: [esg-gateway-dev] [Go-essp-tech] Status of the CMIP5
>>> Archive
>>>
>>> Hi All,
>>>
>>> Similarly we have a web app which exposes publishing information in
>>> the datanode database for use between us and the MetOffice.  It uses
>>> the same SQLAlchemy model as esgcet with a few extra tables.  I
>>> don't think it would take me long to create a view that displays
>>> recently published datasets as an atom feed.
>>>
>>> However, there are complications.  This is a Pylons app which we
>>> deploy on a separate machine to the datanode or gateway.  If we were
>>> to put it on the datanode it would have to listen on a separate port
>>> to tomcat.  Or we could implement something similar in Java and put
>>> it in the datanode's tomcat.
>>>
>>> Also, Luca and Mark are right.  To be useful we would need to
>>> aggregate the information from each datanode and provide less
>>> fine-grained info like "Which experiments have just been published".
>>>  It begins to look a lot more like a query service with output as
>>> Atom/RSS.
>>>
>>> S.
>>>
>>> ---
>>> Stephen Pascoe  +44 (0)1235 445980
>>> Centre of Environmental Data Archival
>>> STFC Rutherford Appleton Laboratory, Harwell Oxford, Didcot OX11 0QX, UK
>>>
>>> From:
>>> go-essp-tech-bounces at ucar.edu<mailto:go-essp-tech-bounces at ucar.edu>
>>> [mailto:go-essp-tech-bounces at ucar.edu] On Behalf Of Mark Morgan
>>> Sent: 28 April 2011 11:11
>>> To: Drach, Bob
>>> Cc: Sébastien Denvil;
>>> go-essp-tech at ucar.edu<mailto:go-essp-tech at ucar.edu>;
>>> esg-gateway-dev at earthsystemgrid.org<mailto:esg-gateway-dev at earthsystemgrid.org>
>>> Subject: Re: [Go-essp-tech] Status of the CMIP5 Archive
>>>
>>> Bob / Bryan
>>>
>>> Ontop of the possibilty of the DataNode exposing ESG-Publisher
>>> AtomPub HTTP endpoints, may I also add to the mix the possibility of
>>> ESG-Publisher search web services.  Such web services would in turn
>>> permit the development of a ESG distributed search broker, i.e. an
>>> aggregator of search results pulled from multiple data nodes.
>>>
>>> At IPSL we have developed a portal that launches an overnight batch
>>> job to harvest & aggregate meta-data derived from the THREDDS
>>> catalogs published at each of our data nodes.  We could certainly
>>> use the ESG-Publisher AtomPub feeds to optimise the synchronisation
>>> of the aggregated meta-data.
>>>
>>> However ultimately all paths lead to distributed search as in a few
>>> years we will have several millions of files/variables to search
>>> against and there are limits to what can be acheived with a solution
>>> based upon aggregation.  Hence the ESG-Publisher search web services
>>> and associated distributed search broker are of real interest.
>>>
>>> Regards
>>>
>>> Mark
>>>
>>>
>>> On 27 Apr 2011, at 23:54, Drach, Bob wrote:
>>>
>>>
>>> Hi Brian,
>>>
>>> Excellent suggestion. Anyone familiar with setting up an RSS feed?
>>>
>>> --Bob
>>>
>>>
>>> On 4/27/11 12:58 PM, "Bryan Lawrence"
>>> <bryan.lawrence at stfc.ac.uk<mailto:bryan.lawrence at stfc.ac.uk>> wrote:
>>>
>>>
>>> hi Bob
>>>
>>> I wonder how hard it would be to produce a data node feed (or a TDS
>>> feed) of datasets published/revised as part of the publication step?
>>>
>>> It'd then be relatively easy to parse that for a "new items" page ...
>>>
>>> Cheers
>>> Bryan
>>>
>>>
>>> I'm happy to post a list of publication events if that would be
>>> useful. But like you I would see this as a temporary solution until
>>> some sort of registry solution could be devised (famous last words
>>> ...). Also I can't really commit to keeping such a list up-to-date
>>> when vacation etc. intervenes.
>>>
>>> I'd be curious to hear if others think this is a good idea as well.
>>>
>>> --Bob
>>>
>>>
>>> On 4/27/11 3:30 AM, "Estanislao Gonzalez"
>>> <estanislao.gonzalez at zmaw.de<mailto:estanislao.gonzalez at zmaw.de>>
>>>
>>> wrote:
>>> Hi Sébastien,
>>>
>>> indeed this is a great idea, but changing this manually on every
>>> gateway is not practical at all.
>>>
>>> I've already proposed moving this to a central registry of some
>>> kind, but considering that the current registry, which is
>>> essential, is not ready I'd suggest a quick and dirty procedure:
>>>
>>> * Bob (I know you love this :-), could you put this info in a file
>>> publicly accessible at pcmdi3? (plain txt file, no headers,
>>> nothing) * Gateway team: could you give us a one line ajax command
>>> (or anything similar, preferably from the client side) depending
>>> on the current js libraries to insert this text where it should?
>>>
>>> I think that'll do.
>>>
>>> Thanks,
>>> Estani
>>>
>>> Am 27.04.2011 11:53, schrieb Sébastien Denvil:
>>> Hi Bob, Stephen, Estanislao
>>>
>>> I noticed that pcmdi gateway have a notice on the homepage listing
>>> new available datasets.
>>>
>>> Latest notice being : "BCC datasets will be available at the end
>>> of April."
>>>
>>> Bob, could add this to your list:
>>> "IPSL-CM5A-LR piControl and historical datasets available"
>>> They are open to CMIP5-research role since 20th of April.
>>>
>>> Stephen, Estanislao I think it could be a good idea to duplicate
>>> this notice on the other gateway to help people identifying which
>>> datasets are accessible.
>>>
>>> Cheers.
>>> Sébastien
>>>
>>> _______________________________________________
>>> GO-ESSP-TECH mailing list
>>> GO-ESSP-TECH at ucar.edu<mailto:GO-ESSP-TECH at ucar.edu>
>>> http://mailman.ucar.edu/mailman/listinfo/go-essp-tech
>>>
>>> --
>>> Bryan Lawrence
>>> Director of Environmental Archival and Associated Research
>>> (NCAS/British Atmospheric Data Centre and NCEO/NERC NEODC)
>>> STFC, Rutherford Appleton Laboratory
>>> Phone +44 1235 445012; Fax ... 5848;
>>> Web: home.badc.rl.ac.uk/lawrence
>>>
>>> _______________________________________________
>>> GO-ESSP-TECH mailing list
>>> GO-ESSP-TECH at ucar.edu<mailto:GO-ESSP-TECH at ucar.edu>
>>> http://mailman.ucar.edu/mailman/listinfo/go-essp-tech
>>>
>>> ---------------------------------------------------
>>> Mark Morgan
>>> Software Architect / Engineer
>>> Institut Pierre Simon Laplace (IPSL),
>>> Université Pierre Marie Curie,
>>> 4 Place Jussieu,
>>> Tour 45-55, Salle #207,
>>> Paris 75005
>>> France.
>>> Tel : +33 (0) 1 44 27 49 10
>>> Email: momipsl at ipsl.jussieu.fr<mailto:momipsl at ipsl.jussieu.fr>
>>> ---------------------------------------------------
>>>
>>>
>>>
>>>
>>>
>>> --
>>> Scanned by iCritical.
>>>
>>> _______________________________________________ esg-gateway-dev
>>> mailing list
>>> esg-gateway-dev at mailman.earthsystemgrid.org<mailto:esg-gateway-dev at mailman.earthsystemgrid.org>
>>> http://mailman.earthsystemgrid.org/mailman/listinfo/esg-gateway-dev
>>> -- 
>>> Scanned by iCritical.
>>> _______________________________________________
>>> esg-gateway-dev mailing list
>>> esg-gateway-dev at mailman.earthsystemgrid.org
>>> <mailto:esg-gateway-dev at mailman.earthsystemgrid.org>
>>> http://mailman.earthsystemgrid.org/mailman/listinfo/esg-gateway-dev
>>
>>
>
> ---------------------------------------------------
> Mark Morgan
> Software Architect / Engineer
> Institut Pierre Simon Laplace (IPSL),
> Université Pierre Marie Curie,
> 4 Place Jussieu,
> Tour 45-55, Salle #207,
> Paris 75005
> France.
> Tel : +33 (0) 1 44 27 49 10
> Email: momipsl at ipsl.jussieu.fr <mailto:momipsl at ipsl.jussieu.fr>
> ---------------------------------------------------
>
>
>

-- 
Gavin M. Bell
Lawrence Livermore National Labs
--

 "Never mistake a clear view for a short distance."
       	       -Paul Saffo

(GPG Key - http://rainbow.llnl.gov/dist/keys/gavin.asc)

 A796 CE39 9C31 68A4 52A7  1F6B 66B7 B250 21D5 6D3E

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ucar.edu/pipermail/go-essp-tech/attachments/20110428/8f174b4e/attachment-0001.html 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: photo.JPG
Type: image/jpeg
Size: 107199 bytes
Desc: not available
Url : http://mailman.ucar.edu/pipermail/go-essp-tech/attachments/20110428/8f174b4e/attachment-0001.jpe 


More information about the GO-ESSP-TECH mailing list