[Go-essp-tech] CMIP5 publishing to Gateway and P2P nodes

Cinquini, Luca (3880) Luca.Cinquini at jpl.nasa.gov
Thu Feb 16 09:38:50 MST 2012


Hi Sergei,
	to echo what Estani said, I have been using a single data node for over a year to publish data to both the JPL gateway, and the JPL P2P node. It works just fine to execute the steps Estani outlined. The only thing to make sure is that your data node is up to date, in particular:

	- you need to upgrade your certificates to trust both the gateway certs and the p2p certs
		("esg-node --force --rebuild-trustore)
	- you need to configure the authorization filter in thredds/WEB-INF/web.xml to use multiple authorization service endppoints, at the very least:

      <param-value>
        https://pcmdi9.llnl.gov/esgf-security/saml/soap/secure/authorizationService.htm,
        https://pcmdi3.llnl.gov/esgcet/saml/soap/secure/authorizationService.htm
      </param-value>

The first one is the p2p node, the second one is the gateway.

thanks, Luca

P.S.: and always, *always* back up your working installation before any changes :)


		
On Feb 16, 2012, at 8:39 AM, Estanislao Gonzalez wrote:

> Hi Sergei,
> 
> you have no replicas, so don't worry, it was just to give you context as 
> to why we were doing that.
> You can publish to both actually without any issue, even from the same 
> machine. You'll need two esg.ini so they point at the rspectives hessian 
> endpoints (pcmdi3 and pcmdi9 publishing urls)
> 
> So you would publish everything as usual to pcmdi3 and then publish with 
> "--noscan --publish" and use the second esg.ini (same as the other one 
> but the hessian_service_url property should now point to the pcmdi9 
> index). That's it.
> 
> But if you have a second machine it would be better to use it for 
> testing installation. It might not be perfect if you are upgrading from 
> a very old version of the data node...
> And in any case, if you have the possibility try installing a second 
> one, you could leave it with the BDM GridFTP installed, so we can speed 
> up replication without creating extra traffic to your standard node :-)
> 
> Thanks,
> Estani
> 
> Am 16.02.2012 15:35, schrieb Serguei Nikonov:
>> Hi Estani,
>> 
>> thanks for you suggestions. Can you give more details about necessity 
>> of unpublish all datasets. I am not sure that I understand clear your 
>> statement
>> "Since the Gateway cannot publish replicas, we had to "unpublish them" 
>> from it".
>> Does it concern to us - not gateway, just datanode? I will ask it in 
>> straight form - how do you think will it be possible to leave current 
>> publishings on pcmdi3 while installing P2P datanode bound to pcmdi9?
>> 
>> Thanks,
>> Sergey
>> 
>> On 02/11/2012 01:06 PM, Estanislao Gonzalez wrote:
>>> Hi Serguei,
>>> 
>>> we are doing a very similar move, so perhaps it's worth sharing it.
>>> 
>>> Since the Gateway cannot publish replicas, we had to "unpublish them" 
>>> form it,
>>> though we still have the catalogs in place (meaning --thredds but no 
>>> --publish
>>> for the esgpublish tool)
>>> So I installed a data node at other machine, cmip3 (I'm using devel, 
>>> since
>>> thought there are a few issues, it's stable and is more closely to 
>>> what will end
>>> up in "master" soon enough).
>>> I've dumped the complete esgcet DB (database inclusive), drop the one 
>>> installed
>>> at cmip3 and ingested the original one.
>>> Then I copied all catalogs to cmip3 and updated the host name in all 
>>> of them via
>>> sed (that's because we have gridFTP endpoint with the hostname 
>>> encoded in each
>>> catalog)
>>> I removed out own catalogs, but that's not something you'll be doing 
>>> I guess. If
>>> you need to remove multiple datasets tell me because I can give you a 
>>> hint on
>>> how to do that faster.
>>> 
>>> So now we have a duplicate of the original node, from which I will 
>>> remove all
>>> replicas afterwards to gain more memory space. cmip3 has all the 
>>> metrics from
>>> the original node and I might have to replace some tables that have 
>>> the hostname
>>> encoded in them (not sure, haven't found any, but I guess the metrics 
>>> do have
>>> the complete URL... have to check)
>>> 
>>> The bad is that as it is, the node is not working with people not 
>>> having the
>>> CMIP5 membership already... but it wasn't working before either 
>>> (previously you
>>> would have seen a 403, now you get a 500 because of some issues we 
>>> are working on).
>>> 
>>> After that you can publish to whatever you want, but if you have a 
>>> second node,
>>> I'll suggest you publish to pcmdi9. You don't need any idp nor index, 
>>> though as
>>> Luca said, the complete installation is the most tested one, so you 
>>> migh want to
>>> install a complete stack, though I recommend you to care for the data 
>>> part only.
>>> 
>>> I'll definitely suggest you to have to nodes running until everything 
>>> is setup.
>>> It will lower the pressure of having it on-line if anything goes 
>>> wrong (though
>>> it shouldn't).
>>> 
>>> Well, hope that's useful to you.
>>> 
>>> Thanks,
>>> Estani
>>> 
>>> 
>>> 
>>> 
>>> Am 10.02.2012 19:07, schrieb Serguei Nikonov:
>>>> Hi Bob, Gavin.
>>>> 
>>>> We are going to upgrade data node to P2P version here in GFDL and 
>>>> before that we
>>>> need to clear up some questions:
>>>> - first of all, what the robust and working version is and when it 
>>>> will be
>>>> released if it's not yet?
>>>> - after installation, should our publications be indexed on our own 
>>>> node or
>>>> do we still "publish to PCMDI"?
>>>> - Is it possible to install it without republishing datasets we have?
>>>> - For avoiding time when system will be out of operation we would 
>>>> like to run
>>>> 2 systems (old Data Node and P2P) in parallel for some period of 
>>>> time till
>>>> all tweaks and tuning will be done and new system will be stable. Is it
>>>> possible to have to data nodes providing the same data in Federation?
>>>> 
>>>> 
>>>> Other question is about status of pcmdi3 - was it down on Feb 8? I 
>>>> could not
>>>> access to any dataset that day - neither GFDL's, nor others centers. 
>>>> Also after
>>>> that day all GFDL datasets are not downlodable from thredds catalog
>>>> (e.g.https://esgdata.gfdl.noaa.gov/thredds/esgcet/1/cmip5.output1.NOAA-GFDL.GFDL-CM3.historical.day.atmos.day.r1i1p1.v20110601.html?dataset=cmip5.output1.NOAA-GFDL.GFDL-CM3.historical.day.atmos.day.r1i1p1.v20110601.pr_day_GFDL-CM3_historical_r1i1p1_18650101-18691231.nc). 
>>>> 
>>>> 
>>>> But at the same time they are healthy on pcmdi - searchable, 
>>>> downloadable. I
>>>> recall we had similar situation 2 months ago which you fixed 
>>>> tweaking something
>>>> on pcmdi side.
>>>> 
>>>> Thanks,
>>>> Sergey,
>>>> _______________________________________________
>>>> GO-ESSP-TECH mailing list
>>>> GO-ESSP-TECH at ucar.edu
>>>> http://mailman.ucar.edu/mailman/listinfo/go-essp-tech
>>> 
>>> 
>> 
> 
> 
> -- 
> Estanislao Gonzalez
> 
> Max-Planck-Institut für Meteorologie (MPI-M)
> Deutsches Klimarechenzentrum (DKRZ) - German Climate Computing Centre
> Room 108 - Bundesstrasse 45a, D-20146 Hamburg, Germany
> 
> Phone:   +49 (40) 46 00 94-126
> E-Mail:  gonzalez at dkrz.de
> 
> _______________________________________________
> GO-ESSP-TECH mailing list
> GO-ESSP-TECH at ucar.edu
> http://mailman.ucar.edu/mailman/listinfo/go-essp-tech



More information about the GO-ESSP-TECH mailing list