[Go-essp-tech] CMIP5 publishing to Gateway and P2P nodes

Estanislao Gonzalez gonzalez at dkrz.de
Thu Feb 16 08:39:20 MST 2012


Hi Sergei,

you have no replicas, so don't worry, it was just to give you context as 
to why we were doing that.
You can publish to both actually without any issue, even from the same 
machine. You'll need two esg.ini so they point at the rspectives hessian 
endpoints (pcmdi3 and pcmdi9 publishing urls)

So you would publish everything as usual to pcmdi3 and then publish with 
"--noscan --publish" and use the second esg.ini (same as the other one 
but the hessian_service_url property should now point to the pcmdi9 
index). That's it.

But if you have a second machine it would be better to use it for 
testing installation. It might not be perfect if you are upgrading from 
a very old version of the data node...
And in any case, if you have the possibility try installing a second 
one, you could leave it with the BDM GridFTP installed, so we can speed 
up replication without creating extra traffic to your standard node :-)

Thanks,
Estani

Am 16.02.2012 15:35, schrieb Serguei Nikonov:
> Hi Estani,
>
> thanks for you suggestions. Can you give more details about necessity 
> of unpublish all datasets. I am not sure that I understand clear your 
> statement
> "Since the Gateway cannot publish replicas, we had to "unpublish them" 
> from it".
> Does it concern to us - not gateway, just datanode? I will ask it in 
> straight form - how do you think will it be possible to leave current 
> publishings on pcmdi3 while installing P2P datanode bound to pcmdi9?
>
> Thanks,
> Sergey
>
> On 02/11/2012 01:06 PM, Estanislao Gonzalez wrote:
>> Hi Serguei,
>>
>> we are doing a very similar move, so perhaps it's worth sharing it.
>>
>> Since the Gateway cannot publish replicas, we had to "unpublish them" 
>> form it,
>> though we still have the catalogs in place (meaning --thredds but no 
>> --publish
>> for the esgpublish tool)
>> So I installed a data node at other machine, cmip3 (I'm using devel, 
>> since
>> thought there are a few issues, it's stable and is more closely to 
>> what will end
>> up in "master" soon enough).
>> I've dumped the complete esgcet DB (database inclusive), drop the one 
>> installed
>> at cmip3 and ingested the original one.
>> Then I copied all catalogs to cmip3 and updated the host name in all 
>> of them via
>> sed (that's because we have gridFTP endpoint with the hostname 
>> encoded in each
>> catalog)
>> I removed out own catalogs, but that's not something you'll be doing 
>> I guess. If
>> you need to remove multiple datasets tell me because I can give you a 
>> hint on
>> how to do that faster.
>>
>> So now we have a duplicate of the original node, from which I will 
>> remove all
>> replicas afterwards to gain more memory space. cmip3 has all the 
>> metrics from
>> the original node and I might have to replace some tables that have 
>> the hostname
>> encoded in them (not sure, haven't found any, but I guess the metrics 
>> do have
>> the complete URL... have to check)
>>
>> The bad is that as it is, the node is not working with people not 
>> having the
>> CMIP5 membership already... but it wasn't working before either 
>> (previously you
>> would have seen a 403, now you get a 500 because of some issues we 
>> are working on).
>>
>> After that you can publish to whatever you want, but if you have a 
>> second node,
>> I'll suggest you publish to pcmdi9. You don't need any idp nor index, 
>> though as
>> Luca said, the complete installation is the most tested one, so you 
>> migh want to
>> install a complete stack, though I recommend you to care for the data 
>> part only.
>>
>> I'll definitely suggest you to have to nodes running until everything 
>> is setup.
>> It will lower the pressure of having it on-line if anything goes 
>> wrong (though
>> it shouldn't).
>>
>> Well, hope that's useful to you.
>>
>> Thanks,
>> Estani
>>
>>
>>
>>
>> Am 10.02.2012 19:07, schrieb Serguei Nikonov:
>>> Hi Bob, Gavin.
>>>
>>> We are going to upgrade data node to P2P version here in GFDL and 
>>> before that we
>>> need to clear up some questions:
>>> - first of all, what the robust and working version is and when it 
>>> will be
>>> released if it's not yet?
>>> - after installation, should our publications be indexed on our own 
>>> node or
>>> do we still "publish to PCMDI"?
>>> - Is it possible to install it without republishing datasets we have?
>>> - For avoiding time when system will be out of operation we would 
>>> like to run
>>> 2 systems (old Data Node and P2P) in parallel for some period of 
>>> time till
>>> all tweaks and tuning will be done and new system will be stable. Is it
>>> possible to have to data nodes providing the same data in Federation?
>>>
>>>
>>> Other question is about status of pcmdi3 - was it down on Feb 8? I 
>>> could not
>>> access to any dataset that day - neither GFDL's, nor others centers. 
>>> Also after
>>> that day all GFDL datasets are not downlodable from thredds catalog
>>> (e.g.https://esgdata.gfdl.noaa.gov/thredds/esgcet/1/cmip5.output1.NOAA-GFDL.GFDL-CM3.historical.day.atmos.day.r1i1p1.v20110601.html?dataset=cmip5.output1.NOAA-GFDL.GFDL-CM3.historical.day.atmos.day.r1i1p1.v20110601.pr_day_GFDL-CM3_historical_r1i1p1_18650101-18691231.nc). 
>>>
>>>
>>> But at the same time they are healthy on pcmdi - searchable, 
>>> downloadable. I
>>> recall we had similar situation 2 months ago which you fixed 
>>> tweaking something
>>> on pcmdi side.
>>>
>>> Thanks,
>>> Sergey,
>>> _______________________________________________
>>> GO-ESSP-TECH mailing list
>>> GO-ESSP-TECH at ucar.edu
>>> http://mailman.ucar.edu/mailman/listinfo/go-essp-tech
>>
>>
>


-- 
Estanislao Gonzalez

Max-Planck-Institut für Meteorologie (MPI-M)
Deutsches Klimarechenzentrum (DKRZ) - German Climate Computing Centre
Room 108 - Bundesstrasse 45a, D-20146 Hamburg, Germany

Phone:   +49 (40) 46 00 94-126
E-Mail:  gonzalez at dkrz.de



More information about the GO-ESSP-TECH mailing list