[Go-essp-tech] verification of datanode status

Estanislao Gonzalez gonzalez at dkrz.de
Wed Dec 21 07:52:12 MST 2011


Bob has something, but in any case we don't want "others" to unpublish 
datasets. The user publishing should be able to do this... it´s really a 
very special case (data node crashes and the decision is not to bring it 
back up)
I don't think we need an automated implementation for this... but Bob 
might provide the guidelines for doing what we need (probably query the 
GW for the datasets, filtering the names and issuing the to unpublish)

Thanks,
Estani

On 20.12.2011 06:21, stephen.pascoe at stfc.ac.uk wrote:
> Hi all,
>
> I have now retracted this dataset.  However, we should really be able
> to delete it without access to the datanode.  There must be a hessian
> API to delete datasets from a Gateway because esgpublish does I don't
> know of a way without having a esgpublisher installation.
>
> Estani, is there anything in your toolset that could do this, or be a
> starting point for an implementation?
>
> Cheers,
> Stephen.
>
> ---
> Stephen Pascoe  +44 (0)1235 445980
> Centre of Environmental Data Archival
> STFC Rutherford Appleton Laboratory, Harwell Oxford, Didcot OX11 0QX, 
> UK
>
>
> -----Original Message-----
> From: go-essp-tech-bounces at ucar.edu
> [mailto:go-essp-tech-bounces at ucar.edu] On Behalf Of Eric Nienhouse
> Sent: 19 December 2011 22:03
> To: Estanislao Gonzalez
> Cc: go-essp-tech at ucar.edu
> Subject: Re: [Go-essp-tech] verification of datanode status
>
> Hi All,
>
> A long standing Gateway requirement is to provide search and 
> discovery
> of datasets (and other metadata) regardless of the state of remote
> services.  In the event that a data node service is unavailable, 
> users
> should still be able to identify datasets, determine what has been
> published and generate download scripts.
>
> This guiding requirement was discussed at great length by the ESG-CET
> project group and was accepted as a key element in support of the
> community's best interest for data discovery.  Identifying "what has
> been published" was a key use case driving this need.  This advantage 
> of
> this approach is that it allows users to find out "what exists" 
> during
> periods of unexpected downtime or other service unavailability.
>
> Henrik noted that the DMI data node is no longer serving datasets
> publishing into ESG.  In this case these datasets can be discovered 
> at
> the Gateway, however, they are inaccessible and out of sync.  If 
> these
> data are no longer meant to be accessed, I'd suggest they be 
> "retracted"
> from the gateway and they will no longer appear in the search 
> results.
>
> Thanks,
>
> -Eric
>
> Estanislao Gonzalez wrote:
>> T(sorry the message got cut)
>> ...it's up to the publisher to define when data shouldn't be 
>> accessible
>> anymore.
>>
>> There are some improvements that can be doen, but most I can think 
>> of
>> will make the understanding of the system more complex to the end 
>> user.
>>
>> Datanode admin should rely on tools that help them get their nodes 
>> up
>> for as long as possible (nagios & Co).
>>
>> My 2c,
>> Estani
>> On 19.12.2011 08:04, Estanislao Gonzalez wrote:
>>
>>> Hi Luca,
>>>
>>> That's not what Henrik meant. Neither the architecture retains a
>>> living
>>> link to a data nose (not an index one, as you've pointed out)
>>>
>>> This is a feature IMO as the search engine is detached from the
>>> dataone. The index might indeed "prune" the data nodes down, but
>>> unless
>>> this is done synchroneusly it would difficult the federation
>>> interaction.
>>> Or to say it differently: Is up to the data node to assure data is
>>> available, and if that's not desired anymore, On 19.12.2011 07:15,
>>> Cinquini, Luca (3880) wrote:
>>>
>>>> Hi Henirik,
>>>> 	not sure about the gateway, but this is a feature the P2P system
>>>> does have: has soon as a datanode is inaccessible, the search
>>>> automatically prunes that node away, so the search results never
>>>> contain dead links.
>>>> thanks, Luca
>>>>
>>>> On Dec 19, 2011, at 3:58 AM, Henrik Wiberg wrote:
>>>>
>>>>
>>>>> Does the gateway somehow 'ping' its registered datanodes to 
>>>>> verify
>>>>> that
>>>>> they are accessible? The datanode at dmi has not been running for 
>>>>> 5
>>>>> mounts still the datanodes published datasets are searchable and
>>>>> displayed at the gateway cmip-gw-badc. Should not inaccessible
>>>>> datasets
>>>>> be removed from the search result?
>>>>>
>>>>> _______________________________________________
>>>>> GO-ESSP-TECH mailing list
>>>>> GO-ESSP-TECH at ucar.edu
>>>>> http://mailman.ucar.edu/mailman/listinfo/go-essp-tech
>>>>>
>>>> _______________________________________________
>>>> GO-ESSP-TECH mailing list
>>>> GO-ESSP-TECH at ucar.edu
>>>> http://mailman.ucar.edu/mailman/listinfo/go-essp-tech
>>>>
>>
>>
>
> _______________________________________________
> GO-ESSP-TECH mailing list
> GO-ESSP-TECH at ucar.edu
> http://mailman.ucar.edu/mailman/listinfo/go-essp-tech

-- 
Estanislao Gonzalez

Max-Planck-Institut für Meteorologie (MPI-M)
Deutsches Klimarechenzentrum (DKRZ) - German Climate Computing Centre
Room 108 - Bundesstrasse 45a, D-20146 Hamburg, Germany

Phone:   +49 (40) 46 00 94-126
E-Mail:  gonzalez at dkrz.de


More information about the GO-ESSP-TECH mailing list