[Go-essp-tech] Thoughts Relating to your phone conference on replication....

Wed Nov 4 08:24:04 MST 2009

Hi Gavin,
	the bottom line is that the THREDDS catalogs are not the currency of  
the system... it's simply one of many possible envisioned ways in  
which a data node can organize its data and send it to the gateway.  
The way the gateway is architected, the currency of the system is the  
metadata stored in the relational database. There are other ways to  
ingest data in the system: one is to parse a local directory, another  
(future) one is to parse an FTP server of a GridFTP server. You can  
also imagine parsing XML catalogs that are in a different schema, and  
in fact this is what likely will happen to interoperate with other  
systems that already have their data management system. The idea was  
to make the data node flexible so to not absolutely require that it  
has a TDS installation (although this will happen for all CMIP5 data  
nodes).

thanks, luca

On Nov 3, 2009, at 12:41 PM, Gavin M Bell wrote:

> Hello Gentle-people,
>
> I just wanted to say a few words.  I should have said something at the
> end of our phone conference but I thought it wiser to collect my
> thoughts a bit for a more cogent and durable presentation.
>
> Okay, please be patient and read through the entire email, the  
> complete
> thought.  Please send feed back so we can hash this out.... Bare  
> with me.
>
> The main idea I want to get across is for us to have a *catalog- 
> centric*
> view of they system.  It is the catalog that is the primary currency  
> of
> the system.
>
> - The catalog gets generated and published from the data via the
> data-node/publisher to the gateway.
>
> - The gateway is simply, in the context of this model, a searchable
> index over a collection of catalogs.
>
> - Changes to catalogs are what is versioned.
>
> - Changes to catalogs are what trigger notifications
>
> - Replication should be about replicating catalogs, where files
> transfers are the necessary side-effect of proper catalog replication.
>
> It is the catalog that is the central 'document' that we are  
> interested
> in.  It is the single entity that contains the necessary information
> used in all levels of this system.
>
> The very good point that was brought up on the call was, what is the
> interface between parts of the system?  It has become clear to me that
> if each part of the system understood the catalog then they could
> operate quite well, gleaning the information out of catalogs.
>
> The topic today was replication:
> So... In a catalog centric model, the question of replication becomes
> simply, what datasets have changed?  This is equivalent to asking,  
> what
> catalogs have changed?  This can be answered by the gateway which, in
> this context, is essentially a catalog store. The gateway knows when a
> catalog changes because a new catalog would usurp an older one - this
> can be detected at publication time and versioned appropriately and
> announced (via notification).  The replication agent is interested in
> these notifications thus should be defacto subscribed to getting such
> notification messages.  When the replication agent is notified it  
> would
> look on it's system and see if the notification is something in it's
> list to have a replica of, it's "replication list".  If so it can pull
> down the catalog or some subset (diff) of that catalog, or simply the
> necessary tuple to find the location(s) of holders of the newest
> catalog.  The catalog will always have in it its authoritative source
> (dataset name and gateway).  This can be resolved to the actual data
> node that has the new version of that catalog (and any other replicas
> that are up-to-date).  Then it is the job of the replication agent  
> that
> wants to be updated to contact the authoritative data-node or any
> up-to-date replica holder and basically sync catalogs.  Syncing  
> catalogs
> means grabbing the latest catalog, from the authoritative source or an
> updated data-node replica, and diffing it with the stale catalog it
> currently has... the result of the diff is the set files and such that
> need to be transfered in order to make the state of the stale node
> equivalent to the state of the latest catalog.  It is the catalog that
> contains the 'inventory' and all other necessary information.  Once
> files are transfered integrity checking can be accomplished at a few
> levels.  First is to have the stale, node generate it's own catalog  
> and
> then check it against the reference (up-to-date) catalog it got from  
> the
> source. If replication has been done successfully they should be
> identical!  The catalog should have a 'header' portion that contains  
> the
> checksum of the immutable portion 'body' of the catalog.  The first
> level integrity check would be to see if what is generated and the
> reference are the same, if not a second level check that required
> walking the catalog's (xml) tree and compare the two trees.  it is in
> the latter check where individual files entries are checked to detect
> what files may need to be fetched again.  Also if the connection goes
> down or fails in some way, generating a catalog over the partial set  
> of
> files that have already been downloaded, and comparing it with the
> source catalog will tell the replication agent where to puck up from.
> The source catalog could be cached on the replication catalog and then
> purged after replication is done.  Or to be more up-to-date, can  
> refetch
> a catalog from any in the list of already up-to-date replica holders.
>
> The model is consistent.  Perhaps what needs to happen is for every  
> part
> of this system to be able to parse and glean information from the  
> catalog.
>
> There are system tweaks and optimizations that can be made (Ex:
> subscribing to be notified for specific entities or doing a general
> subscription blast.  Refetching latest catalog from source or up-to- 
> date
> replicas vs holding on to the source you already have - a question of
> freshness, etc...).  But the model of being catalog centric is
> consistent and complete.  I think this is the direction we should go  
> in
> if we want this system to be scalable and provide clean interfacing of
> the different parts.  Furthermore testing becomes easier because
> essentially all you would need is a bag-of-catalogs to ingest into  
> your
> piece.
>
>
> Thanks for putting up with my stream of consciousness. :-)
> I hope I was cogent.  Feel free to contact me if any additional
> clarification is required.
>
> Gavin.
>
> -- 
> Gavin M. Bell
> Lawrence Livermore National Labs
> --
>
> "Never mistake a clear view for a short distance."
>       	       -Paul Saffo
>