[Go-essp-tech] publishing by realm

Gavin M Bell gavin at llnl.gov
Thu Feb 25 14:54:04 MST 2010


Hey Bryan,

Pardon if I am speaking out of turn here but... Hmmm...
>From what I understand the dataset is captured as a logical file that is
the catalog. This is what is published.  When the dataset is replicated
the logical file is 'diffed' and only the internal files that are
represented that get changed are copied.  Files are checksummed so
indeed replication at the file level is all that is required.  In the
mechanics of it all, catalogs are generated and the immutable portions
must match (checksums), so a replica is true to its name.  If I were
doing it I'd build a Merkle tree over the constituent files and
immutable data (sorted by checksum), but I digress.

Again, I don't want to speak out of turn since I am not coding this up,
but this has been my understanding.

Suffice it to say, IMHO, I think this is doable without any disconnect. :-)

http://en.wikipedia.org/wiki/Merkle_signature_scheme

Bryan Lawrence wrote:
> Hi Bob
> That means, if we leave things the way there are: there is a logical disconnect, and the risk of either vastly more data movement than is necessary, or a complex resolution problem (is my replicated "realm" level dataset the same as yours, if we've done replication at the file level).
> 
> cheers
> Bryan
> 

-- 
Gavin M. Bell
Lawrence Livermore National Labs
--

 "Never mistake a clear view for a short distance."
       	       -Paul Saffo

(GPG Key - http://rainbow.llnl.gov/dist/keys/gavin.asc)

 A796 CE39 9C31 68A4 52A7  1F6B 66B7 B250 21D5 6D3E


More information about the GO-ESSP-TECH mailing list