[Go-essp-tech] Reasoning for the use of symbolic links in drslib
Mark Morgan
momipsl at ipsl.jussieu.fr
Tue Sep 20 09:23:19 MDT 2011
Hi
esgNode.mandatory(checksums + PKI) = a better night's sleep.
Mark
On 20 Sep 2011, at 17:14, Kettleborough, Jamie wrote:
> Hello Balaji,
>
> I agree - getting all nodes to make the checksums available would be a
> good thing. It gives you both the data integrity check on download, and
> the ability to see what files really have changed from one publication
> version to the next.
>
> I don't know how hard it is to do this, particularly for data that is
> already published.
>
> Jamie
>
>> -----Original Message-----
>> From: V. Balaji [mailto:V.Balaji at noaa.gov]
>> Sent: 20 September 2011 16:01
>> To: Kettleborough, Jamie
>> Cc: Karl Taylor; go-essp-tech at ucar.edu; esg-node-dev at lists.llnl.gov
>> Subject: Re: [Go-essp-tech] Reasoning for the use of symbolic
>> links in drslib
>>
>> If nodes can currently choose to record checksums or not, I'd
>> strongly recommend this be a non-optional requirement.. how
>> could anyone download any data with confidence without being
>> able to checksum?
>>
>> You can of course check timestamps and filesizes and so on,
>> but you have to consider those optimizations... a fast option
>> for the less paranoid to avoid the sum computation, which has
>> to be the gold standard.
>>
>> "Trust but checksum".
>>
>> Kettleborough, Jamie writes:
>>
>>> Hello Karl, everyone,
>>>
>>>
>>> For replicating the latest version, I agree that your alternate
>>> structure poses difficulties (but it seems like there must
>> be a way to
>>> smartly determine whether the file you already have a file
>> and simply
>>> need to move it, rather than bring it over again).
>>>
>>>
>>> Doesn't every user (not just the replication system) have
>> this problem:
>>> they want to know what files have changed (or not changed) at a new
>>> publication version. No one wants to be using band width
>> or storage
>>> space to fetch and store files they already have. How is a user
>>> expected to know what has really changed? Estani mentions
>> check sums
>>> - OK, but I don't think all nodes expose them (is this
>> right?). You
>>> may try to infer from modification dates (not sure, I
>> haven't look at
>>> them that closely). You may try to infer from the
>> TRACKING_ID - but
>>> I'm not sure how reliable this is (I can imagine scenarios where
>>> different files share the same TRACKING_ID - e.g. if they have been
>>> modified with an nco tool).
>>>
>>> Is there a recommended method for users to understand what *files*
>>> have actually changed when a new publication version appears?
>>>
>>> Thanks,
>>>
>>> Jamie
>>>
>>
>> --
>>
>> V. Balaji Office: +1-609-452-6516
>> Head, Modeling Systems Group, GFDL Home: +1-212-253-6662
>> Princeton University Email: v.balaji at noaa.gov
>>
> _______________________________________________
> GO-ESSP-TECH mailing list
> GO-ESSP-TECH at ucar.edu
> http://mailman.ucar.edu/mailman/listinfo/go-essp-tech
---------------------------------------------------
Mark Morgan
Software Architect / Engineer
Institut Pierre Simon Laplace (IPSL),
Université Pierre Marie Curie,
4 Place Jussieu,
Tour 45-55, Salle #207,
Paris 75005
France.
Tel : +33 (0) 1 44 27 49 10
Email: momipsl at ipsl.jussieu.fr
---------------------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ucar.edu/pipermail/go-essp-tech/attachments/20110920/a27a99b2/attachment.html
More information about the GO-ESSP-TECH
mailing list