[Go-essp-tech] Reasoning for the use of symbolic links in drslib
Kettleborough, Jamie
jamie.kettleborough at metoffice.gov.uk
Tue Sep 20 09:14:23 MDT 2011
Hello Balaji,
I agree - getting all nodes to make the checksums available would be a
good thing. It gives you both the data integrity check on download, and
the ability to see what files really have changed from one publication
version to the next.
I don't know how hard it is to do this, particularly for data that is
already published.
Jamie
> -----Original Message-----
> From: V. Balaji [mailto:V.Balaji at noaa.gov]
> Sent: 20 September 2011 16:01
> To: Kettleborough, Jamie
> Cc: Karl Taylor; go-essp-tech at ucar.edu; esg-node-dev at lists.llnl.gov
> Subject: Re: [Go-essp-tech] Reasoning for the use of symbolic
> links in drslib
>
> If nodes can currently choose to record checksums or not, I'd
> strongly recommend this be a non-optional requirement.. how
> could anyone download any data with confidence without being
> able to checksum?
>
> You can of course check timestamps and filesizes and so on,
> but you have to consider those optimizations... a fast option
> for the less paranoid to avoid the sum computation, which has
> to be the gold standard.
>
> "Trust but checksum".
>
> Kettleborough, Jamie writes:
>
> > Hello Karl, everyone,
> >
> >
> > For replicating the latest version, I agree that your alternate
> > structure poses difficulties (but it seems like there must
> be a way to
> > smartly determine whether the file you already have a file
> and simply
> > need to move it, rather than bring it over again).
> >
> >
> > Doesn't every user (not just the replication system) have
> this problem:
> > they want to know what files have changed (or not changed) at a new
> > publication version. No one wants to be using band width
> or storage
> > space to fetch and store files they already have. How is a user
> > expected to know what has really changed? Estani mentions
> check sums
> > - OK, but I don't think all nodes expose them (is this
> right?). You
> > may try to infer from modification dates (not sure, I
> haven't look at
> > them that closely). You may try to infer from the
> TRACKING_ID - but
> > I'm not sure how reliable this is (I can imagine scenarios where
> > different files share the same TRACKING_ID - e.g. if they have been
> > modified with an nco tool).
> >
> > Is there a recommended method for users to understand what *files*
> > have actually changed when a new publication version appears?
> >
> > Thanks,
> >
> > Jamie
> >
>
> --
>
> V. Balaji Office: +1-609-452-6516
> Head, Modeling Systems Group, GFDL Home: +1-212-253-6662
> Princeton University Email: v.balaji at noaa.gov
>
More information about the GO-ESSP-TECH
mailing list