[Go-essp-tech] On use of wget over http
Estanislao Gonzalez
gonzalez at dkrz.de
Thu Oct 27 05:20:38 MDT 2011
Hello Martin,
I totally agree as you may already know :-)
I'd like to add another good point for this: bandwidth.
The archive is too large to just expect people to download everything
all over again just to find out it hasn't change.
Tools are built around this too. By providing a checksum, it's no need
to download something you "know" it hasn't change.
For us, as archive sites, I don't think we will really archive anything
without a checksum, I know I won't.
And furthermore, checksums allows us to perform a version diff even
without having any other information,e.g. check the history tab at WDCC:
Version with added and renamed files:
http://ipcc-ar5.dkrz.de/dataset/cmip5.output1.NCC.NorESM1-M.sstClim.mon.land.Lmon.r1i1p1.html
and some deleted files:
http://ipcc-ar5.dkrz.de/dataset/cmip5.output2.MPI-M.MPI-ESM-LR.rcp26.mon.ocean.Omon.r1i1p1.html
This is only possible with checksums, and I think this info would be
quite helpful for users to know.
Thanks,
Estani
Am 27.10.2011 12:13, schrieb martin.juckes at stfc.ac.uk:
>
> Hello,
>
> Yesterday I ran a few tests transferring a 2Gb file from CSIRO to a
> server at Reading in the UK using wget over http. I ran the wget
> command 4 times, and each time got a file of the correct size and
> incorrect checksum. Wget was using multiple automatic retries. I then
> throttled back the transfer rate to 400Kbytes/s and got the file
> transferred in one go, and with the correct checksum. It just took a
> little longer.
>
> My tentative conclusions are that users cannot access the data
> reliably if we do not provide checksums, and that download scripts
> which do not verify checksums are not good enough for an archive of
> this size,
>
> Cheers,
>
> Martin
>
>
> --
> Scanned by iCritical.
>
>
>
>
> _______________________________________________
> GO-ESSP-TECH mailing list
> GO-ESSP-TECH at ucar.edu
> http://mailman.ucar.edu/mailman/listinfo/go-essp-tech
--
Estanislao Gonzalez
Max-Planck-Institut für Meteorologie (MPI-M)
Deutsches Klimarechenzentrum (DKRZ) - German Climate Computing Centre
Room 108 - Bundesstrasse 45a, D-20146 Hamburg, Germany
Phone: +49 (40) 46 00 94-126
E-Mail: gonzalez at dkrz.de
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ucar.edu/pipermail/go-essp-tech/attachments/20111027/912a53d3/attachment.html
More information about the GO-ESSP-TECH
mailing list