[Go-essp-tech] repeated md5 chksum failures

Jennifer Adams jma at cola.iges.org
Mon Apr 16 05:53:15 MDT 2012


Looks like all my chksum failures were related to a disk error on our local system. I have relocated my wgets to a different disk and have not seen a single failure. --Jennifer


On Apr 16, 2012, at 4:22 AM, <martin.juckes at stfc.ac.uk> <martin.juckes at stfc.ac.uk> wrote:

> Hi Jennifer,
>  
> I found a similar problem last year, transferring data from CSIRO and Japan to BADC – though I stopped after 4 attempts to get one file, which all resulted in different checksums. In each case I was using wget and it was doing multiple automatic restarts. I suppressed the problem by running with “--limit-rate=500k” in the wget command line. This resulted in transfers running with no interruptions and restarts, and consistent checksums. It is also, of course, slower – but I haven’t found a way around that. I found some discussion on various web fora which may be related, suggesting that a delay between automated restarts is useful to prevent corruption – but all the discussions I found were ambiguous, so I didn’t keep a record. For data I’m fetching from European and North American nodes I think this problem is too rare to merit running with a limited transfer rate – the few transfers that fail simply have to be repeated,
> Cheers,
> Martin
>  
> From: go-essp-tech-bounces at ucar.edu [mailto:go-essp-tech-bounces at ucar.edu] On Behalf Of Jennifer Adams
> Sent: 11 April 2012 19:52
> To: go-essp-tech at ucar.edu
> Subject: [Go-essp-tech] repeated md5 chksum failures
>  
> Hi, Everyone -- 
> I'm trying to download some fairly large files (~1Gb) from the piControl run (monthly ocean variables) and find that the checksum fails to match several times and then will be ok. In some cases, it can take 10 or more re-tries before the checksum succeeds. 
>  
> The problem is not with a specific data node. Here are some of the dataset IDs for the troublesome downloads: 
> cmip5.output1.CCCma.CanESM2.piControl.mon.ocean.Omon.r1i1p1.v20111028
> cmip5.output1.INM.inmcm4.piControl.mon.ocean.Omon.r1i1p1.v20110323
> cmip5.output1.MIROC.MIROC-ESM.piControl.mon.ocean.Omon.r1i1p1.v20110929
> cmip5.output1.MRI.MRI-CGCM3.piControl.mon.ocean.Omon.r1i1p1.v20110831
> cmip5.output1.NCAR.CCSM4.piControl.mon.ocean.Omon.r1i1p1.v20120220
> cmip5.output1.NCC.NorESM1-M.piControl.mon.ocean.Omon.r1i1p1.v20110901
> cmip5.output2.MRI.MRI-CGCM3.piControl.mon.ocean.Omon.r1i1p1.v20110831
> cmip5.output2.NCC.NorESM1-M.piControl.mon.ocean.Omon.r1i1p1.v20110901
> cmip5.output1.MPI-M.MPI-ESM-LR.piControl.mon.ocean.Omon.r1i1p1.v20120315
> cmip5.output1.MPI-M.MPI-ESM-P.piControl.mon.ocean.Omon.r1i1p1.v20120315
> cmip5.output2.MPI-M.MPI-ESM-P.piControl.mon.ocean.Omon.r1i1p1.v20111028
>  
> For example, from the final two datasets in the list, here is an entry from the wget script:
> 'rhopoto_Omon_MPI-ESM-P_piControl_r1i1p1_185001-185912.nc' 'http://bmbf-ipcc-ar5.dkrz.de/thredds/fileServer/cmip5/output2/MPI-M/MPI-ESM-P/piControl/mon/ocean/Omon/r1i1p1/v20111028/rhopoto/rhopoto_Omon_MPI-ESM-P_piControl_r1i1p1_185001-185912.nc' 'MD5' '036aabfc10caa76a8943f967bc10ad4d'
>  
> Here are the 21 download tries so far today, taking 5 hours, the "md5 failed!" message appears in the log file after each one: 
> 2012-04-11 09:19:18 (2.19 MB/s) - `rhopoto_Omon_MPI-ESM-P_piControl_r1i1p1_185001-185912.nc' saved [1083611268/1083611268]
> 2012-04-11 09:35:05 (1.13 MB/s) - `rhopoto_Omon_MPI-ESM-P_piControl_r1i1p1_185001-185912.nc' saved [1083611268/1083611268]
> 2012-04-11 09:53:26 (1009 KB/s) - `rhopoto_Omon_MPI-ESM-P_piControl_r1i1p1_185001-185912.nc' saved [1083611268/1083611268]
> 2012-04-11 10:05:52 (1.49 MB/s) - `rhopoto_Omon_MPI-ESM-P_piControl_r1i1p1_185001-185912.nc' saved [1083611268/1083611268]
> 2012-04-11 10:17:03 (1.61 MB/s) - `rhopoto_Omon_MPI-ESM-P_piControl_r1i1p1_185001-185912.nc' saved [1083611268/1083611268]
> 2012-04-11 10:31:14 (1.30 MB/s) - `rhopoto_Omon_MPI-ESM-P_piControl_r1i1p1_185001-185912.nc' saved [1083611268/1083611268]
> 2012-04-11 10:48:50 (1.04 MB/s) - `rhopoto_Omon_MPI-ESM-P_piControl_r1i1p1_185001-185912.nc' saved [1083611268/1083611268]
> 2012-04-11 11:01:09 (1.46 MB/s) - `rhopoto_Omon_MPI-ESM-P_piControl_r1i1p1_185001-185912.nc' saved [1083611268/1083611268]
> 2012-04-11 11:14:01 (1.40 MB/s) - `rhopoto_Omon_MPI-ESM-P_piControl_r1i1p1_185001-185912.nc' saved [1083611268/1083611268]
> 2012-04-11 11:29:46 (1.15 MB/s) - `rhopoto_Omon_MPI-ESM-P_piControl_r1i1p1_185001-185912.nc' saved [1083611268/1083611268]
> 2012-04-11 11:42:39 (1.40 MB/s) - `rhopoto_Omon_MPI-ESM-P_piControl_r1i1p1_185001-185912.nc' saved [1083611268/1083611268]
> 2012-04-11 12:01:05 (1011 KB/s) - `rhopoto_Omon_MPI-ESM-P_piControl_r1i1p1_185001-185912.nc' saved [1083611268/1083611268]
> 2012-04-11 12:18:25 (1.03 MB/s) - `rhopoto_Omon_MPI-ESM-P_piControl_r1i1p1_185001-185912.nc' saved [1083611268/1083611268]
> 2012-04-11 12:35:30 (1.04 MB/s) - `rhopoto_Omon_MPI-ESM-P_piControl_r1i1p1_185001-185912.nc' saved [1083611268/1083611268]
> 2012-04-11 12:49:44 (1.35 MB/s) - `rhopoto_Omon_MPI-ESM-P_piControl_r1i1p1_185001-185912.nc' saved [1083611268/1083611268]
> 2012-04-11 13:08:38 ( 960 KB/s) - `rhopoto_Omon_MPI-ESM-P_piControl_r1i1p1_185001-185912.nc' saved [1083611268/1083611268]
> 2012-04-11 13:26:11 (1.01 MB/s) - `rhopoto_Omon_MPI-ESM-P_piControl_r1i1p1_185001-185912.nc' saved [1083611268/1083611268]
> 2012-04-11 13:36:21 (1.78 MB/s) - `rhopoto_Omon_MPI-ESM-P_piControl_r1i1p1_185001-185912.nc' saved [1083611268/1083611268]
> 2012-04-11 13:50:53 (1.25 MB/s) - `rhopoto_Omon_MPI-ESM-P_piControl_r1i1p1_185001-185912.nc' saved [1083611268/1083611268]
> 2012-04-11 14:06:26 (1.15 MB/s) - `rhopoto_Omon_MPI-ESM-P_piControl_r1i1p1_185001-185912.nc' saved [1083611268/1083611268]
> 2012-04-11 14:19:43 (1.39 MB/s) - `rhopoto_Omon_MPI-ESM-P_piControl_r1i1p1_185001-185912.nc' saved [1083611268/1083611268]
>  
> This one failed 14 times before finally getting the "md5 ok" message -- it took 3 hrs 45 minutes to get this file:
> 'so_Omon_MPI-ESM-P_piControl_r1i1p1_189001-189912.nc' 'http://bmbf-ipcc-ar5.dkrz.de/thredds/fileServer/cmip5/output1/MPI-M/MPI-ESM-P/piControl/mon/ocean/Omon/r1i1p1/v20120315/so/so_Omon_MPI-ESM-P_piControl_r1i1p1_189001-189912.nc' 'MD5' '175d6c9dd3ffea30186e6bc9c7e3dee1'
>  
> This problem is sucking up my bandwidth and my time, which are not unlimited. Is there any remedy?  
> --Jennifer
>  
>  
>  
> 
> -- 
> Scanned by iCritical.
> 
> 
> 

--
Jennifer M. Adams
IGES/COLA
4041 Powder Mill Road, Suite 302
Calverton, MD 20705
jma at cola.iges.org



-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ucar.edu/pipermail/go-essp-tech/attachments/20120416/a8ac9a72/attachment-0001.html 


More information about the GO-ESSP-TECH mailing list