[Go-essp-tech] Data node authorization

martin.juckes at stfc.ac.uk martin.juckes at stfc.ac.uk
Fri Jul 1 05:17:24 MDT 2011


Hi Estani,

For transfers from IPSL and CRNM I'd agree that 60 threads is too many -- I found performance tailing off with more than 15-20 threads. The optimal also depends on file size -- with 20 threads I got higher instantaneous transfer rates but lower per thread transfer rate and hence higher rates of interruptions within each transfer. I need to clean up the script to separately monitor instantaneous transfer and successful completion rates. For transfers from CCCMA I'm getting much slower per thread transfer rates and the optimal number of threads appears to be higher. I haven't really done enough to be sure of the exact value -- there are a lot of independent fluctuations going on. 

My impression is that for the link to CCCMA, where I have slow transfers and small files, 60 threads works well in the day, but a smaller number may be better at night when I am finding more network interruptions (all threads failing at more or less the same time).

I've reduced the problem I had with checksums by running them in a "nice" mode, and also by not running them if the file size is obviously wrong.

I'll tidy up my transfer rate records and try to make something useable,

Cheers,
Martin 

> >-----Original Message-----
> >From: go-essp-tech-bounces at ucar.edu [mailto:go-essp-tech-
> >bounces at ucar.edu] On Behalf Of Estanislao Gonzalez
> >Sent: 01 July 2011 11:56
> >To: go-essp-tech at ucar.edu
> >Subject: Re: [Go-essp-tech] Data node authorization
> >
> >Hi Martin,
> >
> >I'm not sure hows for your case, but I think 60 threads are too many,
> >are they truly bringing any performance increase at all? If the
> >performance is minimal you might be using more CPU that could be used
> >for md5sums that are IO and CPU intensive.
> >
> >What I do is to separate the download from the checksums so they run
> >separately. Haven't tried CCCMA and for IPSL I'm limited to a couple
> >of
> >sockets, as if I tried to open more than 3 or 4, they got immediately
> >closed. I've got up to now good performance with ~9 threads and 1 or
> >1.5
> >threads per cpu core for checksumming.
> >But  I guess that's very architecture dependence.
> >
> >I'll kindly ask to store at least some benchmarks about the
> >connections
> >so that we could later draw a map about the measured speeds between
> >data
> >nodes and gateways and hopefully improved the total bandwidth.
> >
> >Thanks,
> >Estani
> >
> >Am 01.07.2011 11:13, schrieb martin.juckes at stfc.ac.uk:
> >> Hi Sebastien,
> >>
> >> How fast are your fast rates from CCCMA? I'm getting a few tens of
> >GB/hour using up to 60 wget threads -- perhaps I should be using more
> >(I got cautious about the numbers because at one point I had 40
> >threads finish in a short time and the machine was then frozen by 40
> >md5sum threads trying to run in parallel).
> >>
> >> Are you planning to do transfers from BCC using the tokenised
> >scripts? I would prefer to use certificates, but they appear to
> >support this at present,
> >>
> >> Cheers,
> >> Martin
> >>
> >>>> -----Original Message-----
> >>>> From: Sébastien Denvil [mailto:sebastien.denvil at ipsl.jussieu.fr]
> >>>> Sent: 01 July 2011 09:33
> >>>> To: Juckes, Martin (STFC,RAL,RALSP)
> >>>> Cc: Luca.Cinquini at jpl.nasa.gov;
> >jamie.kettleborough at metoffice.gov.uk;
> >>>> go-essp-tech at ucar.edu
> >>>> Subject: Re: [Go-essp-tech] Data node authorization
> >>>>
> >>>>   Hi all,
> >>>>
> >>>> we are in the process to download a subset of the already
> >published
> >>>> data
> >>>> to sustain analysis activity (using multiple wget threads).
> >>>>
> >>>> We did that already for BADC, CNRM and CCCMA. Unlike Martin, we
> >had
> >>>> fast
> >>>> transfert rate from CCCMA (network mysteries).
> >>>>
> >>>> I will let you know any interesting findings.
> >>>>
> >>>> The list we plan to download a subset from:
> >>>>
> >>>> bcc-csm1-1
> >>>> CanCM4
> >>>> CanESM2
> >>>> CNRM-CM5
> >>>> GISS-E2-H
> >>>> GISS-E2-R
> >>>> HadGEM2-A
> >>>> HadGEM2-ES
> >>>> inmcm4
> >>>> bcc-csm1-1
> >>>> NorESM1-M
> >>>>
> >>>> Regards.
> >>>> Sébastien
> >>>>
> >>>> On 01/07/2011 09:55, martin.juckes at stfc.ac.uk wrote:
> >>>>> Hi Jamie,
> >>>>>
> >>>>> As Luca says, the plan is to move to the myproxy system. Like the
> >>>> wheels of justice, the wheels of ESGF grind exceedingly slow, but
> >you
> >>>> can't take the analogy much further. Like you, I've started to
> >look at
> >>>> data from other nodes. I've found that the myproxy system works
> >for
> >>>> BADC, IPSL, CNRM and CCCMA nodes. For the last two, you will get a
> >>>> tokenised wget script if you go through the gateway (because they
> >are
> >>>> published through the PCMDI gateway) but it you build your own
> >wget
> >>>> scripts, you can use myproxy certificates. If you want data from
> >>>> CCCMA, I find that the transatlantic transfer is very slow and you
> >may
> >>>> want to copy what I already have at BADC -- let me know if you do
> >as
> >>>> this copy won't be available through the BADC gateway until QC L2
> >has
> >>>> been completed, which may be some way off.
> >>>>> Another issue is the publication of checksums, because there is a
> >>>> significant risk of data corruption when moving large volumes.
> >BADC
> >>>> and IPSL have the checksums in the THREDDS catalogues, CNRM is in
> >the
> >>>> process of re-publishing to achieve this. I'm going to get in
> >touch
> >>>> with CCCMA today to encourage them to do the same,
> >>>>> Regards,
> >>>>> Martin
> >>>>>
> >>>>>>> -----Original Message-----
> >>>>>>> From: go-essp-tech-bounces at ucar.edu [mailto:go-essp-tech-
> >>>>>>> bounces at ucar.edu] On Behalf Of Cinquini, Luca (3880)
> >>>>>>> Sent: 30 June 2011 20:10
> >>>>>>> To: Kettleborough, Jamie
> >>>>>>> Cc: go-essp-tech at ucar.edu
> >>>>>>> Subject: Re: [Go-essp-tech] Data node authorization
> >>>>>>>
> >>>>>>> Hi Jamie,
> >>>>>>> 	the established plan is to move all sites to the MyProxy
> >(SAML-
> >>>>>>> based) authentication and authorization system, and to
> >gradually
> >>>>>>> phased out the token
> >>>>>>> based system. I know JPL and BADC have already moved, and that
> >>>> PCMDI
> >>>>>>> is still on the token based system. AS for the timeline at each
> >>>> site,
> >>>>>>> the corresponding
> >>>>>>> administrators will have to chime in.
> >>>>>>> thanks, Luca
> >>>>>>>
> >>>>>>> On Jun 30, 2011, at 7:22 AM, Kettleborough, Jamie wrote:
> >>>>>>>
> >>>>>>>> Hello,
> >>>>>>>>
> >>>>>>>> Earlier this week I was trying to get data from different data
> >>>>>>> nodes.
> >>>>>>>> There seemed to be two authorization methods in place - one
> >based
> >>>> on
> >>>>>>>> MyProxy, the other based on a token in the HTTP query string.
> >>>>>>>>
> >>>>>>>> Is this the long term plan?
> >>>>>>>> If not then how soon will just one method be supported across
> >all
> >>>>>>> nodes?
> >>>>>>>> If it is then I guess there will be follow up questions about
> >how
> >>>> to
> >>>>>>>> handle both...
> >>>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>>
> >>>>>>>> Jamie
> >>>>>>>> _______________________________________________
> >>>>>>>> GO-ESSP-TECH mailing list
> >>>>>>>> GO-ESSP-TECH at ucar.edu
> >>>>>>>> http://mailman.ucar.edu/mailman/listinfo/go-essp-tech
> >>>>>>> _______________________________________________
> >>>>>>> GO-ESSP-TECH mailing list
> >>>>>>> GO-ESSP-TECH at ucar.edu
> >>>>>>> http://mailman.ucar.edu/mailman/listinfo/go-essp-tech
> >>>>
> >>>> --
> >>>> Sébastien Denvil
> >>>> IPSL, Pôle de modélisation du climat
> >>>> UPMC, Case 101, 4 place Jussieu,
> >>>> 75252 Paris Cedex 5
> >>>>
> >>>> Tour 45-55 2ème étage Bureau 209
> >>>> Tel: 33 1 44 27 21 10
> >>>> Fax: 33 1 44 27 39 02
> >>>>
> >
> >
> >--
> >Estanislao Gonzalez
> >
> >Max-Planck-Institut für Meteorologie (MPI-M)
> >Deutsches Klimarechenzentrum (DKRZ) - German Climate Computing Centre
> >Room 108 - Bundesstrasse 45a, D-20146 Hamburg, Germany
> >
> >Phone:   +49 (40) 46 00 94-126
> >E-Mail:  gonzalez at dkrz.de
> >
> >_______________________________________________
> >GO-ESSP-TECH mailing list
> >GO-ESSP-TECH at ucar.edu
> >http://mailman.ucar.edu/mailman/listinfo/go-essp-tech
-- 
Scanned by iCritical.


More information about the GO-ESSP-TECH mailing list