[Go-essp-tech] Improving on token-based authorization for filedownload

philip.kershaw at stfc.ac.uk philip.kershaw at stfc.ac.uk
Wed Dec 2 08:00:36 MST 2009


Hi Luca,

Just wanted to check up on this: is there now a schedule for implementing a replacement for the token based authorization?  I'd like to raise a ticket for this on the GO-ESSP Trac so that we can keep a record.  

I have concerns that interoperability is going to break for us with you.  Thinking through some possible scenarios:

 1) We will have a data node here with the ESG software stack so for example a TDS protected with the token based system.  If we don't have a Gateway with the token based functionality does this render the token system on our data node useless? - If there is no token issuing functionality in our Gateway then no one can get tokens to access our TDS? ... Or can you apply for a token at some other ESG Gateway and get access to the Data Node here? i.e. are tokens transferable between Gateways/Data Nodes?

 2) We'll also have our own PyDAP and COWS services as part of our Data Node.  They're protected with OpenID and the certificate based wget access I outlined at GO-ESSP.  Can you see problems with other ESG Gateways referencing these services?  If they just reference the endpoints directly from a browser then I'm guessing it's OK: the OpenID sign in process would be initiated here when the given endpoint was accessed.

Cheers,
Phil

> -----Original Message-----
> From: Luca Cinquini [mailto:luca at ucar.edu] 
> Sent: 17 November 2009 23:13
> To: Rachana Ananthakrishnan
> Cc: Alex Sim; Neill Miller; Kershaw, Philip (STFC,RAL,SSTD)
> Subject: Re: Improving on token-based authorization for filedownload
> 
> 
> 
> On Nov 17, 2009, at 2:21 PM, Rachana Ananthakrishnan wrote:
> 
> >>>>
> >>>
> >>> There has been explicit request to improve the token based
> >>> authorization. When you say "all other parts of the system", you  
> >>> mean functionality not in this alpha release? What would be a  
> >>> timeline that would work for the Gateway team, as in what 
> release  
> >>> would you like to target this?
> >> There's many other things that I think would probably be a higher
> >> priority, for example today we talked a lot about supporting the  
> >> DRS syntax, and there's off course versioning...
> >> It would probably be a PI level decision.
> >
> > I'll start a separate thread on this.
> >
> >>>
> >>>> And I think the output from the Gateway should stll be a SAML
> >>>> assertion containing the individual URLs, not the dataset,  
> >>>> because the GridFTP server does not know what dataset 
> the single  
> >>>> files belong to.
> >>>
> >>> Actually I was hoping we can do some wildcard tricks here. If the
> >>> Gateway returned an assertion about dataset/* then GridFTP will  
> >>> simply do a wild card match. So if the request to Gateway 
> had http://foo.bar:12345/dataset1/file1 
> >>> , then if the assertion can have 
> http://foo.bar:12345/dataset1/*,  
> >>> then we could do some caching and save round trips to the  
> >>> authorization service.
> >> The problem though is that there is no relation between the URL and
> >> a dataset identifier... Theoretically, files from different  
> >> datasets can be contained in the same directory.
> >>
> >
> > Okay. I did not get that from your response on the previous thread  
> > on this. The second proposal will require a remote round trip per  
> > file and is not going to help performance in any way. I 
> wonder if we  
> > can embed the dataset information in the URL and use that for  
> > caching purposes. I understood from our discussion that it is  
> > typical to download many files from a given dataset and trying to  
> > optimize for that is useful - is that correct characterization?
> Yes, correct. The gridftp server could still make only one 
> request to  
> the gateway, asking for authorization to all files at once, and  
> receive a single saml statement. I'm not sure if this would be too  
> much data to transfer though, especially for very large 
> number of files.
> BTW, are we sure that requesting authorization one file at a time  
> really creates a large overhead, considering that the files to  
> transfer are really pretty big ?
> Luca
> >
> > Thanks,
> > Rachana
> >
> >> Luca
> >>>
> >>>> The same argument applies to Proposal 3. So maybe there should  
> >>>> probably be only two proposals, #1 and #2=#3
> >>>> (with #4 being a combination of the previous two).
> >>>>
> >>>
> >>> I'll collapse proposal 3 and 4. The attributes caching is not  
> >>> useful given we are talking about caching only per 
> connection and  
> >>> not across connections.
> >>>
> >>> Rachana
> >>>
> >>>> thanks, Luca
> >>>>
> >>>>
> >>>> On Nov 17, 2009, at 10:24 AM, Rachana Ananthakrishnan wrote:
> >>>>
> >>>>> Hi,
> >>>>>
> >>>>> Here is a write up with proposed solutions for 
> authorization of  
> >>>>> end user download of files from GridFTP server.
> >>>>>
> >>>>> 
> http://www.ci.uchicago.edu/wiki/bin/view/ESGProject/EnhancedAu
thorization
>>>>>
>>>>> I would appreciate a review and feedback from each of you on the  
>>>>> proposal.
>>>>>
>>>>> Like mentioned there, reworking the wget based download (from  
>>>>> TDS) is also in the works, but this docuemnt deals exclusively  
>>>>> with the GridFTP support. This has been deemed critical and we  
>>>>> are being asked for a solution on this in short order - so  
>>>>> appreciate a quick turn around with comments.
>>>>>
>>>>> Thanks!
>>>>> Rachana
>>>>
>>>
>>
>

-- 
Scanned by iCritical.


More information about the GO-ESSP-TECH mailing list