[Go-essp-tech] PKI access control on data nodes

Estanislao Gonzalez gonzalez at dkrz.de
Thu Jul 7 08:26:05 MDT 2011


Hi,

There are several problems:

1) Tokens have a short-life, and thus urls are disposable. The tokenless 
method separates the security (which is short-lived) from the url. Thus 
you can pass the wget script around and it will never expire. It will 
only work though, together with a valid certificate.
2) Tokens cannot be inferred, that's the point. So it's impossible to 
create one (like download the same but for other model). Only the 
gateway can do that.
3) Regarding how to get a Token, only the Gateway can provide it and 
there's no API to that (as it would make no sense, because you'll have 
to authenticate anyways for this). This is what Phil was saying.

So, if you only plan to download a couple of files immediately at a 
Gateway, you might be well suited with the token security.
In all other cases you need the token less.

Thanks,
Estani

Am 07.07.2011 15:46, schrieb Kettleborough, Jamie:
> Hello Martin,
>
> I agree its better to have one method of authorization rather than many
> - its less to code, debug, maintain, support.  But at the moment we have
> 2, and I don't really have any information on when the http token method
> will be retired.  I know there is an intent to retire it, but no
> indication of the timescales.  So I was going for plan B: which was to
> look into the option of supporting both.
>
> A programmatic interface to the http token authorization is possible
> surely - how else to the gateways generate those wget scripts?  (Or am I
> missing something obvious).  What did you mean by 'adequate data
> access'?
>
> Jamie
>
>> -----Original Message-----
>> From: martin.juckes at stfc.ac.uk [mailto:martin.juckes at stfc.ac.uk]
>> Sent: 06 July 2011 22:43
>> To: Kettleborough, Jamie; gavin at llnl.gov
>> Cc: go-essp-tech at ucar.edu
>> Subject: Checksums and PKI access control on data nodes
>>
>> Hi Jamie,
>>
>> just picking up something on one of your data node
>> authorization threads.
>>
>> I think programmatic access to data requires PKI security --
>> I don't see any prospect of adequate data access with the
>> http token approach.
>>
>> I think that checksums are also necessary to guarantee data
>> integrity -- these are given in the THREDDS catalogues of
>> BADC, IPSL, and CNRM -- and CCCMA is in the process of adding them.
>>
>> I aim to continue contacting data nodes over the coming weeks
>> and hope that there will be steady progress in levelling the
>> quality of service upwards,
>>
>> cheers,
>> Martin
>>
>> ________________________________________
>> From: go-essp-tech-bounces at ucar.edu
>> [go-essp-tech-bounces at ucar.edu] on behalf of Kettleborough,
>> Jamie [jamie.kettleborough at metoffice.gov.uk]
>> Sent: 05 July 2011 14:48
>> To: Gavin M. Bell
>> Cc: go-essp-tech at ucar.edu
>> Subject: Re: [Go-essp-tech] Data node authorization
>>
>> Hello Gavin,
>>
>> thanks for this.  This looks useful.  Any ideas when any
>> live/production data nodes will have this version of the
>> service on them? - I couldn't find any (but that's part of
>> the problem of course). When available how up to date will
>> the registry be e.g. are their constraints on it like it will
>> only know about data nodes running the same releases?
>>
>> I know you were just answering my tangent.  But I think the
>> original question is still only half answered.  As I
>> understand it there are two ways this might go:
>>
>> 1. all data nodes upgrade change to the PKI infrastructure
>>
>> 2. the ESGF continues to support (for some time) both PKI and
>> the HTTP query string token (I don't know the right name for
>> this, sorry).
>>
>> (there is a 3rd option of everyone move to just the HTTP
>> query string token - but I don't think that is really under
>> discussion).
>>
>> My guess is that 2. is the most likely outcome and data users
>> will have to cope with both.  So...
>>
>> 1. How do you programmatically get data using the HTTP query
>> string token (I think Martin is following this up with Bob -
>> can we have a summary posted to the list?)
>>
>> 2. How does a user know which method to use for which nodes.
>> (This may be in the data-node registry, when available, but
>> it wasn't' obvious to me from the sample Luca sent round? -
>> again I may be missing something though).
>>
>> Apologies if I'm coming across as over demanding here - I
>> realise I'm coming to this discussion relatively late in the
>> day.  Just I'm aware that we have scientists who want to get
>> data so they can start the analysis and writing of multi
>> model papers in time for the 1st draft of the AR5. At the
>> moment I'm really uncertain on how they can get the data
>> minimising the effort that have to put into finding and fetching it.
>>
>> Thanks,
>>
>> Jamie
>>
>>
>> ________________________________
>>
>>          From: Gavin M. Bell [mailto:gavin at llnl.gov]
>>          Sent: 01 July 2011 20:35
>>          To: Kettleborough, Jamie
>>          Cc: Cinquini, Luca (3880); go-essp-tech at ucar.edu
>>          Subject: Re: [Go-essp-tech] Data node authorization
>>
>>
>>          Hello Jamie,
>>
>>          Allow me to solely indulge your tangent for a moment... :-)
>>
>>          The issue of knowing who is where etc. is solved by
>> using a sufficiently recent version of the  ESGF "data" Node
>> (v0.5.1+).
>>          The node-manager's registry component will
>> automatically generate a continuously updating descriptive
>> (xml) document of nodes currently present in the federation
>> at a given time.  This would have ameliorated your task considerably.
>>
>>          If you look at the sites you have collected; go to
>> the esgf-node-manager page and look at the bottom left corner
>> for the version.
>>          They are all earlier than v0.5.1 and hence do not
>> have the automatic federation feature in place.
>>
>>          Ex:
>>          http://esgnode1.nci.org.au/esgf-node-manager/  (v0.5.0)
>>          http://vesg.ipsl.fr/esgf-node-manager/  (v0.4.0)
>>          http://esg.cnrm-game-meteo.fr/esgf-node-manager/  (v0.4.0)
>>          http://dap.cccma.uvic.ca/esgf-node-manager/  (v0.5.0)
>>          http://cmip-dn.badc.rl.ac.uk/esgf-node-manager/  (v0.4.0)
>>
>>          (NASA-GISS are not running a node manager at all)
>>
>>          If you look at more recent node installations
>> (version 0.5.1+) you will see that there is a
>> registration.xml document that is served under
>> esgf-node-manager.  It is an active document that is
>> automatically updated by the node manager's registry service
>> to always reflect the current state of the federation.
>>          This is a feature of the new ESGF Node.  Gateways are
>> not running node managers so they are not present in the
>> registration.xml document.  However, you can find out about
>> gateways indirectly by looking at the ESGF Node's
>> registration entry and looking at the attribute "adminPeer"
>> this indicates that node's target IDP service, which in older
>> ESG parlance indicates a "gateway".  The new ESGF Nodes are
>> built based on a modular component architecture such that
>> sets of components embody functionality, these are what we
>> call ESGF Node "types".  There are 4 node types. The node
>> type that is currently being installed is the well known
>> "data" type a.k.a the "data node", the other types are not
>> mutually exclusive and extend the ESGF Nodes functionality to
>> include familiar features such as:
>>          - User credential management and single sign on support
>>          - Attribute management
>>          - Enhanced Federation-wide searching (with new search
>> front-end)
>>
>>          As well as recent features since v0.5.1 and pending
>> features coming on line such as:
>>          - Automatic fail-over and fault tolerance
>>          - New administrative front ends
>>          - Computation / Visualization tools
>>          - and more...
>>
>>          I would suggest upgrading :-).
>>
>>          The installation/upgrading process has been
>> streamlined to make things more straight forward - and the
>> team and I are always glad to help if needed.  There are
>> further enhancements in the queue that will further
>> streamline the process to make installation/upgrading as
>> turn-key as possible.  There are also enhancements to the
>> federation protocol and new features as well, that will soon
>> be available in an upcoming v0.5.3 release that is currently in test.
>>
>>          FYI:
>>          The current installer installs the ESGF Node at v0.5.1.
>>          In staging is v0.5.2
>>          In test is v0.5.3.
>>
>>          Note: The list above are versions of the node manager
>> component.
>> As it is a component of the ESGF Node, the node itself has a
>> version currently ESG Node v1.0.4+ (Stuyvesant release).
>>
>>          The new ESGF Node augments the data node and is a
>> complete solution in and of itself while being compatible
>> with the current Gateway.  It should be considered a useful
>> tool to help the climate community and adding to the ESG
>> ecosystem of utilities :-).
>>
>>          Whew... (that was a long email)
>>          I hope this was somewhat useful information in the
>> context of your tangent. :-)
>>
>>
>>          On 7/1/11 6:49 AM, Kettleborough, Jamie wrote:
>>
>>                  I created this table by: looking at each
>> gateway, figuring out which
>>                  modelling institutes contributed to the CMIP5
>> project, selecting a
>>                  sample data-set, creating a wget script, and
>> then inspecting the url in
>>                  the script.  (I couldn't get to any NCC data
>> as I didn't have access).
>>                  I only sampled one dataset.
>>
>>                  This feels a bit long winded - what is the
>> expected way to do this?
>>                  Although today I was just gathering
>> information on what data nodes are
>>                  out there I can imagine this as a part of a
>> real life use case (a very
>>                  common use case).  If I want to gather a
>> diagnostic, such as monthly
>>                  mean surface temperature from as many models
>> as I can, I think I'd have
>>                  to do this sort of trawling.  OK I maybe only
>> have to do the initial
>>                  mapping of institute to data node once, but I
>> think there is still a
>>                  trawl needed between gateways to get the
>> data.  I may be missing
>>                  something - and I took some unnecessary
>> steps. Please let me know if
>>                  this is the case.  Estani, Martin, Sebastien
>> - sounds like you have
>>                  already started to do this sort of thing?
>>
>>                  I also note that not all gateways know about
>> all institutes - I think
>>                  this is a known problem.  For instance PCMDI
>> doesn't know about IPSL,
>>                  and only NCI seems to know about CSIRO. Any
>> ideas when this might be
>>                  resolved?
>>
>>
>>
>>
>>          --
>>          Gavin M. Bell
>>          Lawrence Livermore National Labs
>>          --
>>
>>           "Never mistake a clear view for a short distance."
>>                         -Paul Saffo
>>
>>          (GPG Key - http://rainbow.llnl.gov/dist/keys/gavin.asc)
>>
>>           A796 CE39 9C31 68A4 52A7  1F6B 66B7 B250 21D5 6D3E
>>
>>
>> _______________________________________________
>> GO-ESSP-TECH mailing list
>> GO-ESSP-TECH at ucar.edu
>> http://mailman.ucar.edu/mailman/listinfo/go-essp-tech
>> --
>> Scanned by iCritical.
>>
> _______________________________________________
> GO-ESSP-TECH mailing list
> GO-ESSP-TECH at ucar.edu
> http://mailman.ucar.edu/mailman/listinfo/go-essp-tech


-- 
Estanislao Gonzalez

Max-Planck-Institut für Meteorologie (MPI-M)
Deutsches Klimarechenzentrum (DKRZ) - German Climate Computing Centre
Room 108 - Bundesstrasse 45a, D-20146 Hamburg, Germany

Phone:   +49 (40) 46 00 94-126
E-Mail:  gonzalez at dkrz.de



More information about the GO-ESSP-TECH mailing list