[Go-essp-tech] ESG Federation Priotities - Was: NCI OpenIDs not working at PCMDI Gateway

Estanislao Gonzalez gonzalez at dkrz.de
Tue Oct 18 05:30:09 MDT 2011


Hi,

I'm not understanding this clearly. Is this the scenario?

User contacts LAS 1, that cascades and contact LAS 2 for some data File1.

LAS 1 can redirect to ORP running locally (at least in same domain) and 
store the login session at a local (domain wide) cookie.
Now LAS 1 is acting as a delegate to the user for accessing LAS 2. 
Normally LAS will reside outside the domain, so the cookie will never 
get send "automatically" but it could send some SAML assertion to LAS 2.
Note: To me, that knows very little in this respect, the SAML looks like 
a delegated certificate, a "real" proxy, where the IDP assures for some 
given time, the user want to be "known" as logged in. Anyone with it can 
act in behalf of the user (I think I'm missing something, that's not 
very secure if we store the SAML as a domain wide insecure cookie, as 
anyone in the same domain can intercept it, how does it really work?)

So the decision for LAS 2 would be to either trust LAS 1 (so LAS is 
actually acting as an IdP proxy) or trust the Idp directly (and maybe 
query it again to be sure this is either still valid [key: Single sign 
out]).

Is this right?

Roland mentioned a "service certificate", which to me sounded like LAS 1 
got a certificate from some Idp trusted by LAS 2 and use it anytime LAS 
2 has to be accessed, the security control will then happen at LAS 1. 
This means that if at any point in time any LAS gets hacked, all data is 
compromised, isn't it?

Now regarding caching, as this is the same problem I have, suppose LAS 1 
caches part of the file 1 (to name the OpenDAP endpoint somehow) that 
came from LAS 2. When can LAS 1 use the cached data and how does it get 
the security constraints attached to it? If it stores the "how" to get 
that data (which LAS server, etc) it could trigger a modified request 
using the selected security method (i.e. proxy, delegate, etc) to test 
the user can access the cached data, right?

Thanks,
Estani

Am 17.10.2011 19:23, schrieb Rachana Ananthakrishnan:
> On Oct 17, 2011, at 11:29 AM, Roland Schweitzer wrote:
>
>> On 10/17/2011 07:07 AM, Cinquini, Luca (3880) wrote:
>>> Hi Phil,
>>>          we are not using static server certs at the moment, only a list of IPs from the registry.
>>>
>>> I remember your talk at the last go-essp meeting, and it would certainly be good to move in that direction, it's just a question of priorities.
>>> In any case, the very first step should be to secure the LAS UI, i.e. to redirect to the ORP, so that the user is authenticated, am I right ?
>> There is code in the UI client which will attempt to redirect a user
>> through the ORP if the URL used to access the LAS UI client contains an
>> openid parameter on the query string.  It worked in our tests, but there
>> are some issues around the fact that one LAS might want to act as a
>> proxy for making a request from another LAS.  The request will ask the
>> second LAS to return the results directly to the browser which requires
>> the client to also be authenticated at the second LAS node.
> If we used a model where all LAS deployments use the ORP, even if the second LAS  redirects it will not require any user input provided the user's login at the OpenID IdP is still valid. In this case, a cookie should have been created for the user's login session. I am assuming the fact that one LAS is a proxy to the other is dynamic configuration, rather than fixed set of LAS servers that play a particular role - is that correct?
>>> Then, once the user is authenticated, obtain a delegated credential to access opendap services.
>> I thought the proposal on the table was to use a "service certificate"
>> for Ferret to access remote OPeNDAP services.  It's possible to imagine
>> a credential tied to the user, but it seems like it would complicate the
>> LAS caching and we'd need to be careful about making sure we can
>> implement a solution that does not break the LAS cache.
> I recall the discussions on this where we said we will use a service certificate, and that OPeNDAP servers will treat LAS with privileges - that is can access data on behalf of other users. It does seem like the more simple approach we should try first. The static list of IPs was preliminary step, even before the service certificates, given at that time we did not have a way to SSL to the OPeNDAP servers.
>
> Rachana
>
>> Roland
>>> thanks, Luca
>>>
>>> On Oct 17, 2011, at 1:09 AM,<philip.kershaw at stfc.ac.uk>   wrote:
>>>
>>>> Hi Luca,
>>>>
>>>> I think you're saying then that the LAS - OPeNDAP connections are secured
>>>> with IP restrictions.  I recall an initial solution was to use static
>>>> server certificates.  Did this get deployed or are there any plans to
>>>> develop your current system further?
>>>>
>>>> For the MashMyData project here, we extended ESGF security to enable user
>>>> delegation for secured workflows: portal to WPS to OPeNDAP service.  You
>>>> could do it in the above to get a LAS instance to use a delegated
>>>> credential to access a secured OPeNDAP service.  We are using this
>>>> approach on a couple of projects here.
>>>>
>>>> Cheers,
>>>> Phil
>>>>
>>>> On 16/10/2011 14:38, "Cinquini, Luca (3880)"<Luca.Cinquini at jpl.nasa.gov>
>>>> wrote:
>>>>
>>>>> Hi Eric:
>>>>>
>>>>> On Oct 14, 2011, at 9:32 AM, Eric Nienhouse wrote:
>>>>>
>>>>>> Hi All,
>>>>>>
>>>>>> Our NCI OpenID thread was getting rather off topic, so I've started a
>>>>>> new one.
>>>>>>
>>>>>> Good to hear the NCI OpenID issue has been resolved and that the NCI
>>>>>> node has received a number of accolades for quality service  :-)
>>>>>>
>>>>>> I'd like to continue discussing federation priorities, development
>>>>>> efforts and  replication.  Thanks Stephen and Gavin for summarizing a
>>>>>> number of efforts in support of data access across the federation,
>>>>>> including securing OpenDAP, LAS Product Services and replication.
>>>>>>
>>>>>> It is most important we all stay focused on interoperability, system
>>>>>> interfaces and specifications as we move forward.  I believe this is
>>>>>> especially critical now as many federation efforts are at high activity
>>>>>> level.
>>>>>>
>>>>>> It's obvious the success and stability of the production ESGF system
>>>>>> serving a large user base is critical as many users are preparing for
>>>>>> near term scientific reporting deadlines.  Note, fed wide, we have ~25K
>>>>>> users, many of whom are active CMIP5 researchers.  Published dataset
>>>>>> volume and user downloads are rapidly increasing.
>>>>>>
>>>>>> To this end I have a the questions/comments below.
>>>>>>
>>>>>> Regards to all,
>>>>>>
>>>>>> -Eric
>>>>>>>> It get's about people being able to download from multiple sites at
>>>>>>>> the
>>>>>>>> same time, and specially from a local one.
>>>>>>>> That's pretty much what is happening at IPSL, AFAIK you are indeed
>>>>>>>> replicating data internally so scientist can get to them much faster.
>>>>>>>> That's the whole idea of replication.
>>>>>>>>
>>>>>>>>
>>>>>>> There is a replication mechanism in the works - are you volunteering
>>>>>>> to get this bit of work completed?
>>>>>> Gavin: A couple of questions about this replication mechanism in the
>>>>>> works regarding interoperability:
>>>>>>
>>>>>> Will this work have impact on the Thredds catalog representation of
>>>>>> replica datasets?  Are you anticipating any changes to the replica
>>>>>> publication workflow?  I ask as we're working on search scalability,
>>>>>> metadata transfer and replicas.
>>>>>>
>>>>>>> LAS is fully installed and integrated into the ESGF P2P Node.
>>>>>>> As Sebastien noted with the LAS URLs this task has been done.
>>>>>>>
>>>>>>> If you install your ESGF P2P Node with --type compute you will get
>>>>>>> this configured and installed and you too can provide LAS
>>>>>>> functionality. :-)  Try it out :-)
>>>>>> Indeed LAS Product Service integration is getting uptake, which is good
>>>>>> to see.  We're publishing NCAR CMIP5 datasets with LAS endpoints into
>>>>>> the Gateway 1.3.3 snapshot for pre-release testing.  LAS is a great
>>>>>> service for visualization and data subset and download.
>>>>>>
>>>>>> One concern here at NCAR relates to securing LAS access to CMIP5
>>>>>> datasets in our production data node.  My understanding is that LAS
>>>>>> services are not yet under access control in the (compute) node.
>>>>>>
>>>>>> Is this correct?  If so, what are the plans for securing this service?
>>>>>> Is the intention to utilize the OpenDAP security mechanism for doing so?
>>>>> correct - right now, LAS is granted access to the opendap endpoints via
>>>>> the IP filter. At some point,
>>>>> we started working with PMEL to enable the LAS UI to be able to redirect
>>>>> to the ORP, in case the user is not authenticated already,
>>>>> but that work was never completed. We can talk about picking it up at one
>>>>> of our upcoming conferences.
>>>>>
>>>>> thanks, Luca
>>>>>
>>>>>> Thanks for any details you can provide.
>>>>>>
>>>>>> On 13/10/2011 13:02, stephen.pascoe at stfc.ac.uk wrote:
>>>>>>> Sébastien and all,
>>>>>>>
>>>>>>> I agree getting all those services in place at one time is the target.
>>>>>>> It is challenging that different parts of the federation have
>>>>>>> priorities and it's hard work to keep all the different parts in sync.
>>>>>>> Some of us need OPeNDAP straight away, some need CIM metadata, some
>>>>>>> need GridFTP and checksums (for replication), some want visualisation
>>>>>>> (LAS).  All I can do now is mention a few areas where we are making
>>>>>>> progress.
>>>>>>>
>>>>>>> OPeNDAP.  I know our OPeNDAP security is broken at present but we've
>>>>>>> just spent some contractor time figuring out the problem which we have
>>>>>>> just pushed to esg-orp.git's devel branch.  This turns several hacks
>>>>>>> that make OPeNDAP work into configurable options.
>>>>>>>
>>>>>>> We have also contributed the TDS security testing tool in
>>>>>>> esg-contrib.git.  Some initial tests show that JPL is the one place
>>>>>>> where OPeNDAP is working and correctly secured.  At NCI the OPENDAP
>>>>>>> aggregations weren't accessible for datasets where the NetCDF was.
>>>>>>> Unless you are using the latest esg-orp filters it is likely the
>>>>>>> OPeNDAP URLs are not correctly secured.  There is also a loophole where
>>>>>>> if NetCDF files are in a threeds_dataset_root but not explicitly
>>>>>>> restricted in a THREDDS catalog they can be downloaded.  We hope that
>>>>>>> the work in esg-orp.git will allow us to close this.
>>>>>>>
>>>>>>> A major bottleneck for us is the time it takes to make
>>>>>>> AttributeService requests to PCMDI.  We are putting in place a caching
>>>>>>> AuthorizationService that will reduce AttributeService callouts and
>>>>>>> should make downloads quicker for both MOHC and IPSL data.  We are also
>>>>>>> getting end-user configured GridFTP ready for production so that users
>>>>>>> with large data requirements can start using that.
>>>>>>>
>>>>>>> So lots is happening and I embrace a competitive spirit amongst
>>>>>>> datanodes and gateways to get this right.
>>>>>>>
>>>>>>> And a quick query to Sébastien
>>>>>>>
>>>>>>>
>>>>>>>>> replication is the gateways priority. My priority is to have happy
>>>>>>>>> users. And I know they want OpenDap. CORDEX simulations are running
>>>>>>>>> now
>>>>>>>>> and they need OpenDap to subset their download.
>>>>>>> Are your users happy with their access to data from the USA?  Are USA
>>>>>>> scientists happy with their access to IPSL data?  To be honest we know
>>>>>>> BADC has a particular problem with bandwidth but I'd be surprised if
>>>>>>> replication wasn't going to help these users.
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Stephen.
>>>>>>>
>>>>>>> ---
>>>>>>> Stephen Pascoe  +44 (0)1235 445980
>>>>>>> Centre of Environmental Data Archival
>>>>>>> STFC Rutherford Appleton Laboratory, Harwell Oxford, Didcot OX11 0QX,
>>>>>>> UK
>>>>>>>
>>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Sébastien Denvil [mailto:sebastien.denvil at ipsl.jussieu.fr]
>>>>>>> Sent: 13 October 2011 10:07
>>>>>>> To: Estanislao Gonzalez
>>>>>>> Cc: muhammad.atif at anu.edu.au; Eric Nienhouse; Cinquini, Luca (3880);
>>>>>>> Pascoe, Stephen (STFC,RAL,RALSP); Neill Miller;
>>>>>>> esg-gateway-dev at earthsystemgrid.org; esg-node-dev at lists.llnl.gov
>>>>>>> Subject: Re: [esg-node-dev] RE: [esg-gateway-dev] NCI OpenIDs not
>>>>>>> working at PCMDI Gateway
>>>>>>>
>>>>>>> Hi all, Estani,
>>>>>>>
>>>>>>> just a small comment below:
>>>>>>>
>>>>>>> On 13/10/2011 10:12, Estanislao Gonzalez wrote:
>>>>>>>
>>>>>>>>> Hi Muhammad,
>>>>>>>>>
>>>>>>>>> It looks great!
>>>>>>>>>
>>>>>>>>> And Commenting Sébastien remarks. I do agree on OpeNDAP... but the
>>>>>>>>> gateways are incapable of mimicking the p2p way of securing the
>>>>>>>>> aggregations, is not something the data node admins should really
>>>>>>>>> prioritize at the moment (at least not until it works). this is how I
>>>>>>>>> see it:
>>>>>>>>>
>>>>>>>>> Basic:
>>>>>>>>> -DRS structure in both id and urls (this includes: versioning and
>>>>>>>>> maintaining url/catalog version coherency, more to that later)
>>>>>>>>> -PKI
>>>>>>>>> -Both HTTP and GridFTP server access (BDM gives bonus points, but you
>>>>>>>>> don't need to publish those endpoints in the catalog anyways  :-)
>>>>>>>>> -checksums
>>>>>>>>>
>>>>>>>>> extra:
>>>>>>>>> -OpeNDAP Access (which can be broken for aggregations, since there's
>>>>>>>>> no solution to that at the moment
>>>>>>>>> -LAS (I have never seen an installation besides the "demo" one with
>>>>>>>>> this, so it can't be a requirement really, not at the moment)
>>>>>>> is that a demo?
>>>>>>>
>>>>>>> http://esg-datanode.jpl.nasa.gov/thredds/esgcet/1/obs4MIPs.NASA-JPL.AIRS
>>>>>>> .mon.v1.html?dataset=obs4MIPs.NASA-JPL.AIRS.mon.husNobs.1.aggregation.1
>>>>>>>
>>>>>>> http://esg-datanode.jpl.nasa.gov/las/getUI.do?catid=893EB2D5C79AD40EE243
>>>>>>> 6A3F118649CE_ns_obs4MIPs.NASA-JPL.AIRS.mon.husNobs.1.aggregation.1
>>>>>>>
>>>>>>> It looks pretty mature.
>>>>>>>
>>>>>>>
>>>>>>>>> why OpeNDAP as an extra? Because at this time, replication is a
>>>>>>>>> priority. You don't want the whole world to get to your OpenDAP
>>>>>>>>> server, it would be advisable to get some replicas in place before
>>>>>>>>> that.
>>>>>>>>>
>>>>>>> replication is the gateways priority. My priority is to have happy
>>>>>>> users. And I know they want OpenDap. CORDEX simulations are running now
>>>>>>> and they need OpenDap to subset their download.
>>>>>>>
>>>>>>> I don't mind the all world getting to my OpenDAP. We will boost the VM
>>>>>>> as needed to sustain what it takes but OpenDap doesn't consume that
>>>>>>> much
>>>>>>> resources and it save network bandwidth so it's not a bad deal.
>>>>>>>
>>>>>>> cheers.
>>>>>>> Sébastien
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>>> Anyway, I cast my 5 star vote and will use NCI node as an example.
>>>>>>>>> :-)
>>>>>>>>> Well done Muhammad, really.
>>>>>>>>>
>>>>>>>>> Just to show how another node might see, and I won't do this again
>>>>>>>>> anyother time soon but I think it's require to value a pristine node
>>>>>>>>> more, let's take noaa-gfdl (a middle class one :-):
>>>>>>>>>
>>>>>>>>> esgdata.gfdl.noaa.gov
>>>>>>>>> - No entry in the wiki page, so no admin to contact.
>>>>>>>>> - datasets with mixed cases:
>>>>>>>>>
>>>>>>>>> cmip5.output1.NOAA-GFDL.GFDL-HIRAM-C180.sst2090.mon.atmos.Amon.r3i1p2.
>>>>>>>>> v1/
>>>>>>>>>      -
>>>>>>>>>
>>>>>>>>> cmip5.output1.noaa-gfdl.gfdl-hiram-c180.amip.mon.atmos.Amon.r1i1p1.v1/
>>>>>>>>>
>>>>>>>>> - dataset version and directory version mismatch and half-DRS
>>>>>>>>> structure (this has version 1 in the catalogs):
>>>>>>>>>
>>>>>>>>> thredds/fileServer/gfdl_dataroot/NOAA-GFDL/GFDL-HIRAM-C180/amip/fx/atm
>>>>>>>>> os/fx/r0i0p0/v20110601/areacella/areacella_fx_GFDL-HIRAM-C180_amip_r0i
>>>>>>>>> 0p0.nc
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> - Only HTTPServer access points
>>>>>>>>> - self-signed certificate containing "Globus-Test"
>>>>>>>>> - ORP redirecting to a different machine name (probably same machine,
>>>>>>>>> but still misconfigured)
>>>>>>>>> - White-list is wrong or incomplete
>>>>>>>>> - because of the above PKI is not working
>>>>>>>>> - They do have checksums and that is really good.
>>>>>>>>>
>>>>>>>>> So that's a pretty standard data node which makes replication much
>>>>>>>>> more difficult, if not impossible.
>>>>>>>>>
>>>>>>>>> My 2c anyway,
>>>>>>>>> Estani
>>>>>>>>>
>>>>>>>>> Am 13.10.2011 02:40, schrieb Muhammad Atif:
>>>>>>>>>>> On 13/10/11 02:50, Estanislao Gonzalez wrote:
>>>>>>>>>>>>> By the way Muhammad, could you clean the datanode? There are a
>>>>>>>>>>>>> lot
>>>>>>>>>>>>> of "unlinked" catalogs:
>>>>>>>>>>>>>
>>>>>>>>>>>>> http://esgnode1.nci.org.au/thredds/esgcet/3/cmip5.output1.CSIRO-QC
>>>>>>>>>>>>> CCE.CSIRO-Mk3-6-0.historicalGHG.day.ocean.day.r4i1p1.v20110802.htm
>>>>>>>>>>>>> l
>>>>>>>>>>>>>
>>>>>>>>>>>>> That are returning just 404... I think there's an option for
>>>>>>>>>>>>> this in
>>>>>>>>>>>>> the publisher (delete-orphans, or something) or was that intended
>>>>>>>>>>>>> for something else Bob?
>>>>>>>>>>>>>
>>>>>>>>>>>>> But besides that, your data node looks pristine... version,
>>>>>>>>>>>>> checksum, DRS conform directory structures... even a working
>>>>>>>>>>>>> GridFTP!!
>>>>>>>>>>>>> We should start a 5 star data node "quality meter" for data nodes
>>>>>>>>>>>>> installations... you'll get a 4,5 (clean the 404 up and I'll
>>>>>>>>>>>>> cast my
>>>>>>>>>>>>> 5 star vote ;-)... I think the rest of us starts from 4 and goes
>>>>>>>>>>>>> downwards.... But I might be wrong, apologies for any other
>>>>>>>>>>>>> pristine
>>>>>>>>>>>>> data node out there... if any.
>>>>>>>>>>> Anything to get 5 stars from you Estani. All done.  :)
>>>>>>>>>>>
>>>>>>>>>>> I manually removed the entries from catalog.xml in thredds.
>>>>>>>>>>>
>>>>>>>>>>> Regards,
>>>>>>>>>>>
>>>>>>>>>
>>>>>>> -- Sébastien Denvil IPSL, Pôle de modélisation du climat UPMC, Case
>>>>>>> 101, 4 place Jussieu, 75252 Paris Cedex 5 Tour 45-55 2ème étage Bureau
>>>>>>> 209 Tel: 33 1 44 27 21 10 Fax: 33 1 44 27 39 02
>>>>>>> -- Scanned by iCritical.
>>>>> _______________________________________________
>>>>> GO-ESSP-TECH mailing list
>>>>> GO-ESSP-TECH at ucar.edu
>>>>> http://mailman.ucar.edu/mailman/listinfo/go-essp-tech
>>>> --
>>>> Scanned by iCritical.
>>>> _______________________________________________
>>>> GO-ESSP-TECH mailing list
>>>> GO-ESSP-TECH at ucar.edu
>>>> http://mailman.ucar.edu/mailman/listinfo/go-essp-tech
>>> _______________________________________________
>>> GO-ESSP-TECH mailing list
>>> GO-ESSP-TECH at ucar.edu
>>> http://mailman.ucar.edu/mailman/listinfo/go-essp-tech
>> _______________________________________________
>> GO-ESSP-TECH mailing list
>> GO-ESSP-TECH at ucar.edu
>> http://mailman.ucar.edu/mailman/listinfo/go-essp-tech
> _______________________________________________
> GO-ESSP-TECH mailing list
> GO-ESSP-TECH at ucar.edu
> http://mailman.ucar.edu/mailman/listinfo/go-essp-tech
>


-- 
Estanislao Gonzalez

Max-Planck-Institut für Meteorologie (MPI-M)
Deutsches Klimarechenzentrum (DKRZ) - German Climate Computing Centre
Room 108 - Bundesstrasse 45a, D-20146 Hamburg, Germany

Phone:   +49 (40) 46 00 94-126
E-Mail:  gonzalez at dkrz.de



More information about the GO-ESSP-TECH mailing list