[Go-essp-tech] ESG Federation Priotities - Was: NCI OpenIDs not working at PCMDI Gateway

Cinquini, Luca (3880) Luca.Cinquini at jpl.nasa.gov
Mon Oct 17 06:07:03 MDT 2011


Hi Phil,
        we are not using static server certs at the moment, only a list of IPs from the registry.

I remember your talk at the last go-essp meeting, and it would certainly be good to move in that direction, it's just a question of priorities.
In any case, the very first step should be to secure the LAS UI, i.e. to redirect to the ORP, so that the user is authenticated, am I right ?
Then, once the user is authenticated, obtain a delegated credential to access opendap services.

thanks, Luca

On Oct 17, 2011, at 1:09 AM, <philip.kershaw at stfc.ac.uk> wrote:

> Hi Luca,
>
> I think you're saying then that the LAS - OPeNDAP connections are secured
> with IP restrictions.  I recall an initial solution was to use static
> server certificates.  Did this get deployed or are there any plans to
> develop your current system further?
>
> For the MashMyData project here, we extended ESGF security to enable user
> delegation for secured workflows: portal to WPS to OPeNDAP service.  You
> could do it in the above to get a LAS instance to use a delegated
> credential to access a secured OPeNDAP service.  We are using this
> approach on a couple of projects here.
>
> Cheers,
> Phil
>
> On 16/10/2011 14:38, "Cinquini, Luca (3880)" <Luca.Cinquini at jpl.nasa.gov>
> wrote:
>
>> Hi Eric:
>>
>> On Oct 14, 2011, at 9:32 AM, Eric Nienhouse wrote:
>>
>>> Hi All,
>>>
>>> Our NCI OpenID thread was getting rather off topic, so I've started a
>>> new one.
>>>
>>> Good to hear the NCI OpenID issue has been resolved and that the NCI
>>> node has received a number of accolades for quality service  :-)
>>>
>>> I'd like to continue discussing federation priorities, development
>>> efforts and  replication.  Thanks Stephen and Gavin for summarizing a
>>> number of efforts in support of data access across the federation,
>>> including securing OpenDAP, LAS Product Services and replication.
>>>
>>> It is most important we all stay focused on interoperability, system
>>> interfaces and specifications as we move forward.  I believe this is
>>> especially critical now as many federation efforts are at high activity
>>> level.
>>>
>>> It's obvious the success and stability of the production ESGF system
>>> serving a large user base is critical as many users are preparing for
>>> near term scientific reporting deadlines.  Note, fed wide, we have ~25K
>>> users, many of whom are active CMIP5 researchers.  Published dataset
>>> volume and user downloads are rapidly increasing.
>>>
>>> To this end I have a the questions/comments below.
>>>
>>> Regards to all,
>>>
>>> -Eric
>>>>
>>>>> It get's about people being able to download from multiple sites at
>>>>> the
>>>>> same time, and specially from a local one.
>>>>> That's pretty much what is happening at IPSL, AFAIK you are indeed
>>>>> replicating data internally so scientist can get to them much faster.
>>>>> That's the whole idea of replication.
>>>>>
>>>>>
>>>> There is a replication mechanism in the works - are you volunteering
>>>> to get this bit of work completed?
>>> Gavin: A couple of questions about this replication mechanism in the
>>> works regarding interoperability:
>>>
>>> Will this work have impact on the Thredds catalog representation of
>>> replica datasets?  Are you anticipating any changes to the replica
>>> publication workflow?  I ask as we're working on search scalability,
>>> metadata transfer and replicas.
>>>
>>>> LAS is fully installed and integrated into the ESGF P2P Node.
>>>> As Sebastien noted with the LAS URLs this task has been done.
>>>>
>>>> If you install your ESGF P2P Node with --type compute you will get
>>>> this configured and installed and you too can provide LAS
>>>> functionality. :-)  Try it out :-)
>>>
>>> Indeed LAS Product Service integration is getting uptake, which is good
>>> to see.  We're publishing NCAR CMIP5 datasets with LAS endpoints into
>>> the Gateway 1.3.3 snapshot for pre-release testing.  LAS is a great
>>> service for visualization and data subset and download.
>>>
>>> One concern here at NCAR relates to securing LAS access to CMIP5
>>> datasets in our production data node.  My understanding is that LAS
>>> services are not yet under access control in the (compute) node.
>>>
>>> Is this correct?  If so, what are the plans for securing this service?
>>> Is the intention to utilize the OpenDAP security mechanism for doing so?
>>
>> correct - right now, LAS is granted access to the opendap endpoints via
>> the IP filter. At some point,
>> we started working with PMEL to enable the LAS UI to be able to redirect
>> to the ORP, in case the user is not authenticated already,
>> but that work was never completed. We can talk about picking it up at one
>> of our upcoming conferences.
>>
>> thanks, Luca
>>
>>>
>>> Thanks for any details you can provide.
>>>
>>> On 13/10/2011 13:02, stephen.pascoe at stfc.ac.uk wrote:
>>>> Sébastien and all,
>>>>
>>>> I agree getting all those services in place at one time is the target.
>>>> It is challenging that different parts of the federation have
>>>> priorities and it's hard work to keep all the different parts in sync.
>>>> Some of us need OPeNDAP straight away, some need CIM metadata, some
>>>> need GridFTP and checksums (for replication), some want visualisation
>>>> (LAS).  All I can do now is mention a few areas where we are making
>>>> progress.
>>>>
>>>> OPeNDAP.  I know our OPeNDAP security is broken at present but we've
>>>> just spent some contractor time figuring out the problem which we have
>>>> just pushed to esg-orp.git's devel branch.  This turns several hacks
>>>> that make OPeNDAP work into configurable options.
>>>>
>>>> We have also contributed the TDS security testing tool in
>>>> esg-contrib.git.  Some initial tests show that JPL is the one place
>>>> where OPeNDAP is working and correctly secured.  At NCI the OPENDAP
>>>> aggregations weren't accessible for datasets where the NetCDF was.
>>>> Unless you are using the latest esg-orp filters it is likely the
>>>> OPeNDAP URLs are not correctly secured.  There is also a loophole where
>>>> if NetCDF files are in a threeds_dataset_root but not explicitly
>>>> restricted in a THREDDS catalog they can be downloaded.  We hope that
>>>> the work in esg-orp.git will allow us to close this.
>>>>
>>>> A major bottleneck for us is the time it takes to make
>>>> AttributeService requests to PCMDI.  We are putting in place a caching
>>>> AuthorizationService that will reduce AttributeService callouts and
>>>> should make downloads quicker for both MOHC and IPSL data.  We are also
>>>> getting end-user configured GridFTP ready for production so that users
>>>> with large data requirements can start using that.
>>>>
>>>> So lots is happening and I embrace a competitive spirit amongst
>>>> datanodes and gateways to get this right.
>>>>
>>>> And a quick query to Sébastien
>>>>
>>>>
>>>>>> replication is the gateways priority. My priority is to have happy
>>>>>> users. And I know they want OpenDap. CORDEX simulations are running
>>>>>> now
>>>>>> and they need OpenDap to subset their download.
>>>>>
>>>>
>>>> Are your users happy with their access to data from the USA?  Are USA
>>>> scientists happy with their access to IPSL data?  To be honest we know
>>>> BADC has a particular problem with bandwidth but I'd be surprised if
>>>> replication wasn't going to help these users.
>>>>
>>>> Cheers,
>>>> Stephen.
>>>>
>>>> ---
>>>> Stephen Pascoe  +44 (0)1235 445980
>>>> Centre of Environmental Data Archival
>>>> STFC Rutherford Appleton Laboratory, Harwell Oxford, Didcot OX11 0QX,
>>>> UK
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: Sébastien Denvil [mailto:sebastien.denvil at ipsl.jussieu.fr]
>>>> Sent: 13 October 2011 10:07
>>>> To: Estanislao Gonzalez
>>>> Cc: muhammad.atif at anu.edu.au; Eric Nienhouse; Cinquini, Luca (3880);
>>>> Pascoe, Stephen (STFC,RAL,RALSP); Neill Miller;
>>>> esg-gateway-dev at earthsystemgrid.org; esg-node-dev at lists.llnl.gov
>>>> Subject: Re: [esg-node-dev] RE: [esg-gateway-dev] NCI OpenIDs not
>>>> working at PCMDI Gateway
>>>>
>>>> Hi all, Estani,
>>>>
>>>> just a small comment below:
>>>>
>>>> On 13/10/2011 10:12, Estanislao Gonzalez wrote:
>>>>
>>>>>> Hi Muhammad,
>>>>>>
>>>>>> It looks great!
>>>>>>
>>>>>> And Commenting Sébastien remarks. I do agree on OpeNDAP... but the
>>>>>> gateways are incapable of mimicking the p2p way of securing the
>>>>>> aggregations, is not something the data node admins should really
>>>>>> prioritize at the moment (at least not until it works). this is how I
>>>>>> see it:
>>>>>>
>>>>>> Basic:
>>>>>> -DRS structure in both id and urls (this includes: versioning and
>>>>>> maintaining url/catalog version coherency, more to that later)
>>>>>> -PKI
>>>>>> -Both HTTP and GridFTP server access (BDM gives bonus points, but you
>>>>>> don't need to publish those endpoints in the catalog anyways  :-)
>>>>>> -checksums
>>>>>>
>>>>>> extra:
>>>>>> -OpeNDAP Access (which can be broken for aggregations, since there's
>>>>>> no solution to that at the moment
>>>>>> -LAS (I have never seen an installation besides the "demo" one with
>>>>>> this, so it can't be a requirement really, not at the moment)
>>>>>
>>>>
>>>> is that a demo?
>>>>
>>>> http://esg-datanode.jpl.nasa.gov/thredds/esgcet/1/obs4MIPs.NASA-JPL.AIRS
>>>> .mon.v1.html?dataset=obs4MIPs.NASA-JPL.AIRS.mon.husNobs.1.aggregation.1
>>>>
>>>> http://esg-datanode.jpl.nasa.gov/las/getUI.do?catid=893EB2D5C79AD40EE243
>>>> 6A3F118649CE_ns_obs4MIPs.NASA-JPL.AIRS.mon.husNobs.1.aggregation.1
>>>>
>>>> It looks pretty mature.
>>>>
>>>>
>>>>>>
>>>>>> why OpeNDAP as an extra? Because at this time, replication is a
>>>>>> priority. You don't want the whole world to get to your OpenDAP
>>>>>> server, it would be advisable to get some replicas in place before
>>>>>> that.
>>>>>>
>>>>>
>>>>
>>>>
>>>> replication is the gateways priority. My priority is to have happy
>>>> users. And I know they want OpenDap. CORDEX simulations are running now
>>>> and they need OpenDap to subset their download.
>>>>
>>>> I don't mind the all world getting to my OpenDAP. We will boost the VM
>>>> as needed to sustain what it takes but OpenDap doesn't consume that
>>>> much
>>>> resources and it save network bandwidth so it's not a bad deal.
>>>>
>>>> cheers.
>>>> Sébastien
>>>>
>>>>
>>>>
>>>>>> Anyway, I cast my 5 star vote and will use NCI node as an example.
>>>>>> :-)
>>>>>> Well done Muhammad, really.
>>>>>>
>>>>>> Just to show how another node might see, and I won't do this again
>>>>>> anyother time soon but I think it's require to value a pristine node
>>>>>> more, let's take noaa-gfdl (a middle class one :-):
>>>>>>
>>>>>> esgdata.gfdl.noaa.gov
>>>>>> - No entry in the wiki page, so no admin to contact.
>>>>>> - datasets with mixed cases:
>>>>>>
>>>>>> cmip5.output1.NOAA-GFDL.GFDL-HIRAM-C180.sst2090.mon.atmos.Amon.r3i1p2.
>>>>>> v1/
>>>>>>     -
>>>>>>
>>>>>> cmip5.output1.noaa-gfdl.gfdl-hiram-c180.amip.mon.atmos.Amon.r1i1p1.v1/
>>>>>>
>>>>>> - dataset version and directory version mismatch and half-DRS
>>>>>> structure (this has version 1 in the catalogs):
>>>>>>
>>>>>> thredds/fileServer/gfdl_dataroot/NOAA-GFDL/GFDL-HIRAM-C180/amip/fx/atm
>>>>>> os/fx/r0i0p0/v20110601/areacella/areacella_fx_GFDL-HIRAM-C180_amip_r0i
>>>>>> 0p0.nc
>>>>>>
>>>>>>
>>>>>> - Only HTTPServer access points
>>>>>> - self-signed certificate containing "Globus-Test"
>>>>>> - ORP redirecting to a different machine name (probably same machine,
>>>>>> but still misconfigured)
>>>>>> - White-list is wrong or incomplete
>>>>>> - because of the above PKI is not working
>>>>>> - They do have checksums and that is really good.
>>>>>>
>>>>>> So that's a pretty standard data node which makes replication much
>>>>>> more difficult, if not impossible.
>>>>>>
>>>>>> My 2c anyway,
>>>>>> Estani
>>>>>>
>>>>>> Am 13.10.2011 02:40, schrieb Muhammad Atif:
>>>>>
>>>>>>>> On 13/10/11 02:50, Estanislao Gonzalez wrote:
>>>>>>
>>>>>>>>>> By the way Muhammad, could you clean the datanode? There are a
>>>>>>>>>> lot
>>>>>>>>>> of "unlinked" catalogs:
>>>>>>>>>>
>>>>>>>>>> http://esgnode1.nci.org.au/thredds/esgcet/3/cmip5.output1.CSIRO-QC
>>>>>>>>>> CCE.CSIRO-Mk3-6-0.historicalGHG.day.ocean.day.r4i1p1.v20110802.htm
>>>>>>>>>> l
>>>>>>>>>>
>>>>>>>>>> That are returning just 404... I think there's an option for
>>>>>>>>>> this in
>>>>>>>>>> the publisher (delete-orphans, or something) or was that intended
>>>>>>>>>> for something else Bob?
>>>>>>>>>>
>>>>>>>>>> But besides that, your data node looks pristine... version,
>>>>>>>>>> checksum, DRS conform directory structures... even a working
>>>>>>>>>> GridFTP!!
>>>>>>>>>> We should start a 5 star data node "quality meter" for data nodes
>>>>>>>>>> installations... you'll get a 4,5 (clean the 404 up and I'll
>>>>>>>>>> cast my
>>>>>>>>>> 5 star vote ;-)... I think the rest of us starts from 4 and goes
>>>>>>>>>> downwards.... But I might be wrong, apologies for any other
>>>>>>>>>> pristine
>>>>>>>>>> data node out there... if any.
>>>>>>>
>>>>>>>>
>>>>>>>> Anything to get 5 stars from you Estani. All done.  :)
>>>>>>>>
>>>>>>>> I manually removed the entries from catalog.xml in thredds.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>> -- Sébastien Denvil IPSL, Pôle de modélisation du climat UPMC, Case
>>>> 101, 4 place Jussieu, 75252 Paris Cedex 5 Tour 45-55 2ème étage Bureau
>>>> 209 Tel: 33 1 44 27 21 10 Fax: 33 1 44 27 39 02
>>>> -- Scanned by iCritical.
>>>
>>
>> _______________________________________________
>> GO-ESSP-TECH mailing list
>> GO-ESSP-TECH at ucar.edu
>> http://mailman.ucar.edu/mailman/listinfo/go-essp-tech
>
> --
> Scanned by iCritical.
> _______________________________________________
> GO-ESSP-TECH mailing list
> GO-ESSP-TECH at ucar.edu
> http://mailman.ucar.edu/mailman/listinfo/go-essp-tech



More information about the GO-ESSP-TECH mailing list