[Go-essp-tech] ESG Federation Priotities - Was: NCI OpenIDs not working at PCMDI Gateway

philip.kershaw at stfc.ac.uk philip.kershaw at stfc.ac.uk
Mon Oct 17 07:19:38 MDT 2011


Hi Luca,

Yes, that sounds a good first step: front the LAS with an ORP filter.  The
delegated credential generation could be implemented as an additional
filter.

Cheers,
Phil 

On 17/10/2011 13:07, "Cinquini, Luca (3880)" <Luca.Cinquini at jpl.nasa.gov>
wrote:

>Hi Phil,
>        we are not using static server certs at the moment, only a list
>of IPs from the registry.
>
>I remember your talk at the last go-essp meeting, and it would certainly
>be good to move in that direction, it's just a question of priorities.
>In any case, the very first step should be to secure the LAS UI, i.e. to
>redirect to the ORP, so that the user is authenticated, am I right ?
>Then, once the user is authenticated, obtain a delegated credential to
>access opendap services.
>
>thanks, Luca
>
>On Oct 17, 2011, at 1:09 AM, <philip.kershaw at stfc.ac.uk> wrote:
>
>> Hi Luca,
>>
>> I think you're saying then that the LAS - OPeNDAP connections are
>>secured
>> with IP restrictions.  I recall an initial solution was to use static
>> server certificates.  Did this get deployed or are there any plans to
>> develop your current system further?
>>
>> For the MashMyData project here, we extended ESGF security to enable
>>user
>> delegation for secured workflows: portal to WPS to OPeNDAP service.  You
>> could do it in the above to get a LAS instance to use a delegated
>> credential to access a secured OPeNDAP service.  We are using this
>> approach on a couple of projects here.
>>
>> Cheers,
>> Phil
>>
>> On 16/10/2011 14:38, "Cinquini, Luca (3880)"
>><Luca.Cinquini at jpl.nasa.gov>
>> wrote:
>>
>>> Hi Eric:
>>>
>>> On Oct 14, 2011, at 9:32 AM, Eric Nienhouse wrote:
>>>
>>>> Hi All,
>>>>
>>>> Our NCI OpenID thread was getting rather off topic, so I've started a
>>>> new one.
>>>>
>>>> Good to hear the NCI OpenID issue has been resolved and that the NCI
>>>> node has received a number of accolades for quality service  :-)
>>>>
>>>> I'd like to continue discussing federation priorities, development
>>>> efforts and  replication.  Thanks Stephen and Gavin for summarizing a
>>>> number of efforts in support of data access across the federation,
>>>> including securing OpenDAP, LAS Product Services and replication.
>>>>
>>>> It is most important we all stay focused on interoperability, system
>>>> interfaces and specifications as we move forward.  I believe this is
>>>> especially critical now as many federation efforts are at high
>>>>activity
>>>> level.
>>>>
>>>> It's obvious the success and stability of the production ESGF system
>>>> serving a large user base is critical as many users are preparing for
>>>> near term scientific reporting deadlines.  Note, fed wide, we have
>>>>~25K
>>>> users, many of whom are active CMIP5 researchers.  Published dataset
>>>> volume and user downloads are rapidly increasing.
>>>>
>>>> To this end I have a the questions/comments below.
>>>>
>>>> Regards to all,
>>>>
>>>> -Eric
>>>>>
>>>>>> It get's about people being able to download from multiple sites at
>>>>>> the
>>>>>> same time, and specially from a local one.
>>>>>> That's pretty much what is happening at IPSL, AFAIK you are indeed
>>>>>> replicating data internally so scientist can get to them much
>>>>>>faster.
>>>>>> That's the whole idea of replication.
>>>>>>
>>>>>>
>>>>> There is a replication mechanism in the works - are you volunteering
>>>>> to get this bit of work completed?
>>>> Gavin: A couple of questions about this replication mechanism in the
>>>> works regarding interoperability:
>>>>
>>>> Will this work have impact on the Thredds catalog representation of
>>>> replica datasets?  Are you anticipating any changes to the replica
>>>> publication workflow?  I ask as we're working on search scalability,
>>>> metadata transfer and replicas.
>>>>
>>>>> LAS is fully installed and integrated into the ESGF P2P Node.
>>>>> As Sebastien noted with the LAS URLs this task has been done.
>>>>>
>>>>> If you install your ESGF P2P Node with --type compute you will get
>>>>> this configured and installed and you too can provide LAS
>>>>> functionality. :-)  Try it out :-)
>>>>
>>>> Indeed LAS Product Service integration is getting uptake, which is
>>>>good
>>>> to see.  We're publishing NCAR CMIP5 datasets with LAS endpoints into
>>>> the Gateway 1.3.3 snapshot for pre-release testing.  LAS is a great
>>>> service for visualization and data subset and download.
>>>>
>>>> One concern here at NCAR relates to securing LAS access to CMIP5
>>>> datasets in our production data node.  My understanding is that LAS
>>>> services are not yet under access control in the (compute) node.
>>>>
>>>> Is this correct?  If so, what are the plans for securing this service?
>>>> Is the intention to utilize the OpenDAP security mechanism for doing
>>>>so?
>>>
>>> correct - right now, LAS is granted access to the opendap endpoints via
>>> the IP filter. At some point,
>>> we started working with PMEL to enable the LAS UI to be able to
>>>redirect
>>> to the ORP, in case the user is not authenticated already,
>>> but that work was never completed. We can talk about picking it up at
>>>one
>>> of our upcoming conferences.
>>>
>>> thanks, Luca
>>>
>>>>
>>>> Thanks for any details you can provide.
>>>>
>>>> On 13/10/2011 13:02, stephen.pascoe at stfc.ac.uk wrote:
>>>>> Sébastien and all,
>>>>>
>>>>> I agree getting all those services in place at one time is the
>>>>>target.
>>>>> It is challenging that different parts of the federation have
>>>>> priorities and it's hard work to keep all the different parts in
>>>>>sync.
>>>>> Some of us need OPeNDAP straight away, some need CIM metadata, some
>>>>> need GridFTP and checksums (for replication), some want visualisation
>>>>> (LAS).  All I can do now is mention a few areas where we are making
>>>>> progress.
>>>>>
>>>>> OPeNDAP.  I know our OPeNDAP security is broken at present but we've
>>>>> just spent some contractor time figuring out the problem which we
>>>>>have
>>>>> just pushed to esg-orp.git's devel branch.  This turns several hacks
>>>>> that make OPeNDAP work into configurable options.
>>>>>
>>>>> We have also contributed the TDS security testing tool in
>>>>> esg-contrib.git.  Some initial tests show that JPL is the one place
>>>>> where OPeNDAP is working and correctly secured.  At NCI the OPENDAP
>>>>> aggregations weren't accessible for datasets where the NetCDF was.
>>>>> Unless you are using the latest esg-orp filters it is likely the
>>>>> OPeNDAP URLs are not correctly secured.  There is also a loophole
>>>>>where
>>>>> if NetCDF files are in a threeds_dataset_root but not explicitly
>>>>> restricted in a THREDDS catalog they can be downloaded.  We hope that
>>>>> the work in esg-orp.git will allow us to close this.
>>>>>
>>>>> A major bottleneck for us is the time it takes to make
>>>>> AttributeService requests to PCMDI.  We are putting in place a
>>>>>caching
>>>>> AuthorizationService that will reduce AttributeService callouts and
>>>>> should make downloads quicker for both MOHC and IPSL data.  We are
>>>>>also
>>>>> getting end-user configured GridFTP ready for production so that
>>>>>users
>>>>> with large data requirements can start using that.
>>>>>
>>>>> So lots is happening and I embrace a competitive spirit amongst
>>>>> datanodes and gateways to get this right.
>>>>>
>>>>> And a quick query to Sébastien
>>>>>
>>>>>
>>>>>>> replication is the gateways priority. My priority is to have happy
>>>>>>> users. And I know they want OpenDap. CORDEX simulations are running
>>>>>>> now
>>>>>>> and they need OpenDap to subset their download.
>>>>>>
>>>>>
>>>>> Are your users happy with their access to data from the USA?  Are USA
>>>>> scientists happy with their access to IPSL data?  To be honest we
>>>>>know
>>>>> BADC has a particular problem with bandwidth but I'd be surprised if
>>>>> replication wasn't going to help these users.
>>>>>
>>>>> Cheers,
>>>>> Stephen.
>>>>>
>>>>> ---
>>>>> Stephen Pascoe  +44 (0)1235 445980
>>>>> Centre of Environmental Data Archival
>>>>> STFC Rutherford Appleton Laboratory, Harwell Oxford, Didcot OX11 0QX,
>>>>> UK
>>>>>
>>>>>
>>>>> -----Original Message-----
>>>>> From: Sébastien Denvil [mailto:sebastien.denvil at ipsl.jussieu.fr]
>>>>> Sent: 13 October 2011 10:07
>>>>> To: Estanislao Gonzalez
>>>>> Cc: muhammad.atif at anu.edu.au; Eric Nienhouse; Cinquini, Luca (3880);
>>>>> Pascoe, Stephen (STFC,RAL,RALSP); Neill Miller;
>>>>> esg-gateway-dev at earthsystemgrid.org; esg-node-dev at lists.llnl.gov
>>>>> Subject: Re: [esg-node-dev] RE: [esg-gateway-dev] NCI OpenIDs not
>>>>> working at PCMDI Gateway
>>>>>
>>>>> Hi all, Estani,
>>>>>
>>>>> just a small comment below:
>>>>>
>>>>> On 13/10/2011 10:12, Estanislao Gonzalez wrote:
>>>>>
>>>>>>> Hi Muhammad,
>>>>>>>
>>>>>>> It looks great!
>>>>>>>
>>>>>>> And Commenting Sébastien remarks. I do agree on OpeNDAP... but the
>>>>>>> gateways are incapable of mimicking the p2p way of securing the
>>>>>>> aggregations, is not something the data node admins should really
>>>>>>> prioritize at the moment (at least not until it works). this is
>>>>>>>how I
>>>>>>> see it:
>>>>>>>
>>>>>>> Basic:
>>>>>>> -DRS structure in both id and urls (this includes: versioning and
>>>>>>> maintaining url/catalog version coherency, more to that later)
>>>>>>> -PKI
>>>>>>> -Both HTTP and GridFTP server access (BDM gives bonus points, but
>>>>>>>you
>>>>>>> don't need to publish those endpoints in the catalog anyways  :-)
>>>>>>> -checksums
>>>>>>>
>>>>>>> extra:
>>>>>>> -OpeNDAP Access (which can be broken for aggregations, since
>>>>>>>there's
>>>>>>> no solution to that at the moment
>>>>>>> -LAS (I have never seen an installation besides the "demo" one with
>>>>>>> this, so it can't be a requirement really, not at the moment)
>>>>>>
>>>>>
>>>>> is that a demo?
>>>>>
>>>>> 
>>>>>http://esg-datanode.jpl.nasa.gov/thredds/esgcet/1/obs4MIPs.NASA-JPL.AI
>>>>>RS
>>>>> 
>>>>>.mon.v1.html?dataset=obs4MIPs.NASA-JPL.AIRS.mon.husNobs.1.aggregation.
>>>>>1
>>>>>
>>>>> 
>>>>>http://esg-datanode.jpl.nasa.gov/las/getUI.do?catid=893EB2D5C79AD40EE2
>>>>>43
>>>>> 6A3F118649CE_ns_obs4MIPs.NASA-JPL.AIRS.mon.husNobs.1.aggregation.1
>>>>>
>>>>> It looks pretty mature.
>>>>>
>>>>>
>>>>>>>
>>>>>>> why OpeNDAP as an extra? Because at this time, replication is a
>>>>>>> priority. You don't want the whole world to get to your OpenDAP
>>>>>>> server, it would be advisable to get some replicas in place before
>>>>>>> that.
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> replication is the gateways priority. My priority is to have happy
>>>>> users. And I know they want OpenDap. CORDEX simulations are running
>>>>>now
>>>>> and they need OpenDap to subset their download.
>>>>>
>>>>> I don't mind the all world getting to my OpenDAP. We will boost the
>>>>>VM
>>>>> as needed to sustain what it takes but OpenDap doesn't consume that
>>>>> much
>>>>> resources and it save network bandwidth so it's not a bad deal.
>>>>>
>>>>> cheers.
>>>>> Sébastien
>>>>>
>>>>>
>>>>>
>>>>>>> Anyway, I cast my 5 star vote and will use NCI node as an example.
>>>>>>> :-)
>>>>>>> Well done Muhammad, really.
>>>>>>>
>>>>>>> Just to show how another node might see, and I won't do this again
>>>>>>> anyother time soon but I think it's require to value a pristine
>>>>>>>node
>>>>>>> more, let's take noaa-gfdl (a middle class one :-):
>>>>>>>
>>>>>>> esgdata.gfdl.noaa.gov
>>>>>>> - No entry in the wiki page, so no admin to contact.
>>>>>>> - datasets with mixed cases:
>>>>>>>
>>>>>>> 
>>>>>>>cmip5.output1.NOAA-GFDL.GFDL-HIRAM-C180.sst2090.mon.atmos.Amon.r3i1p
>>>>>>>2.
>>>>>>> v1/
>>>>>>>     -
>>>>>>>
>>>>>>> 
>>>>>>>cmip5.output1.noaa-gfdl.gfdl-hiram-c180.amip.mon.atmos.Amon.r1i1p1.v
>>>>>>>1/
>>>>>>>
>>>>>>> - dataset version and directory version mismatch and half-DRS
>>>>>>> structure (this has version 1 in the catalogs):
>>>>>>>
>>>>>>> 
>>>>>>>thredds/fileServer/gfdl_dataroot/NOAA-GFDL/GFDL-HIRAM-C180/amip/fx/a
>>>>>>>tm
>>>>>>> 
>>>>>>>os/fx/r0i0p0/v20110601/areacella/areacella_fx_GFDL-HIRAM-C180_amip_r
>>>>>>>0i
>>>>>>> 0p0.nc
>>>>>>>
>>>>>>>
>>>>>>> - Only HTTPServer access points
>>>>>>> - self-signed certificate containing "Globus-Test"
>>>>>>> - ORP redirecting to a different machine name (probably same
>>>>>>>machine,
>>>>>>> but still misconfigured)
>>>>>>> - White-list is wrong or incomplete
>>>>>>> - because of the above PKI is not working
>>>>>>> - They do have checksums and that is really good.
>>>>>>>
>>>>>>> So that's a pretty standard data node which makes replication much
>>>>>>> more difficult, if not impossible.
>>>>>>>
>>>>>>> My 2c anyway,
>>>>>>> Estani
>>>>>>>
>>>>>>> Am 13.10.2011 02:40, schrieb Muhammad Atif:
>>>>>>
>>>>>>>>> On 13/10/11 02:50, Estanislao Gonzalez wrote:
>>>>>>>
>>>>>>>>>>> By the way Muhammad, could you clean the datanode? There are a
>>>>>>>>>>> lot
>>>>>>>>>>> of "unlinked" catalogs:
>>>>>>>>>>>
>>>>>>>>>>> 
>>>>>>>>>>>http://esgnode1.nci.org.au/thredds/esgcet/3/cmip5.output1.CSIRO-
>>>>>>>>>>>QC
>>>>>>>>>>> 
>>>>>>>>>>>CCE.CSIRO-Mk3-6-0.historicalGHG.day.ocean.day.r4i1p1.v20110802.h
>>>>>>>>>>>tm
>>>>>>>>>>> l
>>>>>>>>>>>
>>>>>>>>>>> That are returning just 404... I think there's an option for
>>>>>>>>>>> this in
>>>>>>>>>>> the publisher (delete-orphans, or something) or was that
>>>>>>>>>>>intended
>>>>>>>>>>> for something else Bob?
>>>>>>>>>>>
>>>>>>>>>>> But besides that, your data node looks pristine... version,
>>>>>>>>>>> checksum, DRS conform directory structures... even a working
>>>>>>>>>>> GridFTP!!
>>>>>>>>>>> We should start a 5 star data node "quality meter" for data
>>>>>>>>>>>nodes
>>>>>>>>>>> installations... you'll get a 4,5 (clean the 404 up and I'll
>>>>>>>>>>> cast my
>>>>>>>>>>> 5 star vote ;-)... I think the rest of us starts from 4 and
>>>>>>>>>>>goes
>>>>>>>>>>> downwards.... But I might be wrong, apologies for any other
>>>>>>>>>>> pristine
>>>>>>>>>>> data node out there... if any.
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Anything to get 5 stars from you Estani. All done.  :)
>>>>>>>>>
>>>>>>>>> I manually removed the entries from catalog.xml in thredds.
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> -- Sébastien Denvil IPSL, Pôle de modélisation du climat UPMC, Case
>>>>> 101, 4 place Jussieu, 75252 Paris Cedex 5 Tour 45-55 2ème étage
>>>>>Bureau
>>>>> 209 Tel: 33 1 44 27 21 10 Fax: 33 1 44 27 39 02
>>>>> -- Scanned by iCritical.
>>>>
>>>
>>> _______________________________________________
>>> GO-ESSP-TECH mailing list
>>> GO-ESSP-TECH at ucar.edu
>>> http://mailman.ucar.edu/mailman/listinfo/go-essp-tech
>>
>> --
>> Scanned by iCritical.
>> _______________________________________________
>> GO-ESSP-TECH mailing list
>> GO-ESSP-TECH at ucar.edu
>> http://mailman.ucar.edu/mailman/listinfo/go-essp-tech
>

-- 
Scanned by iCritical.


More information about the GO-ESSP-TECH mailing list