[Go-essp-tech] GO-ESSP call Jan 10, 2012

Eric Nienhouse ejn at ucar.edu
Wed Jan 18 06:20:53 MST 2012


Hi Estani,

Attached is a listing of unresolved catalogs from recent federation 
testing.  The affected MPI datasets in particular are:

cmip5.output1.MPI-M.MPI-ESM-LR.rcp85.6hr.atmos.6hrPl
cmip5.output1.MPI-M.MPI-ESM-LR.rcp85.day.atmos.day.r3i1p1
cmip5.output1.MPI-M.MPI-ESM-LR.rcp85.day.land.day.r3i1p1
cmip5.output1.MPI-M.MPI-ESM-LR.rcp85.day.landIce.day.r3i1p1
cmip5.output1.MPI-M.MPI-ESM-LR.rcp85.day.ocean.day.r3i1p1
cmip5.output1.MPI-M.MPI-ESM-LR.rcp85.day.seaIce.day.r3i1p1
cmip5.output1.MPI-M.MPI-ESM-LR.rcp85.mon.landIce.LImon.r3i1p1
cmip5.output1.MPI-M.MPI-ESM-LR.rcp85.mon.ocnBgchem.Omon.r3i1p1
cmip5.output1.MPI-M.MPI-ESM-LR.rcp85.mon.seaIce.OImon.r3i1p1
cmip5.output1.MPI-M.MPI-ESM-LR.rcp85.yr.ocnBgchem.Oyr.r3i1p1
cmip5.output2.MPI-M.MPI-ESM-LR.rcp85.mon.ocean.Omon.r3i1p1
cmip5.output2.MPI-M.MPI-ESM-LR.rcp85.yr.ocnBgchem.Oyr.r3i1p1

In the cases above it looks like the Thredds catalogs have moved from a 
"/6" directory to "/1".  They likely just need to be re-retrieved at the 
gateway via esgpublish --publish.  Thanks!

Note there are similar cases at CCCMA, CNRM-CERFACS.

It is a good idea to publish this information as you suggest.  For now 
keeping an up to date list like the attached (ideally with annotations) 
on the ESGF Wiki may be the best approach.  I'll look for a suitable 
spot to place this and post it up.  (Tho we're piling a lot on the CMIP5 
status page as it is.)

The Gateway 2.0 logs messages about these types of inconsistencies when 
harvesting the federation.  We've also been comparing search results 
across the federation by capturing the OAI feeds and search UI output.  
This work was primarily done to test and validate 2.0 search and 
involves manual intervention at this point.

Regards,

-Eric

Estanislao Gonzalez wrote:
> Hi,
>
> this is great information... and if I could know which datasets had a 
> problem and I have to republish, that would be awesome :-)
>
> I guess someone has the urls of those catalogs, that will save time 
> instead of looking blindly for them...
> I'm also wondering if this test procedure could be automated and the 
> results published somewhere, we might have a better archive.
>
> Thanks,
> Estani
>
> Am 17.01.2012 19:31, schrieb Eric Nienhouse:
>   
>> Hi Karl,
>>
>> Thank you for this summary.  I agree a key requirement of the Gateway is
>> to make available all output released by climate modeling groups.
>> You've noted certain discrepancies between the 1.3 and 2.0 Gateway
>> versions which are a concern.
>>
>> I believe the Gateway 2.0 is more accurately representing the ESG
>> federated system state than the 1.3.4 version.  The Gateway 2.0 provides
>> additional validation over 1.3.4 primarily to reduce the number of
>> inconsistencies experienced by users and to accurately support access
>> control.  This leads to some differences when comparing Gateway 1.3.4
>> and 2.0 (we've identified about 75 such cases.)
>>
>> The discrepancies you note are a reflection of inconsistencies between
>> the Thredds dataset catalogs and the publication state stored at the
>> 1.3.4 Gateway.  Below is a brief summary.  Much of this can be (and
>> should be) corrected by re-publishing the current dataset catalog to the
>> corresponding gateway to fix up these errors.
>>
>> * In about 40 cases, the original catalog has "moved" and the Gateway is
>> referencing the old catalog path which no longer exists.  In general
>> files are still down-loadable from 1.3.X.
>>
>> * In 13 cases, the the original catalog no longer exists.  The files are
>> still download-able from 1.3.X without access control due to a TDS
>> loophole.
>>
>> * In at least 2 cases, a new dataset version catalog is available,
>> however it has not been published to the Gateway (and the old one removed.)
>>
>> Again, much of this can be corrected by re-publishing the current
>> dataset catalogs to the gateway.  This should resolve the particular
>> issue Karl noted below in which "MPI-ESM-LR, rcp85, r3i1p1" datasets are
>> missing from the 2.0 Gateway.  In this case, these (12) dataset catalogs
>> have been moved to a new path (from /6/ to /1/) and should be republished.
>>
>> Kind regards,
>>
>> -Eric
>>
>> Karl Taylor wrote:
>>     
>>> Dear all,
>>>
>>> Regarding:
>>>       
>>>> ... Karl contacted us and asked us to remove all of the federated CMIP5 datasets from our gateway.
>>>>
>>>>         
>>> Could somebody explain why Karl asked NCAR to do this?
>>>
>>>
>>> Here is part of the explanation:
>>>
>>> Before changes are made to the way the CMIP5 model output is served,
>>> several requirements must be met.  PCMDI has the responsibility for
>>> this, and we have not given permission for any new gateways to become
>>> operational in the service of CMIP5 data.  In particular, the "Gateway
>>> 2.0" has not yet completed a sufficiently testing phase.  We therefore
>>> asked NCAR to remove the federated CMIP5 datasets from the operational
>>> version of its gateway.
>>>
>>> One requirement of the gateway is that it make available *all*  output
>>> released by the climate modeling groups.   The modeling groups have
>>> worked hard to finish their simulations and produce output in a
>>> user-friendly form.  Our responsibility is to make sure users can get
>>> to it.   The release candidate for Gateway 2.0 fails to list some
>>> datasets that are in fact being served through Gateway 1.3.4 (the
>>> operational gateway at the main CMIP5 data centers at PCMDI, BADC, and
>>> WDCC). [see below for an example.]  Users coming to Gateway 2.0 would
>>> therefore incorrectly conclude that some models hadn't provided the
>>> output they were seeking, when in fact the Gateway 2.0 was simply not
>>> showing it.  This is unacceptable and is one reason that Gateway 2.0
>>> cannot operationally serve CMIP5 model output.
>>>
>>> This serious problem with Gateway 2.0 may not be the only issue that
>>> could delay or rule out its use in serving CMIP5.  Rest assured that
>>> more than one option is being considered for improving access to CMIP5
>>> model output.  We hope to have a better system in place soon, but only
>>> after a careful and thorough testing stage has been completed.  During
>>> the testing phase, we will at some point (within weeks, I think)
>>> invite users to try out the new system (as beta testers), but the
>>> operational system will also remain available and resource limitations
>>> will likely mean our CMIP5 help desk will during the testing phase
>>> only be able to respond to questions concerning the operational system.
>>>
>>> We hope, therefore, that before too long the users will see
>>> substantial improvements in the system giving them access to CMIP5
>>> model output. I welcome your continued input and help.
>>>
>>> Best regards,
>>> Karl
>>>
>>> Example of problem with Gateway 2.0:  Compare search results on
>>> MPI-ESM-LR, rcp85, r3i1p1
>>>
>>> Gateway 1.3.4 (at BADC, for example): returns 15 datasets
>>> the prototype Gateway 2.0 (at NCAR): returns only 3 datasets.
>>>
>>> thus, Gateway 2.0 fails to show the user datasets which, in fact, are
>>> available from the MPI model.
>>>
>>>
>>> ------------------------------------------------------------------------
>>>
>>> _______________________________________________
>>> GO-ESSP-TECH mailing list
>>> GO-ESSP-TECH at ucar.edu
>>> http://mailman.ucar.edu/mailman/listinfo/go-essp-tech
>>>
>>>       
>> _______________________________________________
>> GO-ESSP-TECH mailing list
>> GO-ESSP-TECH at ucar.edu
>> http://mailman.ucar.edu/mailman/listinfo/go-essp-tech
>>     
>
>
>   

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: esgf_unresolved_catalogs_20120104.txt
Url: http://mailman.ucar.edu/pipermail/go-essp-tech/attachments/20120118/1fd7c531/attachment.txt 


More information about the GO-ESSP-TECH mailing list