[Go-essp-tech] Fwd: Re: esgf datanode publication, version numbers

Cecelia DeLuca cecelia.deluca at noaa.gov
Tue Jan 17 17:42:44 MST 2012


Hi Serguei,

Sylvia is out this week but I just spoke to her about your questions and 
relay the
responses below.  Please look here:
http://q.cmip5.ceda.ac.uk/feeds/cmip5/simulation/

You can see the simulations entered through the CMIP5 questionnaire and 
available for
ingestion into the metadata trackback.  If you don't see the ones you 
expect from GFDL,
it means that the process of submitting these simulations through the 
questionnaire isn't completed.

The only one I see in the trackback is an early synthetic example from 
GFDL that we
used for testing (Control-1860).

Our team doesn't handle the questionnaire input, the METAFOR group 
does.  Ron and
Charlotte Pascoe from METAFOR have a dialogue going about how to 
complete the GFDL
submissions.  When that works more simulations should show up at the URL 
above, and
will then be ingested and shown in the metadata trackback.

It would be good to test in a current (1.3.4) version of the gateway.  
PCMDI is still at an earlier
version (1.3.2) and may not show the same results as others.  I was 
testing earlier today
and saw erratic errors there, either hanging while loading or getting 
the internal
server error, in all kinds of search situations.  I didn't see these 
issues testing at BADC
and could not reproduce the internal server error you got when going to 
the second
page of your search results.  Does that reliably fail for you?

There are a number of other issues with metadata being tracked as known 
bugs.  Sylvia
recently sent out a summary.  Her mail is included below my response.  I 
added some
updates.

In summary, to begin to understand and fix the issues you see:
- ensure GFDL simulations are submitted, and only expect to see 
simulations that are
available at the URL above
- test on gateways that have 1.3.4 installed
- remaining errors should be analyzed and checked against known bugs below

Please write if you have further questions.

Best,
Cecelia


-------- Original Message --------
Subject: 	[Curator] Status report on metadata display issues
Date: 	Mon, 9 Jan 2012 09:10:06 -0700
From: 	Sylvia Murphy <sylvia.murphy at noaa.gov>
To: 	go-essp-tech at ucar.edu
CC: 	Metafor List <metafor at lists.enes.org>, curator at list.woc.noaa.gov



Hi Everyone,

Since we have not had a go-essp-tech call in quite a while, I thought I 
would send out an email summarizing the metadata-display operational 
issues going across the federation:


STATUS OF BUGS IN THE SOFTWARE:  These issues affect every installation 
and all users...

1) CIM instances (that contain unicode characters) are not displaying 
properly.  This includes instances that contain % signs, "(", or 
umlauts.  This has been permanently fixed in 2.0 RC1, but will still be 
an issue for users until 2.0 is finalized and installed across the 
federation.  [CD - I don't know about across the federation, but if 
anybody does end up installing 2.0, it should be fixed there.  Goes for 
the next item too.]

2) Users are presented with a blurb on the simulation search page 
indicating that "Much of the 5th Coupled Model Intercomparison Project 
(CMIP5) metadata that will be accessible from this gateway is not yet 
available". This has been fixed in 2.0 RC1, but will still be an issue 
for users until 2.0 is finalized and installed across the federation.

CENTER SPECIFIC ISSUES:

There are some gateways with technical issues that also affect users:

1) PCMDI:  Users get an internal server error when they click on the 
results of any metadata search, and see no model metadata at all.  This 
fix requires that PCMDI update their postgres database. [CD:  Looks like 
the postgres database was updated by PCMDI in the last week, so metadata 
is working in the PCMDI gateway, the first time since September.  Yay!!  
However, PCMDI is still at gateway version 1.3.2, and behavior may not 
be the same as behavior of gateways with the latest production version 
installed.]

2) NCI:  Users see most (37/61) but not all metadata records because NCI 
needs to reharvest their triple store as part of their upgrade to 
1.3.4.  We are in contact with NCI and walking them through this.  Note 
that triple store reharvests will be significantly less painful with ESG 
2.0.

3) DKRZ: Users will not see data links in the trackback for local DKRZ 
data.  This problem has been identified as a mismatch between the names 
in the data and what is coming out of the CIM.  Such mismatches were 
anticipated and a mapping file created to translate between the two.  
This file just needs to be updated at DKRZ.  This is being worked.

As always, you may check the status of metadata across the federation 
at: http://esgf.org/wiki/Cmip5Status
I last checked all the sites on 6 January and recheck every Friday.




On 1/17/2012 12:51 PM, Serguei Nikonov wrote:
> Hi Sylvia,
>
> Ron Stouffer found the serious issue with GFDL simulations - they are not seen
> all through gateway search interface. I tried it on different gateways (PCMDI,
> BADC, DKRZ), the results are not good:
>
> 1. "Simulations" search by keywords does not work normally. On PCMDI and DKRZ
> gateways "GFDL" keyword in  search generates message "An internal server error
> has occurred. The problem has been logged and the administrator notified". BADC
> is little bit better but the search returns 23 results (on 3 pages) which are
> only 6 is GFDL; further, going to the 2nd page of returned results gives the
> same internal server error.
>
> 2. Searching without keywords returns only 6 GFDL results (PCMDI, BADC, DKRZ).
>
> Other issue is that the tab "Outputs" in chosen simulation does not contain any
> datasets. I am not sure whether it's working functionality currently cause I
> could find only one good working example on BADC gateway for HadGEM2-ES model.
>
> Estani recommended me to ask you about help. Can you advise where to start with
> to fix it? Is it GFDL problem or it's general issue.
>
> Thanks,
> Sergey Nikonov
> GFDL Data Portal
>
> On 01/16/2012 12:05 PM, Estanislao Gonzalez wrote:
>> Hi Sergei,
>>
>> that's a whole different issue... I think it's best to start a new thread for
>> that and ask Sylvia Murphy<Sylvia.Murphy at noaa.gov>  directly.
>> There are many issues regarding Simulations right now... I don't think this will
>> affect that at all, since simulations has no dataset version concept.
>>
>> If simulations are not appearing as they should, or the link in trackback to the
>> datasets is not being displayed, this normally has to do with a problem in the
>> DB of the Gateway and has no relation to the data nodes whatsoever. So If you
>> find any problems related to Simulations, you should contact Bob and Sylvia
>> directly.
>> As of this time there are still some unresolved issues though.
>>
>> thanks,
>> Estani
>> Am 16.01.2012 17:39, schrieb Serguei Nikonov:
>>> Hi Estani,
>>>
>>> thanks for you help. I realize that I need to republish all datasets. My main
>>> point right now is what should be done else to make simulations visible
>>> (datasets are OK) in "Simulation" search. This is the main issue currently. Or
>>> "harmonizing" versions in datasets on gateway with physical version in DRS
>>> will be enough and simulations will appear in "Simulations" search after that
>>> at once?
>>>
>>> Thanks,
>>> Sergey
>>>
>>> On 01/16/2012 11:28 AM, Estanislao Gonzalez wrote:
>>>> Hi Sergey,
>>>>
>>>> sorry if this got confusing but there's a small problem with what Hans said.
>>>> Setting "version_by_date=true" won't help you here, since AFAIK it will generate
>>>> a version from the current date, which wont match what you already have in you
>>>> directory path.
>>>>
>>>> To publish a version that's not the current date, you should pass it to the
>>>> esgpublisher "somehow". One option is to use the --new-version flag as I said,
>>>> another would be to provide a list of datasets and versions via the
>>>> --version-list flag (check the --help)
>>>>
>>>> I' CCing Bob, maybe he has a better approach. But I'd say, you have to
>>>> re-publish everything with the proper version.
>>>>
>>>> If you use mapfiles, the simplest way to do it I can think of is to extract the
>>>> version and dataset from the map file using an sed command:
>>>> sed 's#^\([^|].*\)|.*/v\(20[0-9]*\)/.*#\1|\2#'<_your_map_file>  |sort -u
>>>>
>>>> If you use bash you could do (after unpublishing):
>>>> map=<path_to_map_file>
>>>> esgpublish --map $map --version-list<(sed
>>>> 's#^\([^|].*\)|.*/v\(20[0-9]*\)/.*#\1|\2#' $map |sort -u) ....[and the rest as
>>>> usual]
>>>>
>>>> This ways you could publish mapfiles which contains datasets from different
>>>> versions. If all files in the mapfile are from the same version (might be from
>>>> different datasets, no problem) then it's easier to use --new-version 2011xxxx
>>>> It depends on what your environment looks like.
>>>>
>>>> Hope this helps,
>>>> Estani
>>>>
>>>>
>>>> Am 16.01.2012 16:58, schrieb Hans Ramthun:
>>>>> Hallo Sergey,
>>>>>
>>>>> What I wanted is described here were Estani pointed to in one of his mails:
>>>>> http://esg-pcmdi.llnl.gov/internal/esg-data-node-documentation/cmip5-best-practices
>>>>>
>>>>>
>>>>> If you publish the datasets with the option 'version_by_date = true' in the
>>>>> project section of the esg.ini file then the discrepancy should be reversed
>>>>> and the view in the tds should be correct.
>>>>>
>>>>> The gateway search will always show the most recent version of the data and in
>>>>> the history tab other versions of the found data.
>>>>>
>>>>> So the only thing you have to do is to insert the above option in the esg.ini
>>>>> file of your data node before publishing the data.
>>>>>
>>>>> Hope that clarified this.
>>>>>
>>>>> Regards
>>>>> Hans
>>>>>
>>>>>
>>>>>
>>>>> Am 16.01.2012 16:36, schrieb Serguei Nikonov:
>>>>>> Hi Hans,
>>>>>>
>>>>>> so, if I change version of dataset on gateway to have it the same as what we
>>>>>> have in physical path of files then "Simulations" search in gateway (e.g.
>>>>>> http://cmip-gw.badc.rl.ac.uk/query/advanced.htm?product=ConfiguredModelwill)
>>>>>> gives all published experiments visible in "Datasets" search? And also
>>>>>> "Output" tab of simulation will contain links to datasets? Do I understand
>>>>>> correctly what the final result should be achieved?
>>>>>>
>>>>>> Thanks,
>>>>>> Sergey
>>>>>>
>>>>>> On 01/16/2012 02:56 AM, Hans Ramthun wrote:
>>>>>>> Hallo Estani,
>>>>>>>
>>>>>>> Correct that discrepancy is what I meant.
>>>>>>>
>>>>>>> Thanks for investigating and clarifying this
>>>>>>> Hans
>>>>>>>
>>>>>>> Am 14.01.2012 20:28, schrieb Estanislao Gonzalez:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I've CCed Has, for some reason it wasn't in the list.
>>>>>>>>
>>>>>>>> Anyway, the problem was the version number of the datasets, e.g.:
>>>>>>>> This is the url of the metadata to file:
>>>>>>>> http://esgdata.gfdl.noaa.gov/thredds/esgcet/1/cmip5.output1.NOAA-GFDL.GFDL-CM3.historical.mon.atmos.Amon.r1i1p1.v1.html?dataset=cmip5.output1.NOAA-GFDL.GFDL-CM3.historical.mon.atmos.Amon.r1i1p1.v1.ps_Amon_GFDL-CM3_historical_r1i1p1_186001-186412.nc
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> says the dataset it's v1 as you note from the url. Now the download path is:
>>>>>>>> http://esgdata.gfdl.noaa.gov/thredds/fileServer/gfdl_dataroot/NOAA-GFDL/GFDL-CM3/historical/mon/atmos/Amon/r1i1p1/v20110601/ps/ps_Amon_GFDL-CM3_historical_r1i1p1_186001-186412.nc
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> where you see that the version in the DRS path says the version is 20110601.
>>>>>>>>
>>>>>>>> The problem is the discrepancy between the two versions. The required
>>>>>>>> step was
>>>>>>>> to publish the dataset using --new-version 20110601 or any of the others
>>>>>>>> possibilities the publisher offers.
>>>>>>>>
>>>>>>>> Hans, that's what you've meant right?
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> Estani
>>>>>>>>
>>>>>>>> Am 13.01.2012 21:18, schrieb Serguei Nikonov:
>>>>>>>>> Hi Hans,
>>>>>>>>>
>>>>>>>>> Ron asked me to follow up this issue. As I understand primary problem is
>>>>>>>>> with
>>>>>>>>> visibility of GFDL metafor metadata from different gateways, isn't it?
>>>>>>>>>
>>>>>>>>> As I checked GFDL data, it's accessible from European gateways. I am
>>>>>>>>> wondering how type of versioning (numbers or date) in datasets can
>>>>>>>>> affect on
>>>>>>>>> this primary problem? May be I am not clear understand where is the
>>>>>>>>> problem,
>>>>>>>>> can you explain me, please.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Sergey Nikonov,
>>>>>>>>> GFDL Data Portal
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Am 13.01.2012 09:05, schrieb Hans Ramthun:
>>>>>>>>>> Hallo Estani,
>>>>>>>>>>
>>>>>>>>>> How can Ron fix this problem with the version numbers like v1,v2,...on
>>>>>>>>> the
>>>>>>>>>> GFDL thredds server
>>>>>>>>>> (http://esgdata.gfdl.noaa.gov/thredds/esgcet/catalog.html)?
>>>>>>>>>>
>>>>>>>>>> Could you please guide him to get the correct ones like v20120113,...?
>>>>>>>>>>
>>>>>>>>>> Thanks
>>>>>>>>>> Hans
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> -------- Original-Nachricht --------
>>>>>>>>>> Betreff: Re: [metafor] metafor questionnaire
>>>>>>>>>> Datum: Thu, 12 Jan 2012 12:56:39 -0500
>>>>>>>>>> Von: Ron Stouffer<ronald.stouffer at noaa.gov>
>>>>>>>>>> <mailto:ronald.stouffer at noaa.gov>
>>>>>>>>>> Organisation: Geophysical Fluid Dynamics Laboratory
>>>>>>>>>> An: Hans Ramthun<ramthun at dkrz.de>  <mailto:ramthun at dkrz.de>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Hi Hans,
>>>>>>>>>>
>>>>>>>>>> I am not sure if the problem you point out is my problem (something that
>>>>>>>>>> GFDLers need to fix) or something that somebody involved in the CMIP
>>>>>>>>>> data serving software needs to fix.
>>>>>>>>>>
>>>>>>>>>> If it is our (GFDL) problem. How do we go about fixing it???
>>>>>>>>>>
>>>>>>>>>> Thanks for the comment.
>>>>>>>>>> -Ron
>>>>>>>>>>
>>>>>>>>>> On 1/12/2012 9:04 AM, Hans Ramthun wrote:
>>>>>>>>>>> Hallo Ron,
>>>>>>>>>>>
>>>>>>>>>>> When I go to the GFDL thredds server
>>>>>>>>>>> (http://esgdata.gfdl.noaa.gov/thredds/esgcet/catalog.html) I find only
>>>>>>>>>>> datasets with version numbers like v1,v2,...
>>>>>>>>>>> Normally I would expect here something like a data: v20120112 or so.
>>>>>>>>>>>
>>>>>>>>>>> Cheers
>>>>>>>>>>> Hans
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Am 12.01.2012 14:08, schriebcharlotte.pascoe at stfc.ac.uk:
>>>>>>>>> <mailto:charlotte.pascoe at stfc.ac.uk:>
>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>
>>>>>>>>>>>> There are no GFDL simulations on the list of published CIM documents
>>>>>>>>>>>> on the questionnaire atom feed.
>>>>>>>>>>>> However, there are 26 simulations documented in the GFDL pages of the
>>>>>>>>>>>> questionnaire.
>>>>>>>>>>>> Ron, you'll have just received an email from the help desk asking you
>>>>>>>>>>>> to hit the publish button on you simulation documents and offering an
>>>>>>>>>>>> online demo session to help iron out any niggles.
>>>>>>>>>>>>
>>>>>>>>>>>> best,
>>>>>>>>>>>> Charlotte
>>>>>>>>>>>>
>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>> From:metafor-bounces at lists.enes.org
>>>>>>>>> <mailto:metafor-bounces at lists.enes.org>
>>>>>>>>>>>> [mailto:metafor-bounces at lists.enes.org] On Behalf Of Bryan Lawrence
>>>>>>>>>>>> Sent: 11 January 2012 13:33
>>>>>>>>>>>> To: Ron Stouffer
>>>>>>>>>>>> Cc: Metafor List; sylvia murphy
>>>>>>>>>>>> Subject: Re: [metafor] metafor questionnaire
>>>>>>>>>>>>
>>>>>>>>>>>>> Bryan,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Is the ESM2M (GFDL) metafor questionnaire done and public?
>>>>>>>>>>>>>
>>>>>>>>>>>>> -Ron
>>>>>>>>>>>> Hi Ron
>>>>>>>>>>>>
>>>>>>>>>>>> The short answer is yes.
>>>>>>>>>>>>
>>>>>>>>>>>> The slightly longer answer is, that for reasons I don't understand
>>>>>>>>>>>> one sees a different amount of simulation metadata for GFDL in the
>>>>>>>>>>>> various gateways ... There are 23 simulations descriptions for GFDL
>>>>>>>>>>>> (including ESM2M ones) at BADC, 21 at PCMDI and 10 at NCAR in their
>>>>>>>>>>>> gateway 2 ...
>>>>>>>>>>>>
>>>>>>>>>>>> (You can see these records by choosing simulation or'simulation
>>>>>>>>>>>> metadata' as the target in your search on the portals, rather than
>>>>>>>>> data)
>>>>>>>>>>>> I don't know that any of these numbers correspond to the number in
>>>>>>>>>>>> the questionnaire feed. It's an interesting quality control issue,
>>>>>>>>>>>> and I think we'll knock something up to compare what's public from
>>>>>>>>>>>> the questionnaire, and what appears in the portals.
>>>>>>>>>>>>
>>>>>>>>>>>> Cheers
>>>>>>>>>>>> Bryan
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> Bryan Lawrence
>>>>>>>>>>>> University of Reading: Professor of Weather and Climate Computing.
>>>>>>>>>>>> National Centre for Atmospheric Science: Director of Models and Data.
>>>>>>>>>>>> STFC: Director of the Centre for Environmental Data Archival.
>>>>>>>>>>>> Ph:+44 118 3786507<tel:%2B44%20118%203786507>  or 1235 445012;
>>>>>>>>> Web:home.badc.rl.ac.uk/lawrence<http://home.badc.rl.ac.uk/lawrence>
>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>> metafor mailing list
>>>>>>>>>>>> metafor at lists.enes.org<mailto:metafor at lists.enes.org>
>>>>>>>>>>>> https://lists.enes.org/mailman/listinfo/metafor
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Estanislao Gonzalez
>>>>>>>>>
>>>>>>>>> Max-Planck-Institut fЭr Meteorologie (MPI-M)
>>>>>>>>> Deutsches Klimarechenzentrum (DKRZ) - German Climate Computing Centre
>>>>>>>>> Room 108 - Bundesstrasse 45a, D-20146 Hamburg, Germany
>>>>>>>>>
>>>>>>>>> Phone:+49 (40) 46 00 94-126<tel:%2B49%20%2840%29%2046%2000%2094-126>
>>>>>>>>> E-Mail:gonzalez at dkrz.de<mailto:gonzalez at dkrz.de>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>
>>>>
>>
> _______________________________________________
> GO-ESSP-TECH mailing list
> GO-ESSP-TECH at ucar.edu
> http://mailman.ucar.edu/mailman/listinfo/go-essp-tech

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ucar.edu/pipermail/go-essp-tech/attachments/20120117/78b0864c/attachment-0001.html 


More information about the GO-ESSP-TECH mailing list