[Go-essp-tech] Fwd: Re: esgf datanode publication, version numbers

Serguei Nikonov serguei.nikonov at noaa.gov
Tue Jan 17 12:51:47 MST 2012


Hi Sylvia,

Ron Stouffer found the serious issue with GFDL simulations - they are not seen 
all through gateway search interface. I tried it on different gateways (PCMDI, 
BADC, DKRZ), the results are not good:

1. "Simulations" search by keywords does not work normally. On PCMDI and DKRZ 
gateways "GFDL" keyword in  search generates message "An internal server error 
has occurred. The problem has been logged and the administrator notified". BADC 
is little bit better but the search returns 23 results (on 3 pages) which are 
only 6 is GFDL; further, going to the 2nd page of returned results gives the 
same internal server error.

2. Searching without keywords returns only 6 GFDL results (PCMDI, BADC, DKRZ).

Other issue is that the tab "Outputs" in chosen simulation does not contain any 
datasets. I am not sure whether it's working functionality currently cause I 
could find only one good working example on BADC gateway for HadGEM2-ES model.

Estani recommended me to ask you about help. Can you advise where to start with 
to fix it? Is it GFDL problem or it's general issue.

Thanks,
Sergey Nikonov
GFDL Data Portal

On 01/16/2012 12:05 PM, Estanislao Gonzalez wrote:
> Hi Sergei,
>
> that's a whole different issue... I think it's best to start a new thread for
> that and ask Sylvia Murphy <Sylvia.Murphy at noaa.gov> directly.
> There are many issues regarding Simulations right now... I don't think this will
> affect that at all, since simulations has no dataset version concept.
>
> If simulations are not appearing as they should, or the link in trackback to the
> datasets is not being displayed, this normally has to do with a problem in the
> DB of the Gateway and has no relation to the data nodes whatsoever. So If you
> find any problems related to Simulations, you should contact Bob and Sylvia
> directly.
> As of this time there are still some unresolved issues though.
>
> thanks,
> Estani
> Am 16.01.2012 17:39, schrieb Serguei Nikonov:
>> Hi Estani,
>>
>> thanks for you help. I realize that I need to republish all datasets. My main
>> point right now is what should be done else to make simulations visible
>> (datasets are OK) in "Simulation" search. This is the main issue currently. Or
>> "harmonizing" versions in datasets on gateway with physical version in DRS
>> will be enough and simulations will appear in "Simulations" search after that
>> at once?
>>
>> Thanks,
>> Sergey
>>
>> On 01/16/2012 11:28 AM, Estanislao Gonzalez wrote:
>>> Hi Sergey,
>>>
>>> sorry if this got confusing but there's a small problem with what Hans said.
>>> Setting "version_by_date=true" won't help you here, since AFAIK it will generate
>>> a version from the current date, which wont match what you already have in you
>>> directory path.
>>>
>>> To publish a version that's not the current date, you should pass it to the
>>> esgpublisher "somehow". One option is to use the --new-version flag as I said,
>>> another would be to provide a list of datasets and versions via the
>>> --version-list flag (check the --help)
>>>
>>> I' CCing Bob, maybe he has a better approach. But I'd say, you have to
>>> re-publish everything with the proper version.
>>>
>>> If you use mapfiles, the simplest way to do it I can think of is to extract the
>>> version and dataset from the map file using an sed command:
>>> sed 's#^\([^|].*\)|.*/v\(20[0-9]*\)/.*#\1|\2#' <_your_map_file> |sort -u
>>>
>>> If you use bash you could do (after unpublishing):
>>> map=<path_to_map_file>
>>> esgpublish --map $map --version-list <(sed
>>> 's#^\([^|].*\)|.*/v\(20[0-9]*\)/.*#\1|\2#' $map |sort -u) ....[and the rest as
>>> usual]
>>>
>>> This ways you could publish mapfiles which contains datasets from different
>>> versions. If all files in the mapfile are from the same version (might be from
>>> different datasets, no problem) then it's easier to use --new-version 2011xxxx
>>> It depends on what your environment looks like.
>>>
>>> Hope this helps,
>>> Estani
>>>
>>>
>>> Am 16.01.2012 16:58, schrieb Hans Ramthun:
>>>> Hallo Sergey,
>>>>
>>>> What I wanted is described here were Estani pointed to in one of his mails:
>>>> http://esg-pcmdi.llnl.gov/internal/esg-data-node-documentation/cmip5-best-practices
>>>>
>>>>
>>>> If you publish the datasets with the option 'version_by_date = true' in the
>>>> project section of the esg.ini file then the discrepancy should be reversed
>>>> and the view in the tds should be correct.
>>>>
>>>> The gateway search will always show the most recent version of the data and in
>>>> the history tab other versions of the found data.
>>>>
>>>> So the only thing you have to do is to insert the above option in the esg.ini
>>>> file of your data node before publishing the data.
>>>>
>>>> Hope that clarified this.
>>>>
>>>> Regards
>>>> Hans
>>>>
>>>>
>>>>
>>>> Am 16.01.2012 16:36, schrieb Serguei Nikonov:
>>>>> Hi Hans,
>>>>>
>>>>> so, if I change version of dataset on gateway to have it the same as what we
>>>>> have in physical path of files then "Simulations" search in gateway (e.g.
>>>>> http://cmip-gw.badc.rl.ac.uk/query/advanced.htm?product=ConfiguredModelwill)
>>>>> gives all published experiments visible in "Datasets" search? And also
>>>>> "Output" tab of simulation will contain links to datasets? Do I understand
>>>>> correctly what the final result should be achieved?
>>>>>
>>>>> Thanks,
>>>>> Sergey
>>>>>
>>>>> On 01/16/2012 02:56 AM, Hans Ramthun wrote:
>>>>>> Hallo Estani,
>>>>>>
>>>>>> Correct that discrepancy is what I meant.
>>>>>>
>>>>>> Thanks for investigating and clarifying this
>>>>>> Hans
>>>>>>
>>>>>> Am 14.01.2012 20:28, schrieb Estanislao Gonzalez:
>>>>>>> Hi,
>>>>>>>
>>>>>>> I've CCed Has, for some reason it wasn't in the list.
>>>>>>>
>>>>>>> Anyway, the problem was the version number of the datasets, e.g.:
>>>>>>> This is the url of the metadata to file:
>>>>>>> http://esgdata.gfdl.noaa.gov/thredds/esgcet/1/cmip5.output1.NOAA-GFDL.GFDL-CM3.historical.mon.atmos.Amon.r1i1p1.v1.html?dataset=cmip5.output1.NOAA-GFDL.GFDL-CM3.historical.mon.atmos.Amon.r1i1p1.v1.ps_Amon_GFDL-CM3_historical_r1i1p1_186001-186412.nc
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> says the dataset it's v1 as you note from the url. Now the download path is:
>>>>>>> http://esgdata.gfdl.noaa.gov/thredds/fileServer/gfdl_dataroot/NOAA-GFDL/GFDL-CM3/historical/mon/atmos/Amon/r1i1p1/v20110601/ps/ps_Amon_GFDL-CM3_historical_r1i1p1_186001-186412.nc
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> where you see that the version in the DRS path says the version is 20110601.
>>>>>>>
>>>>>>> The problem is the discrepancy between the two versions. The required
>>>>>>> step was
>>>>>>> to publish the dataset using --new-version 20110601 or any of the others
>>>>>>> possibilities the publisher offers.
>>>>>>>
>>>>>>> Hans, that's what you've meant right?
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Estani
>>>>>>>
>>>>>>> Am 13.01.2012 21:18, schrieb Serguei Nikonov:
>>>>>>>> Hi Hans,
>>>>>>>>
>>>>>>>> Ron asked me to follow up this issue. As I understand primary problem is
>>>>>>>> with
>>>>>>>> visibility of GFDL metafor metadata from different gateways, isn't it?
>>>>>>>>
>>>>>>>> As I checked GFDL data, it's accessible from European gateways. I am
>>>>>>>> wondering how type of versioning (numbers or date) in datasets can
>>>>>>>> affect on
>>>>>>>> this primary problem? May be I am not clear understand where is the
>>>>>>>> problem,
>>>>>>>> can you explain me, please.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Sergey Nikonov,
>>>>>>>> GFDL Data Portal
>>>>>>>>
>>>>>>>>
>>>>>>>> > Am 13.01.2012 09:05, schrieb Hans Ramthun:
>>>>>>>> > Hallo Estani,
>>>>>>>> >
>>>>>>>> > How can Ron fix this problem with the version numbers like v1,v2,...on
>>>>>>>> the
>>>>>>>> > GFDL thredds server
>>>>>>>> > (http://esgdata.gfdl.noaa.gov/thredds/esgcet/catalog.html)?
>>>>>>>> >
>>>>>>>> > Could you please guide him to get the correct ones like v20120113,...?
>>>>>>>> >
>>>>>>>> > Thanks
>>>>>>>> > Hans
>>>>>>>> >
>>>>>>>> >
>>>>>>>> > -------- Original-Nachricht --------
>>>>>>>> > Betreff: Re: [metafor] metafor questionnaire
>>>>>>>> > Datum: Thu, 12 Jan 2012 12:56:39 -0500
>>>>>>>> > Von: Ron Stouffer <ronald.stouffer at noaa.gov>
>>>>>>>> > <mailto:ronald.stouffer at noaa.gov>
>>>>>>>> > Organisation: Geophysical Fluid Dynamics Laboratory
>>>>>>>> > An: Hans Ramthun <ramthun at dkrz.de> <mailto:ramthun at dkrz.de>
>>>>>>>> >
>>>>>>>> >
>>>>>>>> >
>>>>>>>> > Hi Hans,
>>>>>>>> >
>>>>>>>> > I am not sure if the problem you point out is my problem (something that
>>>>>>>> > GFDLers need to fix) or something that somebody involved in the CMIP
>>>>>>>> > data serving software needs to fix.
>>>>>>>> >
>>>>>>>> > If it is our (GFDL) problem. How do we go about fixing it???
>>>>>>>> >
>>>>>>>> > Thanks for the comment.
>>>>>>>> > -Ron
>>>>>>>> >
>>>>>>>> > On 1/12/2012 9:04 AM, Hans Ramthun wrote:
>>>>>>>> > > Hallo Ron,
>>>>>>>> > >
>>>>>>>> > > When I go to the GFDL thredds server
>>>>>>>> > > (http://esgdata.gfdl.noaa.gov/thredds/esgcet/catalog.html) I find only
>>>>>>>> > > datasets with version numbers like v1,v2,...
>>>>>>>> > > Normally I would expect here something like a data: v20120112 or so.
>>>>>>>> > >
>>>>>>>> > > Cheers
>>>>>>>> > > Hans
>>>>>>>> > >
>>>>>>>> > >
>>>>>>>> > > Am 12.01.2012 14:08, schriebcharlotte.pascoe at stfc.ac.uk:
>>>>>>>> <mailto:charlotte.pascoe at stfc.ac.uk:>
>>>>>>>> > >> Hi all,
>>>>>>>> > >>
>>>>>>>> > >> There are no GFDL simulations on the list of published CIM documents
>>>>>>>> > >> on the questionnaire atom feed.
>>>>>>>> > >> However, there are 26 simulations documented in the GFDL pages of the
>>>>>>>> > >> questionnaire.
>>>>>>>> > >> Ron, you'll have just received an email from the help desk asking you
>>>>>>>> > >> to hit the publish button on you simulation documents and offering an
>>>>>>>> > >> online demo session to help iron out any niggles.
>>>>>>>> > >>
>>>>>>>> > >> best,
>>>>>>>> > >> Charlotte
>>>>>>>> > >>
>>>>>>>> > >> -----Original Message-----
>>>>>>>> > >> From:metafor-bounces at lists.enes.org
>>>>>>>> <mailto:metafor-bounces at lists.enes.org>
>>>>>>>> > >> [mailto:metafor-bounces at lists.enes.org] On Behalf Of Bryan Lawrence
>>>>>>>> > >> Sent: 11 January 2012 13:33
>>>>>>>> > >> To: Ron Stouffer
>>>>>>>> > >> Cc: Metafor List; sylvia murphy
>>>>>>>> > >> Subject: Re: [metafor] metafor questionnaire
>>>>>>>> > >>
>>>>>>>> > >>> Bryan,
>>>>>>>> > >>>
>>>>>>>> > >>> Is the ESM2M (GFDL) metafor questionnaire done and public?
>>>>>>>> > >>>
>>>>>>>> > >>> -Ron
>>>>>>>> > >> Hi Ron
>>>>>>>> > >>
>>>>>>>> > >> The short answer is yes.
>>>>>>>> > >>
>>>>>>>> > >> The slightly longer answer is, that for reasons I don't understand
>>>>>>>> > >> one sees a different amount of simulation metadata for GFDL in the
>>>>>>>> > >> various gateways ... There are 23 simulations descriptions for GFDL
>>>>>>>> > >> (including ESM2M ones) at BADC, 21 at PCMDI and 10 at NCAR in their
>>>>>>>> > >> gateway 2 ...
>>>>>>>> > >>
>>>>>>>> > >> (You can see these records by choosing simulation or'simulation
>>>>>>>> > >> metadata' as the target in your search on the portals, rather than
>>>>>>>> data)
>>>>>>>> > >>
>>>>>>>> > >> I don't know that any of these numbers correspond to the number in
>>>>>>>> > >> the questionnaire feed. It's an interesting quality control issue,
>>>>>>>> > >> and I think we'll knock something up to compare what's public from
>>>>>>>> > >> the questionnaire, and what appears in the portals.
>>>>>>>> > >>
>>>>>>>> > >> Cheers
>>>>>>>> > >> Bryan
>>>>>>>> > >>
>>>>>>>> > >> --
>>>>>>>> > >> Bryan Lawrence
>>>>>>>> > >> University of Reading: Professor of Weather and Climate Computing.
>>>>>>>> > >> National Centre for Atmospheric Science: Director of Models and Data.
>>>>>>>> > >> STFC: Director of the Centre for Environmental Data Archival.
>>>>>>>> > >> Ph:+44 118 3786507 <tel:%2B44%20118%203786507> or 1235 445012;
>>>>>>>> Web:home.badc.rl.ac.uk/lawrence <http://home.badc.rl.ac.uk/lawrence>
>>>>>>>> > >> _______________________________________________
>>>>>>>> > >> metafor mailing list
>>>>>>>> > >> metafor at lists.enes.org <mailto:metafor at lists.enes.org>
>>>>>>>> > >> https://lists.enes.org/mailman/listinfo/metafor
>>>>>>>> > >
>>>>>>>> > >
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Estanislao Gonzalez
>>>>>>>>
>>>>>>>> Max-Planck-Institut fЭr Meteorologie (MPI-M)
>>>>>>>> Deutsches Klimarechenzentrum (DKRZ) - German Climate Computing Centre
>>>>>>>> Room 108 - Bundesstrasse 45a, D-20146 Hamburg, Germany
>>>>>>>>
>>>>>>>> Phone:+49 (40) 46 00 94-126 <tel:%2B49%20%2840%29%2046%2000%2094-126>
>>>>>>>> E-Mail:gonzalez at dkrz.de <mailto:gonzalez at dkrz.de>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>
>
>



More information about the GO-ESSP-TECH mailing list