[Go-essp-tech] Publishing dataset with option --update

Serguei Nikonov serguei.nikonov at noaa.gov
Thu Dec 29 09:02:05 MST 2011


Hi Bob,

I tried the 1st way you suggested and it worked partially - the dataset was 
created om datanode with version 2 but it was not popped up on gateway. To make 
sure that it's not occasional result I repeated it with another datasets with 
the same result.
Now I have 2 datasets on datanode (visible in thredds server) but they are 
absent on gateway:
cmip5.output1.NOAA-GFDL.GFDL-CM3.historical.mon.atmos.Amon.r1i1p1.v2
cmip5.output1.NOAA-GFDL.GFDL-CM3.historical.mon.atmos.Amon.r2i1p1.v2.

Does it make sense to repeat esgpublish with 'publish' option?

Thanks and Happy New Year,
Sergey

On 12/21/2011 08:41 PM, Drach, Bob wrote:
> Hi Sergey,
>
> The way I would recommend adding new files to an existing dataset is as
> follows:
>
> - Unpublish the previous dataset from the gateway and thredds
>
> % esgunpublish
> cmip5.output1.NOAA-GFDL.GFDL-CM3.historical.mon.atmos.Amon.r1i1p1
>
> - Add the new files to the existing mapfile for the dataset they are being
> added to.
>
> - Republish with the expanded mapfile:
>
> % esgpublish --read-files --map newmap.txt --project cmip5 --thredds
> --publish
>
> The publisher will:
> - not rescan existing files, only the new files
> - create a new version to reflect the additional files
>
>
> Alternatively you can create a mapfile with *only* the new files (Using
> esgscan_directory), then republish using the --update command.
>
> --Bob
>
>
> On 12/21/11 8:40 AM, "Serguei Nikonov"<serguei.nikonov at noaa.gov>  wrote:
>
>> Hi Nate,
>>
>> unfortunately this is not the only dataset I have a problem - there are at
>> least
>> 5 more. Should I unpublish them locally (db, thredds) and than create new
>> version containing full set of files? What is the official way to update
>> dataset?
>>
>> Thanks,
>> Sergey
>>
>>
>> On 12/20/2011 07:06 PM, Nathan Wilhelmi wrote:
>>> Hi Bob/Mike,
>>>
>>> I believe the problem is that when files were added the timestamp on the
>>> dataset
>>> wasn't updated.
>>>
>>> The triple store will only harvest datasets that have files and an updated
>>> timestamp after the last harvest.
>>>
>>> So what likely happened is the dataset was created without files, so it
>>> wasn't
>>> initially harvested. Files were subsequently added, but the timestamp wasn't
>>> updated, so it was still not a candidate for harvesting.
>>>
>>> Can you update the date_updated timestamp for the dataset in question and
>>> then
>>> trigger the RDF harvesting, I believe the dataset will show up then.
>>>
>>> Thanks!
>>> -Nate
>>>
>>> On 12/20/2011 11:49 AM, Serguei Nikonov wrote:
>>>> Hi Mike,
>>>>
>>>> I am a member of data publishers group. I have been publishing considerable
>>>> amount of data without such kind of troubles but this one occurred only when
>>>> I
>>>> tried to add some files to existing dataset. Publishing from scratch works
>>>> fine
>>>> for me.
>>>>
>>>> Thanks,
>>>> Sergey
>>>>
>>>> On 12/20/2011 01:29 PM, Ganzberger, Michael wrote:
>>>>> Hi Serguei,
>>>>>
>>>>> That task is on a scheduler and will re-run every 10 minutes. If your data
>>>>> does not appear after that time then perhaps there is another issue. One
>>>>> issue could be that publishing to the gateway requires that you have the
>>>>> role
>>>>> of "Data Publisher";
>>>>>
>>>>> "check that the account is member of the proper group and has the special
>>>>> role of Data Publisher"
>>>>>
>>>>> http://esgf.org/wiki/ESGFNode/FAQ
>>>>>
>>>>> Mike
>>>>>
>>>>>
>>>>> -----Original Message-----
>>>>> From: Serguei Nikonov [mailto:serguei.nikonov at noaa.gov]
>>>>> Sent: Tuesday, December 20, 2011 10:12 AM
>>>>> To: Ganzberger, Michael
>>>>> Cc: StИphane Senesi; Drach, Bob; go-essp-tech at ucar.edu
>>>>> Subject: Re: [Go-essp-tech] Publishing dataset with option --update
>>>>>
>>>>> Hi Mike,
>>>>>
>>>>> thansk for suggestion but I don't have any privileges to do anything on
>>>>> gateway.
>>>>> I am just publishing data on GFDL data node.
>>>>>
>>>>> Regards,
>>>>> Sergey
>>>>>
>>>>> On 12/20/2011 01:05 PM, Ganzberger, Michael wrote:
>>>>>>
>>>>>>
>>>>>> Hi Serguei,
>>>>>>
>>>>>> I'd like to suggest this that may help you from
>>>>>> http://esgf.org/wiki/Cmip5Gateway/FAQ
>>>>>>
>>>>>>
>>>>>>
>>>>>> "The search does not reflect the latest DB changes I've made
>>>>>>
>>>>>> You have to manually trigger the 3store harvesting. Logging as root and go
>>>>>> to Admin->"Gateway Scheduled Tasks"->"Run tasks" and restart the job named
>>>>>> RDFSynchronizationJobDetail"
>>>>>>
>>>>>> Mike Ganzberger
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: go-essp-tech-bounces at ucar.edu [mailto:go-essp-tech-bounces at ucar.edu]
>>>>>> On Behalf Of StИphane Senesi
>>>>>> Sent: Tuesday, December 20, 2011 9:42 AM
>>>>>> To: Serguei Nikonov
>>>>>> Cc: Drach, Bob; go-essp-tech at ucar.edu
>>>>>> Subject: Re: [Go-essp-tech] Publishing dataset with option --update
>>>>>>
>>>>>> Serguei
>>>>>>
>>>>>> We have for some time now experienced similar problems when publishing
>>>>>> to the PCMDI gateway, i.e. not getting a "SUCCESS" message when
>>>>>> publishing . Sometimes, files are actually published (or at least
>>>>>> accessible through the gateway, their status being actually
>>>>>> "START_PUBLISHING", after esg_list_datasets report) , sometimes not. An
>>>>>> hypothesis is that the PCMDI Gateway load do generate the problem. We
>>>>>> havn't yet got a confirmation by Bob.
>>>>>>
>>>>>> In contrast to your case, this happens when publishing a dataset from
>>>>>> scratch (I mean, not an update)
>>>>>>
>>>>>> Best regards (do not expect any feeback from me since early january, yet)
>>>>>>
>>>>>> S
>>>>>>
>>>>>>
>>>>>> Serguei Nikonov wrote, On 20/12/2011 18:11:
>>>>>>> Hi Bob,
>>>>>>>
>>>>>>> I needed to add some missed variables to existing dataset and I found in
>>>>>>> esgpublish command an option --update. When I tried it I've got normal
>>>>>>> message like
>>>>>>> INFO 2011-12-20 11:21:00,893 Publishing:
>>>>>>> cmip5.output1.NOAA-GFDL.GFDL-CM3.historical.mon.atmos.Amon.r1i1p1, parent
>>>>>>> =
>>>>>>> pcmdi.GFDL
>>>>>>> INFO 2011-12-20 11:21:07,564 Result: PROCESSING
>>>>>>> INFO 2011-12-20 11:21:11,209 Result: PROCESSING
>>>>>>> ....
>>>>>>>
>>>>>>> but nothing happened on gateway - new variables are not there. The files
>>>>>>> corresponding to these variables are in database and in THREDDS catalog
>>>>>>> but
>>>>>>> apparently were not published on gateway.
>>>>>>>
>>>>>>> I used command line
>>>>>>> esgpublish --update --keep-version --map<map_file>  --project cmip5
>>>>>>> --noscan
>>>>>>> --publish.
>>>>>>>
>>>>>>> Should map file be of some specific format to make it works in mode I
>>>>>>> need?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Sergey Nikonov
>>>>>>> GFDL
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> GO-ESSP-TECH mailing list
>>>>>>> GO-ESSP-TECH at ucar.edu
>>>>>>> http://mailman.ucar.edu/mailman/listinfo/go-essp-tech
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>> _______________________________________________
>>>> GO-ESSP-TECH mailing list
>>>> GO-ESSP-TECH at ucar.edu
>>>> http://mailman.ucar.edu/mailman/listinfo/go-essp-tech
>>>
>>>
>>
>



More information about the GO-ESSP-TECH mailing list