[Go-essp-tech] Publishing dataset with option --update

Drach, Bob drach1 at llnl.gov
Tue Jan 3 12:40:50 MST 2012


Hi Jamie, Stephen,

I agree with Stephen. The understanding has been that adding new files to a
dataset also triggers a new dataset version. It is possible to override this
behavior, but the default is to generate a new version number when files
have been added, modified, or deleted from an existing dataset.

Regards,

--Bob


On 12/30/11 11:12 AM, "stephen.pascoe at stfc.ac.uk"
<stephen.pascoe at stfc.ac.uk> wrote:

> Hi Jamie,
> 
> My understanding was that we had agreed that once a dataset version had been
> published (i.e. is available at a Gateway) no files would be
> added/deleted/changed in that version, any changes to the dataset would
> trigger a new version.  This is the only sensible way of having versions at
> all and breaking this rule means users can't be confident that their version
> is consistent with someone else's at the same version number.
> 
> I'm sure you already knew this was BADC's position and want to clarify what
> other centres understand by the versioning rules.  No-one has ever
> contradicted my understanding in emails or telcos but in the end it is up to
> individual datanode administrators to keep to these rules.
> 
> Cheers,
> Stephen.
> 
> ---
> Stephen Pascoe  +44 (0)1235 445980
> Centre of Environmental Data Archival
> STFC Rutherford Appleton Laboratory, Harwell Oxford, Didcot OX11 0QX, UK
> 
> -----Original Message-----
> From: go-essp-tech-bounces at ucar.edu [mailto:go-essp-tech-bounces at ucar.edu] On
> Behalf Of Kettleborough, Jamie
> Sent: 22 December 2011 09:24
> To: Drach, Bob; Serguei Nikonov; Nathan Wilhelmi
> Cc: go-essp-tech at ucar.edu
> Subject: Re: [Go-essp-tech] Publishing dataset with option --update
> 
> Hello Karl, Bob,
> 
> Sorry to labour this, but can you clarify (I don't know enough about map files
> and esgpublish to know the answer).  Do you expect addition of new files to a
> currently published data set to trigger a new DRS publication version (so the
> vYYYYMMDD bit changes in the DRS)?
> 
> If not can you clarify under what circumstances you expect data publishers to
> generater new DRS publication versions, and when its OK for them to update a
> current version.  [I think you - users, data providers, data node admins etc -
> get a better view of the history of the dataset if you always update the
> version - which I think ties in with what Ashish said.  Even if this isn't ESG
> policy I think it is a good policy for CMIP5.]
> 
> I *suspect* there may be a communication issue here and not everyone has the
> same understanding of what should happen.
> 
> Thanks,
> 
> Jamie 
> 
>  
> 
>> -----Original Message-----
>> From: go-essp-tech-bounces at ucar.edu
>> [mailto:go-essp-tech-bounces at ucar.edu] On Behalf Of Drach, Bob
>> Sent: 22 December 2011 01:42
>> To: Serguei Nikonov; Nathan Wilhelmi
>> Cc: go-essp-tech at ucar.edu
>> Subject: Re: [Go-essp-tech] Publishing dataset with option --update
>> 
>> Hi Sergey,
>> 
>> The way I would recommend adding new files to an existing
>> dataset is as
>> follows:
>> 
>> - Unpublish the previous dataset from the gateway and thredds
>> 
>> % esgunpublish
>> cmip5.output1.NOAA-GFDL.GFDL-CM3.historical.mon.atmos.Amon.r1i1p1
>> 
>> - Add the new files to the existing mapfile for the dataset
>> they are being added to.
>> 
>> - Republish with the expanded mapfile:
>> 
>> % esgpublish --read-files --map newmap.txt --project cmip5
>> --thredds --publish
>> 
>> The publisher will:
>> - not rescan existing files, only the new files
>> - create a new version to reflect the additional files
>> 
>> 
>> Alternatively you can create a mapfile with *only* the new
>> files (Using esgscan_directory), then republish using the
>> --update command.
>> 
>> --Bob
>> 
>> 
>> On 12/21/11 8:40 AM, "Serguei Nikonov"
>> <serguei.nikonov at noaa.gov> wrote:
>> 
>>> Hi Nate,
>>> 
>>> unfortunately this is not the only dataset I have a problem - there
>>> are at least
>>> 5 more. Should I unpublish them locally (db, thredds) and
>> than create 
>>> new version containing full set of files? What is the
>> official way to 
>>> update dataset?
>>> 
>>> Thanks,
>>> Sergey
>>> 
>>> 
>>> On 12/20/2011 07:06 PM, Nathan Wilhelmi wrote:
>>>> Hi Bob/Mike,
>>>> 
>>>> I believe the problem is that when files were added the
>> timestamp on 
>>>> the dataset wasn't updated.
>>>> 
>>>> The triple store will only harvest datasets that have files and an
>>>> updated timestamp after the last harvest.
>>>> 
>>>> So what likely happened is the dataset was created without
>> files, so 
>>>> it wasn't initially harvested. Files were subsequently
>> added, but the 
>>>> timestamp wasn't updated, so it was still not a candidate for
>>>> harvesting.
>>>> 
>>>> Can you update the date_updated timestamp for the dataset
>> in question 
>>>> and then trigger the RDF harvesting, I believe the dataset
>> will show 
>>>> up then.
>>>> 
>>>> Thanks!
>>>> -Nate
>>>> 
>>>> On 12/20/2011 11:49 AM, Serguei Nikonov wrote:
>>>>> Hi Mike,
>>>>> 
>>>>> I am a member of data publishers group. I have been publishing
>>>>> considerable amount of data without such kind of troubles
>> but this 
>>>>> one occurred only when I tried to add some files to existing
>>>>> dataset. Publishing from scratch works fine for me.
>>>>> 
>>>>> Thanks,
>>>>> Sergey
>>>>> 
>>>>> On 12/20/2011 01:29 PM, Ganzberger, Michael wrote:
>>>>>> Hi Serguei,
>>>>>> 
>>>>>> That task is on a scheduler and will re-run every 10 minutes. If
>>>>>> your data does not appear after that time then perhaps there is
>>>>>> another issue. One issue could be that publishing to the gateway
>>>>>> requires that you have the role of "Data Publisher";
>>>>>> 
>>>>>> "check that the account is member of the proper group
>> and has the 
>>>>>> special role of Data Publisher"
>>>>>> 
>>>>>> http://esgf.org/wiki/ESGFNode/FAQ
>>>>>> 
>>>>>> Mike
>>>>>> 
>>>>>> 
>>>>>> -----Original Message-----
>>>>>> From: Serguei Nikonov [mailto:serguei.nikonov at noaa.gov]
>>>>>> Sent: Tuesday, December 20, 2011 10:12 AM
>>>>>> To: Ganzberger, Michael
>>>>>> Cc: Stéphane Senesi; Drach, Bob; go-essp-tech at ucar.edu
>>>>>> Subject: Re: [Go-essp-tech] Publishing dataset with
>> option --update
>>>>>> 
>>>>>> Hi Mike,
>>>>>> 
>>>>>> thansk for suggestion but I don't have any privileges to do
>>>>>> anything on gateway.
>>>>>> I am just publishing data on GFDL data node.
>>>>>> 
>>>>>> Regards,
>>>>>> Sergey
>>>>>> 
>>>>>> On 12/20/2011 01:05 PM, Ganzberger, Michael wrote:
>>>>>>> 
>>>>>>> 
>>>>>>> Hi Serguei,
>>>>>>> 
>>>>>>> I'd like to suggest this that may help you from
>>>>>>> http://esgf.org/wiki/Cmip5Gateway/FAQ
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> "The search does not reflect the latest DB changes I've made
>>>>>>> 
>>>>>>> You have to manually trigger the 3store harvesting. Logging as
>>>>>>> root and go to Admin->"Gateway Scheduled Tasks"->"Run
>> tasks" and 
>>>>>>> restart the job named RDFSynchronizationJobDetail"
>>>>>>> 
>>>>>>> Mike Ganzberger
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> -----Original Message-----
>>>>>>> From: go-essp-tech-bounces at ucar.edu
>>>>>>> [mailto:go-essp-tech-bounces at ucar.edu]
>>>>>>> On Behalf Of Stéphane Senesi
>>>>>>> Sent: Tuesday, December 20, 2011 9:42 AM
>>>>>>> To: Serguei Nikonov
>>>>>>> Cc: Drach, Bob; go-essp-tech at ucar.edu
>>>>>>> Subject: Re: [Go-essp-tech] Publishing dataset with option
>>>>>>> --update
>>>>>>> 
>>>>>>> Serguei
>>>>>>> 
>>>>>>> We have for some time now experienced similar problems when
>>>>>>> publishing to the PCMDI gateway, i.e. not getting a "SUCCESS"
>>>>>>> message when publishing . Sometimes, files are actually
>> published 
>>>>>>> (or at least accessible through the gateway, their status being
>>>>>>> actually "START_PUBLISHING", after esg_list_datasets report) ,
>>>>>>> sometimes not. An hypothesis is that the PCMDI Gateway load do
>>>>>>> generate the problem. We havn't yet got a confirmation by Bob.
>>>>>>> 
>>>>>>> In contrast to your case, this happens when publishing
>> a dataset 
>>>>>>> from scratch (I mean, not an update)
>>>>>>> 
>>>>>>> Best regards (do not expect any feeback from me since early
>>>>>>> january, yet)
>>>>>>> 
>>>>>>> S
>>>>>>> 
>>>>>>> 
>>>>>>> Serguei Nikonov wrote, On 20/12/2011 18:11:
>>>>>>>> Hi Bob,
>>>>>>>> 
>>>>>>>> I needed to add some missed variables to existing
>> dataset and I 
>>>>>>>> found in esgpublish command an option --update. When I
>> tried it 
>>>>>>>> I've got normal message like INFO 2011-12-20 11:21:00,893
>>>>>>>> Publishing:
>>>>>>>> 
>> cmip5.output1.NOAA-GFDL.GFDL-CM3.historical.mon.atmos.Amon.r1i1p1
>>>>>>>> , parent = pcmdi.GFDL INFO 2011-12-20 11:21:07,564 Result:
>>>>>>>> PROCESSING INFO 2011-12-20 11:21:11,209 Result: PROCESSING ....
>>>>>>>> 
>>>>>>>> but nothing happened on gateway - new variables are not there.
>>>>>>>> The files corresponding to these variables are in
>> database and in 
>>>>>>>> THREDDS catalog but apparently were not published on gateway.
>>>>>>>> 
>>>>>>>> I used command line
>>>>>>>> esgpublish --update --keep-version --map<map_file> --project
>>>>>>>> cmip5 --noscan --publish.
>>>>>>>> 
>>>>>>>> Should map file be of some specific format to make it works in
>>>>>>>> mode I need?
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> Sergey Nikonov
>>>>>>>> GFDL
>>>>>>>> 
>>>>>>>> 
>>>>>>>> _______________________________________________
>>>>>>>> GO-ESSP-TECH mailing list
>>>>>>>> GO-ESSP-TECH at ucar.edu
>>>>>>>> http://mailman.ucar.edu/mailman/listinfo/go-essp-tech
>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> GO-ESSP-TECH mailing list
>>>>> GO-ESSP-TECH at ucar.edu
>>>>> http://mailman.ucar.edu/mailman/listinfo/go-essp-tech
>>>> 
>>>> 
>>> 
>> 
>> _______________________________________________
>> GO-ESSP-TECH mailing list
>> GO-ESSP-TECH at ucar.edu
>> http://mailman.ucar.edu/mailman/listinfo/go-essp-tech
>> 
> _______________________________________________
> GO-ESSP-TECH mailing list
> GO-ESSP-TECH at ucar.edu
> http://mailman.ucar.edu/mailman/listinfo/go-essp-tech



More information about the GO-ESSP-TECH mailing list