[Go-essp-tech] Versioning in CMIP5 including QC procedure

Drach, Bob drach1 at llnl.gov
Mon Apr 18 13:44:04 MDT 2011


I'm also in agreement on the salient points:

- QC is an essential part of CMIP5
- CMIP5 is using date-style versioning. (Yes, I'll remind the data node
publishers).

I don't think the implementation issues are (or should be) showstoppers.
We'll find a way to make the QC tool work in our environment.

Regards,

Bob


On 4/18/11 1:21 AM, "stephen.pascoe at stfc.ac.uk" <stephen.pascoe at stfc.ac.uk>
wrote:

> I completely agree with Martin.  For us the key event that dictates the
> dataset version is when the data is placed in our archive -- the date of that
> event is what becomes the version number.
> 
> I also notice that neither the CCCMA, NASA-GISS  or BCC datanodes are using
> date-versioning!
> 
> Stephen.
> 
> ---
> Stephen Pascoe  +44 (0)1235 445980
> Centre of Environmental Data Archival
> STFC Rutherford Appleton Laboratory, Harwell Oxford, Didcot OX11 0QX, UK
> 
> 
> -----Original Message-----
> From: go-essp-tech-bounces at ucar.edu [mailto:go-essp-tech-bounces at ucar.edu] On
> Behalf Of martin.juckes at stfc.ac.uk
> Sent: 18 April 2011 08:48
> To: martina.stockhause at zmaw.de; drach1 at llnl.gov; taylor13 at llnl.gov; Lawrence,
> Bryan (STFC,RAL,RALSP)
> Cc: go-essp-tech at ucar.edu; michael.lautenschlager at zmaw.de; painter1 at llnl.gov
> Subject: Re: [Go-essp-tech] Versioning in CMIP5 including QC procedure
> 
> Hello All,
> 
> I'd like to back up Martina on this. We do have an agreed version control
> system defined in the DRS document (or, more precisely, the DRS document
> defines an aspect of the version control which should be implemented -- namely
> a subdirectory level which indicates the version).
> 
> We agreed that the version of a publication dataset should be reflected in the
> directory structure, and it is clear that this requires that the version of
> the publication dataset be determined before running the ESG publisher, not by
> the ESG publisher. The QC software runs on the file system rather than
> accessing data through the data node software, so consistency between the
> file-system layout and the publication units is essential. While it is
> possible to imagine archives in which all access to data is through the data
> node and the file-system layout is of no interest anyone but the data node
> development team, that is clearly not the situation here -- no matter how
> desirable it might be.
> 
> QC is a very important part of ensuring consistency of quality in the archive,
> we really need to make it work.
> 
> At BADC, we started doing the layout on the same machine as the publishing,
> but are now moving it to a different machine -- there is no fundamental
> problem with this (the machine on which the layout is done can, of course,
> communicate with the machine on which the publishing is done).
> 
> regards,
> Martin
> 
> 
> From: go-essp-tech-bounces at ucar.edu [go-essp-tech-bounces at ucar.edu] on behalf
> of Martina Stockhause [martina.stockhause at zmaw.de]
> Sent: 18 April 2011 06:57
> To: Drach, Bob
> Cc: GO-ESSP; Painter, Jeff; michael.lautenschlager
> Subject: Re: [Go-essp-tech] Versioning in CMIP5 including QC procedure
> 
> Hi, Bob,
> 
> since not every published dataset is part of the DOI (on the level of
> experiment), I have to keep track of versions as well, on the dataset and on
> the experiment (DOI) level. The inhomogeneity of the dataset version syntax is
> more a problem of version control within the QC than one of the QC L2 checker,
> the QC L2 analyzer, or the QC L2 result export for QC L3.
> 
> I do not care if the homogeneous version syntax is yours or that of BADC and
> DKRZ, though the latter would save me adaptation effort, but *that* it is
> homogeneous. Maybe you could talk to Stephen to find an agreement on the
> version syntax / version handling.
> 
> I am sorry that I have to insist.
> 
> Best wishes,
> Martina
> 
> 
> On 04/15/2011 11:18 PM, Drach, Bob wrote:
> Hi Martina,
> 
> There are a lot of things I like about the layout tool, but one aspect I'm not
> happy with is that it chooses a dataset version. IMO that logic should reside
> in the publisher, which has access to the history of dataset publication and
> dataset definitions. In our environment the layout is done on a different
> machine than publication, and does not have access to that history.
> Consequently we support the DRS file layout with the exception of dataset
> version numbers, which are defined later in the processing stream.
> 
> Would it be difficult to provide an option for the QC tool to ignore
> extraneous directories (not defined by DRS)?
> 
> Best regards,
> 
> Bob
> 
> 
> On 4/15/11 4:52 AM, "Martina Stockhause"
> <martina.stockhause at zmaw.de<UrlBlockedError.aspx>> wrote:
> 
> 
>  Hi, Dean, Karl, and Bob,
> 
>  there was a discussion started about different types of versioning inside
> ESGF for CMIP5 data on the QC request tracker (see:
> http://redmine.dkrz.de/collaboration/issues/321). Jeff wrote: "
> 
> 
> Bob Drach corrected me on one issue: our PCMDI version numbers are not DRS
> version numbers, they are just a tool for keeping track of the data received
> at PCMDI. Thus these version numbers are generated at PCMDI, while DRS version
> numbers are generated by the data producer. PCMDI does not use Stephen's
> versioning tool, or the DRS-style version numbers.
> 
> 
> "
> 
> 
> 
> Is that right? I thought that we agreed on a versioning procedure using
> Stephen's tool.
> 
> 
> 
> And I do have a problem with different ESG publication procedures (QC level 1
> checks), i.e. different QC procedures at the three partners. Additionally, the
> inconsistent naming conventions between WDCC / BADC on one side and PCMDI on
> the other side cannot be handled by the QC Workflow. Since we do a federated
> QC in three locations we need to use not only the same tools with the same
> configurations for a comparability of QC results, but we need to use the same
> naming conventions to grant a continuation of the overall QC process with QC
> L3 / DOI publication.
> 
> 
> 
> Thus the question:
>  Could PCMDI use Stephen's tool for CMIP5 data versioning as well?
> 
> 
> 
> Best wishes,
>  Martina
> 
> 
> 



More information about the GO-ESSP-TECH mailing list