[Go-essp-tech] Proposed version directory structure document

stephen.pascoe at stfc.ac.uk stephen.pascoe at stfc.ac.uk
Fri Apr 16 02:13:18 MDT 2010


Hi Bob,
 
Thanks for promptly commenting on the document.  Clarifying that the
publisher has these features is great news and I'm sorry that, in trying
to give everyone time to digest the document by Tuesday, I didn't have
time to confirm the facts with you.  I'm hoping this way any errors will
come out in the wash.
 
The main thing I missed was the ability to create multiple THREDDS
catalogues for a dataset (or 1 catalogue per dataset version).  Omitting
this feature felt like a funder mental difference in model to the DRS.
I need to work out how to do this now and I'll revise the version
directory structure document too.  Phil Bentley has recommended a
different structure that has some advantages so the document will
probably look very different next time.
 
Incidentally, I'm increasingly impressed with the ESG publisher and I'm
really enjoying working with it.  The stuff you've done with project
handler plugins in the latest release strengthens my impression that it
is a tool we will be using for a long time.
 
Cheers,
Stephen.
 
---
Stephen Pascoe  +44 (0)1235 445980
British Atmospheric Data Centre
Rutherford Appleton Laboratory
 

________________________________

From: Bob Drach [mailto:drach1 at llnl.gov] 
Sent: 16 April 2010 00:14
To: Pascoe, Stephen (STFC,RAL,SSTD)
Cc: go-essp-tech at ucar.edu
Subject: Re: [Go-essp-tech] Proposed version directory structure
document


Hi Stephen, 


Let me clarify a few points in the description of ESG Publisher:



The document states: "ESG Publisher version system is built around
mutable datasets.  It does not attempt to maintain references to
previous data and the dataset version number is not part of the dataset
id unless the publisher is configured to include it from the dataset
metadata.  This means that it is not straight forward at this time to
publish multiple versions of an atomic dataset unless each version is
published as a separate dataset.  This approach would effectively ignore
ESG Publisher's version system and manage all versions independently."

- As of Version 2 the unit of publication is in fact a 'dataset
version', terminology that came out of the December meeting in Boulder.
A dataset version is an immutable object which can represent a 'DRS
dataset including version number'. The published 'dataset version'
itself has an identifier which typically consists of dataset_id+version
number; this appears in the THREDDS catalog. As you stated in the
document, whether or not the published dataset corresponds to a DRS
dataset is a matter of publisher configuration, not an inherent property
of the publisher.

- The node database does in fact maintain references to the composition
of previous dataset versions. It is possible to have multiple versions
published simultaneously, to list all published versions of a dataset,
and for any given dataset version the files contained in that version
can be listed.

- The intention of the publisher design is to automate versioning as
much as possible. A 'dataset' is considered to be a collection of
dataset versions. Consequently, 'publishing a dataset' really means
'publishing a dataset version where the version number is incremented
relative to the previous version.' Similarly, 'unpublishing' a dataset
by default unpublishes all versions of a dataset. The terminology
dataset_id#n can be used to refer to a specific version.




In short, there is no fundamental mismatch between the DRS model and the
ESG publisher.




Best regards,




Bob








On Apr 15, 2010, at 3:24 AM, <stephen.pascoe at stfc.ac.uk> wrote:



	Hi everyone,
	 
	Attached is my view on how we should structure the archive to
support multiple versions.  It divides into 2 main sections, the first
is a fairly lengthy summary of why this problem isn't solved yet in
terms of the differences between the ESG datanode software and the DRS
document.  The second section lays out the proposed structure and how we
would manage symbolic links and moving from one version to another.  I
restrict myself to directories below the atomic dataset level.  
	 
	Lots of issues are left to resolve, in particular how we ESG
publisher can make use of this structure.  I'll try and draw attention
to these points in the agenda for Tuesday's telco which will follow
later today.
	 
	Cheers,
	Stephen.
	 
	---
	Stephen Pascoe  +44 (0)1235 445980
	British Atmospheric Data Centre
	Rutherford Appleton Laboratory
	 
	
	

	-- 
	Scanned by iCritical. 

	
	
	
<ESGF_version_structure.odt>____________________________________________
___
	GO-ESSP-TECH mailing list
	GO-ESSP-TECH at ucar.edu
	http://*mailman.ucar.edu/mailman/listinfo/go-essp-tech
	



-- 
Scanned by iCritical.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ucar.edu/pipermail/go-essp-tech/attachments/20100416/1007f08a/attachment-0001.html 


More information about the GO-ESSP-TECH mailing list