[Go-essp-tech] [esg-node-dev] Use of <metadata> element in THREDDS catalogs

stephen.pascoe at stfc.ac.uk stephen.pascoe at stfc.ac.uk
Thu Jun 2 01:58:11 MDT 2011


Gavin,

>> In my opinion OpenDAP (or something similar) is the only way
>
> actually... that may not be impossible.
> (pondering...)
> :-)

I would postulate it is near impossible, at least within the constraints of NetCDF3-API compatible OPeNDAP.  OPeNDAP is a protocol for accessing a single "dataset", the netcdf metadata of that dataset is delivered in a single HTTP GET request.  Do you want to deliver the metadata for the entire archive in a single request?  If you mean "or something similar", yes of course we should be aiming at a system that allows users to access data seamlessly.  That's what we are trying to build, right?

On Estani's wider point, I think it's not black and white.  Many users don't want to bother about files when the file divisions appear arbitrary (e.g. splitting a field into time chunks) provided the tools are good enough to support more sophisticated access (Sebastien's point).  The concept of the DRS atomic dataset is meant to define sensible units of data that the user will understand; remember atomic datasets? :-)

However, all users understand the need to divide information into units to make them manageable.  Everyone gets used to using files at some level -- PDFs, exel spreadsheets, source files, etc.

And there is one specific use case that requires users to be aware of files: verifying data integrity and provenance.  A user should be able to check the checksum of the data they downloaded and verify it was downloaded correctly.  If they find a random NetCDF file on their hard drive they should be able to ask ESGF where it came from by looking up the checksum or tracking_id.  An interesting future area of research could an algorithm for hashing virtual NetCDF datasets but we aren't there yet.

My instinct is that we should accept datasets are collections of files and not try to completely hide this idea, however most of the system should focus on datasets because they more flexible.

Stephen.

P.S. This is exactly what I want to cover in the data model interface group so I'm glad there is interest out there.

---
Stephen Pascoe  +44 (0)1235 445980
Centre of Environmental Data Archival
STFC Rutherford Appleton Laboratory, Harwell Oxford, Didcot OX11 0QX, UK

From: Gavin M. Bell [mailto:gavin at llnl.gov]
Sent: 01 June 2011 22:16
To: Estanislao Gonzalez
Cc: Cinquini, Luca (3880); Pascoe, Stephen (STFC,RAL,RALSP); go-essp-tech at ucar.edu; esg-node-dev at lists.llnl.gov
Subject: Re: [Go-essp-tech] [esg-node-dev] Use of <metadata> element in THREDDS catalogs

actually... that may not be impossible.
(pondering...)
:-)

On 6/1/11 9:24 AM, Estanislao Gonzalez wrote:
In my opinion OpenDAP (or something similar) is the only way to go... it would be great if it could act on an archive wide aggregation without any performance hit. Maybe some day...



--

Gavin M. Bell

Lawrence Livermore National Labs

--



 "Never mistake a clear view for a short distance."

               -Paul Saffo



(GPG Key - http://rainbow.llnl.gov/dist/keys/gavin.asc)



 A796 CE39 9C31 68A4 52A7  1F6B 66B7 B250 21D5 6D3E

-- 
Scanned by iCritical.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ucar.edu/pipermail/go-essp-tech/attachments/20110602/f0dda8a2/attachment-0001.html 


More information about the GO-ESSP-TECH mailing list