[GO-ESSP] gridded data management systems

Steve Hankin Steven.C.Hankin at noaa.gov
Wed Nov 24 09:32:26 MST 2004


Hi Jon,

When you refer to "standard flat files" are you including formats like netCDF
and HDF5 under that title?  This is often a source of terminology confusion as
"flat files" sometimes refers to "anything but a database".  Others regard
n-dimensional, multi-variate data standards like netCDF and HDF to be
alternatives to "flat" IEEE files.

The question that you pose is essentially to weigh the pros and cons of managing
your data with a commercial database that has been enhanced to handle grids, or
to handle your data with netCDF (which in the next version will merge with HDF5
to handle compression, tiles, etc.) and the free netCDF utilities.  (Presumably
from-scratch development with IEEE binary files is not the way to go.)  You
mentioned some down-sides to the commercial software route (cost,
"proprietariness" of software, dependence on a single supplier,...).   Are the
advantages of the database approach sufficient to outweigh these costs?   You
have also not mentioned network access to the data.  Is it a requirement is for
the data to be OPeNDAP accessible?  Or alternatively, is access from enterprise
GIS systems at the center of your bullseye?

   * Items 1-2 are trivial for either system.  Comparative performance ... do
     you have any data?  The Barrodale product is new and one-of-a-kind.  It
     would be interesting to see some benchmarks comparing it to netCDF and
     HDF5.
   * Items 3-4 can be handled with the new FDS (Ferret Data Server) and probably
     the GDS server, as well.  Custom code may be required depending upon the
     list of projections that is desired, but these are open environments, where
     this can be added.   Item 3-4 capabilities are also available and
     presumably well supported if your database is embedded in an enterprise GIS
     framework.
   * Item 5 is probably better handled in a database environment, though it can
     also be handled (with some effort -- in various ways) in a Web service
     environment based on OPeNDAP.

Just bouncing around the ideas.  This community will be interested to hear what
further you learn.

    - steve

====================================

Jon Blower wrote:

> Hi all,
>
> As some of you may know, we at the Reading e-Science Centre have been
> investigating some new ways to store and manage data from models of the
> oceans and atmosphere.  We have been looking at storing data in databases,
> rather than standard flat-file systems, and have over the last few months
> been evaluating IBM's Informix database with Barrodale Computing Services'
> Grid DataBlade plug-in (see http://www.resc.rdg.ac.uk/projects.php for more
> details).  Eventually this might form the back-end to our own data portal
> page (http://www.nerc-essc.ac.uk/godiva).
>
> We have found good and bad points about this system and are now wondering
> how to take things forward.  I have been considering the feasibility of
> writing (essentially from scratch) an intelligent storage/management
> application for gridded geospatial data.  The key features of this system
> would include:
>
> 1) Data would be stored in a single format but can be extracted in a variety
> of formats
> 2) Data could be sliced and subsetted in all possible ways (e.g. extraction
> of 1-D timeseries, 2-D areas, 3-D volumes/animations, 4-D data blocks) and
> extracted at different spatial and temporal resolutions
> 3) Data could be stored on the original grid (including rotated grids) but
> extracted on the grid of the user's choice
> 4) The necessary projection and interpolation would happen on the fly
> 5) The system would allow complex queries to be made (e.g. "Give me all the
> times and locations at which the sea surface temperature was greater than 20
> degC in the North Atlantic in June 2003")
>
> The systems we have looked at so far get us part, but not all, of the way
> there.  Furthermore, the system currently under evaluation (Informix/Grid
> DataBlade) is closed-source, commercial software so we can't modify it
> ourselves.  However, such database-based systems have some key advantages
> over standard flat files, notably intelligent tiling and caching, giving
> very fast retrieval of data.
>
> I was wondering whether this community would welcome an effort to create an
> open-source data management/storage system for geospatial data, perhaps as a
> plug-in to an open-source DBMS such as PostgreSQL.  I haven't found an
> existing project that answers our requirements, but please let me know if
> you know of anything (some packages seem to deal with geospatial data, but
> are not designed for _gridded_ data).  It seems that this could be of
> benefit to a to the GO-ESSP community, considering that any Earth System
> Portal must be backed by some kind of data store! ;-)
>
> This has been rather a long post, sorry!  Any suggestions or feedback would
> be very much appreciated.
>
> Best wishes,
> Jon
>
> --------------------------------------------------------------
> Dr Jon Blower              Tel: +44 118 378 5213 (direct line)
> Technical Director         Tel: +44 118 378 8741 (ESSC)
> Reading e-Science Centre   Fax: +44 118 378 6413
> ESSC                       Email: jdb at mail.nerc-essc.ac.uk
> University of Reading
> 3 Earley Gate
> Reading RG6 6AL, UK
> --------------------------------------------------------------
>
> _______________________________________________
> GO-ESSP mailing list
> GO-ESSP at ucar.edu
> http://mailman.ucar.edu/mailman/listinfo/go-essp

--

Steve Hankin, NOAA/PMEL -- Steven.C.Hankin at noaa.gov
7600 Sand Point Way NE, Seattle, WA 98115-0070
ph. (206) 526-6080, FAX (206) 526-6744

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ucar.edu/pipermail/go-essp/attachments/20041124/7fb5410c/attachment.htm


More information about the GO-ESSP mailing list