[ESP] Idea for better automatic metadata harvesting...
Roland Schweitzer
rhs@cdc.noaa.gov
Tue, 28 Jan 2003 10:25:46 -0600
Folks,
I enjoyed the meeting. I've been to a string of productive meetings
recently. I don't know what I'm doing right, but I hope it continues. :)
Anyway, during the discussion we talked about metadata tools. Bryan
indicated that his group is interested in building a tool that will help
researchers create and edit metadata. During our conversation everybody
was enthusiastic about the idea that as much metadata as possible should
be automatically harvested from the existing netCDF files. I too am an
enthusiastic proponent of this idea, but I'm also a realist. In a bare
naked COARDS file there just ain't that much metadata which can be
harvested. However, many data providers add a rich set of metadata to
their netCDF files that is not part of the COARDS standard.
The idea is that any tool we construct should give the user of the tool
an easy way to configure it to harvest that additional metadata. I
worked on a project with Ted Habermann's group at NGDC to build such a
capability for their BlueAngel archive.
Look at http://www.cdc.noaa.gov/~rhs/netcdftofgdc/ for some details.
I'm NOT suggesting the implementation I created is the end all be all,
but I am convinced that the idea is a powerful one. In the right hands
this would change a metadata harvesting tool from a minor added
convenience to a configurable and nearly fully automatic tool. Travis
Stevens of NGDC has taken over my code and has made some additions. If
useful we can contact him for the latest and greatest.
CF has even more metadata (like the standard names) than a bare naked
COARDS file, but even so this capability in a tool would be useful. And
there are gobs and gobs of netCDF files in the COARDS convention that
will never have the added features of CF files.
Ethan and I talked about this for his automatic THREDDS catalog code as
well, but we never got past the discussion stage.
Roland