[GO-ESSP] RE: [NERC-DataGrid] Units and NERC DataGrid

Peter Murray-Rust pm286 at cam.ac.uk
Mon Dec 6 13:04:29 MST 2004


At 11:04 06/12/2004 +0000, K. A. Bouton wrote:

[Laura - this isn't posted to scimarkup - perhaps you/ChrisC can capture it 
on the Wiki]

Thanks very much - the good thing about this is that we are bringing 
together those who have got something going already.  I think there will be 
a variety of organic approaches and probably a mix of discussion forums. 
One outcome is that there could be 2-3 major efforts with a larger number 
of extensions for various domains.

>Not that we can save you a job, but in building the structure for the model
>metadata we are coming to similar conclusions, and thinking we have to do it
>sooner than later.

I think many of us agree on the "sooner". It is important that we get 
experience so that we can perhaps make design changes later.

>We are taking the PRISM/CF approach where we start with a
>list compiled by a few of those in the know

What is PRISM?

>and work from there - Lois calls
>it the benevolent dictator approach :)

I agree with the benevolent dictator. That is how SAX was developed 
(http://www.saxproject.org/?selected=history1). It relies on the dictator 
having the respect of the community and also doing most of the hard work. 
If the dictator is some symbiote of several humans, it sometimes works but 
one of them normally has to take the lead. In SAX we progressed by David 
Megginson asking detailed questions at weekly intervals and collating the 
replies. I think that can be very effective.

I also like the IETF dictum "rough consensus and running code". I have a 
horror of building standards on "paper" before the implementation. I like 
to see tools emerging as we progress.

...more...


> > -----Original Message-----
> > From: nerc-datagrid-bounces at ncas.ac.uk
> > [mailto:nerc-datagrid-bounces at ncas.ac.uk] On Behalf Of Roy Lowry
> > Sent: Monday, December 06, 2004 10:41 AM
> > To: pm286 at cam.ac.uk; b.n.lawrence at rl.ac.uk
> > Cc: NERC-datagrid at ncas.ac.uk; rkl at bodc.ac.uk;
> > sgxml at biwebs.nerc-liv.ac.uk; mmi-tech at mbari.org;
> > lbartolo at kent.edu; GO-Essp at ucar.edu;
> > chris.little at metoffice.gov.uk; rlake at galdosinc.com;
> > caron at unidata.ucar.edu; RobertM at dessci.com
> > Subject: [NERC-DataGrid] Units and NERC DataGrid
> >
> >
> > Bryan/Peter,
> >
> > I have been giving some thought about what to do about units
> > in NERC DataGrid as we get to the stage where vocabularies
> > become today's problem rather than tomorrow's.
> >
> > The first issue to address is what to do about
> > standardisation.  The oceanographic community has been trying
> > for years to standardise the units people use for specific
> > parameters.  The degree of success, with good reason, has
> > declined dramatically as one goes from physical to chemical
> > to biological domains.

Probably true for units. However the biologists are very keen on building 
ontologies and units will be part of that. Though I think they have tougher 
problems to solve - "what is a gene"
?
>  It is therefore clear that in NERC
> > DataGrid we are going to have to deal with the situation
> > where different data hosts store the same parameter in
> > different units.

Yes

>   In other words, whilst we can recommend
> > standardisation I cannot see how we can enforce it and so we
> > need to engineer for what's really out there (remember I deal
> > in measurements, not model outputs).

I don't think that we should try to standardise the actual representation, 
but we *should* try to insist that the representation is exposed, that its 
semantics and ontology are explicit, and that there are tools for accessing 
the units in the hosts. I hope we shall move towards mappings so that it is 
less important what each host actually uses. We need to remember that most 
of the users of a data host are used to a particular procedure and won't 
want to change just for standardisation. The conversion should be as 
transparent as possible. One good aspect of RDF is that it allows 
arbitrarily many annotations onto data instances

> >
> > The solution has to lie in the philosophical approach taken
> > by UDUNITS, but this will require significant extension to do
> > the job we need.  We also need to take a more pragmatic
> > approach than dimensionality/canonical units to allow
> > interoperability between data sets where scale not
> > dimensionality is the issue.

This sounds interesting - could you expand?

> >
> > I also feel that what we need is more along the lines of a
> > semantic web resource rather than a downloadable executable
> > approach: in other words a units ontology that provides a
> > units vocabulary plus the knowledge about which units may be
> > interconverted and how.  Such an ontology could be
> > interlinked with parameter vocabularies indicating the subset
> > of units that may sensibly be associated with particular parameters.
I think we are very keen to take this route. I believe that many of the 
tricky problems of units (user preferences, dimensionless, preservation of 
context) can be done more effectively with RDF. Kieron Taylor (Southampton) 
has been exploring this in chemistry


> >
> > Before embarking on building such an ontology for NERC
> > DataGrid, I am circulating this e-mail far and wide
> > (apologies for  multiple postings) to obtain answers to two
> > simple questions.
> >
> > What do you think of this approach?
> >
> > More importantly:
> >
> > Does anybody know of anybody who is building/has built such
> > an ontlogy and can save me a job?

I think if Kieron can say what he has done it could help (Kieron, make sure 
it doesn't preempt your thesis content).

Best

P.


> >
> > Cheers, Roy
> >
> >
> >
> >
> >
> > Please note my new address:
> >
> > BODC
> > Joseph Proudman Building
> > 6 Brownlow Street
> > Liverpool L3 5DA
> >
> > Direct dial phone - +44 151 795 4895
> > E-mail: rkl at bodc.ac.uk
> >
> >
> > _______________________________________________
> > NERC-DataGrid mailing list
> > NERC-DataGrid at ncas.ac.uk
> > http://www.ncas.ac.uk/mailman/listinfo/nerc-datagrid
> >

Peter Murray-Rust
Unilever Centre for Molecular Informatics
Chemistry Department, Cambridge University
Lensfield Road, CAMBRIDGE, CB2 1EW, UK
Tel: +44-1223-763069



More information about the GO-ESSP mailing list