[Go-essp-tech] Output Variables & XML instances

Bryan Lawrence bryan.lawrence at stfc.ac.uk
Fri Jan 8 02:06:24 MST 2010


Hi Sylvia

Thanks for gathering these up ...

> 1) OUTPUT VARIABLES:
> 
> Original Question (Sylvia):  Should the model metadata side of the
> ESG display include a list of output variables when this information
> is not being collected by the questionnaire? Sylvia suggested we do NOT
> try to do this for the March ESG release.  For the March release,
> she is proposing that the metadata display would have a link to the
> data collections portion of ESG, which shows specific variables.
> 
> 
> The discussion thus far:  
> Bryan would like to see the output variables that actually appear
> in the output datasets tied to the model components in some way,
> even though we are not collecting this information in the questionnaire.
> He indicated that just having a link to the data collections from the
> model metadata display is insufficient.  He suggested the community
> write code to harvest the thredds catalog to get a list of output
> variables and then write code to associate those variables back to the
> model they came from.  He thought that Hans or Stephan might be the
> one to do this.

A key phrase you use is "data collection", and a definitoin of that is 
pretty important ...

We discussed this rather a lot at the informatoin flow meeting in October,
and it's likely to come up as a key topic on Monday. I think the 
diagrams in the doc at http://proj.badc.rl.ac.uk/go-essp/wiki/CMIP5/Meetings/telco091027
are key. They need updating though ...

What I think we (metafor) expect to exist at some time (ideally by
March, but we are where we are), is to be able to navigate from a
range of places in "model description place" to a range of 
places in "data description pages" that themselves map onto
places in the DRS heirarchy ...
 
> Open Questions:
> 
> a) Does the community feel that writing a thredds variable harvesting
> software is needed for the March ESG release?

Yes, but that's not the same as requiring you to use it in the display by then :-)

> b) If the community does decide to proceed with this task,  what form
> would people like to see this list of variables in the ESG model
> metadata display?  

I think this is a problem for the data side, not the metadata side, as
it essentially comes down to (for ESG gateways) links from
component outputs to the appropriate place in the data collection
pages ...

... for metafor gateways we want to land on (virtual) CIM document views.

> 2) XML INSTANCES:
> 
> Original Question (Bryan):  Is ESG expecting/wanting one XML instance
> per realm level component?
> 
> The discussion so far:
> 
> Based upon previous conversations on the METAFOR telcons, ESG has been
> expecting one XML document/instance for each of the components exported
> by the questionnaire.  If this creates many small components in the display
> that are difficult for users to navigate, it may be better to bundle some
> of the subcomponents so that ESG receives one XML document/instance per
> realm.  Ideally each XML instance generated by the questionnaire will
> show up as one component (organized under a top-level simulation) in
> the ESG display.

But Rupert said:

> As things stand the questionnaire is going to output the coupled model
> and all its descendants within a single XML document. The examples you I
> sent you, although sparse and just relating to model descriptions,
> should hopefully have shown that. Does this conflict with what you are
> expecting?

.... it shouldn't be hard to unpick that at your end if that's what you want.

> HOOKING DATA COLLECTIONS TO THE MODEL THEY CAME FROM:
> 
> Another germane discussion that came out of this thread deals with how
> ESG connects data collections to simulation instances.  Sylvia mentioned
> that for CMIP3 ESG achieved this by ensuring that both the names of the
> simulation instances and the data collections followed a similar naming
> convention.  Stephen brought up a very good point in that not much attention
> has been paid to model/simulation naming conventions so far.  This is an important
> topic and it should be discussed separately.  Thanks Stephan for bringing
> this up.  

There are outstanding tickets for this issue on the go-essp wiki. It clearly needs
revisiting ... (I said I'd own the tickets when I created them, but other folk have
to own them for real).

Bryan

> 
> Sylvia
> 
> 
> On Jan 6, 2010, at 2:17 PM, Sylvia Murphy wrote:
> 
> > Bryan,
> > 
> > I tried to simplify the discussion and ended up obfuscating things...how typical.  
> > 
> > * The data links on the model metadata pages are to ESG data collections not to individual files
> > * Currently, the list of output variables and the links to the data collections are all under the Outputs tab but are separated by accordions.  
> > 
> > One comment below...
> > 
> > 
> > On Jan 6, 2010, at 12:21 PM, Bryan Lawrence wrote:
> > 
> >> On Wednesday 06 January 2010 18:55:58 Sylvia Murphy wrote:
> >>> HI Bryan,
> >>> 
> >>> I have added one comment deep down...
> >> and I've cut out the material in between :-) :-)
> >> 
> >>> Just to refresh everyone's minds, the ESG display (e.g. one instance with a full set of tabs) will come from the XML output from the questionnaire.  If we get an XML file that is a subcomponent (e.g. Atmospheric chemistry), it will exist as an instance.  If we get an XML that corresponds to an upper level component (e.g. Atmosphere) it will exist as an instance.  If we get an XML that combines the chemistry portion under the atmospheric component, then that combined XML will exist as an instance. 
> >> 
> >> I think this suggests we should send you one xml instance per realm level component. Is that what you are expecting/want?
> >> 
> >> 
> >>> Outputs Tab:
> >>> 
> >>> Output Variables:
> >>> 
> >>>    Precipitation
> >>>    Humidity
> >>>    Surface Temperature  
> >>>    Etc
> >>> 
> >>> Data Collections:
> >>> 
> >>>    precipitation.nc <<<< this is a link to the data collection, which may be more than one file
> >>>    humidity.nc
> >>>    temperature.nc
> >> 
> >> Things are mildly complicated here by ESG support for non-CMIP5 datasets, but as you know I only care about CMIP5, so ...
> >> 
> >> The data collection links should not be to netcdf files but to esg catalog entries for the appropriate atomic dataset ... (complete with links to replicates etc) ...
> >> 
> >> ... and it's arguable that you need two columns/tabs for variables and collections. For CMIP5, they will be completely isomorphic, and it would be much simpler (given some of these will have more than a hundred entries) if you could click through straight from the variable to both a description and the data pages ...
> >> 
> >> precipitation (defn, datapage)
> > 
> > Lets bring this up when I demo the datahook net week.  We can example what is currently there and see what options we have to change it.
> > 
> > 
> > 
> > 
> >> 
> >> You also need data links from the aggregations (experiment etc), but more of that anon.
> >> 
> >>> Right now, we would have to parse the thredds catalog as you say to get the list of variables.  No code exists to convert the data collection names into a list of variable names.  Code does exist to create the list of data collections.
> >> 
> >> I think we all need a bit of code for that. We should write it standalone so we can all use the logic if not the code ...  hopefully this is on the agenda for Hans and/or Stephen.
> >> 
> >> Cheers
> >> Bryan
> >> 
> >>> 
> >>> Sylvia
> >>> 
> >>> 
> >>> 
> >>> 
> >>>> 
> >>>> Does that help explain?
> >>>> 
> >>>> So the important thing to remember is that Gerry's spreadsheet is informative, but the actual data is authorative as to this matching (which I call *tie*ing) ... which we then build as rdf instances in the esg/owl world.
> >>>> 
> >>>> Bryan
> >>>> -- 
> >>>> Bryan Lawrence
> >>>> Director of Environmental Archival and Associated Research
> >>>> (NCAS/British Atmospheric Data Centre and NCEO/NERC NEODC)
> >>>> STFC, Rutherford Appleton Laboratory
> >>>> Phone +44 1235 445012; Fax ... 5848; 
> >>>> Web: home.badc.rl.ac.uk/lawrence
> >>>> _______________________________________________
> >>>> metafor mailing list
> >>>> metafor at lists.enes.org
> >>>> https://lists.enes.org/mailman/listinfo/metafor
> >>> 
> >>> ***********************************
> >>> Sylvia Murphy
> >>> sylvia.murphy at noaa.gov
> >>> 303-497-7753
> >>> 
> >>> 
> >>> 
> >>> 
> >> 
> >> 
> >> 
> >> -- 
> >> Bryan Lawrence
> >> Director of Environmental Archival and Associated Research
> >> (NCAS/British Atmospheric Data Centre and NCEO/NERC NEODC)
> >> STFC, Rutherford Appleton Laboratory
> >> Phone +44 1235 445012; Fax ... 5848; 
> >> Web: home.badc.rl.ac.uk/lawrence
> > 
> > ***********************************
> > Sylvia Murphy
> > sylvia.murphy at noaa.gov
> > 303-497-7753
> > 
> > 
> > 
> > _______________________________________________
> > metafor mailing list
> > metafor at lists.enes.org
> > https://lists.enes.org/mailman/listinfo/metafor
> 
> ***********************************
> Sylvia Murphy
> sylvia.murphy at noaa.gov
> 303-497-7753
> 
> 
> 
> 



-- 
Bryan Lawrence
Director of Environmental Archival and Associated Research
(NCAS/British Atmospheric Data Centre and NCEO/NERC NEODC)
STFC, Rutherford Appleton Laboratory
Phone +44 1235 445012; Fax ... 5848; 
Web: home.badc.rl.ac.uk/lawrence


More information about the GO-ESSP-TECH mailing list