[Go-essp-tech] [metafor] Minutes: 1/6/10 Output Variable Telecon

Wed Jan 6 11:55:58 MST 2010

HI Bryan,

I have added one comment deep down...

On Jan 6, 2010, at 11:36 AM, Bryan Lawrence wrote:

> Hi Sylvia, Karl
> 
> On Wednesday 06 January 2010 17:40:55 Sylvia Murphy wrote:
>> The spreadsheet that Gerry created listed the variable by mindmap scientific virtual component.  So while your document probably lists something like Humidity by the realm Atmosphere while the METAFOR spreadsheet lists the variable under a subcomponent like the Dynamical Core.  
>> 
>> The ESG will be displaying all these subcomponents because that is what will be coming in via the questionnaire, not just the realms, and that his where the output variables would be listed if we used METAFORs method.  
> 
> This isn't quite right I think ... but close ... more below. (It's what we might have wanted to do though ...)
> 
> On Wednesday 06 January 2010 17:23:29 Sylvia Murphy wrote:
>> HI Bryan,
>> 
>> You said "i think the output variables that actually appear in the dataset need to get TIED [my emphasis] to the model components".
>> 
>> The way ESG should work, is that there will be a link (on the outputs tab) in the model metadata display pointing to the datasets.  I believe the variable name will be listed in those names. 
> 
> The question is how and where is the information gathered that makes these tabs work ...
> 
>> If a user clicks on one of those links he will get a lot of metadata about the files including the CF information about the variables.   
>> 
>> When you say "tied" is this what you were thinking or do you want to see output variables listed (as a list) on the model metadata display?  If so we are going to have to find a way to have the two sides of ESG, the dataset side and the model metadata side communicate with each other.
> 
> So what I mean is that in order to know what variables have actually been output we need to do something like:
> 
> - parse the thredds catalogs to find a list of atomic drs names or tuple equivalents which include hte model name, the experiment, an ensemble number (if necessary), the realm and a variable name (plus other stuff including version number, but to keep it simple for this email we'll only worry about the above)
> 
> The key point is that we will only know a (realm, variable name) pair, we wont know any more detail about sub components *and that's appropriate* ... because the subcomponent descriptions may not map onto physical software that writes output ...
> 
> So, to get the information to populate those tabs, we need to *tie* the variable name (and their datasets) that we get from the data parsing to the appropriate realm (plus simulation etc)  in the model metadata world (metafor+curator/esg).

Just to refresh everyone's minds, the ESG display (e.g. one instance with a full set of tabs) will come from the XML output from the questionnaire.  If we get an XML file that is a subcomponent (e.g. Atmospheric chemistry), it will exist as an instance.  If we get an XML that corresponds to an upper level component (e.g. Atmosphere) it will exist as an instance.  If we get an XML that combines the chemistry portion under the atmospheric component, then that combined XML will exist as an instance. 

What you say is correct in that we have to connect the output data to the model it came from.  Currently, this is done by corresponding creating automatically the files with the model names using set syntax.  If the modelling groups conform to the naming convention for both their files and their model instances, this should work just fine.  

This is how the data hook will work.  I am going to try to get an example set up quickly so I can show everyone on next tuesday's METAFOR call what this looks like.

This is separate from putting a list of output variable names on the model metadata display.  The whole thing COULD look like:

Outputs Tab:

Output Variables:

     Precipitation
     Humidity
     Surface Temperature  
     Etc

Data Collections:

     precipitation.nc <<<< this is a link to the data collection, which may be more than one file
     humidity.nc
     temperature.nc

Right now, we would have to parse the thredds catalog as you say to get the list of variables.  No code exists to convert the data collection names into a list of variable names.  Code does exist to create the list of data collections.

Sylvia

> 
> Does that help explain?
> 
> So the important thing to remember is that Gerry's spreadsheet is informative, but the actual data is authorative as to this matching (which I call *tie*ing) ... which we then build as rdf instances in the esg/owl world.
> 
> Bryan
> -- 
> Bryan Lawrence
> Director of Environmental Archival and Associated Research
> (NCAS/British Atmospheric Data Centre and NCEO/NERC NEODC)
> STFC, Rutherford Appleton Laboratory
> Phone +44 1235 445012; Fax ... 5848; 
> Web: home.badc.rl.ac.uk/lawrence
> _______________________________________________
> metafor mailing list
> metafor at lists.enes.org
> https://lists.enes.org/mailman/listinfo/metafor

***********************************
Sylvia Murphy
sylvia.murphy at noaa.gov
303-497-7753