[Go-essp-tech] Updated Metadata Pipeline

Sylvia Murphy Sylvia.Murphy at noaa.gov
Thu May 20 08:52:08 MDT 2010


Hi Everyone,

It is that time again to update everyone on the metadata pipeline as I understand it. I am writing this in preparation for the metadata pipeline go-essp-tech call that we discussed occurring on 25 May. There are three primary paths:  the questionnaire path, the gridspec path, and the netCDF path. Tasks to be completed are embedded under the specific metadata pathway step they refer to.  This is a summary that includes portions of the ESG/CMIP5 effort outside of the Curator project scope and may have gaps and errors. 

Please note that Curator's timelines are synced with those of ESG. Below is ESG's current release schedule, which can also be viewed at https://wiki.ucar.edu/display/esgcet/ESG+Gateway+Release+Roadmap

1.2 June 15th
1.3 July 15th
1.4 August 16th

At the bottom of the email there is a draft outline of capabilities by release.

Curator also syncs its efforts to those of METAFOR.  Below I am quoting Charlotte's latest questionnaire release schedule (dated 13 May).

"*End of May: Finalize the changes we are making to Grids, Conformance and Ensembles in the questionnaire.
*First half of June:  Beta testing
*Second half of June: Make changes to the Questionnaire and possibly CIM in response to beta testing.
*End of June: Release the Questionnaire."
*End of July: First CMIP5 files received from METAFOR (Target based upon conversations with METAFOR)

Demonstration Schedule:

a) XML Harvest demonstration.  XML Harvest demonstrations have been occurring on a regular basis.   They are primarily for the METAFOR and go-essp-tech communities.  They show a sample questionnaire output (given to us by METAFOR) and are meant to primarily QC the harvesting process but also give folks an idea of what the questionnaire output looks like so that changes in the questionnaire and/or the ESG display can be identified. The next demonstration is targeted for the week of 31 May.  This date is dependent upon receipt of an updated XML file from Gerry and upon Julien who will need to add code to harvest the improved inputs and conformances.  See 4) below.

b) Scientist demonstration.  To my knowledge, the ESG has been primarily demonstrated primarily to technical personnel.  The entire gateway (data browse, data search, model search, metadata content) needs to be demonstrated to scientists who will be using the gateway for CMIP5 research.  Curator/ESG would like to conduct this joint demonstration the week of 21 June in order to make any potentially recommended changes in time for ESG 1.3 release, which is scheduled for 15 July.  


Questionnaire Path:  

1) Modelling centers fill out online questionnaire (being developed by METAFOR for 30 June release).

2) METAFOR converts output from the questionnaire into a CIM compliant XML file.

3) METAFOR sends the XML output from the modelling centers to ESG via Atom.
   *Needed: ESG to write software to periodically query METAFOR's Atom server for new files and to download those files (Target: 30 June)

4) ESG uses software (to be developed by ESG) to convert the XML into an OWL file for ingestation into the Sesame Triple Store.
    * Needed:                  
         a) Better sample XML file from METAFOR: Required from METAFOR (Target: 19 May).  METAFOR has been sending us periodic XML updates.  We need a sample that uses the current version of the mindmaps in order to finish the ingestion of the scientific properties  
         b) Conformance, ensembles, numerical requirements, genealogy, and inputs finalized in both the CIM and XML output (Target: 1 June) so ESG can write harvesting code for those pieces.  Conformance and inputs are partially complete in the output and we have code to harvest those pieces
         c) Complete sample XML from METAFOR (Target: 4 June)            
         d) Curator/ESG to complete the XML to OWL software (Target: 25 June)   
         e) ESG to create a means of automatically running Julien's harvesting code in an operational manner (Target: 25 June)

5) ESG continuity of operations
    * Don identified the need to develop a plan to reinitialize the sesame triple store with the CMIP5 metadata given a catastrophic failure (Target: 30 June)
 
6) The XML instance is displayed on the web. The XML instances will contain all the information about each CMIP5 model and simulation.  This will include information about the platform the simulation was run on, the descriptive scientific properties of each component, etc. 
   * Needed: Next XML Harvesting demo (Target: week of 31 May)
   * Needed: Julien fix the 4 identified bugs to the display (Target: 4 June)
   * Needed: Julien to complete 4 key display improvements (Target 4 June)
   * Needed: All of the CMIP5 experiment information needs to be represented in the system.  This can be done by modifying Luca's program that harvested the information from Karl's document, or by writing a program to harvest METAFOR's XML version of the information.  This task also includes a reconciliation between the RDF and OWL files that currently are being used by ESG for this purpose (Target: 11 June)
  * Needed: Formal demonstration to scientists of a complete questionnaire XML file to determine needed changes to the display (Target: week of 21 June)

7) Users can use the ESG search page to find model metadata.
    * Needed: Community-wide consensus on the facets displayed for the search (Target: 3 weeks prior to 1.4 release)
    * Needed: ESG modify the search pages to conform to community desires (Target: ESG plans to finalize this by the 1.4 release)
    * Needed: ESG finalize the look and feel of the search page and fix any identified bugs (Target: ESG plans to finalize this by the 1.4 release)


Gridspec Path:

*Needed: Status from GFDL on gridspec command line program
*Needed: Status on grid file metadata harvesting program


netCDF Path: 

*Needed: Status on CMOR2
*Needed: Status on the finality of DRS
*Needed: Status on the netCDF harvesting software

 
    
SUMMARY OF METADATA-RELATED CAPABILITIES BY ESG RELEASE

Note:   Here is a current list of baseline model metadata capabilities.  Only future capabilities  are listed under the releases.  Please assume that the baseline carries forward:

Baseline capabilities: 
* Component navigation
* Technical properties displayed
* Basic properties (e.g. institution, contacts etc) displayed
* Pop-up definitions of attributes
* Associated grids displayed
* Datahook
* Initial conditions/boundary conditions displayed
* Conformance displayed
* Scientific properties displayed
* Experiment information displayed

ESG Release 1.2 (15 June)
* User's changes to the component navigation retained in the session
* Citation formatting improved
* Genealogy displayed
* Simulation to data connection and publishing process made more robust
* Loading of the component tree made more efficient
* Link behavior throughout the site made more consistent
* Trackback page display adjusted for users coming via the data browse
* Ensemble information displayed
* Experiment information harvested and host files reconciled

ESG Release 1.3 (15 July)
* XML Harvest complete software complete and made operational

ESG Release 1.4 (16 August)
* Finalize search interface

***********************************
Sylvia Murphy
NESII/CIRES/NOAA Earth System Research Laboratory
325 Broadway, Boulder CO 80305
Email: sylvia.murphy at noaa.gov
Phone: 303-497-7753





-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ucar.edu/pipermail/go-essp-tech/attachments/20100520/67db407e/attachment-0001.html 


More information about the GO-ESSP-TECH mailing list