[Go-essp-tech] Directory structure proposal and a doodle poll

Laura Carriere Laura.E.Carriere at nasa.gov
Tue Mar 12 20:08:00 MDT 2013


Galia,
   Thanks for putting this together.  As you know we've talked about how 
the proposed structure can be applied to the NEX downscaled project and 
what you've presented here will work, although we won't need the 
sub-project for NEX.

There's one recent idea that we had that I want to put out there for 
consideration.  Regarding the replacement of Model_id by PredictorModel, 
we were thinking that another option would be to keep Model_id and, 
following the obs4MIPs convention of using "obs-MODIS" or "obs-TRMM", 
that we could use "predict_BNU-ESM" or even "parent_BNU-ESM".  The value 
in doing this would be to continue to use the "Model" facet in the 
search categories but to be able to differentiate downscaled datasets 
from the original CMIP5 datasets.  I do like the idea of the 
PredictorModel too in that it brings clarity to the issue of what model 
was used to create the original dataset but I'm a little concerned that 
it will confuse the user who is not looking for downscaled data. Another 
option would be to use both the Model_id, with the "predict_" or 
"parent_" prefix and the PredictorModel.

I'd be interested in feedback on this idea, both pro and con and I'm 
looking forward to the telecon you're organizing.  We're hoping to start 
publish this data later this month.

   Laura.

On 3/12/2013 1:14 PM, Galia Guentchev wrote:
> Hi everybody,
>
> Several groups have expressed interest to publish downscaled climate 
> datasets on ESGF. A standardized solution to publishing (directory 
> structure elements) would contribute to the prompt identification of 
> datasets. To discuss needs and options for directory structure 
> elements we had an initial teleconference about a month ago. With this 
> email we are expanding our reach to other groups, such as the go-essp 
> group, in order to have a wider discussion of these elements.
>
> As agreed during our first teleconference, Aparna and Galia worked on 
> a proposal for a Directory Structure for publishing downscaled 
> datasets on ESGF. We would like to focus our next teleconference on 
> discussing this proposal. Below please find a doodle poll for a 
> potential next teleconference.
>
> http://doodle.com/hrwthqs2g5pgsyv6
>
> **********************************************************************
> Details of each element of the proposed directory structure:
>
> Proposed elements -
> /projectID/sub-project/product/institution/*predictorModel/experimentID/frequency/realm/MIPtable/Pred**
> **ictor_experiment_rip/predictorversion*//downscalingMethod/predictand 
> (variableName)/region///DownscaledDataversion//file_name.nc
>
> Example:
>
> /ncpp2013/perfectModel/downscaled/NOAA-GFDL/*GFDL-HIRAM-C360-coarsened/amip/day/atmos/day/r1i1p1/v20121024*//GFDL-ARRMv1/tasmax/US48/v20120227//tasmax_day_amip_r1i1p1_downscaled_US48_GFDLARRMv1_19790101-19831231.nc
>
> The new element sub-project (in blue above) gives the opportunity to 
> indicate to users that in the one case the method was trained on 
> observations (standard setting), and in the other on model that was 
> considered to be the truth (perfect model setting);
> The options there could be: PerfectModel or Standard - where possibly 
> there could be a different name instead of 'standard' for the standard 
> downscaling setting.
>
> For NASA datasets some of the directories could be:
>
> project = NEX
> product = downscaled
> institution = NASA-Ames
> predictorModel - original model value
> experimentID = historical
> frequency = mon
> realm = atmos
> Predictor_experiment_rip - original model value
> variable = precipitation or temperature
> region = CONUS
>
> DownscalingMethod will also be included as a directory to allow for 
> search on method.
>
> **********************
> There are a set of sub-directories that refer to the _PredictorModel_ 
> - presented in bold - 
> */predictorModel/experimentID/frequency/realm/MIPtable/Pred**
> **ictor_experiment_rip/predictorversion*
>
> Where:
>
>   * predictor model - is the specific GCM which is the source of the
>     predictor data set - GFDL-HIRAM-C360-coarsened - in the above example
>   * experimentID - the specific experiment - amip in this case
>   * frequency - refers to the temporal scale of the predictor fields -
>     daily
>   * realm - the realm of the predictors - in this case atmos(phere)
>   * MIPtable - name of the model intercomparison table - daily in this
>     example, could be amon - for atm monthly data;
>   * Predictor-Experiment-rip - follows the standard notation from CMIP5
>   * version - the version date of the global model that provided the
>     predictor dataset
>
> The elements above follow quite closely the structure for CMIP5 model 
> output directory elements.
>
> There is a set of sub-directories that refer to the Downscaling method 
> - presented in italics -
> //downscalingMethod/predictand 
> (variableName)/region///DownscaledDataversion
> /
> /Where:
>
>   * downscalingMethod - is the downscaling method abbreviation - in
>     this case GFDL-ARRMv1 - the GFDL in the name indicates that this
>     is a setting applied by GFDL where there were two sets of
>     predictors, based on the ARRM method of K.Hayhoe; also v.1
>     indicates which version of the ARRM method was used (the original
>     version) - more details about the method are given in the global
>     attributes of the file;
>   * Predictand (variableName) - the specific predictand variable that
>     was downscaled; tasmax in this case;
>   * region - indicates that the method was applied to the US48
>   * DownscaledDataversion - the version of the downscaled dataset
>
> *For the purposes of standardization there are two directions to 
> consider:*
>
> 1) One is to have*one standard directory* structure that will be used 
> by all - for example, following the example of GFDL to have the 
> details of the predictor model first and then the downscaling method 
> details:
>
>   * ProjectID - sub-project - product - Institution - Predictor
>     dataset details - Downscaling method details - Filename
>
> Having a standardized approach would help any automated service/web 
> service to detect the directory path for a particular dataset.
>
> 2) During our last teleconference there was a proposal to follow the 
> downscaling practice and describe the downscaling method first and 
> then the predictor model. This leads to *two paths*:
>
>         • ProjectID - _Standard or Perfect Model sub-project facet_- 
> product - Institution -  then see below:
>                -  (if Perfect model setting) Predictor dataset details 
> - Downscaling method details,
>                -  (if Standard setting) - Downscaling method details - 
> Predictor dataset details
>
>
> The NCPP Core team accepts that it may be reasonable to have a 
> directory structure - where the method description is first; and 
> another directory structure - where the predictor description is first 
> and then the methods that are applied are described; *NCPP will 
> support either approach* (one overall directory structure, or two 
> separate pathways) and if the second approach is chosen (with two 
> different sub-directory sequences) - we would like to promote and to 
> support the standardization of these different directory pathways - 
> meaning - we will support two standardized directory structures to 
> accommodate two common practices.
>
>
> ******************
> Additional details:
>
> *Variable level attributes-*
> The published dataset should also conform to CF-standards.
> eg-
>
>                 tasmax:long_name = "Downscaled Daily Maximum 
> Near-Surface Air Temperature" ;
>                 tasmax:units = "K" ;
>                 tasmax:missing_value = 1.e+20f ;
>                 tasmax:_FillValue = 1.e+20f ;
>                 tasmax:standard_name = "air_temperature" ;
>                 tasmax:original_units = "K" ;
> *                tasmax:downscaling_method: GFDL-ARRMv1*
>
> *Global attributes- *listing a few here, several CMIP-style attributes 
> will be inherited.
>
> "predictorModel" will replace "model_id"
>   For the 'downscaling model', as agreed with Luca on the call it 
> would be 'downscalingMethod'
>
>                 :Conventions = "CF-1.4" ;
>                 :references = "info about model, training datasets etc 
> will be provided here"
>                 :info = "additional info about the downscaling method"
>                 :creation_date = "2011-08-19T21:57:06Z" ;
>                 :institution = "NOAA GFDL(201 Forrestal Rd, Princeton, 
> NJ, 08540)" ;
>                 :history = "info on file processing. Eg" processed by 
> toolX." ;
>                 :projectID = ncpp2013
>                 :subprojectID = perfectModel
>                 :product = downscaled
>                 :institution = NOAA-GFDL
>                 :predictorModel = GFDL-HIRAM-C360-coarsened
>                 :experimentID = amip
>                 :frequency = day
>                 :modeling_realm = atmos
>                 :Predictor_experiment_rip = r1i1p1
>                 :region = US48
>                 :table_id = day
>                 :version = v20120227
>                 :downscalingMethod = GFDL-ARRMv1
> **************************************************
>
> Best regards,
> Galia and Aparna
>
> -- 
> Galia Guentchev, PhD
> Project Scientist
> National CLimate
> Predictions and
> Projections
> Platform (NCPP)
> NCAR RAL CSAP
> FL2 3103
> 3450 Mitchell Lane
> Boulder, CO, 80301
> phone: 303 497 2743


-- 

   Laura Carriere, CSC                  laura.carriere at nasa.gov
   NCCS, Code 606.2		       301 614-5064

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ucar.edu/pipermail/go-essp-tech/attachments/20130312/9cb95e08/attachment.html 


More information about the GO-ESSP-TECH mailing list