[Go-essp-tech] Global attributes and DRS extensions for downscaled datasets

Karl Taylor taylor13 at llnl.gov
Wed Mar 27 15:37:22 MDT 2013


Hello Laura,

I'll insert some responses below:

On 3/27/13 1:28 PM, Laura Carriere wrote:
>
> Karl,
>   I have gone through your document and mapped the variables to both 
> the ones proposed in last week's telecon and to the NEX variables.  
> Your proposed attributes work for NEX without any trouble, with one 
> possible exception.
>
> I do have the following feedback/questions:
>
> 1.  Last weeks' proposal had included a "predictor-version" attribute 
> in order to be able to track which version of the original model data 
> was used.  Your proposal includes a "driving_data_tracking_ids" 
> variable.  Would I be correct in assuming that this serves the same 
> purpose?  Also, we've just realized that the data provider has not 
> provided this information to us and it may not be retrievable (we'll 
> ask).  If it isn't, do you have a default or backup recommendation for 
> this attribute?
In CMIP5 version numbers are assigned by the person publishing the data, 
and it seems that these numbers have not always been assigned 
consistently.  In CMIP5 a tracking_id is assigned automatically to each 
file and is stored as a global attribute.  As a first choice, I was 
suggesting collecting the tracking_id's from all the files used in 
"driving" the downscaling.  As a second choice you could store some 
other version identifier there (presumably, the CMIP5 dataset version 
number).  So, yes this was meant to replace your "predictor_version" 
attribute.
>
> 2.  You recommended that the resolution_id should contain the 
> resolution rounded off to the nearest km.  The NEX resolution is 800 
> meters.  While we could round up to 1, could we also not use .8?  
> Without arguing the current value of data downscaled to 800 meters, 
> resolutions are moving in that direction.
I had assumed no downscaling method for climate data would be accurate 
enough to warrant providing information on scales smaller than a km.   I 
guess some folks disagree, so to allow for the case of 800 meters, I 
wouldn't recommend using a decimal point (because that character 
appearing in filenames, for example, usually means something special).  
For other purposes, we've substituted a "p" for ".", so perhaps that 
would be acceptable here.  For 800 meters you would write 
horizontal_id=hnp8  (or hip8, if this is an interpolated product).  
Another option would be to place a "m" after the number (e.g., hn800m).  
If this were done, we might consider adding "km" to resolutions 
expressed in kilometers, but I'm not sure about that. Comments welcome.
>
> 3.  I'm still looking for a place to include the doi for the dataset.  
> The optional references attribute seems like the best place for it but 
> it could also go in the contact attribute?  Any thoughts?
Yes, I would certainly include it in the "source" attribute, rather than 
the contact attribute, but it would make sense to me to also include a 
new global attribute called "doi" and store in it the full doi string.  
The source attribute can be understood by humans, but the "doi" 
attribute could more readily be interpreted by software.

best regards,
Karl
>
>
> Thanks.
>
>   Laura.
>
> On 3/25/2013 5:49 PM, Karl Taylor wrote:
>>
>> Dear all,
>>
>> I have spent considerable time reviewing the following four documents:
>>
>> A. The email (copied below) sent by Galia and Aparna, which proposed 
>> attributes, filenames, and directory structures for downscaled data.
>>
>> B. 
>> http://cmip-pcmdi.llnl.gov/cmip5/docs/cmip5_data_reference_syntax.pdf 
>> which describes the corresponding CMIP5 metadata.
>>
>> C. 
>> http://cordex.dmi.dk/joomla/images/CORDEX/cordex_archive_specifications.pdf 
>> <http://cordex.dmi.dk/joomla/images/CORDEX/cordex_archive_specifications_121022.pdf> 
>> which describes the corresponding CORDEX metadata.
>>
>> D.http://cmip-pcmdi.llnl.gov/cmip5/docs/CMIP5_output_metadata_requirements.pdf 
>> which specifies all the CMIP5 metadata requirements.
>>
>> I hope that document A above could be made compatible with the others 
>> and in general could provide a sound basis for establishing more 
>> uniformity moving forward.Toward that end, I have prepared the 
>> attached document describing for downscaled data a minimal set of  
>> global attributes needed to augment those used in CMIP5 and also the 
>> extensions needed to the DRS document to accommodate downscaled data.
>>
>> I hope at least a few of you will take the time to study this 
>> document and provide feedback.
>>
>> Best regards,
>> Karl
>>
>>
>> Mail sent by Galia Guentchev 3/12/13
>>
>> **********************************************************************
>> Details of each element of the proposed directory structure:
>>
>> Proposed elements -
>> /projectID/sub-project/product/institution/*predictorModel/experimentID/frequency/realm/MIPtable/Pred**
>> **ictor_experiment_rip/predictorversion*//downscalingMethod/predictand (variableName)/region///DownscaledDataversion//file_name.nc
>>
>> Example:
>>
>> /ncpp2013/perfectModel/downscaled/NOAA-GFDL/*GFDL-HIRAM-C360-coarsened/amip/day/atmos/day/r1i1p1/v20121024*//GFDL-ARRMv1/tasmax/US48/v20120227//tasmax_day_amip_r1i1p1_downscaled_US48_GFDLARRMv1_19790101-19831231.nc
>>
>> The new element sub-project (in blue above) gives the opportunity to 
>> indicate to users that in the one case the method was trained on 
>> observations (standard setting), and in the other on model that was 
>> considered to be the truth (perfect model setting);
>> The options there could be: PerfectModel or Standard - where possibly 
>> there could be a different name instead of 'standard' for the 
>> standard downscaling setting.
>>
>> For NASA datasets some of the directories could be:
>>
>> project = NEX
>> product = downscaled
>> institution = NASA-Ames
>> predictorModel - original model value
>> experimentID = historical
>> frequency = mon
>> realm = atmos
>> Predictor_experiment_rip - original model value
>> variable = precipitation or temperature
>> region = CONUS
>>
>> DownscalingMethod will also be included as a directory to allow for 
>> search on method.
>>
>> **********************
>> There are a set of sub-directories that refer to the _PredictorModel_ 
>> - presented in bold - 
>> */predictorModel/experimentID/frequency/realm/MIPtable/Pred**
>> **ictor_experiment_rip/predictorversion*
>>
>> Where:
>>
>>   * predictor model - is the specific GCM which is the source of the
>>     predictor data set - GFDL-HIRAM-C360-coarsened - in the above example
>>   * experimentID - the specific experiment - amip in this case
>>   * frequency - refers to the temporal scale of the predictor fields
>>     - daily
>>   * realm - the realm of the predictors - in this case atmos(phere)
>>   * MIPtable - name of the model intercomparison table - daily in
>>     this example, could be amon - for atm monthly data;
>>   * Predictor-Experiment-rip - follows the standard notation from CMIP5
>>   * version - the version date of the global model that provided the
>>     predictor dataset
>>
>> The elements above follow quite closely the structure for CMIP5 model 
>> output directory elements.
>>
>> There is a set of sub-directories that refer to the Downscaling 
>> method - presented in italics -
>> //downscalingMethod/predictand 
>> (variableName)/region///DownscaledDataversion
>> /
>> /Where:
>>
>>   * downscalingMethod - is the downscaling method abbreviation - in
>>     this case GFDL-ARRMv1 - the GFDL in the name indicates that this
>>     is a setting applied by GFDL where there were two sets of
>>     predictors, based on the ARRM method of K.Hayhoe; also v.1
>>     indicates which version of the ARRM method was used (the original
>>     version) - more details about the method are given in the global
>>     attributes of the file;
>>   * Predictand (variableName) - the specific predictand variable that
>>     was downscaled; tasmax in this case;
>>   * region - indicates that the method was applied to the US48
>>   * DownscaledDataversion - the version of the downscaled dataset
>>
>> *For the purposes of standardization there are two directions to 
>> consider:*
>>
>> 1) One is to have*one standard directory* structure that will be used 
>> by all - for example, following the example of GFDL to have the 
>> details of the predictor model first and then the downscaling method 
>> details:
>>
>>   * ProjectID - sub-project - product - Institution - Predictor
>>     dataset details - Downscaling method details - Filename
>>
>> Having a standardized approach would help any automated service/web 
>> service to detect the directory path for a particular dataset.
>>
>> 2) During our last teleconference there was a proposal to follow the 
>> downscaling practice and describe the downscaling method first and 
>> then the predictor model. This leads to *two paths*:
>>
>>         • ProjectID - _Standard or Perfect Model sub-project facet_- 
>> product - Institution -  then see below:
>>                -  (if Perfect model setting) Predictor dataset 
>> details - Downscaling method details,
>>                -  (if Standard setting) - Downscaling method details 
>> - Predictor dataset details
>>
>>
>> The NCPP Core team accepts that it may be reasonable to have a 
>> directory structure - where the method description is first; and 
>> another directory structure - where the predictor description is 
>> first and then the methods that are applied are described; *NCPP will 
>> support either approach* (one overall directory structure, or two 
>> separate pathways) and if the second approach is chosen (with two 
>> different sub-directory sequences) - we would like to promote and to 
>> support the standardization of these different directory pathways - 
>> meaning - we will support two standardized directory structures to 
>> accommodate two common practices.
>>
>>
>> ******************
>> Additional details:
>>
>> *Variable level attributes-*
>> The published dataset should also conform to CF-standards.
>> eg-
>>
>>                 tasmax:long_name = "Downscaled Daily Maximum 
>> Near-Surface Air Temperature" ;
>>                 tasmax:units = "K" ;
>>                 tasmax:missing_value = 1.e+20f ;
>>                 tasmax:_FillValue = 1.e+20f ;
>>                 tasmax:standard_name = "air_temperature" ;
>>                 tasmax:original_units = "K" ;
>> *                tasmax:downscaling_method: GFDL-ARRMv1*
>>
>> *Global attributes- *listing a few here, several CMIP-style 
>> attributes will be inherited.
>>
>> "predictorModel" will replace "model_id"
>>   For the 'downscaling model', as agreed with Luca on the call it 
>> would be 'downscalingMethod'
>>
>>                 :Conventions = "CF-1.4" ;
>>                 :references = "info about model, training datasets 
>> etc will be provided here"
>>                 :info = "additional info about the downscaling method"
>>                 :creation_date = "2011-08-19T21:57:06Z" ;
>>                 :institution = "NOAA GFDL(201 Forrestal Rd, 
>> Princeton, NJ, 08540)" ;
>>                 :history = "info on file processing. Eg" processed by 
>> toolX." ;
>>                 :projectID = ncpp2013
>>                 :subprojectID = perfectModel
>>                 :product = downscaled
>>                 :institution = NOAA-GFDL
>>                 :predictorModel = GFDL-HIRAM-C360-coarsened
>>                 :experimentID = amip
>>                 :frequency = day
>>                 :modeling_realm = atmos
>>                 :Predictor_experiment_rip = r1i1p1
>>                 :region = US48
>>                 :table_id = day
>>                 :version = v20120227
>>                 :downscalingMethod = GFDL-ARRMv1
>> **************************************************
>>
>> Best regards,
>> Galia and Aparna
>>
>> -- 
>> Galia Guentchev, PhD
>> Project Scientist
>> National CLimate
>> Predictions and
>> Projections
>> Platform (NCPP)
>> NCAR RAL CSAP
>> FL2 3103
>> 3450 Mitchell Lane
>> Boulder, CO, 80301
>> phone: 303 497 2743
>>
>
>
> -- 
>
>    Laura Carriere, CSClaura.carriere at nasa.gov
>    NCCS, Code 606.2		       301 614-5064

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ucar.edu/pipermail/go-essp-tech/attachments/20130327/9c41f68b/attachment-0001.html 


More information about the GO-ESSP-TECH mailing list