[Go-essp-tech] Global attributes and DRS extensions for downscaled datasets

Chris Jack cjack at csag.uct.ac.za
Sat Mar 30 01:23:50 MDT 2013


Hi Martin and others

I just want to echo the point about statistical downscaling in CORDEX.  
For a whole lot of reasons (including scientific and impacts 
application) this is an area that needs to be developed. Of course SD 
raises lots of issues in terms of standardization but one of the key 
issues is the predictor/training observed datasets used.  These 
predictor sets range from basic station scale observation time series 
through to more complex merged satellite and station products.  It may 
be that the parallel discussion on an obs/validation dataset for CORDEX 
has some valid cross over here for defining the parameters for CORDEX SD 
runs.

Relevant to this thread however is the likely need to include 
information about the observed predictor set used in SD within the 
dataset attributes.

Chris


On 03/27/2013 11:23 AM, martin.juckes at stfc.ac.uk wrote:
>
> Hello Karl,
>
> Thanks for clear response – I probably should have been able to work 
> that out if I had followed the email thread carefully.
>
> There is a separate document for CORDEX which describes the intended 
> mapping of attributes onto facets in the ESGF User Interface – I’ll 
> try to send that to you later.
>
> I understand your reservations about some aspects of the CORDEX data 
> requirements, but the aim was to use terms which are in usage in the 
> community (particularly for the “region” attribute which actually 
> combines region and resolution) and thus, hopefully, improve 
> compliance and acceptance of the standard.
>
> There is an aspiration to include statistically downscaled data in the 
> CORDEX archive (and it is another shortcoming of the CORDEX document 
> you refer to, I think, that it only deals with dynamically downscaled 
> data and does not leave a hook to allow extension to statistically 
> downscaled). The system you’ve described could presumably be used for 
> statistically downscaled data in CORDEX. We have a new European Union 
> project starting next week which funds Bruce Hewitson’s group in Cape 
> Town to do some coordination and networking on data standards for 
> CORDEX downscaling, so I’ve copied in Chris Jack who is, I think, 
> leading their effort.
>
> Regards,
>
> Martin
>
> *From:*Karl Taylor [mailto:taylor13 at llnl.gov]
> *Sent:* 27 March 2013 00:51
> *To:* Juckes, Martin (STFC,RAL,RALSP)
> *Cc:* galina at ucar.edu; obc at dmi.dk; colin.jones at smhi.se; 
> ncpp_core at list.woc.noaa.gov; ncpp_tech at list.woc.noaa.gov; 
> go-essp-tech at ucar.edu; laura.e.carriere at nasa.gov; 
> gerald.potter at nasa.gov; williams13 at llnl.gov; denis.nadeau at nasa.gov; 
> Pascoe, Stephen (STFC,RAL,RALSP)
> *Subject:* Re: Global attributes and DRS extensions for downscaled 
> datasets
>
> Dear Martin,
>
> I'm not advocating changing the CORDEX requirements; it's probably 
> much too late for that.  There are are limitations to the generality 
> of the CORDEX specifications, which means they might not be applicable 
> to downscaling efforts outside of CORDEX.  The document I prepared was 
> to try to address the more general issue of what descriptors are 
> needed for downscaled datasets.
>
> I have proposed that a single additional "descriptor" be added to the 
> already defined components of the DRS:
>
> Source of predictor data ⇒ driving_model_id - driving_model_rip (e.g. 
> “GFDL-CM3-r1i1p1”)   In some cases the driving_model_rip might be 
> omitted (e.g., when using reanalysis output to drive the downscaling).
>
> In CORDEX this descriptor could be formed by joining with a hyphen 
> your GCMModelName and CMIP5EnsembleMember.
>
> I have also proposed expanding the "ensemble member" descriptor to 
> include an indication of the "nominal resolution".  The idea here is 
> that output might need to be regridded or be made available at various 
> resolutions, so we would like to be able to distinguish among these 
> closely related datasets.  Here is the description of the 'riph' 
> designator:
>
>  Ensemble member⇒  ‘riph’ designator, where the “rip” form is defined 
> as in CMIP5 (which for downscaled data would usually be “r1i1p1”), and 
> the “h” is followed by nominal resolution expressed in kilometers.   
> (For backward compatibility the DRS would consider the “h” segment as 
> optional, but it is required for downscaled datasets.)  The last part 
> of the 'riph' designator is of the form “hnXXXX” or “hiXXXX” where 
> XXXX is the nominal horizontal resolution of the downscaled data, 
> expressed in kilometers (rounded to the nearest km with leading zeros 
> dropped).  “hn” indicates that the data is stored on the model’s 
> “native” grid, while “hi” indicates that the data has been 
> interpolated from a model’s native grid to a different grid.  
> (Statistically downscaled data would normally be recorded on a 
> so-called “native” grid.)  Data on a native grid at a nominal 
> resolution of 5 km, for example, would be identified as “hn5”, while 
> regridded data at 11 km resolution would be identified as “hi11”. The 
> XXXX should be calculated as follows:  XXXX = sqrt(domain area / 
> (number of grid cells)), expressed in km/grid cell and rounded off to 
> the nearest km.
>
> CORDEX has chosen to include resolution information as part of a 
> domain name (e.g., CAM-44 or SAM-44i), but the resolution doesn't seem 
> to me to belong as part of the region identification.
>
> I should note also that CORDEX specifies a directory structure and/or 
> filenames where in the CORDEX document some of the DRS categories are 
> renamed.  I've attached a table that shows the DRS elements and 
> corresponding CORDEX identifiers, along with global attributes.  (I'm 
> going to try to get NASA / NOAA to be consistent with the DRS.)  I 
> also provide a table of additional global attributes.  CORDEX is 
> mostly consistent with this table, except for using "CORDEX_domain and 
> omitting driving_model_tracking_ids.
>
> Finally, I note that in the example found in the CORDEX document for 
> global attributes:
>
> 1)  experiment_id = "evaluation",  but in the directory structure and 
> filename templates, this is presumably used as "CMIP5ExperimentName",  
> but of course "evaluation" is not a CMIP5 experiment.  I think a 
> better term for "CMIP5ExperimentName" is simply "experiment", which in 
> the case of CORDEX is usually the same as the CMIP5 experiment_id.
>
> 2)  CORDEX requires "contact", but this was left out of the example.
>
> Please let me know what you think.
>
> Best regards,
> Karl
>
> On 3/26/13 4:58 AM, martin.juckes at stfc.ac.uk 
> <mailto:martin.juckes at stfc.ac.uk> wrote:
>
>     Hello Karl,
>
>     I’m puzzled about how this fits in with CORDEX. We went through
>     this discussion some time ago, and agreed on some data
>     requirements in the document you cite below which we believed to
>     be appropriately consistent with the CMIP5 requirements. This
>     document was then discussed at a WCRP meeting and has been
>     circulated as the requirements for groups submitting CORDEX data
>     to ESGF. Since then, modelling groups have been preparing data and
>     we are expecting to start publication soon.  Do you think there
>     are problems with uniformity in the way the CORDEX requirements
>     are specified?
>
>     Regards,
>
>     Martin
>
>     *From:*Karl Taylor [mailto:taylor13 at llnl.gov]
>     *Sent:* 25 March 2013 21:50
>     *To:* Galia Guentchev
>     *Cc:* ncpp_core at list.woc.noaa.gov
>     <mailto:ncpp_core at list.woc.noaa.gov>; NCPP TECHNICAL TEAM;
>     go-essp-tech at ucar.edu <mailto:go-essp-tech at ucar.edu>;
>     laura.e.carriere at nasa.gov <mailto:laura.e.carriere at nasa.gov>;
>     Potter, Gerald Lee. (GSFC-606.2)[UNIVERSITY OF MARYLAND]; Dean
>     Williams; Nadeau, Denis (GSFC-610.1)[R S INFORMATION SYSTEMS, ];
>     Juckes, Martin (STFC,RAL,RALSP); Pascoe, Stephen (STFC,RAL,RALSP)
>     *Subject:* Global attributes and DRS extensions for downscaled
>     datasets
>
>     Dear all,
>
>     I have spent considerable time reviewing the following four documents:
>
>     A. The email (copied below) sent by Galia and Aparna, which
>     proposed attributes, filenames, and directory structures for
>     downscaled data.
>
>     B.
>     http://cmip-pcmdi.llnl.gov/cmip5/docs/cmip5_data_reference_syntax.pdf
>     which describes the corresponding CMIP5 metadata.
>
>     C.
>     http://cordex.dmi.dk/joomla/images/CORDEX/cordex_archive_specifications.pdf
>     <http://cordex.dmi.dk/joomla/images/CORDEX/cordex_archive_specifications_121022.pdf>
>     which describes the corresponding CORDEX metadata.
>
>     D.
>     http://cmip-pcmdi.llnl.gov/cmip5/docs/CMIP5_output_metadata_requirements.pdf
>     which specifies all the CMIP5 metadata requirements.
>
>     I hope that document A above could be made compatible with the
>     others and in general could provide a sound basis for establishing
>     more uniformity moving forward.  Toward that end, I have prepared
>     the attached document describing for downscaled data a minimal set
>     of  global attributes needed to augment those used in CMIP5 and
>     also the extensions needed to the DRS document to accommodate
>     downscaled data.
>
>     I hope at least a few of you will take the time to study this
>     document and provide feedback.
>
>     Best regards,
>     Karl
>
>
>     Mail sent by Galia Guentchev 3/12/13
>
>     **********************************************************************
>     Details of each element of the proposed directory structure:
>
>     Proposed elements -
>     /projectID/sub-project/product/institution/*predictorModel/experimentID/frequency/realm/MIPtable/Pred
>     ictor_experiment_rip/predictorversion*//downscalingMethod/predictand
>     (variableName)/region///DownscaledDataversion//file_name.nc
>
>     Example:
>
>     /ncpp2013/perfectModel/downscaled/NOAA-GFDL/*GFDL-HIRAM-C360-coarsened/amip/day/atmos/day/r1i1p1/v20121024*//GFDL-ARRMv1/tasmax/US48/v20120227//tasmax_day_amip_r1i1p1_downscaled_US48_GFDLARRMv1_19790101-19831231.nc
>
>     The new element sub-project (in blue above) gives the opportunity
>     to indicate to users that in the one case the method was trained
>     on observations (standard setting), and in the other on model that
>     was considered to be the truth (perfect model setting);
>     The options there could be: PerfectModel or Standard - where
>     possibly there could be a different name instead of 'standard' for
>     the standard downscaling setting.
>
>     For NASA datasets some of the directories could be:
>
>     project = NEX
>     product = downscaled
>     institution = NASA-Ames
>     predictorModel - original model value
>     experimentID = historical
>     frequency = mon
>     realm = atmos
>     Predictor_experiment_rip - original model value
>     variable = precipitation or temperature
>     region = CONUS
>
>     DownscalingMethod will also be included as a directory to allow
>     for search on method.
>
>     **********************
>     There are a set of sub-directories that refer to the
>     _PredictorModel_ - presented in bold -
>     */predictorModel/experimentID/frequency/realm/MIPtable/Pred
>     ictor_experiment_rip/predictorversion*
>
>     Where:
>
>     ·predictor model - is the specific GCM which is the source of the
>     predictor data set - GFDL-HIRAM-C360-coarsened - in the above example
>
>     ·experimentID - the specific experiment - amip in this case
>
>     ·frequency - refers to the temporal scale of the predictor fields
>     - daily
>
>     ·realm - the realm of the predictors - in this case atmos(phere)
>
>     ·MIPtable - name of the model intercomparison table - daily in
>     this example, could be amon - for atm monthly data;
>
>     ·Predictor-Experiment-rip - follows the standard notation from CMIP5
>
>     ·version - the version date of the global model that provided the
>     predictor dataset
>
>     The elements above follow quite closely the structure for CMIP5
>     model output directory elements.
>
>     There is a set of sub-directories that refer to the Downscaling
>     method - presented in italics -
>     /downscalingMethod/predictand
>     (variableName)/region/DownscaledDataversion
>
>     /Where:
>
>     ·downscalingMethod - is the downscaling method abbreviation - in
>     this case GFDL-ARRMv1 - the GFDL in the name indicates that this
>     is a setting applied by GFDL where there were two sets of
>     predictors, based on the ARRM method of K.Hayhoe; also v.1
>     indicates which version of the ARRM method was used (the original
>     version) - more details about the method are given in the global
>     attributes of the file;
>
>     ·Predictand (variableName) - the specific predictand variable that
>     was downscaled; tasmax in this case;
>
>     ·region - indicates that the method was applied to the US48
>
>     ·DownscaledDataversion - the version of the downscaled dataset
>
>     *For the purposes of standardization there are two directions to
>     consider:*
>
>     1) One is to have*one standard directory* structure that will be
>     used by all - for example, following the example of GFDL to have
>     the details of the predictor model first and then the downscaling
>     method details:
>
>     ·ProjectID - sub-project - product - Institution - Predictor
>     dataset details - Downscaling method details - Filename
>
>     Having a standardized approach would help any automated
>     service/web service to detect the directory path for a particular
>     dataset.
>
>     2) During our last teleconference there was a proposal to follow
>     the downscaling practice and describe the downscaling method first
>     and then the predictor model. This leads to *two paths*:
>
>             • ProjectID - _Standard or Perfect Model sub-project
>     facet_- product - Institution -  then see below:
>                    -  (if Perfect model setting) Predictor dataset
>     details - Downscaling method details,
>                    -  (if Standard setting) - Downscaling method
>     details - Predictor dataset details
>
>
>     The NCPP Core team accepts that it may be reasonable to have a
>     directory structure - where the method description is first; and
>     another directory structure - where the predictor description is
>     first and then the methods that are applied are described; *NCPP
>     will support either approach* (one overall directory structure, or
>     two separate pathways) and if the second approach is chosen (with
>     two different sub-directory sequences) - we would like to promote
>     and to support the standardization of these different directory
>     pathways - meaning - we will support two standardized directory
>     structures to accommodate two common practices.
>
>     ******************
>     Additional details:
>
>     *Variable level attributes-*
>     The published dataset should also conform to CF-standards.
>     eg-
>
>                     tasmax:long_name = "Downscaled Daily Maximum
>     Near-Surface Air Temperature" ;
>                     tasmax:units = "K" ;
>                     tasmax:missing_value = 1.e+20f ;
>                     tasmax:_FillValue = 1.e+20f ;
>                     tasmax:standard_name = "air_temperature" ;
>                     tasmax:original_units = "K" ;
>     *                tasmax:downscaling_method: GFDL-ARRMv1*
>
>     *Global attributes- *listing a few here, several CMIP-style
>     attributes will be inherited.
>
>     "predictorModel" will replace "model_id"
>       For the 'downscaling model', as agreed with Luca on the call it
>     would be 'downscalingMethod'
>
>                     :Conventions = "CF-1.4" ;
>                     :references = "info about model, training datasets
>     etc will be provided here"
>                     :info = "additional info about the downscaling
>     method"
>                     :creation_date = "2011-08-19T21:57:06Z" ;
>                     :institution = "NOAA GFDL(201 Forrestal Rd,
>     Princeton, NJ, 08540)" ;
>                     :history = "info on file processing. Eg" processed
>     by toolX." ;
>                     :projectID = ncpp2013
>                     :subprojectID = perfectModel
>                     :product = downscaled
>                     :institution = NOAA-GFDL
>                     :predictorModel = GFDL-HIRAM-C360-coarsened
>                     :experimentID = amip
>                     :frequency = day
>                     :modeling_realm = atmos
>                     :Predictor_experiment_rip = r1i1p1
>                     :region = US48
>                     :table_id = day
>                     :version = v20120227
>                     :downscalingMethod = GFDL-ARRMv1
>     **************************************************
>
>     Best regards,
>     Galia and Aparna
>
>
>
>     -- 
>
>     Galia Guentchev, PhD
>
>     Project Scientist
>
>     National CLimate
>
>     Predictions and
>
>     Projections
>
>     Platform (NCPP)
>
>     NCAR RAL CSAP
>
>     FL2 3103
>
>     3450 Mitchell Lane
>
>     Boulder, CO, 80301
>
>     phone: 303 497 2743
>
>     -- 
>     Scanned by iCritical.
>


-- 
>>>
Chris Jack PhD
e: cjack at csag.uct.ac.za
p: +27 21 650 2684
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ucar.edu/pipermail/go-essp-tech/attachments/20130330/8781e872/attachment-0001.html 


More information about the GO-ESSP-TECH mailing list