<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<font face="Times New Roman">Hi again,<br>
<br>
Yes, this is true. For CMIP5 there was a case where the "rip"
numbers (identifying ensemble members) weren't originally
coordinated between </font>two centers running the same model,
and metadata and file names had to be corrected. I should note,
that if we get away from the idea that institution somehow can be
used to identify data (as opposed to using for this purpose the
model that actually produced the data), then folks running models
will more like realize that it is important to coordinate the
labeling of their simulation output. <br>
<br>
One requirement for filenames for CMIP5 was that they should be
unique across the entire archive. If two institutions ran the same
model and didn't coordinate "rip" identifying numbers, then we could
have duplicate names which users would find confusing at the very
least.<br>
<br>
Karl<br>
<br>
<div class="moz-cite-prefix">On 3/14/13 9:31 AM, Jeffrey F. Painter
wrote:<br>
</div>
<blockquote cite="mid:5141FB4F.9090103@llnl.gov" type="cite">
<meta http-equiv="Content-Type" content="text/html;
charset=ISO-8859-1">
If the institution were not in the directory structure, and the
same model were run by several institutions, then they would have
to be careful to coordinate realization numbers and the rest of
the ensemble facet. These numbers aren't (and probably can't be)
standardized - they describe a wide range of choices made by
whoever runs the model.<br>
<br>
- Jeff<br>
<br>
On 3/13/13 5:52 PM, Karl Taylor wrote:
<blockquote cite="mid:51411F42.3090208@llnl.gov" type="cite"> <font
face="Times New Roman">Dear Galia and all,<br>
<br>
I'm glad thought is being given to unify directory structures
across projects. <br>
<br>
That being said it may not be known by everyone that:<br>
<br>
1) From the user's perspective this isn't really that
necessary, since the directory structure is hidden by ESGF
(but see below for further discussion).<br>
2) The ESGF publishing software doesn't care much at all about
the directory structure.<br>
3) In particular the ESGF search categories (facets) are not
tied to the directory structure.<br>
<br>
So there is quite a bit of flexibility allowed by ESGF, and
users can normally access data blissfully unaware of how its
been organized on various data nodes.<br>
<br>
Nevertheless, I agree that at least for some "projects"
organizing data in unified directory structures is useful.
Node managers who often want to access data outside of the
ESGF api can more easily find what they're looking for with a
standardized directory structure. If the directory structure
is rigorously enforced, users could take wget scripts created
to download one variable and easily modify it to download a
different variable (although its hard to be sure of success).<br>
<br>
Concerning your proposed directory structure, I have two
comments:<br>
<br>
1) I would recommend omitting the "institution"
subdirectory. For CMIP5 I think it was a mistake to include
this (and also, I wouldn't include it at a search category).
The institution can be recorded in the metadata of each file.
If the same model was run at more than 1 institution, then I'd
like to see all the simulations under a single "model"
subdirectory, not split across two directories under different
institutions.<br>
<br>
2) When possible, I would stick with the terminology
established in the "CMIP5 Data Reference Syntax (DRS) and
Controlled Vocabularies" document available at: <a
moz-do-not-send="true" class="moz-txt-link-freetext"
href="http://cmip-pcmdi.llnl.gov/cmip5/docs/cmip5_data_reference_syntax.pdf">http://cmip-pcmdi.llnl.gov/cmip5/docs/cmip5_data_reference_syntax.pdf</a><br>
<br>
Best regards,<br>
Karl<br>
<br>
<br>
<br>
<br>
</font>
<div class="moz-cite-prefix">On 3/12/13 10:14 AM, Galia
Guentchev wrote:<br>
</div>
<blockquote cite="mid:513F6287.6050708@ucar.edu" type="cite"> Hi
everybody,<br>
<br>
Several groups have expressed interest to publish downscaled
climate datasets on ESGF. A standardized solution to
publishing (directory structure elements) would contribute to
the prompt identification of datasets. To discuss needs and
options for directory structure elements we had an initial
teleconference about a month ago. With this email we are
expanding our reach to other groups, such as the go-essp
group, in order to have a wider discussion of these elements.<br>
<br>
As agreed during our first teleconference, Aparna and Galia
worked on a proposal for a Directory Structure for publishing
downscaled datasets on ESGF. We would like to focus our next
teleconference on discussing this proposal. Below please find
a doodle poll for a potential next teleconference.<br>
<br>
<pre wrap=""><a moz-do-not-send="true" class="moz-txt-link-freetext" href="http://doodle.com/hrwthqs2g5pgsyv6">http://doodle.com/hrwthqs2g5pgsyv6</a></pre>
<br>
**********************************************************************<br>
Details of each element of the proposed directory structure:<br>
<br>
Proposed elements -<br>
/projectID/<font color="#3333ff">sub-project</font>/product/institution/<b>predictorModel/experimentID/frequency/realm/MIPtable/Pred</b><b><br>
</b><b>ictor_experiment_rip/predictorversion</b>/<i>downscalingMethod/predictand
(variableName)/region</i>/<i>DownscaledDataversion</i>/file_name.nc<br>
<br>
Example:<br>
<br>
/ncpp2013/<font color="#3333ff">perfectModel</font>/downscaled/NOAA-GFDL/<b>GFDL-HIRAM-C360-coarsened/amip/day/atmos/day/r1i1p1/v20121024</b>/<i>GFDL-ARRMv1/tasmax/US48/v20120227</i>/tasmax_day_amip_r1i1p1_downscaled_US48_GFDLARRMv1_19790101-19831231.nc<br>
<br>
The new element <font color="#3333ff">sub-project </font>(in
blue above) gives the opportunity to indicate to users that in
the one case the method was trained on observations (standard
setting), and in the other on model that was considered to be
the truth (perfect model setting);<br>
The options there could be: PerfectModel or Standard - where
possibly there could be a different name instead of 'standard'
for the standard downscaling setting.<br>
<br>
<p>For NASA datasets some of the directories could be: <br>
</p>
project = NEX<br>
product = downscaled<br>
institution = NASA-Ames<br>
predictorModel - original model value<br>
experimentID = historical<br>
frequency = mon<br>
realm = atmos<br>
Predictor_experiment_rip - original model value<br>
variable = precipitation or temperature<br>
region = CONUS<br>
<br>
DownscalingMethod will also be included as a directory to
allow for search on method.<br>
<br>
**********************<br>
There are a set of sub-directories that refer to the <u>PredictorModel</u>
- presented in bold - <b>/predictorModel/experimentID/frequency/realm/MIPtable/Pred</b><b><br>
</b><b>ictor_experiment_rip/predictorversion</b><br>
<br>
Where: <br>
<ul>
<li>predictor model - is the specific GCM which is the
source of the predictor data set -
GFDL-HIRAM-C360-coarsened - in the above example</li>
<li>experimentID - the specific experiment - amip in this
case</li>
<li>frequency - refers to the temporal scale of the
predictor fields - daily</li>
<li>realm - the realm of the predictors - in this case
atmos(phere)</li>
<li>MIPtable - name of the model intercomparison table -
daily in this example, could be amon - for atm monthly
data;</li>
<li>Predictor-Experiment-rip - follows the standard notation
from CMIP5</li>
<li>version - the version date of the global model that
provided the predictor dataset</li>
</ul>
<p>The elements above follow quite closely the structure for
CMIP5 model output directory elements.<br>
</p>
There is a set of sub-directories that refer to the
Downscaling method - presented in italics - <br>
<i><i>downscalingMethod/predictand (variableName)/region</i>/<i>DownscaledDataversion<br>
</i> <br>
</i>Where:<br>
<ul>
<li>downscalingMethod - is the downscaling method
abbreviation - in this case GFDL-ARRMv1 - the GFDL in the
name indicates that this is a setting applied by GFDL
where there were two sets of predictors, based on the ARRM
method of K.Hayhoe; also v.1 indicates which version of
the ARRM method was used (the original version) - more
details about the method are given in the global
attributes of the file;</li>
<li>Predictand (variableName) - the specific predictand
variable that was downscaled; tasmax in this case;<br>
</li>
<li>region - indicates that the method was applied to the
US48</li>
<li>DownscaledDataversion - the version of the downscaled
dataset<br>
</li>
</ul>
<p><b>For the purposes of standardization there are two
directions to consider:</b> <br>
</p>
<p>1) One is to have<b> one standard directory</b> structure
that will be used by all - for example, following the
example of GFDL to have the details of the predictor model
first and then the downscaling method details:</p>
<ul>
<li>ProjectID - sub-project - product - Institution -
Predictor dataset details - Downscaling method details -
Filename</li>
</ul>
<p> Having a standardized approach would help any automated
service/web service to detect the directory path for a
particular dataset. </p>
<p>2) During our last teleconference there was a proposal to
follow the downscaling practice and describe the downscaling
method first and then the predictor model. This leads to <b>two
paths</b>:</p>
<p> • ProjectID - <u><font color="#3333ff">Standard or
Perfect Model sub-project facet</font></u><font
color="#3333ff"> </font>- product - Institution - then
see below:<br>
- (if Perfect model setting) Predictor
dataset details - Downscaling method details, <br>
- (if Standard setting) - Downscaling method
details - Predictor dataset details<br>
</p>
<p><br>
The NCPP Core team accepts that it may be reasonable to have
a directory structure - where the method description is
first; and another directory structure - where the predictor
description is first and then the methods that are applied
are described; <b>NCPP will support either approach</b>
(one overall directory structure, or two separate pathways)
and if the second approach is chosen (with two different
sub-directory sequences) - we would like to promote and to
support the standardization of these different directory
pathways - meaning - we will support two standardized
directory structures to accommodate two common practices.<br>
</p>
<p><br>
</p>
******************<br>
<font color="#009900">Additional details<font color="#000000">:
</font></font><br>
<br>
<b>Variable level attributes-</b><br>
The published dataset should also conform to CF-standards. <br>
eg-<br>
<br>
tasmax:long_name = "Downscaled Daily Maximum
Near-Surface Air Temperature"
;
<br>
tasmax:units = "K"
;
<br>
tasmax:missing_value = 1.e+20f
;
<br>
tasmax:_FillValue = 1.e+20f
;
<br>
tasmax:standard_name = "air_temperature"
;
<br>
tasmax:original_units = "K"
;
<br>
<b> tasmax:downscaling_method: GFDL-ARRMv1</b><br>
<br>
<b>Global attributes- </b>listing a few here, several
CMIP-style attributes will be inherited. <br>
<br>
"predictorModel" will replace "model_id"<br>
For the 'downscaling model', as agreed with Luca on the call
it would be 'downscalingMethod' <br>
<br>
:Conventions = "CF-1.4" ;<br>
:references = "info about model, training
datasets etc will be provided here"<br>
:info = "additional info about the downscaling
method" <br>
:creation_date = "2011-08-19T21:57:06Z" ;<br>
:institution = "NOAA GFDL(201 Forrestal Rd,
Princeton, NJ, 08540)" ;<br>
:history = "info on file processing. Eg"
processed by toolX." ;<br>
:projectID = ncpp2013<br>
:subprojectID = perfectModel<br>
:product = downscaled<br>
:institution = NOAA-GFDL<br>
:predictorModel = GFDL-HIRAM-C360-coarsened<br>
:experimentID = amip<br>
:frequency = day<br>
:modeling_realm = atmos<br>
:Predictor_experiment_rip = r1i1p1<br>
:region = US48<br>
:table_id = day<br>
:version = v20120227<br>
:downscalingMethod = GFDL-ARRMv1<br>
**************************************************<br>
<br>
Best regards,<br>
Galia and Aparna<br>
<br>
<pre class="moz-signature" cols="72">--
Galia Guentchev, PhD
Project Scientist
National CLimate
Predictions and
Projections
Platform (NCPP)
NCAR RAL CSAP
FL2 3103
3450 Mitchell Lane
Boulder, CO, 80301
phone: 303 497 2743 </pre>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
GO-ESSP-TECH mailing list
<a moz-do-not-send="true" class="moz-txt-link-abbreviated" href="mailto:GO-ESSP-TECH@ucar.edu">GO-ESSP-TECH@ucar.edu</a>
<a moz-do-not-send="true" class="moz-txt-link-freetext" href="http://mailman.ucar.edu/mailman/listinfo/go-essp-tech">http://mailman.ucar.edu/mailman/listinfo/go-essp-tech</a>
</pre>
</blockquote>
<br>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
GO-ESSP-TECH mailing list
<a moz-do-not-send="true" class="moz-txt-link-abbreviated" href="mailto:GO-ESSP-TECH@ucar.edu">GO-ESSP-TECH@ucar.edu</a>
<a moz-do-not-send="true" class="moz-txt-link-freetext" href="http://mailman.ucar.edu/mailman/listinfo/go-essp-tech">http://mailman.ucar.edu/mailman/listinfo/go-essp-tech</a>
</pre>
</blockquote>
</blockquote>
<br>
</body>
</html>