[Go-essp-tech] Global attributes and DRS extensions for downscaled datasets

Martina Stockhause stockhause at dkrz.de
Tue Apr 2 04:05:21 MDT 2013


Hello Karl, hello Laura,

a short remark on Laura's item 3 regarding an attribute "doi".
>>
>> 3.  I'm still looking for a place to include the doi for the 
>> dataset.  The optional references attribute seems like the best place 
>> for it but it could also go in the contact attribute?  Any thoughts?
> Yes, I would certainly include it in the "source" attribute, rather 
> than the contact attribute, but it would make sense to me to also 
> include a new global attribute called "doi" and store in it the full 
> doi string.  The source attribute can be understood by humans, but the 
> "doi" attribute could more readily be interpreted by software.

At WDCC we won't be able to use that global attribute "doi" because of 
our workflow:

- Our DOI process is carried out on stable, i.e. long-term archived 
data. We only store the final data versions for persistence data access 
as required for the DOI assignment.
- We are the data managers of the long-term archive but not the data 
owner. Thus we do not change the data after long-term archiving, which 
includes changes of the data headers.

Another reason is that of granularity:
We assign DOIs on data aggregations of data belonging together, like the 
output of a model run. On the level of individual files we plan to 
introduce "simple" persistent identifiers (PIDs) for data 
identification/data management. Other than for DOIs, these simple PIDs 
can be assigned without quality control and without the persistence 
requirement for the data object but with the persistence of the handle. 
Thus the object might be deleted but the PID is persistent. In the case 
of data object deletion the PID resolves to an information page instead 
of to the data object itself.
Such PIDs could be assigned during data production and introduced as 
global attribute in the netCDF data header.

To cover both applications, I suggest to use a more general approach for 
a global attribute for data identifiers, e.g. similar to DataCite:
"identifier"
"identifier_type" (e.g. doi, urn, purl, handle,...)

Cheers,
Martina



More information about the GO-ESSP-TECH mailing list