[Go-essp-tech] DOI's in ESGF

Martina Stockhause stockhause at dkrz.de
Wed Apr 3 00:39:11 MDT 2013


Hi Laura,

the DOI is buried in the "explanation" part, which includes the citation 
regulation. And, yes there is a place in the XML for the doi:  the 
attributes of gmd:results, e.g.:
<gmd:result xlink:href="http://dx.doi.org/10.1594/WDCC/CMIP5.NCCNMa2" 
xlink:title="doi:10.1594/WDCC/CMIP5.NCCNMa2">
This field has not been indexed after harvesting, so far. Mark promised 
me that several times but unfortunately didn't have the time to 
implement that. I hope for the CIM viewer version 0.8.6.3...

The other question is that of displaying a doi without CIM questionnaire 
information. That gets us back to the discussion at the start of CMIP5. 
Not every data in the ESGF with a doi will have a CIM questionnaire. I 
see different possibilities for that:

1. CIM viewer: The CIM quality documents include DRS information and 
thus can be mapped to the ESGF data entities, directly. Then the CIM 
quality documents of the ESGF datasets instead of those of the DRS 
experiments/DOI level have to be used.

2. netCDF header: An additional global attribute in the netCDF header 
could be made visible through the TDS. As discussed in a separate e-mail 
the files are not always the level on which a doi is assigned.

3. ESGF datanode: As a doi is an additional attribute of ESGF datasets 
that might be assigned after ESGF data publication and/or data 
archiving, it need to be assigned without changing the data or having to 
re-publish the data on the data node.

We have this separation between data and metadata, between ESGF and 
CIM/ES-DOC. Quality information and dois have become part of the 
metadata, though they are closely related to the data.

The weekness of the current metadata solution is the fact that the 
relations to the datasets and the specific dataset versions are lost. 
CIM quality documents are connected to CIM simulations before displaying 
all metadata content on the ESGF dataset level in the ESGF gateway using 
the CIM viewer. Therefore the doi is displayed for all dataset versions 
even for those that are not part of the doi.
In case of NCC amip4xCO2 the doi is displayed in PCMDI's ESGF gateway 
for all mon.atmos data versions except for that which is included in the 
doi: cmip5.output1.NCC.NorESM1-M.amip4xCO2.mon.atmos.Amon.r1i1p1.v20130103
How that can happen, I have no idea.

It was a great achievement by ES-DOC after the end of the CIM project, 
especially of Mark, to make quality and doi visible in the ESGF at all. 
However, for the future we need to re-think how the dois can be 
displayed in the ESGF independent of a CIM questionnaire.

Cheers,
Martina


On 27.03.2013 19:17, Laura Carriere wrote:
>
> Martina,
>   Thank you so much for your detailed email.  I apologize for taking 
> so long to get back to you, it's been a busy week or two.
>
> I was able to see how this all works together and the CIM Viewer is 
> very nice but the data provider for this dataset is unlikely to 
> provide any CIM documents at this point in time.  Since the data is 
> not part of the official CMIP5 project but is downscaled data derived 
> from CMIP5 data, I think we can still move forward.
>
> I do have a question, I checked the metadata of a Nor-ESM1-M dataset 
> that has been assigned a DOI and I don't see the DOI in the metadata.  
> Is there a reason it wasn't included as an optional attribute, or 
> perhaps even included in the contact field?  Since the DOI resolves to 
> a Landing Page which includes contact information, this would seem 
> like a possible option.
>
>   Laura.
>
> On 3/22/2013 3:20 AM, Martina Stockhause wrote:
>> Dear Laura,
>>
>> for CMIP5 data DOIs we connect the DOI with the ESGF via CIM metadata.
>> Thus it becomes visible in the metadata pop-up (CIM viewer) as separate
>> tab, e.g. at the ESGF gateways of PCMDI and IPSL. I favoured a more
>> direct way to make DOIs visible in the ESGF but in the short amount of
>> time this was the only practicable way to do it.
>>
>> A bit more detail on that:
>> 1. We export CIM quality documents on CMIP5 experiment level and on the
>> ESGF dataset level out of a database.
>> 2. We publish the CIM quality documents via atom feed.
>> 3. The documents are harvested and the content to be displayed is
>> indexed. This is part of the functionality of the CIM viewer. The
>> responsible person for its development is Mark Morgan from IPSL, whom I
>> cc in this email. The CIM viewer is part of the developments of the
>> ES-DOC initiative: http://earthsystemcog.org/projects/es-doc-models/
>>
>> The address of our atom feed for examples for CIM quality documents is:
>> http://cera-www.dkrz.de/WDCC/CMIP5/feed/
>> We assign CIM quality documents for QC levels 2 and 3, where level 3 is
>> connected with a DOI assignment to the data. Examples are:
>> QC 2: http://cera-www.dkrz.de/WDCC/CMIP5/downloadAtomXml?id=6254
>> QC 3/DOI: http://cera-www.dkrz.de/WDCC/CMIP5/downloadAtomXml?id=30252
>>
>> If you could directly use this approach for your data will depend on the
>> level on which you assign DOIs and of cause on your profile of the CIM
>> quality document. There might be a couple of adaptations needed.
>>
>> If you need further information, just ask.
>>
>> Cheers,
>> Martina
>>
>>
>> On 21.03.2013 09:03, Michael Lautenschlager wrote:
>>> Dear Laura,
>>> the basic idea is to have DataCite DOIs including a data citation 
>>> reference
>>> for the quality proven CMIP5 data which goes as reference data into 
>>> the IPCC
>>> DDC. These data will be frozen and long-term archived at DKRZ/WDCC. 
>>> I think
>>> that your Goddard NCCS data are just in line with the this CMIP5 
>>> IPCC-DDC
>>> transition concept. Already available DataCite data publications at 
>>> WDCC can
>>> be obtained from
>>> http://cera-www.dkrz.de/WDCC/ui/FindDoiPublications.jsp?query=&and=false 
>>>
>>> including CMIP5.
>>>
>>> I copied my responsible co-workers for the DataCite data publication 
>>> process
>>> to this email and would like to ask them to contact you for defining 
>>> the
>>> DataCite publication process for Goddard NCCS.
>>> Best wishes, Michael
>>>
>>>
>>> ---------------
>>> Dr. Michael Lautenschlager
>>> Head of DKRZ Department Data Management
>>> Director World Data Center Climate
>>>
>>> German Climate Computing Centre (DKRZ)
>>> ADDRESS: Bundesstrasse 45a, D-20146 Hamburg, Germany
>>> PHONE:   +4940-460094-118
>>> E-Mail:  lautenschlager at dkrz.de
>>>
>>> URL:    http://www.dkrz.de/
>>>                http://www.wdc-climate.de/
>>>
>>> Geschäftsführer: Prof. Dr. Thomas Ludwig
>>> Sitz der Gesellschaft: Hamburg
>>> Amtsgericht Hamburg HRB 39784
>>>
>>>
>>>
>>> -----Ursprüngliche Nachricht-----
>>> Von: Williams, Dean N. [mailto:williams13 at llnl.gov]
>>> Gesendet: Mittwoch, 20. März 2013 23:53
>>> An: laura.carriere at nasa.gov
>>> Cc: Michael Lautenschlager
>>> Betreff: Re: [Go-essp-tech] DOI's in ESGF
>>>
>>> Hi Laura,
>>>
>>>     Our DKRZ partners have experience with this. I cc'ed Michael at
>>> DKRZ...
>>>
>>> Best regards,
>>>     Dean
>>>
>>> On 3/20/13 3:09 PM, "Laura Carriere" <Laura.E.Carriere at nasa.gov> wrote:
>>>
>>>> We will be publishing some data on the Goddard NCCS ESGF data node 
>>>> that
>>>> will have a DOI assigned to it.  The data author has requested that 
>>>> the
>>>> DOI be included in the metadata of each data file.   We don't need the
>>>> DOI in the DRS or to be searchable, just in the metadata. The data
>>>> author has the ability to mint the DOI and we will be maintaining the
>>>> Landing Page for the DOI which will point to ESGF as the source of the
>>>> data.  However, before we do this, I was wondering if anyone on the
>>>> list has any experience, guidelines, or advice on publishing data in
>>>> ESGF with DOI's.  Thanks.
>>>>
>>>>     Laura.
>>>>
>>>> -- 
>>>>
>>>>     Laura Carriere, CSC laura.carriere at nasa.gov
>>>>     NCCS, Code 606.2               301 614-5064
>>>>
>>>> _______________________________________________
>>>> GO-ESSP-TECH mailing list
>>>> GO-ESSP-TECH at ucar.edu
>>>> http://mailman.ucar.edu/mailman/listinfo/go-essp-tech
>>>
>>>
>>
>
>


-- 
------------------ DKRZ / Data Management ------------------
Martina Stockhause
Deutsches Klimarechenzentrum
Bundesstr. 45a
D-20146 Hamburg, Germany

phone:	+49-40-460094-122; FAX:	+49-40-460094-106
------------------------------------------------------------



More information about the GO-ESSP-TECH mailing list