[Go-essp-tech] AtomFeed for simulationRun documents and CIM qctool

Martina Stockhause martina.stockhause at zmaw.de
Tue Oct 19 08:37:57 MDT 2010


  Hi Bryan,

then the CIM simulation level is then the same as my quality level, - 
including multiple ensemble members and multiple realms. You write of a 
URL or URI, - a URL to where? A cim document to which the quality 
information belong?

Well, and I meant that I need a GUI-less qctool for the quality 
information ingest with an example call.

Let me try to describe a workflow for the ingest of quality information 
in CIM:

1. We two register the measurement descriptions for QC L2 and QC L3 checks.

2. The people who are responsible for QC L2 checks register with their 
name and email addresses.

Both could be done within the qc questionnaire.

3. Add quality check result for QC L2 and assign QC level 2 using a 
python tool instead of the questionnaire:

qctool.py --level=2 --simulation=<drs experiment> --contact=<email 
address or name> --report=<report file location> --uploadlog=<logfile 
location> [--upoadpdf=<pdf location>] [--upoadbinary=<result location>]

example:
qctool.py --level=2 --simulation=cmip5.output.MPI-M.ECHAM6-MPIOM-TR.amip 
--contact=martina.stockhause at zmaw.de 
--report=QCL2_cimresult_cmip5_output_MPI-M_ECHAM6-MPIOM-TR_amip.xml 
--uploadlog=QCL2_cimlogfile_cmip5_output_MPI-M_ECHAM6-MPIOM-TR_amip.log 
--upoadpdf=QCL2_cimpdf_cmip5_output_MPI-M_ECHAM6-MPIOM-TR_amip.pdf 
--upoadbinary=QCL2_cimresults_cmip5_output_MPI-M_ECHAM6-MPIOM-TR_amip.tar

It could be different or shorter. I would use the information as follows:

- level: Link the measurement description for QC L2 to the uploaded 
result section
- simulation: Link QC L2 quality information to a CIM simulation
- contact: Link the qc contact to the uploaded result section
- uploadlog: Give the location of the qc logfile for upload to CIM. 
uploadpdf,uploadbinary alike.
- report: XML to be specified.

Alternatively, I could send a full quality document together with the 
DRS experiment name it belongs to. Or any solution between these extremes.

You can remove the second measure describing QC L2 I added. It was just 
for testing.
I still encounter an error, when I add a report and fill the form's 
field for explanation in the qc questionnaire.

For QC L3 there are an added quality information, added citations for 
URN and DOI, and might be changes in contacts and citations.

Best wishes
Martina



On 10/18/2010 04:52 PM, Bryan Lawrence wrote:
>
>> Hi Martina
>>
>>> which is the AtomFeed address for access of simulationRun documents
>>> of CIM? This is needed for QC L3. It would be necessary to have one
>>> or two examples in the AtomFeed for the tool development.
>> Gerry is the right person to answer this one now!
>>
>>> And at last I tested your qc questionnaire. Moreover I seem to
>>> understand mostly of what it does.
>> The granularity of quality entries is not clear to me: I have summed
>> results for an DRS experiment (metafor simulationRun), which I can
>> send to you or dublicate it if a finer granularity is needed, e.g.
>> realm.
> The tool is agnostic (I hope). Currently it allows you to provide both a
> URL and URI, so you can make qc assertions about any target URL and any
> identifier within it.
>
> We should probably decide on  best practice. My suspicion is that it
> would be easier to raise them on URLs at the realm dataset level for
> data, and at the simulation level for metadata. How we ensure that both
> are closed to pass up a qc level would then be an issue. We can talk
> about this to decide what is best.
>
>> By the way, how does a metafor simulationRun correspond with the new
>> DRS syntax in the TDS? In the TDS we have realm+ensemble+version as
>> a dataset. Is it realm+version with all ensembles in an
>> simulationRun entry?
> A metafor simulation can include both multiple ensemble members and
> multiple realms. So, a simulation metadata record will describe multiple
> datasets.  We probably need to sort that out in the CIM data record.
> Ideally the output data record associated with a simulation would then
> include a nested set of records corresponding to the datasets as the
> publisher describes them.
>
>> Remarks to the qc questionnaire and the CIM qctool:
>>
>> - We need the offline version of the CIM qctool.
> By which I presume you mean, you need the ability to upload XML to the
> qctool? (We could add a simple python tool to post this stuff as well,
> would that help ... see also below where I have a **)
>
>> - 'issue's: My idea was to send the complete quality metadata after
>> the QC checks for assignment of QC L2. This would include only
>> 'report's.
> I think that's appropriate. Issues are there for other ... issues ...
>
>> - For new authors the email should be set required for contact if
>> questions arise during QC L3 regarding QC L2 results.
> Not sure what you mean here. Can you pls explain further?
>
>> - The 'report/measureDescription' part describing the QC checks
>> itself (not their results) has two values: one for QC L2 and one for
>> QCL3. Therefore these two need to be entered only once and then
>> referenced when adding 'report's.
> There are two measures: QC level 2 and QC level 3. I would expect one
> report as to each. Is that not what you expect? (Clearly a qc level 3
> has already passed qc level 2.). I'm not sure I understand what you are
> suggesting/asking.
>
>> - The 'report/explanation' part is the QC result. Unfortunately, I
>> get an error before I could view the metadata (see at the end of the
>> message). But it would be good to add all additional information at
>> once (logfile and pdf). The logfile should be mandatory in order to
>> have at least this piece of information about the qc results
>> available.
> You can add one logfile/plot at a time currently. Are you suggesting you
> would like the facility (through the interactive tool) to add multiple
> ones? (I suspect this would be better supported via a script, we could
> add that to the tool discussed above ** if that's what you wanted).
>
>> For the CIM qctool the effort would be minimal if I can simply add a
>> report/explanation with the option 'QCL2' or 'QCL3' and the email of
>> the user as reference to the contact. I expect that another option
>> has to be the DRS name of the experiment.
> I think the effort should be that minmal. The measures will all be
> predefined, so you only need to enter a resource description (once) and
> then reports against the (pre-defined) qc meaures (QCL2 and QCL3).
> (I see you added a measure that was a copy of the one that was there!)
> I am not sure why one would need to put either a user or a drs name in
> since you will have a URL pointing to these things - and a portal should
> harvest hte feed and bind them to the targe ....
>
>> What are your ideas? Could we specify the tool options in advance and
>> soon, please? With a concrete example?
>> The report/explanation will be an xml?
> It is now (xml!) ... as you can see ... concrete examples? Well this is
> a live tool, we need some data to put some live reports against.
>
>> For QCL3 this is not sufficient since citations (at least DOI and URN
>> of the data) and contacts for the DRS experiment (simulationRun)
>> will or may be changed by the data author.
> Sure, but the qc report shouldn't change?
>
>> Good to get a step further!
> :-)
>
> Cheers
> Bryan
>
>> Best wishes,
>> Martina
>>
>> On 10/06/2010 04:07 PM, Martina Stockhause wrote:
>>>   Hallo Bryan,
>>>
>>> at the moment we use the AtomFeed for experiments at
>>> http://q.cmip5.ceda.ac.uk/feeds/cmip5/experiment/
>>>
>>> But we would need the AtomFeed for simulations as well for qc level
>>> 3 cross-checks. Is the below address the right one or is there
>>> another which Sylvia and the ESG portal people use?
>>> http://q.cmip5.ceda.ac.uk/feeds/cmip5/simulation/
>>>
>>> Thanks a lot and best wishes,
>>> Martina
>>    KeyError at /report/
>>
>> 'explanation'
>>
>> Request Method: 	POST
>> Request URL: 	http://qc.cmip5.ceda.ac.uk/report/
>> Django Version: 	1.2.3
>> Exception Type: 	KeyError
>> Exception Value:
>>
>> 'explanation'
>>
>> Exception Location:
>> /usr/local/cmip5qc/develop/QCTool/qcproj/qcapp/forms.py in clean,
>> line 71
> Bryan Lawrence
> Director of Environmental Archival and Associated Research
> (NCAS/British Atmospheric Data Centre and NCEO/NERC NEODC)
> STFC, Rutherford Appleton Laboratory
> Phone +44 1235 445012; Fax ... 5848;
> Web: home.badc.rl.ac.uk/lawrence

-- 
----------- DKRZ / Data Management -----------

Martina Stockhause
Deutsches Klimarechenzentrum
Bundesstr. 45a
D-20146 Hamburg
Germany

phone:	+49-40-460094-122
FAX:	+49-40-460094-106
e-mail:	martina.stockhause at zmaw.de

----------------------------------------------



More information about the GO-ESSP-TECH mailing list