[Go-essp-tech] Access control for data with different QC Level

martin.juckes at stfc.ac.uk martin.juckes at stfc.ac.uk
Tue Jul 20 02:47:17 MDT 2010


Hello All,

I'm afraid I have to throw in a little more confusion. The last time I
spoke to Michael L. I'm reasonably sure that the plan was only to apply
the QC L2 analysis to replicated data -- we don't have any system for
applying it to the wider collection of ESGF data nodes. In both the
options Martina lists below, this would mean that most of the data,
including half the "requested" data, will only ever be available to
modelling groups. 

The access "terms of use" which Karl supplied me with a few weeks
(below) make no mention of any data being available only for modelling
centres and are much more along the lines of "here's the data - use it
at your own risk". The PCMDI site is currently advising the world (and a
lot of people are interested) that the data will be available from July,
again with no mention of restrictions to the modelling community.

My impression is that the only way in which we provide timely
distribution of the requested data is by releasing it after QC L1,

Regards,
Martin

 
"Unrestricted use" (access to a subset of models)
I understand that the subset of CMIP5 model output that will be
 made accessible to me has been designated for "unrestricted" use.  I
agree in good faith to attempt to understand the limitations of the
models used in producing this data.  I understand that although the
model output has been subjected to a quality control procedure,
unrecognized errors remain.  I will hold no one responsible for any
errors in the models or in their output.

"Non-commercial research and educational purposes" (access to all
models) 
 I agree to use the CMIP5 model output only for non-commercial research
and educational purposes. I agree in good faith to attempt to understand
the limitations of the models used in producing this data.  I understand
that although the model output has been subjected to a quality control
procedure, unrecognized errors remain. I will hold no one responsible
for any errors in the models or in their output."



> -----Original Message-----
> From: go-essp-tech-bounces at ucar.edu [mailto:go-essp-tech-
> bounces at ucar.edu] On Behalf Of Martina Stockhause
> Sent: 20 July 2010 08:07
> To: V. Balaji
> Cc: Cinquini, Luca (3880); go-essp-tech at ucar.edu
> Subject: Re: [Go-essp-tech] Access control for data with different QC
> Level
> 
> Hi all,
> 
> it seems that we have different ideas of access constraints:
> 
> 1. My understanding was that at QC L1 the CMIP5 modeling centers, at
QC
> L2 non-commercial researchers and at QC L3 every registered user can
> access the data.
> 2. Bryan please correct me: There is QC L1 as in 1. and after QC L2
and
> QC L3 all registered users have access to the core data. Maybe only
> non-commercial researchers are granted access to the non-core data.
> 
> This is more a political issue.
> 
> In either case the QC Level has to be communicated to the ESG.
> Luca suggests that the portal uses the AtomFeed of the questionnaire
to
> harvest the QC Flag. And after QC L3 the DOI link as well. QC and DOI
> are informations on data, so the right place in metafor CIM would be
> the
> dataObject on the hierarchy level "DRS experiment".
> Which parts of CIM do you harvest?
> 
> My biggest question at the moment is how to deliver the QC information
> to CIM. For the DOI target page there are a few additional information
> pieces needed on citation and contact. Stephen suggested to type them
> into the questionnaire. This would slow the publication process down
> and
> is error-prone. We need an automated CIM update there. The metafor
> people were against that solution as well because the questionnaire is
> meant for an inital metadata ingest by the modeling centers.
> Bryan, how do we get the information in the questionnaire, so that it
> can be harvested by the ESG?
> Which would be the alternatives to the AtomFeed/questionnaire as
> harvesting source for the quality level and DOI information?
> 
> My second biggest question is where to put the information in the CIM.
> I
> sent my interpretation / suggestion to the metafor list, but it didn't
> start a discussion. Examples for a simulationRun object, on how the
> dataObjects are referenced and on how the dataObject hierarchies are
> built, would be of great help. Or metafor just defines how I should
> send
> the quality information to them.
> 
> I moved away from the technical issues, but to solve these things is
> the
> precondition for the technical solution in the ESG.
> 
> Thanks a lot,
> Martina
> 
> 
> 
> 
> V. Balaji wrote:
> > I know we discussed this at the Princeton workshop. I didn't
register
> > some of the implications then.
> >
> > I agree that in a technical sense, yes a dataset is "available" to
> > registered users as soon as it is passed by the publisher. (QCL1-D).
> > At that point, however, it's incompletely documented, so I'm not
sure
> > it can be declared fully compliant.
> >
> > My understanding is that while users are free to begin working with
> > the data, they can publish results from the data only when the
> dataset
> > is citable, which means it has undergone more rigorous QC. What they
> > downloaded before QC-L2 is certainly use-at-your-own-risk because
> L2's
> > when the "semantic QC" kicks in. And without QC-L3 it isn't citable.
> >
> > I think there is a pretty strong feeling that the modeling centers'
> > data were used too often without citation or acknowledgment last
> > time, which is what some of the more formal QC levels this time,
> > e.g DOIs tied to data publication, are trying to avoid. Assuming
> > the QC document is adopted by the WGCM, it will be a requirement for
> > downstream users to cite datasets.
> >
> > So, QC-L1D data are "available" in the sense that the 1s and 0s may
> be
> > downloaded, but they're not licensed yet for "do whatever you like
> with
> > them"... perhaps?
> >
> > It's pretty important that we come up with language that is clear
> > what one can and cannot do with data at various levels of QC. I've
> > talked with Karl and Ron and others about making WGCM the authority
> > for this, wo whatever words we use have to be run by them.
> >
> > Thanks,
> >
> > Cinquini, Luca (3880) writes:
> >
> >
> >> Hi Estani,
> >> 	I concur with what Eric said, and to iterate my understanding is
> that as soon as the data is published with QCL1,
> >> it will be available to registered users. Maybe Bob, Dean or Karl
> can comment if my understanding is correct or not.
> >> thanks, Luca
> >>
> >> On Jul 19, 2010, at 2:52 PM, Eric Nienhouse wrote:
> >>
> >>
> >>> Hi All,
> >>>
> >>> We've had a number of discussions on the topic of QC level and
data
> >>> access.  However, I feel we don't yet have a formal definition of
> the
> >>> requirements relating to this area.
> >>>
> >>> I believe it is important to clarify and define the following two
> QC
> >>> related areas:
> >>>
> >>> 1)  Who is the authoritative source of the QC level and how this
> >>> information is propagated through the system?
> >>>
> >>> 2)  How does QC level apply to data access policy (eg. access
> control)?
> >>>
> >>> I would propose discussing this as a future GO-ESSP telco agenda
> topic,
> >>> with the intention we document the outcome.
> >>>
> >>> Perhaps we can discuss this further via email and work towards
> capturing
> >>> the system requirements and related policies in the meanwhile.
> >>>
> >>> Please note that there are plans to expose the QC Level within the
> >>> Gateway UI once the data flow is identified.  However, data access
> >>> control is based upon the group (eg. role) auth-z attribute (such
> as
> >>> "CMIP5 Research") and does not currently rely on the QC Level
> explicitly.
> >>>
> >>> Thanks,
> >>>
> >>> -Eric
> >>>
> >>>
> >>> Estanislao Gonzalez wrote:
> >>>
> >>>> Hi Luca,
> >>>>
> >>>> to sum things up (and correct me Martina/Bryan if I'm wrong):
> >>>>
> >>>> 1) Published data have QC L1-Data "per se",  and will be
available
> to a
> >>>> very selected group only (which doesn't seem to be the group you
> >>>> mention, but I might be wrong).
> >>>> 2) When acquiring QC L2 the data should be accessible to a
broader
> >>>> although still confined group. This check will be performed by
> DKRZ and
> >>>> BADC and the information stored somewhere (not sure where
though).
> Where
> >>>> BADC nor DKRZ have access to all data-nodes, so the information
> will be
> >>>> definitely be stored on some "neutral grounds" (CIM DB?).
> >>>> 3) QC L3 == DOI acquired == publication. At this stage data will
> be
> >>>> available to any registered user.
> >>>>
> >>>> If I'm correct, then the security service must check "somehow"
the
> QC
> >>>> level of the file in order to proceed with the authorization as
it
> is
> >>>> currently implemented (thus comparing roles).
> >>>>
> >>>> Any comments anyone?
> >>>>
> >>>> Thanks,
> >>>> Estani
> >>>>
> >>>> Cinquini, Luca (3880) wrote:
> >>>>
> >>>>
> >>>>> Hi Bryan, Martina,
> >>>>> I agree that these issues need to be discussed better, but here
> are some considerations, which may in some cases only reflect my
> understanding:
> >>>>>
> >>>>> 1) we talked about the QC flag for Levels 2 and 3 to be set in
> the metaphor questionnaire, and be propagated through the atom feed to
> the gateways
> >>>>>
> >>>>> 2) I thought that in order not to delay data distribution, as
> soon as the data has QC level 1 (I.e. It has been processed by the
> publisher), it will available to registered users of the CMIP5
research
> and commercial groups
> >>>>>
> >>>>> 3) At this time there is nothing in the ESG access control model
> that toes the access attributes to the QC flags.
> >>>>>
> >>>>> Thanks, luca
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> On Jul 19, 2010, at 7:39 AM, Bryan Lawrence
> <bryan.lawrence at stfc.ac.uk> wrote:
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>> Hi Martina
> >>>>>>
> >>>>>> We definitely need to formalise some of this, so thanks for
> bringing it
> >>>>>> up.
> >>>>>>
> >>>>>> What I had thought we were proposing was that L2 and L3 data
> have
> >>>>>> effectively the same restrictions ...
> >>>>>>
> >>>>>> ... but your fundamental point (I think) is how do we assign
the
> QC, and
> >>>>>> how does the security software get that information? Ie what is
> the
> >>>>>> workflow that needs to exist. We do need to bottom that out.
> >>>>>>
> >>>>>> Thanks
> >>>>>> Bryan
> >>>>>>
> >>>>>> On Monday 19 July 2010 13:43:59 Martina Stockhause wrote:
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>> Hi all,
> >>>>>>>
> >>>>>>> I had a little discussion with Estani about how the different
> and
> >>>>>>> changing access constraints on the data depending on their QC
> levels
> >>>>>>> are realized. It came out that we don't really know.
> >>>>>>>
> >>>>>>> We have on the one hand the user with a special role e.g.
> >>>>>>> "scientific, non-commercial user", who has access to data on
QC
> L3
> >>>>>>> like every registered user and QC L2 because of his role. On
> the
> >>>>>>> other hand, the data has a quality attribute (QC Level or QC
> Flag),
> >>>>>>> which defines the access restriction of the data. For data
> access a
> >>>>>>> mechanism has to check user role and data attribute, before
> access
> >>>>>>> is granted or denied.
> >>>>>>>
> >>>>>>> How does the data get this quality attribute?
> >>>>>>> How is the user role checked against this quality attribute?
> >>>>>>>
> >>>>>>> For QC L3 we don't need that mechanism, because every
> registered user
> >>>>>>> has access to all CMIP5 data, but for QC L1 and L2 exist such
> access
> >>>>>>> restrictions.
> >>>>>>>
> >>>>>>> Thanks a lot,
> >>>>>>> Martina
> >>>>>>>
> >>>>>>>
> >>>>>>> _______________________________________________
> >>>>>>> GO-ESSP-TECH mailing list
> >>>>>>> GO-ESSP-TECH at ucar.edu
> >>>>>>> http://mailman.ucar.edu/mailman/listinfo/go-essp-tech
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>> --
> >>>>>> Bryan Lawrence
> >>>>>> Director of Environmental Archival and Associated Research
> >>>>>> (NCAS/British Atmospheric Data Centre and NCEO/NERC NEODC)
> >>>>>> STFC, Rutherford Appleton Laboratory
> >>>>>> Phone +44 1235 445012; Fax ... 5848;
> >>>>>> Web: home.badc.rl.ac.uk/lawrence
> >>>>>> _______________________________________________
> >>>>>> GO-ESSP-TECH mailing list
> >>>>>> GO-ESSP-TECH at ucar.edu
> >>>>>> http://mailman.ucar.edu/mailman/listinfo/go-essp-tech
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>> _______________________________________________
> >>>>> GO-ESSP-TECH mailing list
> >>>>> GO-ESSP-TECH at ucar.edu
> >>>>> http://mailman.ucar.edu/mailman/listinfo/go-essp-tech
> >>>>>
> >>>>>
> >>>>>
> >>>>
> >>>>
> >> _______________________________________________
> >> GO-ESSP-TECH mailing list
> >> GO-ESSP-TECH at ucar.edu
> >> http://mailman.ucar.edu/mailman/listinfo/go-essp-tech
> >>
> >>
> >
> >
> 
> --
> ----------- DKRZ / Data Management -----------
> 
> Martina Stockhause
> Deutsches Klimarechenzentrum
> Bundesstr. 45a
> D-20146 Hamburg
> Germany
> 
> phone:	+49-40-460094-122
> FAX:	+49-40-460094-106
> e-mail:	martina.stockhause at zmaw.de
> 
> ----------------------------------------------
> 
> _______________________________________________
> GO-ESSP-TECH mailing list
> GO-ESSP-TECH at ucar.edu
> http://mailman.ucar.edu/mailman/listinfo/go-essp-tech
-- 
Scanned by iCritical.


More information about the GO-ESSP-TECH mailing list