[Go-essp-tech] Quality Control (A Perspective)++

Bryan Lawrence bryan.lawrence at stfc.ac.uk
Thu Mar 18 07:04:29 MDT 2010


Hi Gavin

I'm with you in principle! I suspect we all beleive in making data fully 
open and visible, but none of us believe in wasting people's time with 
rubbish :-)

I think we have to be pretty careful about understanding how CMIP5 is 
giong to differ from CMIP3, and how modelling groups really handle their 
data.

In practice a lot of the data to be produced by a modelling group is not 
going to get looked at by that modelling group before it hits QC1, not 
because they don't care, but because they wont have time (and in some 
cases, not much expertise for some variables). However, these folk are 
making what they hope is good data available as a public good ... but 
they don't yet know whether it is good data; that's not whimsy or 
laissez-faire :-)

For CMIP3, that didn't matter (so much) because the good folks at PCMDI 
did a sanity check, and then the rest of the WG1 community did a sanity 
check, then the entire science community jumped in. I'm sure some 
rubbish got through, but there were sane steps along the way,  and a lot 
wasted time was avoided, and crap science not done.

I don't think we do anyone any favours by making QC1 data available to 
everyone lock stock and barrel, imediately. Of course, if the modelling 
group want to, that's fine: but that's not your choice and it's not mine 
(nor even Karl's). What we get to choose is "the default".

I think we should recommend that folk think of QC one data as like 
preprints. If a group makes them fully publicly visible on their site, 
well and good, but we recommend they only makes them available to their 
mates *initially*. They choose your definition of mates.  If they don't, 
then we recommend the definition be "those who are at institutions who 
themselves have contributed data to CMIP5" ... for a while ...

They choose the definition of "a while" (it's their data on a system 
running in their institution). (Gateways complicate things a bit, and 
gateways managing access control complicate things too, but the broad 
principle is clear). If they don't choose, we recommend N months. (3?)

It gives folks time to retract versions before other folk (who may be 
less expert in the detail) irrevocably waste  time. 

Personally, I think this approach is legal in terms of UK FOI 
regulations, and probably US.  We have a well defined procedure to ensure 
we're not delivering rubbish ... and there is no restriction of actual 
results (but we're simply saying we're not sure they are results, until 
some time for examination has passed: no one advocates that all data 
should be streamed hot off the supercomputer to public web sites, so this 
is just a variation on that theme). By the way, the situation with 
observations (as opposed to simulations) *is* different. There raw data 
will have benefit, for sure, provided the appropriate metadata 
accompanies it ...

For data being replicated into the core, it's different. Now, PCMDI, DKRZ 
and BADC are spending public money in pursuit of defined goals (CMIP5, 
IPCC). We collectively have to follow the goals of the sponsoring 
organisations (WGCM, IPCC) ... within the budgets available ...

... but we get to recommend to those organisations what the policy 
should be.  And I agree, there we have a much higher standard of duty on 
what it means to be Published. This email is long enough already ... we 
can cover this angle more anon.

Cheers
Bryan

On Tuesday 16 Mar 2010 19:32:58 Gavin M Bell wrote:
> In addition to my last email...
> 
> I also believe it to be important that anything published by the
> publisher should be immediately available to the entire community to
> download, investigate, look at, etc....  We can leverage the entire
> community to help with quality control in addition to the few that
>  are directly charged with the review responsibility.  One thing that
>  would have to be worked out, as I mentioned before would be the
>  mechanism for communication - OUTSIDE OF THE ESG SYSTEM.
> 
> I do not believe in making data not fully open and visible.
> 
> I feel that the openness of the system will have the effect of
>  forcing people to be more judicious when publishing their work. 
>  Also, if someone attempts to reference a bit of work that does not
>  have a proper DOI associated with it, it will speak to the rigor and
>  thus validity of that work, which the community should judge
>  appropriately.
> 
> If you have a visibility cloak, I feel it does a disservice, as it
> fosters a certain whimsy or laissez-faire posture w.r.t publishing
>  that should be discouraged.
> 
> Okay... I think that is all I wanted to say.
> 
> :-)
> 

-- 
Bryan Lawrence
Director of Environmental Archival and Associated Research
(NCAS/British Atmospheric Data Centre and NCEO/NERC NEODC)
STFC, Rutherford Appleton Laboratory
Phone +44 1235 445012; Fax ... 5848; 
Web: home.badc.rl.ac.uk/lawrence


More information about the GO-ESSP-TECH mailing list