[Go-essp-tech] What is the risk that science is done using 'deprecated' data?

Kettleborough, Jamie jamie.kettleborough at metoffice.gov.uk
Thu Mar 8 05:06:11 MST 2012


Thanks for the replies on this - any other replies are still very welcome.

Stephen - being selfish - we aren't too worried about 2 as its less of an issue for us (we do a daily trawl of thredds catalogues for new datasets), but I agree it is a problem more generally.  I don't have a feel for which of the problems 1-3 would minimise the risk most if you solved it.  I think making sure new data has a new version is a foundation though.

Part of me wonders though whether its already too late to really do anything with versioning in its current form.  *But* I may be overestimating the size of the problem of new datasets appearing without versions being updated.

Jamie


> -----Original Message-----
> From: go-essp-tech-bounces at ucar.edu 
> [mailto:go-essp-tech-bounces at ucar.edu] On Behalf Of Sébastien Denvil
> Sent: 08 March 2012 10:41
> To: go-essp-tech at ucar.edu
> Subject: Re: [Go-essp-tech] What is the risk that science is 
> done using 'deprecated' data?
> 
> Hi Stephen, let me add a third point:
> 
> 3. Users are aware of a new versions but can't download files 
> so as to have a coherent set of files.
> 
> With respect to that point the p2p transition (especially the 
> attribut caching on the node) will be a major step forward. 
> GFDL just upgrad and we have an amazing success rate of 98%.
> 
> And I agree with Ashish.
> 
> Regards.
> Sébastien
> 
> Le 08/03/2012 11:34, stephen.pascoe at stfc.ac.uk a écrit :
> > Hi Jamie,
> >
> > I can imagine there is a risk of papers being written on 
> deprecated data in two scenarios:
> >
> >   1. Data is being updated at datanodes without creating a 
> new version
> >   2. Users are unaware of new versions available and 
> therefore using 
> > deprecated data
> >
> > Are you concerned about both of these scenarios?  Your 
> email seems to mainly address #1.
> >
> > Thanks,
> > Stephen.
> >
> > On 8 Mar 2012, at 10:21, Kettleborough, Jamie wrote:
> >
> >> Hello,
> >>
> >> Does anyone have a feel for the current level of risk that 
> analysists 
> >> are doing work (with the intention to publish) on data 
> that has been 
> >> found to be wrong by the data providers and so deprecated (in some 
> >> sense)?
> >>
> >> My feeling is that versioning isn't working (that may be 
> putting it a 
> >> bit strongly.  It is too easy for data providers - in their 
> >> understandable drive to get their data out - to have 
> updated files on
> >> disk without publishing a new version.   How big a deal does anyone
> >> think this is?
> >>
> >> If the risk that papers are being written based on 
> deprecated data is 
> >> sufficiently large then is there an agreed strategy for 
> coping with 
> >> this?  Does it have implications for the requirements of the data 
> >> publishing/delivery system?
> >>
> >> Thanks,
> >>
> >> Jamie
> >> _______________________________________________
> >> GO-ESSP-TECH mailing list
> >> GO-ESSP-TECH at ucar.edu
> >> http://mailman.ucar.edu/mailman/listinfo/go-essp-tech
> 
> 
> --
> Sébastien Denvil
> IPSL, Pôle de modélisation du climat
> UPMC, Case 101, 4 place Jussieu,
> 75252 Paris Cedex 5
> 
> Tour 45-55 2ème étage Bureau 209
> Tel: 33 1 44 27 21 10
> Fax: 33 1 44 27 39 02
> 
> 
> 


More information about the GO-ESSP-TECH mailing list