[Go-essp-tech] ESG CMIP5 notification and inquiry requirements
Bryan Lawrence
bryan.lawrence at stfc.ac.uk
Tue Nov 9 11:59:48 MST 2010
Hi Karl
A quick response to 1). This is exactly why I want a tracking id
service. (Which I'm not yet sure we've built. Hans, what's the status of
that?)
If you have a file, with a tracking id, you want to be able to cut and
paste it into a "find my data" page, and go straight to the parent
metadata, which should show it has been withdrawn, and what the current
version is.
Cheers
Bryan
> Dear all,
>
> I am concerned that we will be unable to help users learn when CMIP5
> data they have downloaded has been withdrawn (presumably because it
> is flawed). Here are some common "use cases" that ESG should be
> able to handle (but I don't think it currently does).
>
> 1. A user downloads some files on December 12, 2010. Three months
> later he wants to know
> a) if any files he downloaded were withdrawn (i.e., found to be
> flawed).
> b) if similar data from other models (or replacement data from
> models he has already downloaded) has become available.
> c) the reasons for data being withdrawn or replaced. (For
> example, was the data in the file flawed? Was the data mislabeled?
> Were some of the attributes incorrect? If so, which ones?)
>
> 2. A user wishes to be informed by email when files he has
> downloaded have been withdrawn, and he wants to know the reasons for
> their withdrawal.
>
> 3. A user wishes to be informed by email when new files in his area
> of interest become available. (The user would define "area of
> interest" in terms of a set of DRS identifiers, e.g., experiment,
> variable name, MIP table).
>
> 4. A reader of a journal article wants to know whether any of the
> data used in a study has been withdrawn and the reasons for its
> withdrawal. The DOI's for the dataset(s) are included in the
> article, and the user knows what variables are used from that
> dataset. How does he learn whether the data were subsequently
> withdrawn, and the reasons?
>
> It is my understanding that the assignment of versions in the present
> system is based on "dataset", whereas most users will only be
> interested in a tiny portion of the dataset (e.g., a single
> variable, rather than the perhaps 100 variables that might be
> included in the dataset). It would be very little help if the user
> could learn only about changes at the dataset level (which might
> occur because a single variable was added withdrawn or replaced).
> Also the *reason* for any changes to a dataset should always be made
> clear. So, the challenges would seem to be:
>
> 1. Making sure data providers recorded information about why changes
> were made to their datasets.
>
> 2. Being able to report changes applied only to the subset of files
> in a dataset that are of interest to any particular user. This is
> especially important, since if a user only is interested in 1 out of
> 100 variables, he doesn't want to be bothered with messages about
> changes to the dataset that didn't affect the variable he is
> interested in.
>
> Another thing we should plan on doing is making it easy for users to
> report suspected errors in the data they are analyzing directly to
> the responsible modeling group(s). How are we going to handle all
> the emails from users who think they've discovered problems?
>
> If we don't have some way of doing the above by the first month or
> two of 2011, I think we're going to be in for lots of complaints. I
> therefore hope we can make this a very high priority. Are there any
> higher priorities? (I'm sure there are, just wondering what they
> are.)
>
> Best regards,
> Karl
>
> p.s. feel free to post or forward to whomever you think might be able
> to help.
--
Bryan Lawrence
Director of Environmental Archival and Associated Research
(NCAS/British Atmospheric Data Centre and NCEO/NERC NEODC)
STFC, Rutherford Appleton Laboratory
Phone +44 1235 445012; Fax ... 5848;
Web: home.badc.rl.ac.uk/lawrence
More information about the GO-ESSP-TECH
mailing list