[Go-essp-tech] Data file gaps

martin.juckes at stfc.ac.uk martin.juckes at stfc.ac.uk
Wed Mar 21 11:25:48 MDT 2012


Hi Jeff, Karl,

Specifying the branch time as a year rather than as requested does sound like a serious error - it is one attribute which users should be using. I can see a few options:


(1)    We create a list somewhere of modelling centres which have sued it incorrectly, and advertise the correct values somehow;

(2)    Ask them to resubmit corrected data;


(1)    Is messy, (2) would involve potentially massive additional data flows. So perhaps (1) is best at this stage: what do you think?

Cheers,
Martin

From: go-essp-tech-bounces at ucar.edu [mailto:go-essp-tech-bounces at ucar.edu] On Behalf Of Jeffrey F. Painter
Sent: 21 March 2012 16:47
To: go-essp-tech at ucar.edu
Subject: Re: [Go-essp-tech] Data file gaps

Martin,

I got the impression of two control runs because different experiments or different physics versions of one experiment "are meant to be compared with" different disjoint segments of the piControl files.  Indeed other interpretations are possible; I could be wrong.   Moreover, I would not expect to see a substantive difference between two control runs with the same physics, and two widely separated segments of one control run.  But there are two bottom lines: (1) Jennifer's original issue was really whether the time gap meant there was misssing data.  Not in this case.  (2) In practice what matters is, as you say, that one should use the :branch_time attribute (of the branching experiment, not the control run).

There is a related issue.  Several modeling centers provide only a year for the :branch_time.  The specification which Jamie refers to, says that the :branch_time should be a time in the units of the parent experiment, not a year.  I have been helping a scientist with calculations that involve three experiments which branch from one another.   The compliant data, where the branch time is more precisely specified, has been easier to deal with.

- Jeff


On 3/21/12 8:48 AM, martin.juckes at stfc.ac.uk<mailto:martin.juckes at stfc.ac.uk> wrote:
Hi Jeff,

Where do you get the information that there are two control runs, as opposed to a single run for which two segments are archived? Ken appears to suggest that it is a single run for which two segments have been saved.

As Jamie suggests, it is the "branch_time" attribute in the branching experiment that you need to look at - it is only relevant here because it enters into the specification of the data requested. I.e. people are asked to provide data for periods matching those in the branched historical experiments. In the present instance that request appears to result in disjoint segments of data being archived,

Cheers,
Martin

From: go-essp-tech-bounces at ucar.edu<mailto:go-essp-tech-bounces at ucar.edu> [mailto:go-essp-tech-bounces at ucar.edu] On Behalf Of Kettleborough, Jamie
Sent: 21 March 2012 14:51
To: Jennifer Adams; go-essp-tech at ucar.edu<mailto:go-essp-tech at ucar.edu>
Subject: Re: [Go-essp-tech] Data file gaps

Hello Jennifer,

did you check the branch_time of the piControl sections or the historicalMisc (p1) and historical etc that Ken mentions - I think its the historical runs that Ken mentions that should have the correct branch times, with the parent_experiment_id pointing to piControl.  The branch time is described in one of Karl's documents... http://cmip-pcmdi.llnl.gov/cmip5/docs/CMIP5_output_metadata_requirements.pdf - though the most obvious route to this document I know is through the 'data providers' item on the CMIP5 web page.  Karl - is it worth makeing this more obviously relevant to data users too?

I don't know whether its worth trying to capture an expanded version of Kens explanation somewhere for all data users to see.  If it is then where?

For what its worth branch_time is one of those things that we found fairly hard to get right when we were producing data.

Jamie

________________________________
From: go-essp-tech-bounces at ucar.edu<mailto:go-essp-tech-bounces at ucar.edu> [mailto:go-essp-tech-bounces at ucar.edu] On Behalf Of Jennifer Adams
Sent: 21 March 2012 14:04
To: go-essp-tech at ucar.edu<mailto:go-essp-tech at ucar.edu>
Subject: Re: [Go-essp-tech] Data file gaps
Ken Lo's explanation makes me want to laugh and cry at the same time. I see a global attribute called "branch_time", but for the handful of random files from I checked from both periods it is set to "0."  --Jennifer


On Mar 20, 2012, at 7:13 PM, Jeffrey F. Painter wrote:



I can shed some light on one of the gaps: the GISS-E2-R piControl files actually contain TWO control runs!  One side of the gap is one control run, and the other side of the gap is the other control run.  I cut-and-pasted a message from Ken Lo about this, at the bottom of this message.

- Jeff

On 3/20/12 11:55 AM, Karl Taylor wrote:
database problem??

-------- Original Message --------
Subject:

[Go-essp-tech] Data file gaps

Date:

Tue, 20 Mar 2012 11:50:07 -0700

From:

Jennifer Adams <jma at cola.iges.org><mailto:jma at cola.iges.org>

To:

go-essp-tech at ucar.edu<mailto:go-essp-tech at ucar.edu> <go-essp-tech at ucar.edu><mailto:go-essp-tech at ucar.edu>


Dear All,
I have a GrADS script that creates descriptor files for CMIP5 data, and it has uncovered some gaps in the date strings of files that belong to an atomic data set. I have checked the ESGF gateways for all the datasets listed below and I can confirm that these files are missing for some variables. I haven't given the variable names, because I have only checked the subset of variables that are of interest to me and that may not be a complete list. If the specific var names that I checked would be useful, I can provide them. While checking on these gaps, I noticed that not all variables in an atomic dataset span the same time range ... an inconsistency that doesn't quite make sense to me. I also noticed that in some cases, data files that I had grabbed are no longer listed on the gateway, even though the version number is the same. An example is cmip5.output1.NCC.NorESM1-M.piControl.day.land.day.r1i1p1.v20110901, mrsos, date range 11000101-12001231. What does that mean?

The gaps put these data into the "not quite usable" category. I wasn't sure whether to send this to the helpdesk or the forum or both; in the end I am just posting here.
--Jennifer

 11880 times missing in cmip5.output1.NASA-GISS.GISS-E2-H.piControl.mon.atmos.Amon.r1i1p1 between 141912 and 241001
 11880 times missing in cmip5.output1.NASA-GISS.GISS-E2-H.piControl.mon.land.Lmon.r1i1p1  between 141912 and 241001
  4200 times missing in cmip5.output1.NASA-GISS.GISS-E2-R.piControl.mon.land.Lmon.r1i1p1  between 363012 and 398101
 11880 times missing in cmip5.output1.NASA-GISS.GISS-E2-H.piControl.mon.ocean.Omon.r1i1p1 between 141912 and 241001
  4200 times missing in cmip5.output1.NASA-GISS.GISS-E2-R.piControl.mon.ocean.Omon.r1i1p1 between 363012 and 398101
     1 times missing in cmip5.output1.MOHC.HadGEM2-CC.historical.mon.ocean.Omon.r3i1p1    between 200110 and 200112
    60 times missing in cmip5.output1.NOAA-GFDL.GFDL-ESM2M.piControl.mon.land.Lmon.r1i1p1 between 017012 and 017601
    60 times missing in cmip5.output1.NOAA-GFDL.GFDL-ESM2M.piControl.mon.land.Lmon.r1i1p1 between 003012 and 003601
  3650 times missing in cmip5.output1.NOAA-GFDL.GFDL-CM3.piControl.day.atmos.day.r1i1p1   between 00101231 and 00210101
  1825 times missing in cmip5.output1.NOAA-GFDL.GFDL-CM3.piControl.day.atmos.day.r1i1p1   between 00401231 and 00460101
  7300 times missing in cmip5.output1.NOAA-GFDL.GFDL-CM3.piControl.day.atmos.day.r1i1p1   between 00751231 and 00960101
142350 times missing in cmip5.output1.NOAA-GFDL.GFDL-CM3.piControl.day.atmos.day.r1i1p1   between 01001231 and 04910101
 73000 times missing in cmip5.output1.NCC.NorESM1-M.piControl.day.land.day.r1i1p1         between 08991231 and 11000101
 29220 times missing in cmip5.output1.MPI-M.MPI-ESM-LR.rcp85.day.atmos.day.r1i1p1         between 22001231 and 22810101

--
Jennifer M. Adams
IGES/COLA
4041 Powder Mill Road, Suite 302
Calverton, MD 20705
jma at cola.iges.org<mailto:jma at cola.iges.org>




On 11/4/11 6:07 AM, Ken Lo wrote:
Dear Jeff Painter,

Years 3331 to 3630 of the control run are meant to be compared with
historicalMisc (p1), while years 3981 to 4530 are meant to be compared
with historical, historicalMisc (p109), historicalGHG, historicalNat
and rcp runs (all p1 except indicated otherwise).  The appropriate years
for comparison are written in the branch time of the metadata of each
file.

--
Jennifer M. Adams
IGES/COLA
4041 Powder Mill Road, Suite 302
Calverton, MD 20705
jma at cola.iges.org<mailto:jma at cola.iges.org>







--
Scanned by iCritical.


-- 
Scanned by iCritical.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ucar.edu/pipermail/go-essp-tech/attachments/20120321/a7ea7528/attachment-0001.html 


More information about the GO-ESSP-TECH mailing list