[Go-essp-tech] CMIP5 data archive size estimate

bryan.lawrence at stfc.ac.uk bryan.lawrence at stfc.ac.uk
Wed Dec 9 01:43:20 MST 2009


Hi Karl
Don't worry, I can cycle the .xslx through someone else and get .xls back ... you've got better things to do. I hadn't appreciated the volume problem.
Bryan


-----Original Message-----
From: Karl Taylor [mailto:taylor13 at llnl.gov]
Sent: Wed 09/12/2009 08:05
To: Lawrence, Bryan (STFC,RAL,SSTD)
Cc: luca at ucar.edu; Pascoe, Stephen (STFC,RAL,SSTD); go-essp-tech at ucar.edu
Subject: CMIP5 data archive size estimate
 
Dear all,

I promised to send these spreadsheets to you today, but I don't have 
time to explain them.  Here are some quick notes:

0.  I've only attached the .xlxs version.  The .xls version is 40 
megabytes, so I can't send it by email.  I'll try to find another way to 
get it to you tomorrow.

1.  Estimates are based on input from modeling groups collected more 
than a year ago.

2.  I think only about 2/3 of the models are included in the estimate.

3.  Estimate is based on assuming that all experiments designated by the 
group as 66% likely to be performed or better will actually be run.  
(This perhaps approximately offsets the fact that not all groups have 
provided input yet.)

4.  You can't rely on a single piece of information in the spread sheet 
(it's all completely unofficial), but the estimate of archive size under 
the stated assumptions is probably correct.

5.  There are no estimates of the number of "atomic datasets" or the 
number of files per atomic dataset.

6.  I think in one place, at least gigabytes should have read bytes, but 
that should be obvious.

7.  There are estimates for size at the end of 2010 and at the end of 
2014, but I didn't ask groups for their timelines, so these estimates 
are identical.

8.  There are estimates for "requested output" volume and "replicated" 
output volume.  

9.  The tables of variables that are referred to in the spreadsheets can 
be found at: 
http://cmip-pcmdi.llnl.gov/cmip5/data_description.html?submenuheader=1

10.  Bottom line:  about 1 petabyte of data will be replicated of the 
almost 2 petabytes requested.

Best regards,
Karl

-- 
Scanned by iCritical.


More information about the GO-ESSP-TECH mailing list