[Go-essp-tech] CMIP5 data archive size estimate
Karl Taylor
taylor13 at llnl.gov
Wed Dec 9 09:08:33 MST 2009
Hi Stephen,
Nice trick! (or perhaps that's a forbidden term now). Amazing how MS
can expand by nearly an order of magnitude the storage needed for the
same information.
By the way, I just noticed the "requested" data would occupy 2.2
petabytes, not "almost 2 petabytes" as stated in point 10 of my previous
email.
cheers,
Karl
stephen.pascoe at stfc.ac.uk wrote:
>
> Thanks Karl,
>
> Converting it to *.xls then to *.ods with OpenOffice calc makes it much
> smaller (attached).
>
> S.
>
> ---
> Stephen Pascoe +44 (0)1235 445980
> British Atmospheric Data Centre
> Rutherford Appleton Laboratory
>
> -----Original Message-----
> From: Karl Taylor [mailto:taylor13 at llnl.gov]
> Sent: 09 December 2009 08:06
> To: Lawrence, Bryan (STFC,RAL,SSTD)
> Cc: luca at ucar.edu; Pascoe, Stephen (STFC,RAL,SSTD);
> go-essp-tech at ucar.edu
> Subject: CMIP5 data archive size estimate
>
> Dear all,
>
> I promised to send these spreadsheets to you today, but I don't have
> time to explain them. Here are some quick notes:
>
> 0. I've only attached the .xlxs version. The .xls version is 40
> megabytes, so I can't send it by email. I'll try to find another way to
> get it to you tomorrow.
>
> 1. Estimates are based on input from modeling groups collected more
> than a year ago.
>
> 2. I think only about 2/3 of the models are included in the estimate.
>
> 3. Estimate is based on assuming that all experiments designated by the
> group as 66% likely to be performed or better will actually be run.
> (This perhaps approximately offsets the fact that not all groups have
> provided input yet.)
>
> 4. You can't rely on a single piece of information in the spread sheet
> (it's all completely unofficial), but the estimate of archive size under
> the stated assumptions is probably correct.
>
> 5. There are no estimates of the number of "atomic datasets" or the
> number of files per atomic dataset.
>
> 6. I think in one place, at least gigabytes should have read bytes, but
> that should be obvious.
>
> 7. There are estimates for size at the end of 2010 and at the end of
> 2014, but I didn't ask groups for their timelines, so these estimates
> are identical.
>
> 8. There are estimates for "requested output" volume and "replicated"
> output volume.
>
> 9. The tables of variables that are referred to in the spreadsheets can
> be found at:
> http://*cmip-pcmdi.llnl.gov/cmip5/data_description.html?submenuheader=1
>
> 10. Bottom line: about 1 petabyte of data will be replicated of the
> almost 2 petabytes requested.
>
> Best regards,
> Karl
>
>
More information about the GO-ESSP-TECH
mailing list