[Go-essp-tech] +2Gb CMIP5 files

Karl Taylor taylor13 at llnl.gov
Mon May 17 11:44:54 MDT 2010


Dear all,

CMOR has code already in place for checking whether a file exceeds 2 GB, 
but it is currently turned off (it was turned on for CMIP3).  We thought 
it was now unnecessary.  If the feeling is that there will be users 
downloading CMIP5 files to windows machines using older operating 
systems, I suppose that limiting CMIP5 files to whatever the limit is (2 
GB or 4 GB -- does anyone know which it is?) might be wise.

On the other hand, will anyone use a windows machine to look at netCDF 
files?  If not, maybe this is a non-issue.

Karl

On 5/16/10 12:08 PM, stephen.pascoe at stfc.ac.uk wrote:
> I think I raised undue alarm here when suggesting we might be dealing with +2GB files.  Thanks Phil for clarifying that UKMO is still planning to limit itself to<2GB files.
>
> I am wondering what the policy should be here?  My first thought is that modeling centres will mainly make the same decision as UKMO since it is in their interest for their model output to be widely used.  However, enforcement could be difficult.  The logical place to enforce the limit is in the level 1 QC but CMOR doesn't do this so it will be a problem for people running datanodes.
>
> I suggest we make a strong recommendation to supply data in<2GB files and enforce it during level-2 QC before replicating.
>
> S.
>
> -----Original Message-----
> From: go-essp-tech-bounces at ucar.edu on behalf of Michael Lautenschlager
> Sent: Sun 5/16/2010 1:35 PM
> To: V. Balaji
> Cc: go-essp-tech at ucar.edu
> Subject: Re: [Go-essp-tech] +2Gb CMIP5 files
>
> Hello *,
>
> we strongly support Phils decision for data files less than 2 GB. We
> made decision in Hamburg for the same reasons because we cannot expect
> that all users use 64 Bit systems. Most Windows environments are still
> running with 32 Bits.
>
> Best wishes, Michael
>
> ---------------
> Dr. Michael Lautenschlager
>
> German Climate Computing Centre (DKRZ)
> World Data Center Climate (WDCC)
> ADDRESS: Bundesstrasse 45a, D-20146 Hamburg, Germany
> PHONE:   +4940-460094-118
> E-Mail:  lautenschlager at dkrz.de
>
> URL:    http://*www.*dkrz.de/
>           http://*www.*wdc-climate.de/
>
> V. Balaji schrieb:
>    
>> If I understood correctly the most serious 2Gb problem is with apache!
>>
>> Bentley, Philip writes:
>>
>>      
>>> Hi Stephen,
>>>
>>> Yes, that's true, we did create a small number of test netCDF files in
>>> that size range. But this was because the CMOR library we used at the
>>> time didn't include functionality for chunking the output into smaller
>>> files. Plus we wanted to stress-test our pipeline!
>>>
>>> Two things have happened since then:
>>>
>>> 1. Jamie has been working with Charles at PCMDI to implement and test a
>>> solution whereby we can limit the size of the output netCDF files
>>> produced by CMOR.
>>>
>>> 2. We have made the local decision to limit our netCDF file sizes to 2
>>> GB (or thereabouts) as, logistically, that will cause us less headache
>>> moving these files around, and it should maximise the number of client
>>> applications in which the files can be read.
>>>
>>> IIRC, I think Balaji mentioned that the 64-bit offset format was
>>> required for output from the gridspec toolset. I could be wrong.
>>>
>>> Regards,
>>> Phil
>>>
>>>        
>>>> -----Original Message-----
>>>> From: go-essp-tech-bounces at ucar.edu
>>>> [mailto:go-essp-tech-bounces at ucar.edu] On Behalf Of
>>>> stephen.pascoe at stfc.ac.uk
>>>> Sent: 14 May 2010 10:52
>>>> To: go-essp-tech at ucar.edu
>>>> Subject: [Go-essp-tech] +2Gb CMIP5 files
>>>>
>>>> The latest UKMO extraction for CMIP5 has produced some files
>>>> in the 30Gb range.  We had discussed previously the
>>>> assumption that all files would be<2Gb.  Do we feel it is
>>>> important to enforce a<2Gb limit or should this just be a
>>>> recommendation on modelling centres?
>>>>
>>>> To my knowledge there is two issues with +2Gb files:
>>>>
>>>>   1. +2GB NetCDF files will be in 64-bit offset format.
>>>> Therefore NetCDF libraries prior to v3.6 will not be able to
>>>> read them.
>>>>   2. Older file systems may have a 2Gb file limit. This will
>>>> mainly affect 32-bit systems that are a few years old. FAT32
>>>> has a 4Gb limit.
>>>>
>>>> These are end-user issues, is there any reason why the ESG
>>>> software might have problems with files over 2Gb?  If we do
>>>> want to ensure files are<2Gb do we want to mandate the
>>>> modelling centres deliver that or will the data centres need
>>>> to split files?
>>>>
>>>> Stephen.
>>>>
>>>> ---
>>>> Stephen Pascoe  +44 (0)1235 445980
>>>> British Atmospheric Data Centre
>>>> Rutherford Appleton Laboratory
>>>> --
>>>> Scanned by iCritical.
>>>> _______________________________________________
>>>> GO-ESSP-TECH mailing list
>>>> GO-ESSP-TECH at ucar.edu
>>>> http://*mailman.ucar.edu/mailman/listinfo/go-essp-tech
>>>>
>>>>          
>>> _______________________________________________
>>> GO-ESSP-TECH mailing list
>>> GO-ESSP-TECH at ucar.edu
>>> http://*mailman.ucar.edu/mailman/listinfo/go-essp-tech
>>>
>>>        
>>      
> _______________________________________________
> GO-ESSP-TECH mailing list
> GO-ESSP-TECH at ucar.edu
> http://*mailman.ucar.edu/mailman/listinfo/go-essp-tech
>
>    



More information about the GO-ESSP-TECH mailing list