[Go-essp-tech] +2Gb CMIP5 files

Don Middleton don at ucar.edu
Tue May 18 11:49:56 MDT 2010


Interesting question, so I inquired a bit. Our NCL/Python distribution  
is used by a community of thousands, and binary downloads for Windows  
is at about 10%, with 11% for MacOS, and 75% Linux. I think if you get  
into the impacts community, Windows would come out a lot higher than  
this. Nate, do you recall our numbers for ESG users?

don


On May 18, 2010, at 10:57 AM, Gavin M Bell wrote:

> What self-respecting scientist would run windows....
>
> Ooops did I say that out loud :-).
>
> (just kidding... sort of) :-D
>
> Nathan Wilhelmi wrote:
>> Hi All,
>>
>>    Here is a nice table summarizing the various Windows file system
>> limits. http://*www.*ntfs.com/ntfs_vs_fat.htm
>>
>> -Nate*
>> *
>> stephen.pascoe at stfc.ac.uk wrote:
>>> I've done some testing of these file limits this afternoon and I  
>>> don't
>>> think the filesystems will be a problem.
>>>
>>>> From Wikipedia it appears the FAT32 file system has a 4Gb limit
>>> (http://*en.wikipedia.org/wiki/File_Allocation_Table).  That covers
>>> Windows 95 onwards but my Windows XP box is NTFS and has no  
>>> problem with
>>> +4Gb files.  Similarly my 32-bit linux laptop (recent ubuntu) can  
>>> handle
>>> +4Gb files.
>>>
>>> Looks like anyone with a reasonably modern system will be able to  
>>> handle
>>> +4Gb files.  We may have more problems with old NetCDF library  
>>> versions.
>>>
>>> S.
>>>
>>> ---
>>> Stephen Pascoe  +44 (0)1235 445980
>>> British Atmospheric Data Centre
>>> Rutherford Appleton Laboratory
>>>
>>> -----Original Message-----
>>> From: go-essp-tech-bounces at ucar.edu
>>> [mailto:go-essp-tech-bounces at ucar.edu] On Behalf Of
>>> ag.stephens at stfc.ac.uk
>>> Sent: 18 May 2010 09:31
>>> To: taylor13 at llnl.gov; go-essp-tech at ucar.edu
>>> Cc: doutriaux1 at llnl.gov
>>> Subject: Re: [Go-essp-tech] +2Gb CMIP5 files
>>>
>>> Dear Karl,
>>>
>>> Whether we think it's advisable or not, I'm sure that some of the  
>>> wider
>>> CMIP5 user community will be looking at the outputs on Windows. I  
>>> think
>>> it is sensible to set a 2GB file size limit.
>>>
>>> Regards,
>>>
>>> Ag
>>>
>>> -----Original Message-----
>>> From: go-essp-tech-bounces at ucar.edu
>>> [mailto:go-essp-tech-bounces at ucar.edu] On Behalf Of Karl Taylor
>>> Sent: 17 May 2010 18:45
>>> To: go-essp-tech at ucar.edu
>>> Cc: Doutriaux, Charles
>>> Subject: Re: [Go-essp-tech] +2Gb CMIP5 files
>>>
>>> Dear all,
>>>
>>> CMOR has code already in place for checking whether a file exceeds  
>>> 2 GB,
>>> but it is currently turned off (it was turned on for CMIP3).  We  
>>> thought
>>> it was now unnecessary.  If the feeling is that there will be users
>>> downloading CMIP5 files to windows machines using older operating
>>> systems, I suppose that limiting CMIP5 files to whatever the limit  
>>> is (2
>>> GB or 4 GB -- does anyone know which it is?) might be wise.
>>>
>>> On the other hand, will anyone use a windows machine to look at  
>>> netCDF
>>> files?  If not, maybe this is a non-issue.
>>>
>>> Karl
>>>
>>> On 5/16/10 12:08 PM, stephen.pascoe at stfc.ac.uk wrote:
>>>
>>>> I think I raised undue alarm here when suggesting we might be  
>>>> dealing
>>>>
>>> with +2GB files.  Thanks Phil for clarifying that UKMO is still  
>>> planning
>>> to limit itself to<2GB files.
>>>
>>>> I am wondering what the policy should be here?  My first thought is
>>>>
>>> that modeling centres will mainly make the same decision as UKMO  
>>> since
>>> it is in their interest for their model output to be widely used.
>>> However, enforcement could be difficult.  The logical place to  
>>> enforce
>>> the limit is in the level 1 QC but CMOR doesn't do this so it will  
>>> be a
>>> problem for people running datanodes.
>>>
>>>> I suggest we make a strong recommendation to supply data in<2GB  
>>>> files
>>>>
>>> and enforce it during level-2 QC before replicating.
>>>
>>>> S.
>>>>
>>>> -----Original Message-----
>>>> From: go-essp-tech-bounces at ucar.edu on behalf of Michael
>>>> Lautenschlager
>>>> Sent: Sun 5/16/2010 1:35 PM
>>>> To: V. Balaji
>>>> Cc: go-essp-tech at ucar.edu
>>>> Subject: Re: [Go-essp-tech] +2Gb CMIP5 files
>>>>
>>>> Hello *,
>>>>
>>>> we strongly support Phils decision for data files less than 2 GB.  
>>>> We
>>>> made decision in Hamburg for the same reasons because we cannot  
>>>> expect
>>>>
>>>
>>>> that all users use 64 Bit systems. Most Windows environments are  
>>>> still
>>>>
>>>
>>>> running with 32 Bits.
>>>>
>>>> Best wishes, Michael
>>>>
>>>> ---------------
>>>> Dr. Michael Lautenschlager
>>>>
>>>> German Climate Computing Centre (DKRZ) World Data Center Climate
>>>> (WDCC)
>>>> ADDRESS: Bundesstrasse 45a, D-20146 Hamburg, Germany
>>>> PHONE:   +4940-460094-118
>>>> E-Mail:  lautenschlager at dkrz.de
>>>>
>>>> URL:    http://**www.**dkrz.de/
>>>>          http://**www.**wdc-climate.de/
>>>>
>>>> V. Balaji schrieb:
>>>>
>>>>
>>>>> If I understood correctly the most serious 2Gb problem is with
>>>>>
>>> apache!
>>>
>>>>> Bentley, Philip writes:
>>>>>
>>>>>
>>>>>
>>>>>> Hi Stephen,
>>>>>>
>>>>>> Yes, that's true, we did create a small number of test netCDF  
>>>>>> files
>>>>>> in that size range. But this was because the CMOR library we  
>>>>>> used at
>>>>>>
>>>
>>>>>> the time didn't include functionality for chunking the output  
>>>>>> into
>>>>>> smaller files. Plus we wanted to stress-test our pipeline!
>>>>>>
>>>>>> Two things have happened since then:
>>>>>>
>>>>>> 1. Jamie has been working with Charles at PCMDI to implement and
>>>>>> test a solution whereby we can limit the size of the output  
>>>>>> netCDF
>>>>>> files produced by CMOR.
>>>>>>
>>>>>> 2. We have made the local decision to limit our netCDF file  
>>>>>> sizes to
>>>>>>
>>>
>>>>>> 2 GB (or thereabouts) as, logistically, that will cause us less
>>>>>> headache moving these files around, and it should maximise the
>>>>>> number of client applications in which the files can be read.
>>>>>>
>>>>>> IIRC, I think Balaji mentioned that the 64-bit offset format was
>>>>>> required for output from the gridspec toolset. I could be wrong.
>>>>>>
>>>>>> Regards,
>>>>>> Phil
>>>>>>
>>>>>>
>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: go-essp-tech-bounces at ucar.edu
>>>>>>> [mailto:go-essp-tech-bounces at ucar.edu] On Behalf Of
>>>>>>> stephen.pascoe at stfc.ac.uk
>>>>>>> Sent: 14 May 2010 10:52
>>>>>>> To: go-essp-tech at ucar.edu
>>>>>>> Subject: [Go-essp-tech] +2Gb CMIP5 files
>>>>>>>
>>>>>>> The latest UKMO extraction for CMIP5 has produced some files  
>>>>>>> in the
>>>>>>>
>>>
>>>>>>> 30Gb range.  We had discussed previously the assumption that all
>>>>>>> files would be<2Gb.  Do we feel it is important to enforce a<2Gb
>>>>>>> limit or should this just be a recommendation on modelling  
>>>>>>> centres?
>>>>>>>
>>>>>>> To my knowledge there is two issues with +2Gb files:
>>>>>>>
>>>>>>>  1. +2GB NetCDF files will be in 64-bit offset format.
>>>>>>> Therefore NetCDF libraries prior to v3.6 will not be able to  
>>>>>>> read
>>>>>>> them.
>>>>>>>  2. Older file systems may have a 2Gb file limit. This will  
>>>>>>> mainly
>>>>>>>
>>>
>>>>>>> affect 32-bit systems that are a few years old. FAT32 has a 4Gb
>>>>>>> limit.
>>>>>>>
>>>>>>> These are end-user issues, is there any reason why the ESG  
>>>>>>> software
>>>>>>>
>>>
>>>>>>> might have problems with files over 2Gb?  If we do want to  
>>>>>>> ensure
>>>>>>> files are<2Gb do we want to mandate the modelling centres  
>>>>>>> deliver
>>>>>>> that or will the data centres need to split files?
>>>>>>>
>>>>>>> Stephen.
>>>>>>>
>>>>>>> ---
>>>>>>> Stephen Pascoe  +44 (0)1235 445980
>>>>>>> British Atmospheric Data Centre
>>>>>>> Rutherford Appleton Laboratory
>>>>>>> --
>>>>>>> Scanned by iCritical.
>>>>>>> _______________________________________________
>>>>>>> GO-ESSP-TECH mailing list
>>>>>>> GO-ESSP-TECH at ucar.edu
>>>>>>> http://**mailman.ucar.edu/mailman/listinfo/go-essp-tech
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> _______________________________________________
>>>>>> GO-ESSP-TECH mailing list
>>>>>> GO-ESSP-TECH at ucar.edu
>>>>>> http://**mailman.ucar.edu/mailman/listinfo/go-essp-tech
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>> _______________________________________________
>>>> GO-ESSP-TECH mailing list
>>>> GO-ESSP-TECH at ucar.edu
>>>> http://**mailman.ucar.edu/mailman/listinfo/go-essp-tech
>>>>
>>>>
>>>>
>>> _______________________________________________
>>> GO-ESSP-TECH mailing list
>>> GO-ESSP-TECH at ucar.edu
>>> http://*mailman.ucar.edu/mailman/listinfo/go-essp-tech
>>> --
>>> Scanned by iCritical.
>>> _______________________________________________
>>> GO-ESSP-TECH mailing list
>>> GO-ESSP-TECH at ucar.edu
>>> http://*mailman.ucar.edu/mailman/listinfo/go-essp-tech
>>>
>>
>>
>> _______________________________________________
>> GO-ESSP-TECH mailing list
>> GO-ESSP-TECH at ucar.edu
>> http://*mailman.ucar.edu/mailman/listinfo/go-essp-tech
>>
>>
>
> -- 
> Gavin M. Bell
> Lawrence Livermore National Labs
> --
>
> "Never mistake a clear view for a short distance."
>       	       -Paul Saffo
>
> (GPG Key - http://rainbow.llnl.gov/dist/keys/gavin.asc)
>
> A796 CE39 9C31 68A4 52A7  1F6B 66B7 B250 21D5 6D3E
> _______________________________________________
> GO-ESSP-TECH mailing list
> GO-ESSP-TECH at ucar.edu
> http://mailman.ucar.edu/mailman/listinfo/go-essp-tech



More information about the GO-ESSP-TECH mailing list