[Go-essp-tech] +2Gb CMIP5 files

Don Middleton don at ucar.edu
Tue May 18 12:18:28 MDT 2010


On May 18, 2010, at 12:07 PM, Gavin M Bell wrote:

> Interesting...
>
> I wonder what the other 4% are running? :-).

AIX, SunOS, that sort of thing. As I recall, Windows users were a fair  
bit higher than this on ESG.

don


>
> This is good information to have. I am actually glad that someone has
> numbers on these things. :-).  Thanks.
>
> Don Middleton wrote:
>> Interesting question, so I inquired a bit. Our NCL/Python  
>> distribution
>> is used by a community of thousands, and binary downloads for  
>> Windows is
>> at about 10%, with 11% for MacOS, and 75% Linux. I think if you get  
>> into
>> the impacts community, Windows would come out a lot higher than this.
>> Nate, do you recall our numbers for ESG users?
>>
>> don
>>
>>
>> On May 18, 2010, at 10:57 AM, Gavin M Bell wrote:
>>
>>> What self-respecting scientist would run windows....
>>>
>>> Ooops did I say that out loud :-).
>>>
>>> (just kidding... sort of) :-D
>>>
>>> Nathan Wilhelmi wrote:
>>>> Hi All,
>>>>
>>>>   Here is a nice table summarizing the various Windows file system
>>>> limits. http://**www.**ntfs.com/ntfs_vs_fat.htm
>>>>
>>>> -Nate*
>>>> *
>>>> stephen.pascoe at stfc.ac.uk wrote:
>>>>> I've done some testing of these file limits this afternoon and I  
>>>>> don't
>>>>> think the filesystems will be a problem.
>>>>>
>>>>>> From Wikipedia it appears the FAT32 file system has a 4Gb limit
>>>>> (http://**en.wikipedia.org/wiki/File_Allocation_Table).  That  
>>>>> covers
>>>>> Windows 95 onwards but my Windows XP box is NTFS and has no  
>>>>> problem
>>>>> with
>>>>> +4Gb files.  Similarly my 32-bit linux laptop (recent ubuntu) can
>>>>> handle
>>>>> +4Gb files.
>>>>>
>>>>> Looks like anyone with a reasonably modern system will be able to
>>>>> handle
>>>>> +4Gb files.  We may have more problems with old NetCDF library
>>>>> versions.
>>>>>
>>>>> S.
>>>>>
>>>>> ---
>>>>> Stephen Pascoe  +44 (0)1235 445980
>>>>> British Atmospheric Data Centre
>>>>> Rutherford Appleton Laboratory
>>>>>
>>>>> -----Original Message-----
>>>>> From: go-essp-tech-bounces at ucar.edu
>>>>> [mailto:go-essp-tech-bounces at ucar.edu] On Behalf Of
>>>>> ag.stephens at stfc.ac.uk
>>>>> Sent: 18 May 2010 09:31
>>>>> To: taylor13 at llnl.gov; go-essp-tech at ucar.edu
>>>>> Cc: doutriaux1 at llnl.gov
>>>>> Subject: Re: [Go-essp-tech] +2Gb CMIP5 files
>>>>>
>>>>> Dear Karl,
>>>>>
>>>>> Whether we think it's advisable or not, I'm sure that some of  
>>>>> the wider
>>>>> CMIP5 user community will be looking at the outputs on Windows.  
>>>>> I think
>>>>> it is sensible to set a 2GB file size limit.
>>>>>
>>>>> Regards,
>>>>>
>>>>> Ag
>>>>>
>>>>> -----Original Message-----
>>>>> From: go-essp-tech-bounces at ucar.edu
>>>>> [mailto:go-essp-tech-bounces at ucar.edu] On Behalf Of Karl Taylor
>>>>> Sent: 17 May 2010 18:45
>>>>> To: go-essp-tech at ucar.edu
>>>>> Cc: Doutriaux, Charles
>>>>> Subject: Re: [Go-essp-tech] +2Gb CMIP5 files
>>>>>
>>>>> Dear all,
>>>>>
>>>>> CMOR has code already in place for checking whether a file  
>>>>> exceeds 2
>>>>> GB,
>>>>> but it is currently turned off (it was turned on for CMIP3).  We
>>>>> thought
>>>>> it was now unnecessary.  If the feeling is that there will be  
>>>>> users
>>>>> downloading CMIP5 files to windows machines using older operating
>>>>> systems, I suppose that limiting CMIP5 files to whatever the limit
>>>>> is (2
>>>>> GB or 4 GB -- does anyone know which it is?) might be wise.
>>>>>
>>>>> On the other hand, will anyone use a windows machine to look at  
>>>>> netCDF
>>>>> files?  If not, maybe this is a non-issue.
>>>>>
>>>>> Karl
>>>>>
>>>>> On 5/16/10 12:08 PM, stephen.pascoe at stfc.ac.uk wrote:
>>>>>
>>>>>> I think I raised undue alarm here when suggesting we might be  
>>>>>> dealing
>>>>>>
>>>>> with +2GB files.  Thanks Phil for clarifying that UKMO is still
>>>>> planning
>>>>> to limit itself to<2GB files.
>>>>>
>>>>>> I am wondering what the policy should be here?  My first  
>>>>>> thought is
>>>>>>
>>>>> that modeling centres will mainly make the same decision as UKMO  
>>>>> since
>>>>> it is in their interest for their model output to be widely used.
>>>>> However, enforcement could be difficult.  The logical place to  
>>>>> enforce
>>>>> the limit is in the level 1 QC but CMOR doesn't do this so it  
>>>>> will be a
>>>>> problem for people running datanodes.
>>>>>
>>>>>> I suggest we make a strong recommendation to supply data in<2GB  
>>>>>> files
>>>>>>
>>>>> and enforce it during level-2 QC before replicating.
>>>>>
>>>>>> S.
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: go-essp-tech-bounces at ucar.edu on behalf of Michael
>>>>>> Lautenschlager
>>>>>> Sent: Sun 5/16/2010 1:35 PM
>>>>>> To: V. Balaji
>>>>>> Cc: go-essp-tech at ucar.edu
>>>>>> Subject: Re: [Go-essp-tech] +2Gb CMIP5 files
>>>>>>
>>>>>> Hello *,
>>>>>>
>>>>>> we strongly support Phils decision for data files less than 2  
>>>>>> GB. We
>>>>>> made decision in Hamburg for the same reasons because we cannot  
>>>>>> expect
>>>>>>
>>>>>
>>>>>> that all users use 64 Bit systems. Most Windows environments  
>>>>>> are still
>>>>>>
>>>>>
>>>>>> running with 32 Bits.
>>>>>>
>>>>>> Best wishes, Michael
>>>>>>
>>>>>> ---------------
>>>>>> Dr. Michael Lautenschlager
>>>>>>
>>>>>> German Climate Computing Centre (DKRZ) World Data Center Climate
>>>>>> (WDCC)
>>>>>> ADDRESS: Bundesstrasse 45a, D-20146 Hamburg, Germany
>>>>>> PHONE:   +4940-460094-118
>>>>>> E-Mail:  lautenschlager at dkrz.de
>>>>>>
>>>>>> URL:    http://***www.***dkrz.de/
>>>>>>         http://***www.***wdc-climate.de/
>>>>>>
>>>>>> V. Balaji schrieb:
>>>>>>
>>>>>>
>>>>>>> If I understood correctly the most serious 2Gb problem is with
>>>>>>>
>>>>> apache!
>>>>>
>>>>>>> Bentley, Philip writes:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> Hi Stephen,
>>>>>>>>
>>>>>>>> Yes, that's true, we did create a small number of test netCDF  
>>>>>>>> files
>>>>>>>> in that size range. But this was because the CMOR library we  
>>>>>>>> used at
>>>>>>>>
>>>>>
>>>>>>>> the time didn't include functionality for chunking the output  
>>>>>>>> into
>>>>>>>> smaller files. Plus we wanted to stress-test our pipeline!
>>>>>>>>
>>>>>>>> Two things have happened since then:
>>>>>>>>
>>>>>>>> 1. Jamie has been working with Charles at PCMDI to implement  
>>>>>>>> and
>>>>>>>> test a solution whereby we can limit the size of the output  
>>>>>>>> netCDF
>>>>>>>> files produced by CMOR.
>>>>>>>>
>>>>>>>> 2. We have made the local decision to limit our netCDF file  
>>>>>>>> sizes to
>>>>>>>>
>>>>>
>>>>>>>> 2 GB (or thereabouts) as, logistically, that will cause us less
>>>>>>>> headache moving these files around, and it should maximise the
>>>>>>>> number of client applications in which the files can be read.
>>>>>>>>
>>>>>>>> IIRC, I think Balaji mentioned that the 64-bit offset format  
>>>>>>>> was
>>>>>>>> required for output from the gridspec toolset. I could be  
>>>>>>>> wrong.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Phil
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: go-essp-tech-bounces at ucar.edu
>>>>>>>>> [mailto:go-essp-tech-bounces at ucar.edu] On Behalf Of
>>>>>>>>> stephen.pascoe at stfc.ac.uk
>>>>>>>>> Sent: 14 May 2010 10:52
>>>>>>>>> To: go-essp-tech at ucar.edu
>>>>>>>>> Subject: [Go-essp-tech] +2Gb CMIP5 files
>>>>>>>>>
>>>>>>>>> The latest UKMO extraction for CMIP5 has produced some files  
>>>>>>>>> in the
>>>>>>>>>
>>>>>
>>>>>>>>> 30Gb range.  We had discussed previously the assumption that  
>>>>>>>>> all
>>>>>>>>> files would be<2Gb.  Do we feel it is important to enforce  
>>>>>>>>> a<2Gb
>>>>>>>>> limit or should this just be a recommendation on modelling  
>>>>>>>>> centres?
>>>>>>>>>
>>>>>>>>> To my knowledge there is two issues with +2Gb files:
>>>>>>>>>
>>>>>>>>> 1. +2GB NetCDF files will be in 64-bit offset format.
>>>>>>>>> Therefore NetCDF libraries prior to v3.6 will not be able to  
>>>>>>>>> read
>>>>>>>>> them.
>>>>>>>>> 2. Older file systems may have a 2Gb file limit. This will  
>>>>>>>>> mainly
>>>>>>>>>
>>>>>
>>>>>>>>> affect 32-bit systems that are a few years old. FAT32 has a  
>>>>>>>>> 4Gb
>>>>>>>>> limit.
>>>>>>>>>
>>>>>>>>> These are end-user issues, is there any reason why the ESG  
>>>>>>>>> software
>>>>>>>>>
>>>>>
>>>>>>>>> might have problems with files over 2Gb?  If we do want to  
>>>>>>>>> ensure
>>>>>>>>> files are<2Gb do we want to mandate the modelling centres  
>>>>>>>>> deliver
>>>>>>>>> that or will the data centres need to split files?
>>>>>>>>>
>>>>>>>>> Stephen.
>>>>>>>>>
>>>>>>>>> ---
>>>>>>>>> Stephen Pascoe  +44 (0)1235 445980
>>>>>>>>> British Atmospheric Data Centre
>>>>>>>>> Rutherford Appleton Laboratory
>>>>>>>>> -- 
>>>>>>>>> Scanned by iCritical.
>>>>>>>>> _______________________________________________
>>>>>>>>> GO-ESSP-TECH mailing list
>>>>>>>>> GO-ESSP-TECH at ucar.edu
>>>>>>>>> http://***mailman.ucar.edu/mailman/listinfo/go-essp-tech
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> GO-ESSP-TECH mailing list
>>>>>>>> GO-ESSP-TECH at ucar.edu
>>>>>>>> http://***mailman.ucar.edu/mailman/listinfo/go-essp-tech
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> _______________________________________________
>>>>>> GO-ESSP-TECH mailing list
>>>>>> GO-ESSP-TECH at ucar.edu
>>>>>> http://***mailman.ucar.edu/mailman/listinfo/go-essp-tech
>>>>>>
>>>>>>
>>>>>>
>>>>> _______________________________________________
>>>>> GO-ESSP-TECH mailing list
>>>>> GO-ESSP-TECH at ucar.edu
>>>>> http://**mailman.ucar.edu/mailman/listinfo/go-essp-tech
>>>>> -- 
>>>>> Scanned by iCritical.
>>>>> _______________________________________________
>>>>> GO-ESSP-TECH mailing list
>>>>> GO-ESSP-TECH at ucar.edu
>>>>> http://**mailman.ucar.edu/mailman/listinfo/go-essp-tech
>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> GO-ESSP-TECH mailing list
>>>> GO-ESSP-TECH at ucar.edu
>>>> http://**mailman.ucar.edu/mailman/listinfo/go-essp-tech
>>>>
>>>>
>>>
>>> -- 
>>> Gavin M. Bell
>>> Lawrence Livermore National Labs
>>> -- 
>>>
>>> "Never mistake a clear view for a short distance."
>>>                 -Paul Saffo
>>>
>>> (GPG Key - http://*rainbow.llnl.gov/dist/keys/gavin.asc)
>>>
>>> A796 CE39 9C31 68A4 52A7  1F6B 66B7 B250 21D5 6D3E
>>> _______________________________________________
>>> GO-ESSP-TECH mailing list
>>> GO-ESSP-TECH at ucar.edu
>>> http://*mailman.ucar.edu/mailman/listinfo/go-essp-tech
>>
>>
>>
>
> -- 
> Gavin M. Bell
> Lawrence Livermore National Labs
> --
>
> "Never mistake a clear view for a short distance."
>       	       -Paul Saffo
>
> (GPG Key - http://rainbow.llnl.gov/dist/keys/gavin.asc)
>
> A796 CE39 9C31 68A4 52A7  1F6B 66B7 B250 21D5 6D3E



More information about the GO-ESSP-TECH mailing list