[Go-essp-tech] +2Gb CMIP5 files

Gavin M Bell gavin at llnl.gov
Tue May 18 12:07:37 MDT 2010


Interesting...

I wonder what the other 4% are running? :-).

This is good information to have. I am actually glad that someone has
numbers on these things. :-).  Thanks.

Don Middleton wrote:
> Interesting question, so I inquired a bit. Our NCL/Python distribution
> is used by a community of thousands, and binary downloads for Windows is
> at about 10%, with 11% for MacOS, and 75% Linux. I think if you get into
> the impacts community, Windows would come out a lot higher than this.
> Nate, do you recall our numbers for ESG users?
> 
> don
> 
> 
> On May 18, 2010, at 10:57 AM, Gavin M Bell wrote:
> 
>> What self-respecting scientist would run windows....
>>
>> Ooops did I say that out loud :-).
>>
>> (just kidding... sort of) :-D
>>
>> Nathan Wilhelmi wrote:
>>> Hi All,
>>>
>>>    Here is a nice table summarizing the various Windows file system
>>> limits. http://**www.**ntfs.com/ntfs_vs_fat.htm
>>>
>>> -Nate*
>>> *
>>> stephen.pascoe at stfc.ac.uk wrote:
>>>> I've done some testing of these file limits this afternoon and I don't
>>>> think the filesystems will be a problem.
>>>>
>>>>> From Wikipedia it appears the FAT32 file system has a 4Gb limit
>>>> (http://**en.wikipedia.org/wiki/File_Allocation_Table).  That covers
>>>> Windows 95 onwards but my Windows XP box is NTFS and has no problem
>>>> with
>>>> +4Gb files.  Similarly my 32-bit linux laptop (recent ubuntu) can
>>>> handle
>>>> +4Gb files.
>>>>
>>>> Looks like anyone with a reasonably modern system will be able to
>>>> handle
>>>> +4Gb files.  We may have more problems with old NetCDF library
>>>> versions.
>>>>
>>>> S.
>>>>
>>>> ---
>>>> Stephen Pascoe  +44 (0)1235 445980
>>>> British Atmospheric Data Centre
>>>> Rutherford Appleton Laboratory
>>>>
>>>> -----Original Message-----
>>>> From: go-essp-tech-bounces at ucar.edu
>>>> [mailto:go-essp-tech-bounces at ucar.edu] On Behalf Of
>>>> ag.stephens at stfc.ac.uk
>>>> Sent: 18 May 2010 09:31
>>>> To: taylor13 at llnl.gov; go-essp-tech at ucar.edu
>>>> Cc: doutriaux1 at llnl.gov
>>>> Subject: Re: [Go-essp-tech] +2Gb CMIP5 files
>>>>
>>>> Dear Karl,
>>>>
>>>> Whether we think it's advisable or not, I'm sure that some of the wider
>>>> CMIP5 user community will be looking at the outputs on Windows. I think
>>>> it is sensible to set a 2GB file size limit.
>>>>
>>>> Regards,
>>>>
>>>> Ag
>>>>
>>>> -----Original Message-----
>>>> From: go-essp-tech-bounces at ucar.edu
>>>> [mailto:go-essp-tech-bounces at ucar.edu] On Behalf Of Karl Taylor
>>>> Sent: 17 May 2010 18:45
>>>> To: go-essp-tech at ucar.edu
>>>> Cc: Doutriaux, Charles
>>>> Subject: Re: [Go-essp-tech] +2Gb CMIP5 files
>>>>
>>>> Dear all,
>>>>
>>>> CMOR has code already in place for checking whether a file exceeds 2
>>>> GB,
>>>> but it is currently turned off (it was turned on for CMIP3).  We
>>>> thought
>>>> it was now unnecessary.  If the feeling is that there will be users
>>>> downloading CMIP5 files to windows machines using older operating
>>>> systems, I suppose that limiting CMIP5 files to whatever the limit
>>>> is (2
>>>> GB or 4 GB -- does anyone know which it is?) might be wise.
>>>>
>>>> On the other hand, will anyone use a windows machine to look at netCDF
>>>> files?  If not, maybe this is a non-issue.
>>>>
>>>> Karl
>>>>
>>>> On 5/16/10 12:08 PM, stephen.pascoe at stfc.ac.uk wrote:
>>>>
>>>>> I think I raised undue alarm here when suggesting we might be dealing
>>>>>
>>>> with +2GB files.  Thanks Phil for clarifying that UKMO is still
>>>> planning
>>>> to limit itself to<2GB files.
>>>>
>>>>> I am wondering what the policy should be here?  My first thought is
>>>>>
>>>> that modeling centres will mainly make the same decision as UKMO since
>>>> it is in their interest for their model output to be widely used.
>>>> However, enforcement could be difficult.  The logical place to enforce
>>>> the limit is in the level 1 QC but CMOR doesn't do this so it will be a
>>>> problem for people running datanodes.
>>>>
>>>>> I suggest we make a strong recommendation to supply data in<2GB files
>>>>>
>>>> and enforce it during level-2 QC before replicating.
>>>>
>>>>> S.
>>>>>
>>>>> -----Original Message-----
>>>>> From: go-essp-tech-bounces at ucar.edu on behalf of Michael
>>>>> Lautenschlager
>>>>> Sent: Sun 5/16/2010 1:35 PM
>>>>> To: V. Balaji
>>>>> Cc: go-essp-tech at ucar.edu
>>>>> Subject: Re: [Go-essp-tech] +2Gb CMIP5 files
>>>>>
>>>>> Hello *,
>>>>>
>>>>> we strongly support Phils decision for data files less than 2 GB. We
>>>>> made decision in Hamburg for the same reasons because we cannot expect
>>>>>
>>>>
>>>>> that all users use 64 Bit systems. Most Windows environments are still
>>>>>
>>>>
>>>>> running with 32 Bits.
>>>>>
>>>>> Best wishes, Michael
>>>>>
>>>>> ---------------
>>>>> Dr. Michael Lautenschlager
>>>>>
>>>>> German Climate Computing Centre (DKRZ) World Data Center Climate
>>>>> (WDCC)
>>>>> ADDRESS: Bundesstrasse 45a, D-20146 Hamburg, Germany
>>>>> PHONE:   +4940-460094-118
>>>>> E-Mail:  lautenschlager at dkrz.de
>>>>>
>>>>> URL:    http://***www.***dkrz.de/
>>>>>          http://***www.***wdc-climate.de/
>>>>>
>>>>> V. Balaji schrieb:
>>>>>
>>>>>
>>>>>> If I understood correctly the most serious 2Gb problem is with
>>>>>>
>>>> apache!
>>>>
>>>>>> Bentley, Philip writes:
>>>>>>
>>>>>>
>>>>>>
>>>>>>> Hi Stephen,
>>>>>>>
>>>>>>> Yes, that's true, we did create a small number of test netCDF files
>>>>>>> in that size range. But this was because the CMOR library we used at
>>>>>>>
>>>>
>>>>>>> the time didn't include functionality for chunking the output into
>>>>>>> smaller files. Plus we wanted to stress-test our pipeline!
>>>>>>>
>>>>>>> Two things have happened since then:
>>>>>>>
>>>>>>> 1. Jamie has been working with Charles at PCMDI to implement and
>>>>>>> test a solution whereby we can limit the size of the output netCDF
>>>>>>> files produced by CMOR.
>>>>>>>
>>>>>>> 2. We have made the local decision to limit our netCDF file sizes to
>>>>>>>
>>>>
>>>>>>> 2 GB (or thereabouts) as, logistically, that will cause us less
>>>>>>> headache moving these files around, and it should maximise the
>>>>>>> number of client applications in which the files can be read.
>>>>>>>
>>>>>>> IIRC, I think Balaji mentioned that the 64-bit offset format was
>>>>>>> required for output from the gridspec toolset. I could be wrong.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Phil
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: go-essp-tech-bounces at ucar.edu
>>>>>>>> [mailto:go-essp-tech-bounces at ucar.edu] On Behalf Of
>>>>>>>> stephen.pascoe at stfc.ac.uk
>>>>>>>> Sent: 14 May 2010 10:52
>>>>>>>> To: go-essp-tech at ucar.edu
>>>>>>>> Subject: [Go-essp-tech] +2Gb CMIP5 files
>>>>>>>>
>>>>>>>> The latest UKMO extraction for CMIP5 has produced some files in the
>>>>>>>>
>>>>
>>>>>>>> 30Gb range.  We had discussed previously the assumption that all
>>>>>>>> files would be<2Gb.  Do we feel it is important to enforce a<2Gb
>>>>>>>> limit or should this just be a recommendation on modelling centres?
>>>>>>>>
>>>>>>>> To my knowledge there is two issues with +2Gb files:
>>>>>>>>
>>>>>>>>  1. +2GB NetCDF files will be in 64-bit offset format.
>>>>>>>> Therefore NetCDF libraries prior to v3.6 will not be able to read
>>>>>>>> them.
>>>>>>>>  2. Older file systems may have a 2Gb file limit. This will mainly
>>>>>>>>
>>>>
>>>>>>>> affect 32-bit systems that are a few years old. FAT32 has a 4Gb
>>>>>>>> limit.
>>>>>>>>
>>>>>>>> These are end-user issues, is there any reason why the ESG software
>>>>>>>>
>>>>
>>>>>>>> might have problems with files over 2Gb?  If we do want to ensure
>>>>>>>> files are<2Gb do we want to mandate the modelling centres deliver
>>>>>>>> that or will the data centres need to split files?
>>>>>>>>
>>>>>>>> Stephen.
>>>>>>>>
>>>>>>>> ---
>>>>>>>> Stephen Pascoe  +44 (0)1235 445980
>>>>>>>> British Atmospheric Data Centre
>>>>>>>> Rutherford Appleton Laboratory
>>>>>>>> -- 
>>>>>>>> Scanned by iCritical.
>>>>>>>> _______________________________________________
>>>>>>>> GO-ESSP-TECH mailing list
>>>>>>>> GO-ESSP-TECH at ucar.edu
>>>>>>>> http://***mailman.ucar.edu/mailman/listinfo/go-essp-tech
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> GO-ESSP-TECH mailing list
>>>>>>> GO-ESSP-TECH at ucar.edu
>>>>>>> http://***mailman.ucar.edu/mailman/listinfo/go-essp-tech
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>> _______________________________________________
>>>>> GO-ESSP-TECH mailing list
>>>>> GO-ESSP-TECH at ucar.edu
>>>>> http://***mailman.ucar.edu/mailman/listinfo/go-essp-tech
>>>>>
>>>>>
>>>>>
>>>> _______________________________________________
>>>> GO-ESSP-TECH mailing list
>>>> GO-ESSP-TECH at ucar.edu
>>>> http://**mailman.ucar.edu/mailman/listinfo/go-essp-tech
>>>> -- 
>>>> Scanned by iCritical.
>>>> _______________________________________________
>>>> GO-ESSP-TECH mailing list
>>>> GO-ESSP-TECH at ucar.edu
>>>> http://**mailman.ucar.edu/mailman/listinfo/go-essp-tech
>>>>
>>>
>>>
>>> _______________________________________________
>>> GO-ESSP-TECH mailing list
>>> GO-ESSP-TECH at ucar.edu
>>> http://**mailman.ucar.edu/mailman/listinfo/go-essp-tech
>>>
>>>
>>
>> -- 
>> Gavin M. Bell
>> Lawrence Livermore National Labs
>> -- 
>>
>> "Never mistake a clear view for a short distance."
>>                  -Paul Saffo
>>
>> (GPG Key - http://*rainbow.llnl.gov/dist/keys/gavin.asc)
>>
>> A796 CE39 9C31 68A4 52A7  1F6B 66B7 B250 21D5 6D3E
>> _______________________________________________
>> GO-ESSP-TECH mailing list
>> GO-ESSP-TECH at ucar.edu
>> http://*mailman.ucar.edu/mailman/listinfo/go-essp-tech
> 
> 
> 

-- 
Gavin M. Bell
Lawrence Livermore National Labs
--

 "Never mistake a clear view for a short distance."
       	       -Paul Saffo

(GPG Key - http://rainbow.llnl.gov/dist/keys/gavin.asc)

 A796 CE39 9C31 68A4 52A7  1F6B 66B7 B250 21D5 6D3E


More information about the GO-ESSP-TECH mailing list