[Go-essp-tech] +2Gb CMIP5 files
Don Middleton
don at ucar.edu
Tue May 18 12:18:28 MDT 2010
On May 18, 2010, at 12:07 PM, Gavin M Bell wrote:
> Interesting...
>
> I wonder what the other 4% are running? :-).
AIX, SunOS, that sort of thing. As I recall, Windows users were a fair
bit higher than this on ESG.
don
>
> This is good information to have. I am actually glad that someone has
> numbers on these things. :-). Thanks.
>
> Don Middleton wrote:
>> Interesting question, so I inquired a bit. Our NCL/Python
>> distribution
>> is used by a community of thousands, and binary downloads for
>> Windows is
>> at about 10%, with 11% for MacOS, and 75% Linux. I think if you get
>> into
>> the impacts community, Windows would come out a lot higher than this.
>> Nate, do you recall our numbers for ESG users?
>>
>> don
>>
>>
>> On May 18, 2010, at 10:57 AM, Gavin M Bell wrote:
>>
>>> What self-respecting scientist would run windows....
>>>
>>> Ooops did I say that out loud :-).
>>>
>>> (just kidding... sort of) :-D
>>>
>>> Nathan Wilhelmi wrote:
>>>> Hi All,
>>>>
>>>> Here is a nice table summarizing the various Windows file system
>>>> limits. http://**www.**ntfs.com/ntfs_vs_fat.htm
>>>>
>>>> -Nate*
>>>> *
>>>> stephen.pascoe at stfc.ac.uk wrote:
>>>>> I've done some testing of these file limits this afternoon and I
>>>>> don't
>>>>> think the filesystems will be a problem.
>>>>>
>>>>>> From Wikipedia it appears the FAT32 file system has a 4Gb limit
>>>>> (http://**en.wikipedia.org/wiki/File_Allocation_Table). That
>>>>> covers
>>>>> Windows 95 onwards but my Windows XP box is NTFS and has no
>>>>> problem
>>>>> with
>>>>> +4Gb files. Similarly my 32-bit linux laptop (recent ubuntu) can
>>>>> handle
>>>>> +4Gb files.
>>>>>
>>>>> Looks like anyone with a reasonably modern system will be able to
>>>>> handle
>>>>> +4Gb files. We may have more problems with old NetCDF library
>>>>> versions.
>>>>>
>>>>> S.
>>>>>
>>>>> ---
>>>>> Stephen Pascoe +44 (0)1235 445980
>>>>> British Atmospheric Data Centre
>>>>> Rutherford Appleton Laboratory
>>>>>
>>>>> -----Original Message-----
>>>>> From: go-essp-tech-bounces at ucar.edu
>>>>> [mailto:go-essp-tech-bounces at ucar.edu] On Behalf Of
>>>>> ag.stephens at stfc.ac.uk
>>>>> Sent: 18 May 2010 09:31
>>>>> To: taylor13 at llnl.gov; go-essp-tech at ucar.edu
>>>>> Cc: doutriaux1 at llnl.gov
>>>>> Subject: Re: [Go-essp-tech] +2Gb CMIP5 files
>>>>>
>>>>> Dear Karl,
>>>>>
>>>>> Whether we think it's advisable or not, I'm sure that some of
>>>>> the wider
>>>>> CMIP5 user community will be looking at the outputs on Windows.
>>>>> I think
>>>>> it is sensible to set a 2GB file size limit.
>>>>>
>>>>> Regards,
>>>>>
>>>>> Ag
>>>>>
>>>>> -----Original Message-----
>>>>> From: go-essp-tech-bounces at ucar.edu
>>>>> [mailto:go-essp-tech-bounces at ucar.edu] On Behalf Of Karl Taylor
>>>>> Sent: 17 May 2010 18:45
>>>>> To: go-essp-tech at ucar.edu
>>>>> Cc: Doutriaux, Charles
>>>>> Subject: Re: [Go-essp-tech] +2Gb CMIP5 files
>>>>>
>>>>> Dear all,
>>>>>
>>>>> CMOR has code already in place for checking whether a file
>>>>> exceeds 2
>>>>> GB,
>>>>> but it is currently turned off (it was turned on for CMIP3). We
>>>>> thought
>>>>> it was now unnecessary. If the feeling is that there will be
>>>>> users
>>>>> downloading CMIP5 files to windows machines using older operating
>>>>> systems, I suppose that limiting CMIP5 files to whatever the limit
>>>>> is (2
>>>>> GB or 4 GB -- does anyone know which it is?) might be wise.
>>>>>
>>>>> On the other hand, will anyone use a windows machine to look at
>>>>> netCDF
>>>>> files? If not, maybe this is a non-issue.
>>>>>
>>>>> Karl
>>>>>
>>>>> On 5/16/10 12:08 PM, stephen.pascoe at stfc.ac.uk wrote:
>>>>>
>>>>>> I think I raised undue alarm here when suggesting we might be
>>>>>> dealing
>>>>>>
>>>>> with +2GB files. Thanks Phil for clarifying that UKMO is still
>>>>> planning
>>>>> to limit itself to<2GB files.
>>>>>
>>>>>> I am wondering what the policy should be here? My first
>>>>>> thought is
>>>>>>
>>>>> that modeling centres will mainly make the same decision as UKMO
>>>>> since
>>>>> it is in their interest for their model output to be widely used.
>>>>> However, enforcement could be difficult. The logical place to
>>>>> enforce
>>>>> the limit is in the level 1 QC but CMOR doesn't do this so it
>>>>> will be a
>>>>> problem for people running datanodes.
>>>>>
>>>>>> I suggest we make a strong recommendation to supply data in<2GB
>>>>>> files
>>>>>>
>>>>> and enforce it during level-2 QC before replicating.
>>>>>
>>>>>> S.
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: go-essp-tech-bounces at ucar.edu on behalf of Michael
>>>>>> Lautenschlager
>>>>>> Sent: Sun 5/16/2010 1:35 PM
>>>>>> To: V. Balaji
>>>>>> Cc: go-essp-tech at ucar.edu
>>>>>> Subject: Re: [Go-essp-tech] +2Gb CMIP5 files
>>>>>>
>>>>>> Hello *,
>>>>>>
>>>>>> we strongly support Phils decision for data files less than 2
>>>>>> GB. We
>>>>>> made decision in Hamburg for the same reasons because we cannot
>>>>>> expect
>>>>>>
>>>>>
>>>>>> that all users use 64 Bit systems. Most Windows environments
>>>>>> are still
>>>>>>
>>>>>
>>>>>> running with 32 Bits.
>>>>>>
>>>>>> Best wishes, Michael
>>>>>>
>>>>>> ---------------
>>>>>> Dr. Michael Lautenschlager
>>>>>>
>>>>>> German Climate Computing Centre (DKRZ) World Data Center Climate
>>>>>> (WDCC)
>>>>>> ADDRESS: Bundesstrasse 45a, D-20146 Hamburg, Germany
>>>>>> PHONE: +4940-460094-118
>>>>>> E-Mail: lautenschlager at dkrz.de
>>>>>>
>>>>>> URL: http://***www.***dkrz.de/
>>>>>> http://***www.***wdc-climate.de/
>>>>>>
>>>>>> V. Balaji schrieb:
>>>>>>
>>>>>>
>>>>>>> If I understood correctly the most serious 2Gb problem is with
>>>>>>>
>>>>> apache!
>>>>>
>>>>>>> Bentley, Philip writes:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> Hi Stephen,
>>>>>>>>
>>>>>>>> Yes, that's true, we did create a small number of test netCDF
>>>>>>>> files
>>>>>>>> in that size range. But this was because the CMOR library we
>>>>>>>> used at
>>>>>>>>
>>>>>
>>>>>>>> the time didn't include functionality for chunking the output
>>>>>>>> into
>>>>>>>> smaller files. Plus we wanted to stress-test our pipeline!
>>>>>>>>
>>>>>>>> Two things have happened since then:
>>>>>>>>
>>>>>>>> 1. Jamie has been working with Charles at PCMDI to implement
>>>>>>>> and
>>>>>>>> test a solution whereby we can limit the size of the output
>>>>>>>> netCDF
>>>>>>>> files produced by CMOR.
>>>>>>>>
>>>>>>>> 2. We have made the local decision to limit our netCDF file
>>>>>>>> sizes to
>>>>>>>>
>>>>>
>>>>>>>> 2 GB (or thereabouts) as, logistically, that will cause us less
>>>>>>>> headache moving these files around, and it should maximise the
>>>>>>>> number of client applications in which the files can be read.
>>>>>>>>
>>>>>>>> IIRC, I think Balaji mentioned that the 64-bit offset format
>>>>>>>> was
>>>>>>>> required for output from the gridspec toolset. I could be
>>>>>>>> wrong.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Phil
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: go-essp-tech-bounces at ucar.edu
>>>>>>>>> [mailto:go-essp-tech-bounces at ucar.edu] On Behalf Of
>>>>>>>>> stephen.pascoe at stfc.ac.uk
>>>>>>>>> Sent: 14 May 2010 10:52
>>>>>>>>> To: go-essp-tech at ucar.edu
>>>>>>>>> Subject: [Go-essp-tech] +2Gb CMIP5 files
>>>>>>>>>
>>>>>>>>> The latest UKMO extraction for CMIP5 has produced some files
>>>>>>>>> in the
>>>>>>>>>
>>>>>
>>>>>>>>> 30Gb range. We had discussed previously the assumption that
>>>>>>>>> all
>>>>>>>>> files would be<2Gb. Do we feel it is important to enforce
>>>>>>>>> a<2Gb
>>>>>>>>> limit or should this just be a recommendation on modelling
>>>>>>>>> centres?
>>>>>>>>>
>>>>>>>>> To my knowledge there is two issues with +2Gb files:
>>>>>>>>>
>>>>>>>>> 1. +2GB NetCDF files will be in 64-bit offset format.
>>>>>>>>> Therefore NetCDF libraries prior to v3.6 will not be able to
>>>>>>>>> read
>>>>>>>>> them.
>>>>>>>>> 2. Older file systems may have a 2Gb file limit. This will
>>>>>>>>> mainly
>>>>>>>>>
>>>>>
>>>>>>>>> affect 32-bit systems that are a few years old. FAT32 has a
>>>>>>>>> 4Gb
>>>>>>>>> limit.
>>>>>>>>>
>>>>>>>>> These are end-user issues, is there any reason why the ESG
>>>>>>>>> software
>>>>>>>>>
>>>>>
>>>>>>>>> might have problems with files over 2Gb? If we do want to
>>>>>>>>> ensure
>>>>>>>>> files are<2Gb do we want to mandate the modelling centres
>>>>>>>>> deliver
>>>>>>>>> that or will the data centres need to split files?
>>>>>>>>>
>>>>>>>>> Stephen.
>>>>>>>>>
>>>>>>>>> ---
>>>>>>>>> Stephen Pascoe +44 (0)1235 445980
>>>>>>>>> British Atmospheric Data Centre
>>>>>>>>> Rutherford Appleton Laboratory
>>>>>>>>> --
>>>>>>>>> Scanned by iCritical.
>>>>>>>>> _______________________________________________
>>>>>>>>> GO-ESSP-TECH mailing list
>>>>>>>>> GO-ESSP-TECH at ucar.edu
>>>>>>>>> http://***mailman.ucar.edu/mailman/listinfo/go-essp-tech
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> GO-ESSP-TECH mailing list
>>>>>>>> GO-ESSP-TECH at ucar.edu
>>>>>>>> http://***mailman.ucar.edu/mailman/listinfo/go-essp-tech
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> _______________________________________________
>>>>>> GO-ESSP-TECH mailing list
>>>>>> GO-ESSP-TECH at ucar.edu
>>>>>> http://***mailman.ucar.edu/mailman/listinfo/go-essp-tech
>>>>>>
>>>>>>
>>>>>>
>>>>> _______________________________________________
>>>>> GO-ESSP-TECH mailing list
>>>>> GO-ESSP-TECH at ucar.edu
>>>>> http://**mailman.ucar.edu/mailman/listinfo/go-essp-tech
>>>>> --
>>>>> Scanned by iCritical.
>>>>> _______________________________________________
>>>>> GO-ESSP-TECH mailing list
>>>>> GO-ESSP-TECH at ucar.edu
>>>>> http://**mailman.ucar.edu/mailman/listinfo/go-essp-tech
>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> GO-ESSP-TECH mailing list
>>>> GO-ESSP-TECH at ucar.edu
>>>> http://**mailman.ucar.edu/mailman/listinfo/go-essp-tech
>>>>
>>>>
>>>
>>> --
>>> Gavin M. Bell
>>> Lawrence Livermore National Labs
>>> --
>>>
>>> "Never mistake a clear view for a short distance."
>>> -Paul Saffo
>>>
>>> (GPG Key - http://*rainbow.llnl.gov/dist/keys/gavin.asc)
>>>
>>> A796 CE39 9C31 68A4 52A7 1F6B 66B7 B250 21D5 6D3E
>>> _______________________________________________
>>> GO-ESSP-TECH mailing list
>>> GO-ESSP-TECH at ucar.edu
>>> http://*mailman.ucar.edu/mailman/listinfo/go-essp-tech
>>
>>
>>
>
> --
> Gavin M. Bell
> Lawrence Livermore National Labs
> --
>
> "Never mistake a clear view for a short distance."
> -Paul Saffo
>
> (GPG Key - http://rainbow.llnl.gov/dist/keys/gavin.asc)
>
> A796 CE39 9C31 68A4 52A7 1F6B 66B7 B250 21D5 6D3E
More information about the GO-ESSP-TECH
mailing list