[Go-essp-tech] +2Gb CMIP5 files

Nathan Wilhelmi wilhelmi at ucar.edu
Tue May 18 13:26:21 MDT 2010


Hi - Last time we dug into this >50% of the users on the old ESG system 
were IE. We didn't look if much of that IE on Mac though. What we did 
find is that our browser statics pretty closely tracked many of the 
typical public market share listings. 
http://marketshare.hitslink.com/browser-market-share.aspx?qprid=0

Unfortunately wget doesn't list the OS in the user-agent header, so we 
really don't know what OS it's being run from, nor do we really know 
what OS the files were used on.

-Nate

Don Middleton wrote:
>
> On May 18, 2010, at 12:07 PM, Gavin M Bell wrote:
>
>> Interesting...
>>
>> I wonder what the other 4% are running? :-).
>
> AIX, SunOS, that sort of thing. As I recall, Windows users were a fair 
> bit higher than this on ESG.
>
> don
>
>
>>
>> This is good information to have. I am actually glad that someone has
>> numbers on these things. :-).  Thanks.
>>
>> Don Middleton wrote:
>>> Interesting question, so I inquired a bit. Our NCL/Python distribution
>>> is used by a community of thousands, and binary downloads for 
>>> Windows is
>>> at about 10%, with 11% for MacOS, and 75% Linux. I think if you get 
>>> into
>>> the impacts community, Windows would come out a lot higher than this.
>>> Nate, do you recall our numbers for ESG users?
>>>
>>> don
>>>
>>>
>>> On May 18, 2010, at 10:57 AM, Gavin M Bell wrote:
>>>
>>>> What self-respecting scientist would run windows....
>>>>
>>>> Ooops did I say that out loud :-).
>>>>
>>>> (just kidding... sort of) :-D
>>>>
>>>> Nathan Wilhelmi wrote:
>>>>> Hi All,
>>>>>
>>>>>   Here is a nice table summarizing the various Windows file system
>>>>> limits. http://**www.**ntfs.com/ntfs_vs_fat.htm
>>>>>
>>>>> -Nate*
>>>>> *
>>>>> stephen.pascoe at stfc.ac.uk wrote:
>>>>>> I've done some testing of these file limits this afternoon and I 
>>>>>> don't
>>>>>> think the filesystems will be a problem.
>>>>>>
>>>>>>> From Wikipedia it appears the FAT32 file system has a 4Gb limit
>>>>>> (http://**en.wikipedia.org/wiki/File_Allocation_Table).  That covers
>>>>>> Windows 95 onwards but my Windows XP box is NTFS and has no problem
>>>>>> with
>>>>>> +4Gb files.  Similarly my 32-bit linux laptop (recent ubuntu) can
>>>>>> handle
>>>>>> +4Gb files.
>>>>>>
>>>>>> Looks like anyone with a reasonably modern system will be able to
>>>>>> handle
>>>>>> +4Gb files.  We may have more problems with old NetCDF library
>>>>>> versions.
>>>>>>
>>>>>> S.
>>>>>>
>>>>>> ---
>>>>>> Stephen Pascoe  +44 (0)1235 445980
>>>>>> British Atmospheric Data Centre
>>>>>> Rutherford Appleton Laboratory
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: go-essp-tech-bounces at ucar.edu
>>>>>> [mailto:go-essp-tech-bounces at ucar.edu] On Behalf Of
>>>>>> ag.stephens at stfc.ac.uk
>>>>>> Sent: 18 May 2010 09:31
>>>>>> To: taylor13 at llnl.gov; go-essp-tech at ucar.edu
>>>>>> Cc: doutriaux1 at llnl.gov
>>>>>> Subject: Re: [Go-essp-tech] +2Gb CMIP5 files
>>>>>>
>>>>>> Dear Karl,
>>>>>>
>>>>>> Whether we think it's advisable or not, I'm sure that some of the 
>>>>>> wider
>>>>>> CMIP5 user community will be looking at the outputs on Windows. I 
>>>>>> think
>>>>>> it is sensible to set a 2GB file size limit.
>>>>>>
>>>>>> Regards,
>>>>>>
>>>>>> Ag
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: go-essp-tech-bounces at ucar.edu
>>>>>> [mailto:go-essp-tech-bounces at ucar.edu] On Behalf Of Karl Taylor
>>>>>> Sent: 17 May 2010 18:45
>>>>>> To: go-essp-tech at ucar.edu
>>>>>> Cc: Doutriaux, Charles
>>>>>> Subject: Re: [Go-essp-tech] +2Gb CMIP5 files
>>>>>>
>>>>>> Dear all,
>>>>>>
>>>>>> CMOR has code already in place for checking whether a file exceeds 2
>>>>>> GB,
>>>>>> but it is currently turned off (it was turned on for CMIP3).  We
>>>>>> thought
>>>>>> it was now unnecessary.  If the feeling is that there will be users
>>>>>> downloading CMIP5 files to windows machines using older operating
>>>>>> systems, I suppose that limiting CMIP5 files to whatever the limit
>>>>>> is (2
>>>>>> GB or 4 GB -- does anyone know which it is?) might be wise.
>>>>>>
>>>>>> On the other hand, will anyone use a windows machine to look at 
>>>>>> netCDF
>>>>>> files?  If not, maybe this is a non-issue.
>>>>>>
>>>>>> Karl
>>>>>>
>>>>>> On 5/16/10 12:08 PM, stephen.pascoe at stfc.ac.uk wrote:
>>>>>>
>>>>>>> I think I raised undue alarm here when suggesting we might be 
>>>>>>> dealing
>>>>>>>
>>>>>> with +2GB files.  Thanks Phil for clarifying that UKMO is still
>>>>>> planning
>>>>>> to limit itself to<2GB files.
>>>>>>
>>>>>>> I am wondering what the policy should be here?  My first thought is
>>>>>>>
>>>>>> that modeling centres will mainly make the same decision as UKMO 
>>>>>> since
>>>>>> it is in their interest for their model output to be widely used.
>>>>>> However, enforcement could be difficult.  The logical place to 
>>>>>> enforce
>>>>>> the limit is in the level 1 QC but CMOR doesn't do this so it 
>>>>>> will be a
>>>>>> problem for people running datanodes.
>>>>>>
>>>>>>> I suggest we make a strong recommendation to supply data in<2GB 
>>>>>>> files
>>>>>>>
>>>>>> and enforce it during level-2 QC before replicating.
>>>>>>
>>>>>>> S.
>>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: go-essp-tech-bounces at ucar.edu on behalf of Michael
>>>>>>> Lautenschlager
>>>>>>> Sent: Sun 5/16/2010 1:35 PM
>>>>>>> To: V. Balaji
>>>>>>> Cc: go-essp-tech at ucar.edu
>>>>>>> Subject: Re: [Go-essp-tech] +2Gb CMIP5 files
>>>>>>>
>>>>>>> Hello *,
>>>>>>>
>>>>>>> we strongly support Phils decision for data files less than 2 
>>>>>>> GB. We
>>>>>>> made decision in Hamburg for the same reasons because we cannot 
>>>>>>> expect
>>>>>>>
>>>>>>
>>>>>>> that all users use 64 Bit systems. Most Windows environments are 
>>>>>>> still
>>>>>>>
>>>>>>
>>>>>>> running with 32 Bits.
>>>>>>>
>>>>>>> Best wishes, Michael
>>>>>>>
>>>>>>> ---------------
>>>>>>> Dr. Michael Lautenschlager
>>>>>>>
>>>>>>> German Climate Computing Centre (DKRZ) World Data Center Climate
>>>>>>> (WDCC)
>>>>>>> ADDRESS: Bundesstrasse 45a, D-20146 Hamburg, Germany
>>>>>>> PHONE:   +4940-460094-118
>>>>>>> E-Mail:  lautenschlager at dkrz.de
>>>>>>>
>>>>>>> URL:    http://***www.***dkrz.de/
>>>>>>>         http://***www.***wdc-climate.de/
>>>>>>>
>>>>>>> V. Balaji schrieb:
>>>>>>>
>>>>>>>
>>>>>>>> If I understood correctly the most serious 2Gb problem is with
>>>>>>>>
>>>>>> apache!
>>>>>>
>>>>>>>> Bentley, Philip writes:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> Hi Stephen,
>>>>>>>>>
>>>>>>>>> Yes, that's true, we did create a small number of test netCDF 
>>>>>>>>> files
>>>>>>>>> in that size range. But this was because the CMOR library we 
>>>>>>>>> used at
>>>>>>>>>
>>>>>>
>>>>>>>>> the time didn't include functionality for chunking the output 
>>>>>>>>> into
>>>>>>>>> smaller files. Plus we wanted to stress-test our pipeline!
>>>>>>>>>
>>>>>>>>> Two things have happened since then:
>>>>>>>>>
>>>>>>>>> 1. Jamie has been working with Charles at PCMDI to implement and
>>>>>>>>> test a solution whereby we can limit the size of the output 
>>>>>>>>> netCDF
>>>>>>>>> files produced by CMOR.
>>>>>>>>>
>>>>>>>>> 2. We have made the local decision to limit our netCDF file 
>>>>>>>>> sizes to
>>>>>>>>>
>>>>>>
>>>>>>>>> 2 GB (or thereabouts) as, logistically, that will cause us less
>>>>>>>>> headache moving these files around, and it should maximise the
>>>>>>>>> number of client applications in which the files can be read.
>>>>>>>>>
>>>>>>>>> IIRC, I think Balaji mentioned that the 64-bit offset format was
>>>>>>>>> required for output from the gridspec toolset. I could be wrong.
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Phil
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> -----Original Message-----
>>>>>>>>>> From: go-essp-tech-bounces at ucar.edu
>>>>>>>>>> [mailto:go-essp-tech-bounces at ucar.edu] On Behalf Of
>>>>>>>>>> stephen.pascoe at stfc.ac.uk
>>>>>>>>>> Sent: 14 May 2010 10:52
>>>>>>>>>> To: go-essp-tech at ucar.edu
>>>>>>>>>> Subject: [Go-essp-tech] +2Gb CMIP5 files
>>>>>>>>>>
>>>>>>>>>> The latest UKMO extraction for CMIP5 has produced some files 
>>>>>>>>>> in the
>>>>>>>>>>
>>>>>>
>>>>>>>>>> 30Gb range.  We had discussed previously the assumption that all
>>>>>>>>>> files would be<2Gb.  Do we feel it is important to enforce a<2Gb
>>>>>>>>>> limit or should this just be a recommendation on modelling 
>>>>>>>>>> centres?
>>>>>>>>>>
>>>>>>>>>> To my knowledge there is two issues with +2Gb files:
>>>>>>>>>>
>>>>>>>>>> 1. +2GB NetCDF files will be in 64-bit offset format.
>>>>>>>>>> Therefore NetCDF libraries prior to v3.6 will not be able to 
>>>>>>>>>> read
>>>>>>>>>> them.
>>>>>>>>>> 2. Older file systems may have a 2Gb file limit. This will 
>>>>>>>>>> mainly
>>>>>>>>>>
>>>>>>
>>>>>>>>>> affect 32-bit systems that are a few years old. FAT32 has a 4Gb
>>>>>>>>>> limit.
>>>>>>>>>>
>>>>>>>>>> These are end-user issues, is there any reason why the ESG 
>>>>>>>>>> software
>>>>>>>>>>
>>>>>>
>>>>>>>>>> might have problems with files over 2Gb?  If we do want to 
>>>>>>>>>> ensure
>>>>>>>>>> files are<2Gb do we want to mandate the modelling centres 
>>>>>>>>>> deliver
>>>>>>>>>> that or will the data centres need to split files?
>>>>>>>>>>
>>>>>>>>>> Stephen.
>>>>>>>>>>
>>>>>>>>>> ---
>>>>>>>>>> Stephen Pascoe  +44 (0)1235 445980
>>>>>>>>>> British Atmospheric Data Centre
>>>>>>>>>> Rutherford Appleton Laboratory
>>>>>>>>>> -- 
>>>>>>>>>> Scanned by iCritical.
>>>>>>>>>> _______________________________________________
>>>>>>>>>> GO-ESSP-TECH mailing list
>>>>>>>>>> GO-ESSP-TECH at ucar.edu
>>>>>>>>>> http://***mailman.ucar.edu/mailman/listinfo/go-essp-tech
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> GO-ESSP-TECH mailing list
>>>>>>>>> GO-ESSP-TECH at ucar.edu
>>>>>>>>> http://***mailman.ucar.edu/mailman/listinfo/go-essp-tech
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> GO-ESSP-TECH mailing list
>>>>>>> GO-ESSP-TECH at ucar.edu
>>>>>>> http://***mailman.ucar.edu/mailman/listinfo/go-essp-tech
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> _______________________________________________
>>>>>> GO-ESSP-TECH mailing list
>>>>>> GO-ESSP-TECH at ucar.edu
>>>>>> http://**mailman.ucar.edu/mailman/listinfo/go-essp-tech
>>>>>> -- 
>>>>>> Scanned by iCritical.
>>>>>> _______________________________________________
>>>>>> GO-ESSP-TECH mailing list
>>>>>> GO-ESSP-TECH at ucar.edu
>>>>>> http://**mailman.ucar.edu/mailman/listinfo/go-essp-tech
>>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> GO-ESSP-TECH mailing list
>>>>> GO-ESSP-TECH at ucar.edu
>>>>> http://**mailman.ucar.edu/mailman/listinfo/go-essp-tech
>>>>>
>>>>>
>>>>
>>>> -- 
>>>> Gavin M. Bell
>>>> Lawrence Livermore National Labs
>>>> -- 
>>>>
>>>> "Never mistake a clear view for a short distance."
>>>>                 -Paul Saffo
>>>>
>>>> (GPG Key - http://*rainbow.llnl.gov/dist/keys/gavin.asc)
>>>>
>>>> A796 CE39 9C31 68A4 52A7  1F6B 66B7 B250 21D5 6D3E
>>>> _______________________________________________
>>>> GO-ESSP-TECH mailing list
>>>> GO-ESSP-TECH at ucar.edu
>>>> http://*mailman.ucar.edu/mailman/listinfo/go-essp-tech
>>>
>>>
>>>
>>
>> -- 
>> Gavin M. Bell
>> Lawrence Livermore National Labs
>> -- 
>>
>> "Never mistake a clear view for a short distance."
>>                  -Paul Saffo
>>
>> (GPG Key - http://rainbow.llnl.gov/dist/keys/gavin.asc)
>>
>> A796 CE39 9C31 68A4 52A7  1F6B 66B7 B250 21D5 6D3E
>




More information about the GO-ESSP-TECH mailing list