[ncl-talk] Reading a large netcdf file

Dave Allured - NOAA Affiliate dave.allured at noaa.gov
Mon Dec 18 16:05:05 MST 2017


Tomoko,

Please add the "s" flag to Dennis's request.  This will show chunk size
parameters that may be relevant:

%> ncdump -hs tas_day_IPSL-CM5A-LR_rcp85_r1i1p1_20060101-22051231.rgrd.nc
<http://tas_day_ipsl-cm5a-lr_rcp85_r1i1p1_20060101-22051231.rgrd.nc/>

There are known problems with Netcdf-4 chunk sizes that can dramatically
slow down reading.  In particular, chunk size exceeding chunk cache size,
or chunk size exceeding available memory.  If we know the structure
details, we may be able to suggest a solution.

I recommend chunk sizes in the approximate range of 100 Kbytes to 4 Mbytes
for large netcdf-4 files.

--Dave


On Mon, Dec 18, 2017 at 8:05 AM, Dennis Shea <shea at ucar.edu> wrote:

> When you have a file issue, you should include some information:
>
> (a) what what either of the fillowing show?
>
> %> ncl_filedump tas_day_IPSL-CM5A-LR_rcp85_r1i1p1_20060101-22051231.rgrd.
> nc
>
> or
>
> %> ncdump -h tas_day_IPSL-CM5A-LR_rcp85_r1i1p1_20060101-22051231.rgrd.nc
>
> (b) what version of NCL are you using>
>
> %> ncl -V
>
> (c) your system info
>
> %> uname -a
>
> -----------
>
>  fdir=“/root/dir4ncl/“
>  fili="tas_day_IPSL-CM5A-LR_rcp85_r1i1p1_20060101-22051231.rgrd.nc"
>  fn=addfile(fdir+fili,"r")
>  buff=fn->tas
>  print(“Data is stored”)
>
> On Mon, Dec 18, 2017 at 3:35 AM, Guido Cioni <guidocioni at gmail.com> wrote:
>
>> Tomoko,
>> 9 GB is anything but "large", although the concept of "large" is highly
>> subjective :P
>>
>> I've successfully read SINGLE netcdf files in NCL whose size was ~500GB
>> so that shouldn't be the problem. For some reason a netcdf file of some
>> size, say 400 GB, which has many timesteps is read more slowly than a file
>> with the same size but with less timesteps; that was my impression.
>>
>> You are setting a lot of options which I think are not needed. Did you
>> just try to read the file with this line?
>>
>> fn=addfile(fdir+fili,"r")
>>
>>
>> If it still takes a lot of time it could be system-dependent. When
>> creating the variable NCL stores it into the RAM. If the system does not
>> have enough RAM, some virtual memory will be created on your hard drive,
>> which can slow down everything. But honestly I don't think you're even
>> close to saturate your system's RAM. The problem may lie somewhere else...
>>
>> Let us know.
>>
>> On 18. Dec 2017, at 07:17, Tomoko Koyama <Tomoko.Koyama at Colorado.EDU>
>> wrote:
>>
>> I’d like to extract some data from a large netcdf file, which size is
>> about 9GB.
>>
>> The following shows the partial script, but a submitted job was killed
>> before “Data is stored” message appeared.
>> (I attempted several times with 3-hr max walltime )
>>
>> setfileoption("nc", "FileStructure", "Advanced")
>> fdir=“/root/dir4ncl/“
>> fili="tas_day_IPSL-CM5A-LR_rcp85_r1i1p1_20060101-22051231.rgrd.nc"
>> fn=addfile(fdir+fili,"r")
>> setfileoption("nc","Format","NetCDF4Classic")
>> buff=fn->tas
>> print(“Data is stored”)
>>
>> Does it simply take a long time to read?
>> Is there anyway to speed up to read a large netcdf file?
>>
>>
>> Thank you,
>> Tomoko
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.ucar.edu/pipermail/ncl-talk/attachments/20171218/823e4f2e/attachment.html>


More information about the ncl-talk mailing list