[ncl-talk] Reading a large netcdf file
Tomoko Koyama
Tomoko.Koyama at Colorado.EDU
Mon Dec 18 17:50:43 MST 2017
Thank you very much, Dave.
I hope the following is showing enough information to nail down the cause.
[koyama at login03 IPSL-CM5A-LR]$ ncdump -hs tas_day_IPSL-CM5A-LR_rcp85_r1i1p1_20060101-22051231.rgrd.nc
netcdf tas_day_IPSL-CM5A-LR_rcp85_r1i1p1_20060101-22051231.rgrd {
dimensions:
time = 73000 ;
i = 180 ;
j = 180 ;
variables:
int time(time) ;
string time:long_name = "time" ;
string time:units = "month" ;
time:_Storage = "contiguous" ;
time:_Endianness = "little" ;
float i(i) ;
string i:units = "none" ;
string i:long_name = "i index" ;
i:_Storage = "contiguous" ;
float j(j) ;
string j:units = "none" ;
string j:long_name = "j index" ;
j:_Storage = "contiguous" ;
float tas(time, i, j) ;
string tas:associated_files = "baseURL: http://cmip-pcmdi.llnl.gov/CMIP5/dataLocation gridspecFile: gridspec_atmos_fx_IPSL-CM5A-LR_rcp85_r0i0p0.nc areacella: areacella_fx_IPSL-CM5A-LR_rcp85_r0i0p0.nc" ;
string tas:coordinates = "height" ;
string tas:history = "2011-08-16T22:13:26Z altered by CMOR: Treated scalar dimension: \'height\'. 2011-08-16T22:13:26Z altered by CMOR: replaced missing value flag (9.96921e+36) with standard missing value (1e+20). 2011-08-16T22:13:45Z altered by CMOR: Inverted axis: lat." ;
string tas:cell_measures = "area: areacella" ;
string tas:cell_methods = "time: mean (interval: 30 minutes)" ;
string tas:original_name = "t2m" ;
string tas:units = "K" ;
string tas:long_name = "Near-Surface Air Temperature" ;
string tas:standard_name = "air_temperature" ;
string tas:remap = "remapped via ESMF_regrid_with_weights: Bilinear" ;
tas:missing_value = 1.e+20f ;
tas:_FillValue = 1.e+20f ;
tas:_Storage = "contiguous" ;
float lat(j, i) ;
string lat:long_name = "latitude" ;
string lat:units = "degrees_north" ;
lat:_FillValue = -999.f ;
lat:_Storage = "contiguous" ;
float lon(j, i) ;
string lon:long_name = "longitude" ;
string lon:units = "degrees_east" ;
lon:_FillValue = -999.f ;
lon:_Storage = "contiguous" ;
// global attributes:
:_Format = "netCDF-4” ;
}
> On Dec 18, 2017, at 4:05 PM, Dave Allured - NOAA Affiliate <dave.allured at noaa.gov> wrote:
>
> Tomoko,
>
> Please add the "s" flag to Dennis's request. This will show chunk size parameters that may be relevant:
>
> %> ncdump -hs tas_day_IPSL-CM5A-LR_rcp85_r1i1p1_20060101-22051231.rgrd.nc <http://secure-web.cisco.com/1RvOpL6YjTvAr2KOsZZoipbunMq6bhiktJdhz1n0TOzrW7Kca4nc59oM2g-2N-kBfsTvJq7EbmlS0Z0eO4xIOUUGW_31tXoipuew_3rVpiohC-518ODXnyFmFzR3smzNsHtNpBxiusoXkUpFilSVSwqsMZkdC-XQlFW8u11QwrHot_ne7RdeFPhoD5wROJB6k9CdtgjpxTb-_aez0Tad7dXeTlQS3e03cvhtyTppAhrh3JfDiMZ-wPPL3IxrC8RKfrObsJ1IJzS-axy1820aJai6zHy_wSWi3R-c1HLtu4HAI5n3tr2EPhcRoJP3pwWV5buWNtzKmLv8gwqUjknsHNOSZ3JQcuEjcH701UsyOhZQ6x3bS5Gim_Mb6Ez5ypNgivXHE-Baroa1_LD5JlLe0hvRxMkNAYoIm6NACLkfGmDh7xi3ImE5TglAUjnPxz6te/http%3A%2F%2Ftas_day_ipsl-cm5a-lr_rcp85_r1i1p1_20060101-22051231.rgrd.nc%2F>
>
> There are known problems with Netcdf-4 chunk sizes that can dramatically slow down reading. In particular, chunk size exceeding chunk cache size, or chunk size exceeding available memory. If we know the structure details, we may be able to suggest a solution.
>
> I recommend chunk sizes in the approximate range of 100 Kbytes to 4 Mbytes for large netcdf-4 files.
>
> --Dave
>
>
> On Mon, Dec 18, 2017 at 8:05 AM, Dennis Shea <shea at ucar.edu <mailto:shea at ucar.edu>> wrote:
> When you have a file issue, you should include some information:
>
> (a) what what either of the fillowing show?
>
> %> ncl_filedump tas_day_IPSL-CM5A-LR_rcp85_r1i1p1_20060101-22051231.rgrd.nc <http://secure-web.cisco.com/1OW_a3n1krqUftMdpsEFQLF9MnaCbIIUe4GLVf5ETSY_RBlx7nPiMsNeW7GTg_jFwVFYKnsxK2L_qXy_j6XTtzdM3YZ8QjDyik2JwmTcJQg3siC3KZrDZA2cO2Gv32zlVRdcGx4_lj1NbqlqF1YPNXoYRcBDbT47dqqLWWXTS21VFlWnmXqK7DraSHCwEtg6JBYE_A21VnfoUm9oWJiNW7wrZA5H6ruPd-80HJdDBmfejeC7DmM__iHAL3xtTwdevGM47ItGs9VoGK452GL3K5HZzOm00XzEB9aTMT75DkJ9LWEc6vFeibxgel2GRxCa4rIbcMJc72l-aWsXOxlsterNiDlaTwTO5UCTcAqGLgirzEPEIzEl3bmiJeKRefywuAz1IBVNpmwInKYF-i3vwAjBNmWMoleFEeRYXQo8RkYs7665o-yr9eBchBgNzFXkn/http%3A%2F%2Ftas_day_IPSL-CM5A-LR_rcp85_r1i1p1_20060101-22051231.rgrd.nc>
>
> or
>
> %> ncdump -h tas_day_IPSL-CM5A-LR_rcp85_r1i1p1_20060101-22051231.rgrd.nc <http://secure-web.cisco.com/1OW_a3n1krqUftMdpsEFQLF9MnaCbIIUe4GLVf5ETSY_RBlx7nPiMsNeW7GTg_jFwVFYKnsxK2L_qXy_j6XTtzdM3YZ8QjDyik2JwmTcJQg3siC3KZrDZA2cO2Gv32zlVRdcGx4_lj1NbqlqF1YPNXoYRcBDbT47dqqLWWXTS21VFlWnmXqK7DraSHCwEtg6JBYE_A21VnfoUm9oWJiNW7wrZA5H6ruPd-80HJdDBmfejeC7DmM__iHAL3xtTwdevGM47ItGs9VoGK452GL3K5HZzOm00XzEB9aTMT75DkJ9LWEc6vFeibxgel2GRxCa4rIbcMJc72l-aWsXOxlsterNiDlaTwTO5UCTcAqGLgirzEPEIzEl3bmiJeKRefywuAz1IBVNpmwInKYF-i3vwAjBNmWMoleFEeRYXQo8RkYs7665o-yr9eBchBgNzFXkn/http%3A%2F%2Ftas_day_IPSL-CM5A-LR_rcp85_r1i1p1_20060101-22051231.rgrd.nc>
>
> (b) what version of NCL are you using>
>
> %> ncl -V
>
> (c) your system info
>
> %> uname -a
>
> -----------
>
> fdir=“/root/dir4ncl/“
> fili="tas_day_IPSL-CM5A-LR_rcp85_r1i1p1_20060101-22051231.rgrd.nc <http://secure-web.cisco.com/1OW_a3n1krqUftMdpsEFQLF9MnaCbIIUe4GLVf5ETSY_RBlx7nPiMsNeW7GTg_jFwVFYKnsxK2L_qXy_j6XTtzdM3YZ8QjDyik2JwmTcJQg3siC3KZrDZA2cO2Gv32zlVRdcGx4_lj1NbqlqF1YPNXoYRcBDbT47dqqLWWXTS21VFlWnmXqK7DraSHCwEtg6JBYE_A21VnfoUm9oWJiNW7wrZA5H6ruPd-80HJdDBmfejeC7DmM__iHAL3xtTwdevGM47ItGs9VoGK452GL3K5HZzOm00XzEB9aTMT75DkJ9LWEc6vFeibxgel2GRxCa4rIbcMJc72l-aWsXOxlsterNiDlaTwTO5UCTcAqGLgirzEPEIzEl3bmiJeKRefywuAz1IBVNpmwInKYF-i3vwAjBNmWMoleFEeRYXQo8RkYs7665o-yr9eBchBgNzFXkn/http%3A%2F%2Ftas_day_IPSL-CM5A-LR_rcp85_r1i1p1_20060101-22051231.rgrd.nc>"
> fn=addfile(fdir+fili,"r")
> buff=fn->tas
> print(“Data is stored”)
>
> On Mon, Dec 18, 2017 at 3:35 AM, Guido Cioni <guidocioni at gmail.com <mailto:guidocioni at gmail.com>> wrote:
> Tomoko,
> 9 GB is anything but "large", although the concept of "large" is highly subjective :P
>
> I've successfully read SINGLE netcdf files in NCL whose size was ~500GB so that shouldn't be the problem. For some reason a netcdf file of some size, say 400 GB, which has many timesteps is read more slowly than a file with the same size but with less timesteps; that was my impression.
>
> You are setting a lot of options which I think are not needed. Did you just try to read the file with this line?
>
>> fn=addfile(fdir+fili,"r")
>
> If it still takes a lot of time it could be system-dependent. When creating the variable NCL stores it into the RAM. If the system does not have enough RAM, some virtual memory will be created on your hard drive, which can slow down everything. But honestly I don't think you're even close to saturate your system's RAM. The problem may lie somewhere else...
>
> Let us know.
>
>> On 18. Dec 2017, at 07:17, Tomoko Koyama <Tomoko.Koyama at Colorado.EDU <mailto:Tomoko.Koyama at Colorado.EDU>> wrote:
>>
>> I’d like to extract some data from a large netcdf file, which size is about 9GB.
>>
>> The following shows the partial script, but a submitted job was killed before “Data is stored” message appeared.
>> (I attempted several times with 3-hr max walltime )
>>
>> setfileoption("nc", "FileStructure", "Advanced")
>> fdir=“/root/dir4ncl/“
>> fili="tas_day_IPSL-CM5A-LR_rcp85_r1i1p1_20060101-22051231.rgrd.nc <http://secure-web.cisco.com/1OW_a3n1krqUftMdpsEFQLF9MnaCbIIUe4GLVf5ETSY_RBlx7nPiMsNeW7GTg_jFwVFYKnsxK2L_qXy_j6XTtzdM3YZ8QjDyik2JwmTcJQg3siC3KZrDZA2cO2Gv32zlVRdcGx4_lj1NbqlqF1YPNXoYRcBDbT47dqqLWWXTS21VFlWnmXqK7DraSHCwEtg6JBYE_A21VnfoUm9oWJiNW7wrZA5H6ruPd-80HJdDBmfejeC7DmM__iHAL3xtTwdevGM47ItGs9VoGK452GL3K5HZzOm00XzEB9aTMT75DkJ9LWEc6vFeibxgel2GRxCa4rIbcMJc72l-aWsXOxlsterNiDlaTwTO5UCTcAqGLgirzEPEIzEl3bmiJeKRefywuAz1IBVNpmwInKYF-i3vwAjBNmWMoleFEeRYXQo8RkYs7665o-yr9eBchBgNzFXkn/http%3A%2F%2Ftas_day_IPSL-CM5A-LR_rcp85_r1i1p1_20060101-22051231.rgrd.nc>"
>> fn=addfile(fdir+fili,"r")
>> setfileoption("nc","Format","NetCDF4Classic")
>> buff=fn->tas
>> print(“Data is stored”)
>>
>> Does it simply take a long time to read?
>> Is there anyway to speed up to read a large netcdf file?
>>
>>
>> Thank you,
>> Tomoko
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.ucar.edu/pipermail/ncl-talk/attachments/20171218/501ec517/attachment.html>
More information about the ncl-talk
mailing list