[ncl-talk] How to combine many nc files from different folders

Setareh Rahimi setareh.rahimi at gmail.com
Thu Oct 10 23:18:21 MDT 2019


Dear Dave,
Thanks for your help and sorry for such a slow reply (too much busy). I run
the code you have suggested, but getting an error. please have a look at
the attached file.
Best wishes,

On Mon, Sep 23, 2019 at 11:57 PM Dave Allured - NOAA Affiliate <
dave.allured at noaa.gov> wrote:

> What do you mean by combine each year together?  If you can condense each
> year by computing intermediate results for each year in smaller arrays,
> then yes this is an excellent approach to a large file problem.  If you
> mean read all daily grids in chunks, this is more difficult.
>
> The code sample that I sent before should work as well as reading all
> daily grids in chunks.  It will give helpful information about this
> problem, whether it works or fails.  Can you please let us know what
> problem it generates, if any?  Running too slow is a valid complaint in
> this case.
>
> Please make two changes before running this again:
>
> (1)  There is a mistake on line 5 below.  If you did not already fix this,
> please change to:
>
>     dims3(0) = nfiles
>
> (2)  For better diagnostics, change the print statement inside the loop to
> this:
>
>     print (systemfunc ("date") + "    " + i)
>
>
> On Mon, Sep 23, 2019 at 8:57 AM Setareh Rahimi <setareh.rahimi at gmail.com>
> wrote:
>
>> Dear all,
>>
>> Is this possible to combine each year together separately, and then
>> combine all of those combined years? Does it make sense? Please kindly
>> advice me in this regard.
>> Best wishes,
>>
>>
>> On Fri, Sep 20, 2019 at 8:44 PM Dave Allured - NOAA Affiliate <
>> dave.allured at noaa.gov> wrote:
>>
>>> You are trying to create an array of about 35 Gbytes in memory.  This is
>>> large and may be straining some limit inside the I/O system.  This may be
>>> difficult to debug directly.
>>>
>>> Instead try an alternate strategy, not using addfiles.  Pre-allocate the
>>> large array in NCL.  Then read files one at a time in a loop, and insert
>>> them into the large array.  Something like this (not tested):
>>>
>>>    fils = systemfunc ("ls /*/*.nc")
>>>    f = addfile (fils(0), "r")
>>>    dims3 = getfilevardimsizes (f, "p")
>>>    nfiles = dimsizes (fils)
>>>    dims(0) = nfiles
>>>    print (dims3)
>>>
>>>    x = new (dims3, "float")
>>>    printVarSummary (x)
>>>
>>>    do i = 0, nfiles-1
>>>       print (i+"")
>>>       f = addfile (fils(i), "r")
>>>       x(i,:,:) = f->p
>>>    end do
>>>
>>>    printVarSummary (x)
>>>    printMinMax (x,0)
>>>
>>> This will probably avoid memory problems in the I/O system, but you
>>> might run into an out of memory problem in the NCL core.  This test program
>>> will help to localize the problem.
>>>
>>> It is also time to check more version information.  How was NCL on your
>>> system built and installed?  Can you get the version numbers of the netcdf
>>> and HDF5 libraries that NCL was built with?  There were hints in previous
>>> diagnostics that you may be using older library versions.  Caution, the
>>> library versions linked with NCL may not be the same versions used in
>>> command line tools h5dump and ncdump.
>>>
>>> Also what is the physical memory size (RAM) in your Mac?
>>>
>>>
>>> On Fri, Sep 20, 2019 at 8:00 AM Setareh Rahimi <setareh.rahimi at gmail.com>
>>> wrote:
>>>
>>>> Dear Dave,
>>>> Thank you for your suggestion. I think the problem comes from where  I
>>>> combine so many NetCDF files together. However, I could not remove the
>>>> warnings.
>>>> Best wishes,
>>>>
>>>> On Thu, Sep 19, 2019 at 11:35 PM Dave Allured - NOAA Affiliate <
>>>> dave.allured at noaa.gov> wrote:
>>>>
>>>>> That is good information from the diagnostics.  All file structure
>>>>> details look fine to me.  However I notice that you are using addfiles to
>>>>> open about 12780 files simultaneously.  I missed this the first time.  You
>>>>> may be running into a system limit of number of simultaneous open files.
>>>>>
>>>>> This is discussed in a paragraph about half way down the documentation
>>>>> page for the addfiles function.  From that, I suggest adding this line
>>>>> before your addfiles command:
>>>>>
>>>>>      setfileoption ("nc", "SuppressClose", False)
>>>>>
>>>>> I am not familiar with this option.  It is possible that you need to
>>>>> place this command after addfiles, and change the first argument from "nc"
>>>>> to just "f" without quotes.  Try it both ways if necessary.
>>>>>
>>>>>
>>>>> On Thu, Sep 19, 2019 at 10:35 AM Setareh Rahimi <
>>>>> setareh.rahimi at gmail.com> wrote:
>>>>>
>>>>>> Dear all,
>>>>>> Thank you for your suggestions. However, the problem still exists. I
>>>>>> checked the files individually and found out nothing wrong with them. I run
>>>>>> the script for each year separately and did not get the warnings, but once
>>>>>> run the script for all the years together the warnings appear. I attached
>>>>>> the output from tests that Dave has suggested.
>>>>>> NCL version: 6.6.2
>>>>>> Computer system: macOS Mojave, version 10.14
>>>>>> Best wishes,
>>>>>>
>>>>>>
>>>>>> On Wed, Sep 18, 2019 at 3:16 AM Dave Allured - NOAA Affiliate <
>>>>>> dave.allured at noaa.gov> wrote:
>>>>>>
>>>>>>> Recently there are some known conditions that can cause unknown
>>>>>>> format and corrupted file messages for valid netcdf files.  I recommend
>>>>>>> diagnosing individual files, not using NCL, before dismissing an entire
>>>>>>> file set as corrupted.  Try this black magic first and see if NCL can then
>>>>>>> read the files:
>>>>>>>
>>>>>>>     Bash:       HDF5_USE_FILE_LOCKING=FALSE
>>>>>>>     C-shell:    setenv HDF5_USE_FILE_LOCKING FALSE
>>>>>>>
>>>>>>> If that does not work, then try these tests.  The first two are
>>>>>>> guaranteed to work on all file types.
>>>>>>>
>>>>>>>     file data.nc
>>>>>>>     od -c -N16 data.nc
>>>>>>>     h5dump -BH data.nc
>>>>>>>     ncdump -k data.nc
>>>>>>>     ncdump -sh data.nc
>>>>>>>
>>>>>>> If the problem has not become obvious, then post output from these
>>>>>>> tests to this mailing list.  If more than 40 lines long, put all output
>>>>>>> into a text file with name ending in .txt, and send as a file attachment to
>>>>>>> your message.  Please do not send any screen shots.
>>>>>>>
>>>>>>> Also send your NCL version number and type of computer system.
>>>>>>>
>>>>>>> --Dave
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Sep 17, 2019 at 3:27 PM Dennis Shea via ncl-talk <
>>>>>>> ncl-talk at ucar.edu> wrote:
>>>>>>>
>>>>>>>> As the message states, the file is "*corrupted*".  This is not an
>>>>>>>> NCL issue.
>>>>>>>>
>>>>>>>> [1] delete the file[s] and try reacquiring them
>>>>>>>> [2] Possibly, the source files are corrupted.
>>>>>>>> ---
>>>>>>>> FYI: There are some Persiann examples.
>>>>>>>> *https://www.ncl.ucar.edu/Applications/HiResPrc.shtml*
>>>>>>>> <https://www.ncl.ucar.edu/Applications/HiResPrc.shtml>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, Sep 17, 2019 at 12:15 PM Setareh Rahimi <
>>>>>>>> setareh.rahimi at gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Dear Adam, and Dennis,
>>>>>>>>> Thanks for your help. hopefully I could combine those files
>>>>>>>>> together, but NCL tells there is something wrong with some files (attached
>>>>>>>>> image). In order to check what could be wrong, I redownload 1983 files
>>>>>>>>> again and got many warning for the second time. Any suggestion to remove
>>>>>>>>> those warning, please?
>>>>>>>>> Best wishes,
>>>>>>>>>
>>>>>>>>

-- 
S.Rahimi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.ucar.edu/pipermail/ncl-talk/attachments/20191011/a45921c2/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Screenshot 2019-10-11 at 08.42.49.png
Type: image/png
Size: 335851 bytes
Desc: not available
URL: <http://mailman.ucar.edu/pipermail/ncl-talk/attachments/20191011/a45921c2/attachment-0001.png>


More information about the ncl-talk mailing list