[ncl-talk] NCL memory leak issue

Dennis Shea shea at ucar.edu
Thu Oct 8 20:40:11 MDT 2020


I modified  your test code (slightly).  I added
   *day_count = day_count + 1 *
also, a few minor print statements. See below.
===
I ran the script.
I used 'top' and watched the memory increasing

----

domain  = "d01"
dir  = "/glade/scratch/srahimi/ucla/downscale/test/WRF-4.1.3/test/"
ndays = 365
iyear = 2019
 dir_x =  dir + "MPI_r8_" + iyear + "/" + domain + "/"

 print (dir_x+"")
 file_str = "wrfout*"
 filei = systemfunc("cd " + dir_x + " ; ls " + file_str)

 day_count = 1
 do iday = 0, ndays - 1
  iStrt = 4*(day_count-1)
  iLast = 4*day_count-1
  print("iday="+iday+":   day_count="+day_count+": iStrt="+iStrt+" :
iLast="+iLast)

  files = filei(4*(day_count-1):4*day_count-1)

  ff = addfiles(dir_x+files,"r")

  delete([/ff,files/])
  day_count = day_count + 1     ; DJS added
 end do         ;end day loop
========================

In separate runs, I added

;
*delete(ff)                               ; djs deleted*



*  delete([/files, ff/])                  ; djs added *

After the delete, I added

   print(files)

and

   print(ff)

As expected NCL responded with an error message stating that these
variables were undefined.

So the delete worked but deep in the NCL bowels is not freeing the memory.
Not sure what
The NCL core code is frozen so not sure this issue can be addressed.
====
Not sure what to suggest. Perhaps:

[1] Use addfiles but process (say) one or two years at a time.
[2] Process one file at a time via addfile.

D








On Thu, Oct 8, 2020 at 12:16 PM STEFAN RAHIMI-ESFARJANI <s.rahimi at ucla.edu>
wrote:

> Sorry, I didn't realize that I didn't reply to NCL talk....
>
> See below a much simplified version of the code. By using top, VIRT and
> RES increase substantially through each iteration of the day loop. If I
> comment out the addfiles line and subsequent delete calls, there is no
> problem memory-wise. Why is there iteratively more memory consumption upon
> the addfiles call?
>
> Thanks,
> -Stefan
>
> begin
>
>
> domain  = "d01"
>
>
> dir  = "/glade/scratch/srahimi/ucla/downscale/test/WRF-4.1.3/test/"
>
>
> ndays = 365
>
>
> iyear = 2019
>
>
>  dir_x =  dir + "MPI_r8_" + iyear + "/" + domain + "/"
>
>
>  print (dir_x+"")
>
>  file_str = "wrfout*"
>
>  filei = systemfunc("cd " + dir_x + " ; ls " + file_str)
>
>
>  day_count = 1
>
>
>  do iday = 0, ndays - 1
>
>
>   files = filei(4*(day_count-1):4*day_count-1)
>
>
>   ff = addfiles(dir_x+files,"r")
>
>   delete(ff)
>
>
>  end do         ;end day loop
>
>
>
> end
>
> On Thu, Oct 8, 2020 at 11:56 AM Dennis Shea <shea at ucar.edu> wrote:
>
>> Hello,
>>
>> It is always best to respond the ncl-talk. I am sure there are people who
>> know more than me about these types of issues.
>> ---
>> FYI: to my knowledge,
>>
>> ff = addfiles(dir_x+files,"r")
>>
>> The 'ff' is a variable of type 'list' Each element of 'ff' is a pointer
>> [one 'location'] to each variable. So, it is not consuming memory.
>> I am not aware of any memory leaks associated with netCDF files.
>>
>> Again, I will look this evening.
>>
>> Cheers
>> D
>>
>>
>>
>>
>>
>>
>>
>> On Thu, Oct 8, 2020 at 11:27 AM STEFAN RAHIMI-ESFARJANI <
>> s.rahimi at ucla.edu> wrote:
>>
>>> Hi Dennis,
>>>
>>> As you can see in this much simplified version of the code, the problem
>>> just described persists....
>>>
>>> /glade/scratch/srahimi/test.ncl
>>>
>>> Thanks so much,
>>> -Stefan
>>>
>>> On Thu, Oct 8, 2020 at 11:23 AM STEFAN RAHIMI-ESFARJANI <
>>> s.rahimi at ucla.edu> wrote:
>>>
>>>> Thanks for the reply,
>>>>
>>>> The program is trying to write the created etrans_sfc
>>>> <http://etrans_sfc.daily.mpi-esm1-2-lr_ssp370_r8i1p1f1_d01_1980.nc/>...nc
>>>> to the /files directory. Since this directory isn't in your work directory,
>>>> that is where the explicit issue is happening that you see.
>>>>
>>>> My issue is that, if you run the code for 365 days for a given year,
>>>> top -u shea, you will see that there is a steady increase in VIRT and RES
>>>> memory associated with the line ff = addfiles(stuff). Even when ff
>>>> is deleted, the increase persists. This problem is independent of the .nc
>>>> write follwoing the completion of the day loop, and the problem persists
>>>> even if all var_wrf variables are removed from the code...
>>>>
>>>> I have seen references to this type of leak before with GRB2 files (
>>>> https://mailman.ucar.edu/pipermail/ncl-talk/2017-June/009365.html),
>>>> but it seems that iterations of NCL versions solved this issue with
>>>> addfile....
>>>>
>>>> Is there a different way to release memory for ff?
>>>>
>>>> Thanks,
>>>> -Stefan
>>>>
>>>> On Thu, Oct 8, 2020 at 11:02 AM Dennis Shea <shea at ucar.edu> wrote:
>>>>
>>>>> I copied your script and ran it with a few print statements added:
>>>>>
>>>>> /glade/work/shea/test.mem_leak.ncl
>>>>>
>>>>> %> ncl test.mem_leak.ncl
>>>>>
>>>>> ==================
>>>>> /glade/work/shea>ncl test.mem_leak.ncl
>>>>>  Copyright (C) 1995-2019 - All Rights Reserved
>>>>>  University Corporation for Atmospheric Research
>>>>>  NCAR Command Language Version 6.6.2
>>>>>  The use of this software is governed by a License Agreement.
>>>>>  See http://www.ncl.ucar.edu/ for more details.
>>>>> (0) /glade/scratch/srahimi/interp/meta2/wrfinput_d01
>>>>>
>>>>> Variable: var_wrf
>>>>> Type: float
>>>>> Total Size: 16550560 bytes
>>>>>             4137640 values
>>>>> Number of Dimensions: 3
>>>>> Dimensions and sizes: [365] x [104] x [109]
>>>>> Coordinates:
>>>>> Number Of Attributes: 1
>>>>>   _FillValue : 9.96921e+36
>>>>> (0) ------------------
>>>>> (0)
>>>>> /glade/scratch/srahimi/ucla/downscale/test/WRF-4.1.3/test/MPI_r8_1980/d01/
>>>>> (0) iyear=1980: iday=0
>>>>> (0) iyear=1980: iday=1
>>>>> (0) iyear=1980: iday=2
>>>>> [SNIP]
>>>>> 0) iyear=1980: iday=362
>>>>> (0) iyear=1980: iday=363
>>>>> (0) iyear=1980: iday=364
>>>>> (0) ====================
>>>>> (0) ===EXIT IDAY LOOP===
>>>>> (0) ====================
>>>>>
>>>>> Variable: var_wrf
>>>>> Type: float
>>>>> Total Size: 16550560 bytes
>>>>>             4137640 values
>>>>> Number of Dimensions: 3
>>>>> Dimensions and sizes: [day | 365] x [lat2d | 104] x [lon2d | 109]
>>>>> Coordinates:
>>>>>             day: [19800901..19810831]
>>>>> Number Of Attributes: 2
>>>>>   units : mm/d
>>>>>   _FillValue : 9.96921e+36
>>>>> (0)
>>>>> (0) min=0   max=4.89127
>>>>> (0) ====================
>>>>>
>>>>> /bin/rm: cannot remove 'files/
>>>>> etrans_sfc.daily.mpi-esm1-2-lr_ssp370_r8i1p1f1_d01_1980.nc': No such
>>>>> file or directory
>>>>> fatal:Could not create (files/
>>>>> etrans_sfc.daily.mpi-esm1-2-lr_ssp370_r8i1p1f1_d01_1980.nc)
>>>>> fatal:["Execute.c":8637]:Execute: Error occurred at or near line 113
>>>>> in file test.mem_leak.ncl
>>>>>
>>>>> +++++++++++++++++++++++++++++
>>>>> The 'iday'  loop has
>>>>>
>>>>>  63  do iday = 0, ndays - 1
>>>>>  64
>>>>>  65   files = filei(4*(day_count-1):4*day_count-1)
>>>>>  66 ;;print (files) ;each is a string
>>>>>
>>>>> The following is after the loop. 'files' is a 1D array of strings. If
>>>>> so,  the following looks incorrect.
>>>>>
>>>>> /bin/rm: cannot remove 'files/
>>>>> etrans_sfc.daily.mpi-esm1-2-lr_ssp370_r8i1p1f1_d01_1980.nc': No such
>>>>> file or directory
>>>>>
>>>>> On Thu, Oct 8, 2020 at 9:24 AM STEFAN RAHIMI-ESFARJANI via ncl-talk <
>>>>> ncl-talk at mailman.ucar.edu> wrote:
>>>>>
>>>>>> Greetings,
>>>>>>
>>>>>>
>>>>>> I spent a lot of time developing post-processing software for my
>>>>>> research, which are written in NCL. The code here creates single-year files
>>>>>> of daily evapotranspiration values on a lat/lon grid f(time,lat,lon). I am
>>>>>> experiencing a memory leak that is affecting my ability to run these jobs
>>>>>> using Cheyenne compute nodes.
>>>>>>
>>>>>> In short, I have traced the memory leak to the following line (70) in
>>>>>> /glade/scratch/srahimi/interp/mpi_r8/test.ncl on Cheyenne or see
>>>>>> below:
>>>>>>
>>>>>> ff = addfiles(dir_x+files,"r")
>>>>>>
>>>>>> The files are wrfout .nc files. Variable "ff" should be deleted 2
>>>>>> lines later, and that space should be re-used throughout each iteration of
>>>>>> the loop. What seems to be happening however is that new memory is being
>>>>>> allocated for ff in each iteration of the loop, eventually overpowering the
>>>>>> core and the node.
>>>>>>
>>>>>> I know ways around this in python, but I really do not want to
>>>>>> rewrite these scripts at present. Any suggestions?
>>>>>>
>>>>>> Thanks,
>>>>>> -Stefan Rahimi, UCLA
>>>>>>
>>>>>> ;Created by S. Rahimi on 15 Jan. 2020
>>>>>>
>>>>>> ;to interpolate WRF fields toa common
>>>>>>
>>>>>> ;0.03 rectilinear grid
>>>>>>
>>>>>>
>>>>>> ;Daily averages from 6-h data
>>>>>>
>>>>>>
>>>>>> ;For a 2-D variable
>>>>>>
>>>>>>
>>>>>> ;f = f(year,day,ny,nx)
>>>>>>
>>>>>>
>>>>>> begin
>>>>>>
>>>>>>
>>>>>> var = "etrans_sfc"
>>>>>>
>>>>>> model  = "mpi-esm1-2-lr"
>>>>>>
>>>>>> variant  = "r8i1p1f1"
>>>>>>
>>>>>> ssp  = "ssp370"
>>>>>>
>>>>>> freq = "daily"
>>>>>>
>>>>>> domain  = "d01"
>>>>>>
>>>>>>
>>>>>> startyear  = 1980
>>>>>>
>>>>>> endyear  = 1989
>>>>>>
>>>>>> year0 = 1980
>>>>>>
>>>>>>
>>>>>> dir  = "/glade/scratch/srahimi/ucla/downscale/test/WRF-4.1.3/test/"
>>>>>>
>>>>>> dir_inp  = "/glade/scratch/srahimi/interp/meta2/"
>>>>>>
>>>>>> file_wrfinput  = "/glade/scratch/srahimi/interp/meta2/wrfinput_d01"
>>>>>>
>>>>>>
>>>>>> s2d = 86400.
>>>>>>
>>>>>>
>>>>>> ndays = 365
>>>>>>
>>>>>> nyears = endyear - startyear + 1
>>>>>>
>>>>>>
>>>>>> nx  = 109
>>>>>>
>>>>>> ny  = 104
>>>>>>
>>>>>>
>>>>>> ;lat/lon from wrfinput
>>>>>>
>>>>>> print (file_wrfinput+"")
>>>>>>
>>>>>> f = addfile(file_wrfinput,"r")
>>>>>>
>>>>>> lat_wrf = f->XLAT(0,:,:)
>>>>>>
>>>>>> lon_wrf = f->XLONG(0,:,:)
>>>>>>
>>>>>> delete(f)
>>>>>>
>>>>>>
>>>>>> years = ispan(startyear,endyear,1)
>>>>>>
>>>>>> days = new((/ndays/),"integer")
>>>>>>
>>>>>> var_wrf = new((/ndays,ny,nx/),"float")
>>>>>>
>>>>>>
>>>>>> do iyear = startyear, endyear
>>>>>>
>>>>>>
>>>>>> if (iyear .eq. 2014 ) then
>>>>>>
>>>>>>         continue
>>>>>>
>>>>>> end if
>>>>>>
>>>>>>
>>>>>>  dir_x =  dir + "MPI_r8_" + iyear + "/" + domain + "/"
>>>>>>
>>>>>>
>>>>>>  print (dir_x+"")
>>>>>>
>>>>>>  file_str = "wrfout*"
>>>>>>
>>>>>>  filei = systemfunc("cd " + dir_x + " ; ls " + file_str)
>>>>>>
>>>>>>
>>>>>>  day_count = 1
>>>>>>
>>>>>>
>>>>>>  do iday = 0, ndays - 1
>>>>>>
>>>>>>
>>>>>>   files = filei(4*(day_count-1):4*day_count-1)
>>>>>>
>>>>>>   print (files) ;each is a string
>>>>>>
>>>>>>
>>>>>>   date_str = stringtochar(files(0))
>>>>>>
>>>>>>   date = tostring(date_str(11:14)) + tostring(date_str(16:17)) +
>>>>>> tostring(date_str(19:20))
>>>>>>
>>>>>>   days(iday) = stringtointeger(date)
>>>>>>
>>>>>>
>>>>>>   ff = addfiles(dir_x+files,"r")
>>>>>>
>>>>>>   var_wrf(iday,:,:)
>>>>>> =  dim_avg_n_Wrap(ff[0:dimsizes(files)-1]->ETRAN,0) * s2d
>>>>>>
>>>>>>   delete(ff)
>>>>>>
>>>>>>   var_wrf(iday,:,:) = where(abs(var_wrf(iday,:,:)).gt.1e20,0.,
>>>>>> var_wrf(iday,:,:))
>>>>>>
>>>>>>
>>>>>>   day_count = day_count + 1
>>>>>>
>>>>>>   delete(files)
>>>>>>
>>>>>>
>>>>>>  end do         ;end day loop
>>>>>>
>>>>>>
>>>>>>  var_wrf!0 = "day"
>>>>>>
>>>>>>  var_wrf!1 = "lat2d"
>>>>>>
>>>>>>  var_wrf!2 = "lon2d"
>>>>>>
>>>>>>  var_wrf&day = days
>>>>>>
>>>>>>  var_wrf at units = "mm/d"
>>>>>>
>>>>>>
>>>>>>  ;Write to .nc files
>>>>>>
>>>>>>  dirout = "files/"
>>>>>>
>>>>>>  fil_out = var + "." + freq + "." + \
>>>>>>
>>>>>>   model + "_" + ssp + "_" + variant + "_" + \
>>>>>>
>>>>>>   domain + "_" + iyear + ".nc"
>>>>>>
>>>>>>  outfile = dirout+fil_out
>>>>>>
>>>>>>  system("/bin/rm " +outfile)
>>>>>>
>>>>>>  ncdf = addfile(outfile ,"c")  ; open output netCDF file
>>>>>>
>>>>>>  ;setfileoption(outfile,"DefineMode",True)
>>>>>>
>>>>>>  fAtt               = True            ; assign file attributes
>>>>>>
>>>>>>  fAtt at title         = var+ " for  "+iyear
>>>>>>
>>>>>>  fAtt at Conventions   = "None"
>>>>>>
>>>>>>  fAtt at creation_date = systemfunc("date")
>>>>>>
>>>>>>  ;fileattdef(ncdf,f[0])            ; copy file attributes
>>>>>>
>>>>>>  filedimdef(ncdf,"region",-1,True)
>>>>>>
>>>>>>
>>>>>>  ncdf->$var$ = var_wrf
>>>>>>
>>>>>>
>>>>>>  delete(ncdf)
>>>>>>
>>>>>>
>>>>>> end do          ;end year loop
>>>>>>
>>>>>>
>>>>>> end
>>>>>>
>>>>>> _______________________________________________
>>>>>> ncl-talk mailing list
>>>>>> ncl-talk at mailman.ucar.edu
>>>>>> List instructions, subscriber options, unsubscribe:
>>>>>> https://mailman.ucar.edu/mailman/listinfo/ncl-talk
>>>>>
>>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.ucar.edu/pipermail/ncl-talk/attachments/20201008/36b2e050/attachment.html>


More information about the ncl-talk mailing list