[ncl-talk] A Fatal Error Question: is this a bug in NCL?

Mary Haley haley at ucar.edu
Mon Jul 17 15:26:48 MDT 2017


Hi Barry,

Rick is out this afternoon.

Part of what Rick is suggesting is that if you are done with a variable,
then you should delete it in order to free up memory for later
calculations.

If I may provide some honest comments about your script overall: if you use
some simple "clean coding practices", this will make it *much* easier for
you to debug your own code, and it will certainly make it easier for other
people to debug it. It will also help you see where you might have some
memory issues.


Here are some suggestions for improving your script:

*[1] Use "delete" to remove variables you don't need any more:*

t_ave=t_ave-273.16  ; convert to C

        z_ave=z_ave/10.  ; convert to dam

        printMinMax(z_ave,False)
        u = u * 3.6
        v = v * 3.6
        copy_VarCoords(u,u_ave)                         ; copy coord vars
to speed
        copy_VarCoords(v,v_ave)                         ; copy coord vars
to speed
        copy_VarCoords(z,z_ave)                         ; copy coord vars
to speed
        copy_VarCoords(t,t_ave)                         ; copy coord vars
to speed
        copy_VarCoords(rh,rh_ave)                         ; copy coord vars
to speed

If you no longer need the "u", "v", "z", etc variables, then delete them:

t_ave=t_ave-273.16  ; convert to C

        z_ave=z_ave/10.  ; convert to dam

        printMinMax(z_ave,False)
        u = u * 3.6
        v = v * 3.6
        copy_VarCoords(u,u_ave)                         ; copy coord vars
to speed
        copy_VarCoords(v,v_ave)                         ; copy coord vars
to speed
        copy_VarCoords(z,z_ave)                         ; copy coord vars
to speed
        copy_VarCoords(t,t_ave)
        copy_VarCoords(rh,rh_ave)
        delete([/u,v,z,t,rh/])                          ;  no longer needed

*[2] You are reading several variables and not doing anything with them but
printing them out. Reading them in uses memory. *

If all you need to do is print them, then print them directly. An example:

      lv0 = fin->lv_ISBL0
      print("lv0 = " + lv0)
      lv8 = fin->lv_ISBL8
      print("lv8 = " + lv8)
      lv10 = fin->lv_ISBL10
      print("lv10 = " + lv10)

Do this instead, to save memory:

      print("lv0 = " + fin->lv_ISBL0)
      print("lv8 = " + fin->lv_ISBL8)
      print("lv10 = " + fin->lv_ISBL10)


*[3] Avoid putting unchanged code inside do loops.*

Your script has two do loops with a bunch of code inside of them that
doesn't change.

Do loops can be slow, so it's important to keep as much code *outside* of a
do loop as possible.

As an example, these are your two outside do loops:

  do n=0,n_files -1,1
    do i_dir = 0,n_dirWRF-1

Inside these two loops, you have code that looks like this:

       res1                     = True
        res1 at mpDataBaseVersion = "Ncarg4_1"           ; choose more recent
database
        . . .
        res1 at mpNationalLineThicknessF    = 3.0
        res1 at mpGeophysicalLineThicknessF = 2.0
      . . .

        res1 at tiMainString         = time_var
        res1 at gsnDraw             = False           ; don't draw
                                                   res1 at cnLevelSelectionMode
= "ExplicitLevels"    ; set explicit contour levels
        . . .
        res1 at cnInfoLabelOn = False                  ; turn off contour info
label                                             res1 at lbAutoManage
 = False          ; control label bar
                res1 at lbOrientation         = "Vertical"
        . . .
        res1 at lbTopMarginF = 0.001
        res2                     = True
        res2                     = True
        .  . .
        res2 at mpNationalLineThicknessF    = 3.0
        res2 at mpGeophysicalLineThicknessF = 2.0
        res2 at tiMainString         = time_var
        res2 at gsnDraw             = False           ; don't draw

        . . .
        res2 at lbOrientation         = "Vertical"
        res2 at pmLabelBarSide        = "Right"
        res2 at lbLabelAutoStride =   True
        res2 at pmLabelBarWidthF      = 0.20
        res2 at pmLabelBarHeightF     = 0.7
        . . .

None of this code changes inside the do loops, but yet it is getting set
again and again for every iteration of the two do loops. In order to speed
things up, move as much code as you can to outside the do loop, and only
keep things that actually change inside the do loop.

I recommend creating functions to set the various resource lists:
set_res1_list, set_res2_list, etc. This may not necessarily save memory,
but it makes your code much cleaner.

*[4] Use functions for cleaner code.*

You have code that is repeated in multiple locations.  This can cause extra
memory usage.  If you turn these repeated code segments into NCL functions,
then this can save some memory.

An example, you have:

  i_loc = 1
  j_loc = 1
  GET_IJ::get_ij(lat2d,lon2d,lat_beg,lon_beg,i_loc,j_loc,nj,ni)
  x_start = i_loc
  y_start = j_loc

Granted, this doesn't use a lot of memory, but every time you copy these 5
lines, it makes your script harder to read, and you will slowly use more
memory if you do this kind of thing frequently.

Turn the above five lines into a function:

function get_ij_ncl(lat2d,lon2d,lat_beg,lon_beg)
local nj, ni
begin
  i_loc = 1
  j_loc = 1
  nj = dimsizes(lat2d(:,0))
  ni = dimsizes(lat2d(0,:))
  GET_IJ::get_ij(lat2d,lon2d,lat_beg,lon_beg,i_loc,j_loc,nj,ni)
 return([/i_loc,j_loc/])
end

and then call the function with:

xy_start = get_ij_ncl(lat2d,lon2d,lat_beg,lon_beg)

You will then need to change "x_start" to xy_start(0) and "y_start" to
xy_start(1)

*[5] Another place where a function would be helpful:*

  u_ave_new = u_ave(xy_start(1):xy_end(1),xy_start(0):xy_end(0))
  v_ave_new = v_ave(xy_start(1):xy_end(1),xy_start(0):xy_end(0))
  lat2d_new = lat2d(xy_start(1):xy_end(1),xy_start(0):xy_end(0))
  lon2d_new = lon2d(xy_start(1):xy_end(1),xy_start(0):xy_end(0))
; plotB   = gsn_csm_vector(wks,u_ave_new,v_ave_new,vecres)

;;;opy_VarCoords(dumb_ave,u_ave_new)

; copy_VarCoords(dumb_ave,v_ave_new)

   u_ave_new!0 = "lat"
   u_ave_new!1 = "lon"
   u_ave_new&lat= lat(xy_start(1):xy_end(1))
   u_ave_new&lon= lon(xy_start(0):xy_end(0))
   v_ave_new!0 = "lat"
   v_ave_new!1 = "lon"
   v_ave_new&lat= lat(xy_start(1):xy_end(1))
   v_ave_new&lon= lon(xy_start(0):xy_end(0))

Create a function to do the subsetting and attaching of the metadata:

function average_subset(x,lat,lon,xy_start,xy_end)
begin
  x_ave_new = x(xy_start(1):xy_end(1),xy_start(0):xy_end(0))
  x_ave_new!0 = "lat"
  x_ave_new!1 = "lon"
  x_ave_new&lat= lat(xy_start(1):xy_end(1))
  x_ave_new&lon= lon(xy_start(0):xy_end(0))
  return(x_ave_new)
end

so now you can replace the above code with two lines:

  u_ave_new = average_subset(u_ave,lat,lon,xy_start,xy_end)
  v_ave_new = average_subset(v_ave,lat,lon,xy_start,xy_end)

*[6] You created "lat2d_new" and "lon2d_new" but never use them. Again,
this is using up memory for no purpose. Remove those two lines.*

*[7] If you are propagating a smaller array to a larger array, as is the
case here:*

  lat2d = z
  lon2d = z
  do j = 0,nj-1
  do i = 0,ni-1
   lat2d(j,i)=lat(j)
   lon2d(j,i)=lon(i)
  end do
  end do

then use conform_dims instead of a do loop:

  lat2d = conform_dims(dims2d,lat,0)
  lon2d = conform_dims(dims2d,lon,1)

Note that I no longer need the "lat2d = z" or "lon2d = z" code.

[8] You have this code, and I'm not sure why:

      a = addfile(filename,"r")
      fin = a

This seems like an unnecessary copy of "a".  Why not simply do:

  fin = addfile(filename,"r")

*[9] The "gsn_define_colormap" call should not be inside the do loop. Call
this right after you call gsn_open_wks.*

*[10] There's some random code I don't understand. For example:*

  copy_VarCoords(z,lat2d)
  copy_VarCoords(z,lon2d)

lat2d and lon2d don't need to have z's coordinate arrays attached to them
for any reason that I can see, so I think you can remove those two lines.

*[11] Every time you have a do loop or an if statement, indent all the code
inside *consistently* so you can more easily see what part of the code
you're in.*

*[12] Don't make extra copies of variables unless you really need to.*

For example, you have:

      filename = all_files(n)

There's really no need to create "filename" since you can just use
"all_files(n)".
I've tried to clean up your code to show you what I'm talking about. The
attached script likely won't run because I don't have your data to test
it.  But, hopefully you can see what I'm doing and mimic this in your code.

--Mary



On Mon, Jul 17, 2017 at 11:40 AM, Barry Lynn <barry.h.lynn at gmail.com> wrote:

> Hi Rick:
>
> Thank you for your suggestion.
>
> Could you please give me some guidelines on how to delete resources.
>
> I have included the script.  If you can provide an example, I can take it
> from there.
>
> Thank you,
>
> Barry
>
> On Mon, Jul 17, 2017 at 5:22 PM, Rick Brownrigg <brownrig at ucar.edu> wrote:
>
>> Hi Barry,
>>
>> On Linux, an errno=12 is an "out of memory" error.  I doubt that its
>> related directly to the systemfunc() call, but rather due to memory being
>> consumed by your script in previous iterations of the loop.  You might
>> check if there are variables that can be freed (i.e., deleted()) after each
>> iteration.
>>
>> HTH...
>> Rick
>>
>> On Sun, Jul 16, 2017 at 8:56 PM, Barry Lynn <barry.h.lynn at gmail.com>
>> wrote:
>>
>>> Hi:
>>>
>>> I have a program that does multiple loops.
>>>
>>> I had this error:
>>>
>>> 0) i_dir = 6
>>>
>>> fatal:systemfunc: cannot create child process:[errno=12]
>>>
>>> fatal:["Execute.c":8640]:Execute: Error occurred at or near line 84 in
>>> file ./plot_loop_700mb.ncl
>>>
>>>
>>> This occurred when reading the file from the sixth directory of 20.
>>>
>>>
>>> Here was the read from the 5th directory:
>>>
>>> (0) i_dir = 5
>>>
>>> (0) all_files(n) = /home/cust100021_vol1/barry/cu
>>> st100021_vol2/GEFS/GEFS_05/gep05.t00z.pgrb2b.0p50.f084.grb
>>>
>>> Here is a listing of both files:
>>>
>>> They are both present, but the second attempted allocation/definition of
>>> the file fails with the read that follows:
>>>
>>>
>>> [barry at cust100021-login1 GEFS]$ ls -ltr /home/cust100021_vol1/barry/cu
>>> st100021_vol2/GEFS/GEFS_05/gep05.t00z.pgrb2b.0p50.f084.grb
>>>
>>> -rw-r--r-- 1 barry cust100021 78985320 Jul 16 04:58
>>> /home/cust100021_vol1/barry/cust100021_vol2/GEFS/GEFS_05/gep
>>> 05.t00z.pgrb2b.0p50.f084.grb
>>>
>>> [barry at cust100021-login1 GEFS]$ ls -ltr /home/cust100021_vol1/barry/cu
>>> st100021_vol2/GEFS/GEFS_06/gep06.t00z.pgrb2b.0p50.f084.grb
>>>
>>> -rw-r--r-- 1 barry cust100021 80466702 Jul 16 04:58
>>> /home/cust100021_vol1/barry/cust100021_vol2/GEFS/GEFS_06/gep
>>> 06.t00z.pgrb2b.0p50.f084.grb
>>>
>>> Here is the read:
>>>
>>>
>>>   print("i_dir = " + i_dir)
>>>
>>>   diri = dirWRF(i_dir) + "/"
>>>
>>> ; define individual file read
>>>
>>>    all_files = systemfunc("ls " + diri + "ge*pgrb2b*grb")
>>>
>>>    print("all_files(n) = " + all_files(n))
>>>
>>>   filename = all_files(n)
>>>
>>>
>>> Is the problem that I make this allocation multiple times, each
>>> directory, each file? Perhaps I should read all files at once into a two
>>> dimensional filename array?
>>>
>>>
>>> Thanks
>>>
>>>
>>>
>>> --
>>> Barry H. Lynn, Ph.D
>>> Senior Lecturer,
>>> The Institute of the Earth Science,
>>> The Hebrew University of Jerusalem,
>>> Givat Ram, Jerusalem 91904, Israel
>>> Tel: 972 547 231 170
>>> Fax: (972)-25662581
>>>
>>> C.E.O, Weather It Is, LTD
>>> Weather and Climate Focus
>>> http://weather-it-is.com
>>> Jerusalem, Israel
>>> Local: 02 930 9525
>>> Cell: 054 7 231 170
>>> Int-IS: x972 2 930 9525
>>> US 914 432 3108 <(914)%20432-3108>
>>>
>>> _______________________________________________
>>> ncl-talk mailing list
>>> ncl-talk at ucar.edu
>>> List instructions, subscriber options, unsubscribe:
>>> http://mailman.ucar.edu/mailman/listinfo/ncl-talk
>>>
>>>
>>
>
>
> --
> Barry H. Lynn, Ph.D
> Senior Lecturer,
> The Institute of the Earth Science,
> The Hebrew University of Jerusalem,
> Givat Ram, Jerusalem 91904, Israel
> Tel: 972 547 231 170
> Fax: (972)-25662581
>
> C.E.O, Weather It Is, LTD
> Weather and Climate Focus
> http://weather-it-is.com
> Jerusalem, Israel
> Local: 02 930 9525
> Cell: 054 7 231 170
> Int-IS: x972 2 930 9525
> US 914 432 3108 <(914)%20432-3108>
>
> _______________________________________________
> ncl-talk mailing list
> ncl-talk at ucar.edu
> List instructions, subscriber options, unsubscribe:
> http://mailman.ucar.edu/mailman/listinfo/ncl-talk
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ucar.edu/pipermail/ncl-talk/attachments/20170717/93074a01/attachment.html 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: plot_loop_700mb_mary_mods.ncl
Type: application/octet-stream
Size: 14441 bytes
Desc: not available
Url : http://mailman.ucar.edu/pipermail/ncl-talk/attachments/20170717/93074a01/attachment.obj 


More information about the ncl-talk mailing list