[ncl-talk] Speed up reading arrays?

Mary Haley haley at ucar.edu
Fri Jan 19 15:12:48 MST 2018


Tabish,

It's a bit hard to tell from your script what arrays might be getting
referenced multiple times on the file, but it might go faster if you read
these arrays to local variables first. For example, instead of:

so4_sim   = a[:]->so4_a01(:,0,J,I) + a[:]->so4_a02(:,0,J,I)
+a[:]->so4_a03(:,0,J,I) +a[:]->so4_a04(:,0,J,I) +a[:]->so4_a05(:,0,J,I)
+a[:]->so4_a06(:,0,J,I)

so4_sim_260   = a[:]->so4_a01(:,3,J,I) + a[:]->so4_a02(:,3,J,I)
+a[:]->so4_a03(:,3,J,I) +a[:]->so4_a04(:,3,J,I) +a[:]->so4_a05(:,3,J,I)
+a[:]->so4_a06(:,3,J,I)

You would have:

so4_a01 = a[:]->so4_a01  ; store to local variable
so4_a02 = a[:]->so4_a02
so4_a03 = a[:]->so4_a03
so4_a04 = a[:]->so4_a04
so4_sim        = so4_a01(:,0,J,I) + so4_a02(:,0,J,I) + so4_a03(:,0,J,I) +
so4_a04(:,0,J,I)
so4_sim_260 = so4_a01(:,3,J,I) + so4_a02(:,3,J,I) + so4_a03(:,3,J,I) +
so4_a04(:,3,J,I)


Every time you do a "a[:]->" type of operation, NCL has to go out to the
file, find the correct location, grab the data, and then return it back to
NCL.

If you save it to a local variable, then you won't be jumping out to the
file every time.

The other thing about this script is that it seems prone to user error.
You have to be careful that you are not subscripting something incorrectly,
or pulling the wrong variable off the file.  Sometimes, even though
something might take longer, it can be better to do something inefficiently
but correctly. By this I mean instead of listing out all the variables for
a summation on a line, you have a function.

For example:

function add_vars(f,var_prefix,k,j,i)
local n, var_names, var_sum
begin
  var_names = var_prefix + (1,2,3,4/)
  var_sum = 0.0
  do n=0,3
    var_sum = var_sum + f[:]->$var_names(n)$(:,k,j,i)
  end do
  return(var_sum)
end

begin
  . . .
  so4_sim     = add_vars(a,"so4_a",0,J,I)
  so4_sim_260 = add_vars(a,"so4_a",3,J,I)
end

Note that this is going out to the file every time, but it might be easier
to read. This is something you will have to decide with your script. Also,
I don't know what you are doing with all these variables afterwards, so
that could make a difference how you write all this code.

The other possibility might be to use the NCO (NetCDF operators). They are
not part of NCL, but they are frequently used in lieu NCL if you need to do
quick operations across files from the UNIX command line, and then write
the results to a new file.

--Mary






On Tue, Jan 16, 2018 at 9:58 AM, Tabish Ansari via ncl-talk <
ncl-talk at ucar.edu> wrote:

> Hi
>
> I'm reading in some wrf data from multiple simulations and storing the
> relevant variables as 1d arrays. This process is taking several minutes. Is
> there I can changes something structurally in the algorithm to make this
> faster?
>
> Here are the lines of code:
>
> print("reading wrf directories...")
> DATADir1 = "/data2/tabish/control-run-so4-ECMWF/"
> DATADir2 = "/data2/tabish/best-guess_run/"
> DATADir3 = "/data2/tabish/what-if_run/"
> ;DATADir4 = "/data2/tabish/NH3-sensitivity_run/"
> FILES1 = systemfunc (" ls -1 " + DATADir1 + "wrfout_d03_2014-* ")
> FILES2 = systemfunc (" ls -1 " + DATADir2 + "wrfout_d03_2014-* ")
> FILES3 = systemfunc (" ls -1 " + DATADir3 + "wrfout_d03_2014-* ")
> ;FILES4 = systemfunc (" ls -1 " + DATADir4 + "wrfout_d03_2014-* ")
> a = addfiles(FILES1,"r")
> b = addfiles(FILES2,"r")
> c = addfiles(FILES3,"r")
> ;d = addfiles(FILES4,"r")
> times  = wrf_times_c(a[:]->Times,0)
> ;print(times)
> ;times_crop = times(16:135)
> I=79
> J=144
> so4_sim   = a[:]->so4_a01(:,0,J,I) + a[:]->so4_a02(:,0,J,I)
> +a[:]->so4_a03(:,0,J,I) +a[:]->so4_a04(:,0,J,I) +a[:]->so4_a05(:,0,J,I)
> +a[:]->so4_a06(:,0,J,I)
> no3_sim   = a[:]->no3_a01(:,0,J,I) + a[:]->no3_a02(:,0,J,I)
> +a[:]->no3_a03(:,0,J,I) +a[:]->no3_a04(:,0,J,I) +a[:]->no3_a05(:,0,J,I)
> +a[:]->no3_a06(:,0,J,I)
> nh4_sim   = a[:]->nh4_a01(:,0,J,I) + a[:]->nh4_a02(:,0,J,I)
> +a[:]->nh4_a03(:,0,J,I) +a[:]->nh4_a04(:,0,J,I) +a[:]->nh4_a05(:,0,J,I)
> +a[:]->nh4_a06(:,0,J,I)
> chl_sim   = a[:]->cl_a01(:,0,J,I)  + a[:]->cl_a02(:,0,J,I)
> +a[:]->cl_a03(:,0,J,I)  +a[:]->cl_a04(:,0,J,I)  +a[:]->cl_a05(:,0,J,I)
> +a[:]->cl_a06(:,0,J,I)
> oc_sim    = a[:]->oc_a01(:,0,J,I) + a[:]->oc_a02(:,0,J,I)
> +a[:]->oc_a03(:,0,J,I) +a[:]->oc_a04(:,0,J,I) +a[:]->oc_a05(:,0,J,I)
> +a[:]->oc_a06(:,0,J,I)
> bc_sim    = a[:]->bc_a01(:,0,J,I) + a[:]->bc_a02(:,0,J,I)
> +a[:]->bc_a03(:,0,J,I) +a[:]->bc_a04(:,0,J,I) +a[:]->bc_a05(:,0,J,I)
> +a[:]->bc_a06(:,0,J,I)
> co_sim    = a[:]->co(:,0,J,I)
> no2_sim   = a[:]->no2(:,0,J,I)
> so2_sim   = a[:]->so2(:,0,J,I)
> o3_sim    = a[:]->o3(:,0,J,I)
>
> so4_sim_260   = a[:]->so4_a01(:,3,J,I) + a[:]->so4_a02(:,3,J,I)
> +a[:]->so4_a03(:,3,J,I) +a[:]->so4_a04(:,3,J,I) +a[:]->so4_a05(:,3,J,I)
> +a[:]->so4_a06(:,3,J,I)
> no3_sim_260   = a[:]->no3_a01(:,3,J,I) + a[:]->no3_a02(:,3,J,I)
> +a[:]->no3_a03(:,3,J,I) +a[:]->no3_a04(:,3,J,I) +a[:]->no3_a05(:,3,J,I)
> +a[:]->no3_a06(:,3,J,I)
> nh4_sim_260   = a[:]->nh4_a01(:,3,J,I) + a[:]->nh4_a02(:,3,J,I)
> +a[:]->nh4_a03(:,3,J,I) +a[:]->nh4_a04(:,3,J,I) +a[:]->nh4_a05(:,3,J,I)
> +a[:]->nh4_a06(:,3,J,I)
> chl_sim_260   = a[:]->cl_a01(:,3,J,I)  + a[:]->cl_a02(:,3,J,I)
> +a[:]->cl_a03(:,3,J,I)  +a[:]->cl_a04(:,3,J,I)  +a[:]->cl_a05(:,3,J,I)
> +a[:]->cl_a06(:,3,J,I)
> oc_sim_260    = a[:]->oc_a01(:,3,J,I) + a[:]->oc_a02(:,3,J,I)
> +a[:]->oc_a03(:,3,J,I) +a[:]->oc_a04(:,3,J,I) +a[:]->oc_a05(:,3,J,I)
> +a[:]->oc_a06(:,3,J,I)
>
>
> so4_sim_b   = b[:]->so4_a01(:,0,J,I) + b[:]->so4_a02(:,0,J,I)
> +b[:]->so4_a03(:,0,J,I) +b[:]->so4_a04(:,0,J,I) +b[:]->so4_a05(:,0,J,I)
> +b[:]->so4_a06(:,0,J,I)
> no3_sim_b   = b[:]->no3_a01(:,0,J,I) + b[:]->no3_a02(:,0,J,I)
> +b[:]->no3_a03(:,0,J,I) +b[:]->no3_a04(:,0,J,I) +b[:]->no3_a05(:,0,J,I)
> +b[:]->no3_a06(:,0,J,I)
> nh4_sim_b   = b[:]->nh4_a01(:,0,J,I) + b[:]->nh4_a02(:,0,J,I)
> +b[:]->nh4_a03(:,0,J,I) +b[:]->nh4_a04(:,0,J,I) +b[:]->nh4_a05(:,0,J,I)
> +b[:]->nh4_a06(:,0,J,I)
> chl_sim_b   = b[:]->cl_a01(:,0,J,I)  + b[:]->cl_a02(:,0,J,I)
> +b[:]->cl_a03(:,0,J,I)  +b[:]->cl_a04(:,0,J,I)  +b[:]->cl_a05(:,0,J,I)
> +b[:]->cl_a06(:,0,J,I)
> oc_sim_b    = b[:]->oc_a01(:,0,J,I)  + b[:]->oc_a02(:,0,J,I)
> +b[:]->oc_a03(:,0,J,I)  +b[:]->oc_a04(:,0,J,I)  +b[:]->oc_a05(:,0,J,I)
> +b[:]->oc_a06(:,0,J,I)
> bc_sim_b    = b[:]->bc_a01(:,0,J,I)  + b[:]->bc_a02(:,0,J,I)
> +b[:]->bc_a03(:,0,J,I)  +b[:]->bc_a04(:,0,J,I)  +b[:]->bc_a05(:,0,J,I)
> +b[:]->bc_a06(:,0,J,I)
> co_sim_b    = b[:]->co(:,0,J,I)
> no2_sim_b   = b[:]->no2(:,0,J,I)
> so2_sim_b   = b[:]->so2(:,0,J,I)
> o3_sim_b    = b[:]->o3(:,0,J,I)
> so4_sim_260_b   = b[:]->so4_a01(:,3,J,I) + b[:]->so4_a02(:,3,J,I)
> +b[:]->so4_a03(:,3,J,I) +b[:]->so4_a04(:,3,J,I) +b[:]->so4_a05(:,3,J,I)
> +b[:]->so4_a06(:,3,J,I)
> no3_sim_260_b   = b[:]->no3_a01(:,3,J,I) + b[:]->no3_a02(:,3,J,I)
> +b[:]->no3_a03(:,3,J,I) +b[:]->no3_a04(:,3,J,I) +b[:]->no3_a05(:,3,J,I)
> +b[:]->no3_a06(:,3,J,I)
> nh4_sim_260_b   = b[:]->nh4_a01(:,3,J,I) + b[:]->nh4_a02(:,3,J,I)
> +b[:]->nh4_a03(:,3,J,I) +b[:]->nh4_a04(:,3,J,I) +b[:]->nh4_a05(:,3,J,I)
> +b[:]->nh4_a06(:,3,J,I)
> chl_sim_260_b   = b[:]->cl_a01(:,3,J,I)  + b[:]->cl_a02(:,3,J,I)
> +b[:]->cl_a03(:,3,J,I)  +b[:]->cl_a04(:,3,J,I)  +b[:]->cl_a05(:,3,J,I)
> +b[:]->cl_a06(:,3,J,I)
> oc_sim_260_b    = b[:]->oc_a01(:,3,J,I)  + b[:]->oc_a02(:,3,J,I)
> +b[:]->oc_a03(:,3,J,I)  +b[:]->oc_a04(:,3,J,I)  +b[:]->oc_a05(:,3,J,I)
> +b[:]->oc_a06(:,3,J,I)
>
> so4_sim_c   = c[:]->so4_a01(:,0,J,I) + c[:]->so4_a02(:,0,J,I)
> +c[:]->so4_a03(:,0,J,I) +c[:]->so4_a04(:,0,J,I) +c[:]->so4_a05(:,0,J,I)
> +c[:]->so4_a06(:,0,J,I)
> no3_sim_c   = c[:]->no3_a01(:,0,J,I) + c[:]->no3_a02(:,0,J,I)
> +c[:]->no3_a03(:,0,J,I) +c[:]->no3_a04(:,0,J,I) +c[:]->no3_a05(:,0,J,I)
> +c[:]->no3_a06(:,0,J,I)
> nh4_sim_c   = c[:]->nh4_a01(:,0,J,I) + c[:]->nh4_a02(:,0,J,I)
> +c[:]->nh4_a03(:,0,J,I) +c[:]->nh4_a04(:,0,J,I) +c[:]->nh4_a05(:,0,J,I)
> +c[:]->nh4_a06(:,0,J,I)
> chl_sim_c   = c[:]->cl_a01(:,0,J,I)  + c[:]->cl_a02(:,0,J,I)
> +c[:]->cl_a03(:,0,J,I)  +c[:]->cl_a04(:,0,J,I)  +c[:]->cl_a05(:,0,J,I)
> +c[:]->cl_a06(:,0,J,I)
> oc_sim_c    = c[:]->oc_a01(:,0,J,I)  + c[:]->oc_a02(:,0,J,I)
> +c[:]->oc_a03(:,0,J,I)  +c[:]->oc_a04(:,0,J,I)  +c[:]->oc_a05(:,0,J,I)
> +c[:]->oc_a06(:,0,J,I)
> bc_sim_c    = c[:]->bc_a01(:,0,J,I)  + c[:]->bc_a02(:,0,J,I)
> +c[:]->bc_a03(:,0,J,I)  +c[:]->bc_a04(:,0,J,I)  +c[:]->bc_a05(:,0,J,I)
> +c[:]->bc_a06(:,0,J,I)
> co_sim_c    = c[:]->co(:,0,J,I)
> no2_sim_c   = c[:]->no2(:,0,J,I)
> so2_sim_c   = c[:]->so2(:,0,J,I)
> o3_sim_c    = c[:]->o3(:,0,J,I)
>
> so4_sim_260_c   = c[:]->so4_a01(:,3,J,I) + c[:]->so4_a02(:,3,J,I)
> +c[:]->so4_a03(:,3,J,I) +c[:]->so4_a04(:,3,J,I) +c[:]->so4_a05(:,3,J,I)
> +c[:]->so4_a06(:,3,J,I)
> no3_sim_260_c   = c[:]->no3_a01(:,3,J,I) + c[:]->no3_a02(:,3,J,I)
> +c[:]->no3_a03(:,3,J,I) +c[:]->no3_a04(:,3,J,I) +c[:]->no3_a05(:,3,J,I)
> +c[:]->no3_a06(:,3,J,I)
> nh4_sim_260_c   = c[:]->nh4_a01(:,3,J,I) + c[:]->nh4_a02(:,3,J,I)
> +c[:]->nh4_a03(:,3,J,I) +c[:]->nh4_a04(:,3,J,I) +c[:]->nh4_a05(:,3,J,I)
> +c[:]->nh4_a06(:,3,J,I)
> chl_sim_260_c   = c[:]->cl_a01(:,3,J,I)  + c[:]->cl_a02(:,3,J,I)
> +c[:]->cl_a03(:,3,J,I)  +c[:]->cl_a04(:,3,J,I)  +c[:]->cl_a05(:,3,J,I)
> +c[:]->cl_a06(:,3,J,I)
> oc_sim_260_c    = c[:]->oc_a01(:,3,J,I)  + c[:]->oc_a02(:,3,J,I)
> +c[:]->oc_a03(:,3,J,I)  +c[:]->oc_a04(:,3,J,I)  +c[:]->oc_a05(:,3,J,I)
> +c[:]->oc_a06(:,3,J,I)
>
> ;so4_sim_d   = d[:]->so4_a01(:,0,J,I) + d[:]->so4_a02(:,0,J,I)
> +d[:]->so4_a03(:,0,J,I) +d[:]->so4_a04(:,0,J,I) +d[:]->so4_a05(:,0,J,I)
> +d[:]->so4_a06(:,0,J,I)
> ;no3_sim_d   = d[:]->no3_a01(:,0,J,I) + d[:]->no3_a02(:,0,J,I)
> +d[:]->no3_a03(:,0,J,I) +d[:]->no3_a04(:,0,J,I) +d[:]->no3_a05(:,0,J,I)
> +d[:]->no3_a06(:,0,J,I)
> ;nh4_sim_d   = d[:]->nh4_a01(:,0,J,I) + d[:]->nh4_a02(:,0,J,I)
> +d[:]->nh4_a03(:,0,J,I) +d[:]->nh4_a04(:,0,J,I) +d[:]->nh4_a05(:,0,J,I)
> +d[:]->nh4_a06(:,0,J,I)
>
> ;so4_sim_260_d   = d[:]->so4_a01(:,3,J,I) + d[:]->so4_a02(:,3,J,I)
> +d[:]->so4_a03(:,3,J,I) +d[:]->so4_a04(:,3,J,I) +d[:]->so4_a05(:,3,J,I)
> +d[:]->so4_a06(:,3,J,I)
> ;no3_sim_260_d   = d[:]->no3_a01(:,3,J,I) + d[:]->no3_a02(:,3,J,I)
> +d[:]->no3_a03(:,3,J,I) +d[:]->no3_a04(:,3,J,I) +d[:]->no3_a05(:,3,J,I)
> +d[:]->no3_a06(:,3,J,I)
> ;nh4_sim_260_d   = d[:]->nh4_a01(:,3,J,I) + d[:]->nh4_a02(:,3,J,I)
> +d[:]->nh4_a03(:,3,J,I) +d[:]->nh4_a04(:,3,J,I) +d[:]->nh4_a05(:,3,J,I)
> +d[:]->nh4_a06(:,3,J,I)
>
> print("Stored relevant pollutant arrays.")
>
> Please note that I've commented out DataDir4 and its related arrays. When
> I try to load that too, I get a segmentation fault.
> I just need those arrays for further processing and ultimately for a
> combined time-series plot. Is there anyway to speed this up, and probably
> also read in DataDir4 arrays without a seg fault?
>
> Thanks in advance!
>
> Tabish
>
>
> Tabish U Ansari
> PhD student, Lancaster Environment Center
> Lancaster Univeristy
> Bailrigg, Lancaster,
> LA1 4YW, United Kingdom
>
> _______________________________________________
> ncl-talk mailing list
> ncl-talk at ucar.edu
> List instructions, subscriber options, unsubscribe:
> http://mailman.ucar.edu/mailman/listinfo/ncl-talk
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.ucar.edu/pipermail/ncl-talk/attachments/20180119/d9f328b0/attachment.html>


More information about the ncl-talk mailing list