[ncl-talk] Running the NCL code becomes slower

Walter Kolczynski walter.kolczynski at noaa.gov
Fri Jun 26 11:05:29 MDT 2015


Loops are inherently slow in interpreted languages such as NCL. You 
should reduce your loops to as few as possible. Here is a code snippet I 
use in a lot of my code:

; convert times into epoch-seconds
startTime   = cd_inv_calendar( startYear, startMonth, startDay, 
startHour, 0, 0, "seconds since 1970-1-1 00:00:00", 0 )
endTime     = cd_inv_calendar( endYear, endMonth, endDay, endHour, 0, 0, 
"seconds since 1970-1-1 00:00:00", 0 )
frequency   = frequencyHr * 3600;

; build array with all of the initialization times
nTimes = doubletointeger( (endTime - startTime) / frequency ) + 1
times  = fspan( startTime, endTime, nTimes )
copy_VarAtts( startTime, times )

If you need all of the data at the end (or otherwise can process all the 
data at once), you can then use times to build an array of filenames 
using cd_string and use addfiles to read them all at once:

yyyymmdd  = cd_string( times, "%Y%N%D")
filenames = dir +  "/" + yyyymmdd + ".hdf"
infiles   = addfiles( filenames, "r" )
ListSetType(infiles, "join")
variable  = infiles[:]->$variableName$(:, start_ind_lat:end_ind_lat, 
start_ind_lon:end_ind_lon)
variable!0 = "time"
variable&valid_time = times

If you need to do separate processing for each time, you've still 
reduced everything down to a single loop instead of many:

do t=0, nTimes-1, 1
     time = times(t)
     yyyymmdd  = cd_string( time, "%Y%N%D")
     filename = dir +  "/" + yyyymmdd + ".nc"
     infile = addfile( filename, "r" )
     variable = infile->$variableName$(start_ind_lat:end_ind_lat, 
start_ind_lon:end_ind_lon)
     ; Do stuff to variable
     delete(variable)
end do

Even if you have to do separate processing, it may even be more 
efficient to read all the data at once anyway and step through if you 
have the memory:

do t=0, nTimes-1, 1
variable_temp = variable(t,:,:)
     ; Do stuff to variable_temp
     delete(variable_temp)
end do

Also, if your lat and lon are the same at every time (likely), you 
probably only need to read them in from the first file:
lat = infiles[0]->lat(start_ind_lat:end_ind_lat, start_ind_lon:end_ind_lon)
lon = infiles[0]->lon(start_ind_lat:end_ind_lat, start_ind_lon:end_ind_lon)

or

do t=0, nTimes-1, 1
     time = times(t)
     ...
     if( .not.isdefined("lat")) then
         lat = infile->lat(start_ind_lat:end_ind_lat, 
start_ind_lon:end_ind_lon)
         lon = infile->lon(start_ind_lat:end_ind_lat, 
start_ind_lon:end_ind_lon)
     end if
     ; Do stuff
     delete(variable)
     ; DON'T delete lat or lon
end do

Finally, when using loops you may want to delete any large variables you 
are using on each loop (as I've done above). It shouldn't matter, but 
I've found it does make a difference sometimes. And, if you are making 
any outside function calls to FORTRAN/C++ subroutines in your 
processing, make sure those functions are cleaning up after themselves 
(particularly if you wrote them yourself).

- Walter

On 26-Jun-15 12:23, Zhifeng Yang wrote:
> Hi
>
> I am trying to read SEVIRI data with a lot of variables and the 
> dimension of each variable is 3712*3712. I know the data are pretty 
> large. But the computer should read them smoothly. Since the memory 
> that I specified is about 50GB. Unfortunately, the code is becoming 
> slower and slower while it do the time loop. Here is a sample of my code.
>
> ;  SET UP THE START TIME AND END TIME
>    start_year = 2008
>    end_year   = 2008
>    start_month= 6
>    end_month  = 6
>    start_day  = 1
>    start_hour = 0
>    end_hour   = 23
>    start_min  = 0
>    end_min    = 45
>    min_stride = 15
>    start_ind_lat = 1400
>    end_ind_lat   = 3000
>    start_ind_lon = 1100
>    end_ind_lon   = 2600
>
> ;  DO YEAR LOOP
>    do iyear = start_year, end_year
>
> ;  DO MONTH LOOP
>       do imonth = start_month, end_month
>
> ;  CALCULATE THE NUMBER OF DAYS IN THIS MONTH
>          nday_month = days_in_month(iyear, imonth)
> ;  DO DAY LOOP
>          do iday = start_day, 10;nday_month
> ;  DO HOUR LOOP
>             do ihour = start_hour, end_hour
> ;  DO MINUTE LOOP
>                do imin = start_min, end_min, min_stride
> ;  READ VARIABLES FROM HDF FILE
>                      a     = addfile(dir + siyear + "/" + symd1 + "/" 
> + filename, "r")
>                      lat   = 
> (/a->MSG_Latitude(start_ind_lat:end_ind_lat, start_ind_lon:end_ind_lon)/)
>                      lon   = 
> (/a->MSG_Longitude(start_ind_lat:end_ind_lat, start_ind_lon:end_ind_lon)/)
>                      Cloud_Optical_Thickness_16 = 
> a->Cloud_Optical_Thickness_16(start_ind_lat:end_ind_lat, 
> start_ind_lon:end_ind_lon)
>
>                end do ;imin
>             end do ;ihour
>          end do ;iday
>       end do ;imonth
>    end do ;iyear
>
>
> Thank you
> Zhifeng
>
>
> _______________________________________________
> ncl-talk mailing list
> ncl-talk at ucar.edu
> List instructions, subscriber options, unsubscribe:
> http://mailman.ucar.edu/mailman/listinfo/ncl-talk

-- 
Walter Kolczynski, Jr.
Global Ensemble Team
NOAA/NWS/NCEP/EMC (via I.M. Systems Group)
(301) 683-3781

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ucar.edu/pipermail/ncl-talk/attachments/20150626/00c1fe57/attachment.html 


More information about the ncl-talk mailing list