[ncl-talk] sign_f90 slow down work flow

Thu Oct 19 03:43:31 MDT 2017

Hey Paolina,
first of all it would be nice to add a few print statement to understand which parts of the scripts are actually slowing it down.
Something like this:

print("************************************************")
print("FILEs READ in "+get_cpu_time()+"s")
print("************************************************")

would help. 

Second, taking a peek at your script it seems that the loop part is slowing everything down. That's not how you're supposed to compute quantities in NCL :) 
Explicit loops over variable dimensions should be used ONLY when strictly necessary as they are slowing everything down. 

For example, if you have a 3-D array including temperatures T(ntime,nlon,nlat) in Kelvin and you want to convert them to Celsius the assignment

T=T-273.15 

has the same effect as

do i=0,ntime-1 
	do j=0,nlon-1
		do k=0,nlat-1
			T(i,j,k)=T(i,j,k)-273.15
		end do 
	end do 
end do 

but takes less time (and effort to write it up). 

In your case if you want to compute saturation vapor pressure I'd suggest you to use one of the function described here https://www.ncl.ucar.edu/Document/Functions/meteo.shtml <https://www.ncl.ucar.edu/Document/Functions/meteo.shtml> on the full array of temperature and pressure, without subsetting them! 

Furthermore, remember than in Fortran (and so also in NCL) the order of the internal loops does matter for speed. An excerpt from an EXCELLENT explanation from Dennis given a "long time ago"

Looping in *any* interpreted language (Matlab, python, NCL) is slow
compared to compiled languages. Interpreted languages execute one line
at a time. Functions within interpreted languages are written in C
and/or fortran. These are optimised BUT the overall interpreted code
is not. [...] The reason is that compiled languages, in particular, fortran (due the
the 'rules of the language') often have 'look ahead' information that
allow the language to issue pre-fetch instructions prior to a code
segment being executed. This allows the hardware to pre-load cache and
registers with data. This makes execution faster.

[...]

This affects  compiled languages also. A real world case. A post-doc
was calling a fortran code that had 4-levels. She said it was taking a
*long* time. I looked at the code

   x(mlon,nlat,klev,ntim) , y(mlon,nlat,klev,ntim)   ... fortran
ordering ... these were *BIG* arrays. in fortran the left is the
fastest varying dimension

She had

   do kll=1,klev
      do nl=1,nlat
          do ml=1,mlon
            do nt=1,ntim
....
            end do
          end do
       end do
   end do

It took 20+ minutes

I looked at the code and saw nothing wrong. Later, I realized the
issue. She was striding over the slowest varying dimension.
She reordered code ....It took about 45 seconds.   Wow!

> On 19. Oct 2017, at 11:10, Paolina Bongioannini Cerlini <paolina.cerlini at unipg.it> wrote:
> 
> <BM-256x256x64_3km_SWVP-new_0.ncl>

Guido Cioni
http://guidocioni.altervista <http://guidocioni.altervista/>.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.ucar.edu/pipermail/ncl-talk/attachments/20171019/e23750a1/attachment.html>