[ncl-talk] Slow code

Appo derbetini appopson4 at gmail.com
Tue Jan 12 23:44:55 MST 2016


Hi,
To convert hourly to daily data, I'm using CDO

For january 1980 for example,

cdo daymean  hourly_01.1980.nc     daily_01.1980.nc

Then I merge monthly files to have yearly data


with


cdo mergetime  daily_*.1980.nc    daily.1980.nc



Merci

Appolinaire

2016-01-12 21:47 GMT+01:00 Dennis Shea <shea at ucar.edu>:

> At my request, Mike sent me two sample files. I will look when I get a
> chance.
>
> 'My' strategy when dealing with multiple files is
>
> [1]
> Will use of 'addfiles' be advantageous for the problem.? if hourly arrays
> are 20GB ... the answer is likely 'no'. Memory allocation, especially on
> multi user systems) can be slow.
>
> [2]
> Most commonly
>
>   [a] I will process one year at a time, converting year, month, day, hr
> to (say) 'hours since ...'.
>   [b] Compute the daily averages; write each year or month to netCDF with
> 'time' unlimited
>   [c] use the 'ncks' or 'ncrcat' operators to combine the files if that is
> desirable.
>
> ---
> I am attaching the 'contrbuted.ncl' functions that are used to calculate
> monthly or daily quantities: avg, sum, min, max
>
>
>
>
> http://www.ncl.ucar.edu/Document/Functions/Contributed/calculate_daily_values.shtml
>
> http://www.ncl.ucar.edu/Document/Functions/Contributed/calculate_monthly_values.shtml
>
> D
>
> On Tue, Jan 12, 2016 at 12:48 PM, Michael Notaro <mnotaro at wisc.edu> wrote:
>
>> Thanks, Mary, for your further suggestions.
>>
>>
>> After I got Alan's first email which helped me reassess my code,
>>
>> I modified my code to remove the year dimension from most variables
>>
>> to make them more manageable.  Now the code
>>
>> runs in 1 hour, rather than 1 day+.
>>
>>
>> Michael
>>
>>
>> load "$NCARG_ROOT/lib/ncarg/nclscripts/csm/gsn_code.ncl"
>> load "$NCARG_ROOT/lib/ncarg/nclscripts/csm/gsn_csm.ncl"
>> load "$NCARG_ROOT/lib/ncarg/nclscripts/csm/contributed.ncl"
>> load "$NCARG_ROOT/lib/ncarg/nclscripts/csm/shea_util.ncl"
>> begin
>>
>> mns=(/"01","02","03","04","05","06","07","08","09","10","11","12"/)
>> ndays=(/31,28,31,30,31,30,31,31,30,31,30,31/)
>>
>> daily365=new((/365,20,141,217/),float) ; 365 day calendar data
>> daily365 at _FillValue=1e+35
>> daily365=1e+35
>>
>> do iyr=0,19
>>
>>   data=new((/141,217,12,744/),float) ; hourly data
>>   data at _FillValue=1e+35
>>   data=1e+35
>>
>>   cnt=0
>>   do im=0,11
>>     prefix=(1980+iyr)+""+mns(im)
>>
>> b=addfile("/volumes/data1/yafang/Downscaling/ACCESS1-0/historical/output/ACCESS_SRF."+(1980+iyr)+""+mns(im)+"
>> 0100.nc","r") ; read hourly SRF data
>>     iy=b->iy
>>     jx=b->jx
>>     xlat=b->xlat
>>     xlon=b->xlon
>>     snow=b->snv ; liquid equiv of snow on ground
>>     dims=dimsizes(snow)
>>     nt=dims(0)
>>     data(:,:,im,0:nt-1)=snow(iy|:,jx|:,time|:)
>>     delete(snow)
>>     delete(b)
>>     delete(dims)
>>     cnt=cnt+1
>>   end do
>>   data at _FillValue=1e+20
>>
>>   daily=new((/141,217,12,31/),float) ; daily data per month
>>   daily at _FillValue=1e+35
>>   daily=1e+35
>>
>>   cnt=0
>>   do id=0,30
>>     daily(:,:,:,id)=dim_avg(data(:,:,:,cnt:cnt+23)) ; convert hourly to
>> daily data
>>     cnt=cnt+24
>>   end do
>>
>>   delete(data)
>>
>>
>>
>>   cnt=0
>>   do im=0,11
>>     do id=0,ndays(im)-1
>>       daily365(cnt,iyr,:,:)=daily(:,:,im,id) ; convert daily data per
>> month to 365 day calendar
>>       cnt=cnt+1
>>     end do
>>   end do
>>
>>   delete(daily)
>>
>> end do
>>
>> daily212=new((/19,212,141,217/),float) ; 212 day calendar data for Sep-Mar
>> daily212 at _FillValue=1e+35
>> daily212=1e+35
>>
>> do iyr=0,18
>>   daily212(iyr,0:121,:,:)=daily365(243:364,iyr,:,:) ; retrieve Sep-Mar
>>   daily212(iyr,122:211,:,:)=daily365(0:89,iyr+1,:,:)
>> end do
>>
>> delete(daily365)
>>
>> year=ispan(0,18,1)
>> year!0="year"
>> year&year=year
>>
>> time=ispan(0,211,1)
>> time!0="time"
>> time&time=time
>>
>> daily212!0="year"
>> daily212!1="time"
>> daily212!2="iy"
>> daily212!3="jx"
>> daily212&year=year
>> daily212&time=time
>> daily212&iy=iy
>> daily212&jx=jx
>>
>> daily212 at long_name = "liquid snow water on ground"
>> daily212 at units = "kg m-2"
>> daily212 at coordinates="xlat xlon"
>> daily212 at grid_mapping = "rcm_map"
>>
>>
>>
>> system("rm save_daily212_actual_snv_access_late20_faster.nc")
>> out=addfile("save_daily212_actual_snv_access_late20_faster.nc","c")
>> out->daily212=daily212
>> out->xlat=xlat
>> out->xlon=xlon
>> out->iy=iy
>> out->jx=jx
>>
>>
>>
>>
>>
>> Michael Notaro
>> Associate Director
>> Nelson Institute Center for Climatic Research
>> University of Wisconsin-Madison
>> Phone: (608) 261-1503
>> Email: mnotaro at wisc.edu
>>
>>
>> ------------------------------
>> *From:* Mary Haley <haley at ucar.edu>
>> *Sent:* Tuesday, January 12, 2016 1:34 PM
>> *To:* Alan Brammer
>> *Cc:* Michael Notaro; ncl-talk at ucar.edu
>>
>> *Subject:* Re: [ncl-talk] Slow code
>>
>> Hi folks,
>>
>> These are all good suggestions.
>>
>> Another thing that is expensive in NCL is reordering arrays with syntax
>> like:
>>
>> snow(iy|:,jx|:,time|:),
>>
>> NCL makes a copy of the array when you do this, and it has to swap all
>> the dimensions every time in the loop.
>>
>> Isf reordering the array is absolutely necessary? I see that you are
>> reordering and then calling "dim_avg_n". Since you are already using
>> dim_avg_n, why not leave the array as is and just change the dimension you
>> do the average on?
>>
>> --Mary
>>
>>
>> On Tue, Jan 12, 2016 at 8:30 AM, Alan Brammer <abrammer at albany.edu>
>> wrote:
>>
>>> Hi Michael,
>>>
>>>
>>> I was going to suggest reshape that data array but it’s 20GB and is
>>> going to be unnecessarily slow whatever.  Do you actually need to store all
>>> the hourly data? the below edits suggest that you don’t.  The below uses
>>> less than a 1GB of memory rather than 20+GB.
>>>
>>>  This is obviously untested so may need editing.
>>> (requires 6.1.1 or newer. )
>>>
>>>
>>> mns=(/"01","02","03","04","05","06","07","08","09","10","11","12"/)
>>> ndays=(/31,28,31,30,31,30,31,31,30,31,30,31/)
>>>
>>> daily=new((/141,217,20,12,31/),float) ; hourly data
>>> daily at _FillValue=1e+35
>>> daily=1e+35
>>>
>>> cnt=0
>>> do iyr=0,19
>>>   do im=0,11
>>>     prefix=(1980+iyr)+""+mns(im)
>>>
>>> b=addfile("/volumes/data1/yafang/Downscaling/ACCESS1-0/historical/output/ACCESS_SRF."+(1980+iyr)+""+mns(im)+"
>>> 0100.nc","r") ; read hourly SRF data
>>>     iy=b->iy
>>>     jx=b->jx
>>>     xlat=b->xlat   ; These aren’t doing anything?
>>>     xlon=b->xlon ; These aren’t doing anything?
>>>     snow =b->snv ; liquid equiv of snow on ground
>>>     dims=dimsizes(snow)
>>>     nt=dims(0)
>>>
>>>     snow4d := reshape( snow(iy|:,jx|:,time|:), (/dims(1), dims(2),
>>> ndays(im), 24/) ) ; I assume snow is originally (time|:,iy|:,ix|:)
>>>     daily(:,:,iyr,im,:ndays(im)-1)=dim_avg_n(snow4d, 3)
>>>
>>>     delete(snow)
>>>     delete(b)
>>>     delete(dims)
>>>     cnt=cnt+1
>>>   end do
>>> end do
>>>
>>> daily at _FillValue=1e+20
>>>
>>>
>>> Good luck,
>>>
>>>
>>> Alan Brammer.
>>>
>>>
>>>
>>>
>>> On 12 Jan 2016, at 10:00, Michael Notaro <mnotaro at wisc.edu> wrote:
>>>
>>> Thanks for your email.
>>>
>>> Actually, this is the main part slowing me down, not the top part of the
>>> code with the addfiles.
>>>
>>> cnt=0
>>> do id=0,30
>>>   daily(:,:,:,:,id)=dim_avg(data(:,:,:,:,cnt:cnt+23)) ; convert hourly
>>> to daily data     ***** THIS PART IS SLOW *****
>>>   cnt=cnt+24
>>> end do
>>>
>>>
>>> Any way to perform this task quicker?
>>>
>>> Michael
>>>
>>>
>>>
>>>
>>>
>>> Michael Notaro
>>> Associate Director
>>> Nelson Institute Center for Climatic Research
>>> University of Wisconsin-Madison
>>> Phone: (608) 261-1503
>>> Email: mnotaro at wisc.edu
>>>
>>>
>>> ------------------------------
>>> *From:* Guido Cioni <guidocioni at gmail.com>
>>> *Sent:* Tuesday, January 12, 2016 8:57 AM
>>> *To:* Michael Notaro
>>> *Cc:* ncl-talk at ucar.edu
>>> *Subject:* Re: [ncl-talk] Slow code
>>>
>>> Everyone here will tell you that using loops in NCL it’s not efficient :)
>>> But from my experience I think that the main thing slowing you down is
>>> that you are using addfile at every iteration.
>>> Does creating a whole file and reading that in the beginning would
>>> change what you are trying to compute?
>>>
>>> Guido Cioni
>>> http://guidocioni.altervista.org
>>> <http://guidocioni.altervista.org/>
>>> Guido Cioni <http://guidocioni.altervista.org/>
>>> guidocioni.altervista.org
>>> Le stazioni sono state riparate ed i dati vengono nuovamente aggiornati
>>> in tempo reale. Purtroppo a causa di numerosi malfunzionamenti i dati
>>> pluviometrici di Pisa e ...
>>>
>>>
>>> On 12 Jan 2016, at 15:35, Michael Notaro <mnotaro at wisc.edu> wrote:
>>>
>>> Does anyone have a recommendation to speed up my code?
>>> It's been running for a day now.  I put asterisks next to the real slow
>>> loop.
>>> Basically, that part is converting a large array of hourly data into
>>> daily data.
>>> Thanks, Michael
>>>
>>>
>>> mns=(/"01","02","03","04","05","06","07","08","09","10","11","12"/)
>>> ndays=(/31,28,31,30,31,30,31,31,30,31,30,31/)
>>>
>>> data=new((/141,217,20,12,744/),float) ; hourly data
>>> data at _FillValue=1e+35
>>> data=1e+35
>>>
>>> cnt=0
>>> do iyr=0,19
>>>   do im=0,11
>>>     prefix=(1980+iyr)+""+mns(im)
>>>
>>> b=addfile("/volumes/data1/yafang/Downscaling/ACCESS1-0/historical/output/ACCESS_SRF."+(1980+iyr)+""+mns(im)+"
>>> 0100.nc","r") ; read hourly SRF data
>>>     iy=b->iy
>>>     jx=b->jx
>>>     xlat=b->xlat
>>>     xlon=b->xlon
>>>     snow=b->snv ; liquid equiv of snow on ground
>>>     dims=dimsizes(snow)
>>>     nt=dims(0)
>>>     data(:,:,iyr,im,0:nt-1)=snow(iy|:,jx|:,time|:)
>>>     delete(snow)
>>>     delete(b)
>>>     delete(dims)
>>>     cnt=cnt+1
>>>   end do
>>> end do
>>> data at _FillValue=1e+20
>>>
>>> daily=new((/141,217,20,12,31/),float) ; daily data per month
>>> daily at _FillValue=1e+35
>>> daily=1e+35
>>>
>>> cnt=0
>>> do id=0,30
>>>   daily(:,:,:,:,id)=dim_avg(data(:,:,:,:,cnt:cnt+23)) ; convert hourly
>>> to daily data     ***** THIS PART IS SLOW *****
>>>   cnt=cnt+24
>>> end do
>>>
>>> delete(data)
>>>
>>>
>>>
>>>
>>> Michael Notaro
>>> Associate Director
>>> Nelson Institute Center for Climatic Research
>>> University of Wisconsin-Madison
>>> Phone: (608) 261-1503
>>> Email: mnotaro at wisc.edu
>>> _______________________________________________
>>> ncl-talk mailing list
>>> ncl-talk at ucar.edu
>>> List instructions, subscriber options, unsubscribe:
>>> http://mailman.ucar.edu/mailman/listinfo/ncl-talk
>>>
>>>
>>> _______________________________________________
>>> ncl-talk mailing list
>>> ncl-talk at ucar.edu
>>> List instructions, subscriber options, unsubscribe:
>>> http://mailman.ucar.edu/mailman/listinfo/ncl-talk
>>>
>>>
>>>
>>> _______________________________________________
>>> ncl-talk mailing list
>>> ncl-talk at ucar.edu
>>> List instructions, subscriber options, unsubscribe:
>>> http://mailman.ucar.edu/mailman/listinfo/ncl-talk
>>>
>>>
>>
>> _______________________________________________
>> ncl-talk mailing list
>> ncl-talk at ucar.edu
>> List instructions, subscriber options, unsubscribe:
>> http://mailman.ucar.edu/mailman/listinfo/ncl-talk
>>
>>
>
> _______________________________________________
> ncl-talk mailing list
> ncl-talk at ucar.edu
> List instructions, subscriber options, unsubscribe:
> http://mailman.ucar.edu/mailman/listinfo/ncl-talk
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ucar.edu/pipermail/ncl-talk/attachments/20160113/e531f104/attachment.html 


More information about the ncl-talk mailing list