[ncl-talk] How to process numerous files(data) in NCL

Dennis Shea shea at ucar.edu
Fri May 22 09:19:44 MDT 2020


As noted by Rashed, the question is vague:

Not sure what you  by "efficiently" , the size of your variable or what
statistics are to be calculated.
Really, the issue is the size of your variable(s).

NCL uses the standard Unidata netCDF software [written in C]. As Unidata
has stated, their software was designed for robustness not necessarily for
efficiency. There will be 2847 netCDF file openings. I don't think any
other language would do this more or less efficiently than NCL.

What sizes are "level,lat,lon"?  What is the size of:

 nday = 2847l                 ; the appended 'l' make it a long integer
[same as* tolong*(2846)]
 ntim = nday*24
  klvl =
  nlat =
  mlon =
  n   = ntim*klvl*nlat*mlon    ; number of elements
  print(n)

  nbyte = 4                            ; number of bytes per float number
  N       = n*nbyte
  print(N)                               ; [minimum] memory needed in bytes

;============================
; *IF* you have the memory then the following approach should work.
; NOTE: NCL uses the C-language 'malloc' to allocate memory.
; *THERE IS NO MAGIC: *IF N IS VERY LARGE THIS WILL TAKE TIME!
;                                       Maybe ...  considerable time!
;=============================
  diri = "./"     ; input directory
  fili   = systemfunc("cd "+diri+" ; ls HUANG*nc")
  print(fili)                                    ; print al 2847 file names
;
  f     = addfiles(diri+fili, "r")         ; open all 2847 files
  wsp= f[:]->WSP                        ; input (mtim,klvl,nlat,mlon)
                                                   ; this will take time
because memory must be allocated

   wsp_avg = dim_avg_n_Wrap(wsp, 0)     ; 0 is the 'time' dimension
   printVarSummary(wsp_avg)                    ; (level,lat,lon)

... other statistics
....save to netCDF file

*******************************************************
If memory limitations prevent all hourly grids,  you will have to process
some grid subset then, later, combine the information using the CDO or NCO
operators. You may have to experiment to determine the the subset sizes
that can be accommodated by your local environment. t

  fili   = systemfunc("cd "+diri+" ; ls HUANG*nc")
  print(fili)                                    ; print al 2847 file names
;
  f     = addfiles(diri+fili, "r")         ; open all 2847 files

  latS = -20    ; example of subset
  latN = 0
   lonL = 0
  lonR = 10

  wsp= f[:]->WSP(:,:,*{latS:latN},{**lonL:lonR}*)   ; input
(mtim,klvl,...,...)
                                                   ; this will take time
because memory must be allocated

   wsp_avg = dim_avg_n_Wrap(wsp, 0)     ; 0 is the 'time' dimension
   printVarSummary(wsp_avg)                    ;
(level,{latS:latN},{lonL:lonR})

... other statistics
....save to netCDF file

On Fri, May 22, 2020 at 8:02 AM Rashed Mahmood via ncl-talk <
ncl-talk at ucar.edu> wrote:

> Hi Huang,
> This question is a bit vague, at least I am not what statistics you are
> trying to calculate. NCL can do array processing for certain things,
> however, for a better answer to your specific needs, you would need to be
> specific about what exactly you are trying to do?
>
> Rashed
>
> On Thu, May 21, 2020 at 6:56 PM 时光足迹 via ncl-talk <ncl-talk at ucar.edu>
> wrote:
>
>> Hi there,
>>     I have problems processing data with numerous files (~2GB each file)
>> using NCL.
>>     There are 10-year hourly 3-dimention meteological data in NetCDF
>> format, which from 2010~2019, 2847 days in total. Data were stored daily
>> (so there are 2847 files).
>>     I want to caluclate some statistic varibles along all time in each
>> grid. Take wsp(time|2847*24, level, lat, lon) for example, in the current
>> method, do-loop on levels is used. Then
>> 'wsp' will be processed along time. Finally I can get what I need (
>> wsp_process(level,lat,lon) ).
>>     However, it will be really ineffiency, and sometime it may meet error
>> 'Segmentation fault (core dumped)'.
>>     Is there any effiency way to meet my needs?
>> Best regards,
>> HUANG
>> _______________________________________________
>> ncl-talk mailing list
>> ncl-talk at ucar.edu
>> List instructions, subscriber options, unsubscribe:
>> http://mailman.ucar.edu/mailman/listinfo/ncl-talk
>
> _______________________________________________
> ncl-talk mailing list
> ncl-talk at ucar.edu
> List instructions, subscriber options, unsubscribe:
> http://mailman.ucar.edu/mailman/listinfo/ncl-talk
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.ucar.edu/pipermail/ncl-talk/attachments/20200522/6932038a/attachment.html>


More information about the ncl-talk mailing list