[ncl-talk] passing args within ncl scripts

Dave Allured - NOAA Affiliate dave.allured at noaa.gov
Wed Jul 21 09:20:06 MDT 2021


Jayant, task parallelism will be useful if you can come up with a strategy
to do partial calculations or data subsetting within subprocesses, in such
a way as to reduce the size of intermediate results.  There are many
possible strategies, depending on the kind of calculations.  For example,
daily stats or partial sums or area averaging could be calculated in
subprocesses, then reported back to the parent via relatively small Netcdf
files.  Another way to think about strategy is dimension reduction along
one or more dimensions of the original matrix.

Example 3 uses individual PNG plot files to communicate back to the
parent.  These could just as easily be individual Netcdf files with
dimension-reduced partial results.


On Wed, Jul 21, 2021 at 7:14 AM Rick Brownrigg via ncl-talk <
ncl-talk at mailman.ucar.edu> wrote:

> Wow -- a 4320x56x450x900 floating-point variable is 391GB! In any case,
> NCL's subprocess feature can't be used for shared memory tasks -- the tasks
> are necessarily independent of each other.  I agree a 1.5hrs runtime is
> rather painful. But I don't know of a good way to speed that up.
>
> Rick
>
>
> On Wed, Jul 21, 2021 at 1:00 AM Jayant <jayantkp2979 at gmail.com> wrote:
>
>> Hi Rick,
>> Thanks again for your prompt response.
>> I have around 24 x 30 x 6months = 4320 files in binary format (along with
>> a .ctl descriptor file for each).
>>
>>    1. I read the ctl file first to get the record number of the variable
>>    of interest (say TEMP) and then;
>>    2.  use the fbinrecread function to read the binary file.
>>
>> *I guess the binary read takes a lot of time!* I define a 4-d variable
>> (say inp_temp(4320,56,450,900)) in the beginning and then in a do loop over
>> time, the above 2 steps are performed. After the do loop, I perform some
>> daily and monthly stats and then generate monthly (6) plots or a vertical
>> profile (pressure vs height) at a particular point over the entire period.
>> To give an estimate on the execution time, it takes about an hour and half
>> to complete.
>>
>> In order to reduce time for the binary read, I was thinking of adopting
>> the task parallelism for the do loop part of the script.
>>
>> On Wed, Jul 21, 2021 at 1:38 AM Rick Brownrigg <brownrig at ucar.edu> wrote:
>>
>>> Hi Jayant,
>>>
>>> My apologies if I'm still not clear.  You say "It takes a lot of time to
>>> read a variable and generate a plot."  Are you trying to read 24 files and
>>> generate 24 plots? Or read 24 files and perform analysis and generate plots
>>> from the composite?
>>>
>>> It sounds like the latter -- are you trying to use subprocesses to read
>>> 24 files and end up with one array in memory composed from all 24 of them,
>>> so that you can perform analysis and/or plots on that array?  Then no --
>>> subprocesses won't do the job and NCL in general does not have a way to
>>> perform concurrent reads into a shared memory space. The parent NCL script
>>> executing other programs via the subprocess() function has no communication
>>> with those programs.
>>>
>>> The "addfiles" function is the NCL way of reading multiple files into a
>>> common array; it is not concurrent to the best of my knowledge, but it does
>>> the job.
>>>
>>> Rick
>>>
>>>
>>> On Tue, Jul 20, 2021 at 9:04 PM Jayant <jayantkp2979 at gmail.com> wrote:
>>>
>>>> Thanks Rick,
>>>> I want to use task parallelism. I have hourly files spanning a few
>>>> months from a high resolution simulation. It takes a lot of time to read a
>>>> variable and generate a plot. I have come across task parallelism (example
>>>> 3) and want to modify the example such that I can use 24 processors to read
>>>> 24 files at a time and save the desired variable in a parent array. And
>>>> once the reading is complete, I can perform calculations (daily/monthly
>>>> stats) on the parent array. I hope this helps understand what I intend to
>>>> do.
>>>> You mentioned a file based approach...and perhaps the example 3 does
>>>> save individual plots and later combine frames. I wonder if it's good idea
>>>> in my case????
>>>> Best regards,
>>>> Jayant
>>>>
>>>> On Tue, Jul 20, 2021 at 11:50 PM Rick Brownrigg <brownrig at ucar.edu>
>>>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> If I understand you correctly, you are trying to get the second script
>>>>> to update the array in the first script?  If so, that would not be
>>>>> possible, as the two scripts execute as independent processes, operating in
>>>>> independent memory spaces. They would need some other mechanism to
>>>>> communicate results between each other, perhaps something like a file-based
>>>>> approach.
>>>>>
>>>>> Perhaps explain in more detail what you are trying to do and why there
>>>>> are two scripts involved, and others might be able to offer suggestions.
>>>>>
>>>>> Rick
>>>>>
>>>>>
>>>>> On Tue, Jul 20, 2021 at 8:26 PM Jayant via ncl-talk <
>>>>> ncl-talk at mailman.ucar.edu> wrote:
>>>>>
>>>>>> Hi,
>>>>>> I want to call one ncl script (test_second.ncl) from within another
>>>>>> ncl script (test_prime.ncl) using system command (in fact subprocess
>>>>>> command). In doing so, I want to update an array (defined in
>>>>>> test_prime.ncl) in the second call. I am getting zeros (unchanged!!). How
>>>>>> to proceed? Is there something like global variables that can be defined?
>>>>>> Below are the working example scripts:
>>>>>> ;==================================================
>>>>>> *test_prime.ncl*
>>>>>> begin
>>>>>>  ninp=10
>>>>>>  inparr=new(ninp,float)
>>>>>>  inparr=0.0
>>>>>>
>>>>>>  do i=0,ninp-1
>>>>>>   command="ncl -Q test_second.ncl
>>>>>> "+str_get_sq()+"ip="+i+str_get_sq()+" "+str_get_sq()+"tmparr="+
>>>>>> inparr(i)+str_get_sq()
>>>>>>   system(command)
>>>>>>  end do
>>>>>> print(inparr)
>>>>>> end
>>>>>> ;==================================================
>>>>>> *test_second.ncl*
>>>>>> begin
>>>>>> tmparr=ip ; intend to perform some calculation and update
>>>>>> end
>>>>>>
>>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.ucar.edu/pipermail/ncl-talk/attachments/20210721/020803da/attachment.html>


More information about the ncl-talk mailing list