[mpas-developers] mpas_timer.F synchronization issue.

Doug Jacobsen jacobsen.douglas at gmail.com
Fri Dec 9 09:08:47 MST 2011


Phil,

Yeah, I figured that max would give you the most relevant information to
see where large time is spent on any processor. The min wouldn't be too
difficult to print as well. I figured average would either be fairly
representative of the total set of processors, or it would get corrupted by
load imbalances if they were really bad.

The global reductions are done at the end of the simulation, and they are
only called if the domain_info variable points to domain % dminfo. Which
means that the mpas_core line I previously mentioned needs to be added to
the initialization phase.

In terms of min and max, the global min and max times for a single call are
still printed. And the max total time spent in a routine (for a processor)
is printed. I could determine the min total time spent in a routine (for a
processor) as well if that's something everyone is interested in. As a
summary of the current information that is printed. Below is a line from
the current timer output.

 1  initialize                                6.56141         1
6.56141        6.56141        6.56141    0.01

The first column tells you how many timers were started prior to starting
this one. The second tells you the timer's name. Third is the total time
spent within this timer, including all starts and stops. Fourth is the
number of times this timer was started and stopped. Fourth is the min time
for a single call to this timer. Fifth is the max time for a single call to
this timer. Sixth is the average time for a single call to this timer (or
column 3 / colum 4) and finally is the percent of the total_time timer that
was spent in this timer.

The global reduction takes the global max of columns 3 and 5, and the
global min of colum 6. And columns 7 and 8 are recomputed using the new
global max of column 3.

Let me know if there is any more information anyone wants in the timers.

Doug

On Fri, Dec 9, 2011 at 8:35 AM, Jones, Philip W <pwjones at lanl.gov> wrote:

>
>  Doug,
>
>  Max time is most relevant and printing the min time is useful to get an
> idea of load imbalance.  Don't think the mean really tells you much.
>
>  And the global reductions are only in the timer print, right?
>
>  Thanks,
>
> Phil
>
> TSPA/Correspondence/DUSA EARTH
> ------
> Philip Jones (pwjones at lanl.gov)
> Climate, Ocean and Sea Ice Modeling
> Los Alamos National Laboratory
> T-3 MS B216
> P.O. Box 1663
> Los Alamos, NM 87545
>   ------------------------------
> *From:* mpas-developers-bounces at mailman.ucar.edu [
> mpas-developers-bounces at mailman.ucar.edu] on behalf of Doug Jacobsen [
> jacobsen.douglas at gmail.com]
> *Sent:* Thursday, December 08, 2011 7:36 PM
> *To:* mpas-developers at ucar.edu
> *Subject:* Re: [mpas-developers] mpas_timer.F synchronization issue.
>
>  Hi Everyone,
>
> Something else that I would like input on regarding this. I currently have
> two options for synchronizing the timers. First is the current version,
> which just uses the max of all of the processors timers. The other option
> would be to average all of the timers across processors. Each have their
> own benefits and provide slightly different information. So if anyone has
> any preferences it would be good to have a discussion about them.
>
> Thanks!
> Doug
>
> On Thu, Dec 8, 2011 at 3:47 PM, Doug Jacobsen <jacobsen.douglas at gmail.com>wrote:
>
>> Hello Everyone,
>>
>> I recently noticed that when running an MPI job processors would report
>> different times for sub-timers, ie. not including the total_time timer.
>> This is mostly due to some processors having to wait for mpi calls to
>> finish while other ones don't. None of the previous versions of
>> mpas_timer.F have supported making sure the timers report the same time
>> over all of the processors. So I have attached a new version of
>> mpas_timer.F that supports this. It essentially makes each timer's total
>> time the maximum total time over all of the processors. It also gets the
>> global max and min single call time to print as well. I think this gives a
>> better over all profile for the time spent in routines rather than having
>> to go through each processors log.*.out file to see how it behaved.
>>
>> To support this, the timer module now stores a pointer to domain % dminfo
>> so you don't have to pass it in to print out the timers. Doing this allows
>> the current timer implementation to stay the same, and allows the syncing
>> of timers by adding a single line to mpas_*_mpas_core.F within each core,
>> which is:
>>
>> call mpas_timer_init(domain)
>>
>> within mpas_core_init.
>>
>> I'm open to any comments or suggestions regarding this change, but I
>> would like to propagate it to the trunk. I will also propagate the above
>> addition to mpas_ocn_mpas_core.F but can add it to the other cores if
>> requested.
>>
>> Thanks for your input.
>> Doug
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ucar.edu/pipermail/mpas-developers/attachments/20111209/33fcf66d/attachment.html 


More information about the mpas-developers mailing list