[mpas-developers] dmpar_abort

Xylar Asay-Davis xylar at lanl.gov
Mon Apr 26 16:25:29 MDT 2010


Hi Michael,

I can see cases in the future where a global abort would be very helpful.

Currently, I am using abort as a sanity check on certain flags that are 
passed into my routines, so an error code would not really be an 
appropriate way to handle these -- if the flag is invalid than there is 
a bug in the code that needs to be fixed.  So it seems like calling a 
global abort function would be reasonable here.

Some searching reveals the use of stop in several places in the code 
that we will probably want to replace with
dmpar_global_abort once it's created.

I'm happy to write the routine but you may be able to foresee problems 
that I might miss.  Let me know if you'd like to do it.

  Thanks for your thoughts on this!
-Xylar

On 4/26/10 3:21 PM, Michael Duda wrote:
> Hi, Xylar.
>
> One approach might be to return a status code from routines that
> might encounter errors, and allow a routine higher up in the call
> stack to handle the error with a dmpar_abort if it were deemed
> appropriate. Depending on the nature of the subroutine, this might
> be the preferable approach -- allow higher-level code to determine
> whether the error can be recovered from or whether it is fatal.
> However, this would either entail adding an error code argument to
> the subroutine, which is one thing we'd like to avoid, or
> converting the subroutine into a function, which wouldn't be an
> option if the subroutine was in fact already a function.
>
> Another approach, and one that would be very simple to implement,
> would be to add a dmpar_global_abort(mesg) routine that is
> callable from any code that uses the dmpar module, and that prints
> the message mesg before calling MPI_Abort with MPI_COMM_WORLD. The
> current dmpar_abort only needs the dminfo argument to get the
> communicator to abort on, and I'd be hard-pressed to find a case
> where it would be desirable to abort on a communicator other than
> the global one. Adding a dmpar_global_abort routine would obviate
> the need to pass dminfo into any subroutine that might need to
> abort, and adding it as a new subroutine would allow us to migrate
> from existing calls to dmpar_abort on an as-needed basis.
>
> I'd support adding a dmpar_global_abort routine in the dmpar
> module, but I'd also suggest considering whether the error being
> checked for is one that can be recovered from, in which case a
> return error code might be the cleanest approach in that
> particular case.
>
> Cheers,
> Michael
>
>
> On Mon, Apr 26, 2010 at 02:10:10PM -0600, Xylar Asay-Davis wrote:
>    
>> I'm trying to use dmpar_abort as a way to stop the code with an error
>> message when things go wrong with the code I'm testing.  I could just
>> use stop, but I figured dmpar_abort was the "proper" way.  The problem
>> is that dminfo, the argument needed by dmpar_abort, is a member of the
>> domain, which is not available in many subroutines.  And it's
>> inconvenient to have to pass around any extra arguments to my
>> subroutines just in case I might want to abort.
>>
>> Any suggestions?
>>
>> -Xylar
>>
>> -- 
>>
>> ***********************
>> Xylar S. Asay-Davis
>> E-mail: xylar at lanl.gov
>> Phone: (505) 606-0025
>> Fax: (505) 665-2659
>> CNLS, MS B258
>> Los Alamos National Laboratory
>> Los Alamos, NM 87545
>> ***********************
>>
>>
>> _______________________________________________
>> mpas-developers mailing list
>> mpas-developers at mailman.ucar.edu
>> http://mailman.ucar.edu/mailman/listinfo/mpas-developers
>>      
> _______________________________________________
> mpas-developers mailing list
> mpas-developers at mailman.ucar.edu
> http://mailman.ucar.edu/mailman/listinfo/mpas-developers
>    


-- 

***********************
Xylar S. Asay-Davis
E-mail: xylar at lanl.gov
Phone: (505) 606-0025
Fax: (505) 665-2659
CNLS, MS B258
Los Alamos National Laboratory
Los Alamos, NM 87545
***********************




More information about the mpas-developers mailing list