[Dart-users] DART broken on cheyenne

nancy collins nancy at ucar.edu
Fri Oct 27 19:05:44 MDT 2017


hello all,

if you don't use the NCAR supercomputer "cheyenne" you
can ignore the rest of this message. have a good weekend.

for you other folks - we were notified at 5pm today (yes, on
a friday) that the system people made a change to the default
mpi libraries (the MPT module) to fix a different problem and
it disables some important MPI calls we depend on for filter
to work.

if you get multiple errors like this:

MPT ERROR: rank:13, function:MPI_WIN_LOCK, Invalid win argument

it's this change that is the cause.

the current suggested fix is to load a different mpi library
both during compile and when you run.

before you compile filter, run this:

module swap mpt openmpi

and then recompile.  in addition, in your run script the
mpi run command has to change from:

mpiexec_mpt
   to
mpirun

you may also have to put the module swap line in your pbs job script
and/or in your .login or .tcshrc file.  i'm not sure, sorry.   i'll figure this
out better on monday and send around more specific instructions.

bonus pain: don't forget that yellowstone and cheyenne will be
down from tuesday, oct 31 to friday, nov 3 for maintenance.
so hopefully you can get filter running if you're trying to finish
runs before the machines go down next week.

email dart at ucar.edu if you have problems with filter on cheyenne
and we will try to help you.

nancy


More information about the Dart-users mailing list