[Wrf-users] WRF Problem running in Parallel on multiple nodes (cluster)

Ahsan Ali ahsanshah01 at gmail.com
Tue May 3 02:04:25 MDT 2011


Hello,

I am able to run WRFV3.2.1 using mpirun on multiple cores of single machine,
but when I want to run it across multiple nodes in cluster using hostlist
then I get error, The compute nodes are mounted with the master node during
boot using NFS. I get following error. Please help.

[root at pmd02 em_real]# mpirun -np 10 -hostfile /home/pmdtest/hostlist
./real.exe
bash: orted: command not found
bash: orted: command not found
--------------------------------------------------------------------------
A daemon (pid 22006) died unexpectedly with status 127 while attempting
to launch so we are aborting.

There may be more information reported by the environment (see above).

This may be because the daemon was unable to find all the needed shared
libraries on the remote node. You may set your LD_LIBRARY_PATH to have the
location of the shared libraries on the remote nodes and this will
automatically be forwarded to the remote nodes.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that the job aborted, but has no info as to the process
that caused that situation.
--------------------------------------------------------------------------
mpirun: clean termination accomplished


-- 
Syed Ahsan Ali Bokhari
Electronic Engineer (EE)

Research & Development Division
Pakistan Meteorological Department H-8/4, Islamabad.
Phone # off  +92518358714
Cell # +923155145014
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ucar.edu/pipermail/wrf-users/attachments/20110503/85e38e96/attachment.html 


More information about the Wrf-users mailing list