ompi
ompi copied to clipboard
why mpirun -n is not work (only support 1 process) through ssh connection?
Details of the problem
I'm running 8 mpi nodes on one server with the command mpirun -n 8 mympiexecutable
. If I type the command on PuTTY, it works. But if I execute the command through node-ssh or paramiko (a python ssh tool) or WinSCP's ssh shell, the same error occurred:
Abort(xxxxxxxxxx) on node 0 (rank 0 in comm 0): Fatal error in internal_Send: Invalid rank, error stack:
internal_Send(xxx): MPI_Send(buf=xxxxxxxxxxxx, count=1, MPI_INT, 1, 0, MPI_COMM_WORLD) failed
internal_Send(xx).: Invalid rank has value 1 but must be nonnegative and less than 1
Then, I changed my code, only use 1 mpi node, and execute mpirun -n 1 mympiexecutable
, no error appear.
Another experiment, in my code use 8 mpi nodes, but execute mpirun -n 1 mympiexecutable
, the error appeared again.
Invalid rank has value 1 but must be nonnegative and less than 1
It seems the argument -n
is not work through those common ssh tools except PuTTY.
If there are some configuration need to be preset on common ssh tools?
Need some help, pls.
I'm afraid I do not understand the environment in which you're operating, or what you're trying to do. Can you explain further, and/or provide a recipe for reproducing the issue?
Also, can you supply the information that was requested in the github issue template? See https://github.com/open-mpi/ompi/blob/main/.github/issue_template.md.
Background information Open MPI version: 4.0 Installed from: tarball Server operating system/version: Ubuntu 18.04.6 LTS (GNU/Linux 5.4.0-122-generic x86_64) Computer hardware: Intel® Xeon® Gold 6136 Processor 12Cores 24Threads 3.0GHz Network type: LAN Client operation system: Win10
Details of the problem
Steps to reproduce the problem:
1.Build a hello world mpi executable on server - Server side
source code: mpi_hello_world.c
mpicc -o mpi_hello_world mpi_hello_world.c
2.Execute the executable through PuTTY with the command mpirun -n 8 ./mpi_hello_world
, the output is - Client side
Hello world from processor xxx, rank 0 out of 8 processors
Hello world from processor xxx, rank 1 out of 8 processors
Hello world from processor xxx, rank 2 out of 8 processors
Hello world from processor xxx, rank 4 out of 8 processors
Hello world from processor xxx, rank 5 out of 8 processors
Hello world from processor xxx, rank 6 out of 8 processors
Hello world from processor xxx, rank 3 out of 8 processors
Hello world from processor xxx, rank 7 out of 8 processors
8 processors, correct.
3.Write a python script with paramiko installed, - Client side
import paramiko
ssh = paramiko.SSHClient()
ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())
ssh.connect(YourServerIP, username=YourUsername, port=22, password=YourPassword)
ssh_stdin, ssh_stdout, ssh_stderr = ssh.exec_command('cd YourExecutableDir; mpirun -n 8 ./mpi_hello_world')
print(ssh_stdout.read(), ssh_stderr.read())
ssh.close()
the result is:
Hello world from processor xxx, rank 0 out of 1 processors
Hello world from processor xxx, rank 0 out of 1 processors
Hello world from processor xxx, rank 0 out of 1 processors
Hello world from processor xxx, rank 0 out of 1 processors
Hello world from processor xxx, rank 0 out of 1 processors
Hello world from processor xxx, rank 0 out of 1 processors
Hello world from processor xxx, rank 0 out of 1 processors
Hello world from processor xxx, rank 0 out of 1 processors
only 1 processor is found, the same as run it on node-ssh or WinSCP's shell.
I'm trying to develop a nodejs app to invoke server's mpi executable by node-ssh. The problem is, when I execute the command mpirun -n 8 ./mpi_hello_world
, only 1 processor is found, but, if I run this command on PuTTY, it will be fine.
I don't know what's the difference between those ssh tools? Why openmpi has different appearence with the same command?
That typically occurs when you use MPICH's mpirun
and your app uses Open MPI libmpi.so
(or the other way around).
try using the absolute path to mpirun
in your script, and see how it goes.
for debugging purpose, you can type mpirun
in both terminal and your script.
That typically occurs when you use MPICH's
mpirun
and your app uses Open MPIlibmpi.so
(or the other way around).try using the absolute path to
mpirun
in your script, and see how it goes.for debugging purpose, you can
type mpirun
in both terminal and your script.
Thanks! I found the environment paths are different, and both the openmpi and the mpich are installed on my server. The which mpirun
command shows that PuTTY uses /usr/share/mpich-4.0/bin/mpirun
, while the others use /usr/bin/mpiexec
.