ompi
ompi copied to clipboard
two mpi programs running at the same time will influence each other
Background information
What version of Open MPI are you using? (e.g., v3.0.5, v4.0.2, git branch name and hash, etc.)
v4.0.1, downloaded from open-mpi official web
Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)
from a source
Please describe the system on which you are running
- Operating system/version: ubuntu 18.04 LST
- Computer hardware: old Dell laptop,
CPU
isi7-4720HQ CPU @ 2.60GHz
, 4 cores 8 threads, 8G RAM - Network type: I don't know, just the same network type in other laptops
Details of the problem
I find out that if you are running two mpi programs at the same time but from different terminals, those two programs will influence each other, resulting in a longer execution time, even if each mpi program only uses one single process by mpirun -n 1
. To reproduce the problem, I copy the simple mpi code from mpi_pi.c for calculating pi
. I modify the number of step N
from 1E7
to 1E9
to prolong the execution time. Compile the source code
shell$ mpicc -o pi mpi_pi.c -lm
If I use one process to run
shell$ mpirun -n 1 ./pi
np= 1; Time=6.269241s; PI=-nan
Since I don't care about the result, so I leave the -nan
alone. Then I use two processes to run
shell$ mpirun -n 2 ./pi
np= 2; Time=3.250456s; PI=-nan
In terms of execution time, those results above seem all right. However, if I run the same program from two different terminals (start the program in one terminal firstly, then change to another terminal quickly to start the same program again), the results are
shell$ # terminal 1
shell$ mpirun -n 1 ./pi
np= 1; Time=11.739367s; PI=-nan
shell$ # terminal 2
shell$ mpirun -n 1 ./pi
np= 1; Time=11.814282s; PI=-nan
Now the execution time is almost twice than the normal one. Why would this happen, and How to avoid this?
The two mpirun
instances don't know anything about each other, and so each assumes it has full use of the available resources. Since they use the same placement algorithm, they will both bind their respective application procs to the same places - thus resulting in an overload situation (i.e., where more than one proc is bound to a CPU).
Several ways to solve this, but the easiest for your situation is probably to just tell each mpirun
what CPUs you want that instance to use. Check the available options - if I remember correctly, you would add --cpu-set x,y,z
.
This is because both MPI processes get pinned on core 0
and hence end up time sharing.
You have two options:
- pass
--bind-to none
to thempirun
command line (disable process binding, and let the OS manage it) - run in singleton mode (e.g.
./pi
)
@jsquyres @bwbarrett Is the expected behavior to pin on a single core when running one MPI task?
We chose to pin to core by default with -np 2
, but I do not remember if we also made this choice with -np 1
.
@ggouaillardet I gather you did not read what I wrote 😄 We bind to core if np <= 2
- always have, still do.
Thank you all for your suggestions!
I use the following commands to run the same program in two different terminals
shell$ # terminal 1
shell$ mpirun -n 1 --cpu-set 1 --report-bindings ./pi
[laptop:07053] MCW rank 0 is not bound (or bound to all available processors)
np= 1; Time=6.481132s; PI=-nan
shell$ # terminal 2
shell$ mpirun -n 1 --cpu-set 2 --report-bindings ./pi
[laptop:07060] MCW rank 0 is not bound (or bound to all available processors)
np= 1; Time=6.456938s; PI=-nan
Now the execution time is all right, but the output messages are confusing. In my opinion, when I use --cpu-set 1
, rank 0 will be bound to core 1. Why does the output messages show 'rank 0 is not bound'?
Besides, if I use
shell$ mpirun -n 1 --bind-to core --report-bindings ./pi
[laptop:06910] MCW rank 0 bound to socket 0[core 0[hwt 0-1]]: [BB/../../..]
np= 1; Time=6.267589s; PI=-nan
then rank 0 will be bound to core 0. Can I force the program to run at a specific core (such as core 1)?
The message is unfortunately not very helpful. Since you specified a single cpu and we bound you to a single cpu, then you are "bound to all available processors". If you specify more than one cpu, you'll get the more expected and helpful message as we will still bind you to only one and you won't, therefore, be bound to all available.
Thanks @rhc54 for the explanation.
Do you remember the rationale for binding to a single core when -np 1
?
(fwiw, I wrote my reply long before yours ... until I realized I did not press the Comment
button...)
Do you remember the rationale for binding to a single core when
-np 1
?
I don't recall there being one. We decided that np=2
should be bound to core because of benchmarks and typical user performance testing. There was a lot of debate about np > 2
, and we wound up with bind to NUMA so threaded apps would quit immediately complaining to us. Not sure that was really the best solution, but that is what we did.
So I think the case of np=1
was just that np=2
was the breakpoint and nobody had a reason to do something different for np=1
(fwiw, I wrote my reply long before yours ... until I realized I did not press the
Comment
button...)
Yeah, I was just pulling your chain 😄