ompi icon indicating copy to clipboard operation
ompi copied to clipboard

Open MPI fails with 480 processes on a single node

Open jstrodtb opened this issue 1 year ago • 5 comments

Thank you for taking the time to submit an issue!

Background information

I am testing OpenFOAM on a Power 10 server node with 768 hardware threads. If I run -np 768 (anything over about 256, really), Open MPI crashes due to the operating system being out of file handles. I have increased the number of handles to 64k, and it still runs out. Another MPI code, LAMMPS, runs out at np = 240.

What version of Open MPI are you using? (e.g., v4.1.6, v5.0.1, git branch name and hash, etc.)

5.0.2

Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)

OS distribution package

If you are building/installing from a git clone, please copy-n-paste the output from git submodule status.

Please describe the system on which you are running

  • Operating system/version: RHEL 9
  • Computer hardware: A single IBM Power 10 server node
  • Network type: None(?).

Details of the problem

I am running the OpenFOAM motorbike test with various mesh sizes. I expect to be able to run with MPI processes populating all the hardware threads, so -np 768. However, the program crashes with an operating system error reporting insufficient file handles. This happens on other MPI codes when the process count is well over 200.

Note: If you include verbatim output (or a code block), please use a GitHub Markdown code block like below:

shell$ mpirun -n 2 ./hello_world

jstrodtb avatar Apr 23 '24 18:04 jstrodtb

Sounds like the file limits on that machine are too low. Try running ulimit -n 2048 to increase that limit.

See https://stackoverflow.com/questions/34588/how-do-i-change-the-number-of-open-files-limit-in-linux for details.

devreal avatar Apr 24 '24 17:04 devreal

It looks like this issue is expecting a response, but hasn't gotten one yet. If there are no responses in the next 2 weeks, we'll assume that the issue has been abandoned and will close it.

github-actions[bot] avatar May 08 '24 21:05 github-actions[bot]

@devreal the upper limit on files is 65536. Upon further testing, the failure happens at around np = 250. 65536 = 256^2, so that tracks (obviously, the system has other file handles open).

Is it possible that Open MPI is creating a direct connection between each process that lives on the same node? That would explain this np^2 behavior.

jstrodtb avatar May 10 '24 15:05 jstrodtb

That can happen if communications use TCP, but that should not be the case by default. Try

mpirun --mca pml ob1 --mca btl self,vader -np 768 ...

to force the shared memory component.

mpirun --mca pml_base_verbose 100 --mca btl_base_verbose 100 -np 768 ...

should tell you what is going on by default. You will get more info if you configure Open MPI with --enable-debug

ggouaillardet avatar May 10 '24 20:05 ggouaillardet

I no longer have access to the 768-thread machine. This one has 192 threads. I'm able to up to 468 processes using the commands you suggested:

First I did "sudo ulimit -n 65536". Then:

mpirun --map-by :OVERSUBSCRIBE --mca pml ob1 --mca btl self,vader -np 468 a.out
PRTE ERROR: Unknown error in file odls_default_module.c at line 609
PRTE ERROR: Unknown error in file odls_default_module.c at line 609
 PMIX ERROR: PMIX_ERR_SYS_LIMITS_PIPES in file base/iof_base_setup.c at line 119
 PRTE ERROR: Unknown error in file base/odls_base_default_fns.c at line 1441
 PRTE ERROR: Unknown error in file odls_default_module.c at line 609
--------------------------------------------------------------------------
    This help section is empty because PRRTE was built without Sphinx.
--------------------------------------------------------------------------

jstrodtb avatar May 14 '24 20:05 jstrodtb