ompi icon indicating copy to clipboard operation
ompi copied to clipboard

Internal pmix submodule does not have IPv6 support since Open MPI 5.0.3

Open wenduwan opened this issue 6 months ago • 1 comments

Background information

What version of Open MPI are you using? (e.g., v4.1.6, v5.0.1, git branch name and hash, etc.)

main & v5.0.x branch, and releases >= v5.0.3

Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)

Build with internal pmix

If you are building/installing from a git clone, please copy-n-paste the output from git submodule status.

https://github.com/openpmix/openpmix/commit/d13c12efc1bd79c48dfad4566004653148202a87

Please describe the system on which you are running

  • Operating system/version: Any Linux
  • Computer hardware: Not related
  • Network type: Any network that requires TCP OOB

Details of the problem

Please describe, in detail, the problem that you are having, including the behavior you expect to see, the actual behavior that you are seeing, steps to reproduce the problem, etc. It is most helpful if you can attach a small program that a developer can use to reproduce your problem.

mpirun/prterun cannot launch on IPv6-only subnets

prterun --hostfile ~/hostfile --map-by ppr:1:node hostname
--------------------------------------------------------------------------
PRTE has lost communication with a remote daemon.

  HNP daemon   : [prterun-ip-10-0-1-157-4316@0,0] on node ip-10-0-1-157
  Remote daemon: [prterun-ip-10-0-1-157-4316@0,2] on node i-0f5ca16fae376d4f4

This is usually due to either a failure of the TCP network
connection to the node, or possibly an internal failure of
the daemon itself. We cannot recover from this failure, and
therefore will terminate the job.
--------------------------------------------------------------------------

wenduwan avatar Aug 21 '24 13:08 wenduwan