ompi icon indicating copy to clipboard operation
ompi copied to clipboard

[Open MPI main branch] mpirun/mpicc error while loading shared libraries

Open shijin-aws opened this issue 3 years ago • 15 comments

Thank you for taking the time to submit an issue!

Background information

What version of Open MPI are you using? (e.g., v3.0.5, v4.0.2, git branch name and hash, etc.)

main branch, commit

Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)

git clone https://github.com/open-mpi/ompi.git
cd ompi
git submodule update --recursive --init
./autogen.pl
./configure --prefix=/home/ec2-user/ompi/install --disable-man-pages
make -j install

If you are building/installing from a git clone, please copy-n-paste the output from git submodule status.

 680331773926b62c245626dbc9cf78aed2d641d3 3rd-party/openpmix (v1.1.3-3327-g68033177)
 78825642e8594ebffda0942fa04e375077819732 3rd-party/prrte (psrvr-v2.0.0rc1-4147-g78825642e8)

Please describe the system on which you are running

  • Operating system/version: amazon linux 2
  • Computer hardware: aws ec2 instance c5n.18xlarge
  • Network type:

Details of the problem

We find build Open MPI main branch on a machine that has cuda toolkit installed in /usr/local/cuda will cause mpirun/mpicc error while loading shared libraries

[ec2-user@ip-172-31-49-61 ompi]$ /home/ec2-user/ompi/install/bin/mpirun --version
/home/ec2-user/ompi/install/bin/mpirun: error while loading shared libraries: libOpenCL.so.1: cannot open shared object file: No such file or directory

ldd shows thempirun are linked with cuda libraries like libOpenCL.so which are not found in the default /lib64/ path.

[ec2-user@ip-172-31-49-61 ~]$ ldd /home/ec2-user/ompi/install/bin/mpirun
	linux-vdso.so.1 (0x00007ffe0e1bf000)
	libopen-pal.so.0 => /home/ec2-user/ompi/install/lib/libopen-pal.so.0 (0x00007f380d0cf000)
	libnl-3.so.200 => /lib64/libnl-3.so.200 (0x00007f380ceaf000)
	libnl-route-3.so.200 => /lib64/libnl-route-3.so.200 (0x00007f380cc43000)
	libpmix.so.0 => /home/ec2-user/ompi/install/lib/libpmix.so.0 (0x00007f380c80a000)
	libevent_core-2.1.so.7 => /home/ec2-user/ompi/install/lib/libevent_core-2.1.so.7 (0x00007f380c5d6000)
	libevent_pthreads-2.1.so.7 => /home/ec2-user/ompi/install/lib/libevent_pthreads-2.1.so.7 (0x00007f380c3d3000)
	libhwloc.so.15 => /home/ec2-user/ompi/install/lib/libhwloc.so.15 (0x00007f380c17c000)
	libudev.so.1 => /lib64/libudev.so.1 (0x00007f380bf68000)
	libOpenCL.so.1 => not found
	libcudart.so.11.0 => not found
	libnvidia-ml.so.1 => not found
	libdl.so.2 => /lib64/libdl.so.2 (0x00007f380bd64000)
	libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f380bb46000)
	librt.so.1 => /lib64/librt.so.1 (0x00007f380b93e000)
	libm.so.6 => /lib64/libm.so.6 (0x00007f380b5fe000)
	libutil.so.1 => /lib64/libutil.so.1 (0x00007f380b3fb000)
	libc.so.6 => /lib64/libc.so.6 (0x00007f380b050000)
	libOpenCL.so.1 => not found
	libcudart.so.11.0 => not found
	libnvidia-ml.so.1 => not found
	libOpenCL.so.1 => not found
	libcudart.so.11.0 => not found
	libnvidia-ml.so.1 => not found
	libOpenCL.so.1 => not found
	libcudart.so.11.0 => not found
	libnvidia-ml.so.1 => not found
	libcap.so.2 => /lib64/libcap.so.2 (0x00007f380ae4b000)
	libdw.so.1 => /lib64/libdw.so.1 (0x00007f380abfa000)
	libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f380a9e4000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f380d3b2000)
	libattr.so.1 => /lib64/libattr.so.1 (0x00007f380a7df000)
	libelf.so.1 => /lib64/libelf.so.1 (0x00007f380a5c7000)
	libz.so.1 => /lib64/libz.so.1 (0x00007f380a3b2000)
	liblzma.so.5 => /lib64/liblzma.so.5 (0x00007f380a18c000)
	libbz2.so.1 => /lib64/libbz2.so.1 (0x00007f3809f7c000)

These cuda libraries are actually installed in /usr/local/cuda/lib64:

[ec2-user@ip-172-31-49-61 ~]$ ls /usr/local/cuda/lib64
libOpenCL.so                  libcusolver.so              libnppim.so.11
libOpenCL.so.1                libcusolver.so.11           libnppim.so.11.1.1.269
libOpenCL.so.1.0              libcusolver.so.11.0.0.74    libnppim_static.a
libOpenCL.so.1.0.0            libcusolverMg.so            libnppist.so
....

We are able to find this issue starts from the commit https://github.com/open-mpi/ompi/commit/60e82dd7cbdb6a7b36aa6f9e5543afa4cf0502ee that bumps hwloc to v2.7.

Before this bump, there is no cuda dependency introduced to open mpi executables if we do not build Open MPI with --with-cuda.

I understand this is not an intended behavior after offline talk with @bwbarrett so report the issue here.

shijin-aws avatar Jan 21 '22 20:01 shijin-aws

the tools are picking up their dependency on libcuda from hwloc. hwloc isn't a component in master/5.0, so this is a general library linking problem, not the dso/no dso problem. hwloc, unlike OMPI and Libfabric, doesn't dlopen libcuda, hence the inherited dependency.

bwbarrett avatar Jan 21 '22 20:01 bwbarrett

I tried to build Open MPI with --disable-cuda. I can see it's an option in hwloc's configure, but using this option still does not fix the issue.

shijin-aws avatar Jan 21 '22 23:01 shijin-aws

We had to add the following configure flags to make it work for us: --disable-cuda --disable-nvml --with-cuda=no Open MPI configure will complain about some of them, but will pass them down to hwloc which will pick out the ones it knows and disable the autolinking.

jjhursey avatar Jan 24 '22 22:01 jjhursey

The issue likely comes from hwloc >= 2.5 better autodetecting CUDA. --with-cuda= is used to find NVIDIA libraries, including cuda, opencl and nvml, that's why --with-cuda=no helps, but --disable-opencl would likely work too.

bgoglin avatar Jan 25 '22 11:01 bgoglin

hwloc, unlike OMPI and Libfabric, doesn't dlopen libcuda, hence the inherited dependency.

hwloc can dlopen its own plugins to avoid such dependencies (configure with --enable-plugins or --enable-plugins=cuda,nvml,opencl,...) but we don't do it by default, and I seem to remember that two levels of plugins (hwloc plugins loaded by the OMPI hwloc plugin) would cause some problems (with namespaces?).

bgoglin avatar Jan 25 '22 11:01 bgoglin

In OMPI 5.0, hwloc is no longer loaded as a plugin, so we should be ok with --enable-plugins as a solution.

bwbarrett avatar Jan 25 '22 15:01 bwbarrett

I'm afraid this is proving to be a fundamental problem - we may wind up having to add a configure check in PMIx and PRRTE to disallow hwloc versions greater than 2.4. The problem is that we now fail if the system hwloc is built with the NVIDIA libraries. Remember, neither PMIx nor PRRTE carry their own versions of HWLOC - we rely on the provided version, which is typically just what the system has installed. Neither PMIx nor PRRTE invoke HWLOC from a plugin.

I'm not sure what I can do about PMIx/PRRTE releases already in the wild - I imagine we will just have to deal with the complaints as they come in, recommending that people downgrade their HWLOC install.

This change in hwloc also breaks the OMPI build involving an external HWLOC with this behavior. I'm not sure where we go from here.

rhc54 avatar Jan 28 '22 07:01 rhc54

If people use an external hwloc built with CUDA in /usr/local/cuda without LD_LIBRARY_PATH pointing to /usr/local/cuda/lib64, that lstopo will fail with the same error. It's a bug in their hwloc installation, they must put /usr/loca/cuda/lib64 in LD_LIBRARY_PATH or ld.so.conf .

I don't know what happened on the EC2 machine, but CUDA installs a /etc/ld.so.conf.d/cuda... on CentOS, and it points to a directory with all required NVIDIA libs, works fine.

For internal hwloc, plugins are exactly designed exactly for this.

bgoglin avatar Jan 30 '22 09:01 bgoglin

Err...I honestly don't believe that is what you want, for two reasons.

First, you've gone to all that trouble to write configure code that checks the CUDA library for its version, features, etc, setting controls to direct what you build inside HWLOC. You then throw all that away to accept an arbitrary version in someone's library path? You have no idea what version that is, if your controls are still correct for that version, etc.

I'm told by others that having multiple CUDA versions on a machine is "a really bad thing", so perhaps you won't encounter this scenario in the real world. However, it is a somewhat odd thing to do.

Second, and more concerning, is that you are asking third-party libraries and users to play "guess where CUDA was installed". Users don't know or care where CUDA was installed - they simply assume that the library they are using (MPI or whatever) was smart enough to include it. If they get an error indicating that library was built with CUDA support but has no idea where it is located, they will just file a bug report with that library - they won't know that they have to search their system for a CUDA library.

Using plugins to shield from that has its own problem, and it's a concerning one: HWLOC will automatically reject the plugin when it fails to find the required library. This means that a user, who could have reasonably expected optimizations based on CUDA information, unknowingly doesn't get them. There is no warning - your application just fails to get the supporting information and has to "do without".

I therefore believe what you actually want to do is to "rpath" the CUDA library so its location is fixed within HWLOC. This resolves the above problems while still allowing the CUDA library to be dynamically loaded (which is good as the sys admin will occasionally update it). The linker will still ensure that the .so version matches what you built against - not a perfect protection to the version problem, but as good as you can do (and should be good enough as long as the CUDA library doesn't violate libtool version rules), and users no longer have to search for a CUDA library to use HWLOC.

rhc54 avatar Jan 30 '22 15:01 rhc54

We're not asking third-party to guess where CUDA is, we're asking the guy who installed a custom hwloc to setup things properly (export LD_LIBRARY_PATH in a "module", update ld.so.conf, ...) so that third-party gets the info automatically if they use that hwloc. That's what all spack/guix/module things do all the time and it works fine. Just running lstopo shows whether something is missing.

That's said, I am fine with adding rpath (if somebody knows how to do it properly). I don't know why we didn't do it earlier, maybe @jsquyres remembers but it was ~10~12 years ago.

By the way, having multiple CUDA installed at the same time is actually quite common because some apps like tensorflow require specific (old) versions while some HPC runtimes want a recent one for new features :/

bgoglin avatar Jan 30 '22 16:01 bgoglin

I guess I'll just leave it as "I disagree". To me, the point of "module" is to allow multiple versions of the same library to coexist in an environment so that users can easily switch between them. It is specifically not to enable a particular instance of a library to operate. If I install a library using a configure line that tells it exactly where a required dependency exists, I shouldn't have to then set my environment to allow that library to find the dependency I just told it about.

To me, that makes no logical sense. Likewise, plugins are there for the case where I build the software in an environment where the library exists, and then move that library to an environment where the software does not exist. Plugins are not a good mechanism (IMO) for hiding missing linkages on the same machine where the library was built. That just feels like a bad install to me.

But I'll leave it to you folks to resolve. Right now, it appears that many OMPI CI installations are busted (even though they are building with CUDA) due to this problem if they use HWLOC >= 2.5.0. Given that our people are pretty savvy admins, that seems ominous to me.

rhc54 avatar Jan 30 '22 18:01 rhc54

For reference, @bgoglin filed an issue to track this over in hwloc. I've been able to reproduce with a container with CUDA libraries and posted some details there.

  • https://github.com/open-mpi/hwloc/issues/515

I plan to take a look at the configure logic to see if I can improve it for CUDA. If that pans out then we can try to apply it to the OpenCL logic (and if others need it).

jjhursey avatar Jan 31 '22 17:01 jjhursey

Tested the --enable-plugin option introduced in https://github.com/open-mpi/ompi/pull/9921. Verified it fixes the issue we had.

shijin-aws avatar Feb 03 '22 19:02 shijin-aws

@bwbarrett Do you think this is fixed other than #9933 ?

gpaulsen avatar Mar 03 '22 15:03 gpaulsen

Yes and No. We've fixed all the issues for mpicc, although haven't fixed the bigger issue for prterun. But I think you can remove the blocker tag for 5.0, if that's what you're getting at.

bwbarrett avatar Mar 03 '22 15:03 bwbarrett