[RESEARCH] finding libnvvm.so
Creating this as a place for myself to collect information related to finding libnvvm.so
Note that libnvvm.so is unusual: while most CTK .so files live under <loc>/lib or <loc>/lib64, it lives under <loc>/nvvm/lib64.
$ ./cuda_12.8.0_570.86.10_linux.run --toolkit --toolkitpath=$HOME/ctk_downloads/scratch
$ cd scratch
$ find . -name '*.so' | grep -v -e '^\./extras/' -e '^\./nsight-' -e '^\./compute-sanitizer/' -e '/eclipse_[0-9]*\.so$'
./nvvm/lib64/libnvvm.so
./targets/x86_64-linux/lib/libnvblas.so
./targets/x86_64-linux/lib/libcusolver.so
./targets/x86_64-linux/lib/libnppim.so
./targets/x86_64-linux/lib/libnpps.so
./targets/x86_64-linux/lib/libcudart.so
./targets/x86_64-linux/lib/libcufft.so
./targets/x86_64-linux/lib/libnvJitLink.so
./targets/x86_64-linux/lib/libcusolverMg.so
./targets/x86_64-linux/lib/libcuinj64.so
./targets/x86_64-linux/lib/libnvfatbin.so
./targets/x86_64-linux/lib/stubs/libcusolver.so
./targets/x86_64-linux/lib/stubs/libnvidia-ml.so
./targets/x86_64-linux/lib/stubs/libnppim.so
./targets/x86_64-linux/lib/stubs/libnpps.so
./targets/x86_64-linux/lib/stubs/libcufft.so
./targets/x86_64-linux/lib/stubs/libnvJitLink.so
./targets/x86_64-linux/lib/stubs/libcusolverMg.so
./targets/x86_64-linux/lib/stubs/libnvfatbin.so
./targets/x86_64-linux/lib/stubs/libnppif.so
./targets/x86_64-linux/lib/stubs/libcublas.so
./targets/x86_64-linux/lib/stubs/libnppial.so
./targets/x86_64-linux/lib/stubs/libcurand.so
./targets/x86_64-linux/lib/stubs/libcublasLt.so
./targets/x86_64-linux/lib/stubs/libnppidei.so
./targets/x86_64-linux/lib/stubs/libnvjpeg.so
./targets/x86_64-linux/lib/stubs/libcuda.so
./targets/x86_64-linux/lib/stubs/libnvrtc.alt.so
./targets/x86_64-linux/lib/stubs/libnppig.so
./targets/x86_64-linux/lib/stubs/libcusparse.so
./targets/x86_64-linux/lib/stubs/libnppisu.so
./targets/x86_64-linux/lib/stubs/libnppicc.so
./targets/x86_64-linux/lib/stubs/libnppitc.so
./targets/x86_64-linux/lib/stubs/libnppist.so
./targets/x86_64-linux/lib/stubs/libcufftw.so
./targets/x86_64-linux/lib/stubs/libnvrtc.so
./targets/x86_64-linux/lib/stubs/libnppc.so
./targets/x86_64-linux/lib/libnppif.so
./targets/x86_64-linux/lib/libcublas.so
./targets/x86_64-linux/lib/libcufile_rdma.so
./targets/x86_64-linux/lib/libcufile.so
./targets/x86_64-linux/lib/libnppial.so
./targets/x86_64-linux/lib/libaccinj64.so
./targets/x86_64-linux/lib/libcurand.so
./targets/x86_64-linux/lib/libcublasLt.so
./targets/x86_64-linux/lib/libnppidei.so
./targets/x86_64-linux/lib/libnvrtc-builtins.alt.so
./targets/x86_64-linux/lib/libnvToolsExt.so
./targets/x86_64-linux/lib/libOpenCL.so
./targets/x86_64-linux/lib/libnvjpeg.so
./targets/x86_64-linux/lib/libnvrtc.alt.so
./targets/x86_64-linux/lib/libnppig.so
./targets/x86_64-linux/lib/libcusparse.so
./targets/x86_64-linux/lib/libnvrtc-builtins.so
./targets/x86_64-linux/lib/libnppisu.so
./targets/x86_64-linux/lib/libnppicc.so
./targets/x86_64-linux/lib/libnppitc.so
./targets/x86_64-linux/lib/libnppist.so
./targets/x86_64-linux/lib/libcufftw.so
./targets/x86_64-linux/lib/libnvrtc.so
./targets/x86_64-linux/lib/libnppc.so
Inside a cccl [Dev Container: cuda12.8-gcc13 @ ... ]
$ cd /usr/local/cuda
$ find . -name '*.so' | grep -v -e '^\./compute-sanitizer/'
./nvvm/lib64/libnvvm.so
./targets/x86_64-linux/lib/libnvperf_target.so
./targets/x86_64-linux/lib/libcupti.so
./targets/x86_64-linux/lib/libcudart.so
./targets/x86_64-linux/lib/libnvJitLink.so
./targets/x86_64-linux/lib/libcuinj64.so
./targets/x86_64-linux/lib/stubs/libnvidia-ml.so
./targets/x86_64-linux/lib/stubs/libnvJitLink.so
./targets/x86_64-linux/lib/stubs/libcurand.so
./targets/x86_64-linux/lib/stubs/libcuda.so
./targets/x86_64-linux/lib/stubs/libnvrtc.alt.so
./targets/x86_64-linux/lib/stubs/libnvrtc.so
./targets/x86_64-linux/lib/libpcsamplingutil.so
./targets/x86_64-linux/lib/libaccinj64.so
./targets/x86_64-linux/lib/libcurand.so
./targets/x86_64-linux/lib/libnvperf_host.so
./targets/x86_64-linux/lib/libnvrtc-builtins.alt.so
./targets/x86_64-linux/lib/libnvToolsExt.so
./targets/x86_64-linux/lib/libnvrtc.alt.so
./targets/x86_64-linux/lib/libnvrtc-builtins.so
./targets/x86_64-linux/lib/libcheckpoint.so
./targets/x86_64-linux/lib/libnvrtc.so
Ubuntu 24.04 workstation with CTK 12.6.1 installed into /usr/local
$ cd /usr/local/cuda
$ find . -name '*.so' | grep -v -e '^\./compute-sanitizer/' -e '/eclipse_[0-9]*\.so$'
./nvvm/lib64/libnvvm.so
./targets/x86_64-linux/lib/libnvperf_target.so
./targets/x86_64-linux/lib/libcupti.so
./targets/x86_64-linux/lib/libnvblas.so
./targets/x86_64-linux/lib/libcusolver.so
./targets/x86_64-linux/lib/libnppim.so
./targets/x86_64-linux/lib/libnpps.so
./targets/x86_64-linux/lib/libcudart.so
./targets/x86_64-linux/lib/libcufft.so
./targets/x86_64-linux/lib/libnvJitLink.so
./targets/x86_64-linux/lib/libcusolverMg.so
./targets/x86_64-linux/lib/libcuinj64.so
./targets/x86_64-linux/lib/libnvfatbin.so
./targets/x86_64-linux/lib/stubs/libcusolver.so
./targets/x86_64-linux/lib/stubs/libnvidia-ml.so
./targets/x86_64-linux/lib/stubs/libnppim.so
./targets/x86_64-linux/lib/stubs/libnpps.so
./targets/x86_64-linux/lib/stubs/libcufft.so
./targets/x86_64-linux/lib/stubs/libnvJitLink.so
./targets/x86_64-linux/lib/stubs/libcusolverMg.so
./targets/x86_64-linux/lib/stubs/libnvfatbin.so
./targets/x86_64-linux/lib/stubs/libnppif.so
./targets/x86_64-linux/lib/stubs/libcublas.so
./targets/x86_64-linux/lib/stubs/libnppial.so
./targets/x86_64-linux/lib/stubs/libcurand.so
./targets/x86_64-linux/lib/stubs/libcublasLt.so
./targets/x86_64-linux/lib/stubs/libnppidei.so
./targets/x86_64-linux/lib/stubs/libnvjpeg.so
./targets/x86_64-linux/lib/stubs/libcuda.so
./targets/x86_64-linux/lib/stubs/libnppig.so
./targets/x86_64-linux/lib/stubs/libcusparse.so
./targets/x86_64-linux/lib/stubs/libnppisu.so
./targets/x86_64-linux/lib/stubs/libnppicc.so
./targets/x86_64-linux/lib/stubs/libnppitc.so
./targets/x86_64-linux/lib/stubs/libnppist.so
./targets/x86_64-linux/lib/stubs/libcufftw.so
./targets/x86_64-linux/lib/stubs/libnvrtc.so
./targets/x86_64-linux/lib/stubs/libnppc.so
./targets/x86_64-linux/lib/libnppif.so
./targets/x86_64-linux/lib/libcublas.so
./targets/x86_64-linux/lib/libcufile_rdma.so
./targets/x86_64-linux/lib/libcufile.so
./targets/x86_64-linux/lib/libnppial.so
./targets/x86_64-linux/lib/libpcsamplingutil.so
./targets/x86_64-linux/lib/libaccinj64.so
./targets/x86_64-linux/lib/libcurand.so
./targets/x86_64-linux/lib/libcublasLt.so
./targets/x86_64-linux/lib/libnvperf_host.so
./targets/x86_64-linux/lib/libnppidei.so
./targets/x86_64-linux/lib/libnvToolsExt.so
./targets/x86_64-linux/lib/libOpenCL.so
./targets/x86_64-linux/lib/libnvjpeg.so
./targets/x86_64-linux/lib/libnppig.so
./targets/x86_64-linux/lib/libcusparse.so
./targets/x86_64-linux/lib/libnvrtc-builtins.so
./targets/x86_64-linux/lib/libnppisu.so
./targets/x86_64-linux/lib/libnppicc.so
./targets/x86_64-linux/lib/libnppitc.so
./targets/x86_64-linux/lib/libnppist.so
./targets/x86_64-linux/lib/libcufftw.so
./targets/x86_64-linux/lib/libcheckpoint.so
./targets/x86_64-linux/lib/libnvrtc.so
./targets/x86_64-linux/lib/libnppc.so
Using a venv (named scratchenv here):
$ pip install cuda-bindings
Collecting cuda-bindings
Using cached cuda_bindings-12.8.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (13 kB)
Using cached cuda_bindings-12.8.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (11.2 MB)
Installing collected packages: cuda-bindings
Successfully installed cuda-bindings-12.8.0
$ pip install nvidia-cuda-nvcc-cu12
Collecting nvidia-cuda-nvcc-cu12
Using cached nvidia_cuda_nvcc_cu12-12.8.61-py3-none-manylinux2010_x86_64.manylinux_2_12_x86_64.whl.metadata (1.7 kB)
Using cached nvidia_cuda_nvcc_cu12-12.8.61-py3-none-manylinux2010_x86_64.manylinux_2_12_x86_64.whl (40.1 MB)
Installing collected packages: nvidia-cuda-nvcc-cu12
Successfully installed nvidia-cuda-nvcc-cu12-12.8.61
$ find scratchenv -name '*.so*'
scratchenv/lib/python3.12/site-packages/cuda/ccudart.cpython-312-x86_64-linux-gnu.so
scratchenv/lib/python3.12/site-packages/cuda/cudart.cpython-312-x86_64-linux-gnu.so
scratchenv/lib/python3.12/site-packages/cuda/nvrtc.cpython-312-x86_64-linux-gnu.so
scratchenv/lib/python3.12/site-packages/cuda/cnvrtc.cpython-312-x86_64-linux-gnu.so
scratchenv/lib/python3.12/site-packages/cuda/cuda.cpython-312-x86_64-linux-gnu.so
scratchenv/lib/python3.12/site-packages/cuda/bindings/_internal/nvjitlink.cpython-312-x86_64-linux-gnu.so
scratchenv/lib/python3.12/site-packages/cuda/bindings/_internal/utils.cpython-312-x86_64-linux-gnu.so
scratchenv/lib/python3.12/site-packages/cuda/bindings/cydriver.cpython-312-x86_64-linux-gnu.so
scratchenv/lib/python3.12/site-packages/cuda/bindings/nvrtc.cpython-312-x86_64-linux-gnu.so
scratchenv/lib/python3.12/site-packages/cuda/bindings/driver.cpython-312-x86_64-linux-gnu.so
scratchenv/lib/python3.12/site-packages/cuda/bindings/cyruntime.cpython-312-x86_64-linux-gnu.so
scratchenv/lib/python3.12/site-packages/cuda/bindings/nvjitlink.cpython-312-x86_64-linux-gnu.so
scratchenv/lib/python3.12/site-packages/cuda/bindings/cynvjitlink.cpython-312-x86_64-linux-gnu.so
scratchenv/lib/python3.12/site-packages/cuda/bindings/_bindings/cydriver.cpython-312-x86_64-linux-gnu.so
scratchenv/lib/python3.12/site-packages/cuda/bindings/_bindings/cynvrtc.cpython-312-x86_64-linux-gnu.so
scratchenv/lib/python3.12/site-packages/cuda/bindings/_lib/cyruntime/cyruntime.cpython-312-x86_64-linux-gnu.so
scratchenv/lib/python3.12/site-packages/cuda/bindings/_lib/cyruntime/utils.cpython-312-x86_64-linux-gnu.so
scratchenv/lib/python3.12/site-packages/cuda/bindings/_lib/utils.cpython-312-x86_64-linux-gnu.so
scratchenv/lib/python3.12/site-packages/cuda/bindings/cynvrtc.cpython-312-x86_64-linux-gnu.so
scratchenv/lib/python3.12/site-packages/cuda/bindings/runtime.cpython-312-x86_64-linux-gnu.so
scratchenv/lib/python3.12/site-packages/cuda/ccuda.cpython-312-x86_64-linux-gnu.so
scratchenv/lib/python3.12/site-packages/nvidia/cuda_nvcc/nvvm/lib64/libnvvm.so
When using conda (miniforge), this appears to always exist, after installing cuda-nvcc:
$CONDA_PREFIX/nvvm/lib64/
With these files:
libnvvm.so -> libnvvm.so.4.0.0
libnvvm.so.4 -> libnvvm.so.4.0.0
62M libnvvm.so.4.0.0
I tried:
-
conda install cuda-version=12.8 cuda-nvccinto thebaseenvironment. -
conda create --yes -n scratchenv python=3.12 cuda-version=12.8 cuda-nvcc
Additionally, $CONDA_PREFIX/lib/python3.12/site-packages appears to always be included in sys.path.
For the record, I also experimented with this script:
https://github.com/rwgk/stuff/blob/2398710e6be12f65c044b91c8cb105093c720f43/cuda-python/cuda_bindings_nvvm_proc_self_maps.py
Example:
Into a scratchvenv:
pip install cuda_bindings-12.8.0-cp312-cp312-linux_x86_64.whl
Example output WITHOUT setting LD_LIBRARY_PATH:
label: after import cuda
NO DIFF
label: after import cuda.bindings
NO DIFF
label: after import cuda.bindings.nvvm
--- before
+++ after
@@ -1,8 +1,15 @@
+/home/rgrossekunst/scratchenv/lib/python3.12/site-packages/cuda/bindings/_internal/nvvm.cpython-312-x86_64-linux-gnu.so
+/home/rgrossekunst/scratchenv/lib/python3.12/site-packages/cuda/bindings/_internal/utils.cpython-312-x86_64-linux-gnu.so
+/home/rgrossekunst/scratchenv/lib/python3.12/site-packages/cuda/bindings/cynvvm.cpython-312-x86_64-linux-gnu.so
+/home/rgrossekunst/scratchenv/lib/python3.12/site-packages/cuda/bindings/nvvm.cpython-312-x86_64-linux-gnu.so
/usr/bin/python3.12
/usr/lib/locale/locale-archive
/usr/lib/x86_64-linux-gnu/gconv/gconv-modules.cache
/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
/usr/lib/x86_64-linux-gnu/libc.so.6
/usr/lib/x86_64-linux-gnu/libexpat.so.1.9.1
+/usr/lib/x86_64-linux-gnu/libgcc_s.so.1
/usr/lib/x86_64-linux-gnu/libm.so.6
+/usr/lib/x86_64-linux-gnu/libpthread.so.0
+/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.33
/usr/lib/x86_64-linux-gnu/libz.so.1.3
label: after cuda.bindings.nvvm.version() failure
--- before
+++ after
@@ -7,9 +7,12 @@
/usr/lib/x86_64-linux-gnu/gconv/gconv-modules.cache
/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
/usr/lib/x86_64-linux-gnu/libc.so.6
+/usr/lib/x86_64-linux-gnu/libcuda.so.560.35.03
+/usr/lib/x86_64-linux-gnu/libdl.so.2
/usr/lib/x86_64-linux-gnu/libexpat.so.1.9.1
/usr/lib/x86_64-linux-gnu/libgcc_s.so.1
/usr/lib/x86_64-linux-gnu/libm.so.6
/usr/lib/x86_64-linux-gnu/libpthread.so.0
+/usr/lib/x86_64-linux-gnu/librt.so.1
/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.33
/usr/lib/x86_64-linux-gnu/libz.so.1.3
After export LD_LIBRARY_PATH=/usr/local/cuda/nvvm/lib64 the output changes:
--- /home/rgrossekunst/z/vfailure 2025-02-11 15:53:15.693518916 -0800
+++ /home/rgrossekunst/z/vsuccess 2025-02-11 15:53:05.281172240 -0800
@@ -29,10 +29,10 @@
/usr/lib/x86_64-linux-gnu/libz.so.1.3
-label: after cuda.bindings.nvvm.version() failure
+label: after cuda.bindings.nvvm.version() success
--- before
+++ after
-@@ -7,9 +7,12 @@
+@@ -7,9 +7,13 @@
/usr/lib/x86_64-linux-gnu/gconv/gconv-modules.cache
/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
/usr/lib/x86_64-linux-gnu/libc.so.6
@@ -45,5 +45,6 @@
+/usr/lib/x86_64-linux-gnu/librt.so.1
/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.33
/usr/lib/x86_64-linux-gnu/libz.so.1.3
++/usr/local/cuda-12.6/nvvm/lib64/libnvvm.so.4.0.0
Possible approaches for locating libnvvm.so:
-
Rely on the existing
$ORIGIN/../../../nvidia/cuda_nvcc/nvvm/lib64rpathaddition to discoversite-packages/nvidia/cuda_nvcc/nvvm/lib64. -
Traverse
sys.path, look forlibsubdirectories, look fornvvm/lib64in their parent directories. — This will work for conda-based installations with separatevenv. — We wouldn't need the$ORIGIN/../../../../../../nvvm/lib64rpathaddition anymore, although it would be fine to keep. — Note that this is similar to looking for$CONDA_PREFIX/nvvm/lib64, but without requiring that environment variable. -
Look for
$CUDA_HOME/nvvm/lib64or$CUDA_PATH/nvvm/lib64as the last resort, e.g. to find/usr/local/cuda/nvvm/lib64.
In theory we could look for /usr/local/cuda/nvvm/lib64 as the very last resort, but is that wise?
@rwgk could probably also learn from how Numba finds nvvm today as well: https://github.com/NVIDIA/numba-cuda/blob/bf487d78a40eea87f009d636882a5000a7524c95/numba_cuda/numba/cuda/cudadrv/libs.py#L54
Some ChatGPT queries, for more datapoints:
- https://chatgpt.com/share/67d03b9a-07c4-8008-bad7-5fcc777cf37e
For easy reference, copy-pasting summaries:
General:
- Use
cuda-toolkit-configif available (it's direct and robust). - Fallback to
which nvccorCUDA_HOMEfor general compatibility. - For CMake-based projects, rely on
FindCUDA. - Handle Conda-based installations separately.
Python-based:
numba.cuda.runtime.get_libdevice()– if numba is installed, this directly returns the path to libnvvm.cupy.get_cuda_path()– cleanest if you're already using CuPy.torch.utils.cpp_extension.CUDA_HOME– if PyTorch is in use.- Fallback to
os.getenv()– safest for broad compatibility.
site-packages/nvidia:
- For direct lookup → use
importliborpkg_resources - For flexible discovery → use
sys.path+glob - For environment-based lookup → use
CUDA_PATH
- Use
cuda-toolkit-configif available (it's direct and robust).
I assume this is referring to the apt package cuda-toolkit-config-common and isn't an actually usable tool as far as I'm aware. There's too many different OS package managers where interacting directly with them is probably a bad idea.
- Fallback to
which nvccorCUDA_HOMEfor general compatibility.
CUDA_HOME isn't an officially supported environment variable and is just something that multiple OSS projects have adopted as a "failsafe" for when they can't find things in expected locations or if a user has a non-standard install location. which nvcc will not work in situations where the user doesn't have nvcc installed, i.e. if they're using wheel or conda packages.
- For CMake-based projects, rely on
FindCUDA.
Can't rely on having CMake installed and FindCUDA is deprecated where FindCUDAToolkit is the more updated version, but it still doesn't support finding libraries in wheel packages and doesn't work if an nvcc package isn't installed.
- Handle Conda-based installations separately.
This is an option, but we need to handle more than Conda. We need to handle OS package managers putting things in standard locations found by the loader / linker, as well as the situation where users have installed the CUDA toolkit into a manual location and have set things like LD_LIBRARY_PATH and PATH appropriately.
Python-based:
numba.cuda.runtime.get_libdevice()– if numba is installed, this directly returns the path to libnvvm.
We intend to move numba.cuda to depend on cuda.core for its internal usage of CUDA libraries, so this is a non-starter.
cupy.get_cuda_path()– cleanest if you're already using CuPy.torch.utils.cpp_extension.CUDA_HOME– if PyTorch is in use.
We hope to have cupy and pytorch adopt and depend on cuda.core in the future so these are non-starters.
- Fallback to
os.getenv()– safest for broad compatibility.
There isn't any officially supported environment variable to indicate the CUDA toolkit location and different package managers can have different layouts for the various libraries of the CUDA toolkit.
site-packages/nvidia:
- For direct lookup → use
importliborpkg_resources
There isn't really a python package within the current CUDA wheels. In theory there's a site-packages/nvidia folder that you can import as a no-op and then you can check nvidia.__dir__() and check if there's a cuda_nvcc entry in the list (the cuda-nvcc package is what contains the nvvm DSO).
- For flexible discovery → use
sys.path+glob
If a user is using wheels I don't think this helps us versus just importing the nvidia namespace package / directory as described above.
- For environment-based lookup → use
CUDA_PATH
CUDA_PATH isn't an officially supported environment variable similar to CUDA_HOME.
Thanks @kkraus14, that's very useful, for me to better understand the environment.
We intend to move
numba.cudato depend oncuda.corefor its internal usage of CUDA libraries, so this is a non-starter.
We hope to have
cupyandpytorchadopt and depend on cuda.core in the future so these are non-starters.
These are interesting to me, though, and I already looked pretty closely at the numba code, after you pointed me to it a month ago (your comment here from Feb 12, my #447 experiment). I figure, roughly, we need to provide a functional union of what numba.cuda, cupy, and pytorch do for discovering CUDA libraries and headers. I want to spend a small/modest amount of time to also look at the relevant code in cupy and pytorch.
Content moved here: https://github.com/NVIDIA/cuda-python/issues/451#issuecomment-2810937934