openmm icon indicating copy to clipboard operation
openmm copied to clipboard

openmm-cuda-12 doesn't find its nVidia dependencies on Windows

Open zjp opened this issue 6 months ago • 7 comments

Thank you from the ChimeraX team for your work putting OpenMM on PyPi. It has greatly simplified the process of packaging it with ChimeraX. I just wanted to drop a note on our experience getting CUDA working with the PyPi version and bring up an issue about CUDA dependencies.

@tristanic brought it to my attention a few weeks ago that he had had to install the package openmm-cuda-12 and update CUDA on his system to get CUDA working as an OpenMM backend. When I went to install openmm-cuda-12, I noticed that it brought in several nVidia CUDA pypi packages, so I held off on installing the CUDA Toolkit to see if OpenMM's CUDA backend would work with just those dependencies.

Unfortunately that did not work. They're not added to the DLL directory search path (os.add_dll_directory) by openmm or openmm-cuda-12, which means that when I was testing it out on Windows, Python couldn't find them. I did try to open a shell, add each individual nVidia library's directory under site-packages to my DLL search path, and then import OpenMM, but it didn't work. So in the end I also had to download and install the CUDA Toolkit to get it working.

My plan right now is to remove the extra nVidia libraries from our builds and advise Tristan to advise ISOLDE users to install the CUDA Toolkit instead.

zjp avatar Jul 07 '25 21:07 zjp

Is it possible you're missing the driver? When you install with pip it can automatically install the required runtime libraries, but it can't install the driver. And without it, you won't be able to load any of the other libraries.

peastman avatar Jul 07 '25 23:07 peastman

While trying to troubleshoot I spent some time watching the Windows library loading process with Procmon https://learn.microsoft.com/en-us/sysinternals/downloads/procmon as I imported OpenMM for the first time. I'm afraid I never saw any hint of it trying any of the nvidia files in the Python tree, and I couldn't find anything in the nvidia-xxx wheel files that would add their directories to the library search path (in fact, they don't seem to contain any Python code at all). I think the missing piece of the puzzle is the cuda-bindings https://nvidia.github.io/cuda-python/cuda-bindings/latest/install.html package - looks like it's the container for all the actual Pythonic stuff (without digging too deeply, I'm guessing that importing this is designed to get everything linked up).

Anyway, using just the actual OpenMM libraries from the wheel files and otherwise relying on a system-wide CUDA toolkit installation works fine for the time being... which makes me wonder, given their size, if it's worth making the PyPI CUDA packages optional dependencies? If the CUDA plugin is installed but fails to load, you could print a warning message including the necessary pip install command for the user to install it for themself.

On Tue, Jul 8, 2025 at 12:54 AM Peter Eastman @.***> wrote:

peastman left a comment (openmm/openmm#5000) https://github.com/openmm/openmm/issues/5000#issuecomment-3046870288

Is it possible you're missing the driver? When you install with pip it can automatically install the required runtime libraries, but it can't install the driver. And without it, you won't be able to load any of the other libraries.

— Reply to this email directly, view it on GitHub https://github.com/openmm/openmm/issues/5000#issuecomment-3046870288, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFM54YCLJXDSNAYSCO5OX3L3HMCCXAVCNFSM6AAAAACA7KKIR2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTANBWHA3TAMRYHA . You are receiving this because you were mentioned.Message ID: @.***>

-- Altos Labs UK Limited | England | Company reg 13484917   Registered address: 3rd Floor 1 Ashley Road, Altrincham, Cheshire, United Kingdom, WA14 2DT

tristanic avatar Jul 08 '25 09:07 tristanic

Does this issue only happen on Windows? I've tested on Linux and everything works, but I don't have access to a Windows machine with an NVIDIA GPU.

Perhaps there's another package we need to install on Windows?

peastman avatar Jul 10 '25 17:07 peastman

Yeah - looks like the cuda-bindings library is what you need... while I haven't had the time to actually install and test, they have a fair bit of code dedicated to finding and linking their libraries (see https://github.com/NVIDIA/cuda-python/blob/main/cuda_bindings/cuda/bindings/_path_finder/find_nvidia_dynamic_library.py). Windows is a pain that way - there's no way to specify paths to the libraries you want to link at compile time the way you can in *nix, so they have to be either in the same directory as your compiled file or added to the windows DLL search path. It looks like the cuda-bindings package will do that for you.

On Thu, Jul 10, 2025 at 6:17 PM Peter Eastman @.***> wrote:

peastman left a comment (openmm/openmm#5000) https://github.com/openmm/openmm/issues/5000#issuecomment-3058292722

Does this issue only happen on Windows? I've tested on Linux and everything works, but I don't have access to a Windows machine with an NVIDIA GPU.

Perhaps there's another package we need to install on Windows?

— Reply to this email directly, view it on GitHub https://github.com/openmm/openmm/issues/5000#issuecomment-3058292722, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFM54YCJXO6W2UT4HV7MPV33H2N2BAVCNFSM6AAAAACA7KKIR2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTANJYGI4TENZSGI . You are receiving this because you were mentioned.Message ID: @.***>

-- Altos Labs UK Limited | England | Company reg 13484917   Registered address: 3rd Floor 1 Ashley Road, Altrincham, Cheshire, United Kingdom, WA14 2DT

tristanic avatar Jul 14 '25 15:07 tristanic

If you install cuda-bindings does that make it work? Or is that logic just what it uses to find libraries itself, but it doesn't affect other packages? cuda-bindings is Python wrappers for CUDA, which we don't use.

peastman avatar Jul 14 '25 23:07 peastman

So as I understand it, once a given library has been loaded once in a given process, then it's available to any other dependent library subsequently loaded in the same process. Anyway, I did some messing around and that pans out here:

  • renamed my system CUDA directory to make sure it can't be found, and confirmed that:
from openmm import Platform
Platform.getPluginLoadFailures()
# Lists failed CUDA imports as expected

Then:

 pip install --user -i https://pypi.ngc.nvidia.com nvidia-cuda-runtime-cu12 nvidia-cuda-nvcc-cu12 nvidia-cuda-nvrtc-cu12 nvidia-cuda-cupti-cu12 nvidia-cufft-cu12
pip install --user cuda-bindings

... and in a fresh session:

from cuda.bindings._path_finder.find_nvidia_dynamic_library import find_nvidia_dynamic_library
from cuda.bindings._path_finder.load_dl_windows import load_with_abs_path
for lib in ('cudart', 'nvrtc', 'cufft'):
    print(f'lib: {load_with_abs_path(lib, find_nvidia_dynamic_library(lib))}')
from openmm import Platform
Platform.getPluginLoadFailures()
# None
for i in range(Platform.getNumPlatforms()):
    print(Platform.getPlatform(i).getName())
    
Reference
CPU
CUDA
OpenCL

Doesn't feel like the official way to do it, but it does appear to work. Haven't worked out if it has a way to find the nvcc executable though, and have to get back to my day job for now.

tristanic avatar Jul 15 '25 09:07 tristanic

So as I understand it, once a given library has been loaded once in a given process, then it's available to any other dependent library subsequently loaded in the same process.

This goes for Linux too, by the way - and with some minor tweaking could be used to straightforwardly support the scenario where OpenMM is installed in Python's global site-packages directory while one or more plugins have been later installed into the user site-packages (e.g. with pip install --user OpenMM-CUDA-12). I don't have a git clone of the OpenMM repo to do this properly, but hot-patching my own installation (with OpenMM-CUDA-12 in the user site-packages)...

Added to version.py:

import site
openmm_user_library_path = os.path.abspath(os.path.join(site.getusersitepackages(), 'OpenMM.libs', 'lib'))

Added here:

if os.path.isdir(version.openmm_user_library_path) and version.openmm_user_library_path != version.openmm_library_path:
    pluginLoadedLibNames += Platform.loadPluginsFromDirectory(os.path.join(version.openmm_user_library_path, 'plugins'))

... then all plugins are successfully loaded - once libOpenMM.so has been loaded once in the Python process, the RUNPATH becomes largely superfluous.

from openmm import Platform

Platform.getNumPlatforms()
Out[2]: 4

for i in range(Platform.getNumPlatforms()):
    print(Platform.getPlatform(i).getName())
    
Reference
CPU
OpenCL
CUDA

tristanic avatar Jul 22 '25 13:07 tristanic