Why is libcublas.so.12 (a CUDA library) required, even though I installed the ROCm version of PyTorch for AMD GPUs?
I ran the following commands to install boltz2 with python/3.11.7:
python3 -m venv boltz2
source boltz2/bin/activate
pip3 install torch==2.6.0 torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm6.2.4
pip3 install boltz --index-url https://pypi.org/simple
However, I encountered a strange issue. I'm using AMD GPUs, not NVIDIA GPUs, yet the code fails due to a missing libcublas.so.12, which is part of NVIDIA’s CUDA libraries.
Here are the related cuequivariance packages installed in the environment:
(boltz2) $ pip freeze | grep cue
cuequivariance==0.5.0
cuequivariance-ops-cu12==0.5.0
cuequivariance-ops-torch-cu12==0.5.0
cuequivariance-torch==0.5.0
When running the following command:
(boltz2) $ boltz predict connexin-peptide.yaml --recycling_steps 20 --diffusion_samples 5 --use_msa_server
I get this error:
Error while loading libcue_ops.so: libcublas.so.12: cannot open shared object file: No such file or directory
Why is libcublas.so.12 (a CUDA library) required, even though I installed the ROCm version of PyTorch for AMD GPUs? It seems that some dependencies (e.g., cuequivariance-ops-cu12) are forcing CUDA-specific components, which obviously won't work on AMD hardware.
Cuequivariance has to be optional in your case. I had the same issue with Apple Silicon and fixed it this way. Though, it must be adapted to AMD CPU
does --no_kernels not work? I have just swapped the triangular attention back to trifast in primitives.py when I run on non-NVIDIA hardware.
Sure, --no_kernels should work. I think I had an error during the installation of the dependencies from the pyproject.toml file and some import errors afterwards