pytorch_sparse pip repeatedly selecting wrong CUDA version during install

trafficstars

Hi!

I am trying to get an installation running on an HPC cluster with somewhat older dependencies. I have found torch 1.8.0 with CUDA 11.1 to work, so now I want to install the matching torch_sparse version:

$pip install torch-sparse -f https://data.pyg.org/whl/torch-1.8.0+cu111.html --no-cache-dir

Looking in links: https://data.pyg.org/whl/torch-1.8.0+cu111.html
Collecting torch-sparse
  Downloading torch_sparse-0.6.16.tar.gz (208 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 208.2/208.2 kB 11.8 MB/s eta 0:00:00
  Preparing metadata (setup.py) ... done
Requirement already satisfied: scipy in /home/icb/florin.ratajczak/anaconda3/envs/speostest/lib/python3.7/site-packages (from torch-sparse) (1.7.3)
Requirement already satisfied: numpy<1.23.0,>=1.16.5 in /home/icb/florin.ratajczak/anaconda3/envs/speostest/lib/python3.7/site-packages (from scipy->torch-sparse) (1.21.5)
Building wheels for collected packages: torch-sparse
  Building wheel for torch-sparse (setup.py) ... done
  Created wheel for torch-sparse: filename=torch_sparse-0.6.16-cp37-cp37m-linux_x86_64.whl size=1716194 sha256=46187b7f1bf7bae117f0e216f72bd6cf67ae95c2f9b6734d5369aa47099de042
  Stored in directory: /tmp/pip-ephem-wheel-cache-23v87jns/wheels/ff/5c/28/5d12cf8ac7bb8bc3de9dda8fa446cb4aeb9fffe19ef1028538
Successfully built torch-sparse
Installing collected packages: torch-sparse
Successfully installed torch-sparse-0.6.16

I just added the --no-cache-dir flag to make sure it doesnt load some other version from cache.

However, when I now try to import it, it says its compiled for CUDA 12.0:

(speostest) [florin.ratajczak@hpc-submit03gui speos]$ python
Python 3.7.16 (default, Jan 17 2023, 22:20:44) 
[GCC 11.2.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.__version__
'1.8.0'
>>> torch.version.cuda
'11.1'
>>> import torch_sparse
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/icb/florin.ratajczak/anaconda3/envs/speostest/lib/python3.7/site-packages/torch_sparse/__init__.py", line 33, in <module>
    f'Detected that PyTorch and torch_sparse were compiled with '
RuntimeError: Detected that PyTorch and torch_sparse were compiled with different CUDA versions. PyTorch has CUDA version 11.1 and torch_sparse has CUDA version 12.0. Please reinstall the torch_sparse that matches your PyTorch install.

Any idea why it says it was compiled for CUDA 12.0? I might have to host a course on that cluster in a week or so, so any help is much appreciated!

Feb 09 '23 13:02 fratajcz

Hi, I have a similar problem.

I installed torch-sparse like this: pip install torch-scatter torch-sparse==0.6.12 torch-cluster torch-spline-conv torch-geometric==2.0.4 -f https://data.pyg.org/whl/torch-1.8.2+cpu.html --no-cache-dir

But while running I get the following error:

Traceback (most recent call last):
File "/dss/dsshome1/lxc03/gobi005/BlockEQTL/speos/speos/scripts/explanation_scripts/explanation_one_model.py", line 2, in <module>
from speos.models import ModelBootstrapper
File "/dss/dsshome1/lxc03/gobi005/miniconda3/envs/speos/lib/python3.7/site-packages/speos/models.py", line 1, in <module>
from speos.architectures import GeneNetwork, RelationalGeneNetwork, FCNN, LINKX, SimpleGCN
File "/dss/dsshome1/lxc03/gobi005/miniconda3/envs/speos/lib/python3.7/site-packages/speos/architectures.py", line 4, in <module>
import torch_geometric.nn as pyg_nn
File "/dss/dsshome1/lxc03/gobi005/miniconda3/envs/speos/lib/python3.7/site-packages/torch_geometric/__init__.py", line 4, in <module>
import torch_geometric.data
File "/dss/dsshome1/lxc03/gobi005/miniconda3/envs/speos/lib/python3.7/site-packages/torch_geometric/data/__init__.py", line 1, in <module>
from .data import Data
File "/dss/dsshome1/lxc03/gobi005/miniconda3/envs/speos/lib/python3.7/site-packages/torch_geometric/data/data.py", line 9, in <module>
from torch_sparse import SparseTensor
File "/dss/dsshome1/lxc03/gobi005/miniconda3/envs/speos/lib/python3.7/site-packages/torch_sparse/__init__.py", line 16, in <module>
f'{library}_{suffix}', [osp.dirname(__file__)]).origin)
File "/dss/dsshome1/lxc03/gobi005/miniconda3/envs/speos/lib/python3.7/site-packages/torch/_ops.py", line 104, in load_library
ctypes.CDLL(path)
File "/dss/dsshome1/lxc03/gobi005/miniconda3/envs/speos/lib/python3.7/ctypes/__init__.py", line 364, in __init__
self._handle = _dlopen(self._name, mode)
OSError: /dss/dsshome1/lxc03/gobi005/miniconda3/envs/speos/lib/python3.7/site-packages/torch_sparse/_convert_cpu.so: undefined symbol: __kmpc_fork_call

I think that the reason for this might also be an error produced by different cuda versions of pytorch and torch-sparse.

Feb 28 '23 14:02 arne-vdb

Do you have a PyTorch CUDA version installed? Otherwise, I don't think this is necessarily related to CUDA but more due to a mismatch in PyTorch versions.

Mar 01 '23 09:03 rusty1s

Thanks for the response!

I have tested several cuda and cpu versions and the only one that worked was If I install torch for cuda 11.6 (which is CUDA 12.0 compatible) and then later install torch-sparse for 11.6 as well. All other cuda/cpu versions lead to the problem that torch-sparse is incompatible with torch, even if both have been installed with the right instructions. Also, from my output above you can see that the error message says that torch is compiled for CUDA 11.1 and torch-sparse for 12.0, even though i explicitely requested the cu111 version. Perhaps some paths got scrambled up and torch-sparse is installed for CUDA 12.0, no matter which -f argument is specified?

Mar 01 '23 09:03 fratajcz

This issue had no activity for 6 months. It will be closed in 2 weeks unless there is some new activity. Is this issue already resolved?

Aug 29 '23 01:08 github-actions[bot]

Have you tried with pip install --no-build-isolation ? If not, it will install torch to build the package and install it afterward. I had that issue with ROCm when it would ignore the rocm torch version and install the default torch + cuda version.

Aug 21 '24 16:08 Delaunay

pytorch_sparse pytorch_sparse copied to clipboard

pip repeatedly selecting wrong CUDA version during install

pytorch_sparse
pytorch_sparse copied to clipboard