pytorch_sparse icon indicating copy to clipboard operation
pytorch_sparse copied to clipboard

diffrent behavior between cpu and cuda

Open Boltzmachine opened this issue 4 years ago • 1 comments
trafficstars

when the cuda devices are invisible, my program runs well. but when the cuda devices are available, it reports the error

Traceback (most recent call last):
  File "/home/boltzmachine/THG/train.py", line 10, in <module>
    from models.simple import Simple, Share
  File "/home/boltzmachine/THG/models/simple.py", line 5, in <module>
    from .GNN import DenseGatedRGCN
  File "/home/boltzmachine/THG/models/GNN.py", line 10, in <module>
    from torch_geometric.nn.inits import glorot, zeros
  File "/home/boltzmachine/miniconda3/envs/THG/lib/python3.7/site-packages/torch_geometric/__init__.py", line 5, in <module>
    import torch_geometric.data
  File "/home/boltzmachine/miniconda3/envs/THG/lib/python3.7/site-packages/torch_geometric/data/__init__.py", line 1, in <module>
    from .data import Data
  File "/home/boltzmachine/miniconda3/envs/THG/lib/python3.7/site-packages/torch_geometric/data/data.py", line 8, in <module>
    from torch_sparse import coalesce, SparseTensor
  File "/home/boltzmachine/miniconda3/envs/THG/lib/python3.7/site-packages/torch_sparse/__init__.py", line 15, in <module>
    f'{library}_{suffix}', [osp.dirname(__file__)]).origin)
AttributeError: 'NoneType' object has no attribute 'origin'

I use torch 1.7.0 and cu101 I uninstalled torch-sparse repeatedly until there's nothing install and I installed by

pip install torch-sparse -f https://pytorch-geometric.com/whl/torch-1.7.0+cu101.html --no-cache-dir
pip install torch-scatter -f https://pytorch-geometric.com/whl/torch-1.7.0+cu101.html --no-cache-dir
pip install torch-cluster -f https://pytorch-geometric.com/whl/torch-1.7.0+cu101.html --no-cache-dir
pip install torch-spline-conv -f https://pytorch-geometric.com/whl/torch-1.7.0+cu101.html --no-cache-dir
pip --no-cache-dir install torch-geometric

It seems that there's no *.so of cuda in ~/miniconda3/envs/THG/lib/python3.7/site-packages/torch/cuda/

__init__.py      _diag_cpu.so        _metis_cpu.so    _saint_cpu.so   _spspmm_cpu.so   bandwidth.py  convert.py  index_select.py   metis.py   padding.py  rw.py      select.py  storage.py    utils.py
__pycache__      _ego_sample_cpu.so  _relabel_cpu.so  _sample_cpu.so  _version_cpu.so  cat.py        diag.py     masked_select.py  mul.py     permute.py  saint.py   spmm.py    tensor.py
_convert_cpu.so  _hgt_sample_cpu.so  _rw_cpu.so       _spmm_cpu.so    add.py           coalesce.py   eye.py      matmul.py         narrow.py  reduce.py   sample.py  spspmm.py  transpose.py

One possible reason is that I am using a computer cluster. In my local environment, there is no cuda available until I submit a job by slurm

Boltzmachine avatar Aug 19 '21 15:08 Boltzmachine

The *_cuda.so files should be available nonetheless, even if there is no GPU available until a job is submitted. I therefore think that the issue is that installing from wheels fails for you. How long does the installation take? Can you try again with:

pip install --no-index torch-scatter -f https://pytorch-geometric.com/whl/torch-1.7.0+cu101.html

rusty1s avatar Aug 20 '21 11:08 rusty1s