pytorch_sparse icon indicating copy to clipboard operation
pytorch_sparse copied to clipboard

Benchmarks result in `INTERNAL ASSERT FAILED` when run with device `mps`

Open apullin opened this issue 2 years ago • 4 comments
trafficstars

Running the benchmarks main.py targeting device mps results in an assertion failure:

Traceback (most recent call last):
  File "/Users/apullin/personal/pyg/pytorch_sparse/benchmark/main.py", line 174, in <module>
    correctness(dataset)
  File "/Users/apullin/anaconda3/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/Users/apullin/personal/pyg/pytorch_sparse/benchmark/main.py", line 43, in correctness
    mat.fill_cache_()
  File "/Users/apullin/anaconda3/lib/python3.10/site-packages/torch_sparse-0.6.18-py3.10-macosx-11.0-arm64.egg/torch_sparse/tensor.py", line 286, in fill_cache_
    self.storage.fill_cache_()
  File "/Users/apullin/anaconda3/lib/python3.10/site-packages/torch_sparse-0.6.18-py3.10-macosx-11.0-arm64.egg/torch_sparse/storage.py", line 470, in fill_cache_
    self.rowptr()
  File "/Users/apullin/anaconda3/lib/python3.10/site-packages/torch_sparse-0.6.18-py3.10-macosx-11.0-arm64.egg/torch_sparse/storage.py", line 209, in rowptr
    rowptr = torch.ops.torch_sparse.ind2ptr(row, self._sparse_sizes[0])
  File "/Users/apullin/anaconda3/lib/python3.10/site-packages/torch/_ops.py", line 692, in __call__
    return self._op(*args, **kwargs or {})
RuntimeError: ind.device().is_cpu() INTERNAL ASSERT FAILED at "/Users/apullin/personal/pyg/pytorch_sparse/csrc/cpu/convert_cpu.cpp":8, please report a bug to PyTorch. ind must be CPU tensor

Invocation was: python main.py --device=mps

notably, I also had to comment out lines 66 and 84, with import torch.mps, to get this to run - those imports should not be needed?

A similar error occurs when trying to do an applied problem using mps as the device, e.g. a GCN that uses a sparse adjacency matrix. (this came up in coursework - I will have to recreate a minimum working example so I don't post a solution)

running:

torch==2.1.0
torch-scatter==2.1.2
torch-sparse==0.6.18
torch_geometric==2.4.0

on python 3.10 on Apple Silicon (M2 Max)

apullin avatar Nov 20 '23 02:11 apullin

Sorry for late reply. Our custom kernels currently do not support mps backend at the moment.

rusty1s avatar Nov 26 '23 12:11 rusty1s

Darn. Sadly, it looks like Apple does not provide any kind of GPU BLAS. Does this mean the sparse ops kernels would have to be manually implemented as Metal shaders?

apullin avatar Nov 27 '23 04:11 apullin

I haven't looked at the detailed backend code required for mps yet, but yeah, it would mean we need to register MPS as a backend for torch-sparse and then implement this functionality in Metal.

rusty1s avatar Nov 27 '23 06:11 rusty1s

This issue had no activity for 6 months. It will be closed in 2 weeks unless there is some new activity. Is this issue already resolved?

github-actions[bot] avatar May 26 '24 01:05 github-actions[bot]