pytorch icon indicating copy to clipboard operation
pytorch copied to clipboard

torch.unique crashes on GPU

Open twuebi opened this issue 4 years ago • 4 comments

🐛 Bug

Running torch.unique with any specified argument fails:

In [5]: torch.unique(torch.arange(10).view(2,5).cuda(), dim=1)                                                                                                                                                                                                                                                                
[1]    17003 abort (core dumped)  ipython
In [2]: torch.unique(torch.arange(10).view(2,5).cuda(), dim=1, return_counts=True)
...
RuntimeError: unique_by_key failed on 2nd step: hipErrorInvalidDeviceFunction
In [28]: torch.unique( preds['dep'][0][:,1:],dim=-1)                                                                                                          
Memory access fault by GPU node-1 (Agent handle: 0x564cdae2ed90) on address 0x7fb309004000. Reason: Page not present or supervisor privilege.
[1]    15313 abort (core dumped)

With a single argument it works:

In [1]: torch.unique( preds['dep'][0][:,1:].float())                                                                                                          
Out[1]: 
tensor([ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10., 11., 12., 13.,
        14., 15., 16., 17., 18., 19., 20., 21., 22., 23., 24., 25., 26., 27.,
        28., 30., 31., 32., 33., 34., 35., 36.], device='cuda:0')

To Reproduce

Steps to reproduce the behavior:

torch.unique(torch.arange(10).view(2,5).cuda(), dim=1)

Expected behavior

No crashes.

Environment

PyTorch version: 1.6.0a0+2a460c0 Is debug build: No CUDA used to build PyTorch: Could not collect

OS: Linux Mint 19.1 Tessa GCC version: (Ubuntu 8.4.0-1ubuntu1~18.04) 8.4.0 CMake version: version 3.17.2

Python version: 3.8 Is CUDA available: Yes CUDA runtime version: Could not collect GPU models and configuration: Could not collect Nvidia driver version: Could not collect cuDNN version: Could not collect

Versions of relevant libraries: [pip3] numpy==1.18.4 [pip3] numpydoc==0.9.2 [pip3] pytorch-lamb==1.0.0 [pip3] pytorch-lightning==0.8.0 [pip3] pytorch-pretrained-bert==0.6.2 [pip3] pytorch-transformers==1.1.0 [pip3] torch==1.6.0a0+2a460c0 [pip3] torchvision==0.6.0 [conda] Could not collect

Additional context

GPU: Radeon VII ROCm version: 3.5.1

twuebi avatar Jul 12 '20 11:07 twuebi