DALI icon indicating copy to clipboard operation
DALI copied to clipboard

Bug in creating `TensorGPU` when `stream` key is `None` in CUDA array interface

Open tadejsv opened this issue 10 months ago • 2 comments

Version

1.36.0

Describe the bug.

A TensorGPU can be created from any object conforming to a CUDA Array Interface. Version 3 of this interface (accodring to numba docs) has a stream property, which can be either an integer or None.

DALI, however, always assumes that it will be an integer - which can lead to bugs in some cases, like in the example where I am trying to convert a gpuarray from pycuda package.

The code responsible for this bug is

https://github.com/NVIDIA/DALI/blob/dedcfaec349c85387969bb78bda5d50aa61a4fa9/dali/python/backend_impl.cc#L283-L286

and the bug was introduced with https://github.com/NVIDIA/DALI/pull/5125, which was released in 1.32.0

Minimum reproducible example

import numpy as np
import nvidia.dali.backend as backend
import pycuda.autoinit  # noqa
import pycuda.gpuarray as gpuarray

test_input = np.random.randn(4, 4).astype(np.float32)
g = gpuarray.to_gpu(test_input)
print(g.__cuda_array_interface__)
# {'shape': (4, 4), 'strides': (16, 4), 'data': (139775629590528, False), 'typestr': '<f4', 'stream': None, 'version': 3}
backend.TensorGPU(g)

Relevant log output

SystemError: <built-in method __init__ of PyCapsule object at 0x7f3cca6db060> returned a result with an exception set

Other/Misc.

No response

Check for duplicates

  • [X] I have searched the open bugs/issues and have found no duplicates for this bug report

tadejsv avatar Apr 11 '24 12:04 tadejsv

@tadejsv Thanks for reporting. It looks indeed like an omission. We'll look into that.

mzient avatar Apr 11 '24 15:04 mzient

I think https://github.com/NVIDIA/DALI/pull/5425 should fix the issue. Please check the nightly build after it is merged.

JanuszL avatar Apr 12 '24 08:04 JanuszL

The 1.37 is available. Please reopen if this still doesn't work.

JanuszL avatar Jun 03 '24 11:06 JanuszL