pytorch-pwc
pytorch-pwc copied to clipboard
fix CUDA_ERROR_ILLEGAL_ADDRESS bug
I fix the memory access bug, which describe here #55 . I force cupy allocate memory on pytorch device.
Huge thanks for bringing this up!
Could you provide some more technical details on how this makes a difference? Currently, all the involved tensors will be on the same device as the first input as per:
https://github.com/sniklaus/pytorch-pwc/blob/07c3f5a1675ad09bb073c4a03a480709c9a74fa8/correlation/correlation.py#L279-L285
I am hence a little bit confused on what the proposed fix would change. :thinking:
Sorry, I don't know, but I guess the code allocate shared memory on default device(GPU 0).
cupy_launch('kernel_Correlation_updateOutput', cupy_kernel('kernel_Correlation_updateOutput', {
'rbot0': rbot0,
'rbot1': rbot1,
'top': output
}))(
grid=tuple([ output.shape[3], output.shape[2], output.shape[0] ]),
block=tuple([ 32, 1, 1 ]),
shared_mem=one.shape[1] * 4,
args=[ cupy.int32(n), rbot0.data_ptr(), rbot1.data_ptr(), output.data_ptr() ]
)