Backward method of the grid encoding

Open cppcph opened this issue 1 year ago • 0 comments

I was trying to get the gradient of the loss w.r.t. the input of the encoder in the signed distance field scenario in instant-ngp. However, it seems that the implementation of the backward method of grid encoding in tiny-cuda-nn has issues if the argument of the dL_dinput is not a null pointer.

The following code snippet seems to be the cause of the Cuda memory problem: https://github.com/NVlabs/tiny-cuda-nn/blob/235d1fde956dc04966940f9d1bec66aa3bdb705a/include/tiny-cuda-nn/encodings/grid.h#L893C3-L893C16

With the following message in the debug mode: Thread 1 "instant-ngp" received signal CUDA_EXCEPTION_14, Warp Illegal Address. 0x00007fffe3b71bd0 in void tcnn::kernel_grid_backward_input<__half, 3u>(unsigned int, unsigned int, __half const*, float const*, tcnn::MatrixView)<<<(2048,1,1),(128,1,1)>>> () cuda block (0, 0, 0) thread (96, 0, 0) CUDA focus unchanged. cuda block (0, 0, 0) thread (96, 0, 0) [Switching focus to CUDA kernel 0, grid 44, block (0,0,0), thread (96,0,0), device 0, sm 0, warp 0, lane 0] 0x00007fffe3b71bd0 in void tcnn::kernel_grid_backward_input<__half, 3u>(unsigned int, unsigned int, __half const*, float const*, tcnn::MatrixView)<<<(2048,1,1),(128,1,1)>>> ()

I am using the cuda version 11.3 and nvidia-driver version 530.30.02. The graphics card is GeForce RTX 2060.

Feb 07 '24 05:02 cppcph