thrust icon indicating copy to clipboard operation
thrust copied to clipboard

CUDA-MEMCHECK reports Uninitialized access

Open liruipeng opened this issue 5 years ago • 5 comments

Hello, thrust team,

I found CUDA-MEMCHECK reported uninitialized access of thrust calls. I made a simple reproducer attached below. Please take a look. This is with cuda 10.1 on P100.

Thanks

-Ruipeng

#include <thrust/extrema.h>

#define N 64

int main()
{
   int *h_x, *d_x, i, k;

   h_x = (int *) malloc(N*sizeof(int));

   for (i = 0; i < N; i++)
   {
      h_x[i] = i;
   }

   cudaMalloc(&d_x, N*sizeof(int));

   cudaMemcpy(d_x, h_x, N*sizeof(int),  cudaMemcpyHostToDevice);

   int *ptr = thrust::max_element( thrust::device, d_x, d_x + N);

   cudaMemcpy(&k, ptr, sizeof(int),  cudaMemcpyDeviceToHost);

   printf("max = %d\n", k);

   cudaFree(d_x);
   free(h_x);

   return 0;
}

Output

li50@ray15:~/workspace/tmp/GPU/thrust$ cuda-memcheck  --tool initcheck   ./a.out                                                                                                                                                                                                
========= CUDA-MEMCHECK
========= Host API memory access error at host access to 0x100030e00600 of size 16 bytes
=========     Uninitialized access at 0x100030e00600 on access by cudaMemcopy source.
=========     Saved host backtrace up to driver entry point at error
=========     Host Frame:/lib64/libcuda.so.1 (cuMemcpyDtoHAsync_v2 + 0x1e4) [0x2ec984]
=========     Host Frame:./a.out [0x5357c]
=========     Host Frame:./a.out [0x208cc]
=========     Host Frame:./a.out [0x787a8]
=========     Host Frame:./a.out [0x117f8]
=========     Host Frame:./a.out [0x112dc]
=========     Host Frame:./a.out [0x10f00]
=========     Host Frame:./a.out [0x10608]
=========     Host Frame:./a.out [0x10084]
=========     Host Frame:./a.out [0x100e0]
=========     Host Frame:./a.out [0xe328]
=========     Host Frame:./a.out [0xe38c]
=========     Host Frame:./a.out [0xcdb0]
=========     Host Frame:./a.out [0xc524]
=========     Host Frame:./a.out [0xa388]
=========     Host Frame:./a.out [0xa314]
=========     Host Frame:./a.out [0x9db8]
=========     Host Frame:./a.out [0x4e0c]
=========     Host Frame:/lib64/libc.so.6 [0x25100]
=========     Host Frame:/lib64/libc.so.6 (__libc_start_main + 0xc4) [0x252f4]
=========
max = 63
========= ERROR SUMMARY: 1 error

liruipeng avatar Jan 28 '20 23:01 liruipeng

I confirm this behavior on cuda 10.2.

hwinkler avatar Jun 09 '20 00:06 hwinkler

I can confirm this behavior in CUDA 11 w/ a V100. I notice this issue was opened exactly 1 year ago. Has anyone found a fix for this yet? I've tried explicitly syncing the stream I'm using in the policy but I still get the error.

I'm experiencing this w/ the following code:

  thrust::device_ptr<value_idx> d_ptr_indptr =
    thrust::device_pointer_cast(indptr);

  thrust::device_ptr<value_idx> d_ptr_diff =
    thrust::device_pointer_cast(diff.data());
  ML::thrustAllocatorAdapter alloc(allocator, stream);
  thrust::adjacent_difference(thrust::cuda::par(alloc).on(stream),
                              d_ptr_indptr, d_ptr_indptr + n_rows, d_ptr_diff);

  return *(thrust::max_element(thrust::cuda::par(alloc).on(stream), d_ptr_diff,
                               d_ptr_diff + n_rows));

cjnolet avatar Jan 29 '21 15:01 cjnolet

Just confirmed this behavior on CUDA 10.2 for thrust::min_max

divyegala avatar Jan 29 '21 15:01 divyegala

Looks like this fell through the cracks. I'll try to look at this soon.

alliepiper avatar Jan 29 '21 17:01 alliepiper

Likely related to NVIDIA/cub#184.

alliepiper avatar Jan 29 '21 17:01 alliepiper