SYCLomatic icon indicating copy to clipboard operation
SYCLomatic copied to clipboard

deviceProp.maxThreadsPerMultiProcessor != deviceProp.get_max_work_items_per_compute_unit() ?

Open zjin-lcf opened this issue 2 years ago • 2 comments

deviceProp.maxThreadsPerMultiProcessor is 2048 and deviceProp.get_max_work_items_per_compute_unit() is 1024 on an NVIDIA GPU.

  dpct version 16.0.0. Codebase:(536eeb8014b1570a8b65aee511cbe2ba664e3962)
  cudaDeviceProp deviceProp;
  cudaGetDeviceProperties(&deviceProp, 0);
  const int mTpSM = deviceProp.maxThreadsPerMultiProcessor;

zjin-lcf avatar Jan 12 '23 19:01 zjin-lcf

@zjin-lcf we plan to sync with compiler team what the root cause is.

tomflinda avatar Jan 16 '23 08:01 tomflinda

okay

zjin-lcf avatar Jan 16 '23 11:01 zjin-lcf

@zjin-lcf We have reported this issue to the compiler team and https://github.com/intel/llvm/issues/7997 also track its status, so there is no further action needed in SYCLomatic, I close this issue.

tomflinda avatar Jun 05 '24 01:06 tomflinda