cccl icon indicating copy to clipboard operation
cccl copied to clipboard

[FEA]: Investigate if NVTX ranges in CUB algorithms support graph capture

Open gevtushenko opened this issue 1 year ago • 0 comments

Is this a duplicate?

  • [X] I confirmed there appear to be no duplicate issues for this request and that I agree to the Code of Conduct

Area

CUB

Is your feature request related to a problem? Please describe.

As of https://github.com/NVIDIA/cccl/issues/719 we have NVTX ranges in CUB device algorithms. Most CUB device algorithms support graph capture. For now, it's not clear if NVTX is working correctly in presence of graph capture.

Describe the solution you'd like

We need to understand if NVTX ranges work correctly when CUB is in graph capture mode. Since all of our *_.lid_2 tests run CUB algorithms in graph capture mode, one of these tests, say cub.cpp17.test.device_select_if.lid_2, can be used as an example. If NVTX ranges do not contain kernels they surround, I'd prefer no NVTX ranges to be reported.

Describe alternatives you've considered

No response

Additional context

No response

gevtushenko avatar Apr 29 '24 20:04 gevtushenko