rmm
rmm copied to clipboard
[FEA] Add NVTX ranges to pool allocation/deallocation
Is your feature request related to a problem? Please describe. It would be helpful to be able to see on an Nsight profile how much time is spend on allocating/deallocating memory in the pool memory resource, especially in a multi-threaded environment with per-thread default stream.
Describe the solution you'd like Add NVTX ranges to memory allocation/deallocation in the pool.
Additional context cuDF has a macro defined: https://github.com/rapidsai/cudf/blob/branch-0.15/cpp/include/cudf/detail/nvtx/ranges.hpp
@harrism went about this in the past. We ended up not going through with it because most allocation events are faster than the recommended 1us minimum time for events to annotate with NVTX. That said I don't see anything wrong with adding it as an option that is disabled by default.
I think we can take a simpler/coarser grained approach than https://github.com/rapidsai/rmm/pull/336 and just annotate the device_memory_resource
base class allocate/deallocate
calls, that way we can see the annotations no matter what resource is being used.
We're likely going to run into some difficulty/conflicts with having the nvtx3.hpp
header in both RMM
and libcudf
until https://github.com/NVIDIA/NVTX/pull/2 is merged and we can pull the header from there to ensure both have the same version of the header.
There is some concern with contentions around CUDA events used in the pool between different threads/streams, it'd be nice to have better insight into that scenario.
I have an open PR #336 but I am waiting (as Jake points out) on NVIDIA/NVTX#2 and I will just add NVTX regions at the top-level rather than at low levels within pool_memory_resource
.
This issue has been marked rotten due to no recent activity in the past 90d. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed.
This issue has been marked stale due to no recent activity in the past 30d. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be marked rotten if there is no activity in the next 60d.
Still waiting on NVIDIA/NVTX#2 to be put in a release.
This issue has been labeled inactive-30d
due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d
if there is no activity in the next 60 days.
This issue has been labeled inactive-90d
due to no recent activity in the past 90 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed.