cudf icon indicating copy to clipboard operation
cudf copied to clipboard

[BUG] Unnecessary stream synchronization in cudf::is_valid

Open jlowe opened this issue 1 year ago • 1 comments

Describe the bug While looking at an nsys trace of an aggregation with many aggregation functions, I noticed there's a lot of stream synchronization occurring. The trace had many occurrences of cudf::is_valid, and each time it would synchronize with the stream. There shouldn't be any information needed from the GPU when computing this column, as the row count is the same as the input and the null count is always zero. Image

Steps/Code to reproduce bug Perform and Nsight Systems profile of code using is_valid and note the cudaStreamSynchronize call within the cudf::is_valid range

Expected behavior No stream synchronization triggered by the is_valid call.

jlowe avatar Oct 17 '24 22:10 jlowe

It looks like the stream synchronization is triggered by the use of rmm::exec_policy instead of rmm::exec_policy_nosync in cudf::detail::true_if at https://github.com/rapidsai/cudf/blob/branch-24.12/cpp/include/cudf/detail/unary.hpp#L62.

jlowe avatar Oct 17 '24 22:10 jlowe