cudf Add `bytes_per_second` to groupby max benchmark.

This patch adds memory statistics for the GROUPBY_NVBENCH benchmark using the max aggregation.

For this purpose helper functions are introduced to compute the payload size for:

Column
Table
Groupby execution results

This patch relates to #13735.

Checklist

[x] I am familiar with the Contributing Guidelines.

Aug 28 '23 18:08 Blonck

Pull requests from external contributors require approval from a rapidsai organization member with write permissions or greater before CI can begin.

Aug 28 '23 18:08 rapids-bot[bot]

Hi, I added a few helper function which I would use also for the other groupby benchmarks.

PS: Should I add the other groupby benchmarks to this PR to reduce the amount of PR for #13735 a bit?

Aug 28 '23 18:08 Blonck

Would other benchmarks benefit from those helper functions as well? If not, we can move them to benchmarks /groupby/group_common.hpp.

Should I add the other groupby benchmarks to this PR to reduce the amount of PR for #13735 a bit?

Sounds reasonable to me.

Aug 28 '23 19:08 PointKernel

Would other benchmarks benefit from those helper functions as well? If not, we can move them to `benchmarks

I think so. They could be used in a few of the benchmarks to simplify the code.

Aug 28 '23 19:08 Blonck

/ok to test

Aug 28 '23 20:08 davidwendt

Hi, the PR is not ready, I had not enough time the last two days. So no need to review it :smiley:.

Aug 30 '23 19:08 Blonck

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Aug 30 '23 20:08 copy-pr-bot[bot]

Hi @PointKernel , @davidwendt,

in the current state of the PR every NVBENCH benchmark has bytes_per_second added. However, I'm a bit unhappy with its current state. Specifically, I'd like to introduce test code to validate the calculations for required_bytes, especially when dealing with nested column types. I'm considering adding these utility functions in the test/utilities directoriy and creating corresponding tests there. I've noticed that similar utility functions used in the tests are also employed within the benchmarks. However, I'm think that housing benchmark-specific helper functions under the test directories is not optimal. Do you have any better ideas?

PS: Does something like a nested column iterator which iterates over all columns, including child columns, of a table. I've not found anything like this, I'm also unsure if it feasible to implement, but would be great for required_bytes :sweat_smile:.

PPS: benchmark output perf_log.txt

Aug 31 '23 16:08 Blonck

Hi @Blonck, this is nice work and I'd like to get this change merged. However, it seems that you haven't had a chance to finish up this. Would you mind if I take this from where you left off and finish up?

Jun 25 '24 18:06 jihoonson

You may want to look into using cudf::row_bit_count to get the column/table sizes: https://docs.rapids.ai/api/libcudf/stable/group__transformation__transform.html#gaa78354f8eda093519182149710215d6f which already handles all cudf types. The output is the number of bits per row but you can use cudf::reduce to get the total size and divide by 8 to get the number of bytes.

Jun 25 '24 20:06 davidwendt

Thanks @davidwendt, this is very useful! I will check out the function you suggested.

Jun 25 '24 20:06 jihoonson

Thanks for working on this @Blonck! It looks like this was a helpful starting point for #16126, which ultimately superseded this PR.

May 16 '25 22:05 vyasr

cudf cudf copied to clipboard

Add `bytes_per_second` to groupby max benchmark.

Checklist

cudf
cudf copied to clipboard