cudf [Minor] Extend the Parquet writer's dictionary encoding benchmark.

Description

This PR extends the data cardinality and run length range for the existing parquet writer's encoding benchmark.

Checklist

[x] I am familiar with the Contributing Guidelines.
[x] New or existing tests cover these changes.
[x] The documentation is up to date with these changes.

Aug 17 '24 01:08 mhaseeb123

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Aug 19 '24 21:08 copy-pr-bot[bot]

/ok to test

Aug 19 '24 21:08 mhaseeb123

what's the reason for this change?

Aug 20 '24 11:08 vuule

what's the reason for this change?

First of all welcome back. Greg wanted me to push any updates I did to the benchmark for #16541. Though I think that all my local changes (even wider extended ranges) need not to be pushed upstream if not needed.

Aug 20 '24 17:08 mhaseeb123

/ok to test

Sep 06 '24 23:09 mhaseeb123

/ok to test

Sep 07 '24 01:09 mhaseeb123

/ok to test

Sep 09 '24 18:09 mhaseeb123

/merge

Sep 09 '24 21:09 mhaseeb123

Would be nice to know how much time this increases in benchmark runs. If it is not available now, follow up with Randy on benchmark runs.

Results in #16541 (here) for which we are extending this. Each new benchmark in the matrix takes roughly 0.5s to run on my workstation (AMD Threadripper + RTX Ada 5880) so it should be roughly an 4s increase in total time (8x new benchmarks).

Sep 09 '24 21:09 mhaseeb123