cudf
cudf copied to clipboard
[BUG] cuIO benchmarks generate larger files than expected
cuIO ORC and Parquet benchmarks generate larger files than previously (e.g. GDS blog data). Some observations:
- Only the cases where both cardinality and run length are set lead to small files.
- Cases where the data does not encode/compress well now generate files larger by ~20% (based on integral columns).
This issue has been labeled inactive-30d
due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d
if there is no activity in the next 60 days.