cudf
cudf copied to clipboard
Use page statistics in Parquet reader
Description
#14000 added the ability to write new page statistics to the Parquet writer. This PR uses these new statistics to avoid some string size computations. Benchmarks show an improvement in read times of up to 20%.
Checklist
- [x] I am familiar with the Contributing Guidelines.
- [x] New or existing tests cover these changes.
- [x] The documentation is up to date with these changes.
This pull request requires additional validation before any workflows can run on NVIDIA's runners.
Pull request vetters can view their responsibilities here.
Contributors can view more details about this message here.
parquet_read_decode benchmark
## [0] NVIDIA RTX A6000
| data_type | io_type | cardinality | run_length | Ref Time | Ref Noise | Cmp Time | Cmp Noise | Diff | %Diff |
|-------------|---------------|---------------|--------------|------------|-------------|------------|-------------|--------------|---------|
| INTEGRAL | DEVICE_BUFFER | 0 | 1 | 14.942 ms | 0.74% | 14.989 ms | 0.76% | 46.778 us | 0.31% |
| INTEGRAL | DEVICE_BUFFER | 1000 | 1 | 14.115 ms | 0.50% | 14.201 ms | 0.46% | 85.853 us | 0.61% |
| INTEGRAL | DEVICE_BUFFER | 0 | 32 | 12.908 ms | 0.95% | 13.012 ms | 0.41% | 103.184 us | 0.80% |
| INTEGRAL | DEVICE_BUFFER | 1000 | 32 | 12.272 ms | 1.04% | 12.323 ms | 0.49% | 51.350 us | 0.42% |
| STRING | DEVICE_BUFFER | 0 | 1 | 46.516 ms | 0.45% | 45.672 ms | 0.50% | -844.320 us | -1.82% |
| STRING | DEVICE_BUFFER | 1000 | 1 | 13.762 ms | 0.68% | 13.015 ms | 0.71% | -746.030 us | -5.42% |
| STRING | DEVICE_BUFFER | 0 | 32 | 46.612 ms | 0.50% | 45.692 ms | 0.35% | -920.004 us | -1.97% |
| STRING | DEVICE_BUFFER | 1000 | 32 | 12.815 ms | 1.11% | 10.469 ms | 1.45% | -2345.521 us | -18.30% |
| LIST | DEVICE_BUFFER | 0 | 1 | 76.023 ms | 0.11% | 75.662 ms | 0.05% | -361.155 us | -0.48% |
| LIST | DEVICE_BUFFER | 1000 | 1 | 64.130 ms | 0.20% | 63.906 ms | 0.22% | -224.179 us | -0.35% |
| LIST | DEVICE_BUFFER | 0 | 32 | 55.348 ms | 0.13% | 55.170 ms | 0.09% | -177.805 us | -0.32% |
| LIST | DEVICE_BUFFER | 1000 | 32 | 56.574 ms | 0.21% | 56.434 ms | 0.15% | -139.456 us | -0.25% |
| STRUCT | DEVICE_BUFFER | 0 | 1 | 48.973 ms | 2.36% | 49.407 ms | 2.21% | 433.937 us | 0.89% |
| STRUCT | DEVICE_BUFFER | 1000 | 1 | 30.172 ms | 0.78% | 30.140 ms | 0.73% | -32.744 us | -0.11% |
| STRUCT | DEVICE_BUFFER | 0 | 32 | 49.314 ms | 2.83% | 50.072 ms | 2.74% | 758.959 us | 1.54% |
| STRUCT | DEVICE_BUFFER | 1000 | 32 | 28.299 ms | 0.37% | 27.150 ms | 0.37% | -1148.298 us | -4.06% |
parquet_read_chunks benchmark
## [0] NVIDIA RTX A6000
| T | io_type | cardinality | run_length | byte_limit | Ref Time | Ref Noise | Cmp Time | Cmp Noise | Diff | %Diff |
|--------|---------------|---------------|--------------|--------------|------------|-------------|------------|-------------|---------------|---------|
| STRING | DEVICE_BUFFER | 0 | 1 | 0 | 45.542 ms | 0.13% | 45.354 ms | 0.52% | -187.811 us | -0.41% |
| STRING | DEVICE_BUFFER | 1000 | 1 | 0 | 13.455 ms | 0.43% | 13.059 ms | 0.47% | -396.648 us | -2.95% |
| STRING | DEVICE_BUFFER | 0 | 32 | 0 | 45.564 ms | 0.17% | 45.433 ms | 0.31% | -131.213 us | -0.29% |
| STRING | DEVICE_BUFFER | 1000 | 32 | 0 | 12.537 ms | 1.16% | 10.671 ms | 4.28% | -1865.298 us | -14.88% |
| STRING | DEVICE_BUFFER | 0 | 1 | 500000 | 225.866 ms | 0.04% | 184.424 ms | 0.09% | -41442.361 us | -18.35% |
| STRING | DEVICE_BUFFER | 1000 | 1 | 500000 | 77.381 ms | 0.35% | 74.072 ms | 0.27% | -3309.740 us | -4.28% |
| STRING | DEVICE_BUFFER | 0 | 32 | 500000 | 225.814 ms | 0.05% | 183.758 ms | 0.18% | -42056.079 us | -18.62% |
| STRING | DEVICE_BUFFER | 1000 | 32 | 500000 | 61.544 ms | 0.51% | 52.393 ms | 0.20% | -9151.281 us | -14.87% |
and the same but forcing PLAIN encoding
| T | io_type | cardinality | run_length | byte_limit | Ref Time | Ref Noise | Cmp Time | Cmp Noise | Diff | %Diff |
|--------|---------------|---------------|--------------|--------------|------------|-------------|------------|-------------|---------------|---------|
| STRING | DEVICE_BUFFER | 0 | 1 | 0 | 45.449 ms | 0.32% | 45.168 ms | 0.56% | -281.006 us | -0.62% |
| STRING | DEVICE_BUFFER | 1000 | 1 | 0 | 46.893 ms | 0.69% | 46.812 ms | 0.79% | -80.940 us | -0.17% |
| STRING | DEVICE_BUFFER | 0 | 32 | 0 | 45.666 ms | 0.47% | 45.352 ms | 0.25% | -314.615 us | -0.69% |
| STRING | DEVICE_BUFFER | 1000 | 32 | 0 | 24.542 ms | 0.16% | 24.602 ms | 0.38% | 59.229 us | 0.24% |
| STRING | DEVICE_BUFFER | 0 | 1 | 500000 | 227.223 ms | 0.04% | 183.026 ms | 0.13% | -44196.857 us | -19.45% |
| STRING | DEVICE_BUFFER | 1000 | 1 | 500000 | 230.145 ms | 0.11% | 185.712 ms | 0.13% | -44432.935 us | -19.31% |
| STRING | DEVICE_BUFFER | 0 | 32 | 500000 | 227.210 ms | 0.07% | 183.010 ms | 0.07% | -44200.238 us | -19.45% |
| STRING | DEVICE_BUFFER | 1000 | 32 | 500000 | 208.225 ms | 0.13% | 164.224 ms | 0.11% | -44001.523 us | -21.13% |
/ok to test
/ok to test
/ok to test
Looks like #15020 broke this (db 3 ets 0 :open_mouth: :sob: :rofl:). Regrouping...
/ok to test
/merge