datafusion-comet
datafusion-comet copied to clipboard
Support DELTA_BINARY_PACKED and DELTA_BYTE_ARRAY
What is the problem the feature request solves?
There are some tests in Spark 4.0 that uses parquet.writer.version=v2 (ParquetTypeWideningSuite).
The V2 write writes with delta encoding. Comet currently cannot read such files
Describe the potential solution
No response
Additional context
No response
IIRC, the vectorized versions of these encodings in Spark did not improve performance much over the row based implementation in the parquet library