delta-rs
delta-rs copied to clipboard
Include file stats when converting a parquet directory to a Delta table
Description
Currently the ConvertToDeltaBuilder
skips fetching and populating the stats
https://github.com/delta-io/delta-rs/blob/81593e919497221a1a08bf8db9d20e8e4a39a8a6/crates/core/src/operations/convert_to_delta.rs#L332-L353
This results in log files missing the min/max/null count statistics.
Use Case
These stats are useful as they allow partition pruning and thus influence performance.
Granted it may be possible to use the stats from the files themselves, but that it is sub-optimal to reading from the log directly.
Related Issue(s)