datafusion
datafusion copied to clipboard
[EPIC] Continued correct and improved extracting Parquet statistics into ArrayRefs
Is your feature request related to a problem or challenge?
I consolidated the content of our previous tickets about better statistics https://github.com/apache/datafusion/issues/10806 and https://github.com/apache/datafusion/issues/10806 into a new Epic and cleaned up the subtasks
Describe the solution you'd like
Subtasks:
- [ ] https://github.com/apache/datafusion/issues/10586
- [ ] https://github.com/apache/datafusion/issues/10609
- [ ] https://github.com/apache/datafusion/issues/11000
- [x] https://github.com/apache/datafusion/issues/10923
- [x] https://github.com/apache/datafusion/issues/10926
- [x] https://github.com/apache/datafusion/issues/10928
- [x] #10952
- [x] #11026
- [ ] #11184
- [ ] #11185
- [x] #11027
- [x] #11111
- [x] #11112
- [x] #11113
- [ ] #11114
- [x] https://github.com/apache/datafusion/issues/10965
- [x] #10951
- [ ] https://github.com/apache/datafusion/issues/10934
- [ ] Update the parquet code prune_pages_in_one_row_group (source) to use the new StatisticsExtractor code
- [ ] remove rendant tests in parquet statistics
- [ ] Port code / tests upstream: https://github.com/apache/arrow-rs/issues/4328
Describe alternatives you've considered
No response
Additional context
No response