datafusion icon indicating copy to clipboard operation
datafusion copied to clipboard

[EPIC] Continued correct and improved extracting Parquet statistics into ArrayRefs

Open alamb opened this issue 1 year ago • 0 comments

Is your feature request related to a problem or challenge?

I consolidated the content of our previous tickets about better statistics https://github.com/apache/datafusion/issues/10806 and https://github.com/apache/datafusion/issues/10806 into a new Epic and cleaned up the subtasks

Describe the solution you'd like

Subtasks:

  • [ ] https://github.com/apache/datafusion/issues/10586
  • [ ] https://github.com/apache/datafusion/issues/10609
  • [ ] https://github.com/apache/datafusion/issues/11000
  • [x] https://github.com/apache/datafusion/issues/10923
  • [x] https://github.com/apache/datafusion/issues/10926
  • [x] https://github.com/apache/datafusion/issues/10928
  • [x] #10952
  • [x] #11026
  • [ ] #11184
  • [ ] #11185
  • [x] #11027
  • [x] #11111
  • [x] #11112
  • [x] #11113
  • [ ] #11114
  • [x] https://github.com/apache/datafusion/issues/10965
  • [x] #10951
  • [ ] https://github.com/apache/datafusion/issues/10934
  • [ ] Update the parquet code prune_pages_in_one_row_group (source) to use the new StatisticsExtractor code
  • [ ] remove rendant tests in parquet statistics
  • [ ] Port code / tests upstream: https://github.com/apache/arrow-rs/issues/4328

Describe alternatives you've considered

No response

Additional context

No response

alamb avatar Jun 14 '24 20:06 alamb