parquet-format
parquet-format copied to clipboard
Add: `sum_value` to statistics and ColumnIndex
Allows reading sums directly from the metadata (for any rowgroup where the whole column is of interest) or from the ColumnIndex (for any pages where the whole page is of interest). This can in some cases drastically reduce the amount of data required for aggregations
@ZJONSSON At the minimum, this should probably be discussed on the dev mailing-list, and/or on a dedicated JIRA issue.
Since it has been 3 years since you posted this PR, if you are not interested in advocating for this anymore, feel free to say so and we'll simply close the PR :-)