parquet-format icon indicating copy to clipboard operation
parquet-format copied to clipboard

GH-502: Clarify Int96 status and add recommended ordering

Open alamb opened this issue 5 months ago • 0 comments

Rationale for this change

  • As can be seen on https://github.com/apache/parquet-format/issues/502 and the linked issues, the current state of Int96 is confusing.
  • Given there are proposed changes to various open source parquet writers to make them work better with Int96 despite it being deprecated, we should clarify the spec with respect to Int96 and statistics

What changes are included in this PR?

  1. Add comments clarifying that Int96 should not be written by new parquet writers (@rdblue's suggestion of what deprecated means in practice)
  2. Add a recommendation on ordering of Int96 statistics to match what @rahulketch and @alkis are suggesting on https://github.com/apache/parquet-java/issues/3242 and https://github.com/apache/arrow-rs/issues/7686)

Do these changes have PoC implementations?

No -- in my mind this does not change the meaning or intent of the parquet spec, but instead adds clarification to help other implementers

alamb avatar Jun 25 '25 18:06 alamb