DO NOT MERGE: Expand BYTE_STREAM_SPLIT to support FIXED_LEN_BYTE_ARRAY, INT32 and INT64
Implement the format additions described in PARQUET-2414.
TODO:
- [ ] Wait for format additions to be voted on
- [ ] Create corresponding JIRA issue and link to it
- [ ] Update integration testing changeset once testing file is merged
Jira
- [ ] My PR addresses the following Parquet Jira issues and references
them in the PR title. For example, "PARQUET-1234: My Parquet PR"
- https://issues.apache.org/jira/browse/PARQUET-XXX
- In case you are adding a dependency, check if the license complies with the ASF 3rd Party License Policy.
Tests
- [x] My PR adds the following unit tests OR does not need testing for this extremely good reason:
Commits
- [ ] My commits all reference Jira issues in their subject lines. In addition, my commits follow the guidelines
from "How to write a good git commit message":
- Subject is separated from body by a blank line
- Subject is limited to 50 characters (not including Jira issue reference)
- Subject does not end with a period
- Subject uses the imperative mood ("add", not "adding")
- Body wraps at 72 characters
- Body explains "what" and "why", not "how"
Style
- [x] My contribution adheres to the code style guidelines and Spotless passes.
- To apply the necessary changes, run
mvn spotless:apply -Pvector-plugins
- To apply the necessary changes, run
Documentation
- [x] In case of new functionality, my PR adds documentation that describes how to use it.
- All the public functions and the classes in the PR contain Javadoc that explain what it does
@wgtmac @etseidl @gszadovszky This is a draft PR. It works fine on the integration test but still needs unit tests. Since I'm quite new to this codebase (and to Java in general), I welcome any early comments.
This is ready for review now. The only failing test is the interop test, since the required file is not in parquet-testing yet (I have checked it passes locally).
For the record, I must still address some of the review comments.
@gszadovszky I think I addressed your comments, could you take another look?
@pitrou Do you want to have this in the 1.14.0 release?