parquet-java
parquet-java copied to clipboard
PARQUET-1903: Improve Parquet Protobuf Usability
Make sure you have checked all steps below.
Jira
- [X] My PR addresses the following PARQUET-1903 issues
Tests
- [X] My PR adds the following unit tests OR does not need testing for this extremely good reason:
Commits
- [X] My commits all reference Jira issues in their subject lines. In addition, my commits follow the guidelines from "How to write a good git commit message":
- Subject is separated from body by a blank line
- Subject is limited to 50 characters (not including Jira issue reference)
- Subject does not end with a period
- Subject uses the imperative mood ("add", not "adding")
- Body wraps at 72 characters
- Body explains "what" and "why", not "how"
Documentation
- [X] In case of new functionality, my PR adds documentation that describes how to use it.
- All the public functions and the classes in the PR contain Javadoc that explain what it does
@belugabehr This looks really good! I'm probably going to be working a lot with protobuf + parquet in the future, so I'm happy to see changes like this. I'm not a parquet committer, but I definitely give this a non-binding +1.
@dossett Thanks. There's a bunch more I'd like to add, but this is already a lot.
Also check out:
https://issues.apache.org/jira/browse/PARQUET-1914
That's a good one too @belugabehr ! I posed this question on the parquet dev list about proto3 if you are interested:
http://mail-archives.apache.org/mod_mbox/parquet-dev/202009.mbox/%3CCAMgkoMJmLX%2BNc0-qnMLqkx6aGL1wW%3DMeJkGGTONG3ypmj9LUyw%40mail.gmail.com%3E
I've been in parquet-protobuf code lately and these seem like good changes. @belugabehr I'd be happy to pick this PR up if you like and address some of the minor comments. The biggest concern was a possible breaking change, but I don't think that's the case and tried to explain in the PR comments.