parquet-format
parquet-format copied to clipboard
Apache Parquet Format
Arrow recently introduced [FixedShapeTensor](https://arrow.apache.org/docs/format/CanonicalExtensions.html#fixed-shape-tensor) and [VariableShapeTensor](https://arrow.apache.org/docs/format/CanonicalExtensions.html#variable-shape-tensor) canonical extension types that use FixedSizeList and StructArray(List, FixedSizeList) as storage respectfully. These are targeted at machine learning and scientific applications that deal with...
Apache Iceberg is adding geospatial support: . It would be good if Apache Parquet can support geometry type natively. **Reporter**: [Gang Wu](https://issues.apache.org/jira/secure/ViewProfile.jspa?name=wgtmac) / @wgtmac **Assignee**: [Gang Wu](https://issues.apache.org/jira/secure/ViewProfile.jspa?name=wgtmac) / @wgtmac ####...
More and more deployments are being done on ARM64 machines. It would be good to make sure Parquet MR project builds fine on it. The project moved from TravisCI to...
timestamp with timezone (per SQL) timestamps are adjusted to UTC and stored as integers. metadata in logical types PR: See discussion here: https://github.com/apache/parquet-format/pull/51#discussion_r109667837 **Reporter**: [Julien Le Dem](https://issues.apache.org/jira/secure/ViewProfile.jspa?name=julienledem) / @julienledem **Assignee**:...
Currently int96 binary ordering doesn't match its natural ordering. We should either specify this or declare int96 not ordered and link to the type replacing it. **Reporter**: [Julien Le Dem](https://issues.apache.org/jira/secure/ViewProfile.jspa?name=julienledem)...
Add a union type annotation for Group types that represent a Union rather than a struct. Models like Avro or Arrow would make use of it. **Reporter**: [Julien Le Dem](https://issues.apache.org/jira/secure/ViewProfile.jspa?name=julienledem)...
For completeness and compatibility with Arrow and SQL types. Those are related to the existing INTERVAL type. some references: - https://msdn.microsoft.com/en-us/library/ms716506(v=vs.85).aspx - http://www.techrepublic.com/article/sql-basics-datetime-and-interval-data-types/ - https://www.postgresql.org/docs/9.3/static/datatype-datetime.html - https://docs.oracle.com/html/E26088_01/sql_elements001.htm - http://www.ibm.com/support/knowledgecenter/SSGU8G_12.1.0/com.ibm.sqlr.doc/ids_sqr_123.htm **Reporter**:...
PARQUET-372 drops page and column chunk stats when values are larger than 4k to avoid storing very large values in page headers and the file footer. An alternative approach is...
**Reporter**: [Micah Kornfield](https://issues.apache.org/jira/secure/ViewProfile.jspa?name=emkornfield) / @emkornfield **Assignee**: [Micah Kornfield](https://issues.apache.org/jira/secure/ViewProfile.jspa?name=emkornfield) / @emkornfield #### PRs and other links: - [GitHub Pull Request #258](https://github.com/apache/parquet-format/pull/258) - [GitHub Pull Request #61](https://github.com/apache/parquet-site/pull/61) **Note**: *This issue was originally...
**Reporter**: [Fokko Driesprong](https://issues.apache.org/jira/secure/ViewProfile.jspa?name=fokko) / @Fokko **Assignee**: [Fokko Driesprong](https://issues.apache.org/jira/secure/ViewProfile.jspa?name=fokko) / @Fokko #### PRs and other links: - [GitHub Pull Request #247](https://github.com/apache/parquet-format/pull/247) **Note**: *This issue was originally created as [PARQUET-2482](https://issues.apache.org/jira/browse/PARQUET-2482). Please see...