parquet-format icon indicating copy to clipboard operation
parquet-format copied to clipboard

Apache Parquet Format

Results 93 parquet-format issues
Sort by recently updated
recently updated
newest added

In the Mailing List, I proposed the addition of a Half Float (float16) type in Parquet: https://lists.apache.org/thread/03vmcj7ygwvsbno764vd1hr954p62zr5 This type is becoming increasingly popular in Machine Learning, and there are a...

Bumps [org.apache.maven.plugins:maven-shade-plugin](https://github.com/apache/maven-shade-plugin) from 3.5.1 to 3.5.2. Commits 95e22b4 [maven-release-plugin] prepare release maven-shade-plugin-3.5.2 d807fea Bump org.vafer:jdependency from 2.9.0 to 2.10 6d60841 Bump org.apache.commons:commons-compress from 1.23.0 to 1.25.0 68457e5 [MSHADE-468] add system...

dependencies
java

Bumps [org.codehaus.mojo:exec-maven-plugin](https://github.com/mojohaus/exec-maven-plugin) from 3.1.1 to 3.2.0. Release notes Sourced from org.codehaus.mojo:exec-maven-plugin's releases. 3.2.0 🚀 New features and improvements Enable to exec:java runnables and not only mains with loosely coupled injections...

dependencies
java

This commit adds a new column order `IEEE754TotalOrder`, which can be used for floating point types (FLOAT, DOUBLE, FLOAT16). The advantage of the new order is a well-defined ordering between...

This commit proposes an improvement for handling of NaN values in FLOAT and DOUBLE type columns. The goal is to allow reading engines, regardless of how they order NaN w.r.t....

Make sure you have checked _all_ steps below. ### Jira - [ ] My PR addresses the following [Parquet Jira](https://issues.apache.org/jira/browse/PARQUET/) issues and references them in the PR title. For example,...

See related mailing list discussion: https://lists.apache.org/thread/kd3k4q691lp5c4q3r767zb8jltrm9z33 ## Background In https://github.com/apache/parquet-site/pull/34 we are adding an "implementation status" matrix for different paruqet implementations, to help people understand the supported feature sets of...

As proposed in https://github.com/apache/arrow/issues/34510 and on [ML](https://lists.apache.org/thread/khco6z9kd1spxlokrjxhyy83x9ogvtdm), [PARQUET-2474](https://issues.apache.org/jira/browse/PARQUET-2474). Arrow recently introduced [FixedShapeTensor](https://arrow.apache.org/docs/format/CanonicalExtensions.html#fixed-shape-tensor) and [VariableShapeTensor](https://arrow.apache.org/docs/format/CanonicalExtensions.html#variable-shape-tensor) canonical extension types that use FixedSizeList and StructArray(List, FixedSizeList) as storage respectfully. These are targeted at...

This is to split VARIABLE_SIZE_LIST proposal from #241 as suggested [here](https://github.com/apache/parquet-format/pull/241#discussion_r1648081100). ### GitHub issue - [x] My PR addresses #437 ### Commits - [ ] My commits all reference Jira...