parquet-format icon indicating copy to clipboard operation
parquet-format copied to clipboard

Apache Parquet Format

Results 93 parquet-format issues
Sort by recently updated
recently updated
newest added

**Reporter**: [Fokko Driesprong](https://issues.apache.org/jira/secure/ViewProfile.jspa?name=fokko) / @Fokko **Assignee**: [Fokko Driesprong](https://issues.apache.org/jira/secure/ViewProfile.jspa?name=fokko) / @Fokko **Note**: *This issue was originally created as [PARQUET-2481](https://issues.apache.org/jira/browse/PARQUET-2481). Please see the [migration documentation](https://issues.apache.org/jira/browse/PARQUET-2502) for further details.*

Priority: Major
Type: enhancement

In Current Parquet implementions, if BloomFilter doesn't set the ndv, most implementions will guess the 1M as the ndv. And use it for fpp. So, if fpp is 0.01, the...

Priority: Major
Type: enhancement

Due to PARQUET-2078 RowGroup.file_offset is not reliable. This field is also wrongly calculated in the C++ oss parquet implementation PARQUET-2089 **Reporter**: [Gabor Szadovszky](https://issues.apache.org/jira/secure/ViewProfile.jspa?name=gszadovszky) / @gszadovszky **Assignee**: [Gidon Gershinsky](https://issues.apache.org/jira/secure/ViewProfile.jspa?name=gershinsky) / @ggershinsky...

Priority: Major
Type: bug

Parquet format is getting more and more features while the different implementations cannot keep the pace and left behind with some features implemented and some are not. In many cases...

Priority: Major
Type: enhancement

**Reporter**: [Micah Kornfield](https://issues.apache.org/jira/secure/[email protected]) / @emkornfield **Assignee**: [Micah Kornfield](https://issues.apache.org/jira/secure/[email protected]) / @emkornfield **Note**: *This issue was originally created as [PARQUET-1933](https://issues.apache.org/jira/browse/PARQUET-1933). Please see the [migration documentation](https://issues.apache.org/jira/browse/PARQUET-2502) for further details.*

Priority: Major
Type: enhancement

The current merge script is Python 3 incompatible, copy over the merge_script from the Arrow project which is a development that initially started from merge_parquet.py **Reporter**: [Uwe Korn](https://issues.apache.org/jira/secure/ViewProfile.jspa?name=uwe) / @xhochy...

Priority: Major
Type: enhancement

In the Parquet format specification, under [the section for Plain encoding](https://github.com/apache/parquet-format/blob/master/Encodings.md#plain-plain--0), boolean is encoded using the deprecated bit-packed encoding. However, [the section for bit-packed encoding](https://github.com/apache/parquet-format/blob/master/Encodings.md#bit-packed-deprecated-bit_packed--4) specifies that it is only...

Priority: Major
Type: enhancement

We recently figured out that the Makefile was broken and it would be best to check it during the travis tests. I have a fix locally that I'll rebase and...

Priority: Major
Type: bug

Although considered as deprecated, they should be documented as the format is quite special. **Reporter**: [Uwe Korn](https://issues.apache.org/jira/secure/ViewProfile.jspa?name=uwe) / @xhochy **Assignee**: [Uwe Korn](https://issues.apache.org/jira/secure/ViewProfile.jspa?name=uwe) / @xhochy #### PRs and other links: -...

Priority: Major
Type: enhancement

Apache Iceberg is adding geospatial support: https://docs.google.com/document/d/1iVFbrRNEzZl8tDcZC81GFt01QJkLJsI9E2NBOt21IRI. It would be good if Apache Parquet can support geometry type natively.