asfimport
asfimport
It looks like the the parquet-java merge script is setup to run only on python2 which is EOL. We should update it to run on python3 I plan to do...
They currently are not supported. They would need their own set of operators, like contains() and size() etc. **Reporter**: [Alex Levenson](https://issues.apache.org/jira/secure/ViewProfile.jspa?name=alexlevenson) / @isnotinvain #### PRs and other links: - [GitHub...
Parquet relies on the name. In a lot of usages e.g. schema resolution, this would be a problem. Iceberg uses ID and stored Id/name mappings. This Jira is to add...
we sometimes run into an exception when closing a ParquetWriter instance: ```java 2024-06-10 10:44:01.398 org.apache.parquet.util.AutoCloseables$ParquetCloseResourceException: Unable to close resource 2024-06-10 10:44:01.398 at org.apache.parquet.util.AutoCloseables.uncheckedClose(AutoCloseables.java:85) 2024-06-10 10:44:01.398 at org.apache.parquet.util.AutoCloseables.uncheckedClose(AutoCloseables.java:94) 2024-06-10 10:44:01.398 at...
The Java parquet library has no usage documentation besides the sparse information available in the README. The only thing I could find were a few old (10yr) 3rd party tutorials...
In an effort to understand the parquet format better, I've so far written my own Thrift parser, and upon examining the output, I noticed something peculiar. To begin with, check...
Hi, we are trying to use [org.apache.parquet.avro](https://www.tabnine.com/code/java/packages/org.apache.parquet.avro).AvroParquetWriter to write parquet file to s3 bucket. The file is successfully written to s3 bucket but get an exception com.amazonaws.SdkClientException: Unable to verify...
I'm unable to create a bloomfilter for a field, when I perform writes with repeating values. The bloomfilter returned is null when I try to read such a parquet file....
I try to use the `appendFile` method of `ParquetFileWriter` to merge some smaller Parquet files into one large parquet file. After I finished the merge, I tried deleting the smaller...
Motivation: The current behavior of ParquetFilterReader#readNextRowGroup is to eagerly enumerate all chunks in the row group, then read all pages in the chunk. For distributed data workloads, this can cause...