Gang Wu
Gang Wu
The CI failures are related: ``` [INFO] ------------------------------------------------------------- Error: COMPILATION ERROR : [INFO] ------------------------------------------------------------- Error: /home/runner/work/parquet-java/parquet-java/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/util/HadoopPositionOutputStream.java:[54,16] cannot find symbol symbol: method hasCapability(java.lang.String) location: variable wrapped of type org.apache.hadoop.fs.FSDataOutputStream Error: /home/runner/work/parquet-java/parquet-java/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/util/HadoopPositionOutputStream.java:[67,15]...
I think this is expected from the layered design. The `InternalParquetRecordWriter.flushRowGroupToStore()` method simply flushes all column chunks to the `ParquetFileWriter` which forwards the write to the `PositionOutputStream`. The `PositionOutputStream` decides...
Could you please check how many row groups in a single Parquet file? Is it a single row group per file? Usually the entire row group is flushed when `flushRowGroupToStore`...
I don't think we can directly use parquet-cli to rewrite files from cloud object store. You may either download them to rewrite locally, or use the ParquetWriter API to set...
What's the benefit of adding this file? Usually we need to report a new release via https://reporter.apache.org/addrelease.html?parquet
@VarshaUN I just assigned it to you. Please feel free to add any documentation as you see fit. Thanks for your interest!
There is no prerequisite. I'm not sure if your proposal is too wide to complete. In my mind it may be some code examples like what we have in Apache...
No, I don't mean other components (e.g. parquet-column and parquet-encoding) are not useful. They are widely used by query engines to implement parquet I/Os which is transparent to end users.
> > No, I don't mean other components (e.g. parquet-column and parquet-encoding) are not useful. They are widely used by query engines to implement parquet I/Os which is transparent to...
Sorry that I may not have bandwidth to work on it at this moment