parquet-java
parquet-java copied to clipboard
parquet-cli rewrite option
Describe the usage question you have. Please include as many useful details as possible.
Hi ,
is it possible to read directly from a gcs bucket to prune a column like rewrite -i gs:/sourcebbucket/part-00549.parquet -o gs://targetbucket/newdata/dd --prune-columns col4
i am getting error java.lang.RuntimeException: org.apache.hadoop.fs.UnsupportedFileSystemException: No FileSystem for scheme "gs"
Component(s)
No response
I don't think we can directly use parquet-cli to rewrite files from cloud object store. You may either download them to rewrite locally, or use the ParquetWriter API to set the file system configuration programatically.