Xianyang Liu
Xianyang Liu
This is the first part of #4842. Add the schema id to DataFile/DeteFile/ManifestFile and which could be used to evaluate the filter expression based on the schema.
This PR improves the config setting for the parquet bloom filter. 1. Log warn for those don't exist column or the column type is unsupported (boolean or complex data type)....
We got a IndexOutOfBoundsException: ``` 2022-08-03 09:33:34,076 Error executing query, currentState RUNNING, java.lang.RuntimeException: org.apache.spark.SparkException: Job aborted due to stage failure: Task 3315 in stage 5.0 failed 4 times, most recent...
The following erros are just error prints. It is a bug in ray and will be fixed in future. ```python 2020-12-01 20:44:59,081 ERROR worker.py:977 -- Possible unhandled error from worker:...
Support set placement group for Spark cluster.
This patch aims to reduce the time of Iceberg task planning. We notice the task plan could not benefit from ParallelIterator a lot when there are many manifest files to...
### Rationale for this change Closes #3344 ### What changes are included in this PR? ### Are these changes tested? Yes, added UT. ### Are there any user-facing changes? Yes,...