Xianyang Liu

Results 10 issues of Xianyang Liu

This is the first part of #4842. Add the schema id to DataFile/DeteFile/ManifestFile and which could be used to evaluate the filter expression based on the schema.

python
API
spark
parquet
arrow
core
data
flink
pig
common
docs
build
hive

This PR improves the config setting for the parquet bloom filter. 1. Log warn for those don't exist column or the column type is unsupported (boolean or complex data type)....

parquet

We got a IndexOutOfBoundsException: ``` 2022-08-03 09:33:34,076 Error executing query, currentState RUNNING, java.lang.RuntimeException: org.apache.spark.SparkException: Job aborted due to stage failure: Task 3315 in stage 5.0 failed 4 times, most recent...

lang-java

The following erros are just error prints. It is a bug in ray and will be fixed in future. ```python 2020-12-01 20:44:59,081 ERROR worker.py:977 -- Possible unhandled error from worker:...

Support set placement group for Spark cluster.

enhancement

This patch aims to reduce the time of Iceberg task planning. We notice the task plan could not benefit from ParallelIterator a lot when there are many manifest files to...

core

Closes #7695.

spark
INFRA
build
not-stale

### Rationale for this change Closes #3344 ### What changes are included in this PR? ### Are these changes tested? Yes, added UT. ### Are there any user-facing changes? Yes,...