Xianyang Liu issues

Results 10 issues of


                                            Xianyang Liu

Core: Add schema_id to ContentFile/ManifestFile

This is the first part of #4842. Add the schema id to DataFile/DeteFile/ManifestFile and which could be used to evaluate the filter expression based on the schema.

python

API

spark

parquet

arrow

core

data

flink

pig

common

docs

build

hive

Parquet: Set parquet bloom filter config with compatible column name

This PR improves the config setting for the parquet bloom filter. 1. Log warn for those don't exist column or the column type is unsupported (boolean or complex data type)....

parquet

ARROW-17338: [Java] The maximum request memory of BaseVariableWidthVector should limit to Integer.MAX_VALUE

We got a IndexOutOfBoundsException: ``` 2022-08-03 09:33:34,076 Error executing query, currentState RUNNING, java.lang.RuntimeException: org.apache.spark.SparkException: Job aborted due to stage failure: Task 3315 in stage 5.0 failed 4 times, most recent...

lang-java

INFRA

build

not-stale

GH-3344: Adaptive compression for v2 page

### Rationale for this change Closes #3344 ### What changes are included in this PR? ### Are these changes tested? Yes, added UT. ### Are there any user-facing changes? Yes,...

Xianyang Liu

Core: Add schema_id to ContentFile/ManifestFile

Parquet: Set parquet bloom filter config with compatible column name

ARROW-17338: [Java] The maximum request memory of BaseVariableWidthVector should limit to Integer.MAX_VALUE

Add timeout for UT

Possible unhandled error from worker: ray::ParallelIteratorWorker.par_iter_next_batch()

Add Placement Group support for Spark executor

Cover more UT tests

Core: Avoid reading ManifestFile when create ManifestReader

Build: Apply spotless for scala code

GH-3344: Adaptive compression for v2 page