BInwei Yang comments

Results 167 comments of


                                            BInwei Yang

Support complex types in sparksql hash and xxhash64 function

> > ``` > ============================================================================ > [...]hmarks/ExpressionBenchmarkBuilder.cpp relative time/iter iters/s > ============================================================================ > hash_BIGINT##hash 883.61us 1.13K > hash_BIGINT##xxhash64 927.77us 1.08K > ---------------------------------------------------------------------------- > hash_INTEGER##hash 808.87us 1.24K > hash_INTEGER##xxhash64 815.35us 1.23K...

Support complex types in sparksql hash and xxhash64 function

@marin-ma Did you test TPCH in Gluten using the PR? Let's do a test if not yet and see what's the perf lose.

Add RetryStrategy for S3 file system

> > @yma11 thanks! Can you please confirm and update here? We cannot add CI tests for some of these changes, but the expectation is that the author tests the...

Improve the FlushPolicyFactory to make the client can specify flush policy

@xiaoxmeng can you help to take a look? The PR can add the rowgroup config for parquet writter. Otherwise each file has single rowgroup.

[VL] One task writes too many hive partitions causing OOM

So each write node takes >10M memory which caused the OOM. Should we have one write node and keep it open for each partition? @JkSelf

[VL] One task writes too many hive partitions causing OOM

It's not the same issue. Here your memory is allocated by ArrowContextInstance and R2C. What's your reducer#?

[VL] One task writes too many hive partitions causing OOM

> Sorry, I didn't understand your question, are you asking for number of reducers? This query had a single stage with 23663 tasks, where each task does a union of...

[VL] One task writes too many hive partitions causing OOM

@wForget Is the issue still there in your side? looks not fixed.

[GLUTEN-5953][VL] Prevent pushdown filters with unsupported data types to scan node

Do you mean these fields are not supported in filter pushdown? ``` StructField("short_decimal_field", DecimalType(5, 2), true), StructField("long_decimal_field", DecimalType(32, 8), true), StructField("binary_field", BinaryType, true), StructField("timestamp_field", TimestampType, true) ```

Support reading Iceberg split with equality deletes

> cc: @liujiayi771 Would you like to take a review? Thanks. @liujiayi771 Can you take a look of the PR if possible? should we add something in Gluten side after...