BInwei Yang
BInwei Yang
> > ``` > ============================================================================ > [...]hmarks/ExpressionBenchmarkBuilder.cpp relative time/iter iters/s > ============================================================================ > hash_BIGINT##hash 883.61us 1.13K > hash_BIGINT##xxhash64 927.77us 1.08K > ---------------------------------------------------------------------------- > hash_INTEGER##hash 808.87us 1.24K > hash_INTEGER##xxhash64 815.35us 1.23K...
@marin-ma Did you test TPCH in Gluten using the PR? Let's do a test if not yet and see what's the perf lose.
> > @yma11 thanks! Can you please confirm and update here? We cannot add CI tests for some of these changes, but the expectation is that the author tests the...
@xiaoxmeng can you help to take a look? The PR can add the rowgroup config for parquet writter. Otherwise each file has single rowgroup.
So each write node takes >10M memory which caused the OOM. Should we have one write node and keep it open for each partition? @JkSelf
It's not the same issue. Here your memory is allocated by ArrowContextInstance and R2C. What's your reducer#?
> Sorry, I didn't understand your question, are you asking for number of reducers? This query had a single stage with 23663 tasks, where each task does a union of...
@wForget Is the issue still there in your side? looks not fixed.
Do you mean these fields are not supported in filter pushdown? ``` StructField("short_decimal_field", DecimalType(5, 2), true), StructField("long_decimal_field", DecimalType(32, 8), true), StructField("binary_field", BinaryType, true), StructField("timestamp_field", TimestampType, true) ```
> cc: @liujiayi771 Would you like to take a review? Thanks. @liujiayi771 Can you take a look of the PR if possible? should we add something in Gluten side after...