incubator-gluten icon indicating copy to clipboard operation
incubator-gluten copied to clipboard

[GLUTEN-7685][VL][RAS] Optimize rough model

Open zml1206 opened this issue 1 year ago • 3 comments

What changes were proposed in this pull request?

(Fixes: #7685) In production, we encountered a situation where the consumption of r2c large table was far greater than the performance improvement brought by native. for example:

sql("set spark.gluten.sql.columnar.filescan=false")
spark.range(100000000).toDF("id").selectExpr("concat('id_', round(id/1000000)) as k", "id % 10 as v")
      .write.mode("overwrite").parquet("tmp/t1")
spark.read.parquet("tmp/t1").createOrReplaceTempView("t1")
sql("select  k,sum(v) as v from t1 group by k").collect()

The local test takes 18 seconds to enble gluten, and only 6 seconds to disablegluten.Therefore, I hope to fallback this through RAS.

The optimization points are as follows:

  1. Increase bytesSize factor, cost = bytesSizeFactor * opCost
  2. r2c cost can be configured separately, and the default is 100. If sizeBytes is less than the threshold, the cost of RowToColumnarLike is ignored.
  3. Remove r2c fallback containing complex types
  4. Vanilla op cost is configurable, the default is 20, gluten op cost is 1

How was this patch tested?

zml1206 avatar Oct 25 '24 08:10 zml1206

https://github.com/apache/incubator-gluten/issues/7685

github-actions[bot] avatar Oct 25 '24 08:10 github-actions[bot]

Run Gluten Clickhouse CI

github-actions[bot] avatar Oct 25 '24 08:10 github-actions[bot]

Run Gluten Clickhouse CI

github-actions[bot] avatar Oct 25 '24 10:10 github-actions[bot]

Run Gluten Clickhouse CI

github-actions[bot] avatar Oct 28 '24 02:10 github-actions[bot]

Run Gluten Clickhouse CI

github-actions[bot] avatar Oct 28 '24 02:10 github-actions[bot]

CI failed unrelated.

zml1206 avatar Oct 28 '24 05:10 zml1206

Run Gluten Clickhouse CI

zhztheplayer avatar Oct 28 '24 08:10 zhztheplayer

Run Gluten Clickhouse CI

github-actions[bot] avatar Oct 28 '24 09:10 github-actions[bot]

Run Gluten Clickhouse CI

github-actions[bot] avatar Oct 28 '24 09:10 github-actions[bot]

Is it possible for you to share some test results about this change? Thanks!

I ran it in our grayscale adhoc query environment yesterday, and executorCpuTime overall reduced by 10%. @zhztheplayer

zml1206 avatar Oct 29 '24 01:10 zml1206

Run Gluten Clickhouse CI

github-actions[bot] avatar Oct 29 '24 01:10 github-actions[bot]

Is it possible for you to share some test results about this change? Thanks!

I ran it in our grayscale adhoc query environment yesterday, and executorCpuTime overall reduced by 10%. @zhztheplayer

Sounds great.

BTW, based on my impression RoughCostModel has been used by a few of users. If you want to continue working on the new costers, would you consider creating a new one like RoughCostModel2, with alias rough2, probably? So we don't break the current usages of rough model by accident. Once it gets mature enough, we can move the logics to a more standardized cost model then.

zhztheplayer avatar Oct 29 '24 02:10 zhztheplayer

Run Gluten Clickhouse CI

github-actions[bot] avatar Oct 29 '24 03:10 github-actions[bot]

@zml1206 Can you add one line in PR description to demonstrate the way to enable this cost model? Thanks.

zhztheplayer avatar Oct 29 '24 07:10 zhztheplayer

@zml1206 Can you add one line in PR description to demonstrate the way to enable this cost model? Thanks.

Added.

zml1206 avatar Oct 29 '24 07:10 zml1206