incubator-gluten icon indicating copy to clipboard operation
incubator-gluten copied to clipboard

[GLUTEN-4745][CH] support Sort Merge Join

Open loudongfeng opened this issue 11 months ago • 28 comments

Support inner and outer joins

What changes were proposed in this pull request?

(Fixes: #GLUTEN-4745)

How was this patch tested?

unit tests and integration tests

loudongfeng avatar Feb 29 '24 08:02 loudongfeng

https://github.com/oap-project/gluten/issues/4745

github-actions[bot] avatar Feb 29 '24 08:02 github-actions[bot]

Run Gluten Clickhouse CI

github-actions[bot] avatar Feb 29 '24 08:02 github-actions[bot]

seems unrelated failed in static-build-centos7-test :

error: Failed to download from mirror set
error: https://web.mit.edu/kerberos/dist/krb5/1.20/krb5-1.20.tar.gz: curl failed to download with exit code 18

loudongfeng avatar Feb 29 '24 10:02 loudongfeng

@zzcclp This PR is ready for review.Thanks. In order to support join keys with null correctly, we need change CH's MergeJoinTransform, will add that to CH later.

loudongfeng avatar Feb 29 '24 10:02 loudongfeng

@taiyang-li @lgbo-ustc please help to review, thanks.

zzcclp avatar Mar 01 '24 07:03 zzcclp

Run Gluten Clickhouse CI

github-actions[bot] avatar Mar 02 '24 07:03 github-actions[bot]

Run Gluten Clickhouse CI

github-actions[bot] avatar Mar 02 '24 12:03 github-actions[bot]

From https://opencicd.kyligence.com/job/gluten/job/gluten-ci/7371/console i can see all test passed. Not sure why report faliure.

ERROR: script returned exit code 1

loudongfeng avatar Mar 04 '24 01:03 loudongfeng

Run Gluten Clickhouse CI

github-actions[bot] avatar Mar 04 '24 01:03 github-actions[bot]

PR for support nulls first. https://github.com/Kyligence/ClickHouse/pull/483 May be can make nulls order configurable in ClickHouse later.

loudongfeng avatar Mar 04 '24 06:03 loudongfeng

From https://opencicd.kyligence.com/job/gluten/job/gluten-ci/7371/console i can see all test passed. Not sure why report faliure.

ERROR: script returned exit code 1

@liuneng1994 could you see if ci is running ok?

taiyang-li avatar Mar 04 '24 06:03 taiyang-li

PR for support nulls first. Kyligence/ClickHouse#483 May be can make nulls order configurable in ClickHouse later.

Is there any side effect about this change ? Pls try to complete it in upstream ClickHouse/ClickHouse firstly instead of directly modify src/* files in Kyligence/ClickHouse. cc @baibaichen

taiyang-li avatar Mar 04 '24 07:03 taiyang-li

@zzcclp This PR is ready for review.Thanks. In order to support join keys with null correctly, we need change CH's MergeJoinTransform, will add that to CH later.

Is there already a PR in CH?

lgbo-ustc avatar Mar 04 '24 07:03 lgbo-ustc

Run Gluten Clickhouse CI

github-actions[bot] avatar Mar 05 '24 06:03 github-actions[bot]

Run Gluten Clickhouse CI

github-actions[bot] avatar Mar 05 '24 06:03 github-actions[bot]

@zzcclp This PR is ready for review.Thanks. In order to support join keys with null correctly, we need change CH's MergeJoinTransform, will add that to CH later.

Is there already a PR in CH?

After this PR merged, we can set nulls smallest in Gluten. https://github.com/ClickHouse/ClickHouse/pull/60896

loudongfeng avatar Mar 06 '24 07:03 loudongfeng

For now we can disable outer joins whose join keys may be null. And only support INNER join first.

loudongfeng avatar Mar 06 '24 07:03 loudongfeng

Run Gluten Clickhouse CI

github-actions[bot] avatar Mar 08 '24 07:03 github-actions[bot]

Run Gluten Clickhouse CI

github-actions[bot] avatar Mar 08 '24 08:03 github-actions[bot]

Run Gluten Clickhouse CI

github-actions[bot] avatar Mar 12 '24 03:03 github-actions[bot]

Run Gluten Clickhouse CI

github-actions[bot] avatar Mar 12 '24 03:03 github-actions[bot]

Run Gluten Clickhouse CI

github-actions[bot] avatar Mar 12 '24 06:03 github-actions[bot]

Run Gluten Clickhouse CI

github-actions[bot] avatar Mar 12 '24 06:03 github-actions[bot]

Run Gluten Clickhouse CI

github-actions[bot] avatar Mar 12 '24 06:03 github-actions[bot]

Run Gluten Clickhouse CI

github-actions[bot] avatar Mar 12 '24 06:03 github-actions[bot]

Run Gluten Clickhouse CI

github-actions[bot] avatar Mar 29 '24 07:03 github-actions[bot]

Run Gluten Clickhouse CI

github-actions[bot] avatar Mar 29 '24 07:03 github-actions[bot]

Run Gluten Clickhouse CI

github-actions[bot] avatar Mar 29 '24 07:03 github-actions[bot]

Run Gluten Clickhouse CI

github-actions[bot] avatar Mar 30 '24 02:03 github-actions[bot]

Ready for review. Thanks.

loudongfeng avatar Mar 30 '24 08:03 loudongfeng