incubator-gluten
incubator-gluten copied to clipboard
[GLUTEN-4745][CH] support Sort Merge Join
Support inner and outer joins
What changes were proposed in this pull request?
(Fixes: #GLUTEN-4745)
How was this patch tested?
unit tests and integration tests
https://github.com/oap-project/gluten/issues/4745
Run Gluten Clickhouse CI
seems unrelated failed in static-build-centos7-test :
error: Failed to download from mirror set
error: https://web.mit.edu/kerberos/dist/krb5/1.20/krb5-1.20.tar.gz: curl failed to download with exit code 18
@zzcclp This PR is ready for review.Thanks. In order to support join keys with null correctly, we need change CH's MergeJoinTransform, will add that to CH later.
@taiyang-li @lgbo-ustc please help to review, thanks.
Run Gluten Clickhouse CI
Run Gluten Clickhouse CI
From https://opencicd.kyligence.com/job/gluten/job/gluten-ci/7371/console i can see all test passed. Not sure why report faliure.
ERROR: script returned exit code 1
Run Gluten Clickhouse CI
PR for support nulls first. https://github.com/Kyligence/ClickHouse/pull/483 May be can make nulls order configurable in ClickHouse later.
From https://opencicd.kyligence.com/job/gluten/job/gluten-ci/7371/console i can see all test passed. Not sure why report faliure.
ERROR: script returned exit code 1
@liuneng1994 could you see if ci is running ok?
PR for support nulls first. Kyligence/ClickHouse#483 May be can make nulls order configurable in ClickHouse later.
Is there any side effect about this change ? Pls try to complete it in upstream ClickHouse/ClickHouse firstly instead of directly modify src/* files in Kyligence/ClickHouse. cc @baibaichen
@zzcclp This PR is ready for review.Thanks. In order to support join keys with null correctly, we need change CH's MergeJoinTransform, will add that to CH later.
Is there already a PR in CH?
Run Gluten Clickhouse CI
Run Gluten Clickhouse CI
@zzcclp This PR is ready for review.Thanks. In order to support join keys with null correctly, we need change CH's MergeJoinTransform, will add that to CH later.
Is there already a PR in CH?
After this PR merged, we can set nulls smallest in Gluten. https://github.com/ClickHouse/ClickHouse/pull/60896
For now we can disable outer joins whose join keys may be null. And only support INNER join first.
Run Gluten Clickhouse CI
Run Gluten Clickhouse CI
Run Gluten Clickhouse CI
Run Gluten Clickhouse CI
Run Gluten Clickhouse CI
Run Gluten Clickhouse CI
Run Gluten Clickhouse CI
Run Gluten Clickhouse CI
Run Gluten Clickhouse CI
Run Gluten Clickhouse CI
Run Gluten Clickhouse CI
Run Gluten Clickhouse CI
Ready for review. Thanks.