drill icon indicating copy to clipboard operation
drill copied to clipboard

DRILL-4232 Support for EXCEPT and INTERSECT set operator

Open Leon-WTF opened this issue 3 years ago • 5 comments

DRILL-4232: Support for EXCEPT and INTERSECT set operator

Description

Can have hash set operator and sorted set operator, only implement hash version in this PR. Compute number of left-input duplicates(numLeft, probe side) and number of right-input duplicates(numRight, build side) for each same tuple: INTERSECT: if numRight > 0 and numLeft > 0, output one tuple INTERSECT ALL: if numRight > 0 and numLeft > 0, output min(numLeft,numRight) tuples EXCEPT: if numRight = 0 and numLeft > 0, output one tuple EXCEPT ALL: if numLeft>=numRight, output numLeft - numRight tuples

Documentation

TODO

Testing

TODO

Leon-WTF avatar Jul 17 '22 12:07 Leon-WTF

@Leon-WTF You might want to wait until https://github.com/apache/drill/pull/2602 is merged before you continue this.

cgivre avatar Jul 20 '22 15:07 cgivre

@Leon-WTF You might want to wait until #2602 is merged before you continue this.

@cgivre Thanks for the info, #2602 is very needed, I will focus on implementing the operator firstly, and adapt to the new Calcite when it's merged.

Leon-WTF avatar Jul 24 '22 07:07 Leon-WTF

@Leon-WTF Is this ready for review?

cgivre avatar Aug 28 '22 01:08 cgivre

@Leon-WTF Is this ready for review?

@cgivre Not yet, I'm handling the EXCEPT case, it needs to remove the duplicate records for probe side(for other three cases, it's like semi-join, I just added the records num to the hash map), I'm trying to add an Agg phase after setop phase. Any suggestion on this?

Leon-WTF avatar Aug 28 '22 03:08 Leon-WTF

Hi @Leon-WTF how is this coming?

cgivre avatar Sep 22 '22 19:09 cgivre

@Leon-WTF I don't know if you saw this or not, but #2602 has been merged. We are actually running on Calcite 1.32 now!

cgivre avatar Oct 02 '22 03:10 cgivre

@Leon-WTF I don't know if you saw this or not, but #2602 has been merged. We are actually running on Calcite 1.32 now!

@cgivre Sorry for the late response. Yes, I have merged that in. I'm almost done and I'm adding ut.

Leon-WTF avatar Oct 16 '22 08:10 Leon-WTF

HI @Leon-WTF I saw that you added unit tests. Is this PR ready for review? If not, what is remaining?

cgivre avatar Oct 16 '22 16:10 cgivre

HI @Leon-WTF I saw that you added unit tests. Is this PR ready for review? If not, what is remaining?

@cgivre I think it's just still need more ut to test more completely and fix bugs while adding ut. By the way, do we have any plan to release drill 2.0? I will try to catch that plan.

Leon-WTF avatar Oct 17 '22 02:10 Leon-WTF

HI @Leon-WTF I saw that you added unit tests. Is this PR ready for review? If not, what is remaining?

@cgivre I think it's just still need more ut to test more completely and fix bugs while adding ut. By the way, do we have any plan to release drill 2.0? I will try to catch that plan.

We are getting ready to release Drill 1.20.3. Once that's done, I'd like to start discussions around Drill 2.0. There have been a lot of major work in Drill and I'd like to see that getting used.

cgivre avatar Oct 24 '22 14:10 cgivre

@vvysotskyi Hi, any more comments on this PR?

Leon-WTF avatar Feb 03 '23 09:02 Leon-WTF

Hey @Leon-WTF Any chance you could address @vvysotskyi 's comments soon. This is one of the last PRs slated to be merged for the next release.

cgivre avatar Feb 09 '23 12:02 cgivre

Hey @Leon-WTF Any chance you could address @vvysotskyi 's comments soon. This is one of the last PRs slated to be merged for the next release.

I will do it by this weekend, is that ok?

Leon-WTF avatar Feb 10 '23 05:02 Leon-WTF

@Leon-WTF This weekend would be great! Very excited to get this merged.

cgivre avatar Feb 10 '23 12:02 cgivre

@Leon-WTF Thanks for this. @vvysotskyi Thanks for the review. Merging now!

cgivre avatar Feb 12 '23 20:02 cgivre