drill
drill copied to clipboard
DRILL-4232 Support for EXCEPT and INTERSECT set operator
DRILL-4232: Support for EXCEPT and INTERSECT set operator
Description
Can have hash set operator and sorted set operator, only implement hash version in this PR. Compute number of left-input duplicates(numLeft, probe side) and number of right-input duplicates(numRight, build side) for each same tuple: INTERSECT: if numRight > 0 and numLeft > 0, output one tuple INTERSECT ALL: if numRight > 0 and numLeft > 0, output min(numLeft,numRight) tuples EXCEPT: if numRight = 0 and numLeft > 0, output one tuple EXCEPT ALL: if numLeft>=numRight, output numLeft - numRight tuples
Documentation
TODO
Testing
TODO
@Leon-WTF You might want to wait until https://github.com/apache/drill/pull/2602 is merged before you continue this.
@Leon-WTF You might want to wait until #2602 is merged before you continue this.
@cgivre Thanks for the info, #2602 is very needed, I will focus on implementing the operator firstly, and adapt to the new Calcite when it's merged.
@Leon-WTF Is this ready for review?
@Leon-WTF Is this ready for review?
@cgivre Not yet, I'm handling the EXCEPT case, it needs to remove the duplicate records for probe side(for other three cases, it's like semi-join, I just added the records num to the hash map), I'm trying to add an Agg phase after setop phase. Any suggestion on this?
Hi @Leon-WTF how is this coming?
@Leon-WTF I don't know if you saw this or not, but #2602 has been merged. We are actually running on Calcite 1.32 now!
@Leon-WTF I don't know if you saw this or not, but #2602 has been merged. We are actually running on Calcite 1.32 now!
@cgivre Sorry for the late response. Yes, I have merged that in. I'm almost done and I'm adding ut.
HI @Leon-WTF I saw that you added unit tests. Is this PR ready for review? If not, what is remaining?
HI @Leon-WTF I saw that you added unit tests. Is this PR ready for review? If not, what is remaining?
@cgivre I think it's just still need more ut to test more completely and fix bugs while adding ut. By the way, do we have any plan to release drill 2.0? I will try to catch that plan.
HI @Leon-WTF I saw that you added unit tests. Is this PR ready for review? If not, what is remaining?
@cgivre I think it's just still need more ut to test more completely and fix bugs while adding ut. By the way, do we have any plan to release drill 2.0? I will try to catch that plan.
We are getting ready to release Drill 1.20.3. Once that's done, I'd like to start discussions around Drill 2.0. There have been a lot of major work in Drill and I'd like to see that getting used.
@vvysotskyi Hi, any more comments on this PR?
Hey @Leon-WTF Any chance you could address @vvysotskyi 's comments soon. This is one of the last PRs slated to be merged for the next release.
Hey @Leon-WTF Any chance you could address @vvysotskyi 's comments soon. This is one of the last PRs slated to be merged for the next release.
I will do it by this weekend, is that ok?
@Leon-WTF This weekend would be great! Very excited to get this merged.
@Leon-WTF Thanks for this. @vvysotskyi Thanks for the review. Merging now!