databend
databend copied to clipboard
Feature: support left semi join and right semi join
Summary
TPCH q4 will be converted to semi join
MySQL [tpch]> explain
-> select
-> o_orderpriority,
-> count(*) as order_count
-> from
-> orders
-> where
-> o_orderdate >= to_date('1993-07-01')
-> and o_orderdate < add_months(to_date('1993-07-01'), 3)
-> and exists (
-> select
-> *
-> from
-> lineitem
-> where
-> l_orderkey = o_orderkey
-> and l_commitdate < l_receiptdate
-> )
-> group by
-> o_orderpriority
-> order by
-> o_orderpriority;
+---------------------------------------------------------------------------------------------------------------------+
| explain |
+---------------------------------------------------------------------------------------------------------------------+
| Project |
| ├── columns: [o_orderpriority (#5), order_count (#25)] |
| └── EvalScalar |
| ├── expressions: [COUNT(*) (#27)] |
| └── Sort |
| ├── sort keys: [o_orderpriority ASC NULLS LAST] |
| └── AggregateFinal |
| ├── group by: [o_orderpriority] |
| ├── aggregate functions: [count()] |
| └── AggregatePartial |
| ├── group by: [o_orderpriority] |
| ├── aggregate functions: [count()] |
| └── HashJoin |
| ├── join type: SEMI |
| ├── build keys: [lineitem.l_orderkey (#9)] |
| ├── probe keys: [orders.o_orderkey (#0)] |
| ├── filters: [] |
| ├── Project(Build) |
| │ ├── columns: [l_orderkey (#9)] |
| │ └── TableScan |
| │ ├── table: default.tpch.lineitem |
| │ ├── read rows: 600037902 |
| │ ├── read bytes: 4219861478 |
| │ ├── partitions total: 933 |
| │ ├── partitions scanned: 933 |
| │ └── push downs: [filters: [(l_commitdate < l_receiptdate)], limit: NONE] |
| └── TableScan(Probe) |
| ├── table: default.tpch.orders |
| ├── read rows: 150000000 |
| ├── read bytes: 1438183449 |
| ├── partitions total: 208 |
| ├── partitions scanned: 208 |
| └── push downs: [filters: [(o_orderdate >= 8582), (o_orderdate < 8674)], limit: NONE] |
+---------------------------------------------------------------------------------------------------------------------+
33 rows in set (0.056 sec)
After supporting left/right semi join, we can adjust the order of semi join to make the small table be the build side.