doris
doris copied to clipboard
[Enhancement] Grouping Sets predicate push down
Search before asking
- [X] I had searched in the issues and found no similar issues.
Description
First of all, create a test table and insert some data;
CREATE TABLE `a` (
`k1` int NOT NULL,
`k2` int NOT NULL,
`v` int SUM NULL
) ENGINE=OLAP
AGGREGATE KEY(`k1`, `k2`)
COMMENT 'OLAP'
DISTRIBUTED BY HASH(`k1`) BUCKETS 10;
insert into a values(1,2,3),(4,5,6);
Problem: Then try to explain SQL with doris-2.1.5-rc02, we found the condition do not push down:
> explain select * from (select k1,coalesce(k2,999) as kk2,sum(v) from a group by grouping sets((k1,k2),(k1))) x where x.kk2=999;
+----------------------------------------------------------------------+
| Explain String(Nereids Planner) |
+----------------------------------------------------------------------+
| PLAN FRAGMENT 0 |
| OUTPUT EXPRS: |
| k1[#16] |
| kk2[#17] |
| sum(v)[#18] |
| PARTITION: UNPARTITIONED |
| |
| HAS_COLO_PLAN_NODE: false |
| |
| VRESULT SINK |
| MYSQL_PROTOCAL |
| |
| 5:VEXCHANGE |
| offset: 0 |
| distribute expr lists: k1[#16] |
| |
| PLAN FRAGMENT 1 |
| |
| PARTITION: HASH_PARTITIONED: k1[#8], k2[#9], GROUPING_ID[#10] |
| |
| HAS_COLO_PLAN_NODE: true |
| |
| STREAM DATA SINK |
| EXCHANGE ID: 05 |
| UNPARTITIONED |
| |
| 4:VAGGREGATE (merge finalize)(279) |
| | output: sum(partial_sum(v)[#11])[#15] |
| | group by: k1[#8], k2[#9], GROUPING_ID[#10] |
| | having: (coalesce(k2[#13], 999) = 999) |
| | cardinality=2 |
| | final projections: k1[#12], coalesce(k2[#13], 999), sum(v)[#15] |
| | final project output tuple id: 5 |
| | distribute expr lists: k1[#8], k2[#9], GROUPING_ID[#10] |
| | |
| 3:VEXCHANGE |
| offset: 0 |
| distribute expr lists: |
| |
| PLAN FRAGMENT 2 |
| |
| PARTITION: HASH_PARTITIONED: k1[#0] |
| |
| HAS_COLO_PLAN_NODE: false |
| |
| STREAM DATA SINK |
| EXCHANGE ID: 03 |
| HASH_PARTITIONED: k1[#8], k2[#9], GROUPING_ID[#10] |
| |
| 2:VAGGREGATE (update serialize)(273) |
| | STREAMING |
| | output: partial_sum(v[#6])[#11] |
| | group by: k1[#4], k2[#5], GROUPING_ID[#7] |
| | cardinality=2 |
| | distribute expr lists: |
| | |
| 1:VREPEAT_NODE(270) |
| | repeat: repeat 1 lines [[4, 5], [4]] |
| | exprs: k1[#0], k2[#1], v[#2] |
| | output slots: `null`, `null`, `null`, `GROUPING_ID` |
| | distribute expr lists: k1[#0] |
| | |
| 0:VOlapScanNode(267) |
| TABLE: example_db.a(a), PREAGGREGATION: ON |
| partitions=1/1 (a) |
| tablets=10/10, tabletList=38035,38037,38039 ... |
| cardinality=2, avgRowSize=0.0, numNodes=1 |
| pushAggOp=NONE |
| |
| |
| Statistics |
| planed with unknown column statistics |
+----------------------------------------------------------------------+
Starrocks: But while we try to use Starrocks(Version: 3.2.6-2585333), we found the predicate pushed down to REPEAT_NODE:
> explain select * from (select k1,coalesce(k2,999) as kk2,sum(v) from a group by grouping sets((k1,k2),(k1))) x where x.kk2=999;
+-------------------------------------------------------------+
| Explain String |
+-------------------------------------------------------------+
| PLAN FRAGMENT 0 |
| OUTPUT EXPRS:1: k1 | 6: coalesce | 4: sum |
| PARTITION: HASH_PARTITIONED: 1: k1, 2: k2, 5: GROUPING_ID |
| |
| RESULT SINK |
| |
| 5:Project |
| | <slot 1> : 1: k1 |
| | <slot 4> : 4: sum |
| | <slot 6> : coalesce(2: k2, 999) |
| | |
| 4:AGGREGATE (merge finalize) |
| | output: sum(4: sum) |
| | group by: 1: k1, 2: k2, 5: GROUPING_ID |
| | |
| 3:EXCHANGE |
| |
| PLAN FRAGMENT 1 |
| OUTPUT EXPRS: |
| PARTITION: RANDOM |
| |
| STREAM DATA SINK |
| EXCHANGE ID: 03 |
| HASH_PARTITIONED: 1: k1, 2: k2, 5: GROUPING_ID |
| |
| 2:AGGREGATE (update serialize) |
| | STREAMING |
| | output: sum(3: v) |
| | group by: 1: k1, 2: k2, 5: GROUPING_ID |
| | |
| 1:REPEAT_NODE |
| | repeat: repeat 1 lines [[1, 2], [1]] |
| | PREDICATES: coalesce(2: k2, 999) = 999 |
| | |
| 0:OlapScanNode |
| TABLE: a |
| PREAGGREGATION: ON |
| partitions=1/1 |
| rollup: a |
| tabletRatio=1/1 |
| tabletList=14112 |
| cardinality=2 |
| avgRowSize=12.0 |
+-------------------------------------------------------------+
Starrocks: We use another condition, still use Starrocks(Version: 3.2.6-2585333), we found the predicate pushed down to SCAN_NODE:
> explain select * from (select k1,coalesce(k2,999) as kk2,sum(v) from a group by grouping sets((k1,k2),(k1))) x where x.kk2=111;
+-------------------------------------------------------------+
| Explain String |
+-------------------------------------------------------------+
| PLAN FRAGMENT 0 |
| OUTPUT EXPRS:1: k1 | 6: coalesce | 4: sum |
| PARTITION: HASH_PARTITIONED: 1: k1, 2: k2, 5: GROUPING_ID |
| |
| RESULT SINK |
| |
| 5:Project |
| | <slot 1> : 1: k1 |
| | <slot 4> : 4: sum |
| | <slot 6> : coalesce(2: k2, 999) |
| | |
| 4:AGGREGATE (merge finalize) |
| | output: sum(4: sum) |
| | group by: 1: k1, 2: k2, 5: GROUPING_ID |
| | |
| 3:EXCHANGE |
| |
| PLAN FRAGMENT 1 |
| OUTPUT EXPRS: |
| PARTITION: RANDOM |
| |
| STREAM DATA SINK |
| EXCHANGE ID: 03 |
| HASH_PARTITIONED: 1: k1, 2: k2, 5: GROUPING_ID |
| |
| 2:AGGREGATE (update serialize) |
| | STREAMING |
| | output: sum(3: v) |
| | group by: 1: k1, 2: k2, 5: GROUPING_ID |
| | |
| 1:REPEAT_NODE |
| | repeat: repeat 1 lines [[1, 2], [1]] |
| | PREDICATES: coalesce(2: k2, 999) = 111 |
| | |
| 0:OlapScanNode |
| TABLE: a |
| PREAGGREGATION: ON |
| PREDICATES: coalesce(2: k2, 999) = 111 |
| partitions=1/1 |
| rollup: a |
| tabletRatio=1/1 |
| tabletList=14112 |
| cardinality=1 |
| avgRowSize=12.0 |
+-------------------------------------------------------------+
Solution
No response
Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
Code of Conduct
- [X] I agree to follow this project's Code of Conduct