[Feature] support recursive cte
Why I'm doing:
What I'm doing:
1. Grammar Support (ANTLR Grammar Definition)
- fe/fe-grammar:
- Add RECURSIVE keyword to WITH clause grammar rule
- StarRocks.g4:
WITH RECURSIVE? commonTableExpression - Add RECURSIVE to non-reserved keywords list
2. AST Data Structure Extensions
- CTERelation:
- Add
isRecursivefield to mark recursive CTE - Add
isAnchorfield to mark anchor part (initial query) - Provide corresponding getter/setter methods
- Add
- QueryRelation: Add
hasRecursiveCTEflag - SelectRelation: Support recursive CTE flag propagation
- AstTraverser: Add visitCTE visitor method
3. Recursive CTE Validation and Execution
-
RecursiveCTEAstCheck (new file):
- Detect presence of recursive CTE in statement
- Validate recursive CTE legality (no multi-level recursion support)
- Mark recursive references during AST traversal
-
RecursiveCTEExecutor (new file):
- Core execution engine for iterative query processing
- Manage working tables and result sets
- Implement UNION ALL iterative merge logic
- Handle termination condition detection (stop when result set unchanged)
4. Query Processing Pipeline Integration
- StmtExecutor: Invoke recursive CTE detection and execution before SQL execution
- QueryAnalyzer: Enhance CTE analysis logic with recursive semantic validation
- AstBuilder: Parse WITH RECURSIVE syntax and construct CTERelation objects
- AST2StringVisitor: Support recursive CTE string representation
5. Session Configuration
- SessionVariable: Add
enable_recursive_ctesession variable (optional enable/disable)
6.Recursive CTE Execution Model
- Detection: RecursiveCTEAstCheck identifies recursive CTE presence
- Validation: Check recursive CTE legality (single-level recursion, correct UNION ALL structure)
- Execution:
- First iteration: Execute anchor query (WHERE manager_id IS NULL)
- Iterative loop:
- Execute recursive part using previous iteration results
- Merge results via UNION ALL
- Stop when no new rows added
- Return: Merged result set from all iterations
Example
MySQL td> with recursive t1 as (select cast(1 as bigint) as l union select l + 1 as l from t1 where l < 10 )
select * from t1
+---+
| l |
+---+
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |
+---+
5 rows in set
Time: 0.477s
MySQL td> -- SQL:
WITH RECURSIVE org_hierarchy AS (
SELECT
employee_id,
name,
manager_id,
title, cast(1 as bigint) AS `level`,
name AS path
FROM employees
WHERE manager_id IS NULL
UNION ALL
SELECT
e.employee_id,
e.name,
e.manager_id,
e.title,
oh.`level` + 1,
CONCAT(oh.path, ' -> ', e.name)
FROM employees e
INNER JOIN org_hierarchy oh ON e.manager_id = oh.employee_id
)
SELECT
employee_id,
name,
title,`level`,
path
FROM org_hierarchy;
+-------------+--------+---------------------+-------+-----------------------------------------+
| employee_id | name | title | level | path |
+-------------+--------+---------------------+-------+-----------------------------------------+
| 1 | Alicia | CEO | 1 | Alicia |
| 2 | Bob | CTO | 2 | Alicia -> Bob |
| 3 | Carol | CFO | 2 | Alicia -> Carol |
| 4 | David | VP of Engineering | 3 | Alicia -> Bob -> David |
| 5 | Eve | VP of Research | 3 | Alicia -> Bob -> Eve |
| 6 | Frank | VP of Finance | 3 | Alicia -> Carol -> Frank |
| 7 | Grace | Engineering Manager | 4 | Alicia -> Bob -> David -> Grace |
| 8 | Heidi | Tech Lead | 4 | Alicia -> Bob -> David -> Heidi |
| 9 | Ivan | Research Manager | 4 | Alicia -> Bob -> Eve -> Ivan |
| 10 | Judy | Senior Engineer | 5 | Alicia -> Bob -> David -> Grace -> Judy |
+-------------+--------+---------------------+-------+-----------------------------------------+
10 rows in set
Time: 0.476s
MySQL td>
MySQL td> -- SQL:
explain WITH RECURSIVE org_hierarchy AS (-- 锚点查询:从CEO开始(manager_id IS NULL)
SELECT
employee_id,
name,
manager_id,
title, cast(1 as bigint) AS `level`,
name AS path
FROM employees
WHERE manager_id IS NULL
UNION ALL-- 递归查询:逐级向下查找
SELECT
e.employee_id,
e.name,
e.manager_id,
e.title,
oh.`level` + 1,
CONCAT(oh.path, ' -> ', e.name)
FROM employees e
INNER JOIN org_hierarchy oh ON e.manager_id = oh.employee_id
)
-- 4. 最终查询
SELECT
employee_id,
name,
title,`level`,
path
FROM org_hierarchy;
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Explain String |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Recursive CTE Name: org_hierarchy |
| Temporary Table: org_hierarchy_1764834889831 |
| Start Statement: SELECT anchar_1764834889831.*, 0 AS `_cte_level` |
| FROM (SELECT `td`.`employees`.`employee_id`, `td`.`employees`.`name`, `td`.`employees`.`manager_id`, `td`.`employees`.`title`, CAST(1 AS BIGINT) AS `level`, `td`.`employees`.`name` AS `path` |
| FROM `td`.`employees` |
| WHERE `td`.`employees`.`manager_id` IS NULL) `anchar_1764834889831` |
| Recursive Statement: SELECT `td`.`e`.`employee_id`, `td`.`e`.`name`, `td`.`e`.`manager_id`, `td`.`e`.`title`, `oh`.`level` + 1 AS `oh.level + 1`, concat(`oh`.`path`, ' -> ', `td`.`e`.`name`) AS `concat(oh.path, ' -> ', e.name)` |
| FROM `td`.`employees` AS `e` INNER JOIN (SELECT `td`.`org_hierarchy_1764834889831`.`employee_id`, `td`.`org_hierarchy_1764834889831`.`name`, `td`.`org_hierarchy_1764834889831`.`manager_id`, `td`.`org_hierarchy_1764834889831`.`title`, `td`.`org_hierarchy_1764834889831`.`level`, `td`.`org_hierarchy_1764834889831`.`path` |
| FROM `td`.`org_hierarchy_1764834889831`) `oh` ON `td`.`e`.`manager_id` = `oh`.`employee_id` |
| -------------------------------------------------- |
| Outer Statement After Rewriting: |
| SELECT `org_hierarchy`.`employee_id`, `org_hierarchy`.`name`, `org_hierarchy`.`title`, `org_hierarchy`.`level`, `org_hierarchy`.`path` |
| FROM (SELECT `td`.`org_hierarchy_1764834889831`.`employee_id`, `td`.`org_hierarchy_1764834889831`.`name`, `td`.`org_hierarchy_1764834889831`.`manager_id`, `td`.`org_hierarchy_1764834889831`.`title`, `td`.`org_hierarchy_1764834889831`.`level`, `td`.`org_hierarchy_1764834889831`.`path` |
| FROM `td`.`org_hierarchy_1764834889831`) `org_hierarchy` |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
14 rows in set
Time: 0.015s
Fixes #issue
What type of PR is this:
- [ ] BugFix
- [x] Feature
- [ ] Enhancement
- [ ] Refactor
- [ ] UT
- [ ] Doc
- [ ] Tool
Does this PR entail a change in behavior?
- [ ] Yes, this PR will result in a change in behavior.
- [x] No, this PR will not result in a change in behavior.
If yes, please specify the type of change:
- [ ] Interface/UI changes: syntax, type conversion, expression evaluation, display information
- [ ] Parameter changes: default values, similar parameters but with different default values
- [ ] Policy changes: use new policy to replace old one, functionality automatically enabled
- [ ] Feature removed
- [ ] Miscellaneous: upgrade & downgrade compatibility, etc.
Checklist:
- [ ] I have added test cases for my bug fix or my new feature
- [ ] This pr needs user documentation (for new or modified features or behaviors)
- [ ] I have added documentation for my new feature or new function
- [ ] This is a backport pr
Bugfix cherry-pick branch check:
- [ ] I have checked the version labels which the pr will be auto-backported to the target branch
- [ ] 4.0
- [ ] 3.5
- [ ] 3.4
- [ ] 3.3
[!NOTE] Introduces full recursive CTE support (WITH RECURSIVE) including parsing, validation, iterative execution via temporary tables, session controls, and tests.
- SQL/Parser/Formatter:
- Add
WITH RECURSIVEgrammar andRECURSIVEkeyword in ANTLR (StarRocks.g4,StarRocksLex.g4).- Parse and render recursive CTEs (
parser/AstBuilder,formatter/AST2StringVisitor).- AST & Traversal:
- Extend
CTERelationwithisRecursive/isAnchor;QueryRelationwithhasRecursiveCTE;SelectRelation#setPredicate.- Update traverser to handle recursive CTE anchors (
AstTraverser).- Analyzer:
- Validate and analyze recursive CTEs in
QueryAnalyzer(anchor/recursive parts, name uniqueness, types, ban ORDER BY/LIMIT in recursive CTE).- Execution Pipeline:
- Integrate recursive CTE path in
StmtExecutor(detect, explain, prepare, finalize).- Add
RecursiveCTEAstCheckandRecursiveCTEExecutorto stage iterative execution using temporary tables and loop depth.- Session Variables:
- Add
enable_recursive_cte,recursive_cte_max_depth,recursive_cte_finalize_temporal_tablewith getters/setters (SessionVariable).- Optimizer Guard:
- Reject planning of recursive CTEs in
RelationTransformer.- Connector Parser:
- Adjust Trino
AstBuilderCTE construction to include new flags.- Tests:
- Add analyzer and plan UTs for recursive CTEs and SQL tests (
AnalyzeCTETest,RecursiveCTETest,test/sql/test_cte/T/test_recursive_cte).Written by Cursor Bugbot for commit a03df4274b01e03c8929673eee88d244b67b6b4e. This will update automatically on new commits. Configure here.
@cursor review
@cursor review
@cursor review
@cursor review
@cursor review
@cursor review
🧪 CI Insights
Here's what we observed from your CI run for a03df427.
🟢 All jobs passed!
But CI Insights is watching 👀
@cursor review
@cursor review
@cursor review
Quality Gate passed
Issues
30 New issues
0 Accepted issues
Measures
0 Security Hotspots
0.0% Coverage on New Code
2.4% Duplication on New Code
@cursor review
[Java-Extensions Incremental Coverage Report]
:white_check_mark: pass : 0 / 0 (0%)
[FE Incremental Coverage Report]
:x: fail : 224 / 339 (66.08%)
file detail
| path | covered_line | new_line | coverage | not_covered_line_detail | |
|---|---|---|---|---|---|
| :large_blue_circle: | com/starrocks/sql/ast/SelectRelation.java | 0 | 2 | 00.00% | [161, 162] |
| :large_blue_circle: | com/starrocks/qe/StmtExecutor.java | 3 | 17 | 17.65% | [721, 722, 723, 724, 725, 727, 728, 729, 731, 732, 733, 734, 736, 1022] |
| :large_blue_circle: | com/starrocks/qe/recursivecte/RecursiveCTEExecutor.java | 103 | 192 | 53.65% | [117, 119, 120, 121, 122, 124, 125, 127, 131, 132, 133, 135, 136, 137, 138, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 154, 155, 156, 157, 158, 159, 160, 162, 164, 165, 166, 167, 168, 171, 172, 176, 177, 179, 181, 182, 183, 184, 185, 186, 187, 188, 194, 195, 197, 198, 199, 200, 201, 204, 205, 206, 207, 208, 209, 212, 214, 215, 216, 221, 222, 223, 225, 226, 227, 228, 229, 230, 232, 233, 235, 237, 241, 266, 267, 287, 309, 334, 337, 344] |
| :large_blue_circle: | com/starrocks/qe/SessionVariable.java | 8 | 10 | 80.00% | [2129, 2133] |
| :large_blue_circle: | com/starrocks/sql/analyzer/QueryAnalyzer.java | 71 | 78 | 91.03% | [374, 399, 405, 409, 410, 417, 430] |
| :large_blue_circle: | com/starrocks/qe/recursivecte/RecursiveCTEAstCheck.java | 20 | 21 | 95.24% | [58] |
| :large_blue_circle: | com/starrocks/sql/optimizer/transformer/RelationTransformer.java | 2 | 2 | 100.00% | [] |
| :large_blue_circle: | com/starrocks/sql/ast/CTERelation.java | 7 | 7 | 100.00% | [] |
| :large_blue_circle: | com/starrocks/sql/ast/AstTraverser.java | 2 | 2 | 100.00% | [] |
| :large_blue_circle: | com/starrocks/sql/formatter/AST2StringVisitor.java | 2 | 2 | 100.00% | [] |
| :large_blue_circle: | com/starrocks/sql/ast/QueryRelation.java | 6 | 6 | 100.00% | [] |
[BE Incremental Coverage Report]
:white_check_mark: pass : 0 / 0 (0%)