starrocks icon indicating copy to clipboard operation
starrocks copied to clipboard

[Feature] Support text based mv rewrite

Open LiShuMing opened this issue 1 year ago • 3 comments

Why I'm doing:

Original MV Rewrite only support SPJG operators, but we can support mv rewrite by text match by simple.

What I'm doing:

Basic Idea

The basic idea is:

If mv's defined qeury and user query's the same in AST format(ParseNode), and mv is not outdated(refresh complete), so the mv can be used for query rewrite.

The Key comparator is AST(ParseNode)'s toSQL:

new AstToSQLBuilder.AST2SQLBuilderVisitor(true, false).visit(parseNode);

If two ParseNode's toSQL is equal to other's, we think they are equal.

Based the principle above, we can check whether mv and query are equal or not.

Implements

  • Collect MV's AST into CachingMvPlanContextBuilder#astToMvPlanContextMap to used in query.
  • Add TextMatchBasedRewriteRule at the RBO's entrance, it will
    1. check whether query's AST is same with mvs
    2. check whether query's subquery AST is same with mvs.

Add two parameters:

-- default: off, use text based rewrite when it's on
set enable_materialized_view_text_match_rewrite = on;
-- default: 16, max of one query's subquery that used  to match. disable if it's -1
set materializedViewSubQueryTextMatchMaxCount = 16;

Test

  • Add some basic tests in UT and sql-tester.
  • Test all TPCH/TPCDS queries if mv is the same as the query.

Limitations

  1. MV and query must have the same output order, otherwise match will fail. This is a naive policy and there are a lot of limitations:
  • don't support different output orders
  • don't support different aliases
  • don't support query is subset of mv's output
  1. Query/MV with order/limit is supported, extra TopN operator is added after rewrite. But it's not supported when order by columns are not in the outputs:
select user_id, time, sum(tag_id) from user_tags group by user_id, time order by user_id + 1, time;";
  1. All query inputs' AST need to be normalized again, because MV does not save the original define query rather than a normalized query by AstToSQLBuilder.toSQL. NOTE It will add extra parser time if query is very complex:
    /**
     * Since @{LocalMetastore#createMaterializedView} uses
     * {@code statement.setInlineViewDef(AstToSQLBuilder.toSQL(queryStatement));} to store user's define query,
     * and {@link AstToSQLBuilder} is not reentrant for now, so needs to normalize input query as mv's define query.
     * TODO: But this is expensive, remove this if {@link AstToSQLBuilder} is reentrant.
     * @param queryAst : input query parse node.
     */
    private ParseNode normalizeAst(ParseNode queryAst) {
        String query = AstToSQLBuilder.toSQL(queryAst);
        return MvUtils.getQueryAst(query);
    }

Fixes #issue

What type of PR is this:

  • [ ] BugFix
  • [x] Feature
  • [ ] Enhancement
  • [ ] Refactor
  • [ ] UT
  • [ ] Doc
  • [ ] Tool

Does this PR entail a change in behavior?

  • [ ] Yes, this PR will result in a change in behavior.
  • [x] No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

  • [ ] Interface/UI changes: syntax, type conversion, expression evaluation, display information
  • [ ] Parameter changes: default values, similar parameters but with different default values
  • [ ] Policy changes: use new policy to replace old one, functionality automatically enabled
  • [ ] Feature removed
  • [ ] Miscellaneous: upgrade & downgrade compatibility, etc.

Checklist:

  • [x] I have added test cases for my bug fix or my new feature
  • [ ] This pr needs user documentation (for new or modified features or behaviors)
    • [ ] I have added documentation for my new feature or new function
  • [ ] This is a backport pr

Bugfix cherry-pick branch check:

  • [x] I have checked the version labels which the pr will be auto-backported to the target branch
    • [x] 3.2
    • [ ] 3.1
    • [ ] 3.0
    • [ ] 2.5

LiShuMing avatar Feb 20 '24 11:02 LiShuMing

why not implement it based on SQL Signature? it could adapt to more cases?

Seaven avatar Mar 11 '24 03:03 Seaven

why not implement it based on SQL Signature? it could adapt to more cases?

Can you figure out what's SQL Signature? To be simple, just use AstTree's text as the signature.

LiShuMing avatar Mar 12 '24 11:03 LiShuMing

Quality Gate Failed Quality Gate failed

Failed conditions
B Reliability Rating on New Code (required ≥ A)

See analysis details on SonarCloud

Catch issues before they fail your Quality Gate with our IDE extension SonarLint

sonarqubecloud[bot] avatar Mar 12 '24 11:03 sonarqubecloud[bot]

[FE Incremental Coverage Report]

:white_check_mark: pass : 251 / 313 (80.19%)

file detail

path covered_line new_line coverage not_covered_line_detail
:large_blue_circle: com/starrocks/sql/optimizer/rule/transformation/materialization/MvUtils.java 5 8 62.50% [1207, 1208, 1210]
:large_blue_circle: com/starrocks/alter/AlterMVJobExecutor.java 2 3 66.67% [251]
:large_blue_circle: com/starrocks/sql/optimizer/CachingMvPlanContextBuilder.java 33 49 67.35% [55, 58, 62, 75, 148, 167, 180, 189, 190, 191, 192, 193, 194, 196, 197, 198]
:large_blue_circle: com/starrocks/qe/SessionVariable.java 6 8 75.00% [3021, 3022]
:large_blue_circle: com/starrocks/sql/optimizer/rule/transformation/materialization/rule/TextMatchBasedRewriteRule.java 152 191 79.58% [115, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 210, 211, 214, 218, 219, 220, 223, 225, 226, 233, 234, 239, 240, 249, 250, 251, 252, 253, 254, 292, 293, 294, 389]
:large_blue_circle: com/starrocks/catalog/MaterializedView.java 11 12 91.67% [1076]
:large_blue_circle: com/starrocks/sql/optimizer/Optimizer.java 6 6 100.00% []
:large_blue_circle: com/starrocks/sql/optimizer/transformer/RelationTransformer.java 8 8 100.00% []
:large_blue_circle: com/starrocks/sql/optimizer/MvRewritePreprocessor.java 5 5 100.00% []
:large_blue_circle: com/starrocks/sql/optimizer/transformer/QueryTransformer.java 3 3 100.00% []
:large_blue_circle: com/starrocks/sql/optimizer/transformer/TransformerContext.java 6 6 100.00% []
:large_blue_circle: com/starrocks/sql/optimizer/rule/RuleType.java 1 1 100.00% []
:large_blue_circle: com/starrocks/sql/StatementPlanner.java 10 10 100.00% []
:large_blue_circle: com/starrocks/sql/optimizer/MaterializedViewOptimizer.java 3 3 100.00% []

github-actions[bot] avatar Mar 12 '24 13:03 github-actions[bot]

[BE Incremental Coverage Report]

:white_check_mark: pass : 0 / 0 (0%)

github-actions[bot] avatar Mar 12 '24 13:03 github-actions[bot]

@mergify backport branch-3.2

LiShuMing avatar Mar 14 '24 02:03 LiShuMing

backport branch-3.2

✅ Backports have been created

mergify[bot] avatar Mar 14 '24 02:03 mergify[bot]