starrocks icon indicating copy to clipboard operation
starrocks copied to clipboard

[Feature] Support virtual column _tablet_id_

Open murphyatwork opened this issue 3 weeks ago • 9 comments

Why I'm doing:

What I'm doing:

Add a virtual column _tablet_id_ for data diagnosis.

Fixes #63923

What type of PR is this:

  • [ ] BugFix
  • [x] Feature
  • [ ] Enhancement
  • [ ] Refactor
  • [ ] UT
  • [ ] Doc
  • [ ] Tool

Does this PR entail a change in behavior?

  • [ ] Yes, this PR will result in a change in behavior.
  • [x] No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

  • [ ] Interface/UI changes: syntax, type conversion, expression evaluation, display information
  • [ ] Parameter changes: default values, similar parameters but with different default values
  • [ ] Policy changes: use new policy to replace old one, functionality automatically enabled
  • [ ] Feature removed
  • [ ] Miscellaneous: upgrade & downgrade compatibility, etc.

Checklist:

  • [ ] I have added test cases for my bug fix or my new feature
  • [ ] This pr needs user documentation (for new or modified features or behaviors)
    • [ ] I have added documentation for my new feature or new function
  • [ ] This is a backport pr

Bugfix cherry-pick branch check:

  • [x] I have checked the version labels which the pr will be auto-backported to the target branch
    • [ ] 4.0
    • [ ] 3.5
    • [ ] 3.4
    • [ ] 3.3

[!NOTE] Introduces hidden, read-only virtual column _tablet_id_ for OLAP/Lake tables, with FE/BE support, docs, configs, and tests, while preventing pushdown and adjusting planners/optimizations.

  • Virtual Column Support
    • Add hidden, read-only _tablet_id_ virtual column surfaced in queries but excluded from SELECT *, DESCRIBE, and SHOW CREATE TABLE.
    • FE: expose virtual column via analyzer/planner; new enable_virtual_column config; propagate is_virtual_column through SlotDescriptor (Thrift/Proto).
    • BE: implement virtual_column.{h,cpp} and integrate into OLAP/Lake scan paths to populate _tablet_id_ and map slots; skip predicate pushdown on virtual columns; handle scans with only virtual columns by reading a key column for row count.
  • Planner/Optimizer Adjustments
    • Skip virtual columns in predicate parsing, runtime filters, low-cardinality dict/MinMax stats, and MV rewrite utilities.
  • APIs/Descriptors
    • Extend SlotDescriptor (Thrift/Proto/ProtoBuf) with is_virtual_column.
  • Docs & Tests
    • Add docs (EN/JA/ZH) for virtual columns.
    • Add SQL tests validating _tablet_id_ behavior and table-defined _tablet_id_ precedence.

Written by Cursor Bugbot for commit f79a1baf7d643e4bf4536d2925bd181aaeaf65d3. This will update automatically on new commits. Configure here.

murphyatwork avatar Dec 02 '25 12:12 murphyatwork

@cursor review

alvin-celerdata avatar Dec 02 '25 17:12 alvin-celerdata

🧪 CI Insights

Here's what we observed from your CI run for f79a1baf.

🟢 All jobs passed!

But CI Insights is watching 👀

mergify[bot] avatar Dec 09 '25 07:12 mergify[bot]

@cursor review

alvin-celerdata avatar Dec 09 '25 17:12 alvin-celerdata

[FE Incremental Coverage Report]

:white_check_mark: pass : 74 / 82 (90.24%)

file detail

path covered_line new_line coverage not_covered_line_detail
:large_blue_circle: com/starrocks/sql/optimizer/Utils.java 6 7 85.71% [106]
:large_blue_circle: com/starrocks/sql/analyzer/QueryAnalyzer.java 15 17 88.24% [749, 751]
:large_blue_circle: com/starrocks/catalog/Column.java 49 54 90.74% [367, 418, 419, 421, 545]
:large_blue_circle: com/starrocks/sql/optimizer/rule/transformation/materialization/MaterializedViewRewriter.java 3 3 100.00% []
:large_blue_circle: com/starrocks/common/Config.java 1 1 100.00% []

github-actions[bot] avatar Dec 10 '25 09:12 github-actions[bot]

[BE Incremental Coverage Report]

:x: fail : 12 / 79 (15.19%)

file detail

path covered_line new_line coverage not_covered_line_detail
:large_blue_circle: be/src/connector/lake_connector.cpp 0 10 00.00% [152, 230, 231, 232, 254, 257, 258, 260, 262, 263]
:large_blue_circle: be/src/exec/pipeline/scan/olap_chunk_source.cpp 0 14 00.00% [344, 345, 346, 376, 379, 380, 382, 384, 385, 618, 619, 620, 626, 716]
:large_blue_circle: be/src/storage/virtual_column.cpp 2 40 05.00% [34, 36, 37, 38, 39, 45, 46, 48, 49, 50, 51, 52, 53, 58, 59, 60, 66, 67, 68, 69, 70, 71, 75, 76, 77, 82, 83, 87, 88, 89, 90, 91, 96, 97, 102, 103, 105, 106]
:large_blue_circle: be/src/storage/predicate_parser.cpp 1 4 25.00% [131, 161, 162]
:large_blue_circle: be/src/exec/olap_scan_prepare.cpp 9 11 81.82% [1284, 1285]

github-actions[bot] avatar Dec 10 '25 12:12 github-actions[bot]

@cursor review

alvin-celerdata avatar Dec 10 '25 17:12 alvin-celerdata

[Java-Extensions Incremental Coverage Report]

:white_check_mark: pass : 0 / 0 (0%)

github-actions[bot] avatar Dec 11 '25 09:12 github-actions[bot]

@cursor review

alvin-celerdata avatar Dec 11 '25 15:12 alvin-celerdata