starrocks icon indicating copy to clipboard operation
starrocks copied to clipboard

[Feature] Add index_size meta function to query column index sizes

Open zhan7236 opened this issue 1 month ago • 5 comments

Why I'm doing:

This PR implements the feature requested in https://github.com/StarRocks/starrocks/issues/62680 to provide a meta function for querying column index sizes.

Users need a way to analyze and monitor the size of various indexes (BITMAP, BLOOM, ZONEMAP) on their table columns for storage optimization and troubleshooting purposes.

What I'm doing:

Fixes https://github.com/StarRocks/starrocks/issues/62680. Add a new index_size() meta function that works with the [_META_] hint to query the size of column indexes.

Syntax

SELECT index_size(column_ref [, index_type]) FROM table_name [_META_];

Parameters

  • column_ref: The column to check index size for
  • index_type (optional): The type of index to query
    • 'BITMAP': Bitmap index size
    • 'BLOOM': Bloom filter index size
    • 'ZONEMAP': Zone map index size
    • 'ALL' (default): Total size of all indexes

Examples

-- Get total index size for column k1
SELECT index_size(k1) FROM my_table [_META_];

-- Get only ZONEMAP index size
SELECT index_size(k1, 'ZONEMAP') FROM my_table [_META_];

-- Get BLOOM filter index size
SELECT index_size(k2, 'BLOOM') FROM my_table [_META_];

Changes Made

Backend (C++)

  • be/src/storage/rowset/column_reader.h/cpp: Added get_index_size() method to retrieve index sizes from metadata
  • be/src/storage/meta_reader.h/cpp: Added META_INDEX_SIZE constant and _collect_index_size() function

Frontend (Java)

  • fe/.../FunctionSet.java: Registered INDEX_SIZE as builtin aggregate function
  • fe/.../PushDownAggToMetaScanRule.java: Added support for index_size in meta scan push down
  • fe/.../RewriteSimpleAggToMetaScanRule.java: Added support for index_size in simple agg rewrite

Tests

  • Added test cases in test/sql/test_meta_scan/

Documentation

  • Added function documentation in docs/en/ and docs/zh/

What type of PR is this:

  • [ ] BugFix
  • [x] Feature
  • [ ] Enhancement
  • [ ] Refactor
  • [ ] UT
  • [ ] Doc
  • [ ] Tool

Does this PR entail a change in behavior?

  • [ ] Yes, this PR will result in a change in behavior.
  • [x] No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

N/A - New feature

Checklist:

  • [x] I have added test cases for my code
  • [x] I have added documentation for my code

[!NOTE] Introduces index_size(column[, type]) meta function (BITMAP/BLOOM/ZONEMAP/ALL) with planner pushdown, backend collectors, docs, and tests.

  • Meta function: index_size
    • Returns index size in bytes for a column; optional index_type = BITMAP | BLOOM | ZONEMAP | ALL.
  • Backend (BE):
    • ColumnReader: add get_index_size() aggregating sizes from bitmap/bloom/zonemap metas (column_reader.h/cpp).
    • SegmentMetaCollecter and MetaReader: support META_INDEX_SIZE, parse field names with index type suffix, collect recursively for nested types (meta_reader.h/cpp).
  • Frontend (FE):
    • Register INDEX_SIZE aggregate (1-arg and 2-arg) in FunctionSet.
    • Extend pushdown/rewrites to support index_size, including meta column naming with type suffix and SUM rewrite (PushDownAggToMetaScanRule, RewriteSimpleAggToMetaScanRule).
  • Docs: Add index_size usage guides in docs/en/.../index_size.md and docs/zh/.../index_size.md.
  • Tests: Add SQL tests validating index_size across index types (test/sql/test_meta_scan).

Written by Cursor Bugbot for commit ac02a99e1e50be53ec22b825052031601c580463. This will update automatically on new commits. Configure here.

zhan7236 avatar Nov 26 '25 15:11 zhan7236

🧪 CI Insights

Here's what we observed from your CI run for ac02a99e.

🟢 All jobs passed!

But CI Insights is watching 👀

mergify[bot] avatar Dec 11 '25 04:12 mergify[bot]

  [ERROR] /root/starrocks/fe/fe-core/src/main/java/com/starrocks/catalog/FunctionSet.java:[1545,115] cannot find symbol
  [ERROR]   symbol:   variable VARCHAR
  [ERROR]   location: class com.starrocks.type.Type
  [ERROR] 

@zhan7236

murphyatwork avatar Dec 11 '25 06:12 murphyatwork

@cursor review

alvin-celerdata avatar Dec 11 '25 15:12 alvin-celerdata

@cursor review

alvin-celerdata avatar Dec 16 '25 17:12 alvin-celerdata

@cursor review

alvin-celerdata avatar Dec 17 '25 18:12 alvin-celerdata

Quality Gate Failed Quality Gate failed

Failed conditions
36.2% Duplication on New Code (required ≤ 3%)

See analysis details on SonarQube Cloud

sonarqubecloud[bot] avatar Dec 18 '25 05:12 sonarqubecloud[bot]

[Java-Extensions Incremental Coverage Report]

:white_check_mark: pass : 0 / 0 (0%)

github-actions[bot] avatar Dec 18 '25 07:12 github-actions[bot]

[FE Incremental Coverage Report]

:white_check_mark: pass : 22 / 27 (81.48%)

file detail

path covered_line new_line coverage not_covered_line_detail
:large_blue_circle: com/starrocks/sql/optimizer/rule/transformation/RewriteSimpleAggToMetaScanRule.java 7 12 58.33% [123, 124, 125, 126, 128]
:large_blue_circle: com/starrocks/sql/optimizer/rule/transformation/PushDownAggToMetaScanRule.java 13 13 100.00% []
:large_blue_circle: com/starrocks/catalog/FunctionSet.java 2 2 100.00% []

github-actions[bot] avatar Dec 18 '25 07:12 github-actions[bot]

[BE Incremental Coverage Report]

:x: fail : 33 / 45 (73.33%)

file detail

path covered_line new_line coverage not_covered_line_detail
:large_blue_circle: be/src/storage/rowset/column_reader.cpp 11 18 61.11% [619, 624, 625, 627, 628, 635, 642]
:large_blue_circle: be/src/storage/meta_reader.cpp 22 27 81.48% [60, 62, 273, 572, 573]

github-actions[bot] avatar Dec 18 '25 07:12 github-actions[bot]

@cursor review

alvin-celerdata avatar Dec 18 '25 15:12 alvin-celerdata