starrocks icon indicating copy to clipboard operation
starrocks copied to clipboard

[Feature] Support data ingestion for range distribution table

Open srlch opened this issue 3 weeks ago • 15 comments

What I'm doing:

Implement end-to-end range-based routing for range distribution tables on both FE and BE sides:

FE: Add TabletRange and Tuple.toThrift / Variant.toThrift to serialize tablet ranges as TTabletRange/TVariant. OlapTableSink / FrontendServiceImpl now populate range_distributed_columns and per-tablet range in TOlapTablePartitionParam and TOlapTableIndexTablets, and temporarily disable colocate MV index for RANGE distribution tables.

BE: Introduce RangeRouter to route rows to tablets based on TTabletRange, supporting multi-column ranges and various open/closed/±inf intervals; Extend TabletSinkSender with RangeTabletSinkSender.

Tests: Add/extend FE tests (VariantTest, TabletRangeTest) to cover numeric, string, boolean and date/datetime serialization to Thrift. Add BE tests (RangeRouterTest, TabletSinkSenderRangeTest) covering typical routing paths, boundary behavior, sparse selections and error cases.

Fixes #64986

What type of PR is this:

  • [ ] BugFix
  • [x] Feature
  • [ ] Enhancement
  • [ ] Refactor
  • [ ] UT
  • [ ] Doc
  • [ ] Tool

Does this PR entail a change in behavior?

  • [ ] Yes, this PR will result in a change in behavior.
  • [x] No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

  • [ ] Interface/UI changes: syntax, type conversion, expression evaluation, display information
  • [ ] Parameter changes: default values, similar parameters but with different default values
  • [ ] Policy changes: use new policy to replace old one, functionality automatically enabled
  • [ ] Feature removed
  • [ ] Miscellaneous: upgrade & downgrade compatibility, etc.

Checklist:

  • [x] I have added test cases for my bug fix or my new feature
  • [ ] This pr needs user documentation (for new or modified features or behaviors)
    • [ ] I have added documentation for my new feature or new function
  • [ ] This is a backport pr

Bugfix cherry-pick branch check:

  • [x] I have checked the version labels which the pr will be auto-backported to the target branch
    • [ ] 4.0
    • [ ] 3.5
    • [ ] 3.4
    • [ ] 3.3

[!NOTE] Implements range-based row routing for range-distributed tables, adding FE/BE support, Thrift/Proto range serialization, and comprehensive tests.

  • Backend (BE):
    • Add RangeRouter and RangeTabletSinkSender to route rows by TTabletRange; wire into OlapTableSink when is_range_distribution().
    • Enhance OlapTablePartitionParam to carry distribution_type; compute hashes only for HASH; randomize for RANGE/RANDOM.
    • Update build/tests; add range_router_test and tablet_sink_sender_range_test.
  • Frontend (FE):
    • Introduce TabletRange, Tuple.toThrift(), and Variant.toThrift() (BOOL/INT/LARGEINT/STRING/DATE) to serialize tablet ranges.
    • OlapTableSink/FrontendServiceImpl: populate distribution_type, range-distributed columns, and per-tablet range in TOlapTablePartitionParam/TOlapTableIndexTablets; disable colocate MV for RANGE.
    • Add helpers in MetaUtils for range distribution columns/ids; Tablet now holds TabletRange.
    • Add tests VariantTest, TabletRangeTest; adjust existing tests for distribution info.
  • Thrift/Proto:
    • Add TOlapTableDistributionType and distribution_type to TOlapTablePartitionParam.
    • Add TOlapTableTablet.range and TOlapTableIndexTablets.tablets.
    • TTabletRange now includes bounds and inclusivity; TVariant/VariantPB use long_value and string_value fields.

Written by Cursor Bugbot for commit a007d96e43ac3bcf2705f4b3868a30e9d4fa9863. This will update automatically on new commits. Configure here.

srlch avatar Dec 02 '25 07:12 srlch

@cursor review

alvin-celerdata avatar Dec 02 '25 17:12 alvin-celerdata

@cursor review

alvin-celerdata avatar Dec 03 '25 17:12 alvin-celerdata

🧪 CI Insights

Here's what we observed from your CI run for a007d96e.

🟢 All jobs passed!

But CI Insights is watching 👀

mergify[bot] avatar Dec 05 '25 01:12 mergify[bot]

@cursor review

alvin-celerdata avatar Dec 05 '25 17:12 alvin-celerdata

@cursor review

alvin-celerdata avatar Dec 08 '25 17:12 alvin-celerdata

@cursor review

alvin-celerdata avatar Dec 09 '25 03:12 alvin-celerdata

@cursor review

alvin-celerdata avatar Dec 09 '25 04:12 alvin-celerdata

🚨 Bugbot couldn't run

Something went wrong. Try again by commenting "Cursor review" or "bugbot run", or contact support (requestId: serverGenReqId_b6e0ba7e-f4b8-410c-9135-5955b40f212f).

cursor[bot] avatar Dec 09 '25 04:12 cursor[bot]

@cursor review

alvin-celerdata avatar Dec 09 '25 06:12 alvin-celerdata

@cursor review

alvin-celerdata avatar Dec 09 '25 17:12 alvin-celerdata

@cursor review

alvin-celerdata avatar Dec 10 '25 02:12 alvin-celerdata

@cursor review

alvin-celerdata avatar Dec 15 '25 17:12 alvin-celerdata

[BE Incremental Coverage Report]

:x: fail : 0 / 12 (00.00%)

file detail

path covered_line new_line coverage not_covered_line_detail
:large_blue_circle: src/exec/tablet_info.cpp 0 4 00.00% [276, 278, 280, 281]
:large_blue_circle: src/exec/tablet_sink_sender.cpp 0 2 00.00% [49, 50]
:large_blue_circle: src/exec/tablet_sink_sender.h 0 2 00.00% [72, 73]
:large_blue_circle: src/exec/tablet_sink.cpp 0 4 00.00% [322, 324, 325, 326]

github-actions[bot] avatar Dec 17 '25 03:12 github-actions[bot]

@cursor review

alvin-celerdata avatar Dec 17 '25 05:12 alvin-celerdata

Quality Gate Failed Quality Gate failed

Failed conditions
3.5% Duplication on New Code (required ≤ 3%)

See analysis details on SonarQube Cloud

sonarqubecloud[bot] avatar Dec 17 '25 05:12 sonarqubecloud[bot]

@cursor review

alvin-celerdata avatar Dec 17 '25 18:12 alvin-celerdata

[Java-Extensions Incremental Coverage Report]

:white_check_mark: pass : 0 / 0 (0%)

github-actions[bot] avatar Dec 18 '25 02:12 github-actions[bot]

[FE Incremental Coverage Report]

:x: fail : 54 / 69 (78.26%)

file detail

path covered_line new_line coverage not_covered_line_detail
:large_blue_circle: com/starrocks/catalog/LargeIntVariant.java 0 4 00.00% [62, 63, 64, 65]
:large_blue_circle: com/starrocks/sql/common/MetaUtils.java 0 6 00.00% [249, 254, 259, 263, 267, 271]
:large_blue_circle: com/starrocks/catalog/Tablet.java 1 4 25.00% [41, 42, 48]
:large_blue_circle: com/starrocks/planner/OlapTableSink.java 11 12 91.67% [517]
:large_blue_circle: com/starrocks/catalog/TabletRange.java 11 12 91.67% [30]
:large_blue_circle: com/starrocks/catalog/BoolVariant.java 4 4 100.00% []
:large_blue_circle: com/starrocks/catalog/IntVariant.java 4 4 100.00% []
:large_blue_circle: com/starrocks/catalog/Tuple.java 3 3 100.00% []
:large_blue_circle: com/starrocks/catalog/DateVariant.java 4 4 100.00% []
:large_blue_circle: com/starrocks/catalog/StringVariant.java 4 4 100.00% []
:large_blue_circle: com/starrocks/service/FrontendServiceImpl.java 12 12 100.00% []

github-actions[bot] avatar Dec 18 '25 02:12 github-actions[bot]

@cursor review

alvin-celerdata avatar Dec 18 '25 04:12 alvin-celerdata