[Enhancement] Support repairing tablet metadata in frontend
Why I'm doing:
What I'm doing:
This PR adds repairTabletMetadata method in TabletRepairHelper class of frontend to repair valid tablet metadata through the backends, also add some cleanup before repairing tablet metadata in backend.
https://github.com/StarRocks/starrocks/issues/66015
What type of PR is this:
- [ ] BugFix
- [ ] Feature
- [x] Enhancement
- [ ] Refactor
- [ ] UT
- [ ] Doc
- [ ] Tool
Does this PR entail a change in behavior?
- [ ] Yes, this PR will result in a change in behavior.
- [x] No, this PR will not result in a change in behavior.
If yes, please specify the type of change:
- [ ] Interface/UI changes: syntax, type conversion, expression evaluation, display information
- [ ] Parameter changes: default values, similar parameters but with different default values
- [ ] Policy changes: use new policy to replace old one, functionality automatically enabled
- [ ] Feature removed
- [ ] Miscellaneous: upgrade & downgrade compatibility, etc.
Checklist:
- [x] I have added test cases for my bug fix or my new feature
- [ ] This pr needs user documentation (for new or modified features or behaviors)
- [ ] I have added documentation for my new feature or new function
- [ ] This is a backport pr
Bugfix cherry-pick branch check:
- [x] I have checked the version labels which the pr will be auto-backported to the target branch
- [ ] 4.0
- [ ] 3.5
- [ ] 3.4
- [ ] 3.3
[!NOTE] Adds FE/BE support to repair tablet metadata with pre-repair cache/index cleanup, bundling write control, and corresponding RPC/proto updates and tests.
- Backend (BE):
- Implement
LakeServiceImpl::repair_tablet_metadatawith validation, optional bundling, and per-tablet puts.- Add
_cleanup_before_repair()to drop local data/meta cache, bundle cache, and clear PK index (including local persistent index files).- Expose
TabletManager::drop_local_cachepublicly; minor warning log formatting changes.- Frontend (FE):
- Add
TabletRepairHelper.repairTabletMetadata(...)to orchestrate repair across nodes, including bundling write coordination and version alignment.- Extend
LakeServiceandLakeServiceWithMetricswithrepairTabletMetadataRPC andTIMEOUT_REPAIR_METADATA.- Protocol:
- Update
RepairTabletMetadataRequestto includewrite_bundling_fileflag.- Tests:
- Add/extend BE and FE unit tests covering repair flows (bundling/non-bundling), errors, cancellations, and metrics wrappers; update pseudocluster stubs.
Written by Cursor Bugbot for commit f47b5ac4486eeba3acd1f707367cda24e4bb6a46. This will update automatically on new commits. Configure here.
🧪 CI Insights
Here's what we observed from your CI run for f47b5ac4.
🟢 All jobs passed!
But CI Insights is watching 👀
@cursor review
@cursor review
Quality Gate passed
Issues
15 New issues
0 Accepted issues
Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code
@cursor review
[Java-Extensions Incremental Coverage Report]
:white_check_mark: pass : 0 / 0 (0%)
[FE Incremental Coverage Report]
:white_check_mark: pass : 58 / 60 (96.67%)
file detail
| path | covered_line | new_line | coverage | not_covered_line_detail | |
|---|---|---|---|---|---|
| :large_blue_circle: | com/starrocks/rpc/LakeServiceWithMetrics.java | 0 | 2 | 00.00% | [193, 194] |
| :large_blue_circle: | com/starrocks/lake/TabletRepairHelper.java | 58 | 58 | 100.00% | [] |
[BE Incremental Coverage Report]
:white_check_mark: pass : 25 / 31 (80.65%)
file detail
| path | covered_line | new_line | coverage | not_covered_line_detail | |
|---|---|---|---|---|---|
| :large_blue_circle: | be/src/service/service_be/lake_service.cpp | 25 | 31 | 80.65% | [1520, 1521, 1524, 1525, 1526, 1590] |