[WIP][Feature] cloud native primary key persistent index support larger size
Why I'm doing:
What I'm doing:
Fixes #issue
What type of PR is this:
- [ ] BugFix
- [ ] Feature
- [ ] Enhancement
- [ ] Refactor
- [ ] UT
- [ ] Doc
- [ ] Tool
Does this PR entail a change in behavior?
- [x] Yes, this PR will result in a change in behavior.
- [ ] No, this PR will not result in a change in behavior.
If yes, please specify the type of change:
- [ ] Interface/UI changes: syntax, type conversion, expression evaluation, display information
- [ ] Parameter changes: default values, similar parameters but with different default values
- [ ] Policy changes: use new policy to replace old one, functionality automatically enabled
- [ ] Feature removed
- [ ] Miscellaneous: upgrade & downgrade compatibility, etc.
Checklist:
- [ ] I have added test cases for my bug fix or my new feature
- [ ] This pr needs user documentation (for new or modified features or behaviors)
- [ ] I have added documentation for my new feature or new function
- [ ] This is a backport pr
Bugfix cherry-pick branch check:
- [ ] I have checked the version labels which the pr will be auto-backported to the target branch
- [ ] 4.0
- [ ] 3.5
- [ ] 3.4
- [ ] 3.3
[!NOTE] Scales cloud-native primary-key persistent index via fileset-aware SST ranges, parallel compaction/get, and parallel spill merge, with new configs and thread pools.
- Cloud-Native PK Index (Lake):
- Introduces
filesetmodel and SST key ranges; propagates ranges through write/compaction (TxnLogPB, writers, builders).- Adds parallel compaction framework (
LakePersistentIndexParallelCompactMgr) and size-tiered compaction strategy; supports async compaction and multi-output SSTs.- Enables parallel get and memtable flush with new thread pools; refactors memtable to async flush and SST merge into filesets.
- Updates compaction scoring to favor index (fileset-based) and always choose cloud-native index compaction.
- Lowers PK parallel exec threshold to 100MB; numerous tunables for PK index (compaction/get/memtable/threadpools, task split, target file size, score ratio, ingest compaction threshold).
- Spill/Merge:
- Parallel spill merge within a tablet and iterator retrieval API; adds per-merge memory limit and executor for in-tablet parallelism.
- Runtime/Env:
- Adds new thread pools in
ExecEnv(PK index get/flush) and initializes/shuts down parallel compact manager.- HTTP config updates to dynamically resize related thread pools.
- SSTable/Core:
- Adds concatenating iterator and exposes table key ranges; optional per-read file handle.
- Misc:
- Writer cloning/merge support for parallel paths; tracing/metrics sprinkled.
- Tests:
- Comprehensive UTs for filesets, size-tiered strategy, parallel compaction mgr, concatenating iterator, and adjusted expectations.
Written by Cursor Bugbot for commit 391594cdc27e6d3080f612f13023e4130faf2727. This will update automatically on new commits. Configure here.
@cursor review
@cursor review
[Java-Extensions Incremental Coverage Report]
:white_check_mark: pass : 0 / 0 (0%)
[FE Incremental Coverage Report]
:white_check_mark: pass : 0 / 0 (0%)
[BE Incremental Coverage Report]
:white_check_mark: pass : 35 / 43 (81.40%)
file detail
| path | covered_line | new_line | coverage | not_covered_line_detail | |
|---|---|---|---|---|---|
| :large_blue_circle: | src/storage/storage_engine.h | 0 | 1 | 00.00% | [268] |
| :large_blue_circle: | src/storage/rowset/segment_iterator.cpp | 1 | 2 | 50.00% | [71] |
| :large_blue_circle: | src/storage/lake/rowset_update_state.cpp | 9 | 12 | 75.00% | [46, 55, 56] |
| :large_blue_circle: | src/storage/lake/update_manager.cpp | 9 | 12 | 75.00% | [332, 333, 336] |
| :large_blue_circle: | src/storage/lake/general_tablet_writer.cpp | 7 | 7 | 100.00% | [] |
| :large_blue_circle: | src/storage/lake/compaction_policy.cpp | 3 | 3 | 100.00% | [] |
| :large_blue_circle: | src/runtime/exec_env.cpp | 6 | 6 | 100.00% | [] |
@cursor review
🧪 CI Insights
Here's what we observed from your CI run for 6dd848f2.
🟢 All jobs passed!
But CI Insights is watching 👀
@cursor review