Why I'm doing:

What I'm doing:

Fixes #issue

What type of PR is this:

[ ] BugFix
[ ] Feature
[ ] Enhancement
[ ] Refactor
[ ] UT
[ ] Doc
[ ] Tool

Does this PR entail a change in behavior?

[x] Yes, this PR will result in a change in behavior.
[ ] No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

[ ] Interface/UI changes: syntax, type conversion, expression evaluation, display information
[ ] Parameter changes: default values, similar parameters but with different default values
[ ] Policy changes: use new policy to replace old one, functionality automatically enabled
[ ] Feature removed
[ ] Miscellaneous: upgrade & downgrade compatibility, etc.

Checklist:

[ ] I have added test cases for my bug fix or my new feature
[ ] This pr needs user documentation (for new or modified features or behaviors)
- [ ] I have added documentation for my new feature or new function
[ ] This is a backport pr

Bugfix cherry-pick branch check:

[ ] I have checked the version labels which the pr will be auto-backported to the target branch
- [ ] 4.0
- [ ] 3.5
- [ ] 3.4
- [ ] 3.3

[!NOTE] Scales cloud-native primary-key persistent index via fileset-aware SST ranges, parallel compaction/get, and parallel spill merge, with new configs and thread pools.

Cloud-Native PK Index (Lake):

Introduces fileset model and SST key ranges; propagates ranges through write/compaction (TxnLogPB, writers, builders).

Adds parallel compaction framework (LakePersistentIndexParallelCompactMgr) and size-tiered compaction strategy; supports async compaction and multi-output SSTs.

Enables parallel get and memtable flush with new thread pools; refactors memtable to async flush and SST merge into filesets.

Updates compaction scoring to favor index (fileset-based) and always choose cloud-native index compaction.

Lowers PK parallel exec threshold to 100MB; numerous tunables for PK index (compaction/get/memtable/threadpools, task split, target file size, score ratio, ingest compaction threshold).

Spill/Merge:

Parallel spill merge within a tablet and iterator retrieval API; adds per-merge memory limit and executor for in-tablet parallelism.

Runtime/Env:

Adds new thread pools in ExecEnv (PK index get/flush) and initializes/shuts down parallel compact manager.

HTTP config updates to dynamically resize related thread pools.

SSTable/Core:

Adds concatenating iterator and exposes table key ranges; optional per-read file handle.

Misc:

Writer cloning/merge support for parallel paths; tracing/metrics sprinkled.

Tests:

Comprehensive UTs for filesets, size-tiered strategy, parallel compaction mgr, concatenating iterator, and adjusted expectations.

^{Written by Cursor Bugbot for commit 391594cdc27e6d3080f612f13023e4130faf2727. This will update automatically on new commits. Configure here.}

Dec 02 '25 02:12 luohaha

@cursor review

Dec 02 '25 05:12 alvin-celerdata

@cursor review

Dec 02 '25 17:12 alvin-celerdata

[Java-Extensions Incremental Coverage Report]

:white_check_mark: pass : 0 / 0 (0%)

Dec 03 '25 07:12 github-actions[bot]

[FE Incremental Coverage Report]

:white_check_mark: pass : 0 / 0 (0%)

Dec 03 '25 07:12 github-actions[bot]

[BE Incremental Coverage Report]

:white_check_mark: pass : 35 / 43 (81.40%)

file detail

	path	covered_line	new_line	coverage	not_covered_line_detail
:large_blue_circle:	src/storage/storage_engine.h	0	1	00.00%	[268]
:large_blue_circle:	src/storage/rowset/segment_iterator.cpp	1	2	50.00%	[71]
:large_blue_circle:	src/storage/lake/rowset_update_state.cpp	9	12	75.00%	[46, 55, 56]
:large_blue_circle:	src/storage/lake/update_manager.cpp	9	12	75.00%	[332, 333, 336]
:large_blue_circle:	src/storage/lake/general_tablet_writer.cpp	7	7	100.00%	[]
:large_blue_circle:	src/storage/lake/compaction_policy.cpp	3	3	100.00%	[]
:large_blue_circle:	src/runtime/exec_env.cpp	6	6	100.00%	[]

Dec 03 '25 07:12 github-actions[bot]

@cursor review

Dec 03 '25 17:12 alvin-celerdata

🧪 CI Insights

Here's what we observed from your CI run for 6dd848f2.

🟢 All jobs passed!

But CI Insights is watching 👀

Dec 06 '25 02:12 mergify[bot]

@cursor review

Dec 08 '25 03:12 alvin-celerdata

[WIP][Feature] cloud native primary key persistent index support larger size

Why I'm doing:

What I'm doing:

What type of PR is this:

Checklist:

Bugfix cherry-pick branch check:

[Java-Extensions Incremental Coverage Report]

[FE Incremental Coverage Report]

[BE Incremental Coverage Report]

file detail

🧪 CI Insights

🟢 All jobs passed!