starrocks icon indicating copy to clipboard operation
starrocks copied to clipboard

[Feature] Support pk index major compaction in cloud native table

Open TszKitLo40 opened this issue 1 year ago • 2 comments

Why I'm doing:

PK index major compaction is not supported in cloud native table which will cause write amplification.

What I'm doing:

Support pk index major compaction in cloud native table. WIthout pk index major compaction

publish cost(ms) MAX IO(MB/s)
1G + insert + 1/100 121442 54
1G + upsert + 1/100 291326 73
1G + upsert + 1/10000 338894 46.7
2.5G + insert + 1/100 710491 121
2.5G + upsert + 1/100 1419680 121
2.5G + upsert + 1/10000 307463 121

With pk index major compaction

publish cost(ms) MAX IO(MB/s)
1G + insert + 1/100 52866 64.1
1G + upsert + 1/100 125956 80.8
1G + upsert + 1/10000 405559 46.7
2.5G + insert + 1/100 191436 117
2.5G + upsert + 1/100 389931 108
2.5G + upsert + 1/10000 223999 114

In trace log, we can see that the time of primary_index_commit_latency_us is much less that without major compaction. Because merge compaction is not needed in commit.

I0301 13:51:28.410473  3076 lake_service.cpp:237] Published txns=5050. tablets=10163 cost=1869193us, trace: {"child_traces":[["PublishTablet",{"base_version":93,"deletes":0,"do_update_latency_us":924748,"new_del":0,"primary_index_commit_latency_us":909008,"primary_index_load_latency_us":0,"queuing_latency_us":20,"rewrite_segment_latency_us":19,"rowsetid":93,"state_bytes":9205820,"tablet_id":10163,"total_del":0,"update_index_latency_us":924788,"upsert_rows":151201,"upserts":1}]]}
I0301 13:52:56.940850  4938 lake_service.cpp:237] Published txns=5031. tablets=10139 cost=11237925us, trace: {"child_traces":[["PublishTablet",{"base_version":102,"deletes":0,"do_update_latency_us":870584,"new_del":0,"primary_index_commit_latency_us":10342835,"primary_index_load_latency_us":0,"queuing_latency_us":16,"rewrite_segment_latency_us":14,"rowsetid":102,"state_bytes":10569270,"tablet_id":10139,"total_del":0,"update_index_latency_us":870614,"upsert_rows":151124,"upserts":1}]]}

Fixes #issue

What type of PR is this:

  • [ ] BugFix
  • [x] Feature
  • [ ] Enhancement
  • [ ] Refactor
  • [ ] UT
  • [ ] Doc
  • [ ] Tool

Does this PR entail a change in behavior?

  • [ ] Yes, this PR will result in a change in behavior.
  • [x] No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

  • [ ] Interface/UI changes: syntax, type conversion, expression evaluation, display information
  • [ ] Parameter changes: default values, similar parameters but with different default values
  • [ ] Policy changes: use new policy to replace old one, functionality automatically enabled
  • [ ] Feature removed
  • [ ] Miscellaneous: upgrade & downgrade compatibility, etc.

Checklist:

  • [x] I have added test cases for my bug fix or my new feature
  • [ ] This pr needs user documentation (for new or modified features or behaviors)
    • [ ] I have added documentation for my new feature or new function
  • [ ] This is a backport pr

Bugfix cherry-pick branch check:

  • [x] I have checked the version labels which the pr will be auto-backported to the target branch
    • [x] 3.2
    • [ ] 3.1
    • [ ] 3.0
    • [ ] 2.5

TszKitLo40 avatar Feb 27 '24 07:02 TszKitLo40

[FE Incremental Coverage Report]

:white_check_mark: pass : 0 / 0 (0%)

github-actions[bot] avatar May 21 '24 07:05 github-actions[bot]

[BE Incremental Coverage Report]

:white_check_mark: pass : 112 / 132 (84.85%)

file detail

path covered_line new_line coverage not_covered_line_detail
:large_blue_circle: be/src/storage/tablet_manager.cpp 0 1 00.00% [769]
:large_blue_circle: be/src/storage/lake/local_pk_index_manager.cpp 44 61 72.13% [210, 211, 216, 217, 221, 222, 232, 233, 234, 247, 248, 261, 262, 263, 267, 275, 283]
:large_blue_circle: be/src/storage/persistent_index_compaction_manager.cpp 16 18 88.89% [52, 81]
:large_blue_circle: be/src/storage/lake/local_pk_index_manager.h 1 1 100.00% []
:large_blue_circle: be/src/storage/olap_server.cpp 4 4 100.00% []
:large_blue_circle: be/src/storage/lake/lake_primary_index.h 4 4 100.00% []
:large_blue_circle: be/src/storage/persistent_index.cpp 1 1 100.00% []
:large_blue_circle: be/src/storage/lake/lake_primary_index.cpp 13 13 100.00% []
:large_blue_circle: be/src/storage/lake/update_manager.cpp 13 13 100.00% []
:large_blue_circle: be/src/storage/lake/txn_log_applier.cpp 1 1 100.00% []
:large_blue_circle: be/src/storage/tablet_updates.h 1 1 100.00% []
:large_blue_circle: be/src/storage/storage_engine.cpp 2 2 100.00% []
:large_blue_circle: be/src/storage/persistent_index_tablet_loader.cpp 3 3 100.00% []
:large_blue_circle: be/src/storage/lake/lake_local_persistent_index.cpp 1 1 100.00% []
:large_blue_circle: be/src/storage/lake/lake_local_persistent_index.h 2 2 100.00% []
:large_blue_circle: be/src/storage/lake/lake_local_persistent_index_tablet_loader.h 2 2 100.00% []
:large_blue_circle: be/src/storage/lake/lake_local_persistent_index_tablet_loader.cpp 4 4 100.00% []

github-actions[bot] avatar May 21 '24 07:05 github-actions[bot]

@Mergifyio backport branch-3.2

github-actions[bot] avatar May 21 '24 08:05 github-actions[bot]

backport branch-3.2

✅ Backports have been created

mergify[bot] avatar May 21 '24 08:05 mergify[bot]

https://github.com/Mergifyio backport branch-3.3

TszKitLo40 avatar May 21 '24 08:05 TszKitLo40

backport branch-3.3

✅ Backports have been created

mergify[bot] avatar May 21 '24 08:05 mergify[bot]