starrocks icon indicating copy to clipboard operation
starrocks copied to clipboard

[Feature] Support pk index major compaction in cloud native table

Open TszKitLo40 opened this issue 5 months ago • 2 comments

Why I'm doing:

PK index major compaction is not supported in cloud native table which will cause write amplification.

What I'm doing:

Support pk index major compaction in cloud native table. WIthout pk index major compaction

publish cost(ms) MAX IO(MB/s)
1G + insert + 1/100 121442 54
1G + upsert + 1/100 291326 73
1G + upsert + 1/10000 338894 46.7
2.5G + insert + 1/100 710491 121
2.5G + upsert + 1/100 1419680 121
2.5G + upsert + 1/10000 307463 121

With pk index major compaction

publish cost(ms) MAX IO(MB/s)
1G + insert + 1/100 52866 64.1
1G + upsert + 1/100 125956 80.8
1G + upsert + 1/10000 405559 46.7
2.5G + insert + 1/100 191436 117
2.5G + upsert + 1/100 389931 108
2.5G + upsert + 1/10000 223999 114

In trace log, we can see that the time of primary_index_commit_latency_us is much less that without major compaction. Because merge compaction is not needed in commit.

I0301 13:51:28.410473  3076 lake_service.cpp:237] Published txns=5050. tablets=10163 cost=1869193us, trace: {"child_traces":[["PublishTablet",{"base_version":93,"deletes":0,"do_update_latency_us":924748,"new_del":0,"primary_index_commit_latency_us":909008,"primary_index_load_latency_us":0,"queuing_latency_us":20,"rewrite_segment_latency_us":19,"rowsetid":93,"state_bytes":9205820,"tablet_id":10163,"total_del":0,"update_index_latency_us":924788,"upsert_rows":151201,"upserts":1}]]}
I0301 13:52:56.940850  4938 lake_service.cpp:237] Published txns=5031. tablets=10139 cost=11237925us, trace: {"child_traces":[["PublishTablet",{"base_version":102,"deletes":0,"do_update_latency_us":870584,"new_del":0,"primary_index_commit_latency_us":10342835,"primary_index_load_latency_us":0,"queuing_latency_us":16,"rewrite_segment_latency_us":14,"rowsetid":102,"state_bytes":10569270,"tablet_id":10139,"total_del":0,"update_index_latency_us":870614,"upsert_rows":151124,"upserts":1}]]}

Fixes #issue

What type of PR is this:

  • [ ] BugFix
  • [x] Feature
  • [ ] Enhancement
  • [ ] Refactor
  • [ ] UT
  • [ ] Doc
  • [ ] Tool

Does this PR entail a change in behavior?

  • [ ] Yes, this PR will result in a change in behavior.
  • [x] No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

  • [ ] Interface/UI changes: syntax, type conversion, expression evaluation, display information
  • [ ] Parameter changes: default values, similar parameters but with different default values
  • [ ] Policy changes: use new policy to replace old one, functionality automatically enabled
  • [ ] Feature removed
  • [ ] Miscellaneous: upgrade & downgrade compatibility, etc.

Checklist:

  • [x] I have added test cases for my bug fix or my new feature
  • [ ] This pr needs user documentation (for new or modified features or behaviors)
    • [ ] I have added documentation for my new feature or new function
  • [ ] This is a backport pr

Bugfix cherry-pick branch check:

  • [x] I have checked the version labels which the pr will be auto-backported to the target branch
    • [x] 3.2
    • [ ] 3.1
    • [ ] 3.0
    • [ ] 2.5

TszKitLo40 avatar Feb 27 '24 07:02 TszKitLo40