tikv icon indicating copy to clipboard operation
tikv copied to clipboard

raftstore: move the handling of `CompactedEvent` to split-check worker.

Open LykxSassinator opened this issue 5 months ago • 3 comments

What is changed and how it works?

Issue Number: Close https://github.com/tikv/tikv/issues/18532

What's Changed:

As the issue https://github.com/tikv/tikv/issues/18532 mentioned, this performance bottleneck occurs when processing StoreMsg::CompactedEvent after RocksDB compactions, where updating region sizes in store.meta requires a global lock and linear traversal of all regions.

In the current implementation, the most time-cosumping part is calc_ranges_declined_bytes operation, which scales poorly with high region counts, causing significant Raftstore stalls during the mandatory post-compaction updates. image

This PR moves the handling of StoreMsg::CompactedEvent out of Raftstore's StoreMsg processing by converting it to a SplitCheckTask::CompactedEvent and offloading it to the split-check worker. This change delegates the compaction event processing to the dedicated split-check worker thread.

Optimize the handling of `CompactedEvent` in raftstore by moving it to `split-check` worker.

Related changes

  • [ ] PR to update pingcap/docs/pingcap/docs-cn:
  • [ ] Need to cherry-pick to the release branch

Check List

Tests

  • [x] Unit test
  • [ ] Integration test
  • [ ] Manual test (add detailed scripts or steps below)
  • [ ] No code

Side effects

  • [ ] Performance regression: Consumes more CPU
  • [ ] Performance regression: Consumes more Memory
  • [ ] Breaking backward compatibility

Release note

Optimize the handling of `CompactedEvent` in raftstore by moving it to `split-check` worker.

LykxSassinator avatar Jun 18 '25 09:06 LykxSassinator

Can you add more details about the issue? Which part do you think slowed down the message processing, acquiring meta locks or sending messages to overlapped regions?

Updated

LykxSassinator avatar Jun 19 '25 03:06 LykxSassinator

[LGTM Timeline notifier]

Timeline:

  • 2025-06-20 04:16:39.631327192 +0000 UTC m=+418052.354506169: :ballot_box_with_check: agreed by v01dstar.
  • 2025-06-20 05:56:37.467341789 +0000 UTC m=+424050.190520768: :ballot_box_with_check: agreed by overvenus.

ti-chi-bot[bot] avatar Jun 20 '25 05:06 ti-chi-bot[bot]

/hold

LykxSassinator avatar Jun 20 '25 07:06 LykxSassinator

/unhold

LykxSassinator avatar Jun 20 '25 08:06 LykxSassinator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: overvenus, v01dstar, zhangjinpeng87

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

ti-chi-bot[bot] avatar Jun 20 '25 18:06 ti-chi-bot[bot]

@LykxSassinator: You cannot manually add or delete the cherry pick branch category labels. It will be added automatically by bot when the PR is created.

In response to adding label named type/cherry-pick-for-release-7.5.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

ti-chi-bot[bot] avatar Jun 24 '25 06:06 ti-chi-bot[bot]

In response to a cherrypick label: new pull request created to branch release-8.1: #18579. But this PR has conflicts, please resolve them!

ti-chi-bot avatar Jun 24 '25 06:06 ti-chi-bot

In response to a cherrypick label: new pull request created to branch release-7.5: #18580. But this PR has conflicts, please resolve them!

ti-chi-bot avatar Jun 24 '25 06:06 ti-chi-bot

In response to a cherrypick label: new pull request created to branch release-8.5: #18581. But this PR has conflicts, please resolve them!

ti-chi-bot avatar Jun 24 '25 06:06 ti-chi-bot