storage: reject new commands if memory quota exceeded (#16473)
This is an automated cherry-pick of #16473
This cherry-pick rolls up three PRs:
- #16440
- #16473
- #16482
They are intended to be merged together.
What is changed and how it works?
Issue Number: ref #16234
What's Changed:
Related changes
- Need to cherry-pick to the release branch
Check List
Tests
- [x] Unit test
- [x] Manual test
Test Details
The OOM issue in #16234 is hard to reproduce reliable, so I have to changes the default configs.
A single-node Cluster with the following configs.
TiKV:
[storage]
# Try not to limit concurrent tasks
scheduler-concurrency = 2097152
# Don’t let blockcache affect memory usage
[storage.block-cache]
capacity = "100MB"
TiDB:
lease = "600s"
token-limit = 100000000
[txn-local-latches]
enabled = false
SET GLOBAL tidb_txn_mode = 'optimistic';
SET GLOBAL tidb_enable_async_commit = off;
SET GLOBAL tidb_enable_1pc = off;
Workload:
# Prepare
mysql> create database tpcc1k;
/root/.tiup/components/bench/v1.12.0/go-tpc \
tpcc prepare \
-H 10.2.12.86 -P 31825 \
-D tpcc1k --warehouses 1000 -T 500
# Run
while true; do { \
/root/.tiup/components/bench/v1.12.0/go-tpc \
tpcc run \
-H 10.2.12.86 -P 31825 \
-D tpcc1k --warehouses 1000 --time 4s -T 500 & \
pid=$!; sleep 5; kill -9 $pid; \
} done;
| TiKV Config | Metrics |
|---|---|
OOM if memory-quota is unlimited. [storage] |
|
Does not OOM if memory-quota is configured properly.[storage] |
Release note
Fix an issue that txn scheduler may cause OOM if TiKV writes too slow.
[REVIEW NOTIFICATION]
This pull request has been approved by:
- Connor1996
To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.
The full list of commands accepted by this bot can be found here.
Reviewer can indicate their review by submitting an approval review. Reviewer can cancel approval by submitting a request changes review.
/test
@overvenus: The /test command needs one or more targets.
The following commands are available to trigger optional jobs:
-
/debug pull-unit-test
In response to this:
/test
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
PR needs rebase.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
This cherry pick PR is for a release branch and has not yet been approved by triage owners.
Adding the do-not-merge/cherry-pick-not-approved label.
To merge this cherry pick:
- It must be approved by the approvers firstly.
- AFTER it has been approved by approvers, please wait for the cherry-pick merging approval from triage owners.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
@ti-chi-bot: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:
| Test name | Commit | Details | Required | Rerun command |
|---|---|---|---|---|
| pull-unit-test | 85e80d72b020a7907ccf6dc8e5d76ec7c96eb78b | link | true | /test pull-unit-test |
Full PR test history. Your PR dashboard.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.