server
server copied to clipboard
10.11 innodb wsrep applier lock wait timeout
- [x] The Jira issue number for this PR is: MDEV-29684
Description
This patch implements wsrep applier lock wait timeout functionality.
As transactions which are executed by appliers have passed certification in the cluster, they must be applied and committed successfully. However, occasionally BF aborting local transactions may not work perfectly due to race conditions or unforeseen behavior of the lock manager, which may cause appliers to wait locks indefinitely. Especially if the local transaction has already reached commit stage, it will not yield via lock wait timeout.
In order to resolve indefinite applier waits, a short applier lock wait timeout is introduced. However instead of giving up with lock wait, a background thread is used to retry BF abort on behalf of the applier which is waiting for the lock.
A variable to control the applier lock wait timeout is innodb_wsrep_applier_lock_wait_timeout with default value of five seconds. If the value is zero, the background BF aborting is disabled.
The value of innodb_wsrep_applier_lock_wait_timeout is set to zero in Galera suite MTR test configuration to avoid non-deterministic behavior.
How can this PR be tested?
The PR contains a mtr test for testing the functionality
TODO: modify the automated test suite to verify that the PR causes MariaDB to
behave as intended. Consult the documentation on
"Writing good test cases".
In many cases, this will be as simple as modifying one .test and one .result
file in the mysql-test/ subdirectory. Without automated tests, future regressions
in the expected behavior can't be automatically detected and verified.
If the changes are not amenable to automated testing, please explain why not and carefully describe how to test manually.
Basing the PR against the correct MariaDB version
- [x ] This is a new feature and the PR is based against the latest MariaDB development branch
- [ ] This is a bug fix and the PR is based against the earliest branch in which the bug can be reproduced
Backward compatibility
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.
:white_check_mark: sjaakola
:x: temeo
You have signed the CLA already but the status is still pending? Let us recheck it.
This is not exactly related to MDEV-29684 that is a real bug on 10.4. This is more related to https://jira.mariadb.org/browse/MDEV-29496
Closing as rejected.