server
server copied to clipboard
MDEV-33133: MDL conflict handling code should skip BF-aborted trxs
- [x] The Jira issue number for this PR is: MDEV-33133
Description
This is a backport from 10.6 branch, the issue doesn't reproduce on 10.4, but the fix is good to have along with the MTR test.
It's possible that MDL conflict handling code is called more than once for a transaction when:
- it holds more than one conflicting MDL lock
- reschedule_waiters() is executed, which results in repeated attempts to BF-abort already aborted transaction. In such situations, it might be that BF-aborting logic sees a partially rolled back transaction and erroneously decides on future actions for such a transaction.
The specific situation tested and fixed is when a SR transaction applied in the node gets BF-aborted by a started TOI operation. It's then caught with the server transaction already rolled back, but with no MDL locks yet released. This caused wrong state detection for such a transaction during repeated MDL conflict handling code execution.
How can this PR be tested?
MTR test is provided.
Basing the PR against the correct MariaDB version
- [ ] This is a new feature and the PR is based against the latest MariaDB development branch.
- [ ] This is a bug fix and the PR is based against the earliest maintained branch in which the bug can be reproduced.
PR quality check
- [x] I checked the CODING_STANDARDS.md file and my PR conforms to this where appropriate.
- [x] For any trivial modifications to the PR, I am ok with the reviewer making the changes themselves.
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.
Thanks, the fix has been merged with the head revision: https://github.com/MariaDB/server/commit/235f33e3606b79c5e3b75f4cfd1ca6d92320e9a2