server icon indicating copy to clipboard operation
server copied to clipboard

MDEV-33133: MDL conflict handling code should skip BF-aborted trxs

Open denis-protivensky opened this issue 1 year ago • 1 comments

  • [x] The Jira issue number for this PR is: MDEV-33133

Description

This is a backport from 10.6 branch, the issue doesn't reproduce on 10.4, but the fix is good to have along with the MTR test.

It's possible that MDL conflict handling code is called more than once for a transaction when:

  • it holds more than one conflicting MDL lock
  • reschedule_waiters() is executed, which results in repeated attempts to BF-abort already aborted transaction. In such situations, it might be that BF-aborting logic sees a partially rolled back transaction and erroneously decides on future actions for such a transaction.

The specific situation tested and fixed is when a SR transaction applied in the node gets BF-aborted by a started TOI operation. It's then caught with the server transaction already rolled back, but with no MDL locks yet released. This caused wrong state detection for such a transaction during repeated MDL conflict handling code execution.

How can this PR be tested?

MTR test is provided.

Basing the PR against the correct MariaDB version

  • [ ] This is a new feature and the PR is based against the latest MariaDB development branch.
  • [ ] This is a bug fix and the PR is based against the earliest maintained branch in which the bug can be reproduced.

PR quality check

  • [x] I checked the CODING_STANDARDS.md file and my PR conforms to this where appropriate.
  • [x] For any trivial modifications to the PR, I am ok with the reviewer making the changes themselves.

denis-protivensky avatar Jan 18 '24 09:01 denis-protivensky

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

CLAassistant avatar Jan 18 '24 09:01 CLAassistant

Thanks, the fix has been merged with the head revision: https://github.com/MariaDB/server/commit/235f33e3606b79c5e3b75f4cfd1ca6d92320e9a2

sysprg avatar Sep 02 '24 03:09 sysprg