server icon indicating copy to clipboard operation
server copied to clipboard

MDEV-32633: Fix Galera cluster <-> native replication interaction

Open denis-protivensky opened this issue 11 months ago • 1 comments

  • [x] The Jira issue number for this PR is: MDEV-32633

Description

It's possible to establish Galera multi-cluster setups connected through the native replication when every Galera cluster is configured to have a separate domain ID. For this setup to work, we need to replace domain ID values in generated GTID events when they are written at transaction commit to the values configured by Wsrep replication.

At the same time, it's possible that the GTID event already contains a correct domain ID if it comes through the native replication from another Galera cluster. In this case, when such an event is applied either through a native replication slave thread or through Wsrep applier, we write GTID event on transaction start and avoid writing it during transaction commit.

The code contained multiple problems that were fixed:

  • applying GTID events didn't work because it's applied without a running server transaction and Wsrep transaction was not started
  • GTID event generation on transaction start didn't contain proper "standalone" and "is_transactional" flags that the original applied GTID event contained
  • condition determining that GTID event is written on transaction start to avoid writing it on commit relied on the fact that the GTID event is the first found in transaction/statement caches, which wasn't the case and resulted in duplicate GTID events written
  • instead of relying on the caches to find a GTID event, a simple check is introduced that follows the exact rules for checking if event is written at transaction start as described above
  • the test case is improved to check that exact GTID events are applied after two Galera clusters have synced.

Release Notes

This fix is 10.4 version-only.

How can this PR be tested?

Re-enabled previously failing MTR test.

Basing the PR against the correct MariaDB version

  • [ ] This is a new feature and the PR is based against the latest MariaDB development branch.
  • [x] This is a bug fix and the PR is based against the earliest maintained branch in which the bug can be reproduced.

PR quality check

  • [x] I checked the CODING_STANDARDS.md file and my PR conforms to this where appropriate.
  • [x] For any trivial modifications to the PR, I am ok with the reviewer making the changes themselves.

denis-protivensky avatar Mar 06 '24 15:03 denis-protivensky

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

CLAassistant avatar Mar 06 '24 15:03 CLAassistant

Thanks, the fix has been merged with the head revision, but for version 10.5+, since version 10.4 no longer in active development: https://github.com/MariaDB/server/commit/0cc9b49751fc86f7942e19fe6ed8f7227f847c02 https://github.com/MariaDB/server/commit/a4838721a252ea5570b5fed9ab56e38e5b234865 https://github.com/MariaDB/server/commit/c21aa486a86d9db29ad16ff279380969443df00f

sysprg avatar Jun 04 '24 02:06 sysprg