server icon indicating copy to clipboard operation
server copied to clipboard

MDEV-11675 fixup: MDEV-35474 Start Alter GTID Error Message Can Use Wrong Server_Id

Open ParadoxV5 opened this issue 11 months ago • 1 comments
trafficstars

  • [x] The Jira issue number for this PR is: MDEV-35474

Description

Store a whole GTID in start_alter_info entries.

While Two-Phase ALTER only cares about the domain_id and seq_no, logs/warnings that show full GTIDs also need the ALTER’s server_id.

What problem is the patch trying to solve?

The Two-Phase ALTER work (MDEV-11675) left an error message with an incomplete (and out-of-order), TODO GTID report. The fix https://github.com/MariaDB/server/pull/3518/commits/8c18af91bf4bfe58d7ac02e04d0648af75d6886f also incorrectly filled it in with the current primary server_id, which isn’t necessarily the server_id of the ALTER (which varies on replication setups or server_id configs).

Do you think this patch might introduce side-effects in other parts of the server?

Instead of filling in a struct start_alter_info::server_id member, I decided to replace its current sa_seq_no & domain_id with a whole GTID (inline-memory) member. I find this design more intuitively structural.

I normally wouldn’t refactor APIs in a bug fix, but it looks like only Two-Phase ALTER uses struct, so I made this exception. Of course, this’d still conflict with collegues’ related work, if any.

Release Notes

(We can nest this under #3518.) Fixed the GTID in the warning message of when replication encounters a two-phase ALTER query that failed to complete in the primary.

How can this PR be tested?

MDEV-11675 includes the test rpl_start_alter_restart_master for testing this “failed to complete” behavior. I included the transaction server ID in its warning suppression to check for the current session server ID.

Basing the PR against the correct MariaDB version

  • [ ] This is a new feature or a refactoring, and the PR is based against the main branch.
  • [x] This is a bug fix, and the PR is based against the earliest maintained branch in which the bug can be reproduced.

PR quality check

  • [x] I checked the CODING_STANDARDS.md file and my PR conforms to this where appropriate.
  • [x] For any trivial modifications to the PR, I am ok with the reviewer making the changes themselves.

ParadoxV5 avatar Nov 25 '24 23:11 ParadoxV5