server
server copied to clipboard
MDEV-21322: Report slave progress to the master
This PR presents a patch to extend the command SHOW REPLICA HOSTS with three columns:
-
Gtid_State_Sent. This represents that latest GTIDs sent to the replica in each domain. It will always be populated, regardless of the semi-sync status (i.e. asynchronous connections will still update this column with the latest GTID state sent to the replica).
-
Gtid_State_Ack. For semi-synchronous connections (only), this column represents the last GTID in each domain that the replica has acknowledged.
-
Sync_Status. This value represents the synchronization status of the replica, and is used to help determine how to interpret the Gtid_State_Ack column. There are four possible values:
3.1) Initializing. This means the binlog dump thread is still initializing, and has not yet determined the synchronization status of the replica.
3.2) Asynchronous: This means the replica is not configured for semi-sync replication, and thereby, Gtid_State_Ack should always be empty.
3.3) Semi-sync Stale: This means the replica is configured for semi-sync replication, however, connected using an old state, and is not readily able to send ACKs for new transactions. Functionally, this means that the primary will try to catch the replica up-to-date by sending transactions which will not be ACKed. Additionally, the value shown by Gtid_State_Ack will be empty until the replica catches up and ACKs its first transaction.
3.4) Semi-sync Active: This means the replica is configured for semi-sync replication, and is readily sending ACKs for new transactions it receives. It is possible for Gtid_State_Ack to be empty while Sync_Status is "Semi-sync Active" if no new transactions have been executed on the primary since the replica has connected.
Additionally, this patch creates a new semantic for the configuration rpl_semi_sync_master_timeout=0. That is, now when 0, 1) new transactions will not attempt to wait for an ACK before completing, and 2) the primary will still request ACKs from the replica for new transactions. This means that Gtid_State_Ack will be updated for each ACK from the replica and Sync_Status will read as "Semi-sync Active". Effectively, this creates a mode to mimic the asynchronous connection behavior, while allowing one to monitor the progress at which the primary is sending transactions to the replica via the new columns Gtid_State_Sent and Gtid_State_Ack.
Also note that a new error message was added to account for the case that Gtid_State_(Sent/Ack) represents a binary log file that was purged/cannot be found.
The overall implementation is rather simple. It leverages the existing semi-sync framework, where the replica uses binlog file:pos to ACK transactions, in order to infer GTID state by performing a binlog lookup at the time SHOW REPLICA HOSTS is executed. In particular, the Slave_info struct is extended to store 1) the binlog file:pos pair of the transaction which was last sent to the replica, 2) the binlog file:pos pair that was last ACKed by the replica, and 3) and enum to represent the Sync_Status.
This patch was initially started by @JackSlateur in PR#1427, where it was then transferred to @an3l who buffed it out in PR#2374, and final touches were put on by @bnestere.