orchestrator How does The DelayMasterPromotionIfSQLThreadNotUpToDate parameter work ?

Hi, I want to test how orchestrator works when the master is down and all the slaves lagged behand master；

I found that the orchestrator will stop io_thread and sql_thread on candidate replica,so it will never catch up. Could you tell me why, I would be very grateful!

here is my env:

version

orchestrator ：3.2.6 MySQL：MySQL Community Server (GPL) 5.6.36-log（one master with two slaves）

orchestrator main config parameters

{
  "ReplicationLagQuery": "",
  "FailMasterPromotionOnLagMinutes": 0,
  "FailMasterPromotionIfSQLThreadNotUpToDate": false,
  "DelayMasterPromotionIfSQLThreadNotUpToDate": true,
  "ReasonableReplicationLagSeconds": 300,
  "RecoveryPeriodBlockSeconds": 3600,
  "RecoveryIgnoreHostnameFilters": [],
  "RecoverMasterClusterFilters": [
    "*"
  ],
  "RecoverIntermediateMasterClusterFilters": [
    "*"
  ],
}

test steps

use sysbench for stress testing;
stop all slaves ( stop slave sql_thread);
wait 3 minutes (within ReasonableReplicationLagSeconds)
start all slaves(start slave sql_thread ,then all slaves was lagged behand master);
shutdown master;
orchestrator promotion one slave to be ideal candidate;

orchestrator recover log

2022-03-01 10:26:04	will handle DeadMaster event on dbpnew84v:3100
2022-03-01 10:26:04	Running 1 PreFailoverProcesses hooks
2022-03-01 10:26:04	Running PreFailoverProcesses hook 1 of 1: 
2022-03-01 10:26:04	Completed PreFailoverProcesses hook 1 of 1 in 10.76651ms
2022-03-01 10:26:04	done running PreFailoverProcesses hooks
2022-03-01 10:26:04	RecoverDeadMaster: will recover dbpnew84v:3100
2022-03-01 10:26:04	RecoverDeadMaster: masterRecoveryType=MasterRecoveryGTID
2022-03-01 10:26:04	RecoverDeadMaster: regrouping replicas via GTID
2022-03-01 10:26:15	RecoverDeadMaster: promotedReplicaIsIdeal(dbpnew85v:3100)
2022-03-01 10:26:16	RecoverDeadMaster: found dbpnew85v:3100 to be ideal candidate; will optimize recovery
2022-03-01 10:26:16	RecoverDeadMaster: 1 postponed functions
2022-03-01 10:26:16	promoted replica: dbpnew85v:3100
2022-03-01 10:26:16	RecoverDeadMaster: promoted replica lag seconds: 0
2022-03-01 10:26:16	RecoverDeadMaster: promoted replica sql thread up-to-date: false
2022-03-01 10:26:16	DelayMasterPromotionIfSQLThreadNotUpToDate: waiting for SQL thread on dbpnew85v:3100
2022-03-01 10:31:16	DelayMasterPromotionIfSQLThreadNotUpToDate error: 2022-03-01 10:31:16 ERROR WaitForSQLThreadUpToDate stale coordinates timeout on dbpnew85v:3100 after duration 5m0s
2022-03-01 10:31:16	Running 1 PostUnsuccessfulFailoverProcesses hooks
2022-03-01 10:31:16	Running PostUnsuccessfulFailoverProcesses hook 1 of 1:
2022-03-01 10:31:16	Completed PostUnsuccessfulFailoverProcesses hook 1 of 1 in 14.579948ms
2022-03-01 10:31:16	done running PostUnsuccessfulFailoverProcesses hooks
2022-03-01 10:31:17	Waiting for 1 postponed functions
2022-03-01 10:31:17	Executed 1 postponed functions
2022-03-01 10:31:17	Executed postponed functions: regroup-replicas-gtid dbpnew85v:3100

dbpnew85v general log

 866 Connect   [email protected] on 
                  866 Query     stop slave io_thread
                  866 Query     start slave sql_thread
                  866 Query     stop slave

Mar 01 '22 03:03 hellogitee

好像确实有bug

recoverDeadMaster 会调用 RegroupReplicasGTID, RegroupReplicasGTID-> GetCandidateReplica-> sortedReplicasDataCenterHint-> StopReplicas->StopReplicationNicely

StopReplicationNicely会先WaitForSQLThreadUpToDate. 这里WaitForSQLThreadUpToDate超时是InstanceBulkOperationsWaitTimeoutSeconds默认10s 所以你的情况是InstanceBulkOperationsWaitTimeoutSeconds秒超时了. 然后由于超时, err!=nil, 所以return 但是StopReplicas没做任何判断, 还是会执行StopReplication stop slave

到最后checkAndRecoverDeadMaster会执行下面的代码

		if config.Config.DelayMasterPromotionIfSQLThreadNotUpToDate && !promotedReplica.SQLThreadUpToDate() {
			AuditTopologyRecovery(topologyRecovery, fmt.Sprintf("DelayMasterPromotionIfSQLThreadNotUpToDate: waiting for SQL thread on %+v", promotedReplica.Key))
			if _, err := inst.WaitForSQLThreadUpToDate(&promotedReplica.Key, 0, 0); err != nil {
				return nil, fmt.Errorf("DelayMasterPromotionIfSQLThreadNotUpToDate error: %+v", err)
			}
			AuditTopologyRecovery(topologyRecovery, fmt.Sprintf("DelayMasterPromotionIfSQLThreadNotUpToDate: SQL thread caught up on %+v", promotedReplica.Key))
		}

这就蛋疼了, sql_thread已经停了. 所以不管ReasonableReplicationLagSeconds设置多大这个 WaitForSQLThreadUpToDate 最终还是会超时. 这也是下面这条日志输出的原因

DelayMasterPromotionIfSQLThreadNotUpToDate error: 2022-03-01 10:31:16 ERROR WaitForSQLThreadUpToDate stale coordinates timeout on dbpnew85v:3100 after duration 5m0s

May 03 '22 08:05 Fanduzi

graph TD;
S(recoverDeadMaster) --> A
A(RegroupReplicasGTID) --> A1(GetCandidateReplica)
A1 --> B(sortedReplicasDataCenterHint)
B --> B1(StopReplicas<br><并发执行>)
B1 --> |<1>| C(StopReplicationNicely)
C --> C1(stop slave io_thread)
C --> C2(stop slave sql_thread)
C1 --> D
C2 --> D
D(instance.SQLDelay == 0<br><不是延时从库>) --> |timeout=InstanceBulkOperationsWaitTimeoutSeconds=10s| D1(WaitForSQLThreadUpToDate)
D1 --> |由于延时比较大,执行慢,在10s内无法触达SQLThreadUpToDate| E(case <-generalTimer.C<br>触发了10s超时)
E --> |<2>| B1
B1 --> |<3><br>此时io_thread和sql_thread均为OFF| F(StopReplication<br>当StopReplicationNicely返回error时,此时orc认为没关闭成功)
F --> F1(stop slave<br>对所有副本又执行了一遍)
S -->  G(Expr:DelayMasterPromotionIfSQLThreadNotUpToDate&&!promotedReplica.SQLThreadUpToDate)
G --> G1(此时io_thread和sql_thread均为OFF)
G1 --> |原实现| H(WaitForSQLThreadUpToDate)
H --> H1(触发ReasonableReplicationLagSeconds超时,无论该参数设置多大)
G1 --> |新实现| I(StartAndWaitForSQLThreadUpToDate)
I --> |start slave sql_thread| I1(WaitForSQLThreadUpToDate)

May 18 '22 06:05 lazzyfu

我用的orchestrator-3.2.6 版本，在代码里没有搜到你说的 "新实现 StartAndWaitForSQLThreadUpToDate" 这个函数，只有原实现WaitForSQLThreadUpToDate函数，你是用的什么版本？怎么解决的啊？

Jul 05 '22 08:07 yangzan66

我用的orchestrator-3.2.6 版本，在代码里没有搜到你说的 "新实现 StartAndWaitForSQLThreadUpToDate" 这个函数，只有原实现WaitForSQLThreadUpToDate函数，你是用的什么版本？怎么解决的啊？

调整逻辑

在执行WaitForSQLThreadUpToDate时，先执行start slave sql_thread

代码贡献

这是我们内部的实现，代码来自@Fanduzi 贡献

改动代码如下：

新封装一个方法StartAndWaitForSQLThreadUpToDate

// StartAndWaitForSQLThreadUpToDate starts a replica
// It will actually START the sql_thread even if the replica is completely stopped.
// And will wait for sql thread up to date
func StartAndWaitForSQLThreadUpToDate(instanceKey *InstanceKey, overallTimeout time.Duration, staleCoordinatesTimeout time.Duration) (*Instance, error) {
    instance, err := ReadTopologyInstance(instanceKey)
    if err != nil {
        return instance, log.Errore(err)
    }
 
    if !instance.ReplicationThreadsExist() {
        return instance, fmt.Errorf("instance is not a replica: %+v", instanceKey)
    }
 
    // start sql_thread but catch any errors
    for _, cmd := range []string{`start slave sql_thread`} {
        if _, err := ExecInstance(instanceKey, cmd); err != nil {
            return nil, log.Errorf("%+v: StartReplicationAndWaitForSQLThreadUpToDate: '%q' failed: %+v", *instanceKey, cmd, err)
        }
    }
 
    if instance.SQLDelay == 0 {
        // Otherwise we don't bother.
        if instance, err = WaitForSQLThreadUpToDate(instanceKey, overallTimeout, staleCoordinatesTimeout); err != nil {
            return instance, err
        }
    }
 
    instance, err = ReadTopologyInstance(instanceKey)
    log.Infof("Started replication and wait for sql_thread up to date on %+v, Read:%+v, Exec:%+v", *instanceKey, instance.ReadBinlogCoordinates, instance.ExecBinlogCoordinates)
    return instance, err
}

引用替换

if config.Config.DelayMasterPromotionIfSQLThreadNotUpToDate && !promotedReplica.SQLThreadUpToDate() {
            AuditTopologyRecovery(topologyRecovery, fmt.Sprintf("DelayMasterPromotionIfSQLThreadNotUpToDate: waiting for SQL thread on %+v", promotedReplica.Key))
            if _, err := inst.StartAndWaitForSQLThreadUpToDate(&promotedReplica.Key, 0, 0); err != nil {
                return nil, fmt.Errorf("DelayMasterPromotionIfSQLThreadNotUpToDate error: %+v", err)
            }
            AuditTopologyRecovery(topologyRecovery, fmt.Sprintf("DelayMasterPromotionIfSQLThreadNotUpToDate: SQL thread caught up on %+v", promotedReplica.Key))
        }
 
 
if config.Config.DelayMasterPromotionIfSQLThreadNotUpToDate {
            AuditTopologyRecovery(topologyRecovery, fmt.Sprintf("Waiting to ensure the SQL thread catches up on %+v", promotedReplica.Key))
            if _, err := inst.StartAndWaitForSQLThreadUpToDate(&promotedReplica.Key, 0, 0); err != nil {
                return promotedReplica, lostReplicas, err
            }
            AuditTopologyRecovery(topologyRecovery, fmt.Sprintf("SQL thread caught up on %+v", promotedReplica.Key))
        }

完事后，你自己在重新打包下即可

Jul 05 '22 08:07 lazzyfu

我用的orchestrator-3.2.6 版本，在代码里没有搜到你说的 "新实现 StartAndWaitForSQLThreadUpToDate" 这个函数，只有原实现WaitForSQLThreadUpToDate函数，你是用的什么版本？怎么解决的啊？

调整逻辑

在执行WaitForSQLThreadUpToDate时，先执行start slave sql_thread

代码贡献

这是我们内部的实现，代码来自@Fanduzi 贡献

改动代码如下：

新封装一个方法StartAndWaitForSQLThreadUpToDate

// StartAndWaitForSQLThreadUpToDate starts a replica
// It will actually START the sql_thread even if the replica is completely stopped.
// And will wait for sql thread up to date
func StartAndWaitForSQLThreadUpToDate(instanceKey *InstanceKey, overallTimeout time.Duration, staleCoordinatesTimeout time.Duration) (*Instance, error) {
    instance, err := ReadTopologyInstance(instanceKey)
    if err != nil {
        return instance, log.Errore(err)
    }
 
    if !instance.ReplicationThreadsExist() {
        return instance, fmt.Errorf("instance is not a replica: %+v", instanceKey)
    }
 
    // start sql_thread but catch any errors
    for _, cmd := range []string{`start slave sql_thread`} {
        if _, err := ExecInstance(instanceKey, cmd); err != nil {
            return nil, log.Errorf("%+v: StartReplicationAndWaitForSQLThreadUpToDate: '%q' failed: %+v", *instanceKey, cmd, err)
        }
    }
 
    if instance.SQLDelay == 0 {
        // Otherwise we don't bother.
        if instance, err = WaitForSQLThreadUpToDate(instanceKey, overallTimeout, staleCoordinatesTimeout); err != nil {
            return instance, err
        }
    }
 
    instance, err = ReadTopologyInstance(instanceKey)
    log.Infof("Started replication and wait for sql_thread up to date on %+v, Read:%+v, Exec:%+v", *instanceKey, instance.ReadBinlogCoordinates, instance.ExecBinlogCoordinates)
    return instance, err
}

引用替换

if config.Config.DelayMasterPromotionIfSQLThreadNotUpToDate && !promotedReplica.SQLThreadUpToDate() {
            AuditTopologyRecovery(topologyRecovery, fmt.Sprintf("DelayMasterPromotionIfSQLThreadNotUpToDate: waiting for SQL thread on %+v", promotedReplica.Key))
            if _, err := inst.StartAndWaitForSQLThreadUpToDate(&promotedReplica.Key, 0, 0); err != nil {
                return nil, fmt.Errorf("DelayMasterPromotionIfSQLThreadNotUpToDate error: %+v", err)
            }
            AuditTopologyRecovery(topologyRecovery, fmt.Sprintf("DelayMasterPromotionIfSQLThreadNotUpToDate: SQL thread caught up on %+v", promotedReplica.Key))
        }
 
 
if config.Config.DelayMasterPromotionIfSQLThreadNotUpToDate {
            AuditTopologyRecovery(topologyRecovery, fmt.Sprintf("Waiting to ensure the SQL thread catches up on %+v", promotedReplica.Key))
            if _, err := inst.StartAndWaitForSQLThreadUpToDate(&promotedReplica.Key, 0, 0); err != nil {
                return promotedReplica, lostReplicas, err
            }
            AuditTopologyRecovery(topologyRecovery, fmt.Sprintf("SQL thread caught up on %+v", promotedReplica.Key))
        }

完事后，你自己在重新打包下即可

测试了，没有问题，感谢大佬！

Jul 06 '22 06:07 yangzan66

@Fanduzi 你好，请教您关于StartAndWaitForSQLThreadUpToDate 部分， if instance.SQLDelay == 0 { // Otherwise we don't bother. if instance, err = WaitForSQLThreadUpToDate(instanceKey, overallTimeout, staleCoordinatesTimeout); err != nil { return instance, err } }

为什么是instance.SQLDelay == 0 延迟等于0 的时候执行呢？我理解的这样在追延迟的时候说不起作用的？

Jun 18 '23 14:06 553589912

SQLDelay

这个SQLDelay我没记错的话指的是CHANGE MASTER TO中的MASTER_DELAY

Jun 19 '23 01:06 Fanduzi

是的，感谢老哥

Jun 24 '23 07:06 553589912

orchestrator orchestrator copied to clipboard

How does The DelayMasterPromotionIfSQLThreadNotUpToDate parameter work ?

here is my env:

version

orchestrator main config parameters

test steps

orchestrator recover log

dbpnew85v general log

调整逻辑

代码贡献

改动代码如下：

引用替换

调整逻辑

代码贡献

改动代码如下：

引用替换

orchestrator
orchestrator copied to clipboard