Switchover does not happen when killing the master pod
Describe the bug A clear and concise description of what the bug is.
Environments
- Version: moco:0.10.9
- OS: Ubuntu 20.04.3 LTS
To Reproduce Steps to reproduce the behavior:
- Kill the master pod
- Master role does not switch to an available replica, instead waits for the killed master pod to be up and available again
Expected behavior Operator should switch the master role to a replica pod
Additional context Switchover works fine when the master pod is deleted instead of killed
@slavonicsniper
Could you give us how to reproduce the problem using kubectl?
@slavonicsniper Thank you for the report! I found that the failover does not work in the following two cases. And I will fix the 1st case.
1. When immediately after a failover or a switchover.
When a failover occurs, the moco-controller selects the most advanced replica based on the replica's Retrieved_Gtid_Set.
However, immediately after a failover or a switchover occurs, the Retrieved_Gtid_Set is blank.
Therefore, if the subsequent failover occurs in this case, the moco-controller can not select a replica to promote.
And the failover will fail.
2. When MySQLCluster is empty.
In this case, the moco-controller judges the cluster status as Lost.
This decision is unavoidable due to the clustering specifications.
https://github.com/cybozu-go/moco/blob/v0.10.9/docs/clustering.md#mysqlcluster
@slavonicsniper Could you give us how to reproduce the problem using
kubectl?
In k9s killing the master pod (ctrl+k).
We have fixed the first case in https://github.com/cybozu-go/moco/issues/370#issuecomment-1055089091