moco icon indicating copy to clipboard operation
moco copied to clipboard

Switchover does not happen when killing the master pod

Open erikgalajda opened this issue 4 years ago • 3 comments

Describe the bug A clear and concise description of what the bug is.

Environments

  • Version: moco:0.10.9
  • OS: Ubuntu 20.04.3 LTS

To Reproduce Steps to reproduce the behavior:

  1. Kill the master pod
  2. Master role does not switch to an available replica, instead waits for the killed master pod to be up and available again

Expected behavior Operator should switch the master role to a replica pod

Additional context Switchover works fine when the master pod is deleted instead of killed

erikgalajda avatar Feb 08 '22 12:02 erikgalajda

@slavonicsniper Could you give us how to reproduce the problem using kubectl?

ymmt2005 avatar Feb 09 '22 08:02 ymmt2005

@slavonicsniper Thank you for the report! I found that the failover does not work in the following two cases. And I will fix the 1st case.

1. When immediately after a failover or a switchover.

When a failover occurs, the moco-controller selects the most advanced replica based on the replica's Retrieved_Gtid_Set.

However, immediately after a failover or a switchover occurs, the Retrieved_Gtid_Set is blank. Therefore, if the subsequent failover occurs in this case, the moco-controller can not select a replica to promote. And the failover will fail.

2. When MySQLCluster is empty.

In this case, the moco-controller judges the cluster status as Lost. This decision is unavoidable due to the clustering specifications.

https://github.com/cybozu-go/moco/blob/v0.10.9/docs/clustering.md#mysqlcluster

masa213f avatar Mar 01 '22 06:03 masa213f

@slavonicsniper Could you give us how to reproduce the problem using kubectl?

In k9s killing the master pod (ctrl+k).

erikgalajda avatar Mar 01 '22 08:03 erikgalajda

We have fixed the first case in https://github.com/cybozu-go/moco/issues/370#issuecomment-1055089091

masa213f avatar Jul 27 '23 01:07 masa213f