garnet
garnet copied to clipboard
Replicas not attaching correctly on failover
NOTE: Currently we do not support leader election on failover because we assume replicas cannot initiate failover on their own
Issuing CLUSTER FAILOVER without any arguments should make the old primary a replica to the new primary. This is currently not happening. Our initial assumption was that CLUSTER FAILOVER will always be called with TAKEOVER option since the primary will not be reachable. We did not account for planned failovers that are still a valid case. In addition, the implementation of attachReplicas is buggy and may throw an exception if any one of the remote nodes is unreachable, resulting in the reachable nodes never being informed of the primary change.
Proposed solution:
- Refactor attachReplicas so catch statement is inside the loop and does not interfere with communication to remote nodes in the event of an exception.
- Make old primary replica of the new primary when CLUSTER FAILOVER is issued with default option
NOTE: Avoid changing information directly owned remote nodes by changing the local configuration In the failover case, if we make remote nodes replicas to the new primary by directly updating the local configuration, we will put the cluster into an inconsistent state if those nodes are unreachable. We really, on issuing requests to remote nodes to change their own state, so we can make sure that the change is acknowledged if when the two nodes communicating are well connected. The exception to this rule is when changing ownership of slots as in the failover case. We cannot avoid doing this because slots can be owned by any instance at any given point in time. For the latter scenario, we should be extremely careful to avoid any split-brain inconsistencies. At least until leader election is properly implemented.