How to deal with the split-brain problem in kvrocks
Hi guys, while analyzing the use of a controller to manage KVrocks clusters in a multi-AZ deployment, we identified a risk of split-brain scenarios. For example, in the diagram below, when a network partition occurs:
- The connection between the Load Balancer (LB) and Node N1 remains functional
- But the connection between the Controller Master and Node N1 fails due to network issues This causes the Controller Master to mistakenly mark N1 as faulty and trigger failover, promoting N1's slave to master. However, N1 is actually healthy and continues to accept writes from the LB. Result : Two active masters (split-brain) exist for the same shard, leading to data inconsistency.
In the current solution, this situation can occur in many scenarios, in addition to network partitions, there are also instances that hang for a while and then recover during failover and provide write services normally.
Hi @Allen315, thanks for your report.
The connection between the Load Balancer (LB) and Node N1 remains functional
Yes, it is known issue. And perhaps we could improve this by using multiple nodes to detect the healthy status instead of the leader node only. Will do this when I get time and PR is also welcomed.
there are also instances that hang for a while and then recover during failover and provide write services normally.
It's expected to promote a new master instance if it's unresponsive. You can also increase the failover detect times and interval to mitigate this issue.