Support clients to access kvrocks cluster using LB
Search before asking
- [x] I had searched in the issues and found no similar issues.
Motivation
In the following situations, a master-slave switchover in a Kvrocks cluster may cause a split-brain situation:
- The Kvrocks process hangs and then recovers
- The network card on the Kvrocks host fails and then recovers
- The network partition on the Kvrocks host recovers When the Kvrocks process regains access, client requests may be directed to the old master node for a brief period, causing a split-brain situation and data inconsistency. If clients access Kvrocks using a load balancer (LB), when a Kvrocks cluster switchover occurs, redirecting the LB address to the new master node ensures that client requests are not forwarded to the old master node. The Kvrocks cluster deployment architecture is as follows:
Benefits of this architecture:
- It covers most split-brain scenarios, including process restarts, process hangs, host network interface downtime, and network partitions.
- Using a load balancer (LB) accelerates failover, significantly faster than approaches such as using dynamic domain names on the client side.
This approach is used in both AWS Elasticache and Azure Redis.
Solution
To ensure that client access to the KvRocks cluster goes through the load balancer (LB), while maintaining local IP for internal KvRocks cluster communication (master-slave synchronization and slot migration), the following modifications are required:
-
Add two optional parameters to the clusterx setnodes command to specify a CLB host for each node. The following example shows that you can specify some or all LBs: CLUSTERX SETNODES "ZrFserb4Mqi5dbyCLCMUm9zFXMNPhzKE4RbFauJa 10.190.28.10 7115 master - 0-5460 \n qhbVLW9dc2VF9CZOmvBJOzzqkhNA3oHoD3NVWVyg 10.190.28.10 7133 master - 5461-10921 \n 9cnjXnaNK5KfPtX01V0wn7ZtM7FDPnz7va9qPJBs 10.190.28.10 7237 master - 10922-16383 192.168.0.2 1222\n fuCHh8Ru9tFSY313QPfBcPQ3QeP63Z4rWpqpyaOl 10.190.28.10 7255 slave ZrFserb4Mqi5dbyCLCMUm9zFXMNPhzKE4RbFauJa \n cIqdj5C37cMrr7r7ydZDZ8zm1T1vPLMWj8516rrF 10.190.28.10 6741 slave qhbVLW9dc2VF9CZOmvBJOzzqkhNA3oHoD3NVWVyg \n 3D8JRN8YVONqRU57jP99E3JdH2QnLHzZpFN4rcQQ 10.190.28.10 6765 slave 9cnjXnaNK5KfPtX01V0wn7ZtM7FDPnz7va9qPJBs 192.168.0.1 1234" 1754120940
-
When persisting nodes.conf to a local file, add the LB address information to the file. When the kvrovks process starts, it will also load the corresponding LB address information.
-
When executing the cluster nodes/slots/replicas commands, if the nodes have a LB host configured, replace it with LB host.
Are you willing to submit a PR?
- [x] I'm willing to submit a PR!
So before this feature is implemented, how do we ensure there is no data corruption?
One way I can think of is the old master should not put back as replication directly. We need to truncate the data first so it could trigger a full sync from the new master.
This won't solve everything but at least it could make sure the data is consistent.
I personally don't like the idea to bind the proxy to the cluster, because only for the specific scenario and it's strange to bind the LB address in to the cluster information.