kvrocks
kvrocks copied to clipboard
KVRocks (not clustered mode) does not support the FAILOVER command from Redis
Search before asking
- [x] I had searched in the issues and found no similar issues.
Motivation
Redis has a FAILOVER command which can be used to gracefully switch which node is the master node in a multi-node (non-cluster) setup.
The details are described in the documentation, but the gist is that it will gracefully pauses writes on the master until at least one replica catches up, then make that replica a master and make the old master into a replica of it.
A command like this is necessary for doing per-node maintenance windows in a rolling fashion, keeping high availability with minimum downtime, and most importantly, without losing acknowledged writes.
Apache KVRocks advertises that you can operate it in high availability mode with Redis Sentinel, so I was quite surprised to see that this feature doesn't exist. I can't really treat this as a high availability system without this feature, as there is no clean way to take down a node for maintenance without causing potential loss of acknowledged writes.
Solution
Implement a Redis-compatible FAILOVER command.
Are you willing to submit a PR?
- [ ] I'm willing to submit a PR!
Also worth noting that KVRocks also doesn't implement the CLIENT PAUSE WRITE command from Redis either, which would be the first step of a graceful failover, so I can't use that as a workaround either...
@jemc if you are not already working on it, I'd like to give this a try. I can start from implementing the CLIENT PAUSE WRITE command and then move onto the FAILOVER
@jemc @geonove Kvrocks cluster mode isn't the same as Redis, and it uses https://github.com/apache/kvrocks-controller to manage the cluster information. So we don't expect to support the failover on the Kvrocks inself.
I see, thanks for the explanation @git-hulk . I am going to find another issue to work on then :)
@git-hulk @geonove - I specifically said I wasn't using Clustered mode in the title and description of this issue ticket.
The README.md of this repo says:
High Availability: Support Redis sentinel to failover when master or slave was failed.
That is my use case. I have a two kvrocks node deployed in an HA setup, with redis-sentinel watching them.
I have no need for cluster management tools because I am not using Clustered mode.
What I need is a command or sequence of commands I can run to gracefully fail over from one master to the other, with a guarantee that acknowledged writes will not be dropped.
As far as I can tell by reading the KVRocks source code, the only commands which can induce failover are:
-
CLUSTERX SETNODES [...](used by the kvrocks controller to sync cluster state when in Clustered mode, including master/replica relationships) -
SLAVEOF [...](used when not in Clustered mode)
What's not clear to me is whether simply calling the SLAVEOF command on the current master node is guaranteed to fully replicate all acknowledged writes before switching to being a replica of the other node. In Redis this is not safe to do, in that acknowledged writes may be lost.
KVRocks doesn't seem to have any documentation or indication of how to do a controlled failover gracefully.
@jemc You don't need to use CLUSTERX command if you're running in HA mode. And you could just use the redis-sentinel's failover command to switch the role.
Looks not yet solved. Reopened.
You don't need to use
CLUSTERXcommand if you're running in HA mode
Yes, I know that. The only reason I mentioned it here was after you brought up kvrocks-controller and cluster mode.
I'm not using cluster mode, so kvrocks-controller isn't what I need, nor do I need the CLUSTERX command that kvrocks-controller uses to orchestrate kvrocks.
And you could just use the redis-sentinel's failover command to switch the role.
Changing the state of redis-sentinel is not necessarily enough to cover the specific concern raised in this issue ticket.
The issue here is that HA KVRocks (appears to?) lack any way to guarantee that acknowledged writes won't be lost during a failover. Redis has a specific command for this (FAILOVER) because (at least in Redis), simply switching the master is not enough to make this guarantee.
So the scope of this ticket should be to do one of the following for KVRocks:
- implement this command or something similar and document it, OR
- add documentation that describes how the "acknowledged writes are not lost during planned failover" guarantee is achieved without needing to run any particular command
@jemc Thanks for your clarification. I'm sorry for misunderstanding that you're saying the cluster failover initially.
In some maintance scenarios, we want migrate kvrocks nodes (not slots) without any data inconsistency or lost, the Failover command is very important for this. I call this postive failover. The kvrocks-controller already implements negative Failover, The sample way to implement postive failover with kvrocks-controller is that kvrocks master node support CLIENT PAUSE WRITE (READONLY), kvrocks-controller check the replica's offset equal master's , then update the slots topology. Does anyone begin to do that ? I can help to implement it.
@yuzegao Great, you can take it since there's no assignee for now.
@git-hulk Hi, I’d like to take this issue and implement the FAILOVER feature. Plan:
- write a design proposal ;
- implement this feature and add unit & integration tests and CI;
- open PR and iterate. Could you please assign this issue to me? Thanks!
@git-hulk Hi, I’d like to take this issue and implement the FAILOVER feature. Plan:
- write a design proposal ;
- implement this feature and add unit & integration tests and CI;
- open PR and iterate. Could you please assign this issue to me? Thanks!
Assigned, thank you.
This would be super helpful. We've been unable to perform a manual failover for maintenance with the redis-sentinel becausse there is a mess with the WAL between the master, new master and we ended up having a full-sync of all the replicas.
You can read more about it here
@ethervoid @jemc @PragmaTwice Hi guys, I have compiled a proposal for failover. I have two options and I hope you can provide some suggestions.
https://github.com/apache/kvrocks/discussions/3218