kvrocks icon indicating copy to clipboard operation
kvrocks copied to clipboard

KVRocks (not clustered mode) does not support the FAILOVER command from Redis

Open jemc opened this issue 8 months ago • 10 comments

Search before asking

  • [x] I had searched in the issues and found no similar issues.

Motivation

Redis has a FAILOVER command which can be used to gracefully switch which node is the master node in a multi-node (non-cluster) setup.

The details are described in the documentation, but the gist is that it will gracefully pauses writes on the master until at least one replica catches up, then make that replica a master and make the old master into a replica of it.

A command like this is necessary for doing per-node maintenance windows in a rolling fashion, keeping high availability with minimum downtime, and most importantly, without losing acknowledged writes.

Apache KVRocks advertises that you can operate it in high availability mode with Redis Sentinel, so I was quite surprised to see that this feature doesn't exist. I can't really treat this as a high availability system without this feature, as there is no clean way to take down a node for maintenance without causing potential loss of acknowledged writes.

Solution

Implement a Redis-compatible FAILOVER command.

Are you willing to submit a PR?

  • [ ] I'm willing to submit a PR!

jemc avatar Mar 22 '25 02:03 jemc

Also worth noting that KVRocks also doesn't implement the CLIENT PAUSE WRITE command from Redis either, which would be the first step of a graceful failover, so I can't use that as a workaround either...

jemc avatar Mar 22 '25 03:03 jemc

@jemc if you are not already working on it, I'd like to give this a try. I can start from implementing the CLIENT PAUSE WRITE command and then move onto the FAILOVER

geonove avatar Mar 30 '25 22:03 geonove

@jemc @geonove Kvrocks cluster mode isn't the same as Redis, and it uses https://github.com/apache/kvrocks-controller to manage the cluster information. So we don't expect to support the failover on the Kvrocks inself.

git-hulk avatar Mar 31 '25 13:03 git-hulk

I see, thanks for the explanation @git-hulk . I am going to find another issue to work on then :)

geonove avatar Mar 31 '25 13:03 geonove

@git-hulk @geonove - I specifically said I wasn't using Clustered mode in the title and description of this issue ticket.

The README.md of this repo says:

High Availability: Support Redis sentinel to failover when master or slave was failed.

That is my use case. I have a two kvrocks node deployed in an HA setup, with redis-sentinel watching them.

I have no need for cluster management tools because I am not using Clustered mode.

What I need is a command or sequence of commands I can run to gracefully fail over from one master to the other, with a guarantee that acknowledged writes will not be dropped.

jemc avatar May 05 '25 14:05 jemc

As far as I can tell by reading the KVRocks source code, the only commands which can induce failover are:

  • CLUSTERX SETNODES [...] (used by the kvrocks controller to sync cluster state when in Clustered mode, including master/replica relationships)

  • SLAVEOF [...] (used when not in Clustered mode)

What's not clear to me is whether simply calling the SLAVEOF command on the current master node is guaranteed to fully replicate all acknowledged writes before switching to being a replica of the other node. In Redis this is not safe to do, in that acknowledged writes may be lost.

KVRocks doesn't seem to have any documentation or indication of how to do a controlled failover gracefully.

jemc avatar May 05 '25 16:05 jemc

@jemc You don't need to use CLUSTERX command if you're running in HA mode. And you could just use the redis-sentinel's failover command to switch the role.

git-hulk avatar May 06 '25 02:05 git-hulk

Looks not yet solved. Reopened.

PragmaTwice avatar May 06 '25 02:05 PragmaTwice

You don't need to use CLUSTERX command if you're running in HA mode

Yes, I know that. The only reason I mentioned it here was after you brought up kvrocks-controller and cluster mode.

I'm not using cluster mode, so kvrocks-controller isn't what I need, nor do I need the CLUSTERX command that kvrocks-controller uses to orchestrate kvrocks.

And you could just use the redis-sentinel's failover command to switch the role.

Changing the state of redis-sentinel is not necessarily enough to cover the specific concern raised in this issue ticket.

The issue here is that HA KVRocks (appears to?) lack any way to guarantee that acknowledged writes won't be lost during a failover. Redis has a specific command for this (FAILOVER) because (at least in Redis), simply switching the master is not enough to make this guarantee.

So the scope of this ticket should be to do one of the following for KVRocks:

  • implement this command or something similar and document it, OR
  • add documentation that describes how the "acknowledged writes are not lost during planned failover" guarantee is achieved without needing to run any particular command

jemc avatar May 08 '25 14:05 jemc

@jemc Thanks for your clarification. I'm sorry for misunderstanding that you're saying the cluster failover initially.

git-hulk avatar May 12 '25 12:05 git-hulk

In some maintance scenarios, we want migrate kvrocks nodes (not slots) without any data inconsistency or lost, the Failover command is very important for this. I call this postive failover. The kvrocks-controller already implements negative Failover, The sample way to implement postive failover with kvrocks-controller is that kvrocks master node support CLIENT PAUSE WRITE (READONLY), kvrocks-controller check the replica's offset equal master's , then update the slots topology. Does anyone begin to do that ? I can help to implement it.

yuzegao avatar Aug 23 '25 04:08 yuzegao

@yuzegao Great, you can take it since there's no assignee for now.

git-hulk avatar Aug 23 '25 06:08 git-hulk

@git-hulk Hi, I’d like to take this issue and implement the FAILOVER feature. Plan:

  1. write a design proposal ;
  2. implement this feature and add unit & integration tests and CI;
  3. open PR and iterate. Could you please assign this issue to me? Thanks!

yuzegao avatar Oct 01 '25 08:10 yuzegao

@git-hulk Hi, I’d like to take this issue and implement the FAILOVER feature. Plan:

  1. write a design proposal ;
  2. implement this feature and add unit & integration tests and CI;
  3. open PR and iterate. Could you please assign this issue to me? Thanks!

Assigned, thank you.

git-hulk avatar Oct 01 '25 08:10 git-hulk

This would be super helpful. We've been unable to perform a manual failover for maintenance with the redis-sentinel becausse there is a mess with the WAL between the master, new master and we ended up having a full-sync of all the replicas.

You can read more about it here

ethervoid avatar Oct 02 '25 14:10 ethervoid

@ethervoid @jemc @PragmaTwice Hi guys, I have compiled a proposal for failover. I have two options and I hope you can provide some suggestions.

https://github.com/apache/kvrocks/discussions/3218

yuzegao avatar Oct 09 '25 06:10 yuzegao