redisraft icon indicating copy to clipboard operation
redisraft copied to clipboard

redisraft as primary for redis followers

Open gpl opened this issue 3 years ago • 8 comments

Hello,

We have a use case where we'd like to use redis raft as a primary database in one location, and replicate that dataset to a number of read-only followers in other datacenters.

In our initial testing, this doesn't seem to be supported -- more specifically, it seems like it currently does not support PSYNC(?)

On a redis instance that has been configured with replicaof <redis raft node ip> <redis raft node ip / port>

1099948:S 05 Mar 2022 03:31:37.755 * MASTER <-> REPLICA sync started
1099948:S 05 Mar 2022 03:31:37.755 * Non blocking connect for SYNC fired the event.
1099948:S 05 Mar 2022 03:31:37.755 * Master replied to PING, replication can continue...
1099948:S 05 Mar 2022 03:31:37.806 * Trying a partial resynchronization (request 00e0947b6786ab11ee3054a37422f030ae946dad:1).
1099948:S 05 Mar 2022 03:31:37.806 * Master does not support PSYNC or is in error state (reply: -ERR not supported by RedisRaft)
1099948:S 05 Mar 2022 03:31:37.806 * Retrying with SYNC...
1099948:S 05 Mar 2022 03:31:37.806 # MASTER aborted replication with an error: ERR not supported by RedisRaft
1099948:S 05 Mar 2022 03:31:37.806 * Reconnecting to MASTER localhost:5001 after failure
1099948:S 05 Mar 2022 03:31:37.806 * MASTER <-> REPLICA sync started
1099948:S 05 Mar 2022 03:31:37.806 * Non blocking connect for SYNC fired the event.
1099948:S 05 Mar 2022 03:31:37.806 * Master replied to PING, replication can continue...

gpl avatar Mar 05 '22 11:03 gpl

@gpl This is indeed not supported at the moment, but it's an interesting idea. Replicas will obviously not be able to provide the same consistency guarantees but I can see why this can be useful.

Some details will still need to be sorted out -- first ones that come to mind:

  • Do we allow replication from followers or only leaders?
  • What happens if we replicate from a leader and it becomes a follower / candidate (for a long time)?

yossigo avatar Mar 05 '22 18:03 yossigo

At least in our use case, we are okay with delays of multiple seconds for data in redis, so we would be ok with replicating off of a redis follower. However, I'm not sure what failover scenarios would look like, i.e. a redis follower goes offline.

gpl avatar Jul 07 '22 04:07 gpl

@gpl If we don't require replication from a leader, it's actually easier to handle failure using something like a load balancer endpoint doing round-robin over all cluster nodes.

yossigo avatar Jul 07 '22 05:07 yossigo

I think our only concern there is; if the connection gets lost for whatever reason - I'd be concerned that it would pick up the log at the wrong place and cause a follower to do a full resync?

Not sure if that's a real concern there though.

gpl avatar Dec 08 '22 00:12 gpl

@gpl If this capability is implemented with a proper replication backlog, that should not be a problem as long as the backlog, which is a cyclic buffer, is not exhausted.

yossigo avatar Dec 08 '22 17:12 yossigo

Ah interesting.

I was curious - do you happen to know if redis-raft replication support is on the project timeline anytime soon?

We'd really like the feature but I know it could be somewhat of an obscure feature request.

gpl avatar Dec 08 '22 19:12 gpl

@gpl It makes sense, so it's not an obscure feature request, but I assume it will not be very high on the priority list soon.

yossigo avatar Dec 14 '22 18:12 yossigo

JIRA ticket: https://redislabs.atlassian.net/browse/RR-315

fadidahanna avatar Jul 06 '23 15:07 fadidahanna