redisraft as primary for redis followers
Hello,
We have a use case where we'd like to use redis raft as a primary database in one location, and replicate that dataset to a number of read-only followers in other datacenters.
In our initial testing, this doesn't seem to be supported -- more specifically, it seems like it currently does not support PSYNC(?)
On a redis instance that has been configured with replicaof <redis raft node ip> <redis raft node ip / port>
1099948:S 05 Mar 2022 03:31:37.755 * MASTER <-> REPLICA sync started
1099948:S 05 Mar 2022 03:31:37.755 * Non blocking connect for SYNC fired the event.
1099948:S 05 Mar 2022 03:31:37.755 * Master replied to PING, replication can continue...
1099948:S 05 Mar 2022 03:31:37.806 * Trying a partial resynchronization (request 00e0947b6786ab11ee3054a37422f030ae946dad:1).
1099948:S 05 Mar 2022 03:31:37.806 * Master does not support PSYNC or is in error state (reply: -ERR not supported by RedisRaft)
1099948:S 05 Mar 2022 03:31:37.806 * Retrying with SYNC...
1099948:S 05 Mar 2022 03:31:37.806 # MASTER aborted replication with an error: ERR not supported by RedisRaft
1099948:S 05 Mar 2022 03:31:37.806 * Reconnecting to MASTER localhost:5001 after failure
1099948:S 05 Mar 2022 03:31:37.806 * MASTER <-> REPLICA sync started
1099948:S 05 Mar 2022 03:31:37.806 * Non blocking connect for SYNC fired the event.
1099948:S 05 Mar 2022 03:31:37.806 * Master replied to PING, replication can continue...
@gpl This is indeed not supported at the moment, but it's an interesting idea. Replicas will obviously not be able to provide the same consistency guarantees but I can see why this can be useful.
Some details will still need to be sorted out -- first ones that come to mind:
- Do we allow replication from followers or only leaders?
- What happens if we replicate from a leader and it becomes a follower / candidate (for a long time)?
At least in our use case, we are okay with delays of multiple seconds for data in redis, so we would be ok with replicating off of a redis follower. However, I'm not sure what failover scenarios would look like, i.e. a redis follower goes offline.
@gpl If we don't require replication from a leader, it's actually easier to handle failure using something like a load balancer endpoint doing round-robin over all cluster nodes.
I think our only concern there is; if the connection gets lost for whatever reason - I'd be concerned that it would pick up the log at the wrong place and cause a follower to do a full resync?
Not sure if that's a real concern there though.
@gpl If this capability is implemented with a proper replication backlog, that should not be a problem as long as the backlog, which is a cyclic buffer, is not exhausted.
Ah interesting.
I was curious - do you happen to know if redis-raft replication support is on the project timeline anytime soon?
We'd really like the feature but I know it could be somewhat of an obscure feature request.
@gpl It makes sense, so it's not an obscure feature request, but I assume it will not be very high on the priority list soon.
JIRA ticket: https://redislabs.atlassian.net/browse/RR-315