KeyDB icon indicating copy to clipboard operation
KeyDB copied to clipboard

slave-priority 0 seems not to work together with active-replica and multi-master

Open ecerroni opened this issue 6 years ago • 6 comments

Sentinel seems to respect this:

docker run -p 6381:6381 --net redis-local --name local-replica-1 eqalpha/keydb keydb-server /etc/keydb/redis.conf  --bind 0.0.0.0 --port 6381 --replicaof local-master 6380 --slave-priority 0

But not this

docker run -p 6379:6379 --net redis-local --name global-replicator-1 -d eqalpha/keydb keydb-server /etc/keydb/redis.conf --active-replica yes --multi-master yes --bind 0.0.0.0 --replicaof global-replicator-2 6380 --replicaof local-master 6380 --slave-priority 0

In the latter case, Sentinel seems to ignore the --slave-priority flag and promotes the global-replicator-1 to master anyway

I am new to redis/keydb so I am not sure if it is either intended or maybe a conflict with one among --active-replica and --multi-master

ecerroni avatar Oct 31 '19 18:10 ecerroni

Hi @ecerroni,

Active replication is not designed to be used with sentinel. In fact it's main purpose is to make it unnecessary. With Active Replication both nodes are masters simultaneously, and you have the option to direct traffic to either as needed. The features is intended to be used with a traditional TCP load balancer such as HAProxy for this purpose.

When you use the Active Replica flag you are implying that the node will behave as a master. As a result sentinel is not needed to promote the node since it's already in the master state.

Let me know if you have any more questions!

JohnSully avatar Nov 01 '19 03:11 JohnSully

I see, that makes sense. However let me explain further what I am exactly trying to achieve.

The stack I'm using is:

  • Apollo Server
  • Apollo Server Redis Cache (that uses ioredis package under the hood)

Goal:

  • Having multiple masters as global replicators on different host servers.
  • Each host server having local replicas of its own global-replicator
  • Apollo server** reads from its own local replicas
  • Apollo server write to its own global replicator

** There is one apollo server on each host

I see that with ioredis I may use either single host, sentinels or cluster, but no one seems to fit my goal.

The following represents my last attempt where I tried ot put --active-replica and slave-priority together and it did not work for the reason I explained in the previous post:

-----------------------------------------------------------------
 server A                              |   server B              |       
-----------------------------------------------------------------|
                                       |                         |   
global-replicator-1 <------------------|---> global-replicator-2 |
   ^                                   |                    .    |
   |                                   |                    .    | 
   |                                   |                    .    |
local-master|-----> local-replica-1    |                         |
   |        |-----> local-replica-n    |                         |
   |                                   |                         |
local-sentinel                         |                         |
------------------------------------------------------------------

However, what I am actually trying to achieve looks more like this:

-----------------------------------------------------------------
 server A                              |   server B              |       
-----------------------------------------------------------------|
                                       |                         |   
global-replicator-1 <------------------|---> global-replicator-2 |
     |-----> local-replica-1           |                         |
     |-----> local-replica-n           |                         |
                                       |                         |
------------------------------------------------------------------

That if global-replicator-1 fails another local slave might be promoted to master or, if that is not possible at all, at least to have more local replicas for reads.

I know this is probably out of scope, but it'd be interesting to explore what can be done with keydb in this case, that is when someone wants to configure multi-master, active-replica and local slaves.

Any suggestions?

ecerroni avatar Nov 01 '19 10:11 ecerroni

Before answering I want to understand better why you want to run multiple replicas on the same server. Is it for performance?

JohnSully avatar Nov 01 '19 14:11 JohnSully

Reading and writing locally to the host for performance reason is one thing.

However implementing local fault tolerance is my main goal. If host's global replicator fails for whatever reason, the node can still read and write locally. Once the global replicator comes back, it will also synchronize with the other global replicators.

If local master goes down, sentinel will take care of promoting a new master.

I've achieved that with the first schema, where global-replicator-1 is slaveof local-master. The only issue is that I cannot hold the sentinel back from promoting it to local master. Once that happens, sentinel pulls all other global replicators as slaves of new local master as well breaking the schema.

I'd implement any other approach that would guarantee local fault tolerance while synching all global replicators, but I cannot think of any. That's why I am open to suggestions.

ecerroni avatar Nov 02 '19 08:11 ecerroni

This sounds like a valid use case, I will target this fix for v5.2

In the meantime the only workaround for your topology is to switch to a mesh topology. KeyDB is capable of removing duplicate messages from the mesh, but you will see additional network traffic.

JohnSully avatar Nov 05 '19 03:11 JohnSully

The work here is a bit more involved than I previously thought. Sentinel doesn't understand the concept of multiple masters and is just picking out the first slave_priority it sees in "info replication". It will take a bit of time to make the necessary changes and test them. I'm going to have to bump this to v5.3.

JohnSully avatar Nov 22 '19 01:11 JohnSully