jedis icon indicating copy to clipboard operation
jedis copied to clipboard

Improve cluster nodes and slots information

Open marcosnils opened this issue 9 years ago • 31 comments

JedisCluster currently exposes nodes information through the getClusterNodes method but this returns very limited information about node information (only host and port). It'd be nice if we could return some sort of ClusterNodeInformation object with all the information of the corresponding node. It's also necessary to extend the current ClusterNodeInformation bean and to add information such as master/slave role.

marcosnils avatar May 17 '15 18:05 marcosnils

How about the following proposal? We create a set of new classes: Cluster Node MasterNode SlaveNode

We add an instance of Cluster in JedisCluster and (assuming we have an instance of JedisCluster called jedis) we can do things like: jedis.cluster.nodes to return a list of Node that are all nodes of the cluster jedis.cluster.masters to return a list of MasterNode that are all master nodes of the cluster jedis.cluster.slaves to return a list of SlaveNode that are all slave nodes of the cluster jedis.cluster.getNodeById() to return a Node with a specific id

On MasterNode we can do (assuming we have an instance of MasterNode called master): master.slaves to get a list of SlaveNode that are slaves of the master

On SlaveNode we can do (assuming we have an instance of SlaveNode called slave): slave.master to get a MasterNode that is the master of that slave

On Node in general we can do things like: node.getPool to get a JedisPool where you can run commands on that specific node.

xetorthio avatar May 20 '15 00:05 xetorthio

I found this proposal very interesting. But can we have more than one cluster?

I'm asking this because I'm thinking that we can add all those data directly in JedisCluster without needing Cluster class.

nykolaslima avatar May 20 '15 12:05 nykolaslima

The idea is that JedisCluster only manages one cluster. So in this context, you cannot have more than one cluster. The reason why I think putting everything under JedisCluster.cluster is to keep everything related to cluster discovery separated and easier to see. If we don't do that, it would be mixed with all the commands.

xetorthio avatar May 20 '15 13:05 xetorthio

That makes sense. To isolate those "cluster info" in it. :+1:

This looks as a good solution for me. @marcosnils @HeartSaVioR what do you guys think?

Just another question related to another issue. When we want to execute a command in a specific node, this method will be directly in JedisCluster and not in Cluster right? (I know that it's another issue, but I believe that they are related) Maybe something like: jedis.get("key", node)

nykolaslima avatar May 20 '15 17:05 nykolaslima

Do we also want to be able to specify a list of nodes, i.e. only look here for the key?

Can a Node be a slave instance. This gets back to the older issue of being able to do readonly reads to a slave and do writes to the master when you're tolerant of masters and slaves not being totally synced at the time of the request. The moved type errors would not occur if it is the wrong slave for the key and would just return null.

Do we need to surface a convenience routine to pick the master to send a command to. Something like:

public Node getMasterForKey(String key) { ... }
public List<Node> getSlavesForKey(String key) { ... }
public List<Node> getMastersForKeys(String... keys) { ... }```

allanwax avatar May 20 '15 18:05 allanwax

Yes. Actually when I thought about my proposal the methods you mentioned were there, but when I wrote it here I forgot to add them. So yes. I think those methods are important to add.

By adding all these methods, I think the cluster discovery API is pretty decent and complete and gives a lot of freedom to users.

xetorthio avatar May 20 '15 19:05 xetorthio

So this proposal allows for things like:

Running commands on the master that owns key "foo"

try (Jedis j = jedisCluster.cluster.getMasterForKey("foo").getPool().getResource()) {
  j.ping();
  ...
}

Running commands on one of the slaves which master owns key "foo"

try (Jedis j = jedisCluster.cluster.getSlavesForKey("foo")[0].getPool().getResource()) {
  j.ping();
  ...
}

etc...

xetorthio avatar May 20 '15 19:05 xetorthio

I think we should also consider getting rid of idea of needing to go to a pool to get a resource to do the operation. That should be hidden. Kind-of like jedisCluster.get(key); some master is chosen and some pool member is used and when I'm done I'm not worried about returning resources.

Something like Node.getJedis().get(key) or Node.getJedis().ping(). This is fuzzy right now. Someone has to handle the resources but it shouldn't need to be explicit.

allanwax avatar May 21 '15 15:05 allanwax

Could be. But there are people that dont use the pool. How to deal with that? Always creating a default pool with one instance? On Thu, May 21, 2015 at 12:26 PM allanwax [email protected] wrote:

I think we should also consider getting rid of idea of needing to go to a pool to get a resource to do the operation. That should be hidden. Kind-of like jedisCluster.get(key); some master is chosen and some pool member is used and when I'm done I'm not worried about returning resources.

Something like Node.getJedis().get(key) or Node.getJedis().ping(). This is fuzzy right now. Someone has to handle the resources but it shouldn't need to be explicit.

— Reply to this email directly or view it on GitHub https://github.com/xetorthio/jedis/issues/988#issuecomment-104318881.

nykolaslima avatar May 21 '15 16:05 nykolaslima

@nykolaslima JedisCluster already has internal Pools for each master nodes. @allanwax We can shorten path but users still need to close instance explicitly (including try-with-resources). I think we should reuse Jedis class's methods in order to keep Jedis simple. It is really important design concept.

HeartSaVioR avatar May 21 '15 22:05 HeartSaVioR

Though we can make new Jedis instance instead of borrowing, but users still need to close resource since users can send some (not one) commands after getting instance. It is same picture for normal Jedis.

HeartSaVioR avatar May 21 '15 22:05 HeartSaVioR

@HeartSaVioR @xetorthio I agree with @allanwax comment about hiding redundant getPool method. I know it's a small detail but the API seems much nicer like this:

try (Jedis j = jedisCluster.cluster.getSlavesForKey("foo")[0].getConnection()) {
  j.ping();
  ...
}

What do you think?

I'd like to hear @christophstrobl, @olivergierke and @thomasdarimont opinions about this

marcosnils avatar May 21 '15 23:05 marcosnils

I agree for abstraction of getting instance since we can make it borrowed from pool, or creating new instance at that time. My only concern is just how to clean up resource. It doesn't make sense it can be implicit .

HeartSaVioR avatar May 22 '15 00:05 HeartSaVioR

@HeartSaVioR try-with-resources will clean it up automatically, otherwise we'll add a javadoc to the getConnection method that resource needs to be closed after using.

marcosnils avatar May 22 '15 00:05 marcosnils

OK, we can document it. :)

Btw, I think it's better to create new instance per each getConnection() cause if we share Pool and users doesn't return resource to pool, JedisCluster may stuck, too.

HeartSaVioR avatar May 22 '15 00:05 HeartSaVioR

Please don't forget we can't force users to use JDK7 or upper unless we release Jedis 3.0.0 AT LEAST NOW. We can make it applied to 2.8.0, too.

HeartSaVioR avatar May 22 '15 00:05 HeartSaVioR

@HeartSaVioR I do also agree that a new connection should be returned to avoid JedisCluster pool possible errors. The good thing about this approach is that if user forgets to close connection and it falls out of scope, the GC will close the socket.

We still need to remark that connection should be gracefully closed

marcosnils avatar May 22 '15 00:05 marcosnils

Sockets need to be explicitly closed or they stay open in the kernel. GC itself won't do it. SEE http://stackoverflow.com/questions/2454550/should-i-close-sockets-from-both-ends


I think the explicit return is a call to close() on the jedis instance.

I know everyone complains about the use of finalize() but as I've said before, there are cases where it is usefull. If the means to do the call is something like Jedis jedis = node.getJedis(); , and I'm not saying that is the way to go, then if people just drop the instance and let it go to the GC, then the GC will call finalize(), where we call close(), and that will either dispose of the connection or return it to the pool.

A secondary issue is that Jedis implements Closeable not AutoCloseable. For a try-with-resources block, the classes within need to implement AutoCloseable

allanwax avatar May 22 '15 15:05 allanwax

@allanwax Closable extends AutoClosable so Jedis implements AutoClosable as well.

marcosnils avatar May 22 '15 15:05 marcosnils

Thanks for the correction. Next time I need to look deeper.

allanwax avatar May 22 '15 16:05 allanwax

So we all agree on those design points? Can we proceed with it?

nykolaslima avatar May 22 '15 18:05 nykolaslima

I totaly agree with hinding the Pool details so that one can call getConnection() without having to worry about the underlying pooling.

Concerning the Node abstraction I think having dedicated types for master/slave nodes makes perfect sense. so that one can have eg.

public MasterNode getMasterNodeForKey(byte [] key);
public List<SlaveNode> getSlavesForKey(byte [] key);
public List<MasterNode> getNodesForKeys(Iterable<byte[]> keys); 

The thing I'm not sure about is if the Nodes should allow access to the connection directly. Would it make sense to keep Nodes as information containers about slots, ids,... while having to ask JedisCluster for a connection to one of the nodes.

MasterNode node = jedisCluster.getMasterNodeForKey(key);
try (Jedis jedis = jedisCluster.getConnection(node)) {
 // ...
}

This would leave connection handling to JedisCluster and the ClusterConnectionHandler and would alos allow somethig linke having a MultiNodeJedis (sorry for the naming) via eg. jedisCluster.getConnection(node1, node2, node3) which potentially could send commands in parallel to all specified nodes.

christophstrobl avatar Jun 02 '15 08:06 christophstrobl

I somewhat disagree about separating MasterNode and SlaveNode, but not violently. I think the abstraction should take care of the detail. I do see a need for specialized operation on masters/slaves and so i do agree there is a need for those classes. For example, there is an old thread requesting that readOnly operations be allowed on slaves without redirection in the cases where the code is tolerant of the master and slave not being in sync.

For normal stuff, it should be sufficient to do Node node = getNodeForKey(key); node.operation(...) and not care what kind of Node was returned.

allanwax avatar Jun 02 '15 15:06 allanwax

What happens if during processing a MasterNode becomes a SlaveNode or vice-verso? If this is an issue, is this more of a case to abstract these into a Node and let the node deal with the change in environment? Just thoughts.

allanwax avatar Jul 30 '15 21:07 allanwax

I add a function to get HostAndPort struct from JedisCluster to delete data from every node using scan and pipeline. And I should judge does this node is master.I think just return HostAndPort is better. Here is how I use these function now.MasterNode and SlaveNode maybe kind of redundancy.

ExecutorService executorService = Executors.newCachedThreadPool();
JedisCluster jedisCluster = ((RedisStoreClient) storeClient).getJedisClient();
Set<HostAndPort> nodes = jedisCluster.getCluserNodesHostAndPort();
final AtomicLong stat = new AtomicLong(1);
final int step = 2000;
Set<HostAndPort> masters = new HashSet<HostAndPort>();

for (HostAndPort node : nodes) {
    Jedis jedis = new Jedis(node.getHost(), node.getPort());
    try {
        String info = jedis.info("Replication");
        if (info.indexOf("role:master") < 0)
            continue;
        masters.add(node);
    } catch (Throwable t) {
        continue;
    }
}
ArrayList<Future> masterTask = new ArrayList<Future>();
for (final HostAndPort node : masters) {
    Future f = executorService.submit(new Runnable() {
        @Override
        public void run() {
            Jedis jedis = new Jedis(node.getHost(), node.getPort());
            ScanParams scanParams = new ScanParams();
            scanParams.match(category + "*");
            scanParams.count(step);
            ScanResult<String> result;
            result = jedis.scan("0", scanParams);
            while (!Thread.currentThread().isInterrupted() && (!result.getStringCursor().equals("0") || result.getResult().size() != 0)) {
                try {
                    List<Response<Long>> responses = new ArrayList<Response<Long>>();
                    Pipeline pipeline = jedis.pipelined();
                    for (String key : result.getResult()) {
                        Response<Long> rl = pipeline.del(key);
                        responses.add(rl);
                    }
                    pipeline.sync();
                    for(Response<Long> r : responses) {
                        stat.addAndGet(r.get());
                    }
                    result = jedis.scan(result.getStringCursor(), scanParams);
                } catch (Throwable t) {
                    continue;
                }
            }
        }
    });
    masterTask.add(f);
}

ghost avatar Jan 28 '16 16:01 ghost

Looks like Jedis going in completely different direction from this task, even that minimal ClusterNodeInformation was deleted from current version. Using pipelining and scans in cluster mode became not easier, but more painful in 2.8.0 Any progress on this scope?

Spikhalskiy avatar Mar 25 '16 11:03 Spikhalskiy

@Spikhalskiy ClusterNodeInformation represents the output of CLUSTER NODES, and @antirez announced output of CLUSTER NODES will be changed from Redis 3.2.

https://twitter.com/antirez/status/691993742606274560 https://www.reddit.com/r/redis/comments/42kxjq/community_help_needed_redis_cluster_32/

Jedis doesn't have any strategies (different versioning, maven profile, and so on) when Redis breaks backward compatibility. (I was having a hard time when Redis broke backward compatibility with new ZADD options... #1067)

Since active maintainers are only two now, we would like to simplify our work, and let other library wraps Jedis and provide high-level operations. One of them is Spring Data Redis.

Anyway, since we can't rely on CLUSTER NODES, we should accomplish same things with CLUSTER SLOTS. I guess we don't have enough time to address this high-level feature of JedisCluster, so I'd rather want amazing contributors to contribute on this and finally become maintainers (collaborators), and you can become one of them! :)

HeartSaVioR avatar Mar 27 '16 05:03 HeartSaVioR

Is this implemented ??

mjmanav4 avatar Sep 17 '19 06:09 mjmanav4

Couple thoughts on this thread.

  1. It does not make sense to have master and slave fields because they are dependent on each other. Having them as separate fields looses that relationship. I'd rather have getClusterSlots() similar to JedisCluster.getClusterNodes() which would have data from CLUSTER SLOTS command:
class ClusterSlot {
  int slotRangeStart;
  int slotRangeEnd;
  Node masterNode;
  List<Node> replicaNodes; // or slaveNodes
}
  1. Better yet, Map<String, JedisPool> getClusterNodes() should be fixed and include data returned by the CLUSTER NODES command, which includes master/slave flag which is what this entire conversation is about. I.e. something like:
List<ClusterNode> getClusterNodes();
class ClusterNode {
  HostAndPort host;
  String nodeId;
  String masterNodeId; // set for replica nodes
  bool master;
  bool slave;
  List<Range> slots;
}
  1. Examples above for getMasterNodeForKey(key); seem wrong to me. Entire purpose of cluster is that you shouldn't need to do this, cluster follows redirect responses. I understand there are valid use cases, but this seems bogus.

maksymkovalenko avatar Oct 15 '19 19:10 maksymkovalenko

@maksymkovalenko I can see a use for getMasterNodeForKey where you want to know where you're writing to and what its slaves are (at least at the moment in time when you ask). Some optimizations in Redis access are to read from the slaves and write to the masters so that in a read heavy application you can spread the reads around among the slaves and get some parallelism.

allanwax avatar Oct 15 '19 19:10 allanwax