jedis icon indicating copy to clipboard operation
jedis copied to clipboard

JedisCluster functionLoadReplace throws JedisBroadcastException

Open oscarboher opened this issue 9 months ago • 6 comments

Expected behavior

functionLoadReplace(script) using a JedisCluster should load the function without errors.

The script is a dummy script:

#!lua name=mylib
redis.register_function({
    function_name = 'mylib',
    callback = function() return 0 end,
    description = "hi how are you"
})

Using Jedis 5.2.0

Actual behavior

functionLoadReplace throws a JedisBroadcastException error: Caused by: redis.clients.jedis.exceptions.JedisBroadcastException: A failure occurred while broadcasting the command. at redis.clients.jedis.executors.ClusterCommandExecutor.broadcastCommand(ClusterCommandExecutor.java:47) ~[jedis-5.2.0.jar:?] at redis.clients.jedis.UnifiedJedis.broadcastCommand(UnifiedJedis.java:312) ~[jedis-5.2.0.jar:?] at redis.clients.jedis.UnifiedJedis.checkAndBroadcastCommand(UnifiedJedis.java:325) ~[jedis-5.2.0.jar:?] at redis.clients.jedis.UnifiedJedis.functionLoadReplace(UnifiedJedis.java:3552) ~[jedis-5.2.0.jar:?]

Checking in the cluster the function does exist, so I understand that in some node the load was successful, but failed in the broadcast.

This happens for both functions that write to the ddbb and for functions with the 'no-write' flag, so It would seem it is not because of trying to upload the function to a replica node.

Doing functionLoadReplace on each of the nodes of the cluster using a Jedis resource does work, both for replicas and masters, with or without the no-write flag.

oscarboher avatar Apr 15 '25 15:04 oscarboher

@oscarboher Thanks for the report. It will speed up the investigation if you could provide a reproducible example which will help to be on the same page how is JedisCluster created/configured/

ggivo avatar Apr 16 '25 10:04 ggivo

Sure, this is the simplest way to test it, using Jedis 5.2.0. We are using user/pwd authentication, and ssl, but can be removed to the same effect.

I checked against a redis-cluster installed locally (as well as our environment) - The error only happens when the cluster has replicas, with only 3 masters the JedisBroadcastException does not appear.

package org.example;

import redis.clients.jedis.ConnectionPoolConfig;
import redis.clients.jedis.DefaultJedisClientConfig;
import redis.clients.jedis.HostAndPort;
import redis.clients.jedis.JedisCluster;

import java.time.Duration;
import java.util.Set;

public class Main {
    public static void main(String[] args) {

        String user = <my-user>;
        String pwd = <my-pwd>
        HostAndPort hostAndPort = HostAndPort.from(<my-host>);

        ConnectionPoolConfig poolConfig = new ConnectionPoolConfig();
        poolConfig.setMaxTotal(50);
        poolConfig.setMaxWait(Duration.ofMillis(2000));
        poolConfig.setBlockWhenExhausted(true);


        poolConfig.setMaxIdle(10);
        poolConfig.setMinIdle(5);
        poolConfig.setTimeBetweenEvictionRuns(Duration.ofMillis(60_000));
        poolConfig.setSoftMinEvictableIdleDuration(Duration.ofMillis(60_000));

        String script = """
            #!lua name=mylib
            redis.register_function({
                function_name = 'mylib',
                callback = function() return 0 end,
                description = "hi how are you"
            })
            """;

        DefaultJedisClientConfig clientConfig =
                DefaultJedisClientConfig.builder()
                        .user(user)
                        .password(pwd)
                        .ssl(true)
                        .timeoutMillis(10000)
                        .build();
        try(var client = new JedisCluster(Set.of(hostAndPort), clientConfig, poolConfig)){
            var testClient = client.set("testKey", "testValue");
            System.out.println("Test client: " + testClient);
            var functionName = client.functionLoadReplace(script);
            System.out.println("Function loaded: " + functionName);
        }

    }
}

This results in the error:

Exception in thread "main" redis.clients.jedis.exceptions.JedisBroadcastException: A failure occurred while broadcasting the command.
	at redis.clients.jedis.executors.ClusterCommandExecutor.broadcastCommand(ClusterCommandExecutor.java:47)
	at redis.clients.jedis.UnifiedJedis.broadcastCommand(UnifiedJedis.java:312)
	at redis.clients.jedis.UnifiedJedis.checkAndBroadcastCommand(UnifiedJedis.java:325)
	at redis.clients.jedis.UnifiedJedis.functionLoadReplace(UnifiedJedis.java:3552)
	at org.example.Main.main(Main.java:48)

The client.set returns correctly, just to validate that we can reach the cluster. Test client: OK

oscarboher avatar Apr 16 '25 13:04 oscarboher

Thanks! Just confirmed the error is reproducible with the provided example code.

ggivo avatar Apr 16 '25 14:04 ggivo

The actual cause for the error can be checked by printing replies from individual nodes stored with JedisBroadcastException.

      try {
        var functionName = client.functionLoadReplace(script);
        System.out.println("Ok! Function name: " + functionName);
      } catch (JedisBroadcastException e) {
        System.out.println("Error getting testKey: " + e);
        e.getReplies().forEach((node, reply) -> {
          System.out.println("Node: " + node + ", Reply: " + reply);
        });
      }
Error getting testKey: redis.clients.jedis.exceptions.JedisBroadcastException: A failure occurred while broadcasting the command.
Node: 127.0.0.1:6384, Reply: redis.clients.jedis.exceptions.JedisDataException: READONLY You can't write against a read only replica.
Node: 127.0.0.1:6383, Reply: redis.clients.jedis.exceptions.JedisDataException: READONLY You can't write against a read only replica.
Node: 127.0.0.1:6382, Reply: redis.clients.jedis.exceptions.JedisDataException: READONLY You can't write against a read only replica.
Node: 127.0.0.1:6381, Reply: mylib
Node: 127.0.0.1:6380, Reply: mylib
Node: 127.0.0.1:6379, Reply: mylib

The root cause for the exception is that we are broadcasting the command not only to primary nodes as specified in
Functions in cluster but also to replica ones.

Broadcast for functionLoadReplace introduced with https://github.com/redis/jedis/issues/3303

It's a bug that needs to be addressed.

@oscarboher as a workaround I can suggest having a dedicated connection to individual primary nodes and executing
client.functionLoadReplace(script); on each of them.

ggivo avatar Apr 17 '25 06:04 ggivo

Hey, any estimation on the fix? Will this be available in 5.2 or directly on 6.0.0? Thank you

oscarboher avatar Apr 17 '25 13:04 oscarboher

Our hands are kind of full right now, and I can not commit to when will have free slots to work on it. Plans are to backport it to 5.2 once fixed.

Any contribution is welcome!

ggivo avatar Apr 17 '25 14:04 ggivo

Hi @ggivo,

I just tried to fix the problem. Any feedback would be appreciated.

https://github.com/redis/jedis/pull/4219

Kguswo avatar Jul 31 '25 05:07 Kguswo