jedis icon indicating copy to clipboard operation
jedis copied to clipboard

Eviction gives "Could not return the broken resource to the pool" even if all resources are fine

Open steinrr opened this issue 2 years ago • 10 comments

Expected behavior

I have a Spring Boot application where I loop until a key is found in Redis (Azure Redis). Everything works fine, and there are no issues. I run a load test with 1250 requests to the REST interface and all of them succeeds OK. There are no exceptions thrown from either getting resource or getting key - everything looks fine.

But when the all requests has finished, and there are currently no requests, I find these in my logs:

2021-12-01 10:03:19,764 WARN redis.clients.jedis.JedisFactory [commons-pool-evictor] Error while close
redis.clients.jedis.exceptions.JedisException: Could not return the broken resource to the pool
	at redis.clients.jedis.util.Pool.returnBrokenResourceObject(Pool.java:126)
	at redis.clients.jedis.util.Pool.returnBrokenResource(Pool.java:103)
	at redis.clients.jedis.JedisPool.returnResource(JedisPool.java:382)
	at redis.clients.jedis.JedisPool.returnResource(JedisPool.java:15)
	at redis.clients.jedis.Jedis.close(Jedis.java:4067)
	at redis.clients.jedis.JedisFactory.destroyObject(JedisFactory.java:166)
	at org.apache.commons.pool2.PooledObjectFactory.destroyObject(PooledObjectFactory.java:127)
	at org.apache.commons.pool2.impl.GenericObjectPool.destroy(GenericObjectPool.java:611)
	at org.apache.commons.pool2.impl.GenericObjectPool.evict(GenericObjectPool.java:729)
	at org.apache.commons.pool2.impl.BaseGenericObjectPool$Evictor.run(BaseGenericObjectPool.java:160)
	at org.apache.commons.pool2.impl.EvictionTimer$WeakRunner.run(EvictionTimer.java:113)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
	at java.base/java.util.concurrent.FutureTask.runAndReset(Unknown Source)
	at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
	at java.base/java.lang.Thread.run(Unknown Source)
Caused by: java.lang.IllegalStateException: Invalidated object not currently part of this pool
	at org.apache.commons.pool2.impl.GenericObjectPool.invalidateObject(GenericObjectPool.java:936)
	at org.apache.commons.pool2.impl.GenericObjectPool.invalidateObject(GenericObjectPool.java:914)
	at redis.clients.jedis.util.Pool.returnBrokenResourceObject(Pool.java:124)
	... 16 common frames omitted

I don't understand that I can have a broken resource - everything worked fine and all my Jedis was able to get the keys.

@Configuration class:

	@Bean
	JedisPool jedisPool() {
		JedisPoolConfig poolConfig = buildPoolConfig();
		return new JedisPool(poolConfig, REDISCACHEHOSTNAME, 6379, 10, REDISCACHEKEY);
		
	}
	

	private JedisPoolConfig buildPoolConfig() {
	    final JedisPoolConfig poolConfig = new JedisPoolConfig();
	    poolConfig.setMaxTotal(128);
	    poolConfig.setMaxIdle(128);
	    poolConfig.setMinIdle(16);
	    poolConfig.setTestOnBorrow(true);
	    poolConfig.setTestOnReturn(true);
	    poolConfig.setTestWhileIdle(true);
	    poolConfig.setMinEvictableIdleTimeMillis(Duration.ofSeconds(60).toMillis());
	    poolConfig.setTimeBetweenEvictionRunsMillis(Duration.ofSeconds(30).toMillis());
	    poolConfig.setNumTestsPerEvictionRun(3);
	    poolConfig.setBlockWhenExhausted(true);
	    return poolConfig;
	}

NB! I see that the setMinEvictableIdleTimeMillis and setTimeBetweenEvictionRunsMillis are deprecated - can this be the issue?

@Restcontroller class:

    	Jedis jedis = null;
		
	    try  {
	    	jedis = localJedisPool.getResource();
	    } catch (Exception e) {
		    logger.error("Could not get REDIS instance for session id: " + sessionid);
	    }

	    boolean foundKey = false;
	    int i = 0;

	    String response = null;

	    do {
	        response = jedis.get(sessionid);
	        if (response == null) {
	        	i++;
	    	    logger.info("Failed REDIS key for session id: " + sessionid + ". Retry #" + i);
	        	try {
	        	    Thread.sleep(25);
	        	} catch (InterruptedException ie) {
	        	    Thread.currentThread().interrupt();
	        	}
	        } else {
	        	foundKey = true;
	    	    logger.info("Found REDIS key for session id: " + sessionid + " AFTER retry #" + i);
	        }
	        
	    } 
	    while (foundKey == false && i<20);
	    
        localJedisPool.returnResource(jedis);

NB! The code is "quick and dirty" and I could probably use pub/sub with keyspace notification, but this is a proof-of-concept app and "does the job". It checks for the existence of a key. If not found, it will pause 25 ms and try once more.

Actual behavior

See above stacktrace.

Steps to reproduce:

See above code

Redis / Jedis Configuration

See code.

Jedis version:

3.7.0

Redis version:

Azure Redis cache

Java version:

11

steinrr avatar Dec 01 '21 10:12 steinrr

I did some more testing. I removed those two deprecated config entries, and changed from .returnResource to .close. It still works fine, but now I am seeing these in the logs.

The config now looks like this:

	private JedisPoolConfig buildPoolConfig() {
	    final JedisPoolConfig poolConfig = new JedisPoolConfig();
	    poolConfig.setMaxTotal(400);
	    poolConfig.setMaxIdle(400);
	    poolConfig.setMinIdle(200);
	    poolConfig.setBlockWhenExhausted(true);
	    poolConfig.setTestOnBorrow(true);
	    poolConfig.setTestOnReturn(true);
	    poolConfig.setTestWhileIdle(true);
	    poolConfig.setNumTestsPerEvictionRun(3);
	    return poolConfig;
	}

These are now seen in the logs:

2021-12-01 10:52:03,237 WARN redis.clients.jedis.JedisFactory [commons-pool-evictor] Error while QUIT
redis.clients.jedis.exceptions.JedisConnectionException: Unexpected end of stream.
        at redis.clients.jedis.util.RedisInputStream.ensureFill(RedisInputStream.java:202)
        at redis.clients.jedis.util.RedisInputStream.readByte(RedisInputStream.java:43)
        at redis.clients.jedis.Protocol.process(Protocol.java:162)
        at redis.clients.jedis.Protocol.read(Protocol.java:227)
        at redis.clients.jedis.Connection.readProtocolWithCheckingBroken(Connection.java:352)
        at redis.clients.jedis.Connection.getStatusCodeReply(Connection.java:270)
        at redis.clients.jedis.BinaryJedis.quit(BinaryJedis.java:471)
        at redis.clients.jedis.JedisFactory.destroyObject(JedisFactory.java:160)
        at org.apache.commons.pool2.PooledObjectFactory.destroyObject(PooledObjectFactory.java:127)
        at org.apache.commons.pool2.impl.GenericObjectPool.destroy(GenericObjectPool.java:611)
        at org.apache.commons.pool2.impl.GenericObjectPool.evict(GenericObjectPool.java:729)
        at org.apache.commons.pool2.impl.BaseGenericObjectPool$Evictor.run(BaseGenericObjectPool.java:160)
        at org.apache.commons.pool2.impl.EvictionTimer$WeakRunner.run(EvictionTimer.java:113)
        at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
        at java.base/java.util.concurrent.FutureTask.runAndReset(Unknown Source)
        at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
        at java.base/java.lang.Thread.run(Unknown Source)

I am also seeing these:

2021-12-01 12:24:02,017 WARN redis.clients.jedis.JedisFactory [commons-pool-evictor] Error while QUIT
redis.clients.jedis.exceptions.JedisConnectionException: java.net.SocketException: Broken pipe (Write failed)
	at redis.clients.jedis.Connection.flush(Connection.java:342)
	at redis.clients.jedis.Connection.getStatusCodeReply(Connection.java:269)
	at redis.clients.jedis.BinaryJedis.quit(BinaryJedis.java:471)
	at redis.clients.jedis.JedisFactory.destroyObject(JedisFactory.java:160)
	at org.apache.commons.pool2.PooledObjectFactory.destroyObject(PooledObjectFactory.java:127)
	at org.apache.commons.pool2.impl.GenericObjectPool.destroy(GenericObjectPool.java:611)
	at org.apache.commons.pool2.impl.GenericObjectPool.evict(GenericObjectPool.java:729)
	at org.apache.commons.pool2.impl.BaseGenericObjectPool$Evictor.run(BaseGenericObjectPool.java:160)
	at org.apache.commons.pool2.impl.EvictionTimer$WeakRunner.run(EvictionTimer.java:113)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
	at java.base/java.util.concurrent.FutureTask.runAndReset(Unknown Source)
	at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
	at java.base/java.lang.Thread.run(Unknown Source)
Caused by: java.net.SocketException: Broken pipe (Write failed)
	at java.base/java.net.SocketOutputStream.socketWrite0(Native Method)
	at java.base/java.net.SocketOutputStream.socketWrite(Unknown Source)
	at java.base/java.net.SocketOutputStream.write(Unknown Source)
	at redis.clients.jedis.util.RedisOutputStream.flushBuffer(RedisOutputStream.java:52)
	at redis.clients.jedis.util.RedisOutputStream.flush(RedisOutputStream.java:133)
	at redis.clients.jedis.Connection.flush(Connection.java:339)
	... 14 common frames omitted

steinrr avatar Dec 01 '21 10:12 steinrr

@steinrr try this

    		
try (Jedis jedis = localJedisPool.getResource())  {
        boolean foundKey = false;
        int i = 0;

        String response = null;

        do {
            response = jedis.get(sessionid);
            if (response == null) {
                i++;
                logger.info("Failed REDIS key for session id: " + sessionid + ". Retry #" + i);
                try {
                    Thread.sleep(25);
                } catch (InterruptedException ie) {
                    Thread.currentThread().interrupt();
                }
            } else {
                foundKey = true;
                logger.info("Found REDIS key for session id: " + sessionid + " AFTER retry #" + i);
            }

        }
        while (foundKey == false && i<20);

    } catch (Exception e) {
        logger.error("Could not get REDIS instance for session id: " + sessionid);
    }

you don need to call localJedisPool.returnResource(jedis);, you need call jedis.close after query.

yangbodong22011 avatar Dec 10 '21 07:12 yangbodong22011

@yangbodong22011

I already did rewrite to using close instead of return. BUT the issue is not this part of the code - it works as expected and does as I wish - but if you see the exception log I posted - the error happens in the eviction process.

steinrr avatar Dec 10 '21 08:12 steinrr

I already did rewrite to using close instead of return. BUT the issue is not this part of the code - it works as expected and does as I wish - but if you see the exception log I posted - the error happens in the eviction process.

The potential problem here is that you will return the Jedis of broken to the normal connection pool through localJedisPool.returnResource(jedis), and it should call localJedisPool.returnBrokenResource(jedis). Of course, if you call Jedis.close(), all this will be done automatically.

but if you see the exception log I posted

If the error occurs after you have used jedis.close(), then it is possible: when evict is running, before calling the quit command, the connection has been disconnected from redis, so you will also encounter Broken Pipe error.

If this error may recur stably, maybe you can use wireshark or tcpdump to capture the packets and see what happened.

yangbodong22011 avatar Dec 10 '21 08:12 yangbodong22011

Yes, the error happens after I have used jedis.close() (many minutes after). The app is run in an Azure app service, so it is not easy to debug the TCP traffic here unfortunately. Do you have any experiences with Azure app service/Azure Redis and how quickly it will disconnect clients if not used?

steinrr avatar Dec 10 '21 08:12 steinrr

Do you have any experiences with Azure app service/Azure Redis and how quickly it will disconnect clients if not used?

Unfortunately, I didn't.

yangbodong22011 avatar Dec 10 '21 09:12 yangbodong22011

what value of timeout configration of redis-server ?

zeekling avatar Dec 29 '21 14:12 zeekling

Yes, the error happens after I have used jedis.close() (many minutes after). The app is run in an Azure app service, so it is not easy to debug the TCP traffic here unfortunately. Do you have any experiences with Azure app service/Azure Redis and how quickly it will disconnect clients if not used?

@yangbodong22011 is right, use close instead of returnResource will solve the eviction problem, because close will also set dataSource property of Jedis instance to null in addition to call returnResource. without set dataSource property, eviction will call close again after remove the instance from pool, witch cause the object not currently part of this pool exception.

iamgd67 avatar Nov 23 '22 06:11 iamgd67

I solved this by calling pool.getResource() again before pool.destroy() or pool.close(). In my case, this is triggered only when the application stops. I don't know if this is your case, but it worked for me. I'm using jedis 3.6.3

v-barros avatar Dec 01 '23 21:12 v-barros

@steinrr solved the initial issue.

I think his second comment is happening because some Redis offerings based on plain open source Redis server do not support QUIT command. Such cases should be fixed by #3353 (#3377 for Jedis 3.x which is included in Jedis 3.10.0).

sazzad16 avatar Dec 03 '23 10:12 sazzad16