spring-data-redis icon indicating copy to clipboard operation
spring-data-redis copied to clipboard

Support Lock TTL configuration using `DefaultRedisCacheWriter`

Open artrodkin opened this issue 3 years ago • 2 comments

We ran into the situation that was referenced in DATAREDIS-1052

We had an exception during the attempt to unlock: ExceptionCause="redis.clients.jedis.exceptions.JedisConnectionException: Attempting to read from a broken connection" ExceptionStack="org.springframework.data.redis.RedisConnectionFailureException: Attempting to read from a broken connection; nested exception is redis.clients.jedis.exceptions.JedisConnectionException: Attempting to read from a broken connection at org.springframework.data.redis.connection.jedis.JedisExceptionConverter.convert(JedisExceptionConverter.java:65) at org.springframework.data.redis.connection.jedis.JedisExceptionConverter.convert(JedisExceptionConverter.java:42) at org.springframework.data.redis.PassThroughExceptionTranslationStrategy.translate(PassThroughExceptionTranslationStrategy.java:44) at org.springframework.data.redis.FallbackExceptionTranslationStrategy.translate(FallbackExceptionTranslationStrategy.java:42) at org.springframework.data.redis.connection.jedis.JedisConnection.convertJedisAccessException(JedisConnection.java:135) at org.springframework.data.redis.connection.jedis.JedisKeyCommands.del(JedisKeyCommands.java:122) at org.springframework.data.redis.connection.DefaultedRedisConnection.del(DefaultedRedisConnection.java:82) at org.springframework.data.redis.cache.DefaultRedisCacheWriter.doUnlock(DefaultRedisCacheWriter.java:228)

After this , all threads trying to access the key were stuck waiting:

java.lang.Thread.State: TIMED_WAITING (sleeping) at java.lang.Thread.sleep(Native Method) at org.springframework.data.redis.cache.DefaultRedisCacheWriter.checkAndPotentiallyWaitUntilUnlocked(DefaultRedisCacheWriter.java:274) at org.springframework.data.redis.cache.DefaultRedisCacheWriter.execute(DefaultRedisCacheWriter.java:247) at org.springframework.data.redis.cache.DefaultRedisCacheWriter.get(DefaultRedisCacheWriter.java:110) at org.springframework.data.redis.cache.RedisCache.lookup(RedisCache.java:88)

And there was no client to release this lock.

We were wondering if it would be possible to add the ability to send the expiration time in the SET NX command when this lock is created, so that system could automatically recover from this situation.

Thanks Art

artrodkin avatar Apr 11 '22 22:04 artrodkin

The issue is caused by the lock entry not being cleaned up. In this case, the cleanup failed because the connection became unusable. There are a couple of ways out here:

  1. Setting a TTL
  2. Retrying the cleanup

While we could retry the cleanup in case the connection is disconnected, the cleanup could still fail in case the Redis server went down.

I think that we should extend our RedisCacheWriter creation to allow specifying a lock TTL. We have already a few configuration properties such as BatchStrategy and SleepTimeout that we could pull together into another configuration object and add the lock TTL as a third configuration option so different caches can use different lock timeouts.

mp911de avatar Apr 20 '22 14:04 mp911de

We require a revision of our configuration support for DefaultRedisCacheWriter. The number of supported features is growing and adding another argument to the factory methods would harm its usability.

Likely, a configuration abstraction can help evolve the default implementation.

mp911de avatar Oct 11 '22 13:10 mp911de