redis-py icon indicating copy to clipboard operation
redis-py copied to clipboard

Thread safety in Connection Pool - connections growing

Open joshverma opened this issue 1 year ago • 5 comments

Version: 3.5.3

Platform: python:3.9 Docker image

Description:

I'm trying to understand the behavior of ConnectionPool. If I have a thread pool of size 20 each trying to get a redis connection from a connection pool, shouldn't they all end up using the same connection object? Since a lock is used to pop and push to the connection pool, there should either be 0 or 1 connections in use by the connection pool, right?

However, this is not the behavior I am seeing. The behavior I am seeing is that sometimes, the pool is empty. At that point, new connections are created and added to the pool. This results in around 20 connections (give or take, it varies), being opened to the pool in total. The number of open connections to the pool slowly grows over time as well, which surprises me since a lock is used to pop/push from the pool.

Has anyone experienced this/can they shed some light on what may be happening?

Thanks in advance!

joshverma avatar Jan 17 '24 23:01 joshverma

I think I've figured it out. First of all the lock is released after popping a connection from the pool, so at that point other threads are allowed to acquire the lock and pop another connection. A similar pattern follows for releasing a connection back to the pool. I misread the code and was under the impression that the lock encompassed the entire lifetime of the sending of the redis command as well. So this explains how many connections are used at once.

However, as for the number of connections slowly growing over time. This is due to the use of thread pools. If each thread in a thread pool asks for a connection, there is a randomness in the order and "interleaving" of each thread's operations.

An example is a thread pool of size 2:

  • In one scenario, Thread 1 can get a connection, execute the redis command, and release the connection back to the pool, all before Thread 2 has even asked for a connection yet (maybe Thread 1 ran very fast or Thread 2 ran very slow). Thread 2 will then ask for a connection, but since Thread 1 already released its connection back to the pool, that same connection is available, so Thread 2 will simply use that connection. This means that in this scenario only 1 unique connection was created and used.
  • In a second scenario, Thread 1 could acquire a lock, get a connection, then release the lock. While Thread 1 is executing the redis command, Thread 2 now acquires the lock and asks for a connection. There will be no connections available in the pool, since Thread 1 is using the connection, so a second connection is created and used by Thread 2. This means that in this scenario, 2 unique connections were created and used.

If we extend this logic to a thread pool of size 20, there is more variability in the order and interleaving of the threads now, because there are more of them. The first execution of the code, those 20 threads could perform their operations by only using 12 connections as an example. But maybe on the second execution of the code, those 20 threads are interleaved in such a way that they need 14 connections. If the number of required connections for the thread pool continues to increase, we will see an increase in the total current connections to redis. I suppose there technically would be an upper limit in this case (20 threads), but for complex applications the upper limit may be difficult to determine.

Hopefully this explanation helped someone else in the same boat as me. Or if I am wrong about anything, please feel free to correct me.

joshverma avatar Jan 18 '24 04:01 joshverma

if you want to limit the number of connections, you can set the max_connections parameter when initializing the ConnectionPool object.

James-Leong avatar Jul 16 '24 07:07 James-Leong

Yeah that is what I ended up doing, thanks @James-Leong!

joshverma avatar Jul 16 '24 14:07 joshverma

What if want some ideal connection while initialisation connection pool?

darshakofficial avatar Dec 24 '24 11:12 darshakofficial

I don't think it is possible to initialize connections automatically. I think that connections are lazily created in a connection pool. So you may need to run a loop to create a connection n number of times after creating the pool.

@darshakofficial

joshverma avatar Jan 02 '25 16:01 joshverma

Closing this issue since a very good and detailed explanation of the current connection creation mechanism in the connection pool is already provided in the second comment. Thanks @joshverma for the detailed explanation provided here!

petyaslavova avatar Nov 14 '25 17:11 petyaslavova