redis-py
redis-py copied to clipboard
pubsub.run_in_thread : 'sleep_time' and documentation
The pubsub.run_in_thread
implementation works great for us, very nice work indeed.
We do however have a question about the parameter name 'sleep_time' plus its default value, and about the documentation.
Last week, we had a production issue, which was entirely our fault - not tested thoroughly enough. An example from the documentation had been taken too literally, and our CPU cost estimate was wrong.
Note: where I say 'documentation', it may fall under your control or maybe not. We also collected info from StackOverflow and such, we are not entirely sure where the examples came from ;)
We already solved all issues, but recommend the following:
Change sleep_time
to socket_timeout
. The name sleep_time
was very confusing to us, since it's no sleep at all. It's used as a TCP timeout value in the select() statement, where control is returned to the OS.
It's initial value is 0
. This is very CPU unfriendly. The documentation says 'be nice to the CPU' by setting it to 0.001
. This is a tiny bit more nice than 0
, but turned out to be not nice enough for our scenario. On a live server, the run_in_thread was running in hundreds of processes on one server, each causing 0.5-1.0% CPU, clogging everything. Revert and reboot needed (and alas, an hour of production downtime due to dependencies).
The initial value for sleep_time
(socket_timeout
) should be None
in our opinion. Since this week, we run the pubsub as a daemon thread with sleep_time=None
, and after initialization, this takes 0.0% CPU. It hands over control to the OS. Our hundreds of processes on that live server run fine now.
Caveat: You should not try to use this thread for pubsub connections to more than 1 redis instance. It's better to fire up an extra thread to accomplish that. Also see the python docs on select: https://docs.python.org/2/howto/sockets.html . Excerpt: give it a nice long timeout (say a minute) unless you have good reason to do otherwise
Kind regards, TW
Changing the sleep_time
to a value greater than 0
seems like a good idea. If I recall correctly, if we set sleep_time=None
, we'd never get notified if/when the socket dies. I'll need to test this scenario.
Good point, that might well be the case. I'll test it as well.
Note: a larger value like 60.0
worked fairly good in our scenario as well. 1.0
was already too short, the os didn't park processes as idle anymore.
I agree, the name sleep_time
for that parameter is very confusing.
this is now broken because you are unable to stop the thread
This issue is marked stale. It will be closed in 30 days if it is not updated.
any updates?
Changing the name to socket_timeout
makes sense. We'll do that in 4.0 as it's a backwards incompatible change. I'm also fine with increasing the default value, but I'm not quite sure what the optimal value would be. I believe that defaulting to None
seems like a bad idea. If I remember correctly, some types of socket disconnections won't be raised when no timeout is specified and the thread will hang forever. Perhaps 5 or 10 seconds is a reasonable compromise?
Sounds good, thx.
I think 2 seconds would be a proper initial value. With a docstring that refers to the official recommendation:
See: https://docs.python.org/2/howto/sockets.html . give it a nice long timeout (say a minute) unless you have good reason to do otherwise
2 seconds is slightly too CPU intensive for our special requirements, but we'll set it to None anyway.
Sidenote: There are always alternatives. An improved implementation could still have a socket_timeout of None, but with an event loop and a non-blocking socket and possibly a signal mask. Programs that have noticable CPU usage while idle should be prevented if at all possible. But I know this adds a lot of logic/maintenance to your module, and from experience I know it's very hard to maintain py2 compat, going that route. So changing the default socket_timeout to 2 seconds sounds reasonable to me.
Thanks for the feedback. I'll add this to the redis-py 4.0 TODO list. FYI, 4.0 will be Python 3.5+ only.
This issue is marked stale. It will be closed in 30 days if it is not updated.
Bump
@sav-norem. Wanna?
This issue is marked stale. It will be closed in 30 days if it is not updated.