django-q2
django-q2 copied to clipboard
qcluster stuck forever when initial connection failed in ORM broker
When starting qcluster the following happens:
-
Cluster.start()
starts a new process, callsSentinel()
in it and waits until Sentinel emitsstart_event
-
Sentinel()
instantiates the broker thoughget_broker()
- The broker
__init__()
callsget_connection()
If the connection attempt fails in the ORM broker, get_connection()
raises an exception.
As a result the Sentinel process dies and the main process waits forever for start_event
.
There is not indication from the outside (besides the log entries) that the qcluster is permanently non-functional.
The root cause seems to be that ORM.__init__()
through get_connection()
actually tries to establish a connection. Redis on the other hand seems to only setup the client without any network connection.
In the best case this is unnecessary overhead. self.connection
is never used and the constructor is called from the Sentinel process so the pusher process needs to establish a new connection anyway. In the worst case the above happens.
I think ideally Broker.connection
should be initialized lazily. That would generally reduce the amount of code that is run in the Sentinel process.