paho.mqtt.python
paho.mqtt.python copied to clipboard
Connection stalled in multithreaded app with 1.6.x branch
Hi,
I have 5 Python 3 processes, each running 2 threads. Each process has an MQTT connection shared by the 2 threads. After a random time some threads are stalled with wait_for_publish method. My understanding is that you're working on supporting multithreading in 1.6.x branch but apparently it is not yet there. Currently, only working workaround is to not use wait_for_publish but rather use a loop to wait with a timeout. If the timeout is reached, we disconnect and then reconnect. Do you have some update to share, and maybe a release date? Many thanks in advance.
Best regards,
Cyril
Thanks for testing the 1.6.x branch. There is already multi threaded support in the Python client, I have certainly been trying to improve it though.
The 1.6.x branch has an extra parameter in wait_for_publish(), which is a timeout so you can do msg.wait_for_publish(3.1) to time out after 3.1 seconds, for example.
I'd be interested in fixing the underlying problem of course though, do you have some example code you could share?
I'm planning on a release around the end of September.
Side question, is there a some type of is-loop check? It would make sense for the parent process to be able to know if the 'loop' logic has stalled, halted or such, no? I am checking is_connected of course, but that could be misleading if the loop thread has failed, right?
Thanks for testing the 1.6.x branch. There is already multi threaded support in the Python client, I have certainly been trying to improve it though.
The 1.6.x branch has an extra parameter in wait_for_publish(), which is a timeout so you can do
msg.wait_for_publish(3.1)to time out after 3.1 seconds, for example.I'd be interested in fixing the underlying problem of course though, do you have some example code you could share?
Sorry for the late reply. Here is a link to my Github project: https://github.com/SoftwareAG/cumulocity-python-device-onboarding This is a bit specific as it's about connecting devices to Cumulocity, but it can be easily changed to connect to a standard MQTT broker. It's actually using wait_for_publish, not the workaround I was talking about, and it's using multiprocessing and queues which didn't solve my issue whenever I run multiple instances of my script: after some time, if I run 5 instances, 2 end up stalled (this can happen after hours, but maybe sooner by increasing the frequency at which data are sent).
Is it possible that a reconnection happen and you are using QoS = 0 message ? If yes, it's probably fixed by #796