paho.mqtt.python
paho.mqtt.python copied to clipboard
ssl.SSLEOFError: EOF occurred in violation of protocol (_ssl.c:2483)
My Env is:
Python 3.9.7 (default, Sep 16 2021, 08:50:36)
[Clang 10.0.0 ] :: Anaconda, Inc. on darwin
paho-mqtt==1.6.1
I am trying to build an application that receives events/messages from multiple sources and at a fairly high rate (2000 - 3000 msgs/sec). My application does some data massaging and publishes it to ActiveMQ. For few messages everything works well, however for few hundred messages things start falling apart and I get following error messages:
File "/Users/ankurpandey/opt/anaconda3/envs/slatesafety/lib/python3.9/ssl.py", line 1173, in send
File "/Users/ankurpandey/opt/anaconda3/envs/slatesafety/lib/python3.9/threading.py", line 973, in _bootstrap_inner
return self.loop_write()
File "/Users/ankurpandey/opt/anaconda3/envs/slatesafety/lib/python3.9/site-packages/paho/mqtt/client.py", line 1577, in loop_write
Exception in thread ws_msg_handler:
Traceback (most recent call last):
File "/Users/ankurpandey/opt/anaconda3/envs/slatesafety/lib/python3.9/threading.py", line 973, in _bootstrap_inner
return self._sslobj.write(data)
ssl.SSLEOFError: EOF occurred in violation of protocol (_ssl.c:2483)
rc = self._packet_write()
File "/Users/ankurpandey/opt/anaconda3/envs/slatesafety/lib/python3.9/site-packages/paho/mqtt/client.py", line 649, in _sock_send
rc = self._packet_write()
self.run()
mqtt_client.publish_message(get_topic_name(kyra_msg), orjson.dumps(kyra_msg))
File "/Users/ankurpandey/opt/anaconda3/envs/slatesafety/lib/python3.9/site-packages/paho/mqtt/client.py", line 2464, in _packet_write
File "/Users/ankurpandey/opt/anaconda3/envs/slatesafety/lib/python3.9/site-packages/paho/mqtt/client.py", line 2464, in _packet_write
return self._sock.send(buf)
File "/Users/ankurpandey/opt/anaconda3/envs/slatesafety/lib/python3.9/ssl.py", line 1173, in send
self.run()
File "/Users/ankurpandey/opt/anaconda3/envs/slatesafety/lib/python3.9/threading.py", line 910, in run
write_length = self._sock_send(
File "/Users/ankurpandey/opt/anaconda3/envs/slatesafety/lib/python3.9/site-packages/paho/mqtt/client.py", line 649, in _sock_send
File "/Users/ankurpandey/Documents/projects/guardhat/ss-intl/integration/./libs/kyra_mqtt_client.py", line 46, in publish_message
File "/Users/ankurpandey/opt/anaconda3/envs/slatesafety/lib/python3.9/threading.py", line 910, in run
write_length = self._sock_send(
File "/Users/ankurpandey/opt/anaconda3/envs/slatesafety/lib/python3.9/site-packages/paho/mqtt/client.py", line 649, in _sock_send
self._target(*self._args, **self._kwargs)
File "/Users/ankurpandey/Documents/projects/guardhat/ss-intl/integration/./libs/slate_safety/ss_message_handler.py", line 23, in process_ss_message
The code that is sending the data to ActiveMQ is as follows:
def on_message(self, wsapp, message):
"""
handles message from Websocket
"""
payload = orjson.loads(message)
logger.debug("message received from server: " + str(payload))
threading.Thread(target=process_ss_message,
name='ws_msg_handler',
kwargs={"message": payload} ).start()
AND:
def process_ss_message(message):
start = time()
target_msg = message_translater_to_target(message)
logger.debug("msg::" + str(kyra_msg))
mqtt_client = SSMQTTClient({})
mqtt_client.publish_message(get_topic_name(target_msg), orjson.dumps(target_msg))
logger.debug(f"Message conversion and processing time: {(time() - start) * 1000} ms")
I did a quick google search to resolve this, and one of the solution was to use pyOpenSSL
So I made a quick change in lib/python3.9/site-packages/paho/mqtt/client.py
as follows:
import collections
import errno
import os
import platform
import select
import socket
ssl = None
try:
# import ssl
import urllib3.contrib.pyopenssl as ssl
except ImportError:
pass
After this I didn't see this error and I even increased the message consumption rate. It may not be a bug. But I am trying to understand the reason behind it and how can I avoid it without changing the code of the library. I tried searching this in previous issues but was unable to find.
One reference that I found was here: https://github.com/psf/requests/issues/3006 However this is for request packages.
Did you figure this problem out? Would like to have some pointers because I am also facing this issue
I am also highly interested in the solution to this issue
Me too, i don't understand where it's coming from
@cartertinney @PaulFaguet pip install ndg-httpsclient pip install pyopenssl pip install pyasn1
This helped solve my issue. I was experiencing this issue before but after I installed the packages, I did not face this issue again.
It's my understanding that this happens because the client is trying to publish a message and a PUBACK at the same time (from two threads), but I'm not sure of the correct way to avoid/fix this
Any luck? I'm on an AWS EC2 to IOT Core.
I got mine to work. If you are using Certificates and you don't wait until the connection handshake finishes before you publish then you will get the: ssl.ssleoferror: eof occurred in violation of protocol (_ssl.c:2396)
Not sure if this is your issue. But worked for me.
@ericGTT I think so far this is the best explanation. I had guessed (without looking into code) either too many handshake requests are being generated as I was trying to send message in different threads/async functions. This is certainly not the issue with any server so I am sure, Mosquitto or AWS IoT or any other platform will work. If we had something like a completely initialized connection pool then this problem may not come.
@matrixbegins I was running the same python script on my computer at home and on a windows EC2 in AWS. The one at home worked the one on EC2 didn't. I just (after a lot of googling and not finding much and wiresharking it) put a delay "time.sleep" after the connection to give it a little time for a handshake and it works on both now. Not the most elegant solution but my script is just a test script. I'm sure there is a way to check the connection status but I didn't look for that. Good luck.
I have seen this too, when the interface towards the broker is down and the address is gone.
It seems there are probably two things going on:
- The socket really can get errored because the TCP connection fails, and this should lead to disconnect so the reconnect callbacks can happen.
- It seems like there is a failure to have adequate locking. Clients should not have to sequence.
I have opened #750 about mishandling of a network error. I'm declaring this issue to be about the error happening other than when the network causes it.
Flagging this as a bug for now (its likely that #797 will provide a solution in many cases but lets see how that works out post the next release).
raceback (most recent call last):
File "/usr/lib/python3.10/wsgiref/handlers.py", line 138, in run
self.finish_response()
File "/usr/local/lib/python3.10/dist-packages/django/core/servers/basehttp.py", line 173, in finish_response
super().finish_response()
File "/usr/lib/python3.10/wsgiref/handlers.py", line 184, in finish_response
self.write(data)
File "/usr/lib/python3.10/wsgiref/handlers.py", line 293, in write
self._write(data)
File "/usr/lib/python3.10/wsgiref/handlers.py", line 467, in _write
result = self.stdout.write(data)
File "/usr/lib/python3.10/socketserver.py", line 826, in write
self._sock.sendall(b)
File "/usr/lib/python3.10/ssl.py", line 1266, in sendall
v = self.send(byte_view[count:])
File "/usr/lib/python3.10/ssl.py", line 1235, in send
return self._sslobj.write(data)
ssl.SSLEOFError: EOF occurred in violation of protocol (_ssl.c:2426)
Happened for me also.
Will go through the above solutions.
mine is a django prod app running on dev server
We have the same problem while using locust with MQTT. When using TLS the messages are not published if the payload exceeds a certain limit - 2679 Bytes seem to work, 319740 does not. After a certain time period the broker disconnects the client due to a timeout. A solution would be highly appreciated!
Same code without TLS works flawless - also - so far as I can tell, if the payload stays low (< 1 KB). Seems to be some kind of a race condition/low level socket handling issue.