paho.mqtt.python icon indicating copy to clipboard operation
paho.mqtt.python copied to clipboard

ssl.SSLEOFError: EOF occurred in violation of protocol (_ssl.c:2483)

Open matrixbegins opened this issue 3 years ago • 14 comments

My Env is:

Python 3.9.7 (default, Sep 16 2021, 08:50:36) 
[Clang 10.0.0 ] :: Anaconda, Inc. on darwin
paho-mqtt==1.6.1

I am trying to build an application that receives events/messages from multiple sources and at a fairly high rate (2000 - 3000 msgs/sec). My application does some data massaging and publishes it to ActiveMQ. For few messages everything works well, however for few hundred messages things start falling apart and I get following error messages:

File "/Users/ankurpandey/opt/anaconda3/envs/slatesafety/lib/python3.9/ssl.py", line 1173, in send
  File "/Users/ankurpandey/opt/anaconda3/envs/slatesafety/lib/python3.9/threading.py", line 973, in _bootstrap_inner
    return self.loop_write()
  File "/Users/ankurpandey/opt/anaconda3/envs/slatesafety/lib/python3.9/site-packages/paho/mqtt/client.py", line 1577, in loop_write
Exception in thread ws_msg_handler:
Traceback (most recent call last):
  File "/Users/ankurpandey/opt/anaconda3/envs/slatesafety/lib/python3.9/threading.py", line 973, in _bootstrap_inner

return self._sslobj.write(data)
ssl.SSLEOFError: EOF occurred in violation of protocol (_ssl.c:2483)
    rc = self._packet_write()
  File "/Users/ankurpandey/opt/anaconda3/envs/slatesafety/lib/python3.9/site-packages/paho/mqtt/client.py", line 649, in _sock_send
    rc = self._packet_write()
    self.run()
    mqtt_client.publish_message(get_topic_name(kyra_msg), orjson.dumps(kyra_msg))
  File "/Users/ankurpandey/opt/anaconda3/envs/slatesafety/lib/python3.9/site-packages/paho/mqtt/client.py", line 2464, in _packet_write
  File "/Users/ankurpandey/opt/anaconda3/envs/slatesafety/lib/python3.9/site-packages/paho/mqtt/client.py", line 2464, in _packet_write
    return self._sock.send(buf)
  File "/Users/ankurpandey/opt/anaconda3/envs/slatesafety/lib/python3.9/ssl.py", line 1173, in send
    self.run()
  File "/Users/ankurpandey/opt/anaconda3/envs/slatesafety/lib/python3.9/threading.py", line 910, in run
    write_length = self._sock_send(
  File "/Users/ankurpandey/opt/anaconda3/envs/slatesafety/lib/python3.9/site-packages/paho/mqtt/client.py", line 649, in _sock_send
  File "/Users/ankurpandey/Documents/projects/guardhat/ss-intl/integration/./libs/kyra_mqtt_client.py", line 46, in publish_message
  File "/Users/ankurpandey/opt/anaconda3/envs/slatesafety/lib/python3.9/threading.py", line 910, in run
    write_length = self._sock_send(
  File "/Users/ankurpandey/opt/anaconda3/envs/slatesafety/lib/python3.9/site-packages/paho/mqtt/client.py", line 649, in _sock_send
    self._target(*self._args, **self._kwargs)
  File "/Users/ankurpandey/Documents/projects/guardhat/ss-intl/integration/./libs/slate_safety/ss_message_handler.py", line 23, in process_ss_message

The code that is sending the data to ActiveMQ is as follows:

def on_message(self, wsapp, message):
        """
            handles message from Websocket
        """
        payload = orjson.loads(message)
        logger.debug("message received from server: " + str(payload))

        threading.Thread(target=process_ss_message,
                    name='ws_msg_handler',
                    kwargs={"message": payload} ).start()

AND:

def process_ss_message(message):
    start = time()
    target_msg = message_translater_to_target(message)
    logger.debug("msg::" + str(kyra_msg))

    mqtt_client = SSMQTTClient({})

    mqtt_client.publish_message(get_topic_name(target_msg), orjson.dumps(target_msg))
    logger.debug(f"Message conversion and processing time: {(time() - start) * 1000} ms")

I did a quick google search to resolve this, and one of the solution was to use pyOpenSSL So I made a quick change in lib/python3.9/site-packages/paho/mqtt/client.py as follows:

import collections
import errno
import os
import platform
import select
import socket

ssl = None
try:
    # import ssl
    import urllib3.contrib.pyopenssl as ssl
except ImportError:
    pass

After this I didn't see this error and I even increased the message consumption rate. It may not be a bug. But I am trying to understand the reason behind it and how can I avoid it without changing the code of the library. I tried searching this in previous issues but was unable to find.

matrixbegins avatar Jan 20 '22 15:01 matrixbegins

One reference that I found was here: https://github.com/psf/requests/issues/3006 However this is for request packages.

matrixbegins avatar Jan 20 '22 15:01 matrixbegins

Did you figure this problem out? Would like to have some pointers because I am also facing this issue

Sohaib90 avatar Jun 24 '22 12:06 Sohaib90

I am also highly interested in the solution to this issue

cartertinney avatar Jun 27 '22 16:06 cartertinney

Me too, i don't understand where it's coming from

PaulFaguet avatar Jun 28 '22 08:06 PaulFaguet

@cartertinney @PaulFaguet pip install ndg-httpsclient pip install pyopenssl pip install pyasn1

This helped solve my issue. I was experiencing this issue before but after I installed the packages, I did not face this issue again.

Sohaib90 avatar Jul 04 '22 14:07 Sohaib90

It's my understanding that this happens because the client is trying to publish a message and a PUBACK at the same time (from two threads), but I'm not sure of the correct way to avoid/fix this

furgoose avatar Jul 11 '22 12:07 furgoose

Any luck? I'm on an AWS EC2 to IOT Core.

ericGTT avatar Jul 12 '22 20:07 ericGTT

I got mine to work. If you are using Certificates and you don't wait until the connection handshake finishes before you publish then you will get the: ssl.ssleoferror: eof occurred in violation of protocol (_ssl.c:2396)

Not sure if this is your issue. But worked for me.

ericGTT avatar Jul 13 '22 21:07 ericGTT

@ericGTT I think so far this is the best explanation. I had guessed (without looking into code) either too many handshake requests are being generated as I was trying to send message in different threads/async functions. This is certainly not the issue with any server so I am sure, Mosquitto or AWS IoT or any other platform will work. If we had something like a completely initialized connection pool then this problem may not come.

matrixbegins avatar Jul 14 '22 02:07 matrixbegins

@matrixbegins I was running the same python script on my computer at home and on a windows EC2 in AWS. The one at home worked the one on EC2 didn't. I just (after a lot of googling and not finding much and wiresharking it) put a delay "time.sleep" after the connection to give it a little time for a handshake and it works on both now. Not the most elegant solution but my script is just a test script. I'm sure there is a way to check the connection status but I didn't look for that. Good luck.

ericGTT avatar Jul 14 '22 13:07 ericGTT

I have seen this too, when the interface towards the broker is down and the address is gone.

It seems there are probably two things going on:

  • The socket really can get errored because the TCP connection fails, and this should lead to disconnect so the reconnect callbacks can happen.
  • It seems like there is a failure to have adequate locking. Clients should not have to sequence.

I have opened #750 about mishandling of a network error. I'm declaring this issue to be about the error happening other than when the network causes it.

gdt avatar Oct 01 '23 11:10 gdt

Flagging this as a bug for now (its likely that #797 will provide a solution in many cases but lets see how that works out post the next release).

MattBrittan avatar Jan 08 '24 01:01 MattBrittan

raceback (most recent call last):
  File "/usr/lib/python3.10/wsgiref/handlers.py", line 138, in run
    self.finish_response()
  File "/usr/local/lib/python3.10/dist-packages/django/core/servers/basehttp.py", line 173, in finish_response
    super().finish_response()
  File "/usr/lib/python3.10/wsgiref/handlers.py", line 184, in finish_response
    self.write(data)
  File "/usr/lib/python3.10/wsgiref/handlers.py", line 293, in write
    self._write(data)
  File "/usr/lib/python3.10/wsgiref/handlers.py", line 467, in _write
    result = self.stdout.write(data)
  File "/usr/lib/python3.10/socketserver.py", line 826, in write
    self._sock.sendall(b)
  File "/usr/lib/python3.10/ssl.py", line 1266, in sendall
    v = self.send(byte_view[count:])
  File "/usr/lib/python3.10/ssl.py", line 1235, in send
    return self._sslobj.write(data)
ssl.SSLEOFError: EOF occurred in violation of protocol (_ssl.c:2426)

Happened for me also.

Will go through the above solutions.

mine is a django prod app running on dev server

NafiGit avatar Feb 03 '24 12:02 NafiGit

We have the same problem while using locust with MQTT. When using TLS the messages are not published if the payload exceeds a certain limit - 2679 Bytes seem to work, 319740 does not. After a certain time period the broker disconnects the client due to a timeout. A solution would be highly appreciated!

Same code without TLS works flawless - also - so far as I can tell, if the payload stays low (< 1 KB). Seems to be some kind of a race condition/low level socket handling issue.

Holundermann avatar Mar 19 '24 13:03 Holundermann