paho.mqtt.python icon indicating copy to clipboard operation
paho.mqtt.python copied to clipboard

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x92 in position 0: invalid start byte

Open mi3z opened this issue 7 years ago • 7 comments

Hello :)

i'm facing the following problem: After connecting to iot.eclipse.org (198.41.30.241:1883) with Paho-mqtt, subscribe to all topics ("#") and printing them, the program crashes with the following error:

File "/usr/local/lib/python3.4/dist-packages/paho/mqtt/client.py", line 377, in topic return self._topic.decode('utf-8') UnicodeDecodeError: 'utf-8' codec can't decode byte 0x92 in position 0: invalid start byte

The problem: So of course, i know MQTT requires utf-8 but in case it's not, there should be a error handling on all utf-8 decodings.

Best Regards mi3z

mi3z avatar Apr 05 '17 16:04 mi3z

According to the spec (http://docs.oasis-open.org/mqtt/mqtt/v3.1.1/os/mqtt-v3.1.1-os.html: [MQTT-1.5.3-1]) "If a Server or Client receives a Control Packet containing ill-formed UTF-8 it MUST close the Network Connection."

jamesmyatt avatar Apr 10 '17 14:04 jamesmyatt

This error should be in your control. I mean that it occur because you try to access msg.topic in a on_message callback. Something like:

    def on_message(client, msg):
        msg.topic  <--- this cause the error.

You can therefor handle this error with a try/except around the access to msg.topic.

PierreF avatar Apr 10 '17 15:04 PierreF

That's right. But instead of closing the connection the paho library is crashing because of an unhandled exception triggered from ".decode" and returns an exception.

I could handle this error with a try/except, but then I have in consequence to encapsulate all paho calls with try/except, since I can't trust the library not breaking my application.

I agree it confirms the specs. But I would suggest a better exception handling through the library. IMHO I think it should not even call the on_message() method if the packet is not valid.

Best Regards mi3z

mi3z avatar Apr 10 '17 15:04 mi3z

I agree with mi3z. I just started a project with MQTT, I saw the getting started code and it looked nice. But when I ran that exact code, I also got this UTF-8 decode error. Which is interesting for the 'getting started code' haha, a good first impression. Don't get me wrong, I have no clue how hard it was to write this module (thanks everyone), just trying to give some honest feedback.

Humphreybas avatar Aug 15 '17 14:08 Humphreybas

I agree that getting started should be update to avoid this error.

About the strict respect to the specification (e.g. disconnecting) that would mean that client will connect, got an non-UTF-8 topic, disconnect... then reconnect, and so on. (this is true, because it case of getting started, it come from retained topic and/or periodic $SYS topic. But it could also occur as soon as you are using QoS > 0 and clean_session=False). For this reason I'm prefer that library client had the choice on what to do with non-utf-8 topic.

Maybe a flag should be set before connection or another callback (on_bad_message ?) should be added.

PierreF avatar Sep 21 '17 13:09 PierreF

Still isn't fixed today

if you run the subscribe_callback.py example

Traceback (most recent call last): File "/home/nlunghiadm/tmp/python/paho.mqtt.python/examples/subscribe_callback.py", line 24, in subscribe.callback(print_msg, "#", hostname="iot.eclipse.org") File "/home/nlunghiadm/tmp/python/paho.mqtt.python/src/paho/mqtt/subscribe.py", line 165, in callback client.loop_forever() File "/home/nlunghiadm/tmp/python/paho.mqtt.python/src/paho/mqtt/client.py", line 1481, in loop_forever rc = self.loop(timeout, max_packets) File "/home/nlunghiadm/tmp/python/paho.mqtt.python/src/paho/mqtt/client.py", line 1003, in loop rc = self.loop_read(max_packets) File "/home/nlunghiadm/tmp/python/paho.mqtt.python/src/paho/mqtt/client.py", line 1284, in loop_read rc = self._packet_read() File "/home/nlunghiadm/tmp/python/paho.mqtt.python/src/paho/mqtt/client.py", line 1849, in _packet_read rc = self._packet_handle() File "/home/nlunghiadm/tmp/python/paho.mqtt.python/src/paho/mqtt/client.py", line 2305, in _packet_handle return self._handle_publish() File "/home/nlunghiadm/tmp/python/paho.mqtt.python/src/paho/mqtt/client.py", line 2500, in _handle_publish self._handle_on_message(message) File "/home/nlunghiadm/tmp/python/paho.mqtt.python/src/paho/mqtt/client.py", line 2647, in _handle_on_message self.on_message(self, self._userdata, message) File "/home/nlunghiadm/tmp/python/paho.mqtt.python/src/paho/mqtt/subscribe.py", line 40, in _on_message_callback userdata['callback'](client, userdata['userdata'], message) File "/home/nlunghiadm/tmp/python/paho.mqtt.python/examples/subscribe_callback.py", line 22, in print_msg print("%s : %s" % (message.topic, message.payload)) File "/home/nlunghiadm/tmp/python/paho.mqtt.python/src/paho/mqtt/client.py", line 360, in topic return self._topic.decode('utf-8') UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa1 in position 0: invalid start byte

Process finished with exit code 1

nicola-lunghi avatar Jul 10 '18 12:07 nicola-lunghi

I am using v1.6.1 and am getting similar crash else were

Exception in thread Thread-1:
Traceback (most recent call last):
  File "/usr/lib/python3.7/threading.py", line 917, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.7/threading.py", line 865, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/lib/python3.7/dist-packages/paho/mqtt/client.py", line 3591, in _thread_main
    self.loop_forever(retry_first_connection=True)
  File "/usr/local/lib/python3.7/dist-packages/paho/mqtt/client.py", line 1756, in loop_forever
    rc = self._loop(timeout)
  File "/usr/local/lib/python3.7/dist-packages/paho/mqtt/client.py", line 1164, in _loop
    rc = self.loop_read()
  File "/usr/local/lib/python3.7/dist-packages/paho/mqtt/client.py", line 1556, in loop_read
    rc = self._packet_read()
  File "/usr/local/lib/python3.7/dist-packages/paho/mqtt/client.py", line 2439, in _packet_read
    rc = self._packet_handle()
  File "/usr/local/lib/python3.7/dist-packages/paho/mqtt/client.py", line 3033, in _packet_handle
    return self._handle_publish()
  File "/usr/local/lib/python3.7/dist-packages/paho/mqtt/client.py", line 3305, in _handle_publish
    props, props_len = message.properties.unpack(packet)
  File "/usr/local/lib/python3.7/dist-packages/paho/mqtt/properties.py", line 429, in unpack
    buffer, attr_type, propslenleft)
  File "/usr/local/lib/python3.7/dist-packages/paho/mqtt/properties.py", line 402, in readProperty
    value1, valuelen1 = readUTF(buffer, propslen - valuelen)
  File "/usr/local/lib/python3.7/dist-packages/paho/mqtt/properties.py", line 70, in readUTF
    buf = buffer[2:2+length].decode("utf-8")
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xee in position 1: invalid continuation byte

What seams to fix the problem is changes in this file paho/mqtt/properties.py

 70 
 71     try:
 72         buf = buffer[2:2+length].decode("utf-8")
 73     except:
 74         raise MalformatPacket("Cannot decode topic to utf-8")
 75 

Can somebody include these changes in the main code?

ksprs avatar Mar 15 '22 08:03 ksprs