paho.mqtt.python icon indicating copy to clipboard operation
paho.mqtt.python copied to clipboard

Send queue has a size limit of 65555 when using QoS > 0

Open yschroeder opened this issue 5 years ago • 9 comments

Contrary to the documentation, which states that there is an unlimited queue size by default, the send queue for messages with QoS > 0 is limited to 65555 entries via this if statement: https://github.com/eclipse/paho.mqtt.python/blob/master/src/paho/mqtt/client.py#L1210

I think the code could be changed so that message ids are assigned later, when the message should actually be sent and a message id is available again.

yschroeder avatar Apr 24 '19 08:04 yschroeder

Any insight from the devs about this issue? a brief look at the code suggests this is true and the queue is actually limited to 65535 messages given it's not an actual queue but a dict indexed by message ids. It would be great to have some confirmation about this, I wanted to use paho as my sole message queue but with this constraint it seems I'd start losing data after about 18 hours (sending 1 sample per second) of network down.

fargiolas avatar Dec 19 '20 03:12 fargiolas

I will shamelessly summon @ralight, as I think this issue got lost between the support requests from Python beginners.

As my above link is outdated, I will describe the problem again:

Here the message ID for an outgoing message is generated via _mid_generate(). It assigns the numbers from 1 to 65535 as message IDs in a round-robin style.

Here the generated message ID is checked if it already exists in the dictionary that holds outgoing messages. If the message ID is already in there, the message is discarded. This might happen when the network connection is gone while 2^16 messages are requested to be published. In fact, this dictionary should be a queue and allow arbitrary numbers of messages.

Here the new message is put into the dictionary if the message ID does not exist there (the previous message with the same ID is already sent successfully).

In my opinion, this behavior is a bug when using QoS 1 or QoS 2. I expect all my messages that I put into the MQTT stack to reach the broker eventually.

yschroeder avatar Jan 29 '21 23:01 yschroeder

I believe this comes from the protocol limiting IDs to 16bits, but as far as I can tell nothing prevents an implementation to store messages in an independent local queue and assign them an ID as soon as they become available for reuse.

In my opinion, this is a quite serious issue both because the doc promises unlimited message queue ~~and because it silently drops messages without throwing an error~~. (EDIT: it indeed does return an error code in message.info)

fargiolas avatar Jan 31 '21 11:01 fargiolas

It is not strictly silent. The publish() call returns a message.info object which contains a return code, but it is rarely checked.

yschroeder avatar Jan 31 '21 20:01 yschroeder

It is not strictly silent. The publish() call returns a message.info object which contains a return code, but it is rarely checked.

you're right, thought the error was being set only if queue length exceeded _max_queued_messages but it also checks if current id is already in queue. Fixed my comment thanks!

fargiolas avatar Feb 01 '21 07:02 fargiolas

I've created a pullrequest to mention this limitation in the documentation. Would have saved me a lot of time if this was mentioned as long as this limitiation is fixed.

JsBergbau avatar May 01 '21 09:05 JsBergbau

I am not entirely sure how I came up with the value of 65555. The message IDs are 16 bit, so their maximum is of course 65535. And I am pretty sure this is not a typo as I typed it twice and and I know my powers of two...

Is there another inflight queue or something that can accomodate 20 additional messages?

yschroeder avatar May 19 '21 19:05 yschroeder

To answer my own question:

There are exactly 20 inflight messages according to this line.

So 65555 seems to be indeed correct.

yschroeder avatar May 19 '21 19:05 yschroeder

I'm going to flag this as an enhancement; as the limitation is now documented I guess it's not a bug!. Having said that it does look to me like there may be a bug here, when nearing 65535 messages in progress you are going to get ID reuse and there does not seem to be any check for this.

MattBrittan avatar Jan 07 '24 22:01 MattBrittan