mosquitto icon indicating copy to clipboard operation
mosquitto copied to clipboard

"per_listener_settings true" causes messages for clients to no longer be stored

Open JSchy65 opened this issue 3 years ago • 16 comments

mosquitto broker version 2.0.15 (Oracle Linux 8)

Start mosquitto with the following configuration: sys_interval 1 per_listener_settings true listener 1883 allow_anonymous true

mosquitto_sub -p 1883 -v -t '$SYS/broker/clients/connected' -t '$SYS/broker/clients/total' -t '$SYS/broker/store/messages/count'

what leads to:

$SYS/broker/clients/total 1 $SYS/broker/clients/connected 1 $SYS/broker/store/messages/count 53

mosquitto_sub -p 1883 -q 2 -t test/# -d -c -i clientA

$SYS/broker/clients/connected 2 $SYS/broker/clients/total 2

stop client / ctrl-c / -> disconnect

$SYS/broker/clients/connected 1

mosquitto_pub -p 1883 -q 2 -t test/xxx -m "Testnachricht"

The count of messages should increment by 1 but is not !!!

mosquitto_sub -p 1883 -q 2 -t test/# -d -c -i clientA

Does not return the expected message !!!

Retry with per_listener_settings false

Now it works like expected.

Thank You very much, Regards Joachim Schachermayer

JSchy65 avatar Aug 19 '22 07:08 JSchy65

The issue occurs in versions 2.0.12, 2.0.13, 2.0.14 and 2.0.15. The issue occurs NOT in 1.6.9, 1.6.15, 2.0.0, 2.0.10 and 2.0.11 (others not tested).

JSchy65 avatar Aug 19 '22 12:08 JSchy65

I found the same Bug, for people who rely on the queueing of QOS 1 or 2 Messages this is a Showstopper for updating to the latest versions.

Steps to reproduce:

  • start mosquitto with "per_listener_settings" true
  • .\mosquitto_sub.exe -q 2 -i "testclient" -c -t "#"
  • .\mosquitto_pub.exe -d -q 2 -t "topic1/subtopic1" -m "hello, world" -i "test"
  • Messages arrive at subscriber.
  • Stop the subscriber
  • publish more messages
  • .\mosquitto_pub.exe -d -q 2 -t "topic1/subtopic1" -m "hello, world" -i "test"
  • Start the subscriber again
  • No Messages will be shown

Change the Setting to "false" and repeat

You will get the queued messages!

jh-profilsys-de avatar Aug 25 '22 06:08 jh-profilsys-de

I've found the same issue today 2.0.14 and 2.0.15 mosquitto.conf

persistence true
persistence_location /mosquitto/data
log_type all

per_listener_settings true


listener 8883
acl_file /mosquitto/config/acl

listener 1883
acl_file /mosquitto/config/acl
allow_anonymous true

test script

timeout 1 mosquitto_sub -c -i test -t /test -q 1
mosquitto_pub -t /test -m "$(date)" -q 1
RET=$(timeout 1 mosquitto_sub -c -i test -t /test -q 1)
if [ ! -z "$RET" ]
   then
    echo "Success! $RET"
else
    echo "Failed!"
fi

Rotario avatar Oct 03 '22 16:10 Rotario

To emphasize the importance of this topic, I repeat what jh-profilsys-de already wrote above:

"for people who rely on the queueing of QOS 1 or 2 Messages this is a Showstopper for updating to the latest versions"

I think this problem affects a large group of users. Therefore, in my opinion, its priority should be increased.

JSchy65 avatar Oct 07 '22 08:10 JSchy65

+1

strahlc avatar Feb 06 '23 15:02 strahlc

I found this bug was introduced by the commit 79542158f47ac589b2248ad40102a3c8e545d413

gblach avatar Feb 10 '23 23:02 gblach

related to #2526

pat1 avatar Mar 06 '23 08:03 pat1

revert commit https://github.com/eclipse/mosquitto/commit/79542158f47ac589b2248ad40102a3c8e545d413 solve the problem

diff -ru mosquitto-2.0.15-orig/lib/net_mosq.c mosquitto-2.0.15/lib/net_mosq.c
--- mosquitto-2.0.15-orig/lib/net_mosq.c	2022-08-16 15:34:02.000000000 +0200
+++ mosquitto-2.0.15/lib/net_mosq.c	2023-03-06 09:43:51.515905263 +0100
@@ -250,7 +250,7 @@
 #ifdef WITH_BROKER
 	if(mosq->listener){
 		mosq->listener->client_count--;
-		mosq->listener = NULL;
+		//mosq->listener = NULL;
 	}
 #endif
 

but I do not know other feed back and bugs this can create

pat1 avatar Mar 06 '23 10:03 pat1

+1 took me quite a while to work out this was the cause!

LiamDavey avatar May 16 '23 03:05 LiamDavey

Reverting commit https://github.com/eclipse/mosquitto/commit/79542158f47ac589b2248ad40102a3c8e545d413 do not solve the problem; after some time the broker do not send delayed messages and need a restart. After one year any news about this bug ?

pat1 avatar Aug 01 '23 08:08 pat1

I'm also having exactly the same issue. Here is my dockerized setup that reveals the problem, should be easy to reproduce:

docker-compose.yml

version: "3.9"

services:
  broker:
    image: eclipse-mosquitto:${VERSION}
    command: /usr/sbin/mosquitto -v -c /mosquitto.conf
    volumes:
      - ./mosquitto.conf:/mosquitto.conf

  clients:
    image: eclipse-mosquitto:${VERSION}
    command: >
        /bin/sh -c "
          /usr/bin/mosquitto_sub -h broker -t TEST -i SUB -q 2 -c -d -v -W 5 ;
          /usr/bin/mosquitto_pub -h broker -t TEST -m HELLO -q 2 -i PUB -d ;
          /usr/bin/mosquitto_sub -h broker -t TEST -i SUB -q 2 -c -d -v -W 5
        "
    depends_on:
      - broker

mosquitto.conf (variant 1, fails with mosquitto >= 2.0.12)

per_listener_settings true
listener 1883
allow_anonymous true

mosquitto.conf (variant 2, works fine in all versions)

per_listener_settings false
listener 1883
allow_anonymous true

Run with different broker versions, using either variant 1 or 2 of the config:

VERSION=2.0.10 docker compose up --force-recreate --abort-on-container-exit
VERSION=2.0.11 docker compose up --force-recreate --abort-on-container-exit
VERSION=2.0.12 docker compose up --force-recreate --abort-on-container-exit
VERSION=2.0.13 docker compose up --force-recreate --abort-on-container-exit
VERSION=2.0.14 docker compose up --force-recreate --abort-on-container-exit
VERSION=2.0.15 docker compose up --force-recreate --abort-on-container-exit
VERSION=2.0.16 docker compose up --force-recreate --abort-on-container-exit
VERSION=2.0.17 docker compose up --force-recreate --abort-on-container-exit
VERSION=2.0.18 docker compose up --force-recreate --abort-on-container-exit

Expected output is something like this:

mosquitto_session-broker-1   | 1695734381: mosquitto version 2.0.11 starting
mosquitto_session-broker-1   | 1695734381: Config loaded from /mosquitto.conf.
mosquitto_session-broker-1   | 1695734381: Opening ipv4 listen socket on port 1883.
mosquitto_session-broker-1   | 1695734381: Opening ipv6 listen socket on port 1883.
mosquitto_session-broker-1   | 1695734381: mosquitto version 2.0.11 running
mosquitto_session-clients-1  | Client SUB sending CONNECT
mosquitto_session-broker-1   | 1695734381: New connection from 172.18.0.3:47330 on port 1883.
mosquitto_session-broker-1   | 1695734381: New client connected from 172.18.0.3:47330 as SUB (p2, c0, k60).
mosquitto_session-broker-1   | 1695734381: No will message specified.
mosquitto_session-broker-1   | 1695734381: Sending CONNACK to SUB (0, 0)
mosquitto_session-broker-1   | 1695734381: Received SUBSCRIBE from SUB
mosquitto_session-broker-1   | 1695734381:      TEST (QoS 2)
mosquitto_session-broker-1   | 1695734381: SUB 2 TEST
mosquitto_session-broker-1   | 1695734381: Sending SUBACK to SUB
mosquitto_session-broker-1   | 1695734386: Received DISCONNECT from SUB
mosquitto_session-broker-1   | 1695734386: Client SUB disconnected.
mosquitto_session-clients-1  | Timed out
mosquitto_session-clients-1  | Client SUB received CONNACK (0)
mosquitto_session-clients-1  | Client SUB sending SUBSCRIBE (Mid: 1, Topic: TEST, QoS: 2, Options: 0x00)
mosquitto_session-clients-1  | Client SUB received SUBACK
mosquitto_session-clients-1  | Subscribed (mid: 1): 2
mosquitto_session-clients-1  | Client SUB sending DISCONNECT
mosquitto_session-clients-1  | Client PUB sending CONNECT
mosquitto_session-broker-1   | 1695734386: New connection from 172.18.0.3:47332 on port 1883.
mosquitto_session-broker-1   | 1695734386: New client connected from 172.18.0.3:47332 as PUB (p2, c1, k60).
mosquitto_session-broker-1   | 1695734386: No will message specified.
mosquitto_session-broker-1   | 1695734386: Sending CONNACK to PUB (0, 0)
mosquitto_session-broker-1   | 1695734386: Received PUBLISH from PUB (d0, q2, r0, m1, 'TEST', ... (5 bytes))
mosquitto_session-broker-1   | 1695734386: Sending PUBREC to PUB (m1, rc0)
mosquitto_session-broker-1   | 1695734386: Received PUBREL from PUB (Mid: 1)
mosquitto_session-broker-1   | 1695734386: Sending PUBCOMP to PUB (m1)
mosquitto_session-broker-1   | 1695734386: Received DISCONNECT from PUB
mosquitto_session-clients-1  | Client PUB received CONNACK (0)
mosquitto_session-broker-1   | 1695734386: Client PUB disconnected.
mosquitto_session-clients-1  | Client PUB sending PUBLISH (d0, q2, r0, m1, 'TEST', ... (5 bytes))
mosquitto_session-clients-1  | Client PUB received PUBREC (Mid: 1)
mosquitto_session-clients-1  | Client PUB sending PUBREL (m1)
mosquitto_session-clients-1  | Client PUB received PUBCOMP (Mid: 1, RC:0)
mosquitto_session-clients-1  | Client PUB sending DISCONNECT
mosquitto_session-clients-1  | Client SUB sending CONNECT
mosquitto_session-broker-1   | 1695734386: New connection from 172.18.0.3:47334 on port 1883.
mosquitto_session-broker-1   | 1695734386: New client connected from 172.18.0.3:47334 as SUB (p2, c0, k60).
mosquitto_session-broker-1   | 1695734386: No will message specified.
mosquitto_session-broker-1   | 1695734386: Sending CONNACK to SUB (1, 0)
mosquitto_session-broker-1   | 1695734386: Sending PUBLISH to SUB (d0, q2, r0, m1, 'TEST', ... (5 bytes))
mosquitto_session-clients-1  | Client SUB received CONNACK (0)
mosquitto_session-broker-1   | 1695734386: Received SUBSCRIBE from SUB
mosquitto_session-clients-1  | Client SUB sending SUBSCRIBE (Mid: 1, Topic: TEST, QoS: 2, Options: 0x00)
mosquitto_session-clients-1  | Client SUB received PUBLISH (d0, q2, r0, m1, 'TEST', ... (5 bytes))
mosquitto_session-clients-1  | Client SUB sending PUBREC (m1, rc0)
mosquitto_session-clients-1  | Client SUB received SUBACK
mosquitto_session-clients-1  | Subscribed (mid: 1): 2
mosquitto_session-broker-1   | 1695734386:      TEST (QoS 2)
mosquitto_session-broker-1   | 1695734386: SUB 2 TEST
mosquitto_session-broker-1   | 1695734386: Sending SUBACK to SUB
mosquitto_session-broker-1   | 1695734386: Received PUBREC from SUB (Mid: 1)
mosquitto_session-broker-1   | 1695734386: Sending PUBREL to SUB (m1)
mosquitto_session-broker-1   | 1695734386: Received PUBCOMP from SUB (Mid: 1, RC:0)
mosquitto_session-clients-1  | Client SUB received PUBREL (Mid: 1)
mosquitto_session-clients-1  | Client SUB sending PUBCOMP (m1)
mosquitto_session-clients-1  | TEST HELLO                         <=============== HELLO received here
mosquitto_session-clients-1  | Timed out
mosquitto_session-broker-1   | 1695734391: Received DISCONNECT from SUB
mosquitto_session-clients-1  | Client SUB sending DISCONNECT
mosquitto_session-broker-1   | 1695734391: Client SUB disconnected.

Summary of my results ("OK" - HELLO received, "NOT OK" - HELLO not received on reconnect):

Version Variant 1 Variant 2
2.0.10 OK OK
2.0.11 OK OK
2.0.12 NOT OK OK
2.0.13 NOT OK OK
2.0.14 NOT OK OK
2.0.15 NOT OK OK
2.0.16 NOT OK OK
2.0.17 NOT OK OK
2.0.18 NOT OK OK

Jasihien avatar Sep 26 '23 14:09 Jasihien

Anyone knows if there is any plans or progress on this bug? I would think having the option of queuing messages while at the same time having several differently configured listeners would be a major feature for many users.

torrunes avatar Mar 05 '24 20:03 torrunes

Anyone knows if there is any plans or progress on this bug? I would think having the option of queuing messages while at the same time having several differently configured listeners would be a major feature for many users.

We also just encountered this in our project. Wasn't easy to find out that the issue of QoS 1 messages not being delivered properly was this (from my pov) unrelated config setting.

xammmue avatar Mar 14 '24 07:03 xammmue

After also running into this issue, I managed to track down the problem commit which came in 2.0.12: https://github.com/eclipse/mosquitto/commit/79542158f47ac589b2248ad40102a3c8e545d413

This commit sets mosq->listener = NULL upon disconnect, however mosquitto_acl_check() (called by subs__send() for new messages) requires the listener to still exist if per_listener_settings is true:

if(db.config->per_listener_settings){
        if(context->listener){
            opts = &context->listener->security_options;
        }else{
            return MOSQ_ERR_ACL_DENIED;
        }

I'm not sure how mosq->listener = NULL fixes the original issue of Fix max_connections option not being correctly counted. which that commit intended.

@ralight

llamaonaskateboard avatar Jul 02 '24 12:07 llamaonaskateboard

It looks like the original max_connections issue stems from connect__on_authorised() calling do_disconnect() which in turn calls context__disconnect() if the context already exists upon client connection.

context__disconnect() ends up with net__socket_close() getting called twice in this scenario (decrementing the client_count each time), once directly (mosq_cs_duplicate) and then a second time as part of context__add_to_disused()/context__freedisused()/context__cleanup() (mosq_cs_disused) .

The result without mosq->listener = NULL or some other mitigation is that listener->client_count can end up as a negative number after the multiple decrements.

Another possible solution could be as below? If I'm following this all correctly, the state only ends up being mosq_cs_duplicate or mosq_cs_disused for persisted or existing clients upon reconnect.

if(mosq->listener && mosq->state != mosq_cs_duplicate && mosq->state != mosq_cs_disused){
	mosq->listener->client_count--;
}

llamaonaskateboard avatar Jul 04 '24 10:07 llamaonaskateboard