Adafruit_CircuitPython_MiniMQTT icon indicating copy to clipboard operation
Adafruit_CircuitPython_MiniMQTT copied to clipboard

honor recv_timeout in _sock_exact_recv() and ping()

Open vladak opened this issue 1 year ago • 6 comments

This logical change uses recv_timeout as receive timeout in _sock_exact_recv().

Casually tested with CPython using the test case supplied in the bug, more testing will follow.

vladak avatar Nov 26 '23 22:11 vladak

Added a simple test case.

vladak avatar Jan 20 '24 19:01 vladak

Tested on QtPy running Adafruit CircuitPython 8.2.6 on 2023-09-12; Adafruit QT Py ESP32-S3 no psram with ESP32S3 with the code from the issue adapted for CP:

#!/usr/bin/env python3

import ssl
import time
import wifi
import socketpool

import adafruit_minimqtt.adafruit_minimqtt as MQTT

import adafruit_logging as logging

from secrets import secrets


def main():
    logger = logging.getLogger(__name__)
    logger.setLevel(logging.DEBUG)

    logger.info(f"Connecting to wifi {secrets['SSID']}")
    wifi.radio.connect(secrets["SSID"], secrets["password"], timeout=20)
    logger.info(f"Connected to {secrets['SSID']}")
    logger.debug(f"IP: {wifi.radio.ipv4_address}")

    pool = socketpool.SocketPool(wifi.radio)

    keep_alive_timeout = 16
    host = "172.40.0.3"
    port = 1883
    mqtt_client = MQTT.MQTT(
        broker=host,
        port=port,
	socket_pool=pool,
        ssl_context=ssl.create_default_context(),
        connect_retries=1,
        recv_timeout=5,
        keep_alive=keep_alive_timeout,
    )

    mqtt_client.logger = logger

    logger.debug(f"connecting with keep alive = {keep_alive_timeout}")
    mqtt_client.connect()

    delay = 2 * keep_alive_timeout
    logger.debug(f"sleeping for {delay} seconds")
    time.sleep(delay)
    logger.debug("pinging the server")
    try:
        mqtt_client.ping()
    except MQTT.MMQTTException as e:
        logger.error(f"ping failed: {e}")


if __name__ == "__main__":
    main()

the output was as follows:

1311.760: INFO - Connecting to wifi XXX
1311.761: INFO - Connected to XXX
1311.763: DEBUG - IP: 172.40.0.23
1311.766: DEBUG - connecting with keep alive = 16
1311.767: DEBUG - Attempting to connect to MQTT broker (attempt #0)
1311.768: DEBUG - Attempting to establish MQTT connection...
1311.770: INFO - Establishing an INSECURE connection to 172.40.0.3:1883
1311.775: DEBUG - Sending CONNECT to broker...
1311.777: DEBUG - Fixed Header: bytearray(b'\x10\x14')
1311.778: DEBUG - Variable Header: bytearray(b'\x00\x04MQTT\x04\x02\x00\x10')
1311.782: DEBUG - Receiving CONNACK packet from broker
1311.785: DEBUG - Got message type: 0x20 pkt: 0x20
1311.787: DEBUG - Resetting reconnect backoff
1311.789: DEBUG - sleeping for 32 seconds
1343.790: DEBUG - pinging the server
1343.791: DEBUG - Sending PINGREQ
1348.794: ERROR - ping failed: PINGRESP not returned from broker within 5 seconds.

On CP it works differently than with CPython - the timeout from _sock_exact_recv() is propagated via _wait_for_msg() as None into the higher level functions which then check the timeout themselves, in this case this is ping(). The important part is that the time difference between the PINREQ sent and the detection of PINGRESP not received matches the receive timeout, not keep alive timeout.

vladak avatar Jan 20 '24 20:01 vladak

Checking the code after this change for the remaining keep_alive occurrences, they are only in places related to the CONNECT message marshalling and inside the loop() in order to ping the server periodically.

vladak avatar Jan 20 '24 20:01 vladak

The change in ping() probably deserves some comment. The MQTT 3.1.1 spec in section 2.2 says:

If a client does not receive a PINGRESP message within a Keep Alive time period after sending a PINGREQ, it should close the TCP/IP socket connection.

The current code in ping() merely raises the MMQTTException exception without closing the connection, perhaps with the expectation that the caller will close it. Anyhow, I think the receive timeout should trump the keep alive timeout also in this case - the information about not being able to receive data should get to the caller as soon as possible (assuming receive timeout is usually much smaller than the keep alive timeout).

vladak avatar Jan 21 '24 22:01 vladak

@vladak this now has merge conflicts.

dhalbert avatar Mar 27 '24 14:03 dhalbert

@vladak this now has merge conflicts.

rebased

vladak avatar Mar 27 '24 16:03 vladak