async-mqtt-client icon indicating copy to clipboard operation
async-mqtt-client copied to clipboard

reconnect mqtt issue

Open smarta1980 opened this issue 3 years ago • 31 comments

I tried this library async-mqtt-client-develop but there is an issue want to know how to solve: -1 wifi network is available but internet connection lost so how to reconnect to mqtt broker when internet go back -2 wifi network is available & internet available but mqtt broker is down so how to keep try to connecting to broker is there any settings to set time for reconnecting to mqtt broker? according to test example FullyFeatured-ESP8266.ino if the wifi available & internet lost for a second then later internet available esp8266 take few minutes to reconnect to mqtt broker again how to make reconnect to broker in few seconds?

smarta1980 avatar Jul 12 '21 11:07 smarta1980

give a try https://github.com/marvinroger/async-mqtt-client/issues/23#issuecomment-305274025

cyber-junkie9 avatar Jul 12 '21 11:07 cyber-junkie9

give a try #23 (comment)

I tested it, it is same issue take few minutes to reconnect to mqtt when internet lost then internet back Note: Issue of reconnect to mqtt broker take few minutes when wifi is avaailable & internet lost then esp8266 connected to internet then not connected to mqtt brroker immediately it takes few minutes

smarta1980 avatar Jul 12 '21 11:07 smarta1980

I see another issue I publish message from pc as tester but when receive it from esp8266 then it show as tester[⸮⸮⸮Q⸮ it must be as tester why get unknown characters?

smarta1980 avatar Jul 12 '21 15:07 smarta1980

Let me guess - you treat the message in the ESP8266 side as C string, right? If so you have missing the terminating /0 character, because the message payload doesn't contain it. To return the right message use the given pointer AND the given length.

Pablo2048 avatar Jul 12 '21 15:07 Pablo2048

I did like this: void onMqttMessage(char* topic, char* payload, AsyncMqttClientMessageProperties properties, size_t len, size_t index, size_t total) { String data = payload; Serial.println("data"); }

smarta1980 avatar Jul 12 '21 15:07 smarta1980

:D so it is as I expected. The char *payload is not C string. Actually it is uint8_t *payload with the length of the message in size_t len (it's little bit complicated actually because of index, total, and all that stuff, but for small messages you can just use the len parameter). Simplest way is to do memcpy() to some temporary char[] and after that put the /0 terminator via ...[len] = 0;

Pablo2048 avatar Jul 12 '21 15:07 Pablo2048

it is complicated could you please make an example to test?

smarta1980 avatar Jul 12 '21 15:07 smarta1980

void onMqttMessage(char* topic, char* payload, AsyncMqttClientMessageProperties properties, size_t len, size_t index, size_t total)
{
  char dummy[128];

  if (len < sizeof(dummy)) {
    memcpy(dummy, payload, len);
    dummy[len] = 0;
    Serial.println(dummy);
  } else {
    Serial.println(F("Msg too big to fit in buffer!"));
  }
}

Pablo2048 avatar Jul 12 '21 15:07 Pablo2048

it is working as your example, thanks another point how to cover issue of reconnect to mqtt broker while internet cut and return it takes few minutes to reconnect to mqtt then it is long time how to reconnect in few seconds?

smarta1980 avatar Jul 12 '21 15:07 smarta1980

Can you please post your code?

Pablo2048 avatar Jul 12 '21 16:07 Pablo2048

https://github.com/marvinroger/async-mqtt-client/blob/develop/examples/FullyFeatured-ESP8266/FullyFeatured-ESP8266.ino

smarta1980 avatar Jul 12 '21 18:07 smarta1980

That is just an example - instead of calling WiFi.begin(...) every 2 seconds you have to use SDK internal functions. Try to use WiFi.setAutoConnect(true); and WiFi.setAutoReconnect(true); - then there is no need to periodic call to WiFi.begin() and the module itself connects itself as fast as possible.

Pablo2048 avatar Jul 13 '21 04:07 Pablo2048

@Pablo2048 the issue with connect mqtt broker not with wifi let me explain again what tests I do esp8266 connected to wifi of access point then I disconnect internet cable of access point to cut internet service for few seconds after that connect internet cable to access point then esp8266 using the above code takes 5 minutes to reconnect to mqtt broker so that lead me to say if internet cut just for a second and internet return back so esp8266 not connected immediately to mqtt broker it takes 5 minutes to reconnect to mqtt broker

smarta1980 avatar Jul 13 '21 09:07 smarta1980

I don't think that the problem is with MQTT - I'm heavy user of async-mqtt-client, my software is constructed the way I've described and I never observed such behavior (actually >150 devices in the field).

Pablo2048 avatar Jul 13 '21 15:07 Pablo2048

I've got your mail, but i prefer to response here to keep the discussion public. Unfortunately I'm not allowed to share whole code, but I can give just fragments and guidance. I write some class to cover async-mqtt-client and handle all connect, reconnect, last will and testament, multiple publish and subcribe so here are the important parts:

This is onDisconnect method of my MQTTClient class:

void MQTTClient::onMqttDisconnect(AsyncMqttClientDisconnectReason reason)
{
    int retryTimer = 2;

    TRACE(TRACE_ERROR, F("MC: Mqtt disconnected! (%d)"), (int)reason);
    _retriesCount++;
    if (_retriesCount < 5) {
        retryTimer = 2;
    } else if (_retriesCount < 10) {
        retryTimer = 15;
    } else if (_retriesCount < 20) {
        retryTimer = 30;
    } else {
        retryTimer = 60;
    }
    _timer.attach_scheduled(retryTimer, std::bind(&MQTTClient::connect, this));
}

called directly from onDisconnect of async-mqtt-client. The _retriesCount variable is cleared after every successful publish to the broker in onPublish method:

void MQTTClient::onPublish(uint16_t packetId)
{

    _retriesCount = 0;
}

and this is the connect method:

void MQTTClient::connect(void)
{

    if (WiFi.isConnected()) {
        TRACE(TRACE_DEBUG, F("MC: Connecting to %s"), _broker);
        _client.connect();
        _timer.detach();
    }
}

As I already wrote the WiFi relies on autoConnect and autoReconnect from inside the SDK. That's all.

Pablo2048 avatar Jul 14 '21 06:07 Pablo2048

@smarta1980 It all depends on which events get triggered. As long as your device doesn't detect the loss of the TCP connection (which may also be invalidated by loss of WiFi), the device assumes still connected. Then MQTT keepalive does its thing.

Otherwise, it is the underlying libs (WiFi, Lwip, Async TCP) that propagate the events to the mqtt lib and your code.

Why it takes 5min in your case? Could be a lot of things. I did similar test and it reconnected within seconds.

Your modem might wait for a new IP from your provider?

The test I did was by using a local broker on my laptop and putting the laptop in flight mode for a few moments, forcing ungraceful connection loss.

bertmelis avatar Jul 14 '21 06:07 bertmelis

@bertmelis I do test example code with HiveMQ broker how to check if TCP connection disconnected?

smarta1980 avatar Jul 14 '21 10:07 smarta1980

how to check if TCP connection disconnected?

Are you talking about HiveMQ, or ESP8266? If it's about MiveMQ then check their documentation about logging posibilities. If it's about the ESP8266 then there are some serial debug printouts in the example you are using - record it using Arduino serial monitor (of course with timestamps turned on) and post the result here. Otherwise we are unable to help I'm afraid.

Pablo2048 avatar Jul 14 '21 10:07 Pablo2048

@Pablo2048 I tested example code https://github.com/marvinroger/async-mqtt-client/blob/develop/examples/FullyFeatured-ESP8266/FullyFeatured-ESP8266.ino with HiveMQ Broker broker.hivemq.com:1883

smarta1980 avatar Jul 14 '21 10:07 smarta1980

Do you have any logs from your tests? Otherwise we have nothing to handle with.

Pablo2048 avatar Jul 14 '21 10:07 Pablo2048

@Pablo2048 13:45:54.088 -> ⸮⸮|v`llE⸮D>⸮$⸮⸮ 13:45:54.181 -> 13:45:54.181 -> Connecting to Wi-Fi... 13:45:55.490 -> Connected to Wi-Fi. 13:45:55.490 -> Connecting to MQTT... 13:45:55.817 -> Connected to MQTT. 13:45:55.817 -> Session present: 0 13:45:55.817 -> Subscribing at QoS 2, packetId: 1 13:45:55.863 -> Publishing at QoS 0 13:45:55.909 -> Publishing at QoS 1, packetId: 2 13:45:55.909 -> Publishing at QoS 2, packetId: 3 13:45:56.048 -> Subscribe acknowledged. 13:45:56.048 -> packetId: 1 13:45:56.048 -> qos: 2 13:45:56.142 -> Publish acknowledged. 13:45:56.142 -> packetId: 2 13:45:56.517 -> Publish acknowledged. 13:45:56.517 -> packetId: 3 13:51:07.812 -> Disconnected from MQTT. 13:51:09.782 -> Connecting to MQTT... 13:51:09.970 -> Connected to MQTT. 13:51:09.970 -> Session present: 0 13:51:10.016 -> Subscribing at QoS 2, packetId: 4 13:51:10.016 -> Publishing at QoS 0 13:51:10.062 -> Publishing at QoS 1, packetId: 5 13:51:10.109 -> Publishing at QoS 2, packetId: 6 13:51:10.156 -> Subscribe acknowledged. 13:51:10.156 -> packetId: 4 13:51:10.203 -> qos: 2 13:51:10.250 -> Publish acknowledged. 13:51:10.250 -> packetId: 5 13:51:10.625 -> Publish acknowledged. 13:51:10.625 -> packetId: 6

I cut internet cable just for a second and return it back at 13:45:56.517 then you can see that at 13:51:09.970 after 5 minutes esp8266 tries to connect mqtt

smarta1980 avatar Jul 14 '21 10:07 smarta1980

So from the log it seems like the 5 minutes gap came not from connecting part, but disconnecting - after packetid: 3 to Disconnected message is the gap. Anyway I don't see this message here: https://github.com/marvinroger/async-mqtt-client/blob/2991968a97193aaa6402d146490b93ea671c7e02/examples/FullyFeatured-ESP8266/FullyFeatured-ESP8266.ino#L34 It seems like you have modified the testing script. Aren't you?

Pablo2048 avatar Jul 14 '21 11:07 Pablo2048

@Pablo2048 yes, made some modification so this is the the testing from the example code as it is no any modifications

I am using core 2.7.4

14:10:33.986 -> Connecting to Wi-Fi... 14:10:34.222 -> Connected to Wi-Fi. 14:10:34.222 -> Connecting to MQTT... 14:10:34.409 -> Connected to MQTT. 14:10:34.409 -> Session present: 0 14:10:34.409 -> Subscribing at QoS 2, packetId: 1 14:10:34.409 -> Publishing at QoS 0 14:10:34.409 -> Publishing at QoS 1, packetId: 2 14:10:34.409 -> Publishing at QoS 2, packetId: 3 14:10:34.643 -> Subscribe acknowledged. 14:10:34.643 -> packetId: 1 14:10:34.643 -> qos: 2 14:10:34.643 -> Publish received. 14:10:34.643 -> topic: test/lol 14:10:34.643 -> qos: 2 14:10:34.643 -> dup: 0 14:10:34.643 -> retain: 1 14:10:34.643 -> len: 6 14:10:34.643 -> index: 0 14:10:34.643 -> total: 6 14:10:34.737 -> Publish acknowledged. 14:10:34.737 -> packetId: 2 14:10:35.345 -> Publish acknowledged. 14:10:35.345 -> packetId: 3 14:10:35.855 -> Publish received. 14:10:35.855 -> topic: test/lol 14:10:35.855 -> qos: 1 14:10:35.855 -> dup: 0 14:10:35.855 -> retain: 0 14:10:35.855 -> len: 6 14:10:35.855 -> index: 0 14:10:35.855 -> total: 6 14:10:35.855 -> Publish received. 14:10:35.855 -> topic: test/lol 14:10:35.855 -> qos: 0 14:10:35.855 -> dup: 0 14:10:35.855 -> retain: 0 14:10:35.855 -> len: 6 14:10:35.855 -> index: 0 14:10:35.901 -> total: 6 14:10:35.901 -> Publish received. 14:10:35.901 -> topic: test/lol 14:10:35.901 -> qos: 2 14:10:35.901 -> dup: 0 14:10:35.901 -> retain: 0 14:10:35.901 -> len: 6 14:10:35.901 -> index: 0 14:10:35.901 -> total: 6

14:15:47.548 -> Disconnected from MQTT. 14:15:49.559 -> Connecting to MQTT... 14:15:49.744 -> Connected to MQTT. 14:15:49.744 -> Session present: 0 14:15:49.744 -> Subscribing at QoS 2, packetId: 4 14:15:49.744 -> Publishing at QoS 0 14:15:49.744 -> Publishing at QoS 1, packetId: 5 14:15:49.744 -> Publishing at QoS 2, packetId: 6 14:15:49.931 -> Subscribe acknowledged. 14:15:49.931 -> packetId: 4 14:15:49.931 -> qos: 2 14:15:49.931 -> Publish received. 14:15:49.931 -> topic: test/lol 14:15:49.931 -> qos: 2 14:15:49.977 -> dup: 0 14:15:49.977 -> retain: 1 14:15:49.977 -> len: 6 14:15:49.977 -> index: 0 14:15:49.977 -> total: 6 14:15:50.023 -> Publish acknowledged. 14:15:50.023 -> packetId: 5 14:15:50.396 -> Publish acknowledged. 14:15:50.396 -> packetId: 6 14:15:50.821 -> Publish received. 14:15:50.821 -> topic: test/lol 14:15:50.821 -> qos: 1 14:15:50.821 -> dup: 0 14:15:50.821 -> retain: 0 14:15:50.821 -> len: 6 14:15:50.821 -> index: 0 14:15:50.821 -> total: 6 14:15:50.821 -> Publish received. 14:15:50.821 -> topic: test/lol 14:15:50.821 -> qos: 0 14:15:50.821 -> dup: 0 14:15:50.821 -> retain: 0 14:15:50.821 -> len: 6 14:15:50.821 -> index: 0 14:15:50.821 -> total: 6 14:15:50.821 -> Publish received. 14:15:50.821 -> topic: test/lol 14:15:50.821 -> qos: 2 14:15:50.821 -> dup: 0 14:15:50.821 -> retain: 0 14:15:50.821 -> len: 6 14:15:50.867 -> index: 0 14:15:50.867 -> total: 6

I cut internet cable just for a second and return it back at 14:10:35.901 then you can see that at 14:15:47.548 after 5 minutes esp8266 tries to connect mqtt

smarta1980 avatar Jul 14 '21 11:07 smarta1980

@Pablo2048 did you check it, so we need to check if tcp connection is disconnected so can go to reconnect to mqtt but how to check if tcp connection is disconnected?

smarta1980 avatar Jul 14 '21 12:07 smarta1980

It make no sense such behavior to me if you really pull the cable just for one second. Can you ping the router which cable are you pulling out and watch, if it doesn't restart itself after you put the cable back? Or can you try to access some web pages from your web browser right after you push the cable back?

Pablo2048 avatar Jul 14 '21 14:07 Pablo2048

What keepalive value are you using? Do you use any non-standard compiler flags?

bertmelis avatar Jul 15 '21 06:07 bertmelis

@bertmelis I just use example from github without any modification what do you mean non-standard compiler flags and how about keepalive?

smarta1980 avatar Jul 15 '21 10:07 smarta1980

The mqtt lib should disconnect based on the keepalive value. Default in this lib is 15 seconds.

So it should not take 5 minutes to detect connection loss.

bertmelis avatar Jul 16 '21 05:07 bertmelis

Give PangolinMQTT a try.

HamzaHajeir avatar Jul 18 '21 18:07 HamzaHajeir

Give PangolinMQTT a try.

Thank you for your helpful contribution.

A more sensible thing would be to enable debugging messages in this lib. Compile with DEBUG_ASYNC_MQTT_CLIENT = 1

bertmelis avatar Jul 18 '21 18:07 bertmelis

how to get value of AsyncMqttClientDisconnectReason and put inside variable?

smarta1980 avatar Jul 23 '21 21:07 smarta1980