MQTT.js
MQTT.js copied to clipboard
After mqtt client is reconnected, it is unable to continue publishing message
Hi there,
I am using the latest version of mqttjs (5.1.3) and this issue happens in old versions as well.
In my app, I received the offline event even when the mosquitto MQTT broker is up and running. After the mqttjs client goes through the process of offline -> closed -> reconnect -> connected. The mqttjs client can no longer publish messages again.
Is there a workaround?
app log:
mqtt broker log:
Can you provide a script that reproduces the issue?
Hey @robertsLando , we are having same issue in our react application in the fronend this time.
the error in the chrome only shows it failed at createWebsocket with array buffer
(sry I have not made screenshot for that).
the only way to let it work is to close and re-open the chrome broswer in our case.
i think now reactjs and nodejs both has the same issue. Can i grab more attentions on this issue ? Thx.
@jaketakula I need to see more details about the error and how to reproduce it. Also what version are you using?
@jaketakula I need to see more details about the error and how to reproduce it. Also what version are you using?
-
the mqttjs version being used is
5.1.3 -
this issue happens randomly in chrome broswer - roughly every1 or 2 weeks seen once. so i think you can easily setup a ping-pong demo app using
5.1.3and some mqtt broker you like. after that, keepit it running several days and you could see the reconnecion-drop issue.
this issue happens randomly in chrome broswer - roughly every1 or 2 weeks seen once.
you mean 1/2 weeks with the browser opened or what?
The browser never sends data to broker but keeps open and only receive msg from broker.
On Wed, 10 Jan 2024 at 7:02 pm, Daniel Lando @.***> wrote:
this issue happens randomly in chrome broswer - roughly every1 or 2 weeks seen once.
you mean 1/2 weeks with the browser opened or what?
— Reply to this email directly, view it on GitHub https://github.com/mqttjs/MQTT.js/issues/1727#issuecomment-1884358937, or unsubscribe https://github.com/notifications/unsubscribe-auth/BCFQK2J5BDWLWTKXWGOTA33YNZDIDAVCNFSM6AAAAAA6UXDO66VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOBUGM2TQOJTG4 . You are receiving this because you were mentioned.Message ID: <mqttjs/MQTT .@.***>
Could be fixed by #1779 , someone could give a try to 5.3.5?
Thank you very much. I will bump the version now. Will update you later on.
On Wed, 24 Jan 2024 at 1:31 am, Daniel Lando @.***> wrote:
Could be fixed by #1779 https://github.com/mqttjs/MQTT.js/pull/1779 , someone could give a try to 5.3.5?
— Reply to this email directly, view it on GitHub https://github.com/mqttjs/MQTT.js/issues/1727#issuecomment-1906177784, or unsubscribe https://github.com/notifications/unsubscribe-auth/BCFQK2MTWWRAHR2OIE2T65TYP7CVXAVCNFSM6AAAAAA6UXDO66VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMBWGE3TONZYGQ . You are receiving this because you were mentioned.Message ID: <mqttjs/MQTT .@.***>
@jaketakula Thanks! Any news?
It usually take 1 or 2 weeks to see the issue. So pls keep patient. Thx.
On Wed, 24 Jan 2024 at 6:26 pm, Daniel Lando @.***> wrote:
@jaketakula https://github.com/jaketakula Thanks! Any news?
— Reply to this email directly, view it on GitHub https://github.com/mqttjs/MQTT.js/issues/1727#issuecomment-1907537193, or unsubscribe https://github.com/notifications/unsubscribe-auth/BCFQK2OEEP6NCUB5RICMHEDYQCZTTAVCNFSM6AAAAAA6UXDO66VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMBXGUZTOMJZGM . You are receiving this because you were mentioned.Message ID: <mqttjs/MQTT .@.***>
any news on this? we have a similar issue so far, we did upgrade to the latest version but want to make sure before we deploy (iot devices) it won't happen again.
I didn't checked my own as I never faced this issue, dunno if @jaketakula has news (but I think he would have write if a bug happended). Recent changes fixed a very old bug in reconnect/keep alive that could have caused it BTW
we deployed 10 IoT devices and slowly they're getting disconnected one by one without reconnecting. So the issue still persists.
Here's the config I'm using:
this.#conn = mqtt.connect(url, {
cert: CERT_PATH,
key: KEY_PATH,
ca: CA_PATH,
protocolId: "MQTT",
protocolVersion: 5,
encoding: "binary",
clean: false,
clientId: 'specific-IoT-id',
keepalive: 60,
reconnectPeriod: 1000,
connectTimeout: 30000,
reschedulePings: false,
queueQoSZero: true,
resubscribe: true,
manualConnect: false,
});
I also tried to add timeout on publish (since when sending QoS 1 messages, it waits until delivery happens) but fails miserably. pseudo code:
await Promise.race([publish, timer]);
if timeout then client.reconnect();
but after I'm using reconnect() method, every publish request made throws "client disconnecting" error and can't recover from it.
EDIT: can you suggest what shall I do? workaround will also do cause we're in rush right now.
~~EDIT2: I forgot to run npm install :man_facepalming: I'll test again and get back as soon as I have news.~~
Yup, same thing.
@overflowz Please open a new bug issue and follow the steps in order to also attach DEBUG logs
The problem is, it's hard to reproduce and you gotta wait for days or even weeks for it to trigger (this is for the reconnect). As for the reconnect() -- will do later today.
I understand that but it's hard for me to know what's going on here without more info,,, what you could do is to also patch the log function in client and print logs to a file so you don't loose them when this happens
I'm not sure if it's related but I'm facing a similar problem.
Context My arch looks like this:
device -- Ethernet connection --> manager -- 5G connection --> MQTT broker
|
L______ 5G connection --> HTTP server
what happens is that every X hours the 5G modem goes down for a short period of time. When this happens I can see in the manager's logs that when it tries to communicate via HTTP it receives a EHOSTUNREACH error, which disappears when the 5G connection is back.
The problem is that when the connection is back the client.connected flips to true but the messages are not being sent.
My sender function looks something like this:
async function sendToBroker(topic, message) {
if (!this.client.connected) {
console.warn("Client is not connected, storing message");
this.storage.store({ topic, message });
return;
}
console.debug("Sending message to broker", message);
await this.client.publishAsync(topic, message);
}
that is called like this:
async function send(ctx) {
ctx.call("mqtt.sendToBroker", {
topic: "topic",
message: "message",
},
{
timeout: 10000, // Throw an error if it takes more than 10 seconds
});
}
In the logs, I can see Sending message to broker that is followed by a timeout error saying that sendToBroker() did not resolve in 10 seconds.
I'm assuming that it gets stuck in the await this.client.publish(topic, message); line.
Here's how I create my client:
this.client = mqtt.connect(url, {
username: token,
cert: fs.readFileSync(certPath, "utf8"),
rejectUnauthorized: false,
});
Question Any idea why this is happening? Is there a way to check if the connection is really up?
Might be vaguely related to: https://github.com/mqttjs/MQTT.js/issues/1825
@AndreMaz Could you create a full script that I can use to reproduce the issue? By checking the other issues it seems this happens only when working with tls?
@robertsLando It will be difficult as it's a proprietary code but I'll try to create a repro example
By checking the other issues it seems this happens only when working with tls?
I've been using TLS since the beginning but can't say for sure if it's the source of the problem
Can confirm, we're also using with tls, haven't tried otherwise.
I don't need your source code but a scripts that reproduces the issue. An easy one that connects to a broker with TLS (you can use hivemq public one https://www.hivemq.com/mqtt/public-mqtt-broker/) and then try to reproduce the disconnect and see if the problem happens. I tried last time without success and if I cannot reproduce it on my side it's hard to fix
Yep yep, I know that you don't want it. I've created a simple script that mimics the data-flow but I can't reproduce the issue at home. I've been switching my laptop between WiFi, Ethernet, and 5G hotspot and so far no luck, ie, I did not see the timeout that I've mentioned.
if it helps, since we updated to version 5.3.5, we're facing this issue less frequently, but it's still there. I believe these were the the relevant chages that could've affected it #1779
but in the latest release, there are fixes for the possible race condition again: #1848 but it's hard for us to keep updating many iot devices since we have to do a npm install where connection is very unstable.
these issues are really hard for us to test in a production environment and it also costs us a lot. is there any version that is considered stable so we can pick it for now until these issues gets fixed?
thank you!
I also think that this is a regression that was introduced in the latest releases (don't know which one tho)
I'll have to check my previous releases and test them out. Will keep you guys updated
I'm sorry for the issues guys and I would like to help you but it's hard to guess what could be the root cause here, we should firstly try to find out an easy way to reproduce the issue somehow
I'm sorry for the issues guys and I would like to help you but it's hard to guess what could be the root cause here, we should firstly try to find out an easy way to reproduce the issue somehow
No worries at all! We're devs too and we understand the frustration :-) Just to be clear, I didn't mean it as a "fix it asap", I do appreciate the work the maintainers are doing, truly. I was asking if there are any old version(s) I could try so I won't get pressured by the company to fix a problem that is hard to explain why it does not work sometimes xD
Regardless, I really do wish to help somehow too, but it's really hard to reproduce :( We're currently running the code on 50 devices and it might happen once a week, two weeks or even months per one device, it's really unpredictable and random.
I'm the only maintainer here unfortunately and I started because I use this package in almost all my projects and I wanted to help keeping it maintained (as it was almost died)
Based on the first comment of this issue seems this was happening also with older versions so I dunno, I'm sorry :(
Hey @robertsLando no need to apologize. Huge kudos to you for what you're doing :muscle:
I'll have to check my previous releases and test them out.
Just checked, I went from 5.3.3 -> 5.5.2 -> 5.5.4
I'm rolling back to 5.3.3 and going to let it run for a while. Will keep you updated
@AndreMaz Thanks!
Hey @robertsLando no need to apologize. Huge kudos to you for what you're doing 💪
I'll have to check my previous releases and test them out.
Just checked, I went from
5.3.3->5.5.2->5.5.4I'm rolling back to
5.3.3and going to let it run for a while. Will keep you updated
if it helps, we were using 5.3.3 previously and the issue was appearing more frequently than now (about 3-4 times a week).