meross_lan icon indicating copy to clipboard operation
meross_lan copied to clipboard

Meross Devices Constantly Becoming Unavailable, Setup With MQTT via Custom Pairer App, Hacks Device Key

Open timnolte opened this issue 2 years ago • 72 comments

Version of the custom_component

v2.6.1

Configuration

I am using the Mosquitto MQTT Broker, which I use along with Zigbee2MQTT as well. I have a unique HA user setup based on the MAC address of each device and I've configured this with each device when using the Android Custom Pairer app.

Describe the bug

As visible in the History/Logbook I see can the devices constantly going unavailable, then back online. This seems to have gotten worse with each new release of HA since the 2022.8 releases. While trying to debug this I can't even get/download the diagnostic on any of the Meross devices. My devices that were successfully setup via MQTT seem not to be also connected via HTTP. I am unable to change the Connection Protocol to anything other than MQTT for some of the devices as there is also no Host Address and also no Device Key(since it was configured with the Hacks Mode). And there seems to be no way to correct this either. I have a few devices that are connected only via HTTP, which are HomeKit Compatible devices, and those ones can't communicate via MQTT it seems, and those devices also become unavailable. I didn't want to have to have any of my devices registered through the Meross Cloud which is why I went the route of configuring them through the Custom Pairer App but I still can't seem to find the right setup that gets my devices connected via both HTTP & MQTT.

I'm really at a loss as to what to do other than remove all of my Meross devices, and the integration, and start all over again with no choice but to have my devices all connected to the Meross Cloud.

Debug log

Trying to provide relevant log entries with some of the data masked. I have the full raw logs that I'd be happy to provide in a more secure way.

2022-08-17 20:24:09.830 WARNING (MainThread) [homeassistant.config_entries] Config entry 'Living Room Lights (mss510x)' for meross_lan integration not ready yet: MQTT unavailable; Retrying in background
2022-08-17 20:24:09.835 WARNING (MainThread) [homeassistant.config_entries] Config entry 'MQTT Hub' for meross_lan integration not ready yet: MQTT unavailable; Retrying in background
2022-08-17 20:24:09.839 WARNING (MainThread) [homeassistant.config_entries] Config entry 'Bathroom Fan (mss510x)' for meross_lan integration not ready yet: MQTT unavailable; Retrying in background
2022-08-17 20:24:11.696 DEBUG (MainThread) [custom_components.meross_lan] MerossHttpClient(192.168.12.71): HTTP Response ({"header":{"messageId":"ab0247af137e4fef8b9c667e68c14b6a","namespace":"Appliance.System.All","method":"GETACK","payloadVersion":1,"from":"/appliance/XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX/publish","timestamp":1660782251,"timestampMs":989,"sign":"********************************"},"payload":{"all":{"system":{"hardware":{"type":"mss550x","subType":"us","version":"4.0.0","chipType":"MT7686","uuid":"XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX","macAddress":"xx:xx:xx:xx:xx:xx"},"firmware":{"version":"4.2.2","homekitVersion":"2.0.1","compileTime":"Sep 23 2021 17:21:34","encrypt":1,"wifiMac":"xx:xx:xx:xx:xx:xx","innerIp":"192.168.12.71","server":"192.168.12.6","port":8883,"userId":0},"time":{"timestamp":1660782251,"timezone":"","timeRule":[]},"online":{"status":0,"bindId":"","who":0}},"digest":{"togglex":[{"channel":0,"onoff":0,"lmTime":1660708156}],"triggerx":[],"timerx":[]}}}}
)
2022-08-17 20:24:11.697 DEBUG (MainThread) [custom_components.meross_lan] MerossDevice(XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX) back online!
2022-08-17 20:24:16.303 DEBUG (MainThread) [custom_components.meross_lan] MerossApi: MQTT RECV device_id:(XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX) method:(GETACK) namespace:(Appliance.System.All)
2022-08-17 20:24:16.304 DEBUG (MainThread) [custom_components.meross_lan] MerossDevice(XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX) back online!
2022-08-17 20:24:16.305 DEBUG (MainThread) [custom_components.meross_lan] MerossApi: MQTT SEND device_id:(XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX) method:(GET) namespace:(Appliance.System.Runtime)

timnolte avatar Aug 18 '22 03:08 timnolte

I've noticed this too.

My Meross devices are reporting "unavailable" back and forth all the time.

None of my other devices that use WiFi (located close to the same locations) are having these issues.

bradleysimard avatar Sep 26 '22 12:09 bradleysimard

Hello, this reminds me of a particular mqtt timeout disconnect behaviour which was discussed here #192

If this is the case and meross_lan was configured through MQTT (you see this when the configuration panel does not allow you to enter/modify the device address) once the MQTT connection stalls, meross_lan should switch automatically to the last known IP address and start by HTTP (seamlessly - you just see a log stating a protocol switch)

If this doesn't work by itself it could be the device, once disconnected by mosquitto (the MQTT specification states the broker should disconnect clients after a certain timeout but the implementation either in mosquitto or in the meross device could be 'funny' so to say), enters a reboot state thus being offline for some like 30-60 seconds and you'd see this in the history/log for the device/entities

krahabb avatar Sep 29 '22 13:09 krahabb

OK, I think this brings me to another aspect that I feel like my devices that are "primarily" setup via MQTT are not properly being seen as available via HTTP/IP. I do have all of my Meross devices setup in DHCP with fixed IP addresses.

timnolte avatar Sep 29 '22 14:09 timnolte

Same here, everything is OK on the meross app

tunisiano187 avatar Oct 19 '22 05:10 tunisiano187

Same - OK in the APP but deep dropping using this integration.

Kilberz avatar Oct 30 '22 19:10 Kilberz

For @tunisiano187 and @Kilberz, This is issue is hardly related to your cases since your devices are paired to the app and the Meross cloud MQTT servers so meross_lan is not using MQTT to communicate with them.

When your devices are paired to/with the official Meross app, meross_lan can only use HTTP to communicate with them: in this scenario, it would be better to fix their IP address (this is usually achieved by configuring your own router/DHCP settings) so they're always communicating with the same address. meross_lan is not able to detect an address change of the device (it can happen when device reboots or even on timeout depending on how the router is configured/behaving)

In my experience, I also see some random disconnects which usually recover in few seconds (provided the device has a fixed IP) This issue (HTTP disconnection) appears as unresolvable to me since it is caused by the device itself which sometimes rejects meross_lan requests with no apparent reason: it just doesn't reply. This issue, however, shouldn't happen so often to make the devices unusable.

If the issue is very persistent the reason might be the configuration of the device key in meross_lan. If you're unsure, enter the configuration panel, delete the content of the key field and select the 'cloud retrieve' option for the key mode: this will prompt you to enter your Meross account login information in order to retrieve the correct key for the device.

krahabb avatar Nov 03 '22 13:11 krahabb

The ips are fixed, the main problem is the connection lost since it connects and disconnect in less than 30 seconds. So IPs does not change in that time.

tunisiano187 avatar Nov 03 '22 17:11 tunisiano187

chiming in after updating from HAS 10.something to the latest this is also occurring on just one of my devices. I have 14 Meross plugs, all work OK but one. All are setup in the app, also have the correct IP address listed in the device and all on static IP.

the behavior is the entities become available for a second, then unavailable for a few seconds

Living Room - DNDMode became unavailable 11:03:29 - 29 seconds ago Living Room - DNDMode turned on 11:03:29 - 30 seconds ago Living Room - outlet turned on 11:03:28 - 30 seconds ago Living Room - outlet became unavailable 11:03:19 - 39 seconds ago Living Room - DNDMode became unavailable 11:03:19 - 39 seconds ago Living Room - DNDMode turned on 11:03:19 - 42 seconds ago Living Room - outlet turned on 11:03:18 - 42 seconds ago Living Room - outlet became unavailable 11:03:14 - 1 minute ago Living Room - DNDMode became unavailable 11:03:14 - 1 minute ago Living Room - DNDMode turned on 11:03:14 - 1 minute ago Living Room - outlet turned on 11:03:13 - 1 minute ago Living Room - outlet became unavailable 11:03:09 - 1 minute ago

jragarw avatar Nov 08 '22 11:11 jragarw

This is very interesting, 1 device out 14 showing this behaviour is likely related to a very specific firmware. If you could inspect this a bit and see if the fw version is different from others (devices of the same type) that could provide some hints Also, the really 'fast' disconnection and reconnection is very strange since devices should be polled every 30 seconds . The delay could vary anyway, especially when devices don't respond, since meross_lan tries a little harder to silently re-establish the connection over few attempts and then, when also the last attempt fails, reporting the disconnection to HA

krahabb avatar Nov 10 '22 20:11 krahabb

https://imgur.com/a/GtXNlwt

Interestingly, 10 of my plugs are hardware revision 2, with a 2.X firmware, and the newest two are revision 6 with a 6.X firmware

The 6.X are the ones which are reading as unavailable with the quick disconnect reconnect.

jragarw avatar Nov 11 '22 01:11 jragarw

I'm getting the same problem with my newly purchased meross switches. They have FW 6.3.6, and after about 12-16 hours - become unavailable. The only way to bring them back on is to physically turn the power off at the wall socket.

Devices with FW 2.1.16 are solid.

For all these devices, I set them up to use an mqtt broker using the instructions detailed here: https://github.com/bytespider/Meross/wiki/MQTT

enomam avatar Feb 23 '23 09:02 enomam

Needs more info, following places to check: [Plug] --- A --- [Wifi] --- B --- [MQTT Broker] --- C --- [HA meross lan integration] A: Is the device successfully connected to Wifi? Check via router or ping the IP address. If not, make sure wifi signal is strong enough and try to use a channel that has low noise (can usually be checked and configured in the router settings) B: Is the device connected to the MQTT Broker and successfully publishing data? Check mqtt logs, observe whats sent and received i.e. with mqtt explorer. C: Check HA logs, enable debug logs for meross lan integration and check for hints whats going wrong.

DominikGebhart avatar Feb 23 '23 12:02 DominikGebhart

I just want to point out that my issues were related to my zigbee network causing micro-outages due to competing 2.4ghz channels.

I guess most of my other devices can just re-establish a connection quickly, while the meross devices were requiring manual reconnections.

Just incase anyone else has a similar setup.

bradleysimard avatar Feb 23 '23 13:02 bradleysimard

@enomam, as @DominikGebhart pointed out you should enable debug logging for meross_lan (you can now enable debug log for an integration from the '...' menu on the integration panel (just enable debug logging for any configuration entry you have and it will do that globally for the meross_lan integration) When a device works on a private MQTT meross_lan never tries to contact the device (no poll in general) unless needed to actually send a command message. This might lead to a state where the device is reported offline (because of any transient disconnection) and you can't actually send any command (since the UI prevents you from interacting with the devce being unavailable). In general, on MQTT, the device should push some messages every now and then and this is recognized from meross_lan as being online. The only 'safety' measure in meross_lan is an heartbeat (roughly on 5 minutes timeout) where meross_lan tries to 'ping' the device over MQTT in order to see if it's there or not but this is actually only meant to prevent meross_lan from thinking the device is disconnected when it's not. So, in the end, if you can't see your devices coming online at all they're likely effectively offline (with respect to HA)

In order to see what's going on you should likely check the mosquitto connect/disconnect log in order to see what is happening to the paired devices. The issue in #192 could be an hint for this too.

krahabb avatar Feb 23 '23 15:02 krahabb

I have the same problem (I believe). I discovered it when the light in a room switched on by itself during the night several times - the wall switch was rather unstable before but I was not paying much attention so far. I ended up disabling the automation that switches the lamps when the wall switch switches to 'on'.

The switch (michael main) is a Meross mss510x (wall switch powered by the mains), connected to a Unifi AP, MQTT is a standalone mosquito, HA is 2023.10.4 and mersoss_lan is Cloudy.3 (4.3.0).

This is what I see in the device log:

image

The WiFi is stable so the next check is mosquito. Around the 10:35:30 timestamp from the meross_lan log, I have in mosquito (removed the saving in memory db lines that clutter the log)

domotique-mqtt-1  | 2023-10-22T10:34:09: Client fmware:2102089305516425584748e1e94bd8ba_NVZuM0oHVNYMgCeS has exceeded timeout, disconnecting.
domotique-mqtt-1  | 2023-10-22T10:34:17: New connection from 192.168.10.51:52603 on port 8883.
domotique-mqtt-1  | 2023-10-22T10:35:00: New connection from 192.168.10.51:52605 on port 8883.
domotique-mqtt-1  | 2023-10-22T10:35:02: New client connected from 192.168.10.51:52605 as fmware:2102089305516425584748e1e94bd8ba_NVZuM0oHVNYMgCeS (p1, c1, k30, u'48:e1:e9:4b:d8:ba').
domotique-mqtt-1  | 2023-10-22T10:35:02: OpenSSL Error[0]: error:140370E5:SSL routines:ACCEPT_SR_KEY_EXCH:ssl handshake failure
domotique-mqtt-1  | 2023-10-22T10:35:02: Client <unknown> disconnected: Protocol error.
domotique-mqtt-1  | 2023-10-22T10:35:51: Client fmware:2102089305516425584748e1e94bd8ba_NVZuM0oHVNYMgCeS has exceeded timeout, disconnecting.
domotique-mqtt-1  | 2023-10-22T10:36:29: New connection from 192.168.10.51:52606 on port 8883.
domotique-mqtt-1  | 2023-10-22T10:36:29: New client connected from 192.168.10.51:52606 as fmware:2102089305516425584748e1e94bd8ba_NVZuM0oHVNYMgCeS (p1, c1, k30, u'48:e1:e9:4b:d8:ba').
(no more relevant logs, the device has reconnected fine)

192.168.10.51 is indeed the IP of the switch. All this happened without any manual interaction with the switch.

I saw that many of you in this thread had similar issues - is there a consensus on where to go next?

When searching for the issue, the usual culprit is "bad certificate" but this is not my case: the handshake failure is intermittent and eventually everything is fine.

wsw70 avatar Oct 22 '23 09:10 wsw70

OpenSSL Error[0]: error:140370E5:SSL routines:ACCEPT_SR_KEY_EXCH:ssl handshake failure

@wsw70 , could it be we have 'another' mosquitto bug? In my experience it happened (not really lately..more a couple years ago) that some mosquitto releases were inconsistent to say the least.

I'm actually using mosquitto 2.0.12 and it reports some disconnections too (but I don't have detailed logs for that). These in turn don't affect (at least not that I'm aware of) the overall device availability in meross_lan

My devices are anyway configured for automatic protocol switching (in meross_lan) and are therefore usually accessed or accessible via HTTP so I guess I cannot detect MQTT unavailability in every-day life....

krahabb avatar Oct 22 '23 13:10 krahabb

I'm actually using mosquitto 2.0.12 and it reports some disconnections too (but I don't have detailed logs for that). These in turn don't affect (at least not that I'm aware of) the overall device availability in meross_lan

I have 2.0.18 and several devices that connect to MQTT. Somehow only the Meross devices disconnect and reconnect (though a timeout)

My devices are anyway configured for automatic protocol switching (in meross_lan) and are therefore usually accessed or accessible via HTTP so I guess I cannot detect MQTT unavailability in every-day life....

Mine too (I think: there is nothing chosen between auto, mqtt and http). I will try to force one device to use only http to see how it goes.

One question: my firmware is 3.1.5, I see mentions of 4.x versions - have you upgraded? if so - how?

wsw70 avatar Oct 22 '23 16:10 wsw70

My firmware for mss310(s) is 2.1.4 and it is the latest available since one of them is Meross-binded and doesn't notify any update...all of my devices are really 'almost legacy'

krahabb avatar Oct 23 '23 08:10 krahabb

My firmware for mss310(s) is 2.1.4 and it is the latest available since one of them is Meross-binded and doesn't notify any update...all of my devices are really 'almost legacy'

Ahhh, you mean that upgrading to the latest version means that it is not possible anymore to use a local MQTT because of some binding made between the device and Meross cloud? Or that they do not use MQTT anymore (and rely on HTTP)? Or something else?

I upgraded one device to 4.something and the pairer would not work anymore (and the latest version could not be installed on my phone). I had to go to bed so I did not investigate further but I will do so because I need to understand which strategy to take for the new devices I will buy :)

wsw70 avatar Oct 23 '23 09:10 wsw70

I really don't know..but I hardly think the whole MQTT protocol has been dropped...newer devices/frimwares had always added protocols/features (like HK and now MATTER) rather than removing the MQTT/HTTP original interfaces

krahabb avatar Oct 23 '23 12:10 krahabb

I upgraded one device to 4.something and the pairer would not work anymore (and the latest version could not be installed on my phone).

Newer version might need to use the newer WifiX pairing stuff, see https://github.com/bytespider/Meross/pull/60

DominikGebhart avatar Oct 26 '23 17:10 DominikGebhart

Newer version might need to use the newer WifiX pairing stuff, see bytespider/Meross#60

Thank you! I managed to pair it without problems.

wsw70 avatar Oct 30 '23 17:10 wsw70

The only way to bring them back on is to physically turn the power off at the wall socket.

There is actually a way to do this, at least on my meross 315. I've discovered that they drop the mqtt connection but stay connected to the wifi. If I disconnect them from the wifi from my unifi control panel they reconnect to mqtt without needing to physically interact with them or turning off the device! image

fuomag9 avatar Feb 10 '24 12:02 fuomag9

I'm also getting frequent disconnects from my power sockets. I'm on the latest version of MerossLan 5.0.2

I have quite a few devices, all still connected to Meross Cloud, all with reserved DHCP entries, all connected in the integration via http method.

image

Can nothing be done about this?

gary-sargent avatar Mar 14 '24 13:03 gary-sargent

Also just to add I have ping monitoring of all the devices, and they are being pinged every 20 seconds. Every single ping is coming back with a reply - suggesting no network issues.

gary-sargent avatar Mar 14 '24 13:03 gary-sargent

Also just to add I have ping monitoring of all the devices, and they are being pinged every 20 seconds. Every single ping is coming back with a reply - suggesting no network issues.

Yeah, I'm 100% sure it's a software issue, maybe in Meross-lan? The disconnect method seems to have broken recently (I.e. the plug does not recover) so I'm stuck with non functional ones until I replug them 😭

fuomag9 avatar Mar 14 '24 13:03 fuomag9

For those having disconnect issues I'd be certain that you aren't using the blank device key hack. Since I've gone through and properly setup all of my devices with a device key, including creating local only users for every device in Home Assistant with the proper password that corresponds to the device key and MAC address, I've no longer experienced any disconnects. My devices have been rock solid and working over both IP & MQTT automatically.

timnolte avatar Mar 14 '24 15:03 timnolte

Oh, I would also say if at all possible setup your local IP addressing such that your devices always get assigned the same IP address. I've configured my local LAN DHCP server to statically assign fixed IP addresses to all of my Meross devices. This ensures the IP addresses don't change which can also cause devices to be more susceptible to connectivity issues.

timnolte avatar Mar 14 '24 15:03 timnolte

@timnolte what do you mean by "blank device key hack"? If I click configure integration in HA, then the "device key" field is filled in for devices that are going offline. My devices already always get the same static IP.

I'm not using MQTT I'm using the http method.

gary-sargent avatar Mar 14 '24 16:03 gary-sargent

@gary-sargent well, I will point out that this issue was targeted around using MQTT. So I'm not certain that it's appropriate to be continuing to have folks posting other issues to this thread that aren't specific to the original issue. Using the Custom Pairer App is specifically for use with MQTT. It sounds like you perhaps also have your devices connected to Meross Cloud and are using local IP control? Do you have your devices set to http only and not auto?

timnolte avatar Mar 14 '24 17:03 timnolte