core
core copied to clipboard
Matter over Thread Devices unavailable - Home Assistant Only
The problem
I have a problem with Matter devices, I use a Apple Thread network and around 60+ Matter over Thread devices (Eve and Aqara), all worked perfect since the last or wo last HA updates/ Matter Server Updates.
Now some Devices are not available in Home Assistant, but still work perfect in Apple Home, so it is not a Network Problem.
What version of Home Assistant Core has the issue?
core-2024.8.1
What was the last working version of Home Assistant Core?
No response
What type of installation are you running?
Home Assistant OS
Integration causing the issue
Matter Server
Link to integration documentation on our website
https://www.home-assistant.io/integrations/matter/
Diagnostics information
No response
Example YAML snippet
No response
Anything in the logs that might be useful for us?
No response
Additional information
No response
EDIT by @marcelveldt:
We have identified this as a global/common issue. Please see this post for more info how to help troubleshoot. Thanks! https://github.com/home-assistant/core/issues/123835#issuecomment-2324263470
Hey there @home-assistant/matter, mind taking a look at this issue as it has been labeled with an integration (matter) you are listed as a code owner for? Thanks!
Code owner commands
Code owners of matter can trigger bot actions by commenting:
@home-assistant closeCloses the issue.@home-assistant rename Awesome new titleRenames the issue.@home-assistant reopenReopen the issue.@home-assistant unassign matterRemoves the current integration label and assignees on the issue, add the integration domain after the command.@home-assistant add-label needs-more-informationAdd a label (needs-more-information, problem in dependency, problem in custom component) to the issue.@home-assistant remove-label needs-more-informationRemove a label (needs-more-information, problem in dependency, problem in custom component) on the issue.
(message by CodeOwnersMention)
matter documentation matter source (message by IssueLinks)
I'm seeing a similar issue too. Not sure if this started with the update to 2024.8.0 or Matter Server 6.4.1 but I'm finding that a Matter light (Nanoleaf Essentials) I have no longer responds to on/off commands. Changing the colour and the brightness seems to work, but it remains in an Off state. If you have a button in HA setup to toggle the light this is no longer working.
As soon as I restart the Matter Server on/off commands start to work again for a few days then they stop.
Same problem here. After restart after today's operating-system-update, some of my matter devices (Eve energy plug und eve therm) are only not available in home assistant. In apple HomeKit and eve-app every thing works fine, as in ha before update/restart. Some other users have had the same problem after matter-server-update or core-update to 2024.8.*
The same issue here, I assume starting with Home Assistant OS 2024.8. Devices becomes after few hours unavailable. Restart of the Matter Server mostly brings the device back again. Sometimes I have to restart the Matter Server twice.
It's strange, because everything works fine without any issues since beginning of the year until last week Wednesday.
core-2024.8.1 supervisor-2024.08.0 Home Assistant OS 13.0 Matter Server Add-on 6.4.1 OpenThread Border Router Add-on 2.9.1
As soon one or more devices becomes offline, this error popp up in the Matter Server log. Mostly the Eve Energy becomes offline. But also all other devices becomes irregular offline.
2024-08-15 00:46:40.905 (Dummy-2) CHIP_ERROR [chip.native.EM] Failed to Send CHIP MessageCounter:176366169 on exchange 6698i with Node: <0000000000000027, 1> sendCount: 4 max retries: 4 2024-08-15 00:46:41.213 (Dummy-2) CHIP_PROGRESS [chip.native.EM] Retransmitting MessageCounter:81826591 on exchange 6700i with Node: <000000000000003C, 1> Send Cnt 4 2024-08-15 00:46:43.394 (Dummy-2) CHIP_PROGRESS [chip.native.SC] SecureSession[0x7f4c980a4b00, LSID:572]: State change 'kActive' --> 'kDefunct' 2024-08-15 00:46:43.394 (Dummy-2) CHIP_ERROR [chip.native.DMG] Time out! failed to receive report data from Exchange: 6698i with Node: <0000000000000027, 1>
Disabling the OpenThread Border Router, looks like Thread becomes more stable. Now running only with the exiting 7 Apple Thread Border Routers, no trouble in the last 24h
Edit: Unfortunately, still devices becomes offline!! And becomes online after restarting the Matter Server. The whole ting is very frustrating!
Same here. Also, when that happens on my HA green, the matter add-on shows cpu usage >>25% constantly (stuck at always the same number) where it normally would be <<10% (varying). So it looks like the add-on freezes.
Another observation: Although the number of devices doesn't change the amount of memory used (up to the point where it "freezes") will constantly rise to somewhere between 10 and 20%
Same here. Stopping OTBR similarly improved the situation for me and brought some (but not all) devices back online.
Current HA (2024.8.2) even after the restart of the Matter Server not all devices becomes online again?!
2024-08-25 16:12:14.638 (Dummy-2) CHIP_ERROR [chip.native.DIS] Timeout waiting for mDNS resolution.
2024-08-25 16:12:22.210 (Dummy-2) CHIP_PROGRESS [chip.native.DIS] Checking node lookup status after 45003 ms
2024-08-25 16:12:22.210 (Dummy-2) CHIP_ERROR [chip.native.DIS] OperationalSessionSetup[1:000000000000003C]: operational discovery failed: src/lib/address_resolve/AddressResolve_DefaultImpl.cpp:119: CHIP Error 0x00000032: Timeout
2024-08-25 16:12:22.211 (Dummy-2) CHIP_PROGRESS [chip.native.DIS] Checking node lookup status after 43489 ms
2024-08-25 16:12:22.211 (Dummy-2) CHIP_PROGRESS [chip.native.DIS] Checking node lookup status after 42529 ms
2024-08-25 16:12:22.211 (Dummy-2) CHIP_PROGRESS [chip.native.DIS] Checking node lookup status after 38577 ms
2024-08-25 16:12:22.211 (MainThread) WARNING [matter_server.server.device_controller] <Node:60> Setup for node failed: Unable to establish CASE session with Node 60
2024-08-25 16:12:23.722 (Dummy-2) CHIP_PROGRESS [chip.native.DIS] Checking node lookup status after 45000 ms
2024-08-25 16:12:23.723 (Dummy-2) CHIP_ERROR [chip.native.DIS] OperationalSessionSetup[1:0000000000000030]: operational discovery failed: src/lib/address_resolve/AddressResolve_DefaultImpl.cpp:119: CHIP Error 0x00000032: Timeout
2024-08-25 16:12:23.723 (Dummy-2) CHIP_PROGRESS [chip.native.DIS] Checking node lookup status after 44041 ms
2024-08-25 16:12:23.724 (Dummy-2) CHIP_PROGRESS [chip.native.DIS] Checking node lookup status after 40090 ms
2024-08-25 16:12:23.725 (MainThread) WARNING [matter_server.server.device_controller] <Node:48> Setup for node failed: Unable to establish CASE session with Node 48
2024-08-25 16:12:24.683 (Dummy-2) CHIP_PROGRESS [chip.native.DIS] Checking node lookup status after 45001 ms
2024-08-25 16:12:24.683 (Dummy-2) CHIP_ERROR [chip.native.DIS] OperationalSessionSetup[1:0000000000000029]: operational discovery failed: src/lib/address_resolve/AddressResolve_DefaultImpl.cpp:119: CHIP Error 0x00000032: Timeout
2024-08-25 16:12:24.683 (Dummy-2) CHIP_PROGRESS [chip.native.DIS] Checking node lookup status after 41049 ms
2024-08-25 16:12:24.683 (MainThread) WARNING [matter_server.server.device_controller] <Node:41> Setup for node failed: Unable to establish CASE session with Node 41
2024-08-25 16:12:25.571 (Dummy-2) CHIP_PROGRESS [chip.native.DIS] Checking node lookup status after 41937 ms
2024-08-25 16:12:25.571 (Dummy-2) CHIP_PROGRESS [chip.native.DIS] Checking node lookup status after 200 ms
2024-08-25 16:12:26.166 (Dummy-2) CHIP_PROGRESS [chip.native.DIS] Found an existing secure session to [1:0000000000000027]!
2024-08-25 16:12:26.167 (Dummy-2) CHIP_PROGRESS [chip.native.EM] <<< [E:65038i S:55630 M:12336338] (S) Msg TX to 1:0000000000000027 [45F7] [UDP:[fd5d:81a0:e70b:0:1e7f:7cfb:c99a:5a0a]:5540] --- Type 0001:02 (IM:ReadRequest)
2024-08-25 16:12:26.401 (Dummy-2) CHIP_PROGRESS [chip.native.EM] >>> [E:65038i S:55630 M:160473088 (Ack:12336338)] (S) Msg RX from 1:0000000000000027 [45F7] --- Type 0001:05 (IM:ReportData)
2024-08-25 16:12:26.402 (Dummy-2) CHIP_PROGRESS [chip.native.EM] <<< [E:65038i S:55630 M:12336339 (Ack:160473088)] (S) Msg TX to 1:0000000000000027 [45F7] [UDP:[fd5d:81a0:e70b:0:1e7f:7cfb:c99a:5a0a]:5540] --- Type 0000:10 (SecureChannel:StandaloneAck)
2024-08-25 16:12:28.634 (Dummy-2) CHIP_PROGRESS [chip.native.DIS] Checking node lookup status after 45000 ms
2024-08-25 16:12:28.634 (Dummy-2) CHIP_ERROR [chip.native.DIS] OperationalSessionSetup[1:000000000000003E]: operational discovery failed: src/lib/address_resolve/AddressResolve_DefaultImpl.cpp:119: CHIP Error 0x00000032: Timeout
2024-08-25 16:12:28.635 (Dummy-2) CHIP_PROGRESS [chip.native.DIS] Checking node lookup status after 3264 ms
2024-08-25 16:12:28.635 (MainThread) WARNING [matter_server.server.device_controller] <Node:62> Setup for node failed: Unable to establish CASE session with Node 62
2024-08-25 16:12:51.833 (Dummy-2) CHIP_PROGRESS [chip.native.EM] >>> [E:38994r S:55630 M:160473089] (S) Msg RX from 1:0000000000000027 [45F7] --- Type 0001:05 (IM:ReportData)
2024-08-25 16:12:51.833 (Dummy-2) CHIP_PROGRESS [chip.native.DMG] Refresh LivenessCheckTime for 64224 milliseconds with SubscriptionId = 0x1db97744 Peer = 01:0000000000000027
2024-08-25 16:12:51.833 (Dummy-2) CHIP_PROGRESS [chip.native.EM] <<< [E:38994r S:55630 M:12336340 (Ack:160473089)] (S) Msg TX to 1:0000000000000027 [45F7] [UDP:[fd5d:81a0:e70b:0:1e7f:7cfb:c99a:5a0a]:5540] --- Type 0001:01 (IM:StatusResponse)
2024-08-25 16:12:51.948 (Dummy-2) CHIP_PROGRESS [chip.native.EM] >>> [E:38994r S:55630 M:160473090 (Ack:12336340)] (S) Msg RX from 1:0000000000000027 [45F7] --- Type 0000:10 (SecureChannel:StandaloneAck)
2024-08-25 16:12:56.373 (Dummy-2) CHIP_ERROR [chip.native.DIS] Timeout waiting for mDNS resolution.
2024-08-25 16:12:58.403 (Dummy-2) CHIP_PROGRESS [chip.native.DIS] Found an existing secure session to [1:0000000000000027]!
2024-08-25 16:12:58.404 (Dummy-2) CHIP_PROGRESS [chip.native.EM] <<< [E:65039i S:55630 M:12336341] (S) Msg TX to 1:0000000000000027 [45F7] [UDP:[fd5d:81a0:e70b:0:1e7f:7cfb:c99a:5a0a]:5540] --- Type 0001:02 (IM:ReadRequest)
2024-08-25 16:12:58.648 (Dummy-2) CHIP_PROGRESS [chip.native.EM] >>> [E:65039i S:55630 M:160473091 (Ack:12336341)] (S) Msg RX from 1:0000000000000027 [45F7] --- Type 0001:05 (IM:ReportData)
2024-08-25 16:12:58.648 (Dummy-2) CHIP_PROGRESS [chip.native.EM] <<< [E:65039i S:55630 M:12336342 (Ack:160473091)] (S) Msg TX to 1:0000000000000027 [45F7] [UDP:[fd5d:81a0:e70b:0:1e7f:7cfb:c99a:5a0a]:5540] --- Type 0000:10 (SecureChannel:StandaloneAck)
2024-08-25 16:13:10.371 (Dummy-2) CHIP_PROGRESS [chip.native.DIS] Checking node lookup status after 45000 ms
2024-08-25 16:13:10.371 (Dummy-2) CHIP_ERROR [chip.native.DIS] OperationalSessionSetup[1:000000000000003C]: operational discovery failed: src/lib/address_resolve/AddressResolve_DefaultImpl.cpp:119: CHIP Error 0x00000032: Timeout
2024-08-25 16:13:13.574 (Dummy-2) CHIP_PROGRESS [chip.native.DIS] Checking node lookup status after 200 ms
2024-08-25 16:13:30.651 (Dummy-2) CHIP_PROGRESS [chip.native.DIS] Found an existing secure session to [1:0000000000000027]!
2024-08-25 16:13:30.652 (Dummy-2) CHIP_PROGRESS [chip.native.EM] <<< [E:65040i S:55630 M:12336343] (S) Msg TX to 1:0000000000000027 [45F7] [UDP:[fd5d:81a0:e70b:0:1e7f:7cfb:c99a:5a0a]:5540] --- Type 0001:02 (IM:ReadRequest)
2024-08-25 16:13:30.885 (Dummy-2) CHIP_PROGRESS [chip.native.EM] >>> [E:65040i S:55630 M:160473092 (Ack:12336343)] (S) Msg RX from 1:0000000000000027 [45F7] --- Type 0001:05 (IM:ReportData)
2024-08-25 16:13:30.886 (Dummy-2) CHIP_PROGRESS [chip.native.EM] <<< [E:65040i S:55630 M:12336344 (Ack:160473092)] (S) Msg TX to 1:0000000000000027 [45F7] [UDP:[fd5d:81a0:e70b:0:1e7f:7cfb:c99a:5a0a]:5540] --- Type 0000:10 (SecureChannel:StandaloneAck)
2024-08-25 16:13:44.376 (Dummy-2) CHIP_ERROR [chip.native.DIS] Timeout waiting for mDNS resolution.
2024-08-25 16:13:51.922 (Dummy-2) CHIP_PROGRESS [chip.native.EM] >>> [E:38995r S:55630 M:160473093] (S) Msg RX from 1:0000000000000027 [45F7] --- Type 0001:05 (IM:ReportData)
2024-08-25 16:13:51.923 (Dummy-2) CHIP_PROGRESS [chip.native.DMG] Refresh LivenessCheckTime for 64224 milliseconds with SubscriptionId = 0x1db97744 Peer = 01:0000000000000027
2024-08-25 16:13:51.923 (Dummy-2) CHIP_PROGRESS [chip.native.EM] <<< [E:38995r S:55630 M:12336345 (Ack:160473093)] (S) Msg TX to 1:0000000000000027 [45F7] [UDP:[fd5d:81a0:e70b:0:1e7f:7cfb:c99a:5a0a]:5540] --- Type 0001:01 (IM:StatusResponse)
2024-08-25 16:13:52.037 (Dummy-2) CHIP_PROGRESS [chip.native.EM] >>> [E:38995r S:55630 M:160473094 (Ack:12336345)] (S) Msg RX from 1:0000000000000027 [45F7] --- Type 0000:10 (SecureChannel:StandaloneAck)
2024-08-25 16:13:58.374 (Dummy-2) CHIP_PROGRESS [chip.native.DIS] Checking node lookup status after 45000 ms
2024-08-25 16:13:58.374 (Dummy-2) CHIP_ERROR [chip.native.DIS] OperationalSessionSetup[1:000000000000003C]: operational discovery failed: src/lib/address_resolve/AddressResolve_DefaultImpl.cpp:119: CHIP Error 0x00000032: Timeout
2024-08-25 16:13:58.374 (MainThread) WARNING [matter_server.server.device_controller] <Node:60> Setup for node failed: Unable to establish CASE session with Node 60
2024-08-25 16:14:02.889 (Dummy-2) CHIP_PROGRESS [chip.native.DIS] Found an existing secure session to [1:0000000000000027]!
2024-08-25 16:14:02.890 (Dummy-2) CHIP_PROGRESS [chip.native.EM] <<< [E:65041i S:55630 M:12336346] (S) Msg TX to 1:0000000000000027 [45F7] [UDP:[fd5d:81a0:e70b:0:1e7f:7cfb:c99a:5a0a]:5540] --- Type 0001:02 (IM:ReadRequest)
2024-08-25 16:14:03.126 (Dummy-2) CHIP_PROGRESS [chip.native.EM] >>> [E:65041i S:55630 M:160473095 (Ack:12336346)] (S) Msg RX from 1:0000000000000027 [45F7] --- Type 0001:05 (IM:ReportData)
2024-08-25 16:14:03.126 (Dummy-2) CHIP_PROGRESS [chip.native.EM] <<< [E:65041i S:55630 M:12336347 (Ack:160473095)] (S) Msg TX to 1:0000000000000027 [45F7] [UDP:[fd5d:81a0:e70b:0:1e7f:7cfb:c99a:5a0a]:5540] --- Type 0000:10 (SecureChannel:StandaloneAck)
2024-08-25 16:14:05.922 (Dummy-2) CHIP_PROGRESS [chip.native.DIS] Checking node lookup status after 200 ms
2024-08-25 16:14:05.922 (Dummy-2) CHIP_PROGRESS [chip.native.DIS] Checking node lookup status after 201 ms
2024-08-25 16:14:05.922 (Dummy-2) CHIP_PROGRESS [chip.native.DIS] Checking node lookup status after 200 ms
2024-08-25 16:14:35.128 (Dummy-2) CHIP_PROGRESS [chip.native.DIS] Found an existing secure session to [1:0000000000000027]!
2024-08-25 16:14:35.129 (Dummy-2) CHIP_PROGRESS [chip.native.EM] <<< [E:65042i S:55630 M:12336348] (S) Msg TX to 1:0000000000000027 [45F7] [UDP:[fd5d:81a0:e70b:0:1e7f:7cfb:c99a:5a0a]:5540] --- Type 0001:02 (IM:ReadRequest)
2024-08-25 16:14:35.358 (Dummy-2) CHIP_PROGRESS [chip.native.EM] >>> [E:65042i S:55630 M:160473096 (Ack:12336348)] (S) Msg RX from 1:0000000000000027 [45F7] --- Type 0001:05 (IM:ReportData)
2024-08-25 16:14:35.359 (Dummy-2) CHIP_PROGRESS [chip.native.EM] <<< [E:65042i S:55630 M:12336349 (Ack:160473096)] (S) Msg TX to 1:0000000000000027 [45F7] [UDP:[fd5d:81a0:e70b:0:1e7f:7cfb:c99a:5a0a]:5540] --- Type 0000:10 (SecureChannel:StandaloneAck)
2024-08-25 16:14:36.725 (Dummy-2) CHIP_ERROR [chip.native.DIS] Timeout waiting for mDNS resolution.
2024-08-25 16:14:36.725 (Dummy-2) CHIP_ERROR [chip.native.DIS] Timeout waiting for mDNS resolution.
2024-08-25 16:14:36.725 (Dummy-2) CHIP_ERROR [chip.native.DIS] Timeout waiting for mDNS resolution.
2024-08-25 16:14:50.722 (Dummy-2) CHIP_PROGRESS [chip.native.DIS] Checking node lookup status after 45001 ms
2024-08-25 16:14:50.722 (Dummy-2) CHIP_ERROR [chip.native.DIS] OperationalSessionSetup[1:0000000000000030]: operational discovery failed: src/lib/address_resolve/AddressResolve_DefaultImpl.cpp:119: CHIP Error 0x00000032: Timeout
2024-08-25 16:14:50.723 (Dummy-2) CHIP_PROGRESS [chip.native.DIS] Checking node lookup status after 45001 ms
2024-08-25 16:14:50.723 (Dummy-2) CHIP_ERROR [chip.native.DIS] OperationalSessionSetup[1:0000000000000029]: operational discovery failed: src/lib/address_resolve/AddressResolve_DefaultImpl.cpp:119: CHIP Error 0x00000032: Timeout
2024-08-25 16:14:50.723 (Dummy-2) CHIP_PROGRESS [chip.native.DIS] Checking node lookup status after 45000 ms
2024-08-25 16:14:50.723 (Dummy-2) CHIP_ERROR [chip.native.DIS] OperationalSessionSetup[1:000000000000003E]: operational discovery failed: src/lib/address_resolve/AddressResolve_DefaultImpl.cpp:119: CHIP Error 0x00000032: Timeout
2024-08-25 16:14:52.028 (Dummy-2) CHIP_PROGRESS [chip.native.EM] >>> [E:38996r S:55630 M:160473097] (S) Msg RX from 1:0000000000000027 [45F7] --- Type 0001:05 (IM:ReportData)
2024-08-25 16:14:52.028 (Dummy-2) CHIP_PROGRESS [chip.native.DMG] Refresh LivenessCheckTime for 64224 milliseconds with SubscriptionId = 0x1db97744 Peer = 01:0000000000000027
2024-08-25 16:14:52.029 (Dummy-2) CHIP_PROGRESS [chip.native.EM] <<< [E:38996r S:55630 M:12336350 (Ack:160473097)] (S) Msg TX to 1:0000000000000027 [45F7] [UDP:[fd5d:81a0:e70b:0:1e7f:7cfb:c99a:5a0a]:5540] --- Type 0001:01 (IM:StatusResponse)
2024-08-25 16:14:52.139 (Dummy-2) CHIP_PROGRESS [chip.native.EM] >>> [E:38996r S:55630 M:160473098 (Ack:12336350)] (S) Msg RX from 1:0000000000000027 [45F7] --- Type 0000:10 (SecureChannel:StandaloneAck)
2024-08-25 16:14:53.925 (Dummy-2) CHIP_PROGRESS [chip.native.DIS] Checking node lookup status after 200 ms
2024-08-25 16:14:53.925 (Dummy-2) CHIP_PROGRESS [chip.native.DIS] Checking node lookup status after 200 ms
2024-08-25 16:14:53.925 (Dummy-2) CHIP_PROGRESS [chip.native.DIS] Checking node lookup status after 199 ms
2024-08-25 16:14:53.925 (Dummy-2) CHIP_PROGRESS [chip.native.DIS] Keeping DNSSD lookup active
2024-08-25 16:14:53.927 (Dummy-2) CHIP_PROGRESS [chip.native.DIS] Checking node lookup status after 202 ms
2024-08-25 16:14:53.927 (Dummy-2) CHIP_PROGRESS [chip.native.DIS] Checking node lookup status after 201 ms
2024-08-25 16:14:53.927 (Dummy-2) CHIP_PROGRESS [chip.native.DIS] Checking node lookup status after 200 ms
2024-08-25 16:15:07.361 (Dummy-2) CHIP_PROGRESS [chip.native.DIS] Found an existing secure session to [1:0000000000000027]!
2024-08-25 16:15:07.362 (Dummy-2) CHIP_PROGRESS [chip.native.EM] <<< [E:65043i S:55630 M:12336351] (S) Msg TX to 1:0000000000000027 [45F7] [UDP:[fd5d:81a0:e70b:0:1e7f:7cfb:c99a:5a0a]:5540] --- Type 0001:02 (IM:ReadRequest)
2024-08-25 16:15:07.596 (Dummy-2) CHIP_PROGRESS [chip.native.EM] >>> [E:65043i S:55630 M:160473099 (Ack:12336351)] (S) Msg RX from 1:0000000000000027 [45F7] --- Type 0001:05 (IM:ReportData)
2024-08-25 16:15:07.597 (Dummy-2) CHIP_PROGRESS [chip.native.EM] <<< [E:65043i S:55630 M:12336352 (Ack:160473099)] (S) Msg TX to 1:0000000000000027 [45F7] [UDP:[fd5d:81a0:e70b:0:1e7f:7cfb:c99a:5a0a]:5540] --- Type 0000:10 (SecureChannel:StandaloneAck)
2024-08-25 16:15:24.726 (Dummy-2) CHIP_ERROR [chip.native.DIS] Timeout waiting for mDNS resolution.
2024-08-25 16:15:24.726 (Dummy-2) CHIP_ERROR [chip.native.DIS] Timeout waiting for mDNS resolution.
2024-08-25 16:15:24.727 (Dummy-2) CHIP_ERROR [chip.native.DIS] Timeout waiting for mDNS resolution.
2024-08-25 16:15:38.726 (Dummy-2) CHIP_PROGRESS [chip.native.DIS] Checking node lookup status after 45001 ms
2024-08-25 16:15:38.726 (Dummy-2) CHIP_ERROR [chip.native.DIS] OperationalSessionSetup[1:0000000000000030]: operational discovery failed: src/lib/address_resolve/AddressResolve_DefaultImpl.cpp:119: CHIP Error 0x00000032: Timeout
2024-08-25 16:15:38.727 (Dummy-2) CHIP_PROGRESS [chip.native.DIS] Checking node lookup status after 45001 ms
2024-08-25 16:15:38.727 (Dummy-2) CHIP_ERROR [chip.native.DIS] OperationalSessionSetup[1:0000000000000029]: operational discovery failed: src/lib/address_resolve/AddressResolve_DefaultImpl.cpp:119: CHIP Error 0x00000032: Timeout
2024-08-25 16:15:38.727 (Dummy-2) CHIP_PROGRESS [chip.native.DIS] Checking node lookup status after 45000 ms
2024-08-25 16:15:38.727 (Dummy-2) CHIP_ERROR [chip.native.DIS] OperationalSessionSetup[1:000000000000003E]: operational discovery failed: src/lib/address_resolve/AddressResolve_DefaultImpl.cpp:119: CHIP Error 0x00000032: Timeout
2024-08-25 16:15:38.728 (MainThread) WARNING [matter_server.server.device_controller] <Node:48> Setup for node failed: Unable to establish CASE session with Node 48
2024-08-25 16:15:38.728 (MainThread) WARNING [matter_server.server.device_controller] <Node:41> Setup for node failed: Unable to establish CASE session with Node 41
2024-08-25 16:15:38.728 (MainThread) WARNING [matter_server.server.device_controller] <Node:62> Setup for node failed: Unable to establish CASE session with Node 62
2024-08-25 16:15:39.598 (Dummy-2) CHIP_PROGRESS [chip.native.DIS] Found an existing secure session to [1:0000000000000027]!
2024-08-25 16:15:39.599 (Dummy-2) CHIP_PROGRESS [chip.native.EM] <<< [E:65044i S:55630 M:12336353] (S) Msg TX to 1:0000000000000027 [45F7] [UDP:[fd5d:81a0:e70b:0:1e7f:7cfb:c99a:5a0a]:5540] --- Type 0001:02 (IM:ReadRequest)
2024-08-25 16:15:39.836 (Dummy-2) CHIP_PROGRESS [chip.native.EM] >>> [E:65044i S:55630 M:160473100 (Ack:12336353)] (S) Msg RX from 1:0000000000000027 [45F7] --- Type 0001:05 (IM:ReportData)
2024-08-25 16:15:39.837 (Dummy-2) CHIP_PROGRESS [chip.native.EM] <<< [E:65044i S:55630 M:12336354 (Ack:160473100)] (S) Msg TX to 1:0000000000000027 [45F7] [UDP:[fd5d:81a0:e70b:0:1e7f:7cfb:c99a:5a0a]:5540] --- Type 0000:10 (SecureChannel:StandaloneAck)
For me, Matter is out of the game for now.
Same problem here also. Noticed that with Home Assistant Core 2024.8.0 and Matter Server 6.4.1 all of my Eve Matter devices have gone offline and haven't come back at all. Restarting Home Assistant, Disabling Open Thread Border Router and Restarting the Matter server didn't help at all. So as of right now, none of my Matter devices are available in Home Assistant, but are working without issues in Apple Home.
Reverting to Matter server 6.3.0 fixes this for me. Surely this is a regression at 6.4
@PENorris How can I downgrade to 6.3?
@PENorris How can I downgrade to 6.3?
I'm using python-matter-server in a docker environment, so it's a bit simpler. If you're using Matter as an add-on, you could try this method, but I can't vouch for it. https://community.home-assistant.io/t/how-to-downgrade-addons/331223/27
Unfortunately I couldn't downgrade. No matter what I do, the local add-ons section is missing?! I followed the instructions exactly.
Thought I could test the beta 6.5.0. I activated beta in the configuration, but no matter how often I restart the Matter server, the version remains 6.4.1.
I also have the same issue, mostly with Eve devices.
Unfortunately I couldn't downgrade. No matter what I do, the local add-ons section is missing?! I followed the instructions exactly.
The "the local add-ons section" is if you develop a custom add-on locally. You likely won't have that one.
If you use the official Matter Server add-on, you need to use the Backup functionality: On every upgrade Supervisor creates a backup. They are named like addon_core_matter_server_6.4.0 or addon_core_matter_server_6.3.1, simply click on it and press restore. Note that since the cryptographic material is stored on the Matter Server, and restoring a older backup will restore the old state as well you'll loose connectivity to any newly added devices. So, if you have added devices since then, they won't work anymore. Ideally you remove those first, then downgrade, and then add them again.
We are looking into this issue, as we seem to get multiple reports about this issue, although the logging shows strong evidence of signal issues. So either the Thread network itself is not healthy (signal issues, interference) or the Matter Server somehow picks a bad route to the devices.
Anyways, I had this issue on my own personal home network, with all the logging about timeouts, mdns resolving failing etc. I then started to optimize my network by removing a few homepod minis that had a bit of bad signal on Wifi and I picked a new channel for Wifi and zigbee. Now all my WiFi AP's are on channel 1, my zigbee on channel 20 (which is more or less wifi channel 6) and Thread runs at channel 25 (more or less wifi channel 11).
Since then; a total silent Matter server log and all my devices stay available.
If you start replying with "but it works with apple or google": Do note that these ecosystems do not care as much about network health as we do; they probably do a ton of retries to completely hide a network instability issues or they have tweaked certain things we are not (yet aware of). Also, multi-admin is a cool feature but it also increases network traffic big time. So the fact that you have a device added to multiple fabrics also duplicates the traffic. So if the signal is already somewhat disturbed, it will enhance the issue even more.
@marcelveldt
Good explanation, I will agree with it. But: Why starts the problem after the last update of the thread/matter-server, do you change something in the sensitive of the server?
Why starts the problem after the last update of the thread/matter-server, do you change something in the sensitive of the server?
That is something we're trying to find out but still it is good to have a healthy network as starting point. Many reports about timeout and/or mdns resolution failed is an indication of thread network instability. It could be that we are a bit more sensitive to that nowadays but still good to get your network as optimized as possible. Something you can actually do with Home Assistant as we provide this level of details
We are looking into this issue, as we seem to get multiple reports about this issue, both here and on discord although the logging shows strong evidence of(thread) network issues. So either the Thread network itself is not healthy (signal issues, interference) or the Matter Server somehow picks a bad route to the devices.
If you experience this issue, please reply here with a short message:
- What version of HAOS ?
- What version of the Matter Server ?
- What is the cpu/memory usage of the Matter Server when this issue exists ?
- What Thread Border Router(s) you use ?
- Are your devices are also joined to other ecosystems (If so, to which and do they keep working there?)
- What devices (vendor+model) are you experiencing issues with ?
- Can you still ping the device from the HA device details page ?
- Does a restart of the device(s) bring the device back online in Home Assistant ?
- Does a restart of your Thread Border Router(s) bring the device back online in Home Assistant ?
- Does a Home Assistant core restart fix the issue ?
- Does a Matter Server restart fix the issue ?
- If none of the above helps: Does a complete (HAOS) host reboot fix the issue ?
If you have anything other relevant to share please do so but do not paste logging output here. if you want to share logging, please attach it as file to your message. Otherwise this thread becomes unreadable very quickly.
Thanks!
I'm experience this as well see below the info. I have a very simple setup with 1 homepod mini as thread border router and only one node an Aqara U200 smart lock.
- What version of HAOS ? 13.1
- What version of the Matter Server ? 6.4.1
- What Thread Border Router(s) you use ? Homepod mini 2 meter distance from the lock.
- Are your devices are also joined to other ecosystems (If so, to which and do they keep working there?) Apple Home and it keeps working there. Home Pod mini is also working in HA as media device.
- What devices (vendor+model) are you experiencing issues with ? Aqara U200 smart lock
- Can you still ping the device from the HA device details page ? no
- Does a restart of the device(s) bring the device back online in Home Assistant ? No
- Does a restart of your Thread Border Router(s) bring the device back online in Home Assistant ? No
- Does a Home Assistant core restart fix the issue ? no
- Does a Matter Server restart fix the issue ? Yes
Previously with HAOS 2024.8.3 What version of HAOS ? 2024.8.3 What version of the Matter Server 6.4.1 What is the cpu/memory usage of the Matter Server when this issue exists ? No impact What Thread Border Router(s) you use ? ConBee II Are your devices are also joined to other ecosystems (If so, to which and do they keep working there?) Apple Home / YES What devices (vendor+model) are you experiencing issues with ? Doesn't matter (EVE Systems , Aqara, Siegenia, Nanoleaf) Can you still ping the device from the HA device details page ? YES Does a restart of the device(s) bring the device back online in Home Assistant ? NO Does a restart of your Thread Border Router(s) bring the device back online in Home Assistant ? NO Does a Home Assistant core restart fix the issue ? NO Does a Matter Server restart fix the issue ? YES If none of the above helps: Does a complete (HAOS) host reboot fix the issue ? Sometimes
Current HAOS 2024.9.0b2
Devices still becomes randomly offline. But becomes back automatically online after few seconds/minutes.
@marcelveldt
While this is being looked at, do you know of a way for HAOS user to downgrade to 6.3.0?
While this is being looked at, do you know of a way for HAOS user to downgrade to 6.3.0?
Restore your backup
Today the connection got lost again after restarting it 4days ago. It's updated in Apple home. I found this in the Matter server log. Setup is still the same. 1 home pod mini talking to the Aqara U200. After restart of the Matter Server it's back again. Hope this helps in finding the solution.
2024-09-06 10:29:35.782 (MainThread) INFO [matter_server.server.device_controller.mdns] Node:1 Re-discovered on mDNS 2024-09-06 10:29:35.783 (MainThread) INFO [matter_server.server.device_controller] Node:1 Setting-up node... 2024-09-06 10:29:35.788 (MainThread) INFO [matter_server.server.device_controller] Node:1 Setting up attributes and events subscription. 2024-09-06 10:30:14.104 (Dummy-2) CHIP_ERROR [chip.native.EM] Failed to Send CHIP MessageCounter:175275614 on exchange 58792i with Node: <0000000000000001, 2> sendCount: 4 max retries: 4 2024-09-06 10:30:17.259 (Dummy-2) CHIP_ERROR [chip.native.DMG] Time out! failed to receive report data from Exchange: 58792i with Node: <0000000000000001, 2> 2024-09-06 10:30:17.261 (MainThread) WARNING [matter_server.server.device_controller] Node:1 Unable to subscribe to Node: src/app/ReadClient.cpp:682: CHIP Error 0x00000032: Timeout
I have 3 Border Routers, a SkyConnect and 2 HomePod minis and my Matter network was buggy as heck. Devices go offline that may or may not come online again and so on. Problems with pairing etc.
So now I tried to shutdown one of my 2 HomePod minis. I thought maybe too many Border Routers might make problems. And that actually was the case. Now since I only have the SkyConnect and 1 HomePod mini as Border Routers everything is working flawlessly for days now. knock on wood
I’m experiencing the same issue. Everything works flawlessly in Apple Home, but not in Home Assistant. @marcelveldt, in your previous response, you mentioned:
If you start replying with "but it works with apple or google": Do note that these ecosystems do not care as much about network health as we do; they probably do a ton of retries to completely hide a network instability issues or they have tweaked certain things we are not (yet aware of).
With respect, that’s missing the point. End users don’t care about the technical details of network health; they care that their devices work as expected, and right now, Apple Home manages to do this while Home Assistant does not. Whether Apple Home retries more often or not is irrelevant—retries are a best practice for network resilience, especially in IoT environments where instability is common. The fact remains that when I control a device in Apple Home, it works. When I physically change a device’s state, it reports back accurately. The same devices consistently fail to perform as expected in Home Assistant.
Your suggestion to “fix the root problem” has been made repeatedly over the last year (e.g., #110218), but we’re still seeing the same issues. The focus on Thread network health is undermining the user experience and breaking integrations. you should shift focus away from chasing network perfection and towards end user experience, like Apple Home. Please strive to make Home Assistant at least as good as Apple Home.
To illustrate how much time I’ve invested in troubleshooting, here’s what I’ve tried so far:
- Apple TV 4K as the exclusive border router and controller (works great).
- Home Assistant as the exclusive border router and controller (doesn’t work well).
- Apple TV as the border router, Home Assistant as the controller (still doesn’t work well).
- Both Apple TV and Home Assistant as border routers and controllers (Apple Home works great, Home Assistant does not).
- Spreading out border routers to provide better coverage.
- Configuring WiFi to use only channels 1 and 6, leaving channel 11 free (no improvement).
- Adding routing end devices like Eve Home Energy (no improvement).
- Disabling Nano Leaf devices due to known instability (no improvement).
- Disabling advanced mDNS and IGMP snooping (no improvement).
To pour salt in the wound, Apple Home has worked flawlessly the entire time. And all of my devices successfully ping from Home Assistant.
Anyway, here's the info you requested here:
- What version of HAOS ? 2024.9.1
- What version of the Matter Server ? 6.4.2
- What is the cpu/memory usage of the Matter Server when this issue exists ? 0% CPU, 1.5% RAM
- What Thread Border Router(s) you use ? Apple TV 4K (3rd gen) (latest)
- Are your devices are also joined to other ecosystems (If so, to which and do they keep working there?) Apple Home, which works great
- What devices (vendor+model) are you experiencing issues with ? Inovelli light switches, Eve Energy, Eve Motion, Nanoleaf bulbs. All of them work great in Apple Home. All of them have experienced issues in Home Assistant
- Can you still ping the device from the HA device details page ? Yes, I can ping every device
- Does a restart of the device(s) bring the device back online in Home Assistant ? No
- Does a restart of your Thread Border Router(s) bring the device back online in Home Assistant ? No
- Does a Home Assistant core restart fix the issue ? No
- Does a Matter Server restart fix the issue ? Yes, temporarily
- If none of the above helps: Does a complete (HAOS) host reboot fix the issue ? Yes, temporarily
We have released a new version of the Matter server today. Please test if that fixes the issues. We have identified the root cause but did not yet update this ticket
To illustrate how much time I’ve invested in troubleshooting, here’s what I’ve tried so far:
- Apple TV 4K as the exclusive border router and controller (works great).
- Home Assistant as the exclusive border router and controller (doesn’t work well).
- Apple TV as the border router, Home Assistant as the controller (still doesn’t work well).
- Both Apple TV and Home Assistant as border routers and controllers (Apple Home works great, Home Assistant does not).
- Spreading out border routers to provide better coverage.
- Configuring WiFi to use only channels 1 and 6, leaving channel 11 free (no improvement).
- Adding routing end devices like Eve Home Energy (no improvement).
- Disabling Nano Leaf devices due to known instability (no improvement).
- Disabling advanced mDNS and IGMP snooping (no improvement).
I also spent days to figure out the problem and also tried all of the above. Like I wrote in my case it helped to shutdown one of my HomePod minis and since then I don't have any problems anymore. I'm actually too scared to turn it back on and try it with the new matter server version, that came out today.
To illustrate how much time I’ve invested in troubleshooting, here’s what I’ve tried so far:
- Apple TV 4K as the exclusive border router and controller (works great).
- Home Assistant as the exclusive border router and controller (doesn’t work well).
- Apple TV as the border router, Home Assistant as the controller (still doesn’t work well).
- Both Apple TV and Home Assistant as border routers and controllers (Apple Home works great, Home Assistant does not).
- Spreading out border routers to provide better coverage.
- Configuring WiFi to use only channels 1 and 6, leaving channel 11 free (no improvement).
- Adding routing end devices like Eve Home Energy (no improvement).
- Disabling Nano Leaf devices due to known instability (no improvement).
- Disabling advanced mDNS and IGMP snooping (no improvement).
I also spent days to figure out the problem and also tried all of the above. Like I wrote in my case it helped to shutdown one of my HomePod minis and since then I don't have any problems anymore. I'm actually too scared to turn it back on and try it with the new matter server version, that came out today.
The issue here has nothing to do with the number of border routers, using multiple controllers, WiFi interference, or any of the other factors @marcelveldt has suggested. The fact that Apple Home works perfectly for everyone, combined with the observation that restarting the Home Assistant Matter server temporarily resolves the problem, and that older versions of the Home Assistant Matter integration worked fine, points clearly to the software in Home Assistant.
The team maintaining Matter for Home Assistant has made the explicit decision to surface network issues to user, as seen in #110218. This decision is what’s driving these problems, which is why I made my previous comment. If anyone here wants to perfect their Thread network, they are welcome to dig into the logs and spend time doing that, but we don't need to expose the entire Home Assistant community to this effort.
The issue here has nothing to do with the number of border routers, using multiple controllers, WiFi interference, or any of the other factors @marcelveldt has suggested. The fact that Apple Home works perfectly for everyone, combined with the observation that restarting the Home Assistant Matter server temporarily resolves the problem, and that older versions of the Home Assistant Matter integration worked fine, points clearly to the software in Home Assistant.
Well, it might not have anything to to with it but in my case it solved all my Matter problems, that's all I'm saying. But anyway, trying the new Matter Server update now with the 3 border routers like before.