zigbee2mqtt
zigbee2mqtt copied to clipboard
Connects to MQTT but never starts the http listener
What happened?
After a restart, it failed to start. Debug shows it connecting to mqtt successfully (mqtts://mqtt.DOMAIN
no errors, valid cert) and then it just sits until the health check restarts it. I raised that limit to 15 minutes with no change.
It originally occurred with the 1.37.0 update, but rolling back didn't help. I suspect the restart was the cause, not the update.
What did you expect to happen?
HTTP listener starts eventually
How to reproduce it (minimal and precise)
No response
Zigbee2MQTT version
unknown, frontend isn't starting. container v1.37.0
Adapter firmware version
again I can't get to it but it is updated w/in the past year
Adapter
Sonoff zb-dongle-p
Setup
helm on k3s on rpi4
Debug log
The log lists literally every device and half of my configuration secrets. I'm happy to extract anything that would help. Here are some hopefully useful excerpts:
[2024-05-07 09:34:40] info: z2m: Logging to console, file (filename: log.log)
[2024-05-07 09:34:44] debug: z2m: Loaded state from file /data/state.json
[2024-05-07 09:34:45] info: z2m: Starting Zigbee2MQTT version 1.37.0 (commit #46f34c8)
[2024-05-07 09:34:45] info: z2m: Starting zigbee-herdsman (0.45.0)
[2024-05-07 09:34:45] debug: z2m: Using zigbee-herdsman with settings: '{"adapter":{"concurrent":null,"delay":null,"disableLED":false},"backupPath":"/data/coordinator_backup.json","databaseBackupPath":"/data/database.db.backup","databasePath":"/data/database.db","network":{"channelList":[20],"extendedPanID":[.......],"networkKey":"HIDDEN","panID":....},"serialPort":{"adapter":"auto","path":"tcp://ser2sock-zigbee:10000"}}'
[2024-05-07 09:34:45] debug: zh:controller: Starting with options '{"network":{"networkKeyDistribute":false,"networkKey":[...],"panID":...,"extendedPanID":[......],"channelList":[20]},"serialPort":{"path":"tcp://ser2sock-zigbee:10000","adap
ter":"auto"},"databasePath":"/data/database.db","databaseBackupPath":"/data/database.db.backup","backupPath":"/data/coordinator_backup.json","adapter":{"disableLED":false,"concurrent":null,"delay":null}}'
[2024-05-07 09:34:45] info: zh:zstack:znp: Opening TCP socket with ser2sock-zigbee:10000
[2024-05-07 09:34:46] info: zh:zstack:znp: Socket connected
[2024-05-07 09:34:46] info: zh:zstack:znp: Socket ready
[2024-05-07 09:34:46] info: zh:zstack:znp: Writing CC2530/CC2531 skip bootloader payload
[2024-05-07 09:34:46] debug: zh:zstack:unpi:writer: --> buffer [239]
[2024-05-07 09:34:46] debug: zh:zstack:unpi:parser: <-- [33,83,69,82,50,83,79,67,75,32,67,111,110,110,101,99,116,101,100,13,10,33,83,69,82,50,83,79,67,75,32,83,69,82,73,65,76,95,67,79,78,78,69,67,84,69,68,13,10]
[2024-05-07 09:34:46] debug: zh:zstack:unpi:parser: --- parseNext [33,83,69,82,50,83,79,67,75,32,67,111,110,110,101,99,116,101,100,13,10,33,83,69,82,50,83,79,67,75,32,83,69,82,73,65,76,95,67,79,78,78,69,67,84,69,68,13,10]
[2024-05-07 09:34:47] info: zh:zstack:znp: Skip bootloader for CC2652/CC1352
[2024-05-07 09:34:47] debug: zh:zstack:znp: SREQ: --> SYS - ping - {"capabilities":1}
[2024-05-07 09:34:47] debug: zh:zstack:unpi:writer: --> frame [254,0,33,1,32]
[2024-05-07 09:34:47] debug: zh:zstack:unpi:parser: <-- [254,2,97,1,89,6,61]
[2024-05-07 09:34:47] debug: zh:zstack:unpi:parser: --- parseNext [33,83,69,82,50,83,79,67,75,32,67,111,110,110,101,99,116,101,100,13,10,33,83,69,82,50,83,79,67,75,32,83,69,82,73,65,76,95,67,79,78,78,69,67,84,69,68,13,10,254,2,97,1,89,6,61]
[2024-05-07 09:34:47] debug: zh:zstack:unpi:parser: --> parsed 2 - 3 - 1 - 1 - [89,6] - 61
[2024-05-07 09:34:47] debug: zh:zstack:znp: SRSP: <-- SYS - ping - {"capabilities":1625}
[2024-05-07 09:34:47] debug: zh:zstack:unpi:parser: --- parseNext []
[2024-05-07 09:34:47] debug: zh:zstack:znp: SREQ: --> SYS - version - {}
[2024-05-07 09:34:47] debug: zh:zstack:unpi:writer: --> frame [254,0,33,2,35]
[2024-05-07 09:34:47] debug: zh:zstack:unpi:parser: <-- [254,10,97,2,2,1,2,7,1,107,177,52,1,0,129]
[2024-05-07 09:34:47] debug: zh:zstack:unpi:parser: --- parseNext [254,10,97,2,2,1,2,7,1,107,177,52,1,0,129]
[2024-05-07 09:34:47] debug: zh:zstack:unpi:parser: --> parsed 10 - 3 - 1 - 2 - [2,1,2,7,1,107,177,52,1,0] - 129
[2024-05-07 09:34:47] debug: zh:zstack:znp: SRSP: <-- SYS - version - {"transportrev":2,"product":1,"majorrel":2,"minorrel":7,"maintrel":1,"revision":20230507}
[2024-05-07 09:34:47] debug: zh:zstack:unpi:parser: --- parseNext []
[2024-05-07 09:34:47] debug: zh:zstack: Adapter concurrent: 16
2024-05-07 09:34:47] debug: zh:zstack: Detected znp version 'zStack3x0' ({"transportrev":2,"product":1,"majorrel":2,"minorrel":7,"maintrel":1,"revision":20230507})
[2024-05-07 09:34:47] debug: zh:adapter:zstack:manager: beginning znp startup
...
[2024-05-07 09:34:47] debug: zh:adapter:zstack:manager: endpoint '1' already registered
[2024-05-07 09:34:47] debug: zh:adapter:zstack:manager: endpoint '2' already registered
[2024-05-07 09:34:47] debug: zh:adapter:zstack:manager: endpoint '3' already registered
[2024-05-07 09:34:47] debug: zh:adapter:zstack:manager: endpoint '4' already registered
[2024-05-07 09:34:47] debug: zh:adapter:zstack:manager: endpoint '5' already registered
[2024-05-07 09:34:47] debug: zh:adapter:zstack:manager: endpoint '6' already registered
[2024-05-07 09:34:47] debug: zh:adapter:zstack:manager: endpoint '8' already registered
[2024-05-07 09:34:47] debug: zh:adapter:zstack:manager: endpoint '10' already registered
[2024-05-07 09:34:47] debug: zh:adapter:zstack:manager: endpoint '11' already registered
[2024-05-07 09:34:47] debug: zh:adapter:zstack:manager: endpoint '110' already registered
[2024-05-07 09:34:47] debug: zh:adapter:zstack:manager: endpoint '12' already registered
[2024-05-07 09:34:47] debug: zh:adapter:zstack:manager: endpoint '13' already registered
[2024-05-07 09:34:47] debug: zh:adapter:zstack:manager: endpoint '47' already registered
[2024-05-07 09:34:47] debug: zh:adapter:zstack:manager: endpoint '242' already registered
...
[2024-05-07 09:34:47] debug: zh:controller: Started with result 'resumed'
[2024-05-07 09:34:47] debug: zh:controller: Injected database: true, adapter: true
...
[2024-05-07 09:34:48] debug: zh:controller:device: Request Queue (0x00124b002a9a8400): default expiration timeout set to 0
[2024-05-07 09:34:48] debug: zh:controller:device: Request Queue (0x680ae2fffef4596f): default expiration timeout set to 43200000
[2024-05-07 09:34:48] debug: zh:controller:device: Request Queue (0x0017880110372268): default expiration timeout set to 0
...
[2024-05-07 09:34:48] info: z2m: zigbee-herdsman started (resumed)
[2024-05-07 09:34:48] info: z2m: Coordinator firmware version: '{"meta":{"maintrel":1,"majorrel":2,"minorrel":7,"product":1,"revision":20230507,"transportrev":2},"type":"zStack3x0"}'
...
[2024-05-07 09:34:48] debug: z2m: Zigbee network parameters: {"channel":20,"extendedPanID":"0xZZZZZZZZZZZZZZZZ","panID":00000}
[2024-05-07 09:34:48] info: z2m: Currently 38 devices are joined:
...
[2024-05-07 09:34:48] debug: zh:controller: Disable joining
...
[2024-05-07 09:34:48] info: z2m: Connecting to MQTT server at mqtts://mqtt.DOMAIN
[2024-05-07 09:34:48] debug: z2m: Using MQTT keepalive: 60
[2024-05-07 09:34:48] debug: z2m: Using MQTT login with username: zigbee2mqtt
[2024-05-07 09:34:48] debug: z2m: Using MQTT client ID: 'zigbee2mqtt'
[2024-05-07 09:34:48] info: z2m: Connected to MQTT server
[2024-05-07 09:34:48] debug: z2m: MQTT publish: topic 'zigbee2mqtt/bridge/state', payload '{"state":"online"}'
[2024-05-07 09:34:49] debug: z2m: Received MQTT message on 'zigbee2mqtt/Bay Lights/availability' with data '{"state":"offline"}'
...
[2024-05-07 09:34:49] debug: z2m: Received MQTT message on 'zigbee2mqtt/bridge/extensions' with data '[]'
[2024-05-07 09:34:49] debug: z2m: Received MQTT message on 'zigbee2mqtt/bridge/groups' with data '[]'
...
[2024-05-07 09:34:49] debug: z2m: Received MQTT message on 'zigbee2mqtt/bridge/devices' with data .......
...
[2024-05-07 09:34:50] debug: z2m: Received MQTT message on 'zigbee2mqtt/bridge/extensions' with data '[]'
[2024-05-07 09:34:50] debug: z2m: Received MQTT message on 'zigbee2mqtt/bridge/groups' with data '[]'
...
[2024-05-07 09:34:56] debug: z2m: Received Zigbee message from 'Living Room Window', type 'attributeReport', cluster 'manuSpecificSamsungAccelerometer', data '{"acceleration":0,"x_axis":0,"y_axis":0,"z_axis":0}' from endpoint 1 with groupID 0
... It continues receiving zigbee messages
[2024-05-07 09:39:08] debug: z2m: Received Zigbee message from 'Another Device', type 'attributeReport', cluster 'genOnOff', data '{"onOff":1}' from endpoint 1 with groupID 0
...
[2024-05-07 09:45:00] debug: zh:zstack:backup: fetched adapter network security material table (capacity=1, used=1)
[2024-05-07 09:45:00] debug: zh:zstack:backup: Following devices with link key are missing from new backup but present in old backup and database, adding them back:
[2024-05-07 09:45:00] info: zh:controller: Wrote coordinator backup to '/data/coordinator_backup.json'
Lots of received messages but nothing is showing on MQTT under zigbee/#
except HA commands.
Eventually the health check times out because nothing is listening on 8080. I raised it to 15 minutes but it still doesn't come ready in time. It is running on a node with minimal other responsibilities (just the associated ser2sock.)
To rule out as much as possible, I've moved the radio from the original node to a new node and replaced the hub+cable with a normal extension cable. Nothing changed.
(Somewhat unrelated, but the zwave stack is still functioning on the original node and with the original USB hub. On the hub the radios are about 4 inches apart from each other, and the whole thing is 2 feet from the nearest hardware.)
Just to rule out the obvious : can you make sure that the frontend is activated in your configuration ? :)
frontend: true
The frontend extension is firing a log message at startup when binding the port. Or maybe your log message is too short and it is starting after, as it is one of the last extension to be loaded.
Advanced form, but yes:
frontend:
port: 8080
host: 0.0.0.0
url: https://zigbee.XXXXXX
The only change was the version and the restarts. (I tried 1.37.1 today but it still hung. After about an hour I reverted to 1.36.1, but it hasn't recovered yet.) Along the way it has survived restarts without a version change, so I wonder if there is something getting stuck on the radio side in the 1.37 update.
There is plenty of available cpu and ram, and according to prometheus it is not using much of either (under 10% of a core and 500M, no limits/requests.) The nodes have ipv6 disabled, but if that matters I would expect 0.0.0.0
to resolve it.
It acts normally, in that if I kick it from MQTT it reconnects, and it is bridging zigbee and MQTT. It just never opens the http listener.
Edit re logs: it logs the bind message if it recovers, but not when it is hung.
Dumb question : are you using docker or a container ? Did you try by commenting the host
property ?
0.0.0.0
traditionally means 'every available IPV4 address'. I removed the host line, but so far (8 minutes) it is still just hanging. These are info, sorry. I will restart with debug for the next run.
Nothing is logged after that last message. If I watch MQTT, it is publishing availability updates but incoming commands aren't being processed. (So I was wrong earlier, it is not behaving normally on the mqtt side either.)
Using '/data' as data directory
Zigbee2MQTT:info 2024-05-15 11:40:16: Logging to console and directory: '/data/log/2024-05-15.11-40-15' filename: log.txt
Zigbee2MQTT:info 2024-05-15 11:40:16: Starting Zigbee2MQTT version 1.36.1 (commit #ffc2ff1)
Zigbee2MQTT:info 2024-05-15 11:40:16: Starting zigbee-herdsman (0.40.3)
Zigbee2MQTT:info 2024-05-15 11:40:18: zigbee-herdsman started (resumed)
Zigbee2MQTT:info 2024-05-15 11:40:18: Coordinator firmware version: '{"meta":{"maintrel":1,"majorrel":2,"minorrel":7,"product":1,"revision":20230507,"transportrev":2},"type":"zStack3x0"}'
Zigbee2MQTT:info 2024-05-15 11:40:18: Currently 38 devices are joined:
...
Zigbee2MQTT:info 2024-05-15 11:40:18: Zigbee: disabling joining new devices.
Zigbee2MQTT:info 2024-05-15 11:40:19: Connecting to MQTT server at mqtts://mqtt.xxxxxxxxx
Zigbee2MQTT:info 2024-05-15 11:40:19: Connected to MQTT server
Zigbee2MQTT:info 2024-05-15 11:40:19: MQTT publish: topic 'zigbee2mqtt/bridge/state', payload '{"state":"online"}'
Zigbee2MQTT:error 2024-05-15 11:40:19: MQTT error: write EPIPE
Zigbee2MQTT:info 2024-05-15 11:40:20: Connected to MQTT server
Zigbee2MQTT:info 2024-05-15 11:40:20: MQTT publish: topic 'zigbee2mqtt/bridge/state', payload '{"state":"online"}'
0.0.0.0
traditionally means 'every available IPV4 address'. I removed the host line, but so far (8 minutes) it is still just hanging. These are info, sorry. I will restart with debug for the next run.
Nothing is logged after that last message. If I watch MQTT, it is publishing availability updates but incoming commands aren't being processed. (So I was wrong earlier, it is not behaving normally on the mqtt side either.)
Using '/data' as data directory
Zigbee2MQTT:info 2024-05-15 11:40:16: Logging to console and directory: '/data/log/2024-05-15.11-40-15' filename: log.txt
Zigbee2MQTT:info 2024-05-15 11:40:16: Starting Zigbee2MQTT version 1.36.1 (commit #ffc2ff1)
Zigbee2MQTT:info 2024-05-15 11:40:16: Starting zigbee-herdsman (0.40.3)
Zigbee2MQTT:info 2024-05-15 11:40:18: zigbee-herdsman started (resumed)
Zigbee2MQTT:info 2024-05-15 11:40:18: Coordinator firmware version: '{"meta":{"maintrel":1,"majorrel":2,"minorrel":7,"product":1,"revision":20230507,"transportrev":2},"type":"zStack3x0"}'
Zigbee2MQTT:info 2024-05-15 11:40:18: Currently 38 devices are joined:
...
Zigbee2MQTT:info 2024-05-15 11:40:18: Zigbee: disabling joining new devices.
Zigbee2MQTT:info 2024-05-15 11:40:19: Connecting to MQTT server at mqtts://mqtt.xxxxxxxxx
Zigbee2MQTT:info 2024-05-15 11:40:19: Connected to MQTT server
Zigbee2MQTT:info 2024-05-15 11:40:19: MQTT publish: topic 'zigbee2mqtt/bridge/state', payload '{"state":"online"}'
Zigbee2MQTT:error 2024-05-15 11:40:19: MQTT error: write EPIPE
Zigbee2MQTT:info 2024-05-15 11:40:20: Connected to MQTT server
Zigbee2MQTT:info 2024-05-15 11:40:20: MQTT publish: topic 'zigbee2mqtt/bridge/state', payload '{"state":"online"}'
Zigbee2MQTT:info 2024-05-15 11:49:50: MQTT publish: topic 'zigbee2mqtt/bridge/state', payload '{"state":"offline"}'
Zigbee2MQTT:info 2024-05-15 11:49:50: Disconnecting from MQTT server
(The offline is me restarting it for debug logging)
Debug includes it receiving zigbee messages, and those debug logs - but not the actual data - are sent to mqtt as well.
"message": "Received Zigbee message from 'Humidor', type 'attributeReport', cluster 'haElectricalMeasurement', data '{\"activePower\":59}' from endpoint 1 with groupID 0"
When sending commands from HA:
Zigbee2MQTT:debug 2024-05-15 12:03:54: Received MQTT message on 'zigbee2mqtt/Device/set' with data '{"state":"OFF"}'
but so far (12:07) it hasn't logged any followup. Just more incoming messages.
1.37.1 has not come online yet, running overnight with about an hour between restarts.
I'm downgrading to 1.36.1 to see if it recovers. (If it doesn't recover, I'll shut it all down and unplug it for an hour.)
This is still completely down for me. I have tried unplugging everything for a full 24 hours, and it still behaves exactly the same. Downgrading did not help.
What can I do to help diagnose this?