node-zwave-js
node-zwave-js copied to clipboard
Aeotec Z-Stick 700 with FW 7.17.2.0 - "Duplicate command" errors after backup restoration
Is your problem within Home Assistant (Core or Z-Wave JS Integration)?
NO, my problem is NOT within Home Assistant or the ZWave JS integration
Is your problem within ZWaveJS2MQTT?
NO, my problem is NOT within ZWaveJS2MQTT
Checklist
-
[X] I have checked the troubleshooting section and my problem is not described there.
-
[X] I have read the changelog and my problem was not mentioned there.
Describe the bug
Yesterday, after updating the stick to FW 7.17.2.0, I restored a current NVM backup from my 500 series stick. The driver started up fine and the nodes seem to work OK. Unlike previous attempts, there were no dead nodes and no automatic re-interviews.
However, when checking the log files this morning, I noticed that it does contain a lot of "Duplicate command" errors, and not (just) from the Sensative Strips Guard 700. Could this be a side effect of a "Health Check" attempt I made (under the erroneous assumption that certain issues had been fixed since the last time I tried)? I did restart the driver afterwards though, to stop the network flooding.
Device information
Manufacturer: Aeotec Model name: Z-Stick 700 Node ID in your network: 1
How are you using node-zwave-js
?
- [ ]
zwavejs2mqtt
Docker image (latest) - [ ]
zwavejs2mqtt
Docker image (dev) - [ ]
zwavejs2mqtt
Docker manually built (please specify branches) - [ ]
ioBroker.zwave2
adapter (please specify version) - [ ]
HomeAssistant zwave_js
integration (please specify version) - [ ]
pkg
- [ ]
node-red-contrib-zwave-js
(please specify version, double click node to find out) - [ ] Manually built from GitHub (please specify branch)
- [X] Other (please describe)
Which branches or versions?
version:
node-zwave-js
branch: 8.11.6
zwavejs2mqtt
branch: 6.5.2
zwavejs2mqtt community addon
: 0.36.0
Did you change anything?
yes (please describe)
If yes, what did you change?
Switched back to Z-Stick 700
Did this work before?
Yes (please describe)
If yes, where did it work?
With Z-Stick 5gen+
Attach Driver Logfile
👋 Hey @mundschenk-at!
It looks like you attached a logfile, but its filename doesn't look like it a driver log that came from Z-Wave JS. Please make sure you upload the correct one.
It looks like you attached a logfile, but its filename doesn't look like it a driver log.
It is.
For me, the logs look like the nodes don't get the expected acknowledgement from the controller and retry sending. Also there are some instances where the controller can't reach a node while the same node is sending reports to the controller, which would support my suspicion further.
I don't see any mention of you doing a network heal. Even if the backup restores the known routes, the physical differences between sticks may result in incorrect routes being used for attempting communication with the nodes.
I had not done a network heal when that log file was created, no. I've since started one, takes some time for the battery powered nodes. I'll upload tomorrow's logfile which should be mostly after the healing (only the smoke detector is not done yet, not sure when it's scheduled for its daily wakeup).
Addendum: According to the network graph, range still appears to be much worse with 700 series (though it's been advertised as having an improved range). I think I mentioned this previously in another ticket, I've a Strips Guard 700 on a window in the same room as the controller, maybe 5 m distance that takes 2 hops with the 700 series and only 1 hop with the 5gen+. This might explain the issues before the network heal (but not why range is so abysmal with the newer stick).
Here's today's log, @AlCalzone. The network heal finished at 01:00 with the last remaining node (45 - the others were healed before the log turnover), but unfortunately to me it looks very similar (lots of invalid payloads/duplicate commands): zwavejslog_20220313.zip
Not sure if we went through this already - did you follow the instructions for controller placement mentioned here? https://zwave-js.github.io/node-zwave-js/#/troubleshooting/connectivity-issues?id=general-troubleshooting
We didn't, but I know about it and the stick's attached to a 1,5 m extension cord (above a wooden door frame, so no metal surface nearby either).
Ok. What bugs me a bit are the low RSSI values (-88...-100 dBm), which are probably not much above the background noise, and they gradually get worse. Maybe this is less of a controller placement issue than a mesh strength issue.
You can see that at least node 18 has a stronger connection than most others:
2022-03-13T09:11:21.169Z DRIVER « [REQ] [SendDataBridge]
callback id: 228
transmit status: OK, took 100 ms
repeater node IDs: 18, 10
routing attempts: 1
protocol & route speed: Z-Wave, 100 kbit/s
ACK RSSI: -89 dBm
^^ measured by the controller
ACK RSSI on repeaters: -63 dBm, N/A
^^ ^^^ measured by node 10
^^ measured by node 18
ACK channel no.: 0
TX channel no.: 0
In zwavejs2mqtt, on the network map, you have the ability to test the connection quality to individual nodes. I think this is your best bet now - identify nodes with a bad connectivity (might be the repeaters, not necessarily the "problematic" nodes) and try to optimize their connection.
You may also try to increase the TX power of the controller. At least partially your issue is caused by the nodes not being able to "hear" the controller. Note that this doesn't help if the controller can't "hear" some nodes though.
How would increase the controller's TX power? The health check on the network map is not working for me (it rapidly swamps the network, similar to https://github.com/zwave-js/node-zwave-js/issues/4130).
This is a relatively small appartment and the non-battery nodes are fairely evenly spaced. Node 18 is in the same direction as Node 10, but the latter is nearer to controller (basically they are located on opposite sites of the same wall: 1 -- wall -- "empty" room -- 10 -- wall -- 18 -- other room).
For comparison, this is the log from a few days ago using the Gen5+ controller (also Aeotec). Everything is the same except the controller (i.e. same nodes and placement): zwavejslog_20220309.zip
How would increase the controller's TX power?
This is a bit tricky, since no UI exposes it AFAIK. You can do it via code though:
I suggest the following:
- Go to the log tab, open log in a new window
- Go to the driver function dialog, execute this:
await driver.controller.getPowerlevel()
- Look at the log and remember the two numbers. Should be something like -3.3 and 0 I think - mine are messed up (outside the -12.8...+12.7 dBm range).
- execute this:
await driver.controller.setPowerlevel( 3, // the normal powerlevel plus 3 (or so, need to test) 0 // the output power at 0 dBm, unchanged! )
I would start with a slight adjustment of the normal powerlevel, e.g. +3 dBm. Note that increasing the powerlevel of a radio device may be against the law, so don't go overboard.
The health check on the network map is not working for me (it rapidly swamps the network
Any chance I could see a log of that?
I would start with a slight adjustment of the normal powerlevel, e.g. +3 dBm. Note that increasing the powerlevel of a radio device may be against the law, so don't go overboard.
What's normal here? GetPowerlevel
returned these values:
command: GetPowerlevel
normal powerlevel: 3.2 dBm
output power at 0 dBm: 14.8 dBm
The health check on the network map is not working for me (it rapidly swamps the network
Any chance I could see a log of that?
I've triggered the health check a few minutes ago: zwavejslog_20220314.zip
I've triggered the health check a few minutes ago
Ok that's https://github.com/zwave-js/node-zwave-js/issues/4075
What's normal here? GetPowerlevel returned these values
I'm trying to find that out. Unfortunately my test stick has very strange values there. Will let you know once I've found the defaults
What's normal here? GetPowerlevel returned these values
I'm trying to find that out. Unfortunately my test stick has very strange values there. Will let you know once I've found the defaults
Is the power level of a Gen5 stick comparable (or interesting)? Then I could switch back to my other stick.
Also, regarding network strength, do you see similar issues in the Gen5 log or is it the 700 series stick? (To me, things seem better with the Gen5 stick, but I'm not that well-versed in reading the logs.)
From what I've seen in the NVM data, the settings for 500 and 700 series sticks aren't comparable.
In your Gen5 log, I don't see RSSI for incoming messages, but the ACKs have RSSI values in the 50s/60s, roughly 30-40 dBm higher than on the 700 series. This could be caused by incorrect settings on the 700 series stick.
Start by setting the TX power to:
normal powerlevel: 0
measured: +3.3
which are the default settings. For good measure also set the RF region to EU, you never know...
await driver.controller.setRFRegion(0)
@mundschenk-at It might be that you're also affected by this bug: https://github.com/zwave-js/node-zwave-js/issues/4367 In that case you won't be able to change the settings on your own without editing the NVM backup first. I can do that for you if you send me the NVM backup via e-mail.
@mundschenk-at It might be that you're also affected by this bug: #4367 In that case you won't be able to change the settings on your own without editing the NVM backup first. I can do that for you if you send me the NVM backup via e-mail.
Yes it definitely looks that way:
2022-03-15T15:12:04.787Z DRIVER « [RES] [SerialAPISetup]
command: GetRFRegion
region: Unknown
I'll edit the backup myself, should not be a problem.
Sure, see here for details: https://github.com/zwave-js/node-zwave-js/issues/3906#issuecomment-1068031298
I tried this: Made a new 700 series backup, converted to JSON and replaced "rfConfig": null
with the object from https://github.com/zwave-js/node-zwave-js/issues/3906#issuecomment-1068031298. It worked, in that GetRFRegion
returns Europe
and the power levels are now 0.0 dBm
and 3.3 dBm
.
However, the whole network started to be reinterviewed immediately. As a result, several nodes had incomplete information and/or security classes. I manually re-interviewed them (using the "reset security classes" option) and was able to fix most of them (two are missing, one is a smoke detector that I would need a ladder to wake up, the other an Aeotec Water Sensor 6, which is quite buggy when using S0
security - I may have to reinclude that, maybe dropping security this time). Unfortunately, even after this, things are not really good.
I've tried starting a network heal, but that has stalled after just 4 nodes (i.e. not even all mains powered nodes have been healed).
Unfortunately, even after this, things are not really good. I've tried starting a network heal, but that has stalled after just 4 nodes .
Can you post another log so I can see for myself? You don't have to encrypt them, unless you re-include a secure device and/or change user codes while you're logging.
Unfortunately, even after this, things are not really good. I've tried starting a network heal, but that has stalled after just 4 nodes .
Can you post another log so I can see for myself? You don't have to encrypt them, unless you re-include a secure device and/or change user codes while you're logging.
Driver restarts as well (home ID), no? Since these are pretty long logs (whole day), I feel more comfortable with encrypted uploads.
The home ID is just a number that's transmitted in plaintext via radio anyways. If someone is standing near your network with a Zniffer, they will see it. Encrypted is fine for me though.
Here it is: zwavejslog_20220315.zip
(Sorry, a bit messy due to repeated re-interview/heal attempts.)
From what I'm seeing in the logs, the heal is slowly progressing. You had a full interview of node 99 going on in parallel, which delays the process quite a bit.
The RSSI values didn't change, but that's expected, since you didn't change anything physical. I guess now you can try this and see if the nodes can "hear" the controller better then.
The RSSI values didn't change, but that's expected, since you didn't change anything physical. I guess now you can try this and see if the nodes can "hear" the controller better then.
Just to be clear, do you mean I should initially try 3, 0
as the parameters (as in the example), or rather 3, 3.3
(3
added to the previous "normal powerlevel" of 0.0
, 0
added to the previous "output power at 0 dBm`)?
I meant 3, 3.3
.
Here's the new log. I think it is somewhat better, but node 10 still appears to be problematic (though I don't know why, it's a FGWP102, though you might consider it a legacy device - I didn't know how to read the specifications properly when ordering my first nodes).
Also, when doing a health check for node 11, node 80 started freaking out completely. This one is a bit remote (at the bottom of a cupboard at the other end of the flat), but there are plenty of repeaters in between if necessary.