core
core copied to clipboard
Introduce a delay between update entity calls
Breaking change
Any skipped updates in the zwave_js update entities will be cleared when you first upgrade your HA instance to this version. This will be a one time occurrence per update entity, the integration will persist the state going forward.
Proposed change
It was discovered that in some cases, because of the way we were handling update entity updates, we were causing floods of network traffic. The reason is because even though we used a semaphore to limit parallel requests, as soon as the call was done the next one immediately started. For large networks, at startup, and every 24 hours after, we would generate a lot of traffic which ended up causing bit flips.
In this logic, I used balloobs idea to introduce a 5 minute delay before releasing the lock (we are now limiting it to a single update at a time) which will space out the network requests and the subsequently scheduled updates.
I also fixed a bug where we weren't properly unsubscribing from the callback.
CC @AlCalzone @kpine
Type of change
- [ ] Dependency upgrade
- [x] Bugfix (non-breaking change which fixes an issue)
- [ ] New integration (thank you!)
- [ ] New feature (which adds functionality to an existing integration)
- [ ] Deprecation (breaking change to happen in the future)
- [ ] Breaking change (fix/feature causing existing functionality to break)
- [ ] Code quality improvements to existing code or addition of tests
Additional information
- This PR fixes or closes issue: fixes #
- This PR is related to issue:
- Link to documentation pull request:
Checklist
- [ ] The code change is tested and works locally.
- [ ] Local tests pass. Your PR cannot be merged unless tests pass
- [ ] There is no commented out code in this PR.
- [ ] I have followed the development checklist
- [ ] I have followed the perfect PR recommendations
- [ ] The code has been formatted using Black (
black --fast homeassistant tests) - [ ] Tests have been added to verify that the new code works.
If user exposed functionality or configuration variables are added/changed:
- [ ] Documentation added/updated for www.home-assistant.io
If the code communicates with devices, web services, or third-party tools:
- [ ] The manifest file has all fields filled out correctly.
Updated and included derived files by running:python3 -m script.hassfest. - [ ] New or updated dependencies have been added to
requirements_all.txt.
Updated by runningpython3 -m script.gen_requirements_all. - [ ] For the updated dependencies - a link to the changelog, or at minimum a diff between library versions is added to the PR description.
- [ ] Untested files have been added to
.coveragerc.
To help with the load of incoming pull requests:
- [ ] I have reviewed two other open pull requests in this repository.
Hey there @home-assistant/z-wave, mind taking a look at this pull request as it has been labeled with an integration (zwave_js) you are listed as a code owner for? Thanks!
Code owner commands
Code owners of zwave_js can trigger bot actions by commenting:
@home-assistant closeCloses the pull request.@home-assistant rename Awesome new titleRenames the pull request.@home-assistant reopenReopen the pull request.@home-assistant unassign zwave_jsRemoves the current integration label and assignees on the pull request, add the integration domain after the command.
Please take a look at the requested changes, and use the Ready for review button when you are done, thanks :+1:
We should test this or write a test for it. Not sure how to do the latter though as we don't use the Home Assistant event helpers for the update delay.
We should test this or write a test for it. Not sure how to do the latter though as we don't use the Home Assistant event helpers for the update delay.
do we want to switch to using the event helpers? I thought about it, it adds a little complexity but it would make writing a test easier which I think is preferred
It would be good to write a test, yes. With the event helper approach, will you calculate the schedule time depending on the number of nodes and schedule all updates in one go?
No I was planning to use the call later helper as an exact replacement for asyncio.sleep. The complexity I was referring to was just having another callback to manage, unsub, etc. The code is just easier to read in its current form, but for this little added complexity I can add a test where I can verify that only one update happens before 5 minutes after start
Ok. I don't understand how the lock will work with that approach, but I'll take a look when you push.
OK so switched to the helper. Two problems:
- I can't figure out how to test this in tests. I tested it on my instance and it successfully staggered the firmware updates 5 minutes at a time
- Because most of the updates are waiting to acquire the lock, the task never gets canceled. Not sure how to address this but this would have been a problem in either instance
2023-03-17 22:41:48.116 WARNING (MainThread) [homeassistant.core] Task <Task pending name='Task-1762' coro=<ZWaveNodeFirmwareUpdate._async_update() running at ./homeassistant/components/zwave_js/update.py:174> wait_for=<Future pending cb=[Task.task_wakeup()]> cb=[set.remove()]> was still running after stage 2 shutdown; Integrations should cancel non-critical tasks when receiving the stop event to prevent delaying shutdown
maybe I need to create the HassJob myself for the update and then add a listener for a stop event to cancel the job?
OK so I think my new solution avoids the task problem and removes the need for a lock entirely. Basically for every entity we add, we increment a counter which we use to determine the initial delay. Because we can't guarantee that hass is running during the first run, we just push the run to 24 hours later so that we preserve the 5 minute delays
It's not a breaking change anymore, right?
It's not a breaking change anymore, right?
nope, fixing that
Maybe also update the PR description for the latest iteration of approach.