Reject Core Backup if sending backup start WebSocket message fails
Proposed change
If sending the WebSocket message to Home Assistant Core to inform it about the start of a backup fails, we should reject the backup process entirely. This prevents taking a backup without Home Assistant Core being aware of it.
As a side effect, this likely prevents situation where Core does not learn about the backup progress since we also won't be able to send WebSocket messages about Job progress updates.
Type of change
- [ ] Dependency upgrade
- [x] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (which adds functionality to the supervisor)
- [ ] Breaking change (fix/feature causing existing functionality to break)
- [ ] Code quality improvements to existing code or addition of tests
Additional information
- This PR fixes or closes issue: fixes #
- This PR is related to issue:
- Link to documentation pull request:
- Link to cli pull request:
- Link to client library pull request:
Checklist
- [ ] The code change is tested and works locally.
- [ ] Local tests pass. Your PR cannot be merged unless tests pass
- [ ] There is no commented out code in this PR.
- [ ] I have followed the development checklist
- [ ] The code has been formatted using Ruff (
ruff format supervisor tests) - [ ] Tests have been added to verify that the new code works.
If API endpoints or add-on configuration are added/changed:
- [ ] Documentation added/updated for developers.home-assistant.io
- [ ] CLI updated (if necessary)
- [ ] Client library updated (if necessary)
In cases where Core API does not respond (e.g. after an update or restart) we still allow to take a backup. It seems that in such cases we won't be able to inform the Core about backup starting via WebSocket either (backup_start WebSocket message does not get delivered). As a consequence, Job updates also do not get delivered, which the new Backup integration relies on (an example error report of such a system seems to be https://github.com/home-assistant/core/issues/143158). With this change, we will reject a backup entirely if informing the Core fails.
A downside will be that on such systems we won't take a backup. From Sentry it seems there are quite some system where the Core API does not respond properly (HomeAssistantStartupTimeout etc.). So I am not sure if a hanging backup is maybe the better situation then rejecting from a users perspective :thinking:
Ideally we'd find the root cause of why Core API does not respond, so the WebSocket messages get delivered again.
/cc @emontnemery