core
core copied to clipboard
Intermittent issue when activating Doorbird relay
The problem
Hello,
I face an intermittent issue when trying to press one of my DoorBird relay using the DoorBird integration.
The following scenario works most of the time:
alias: Gate Button
description: ""
trigger:
- platform: device
domain: mqtt
device_id: b3bf4b3170cb8d4f347a203bd1d76430
type: action
subtype: single
condition:
- condition: time
after: "07:00:00"
before: "20:00:00"
action:
- device_id: 95eb710efa81754c8cf60b2442fe9e57
domain: button
entity_id: cf84f0e45e42e298b6b195e31bbfebb5
type: press
mode: single
However, when chaining call (pressing the push button with like a minute or less), nothing happen on the DoorBird and I get weird Expecting value: line 1 column 1 (char 0) error in the scenario traces.
Here is one that run:
And here one triggered ~8 minutes after and failed:
Here are the relevant logs extracted from home assistant container: traceback.log
I'm suspecting a not handled bad api response returned by the DoorBird device (quota? permission token expiration? trigger safety delay?). Any idea?
What version of Home Assistant Core has the issue?
core-2024.6.1
What was the last working version of Home Assistant Core?
No response
What type of installation are you running?
Home Assistant Container
Integration causing the issue
DoorBird
Link to integration documentation on our website
https://www.home-assistant.io/integrations/doorbird/
Diagnostics information
No response
Example YAML snippet
No response
Anything in the logs that might be useful for us?
Something is definitely happening in the doorbirdpy library, but not enough informations in the log to conclude (response http code, body, etc):
File "/usr/src/homeassistant/homeassistant/components/doorbird/button.py", line 81, in press
self.entity_description.press_action(self._door_station.device, self._relay)
File "/usr/src/homeassistant/homeassistant/components/doorbird/button.py", line 30, in <lambda>
press_action=lambda device, relay: device.energize_relay(relay),
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/doorbirdpy/__init__.py", line 78, in energize_relay
data = self._get_json(
^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/doorbirdpy/__init__.py", line 321, in _get_json
return response.json()
^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/requests/models.py", line 978, in json
raise RequestsJSONDecodeError(e.msg, e.doc, e.pos)
requests.exceptions.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
I'll try to edit this file in situ and reproduce the problem to log more details when the error is raised.
Additional information
No response
Hey there @oblogic7, @bdraco, @flacjacket, mind taking a look at this issue as it has been labeled with an integration (doorbird) you are listed as a code owner for? Thanks!
Code owner commands
Code owners of doorbird can trigger bot actions by commenting:
@home-assistant closeCloses the issue.@home-assistant rename Awesome new titleRenames the issue.@home-assistant reopenReopen the issue.@home-assistant unassign doorbirdRemoves the current integration label and assignees on the issue, add the integration domain after the command.@home-assistant add-label needs-more-informationAdd a label (needs-more-information, problem in dependency, problem in custom component) to the issue.@home-assistant remove-label needs-more-informationRemove a label (needs-more-information, problem in dependency, problem in custom component) on the issue.
(message by CodeOwnersMention)
doorbird documentation doorbird source (message by IssueLinks)
I'm almost certain the device is returning a "successful" HTTP status code because doorbirdby calls raise_for_status() (which would throw an error if it was a 4xx or 5xx status code) before it attempts to parse a JSON response, which in this case is initiated in energize_relay().
"Expected value at char 0" means the response was completely empty, which is strange. This weekend I'll try to get some time to set my D101 up with a script that calls that endpoint over and over and logs each response to see if it's a bug in the DoorBird firmware.
Update: I ran a script that requested /bha-api/open-door.cgi?r=1 and logged the result every 5 minutes between 8/17 and 8/25. In that time, it logged 2,516 requests, 2 of which had an empty response body instead of the expected JSON.
The script used cURL directly instead of doorbirdpy, so this points to a bug in the DoorBird firmware. I'm not sure how to handle this going forward, since not knowing whether the relay was energized successfully or not is a problem that should be logged in Home Assistant.
Doorbird support has been pretty good about escalating bugs. I'd try contacting them next. Let them know you are a developer working on the Home Assistant integration.
Update: DoorBird support suggested that the device may be auto-rebooting, so it is not online when these requests are sent. I don't believe this is the problem because I'd be seeing a cURL error and not a 200 response, but I have restarted my test with more logging per their request, so we'll see how that goes.
After more extensive testing (4,181 test requests over 30 days), the device failed to respond to three of them. Two of those failures were "no route to host" and one was "connection refused".
===> Tue Sep 24 08:10:01 UTC 2024
curl: (7) Failed to connect to 192.168.2.95 port 80: No route to host
===> Sat Sep 28 08:30:01 UTC 2024
curl: (7) Failed to connect to 192.168.2.95 port 80: No route to host
===> Sun Sep 29 08:20:01 UTC 2024
curl: (7) Failed to connect to 192.168.2.95 port 80: Connection refused
I feel like that should be raising an exception long before it gets to parsing the response as JSON, but it's possible there is a different issue with your device or network that didn't show up in my testing.
I do think it's interesting that all of the failures were around the same time (~1am local time), so I asked DoorBird support if the device reboots itself on a schedule, though this seems unlikely to be the cause of the original problem given its (in)frequency.
If the failures are transient we could wrap the api calls with https://github.com/jd/tenacity
Tiny update from DoorBird support:
I wanted to update you on the findings from our development team regarding the device restarts you reported. They have confirmed that these are routine daily software restarts and nothing critical.
Here are the restart logs from the device, which match the times you provided:
Restarting at Tue Sep 24 01:09:36 PDT 2024 (Mode: normal) Restarting at Sat Sep 28 01:29:28 PDT 2024 (Mode: normal) Restarting at Sun Sep 29 01:19:13 PDT 2024 (Mode: normal)Please note that the time difference of +7 hours is due to different time zones.
Seems like tenacity with a backoff would be the way to go
@bdraco I've pushed a new version with @retry annotations to the doorbirdpy dev branch. If it looks good to you, I'll go ahead and release it to PyPI: https://gitlab.com/klikini/doorbirdpy/-/commit/73e8ea277077dccd97f090fde5a85c7b622b5f1b
Looks good. Will test shortly.
I put it on my production and didn't discovery any unexpected behavior. I think its one of those things only time will tell so seems good to release and we can bump it in HA before the next beta so we have a week to revert if it causes any unwanted side effects.
I pushed it to master and the build failed, so I'll look into it after work today or tomorrow.
Update: fixed the test, 3.0.6 is now in PyPI.
@Klikini I opened https://github.com/home-assistant/core/pull/129114 but in the process of testing I realized we shouldn't retry on info since it will block startup for the retry period if the device is offline
@Klikini gentle nudge in case you missed my comment above. Beta is Wed so it would be nice to get one more bump and get this sorted, but also no rush.
@bdraco I removed the retry annotation from info() and pushed 3.0.7 to PyPI.
Thanks!