core
core copied to clipboard
Issue with Lifx integration
The problem
Since the latest integration update I have a lot of occurrences of LIFX light becoming "not available". This is happening on most (but not all) of them (I have 20+ lights).
The behavior is not consistent during the day which makes me suspect there is some relation with the wifi environment (I have 3 AP broadcasting the same SSID on 1-6-11 channel), but by AP logs it doesn't seem to be related to LIFX disconnecting from one AP and reconnecting to the other.
Also I see they usually become unavailable for 10s then coming back online: I ask myself if this has something to do with polling rate cycle of the integration as I see from integration discovery interval is 10 (seconds?)
"""Const for LIFX."""
import logging
DOMAIN = "lifx"
TARGET_ANY = "00:00:00:00:00:00"
DISCOVERY_INTERVAL = 10
MESSAGE_TIMEOUT = 1.65
MESSAGE_RETRIES = 5
OVERALL_TIMEOUT = 9
UNAVAILABLE_GRACE = 90
so could it be that discovery, in my environment, simply can't keep the pace and drops connections?
What version of Home Assistant Core has the issue?
2022.9.5
What was the last working version of Home Assistant Core?
the one before LIFX integration update
What type of installation are you running?
Home Assistant OS
Integration causing the issue
LIFX
Link to integration documentation on our website
https://www.home-assistant.io/integrations/lifx/
Diagnostics information
No response
Example YAML snippet
No response
Anything in the logs that might be useful for us?
No response
Additional information
No response
lifx documentation lifx source (message by IssueLinks)
Hey there @bdraco, @djelibeybi, mind taking a look at this issue as it has been labeled with an integration (lifx) you are listed as a code owner for? Thanks!
(message by CodeOwnersMention)
It's more likely that your bulbs have always been doing this, we're just better at reporting it now than before. Does Home Assistant consistently re-establish connectivity to each bulb? Are they responsive to automation and manual control?
so in general they are re-establishing connection to HA.
In the past I never had automation / responsiveness issues, while now it sometimes happen when the state is unavailable and an action shot.
Not sure if something happened HA side (ex. increase amount of broadcast traffic) which made worse the situation recently. I have quite a lot of wifi devices in a single Lan segment which can be the issue (I need to segregate into VLAN at some stage but I can't find time for this). Airtime shouldn't be an issue as devices are split through 3 APs (20-25 each)
There shouldn't have been any significant increase in the amount of traffic, but we are interacting with the bulbs more than before. If you don't use HomeKit, it may be worth integrating your bulbs using Home Assistant's HomeKit Controller integration instead, as that uses local push, instead of polling the bulbs every 10 seconds.
If you do you use HomeKit, you still can by connecting them to Home Assistant first, then exporting them to HomeKit from HASS.
Yes I read it and also that is in my todo list: should be a much better way to controlling bulbs. Unluckily I have some Z strips which are not homekit compliant, hence for those I believe I will have to stick to LIFX integration.
When you say "we are interacting more with the bulbs" what are you referring to in details?
I also see a LOT more of this kind of error messages since I use the new integration. The led strip becomes unavailable frequently and I have this in the logs:
2022-10-04 16:40:08.771 ERROR (MainThread) [homeassistant.components.lifx] Timeout fetching Tête de lit (10.0.0.3) data
2022-10-04 16:57:58.259 ERROR (MainThread) [homeassistant.components.lifx] Timeout fetching Tête de lit (10.0.0.3) data
2022-10-04 17:30:03.269 ERROR (MainThread) [homeassistant.components.lifx] Timeout fetching Tête de lit (10.0.0.3) data
2022-10-04 17:43:10.296 ERROR (MainThread) [homeassistant.components.lifx] Timeout fetching Tête de lit (10.0.0.3) data
2022-10-04 17:48:25.046 ERROR (MainThread) [homeassistant.components.lifx] Timeout fetching Tête de lit (10.0.0.3) data
2022-10-04 17:50:58.265 ERROR (MainThread) [homeassistant.components.lifx] Timeout fetching Tête de lit (10.0.0.3) data
2022-10-04 18:06:03.258 ERROR (MainThread) [homeassistant.components.lifx] Timeout fetching Tête de lit (10.0.0.3) data
2022-10-04 18:11:40.260 ERROR (MainThread) [homeassistant.components.lifx] Timeout fetching Tête de lit (10.0.0.3) data
2022-10-04 18:14:02.258 ERROR (MainThread) [homeassistant.components.lifx] Timeout fetching Tête de lit (10.0.0.3) data
2022-10-04 18:26:12.260 ERROR (MainThread) [homeassistant.components.lifx] Timeout fetching Tête de lit (10.0.0.3) data
2022-10-04 18:52:03.934 ERROR (MainThread) [homeassistant.components.lifx] Timeout fetching Tête de lit (10.0.0.3) data
2022-10-04 18:54:15.258 ERROR (MainThread) [homeassistant.components.lifx] Timeout fetching Tête de lit (10.0.0.3) data
2022-10-04 18:57:23.262 ERROR (MainThread) [homeassistant.components.lifx] Timeout fetching Tête de lit (10.0.0.3) data
2022-10-04 19:02:22.266 ERROR (MainThread) [homeassistant.components.lifx] Timeout fetching Tête de lit (10.0.0.3) data
2022-10-04 19:14:55.286 ERROR (MainThread) [homeassistant.components.lifx] Timeout fetching Tête de lit (10.0.0.3) data
2022-10-04 19:19:45.260 ERROR (MainThread) [homeassistant.components.lifx] Timeout fetching Tête de lit (10.0.0.3) data
In my case, the Wi-Fi coverage is weak in this room, but it never was an issue to use the led strip. Before muting the integration in the logs in my case, I am going to try the HomeKit integration. I just need to improve by Bluetooth coverage first. Thanks for the suggestion.
@bdraco I wonder if we shouldn't make the timeout a little less noisy? Perhaps only report if the device hasn't recovered after some amount of time? Most of my timeouts recover within the next 10 second window, for example. Usually much quicker.
Does increasing the timeout allow it to go though? If we suppress the log message, the device will still be marked unavailable and lead to questions about why.
Let me test and get back to you on that.
I'm not getting any timeouts when running the latest dev that uses extended multizone messages and considering the issue is with a strip, I'd like to see if this is still an issue once that code is in a stable release.
TL;DR: this may already be fixed in dev via https://github.com/home-assistant/core/pull/79444
I'm having a similar issue... I only have Z / Z2 strips...

They only came back online after a HA hardware reboot, had no way as a novice end user to force some sort of manual check.
Is there a way to update temporarily to the dev build, or maybe it's not worth the effort... is there an approximate ETA for when the potential fix will hit stable? (are we talking weeks / months)
One LIFX-Z failed to come back on HA (still worked without issues via Alexa / LIFX app. Removed device from HA list, then tried looking for it again via discovery. Was discovered as a weird name (serial number/mac?) and not the set one ('Bath' in this case). Even after adding it not only does it not show up in the device list, LIFX integration is no longer finding it... hardware reboot did not bring it back. Advice would be appreciated.
I am having the same issue as well. Reloading the integration tends to fix the issue, but it often reoccurs a short time later
I made various test changing stuff on my network trying to reduce as much as I can multicast traffic but no luck. My aim was to make such that polling all my lifx bulbs/strips was feasible in 10s which I think is a too short turnaround time.
I will try to move some bulbs to HomeKit and see if I will improve the situation which is quite annoying at the moment
If you haven't disconnected your bulbs from the LIFX Cloud, that's another thing you should do to reduce the CPU load on the bulbs themselves. This assumes you don't use themes or schedules defined in the LIFX app, as those require cloud connectivity.
They are disconnected, all of them. All worked well with no hick-ups for 2y before integration update
I'm not denying there is something going on with the way the integration currently does discovery, but it's proving extremely difficult to isolate or reproduce in a controlled environment. Especially considering discovery is improved for most folks.
I know it is not, don’t take too bad my comment. My feeling is it is a wrong mix of polling frequencies / retries which lead to this.
I previously had experiences with a (probably) faulty lifx bulb which just disconnected and hanged on every HA restart (with previous version of integration) I think due to the burst of multicast/polling HA was shooting. I believe lifx bulb have very poor bandwidth and go banana when “flooded”.
In my environment I don’t have the same intermittent disconnection for all bulbs: I have more for the one with weaker signal (still decent thought like -70dB). So I think is a mix of wifi radio environment, positioning, number of bulbs.
Likely this is just showing polling is not a robust way of communication. Not sure if there is something different that can be done within Lifx integration.
Now I migrated few lights to HomeKit: let’s see if it will be better
I am having the same issue as well. 2+ years of ~99.9% uptime, now im experiencings multiple long dropouts across my 20 bulbs every day
Yeah, I have a hypothesis as to the cause of this, I just need some spare time to refactor things to see if it's valid or not. I'm hoping to get to it this weekend.
There is a thundering herd problem with the coordinators that cause all polling to be aligned at microsecond 0 that is fixed in 2022.12.x that might help this issue
The thundering heard fix at 0 microseconds #82233
I've been trying to track this down for ages. I'm really glad you found the cause.
In case this can help I migrated all my lifx light to HomeKit controller integration: since then (3+ weeks ago) had zero disconnections
Thank you all for your input, i will holdout for 2022.12 as this issue is still persisting. I will use @mspinolo suggestion if the issue remains post update 🙏
@melbs2 I have some stuff I'm testing on top of @bdraco's fix for the thundering herd that is showing a lot of promise. There is still an issue with very old devices (like Beams or Tiles) but otherwise, I'm quite happy with the way my flock of 60 devices is behaving.
Did you change something recently, I'm getting a ton of errors for all my LIFX Z v1 strips, I've unpowered and repowered them multiple times to no avail. They work everywhere else (LIFX app, Alexa).
Seems to have happened after 2022.11.5
Logger: homeassistant.config_entries
Source: config_entries.py:1089
First occurred: December 2, 2022 at 7:22:14 PM (462 occurrences)
Last logged: 2:06:59 AM
Config entry 'Lower' for lifx integration not ready yet: No response from LIFX bulb; Retrying in background
Config entry 'Door' for lifx integration not ready yet: No response from LIFX bulb; Retrying in background
Config entry 'Window' for lifx integration not ready yet: No response from LIFX bulb; Retrying in background
Config entry 'Upper' for lifx integration not ready yet: No response from LIFX bulb; Retrying in background
Logger: homeassistant.helpers.service
Source: helpers/service.py:637
First occurred: December 3, 2022 at 12:07:21 AM (13 occurrences)
Last logged: December 3, 2022 at 7:00:00 PM
Unable to find referenced entities light.door or it is/they are currently not available
Unable to find referenced entities light.window or it is/they are currently not available
Unable to find referenced entities light.upper or it is/they are currently not available
Unable to find referenced entities light.lower or it is/they are currently not available
If you have HACS installed, you could try LIFX Beta component: https://github.com/Djelibeybi/ha-lifx-beta/
Having the same issue.
Unable to use automation against the lights, like slow dim with transition. The lights go unavailable after 1-2 minutes subsequently prompting it to go off once available again. The light otherwise does not go unavailable, but if i trigger this dimming which I believe involved rapid communication with the light (turn on from 0 to 50% brightness over 5 minutes). Seems to trigger the unavailability consistently. As if the light is being flooded by the comms and decides to give up for a few seconds.