TPlink Kasa devices constantly go offline in Home Assistant
The problem
TPlink Kasa wifi switches (HS200, HS210, KS200), In-wall Outlets (KP200), and power strips (KP303) will show offline in Home Assistant despite working in the Kasa App. These devices will go online and offline seemingly at random.
What version of Home Assistant Core has the issue?
core-2024.10.1
What was the last working version of Home Assistant Core?
No response
What type of installation are you running?
Home Assistant OS
Integration causing the issue
TP-Link Smart Home
Link to integration documentation on our website
(https://www.home-assistant.io/integrations/tplink)
Diagnostics information
home-assistant_tplink_2024-10-09T16-52-23.108Z.log
Example YAML snippet
No response
Anything in the logs that might be useful for us?
No response
Additional information
No response
Hey there @rytilahti, @bdraco, @sdb9696, mind taking a look at this issue as it has been labeled with an integration (tplink) you are listed as a code owner for? Thanks!
Code owner commands
Code owners of tplink can trigger bot actions by commenting:
@home-assistant closeCloses the issue.@home-assistant rename Awesome new titleRenames the issue.@home-assistant reopenReopen the issue.@home-assistant unassign tplinkRemoves the current integration label and assignees on the issue, add the integration domain after the command.@home-assistant add-label needs-more-informationAdd a label (needs-more-information, problem in dependency, problem in custom component) to the issue.@home-assistant remove-label needs-more-informationRemove a label (needs-more-information, problem in dependency, problem in custom component) on the issue.
(message by CodeOwnersMention)
tplink documentation tplink source (message by IssueLinks)
I have Kasa devices (HS100, HS110) and they have been offline for almost a year. I gave up on this integration, tbh.
I have 60 devices. 8 of them have just started going offline (still good in the app) consistently, whereas the other are still fine. They are also all near to each other in the house, strangely enough. If i reset the switch with the button, they reconnect and are good for a day or two. It's always the same 8 devices.
Just to be clear: my 5 smart plugs work perfectly with the app or via Amazon Alexa. There are zero problems, always online and my network is perfect (all devices with static IP addresses).
It's just this integration that always have problems, I ended up removing it from HA. I have 80 other devices (plugs/lights/sensors/valves) with different protocols (wifi/zigbee/bluetooth) and the only problem is with TP-Link HS100/110. All the time.
They drop off the network (only in HA, because they're there, obviously) and the system keeps trying to reconnect every 5 min, failing. Since it's been almost 8 months since this problem started, I think either this integration isn't maintained anymore, or nobody cares about Kasa plugs.
Thank you for your perspectives, I thought I was going nuts with the trouble I was having.
To do a little independent investigation (maybe it was a hardware issue?) i wrote a python script using python-kasa to:
- Discover all my Kasa devices to see if the number expected matched the discovered value;
- Get a 'feature' list from the devices; and
- Attempt to force a 'reboot' via python, since manually pressing the reboot button on the switches will clear up the connection issues for a day or so (like rtech73 stated).
While all of the devices respond in the Kasa and Tapo apps on the iPhone (IOS 18), the python script will throw errors not being able to 'device.update()' and 'device.reboot()' the IPs of the kasa plugs and switches that show offline in HA.
TLDR: Is it a python-kasa issue?
Hi @AV0uu. From looking through your logs it seems that at the time when your tplink device are going offline you also have multiple other device types going offline with different integrations. It appears there are either general network issues happening at the same time or network issues specifically with the device running HA.
There are a few reasons why the kasa and tapo apps appear to be working ok. HA connects locally to the devices, whereas if the devices have access to the internet the native apps tend to connect via the cloud. It could be that the HA box is experiencing the issues and it is not affecting the devices going directly through your router to the cloud. Also the tplink integration reports unavailable as soon as the device becomes unavailable (within 5 seconds), whereas the native apps generally don't tell you when they can't connect until a lot longer.
@Gherry777 please open a new issue and include some debug logs if you want assistance. This integration is well maintained and we are happy to help with issues when there are constructive contributions.
@rtech73 we have had some issues reported where they have turned out to be problems with certain access points on mesh networks. Tweaking the wireless protocols has been reported as sometimes fixing these issues.
I have 11 kasa KL125 bulbs and since 2024.10.2 they've been going offline in homeassistant. In the app they are always online and consistently immediately responsive. The fact that the app works instantly on all bulbs that show offline in home assistant tells me that this is definitely a home assistant thing.
One person in another thread mentioned changing IPs in dhcp, but I've since migrated all to static IPs.
This is what I see in home assistant. Again, all of these have solid signal to the access point, and are available in the kasa app and instantly respond to changes from kasa.
Some of the behavior I see when using the homeassistant is:
When turning on or off a group of 4 bulbs I see the following behavior
Not all will toggle usually 2 or 3 out of 4 One may blink at full brightness for about a tenth of a second every 3 to 4 seconds Sometimes they come on a different brightnesses
Code for my toggle call
metadata: {}
data:
brightness_pct: 100
target:
area_id: office
entity_id: light.office_lights
action: light.toggle
light.office_lights is a group entity containing office light 1-4
Adding log entries - I see a bunch of this over and over with all of my bulbs:
2024-10-18 12:13:17.806 ERROR (MainThread) [homeassistant.components.tplink.coordinator] Error fetching 10.1.11.19 data: Unable to query the device 10.1.11.19:9999:
2024-10-18 12:13:17.807 WARNING (MainThread) [homeassistant.components.group.sensor] Unable to use state. Only numerical states are supported, entity sensor.office_light_1_current_consumption with value unavailable excluded from calculation in sensor.lights_current_energy_usage
2024-10-18 12:13:17.807 WARNING (MainThread) [homeassistant.components.group.sensor] Unable to use state. Only numerical states are supported, entity sensor.office_light_1_today_s_consumption with value unavailable excluded from calculation in sensor.lights_today_s_usage
2024-10-18 12:13:18.005 ERROR (MainThread) [homeassistant.components.tplink.coordinator] Error fetching 10.1.11.10 data: Unable to query the device 10.1.11.10:9999:
2024-10-18 12:13:18.007 WARNING (MainThread) [homeassistant.components.group.sensor] Unable to use state. Only numerical states are supported, entity sensor.office_light_4_current_consumption with value unavailable excluded from calculation in sensor.lights_current_energy_usage
2024-10-18 12:13:18.007 WARNING (MainThread) [homeassistant.components.group.sensor] Unable to use state. Only numerical states are supported, entity sensor.office_light_4_today_s_consumption with value unavailable excluded from calculation in sensor.lights_today_s_usage
2024-10-18 12:13:27.946 ERROR (MainThread) [homeassistant.components.tplink.coordinator] Error fetching 10.1.11.11 data: Unable to query the device 10.1.11.11:9999:
2024-10-18 12:13:27.947 WARNING (MainThread) [homeassistant.components.group.sensor] Unable to use state. Only numerical states are supported, entity sensor.bedroom_light_2_current_consumption with value unavailable excluded from calculation in sensor.lights_current_energy_usage
2024-10-18 12:13:27.947 WARNING (MainThread) [homeassistant.components.group.sensor] Unable to use state. Only numerical states are supported, entity sensor.bedroom_light_2_today_s_consumption with value unavailable excluded from calculation in sensor.lights_today_s_usage
Hi @AV0uu. From looking through your logs it seems that at the time when your tplink device are going offline you also have multiple other device types going offline with different integrations. It appears there are either general network issues happening at the same time or network issues specifically with the device running HA.
There are a few reasons why the kasa and tapo apps appear to be working ok. HA connects locally to the devices, whereas if the devices have access to the internet the native apps tend to connect via the cloud. It could be that the HA box is experiencing the issues and it is not affecting the devices going directly through your router to the cloud. Also the tplink integration reports unavailable as soon as the device becomes unavailable (within 5 seconds), whereas the native apps generally don't tell you when they can't connect until a lot longer.
Thank you for taking the time to explain. I understand that the the apps would be slower to show unavailiblity, perhaps the log i sent is not telling the whole story. The devices stay offline for hours or even a day in HA while i am able to operate them in the apps. For instance, right at this moment my 'Fireplace Top' and 'Fireplace Bottom' entities of the 'Fireplace Outlet" all show offline in HA, and have for hours, but i can operate the outlet via Kasa app.
@welborn please open a new issue and include some debug logs.
Thank you for taking the time to explain. I understand that the the apps would be slower to show unavailiblity, perhaps the log i sent is not telling the whole story. The devices stay offline for hours or even a day in HA while i am able to operate them in the apps. For instance, right at this moment my 'Fireplace Top' and 'Fireplace Bottom' entities of the 'Fireplace Outlet" all show offline in HA, and have for hours, but i can operate the outlet via Kasa app.
Yes but as I said there are many devices in your HA instances reporting as unavailable. You could enable debug logs for kasa which would give us more detail, but I also think you should try to figure out whether any of your custom integrations are periodically hosing your HA instance.
Thank you for taking the time to explain. I understand that the the apps would be slower to show unavailiblity, perhaps the log i sent is not telling the whole story. The devices stay offline for hours or even a day in HA while i am able to operate them in the apps. For instance, right at this moment my 'Fireplace Top' and 'Fireplace Bottom' entities of the 'Fireplace Outlet" all show offline in HA, and have for hours, but i can operate the outlet via Kasa app.
Yes but as I said there are many devices in your HA instances reporting as unavailable. You could enable debug logs for
kasawhich would give us more detail, but I also think you should try to figure out whether any of your custom integrations are periodically hosing your HA instance.
Will do!
home-assistant_tplink_2024-10-18T18-21-36.576Z.log
Debug log attached, (i had to cut it down to fit the upload size)
Were the devices unavailable during this logging? I don't see any tplink errors.
Some kasa devices were available, others were unavailable. I just went through and deleted and disabled a bunch of HACs integrations to see if that would help, but it appears to have no effect.
home-assistant_tplink_2024-10-18T18-50-52.501Z.log
Four devices were offline during this debug log period. (all the KP200 Outlets which seem to suffer the problem more than the switches)
At the risk of shouting into the dark here; I am following-up after a couple weeks after attempting to mitigate some of the issues for anyone doing a search for the same problem in the future (https://xkcd.com/979/).
I was able to improve device availability in HA by:
- Assigning a static IP address to each device.
- Removing the Single Pole switches and replacing with Z-wave (fewer devices on wifi).
- Removing most of the HACS and Add-On integrations that I absolutely did not need.
- Writing a python script to send a 'reboot' to the kasa devices on my network that I run a couple times per week.
I would estimate that I am now seeing 80% to 90% of the Kasa devices in HA fairly consistently, but rarely 100% all at once. The worst offenders are still the in-wall outlets (KP200s) as well as some of the older 3-way switches (which are the only ones that have 2-traveler architecture).
One challenge was the feature of the TP-link mesh router (XE75) to identify and control the kasa switches, I think this was part of the issue and made assigning a static IP overly difficult (i had to manually type in the MACs).
A reboot device button in HA would be very welcome!
Thank you all for taking the time to advise me about this issue, especially sdb9696.
TLDR: There was definitely something to the assertion that there may be network issues, but I still think something else is also causing issues.
Thanks for your insights, @AV0uu! So there are many variables, which also differ between device families, so finding out the exact cause is rather complicated as you have noticed.
I feel that one of the most common issue is related to too strict network configuration combined with device address changes, so I'm going to describe its workings a bit. In order to update the config entry on address changes, the integration leverages L2 connectivity:
- UDP broadcasts that are sent out on the network interfaces configured in homeassistant, and
- DHCP communications which require that there is a matching entry in the manifest file for the used mac address ~~(and notably for it's host name, too!)~~
Both of these require that the homeassistant instance is running in the same network, or at least has a direct access and is configured to use the separate network adapters. The documentation could be improved to clarify this, so PRs are welcome!
Now, while I was writing this comment, I started to wonder if our hostname-based filtering might be a cause for some of the woes? Perhaps worth investigating, if it'd be fine to move away from hostname-based matching and perform a connectivity check for all known mac address prefixes. This would also relieve us, the maintainers, from trying to keep the list up-to-date.
In a perhaps relevant note on how you improved the availability, @sdb9696 noticed that some devices throttle discovery requests (https://github.com/python-kasa/python-kasa/pull/1207), so disabling other tapo/kasa integrations that send out requests might indeed help to alleviate the issue.
P.S. Your wish for a 'reboot' button has been answered, and there is now one you can enable in the 2024.11 release (see #127935) :-)
edit: I was correct OOB, that the hostname matching applies only to the initial discovery, as registered_devices in the manifest skips the hostname check for already known devices. Whether we should be less strict on the checks in general remains undecided.
Wow, thank you @rytilahti , for your additional explanation. I have upgraded to 2024.11 and enabled that reboot for all my kasa devices! You guys rock!
Closing this issue as network related and (mostly I think) resolved