core icon indicating copy to clipboard operation
core copied to clipboard

TP-Link and Tapo devices failed setup with connect call failed errors since update to 2024.7

Open stuartford opened this issue 1 year ago • 23 comments

The problem

Since updating to 2024.7 the TP-Link and Tapo integrations are a mess. Most devices are working, but each integration has a hardcore few that fail.

Errors from each integration attached. HA continually tries to reload these devices, but they just end up back in the "Needs attention" list. Reloading the devices manually has the same result.

TP-Link devices make up the majority of my smart home estate, so this is a VM rollback incident for me.

Has anyone else had this issue, and, if so, how did you resolve it?

Screenshot 2024-08-05 at 11 27 55 Screenshot 2024-08-05 at 11 28 02

What version of Home Assistant Core has the issue?

2024.7.4

What was the last working version of Home Assistant Core?

2024.6.2

What type of installation are you running?

Home Assistant OS

Integration causing the issue

TP-Link

Link to integration documentation on our website

https://www.home-assistant.io/integrations/tplink

Diagnostics information

No response

Example YAML snippet

No response

Anything in the logs that might be useful for us?

No response

Additional information

No response

stuartford avatar Aug 05 '24 10:08 stuartford

Hey there @rytilahti, @bdraco, @sdb9696, mind taking a look at this issue as it has been labeled with an integration (tplink) you are listed as a code owner for? Thanks!

Code owner commands

Code owners of tplink can trigger bot actions by commenting:

  • @home-assistant close Closes the issue.
  • @home-assistant rename Awesome new title Renames the issue.
  • @home-assistant reopen Reopen the issue.
  • @home-assistant unassign tplink Removes the current integration label and assignees on the issue, add the integration domain after the command.
  • @home-assistant add-label needs-more-information Add a label (needs-more-information, problem in dependency, problem in custom component) to the issue.
  • @home-assistant remove-label needs-more-information Remove a label (needs-more-information, problem in dependency, problem in custom component) on the issue.

(message by CodeOwnersMention)


tplink documentation tplink source (message by IssueLinks)

home-assistant[bot] avatar Aug 05 '24 10:08 home-assistant[bot]

Update: Rolled-back to 2024.6.2, but the problems persist, so it's not to do with the update.

I am still left with a raft of non-functional devices, however, so I would appreciate some help with these errors!

stuartford avatar Aug 05 '24 10:08 stuartford

It may be the devices have changed ip address. Have you tried using the integration discovery to see if new devices are discovered (leave host blank)

sdb9696 avatar Aug 05 '24 13:08 sdb9696

I solved this by power-cycling the devices. I don't know why it was required for roughly 1/3rd of the devices, but that's what I did and now they are working again.

stuartford avatar Aug 05 '24 14:08 stuartford

all my devices use static IP assignment.

i've found that since the big update - it takes a while for them to settle down, that smells like a rate-limit to me, somehow. other thing i would say is that i added some newer ep25 devices to my setup (auth required), while most of my other hardware is older (not required).

i have a fairly 'large' count of devices relative to most, i expect. (also a factor for my rate-limit theory, when they all try to shuffle in at the same time following a HA reboot). I have roughly 30 devices tied to the integration.

if it is rate-limit, then perhaps a staggered spin up is in order? (powering down devices could also make the same timing-shift effect)

speakxj7 avatar Aug 11 '24 23:08 speakxj7

I am also having this issue - I cannot add new devices manually, and the integration reports that there are no new devices when I try the discovery. I am also having the same issue as @stuartford . Please note screenshots below:

IMG_5310 IMG_5309 IMG_5308 IMG_5307

msmith000 avatar Aug 13 '24 12:08 msmith000

I'm having the same issue after update 8.2024.

Tried power cycling devices but most did not come back.

Daguse avatar Aug 15 '24 11:08 Daguse

i've found that since the big update - it takes a while for them to settle down, that smells like a rate-limit to me, somehow. other thing i would say is that i added some newer ep25 devices to my setup (auth required), while most of my other hardware is older (not required).

i have a fairly 'large' count of devices relative to most, i expect. (also a factor for my rate-limit theory, when they all try to shuffle in at the same time following a HA reboot). I have roughly 30 devices tied to the integration.

Are you having this problem also with the older devices, or does it just concern ep25s? We are not currently fetching the firmware update status for older devices, so if that's the case, it may very well have something to do with the cloud throttling the requests.

rytilahti avatar Aug 15 '24 12:08 rytilahti

I'm not sure how to tell if a device is ep25. However, I believe they are older devices as I don't remember having to do any auth requesters.

Daguse avatar Aug 15 '24 13:08 Daguse

If you go to the device page, it will tell you the model. Alternatively, downloading the diagnostics info will also have more information (like whether authentication was used).

Anyway, looks like this is affecting also non-auth models given the communications are tried over port 9999 which is only used for the older protocol. That being said, this error only means that the device is not responding for the query for some reason. This might be due to network configuration (i.e., firewall not allowing connections), firmware gotten stuck somehow, etc., and is rather hard to debug...

rytilahti avatar Aug 15 '24 13:08 rytilahti

i've found that since the big update - it takes a while for them to settle down, that smells like a rate-limit to me, somehow. other thing i would say is that i added some newer ep25 devices to my setup (auth required), while most of my other hardware is older (not required). i have a fairly 'large' count of devices relative to most, i expect. (also a factor for my rate-limit theory, when they all try to shuffle in at the same time following a HA reboot). I have roughly 30 devices tied to the integration.

Are you having this problem also with the older devices, or does it just concern ep25s? We are not currently fetching the firmware update status for older devices, so if that's the case, it may very well have something to do with the cloud throttling the requests.

i would say that it's statistically worse for the auth-required new ep25's (pretty much 100% flake out on cold start and also take the longest to settle down), but i definitely see the problem as a more general problem. as i speak (after rebooting my HA system for some updates) kl130, hs220, ep10, and old ep25 (no auth) - all in the connect call failed cycle. i expect in an hour or two that pretty much all my devices will be fine.

speakxj7 avatar Aug 15 '24 13:08 speakxj7

Could it be a network issue at your end? What you could do is to try installing python-kasa and run kasa discover to see if the devices are detected correctly. If that is working stable, we can possible rule out that.

rytilahti avatar Aug 15 '24 13:08 rytilahti

So to add some flavor, I did a rollback to 7.2024 and still had the same issue. To add to it, I'm also having an issue with Govee and IPP after updating. Something must have changed in the network.

Daguse avatar Aug 15 '24 14:08 Daguse

Same issue over and over again. Must power cycle my Tapo P100 and P110 each few days to keep them connected to HA.

DarthSonic avatar Aug 17 '24 11:08 DarthSonic

Mine have been like this too since I think 2024.7. They’re just so unreliable now, it’s just pot luck if they’re online or not. Anyone know of any alternative integrations that will bring the Tapo P100s into HA?

geecy84 avatar Aug 17 '24 17:08 geecy84

I've had this problem with TP-Link devices, old and new, switches, plugs, for nearly a year. Why can Alexa see these devices and works with them with no problem and HA doesn't? Note that only 10 of my 35 TP-Link devices are having this problem, and 2-3 are truly offline by design (unplugged plugs for Xmas use). I'm also have a similar problem with WLED, Reolink and Roku. In the meantime I just installed a new HS200 switch and the integration asks for user credentials. I used the ones I use signing on to the Kasa app but they don't work. What credentials are they talking about?

rdperkins avatar Aug 31 '24 10:08 rdperkins

I've had this problem with TP-Link devices, old and new, switches, plugs, for nearly a year. Why can Alexa see these devices and works with them with no problem and HA doesn't?

If you ssh into your HA host, can you ping the devices?

I've just solved a similar problem with 10 KP-115 plugs. It was a combination of the plugs locking up (refusing connections, even though I could ping them), and some routing issue between HA and the devices

Power cycling the plugs didnt help. I had to factory reset each plug, before it would start accepting connections again, and then reboot HA host to get the integration talking to them.

This tool was very helpful in interrogating the plugs outside of HA

https://github.com/softScheck/tplink-smartplug

carpii avatar Sep 01 '24 15:09 carpii

The upcoming release will disable the firmware update information (#124930) which should improve stability as the devices do not connect to the cloud anymore to fetch that information.

@rdperkins

Why can Alexa see these devices and works with them with no problem and HA doesn't?

Maybe Alexa controls these devices over the cloud, and by doing so is not affected by any issues in the local controls?

I'm also have a similar problem with WLED, Reolink and Roku.

This sounds like a deeper problem in your network, to be honest. These devices are controlled by completely different protocols and thus should have no effect on tplink devices.

In the meantime I just installed a new HS200 switch and the integration asks for user credentials. I used the ones I use signing on to the Kasa app but they don't work. What credentials are they talking about?

They are the credentials that were used when you provisioned the device using the app.

@carpii you can also use python-kasa directly to control these devices from the console. It is the very same library that is used by the integration.

rytilahti avatar Sep 01 '24 17:09 rytilahti

Problem remains with many of my 35 TP-Link devices, a couple of which are HS100 plugs, but most are HS200 switches or the 3-way versions. The problem is not just recent devices unless Kasa updated the firmware to require authentication. I've had most of these devices over 2 years. I powered down the entire house thinking that would work, but only 5 devices were discovered the first day, then two a day later, then two more two days later. I finally reinstalled HAOS and restored from a backup, so back to square zero. Can these devices be flashed to operate locally or do I need to go with Z-Wave devices and ditch Kasa? I've probably spent 50-60 hours minimum trying to get this solved. Has anyone got a working suggestion?

rdperkins avatar Oct 07 '24 23:10 rdperkins

I recently experienced the same problem as @msmith000 with my tp-link devices losing connectivity. While checking my router's Wi-Fi settings, I found this: Screenshot_20241009_021422_ASUS Router After disabling 802.11ax mode, all devices started working as expected. While this is a workaround for me, I feel like I have downgraded my router capabilities with this move.

MariusAPop avatar Oct 09 '24 09:10 MariusAPop

I removed the 5ghz access to my iot ssid and it didn’t help. Thanks anyway

From: Marius Pop @.> Sent: Wednesday, October 9, 2024 5:24 AM To: home-assistant/core @.> Cc: Robert Perkins @.>; Mention @.> Subject: Re: [home-assistant/core] TP-Link and Tapo devices failed setup with connect call failed errors since update to 2024.7 (Issue #123181)

I recently experienced the same problem as @msmith000https://github.com/msmith000 with my tp-link devices losing connectivity. While checking my router's Wi-Fi settings, I found this: Screenshot_20241009_021422_ASUS.Router.jpg (view on web)https://github.com/user-attachments/assets/83eafb06-143d-49c3-a561-cb6e4912b755 After disabling 802.11ax mode, all devices started working as expected. While this is a workaround for me, I feel like I have downgraded my router capabilities with this move.

— Reply to this email directly, view it on GitHubhttps://github.com/home-assistant/core/issues/123181#issuecomment-2401800859, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AH55NHZUNAOY4OO52G6UG73Z2TY2JAVCNFSM6AAAAABMAAPHA6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMBRHAYDAOBVHE. You are receiving this because you were mentioned.Message ID: @.@.>>

rdperkins avatar Oct 10 '24 19:10 rdperkins

I did finally get things working. My solution was to assign a static IP to each device on a 2.4g network. Sometimes they get lost to HA when I restart it, but usually it’s easily handled by also restarting the router.I would recommend keeping a list of the IPs and MAC addresses in a spreadsheet so you know which device is offending. Thanks,Michael @.://linkedin.com/in/msmithxyzThanks,//mike On Oct 10, 2024, at 15:18, Robert Perkins @.> wrote: I removed the 5ghz access to my iot ssid and it didn’t help. Thanks anyway

From: Marius Pop @.***>

Sent: Wednesday, October 9, 2024 5:24 AM

To: home-assistant/core @.***>

Cc: Robert Perkins @.>; Mention @.>

Subject: Re: [home-assistant/core] TP-Link and Tapo devices failed setup with connect call failed errors since update to 2024.7 (Issue #123181)

I recently experienced the same problem as @msmith000https://github.com/msmith000 with my tp-link devices losing connectivity. While checking my router's Wi-Fi settings, I found this:

Screenshot_20241009_021422_ASUS.Router.jpg (view on web)https://github.com/user-attachments/assets/83eafb06-143d-49c3-a561-cb6e4912b755

After disabling 802.11ax mode, all devices started working as expected. While this is a workaround for me, I feel like I have downgraded my router capabilities with this move.

Reply to this email directly, view it on GitHubhttps://github.com/home-assistant/core/issues/123181#issuecomment-2401800859, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AH55NHZUNAOY4OO52G6UG73Z2TY2JAVCNFSM6AAAAABMAAPHA6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMBRHAYDAOBVHE.

You are receiving this because you were mentioned.Message ID: @.@.>>

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: @.***>

msmith000 avatar Oct 10 '24 20:10 msmith000

I did finally get things working. My solution was to assign a static IP to each device on a 2.4g network. Sometimes they get lost to HA when I restart it, but usually it’s easily handled by also restarting the router.I would recommend keeping a list of the IPs and MAC addresses in a spreadsheet so you know which device is offending. Thanks,Michael @.://linkedin.com/in/msmithxyzThanks,//mike On Oct 10, 2024, at 15:18, Robert Perkins @.> wrote: I removed the 5ghz access to my iot ssid and it didn’t help. Thanks anyway

From: Marius Pop @.***>

Sent: Wednesday, October 9, 2024 5:24 AM

To: home-assistant/core @.***>

Cc: Robert Perkins @.>; Mention @.>

Subject: Re: [home-assistant/core] TP-Link and Tapo devices failed setup with connect call failed errors since update to 2024.7 (Issue #123181)

I recently experienced the same problem as @msmith000https://github.com/msmith000 with my tp-link devices losing connectivity. While checking my router's Wi-Fi settings, I found this:

Screenshot_20241009_021422_ASUS.Router.jpg (view on web)https://github.com/user-attachments/assets/83eafb06-143d-49c3-a561-cb6e4912b755

After disabling 802.11ax mode, all devices started working as expected. While this is a workaround for me, I feel like I have downgraded my router capabilities with this move.

Reply to this email directly, view it on GitHubhttps://github.com/home-assistant/core/issues/123181#issuecomment-2401800859, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AH55NHZUNAOY4OO52G6UG73Z2TY2JAVCNFSM6AAAAABMAAPHA6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMBRHAYDAOBVHE.

You are receiving this because you were mentioned.Message ID: @.@.>>

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: @.***>

msmith000 avatar Oct 10 '24 20:10 msmith000

I did finally get things working. My solution was to assign a static IP to each device on a 2.4g network. Sometimes they get lost to HA when I restart it, but usually it’s easily handled by also restarting the router.I would recommend keeping a list of the IPs and MAC addresses in a spreadsheet so you know which device is offending. On Oct 10, 2024, at 15:18, Robert Perkins @.***> wrote: I removed the 5ghz access to my iot ssid and it didn’t help. Thanks anyway

From: Marius Pop @.***>

Sent: Wednesday, October 9, 2024 5:24 AM

To: home-assistant/core @.***>

Cc: Robert Perkins @.>; Mention @.>

Subject: Re: [home-assistant/core] TP-Link and Tapo devices failed setup with connect call failed errors since update to 2024.7 (Issue #123181)

I recently experienced the same problem as @msmith000https://github.com/msmith000 with my tp-link devices losing connectivity. While checking my router's Wi-Fi settings, I found this:

Screenshot_20241009_021422_ASUS.Router.jpg (view on web)https://github.com/user-attachments/assets/83eafb06-143d-49c3-a561-cb6e4912b755

After disabling 802.11ax mode, all devices started working as expected. While this is a workaround for me, I feel like I have downgraded my router capabilities with this move.

Reply to this email directly, view it on GitHubhttps://github.com/home-assistant/core/issues/123181#issuecomment-2401800859, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AH55NHZUNAOY4OO52G6UG73Z2TY2JAVCNFSM6AAAAABMAAPHA6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMBRHAYDAOBVHE.

You are receiving this because you were mentioned.Message ID: @.@.>>

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: @.***>

msmith000 avatar Oct 26 '24 20:10 msmith000

I'm having the same issue with S505 light dimmer switches. All of them (4). Same error connecting as reported above. I've rebooted router and confirmed that all of the S505 switches are connected to the router.

gobees avatar Dec 03 '24 17:12 gobees

Closing this issue as it's old and the OP's issue is solved as are a number of the other reports, We're also updating the documentation with additional troubleshooting steps.

sdb9696 avatar Jan 10 '25 11:01 sdb9696