core icon indicating copy to clipboard operation
core copied to clipboard

aiowithings.exceptions.WithingsConnectionError: Timeout occurred while connecting to Withings

Open codyc1515 opened this issue 6 months ago • 11 comments

The problem

Logger: homeassistant.components.withings
Source: helpers/update_coordinator.py:300
Integration: Withings (documentation, issues)
First occurred: 09:37:26 (1 occurrences)
Last logged: 09:37:26

Unexpected error fetching Withings measurements data: Timeout occurred while connecting to Withings
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/aiowithings/withings.py", line 105, in _request
    response = await self.session.request(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/aiohttp/client.py", line 574, in _request
    conn = await self._connector.connect(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/aiohttp/connector.py", line 544, in connect
    proto = await self._create_connection(req, traces, timeout)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/aiohttp/connector.py", line 911, in _create_connection
    _, proto = await self._create_direct_connection(req, traces, timeout)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/aiohttp/connector.py", line 1204, in _create_direct_connection
    transp, proto = await self._wrap_create_connection(
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/aiohttp/connector.py", line 992, in _wrap_create_connection
    return await self._loop.create_connection(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/asyncio/base_events.py", line 1112, in create_connection
    transport, protocol = await self._create_connection_transport(
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/asyncio/base_events.py", line 1145, in _create_connection_transport
    await waiter
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/aiowithings/withings.py", line 104, in _request
    async with asyncio.timeout(self.request_timeout):
  File "/usr/local/lib/python3.11/asyncio/timeouts.py", line 111, in __aexit__
    raise TimeoutError from exc_val
TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/helpers/update_coordinator.py", line 300, in _async_refresh
    self.data = await self._async_update_data()
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/components/withings/coordinator.py", line 70, in _async_update_data
    return await self._internal_update_data()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/components/withings/coordinator.py", line 100, in _internal_update_data
    measurements = await self._client.get_measurement_in_period(startdate, now)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/aiowithings/withings.py", line 190, in get_measurement_in_period
    return await self._get_measurements(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/aiowithings/withings.py", line 166, in _get_measurements
    response = await self._request("measure", data=data)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/aiowithings/withings.py", line 113, in _request
    raise WithingsConnectionError(msg) from exception
aiowithings.exceptions.WithingsConnectionError: Timeout occurred while connecting to Withings

What version of Home Assistant Core has the issue?

core-2024.1.2

What was the last working version of Home Assistant Core?

Around November or December 2023

What type of installation are you running?

Home Assistant OS

Integration causing the issue

Withings

Link to integration documentation on our website

https://www.home-assistant.io/integrations/withings/

Diagnostics information

No response

Example YAML snippet

No response

Anything in the logs that might be useful for us?

No response

Additional information

Happens every time I restart. The integration actually loads fine. I'm feeling like it's actually an issue on the Withings side or we are timing out the request prematurely. It looks like there was a change in aiowithings which touched timouts around the time I started to see issues.

codyc1515 avatar Jan 09 '24 20:01 codyc1515

Hey there @joostlek, mind taking a look at this issue as it has been labeled with an integration (withings) you are listed as a code owner for? Thanks!

Code owner commands

Code owners of withings can trigger bot actions by commenting:

  • @home-assistant close Closes the issue.
  • @home-assistant rename Awesome new title Renames the issue.
  • @home-assistant reopen Reopen the issue.
  • @home-assistant unassign withings Removes the current integration label and assignees on the issue, add the integration domain after the command.
  • @home-assistant add-label needs-more-information Add a label (needs-more-information, problem in dependency, problem in custom component) to the issue.
  • @home-assistant remove-label needs-more-information Remove a label (needs-more-information, problem in dependency, problem in custom component) on the issue.

(message by CodeOwnersMention)


withings documentation withings source (message by IssueLinks)

home-assistant[bot] avatar Jan 09 '24 20:01 home-assistant[bot]

How many measurements do you have?

joostlek avatar Jan 09 '24 20:01 joostlek

I use it sparingly. Maybe once a week (if that). Just the thermometer.

codyc1515 avatar Jan 09 '24 23:01 codyc1515

Not sure if it’s related (if not I can file a separate issue) but I’ve been seeing similar timeouts with my sleep sensors:

Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/components/withings/__init__.py", line 245, in register_webhook
    await async_subscribe_webhooks(client, webhook_url)
  File "/usr/src/homeassistant/homeassistant/components/withings/__init__.py", line 309, in async_subscribe_webhooks
    await client.subscribe_notification(webhook_url, notification)
  File "/usr/local/lib/python3.11/site-packages/aiowithings/withings.py", line 349, in subscribe_notification
    await self._request(
  File "/usr/local/lib/python3.11/site-packages/aiowithings/withings.py", line 139, in _request
    raise WithingsConnectionError(error)
aiowithings.exceptions.WithingsConnectionError: Timeout

I reload the integration (after my automations fail since in bed never triggers) and everything is good. If I upgrade HA causing a reload this tends to reoccur.

dcmeglio avatar Jan 10 '24 04:01 dcmeglio

Yes, I also often faced an issue with timeouts when reloading YAML too.

codyc1515 avatar Jan 10 '24 04:01 codyc1515

This appears to be some sort of core issue on boot. Many integrations are facing timeout on first load. Any ideas for what may be causing this? It doesn't appear to be a Withings issue.

codyc1515 avatar Jan 11 '24 20:01 codyc1515

Could something be blocking network IO? Can you maybe try booting in safe mode?

joostlek avatar Jan 14 '24 08:01 joostlek

Could something be blocking network IO? Can you maybe try booting in safe mode?

I can try later but I'm not seeing what @codyc1515 says. Withings is the only integration experiencing timeouts (and seems there's no retry or anything that reports the integration failed to load so there is probably missing error handling here). Given that every other integration is working I'd be surprised if it's a network IO issue.

Also immediately reloading the integration always fixes it for me. I suspect if this integration implemented the built-in reload logic for this error it would auto-resolve.

dcmeglio avatar Jan 14 '24 14:01 dcmeglio

I agree. On checking, there’s a bunch of areas where exceptions from aiowithings are not caught. When this happens the integration fails to load but doesn’t show as such with a ConfigEntryError. Eventually as you say it seems to come right with a reload or similar.

codyc1515 avatar Jan 14 '24 17:01 codyc1515

Since I can consistently reproduce this one, I'm going to try to raise a PR later to catch and retry this error.

dcmeglio avatar Jan 14 '24 17:01 dcmeglio

Thank you. I took a stab at it but couldn't seem to get a handle on it. The issue is easily reproducible if you have a slow HA start-up. Incidentally, that does appear to be custom_integrations related as safe mode largely worked okay. There are some other errors that can occur in the Withings integration as well relating to web hooks that are unhandled, such as if you had an invalid / unreachable URL configured under Network in Settings.

codyc1515 avatar Jan 14 '24 23:01 codyc1515

The fix here is trickier than retries now that I see what's going on. The problem is that the error is occurring inside an asynchronous callback for cloud connection state changing. So we can't just raise a ConfigEntryNotReady as that won't be handled from within that code path. This would require implementing custom retry logic so it's a much bigger change. I'm trying to sort it out, but as far as I can tell there's almost no documentation on how the cloud webhook stuff works and looks like, understandably, @joostlek modeled this on the existing netatmo integration.

dcmeglio avatar Jan 20 '24 16:01 dcmeglio

I mean we can implement retry in aiowithings, that's no problem. I also just got another guy who has a bug and it just doesn't work at all

joostlek avatar Jan 20 '24 16:01 joostlek

Yeah that's where I'm headed but there could be weird race conditions that occur. E.g. while retries of a register are occurring, what if the cloud disconnects which would separately trigger an unregister (or the other way around) the order they execute would be non-deterministic since it's all async and we could end up in a weird state. The entire call to register, including all the retries, would need to be in a single lock.

dcmeglio avatar Jan 20 '24 16:01 dcmeglio

So I just am very curious why this fails in the first place. Does Withings take significantly longer at your requests?

joostlek avatar Jan 20 '24 16:01 joostlek

Oh, but the register and unregister is now only run one at a time, they won't run together anymore

joostlek avatar Jan 20 '24 16:01 joostlek

So I just am very curious why this fails in the first place. Does Withings take significantly longer at your requests?

I can't figure it out. It only happens on HA load. If I reload the integration, everything works near instantly, <1s. I have never gotten a timeout when I've manually reloaded the integration, it's only on HA startup.

dcmeglio avatar Jan 20 '24 16:01 dcmeglio

Do you have more config entries setup for Withings?

joostlek avatar Jan 20 '24 17:01 joostlek

Do you have more config entries setup for Withings?

I have 2 one for me and one for my wife.

dcmeglio avatar Jan 20 '24 17:01 dcmeglio

Can you maybe try disabling one and rebooting?

joostlek avatar Jan 20 '24 17:01 joostlek

I only have one and echo all the above (including the reloading). Note I’m also facing similar issues with the Netatmo integration.

codyc1515 avatar Jan 20 '24 18:01 codyc1515

Interesting, can you share the Netatmo error? Since the code in HA is so similar maybe we can spot a common problem.

dcmeglio avatar Jan 20 '24 19:01 dcmeglio

Since I can consistently reproduce this one, I'm going to try to raise a PR later to catch and retry this error.

Did you have a chance to look into that?

codyc1515 avatar Jan 28 '24 09:01 codyc1515

Yes, but as I said in the follow up replies, it's more complicated than that. This requires changes to the underlying aiowithings library. It's not a huge lift, but it's bigger than the 1 line change I originally thought and I havne't had time yet.

dcmeglio avatar Jan 28 '24 14:01 dcmeglio

Can you maybe elaborate a bit more on what you found?

joostlek avatar Jan 28 '24 14:01 joostlek

Just what we already discussed. It times out and a retry would probably resolve the issue. I was hoping @codyc1515 could provide the netatmo error I requested so we could try to find a pattern.

dcmeglio avatar Jan 28 '24 14:01 dcmeglio

I was chatting with Cody on discord today, and he had this error raised when subscribing webhooks. So when looking at the logs you would think it subscribed 5 webhooks (since one raised an error) but when manually calling the call to list all the webhooks, 6 were returned. So I think there's also a bit of Withings trickery involved.

Afterwards Cody created a new Withings application and it dissapeared

joostlek avatar Jan 28 '24 15:01 joostlek

That’s all correct. The Netatmo webhook issues I see relate to home_id values which are specific to Netatmo integration.

codyc1515 avatar Jan 28 '24 18:01 codyc1515

I am going to close this for now. I believe the issue was resolved by removing integration, deleting credentials for integration, creating new app in Withings portal, add new credentials to HA, install integration again. Thanks for your help with this.

codyc1515 avatar Jan 28 '24 22:01 codyc1515