operating-system icon indicating copy to clipboard operation
operating-system copied to clipboard

Home Assistant loses internet connection every day

Open Nezz opened this issue 2 years ago • 24 comments
trafficstars

Describe the issue you are experiencing

Every day at some point during the day my home assistant loses internet connection. It cannot be reached from the network (including http://homeassistant.local:4357), nor can it reach any devices on the network (wifi devices become unavailable). It has happened every day since I updated to 2023.8.4, but it seems unlikely that this would be caused by updating core.

I run Home Assistant OS on VMware Workstation (bridged networking mode). The host machine is connected to the internet. Re-connecting to the network adapter to the VM does not resolve the issue. network reload does not resolve the issue. Restarting the VM resolves the issue.

What operating system image do you use?

generic-x86-64 (Generic UEFI capable x86-64 systems)

What version of Home Assistant Operating System is installed?

10.5

Did you upgrade the Operating System.

No

Steps to reproduce the issue

Unclear what causes this.

Anything in the Supervisor logs that might be useful for us?

Nothing relevant in supervisor logs

Anything in the Host logs that might be useful for us?

Nothing relevant in host logs

System information

No response

Additional information

Untitled image image image

Nezz avatar Aug 31 '23 19:08 Nezz

Let me know what kind of information I could grab to troubleshoot this. On a high level it seems to me that the ipv4 networking interface does not recover in HAOS.

Note that when the issue happens I can only access Home Assistant via the CLI.

Nezz avatar Aug 31 '23 19:08 Nezz

It seems that you running without IPv4, is that always the case?

agners avatar Aug 31 '23 20:08 agners

No, the VM normally gets 192.168.0.43 (that's when things work).

Nezz avatar Aug 31 '23 21:08 Nezz

Let me know if I should grab any further info. I'm about to restart the VM, so the issue will occur again in 12-24 hours.

Nezz avatar Sep 01 '23 19:09 Nezz

Running network update enp2s1 --ipv4-method auto brought my HA instance back online without restarting the VM.

Nezz avatar Sep 01 '23 19:09 Nezz

It disconnected again and running the command from the previous comment helped again.

Nezz avatar Sep 02 '23 20:09 Nezz

My instance (OS 10.5, core 2023.8.4) has dropped of the network two days in a row. Running OS directly (not VM) on an old laptop (i7 processor from 2016 or so), connected over wifi, ipv4 and ipv6 enabled. Has been rock solid until two days ago. Will see if I can get something sensible from the logs when I get home (cannot, obviously, reach it remotely).

uphillbattle avatar Sep 05 '23 06:09 uphillbattle

Running network update enp2s1 --ipv4-method auto brought my HA instance back online without restarting the VM.

Hm that is weird. Sounds like NetworkManager has issues acquiring a IPv4 address then.

Can you share the output of ha host logs --identifier NetworkManager -n 10000? You can redirect it to /config e.g. using:

ha host logs --identifier NetworkManager -n 10000 > /config/networkmanager.log

agners avatar Sep 05 '23 20:09 agners

Here is the log: networkmanager.log A disconnect happened on Sep 2 22:56-23:29 (UTC+3). Another one happened on Sep 4 19:18-23:55.

Nezz avatar Sep 06 '23 08:09 Nezz

Here is what my logs look like when things work as expected:

Sep 05 18:30:29 homeassistant NetworkManager[304]: <info>  [1693938629.9525] dhcp4 (enp2s1): activation: beginning transaction (timeout in 45 seconds)
Sep 05 18:30:29 homeassistant NetworkManager[304]: <info>  [1693938629.9565] dhcp4 (enp2s1): state changed no lease
Sep 05 18:30:30 homeassistant NetworkManager[304]: <info>  [1693938630.0124] manager: NetworkManager state is now DISCONNECTED
Sep 05 18:30:30 homeassistant NetworkManager[304]: <info>  [1693938630.0134] device (enp2s1): Activation: starting connection 'Supervisor enp2s1' (fd28114f-6784-4a46-913f-2277b1bbbf74)
Sep 05 18:30:30 homeassistant NetworkManager[304]: <info>  [1693938630.0425] device (enp2s1): state change: disconnected -> prepare (reason 'none', sys-iface-state: 'managed')
Sep 05 18:30:30 homeassistant NetworkManager[304]: <info>  [1693938630.0436] manager: NetworkManager state is now CONNECTING
Sep 05 18:30:30 homeassistant NetworkManager[304]: <info>  [1693938630.0444] device (enp2s1): state change: prepare -> config (reason 'none', sys-iface-state: 'managed')
Sep 05 18:30:30 homeassistant NetworkManager[304]: <info>  [1693938630.0653] device (enp2s1): state change: config -> ip-config (reason 'none', sys-iface-state: 'managed')
Sep 05 18:30:30 homeassistant NetworkManager[304]: <info>  [1693938630.1086] dhcp4 (enp2s1): activation: beginning transaction (timeout in 45 seconds)
Sep 05 18:30:30 homeassistant NetworkManager[304]: <info>  [1693938630.3323] dhcp4 (enp2s1): state changed new lease, address=192.168.0.58
Sep 05 18:30:30 homeassistant NetworkManager[304]: <info>  [1693938630.3392] policy: set 'Supervisor enp2s1' (enp2s1) as default for IPv4 routing and DNS
Sep 05 18:30:30 homeassistant NetworkManager[304]: <info>  [1693938630.4572] device (enp2s1): state change: ip-config -> ip-check (reason 'none', sys-iface-state: 'managed')
Sep 05 18:30:30 homeassistant NetworkManager[304]: <info>  [1693938630.4857] device (enp2s1): state change: ip-check -> secondaries (reason 'none', sys-iface-state: 'managed')
Sep 05 18:30:30 homeassistant NetworkManager[304]: <info>  [1693938630.4994] device (enp2s1): state change: secondaries -> activated (reason 'none', sys-iface-state: 'managed')
Sep 05 18:30:30 homeassistant NetworkManager[304]: <info>  [1693938630.5221] manager: NetworkManager state is now CONNECTED_SITE
Sep 05 18:30:30 homeassistant NetworkManager[304]: <info>  [1693938630.5276] device (enp2s1): Activation: successful, device activated.
Sep 05 18:30:30 homeassistant NetworkManager[304]: <info>  [1693938630.7554] manager: NetworkManager state is now CONNECTED_GLOBAL

state changed no lease -> NetworkManager state is now DISCONNECTED -> recovery begins

Here is the first disconnect.

Sep 02 19:56:25 homeassistant NetworkManager[313]: <info>  [1693684585.1553] dhcp4 (enp2s1): activation: beginning transaction (timeout in 45 seconds)
Sep 02 19:56:25 homeassistant NetworkManager[313]: <info>  [1693684585.1555] dhcp4 (enp2s1): state changed no lease
Sep 02 20:06:25 homeassistant NetworkManager[313]: <info>  [1693685185.4444] manager: NetworkManager state is now CONNECTED_SITE
[ network update enp2s1 --ipv4-method auto is called ]
Sep 02 20:29:15 homeassistant NetworkManager[313]: <info>  [1693686555.5143] audit: op="connection-update" uuid="60e10c4d-470a-3fae-8f6f-e4ccd041ae60" name="Supervisor enp2s1" args="connection.timestamp" pid=969 uid=0 result="success"
Sep 02 20:29:15 homeassistant NetworkManager[313]: <info>  [1693686555.5202] device (enp2s1): state change: activated -> deactivating (reason 'new-activation', sys-iface-state: 'managed')
...
Sep 02 20:29:16 homeassistant NetworkManager[313]: <info>  [1693686556.3933] policy: set 'Supervisor enp2s1' (enp2s1) as default for IPv4 routing and DNS
Sep 02 20:29:16 homeassistant NetworkManager[313]: <info>  [1693686556.4126] device (enp2s1): state change: ip-config -> ip-check (reason 'none', sys-iface-state: 'managed')
Sep 02 20:29:16 homeassistant NetworkManager[313]: <info>  [1693686556.4157] device (enp2s1): state change: ip-check -> secondaries (reason 'none', sys-iface-state: 'managed')
Sep 02 20:29:16 homeassistant NetworkManager[313]: <info>  [1693686556.4163] device (enp2s1): state change: secondaries -> activated (reason 'none', sys-iface-state: 'managed')
Sep 02 20:29:16 homeassistant NetworkManager[313]: <info>  [1693686556.4173] manager: NetworkManager state is now CONNECTED_SITE
Sep 02 20:29:16 homeassistant NetworkManager[313]: <info>  [1693686556.4195] device (enp2s1): Activation: successful, device activated.
Sep 02 20:29:16 homeassistant NetworkManager[313]: <info>  [1693686556.4657] manager: NetworkManager state is now CONNECTED_GLOBAL

It seems that sometimes the network manager does not pick up that the lease has expired. Instead of changing the state to DISCONNECTED, it gets stuck in CONNECTED_SITE.

The disconnect I had on the 4th of September is another example of that:

Sep 03 16:17:57 homeassistant NetworkManager[313]: <info>  [1693757877.2504] device (enp2s1): Activation: successful, device activated.
Sep 03 16:17:57 homeassistant NetworkManager[313]: <info>  [1693757877.3082] manager: NetworkManager state is now CONNECTED_GLOBAL
Sep 04 16:17:56 homeassistant NetworkManager[313]: <info>  [1693844276.9655] dhcp4 (enp2s1): activation: beginning transaction (timeout in 45 seconds)
Sep 04 16:17:56 homeassistant NetworkManager[313]: <info>  [1693844276.9657] dhcp4 (enp2s1): state changed no lease
Sep 04 16:17:57 homeassistant NetworkManager[313]: <info>  [1693844277.3220] manager: NetworkManager state is now CONNECTED_SITE

state changed no lease -> NetworkManager state is now CONNECTED_SITE -> no recovery happens

Nezz avatar Sep 06 '23 08:09 Nezz

The same network drop happened again:

Sep 06 18:30:30 homeassistant NetworkManager[304]: <info>  [1694025030.1153] dhcp4 (enp2s1): activation: beginning transaction (timeout in 45 seconds)
Sep 06 18:30:30 homeassistant NetworkManager[304]: <info>  [1694025030.1157] dhcp4 (enp2s1): state changed no lease
Sep 06 18:40:30 homeassistant NetworkManager[304]: <info>  [1694025630.5754] manager: NetworkManager state is now CONNECTED_SITE

Nezz avatar Sep 06 '23 20:09 Nezz

Any tips how to fix this? I need to run network update enp2s1 every day to bring HA back online. Can I schedule that somehow to work around this?

Nezz avatar Sep 08 '23 21:09 Nezz

Not sure if we have the exact same problem, but it seems at least we have had the same symptoms. I tried downgrading to 10.4 but that didn’t help. I then downgraded to 10.3, but at the same time did the following changes:

  • I disabled ipv6 in HA
  • Instead of setting up HA with a static IP, I changed the setting to Auto and instead reserved the IP address at the router

The network connection has been stable since (about 3 days and counting). I’ll give it another day or two before upgrading again to see if it was the network settings or the downgrading that did the trick.

Below is a picture of the last lines of the NetworkManager logs when network connection had dropped with OS v10.4 (how can I get the logs out in file format when the network is down?). IMG_2408

uphillbattle avatar Sep 09 '23 13:09 uphillbattle

What did not work:

  • Having a reserved IP for Home Assistant in my router and using automatic IPv4 in HA
  • Not having a reserved IP and using automatic IPv4 in HA

What worked:

  • Having a reserved IP configured in the router and setting the static IPv4 in HA

Nezz avatar Sep 13 '23 08:09 Nezz

Having a reserved IP for Home Assistant in my router and using automatic IPv4 in HA has been stable with OS 10.3 for several days. Yesterday afternoon, I upgraded the OS to 10.5 (but did not change the IP settings). It has not yet fallen off the network (after 17 hours) but it's too soon to draw any conclusion.

uphillbattle avatar Sep 13 '23 08:09 uphillbattle

Looking at the DHCP documentation, the lease should be renewed at half time (after 12 hours assuming the standard 24 hour lease). However, this does not seem to happen, or at least NetworkManager does not log about it.

Nezz avatar Sep 13 '23 09:09 Nezz

For information: I have had no more problems since upgrading the OS to 10.5, so in my case it seems the IP-settings did the trick. Can’t explain why that should matter, so it’s just an observation.

uphillbattle avatar Sep 21 '23 19:09 uphillbattle

Same here. DHCP drops connection every so often. Thanks to the workaround, static ip in HA (and Router, which was set anyway) did the trick. Another thing: Changing the wifi drops the connection until duckdns renews ip (5min, maybe 10min interval?). Im talking about the onboard rasperry pi 4b wifi, HA OS. In addition, i have another wifi adapter via usb plugged. Banged my head around this for 3 days. Crazy behaviour where i couldn't connect to 2,4 wifi in any way. Used nmcli about 100 times. Wifi couldnt be found, password wasn't delivered and so on. Onboard wifi (wlan0) or ssid got blocked somehow.

Maybe this helps too: i had a bunch of lease files in var/lib/NetworkManager. Deleting them brought my wifi up on boot in notime. Maybe this isn't related. Its late...

dingausmwald avatar Oct 04 '23 02:10 dingausmwald

FWIW, my instance started dropping the network connection again after a couple of weeks. The instance was on WiFi. I gave up and got a USB Ethernet adapter. The instance has been running without problems ever since (more than a month now). So it seems it a wired network connection did the trick in my case.

uphillbattle avatar Nov 22 '23 20:11 uphillbattle

My reserved IP solution over wifi is still working without problems. However, it'd be nice to fix this. The DHCP protocol turned 30 years old last month and it's be great if it worked in HA reliably.

Nezz avatar Nov 24 '23 22:11 Nezz

I got the same probleme while running HaOs on a x86 Gigabyte NUC GB-BXBT-207. Setting a static adress + reserved IP in the DHCP server does not solve the problem and the only way to get it online again is a reboot...

I struggle to get more info to identify the problem...

It still connected to the wifi network from nmcli it still have an IP in ip but it cannot ping the router nor 8.8.8.8 and it's not detected by the router. In the router panel admin i can see the mac adress of the device without an ip (it got when freeslhy booted up)

The problem occur every day, what can i provide to help us identify the cause of the deconnection ?

dprslt avatar Jan 30 '24 20:01 dprslt

As mentioned above, my instance has not dropped out of the network since I ditched wifi and got a usb-ethernet adapter for wired network connection. More than 3 months now, without a network glitch.

Wifi is discouraged for stability reasons - the network drops may be the symptom that backs up the claim that HA OS on wifi is not sufficiently reliable.

I have no idea why the machine drops off the network when on wifi, so I can only contribute with the observation that in my case, going to wired network has solved the issue.

uphillbattle avatar Jan 31 '24 09:01 uphillbattle

Hello, I'm out of ideas. Having similar issue loosing HA network connection occasionally. Only help is to switch Rpi4 off and leave it for about 10 min, if start again HA will start. Sometime it stays 1h, sometings 5min and sometimes 1day. Connected to router with ethernet connection. Changed from HDCP (auto) to static IP, disabled ipv6, Changed SD card. Running latest version of HA. Also tried other power supply.

Running this box almost year without these issues. I had this few times month ago, then it stopped without any changes and not this is back for about 2-3 weeks.

Here is log of networkmanager (strange thing why 24th March shows log 4th April also?): Obrázek WhatsApp, 2024-03-24 v 19 35 38_64fd9a58

Datel01 avatar Mar 24 '24 21:03 Datel01

Hello, I'm out of ideas. Having similar issue loosing HA network connection occasionally. Only help is to switch Rpi4 off and leave it for about 10 min, if start again HA will start. Sometime it stays 1h, sometings 5min and sometimes 1day. Connected to router with ethernet connection. Changed from HDCP (auto) to static IP, disabled ipv6, Changed SD card. Running latest version of HA. Also tried other power supply.

Running this box almost year without these issues. I had this few times month ago, then it stopped without any changes and not this is back for about 2-3 weeks.

Hi @Datel01 I'm having the same issue since 2-3 weeks on a Raspi 3B. HAOS seems to forget the WiFi network, however when logging in over the LAN IP, it still knows the connection. On my phone I get a NSURL Error (obviously, can't find it) - also oftentimes it says "wrong credentials". If you've found something, I'd appreciate a response - I will do so as well.

l-marchesi avatar Apr 05 '24 21:04 l-marchesi

There hasn't been any activity on this issue recently. To keep our backlog manageable we have to clean old issues, as many of them have already been resolved with the latest updates. Please make sure to update to the latest Home Assistant OS version and check if that solves the issue. Let us know if that works for you by adding a comment 👍 This issue has now been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.

github-actions[bot] avatar Jul 05 '24 05:07 github-actions[bot]