operating-system icon indicating copy to clipboard operation
operating-system copied to clipboard

Home Assistant OS fails to boot: Failed to start Docker Application Container Engine.

Open andreimoraru opened this issue 2 years ago • 9 comments

Describe the issue you are experiencing

Upon a reboot of the virtual machine running Home Assistant OS, the HA OS doesn't boot properly failing with the following log and a reboot.

Jun 02 03:33:24 homeassistant systemd[1]: Failed to start Docker Application Container Engine.

What operating system image do you use?

generic-x86-64 (Generic UEFI capable x86-64 systems)

What version of Home Assistant Operating System is installed?

8.1

Did you upgrade the Operating System.

No

Steps to reproduce the issue

  1. Reboot the virtual machine with Home Assistant OS on it
  2. Notice in the log on the screen the message
  3. homeassistant systemd[1]: Failed to start Docker Application Container Engine.
  4. The HA operating system doesn't complete the boot process and reboot.
  5. It does above steps endlessly ...

Anything in the Supervisor logs that might be useful for us?

not sure there are any useful supervisor logs as OS doesn't boot completely

Anything in the Host logs that might be useful for us?

I managed to get the full boot log by attaching the vmdk of the HA OS virtual machine to a Linux system, mounting the file system and grabbing the boot log using journalctl

-- Journal begins at Tue 2022-05-17 12:46:16 EEST, ends at Thu 2022-06-02 03:34:36 EEST. --
Jun 02 03:32:24 homeassistant kernel: Linux version 5.15.41 (builder@58360eda4381) (x86_64-buildroot-linux-gnu-gcc.br_real (Buildroot -g2eebdbad5) 9.4.0, GNU ld (GNU Binutils) 2.36.1) #1 SMP Fri May 20 00:38:27 UTC 2022
Jun 02 03:32:24 homeassistant kernel: Command line: BOOT_IMAGE=(hd0,gpt2)/bzImage root=PARTUUID=8d3d53e3-6d49-4c38-8349-aff6859e82fd rootwait zram.enabled=1 zram.num_devices=3 apparmor=1 security=apparmor systemd.machine_id=a122ee3f8d7a4adea0882248aeb31ed0 cgroup_enable=memory fsck.repair=yes console=tty1 rauc.slot=A
...
...
Jun 02 03:33:06 homeassistant systemd[1]: Reached target Timer Units.
Jun 02 03:33:06 homeassistant systemd[1]: Starting HassOS AppArmor...
Jun 02 03:33:07 homeassistant audit[450]: AVC apparmor="STATUS" operation="profile_load" profile="unconfined" name="hassio-supervisor" pid=450 comm="apparmor_parser"
Jun 02 03:33:07 homeassistant audit[450]: AVC apparmor="STATUS" operation="profile_load" profile="unconfined" name="hassio-supervisor///usr/bin/gdbus" pid=450 comm="apparmor_parser"
Jun 02 03:33:07 homeassistant audit[450]: AVC apparmor="STATUS" operation="profile_load" profile="unconfined" name="hassio-supervisor///usr/bin/git" pid=450 comm="apparmor_parser"
Jun 02 03:33:07 homeassistant kernel: audit: type=1400 audit(1654129987.181:8): apparmor="STATUS" operation="profile_load" profile="unconfined" name="hassio-supervisor" pid=450 comm="apparmor_parser"
Jun 02 03:33:07 homeassistant kernel: audit: type=1400 audit(1654129987.181:8): apparmor="STATUS" operation="profile_load" profile="unconfined" name="hassio-supervisor///usr/bin/gdbus" pid=450 comm="apparmor_parser"
Jun 02 03:33:07 homeassistant kernel: audit: type=1400 audit(1654129987.181:8): apparmor="STATUS" operation="profile_load" profile="unconfined" name="hassio-supervisor///usr/bin/git" pid=450 comm="apparmor_parser"
Jun 02 03:33:07 homeassistant kernel: audit: type=1300 audit(1654129987.181:8): arch=c000003e syscall=1 success=yes exit=38459 a0=6 a1=55dd0dde7c60 a2=963b a3=0 items=0 ppid=449 pid=450 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="apparmor_parser" exe="/usr/sbin/apparmor_parser" subj==unconfined key=(null)
Jun 02 03:33:07 homeassistant kernel: audit: type=1327 audit(1654129987.181:8): proctitle=61707061726D6F725F706172736572002D72002D57002D4C002F6D6E742F646174612F73757065727669736F722F61707061726D6F722F6361636865002F6D6E742F646174612F73757065727669736F722F61707061726D6F722F68617373696F2D73757065727669736F72
Jun 02 03:33:07 homeassistant audit[450]: SYSCALL arch=c000003e syscall=1 success=yes exit=38459 a0=6 a1=55dd0dde7c60 a2=963b a3=0 items=0 ppid=449 pid=450 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="apparmor_parser" exe="/usr/sbin/apparmor_parser" subj==unconfined key=(null)
Jun 02 03:33:07 homeassistant audit: PROCTITLE proctitle=61707061726D6F725F706172736572002D72002D57002D4C002F6D6E742F646174612F73757065727669736F722F61707061726D6F722F6361636865002F6D6E742F646174612F73757065727669736F722F61707061726D6F722F68617373696F2D73757065727669736F72
Jun 02 03:33:07 homeassistant systemd[1]: Finished HassOS AppArmor.
Jun 02 03:33:07 homeassistant systemd[1]: Starting Docker Application Container Engine...
Jun 02 03:33:09 homeassistant systemd[1]: systemd-hostnamed.service: Deactivated successfully.
Jun 02 03:33:09 homeassistant audit: BPF prog-id=0 op=UNLOAD
Jun 02 03:33:09 homeassistant kernel: audit: type=1334 audit(1654129989.296:9): prog-id=0 op=UNLOAD
Jun 02 03:33:09 homeassistant kernel: audit: type=1334 audit(1654129989.296:10): prog-id=0 op=UNLOAD
Jun 02 03:33:09 homeassistant audit: BPF prog-id=0 op=UNLOAD
Jun 02 03:33:09 homeassistant dockerd[451]: time="2022-06-02T00:33:09.800999783Z" level=info msg="Starting up"
Jun 02 03:33:09 homeassistant dockerd[451]: time="2022-06-02T00:33:09.877185015Z" level=info msg="libcontainerd: started new containerd process" pid=460
Jun 02 03:33:09 homeassistant dockerd[451]: time="2022-06-02T00:33:09.878284655Z" level=info msg="parsed scheme: \"unix\"" module=grpc
Jun 02 03:33:09 homeassistant dockerd[451]: time="2022-06-02T00:33:09.881938363Z" level=info msg="scheme \"unix\" not registered, fallback to default scheme" module=grpc
Jun 02 03:33:09 homeassistant dockerd[451]: time="2022-06-02T00:33:09.882367021Z" level=info msg="ccResolverWrapper: sending update to cc: {[{unix:///var/run/docker/containerd/containerd.sock  <nil> 0 <nil>}] <nil> <nil>}" module=grpc
Jun 02 03:33:09 homeassistant dockerd[451]: time="2022-06-02T00:33:09.882904959Z" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
Jun 02 03:33:10 homeassistant dockerd[451]: time="2022-06-02T00:33:10.888177291Z" level=warning msg="grpc: addrConn.createTransport failed to connect to {unix:///var/run/docker/containerd/containerd.sock  <nil> 0 <nil>}. Err :connection error: desc = \"transport: error while dialing: dial unix:///var/run/docker/containerd/containerd.sock: timeout\". Reconnecting..." module=grpc
Jun 02 03:33:11 homeassistant dockerd[460]: time="2022-06-02T00:33:11Z" level=warning msg="deprecated version : `1`, please switch to version `2`"
Jun 02 03:33:11 homeassistant dockerd[460]: time="2022-06-02T00:33:11.738606963Z" level=info msg="starting containerd" revision= version=1.5.11
Jun 02 03:33:11 homeassistant dockerd[460]: time="2022-06-02T00:33:11.976081765Z" level=info msg="loading plugin \"io.containerd.content.v1.content\"..." type=io.containerd.content.v1
Jun 02 03:33:12 homeassistant dockerd[460]: time="2022-06-02T00:33:12.013034612Z" level=info msg="loading plugin \"io.containerd.snapshotter.v1.aufs\"..." type=io.containerd.snapshotter.v1
Jun 02 03:33:12 homeassistant dockerd[460]: time="2022-06-02T00:33:12.106565736Z" level=info msg="skip loading plugin \"io.containerd.snapshotter.v1.aufs\"..." error="aufs is not supported (modprobe aufs failed: exit status 1 \"modprobe: FATAL: Module aufs not found in directory /lib/modules/5.15.41\\n\"): skip plugin" type=io.containerd.snapshotter.v1
Jun 02 03:33:12 homeassistant dockerd[460]: time="2022-06-02T00:33:12.126287966Z" level=info msg="loading plugin \"io.containerd.snapshotter.v1.devmapper\"..." type=io.containerd.snapshotter.v1
Jun 02 03:33:12 homeassistant dockerd[460]: time="2022-06-02T00:33:12.133892569Z" level=warning msg="failed to load plugin io.containerd.snapshotter.v1.devmapper" error="devmapper not configured"
Jun 02 03:33:12 homeassistant dockerd[460]: time="2022-06-02T00:33:12.134447156Z" level=info msg="loading plugin \"io.containerd.snapshotter.v1.native\"..." type=io.containerd.snapshotter.v1
Jun 02 03:33:12 homeassistant dockerd[460]: time="2022-06-02T00:33:12.136546886Z" level=info msg="loading plugin \"io.containerd.snapshotter.v1.overlayfs\"..." type=io.containerd.snapshotter.v1
Jun 02 03:33:12 homeassistant dockerd[460]: time="2022-06-02T00:33:12.144675605Z" level=info msg="loading plugin \"io.containerd.snapshotter.v1.zfs\"..." type=io.containerd.snapshotter.v1
Jun 02 03:33:12 homeassistant dockerd[460]: time="2022-06-02T00:33:12.145168461Z" level=info msg="skip loading plugin \"io.containerd.snapshotter.v1.zfs\"..." error="path /mnt/data/docker/containerd/daemon/io.containerd.snapshotter.v1.zfs must be a zfs filesystem to be used with the zfs snapshotter: skip plugin" type=io.containerd.snapshotter.v1
Jun 02 03:33:12 homeassistant dockerd[460]: time="2022-06-02T00:33:12.145218771Z" level=info msg="loading plugin \"io.containerd.metadata.v1.bolt\"..." type=io.containerd.metadata.v1
Jun 02 03:33:12 homeassistant dockerd[460]: time="2022-06-02T00:33:12.145279620Z" level=warning msg="could not use snapshotter devmapper in metadata plugin" error="devmapper not configured"
Jun 02 03:33:12 homeassistant dockerd[460]: time="2022-06-02T00:33:12.145309855Z" level=info msg="metadata content store policy set" policy=shared
Jun 02 03:33:12 homeassistant dockerd[460]: panic: invalid freelist page: 41, page type is leaf
Jun 02 03:33:12 homeassistant dockerd[460]: goroutine 1 [running]:
Jun 02 03:33:12 homeassistant dockerd[460]: go.etcd.io/bbolt.(*freelist).read(0x1ca2b4d, 0x7fc03c504000)
Jun 02 03:33:12 homeassistant dockerd[460]:         go.etcd.io/[email protected]/freelist.go:266 +0x234
Jun 02 03:33:12 homeassistant dockerd[460]: go.etcd.io/bbolt.(*DB).loadFreelist.func1()
Jun 02 03:33:12 homeassistant dockerd[460]:         go.etcd.io/[email protected]/db.go:316 +0xab
Jun 02 03:33:12 homeassistant dockerd[460]: sync.(*Once).doSlow(0xc0001603c0, 0x10)
Jun 02 03:33:12 homeassistant dockerd[460]:         sync/once.go:68 +0xd2
Jun 02 03:33:12 homeassistant dockerd[460]: sync.(*Once).Do(...)
Jun 02 03:33:12 homeassistant dockerd[460]:         sync/once.go:59
Jun 02 03:33:12 homeassistant dockerd[460]: go.etcd.io/bbolt.(*DB).loadFreelist(0xc000160200)
Jun 02 03:33:12 homeassistant dockerd[460]:         go.etcd.io/[email protected]/db.go:309 +0x47
Jun 02 03:33:12 homeassistant dockerd[460]: go.etcd.io/bbolt.Open({0xc000174e60, 0x49}, 0x1ca212b, 0x0)
Jun 02 03:33:12 homeassistant dockerd[460]:         go.etcd.io/[email protected]/db.go:286 +0x46b
Jun 02 03:33:12 homeassistant dockerd[460]: github.com/containerd/containerd/services/server.LoadPlugins.func2(0xc0003ae380)
Jun 02 03:33:12 homeassistant dockerd[460]:         github.com/containerd/containerd/services/server/server.go:380 +0x828
Jun 02 03:33:12 homeassistant dockerd[460]: github.com/containerd/containerd/plugin.(*Registration).Init(0xc000082240, 0xc0003ae380)
Jun 02 03:33:12 homeassistant dockerd[460]:         github.com/containerd/containerd/plugin/plugin.go:110 +0x2e
Jun 02 03:33:12 homeassistant dockerd[460]: github.com/containerd/containerd/services/server.New({0x1ee5730, 0xc00011a000}, 0xc000001ba0)
Jun 02 03:33:12 homeassistant dockerd[460]:         github.com/containerd/containerd/services/server/server.go:168 +0x102b
Jun 02 03:33:12 homeassistant dockerd[460]: github.com/containerd/containerd/cmd/containerd/command.App.func1(0xc0000a8160)
Jun 02 03:33:12 homeassistant dockerd[460]:         github.com/containerd/containerd/cmd/containerd/command/main.go:179 +0x6d5
Jun 02 03:33:12 homeassistant dockerd[460]: github.com/urfave/cli.HandleAction({0x1a85840, 0x1d55450}, 0xc0001000b0)
Jun 02 03:33:12 homeassistant dockerd[460]:         github.com/urfave/[email protected]/app.go:523 +0xa8
Jun 02 03:33:12 homeassistant dockerd[460]: github.com/urfave/cli.(*App).Run(0xc0000c61c0, {0xc0001000a0, 0x5, 0x5})
Jun 02 03:33:12 homeassistant dockerd[460]:         github.com/urfave/[email protected]/app.go:285 +0x734
Jun 02 03:33:12 homeassistant dockerd[460]: main.main()
Jun 02 03:33:12 homeassistant dockerd[460]:         github.com/containerd/containerd/cmd/containerd/main.go:33 +0x37
Jun 02 03:33:12 homeassistant dockerd[451]: time="2022-06-02T00:33:12.165936012Z" level=error msg="containerd did not exit successfully" error="exit status 2" module=libcontainerd
Jun 02 03:33:13 homeassistant dockerd[451]: time="2022-06-02T00:33:13.527829710Z" level=warning msg="grpc: addrConn.createTransport failed to connect to {unix:///var/run/docker/containerd/containerd.sock  <nil> 0 <nil>}. Err :connection error: desc = \"transport: error while dialing: dial unix:///var/run/docker/containerd/containerd.sock: timeout\". Reconnecting..." module=grpc
Jun 02 03:33:17 homeassistant dockerd[451]: time="2022-06-02T00:33:17.574194360Z" level=warning msg="grpc: addrConn.createTransport failed to connect to {unix:///var/run/docker/containerd/containerd.sock  <nil> 0 <nil>}. Err :connection error: desc = \"transport: error while dialing: dial unix:///var/run/docker/containerd/containerd.sock: timeout\". Reconnecting..." module=grpc
Jun 02 03:33:23 homeassistant dockerd[451]: time="2022-06-02T00:33:23.163113215Z" level=warning msg="grpc: addrConn.createTransport failed to connect to {unix:///var/run/docker/containerd/containerd.sock  <nil> 0 <nil>}. Err :connection error: desc = \"transport: error while dialing: dial unix:///var/run/docker/containerd/containerd.sock: timeout\". Reconnecting..." module=grpc
Jun 02 03:33:24 homeassistant dockerd[451]: failed to start containerd: timeout waiting for containerd to start
Jun 02 03:33:24 homeassistant systemd[1]: docker.service: Main process exited, code=exited, status=1/FAILURE
Jun 02 03:33:24 homeassistant systemd[1]: docker.service: Failed with result 'exit-code'.
Jun 02 03:33:24 homeassistant systemd[1]: Failed to start Docker Application Container Engine.
Jun 02 03:33:24 homeassistant systemd[1]: Dependency failed for Dropbear SSH daemon.
Jun 02 03:33:24 homeassistant systemd[1]: dropbear.service: Job dropbear.service/start failed with result 'dependency'.
Jun 02 03:33:24 homeassistant systemd[1]: Dependency failed for HassOS supervisor.
Jun 02 03:33:24 homeassistant systemd[1]: hassos-supervisor.service: Job hassos-supervisor.service/start failed with result 'dependency'.
Jun 02 03:33:24 homeassistant systemd[1]: Removed slice Slice /system/modprobe.
Jun 02 03:33:24 homeassistant systemd[1]: Stopped target Bluetooth Support.
Jun 02 03:33:24 homeassistant systemd[1]: Stopped target HassOS hardware targets.
Jun 02 03:33:24 homeassistant systemd[1]: Stopped target HassOS zram targets.
Jun 02 03:33:25 homeassistant systemd[1]: Stopped target Host and Network Name Lookups.
Jun 02 03:33:25 homeassistant systemd[1]: Stopped target Remote Encrypted Volumes.
Jun 02 03:33:25 homeassistant systemd[1]: Stopped target Remote Verity Protected Volumes.
Jun 02 03:33:25 homeassistant systemd[1]: Stopped target Timer Units.
Jun 02 03:33:25 homeassistant systemd[1]: fstrim.timer: Deactivated successfully.
Jun 02 03:33:25 homeassistant systemd[1]: Stopped Discard unused blocks once a week.
Jun 02 03:33:25 homeassistant systemd[1]: systemd-tmpfiles-clean.timer: Deactivated successfully.
Jun 02 03:33:25 homeassistant systemd[1]: Stopped Daily Cleanup of Temporary Directories.
Jun 02 03:33:25 homeassistant systemd[1]: systemd-coredump.socket: Deactivated successfully.
Jun 02 03:33:25 homeassistant systemd[1]: Closed Process Core Dump Socket.
Jun 02 03:33:25 homeassistant bluetoothd[310]: Terminating
Jun 02 03:33:25 homeassistant systemd[1]: Stopping Bluetooth service...
Jun 02 03:33:25 homeassistant systemd[1]: Removed slice Slice /system/ha-cli.
Jun 02 03:33:25 homeassistant systemd[1]: hassos-apparmor.service: Deactivated successfully.
Jun 02 03:33:25 homeassistant systemd[1]: Stopped HassOS AppArmor.
Jun 02 03:33:25 homeassistant systemd[1]: Stopped target Network is Online.
Jun 02 03:33:25 homeassistant bluetoothd[310]: Stopping SDP server
Jun 02 03:33:25 homeassistant bluetoothd[310]: Exit
Jun 02 03:33:25 homeassistant systemd[1]: Stopped target System Time Synchronized.
Jun 02 03:33:25 homeassistant systemd[1]: Stopped target System Time Set.
Jun 02 03:33:25 homeassistant systemd[1]: NetworkManager-wait-online.service: Deactivated successfully.
Jun 02 03:33:25 homeassistant systemd[1]: Stopped Network Manager Wait Online.
Jun 02 03:33:25 homeassistant systemd[1]: hassos-persists.service: Deactivated successfully.
Jun 02 03:33:25 homeassistant systemd[1]: Stopped HassOS persists data on bootstate.
Jun 02 03:33:25 homeassistant systemd[1]: Stopping Rauc Update Service...
Jun 02 03:33:25 homeassistant rauc[312]: stopping service
Jun 02 03:33:25 homeassistant systemd[1]: Stopping Hardware RNG Entropy Gatherer Daemon...
Jun 02 03:33:25 homeassistant rngd[314]: [rdrand]: Shutting down
Jun 02 03:33:25 homeassistant rngd[314]: [jitter]: Shutting down
Jun 02 03:33:25 homeassistant systemd[1]: Stopping User Login Management...
Jun 02 03:33:25 homeassistant systemd[1]: Stopping Load/Save Random Seed...
Jun 02 03:33:25 homeassistant systemd[1]: systemd-time-wait-sync.service: Deactivated successfully.
Jun 02 03:33:25 homeassistant systemd[1]: Stopped Wait Until Kernel Time Synchronized.
Jun 02 03:33:25 homeassistant systemd[1]: Stopping Permit User Sessions...
Jun 02 03:33:25 homeassistant systemd[1]: Stopping vmtoolsd for openvmtools...
Jun 02 03:33:25 homeassistant systemd[1]: bluetooth.service: Deactivated successfully.
Jun 02 03:33:25 homeassistant systemd[1]: Stopped Bluetooth service.
Jun 02 03:33:25 homeassistant systemd[1]: rauc.service: Deactivated successfully.
Jun 02 03:33:25 homeassistant systemd[1]: Stopped Rauc Update Service.
Jun 02 03:33:25 homeassistant systemd[1]: systemd-logind.service: Deactivated successfully.
Jun 02 03:33:25 homeassistant systemd[1]: Stopped User Login Management.
Jun 02 03:33:25 homeassistant systemd[1]: vmtoolsd.service: Main process exited, code=killed, status=9/KILL
Jun 02 03:33:25 homeassistant systemd[1]: vmtoolsd.service: Failed with result 'signal'.
Jun 02 03:33:25 homeassistant systemd[1]: Stopped vmtoolsd for openvmtools.
Jun 02 03:33:25 homeassistant systemd[1]: rngd.service: Deactivated successfully.
Jun 02 03:33:25 homeassistant systemd[1]: Stopped Hardware RNG Entropy Gatherer Daemon.
Jun 02 03:33:25 homeassistant systemd[1]: systemd-random-seed.service: Deactivated successfully.
Jun 02 03:33:25 homeassistant systemd[1]: Stopped Load/Save Random Seed.
Jun 02 03:33:25 homeassistant systemd[1]: systemd-user-sessions.service: Deactivated successfully.
Jun 02 03:33:25 homeassistant systemd[1]: Stopped Permit User Sessions.
Jun 02 03:33:25 homeassistant systemd[1]: Requested transaction contradicts existing jobs: Transaction for docker.service/restart is destructive (etc-hosts.mount has 'stop' job queued, but 'start' is included in transaction).
Jun 02 03:33:25 homeassistant systemd[1]: docker.service: Failed to schedule restart job: Transaction for docker.service/restart is destructive (etc-hosts.mount has 'stop' job queued, but 'start' is included in transaction).
Jun 02 03:33:25 homeassistant systemd[1]: docker.service: Failed with result 'exit-code'.
Jun 02 03:33:25 homeassistant systemd[1]: Shutdown is already active. Skipping emergency action request reboot.
Jun 02 03:33:25 homeassistant systemd[1]: Stopped target Network.
Jun 02 03:33:25 homeassistant audit: BPF prog-id=0 op=UNLOAD
Jun 02 03:33:25 homeassistant audit: BPF prog-id=0 op=UNLOAD
Jun 02 03:33:25 homeassistant kernel: audit: type=1334 audit(1654130005.409:11): prog-id=0 op=UNLOAD
Jun 02 03:33:25 homeassistant kernel: audit: type=1334 audit(1654130005.410:12): prog-id=0 op=UNLOAD
Jun 02 03:33:25 homeassistant systemd[1]: Stopped target Remote File Systems.
Jun 02 03:33:25 homeassistant NetworkManager[309]: <info>  [1654130005.4207] caught SIGTERM, shutting down normally.
Jun 02 03:33:25 homeassistant systemd[1]: Stopping Network Manager...
Jun 02 03:33:25 homeassistant systemd[1]: Stopping Network Name Resolution...
Jun 02 03:33:25 homeassistant systemd[1]: Stopping WPA supplicant...
Jun 02 03:33:25 homeassistant systemd[1]: systemd-resolved.service: Deactivated successfully.
Jun 02 03:33:25 homeassistant systemd[1]: Stopped Network Name Resolution.
Jun 02 03:33:25 homeassistant systemd[1]: wpa_supplicant.service: Deactivated successfully.
Jun 02 03:33:25 homeassistant systemd[1]: Stopped WPA supplicant.
Jun 02 03:33:25 homeassistant NetworkManager[309]: <info>  [1654130005.4708] dhcp4 (enp2s1): canceled DHCP transaction
Jun 02 03:33:25 homeassistant NetworkManager[309]: <info>  [1654130005.4708] dhcp4 (enp2s1): state changed bound -> terminated
Jun 02 03:33:25 homeassistant NetworkManager[309]: <info>  [1654130005.4879] manager: NetworkManager state is now CONNECTED_SITE
Jun 02 03:33:25 homeassistant dbus-daemon[104]: [system] Activating via systemd: service name='org.freedesktop.nm_dispatcher' unit='dbus-org.freedesktop.nm-dispatcher.service' requested by ':1.8' (uid=0 pid=309 comm="/usr/sbin/NetworkManager --no-daemon ")
Jun 02 03:33:25 homeassistant systemd[1]: Requested transaction contradicts existing jobs: Transaction for NetworkManager-dispatcher.service/start is destructive (sysinit.target has 'stop' job queued, but 'start' is included in transaction).
Jun 02 03:33:25 homeassistant dbus-daemon[104]: [system] Activation via systemd failed for unit 'dbus-org.freedesktop.nm-dispatcher.service': Transaction for NetworkManager-dispatcher.service/start is destructive (sysinit.target has 'stop' job queued, but 'start' is included in transaction).
Jun 02 03:33:25 homeassistant NetworkManager[309]: <info>  [1654130005.5090] exiting (success)
Jun 02 03:33:25 homeassistant systemd[1]: NetworkManager.service: Deactivated successfully.
Jun 02 03:33:25 homeassistant systemd[1]: Stopped Network Manager.
Jun 02 03:33:25 homeassistant systemd[1]: Stopped target Basic System.
Jun 02 03:33:25 homeassistant systemd[1]: Stopped target Preparation for Network.
Jun 02 03:33:25 homeassistant systemd[1]: Stopped target Path Units.
Jun 02 03:33:25 homeassistant systemd[1]: Stopped target Slice Units.
Jun 02 03:33:25 homeassistant systemd[1]: Removed slice User and Session Slice.
Jun 02 03:33:25 homeassistant systemd[1]: Stopped target Socket Units.
Jun 02 03:33:25 homeassistant systemd[1]: docker.socket: Deactivated successfully.
Jun 02 03:33:25 homeassistant systemd[1]: Closed Docker Socket for the API.
Jun 02 03:33:25 homeassistant systemd[1]: systemd-journal-gatewayd.socket: Deactivated successfully.
Jun 02 03:33:25 homeassistant systemd[1]: Closed Journal Gateway Service Socket.
Jun 02 03:33:25 homeassistant systemd[1]: Stopped target System Initialization.
Jun 02 03:33:25 homeassistant systemd[1]: Stopped target Local Encrypted Volumes.
Jun 02 03:33:25 homeassistant systemd[1]: systemd-ask-password-console.path: Deactivated successfully.
Jun 02 03:33:25 homeassistant systemd[1]: Stopped Dispatch Password Requests to Console Directory Watch.
Jun 02 03:33:25 homeassistant systemd[1]: systemd-ask-password-wall.path: Deactivated successfully.
Jun 02 03:33:25 homeassistant systemd[1]: Stopped Forward Password Requests to Wall Directory Watch.
Jun 02 03:33:25 homeassistant systemd[1]: Stopped target HassOS overlay targets.
Jun 02 03:33:25 homeassistant systemd[1]: Stopped target Local Integrity Protected Volumes.
Jun 02 03:33:25 homeassistant systemd[1]: Stopped target Local Verity Protected Volumes.
Jun 02 03:33:25 homeassistant systemd[1]: systemd-sysctl.service: Deactivated successfully.
Jun 02 03:33:25 homeassistant systemd[1]: Stopped Apply Kernel Variables.
Jun 02 03:33:25 homeassistant systemd[1]: Stopping Network Time Synchronization...
Jun 02 03:33:25 homeassistant systemd[1]: systemd-timesyncd.service: Deactivated successfully.
Jun 02 03:33:25 homeassistant systemd[1]: Stopped Network Time Synchronization.
Jun 02 03:33:25 homeassistant systemd[1]: systemd-tmpfiles-setup.service: Deactivated successfully.
Jun 02 03:33:25 homeassistant systemd[1]: Stopped Create Volatile Files and Directories.
Jun 02 03:33:25 homeassistant systemd[1]: Stopped target Local File Systems.
Jun 02 03:33:25 homeassistant systemd[1]: Unmounting NetworkManager persistent system connections...
Jun 02 03:33:25 homeassistant systemd[1]: Unmounting Docker persistent configuration...
Jun 02 03:33:25 homeassistant systemd[1]: Unmounting Dropbear persistent keys...
Jun 02 03:33:25 homeassistant systemd[1]: Unmounting Hostname persistent configuration...
Jun 02 03:33:25 homeassistant systemd[1]: Unmounting Hosts persistent configuration...
Jun 02 03:33:25 homeassistant systemd[1]: Unmounting /etc/machine-id...
Jun 02 03:33:25 homeassistant systemd[1]: Unmounting Kernel persistent modprobe.d...
Jun 02 03:33:25 homeassistant systemd[1]: Unmounting Kernel persistent modules-load.d...
Jun 02 03:33:25 homeassistant systemd[1]: Unmounting Timesyncd persistent configuration...
Jun 02 03:33:25 homeassistant systemd[1]: Unmounting Udev persistent rules.d...
Jun 02 03:33:25 homeassistant systemd[1]: Unmounting HassOS boot partition...
Jun 02 03:33:25 homeassistant systemd[1]: Unmounting Docker persistent user...
Jun 02 03:33:25 homeassistant systemd[1]: Unmounting Dropbear persistent user...
Jun 02 03:33:25 homeassistant systemd[1]: Unmounting Temporary Directory /tmp...
Jun 02 03:33:25 homeassistant systemd[1]: Unmounting NetworkManager persistent data...
Jun 02 03:33:25 homeassistant systemd[1]: Unmounting Bluetooth persistent state...
Jun 02 03:33:25 homeassistant systemd[1]: Unmounting Docker persistent data...
Jun 02 03:33:25 homeassistant systemd[1]: Unmounting Systemd persistent state...
Jun 02 03:33:25 homeassistant systemd[1]: [email protected]: Deactivated successfully.
Jun 02 03:33:25 homeassistant systemd[1]: Stopped Grow File System on /mnt/data.
Jun 02 03:33:25 homeassistant systemd[1]: Removed slice Slice /system/systemd-growfs.

System Health information

Can't get any logs as HA never completes the boot process

Additional information

I have access to the filesystems of the HA OS by attaching a copy of its VM's vmdk to another Linux VM and mounting the file system. Any information required can be retrieved

andreimoraru avatar Jun 02 '22 01:06 andreimoraru

Hm, that looks like Docker crashes hard:

Jun 02 03:33:12 homeassistant dockerd[460]: panic: invalid freelist page: 41, page type is leaf
Jun 02 03:33:12 homeassistant dockerd[460]: goroutine 1 [running]:
Jun 02 03:33:12 homeassistant dockerd[460]: go.etcd.io/bbolt.(*freelist).read(0x1ca2b4d, 0x7fc03c504000)
Jun 02 03:33:12 homeassistant dockerd[460]:         go.etcd.io/[email protected]/freelist.go:266 +0x234
Jun 02 03:33:12 homeassistant dockerd[460]: go.etcd.io/bbolt.(*DB).loadFreelist.func1()
Jun 02 03:33:12 homeassistant dockerd[460]:         go.etcd.io/[email protected]/db.go:316 +0xab
Jun 02 03:33:12 homeassistant dockerd[460]: sync.(*Once).doSlow(0xc0001603c0, 0x10)
Jun 02 03:33:12 homeassistant dockerd[460]:         sync/once.go:68 +0xd2
Jun 02 03:33:12 homeassistant dockerd[460]: sync.(*Once).Do(...)
Jun 02 03:33:12 homeassistant dockerd[460]:         sync/once.go:59
Jun 02 03:33:12 homeassistant dockerd[460]: go.etcd.io/bbolt.(*DB).loadFreelist(0xc000160200)
Jun 02 03:33:12 homeassistant dockerd[460]:         go.etcd.io/[email protected]/db.go:309 +0x47
Jun 02 03:33:12 homeassistant dockerd[460]: go.etcd.io/bbolt.Open({0xc000174e60, 0x49}, 0x1ca212b, 0x0)
Jun 02 03:33:12 homeassistant dockerd[460]:         go.etcd.io/[email protected]/db.go:286 +0x46b
Jun 02 03:33:12 homeassistant dockerd[460]: github.com/containerd/containerd/services/server.LoadPlugins.func2(0xc0003ae380)
Jun 02 03:33:12 homeassistant dockerd[460]:         github.com/containerd/containerd/services/server/server.go:380 +0x828
Jun 02 03:33:12 homeassistant dockerd[460]: github.com/containerd/containerd/plugin.(*Registration).Init(0xc000082240, 0xc0003ae380)
Jun 02 03:33:12 homeassistant dockerd[460]:         github.com/containerd/containerd/plugin/plugin.go:110 +0x2e
Jun 02 03:33:12 homeassistant dockerd[460]: github.com/containerd/containerd/services/server.New({0x1ee5730, 0xc00011a000}, 0xc000001ba0)
Jun 02 03:33:12 homeassistant dockerd[460]:         github.com/containerd/containerd/services/server/server.go:168 +0x102b
Jun 02 03:33:12 homeassistant dockerd[460]: github.com/containerd/containerd/cmd/containerd/command.App.func1(0xc0000a8160)
Jun 02 03:33:12 homeassistant dockerd[460]:         github.com/containerd/containerd/cmd/containerd/command/main.go:179 +0x6d5
Jun 02 03:33:12 homeassistant dockerd[460]: github.com/urfave/cli.HandleAction({0x1a85840, 0x1d55450}, 0xc0001000b0)
Jun 02 03:33:12 homeassistant dockerd[460]:         github.com/urfave/[email protected]/app.go:523 +0xa8
Jun 02 03:33:12 homeassistant dockerd[460]: github.com/urfave/cli.(*App).Run(0xc0000c61c0, {0xc0001000a0, 0x5, 0x5})
Jun 02 03:33:12 homeassistant dockerd[460]:         github.com/urfave/[email protected]/app.go:285 +0x734
Jun 02 03:33:12 homeassistant dockerd[460]: main.main()
Jun 02 03:33:12 homeassistant dockerd[460]:         github.com/containerd/containerd/cmd/containerd/main.go:33 +0x37

My best guess is that some data corruption occured. I don't have a good idea how to resolve this other than reinstalling your system :cry:

agners avatar Jun 02 '22 10:06 agners

This literally happened to me as well today. HA stopped working out of nowhere, I tried a power cycle but I've seen it bootlooping on my switch leds. Connected a display and it fails at the same error, then proceeds to unmount and stop processes and reboots. Does this forever. I was running low on space however yesterday I had 0.5GB still left, and when I'm now checking the card contents it still had 750mb free. I've seen this issue come up due to low space however even if I delete snapshots or backups, logs, and make 1,5GB free space on the card, it still boots the same. Any ideas?

beedee95 avatar Jun 02 '22 13:06 beedee95

How I fixed this

I have powered off the VM with HA OS, then copied the vmdk file in safe location to have the original

Next I have tried to work with GRUB command line of the VM running HA OS, but unfortunately could mount the file systems. The purpose was to find the boot logs and get more details about the quite generic error: Failed to start Docker application container engine.

I gave up the GRUB way and decided to attach a copy of the vmdk file of VM running HA OS to an existing virtual machine in VMware Workstation running some Debian 8.x operating system. I did that, mounted filesystem the /dev/sdb8 device on the HA OS VM vmdk in the Debian 8.x VM operating system. I used /mnt as mount point. Next was to understand where the boot log files are located on the filesystem on /dev/sdb8 device

The boot log files were located under logs/ path on the filesystem. Remember the mount point was /mnt

root@debian:~# ls /mnt/
docker/ .hassos-supervisor-version logs/ lost+found/ rauc.db supervisor/
root@debian:~# ls /mnt/logs/
27ef8442d2f44e578edc859de09ab4c8/ a122ee3f8d7a4adea0882248aeb31ed0

I learned that journalctl utility can read logs from directory (-D ) or file (--file <path to .journal file>) I tried to use journalctl utility with -D key but there were tons of logs since beginning of May. All right, let's narrow the search I learned that in the journal files there logs are somehow grouped in sequences for every boot. To say in other words, you can retrieve the logs for specific boot of the HA OS. This was useful to narrow the logs. To see the list of boot events of HA OS use the following:

root@debian:~# journalctl -D /mnt/logs/a122ee3f8d7a4adea0882248aeb31ed0/ --list-boots

But the journalctl returned an error about unkown protocol. So i supposed the version of the journald that was used to collect the logs on HA OS virtual machine is not supported by the older version journalctl on my Debian 8.x VM. So I went and built another VM with latest version of Debian then attached the copy of vmdk and mounted /dev/sdb8 device. Now the journalctl utility returned below:

root@debian:~# journalctl -D /mnt/logs/a122ee3f8d7a4adea0882248aeb31ed0/ --list-boots
-37 9dff024fcdf74a178ead9e5cb2410dd6 Tue 2022-05-17 12:46:16 EEST—Tue 2022-05-24 08:24:07 EEST
-36 87dd47a0164e4695b079664a8c05e157 Tue 2022-05-24 08:24:14 EEST—Wed 2022-06-01 23:28:14 EEST
...
-12 68095474bf9542628d590fd7d069ffcc Thu 2022-06-02 00:38:10 EEST—Thu 2022-06-02 00:38:59 EEST
-11 5f05dd94b9644c5a85842909abdc5c8b Thu 2022-06-02 00:39:04 EEST—Thu 2022-06-02 00:39:53 EEST
-10 ce803b0659b542539db81a2e1160130e Thu 2022-06-02 00:39:58 EEST—Thu 2022-06-02 00:40:48 EEST
-9 c8b836a0b5234271b9e69e18be5df66c Thu 2022-06-02 00:40:53 EEST—Thu 2022-06-02 00:41:42 EEST
-8 243a7a5e49374beda0817d8cf8f1c0fa Thu 2022-06-02 00:41:47 EEST—Thu 2022-06-02 00:42:36 EEST
-7 a5fa3a278ebc4d5c813d233b4d906548 Thu 2022-06-02 00:42:40 EEST—Thu 2022-06-02 00:43:30 EEST
-6 e0f74533396349f884d2df336241f438 Thu 2022-06-02 00:43:35 EEST—Thu 2022-06-02 00:44:24 EEST
...
-2 07f0f1b6290749b4bbece4d672f7b0c0 Thu 2022-06-02 00:47:12 EEST—Thu 2022-06-02 00:48:01 EEST
-1 30a8c20c94264d1795794335f504d399 Thu 2022-06-02 01:10:17 EEST—Thu 2022-06-02 01:11:08 EEST
0 c55a09c569db4849ac31113a36550244 Thu 2022-06-02 01:21:19 EEST—Thu 2022-06-02 01:21:48

I truncated the output above for readability purpose Next, I want to know the boot logs for the boot attempt before previous attempt. I use again the journalctl utility by passing the tag of boot attempt to -b key and I redirect the output to a file (because we need evidence when we open a bug for HA OS, right? https://github.com/home.../operating-system/issues/1960 )

journalctl -D /mnt/logs/a122ee3f8d7a4adea0882248aeb31ed0/ -b d6739fdb3ee14714b5c27b6d0501da1c > failed_boot_03-33.txt

So now we have the failed boot attempt logs. Let's inspect it We have the record of error seen in the HA OS console briefly

Jun 02 03:33:24 homeassistant systemd[1]: Failed to start Docker Application Container Engine.

Let's see above that line, we have a docker related panic

Jun 02 03:33:12 homeassistant dockerd[460]: panic: invalid freelist page: 41, page type is leaf

Hmm....😕

We focus on few lines Jun 02 03:33:24 homeassistant dockerd[451]: failed to start containerd: timeout waiting for containerd to start Containerd failed to start. Why?

Jun 02 03:33:24 homeassistant systemd[1]: docker.service: Main process exited, code=exited, status=1/FAILURE
Jun 02 03:33:24 homeassistant systemd[1]: docker.service: Failed with result 'exit-code'.

Docker service failed to start due to containerd startup failure. Ok

After docker service fails to start the HA OS begin to reboot. But why? The answer is on the /dev/sdb5 filesystem in /mnt/etc/systemd/system/docker.service.d/failure.conf file

root@debian:~# cat /mnt/etc/systemd/system/docker.service.d/failure.conf
[Unit]
FailureAction=reboot

All right, now it makes sense why the boot loop happens. I tried to change the file content to: FailureAction=none but the filesystem on /dev/sdb5 is read-only. I gave up on searching solutions to make it read-write in order to adjust for failure.conf file

root@debian:~# mount
...
/dev/sdb5 on /mnt type squashfs (ro,relatime)

I went back and looked one more time at the error.

Jun 02 03:33:12 homeassistant dockerd[460]: panic: invalid freelist page: 41, page type is leaf
a couple of lines earlier in the log, there are the following lines
Jun 02 03:33:12 homeassistant dockerd[460]: time="2022-06-02T00:33:12.145218771Z" level=info msg="loading plugin \"io.containerd.metadata.v1.bolt\"..." type=io.containerd.metadata.v1
Jun 02 03:33:12 homeassistant dockerd[460]: time="2022-06-02T00:33:12.145279620Z" level=warning msg="could not use snapshotter devmapper in metadata plugin" error="devmapper not configured"
Jun 02 03:33:12 homeassistant dockerd[460]: time="2022-06-02T00:33:12.145309855Z" level=info msg="metadata content store policy set" policy=shared
So it seems there something wrong with io.containerd.metadata.v1 directory and/or files in that directory
This could be issue why containerd service failed to start, thus failing my Docker application service and causing the boot loop
Jun 02 03:33:12 homeassistant dockerd[460]: [github.com/containerd/containerd/cmd/containerd/main.go:33](https://github.com/containerd/containerd/cmd/containerd/main.go:33
Jun 02 03:33:12 homeassistant dockerd[451]: time="2022-06-02T00:33:12.165936012Z" level=error msg="containerd did not exit successfully" error="exit status 2" module=libcontainerd

I found the meta.db file inside the path on /dev/sdb8 device filesystem

root@debian:~# ls -la /mnt/docker/containerd/daemon/io.containerd.metadata.v1.bolt/meta.db
-rw-r--r-- 1 root root 524288 Jun 1 23:30 /mnt/docker/containerd/daemon/io.containerd.metadata.v1.bolt/meta.db

There are some information here about issues with meta.db https://github.com/containerd/containerd/issues/3347 So I needed Go and the bbolt utility to check the meta.db

I went and installed Go

https://www.itzgeek.com/how-tos/linux/debian/how-to-install-go-lang-on-debian-11-debian-10.html

I made a copy ot meta.db and started playing

root@debian:~# cp /mnt/docker/containerd/daemon/io.containerd.metadata.v1.bolt/meta.db .
root@debian:~# /root/go/bin/bbolt check meta.db
panic: invalid freelist page: 41, page type is leaf
goroutine 1 [running]:
[go.etcd.io/bbolt.(*freelist).read](http://go.etcd.io/bbolt.(*freelist).read?fbclid=IwAR00eghnyXYytbJ6eNkSxyrraEyNwUbEwCYwfID4Hcy2UCFeyqUY7rRAzYs)(0x505ac7, 0x7ff3baa8d000)
/root/go/pkg/mod/go.etcd.io/[email protected]/freelist.go:266 +0x234
[go.etcd.io/bbolt.(*DB).loadFreelist.func1()](http://go.etcd.io/bbolt.(*DB).loadFreelist.func1()?fbclid=IwAR0gdMvEs7exFasSh9V2UAR8u3hIYBGb4tc0Kapq8iFxNE0DrlQalAPxmrU)
/root/go/pkg/mod/go.etcd.io/[email protected]/db.go:323 +0xae
sync.(*Once).doSlow(0xc0000801c8, 0x10)
/usr/local/go/src/sync/once.go:68 +0xd2
sync.(*Once).Do(...)
/usr/local/go/src/sync/once.go:59
[go.etcd.io/bbolt.(*DB).loadFreelist(0xc000080000)](http://go.etcd.io/bbolt.(*DB).loadFreelist(0xc000080000)?fbclid=IwAR1zFc3kHN4LsPidcUVNMS0F_h6qXARC41Pgb9JJExYrP4jtoBcXXKn8ITY)
/root/go/pkg/mod/go.etcd.io/[email protected]/db.go:316 +0x47
[go.etcd.io/bbolt.Open](http://go.etcd.io/bbolt.Open?fbclid=IwAR063oD_biOg4bzjuE1OyXizChF010l9SlzpNm5OETxD6rT_DQZ9KxIPRx0)({0x7fffd2514eee, 0x7}, 0x1b6, 0x0)
/root/go/pkg/mod/go.etcd.io/[email protected]/db.go:293 +0x46b
main.(*CheckCommand).Run(0xc00008de58, {0xc0000101a0, 0x1, 0x1})
/root/go/pkg/mod/go.etcd.io/[email protected]/cmd/bbolt/main.go:202 +0x1a5
main.(*Main).Run(0xc00005cf40, {0xc000010190, 0x2, 0x2})
/root/go/pkg/mod/go.etcd.io/[email protected]/cmd/bbolt/main.go:112 +0x993
main.main()
/root/go/pkg/mod/go.etcd.io/[email protected]/cmd/bbolt/main.go:70 +0xb4

Above output is very similar if not even the same with the one in the journalctl boot attempt log

All right. This is the issue. The meta.db file is corrupted. How do i fix it?

I tried

root@debian:~# /root/go/bin/bbolt compact -o fixed.db meta.db
panic: invalid freelist page: 41, page type is leaf

goroutine 1 [running]:
go.etcd.io/bbolt.(*freelist).read(0x505ac7, 0x7fa99d224000)
        /root/go/pkg/mod/go.etcd.io/[email protected]/freelist.go:266 +0x234
go.etcd.io/bbolt.(*DB).loadFreelist.func1()
        /root/go/pkg/mod/go.etcd.io/[email protected]/db.go:323 +0xae
sync.(*Once).doSlow(0xc0000801c8, 0x10)
        /usr/local/go/src/sync/once.go:68 +0xd2
sync.(*Once).Do(...)
        /usr/local/go/src/sync/once.go:59
go.etcd.io/bbolt.(*DB).loadFreelist(0xc000080000)
        /root/go/pkg/mod/go.etcd.io/[email protected]/db.go:316 +0x47
go.etcd.io/bbolt.Open({0x7ffebab96eee, 0x7}, 0x124, 0x0)
        /root/go/pkg/mod/go.etcd.io/[email protected]/db.go:293 +0x46b
main.(*CompactCommand).Run(0xc0000441e0, {0xc0000100c0, 0x3, 0x3})
        /root/go/pkg/mod/go.etcd.io/[email protected]/cmd/bbolt/main.go:1982 +0x2e5
main.(*Main).Run(0xc00005af40, {0xc0000100b0, 0x4, 0x4})
        /root/go/pkg/mod/go.etcd.io/[email protected]/cmd/bbolt/main.go:114 +0x6f6
main.main()
        /root/go/pkg/mod/go.etcd.io/[email protected]/cmd/bbolt/main.go:70 +0xb4

All right. What should I do? I saved the original meta.db to another location on the filesystem on Debian virtual machine, then removed it from original location.

root@debian:~# rm  /mnt/docker/containerd/daemon/io.containerd.metadata.v1.bolt/meta.db

I read that Docker will recreate it, however was not sure

I powered off the Debian virtual machine, attached the vmdk to HA OS virtual machine, powered up and ... HA OS booted succcessfully

The issue I discovered after HA OS boot was that no add-ons were listed in the Add-ons in Home Assistance web ui

andreimoraru avatar Jun 02 '22 14:06 andreimoraru

I restored Add-ons from most up to date backup, then updated them. Although I can see them in the web page of Home Assistant, in the companion app on the phone I get an error when I open the add-on

andreimoraru avatar Jun 02 '22 14:06 andreimoraru

This literally happened to me as well today. HA stopped working out of nowhere, I tried a power cycle but I've seen it bootlooping on my switch leds. Connected a display and it fails at the same error, then proceeds to unmount and stop processes and reboots. Does this forever. I was running low on space however yesterday I had 0.5GB still left, and when I'm now checking the card contents it still had 750mb free. I've seen this issue come up due to low space however even if I delete snapshots or backups, logs, and make 1,5GB free space on the card, it still boots the same. Any ideas?

please see my above reply, maybe will be useful

andreimoraru avatar Jun 02 '22 14:06 andreimoraru

Hello, thanks for the suggestion, however it looks kinda complicated to me + I'm not on a VM, this is a rpi3 with an sd card. Does that change anything?

beedee95 avatar Jun 02 '22 14:06 beedee95

I think you can pull up the SD card from RPI and grab a copy of it using whatever tool you're comfortable with. Then on your laptop, attach the copy to a VM with Linux or if you're using Linux or Mac you can mount the filesystem from that SD copy and play along. See what you can do. If you find the solution, then you can make the changes back to SD and test.

andreimoraru avatar Jun 02 '22 20:06 andreimoraru

I am having a similar issue however I am running on Virtualbox VM. Getting stuck with:

A start job is running for Docker Application Container Engine

Boot stalls and continues endlessly without a timeout with this message. I had this issue with 8.0 as well, rechecked VM settings, and eventually, after multiple manual reboots, booted and worked fine from that point on. Wondering if the database improvements with 2022.6 are causing some confusion/corruption on VM systems?

sjmotew avatar Jun 03 '22 15:06 sjmotew

I believe I'm having the same issue, or at least the same error message. This has happened at least one other time because I had to reinstall from scratch and restore from a backup. In both cases, this was during bad storms where we get intermittent power outages, so the VM was shutdown improperly and shutdown during the next attempt at booting up. In both cases I got the error "Failed to start Wait Until Kernel Time Synchronized", and then after a few more successful tasks, the hold up at "A Start job is running for Docker Applicaction Container Engine.

image

This is on a Windows 10 Pro machine, and it's the same error on my current version of VirtualBox [Version 6.1.34 r150636 (Qt5.6.2)] as on the previous version before I updated minutes ago...I want to say 6.04 something? Regardless, it was more than a few versions ago.

drphungky avatar Jul 12 '22 23:07 drphungky

There hasn't been any activity on this issue recently. To keep our backlog manageable we have to clean old issues, as many of them have already been resolved with the latest updates. Please make sure to update to the latest Home Assistant OS version and check if that solves the issue. Let us know if that works for you by adding a comment 👍 This issue has now been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.

github-actions[bot] avatar Oct 11 '22 00:10 github-actions[bot]

I'm persistently having this problem even with a clean image of Home Assistant in VirtualBox. It is beyond frustrating.

Liqueseous avatar Oct 16 '22 21:10 Liqueseous

Having the same issue fairly consistently, also using W10 Pro with a virtualbox VM - Almost always have the issue every few weeks, to which my solution is usually just completely fresh HASS, but that's not exactly practical. Tends to follow some kind of crash of the host, or other interruption to the VM execution. Don't know whether it's due to DACE leaving itself in a unsafe state or smthn.

NeonCoding avatar Nov 23 '22 09:11 NeonCoding

@NeonCoding what kind of virtual disk are you using?

agners avatar Nov 23 '22 13:11 agners

@NeonCoding what kind of virtual disk are you using?

VDI, downloaded from the "Windows" page of installation guides. It was a backed-up version, but again the VM runs fine up until some kind of crash or kernel panic causes the loop.

NeonCoding avatar Nov 23 '22 13:11 NeonCoding

Same, and when I have to reset everything after a crash, I often update to whatever the latest image is on the home assistant website (not convinced that's updated very often though - the documentation sometimes get left behind).

drphungky avatar Nov 23 '22 13:11 drphungky

Did you find any solution, third time I need to recreate it... It's becoming a pain. Running Virtualbox on Windows 10, I just stopped the VM, updated windows, rebooted and now stuck on a start job is running for docker... image

lbouriez avatar Dec 01 '22 16:12 lbouriez

This issue for me was caused by the VM not finishing its proper shutdown procedures via an ACPI Shutdown. Power off does not shutdown the system properly.

The ACPI shutdown is similar to using the shutdown button in Windows where the OS goes through proper procedures as Poweroff is more like pulling the plug.

My only "solution" for this issue occurring on a host shutdown and I use "solution" loosely was to use a program called VBoxHeadless Tray which can run a VirtualBox instance in headless mode and will safely suspend a virtual machine if the host goes through shutdown procedures.

This helped alleviate the issues when the host system would restart or shutdown. This will not help with improper shutdowns of the host system such as loss of power.

There was another issue I encountered and haven't overcome yet and that was that after a few days Home Assistant was no longer accessible via the browser and I had to either suspend and unsuspend the VM to reset the network connection or restart the VM to get it working again.

Liqueseous avatar Dec 01 '22 21:12 Liqueseous

Thanks for your answer, I finished by restore again a backup. A poweroff, even if not good shouldnt break HA, it's a real issue :/ Imagine if windows die every times we power off by pressing the button on the laptop...

lbouriez avatar Dec 01 '22 21:12 lbouriez

I'd also suggest using the Google Drive backup as a mitigation strategy. It's annoying to have to reinstall everything "hardware" wise, but at least the image makes restoring fast. I've used mine a bunch.

drphungky avatar Dec 01 '22 21:12 drphungky

I think the issue issue came to an end on my side. The VirtualBox machine running Home Assistant was on top of Intel NUC with two DDR3L memory cards of 8GB each. The operating system on the Intel NUC itself behaved strangely for a while (12+ months) - random BSODs, power offs etc.

I suspected one or both DDR3L memory cards to be corrupt and the Memtest86 check showed that one out of two memory cards had failed addresses. See attached screenshot testing one out of two DDR3L memory chips

I replaced the failing RAM memory chip with another one and the none of aforementioned issues ever occurred.

F8E77EC6-0246-4F4A-B506-032966C00E3E

andreimoraru avatar Dec 02 '22 12:12 andreimoraru

Still same problem in Windos11 Virtual box brand new installation. 圖片

moonstarsky37 avatar Dec 16 '22 08:12 moonstarsky37

There hasn't been any activity on this issue recently. To keep our backlog manageable we have to clean old issues, as many of them have already been resolved with the latest updates. Please make sure to update to the latest Home Assistant OS version and check if that solves the issue. Let us know if that works for you by adding a comment 👍 This issue has now been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.

github-actions[bot] avatar Mar 16 '23 08:03 github-actions[bot]