operating-system icon indicating copy to clipboard operation
operating-system copied to clipboard

Home Assistant 8.3 and 8.4 broken on Virtualbox and Hyper-V

Open scarolan opened this issue 2 years ago • 4 comments

Describe the issue you are experiencing

See here for background: https://community.home-assistant.io/t/supervisor-wont-start-in-virtualbox-vm/447644/5

I attempted to run Home Assistant in both Virtualbox and with HyperV. The homeassistant container never starts with versions 8.3 and 8.4, but it starts up fine with version 8.2. It appears some kind of regression has been introduced into the build.

What operating system image do you use?

ova (for Virtual Machines)

What version of Home Assistant Operating System is installed?

8.2

Did you upgrade the Operating System.

No

Steps to reproduce the issue

  1. Install virtualbox or hyperv
  2. Follow the installation instructions here: https://www.home-assistant.io/installation/windows
  3. Start the VM. Observe how the 'homeassistant' container never starts, and nothing is listening on port 8123.

Anything in the Supervisor logs that might be useful for us?

See community forum thread

Anything in the Host logs that might be useful for us?

See community forum thread

System Health information

No response

Additional information

No response

scarolan avatar Aug 07 '22 22:08 scarolan

I just tried v8.2, v8.4 and v8.5 and all run fine on Hyper-V (Windows 10 Pro) for me. I didn't try VirtualBox.

jant90 avatar Aug 22 '22 14:08 jant90

Maybe someone can test this on Windows 11. I suspect something might have changed between Win10 and Win11.

scarolan avatar Aug 22 '22 14:08 scarolan

Upgrading a VirtualBox VM running Hass OS from 8.1 to 8.4 now causes it not to boot on a Windows 10 host.

It now gets stuck at Starting Docker Application Container Engine -- says a start job has been running for 20 minutes.

Can't access the supervisor (4357), ssh, etc.

I can get a prompt on the 2nd virtual console (ALT+F2) - trying to do a docker ps hangs.

I'm still running VirtualBox 6.1.30, because the Hass OS VM wouldn't boot with 6.1.32, 6.1.34, due to a VBox bug.

Currently trying to figure out, how to recover this VM (rollback to 8.1) and how to get any useful debug info off the VM.

Was able to catch the kernel messages about an "Oops" when booting Hass OS 8.4:

image

rct avatar Sep 10 '22 16:09 rct

When I'm seeing the problem where HassOS fails to complete booting (because dockerd/containerd never finishes starting up, there is a kernel Oops. I copied off the dmesg content so I could post it here:

[    8.350354] BUG: unable to handle page fault for address: fffffe0000001004
[    8.353844] #PF: supervisor write access in kernel mode
[    8.356642] #PF: error_code(0x0003) - permissions violation
[    8.356644] PGD 11ffec067 P4D 11ffec067 PUD 11ffea067 PMD 11ffe9067 PTE 800000011bc0b161
[    8.356647] Oops: 0003 [#1] SMP PTI
[    8.356649] CPU: 0 PID: 366 Comm: modprobe Not tainted 5.15.55 #1
[    8.356651] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[    8.356652] RIP: 0010:do_sync_core+0x1b/0x20
[    8.356656] Code: ef fe ff ff e8 66 4d d2 00 66 0f 1f 44 00 00 eb 07 0f 1f 00 0f 01 e8 c3 8c d0 50 54 48 83 04 24 08 9c 8c c8 50 68 6d ac 43 87 <48> cf c3 66 90 41 57 41 56 41 55 41 54 55 53 48 83 ec 38 65 48 8b
[    8.356658] RSP: 0000:ffffc313c14d3af0 EFLAGS: 00010086
[    8.356659] RAX: 0000000000000010 RBX: 0000000000000001 RCX: 0000000000000000
[    8.356660] RDX: 000000000000000f RSI: ffffa0f35bc2ce88 RDI: 0000000000000000
[    8.356661] RBP: 0000000000000246 R08: 0000000000000040 R09: 0000000000000000
[    8.356662] R10: ffffa0f35bc2bf40 R11: 0000000000000462 R12: 0000000000000000
[    8.356663] R13: ffffa0f35bc2ce80 R14: ffffa0f35bc2ce80 R15: ffffa0f35bc2ce88
[    8.356664] FS:  00007fc98fc9ac00(0000) GS:ffffa0f35bc00000(0000) knlGS:0000000000000000
[    8.356665] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    8.356666] CR2: fffffe0000001004 CR3: 000000010804a003 CR4: 00000000000306f0
[    8.356668] Call Trace:
[    8.356669]  <TASK>
[    8.356670]  ? do_sync_core+0x1d/0x20
[    8.356672]  smp_call_function_many_cond+0xd5/0x2a0
[    8.356676]  ? optimize_nops.isra.0+0x210/0x210
[    8.356677]  on_each_cpu_cond_mask+0x19/0x20
[    8.356680]  text_poke_bp_batch+0x239/0x280
[    8.356681]  ? nl80211_del_tx_ts+0xc6/0x110 [cfg80211]
[    8.356715]  ? __traceiter_rdev_add_tx_ts+0x80/0x80 [cfg80211]
[    8.447525]  ? nl80211_del_tx_ts+0xc6/0x110 [cfg80211]
[    8.447579]  text_poke_bp+0x3f/0x60
[    8.447584]  arch_static_call_transform+0x6e/0x80
[    8.447589]  __static_call_init.part.0+0x156/0x200
[    8.447593]  static_call_module_notify+0x64/0x180
[    8.447596]  notifier_call_chain_robust+0x55/0xb0
[    8.447600]  blocking_notifier_call_chain_robust+0x38/0x50
[    8.447603]  load_module+0x1ded/0x26b0
[    8.447607]  ? __do_sys_finit_module+0xa0/0xe0
[    8.447609]  __do_sys_finit_module+0xa0/0xe0
[    8.447612]  do_syscall_64+0x3b/0x90
[    8.447614]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[    8.447617] RIP: 0033:0x7fc98fda3d49
[    8.483534] Code: 08 44 89 e0 5b 41 5c c3 66 0f 1f 84 00 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 9f f0 0e 00 f7 d8 64 89 01 48
[    8.483538] RSP: 002b:00007ffc5a799548 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[    8.483540] RAX: ffffffffffffffda RBX: 0000556386416ab0 RCX: 00007fc98fda3d49
[    8.483541] RDX: 0000000000000000 RSI: 0000556385d19368 RDI: 0000000000000000
[    8.483542] RBP: 0000000000040000 R08: 0000000000000000 R09: 0000000000000000
[    8.483542] R10: 0000000000000000 R11: 0000000000000246 R12: 0000556385d19368
[    8.483543] R13: 0000000000000000 R14: 00005563864168c0 R15: 0000556386416ab0
[    8.483545]  </TASK>
[    8.483546] Modules linked in: cfg80211(+)
[    8.483549] CR2: fffffe0000001004
[    8.483551] ---[ end trace 1849852808246020 ]---
[    8.483551] RIP: 0010:do_sync_core+0x1b/0x20
[    8.483555] Code: ef fe ff ff e8 66 4d d2 00 66 0f 1f 44 00 00 eb 07 0f 1f 00 0f 01 e8 c3 8c d0 50 54 48 83 04 24 08 9c 8c c8 50 68 6d ac 43 87 <48> cf c3 66 90 41 57 41 56 41 55 41 54 55 53 48 83 ec 38 65 48 8b
[    8.483556] RSP: 0000:ffffc313c14d3af0 EFLAGS: 00010086
[    8.483557] RAX: 0000000000000010 RBX: 0000000000000001 RCX: 0000000000000000
[    8.483558] RDX: 000000000000000f RSI: ffffa0f35bc2ce88 RDI: 0000000000000000
[    8.483559] RBP: 0000000000000246 R08: 0000000000000040 R09: 0000000000000000
[    8.483560] R10: ffffa0f35bc2bf40 R11: 0000000000000462 R12: 0000000000000000
[    8.483560] R13: ffffa0f35bc2ce80 R14: ffffa0f35bc2ce80 R15: ffffa0f35bc2ce88
[    8.483561] FS:  00007fc98fc9ac00(0000) GS:ffffa0f35bc00000(0000) knlGS:0000000000000000
[    8.483563] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    8.483563] CR2: fffffe0000001004 CR3: 000000010804a003 CR4: 00000000000306f0

rct avatar Sep 11 '22 16:09 rct

There hasn't been any activity on this issue recently. To keep our backlog manageable we have to clean old issues, as many of them have already been resolved with the latest updates. Please make sure to update to the latest Home Assistant OS version and check if that solves the issue. Let us know if that works for you by adding a comment 👍 This issue has now been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.

github-actions[bot] avatar Dec 10 '22 17:12 github-actions[bot]