UTM icon indicating copy to clipboard operation
UTM copied to clipboard

Apple Virtual machine keeps setting itself as read-only

Open Git-North opened this issue 1 year ago • 95 comments

UTM Version 4.1.2 (Beta) Ubuntu Version: 23.4 Lunar Lobster Apple Virtualisation With Rosetta 2 enabled

none of the disks are set as "read only" inside UTM it sometimes works for seconds sometimes minutes but it always happens errors usually has something like "error: read-only file system" but it varies from command to command

Git-North avatar Dec 24 '22 20:12 Git-North

Do you cleanly shut down the VM every time?

ktprograms avatar Dec 25 '22 10:12 ktprograms

I always use the "shutdown now" command in the terminal

Git-North avatar Dec 27 '22 09:12 Git-North

What kind of commands do you do that cause it?

ktprograms avatar Dec 28 '22 01:12 ktprograms

It happens after I finish the setup process (using iso to install and reboot) I first got the error when I was using nala a package manager for ubuntu I got it when using pacstall aswell when I tried a different vm and finally I tried experimenting and after setting up a fresh vm waiting a few minutes I got the error while doing "mkdir" running no commands prior

Git-North avatar Jan 02 '23 16:01 Git-North

Is there anything interesting/weird in the dmesg?

ktprograms avatar Jan 03 '23 01:01 ktprograms

My VM seems to not boot I will tell you the results after I create one again (I will most likely get the same results since this is my 7th VM at this point)

Git-North avatar Jan 03 '23 08:01 Git-North

I had / have the same problem. The kernel log says that the kernel realizes that the checksum of an inode of /dev/vda2 (=ext4 system partition) is incorrect and thus remounts the file system as read-only. Shortly after that the kernel oopses (debian with kernel 5.10.0.158.):

Jan 12 12:03:57 debian kernel: Internal error: Oops - BUG: 0 [#1] SMP
Jan 12 12:03:57 debian kernel: Modules linked in: uinput rfkill joydev binfmt_misc nls_ascii nls_cp437 vfat fat aes_ce_blk crypto_simd hid_generic cryptd aes_ce_>
Jan 12 12:03:57 debian kernel: CPU: 5 PID: 244 Comm: hwrng Not tainted 5.10.0-20-arm64 #1 Debian 5.10.158-2
Jan 12 12:03:57 debian kernel: Hardware name: Apple Inc. Apple Virtualization Generic Platform, BIOS 1916.60.2.0.0 11/04/2022
Jan 12 12:03:57 debian kernel: pstate: 00400005 (nzcv daif +PAN -UAO -TCO BTYPE=--)
Jan 12 12:03:57 debian kernel: pc : do_undefinstr+0x2e0/0x2f0
Jan 12 12:03:57 debian kernel: lr : do_undefinstr+0x180/0x2f0
Jan 12 12:03:57 debian kernel: sp : ffff800013143c40
Jan 12 12:03:57 debian kernel: x29: ffff800013143c40 x28: ffff0000c0a59e80 
Jan 12 12:03:57 debian kernel: x27: ffff0000c105bd48 x26: ffff800011a7dea8 
Jan 12 12:03:57 debian kernel: x25: 0000000000000000 x24: ffff800008db8320 
Jan 12 12:03:57 debian kernel: x23: 0000000060c00005 x22: ffff800008db64b4 
Jan 12 12:03:57 debian kernel: x21: ffff800013143e10 x20: 0000000000000000 
Jan 12 12:03:57 debian kernel: x19: ffff800013143cc0 x18: 0000000000000000 
Jan 12 12:03:57 debian kernel: x17: 0000000000000000 x16: 0000000000000000 
Jan 12 12:03:57 debian kernel: x15: ffff800008db64b4 x14: 0000000000000000 
Jan 12 12:03:57 debian kernel: x13: 0000000000000000 x12: 0000000000000000 
Jan 12 12:03:57 debian kernel: x11: 0000000000000000 x10: fcb411eb11aa9ff1 
Jan 12 12:03:57 debian kernel: x9 : ffff8000103949e0 x8 : ffff0000c0a5ad98 
Jan 12 12:03:57 debian kernel: x7 : ffff80021da4e000 x6 : 0000000000000000 
Jan 12 12:03:57 debian kernel: x5 : ffff800011825f80 x4 : 00000000d503403f 
Jan 12 12:03:57 debian kernel: x3 : 0000000000000000 x2 : ffff800011a740f0 
Jan 12 12:03:57 debian kernel: x1 : 0000000000000000 x0 : 0000000060c00005 
Jan 12 12:03:57 debian kernel: Call trace:
Jan 12 12:03:57 debian kernel:  do_undefinstr+0x2e0/0x2f0
Jan 12 12:03:57 debian kernel:  el1_undef+0x2c/0x4c
Jan 12 12:03:57 debian kernel:  el1_sync_handler+0x8c/0xd0
Jan 12 12:03:57 debian kernel:  el1_sync+0x88/0x140
Jan 12 12:03:57 debian kernel:  hwrng_fillfn+0x130/0x1e0 [rng_core]
Jan 12 12:03:57 debian kernel:  kthread+0x12c/0x130
Jan 12 12:03:57 debian kernel:  ret_from_fork+0x10/0x30
Jan 12 12:03:57 debian kernel: Code: 33103e80 2a0003f4 17ffffa6 f90013f5 (d

An other time the kernel oopsed this:

Jan 12 12:03:57 debian kernel: WARNING: CPU: 5 PID: 0 at kernel/rcu/tree.c:624 rcu_eqs_enter.constprop.0+0x74/0x7c
Jan 12 12:03:57 debian kernel: Modules linked in: uinput rfkill joydev binfmt_misc nls_ascii nls_cp437 vfat fat aes_ce_blk crypto_simd hid_generic cryptd aes_ce_>
Jan 12 12:03:57 debian kernel: CPU: 5 PID: 0 Comm: swapper/5 Tainted: G      D           5.10.0-20-arm64 #1 Debian 5.10.158-2
Jan 12 12:03:57 debian kernel: Hardware name: Apple Inc. Apple Virtualization Generic Platform, BIOS 1916.60.2.0.0 11/04/2022
Jan 12 12:03:57 debian kernel: pstate: 20c003c5 (nzCv DAIF +PAN +UAO -TCO BTYPE=--)
Jan 12 12:03:57 debian kernel: pc : rcu_eqs_enter.constprop.0+0x74/0x7c
Jan 12 12:03:57 debian kernel: lr : rcu_idle_enter+0x18/0x24
Jan 12 12:03:57 debian kernel: sp : ffff800011bc3f20
Jan 12 12:03:57 debian kernel: x29: ffff800011bc3f20 x28: 0000000000000000 
Jan 12 12:03:57 debian kernel: x27: 0000000000000000 x26: ffff0000c028bd00 
Jan 12 12:03:57 debian kernel: x25: 0000000000000000 x24: 0000000000000000 
Jan 12 12:03:57 debian kernel: x23: ffff80001181a1bc x22: ffff800011426bb0 
Jan 12 12:03:57 debian kernel: x21: ffff80001181a180 x20: 0000000000000005 
Jan 12 12:03:57 debian kernel: x19: ffff800011412008 x18: 00000000fffffff5 
Jan 12 12:03:57 debian kernel: x17: 0000000000000308 x16: 0000000000000040 
Jan 12 12:03:57 debian kernel: x15: 0000000000000000 x14: 0000000000000000 
Jan 12 12:03:57 debian kernel: x13: 0000000000000001 x12: 0000000000000040 
Jan 12 12:03:57 debian kernel: x11: ffff0000c0402238 x10: 593e9d5df37fe0d6 
Jan 12 12:03:57 debian kernel: x9 : ffff800010bb50c0 x8 : ffff0000c028cc18 
Jan 12 12:03:57 debian kernel: x7 : ffff000229b2eac0 x6 : 000000010d7b1d55 
Jan 12 12:03:57 debian kernel: x5 : 00ffffffffffffff x4 : ffff80021da4e000 
Jan 12 12:03:57 debian kernel: x3 : 4000000000000002 x2 : 4000000000000000 
Jan 12 12:03:57 debian kernel: x1 : ffff800011428f80 x0 : ffff00022ee76f80 
Jan 12 12:03:57 debian kernel: Call trace:
Jan 12 12:03:57 debian kernel:  rcu_eqs_enter.constprop.0+0x74/0x7c
Jan 12 12:03:57 debian kernel:  rcu_idle_enter+0x18/0x24
Jan 12 12:03:57 debian kernel:  default_idle_call+0x40/0x178
Jan 12 12:03:57 debian kernel:  do_idle+0x238/0x2b0
Jan 12 12:03:57 debian kernel:  cpu_startup_entry+0x2c/0x9c
Jan 12 12:03:57 debian kernel:  secondary_start_kernel+0x144/0x180

I then installed Fedora 37 with kernel 6.0.7 which also got me an read-only file-system and this log:

[  325.566567] SELinux:  Context system_u:object_r:cert_t:s0 is not valid (left unmapped).
[  335.408511] BTRFS error (device vda3): parent transid verify failed on logical 254394368 mirror 1 wanted 20 found 0
[  335.413248] BTRFS info (device vda3): read error corrected: ino 0 off 254394368 (dev /dev/vda3 sector 513248)
[  335.418330] BTRFS info (device vda3): read error corrected: ino 0 off 254398464 (dev /dev/vda3 sector 513256)
[  335.418375] BTRFS info (device vda3): read error corrected: ino 0 off 254402560 (dev /dev/vda3 sector 513264)
[  335.418419] BTRFS info (device vda3): read error corrected: ino 0 off 254406656 (dev /dev/vda3 sector 513272)
[  335.924646] BTRFS error (device vda3): parent transid verify failed on logical 60178432 mirror 1 wanted 13 found 0
[  335.930552] BTRFS info (device vda3): read error corrected: ino 0 off 60178432 (dev /dev/vda3 sector 133920)
[  335.935140] BTRFS info (device vda3): read error corrected: ino 0 off 60182528 (dev /dev/vda3 sector 133928)
[  335.935332] BTRFS info (device vda3): read error corrected: ino 0 off 60186624 (dev /dev/vda3 sector 133936)
[  335.935486] BTRFS info (device vda3): read error corrected: ino 0 off 60190720 (dev /dev/vda3 sector 133944)
[  337.205193] BTRFS warning (device vda3): csum failed root 257 ino 34956 off 0 csum 0xf5f4f143 expected csum 0x00000000 mirror 1
[  337.205209] BTRFS error (device vda3): bdev /dev/vda3 errs: wr 0, rd 0, flush 0, corrupt 1, gen 0
[  337.210375] BTRFS warning (device vda3): csum failed root 257 ino 34956 off 0 csum 0xf5f4f143 expected csum 0x00000000 mirror 1
[  337.210379] BTRFS error (device vda3): bdev /dev/vda3 errs: wr 0, rd 0, flush 0, corrupt 2, gen 0
[  337.393847] SELinux:  Context system_u:object_r:file_context_t:s0 is not valid (left unmapped).
[  337.845752] SELinux:  Context system_u:object_r:var_lib_nfs_t:s0 is not valid (left unmapped).
[  373.307836] BTRFS warning (device vda3): csum failed root 257 ino 34956 off 0 csum 0xf5f4f143 expected csum 0x00000000 mirror 1
[  373.307846] BTRFS error (device vda3): bdev /dev/vda3 errs: wr 0, rd 0, flush 0, corrupt 3, gen 0
[  373.769193] BTRFS error (device vda3): parent transid verify failed on logical 64929792 mirror 1 wanted 13 found 0
[  373.769820] BTRFS info (device vda3): read error corrected: ino 0 off 64929792 (dev /dev/vda3 sector 143200)
[  373.769863] BTRFS info (device vda3): read error corrected: ino 0 off 64933888 (dev /dev/vda3 sector 143208)
[  373.769897] BTRFS info (device vda3): read error corrected: ino 0 off 64937984 (dev /dev/vda3 sector 143216)
[  373.769931] BTRFS info (device vda3): read error corrected: ino 0 off 64942080 (dev /dev/vda3 sector 143224)
[  373.935038] BTRFS warning (device vda3): csum failed root 257 ino 33540 off 0 csum 0xdd812b50 expected csum 0x00000000 mirror 1
[  373.935045] BTRFS error (device vda3): bdev /dev/vda3 errs: wr 0, rd 0, flush 0, corrupt 4, gen 0
[  373.935168] BTRFS warning (device vda3): csum failed root 257 ino 33540 off 0 csum 0xdd812b50 expected csum 0x00000000 mirror 1
[  373.935170] BTRFS error (device vda3): bdev /dev/vda3 errs: wr 0, rd 0, flush 0, corrupt 5, gen 0
[  373.935265] BTRFS warning (device vda3): csum failed root 257 ino 33540 off 0 csum 0xdd812b50 expected csum 0x00000000 mirror 1
[  373.935266] BTRFS error (device vda3): bdev /dev/vda3 errs: wr 0, rd 0, flush 0, corrupt 6, gen 0
[  378.938443] BTRFS critical (device vda3): leaf free space ret -4225, leaf data size 16283, used 20508 nritems 169
[  378.938506] BTRFS critical (device vda3): leaf free space ret -4225, leaf data size 16283, used 20508 nritems 169
[  378.938509] BTRFS critical (device vda3): leaf free space ret -4225, leaf data size 16283, used 20508 nritems 169
[  378.938510] BTRFS critical (device vda3): leaf free space ret -4225, leaf data size 16283, used 20508 nritems 169
[  379.073738] BTRFS warning (device vda3): csum failed root 257 ino 33540 off 0 csum 0xdd812b50 expected csum 0x00000000 mirror 1
[  379.073748] BTRFS error (device vda3): bdev /dev/vda3 errs: wr 0, rd 0, flush 0, corrupt 7, gen 0
[  379.073917] BTRFS warning (device vda3): csum failed root 257 ino 33540 off 0 csum 0xdd812b50 expected csum 0x00000000 mirror 1
[  379.073920] BTRFS error (device vda3): bdev /dev/vda3 errs: wr 0, rd 0, flush 0, corrupt 8, gen 0
[  379.297718] systemd[1]: systemd 251.7-611.fc37 running in system mode (+PAM +AUDIT +SELINUX -APPARMOR +IMA +SMACK +SECCOMP -GCRYPT +GNUTLS +OPENSSL +ACL +BLKID +CURL +ELFUTILS +FIDO2 +IDN2 -IDN -IPTC +KMOD +LIBCRYPTSETUP +LIBFDISK +PCRE2 +PWQUALITY +P11KIT +QRENCODE +TPM2 +BZIP2 +LZ4 +XZ +ZLIB +ZSTD +BPF_FRAMEWORK +XKBCOMMON +UTMP +SYSVINIT default-hierarchy=unified)
[  379.297873] systemd[1]: Detected virtualization apple.
[  379.297876] systemd[1]: Detected architecture arm64.
[  379.391880] systemd[1]: bpf-lsm: Failed to link program; assuming BPF LSM is not available
[  379.422898] systemd-sysv-generator[4243]: SysV service '/etc/rc.d/init.d/livesys' lacks a native systemd unit file. Automatically generating a unit file for compatibility. Please update package to include a native systemd unit file, in order to make it more safe and robust.
[  379.422920] systemd-sysv-generator[4243]: SysV service '/etc/rc.d/init.d/livesys-late' lacks a native systemd unit file. Automatically generating a unit file for compatibility. Please update package to include a native systemd unit file, in order to make it more safe and robust.
[  379.429646] systemd-gpt-auto-generator[4234]: Failed to dissect: Permission denied
[  379.431593] systemd[4220]: /usr/lib/systemd/system-generators/systemd-gpt-auto-generator failed with exit status 1.
[  379.794222] ------------[ cut here ]------------
[  379.794226] BTRFS: Transaction aborted (error -2)
[  379.794268] WARNING: CPU: 7 PID: 3504 at fs/btrfs/inode.c:9568 btrfs_rename+0x810/0x8d0
[  379.794293] Modules linked in: tls snd_seq_dummy snd_hrtimer uinput nft_objref nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 rfkill ip_set nf_tables nfnetlink qrtr sunrpc vfat fat virtio_snd snd_seq snd_seq_device joydev snd_pcm snd_timer virtio_balloon snd virtiofs soundcore virtio_console zram crct10dif_ce polyval_ce polyval_generic ghash_ce sha3_ce virtio_net sha512_ce sha512_arm64 virtio_gpu net_failover failover virtio_blk virtio_dma_buf apple_mfi_fastcharge ip6_tables ip_tables fuse
[  379.794422] CPU: 7 PID: 3504 Comm: dnf Not tainted 6.0.7-301.fc37.aarch64 #1
[  379.794424] Hardware name: Apple Inc. Apple Virtualization Generic Platform, BIOS 1916.60.2.0.0 11/04/2022
[  379.794425] pstate: 00400005 (nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[  379.794427] pc : btrfs_rename+0x810/0x8d0
[  379.794429] lr : btrfs_rename+0x810/0x8d0
[  379.794430] sp : ffff80000f6fb910
[  379.794431] x29: ffff80000f6fb910 x28: ffff0000c840a300 x27: ffff0000c840a0d0
[  379.794432] x26: ffff0000861635c0 x25: ffff0000c8c1f760 x24: ffff0000c871ca90
[  379.794434] x23: ffff0000c8c1f760 x22: 0000000000028ff2 x21: ffff0000c8c1f530
[  379.794435] x20: ffff00008613a780 x19: ffff0000c8585600 x18: 00000000fffffffe
[  379.794437] x17: 00a16c4317540000 x16: 0000000009c40000 x15: ffff80000f6fb518
[  379.794438] x14: 0000000000000001 x13: 29322d20726f7272 x12: 652820646574726f
[  379.794439] x11: 00000000ffffdfff x10: ffff80000aaf0590 x9 : ffff8000082e56e0
[  379.794452] x8 : 000000000002ffe8 x7 : c0000000ffffdfff x6 : 00000000000affa8
[  379.794453] x5 : 0000000000001fff x4 : 0000000000000001 x3 : ffff80000a2a6008
[  379.794455] x2 : 0000000000000001 x1 : ffff00013559c400 x0 : 0000000000000025
[  379.794456] Call trace:
[  379.794457]  btrfs_rename+0x810/0x8d0
[  379.794459]  btrfs_rename2+0x30/0x80
[  379.794460]  vfs_rename+0x338/0x8a0
[  379.794468]  do_renameat2+0x42c/0x484
[  379.794469]  __arm64_sys_renameat+0x60/0x80
[  379.794471]  invoke_syscall+0x78/0x100
[  379.794477]  el0_svc_common.constprop.0+0x4c/0xf4
[  379.794478]  do_el0_svc+0x34/0x4c
[  379.794479]  el0_svc+0x34/0x10c
[  379.794513]  el0t_64_sync_handler+0xf4/0x120
[  379.794514]  el0t_64_sync+0x190/0x194
[  379.794520] ---[ end trace 0000000000000000 ]---

I then rebooted the host system and installed Debian. I was also able to upgrade to kernel Linux debian 6.1.0-1-arm64 #1 SMP Debian 6.1.4-1 (2023-01-07) aarch64 GNU/Linux without a problem (this was not possible before the reboot). I do get some freezes here and then which require a hard exit of the VM and UTM. But remember that I am using a kernel from the unstable channel so this might not have anything to do with UTM or the Virtualization.Framework at all. The kernel log is completely clean. I attached some logs from these crashes. Overall a very frustrating experience I.. But as far as I see UTM is really only a lightweight frontend for the Virtualization.Framework, so I guess there isn't much you can do about this? However I really like the UTM project, keep the good work up!

com.apple.Virtualization.VirtualMachine_2023-01-13-113856_MacBook-Pro-von-Tim.log com.apple.Virtualization.VirtualMachine_2023-01-13-114201_MacBook-Pro-von-Tim.wakeups_resource.log UTM_2023-01-13-114630_MacBook-Pro-von-Tim.cpu_resource.log

timnoack avatar Jan 13 '23 11:01 timnoack

Any progress on this?

Git-North avatar Feb 17 '23 22:02 Git-North

It dont think this got anything to do with UTM. I see the same problems with VirtIO FS corruption in everything using the Apple Virtualization Framework (eg Docker). I opened a ticket in the feedback assistant but Apple asks for a reliable way to reproduce the issue, which I did not find a way yet.

I dont know how many people are using UTM with the Apple Virtualization Framework enabled but I know that a ton of people are using Docker on macOS, which sporadically crashes due to the same reason for me. As no one else seems to have the same problem there, my current guess is that it's either a faulty macOS installation or a faulty hardware. I got no third-party kernel extensions loaded and programs running in EL0 should not be able to interfere with the hypervisor.

Did you try reinstalling macOS?

timnoack avatar Feb 19 '23 23:02 timnoack

I have the same issue with Fedora 38 arm guest, it works for some time then fs becomes read-only and I have to restart it, will try to get more logs later

One type of message I see is like this

> git commit -m "message"
An unexpected error has occurred: OSError: [Errno 5] Input/output error

# and in dmesg:
[ 2006.845686] BTRFS critical (device vda3): corrupted leaf, root=7 block=0 owner mismatch, have 0 expect 7

maximvl avatar May 29 '23 10:05 maximvl

Here is one more case when FS becomes read-only:

[  829.279825] BTRFS critical (device vda3): corrupted leaf, root=257 block=0 owner mismatch, have 0 expect [256, 18446744073709551360]
[  829.725624] BTRFS critical (device vda3): corrupted leaf, root=257 block=0 owner mismatch, have 0 expect [256, 18446744073709551360]
[  829.725806] BTRFS critical (device vda3): corrupted leaf, root=257 block=0 owner mismatch, have 0 expect [256, 18446744073709551360]
[  829.725838] BTRFS critical (device vda3): corrupted leaf, root=257 block=0 owner mismatch, have 0 expect [256, 18446744073709551360]
[  829.725853] BTRFS critical (device vda3): corrupted leaf, root=257 block=0 owner mismatch, have 0 expect [256, 18446744073709551360]
[  829.725864] BTRFS critical (device vda3): corrupted leaf, root=257 block=0 owner mismatch, have 0 expect [256, 18446744073709551360]
[  829.725875] BTRFS critical (device vda3): corrupted leaf, root=257 block=0 owner mismatch, have 0 expect [256, 18446744073709551360]
[  829.725899] BTRFS critical (device vda3): corrupted leaf, root=257 block=0 owner mismatch, have 0 expect [256, 18446744073709551360]
[  829.725915] BTRFS critical (device vda3): corrupted leaf, root=257 block=0 owner mismatch, have 0 expect [256, 18446744073709551360]
[  829.725931] BTRFS critical (device vda3): corrupted leaf, root=257 block=0 owner mismatch, have 0 expect [256, 18446744073709551360]
[  829.725943] BTRFS critical (device vda3): corrupted leaf, root=257 block=0 owner mismatch, have 0 expect [256, 18446744073709551360]
[  829.725953] BTRFS critical (device vda3): corrupted leaf, root=257 block=0 owner mismatch, have 0 expect [256, 18446744073709551360]
[  829.725968] BTRFS critical (device vda3): corrupted leaf, root=257 block=0 owner mismatch, have 0 expect [256, 18446744073709551360]
[  829.725983] BTRFS critical (device vda3): corrupted leaf, root=257 block=0 owner mismatch, have 0 expect [256, 18446744073709551360]
[  829.725998] BTRFS critical (device vda3): corrupted leaf, root=257 block=0 owner mismatch, have 0 expect [256, 18446744073709551360]
[  856.928118] BTRFS critical (device vda3): corrupted leaf, root=257 block=0 owner mismatch, have 0 expect [256, 18446744073709551360]
[  856.928181] BTRFS critical (device vda3): corrupted leaf, root=257 block=0 owner mismatch, have 0 expect [256, 18446744073709551360]
[  863.839380] BTRFS warning (device vda3): csum failed root 257 ino 162865 off 0 csum 0xc06d8a81 expected csum 0x00000000 mirror 1
[  863.839386] BTRFS error (device vda3): bdev /dev/vda3 errs: wr 0, rd 0, flush 0, corrupt 399, gen 0
[  863.839393] BTRFS warning (device vda3): csum failed root 257 ino 162865 off 4096 csum 0x8e9c436a expected csum 0x00000000 mirror 1
[  863.839394] BTRFS error (device vda3): bdev /dev/vda3 errs: wr 0, rd 0, flush 0, corrupt 400, gen 0
[  863.839544] BTRFS warning (device vda3): csum failed root 257 ino 162865 off 0 csum 0xc06d8a81 expected csum 0x00000000 mirror 1
[  863.839546] BTRFS error (device vda3): bdev /dev/vda3 errs: wr 0, rd 0, flush 0, corrupt 401, gen 0
[  863.839942] BTRFS warning (device vda3): csum failed root 257 ino 162880 off 0 csum 0x3dfb4247 expected csum 0x00000000 mirror 1
[  863.839944] BTRFS error (device vda3): bdev /dev/vda3 errs: wr 0, rd 0, flush 0, corrupt 402, gen 0
[  863.839950] BTRFS warning (device vda3): csum failed root 257 ino 162880 off 4096 csum 0x800bb04e expected csum 0x00000000 mirror 1
[  863.839951] BTRFS error (device vda3): bdev /dev/vda3 errs: wr 0, rd 0, flush 0, corrupt 403, gen 0
[  863.840049] BTRFS warning (device vda3): csum failed root 257 ino 162880 off 0 csum 0x3dfb4247 expected csum 0x00000000 mirror 1
[  863.840053] BTRFS error (device vda3): bdev /dev/vda3 errs: wr 0, rd 0, flush 0, corrupt 404, gen 0
[  863.841823] BTRFS warning (device vda3): csum failed root 257 ino 162879 off 0 csum 0x9379162b expected csum 0x00000000 mirror 1
[  863.841825] BTRFS error (device vda3): bdev /dev/vda3 errs: wr 0, rd 0, flush 0, corrupt 405, gen 0
[  863.841831] BTRFS warning (device vda3): csum failed root 257 ino 162879 off 4096 csum 0x6a9e7a52 expected csum 0x00000000 mirror 1
[  863.841832] BTRFS error (device vda3): bdev /dev/vda3 errs: wr 0, rd 0, flush 0, corrupt 406, gen 0
[  863.841913] BTRFS warning (device vda3): csum failed root 257 ino 162879 off 0 csum 0x9379162b expected csum 0x00000000 mirror 1
[  863.841916] BTRFS error (device vda3): bdev /dev/vda3 errs: wr 0, rd 0, flush 0, corrupt 407, gen 0
[  863.843813] BTRFS warning (device vda3): csum failed root 257 ino 162893 off 0 csum 0x07fad04f expected csum 0x00000000 mirror 1
[  863.843816] BTRFS error (device vda3): bdev /dev/vda3 errs: wr 0, rd 0, flush 0, corrupt 408, gen 0
[  865.182127] BTRFS critical (device vda3): corrupted leaf, root=7 block=0 owner mismatch, have 0 expect 7
[  865.182135] ------------[ cut here ]------------
[  865.182135] BTRFS: Transaction aborted (error -117)
[  865.182191] WARNING: CPU: 3 PID: 470 at fs/btrfs/inode.c:3343 btrfs_finish_ordered_io+0x9b8/0x9c0
[  865.182257] Modules linked in: snd_seq_dummy snd_hrtimer uinput xt_conntrack xt_MASQUERADE nf_conntrack_netlink xt_addrtype nft_compat br_netfilter bridge stp llc nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 rfkill overlay ip_set nf_tables nfnetlink qrtr sunrpc binfmt_misc vfat fat virtio_snd snd_seq snd_seq_device snd_pcm snd_timer virtio_console snd soundcore virtio_balloon virtiofs joydev loop crct10dif_ce polyval_ce polyval_generic ghash_ce sha3_ce virtio_net sha512_ce net_failover sha512_arm64 failover virtio_gpu virtio_blk virtio_dma_buf apple_mfi_fastcharge ip6_tables ip_tables fuse
[  865.182449] CPU: 3 PID: 470 Comm: kworker/u12:6 Not tainted 6.2.15-300.fc38.aarch64 #1
[  865.182456] Hardware name: Apple Inc. Apple Virtualization Generic Platform, BIOS 1916.80.2.0.0 12/19/2022
[  865.182457] Workqueue: btrfs-endio-write btrfs_work_helper
[  865.182460] pstate: 61400005 (nZCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
[  865.182467] pc : btrfs_finish_ordered_io+0x9b8/0x9c0
[  865.182469] lr : btrfs_finish_ordered_io+0x9b8/0x9c0
[  865.182470] sp : ffff80000d01bc60
[  865.182471] x29: ffff80000d01bc60 x28: 00000000ffffff8b x27: ffff0000c9802000
[  865.182472] x26: ffff0000c1ba5ea0 x25: ffff000109db6c80 x24: ffff0000c9802800
[  865.182474] x23: 0000000000001000 x22: 0000000000000000 x21: ffff0002f1c0bae8
[  865.182475] x20: 0000000000000fff x19: ffff00042cb74a50 x18: 00000000fffffffe
[  865.182476] x17: 6e776f20303d6b63 x16: 6f6c6220373d746f x15: ffff80000d01b830
[  865.182477] x14: 0000000000000001 x13: 293731312d20726f x12: 7272652820646574
[  865.182478] x11: 00000000ffffdfff x10: ffff80000aa502a0 x9 : ffff800008137b40
[  865.182479] x8 : 000000000002ffe8 x7 : c0000000ffffdfff x6 : 00000000000affa8
[  865.182480] x5 : 0000000000001fff x4 : 0000000000000002 x3 : ffff80000a1f3008
[  865.182482] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff0000c86d0000
[  865.182483] Call trace:
[  865.182483]  btrfs_finish_ordered_io+0x9b8/0x9c0
[  865.182485]  finish_ordered_fn+0x1c/0x30
[  865.182487]  btrfs_work_helper+0xe0/0x270
[  865.182488]  process_one_work+0x1e4/0x480
[  865.182507]  worker_thread+0x74/0x40c
[  865.182508]  kthread+0xe8/0xf4
[  865.182509]  ret_from_fork+0x10/0x20
[  865.182511] ---[ end trace 0000000000000000 ]---
[  865.182512] BTRFS: error (device vda3: state A) in btrfs_finish_ordered_io:3343: errno=-117 Filesystem corrupted
[  865.182514] BTRFS info (device vda3: state EA): forced readonly

And I can't even shutdown properly at this point:

~ ⟩ shutdown now                                                                                                            
exec: Failed to execute process '/usr/sbin/shutdown', unknown error number 117

maximvl avatar May 29 '23 11:05 maximvl

Same issue here with a Debian Testing aarch64 guest on Apple HV. Consistently triggers multiple times a day during regular use. Sometimes I end up with a corrupted disk and need to fsck from initramfs shell before being able to continue booting where it will then resolve various filesystem inconsistencies.

athre0z avatar May 30 '23 15:05 athre0z

Same issue here with a Debian Testing aarch64 guest on Apple HV. Consistently triggers multiple times a day during regular use. Sometimes I end up with a corrupted disk and need to fsck from initramfs shell before being able to continue booting where it will then resolve various filesystem inconsistencies.

@athre0z I got my Fedora fs corrupted to a point it couldn't boot graphical interface and install packages, I was able to recover my data through terminal and shared directory so be careful

maximvl avatar Jun 06 '23 10:06 maximvl

@pisker is probably right, I just saw the same btrfs error in a lima vm using vz on Ventura 13.4

vlad-rw avatar Jun 14 '23 15:06 vlad-rw

FWIW I'm under the impression that it got a lot better with 13.4: I was easily running into ~4 crashes on any given workday previously whereas now it's more like one crash every two days. That being said, with this kind of spurious bugs it's also perfectly possible that it's just chance or a result of slightly altered workload.

Personally I don't really care about FS corruption: everything of value is on a share anyway and if my VM dies I can spin up a fresh one in 30 minutes. I still prefer the crashy Apple HV with the lightning fast virtfs share over the qemu HV with the horrible 9p network share.

athre0z avatar Jun 15 '23 14:06 athre0z

I also encountered this using Apple Virtualization framework. I'm using the VM as a homelab server (basically container runner) with bridged interface. I chose Apple Virtualization framework over qemu because it has better vNIC performance on my 10G nic. I'm running Rocky Linux 9 and the default format is XFS. When this bug happens, the console will have a log says Corruption of in-memory data detected. Shutting down filesystem.. Unlike other filesystem which changes to read-only mode, XFS is forced to shutdown the filesystem, which equivalent to unmount the root partition. But the good part is, it seems like the corruption in only in memory and not on disk, as I tried to ran xfs_repair after the fs shutdown but no error has been found.

The frequency of this bug is low for me, perhaps 1-2 times a month, but it is still annoying to manually reboot the vm once I found my containers are down. So my workaround is to make a simple rust program that check if the root is available each 10s, if not then force reboot the machine and here's the code:

use std::fs;
use std::io::Result;
use std::io::Write;
use std::thread;
use std::time::Duration;
/// Reference: https://www.kernel.org/doc/html/latest/admin-guide/sysrq.html
pub fn force_reboot() -> Result<()> {
    let mut file = fs::File::create("/proc/sysrq-trigger")?;
    file.write_all(b"b")?;
    Ok(())
}

fn main() {
    loop {
        match fs::read_dir("/") {
            Ok(_) => {}
            Err(_) => {
                force_reboot();
            }
        };
        thread::sleep(Duration::from_secs(10));
    }
}

Hope this could help someone who also runs a server and have same problem.

gnattu avatar Jul 23 '23 22:07 gnattu

Similar here,

This time with a fedora beta 39 guest. Using applie virtualization I get these memory corruption failures (and then other i/o failure). Fedora (albeit 38) is running fine under QEMU with UTM.

I'd also previously hit an exception constantly with a RHEL 9.x gues - again, only with apple virtualization

planetf1 avatar Sep 24 '23 06:09 planetf1

Here's example:

Screenshot 2023-09-24 at 07 29 14

planetf1 avatar Sep 24 '23 06:09 planetf1

I'm experiencing the same issue. This time trying to install Kali 2023.3 on Apple Virtualization Framework:

Screenshot 2023-10-04 at 10 56 10

However, I have Kali installed inside Docker, running with Apple Virtualization Framework, and no problem at all. Albeit no GUI...

kurgannet avatar Oct 04 '23 09:10 kurgannet

Another one while installing Debian 12.1. In fact, I cannot even install a Linux distro, I always get the kernel panic Screenshot 2023-10-04 at 18 20 49

kurgannet avatar Oct 04 '23 16:10 kurgannet

I think by now there might be enough reports to assume that there's really a bug somewhere, and that all kinds of Distros are affected (NixOS here ;); so adding yet another one seems to have diminishing returns in terms of information gained?

From the reports here it seems the oldest kernel explicitly mentioned was 5.10.0.158, I am myself encountered it multiple times on 6.1.*, 6.1.54 atm (using both XFS and ext4)

phaer avatar Oct 04 '23 17:10 phaer

I have tried the same with a Mac mini M2 Sonoma and I have no issues so far. I have tested Debian 12.1 and Kali 2023.3, no kernel panics in 1 day.

On a MacBook 16 M1 Pro 16GB I cannot even install those guest OS, getting these errors constantly, in a matter of minutes. I mean, I ALWAYS get these errors, I am unable to install (which should take less than half an hour). The last one is:

Screenshot 2023-10-05 at 12 31 08

I am not sure if this is related to disk access. The first error reads "Unable to handle kernel NULL enabling pointer dereference at virtual address 0000000000000020".

Any clues?

kurgannet avatar Oct 05 '23 10:10 kurgannet

I can report that on a OSX 13.6, M2 Mini Pro 16G, the filesystem error occurs frequently.

rainwoodman avatar Oct 08 '23 05:10 rainwoodman

My experience has been the same until upgrading the Linux kernel after reading this: https://www.techradar.com/news/linux-kernel-62-is-here-and-it-now-has-mainline-support-for-apple-m1-chips. On M1, Sonoma, using Ubuntu 23.10 (Mantic), Linux kernel 6.5, it is reasonably stable. Unfortunately once there is one disk corruption it is difficult to know whether subsequent issues are totally fresh or a consequence of the first one. I regularly boot into an attached iso and run fsck while the main VM is unmounted. Journalctl reports NULL pointer and EXT4 issues from time to time but it is usable. You can get latest Ubuntu from https://cdimage.ubuntu.com

wrmack avatar Oct 08 '23 16:10 wrmack

Is it me or deactivating ballooning solves the problem? I've deactivated it two weeks ago, and no problem since on my side.

lfdla avatar Oct 16 '23 13:10 lfdla

I came across this bug searching for a BTRFS error that I was encountering in one of my VMs. Not running UTM at the moment, but rather Parallels. But the underlying Apple Virtualization Framework is being used, like UTM can do.

I have a hunch that this isn't anything BTRFS-specific, though. But rather that BTRFS is most likely to notice and warn you at the point that any corruption might happen, compared to some other filesystems. Unfortunately, I haven't been able to find a way to trigger the bug consistently. And unlike what has been suggested earlier, the corruption wasn't simply just in memory, as the filesystem corruption was still there after a reboot.

If there is indeed a silent data corruption bug in Apple Virtualization Framework, then this sounds quite bad as it'll affect everything that uses it. On the other hand, if there's a Linux bug with the virtual hardware provided by the framework, well, they've got some work to do. I've tested a 6.1.57 VM, and the corruption still has happened.

Screenshot 2023-10-18 at 4 59 59 PM

wdormann avatar Oct 18 '23 21:10 wdormann

In my experience this issue is mostly related to “something” that leads to a kernel oops and filesystem error. ext4 is also affected.

I have tried “every” virtualization platform that relies on AVF and the bug is there, always. Docker seems quite stable because of the Linux kernel it uses (a 5.* version, not 6.*). It seems to me that they’ve tried and tested kernels until they’ve found the stable one.

In the end, it looks like it’s a combination of AVF + Linux kernel version, so the solution may be on Apple’s on Linux’s side… or both!

kurgannet avatar Oct 19 '23 07:10 kurgannet

For what it's worth, I've recreated the Linux filesystem corruption in 3 different platform configurations:

  1. As seen in the screenshot above, a Gentoo linux VM where the disk is presented as SDA, so something SCSI/SATA I believe.
  2. Ubuntu 23.10 stock with QEMU virtualization and NVMe storage: Screenshot 2023-10-24 at 12 18 23 AM
  3. Ubuntu 23.10 with Apple virtualization and latest mainline kernel from the mainline project, and virtio storage. Screenshot 2023-10-24 at 12 17 20 AM

All three of these scenarios were generated under load (repeatedly compiling a large project (qt6))

Reproduction under a synthetic benchmark of CPU/RAM/Disk seems not as readily possible.

wdormann avatar Oct 24 '23 04:10 wdormann

Just in case you haven't come across it, AsahiLinux are working on the linux kernel specifically for Apple silicon purposes. The specific features they are working on are listed here. Their work finds its way into the kernel. They also provide a downloadable dual-boot solution.

wrmack avatar Oct 24 '23 05:10 wrmack

For what it's worth, I tried a similar exercise of compiling Qt6 on Linux, Windows, and macOS VMs. Windows and macOS VMs went a full 24 hours without any corruption. Any Linux VM I've tried could go maybe as far as 10 minutes before corruption occurred.

I only have the ability to test within VMs so I cannot eliminate the hypervisor layer at this point. But as far as I can tell by now is that aarch64 Linux has problems.

windows_qt6_build macos_qt6_build

wdormann avatar Oct 25 '23 18:10 wdormann