zfs Immutable data corruption(?) after hitting #13709

System information

Type	Version/Name
Distribution Name	Gentoo
Distribution Version	?
Kernel Version	6.1-rc4
Architecture	x86_64
OpenZFS Version	zfs-kmod-2.1.99-1 (945b407486a0072ec7dd117a0bde2f72d52eb445) (matching userspace)

Describe the problem you're observing

It might be of note that I first started experiencing these problems around the same time I started experiencing #13709, & it similarly started happening after I upgraded ZFS. I'm not sure when the bug was introduced because it's not very easy to reproduce; I've only encountered it on one system despite having several other systems running the exact same configuration that are totally fine (for the record, they did not get bit by #13709 either, & yes, I use native encryption on all of them). Those systems are also, confusingly, much heavier users of their FS than the system I got this error on. & yes, those systems run the same version of ZFS & the kernel.

Put simply, shortly after #13709 hit this pool, random files started becoming inaccessible & immutable on it with -EIO or No error information. Some were detected by ZFS (classified as permanent errors), some not. The count of checksum errors on this disk keeps continually going up, though. Scrubs do nothing (not even 2 of them), & zpool clear is a no-op, which is somewhat unsurprising considering I wasn't able to delete the files with rm either. Overwriting corrupted files also errors in the same fashion.

It also spreads to metadata, as some symlinks got "corrupted" (again, inaccessible & immutable) in the same manner. It doesn't seem to be triggered by anything specific other than just doing I/O on the filesystem, & what gets "corrupted" seems to be entirely random, save for a slight bias in files made after the bug started going on a rampage. There seems to be no preference for specific datasets; the bug hits all over the pool.

Also, the question mark in the title is because I can't tell if any data actually got corrupted or if ZFS is just having an oopsie, because if it wasn't clear enough, I cannot actually read them to know for sure.

The one disk in the pool is an NVMe SSD & very healthy according to SMART. It's never seen a nonzero error count on ZFS up until this very moment (I scrub often).

The bug does not trigger any warnings in the ZFS module, as far as I can tell. dmesg is completely silent on these matters.

Also, I use BLAKE3 checksums & zstd-4 compression, if it matters.

NB: #13709 came & went for me. I used the snapshot trick, & after that it hasn't affected my system. This bug still affects my system, though.

Include any warning/errors/backtraces from the system logs

# zpool status -v
  pool: lon
 state: ONLINE
status: One or more devices has experienced an error resulting in data
        corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
        entire pool from backup.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A
  scan: scrub repaired 0B in 00:19:46 with 21 errors on Thu Nov 10 01:56:56 2022
config:

        NAME               STATE     READ WRITE CKSUM
        lon                ONLINE       0     0     0
          nvme0n1p3        ONLINE       0     0   225

errors: List of errors unavailable: no such pool or dataset

The list of errors sometimes displays, & sometimes it does not. I have no idea what influences this, but I can tell you just spamming the command over & over doesn't change anything if it's already screwed up at that moment. It mostly consists of files, & a couple of bits of metadata.

Nov 10 '22 05:11 0n-s

Not sure if this is related, but here's a panic trace from the system that suffers from this. This same stacktrace happens very occasionally from very ordinary use of the filesystem (haven't done a scrub since I reported this bug), but this time I was actually able to not forget to enable netconsole.

4,973,36963886645,-,caller=T191137;general protection fault, probably for non-canonical address 0x7e34d12d31e056b2: 0000 [#1] PREEMPT SMP PTI
4,974,36963886675,-,caller=T191137;CPU: 1 PID: 191137 Comm: z_wr_iss Tainted: G                T  6.1.0-rc6 #1
4,975,36963886690,-,caller=T191137;Hardware name: LENOVO 20L5CTO1WW/20L5CTO1WW, BIOS N24ET61W (1.36 ) 10/13/2020
4,976,36963886700,-,caller=T191137;RIP: 0010:hasher_push_cv+0x12c/0x240
4,977,36963886715,-,caller=T191137;Code: 89 8c 24 98 00 00 00 48 8b 4b 10 48 89 8c 24 90 00 00 00 48 8b 0b 48 8b 53 08 48 89 94 24 88 00 00 00 48 89 8c 24 80 00 00 00 <4c> 8b 18 4c 89 e7 4c 89 fe ba 40 00 00 00 31 c9 41 ff d3 66 90 8b
4,978,36963886728,-,caller=T191137;RSP: 0018:ffffb2bd0b857590 EFLAGS: 00010206
4,979,36963886742,-,caller=T191137;RAX: 7e34d12d31e056b2 RBX: ffff946e49cd5800 RCX: 78b8b2a6d494cbe5
4,980,36963886751,-,caller=T191137;RDX: 8c755fa979d07c01 RSI: 0000000000000000 RDI: 0000000000000000
4,981,36963886762,-,caller=T191137;RBP: ffffb2bd0b857670 R08: 000000000000003f R09: 0000000000000000
4,982,36963886772,-,caller=T191137;R10: ffffb2bd0b857720 R11: ffffffffb32efa40 R12: ffffb2bd0b857610
4,983,36963886781,-,caller=T191137;R13: 0000000000000000 R14: 0000000000000700 R15: ffffb2bd0b8575c8
4,984,36963886791,-,caller=T191137;FS:  0000000000000000(0000) GS:ffff946faa880000(0000) knlGS:0000000000000000
4,985,36963886803,-,caller=T191137;CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
4,986,36963886813,-,caller=T191137;CR2: 00006c2ad5340100 CR3: 0000000422936002 CR4: 00000000003706e0
4,987,36963886824,-,caller=T191137;Call Trace:
4,988,36963886835,-,caller=T191137; <TASK>
4,989,36963886848,-,caller=T191137; Blake3_Update+0x850/0xb60
4,990,36963886865,-,caller=T191137; ? sha2_mac_update+0x31/0x60
4,991,36963886878,-,caller=T191137; ? zio_crypt_bp_do_hmac_updates+0x159/0x1e0
4,992,36963886892,-,caller=T191137; ? zio_crypt_do_dnode_hmac_updates+0x1c5/0x240
4,993,36963886908,-,caller=T191137; ? SHA2Update+0x102/0x180
4,994,36963886920,-,caller=T191137; ? __kmem_cache_free+0x242/0x440
4,995,36963886933,-,caller=T191137; ? sha2_mac_final+0x174/0x1c0
4,996,36963886947,-,caller=T191137; ? kmem_cache_free+0x2d2/0x580
4,997,36963886958,-,caller=T191137; ? spl_kmem_cache_free+0xfc/0x2a0
4,998,36963886971,-,caller=T191137; ? percpu_counter_add_batch+0xd2/0x120
4,999,36963886983,-,caller=T191137; ? crypto_mac_final+0x68/0x80
4,1000,36963887127,-,caller=T191137; ? zio_crypt_do_objset_hmacs+0x56f/0x5c0
4,1001,36963887143,-,caller=T191137; blake3_incremental+0x16/0x40
4,1002,36963887156,-,caller=T191137; abd_iterate_func+0x259/0x4e0
4,1003,36963887168,-,caller=T191137; ? abd_checksum_blake3_native+0x80/0x80
4,1004,36963887183,-,caller=T191137; abd_checksum_blake3_native+0x54/0x80
4,1005,36963887196,-,caller=T191137; zio_checksum_compute+0x17d/0x480
4,1006,36963887213,-,caller=T191137; ? spl_kmem_cache_free+0xfc/0x2a0
4,1007,36963887226,-,caller=T191137; ? zio_encrypt+0x19f/0x800
4,1008,36963887239,-,caller=T191137; ? percpu_counter_add_batch+0xd2/0x120
4,1009,36963887251,-,caller=T191137; ? zio_write_compress+0x355/0x8e0
4,1010,36963887264,-,caller=T191137; zio_checksum_generate+0x69/0x80
4,1011,36963887277,-,caller=T191137; zio_execute+0x5f/0x300
4,1012,36963887291,-,caller=T191137; taskq_thread+0x41b/0x840
4,1013,36963887307,-,caller=T191137; ? migrate_disable+0x100/0x100
4,1014,36963887323,-,caller=T191137; ? zio_nowait+0x460/0x460
4,1015,36963887337,-,caller=T191137; kthread+0x22c/0x280
4,1016,36963887350,-,caller=T191137; ? taskq_thread_create+0x200/0x200
4,1017,36963887364,-,caller=T191137; ? kthreadd+0x7a0/0x7a0
4,1018,36963887377,-,caller=T191137; ret_from_fork+0x1f/0x30
4,1019,36963887393,-,caller=T191137; </TASK>
4,1020,36963887402,-,caller=T191137;Modules linked in: netconsole fuse intel_xhci_usb_role_switch iwlmvm snd_hda_codec_hdmi mac80211 libarc4 snd_ctl_led snd_hda_codec_realtek iwlwifi snd_hda_codec_generic intel_pmc_core_pltdrv intel_pmc_core coretemp snd_hda_intel snd_intel_dspcfg snd_hda_codec cfg80211 snd_hda_core input_leds ucsi_acpi think_lmi intel_wmi_thunderbolt iosm wmi_bmof firmware_attributes_class ee1004 wwan typec_ucsi typec roles ext4 mbcache crc16 jbd2
4,1021,36963887705,-,caller=T191137;---[ end trace 0000000000000000 ]---
4,1022,36964186735,-,caller=T191137;RIP: 0010:hasher_push_cv+0x12c/0x240
4,1023,36964186759,-,caller=T191137;Code: 89 8c 24 98 00 00 00 48 8b 4b 10 48 89 8c 24 90 00 00 00 48 8b 0b 48 8b 53 08 48 89 94 24 88 00 00 00 48 89 8c 24 80 00 00 00 <4c> 8b 18 4c 89 e7 4c 89 fe ba 40 00 00 00 31 c9 41 ff d3 66 90 8b
4,1024,36964186784,-,caller=T191137;RSP: 0018:ffffb2bd0b857590 EFLAGS: 00010206
4,1025,36964186804,-,caller=T191137;RAX: 7e34d12d31e056b2 RBX: ffff946e49cd5800 RCX: 78b8b2a6d494cbe5
4,1026,36964186810,-,caller=T191137;RDX: 8c755fa979d07c01 RSI: 0000000000000000 RDI: 0000000000000000
4,1027,36964186824,-,caller=T191137;RBP: ffffb2bd0b857670 R08: 000000000000003f R09: 0000000000000000
4,1028,36964186829,-,caller=T191137;R10: ffffb2bd0b857720 R11: ffffffffb32efa40 R12: ffffb2bd0b857610
4,1029,36964186843,-,caller=T191137;R13: 0000000000000000 R14: 0000000000000700 R15: ffffb2bd0b8575c8
4,1030,36964186848,-,caller=T191137;FS:  0000000000000000(0000) GS:ffff946faa880000(0000) knlGS:0000000000000000
4,1031,36964186866,-,caller=T191137;CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
4,1032,36964186872,-,caller=T191137;CR2: 00006c2ad5340100 CR3: 0000000422936002 CR4: 00000000003706e0
0,1033,36964186890,-,caller=T191137;Kernel panic - not syncing: Fatal exception
0,1034,36964186901,-,caller=T191137;Kernel Offset: 0x31c00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)

Is this a sign of anything, like any extra "corruption" anywhere? That I don't know, since again, where it happens is essentially completely random. I also have no idea what might be causing it, since whenever it happens all I've been doing is essentially what I've been doing the whole time, that is watching a random video stored on NFS. From now on I'll be capturing blktraces to the same NFS (now that I've figured out you can disable blktrace buffering) to see if I can find some patterns. Let me know if blktrace isn't the most appropriate tool, though.

Also, like all the other symptoms, I've never seen this happen on all my other machines which run the same ZFS version on the same kernel version, do not suffer from any of the other symptoms of this bug, also use native encryption, & make much heavier use of their FS than this machine.

Nov 26 '22 14:11 0n-s

Just snapshotted my machine over to a virtual machine to see what a scrub would do, & by pure luck I can definitely note that scrubs are prone to causing this bug, at least if it's already hit the pool. Random files that were readable right before the scrub are made immutable & unreadable after the scrub. Which I suppose is not surprising given that it's highly unlikely to be real data corruption it's detecting (especially since the VM copy of the disk is on completely different, known-good disks). The aforementioned stacktrace makes me think it's a random bug in the checksum code which causes ZFS to lock me out of files because it writes out bad checksums. But IDK, I'm not a ZFS developer; I'd certainly like to hear from one, though!

Also, scrubs are a great way to force stack traces like the one above to appear. Normally I can have several weeks of uptime without ever facing such a panic, which is just great (bugs that only appear once a bluemoon for no discernable reason are very fun). But during the scrub I did, I got 3 of them. Just rebooted the VM each time, scrub resumed itself automatically. I'm not gonna post the stacktraces since they're all almost identical to the one I posted already.

I also used this opportunity to repeat this test (first rolling back to the initial snapshot before I booted it in a VM) with a kernel built with Clang (llvm-project commit bbfbec94b10699d473c106d85d5a48ff5d69e721, no LTO) instead of GCC 12.2.1, just to see if perhaps it was a compiler bug. Everything described above remains consistent, so that doesn't seem to be the case.

Dec 09 '22 07:12 0n-s

zfs zfs copied to clipboard

Immutable data corruption(?) after hitting #13709

System information

Describe the problem you're observing

Include any warning/errors/backtraces from the system logs

zfs
zfs copied to clipboard