Incrementally receiving an update to a zvol with key inheritence results in inaccessible device (While: keystatus=available), deadbolt when change-key attempted.
System information
| Type | Version/Name |
|---|---|
| Distribution Name | Archlinux |
| Distribution Version | Rolling. Up to date today. |
| Kernel Version | 6.12.31-1-lts |
| Architecture | x86_64 |
| OpenZFS Version | 2.3.2 |
Describe the problem you're observing
Incrementally received natively-encrypted zvol with key inheritance set on receiver cannot be accessed after first incremental receive.
Creating and receiving an initial zvol "theZvol" (Full path: localZpool/data/theZvol@initial) works. The zvol inherited its key from the parent localZpool/data on the source machine and zfs load-key can be used on the remote machine to begin using it.
Using change-key -i remoteZpool/data/theZvol on the remote machine works too (And I assume would unlock theZvol automatically in future when the parent of coincidentally the same passphrase gets unlocked)
Taking another snapshot on the source localZpool/data/theZvol@secondSnap and incrementally sending this snapshot to the remote incrementally works too. But... The keystatus for it on the remote machine is still available on the remote machine, but trying to access theZvol throws a permission denied error
$ cat /dev/zvol/remoteZpool/data/theZvol
cat: /dev/zvol/remoteZpool/data/theZvol: Permission denied
$ fdisk -l /dev/zvol/remoteZpool/data/theZvol
fdisk: cannot open /dev/zvol/remoteZpool/data/theZvol: Permission denied
Its keystatus still says available on the remote machine.
Trying to load or unload the key throws errors which don't make sense in this context:
$ sudo zfs load-key remoteZpool/data/theZvol
Key load error: Keys must be loaded for encryption root of 'remoteZpool/data/theZvol' (remoteZpool/data)
$ zfs unload-key remoteZpool/data/theZvol
Key unload error: Keys must be unloaded for encryption root of 'remoteZpool/data/theZvol' (remoteZpool/data).
Unmounting remoteZpool/data and all child datasets then trying to unload its own key throws:
umount /data # remoteZpool/data mountpoint
$ zfs unload-key remoteZpool/data
Key unload error: 'remoteZpool/data' is busy.
Trying to desperately change-key on the remote causes a lockup and produces some output in dmesg
zfs change-key remoteZpool/data/theZvol -o keyformat=passphrase
Enter new passphrase for 'remoteZpool/data/theZvol':
Re-enter new passphrase for 'remoteZpool/data/theZvol':
<Hung>
See further below for dmesg logs
At this point using any zfs/zpool command or interacting with this specific zvol's device file locks up the process.
Describe how to reproduce the problem
# On source
zfs create localZpool/data/theZvol -V50G -s
dd if=/dev/urandom of=/dev/zvol/localZpool/data/theZvol bs=1M count=100 status=progress
zfs snapshot localZpool/data/theZvol@initial
# On remote
syncoid --compress=none --sendoptions="pw" --recvoptions="u" localServer:localZpool/data/theZvol remoteServer:remoteZpool/data/theZvol
sudo zfs load-key remoteZpool/data/theZvol # Works, accessible
sudo zfs change-key -i remoteZpool/data/theZvol # Works, for future
# On source (again)
dd if=/dev/urandom of=/dev/zvol/localZpool/data/theZvol bs=1M count=100 status=progress
zfs snapshot localZpool/data/theZvol@secondary
# On remote (again)
syncoid --compress=none --sendoptions="pw" --recvoptions="u" localServer:localZpool/data/theZvol remoteServer:remoteZpool/data/theZvol
Key is still loaded on remote but theZvol block device throws permission denied.
Include any warning/errors/backtraces from the system logs
[ +5.185250] VERIFY0(spa_keystore_dsl_key_hold_dd(dp->dp_spa, dd, FTAG, &dck)) failed (0 == 13)
[ +0.000004] PANIC at dsl_crypt.c:1490:spa_keystore_change_key_sync_impl()
[ +0.000002] Showing stack for process 996
[ +0.000002] CPU: 4 UID: 0 PID: 996 Comm: txg_sync Tainted: P U OE 6.12.31-1-lts #1 e7c67df8a8ca221e76be609de09b679bff2a455a
[ +0.000003] Tainted: [P]=PROPRIETARY_MODULE, [U]=USER, [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
[ +0.000001] Hardware name: Dell Inc. Latitude 5430/03G0RF, BIOS 1.15.0 07/12/2023
[ +0.000001] Call Trace:
[ +0.000002] <TASK>
[ +0.000002] dump_stack_lvl+0x5d/0x80
[ +0.000008] spl_panic+0xf6/0x10d [spl c5fb4eee9ec0cc5fd64adb47887293ac91c1d6d0]
[ +0.000011] ? spa_keystore_dsl_key_hold_dd.isra.0+0xec/0x210 [zfs f514231b8142b437743036d71e8a8110bf81d3ff]
[ +0.000183] ? dsl_crypto_key_open.constprop.0+0x18e/0x340 [zfs f514231b8142b437743036d71e8a8110bf81d3ff]
[ +0.000118] spa_keystore_change_key_sync_impl+0x428/0x440 [zfs f514231b8142b437743036d71e8a8110bf81d3ff]
[ +0.000119] spa_keystore_change_key_sync+0x251/0x4a0 [zfs f514231b8142b437743036d71e8a8110bf81d3ff]
[ +0.000116] dsl_sync_task_sync+0xa8/0xf0 [zfs f514231b8142b437743036d71e8a8110bf81d3ff]
[ +0.000137] dsl_pool_sync+0x411/0x520 [zfs f514231b8142b437743036d71e8a8110bf81d3ff]
[ +0.000134] spa_sync+0x587/0x1080 [zfs f514231b8142b437743036d71e8a8110bf81d3ff]
[ +0.000136] ? spa_txg_history_init_io+0x10f/0x120 [zfs f514231b8142b437743036d71e8a8110bf81d3ff]
[ +0.000127] txg_sync_thread+0x20b/0x3b0 [zfs f514231b8142b437743036d71e8a8110bf81d3ff]
[ +0.000123] ? __pfx_txg_sync_thread+0x10/0x10 [zfs f514231b8142b437743036d71e8a8110bf81d3ff]
[ +0.000118] ? __pfx_thread_generic_wrapper+0x10/0x10 [spl c5fb4eee9ec0cc5fd64adb47887293ac91c1d6d0]
[ +0.000007] thread_generic_wrapper+0x5a/0x70 [spl c5fb4eee9ec0cc5fd64adb47887293ac91c1d6d0]
[ +0.000006] kthread+0xcf/0x100
[ +0.000003] ? __pfx_kthread+0x10/0x10
[ +0.000001] ret_from_fork+0x31/0x50
[ +0.000002] ? __pfx_kthread+0x10/0x10
[ +0.000001] ret_from_fork_asm+0x1a/0x30
[ +0.000004] </TASK>
[ +33.864393] INFO: task txg_sync:996 blocked for more than 122 seconds.
[ +0.000005] Tainted: P U OE 6.12.31-1-lts #1
[ +0.000002] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ +0.000001] task:txg_sync state:D stack:0 pid:996 tgid:996 ppid:2 flags:0x00004000
[ +0.000003] Call Trace:
[ +0.000002] <TASK>
[ +0.000005] __schedule+0x3c7/0x12f0
[ +0.000008] schedule+0x27/0xf0
[ +0.000003] spl_panic+0x10b/0x10d [spl c5fb4eee9ec0cc5fd64adb47887293ac91c1d6d0]
[ +0.000010] ? spa_keystore_dsl_key_hold_dd.isra.0+0xec/0x210 [zfs f514231b8142b437743036d71e8a8110bf81d3ff]
[ +0.000153] ? dsl_crypto_key_open.constprop.0+0x18e/0x340 [zfs f514231b8142b437743036d71e8a8110bf81d3ff]
[ +0.000113] spa_keystore_change_key_sync_impl+0x428/0x440 [zfs f514231b8142b437743036d71e8a8110bf81d3ff]
[ +0.000320] spa_keystore_change_key_sync+0x251/0x4a0 [zfs f514231b8142b437743036d71e8a8110bf81d3ff]
[ +0.000345] dsl_sync_task_sync+0xa8/0xf0 [zfs f514231b8142b437743036d71e8a8110bf81d3ff]
[ +0.000414] dsl_pool_sync+0x411/0x520 [zfs f514231b8142b437743036d71e8a8110bf81d3ff]
[ +0.000366] spa_sync+0x587/0x1080 [zfs f514231b8142b437743036d71e8a8110bf81d3ff]
[ +0.000370] ? spa_txg_history_init_io+0x10f/0x120 [zfs f514231b8142b437743036d71e8a8110bf81d3ff]
[ +0.000333] txg_sync_thread+0x20b/0x3b0 [zfs f514231b8142b437743036d71e8a8110bf81d3ff]
[ +0.000311] ? __pfx_txg_sync_thread+0x10/0x10 [zfs f514231b8142b437743036d71e8a8110bf81d3ff]
[ +0.000301] ? __pfx_thread_generic_wrapper+0x10/0x10 [spl c5fb4eee9ec0cc5fd64adb47887293ac91c1d6d0]
[ +0.000018] thread_generic_wrapper+0x5a/0x70 [spl c5fb4eee9ec0cc5fd64adb47887293ac91c1d6d0]
[ +0.000015] kthread+0xcf/0x100
[ +0.000006] ? __pfx_kthread+0x10/0x10
[ +0.000004] ret_from_fork+0x31/0x50
[ +0.000006] ? __pfx_kthread+0x10/0x10
[ +0.000003] ret_from_fork_asm+0x1a/0x30
[ +0.000008] </TASK>