zfs icon indicating copy to clipboard operation
zfs copied to clipboard

permanent errors after upgrading ZFS

Open clhedrick opened this issue 2 years ago • 0 comments

System information

Type Version/Name
Distribution Name Ubuntu
Distribution Version 22.04
Kernel Version 5.15.0-43-generic
Architecture x86_64
OpenZFS Version zfs-2.1.4-0ubuntu0.1

Describe the problem you're observing

After upgrading from Ubuntu 20 to 22, zpool status show 143 permanent errors. I've never had an issue with devices. No errors shown then or after a scrub.

This is a backup system. I backup to it by send | receive. Originally one of the systems backed up was encrypted. After a crash I reconstructed it unencrypted, but I didn't reconstruct the backup system, as it had no errors. I did create unencrypted versions of the file systems on the backup system, but kept some of the encrypted ones around. They caused no problems under Ubuntu 20. But under 22, I got failures to mount, and 143 permanent errors. zpool status -v showed file names that were all in encrypted file systems.

I destroyed the encrypted file systems and then run a scrub. Now I've got 2 permanent errors <0x1c336>:<0x0> <0x2b49c>:<0x0>

Based on other reports I'll do a second scrub this weekend.

Note that the root file system is encrypted. It has no data, not even mount points. It's not mounted, although it will mount.

It would be useful to be able to clear the errors. We have monitoring scripts that check for problems with our ZFS file systems. This shows as a problem. We can ignore it, but that would hide any new errors that might occur.

Describe how to reproduce the problem

Include any warning/errors/backtraces from the system logs

clhedrick avatar Aug 10 '22 13:08 clhedrick

I have similar issues... I could clear the errors by running (twice) a scrub. It wasn't mandatory to have the scrub competed at 100%: canceling the scrub(s) shortly after starting them did the trick.

I created a new encrypted dataset under zfs 2.1.5 (syncoid) and destroyed the former encrypted dataset (initially created under zfs 0.8.3 with syncoid)

So far so good :-)... no permanent errors anymore

ofthesun9 avatar Aug 16 '22 14:08 ofthesun9

Same problems here after upgrade from ubuntu 20.04 to 22.04

versus167 avatar Aug 21 '22 17:08 versus167

Some new flavour of this problem. On an other machine I got after the update hangs an this log-entries:

Aug 21 20:44:08 backup kernel: [ 4841.628971] VERIFY3(0 == zap_add(mos, dsl_dir_phys(pds)->dd_child_dir_zapobj, name, sizeof (uint64_t), 1, &ddobj, tx)) failed (0 == 17) Aug 21 20:44:08 backup kernel: [ 4841.629271] PANIC at dsl_dir.c:951:dsl_dir_create_sync() Aug 21 20:44:08 backup kernel: [ 4841.629338] Showing stack for process 675 Aug 21 20:44:08 backup kernel: [ 4841.629340] CPU: 0 PID: 675 Comm: txg_sync Tainted: P O 5.15.0-46-generic #49-Ubuntu Aug 21 20:44:08 backup kernel: [ 4841.629344] Hardware name: Gigabyte Technology Co., Ltd. GA-A55M-S2V/GA-A55M-S2V, BIOS F6 11/18/2011 Aug 21 20:44:08 backup kernel: [ 4841.629346] Call Trace: Aug 21 20:44:08 backup kernel: [ 4841.629349] <TASK> Aug 21 20:44:08 backup kernel: [ 4841.629352] show_stack+0x52/0x5c Aug 21 20:44:08 backup kernel: [ 4841.629357] dump_stack_lvl+0x4a/0x63 Aug 21 20:44:08 backup kernel: [ 4841.629363] dump_stack+0x10/0x16 Aug 21 20:44:08 backup kernel: [ 4841.629366] spl_dumpstack+0x29/0x2f [spl] Aug 21 20:44:08 backup kernel: [ 4841.629382] spl_panic+0xd1/0xe9 [spl] Aug 21 20:44:08 backup kernel: [ 4841.629394] ? dmu_buf_rele+0xe/0x20 [zfs] Aug 21 20:44:08 backup kernel: [ 4841.629598] ? zap_unlockdir+0x46/0x60 [zfs] Aug 21 20:44:08 backup kernel: [ 4841.629777] ? zap_add_impl+0x96/0x160 [zfs] Aug 21 20:44:08 backup kernel: [ 4841.629957] ? zap_add+0x7b/0xb0 [zfs] Aug 21 20:44:08 backup kernel: [ 4841.630138] dsl_dir_create_sync+0x1ff/0x280 [zfs] Aug 21 20:44:08 backup kernel: [ 4841.630306] ? spl_kmem_free_impl+0x29/0x40 [spl] Aug 21 20:44:08 backup kernel: [ 4841.630319] dsl_dataset_create_sync+0x52/0x380 [zfs] Aug 21 20:44:08 backup kernel: [ 4841.630498] dmu_recv_begin_sync+0x374/0xa00 [zfs] Aug 21 20:44:08 backup kernel: [ 4841.630659] ? spa_get_slop_space+0x6e/0xc0 [zfs] Aug 21 20:44:08 backup kernel: [ 4841.630833] ? __cond_resched+0x1a/0x50 Aug 21 20:44:08 backup kernel: [ 4841.630838] dsl_sync_task_sync+0xb9/0x110 [zfs] Aug 21 20:44:08 backup kernel: [ 4841.631010] dsl_pool_sync+0x369/0x400 [zfs] Aug 21 20:44:08 backup kernel: [ 4841.631177] spa_sync_iterate_to_convergence+0xe0/0x1f0 [zfs] Aug 21 20:44:08 backup kernel: [ 4841.631353] spa_sync+0x2dc/0x5b0 [zfs] Aug 21 20:44:08 backup kernel: [ 4841.631526] txg_sync_thread+0x266/0x2f0 [zfs] Aug 21 20:44:08 backup kernel: [ 4841.631703] ? txg_dispatch_callbacks+0x100/0x100 [zfs] Aug 21 20:44:08 backup kernel: [ 4841.631883] thread_generic_wrapper+0x64/0x80 [spl] Aug 21 20:44:08 backup kernel: [ 4841.631896] ? __thread_exit+0x20/0x20 [spl] Aug 21 20:44:08 backup kernel: [ 4841.631907] kthread+0x12a/0x150 Aug 21 20:44:08 backup kernel: [ 4841.631912] ? set_kthread_struct+0x50/0x50 Aug 21 20:44:08 backup kernel: [ 4841.631914] ret_from_fork+0x22/0x30 Aug 21 20:44:08 backup kernel: [ 4841.631919] </TASK> Aug 21 20:48:03 backup kernel: [ 5076.258637] INFO: task txg_sync:675 blocked for more than 120 seconds. Aug 21 20:48:03 backup kernel: [ 5076.258829] Tainted: P O 5.15.0-46-generic #49-Ubuntu Aug 21 20:48:03 backup kernel: [ 5076.259007] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Aug 21 20:48:03 backup kernel: [ 5076.259149] task:txg_sync state:D stack: 0 pid: 675 ppid: 2 flags:0x00004000 Aug 21 20:48:03 backup kernel: [ 5076.259163] Call Trace: Aug 21 20:48:03 backup kernel: [ 5076.259169] <TASK> Aug 21 20:48:03 backup kernel: [ 5076.259176] __schedule+0x23d/0x590 Aug 21 20:48:03 backup kernel: [ 5076.259197] schedule+0x4e/0xc0 Aug 21 20:48:03 backup kernel: [ 5076.259206] spl_panic+0xe7/0xe9 [spl] Aug 21 20:48:03 backup kernel: [ 5076.259254] ? dmu_buf_rele+0xe/0x20 [zfs] Aug 21 20:48:03 backup kernel: [ 5076.259710] ? zap_unlockdir+0x46/0x60 [zfs] Aug 21 20:48:03 backup kernel: [ 5076.260216] ? zap_add_impl+0x96/0x160 [zfs] Aug 21 20:48:03 backup kernel: [ 5076.260722] ? zap_add+0x7b/0xb0 [zfs] Aug 21 20:48:03 backup kernel: [ 5076.261229] dsl_dir_create_sync+0x1ff/0x280 [zfs] Aug 21 20:48:03 backup kernel: [ 5076.261690] ? spl_kmem_free_impl+0x29/0x40 [spl] Aug 21 20:48:03 backup kernel: [ 5076.261728] dsl_dataset_create_sync+0x52/0x380 [zfs] Aug 21 20:48:03 backup kernel: [ 5076.262192] dmu_recv_begin_sync+0x374/0xa00 [zfs] Aug 21 20:48:03 backup kernel: [ 5076.262696] ? spa_get_slop_space+0x6e/0xc0 [zfs] Aug 21 20:48:03 backup kernel: [ 5076.263289] ? __cond_resched+0x1a/0x50 Aug 21 20:48:03 backup kernel: [ 5076.263303] dsl_sync_task_sync+0xb9/0x110 [zfs] Aug 21 20:48:03 backup kernel: [ 5076.263773] dsl_pool_sync+0x369/0x400 [zfs] Aug 21 20:48:03 backup kernel: [ 5076.264239] spa_sync_iterate_to_convergence+0xe0/0x1f0 [zfs] Aug 21 20:48:03 backup kernel: [ 5076.264726] spa_sync+0x2dc/0x5b0 [zfs] Aug 21 20:48:03 backup kernel: [ 5076.265213] txg_sync_thread+0x266/0x2f0 [zfs] Aug 21 20:48:03 backup kernel: [ 5076.265712] ? txg_dispatch_callbacks+0x100/0x100 [zfs] Aug 21 20:48:03 backup kernel: [ 5076.266207] thread_generic_wrapper+0x64/0x80 [spl] Aug 21 20:48:03 backup kernel: [ 5076.266246] ? __thread_exit+0x20/0x20 [spl] Aug 21 20:48:03 backup kernel: [ 5076.266284] kthread+0x12a/0x150 Aug 21 20:48:03 backup kernel: [ 5076.266295] ? set_kthread_struct+0x50/0x50 Aug 21 20:48:03 backup kernel: [ 5076.266305] ret_from_fork+0x22/0x30 Aug 21 20:48:03 backup kernel: [ 5076.266318] </TASK> Aug 21 20:48:03 backup kernel: [ 5076.266351] INFO: task zfs:1782 blocked for more than 120 seconds. Aug 21 20:48:03 backup kernel: [ 5076.266561] Tainted: P O 5.15.0-46-generic #49-Ubuntu Aug 21 20:48:03 backup kernel: [ 5076.266714] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Aug 21 20:48:03 backup kernel: [ 5076.266857] task:zfs state:D stack: 0 pid: 1782 ppid: 1781 flags:0x00004002 Aug 21 20:48:03 backup kernel: [ 5076.266870] Call Trace: Aug 21 20:48:03 backup kernel: [ 5076.266874] <TASK> Aug 21 20:48:03 backup kernel: [ 5076.266878] __schedule+0x23d/0x590 Aug 21 20:48:03 backup kernel: [ 5076.266887] ? autoremove_wake_function+0x12/0x40 Aug 21 20:48:03 backup kernel: [ 5076.266897] schedule+0x4e/0xc0 Aug 21 20:48:03 backup kernel: [ 5076.266905] io_schedule+0x46/0x80 Aug 21 20:48:03 backup kernel: [ 5076.266913] cv_wait_common+0xab/0x130 [spl] Aug 21 20:48:03 backup kernel: [ 5076.266953] ? wait_woken+0x70/0x70 Aug 21 20:48:03 backup kernel: [ 5076.266962] __cv_wait_io+0x18/0x20 [spl] Aug 21 20:48:03 backup kernel: [ 5076.267002] txg_wait_synced_impl+0x9b/0x120 [zfs] Aug 21 20:48:03 backup kernel: [ 5076.267520] txg_wait_synced+0x10/0x50 [zfs] Aug 21 20:48:03 backup kernel: [ 5076.268016] dsl_sync_task_common+0x1c6/0x2a0 [zfs] Aug 21 20:48:03 backup kernel: [ 5076.268486] ? recv_begin_check_existing_impl+0x590/0x590 [zfs] Aug 21 20:48:03 backup kernel: [ 5076.268924] ? recv_check_large_blocks+0x60/0x60 [zfs] Aug 21 20:48:03 backup kernel: [ 5076.269365] ? recv_begin_check_existing_impl+0x590/0x590 [zfs] Aug 21 20:48:03 backup kernel: [ 5076.269804] ? recv_check_large_blocks+0x60/0x60 [zfs] Aug 21 20:48:03 backup kernel: [ 5076.270242] dsl_sync_task+0x1a/0x20 [zfs] Aug 21 20:48:03 backup kernel: [ 5076.270754] dmu_recv_begin+0x1e2/0x390 [zfs] Aug 21 20:48:03 backup kernel: [ 5076.271292] zfs_ioc_recv_impl.constprop.0+0x106/0xb20 [zfs] Aug 21 20:48:03 backup kernel: [ 5076.271898] zfs_ioc_recv_new+0x310/0x3b0 [zfs] Aug 21 20:48:03 backup kernel: [ 5076.272498] ? spl_kmem_alloc_impl+0xbe/0xd0 [spl] Aug 21 20:48:03 backup kernel: [ 5076.272542] ? spl_vmem_alloc+0x19/0x20 [spl] Aug 21 20:48:03 backup kernel: [ 5076.272586] ? nv_alloc_sleep_spl+0x1f/0x30 [znvpair] Aug 21 20:48:03 backup kernel: [ 5076.272629] ? nv_mem_zalloc+0x33/0x50 [znvpair] Aug 21 20:48:03 backup kernel: [ 5076.272668] ? nvlist_xalloc+0x51/0xa0 [znvpair] Aug 21 20:48:03 backup kernel: [ 5076.272707] ? nvlist_alloc+0x28/0x40 [znvpair] Aug 21 20:48:03 backup kernel: [ 5076.272747] zfsdev_ioctl_common+0x285/0x740 [zfs] Aug 21 20:48:03 backup kernel: [ 5076.273270] ? _copy_from_user+0x2e/0x70 Aug 21 20:48:03 backup kernel: [ 5076.273281] zfsdev_ioctl+0x57/0xf0 [zfs] Aug 21 20:48:03 backup kernel: [ 5076.273790] __x64_sys_ioctl+0x95/0xd0 Aug 21 20:48:03 backup kernel: [ 5076.273803] do_syscall_64+0x5c/0xc0 Aug 21 20:48:03 backup kernel: [ 5076.273812] ? do_user_addr_fault+0x1e7/0x670 Aug 21 20:48:03 backup kernel: [ 5076.273821] ? do_syscall_64+0x69/0xc0 Aug 21 20:48:03 backup kernel: [ 5076.273828] ? exit_to_user_mode_prepare+0x37/0xb0 Aug 21 20:48:03 backup kernel: [ 5076.273838] ? irqentry_exit_to_user_mode+0x9/0x20 Aug 21 20:48:03 backup kernel: [ 5076.273847] ? irqentry_exit+0x1d/0x30 Aug 21 20:48:03 backup kernel: [ 5076.273856] ? exc_page_fault+0x89/0x170 Aug 21 20:48:03 backup kernel: [ 5076.273865] entry_SYSCALL_64_after_hwframe+0x61/0xcb Aug 21 20:48:03 backup kernel: [ 5076.273876] RIP: 0033:0x7faa82a99aff Aug 21 20:48:03 backup kernel: [ 5076.273884] RSP: 002b:00007ffcd73c4bb0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 Aug 21 20:48:03 backup kernel: [ 5076.273893] RAX: ffffffffffffffda RBX: 00007ffcd73c8280 RCX: 00007faa82a99aff Aug 21 20:48:03 backup kernel: [ 5076.273899] RDX: 00007ffcd73c4c30 RSI: 0000000000005a46 RDI: 0000000000000005 Aug 21 20:48:03 backup kernel: [ 5076.273904] RBP: 00007ffcd73c8220 R08: 0000000000000000 R09: 0000555b46c32d70 Aug 21 20:48:03 backup kernel: [ 5076.273909] R10: 00007faa82b98da0 R11: 0000000000000246 R12: 0000000000005a46 Aug 21 20:48:03 backup kernel: [ 5076.273914] R13: 00007ffcd73c4c30 R14: 0000000000005a46 R15: 0000555b46c0f7a0 Aug 21 20:48:03 backup kernel: [ 5076.273923] </TASK>

versus167 avatar Aug 21 '22 19:08 versus167

Are you sure this is the same problem?

On Aug 21, 2022, at 3:56 PM, Volker Süß @.***> wrote:



Some new flavour of this problem. On an other machine I got after the update hangs an this log-entries:

Aug 21 20:44:08 backup kernel: [ 4841.628971] VERIFY3(0 == zap_add(mos, dsl_dir_phys(pds)->dd_child_dir_zapobj, name, sizeof (uint64_t), 1, &ddobj, tx)) failed (0 == 17) Aug 21 20:44:08 backup kernel: [ 4841.629271] PANIC at dsl_dir.c:951:dsl_dir_create_sync() Aug 21 20:44:08 backup kernel: [ 4841.629338] Showing stack for process 675 Aug 21 20:44:08 backup kernel: [ 4841.629340] CPU: 0 PID: 675 Comm: txg_sync Tainted: P O 5.15.0-46-generic #49https://github.com/openzfs/zfs/issues/49-Ubuntu Aug 21 20:44:08 backup kernel: [ 4841.629344] Hardware name: Gigabyte Technology Co., Ltd. GA-A55M-S2V/GA-A55M-S2V, BIOS F6 11/18/2011 Aug 21 20:44:08 backup kernel: [ 4841.629346] Call Trace: Aug 21 20:44:08 backup kernel: [ 4841.629349] Aug 21 20:44:08 backup kernel: [ 4841.629352] show_stack+0x52/0x5c Aug 21 20:44:08 backup kernel: [ 4841.629357] dump_stack_lvl+0x4a/0x63 Aug 21 20:44:08 backup kernel: [ 4841.629363] dump_stack+0x10/0x16 Aug 21 20:44:08 backup kernel: [ 4841.629366] spl_dumpstack+0x29/0x2f [spl] Aug 21 20:44:08 backup kernel: [ 4841.629382] spl_panic+0xd1/0xe9 [spl] Aug 21 20:44:08 backup kernel: [ 4841.629394] ? dmu_buf_rele+0xe/0x20 [zfs] Aug 21 20:44:08 backup kernel: [ 4841.629598] ? zap_unlockdir+0x46/0x60 [zfs] Aug 21 20:44:08 backup kernel: [ 4841.629777] ? zap_add_impl+0x96/0x160 [zfs] Aug 21 20:44:08 backup kernel: [ 4841.629957] ? zap_add+0x7b/0xb0 [zfs] Aug 21 20:44:08 backup kernel: [ 4841.630138] dsl_dir_create_sync+0x1ff/0x280 [zfs] Aug 21 20:44:08 backup kernel: [ 4841.630306] ? spl_kmem_free_impl+0x29/0x40 [spl] Aug 21 20:44:08 backup kernel: [ 4841.630319] dsl_dataset_create_sync+0x52/0x380 [zfs] Aug 21 20:44:08 backup kernel: [ 4841.630498] dmu_recv_begin_sync+0x374/0xa00 [zfs] Aug 21 20:44:08 backup kernel: [ 4841.630659] ? spa_get_slop_space+0x6e/0xc0 [zfs] Aug 21 20:44:08 backup kernel: [ 4841.630833] ? __cond_resched+0x1a/0x50 Aug 21 20:44:08 backup kernel: [ 4841.630838] dsl_sync_task_sync+0xb9/0x110 [zfs] Aug 21 20:44:08 backup kernel: [ 4841.631010] dsl_pool_sync+0x369/0x400 [zfs] Aug 21 20:44:08 backup kernel: [ 4841.631177] spa_sync_iterate_to_convergence+0xe0/0x1f0 [zfs] Aug 21 20:44:08 backup kernel: [ 4841.631353] spa_sync+0x2dc/0x5b0 [zfs] Aug 21 20:44:08 backup kernel: [ 4841.631526] txg_sync_thread+0x266/0x2f0 [zfs] Aug 21 20:44:08 backup kernel: [ 4841.631703] ? txg_dispatch_callbacks+0x100/0x100 [zfs] Aug 21 20:44:08 backup kernel: [ 4841.631883] thread_generic_wrapper+0x64/0x80 [spl] Aug 21 20:44:08 backup kernel: [ 4841.631896] ? __thread_exit+0x20/0x20 [spl] Aug 21 20:44:08 backup kernel: [ 4841.631907] kthread+0x12a/0x150 Aug 21 20:44:08 backup kernel: [ 4841.631912] ? set_kthread_struct+0x50/0x50 Aug 21 20:44:08 backup kernel: [ 4841.631914] ret_from_fork+0x22/0x30 Aug 21 20:44:08 backup kernel: [ 4841.631919] Aug 21 20:48:03 backup kernel: [ 5076.258637] INFO: task txg_sync:675 blocked for more than 120 seconds. Aug 21 20:48:03 backup kernel: [ 5076.258829] Tainted: P O 5.15.0-46-generic #49https://github.com/openzfs/zfs/issues/49-Ubuntu Aug 21 20:48:03 backup kernel: [ 5076.259007] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Aug 21 20:48:03 backup kernel: [ 5076.259149] task:txg_sync state:D stack: 0 pid: 675 ppid: 2 flags:0x00004000 Aug 21 20:48:03 backup kernel: [ 5076.259163] Call Trace: Aug 21 20:48:03 backup kernel: [ 5076.259169] Aug 21 20:48:03 backup kernel: [ 5076.259176] __schedule+0x23d/0x590 Aug 21 20:48:03 backup kernel: [ 5076.259197] schedule+0x4e/0xc0 Aug 21 20:48:03 backup kernel: [ 5076.259206] spl_panic+0xe7/0xe9 [spl] Aug 21 20:48:03 backup kernel: [ 5076.259254] ? dmu_buf_rele+0xe/0x20 [zfs] Aug 21 20:48:03 backup kernel: [ 5076.259710] ? zap_unlockdir+0x46/0x60 [zfs] Aug 21 20:48:03 backup kernel: [ 5076.260216] ? zap_add_impl+0x96/0x160 [zfs] Aug 21 20:48:03 backup kernel: [ 5076.260722] ? zap_add+0x7b/0xb0 [zfs] Aug 21 20:48:03 backup kernel: [ 5076.261229] dsl_dir_create_sync+0x1ff/0x280 [zfs] Aug 21 20:48:03 backup kernel: [ 5076.261690] ? spl_kmem_free_impl+0x29/0x40 [spl] Aug 21 20:48:03 backup kernel: [ 5076.261728] dsl_dataset_create_sync+0x52/0x380 [zfs] Aug 21 20:48:03 backup kernel: [ 5076.262192] dmu_recv_begin_sync+0x374/0xa00 [zfs] Aug 21 20:48:03 backup kernel: [ 5076.262696] ? spa_get_slop_space+0x6e/0xc0 [zfs] Aug 21 20:48:03 backup kernel: [ 5076.263289] ? __cond_resched+0x1a/0x50 Aug 21 20:48:03 backup kernel: [ 5076.263303] dsl_sync_task_sync+0xb9/0x110 [zfs] Aug 21 20:48:03 backup kernel: [ 5076.263773] dsl_pool_sync+0x369/0x400 [zfs] Aug 21 20:48:03 backup kernel: [ 5076.264239] spa_sync_iterate_to_convergence+0xe0/0x1f0 [zfs] Aug 21 20:48:03 backup kernel: [ 5076.264726] spa_sync+0x2dc/0x5b0 [zfs] Aug 21 20:48:03 backup kernel: [ 5076.265213] txg_sync_thread+0x266/0x2f0 [zfs] Aug 21 20:48:03 backup kernel: [ 5076.265712] ? txg_dispatch_callbacks+0x100/0x100 [zfs] Aug 21 20:48:03 backup kernel: [ 5076.266207] thread_generic_wrapper+0x64/0x80 [spl] Aug 21 20:48:03 backup kernel: [ 5076.266246] ? __thread_exit+0x20/0x20 [spl] Aug 21 20:48:03 backup kernel: [ 5076.266284] kthread+0x12a/0x150 Aug 21 20:48:03 backup kernel: [ 5076.266295] ? set_kthread_struct+0x50/0x50 Aug 21 20:48:03 backup kernel: [ 5076.266305] ret_from_fork+0x22/0x30 Aug 21 20:48:03 backup kernel: [ 5076.266318] Aug 21 20:48:03 backup kernel: [ 5076.266351] INFO: task zfs:1782 blocked for more than 120 seconds. Aug 21 20:48:03 backup kernel: [ 5076.266561] Tainted: P O 5.15.0-46-generic #49https://github.com/openzfs/zfs/issues/49-Ubuntu Aug 21 20:48:03 backup kernel: [ 5076.266714] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Aug 21 20:48:03 backup kernel: [ 5076.266857] task:zfs state:D stack: 0 pid: 1782 ppid: 1781 flags:0x00004002 Aug 21 20:48:03 backup kernel: [ 5076.266870] Call Trace: Aug 21 20:48:03 backup kernel: [ 5076.266874] Aug 21 20:48:03 backup kernel: [ 5076.266878] __schedule+0x23d/0x590 Aug 21 20:48:03 backup kernel: [ 5076.266887] ? autoremove_wake_function+0x12/0x40 Aug 21 20:48:03 backup kernel: [ 5076.266897] schedule+0x4e/0xc0 Aug 21 20:48:03 backup kernel: [ 5076.266905] io_schedule+0x46/0x80 Aug 21 20:48:03 backup kernel: [ 5076.266913] cv_wait_common+0xab/0x130 [spl] Aug 21 20:48:03 backup kernel: [ 5076.266953] ? wait_woken+0x70/0x70 Aug 21 20:48:03 backup kernel: [ 5076.266962] __cv_wait_io+0x18/0x20 [spl] Aug 21 20:48:03 backup kernel: [ 5076.267002] txg_wait_synced_impl+0x9b/0x120 [zfs] Aug 21 20:48:03 backup kernel: [ 5076.267520] txg_wait_synced+0x10/0x50 [zfs] Aug 21 20:48:03 backup kernel: [ 5076.268016] dsl_sync_task_common+0x1c6/0x2a0 [zfs] Aug 21 20:48:03 backup kernel: [ 5076.268486] ? recv_begin_check_existing_impl+0x590/0x590 [zfs] Aug 21 20:48:03 backup kernel: [ 5076.268924] ? recv_check_large_blocks+0x60/0x60 [zfs] Aug 21 20:48:03 backup kernel: [ 5076.269365] ? recv_begin_check_existing_impl+0x590/0x590 [zfs] Aug 21 20:48:03 backup kernel: [ 5076.269804] ? recv_check_large_blocks+0x60/0x60 [zfs] Aug 21 20:48:03 backup kernel: [ 5076.270242] dsl_sync_task+0x1a/0x20 [zfs] Aug 21 20:48:03 backup kernel: [ 5076.270754] dmu_recv_begin+0x1e2/0x390 [zfs] Aug 21 20:48:03 backup kernel: [ 5076.271292] zfs_ioc_recv_impl.constprop.0+0x106/0xb20 [zfs] Aug 21 20:48:03 backup kernel: [ 5076.271898] zfs_ioc_recv_new+0x310/0x3b0 [zfs] Aug 21 20:48:03 backup kernel: [ 5076.272498] ? spl_kmem_alloc_impl+0xbe/0xd0 [spl] Aug 21 20:48:03 backup kernel: [ 5076.272542] ? spl_vmem_alloc+0x19/0x20 [spl] Aug 21 20:48:03 backup kernel: [ 5076.272586] ? nv_alloc_sleep_spl+0x1f/0x30 [znvpair] Aug 21 20:48:03 backup kernel: [ 5076.272629] ? nv_mem_zalloc+0x33/0x50 [znvpair] Aug 21 20:48:03 backup kernel: [ 5076.272668] ? nvlist_xalloc+0x51/0xa0 [znvpair] Aug 21 20:48:03 backup kernel: [ 5076.272707] ? nvlist_alloc+0x28/0x40 [znvpair] Aug 21 20:48:03 backup kernel: [ 5076.272747] zfsdev_ioctl_common+0x285/0x740 [zfs] Aug 21 20:48:03 backup kernel: [ 5076.273270] ? _copy_from_user+0x2e/0x70 Aug 21 20:48:03 backup kernel: [ 5076.273281] zfsdev_ioctl+0x57/0xf0 [zfs] Aug 21 20:48:03 backup kernel: [ 5076.273790] __x64_sys_ioctl+0x95/0xd0 Aug 21 20:48:03 backup kernel: [ 5076.273803] do_syscall_64+0x5c/0xc0 Aug 21 20:48:03 backup kernel: [ 5076.273812] ? do_user_addr_fault+0x1e7/0x670 Aug 21 20:48:03 backup kernel: [ 5076.273821] ? do_syscall_64+0x69/0xc0 Aug 21 20:48:03 backup kernel: [ 5076.273828] ? exit_to_user_mode_prepare+0x37/0xb0 Aug 21 20:48:03 backup kernel: [ 5076.273838] ? irqentry_exit_to_user_mode+0x9/0x20 Aug 21 20:48:03 backup kernel: [ 5076.273847] ? irqentry_exit+0x1d/0x30 Aug 21 20:48:03 backup kernel: [ 5076.273856] ? exc_page_fault+0x89/0x170 Aug 21 20:48:03 backup kernel: [ 5076.273865] entry_SYSCALL_64_after_hwframe+0x61/0xcb Aug 21 20:48:03 backup kernel: [ 5076.273876] RIP: 0033:0x7faa82a99aff Aug 21 20:48:03 backup kernel: [ 5076.273884] RSP: 002b:00007ffcd73c4bb0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 Aug 21 20:48:03 backup kernel: [ 5076.273893] RAX: ffffffffffffffda RBX: 00007ffcd73c8280 RCX: 00007faa82a99aff Aug 21 20:48:03 backup kernel: [ 5076.273899] RDX: 00007ffcd73c4c30 RSI: 0000000000005a46 RDI: 0000000000000005 Aug 21 20:48:03 backup kernel: [ 5076.273904] RBP: 00007ffcd73c8220 R08: 0000000000000000 R09: 0000555b46c32d70 Aug 21 20:48:03 backup kernel: [ 5076.273909] R10: 00007faa82b98da0 R11: 0000000000000246 R12: 0000000000005a46 Aug 21 20:48:03 backup kernel: [ 5076.273914] R13: 00007ffcd73c4c30 R14: 0000000000005a46 R15: 0000555b46c0f7a0 Aug 21 20:48:03 backup kernel: [ 5076.273923]

— Reply to this email directly, view it on GitHubhttps://github.com/openzfs/zfs/issues/13763#issuecomment-1221610704, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AAORUCGKQA7ISVIR6BVHLQLV2KCX5ANCNFSM56EUNQGA. You are receiving this because you authored the thread.Message ID: @.***>

clhedrick avatar Aug 22 '22 12:08 clhedrick

In the meantime, I no longer believe that. There is only the connection that the change from Ubuntu 20 to 22 took place only on Saturday and there were problems with zfs send/receive. But there are no more errors reported. So probably not the same problem.

versus167 avatar Aug 22 '22 12:08 versus167

If you're talking about send / receive of encrypted data, that's could be known issues that were also in 20.04. It's unsafe to send or receive from or into an encrypted file system. It's unclear whether it is safe to use encryption without using send / receive. I'm currently skeptical.

clhedrick avatar Aug 22 '22 14:08 clhedrick

I'm talking about "send from unencrypted to encrypted dataset". I use this constellation for about one year now withoout any problems. And now - after upgrade - I run in this problem...

versus167 avatar Aug 22 '22 14:08 versus167

I just had the same experience as @clhedrick - I upgraded to Ubuntu 22, and lost a few ZFS datasets in my ZFS pool! Towards the end of the "zpool status -v" output I get:

errors: Permanent errors have been detected in the following files:

        IWPro/home:<0x0>

(and a couple of other datasets)... My pool contains both encrypted and unencrypted datasets. Only encrypted datasets are affected, but not all of them. The encrypted datasets affected ARE datasets I also "zfs send" to an offsite ZFS pool (datasets are encrypted at both ends, so perhaps somewhat similar to @versus167), but I don't really see how a simple "zfs send" is able to corrupt the datasets...

In my case, the root (/) filesystem is OK, whereas e.g. the /home/ filesystem is not, although both reside in the same pool. - I suspect this could be the same issue as #13709 .

jonryk avatar Aug 30 '22 09:08 jonryk