zfs
zfs copied to clipboard
ZFS hangs on kernel error: VERIFY3(0 == dmu_bonus_hold_by_dnode
System information
| Type | Version/Name |
|---|---|
| Distribution Name | Debian |
| Distribution Version | Buster |
| Linux Kernel | 5.10.0-0.bpo.4-amd64 / 4.19.0-16 |
| Architecture | amd64 |
| ZFS Version | 2.0.3-1~bpo10+1 |
| SPL Version | 2.0.3-1~bpo10+1 |
Zpool properties:
-o ashift=12
-O checksum=sha256
-O compression=lz4
-O atime=off
-O normalization=formD
-O xattr=sa
-O relatime=on
Example test machine properties:
- CPU model name : AMD Ryzen Threadripper 2990WX 32-Core Processor
- 6 HDDs of model: Gigabyte Technology Co., Ltd. X399 AORUS PRO/X399 AORUS PRO-CF, BIOS F2g 05/08/2019
Describe the problem you're observing
During an incremental send using zfs send -I | receive -F, ZFS hangs. Any subsequent call to ZFS or e.g. opening any file on the ZFS filesystem, will hang indefinitely.
Note that we also experienced: https://github.com/openzfs/zfs/issues/12014 - perhaps it is related.
Describe how to reproduce the problem
This is happening frequently for us, on multiple machines, all using HDDs. The machines act as backup targets so most activity is receiving snapshots all day. Other zfs related processes on the machines are: snapshots are taken and pruned multiple times per day and a scrub happens once a week. We use encrypted filesystems and a single zpool. Historically the error has triggered for us between 1 and 48 hours.
Total data usage is 2.12T, with zpool iostat output showing:
capacity operations bandwidth
pool alloc free read write read write
---------- ----- ----- ----- ----- ----- -----
data 4.37T 10.2T 17 172 967K 5.46M
Current mitigation
Downgrading to zfs version 0.8.4-1~bpo10+1 seems to have resolved the issue - at the time of writing we had about a ~week~ month without ZFS hanging.
Include any warning/errors/backtraces from the system logs
The first error
|May 4 17:11:31 <host> kernel: [96617.381743] VERIFY3(0 == dmu_bonus_hold_by_dnode(dn, FTAG, &db, flags)) failed (0 == 5)
│May 4 17:11:31 <host> kernel: [96617.381882] PANIC at dmu_recv.c:1811:receive_object()
│May 4 17:11:31 <host> kernel: [96617.381990] Showing stack for process 97766
│May 4 17:11:31 <host> kernel: [96617.382002] CPU: 55 PID: 97766 Comm: receive_writer Tainted: P OE 4.19.0-16-amd64 #1 Debian 4.19.181-1
│May 4 17:11:31 <host> kernel: [96617.382006] Hardware name: Gigabyte Technology Co., Ltd. X399 AORUS PRO/X399 AORUS PRO-CF, BIOS F2g 05/08/2019
│May 4 17:11:31 <host> kernel: [96617.382008] Call Trace:
│May 4 17:11:31 <host> kernel: [96617.382058] dump_stack+0x66/0x81
│May 4 17:11:31 <host> kernel: [96617.382098] spl_panic+0xd3/0xfb [spl]
│May 4 17:11:31 <host> kernel: [96617.382454] ? dbuf_rele_and_unlock+0x30f/0x660 [zfs]
│May 4 17:11:31 <host> kernel: [96617.382562] ? dbuf_read+0x1ca/0x520 [zfs]
│May 4 17:11:31 <host> kernel: [96617.382642] ? dnode_hold_impl+0x350/0xc20 [zfs]
│May 4 17:11:31 <host> kernel: [96617.382652] ? spl_kmem_cache_free+0xec/0x1c0 [spl]
│May 4 17:11:31 <host> kernel: [96617.382728] receive_object+0x923/0xc70 [zfs]
│May 4 17:11:31 <host> kernel: [96617.382803] receive_writer_thread+0x279/0xa00 [zfs]
│May 4 17:11:31 <host> kernel: [96617.382817] ? set_curr_task_fair+0x26/0x50
│May 4 17:11:31 <host> kernel: [96617.382894] ? receive_process_write_record+0x190/0x190 [zfs]
│May 4 17:11:31 <host> kernel: [96617.382901] ? __thread_exit+0x20/0x20 [spl]
│May 4 17:11:31 <host> kernel: [96617.382905] ? thread_generic_wrapper+0x6f/0x80 [spl]
│May 4 17:11:31 <host> kernel: [96617.382910] thread_generic_wrapper+0x6f/0x80 [spl]
│May 4 17:11:31 <host> kernel: [96617.382920] kthread+0x112/0x130
│May 4 17:11:31 <host> kernel: [96617.382926] ? kthread_bind+0x30/0x30
│May 4 17:11:31 <host> kernel: [96617.382935] ret_from_fork+0x22/0x40
Subsequent errors
│May 4 17:16:05 <host> kernel: [96890.895937] hrtimer: interrupt took 74060 ns
│May 4 17:16:20 <host> kernel: [96906.140617] INFO: task txg_quiesce:1694 blocked for more than 120 seconds.
│May 4 17:16:20 <host> kernel: [96906.140714] Tainted: P OE 4.19.0-16-amd64 #1 Debian 4.19.181-1
│May 4 17:16:20 <host> kernel: [96906.140799] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
│May 4 17:16:20 <host> kernel: [96906.140884] txg_quiesce D 0 1694 2 0x80000000
│May 4 17:16:20 <host> kernel: [96906.140889] Call Trace:
│May 4 17:16:20 <host> kernel: [96906.140903] __schedule+0x29f/0x840
│May 4 17:16:20 <host> kernel: [96906.140908] schedule+0x28/0x80
│May 4 17:16:20 <host> kernel: [96906.140924] cv_wait_common+0xfb/0x130 [spl]
│May 4 17:16:20 <host> kernel: [96906.140935] ? finish_wait+0x80/0x80
│May 4 17:16:20 <host> kernel: [96906.141077] txg_quiesce_thread+0x2a9/0x3a0 [zfs]
│May 4 17:16:20 <host> kernel: [96906.141169] ? txg_sync_thread+0x480/0x480 [zfs]
│May 4 17:16:20 <host> kernel: [96906.141176] ? __thread_exit+0x20/0x20 [spl]
│May 4 17:16:20 <host> kernel: [96906.141182] thread_generic_wrapper+0x6f/0x80 [spl]
│May 4 17:16:20 <host> kernel: [96906.141187] kthread+0x112/0x130
│May 4 17:16:20 <host> kernel: [96906.141190] ? kthread_bind+0x30/0x30
│May 4 17:16:20 <host> kernel: [96906.141193] ret_from_fork+0x22/0x40
│May 4 17:16:20 <host> kernel: [96906.141783] INFO: task receive_writer:97766 blocked for more than 120 seconds. ┤
│May 4 17:16:20 <host> kernel: [96906.141865] Tainted: P OE 4.19.0-16-amd64 #1 Debian 4.19.181-1 ┤
│May 4 17:16:20 <host> kernel: [96906.141963] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. ┤
│May 4 17:16:20 <host> kernel: [96906.142065] receive_writer D 0 97766 2 0x80000000 ┤
│May 4 17:16:20 <host> kernel: [96906.142068] Call Trace: ┤
│May 4 17:16:20 <host> kernel: [96906.142074] __schedule+0x29f/0x840 ┤
│May 4 17:16:20 <host> kernel: [96906.142078] ? ret_from_fork+0x21/0x40 ┤
│May 4 17:16:20 <host> kernel: [96906.142083] schedule+0x28/0x80 ┤
│May 4 17:16:20 <host> kernel: [96906.142091] spl_panic+0xf9/0xfb [spl] ┤
│May 4 17:16:20 <host> kernel: [96906.142192] ? dbuf_rele_and_unlock+0x30f/0x660 [zfs] ┤
│May 4 17:16:20 <host> kernel: [96906.142239] ? dbuf_read+0x1ca/0x520 [zfs] │
│May 4 17:16:20 <host> kernel: [96906.142289] ? dnode_hold_impl+0x350/0xc20 [zfs] ┤
│May 4 17:16:20 <host> kernel: [96906.142295] ? spl_kmem_cache_free+0xec/0x1c0 [spl] ┤
│May 4 17:16:20 <host> kernel: [96906.142343] receive_object+0x923/0xc70 [zfs] ┤
│May 4 17:16:20 <host> kernel: [96906.142394] receive_writer_thread+0x279/0xa00 [zfs] ┤
│May 4 17:16:20 <host> kernel: [96906.142399] ? set_curr_task_fair+0x26/0x50 ┤
│May 4 17:16:20 <host> kernel: [96906.142448] ? receive_process_write_record+0x190/0x190 [zfs] ┤
│May 4 17:16:20 <host> kernel: [96906.142455] ? __thread_exit+0x20/0x20 [spl] ┤
│May 4 17:16:20 <host> kernel: [96906.142460] ? thread_generic_wrapper+0x6f/0x80 [spl] ┤
│May 4 17:16:20 <host> kernel: [96906.142465] thread_generic_wrapper+0x6f/0x80 [spl] ┤
│May 4 17:16:20 <host> kernel: [96906.142469] kthread+0x112/0x130 ┤
│May 4 17:16:20 <host> kernel: [96906.142471] ? kthread_bind+0x30/0x30 ┤
│May 4 17:16:20 <host> kernel: [96906.142473] ret_from_fork+0x22/0x40
EDIT May 11th 2021: downgrading ZFS to 0.8.4-1~bpo10+1 resolved the issue.
EDIT June 14th 2021: the issue has still been resolved on the downgraded ZFS version.
EDIT December 20th 2021: the issue is still there. Also with zfs-2.0.3-9 on 5.10.0-8-amd64
Same issue here, ran into this while copying many snapshots over from one pool to another. Stack trace is similar, but a little different on ZFS 2.1.1:
Nov 17 23:53:57 localhost kernel: VERIFY3(0 == dmu_bonus_hold_by_dnode(dn, FTAG, &db, flags)) failed (0 == 5)
Nov 17 23:53:57 localhost kernel: PANIC at dmu_recv.c:1798:receive_object()
Nov 17 23:53:57 localhost kernel: Showing stack for process 1363685
Nov 17 23:53:57 localhost kernel: CPU: 2 PID: 1363685 Comm: receive_writer Tainted: P OE 5.13.12-200.fc34.x86_64 #1
Nov 17 23:53:57 localhost kernel: Hardware name: Supermicro X10SLH-F/X10SLM+-F/X10SLH-F/X10SLM+-F, BIOS 3.2a 05/31/2019
Nov 17 23:53:57 localhost kernel: Call Trace:
Nov 17 23:53:57 localhost kernel: dump_stack+0x76/0x94
Nov 17 23:53:57 localhost kernel: spl_panic+0xd4/0xfc [spl]
Nov 17 23:53:57 localhost kernel: ? dbuf_rele_and_unlock+0x487/0x640 [zfs]
Nov 17 23:53:57 localhost kernel: ? dbuf_create+0x5c8/0x5f0 [zfs]
Nov 17 23:53:57 localhost kernel: ? __raw_callee_save___native_queued_spin_unlock+0x11/0x1e
Nov 17 23:53:57 localhost kernel: ? dnode_rele_and_unlock+0x64/0xc0 [zfs]
Nov 17 23:53:57 localhost kernel: receive_object+0xa59/0xc50 [zfs]
Nov 17 23:53:57 localhost kernel: receive_writer_thread+0x1ba/0xac0 [zfs]
Nov 17 23:53:57 localhost kernel: ? kfree+0x403/0x430
Nov 17 23:53:57 localhost kernel: ? thread_generic_wrapper+0x62/0x80 [spl]
Nov 17 23:53:57 localhost kernel: ? receive_read_payload_and_next_header+0x2f0/0x2f0 [zfs]
Nov 17 23:53:57 localhost kernel: ? thread_generic_wrapper+0x6f/0x80 [spl]
Nov 17 23:53:57 localhost kernel: thread_generic_wrapper+0x6f/0x80 [spl]
Nov 17 23:53:57 localhost kernel: ? __thread_exit+0x20/0x20 [spl]
Nov 17 23:53:57 localhost kernel: kthread+0x127/0x150
Nov 17 23:53:57 localhost kernel: ? set_kthread_struct+0x40/0x40
Nov 17 23:53:57 localhost kernel: ret_from_fork+0x22/0x30
It does seem random as after reboot I re-ran the exact same zfs send | zfs recv (with resume token) and it completed no issues.
Any updates for this issue or #12785?
Some debugging output after a recent crash
The issue occured again on january 6 14:33:34 - I base this on default log output which I think is UTC+1 by default. This would translate to timestamp 1641476014.
zfs_get_all.txt zfs_dbgmsg.txt zpool_history.txt
EDIT: logs were set to UTC + 1
One more set of logs of the same event happening a day later at Jan 7 20:20:22 1641583222
dbgmsg_after_reboot.txt zpool_history.txt dbgmsg.txt
With the following zpool status -v output:
pool: data
state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A
scan: scrub repaired 0B in 12:50:05 with 0 errors on Tue Jan 4 00:12:53 2022
config:
NAME STATE READ WRITE CKSUM
data ONLINE 0 0 0
raidz2-0 ONLINE 0 0 0
sda ONLINE 0 0 0
sdb ONLINE 0 0 0
sdd ONLINE 0 0 0
sde ONLINE 0 0 0
spares
sdc AVAIL
errors: Permanent errors have been detected in the following files:
data/private/private/private/private/private/%recv:<0x0>
I am pretty sure the panic happens because dbuf_read() returns an EIO error in the codepath: receive_object() -> dmu_bonus_hold_by_dnode() -> dbuf_read().
Anyone who can replicate the bug:
- set
echo 512 | sudo tee /sys/module/zfs/parameters/zfs_flags - Trigger the bug
- post a
cat /proc/spl/kstat/zfs/dbgmsg
System information
| Type | Version/Name |
|---|---|
| Distribution Name | Proxmox |
| Distribution Version | 7.1.12 |
| Linux Kernel | 5.13.19-6-pve |
| Architecture | amd64 |
| ZFS Version | 2.1.4-pve1 |
@malfret thank you for the logs. Based on the time the panic occurred I went to look for events in dbgmsg.txt around 1648983726. However the events logged into dbgmsg.txt are about 10 minutes later than the panic (starting at 1648984381).
If you can reproduce this, would you mind posting a dbgmsg (with echo 512 > /sys/module/zfs/parameters/zfs_flags set beforehand) right after the panic occurs?
At the time of panic, the file system was not working properly and the shell script was unable to save the log. The log was saved manually.
How to increase the log size so that the data can be saved?
If I change zfs_dbgmsg_maxsize - it will increase the size of the dbgmsg log?
echo $((128*1024*1024)) >> /sys/module/zfs/parameters/zfs_dbgmsg_maxsize
If I change zfs_dbgmsg_maxsize - it will increase the size of the dbgmsg log?
echo $((128*1024*1024)) >> /sys/module/zfs/parameters/zfs_dbgmsg_maxsize
Right, that may help.
The panic seems to happen because arc_untransform() returns an EIO error in the codepath: receive_object() -> dmu_bonus_hold_by_dnode() -> dbuf_read() -> arc_untransform().
And that happens because arc_buf_fill() returns an error ECKSUM (EBADE or 52 in linux).
@gamanakis We are still experiencing this bug on our production systems, any chance that this will be solved any time soon? We are running the latest version from debian stable 2.0.3-9. Could this be escalated maybe?
Any help is appreciated!
Also happened with 2.1.5-1~bpo11+1. Nothing logged to syslog.
kernel:[9838382.870025] VERIFY3(0 == dmu_bonus_hold_by_dnode(dn, FTAG, &db, flags)) failed (0 == 5)
kernel:[9838382.870083] PANIC at dmu_recv.c:1811:receive_object()
Also occurred with 2.1.7 (with https://github.com/openzfs/zfs/commit/c8d2ab05e1e6bf21185723fabdebbe5bb3374381 reverted, otherwise I would run into #14252).
Dec 23 12:30:18 jonathan-desktop kernel: VERIFY3(0 == dmu_bonus_hold_by_dnode(dn, FTAG, &db, flags)) failed (0 == 5)
Dec 23 12:30:18 jonathan-desktop kernel: PANIC at dmu_recv.c:1806:receive_object()
Dec 23 12:30:18 jonathan-desktop kernel: Showing stack for process 3273208
Dec 23 12:30:18 jonathan-desktop kernel: CPU: 5 PID: 3273208 Comm: receive_writer Tainted: P OE 6.1.1-x64v3-xanmod1 #0~20221222.0006b63
Dec 23 12:30:18 jonathan-desktop kernel: Hardware name: To Be Filled By O.E.M. X570 Taichi/X570 Taichi, BIOS P5.00 10/19/2022
Dec 23 12:30:18 jonathan-desktop kernel: Call Trace:
Dec 23 12:30:18 jonathan-desktop kernel: <TASK>
Dec 23 12:30:18 jonathan-desktop kernel: dump_stack_lvl+0x44/0x5c
Dec 23 12:30:18 jonathan-desktop kernel: spl_panic+0xcb/0xe3 [spl]
Dec 23 12:30:18 jonathan-desktop kernel: ? dbuf_rele_and_unlock+0x5ad/0x730 [zfs]
Dec 23 12:30:18 jonathan-desktop kernel: ? dbuf_cons+0xa2/0xc0 [zfs]
Dec 23 12:30:18 jonathan-desktop kernel: ? spl_kmem_cache_alloc+0x9f/0x740 [spl]
Dec 23 12:30:18 jonathan-desktop kernel: ? dbuf_read+0x10a/0x5e0 [zfs]
Dec 23 12:30:18 jonathan-desktop kernel: ? aggsum_add+0x16a/0x180 [zfs]
Dec 23 12:30:18 jonathan-desktop kernel: ? dnode_rele_and_unlock+0x54/0xe0 [zfs]
Dec 23 12:30:18 jonathan-desktop kernel: receive_object+0xb06/0xd30 [zfs]
Dec 23 12:30:18 jonathan-desktop kernel: ? dnode_rele_and_unlock+0x54/0xe0 [zfs]
Dec 23 12:30:18 jonathan-desktop kernel: ? dmu_object_info+0x4f/0x80 [zfs]
Dec 23 12:30:18 jonathan-desktop kernel: receive_writer_thread+0x1b2/0xa90 [zfs]
Dec 23 12:30:18 jonathan-desktop kernel: ? set_next_task_fair+0x66/0x90
Dec 23 12:30:18 jonathan-desktop kernel: ? start_cfs_bandwidth.part.0+0x50/0x50
Dec 23 12:30:18 jonathan-desktop kernel: ? set_user_nice.part.0+0x11b/0x330
Dec 23 12:30:18 jonathan-desktop kernel: ? receive_process_write_record+0x160/0x160 [zfs]
Dec 23 12:30:18 jonathan-desktop kernel: ? __thread_exit+0x10/0x10 [spl]
Dec 23 12:30:18 jonathan-desktop kernel: thread_generic_wrapper+0x55/0x60 [spl]
Dec 23 12:30:18 jonathan-desktop kernel: kthread+0xd5/0x100
Dec 23 12:30:18 jonathan-desktop kernel: ? kthread_complete_and_exit+0x20/0x20
Dec 23 12:30:18 jonathan-desktop kernel: ret_from_fork+0x22/0x30
Dec 23 12:30:18 jonathan-desktop kernel: </TASK>
[1687794.507453] VERIFY3(0 == dmu_bonus_hold_by_dnode(dn, FTAG, &db, flags)) failed (0 == 5)
[1687794.507479] PANIC at dmu_recv.c:1806:receive_object()
[1687794.507493] Showing stack for process 2882666
[1687794.507505] CPU: 5 PID: 2882666 Comm: receive_writer Tainted: P D OE 5.19.0-38-generic #39~22.04.1-Ubuntu
[1687794.507529] Hardware name: Intel(R) Client Systems NUC8i7BEH/NUC8BEB, BIOS BECFL357.86A.0090.2022.0916.1942 09/16/2022
[1687794.507547] Call Trace:
[1687794.507557] <TASK>
[1687794.507571] show_stack+0x52/0x69
[1687794.507598] dump_stack_lvl+0x49/0x6d
[1687794.507629] dump_stack+0x10/0x18
[1687794.507647] spl_dumpstack+0x29/0x35 [spl]
[1687794.507691] spl_panic+0xd1/0xe9 [spl]
[1687794.507732] ? arc_space_consume+0x54/0x130 [zfs]
[1687794.508027] ? dbuf_create+0x5c1/0x5f0 [zfs]
[1687794.508306] ? dbuf_read+0x11b/0x630 [zfs]
[1687794.508587] ? dmu_bonus_hold_by_dnode+0x15b/0x1b0 [zfs]
[1687794.508860] receive_object+0xae1/0xd40 [zfs]
[1687794.509186] ? __slab_free+0x31/0x340
[1687794.509211] ? spl_kmem_free+0x32/0x40 [spl]
[1687794.509244] ? kfree+0x30f/0x330
[1687794.509259] ? mutex_lock+0x13/0x50
[1687794.509277] receive_writer_thread+0x1ce/0xb50 [zfs]
[1687794.509560] ? set_next_task_fair+0x70/0xb0
[1687794.509578] ? receive_process_write_record+0x1a0/0x1a0 [zfs]
[1687794.509861] ? spl_kmem_free+0x32/0x40 [spl]
[1687794.509894] ? kfree+0x30f/0x330
[1687794.509909] ? receive_process_write_record+0x1a0/0x1a0 [zfs]
[1687794.510191] ? __thread_exit+0x20/0x20 [spl]
[1687794.510225] thread_generic_wrapper+0x61/0x80 [spl]
[1687794.510256] ? thread_generic_wrapper+0x61/0x80 [spl]
[1687794.510288] kthread+0xeb/0x120
[1687794.510303] ? kthread_complete_and_exit+0x20/0x20
[1687794.510321] ret_from_fork+0x1f/0x30
[1687794.510343] </TASK>
You might want to update to the latest ZFS release. jonathonf's PPA hasn't been updated in quite a while now.
What I did was that I grabbed the zfs-linux source tarball from the lunar repo and compiled it with dpkg-buildpackage.
On Sat, Apr 22, 2023 at 9:21 PM Lenno Nagel @.***> wrote:
[1687794.507453] VERIFY3(0 == dmu_bonus_hold_by_dnode(dn, FTAG, &db, flags)) failed (0 == 5) [1687794.507479] PANIC at dmu_recv.c:1806:receive_object() [1687794.507493] Showing stack for process 2882666 [1687794.507505] CPU: 5 PID: 2882666 Comm: receive_writer Tainted: P D OE 5.19.0-38-generic #39~22.04.1-Ubuntu [1687794.507529] Hardware name: Intel(R) Client Systems NUC8i7BEH/NUC8BEB, BIOS BECFL357.86A.0090.2022.0916.1942 09/16/2022 [1687794.507547] Call Trace: [1687794.507557] <TASK> [1687794.507571] show_stack+0x52/0x69 [1687794.507598] dump_stack_lvl+0x49/0x6d [1687794.507629] dump_stack+0x10/0x18 [1687794.507647] spl_dumpstack+0x29/0x35 [spl] [1687794.507691] spl_panic+0xd1/0xe9 [spl] [1687794.507732] ? arc_space_consume+0x54/0x130 [zfs] [1687794.508027] ? dbuf_create+0x5c1/0x5f0 [zfs] [1687794.508306] ? dbuf_read+0x11b/0x630 [zfs] [1687794.508587] ? dmu_bonus_hold_by_dnode+0x15b/0x1b0 [zfs] [1687794.508860] receive_object+0xae1/0xd40 [zfs] [1687794.509186] ? __slab_free+0x31/0x340 [1687794.509211] ? spl_kmem_free+0x32/0x40 [spl] [1687794.509244] ? kfree+0x30f/0x330 [1687794.509259] ? mutex_lock+0x13/0x50 [1687794.509277] receive_writer_thread+0x1ce/0xb50 [zfs] [1687794.509560] ? set_next_task_fair+0x70/0xb0 [1687794.509578] ? receive_process_write_record+0x1a0/0x1a0 [zfs] [1687794.509861] ? spl_kmem_free+0x32/0x40 [spl] [1687794.509894] ? kfree+0x30f/0x330 [1687794.509909] ? receive_process_write_record+0x1a0/0x1a0 [zfs] [1687794.510191] ? __thread_exit+0x20/0x20 [spl] [1687794.510225] thread_generic_wrapper+0x61/0x80 [spl] [1687794.510256] ? thread_generic_wrapper+0x61/0x80 [spl] [1687794.510288] kthread+0xeb/0x120 [1687794.510303] ? kthread_complete_and_exit+0x20/0x20 [1687794.510321] ret_from_fork+0x1f/0x30 [1687794.510343] </TASK>
— Reply to this email directly, view it on GitHub https://github.com/openzfs/zfs/issues/12001#issuecomment-1518939505, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABI35LW7C3C2JF5BL4ICZVLXCSU4DANCNFSM44EHQVOA . You are receiving this because you commented.Message ID: @.***>
Using 2.2.0 now
[ 46.799495] ZFS: Loaded module v2.2.0-0ubuntu1~23.10, ZFS pool version 5000, ZFS filesystem version 5
The error seems to be the same
[984462.595691] VERIFY0(0 == dmu_bonus_hold_by_dnode(dn, FTAG, &db, flags)) failed (0 == 5)
[984462.595700] PANIC at dmu_recv.c:2093:receive_object()
[984462.595703] Showing stack for process 44582
[984462.595704] CPU: 4 PID: 44582 Comm: receive_writer Tainted: P O 6.5.0-14-generic #14-Ubuntu
[984462.595706] Hardware name: Intel(R) Client Systems NUC8i7BEH/NUC8BEB, BIOS BECFL357.86A.0090.2022.0916.1942 09/16/2022
[984462.595708] Call Trace:
[984462.595710] <TASK>
[984462.595713] dump_stack_lvl+0x48/0x70
[984462.595722] dump_stack+0x10/0x20
[984462.595724] spl_dumpstack+0x29/0x40 [spl]
[984462.595742] spl_panic+0xfc/0x120 [spl]
[984462.595757] receive_object+0x9ce/0xaa0 [zfs]
[984462.596131] ? __list_del_entry+0x1d/0x30 [zfs]
[984462.596427] ? list_del+0xd/0x30 [zfs]
[984462.596674] receive_process_record+0x84/0x2b0 [zfs]
[984462.596963] receive_writer_thread+0xb7/0x1a0 [zfs]
[984462.597246] ? __pfx_receive_writer_thread+0x10/0x10 [zfs]
[984462.597543] ? __pfx_thread_generic_wrapper+0x10/0x10 [spl]
[984462.597563] thread_generic_wrapper+0x5c/0x70 [spl]
[984462.597583] kthread+0xef/0x120
[984462.597587] ? __pfx_kthread+0x10/0x10
[984462.597590] ret_from_fork+0x44/0x70
[984462.597594] ? __pfx_kthread+0x10/0x10
[984462.597597] ret_from_fork_asm+0x1b/0x30
[984462.597601] </TASK>