L2ARC VERIFY failure in ZTS
System information
| Type | Version/Name |
|---|---|
| Distribution Name | Debian |
| Distribution Version | 11.5 |
| Kernel Version | 5.10.0-17-amd64 |
| Architecture | amd64 |
| OpenZFS Version | 6f8602a |
Describe the problem you're observing
While trying to see if #13897 would repro on x86, it did not seem to obliterate my stack, but it did hang a number of times on L2ARC tests waiting infinitely for certain stats to change.
After about 6 runs of this fixing the infinite ksh loops to abort eventually, I got a VERIFY failure while in zpool_split_vdevs.ksh.
Describe how to reproduce the problem
- Run zfs-tests.sh -T functional several times, I guess?
Include any warning/errors/backtraces from the system logs
[191672.041083] VERIFY3(dev->l2ad_hand + distance < dev->l2ad_end) failed (88080896 < 66584576)
[191672.043180] PANIC at arc.c:9303:l2arc_evict()
[191672.045225] Showing stack for process 2285770
[191672.047153] CPU: 1 PID: 2285770 Comm: l2arc_feed Tainted: P OE 5.10.0-17-amd64 #1 Debian 5.10.136-1
[191672.049145] Hardware name: Supermicro Super Server/X12SDV-16C-SPT8F, BIOS 1.0 05/04/2022
[191672.051253] Call Trace:
[191672.053346] dump_stack+0x6b/0x83
[191672.055435] spl_panic+0xd4/0xfc [spl]
[191672.057504] ? vcmn_err.cold+0x27/0x9a [spl]
[191672.059747] l2arc_evict+0x8e4/0x920 [zfs]
[191672.061774] ? add_wait_queue_exclusive+0x70/0x70
[191672.063990] l2arc_feed_thread+0x3f5/0x2890 [zfs]
[191672.066018] ? dequeue_entity+0xc6/0x450
[191672.068028] ? newidle_balance+0x282/0x3d0
[191672.070011] ? kfree+0x410/0x490
[191672.072141] ? l2arc_remove_vdev+0x330/0x330 [zfs]
[191672.074120] thread_generic_wrapper+0x75/0xc0 [spl]
[191672.076082] ? spl_taskq_fini+0x80/0x80 [spl]
[191672.078026] kthread+0x118/0x140
[191672.079943] ? __kthread_bind_mask+0x60/0x60
[191672.081848] ret_from_fork+0x1f/0x30
@gamanakis may have some ideas here.
@rincebrain Thank you for filing this. I cannot reproduce it, but I took a careful look and found a subtle bug. Would you mind giving #14828 a try?
Just to note, I didn't overlook this request, but my testbed that I originally did this on kept crashing with other failures when I tried running ZTS in a loop, so I've been trying to figure out if there's some common cause...we'll see. I'll chime in here again if I can't reproduce it any more regardless.