zfs
zfs copied to clipboard
extra zap_put_leaf in an unlikely code path
The problem is found by a code review of the latest code, so no OS / version details. But the code has been unchanged for many years.
fzap_add
and fzap_update
may need to call zap_expand_leaf
if there is not enough room for a new entry.
zap_expand_leaf
may need to call zap_grow_ptrtbl
if the current pointer table is not large enough.
In the very unlikely -- but not impossible -- event that the zap_grow_ptrtbl
call fails, zap_expand_leaf
would return with the original leaf, l
, released by a call to zap_put_leaf
and with the output leaf pointer, lp
, keeping it original value.
Both fzap_add
and fzap_update
call zap_expand_leaf
like this:
err = zap_expand_leaf(zn, l, tag, tx, &l);
So, in the described scenario, l
would still point to the original leaf, but the leaf would be released.
In the epilogue parts of the functions there is a call to zap_put_leaf_maybe_grow_ptrtbl
that would zap_put_leaf
the leaf again.
One potential consequence, in non-debug builds, is that a dirty hold would get released for the leaf's underlying dbuf. Thus, the dbuf would both be free for a re-use and still be referenced from its dirty record.
@ahrens , when you have some time, could you please double-check my observations? Thank you!
The same kind of problem can also happen if the zap_deref_leaf
call in zap_expand_leaf
fails.
In both cases zn->zn_zap
is not NULL, the leaf is not referenced and lp
points to the leaf.
This issue has been automatically marked as "stale" because it has not had any activity for a while. It will be closed in 90 days if no further activity occurs. Thank you for your contributions.
I believe that this is a valid bug, although quite hard to hit in practice.
We have probably hit this just now?
VERIFY3(NULL == dmu_buf_set_user(l->l_dbuf, &l->l_dbu)) failed (0000000000000000 == ffff91dfeb60e200)
PANIC at zap.c:441:zap_create_leaf()
Showing stack for process 2470499
CPU: 10 PID: 2470499 Comm: python3 Tainted: G OE 6.8.8 #1-vpsAdminOS
Hardware name: Dell Inc. PowerEdge R7525/0PYVT1, BIOS 2.5.6 10/06/2021
In memory cgroup /osctl/pool.tank/group.default/user.2739/ct.19681/user-owned/lxc.payload.19681/docker/3bd6008a375b00b30cb2fccbf7dcd2e32c081094425b4c8603c16fa09743d08a
Call Trace:
<TASK>
dump_stack_lvl+0x53/0x70
spl_panic+0x100/0x120 [spl]
zap_expand_leaf+0x4ec/0x520 [zfs]
fzap_add_cd+0xf9/0x230 [zfs]
fzap_add+0x47/0xd0 [zfs]
zap_add_impl+0x15e/0x330 [zfs]
? srso_return_thunk+0x5/0x5f
? zap_add+0x5c/0xd0 [zfs]
zfs_link_create+0x189/0x500 [zfs]
zfs_create+0x75f/0xa00 [zfs]
zpl_create+0xd0/0x1e0 [zfs]
path_openat+0xe9c/0x1180
do_filp_open+0xb3/0x160
do_sys_openat2+0xab/0xe0
__x64_sys_openat+0x6e/0xa0
do_syscall_64+0xab/0x1b0
entry_SYSCALL_64_after_hwframe+0x79/0x81
RIP: 0033:0x7f9df8ef9f01
Code: 75 57 89 f0 25 00 00 41 00 3d 00 00 41 00 74 49 80 3d ea 26 0e 00 00 74 6d 89 da 48 89 ee bf 9c ff ff ff b8 01 01 00 00 0f 05 <48> 3d 00 f0 ff ff 0f 87 93 00 00 00 48 8b 54 24 28 64 48 2b 14 25
RSP: 002b:00007ffe2a11f4d0 EFLAGS: 00000202 ORIG_RAX: 0000000000000101
RAX: ffffffffffffffda RBX: 0000000000080241 RCX: 00007f9df8ef9f01
RDX: 0000000000080241 RSI: 00007f9df8a61d00 RDI: 00000000ffffff9c
RBP: 00007f9df8a61d00 R08: 0000000000000000 R09: 0000000000000000
R10: 00000000000001b6 R11: 0000000000000202 R12: 00007f9df8e00fc0
R13: 00000000ffffffff R14: 00007f9df8acfaf0 R15: 0000000000000001
</TASK>
ping @amotin - just in case it might click for you faster, do you see any possible relation here to your zap related optimizations, please?
It seems unlikely that this is the same issue tho, if the zap_grow_ptrtbl
function would return an error, it wouldn't get to zap_create_leaf
. Maybe it could rather be related to 80cc51629?
dunno, I'll create a new issue so that I don't spam this one...