zfs
zfs copied to clipboard
Leak's back
I'm having some trouble again with kernel_task growing to enormous sizes after the last manual build I installed several days ago. After letting my system sit for a couple days over the weekend I found that kernel_task
had grown to 103GB and then just after about 14 hours I've found kernel_task
to have grown to roughly 33Gb after setting these parameters in /etc/zfs/zsysctl.conf
:
kstat.zfs.darwin.tunable.zfs_arc_max=4294967296
kstat.zfs.darwin.tunable.zfs_arc_meta_limit=3221225472
kstat.zfs.darwin.tunable.zfs_arc_min=1610612736
kstat.zfs.darwin.tunable.zfs_arc_meta_min=1342177280
kstat.zfs.darwin.tunable.zfs_dirty_data_max=536870912
Any issues currently potentially causing this or do we need to track this down? I'd found that memory consumption within kernel_task
had been very reasonable for perhaps about 10 months until this last build.
Yes, we've been looking at that leak lately, so how recently would be the question, in particular:
- avoid zero alloc in vdev_compact_children()
- mutex leak in dsl_dataset_hold_obj()
- add missing dmu_zfetch_fini() in dnode_move_impl()
Is there anything that can or should be done on my end as far as collection some metrics? If you'd explain what you'd like for me to do, I can most certainly work at collecting some data at the very least. I'm essentially having to reboot at least daily at this point to release the memory.
last manual build I installed several days ago.
Can you confirm you are running a version that includes the three commits I listed?
Oh I'm sorry, I didn't realize you were asking to verify these commits being present. The build I'm running had commits up to:
commit c1412d71ea3d1ae17599db9e8512ea9753de41f6 (origin/trim)
Author: Brian Behlendorf <[email protected]>
Date: Thu Mar 21 10:58:24 2019 -0700
Then you are missing two of them I think.
Gotcha. I've built the new kexts and hopefully will have some positive news later today/tomorrow!
Hmm. kernel_task
grew to over 28 GB real mem usage overnight under no appreciable usage, except perhaps spotlight indexing and I see backupd-helper
& backupd
did a good bit of disk reading.
Ah yep, I was suspecting there was another leak too. I've had a hell of a time trying to track it down though. Can you see a size always-growing with "sysctl kstat | grep inuse" ? Could indicate which size is leaking. Also, if you watch the kernel logs when unloading spl it will tell you how much it thinks it leaked. The final line with ""(os_mem_alloc: %llu)\n"" should be zero.
Interesting, I had to be out of the office most of the day yesterday and found it still hovering at roughly 32 GB this morning. Just for reference here are those values, but I don't have any frame of reference yet.
Where are the kernel logs stored now in Mojave?
(standard input):kstat.vmem.vmem.bucket_16384.mem_inuse: 311533568
(standard input):kstat.vmem.vmem.bucket_32768.mem_inuse: 12904857600
(standard input):kstat.vmem.vmem.bucket_65536.mem_inuse: 7973576704
(standard input):kstat.vmem.vmem.bucket_131072.mem_inuse: 4724477952
(standard input):kstat.vmem.vmem.bucket_262144.mem_inuse: 45240320
So, like 32768 has 12G, and the others 8G, 4G, 3G. Certainly makes it look like 32768. In good situations these inuse should go up and down. But if it strictly increases, it is suspicious.
Logs are annoyoing after 10.11, you can run
log stream --source --predicate 'senderImagePath CONTAINS "zfs" OR senderImagePath CONTAINS "spl"'
To watch it live, or
log stream --source --start '1999-99-99 23:59' | egrep -i 'zfs|spl'
To fetch stored logs
Yeah, I've been really frustrated by logging now. So on the sizes, where do you come up with those figures out of the 32G?
12904857600 = 12,904,857,600
about 12G
I gotcha, so you're just rounding. The 3G is what really threw me off. So any of these vmem buckets with the seeming numeric id assigned are or were (when 0) in use by SPL, right?
"inuse" is actually in used right now - there are other fields for max, etc. So once pools are exported, and zfs unloaded, all inuse should be 0.
I guess the question is why are there so many fields like kstat.unix.kmem_cache.zio_buf_94208.buf_inuse: 0
? Are they there for potential use yet still? I was figuring that they're old and defunct now, but I guess not...
zio_buf is a different alloc path, from IO data - so you have zero 94208 sized data buffer in flight.That's all good. zio wouldnt have many active at any time, since they are released once IO is done.
OK - I managed to get zero leaks in master under the "async_zget2" branches, which we are currently exploring as the way forward. It still uses an ugly while
loop but works as a proof-of-concept
I think you got it. I had some time to build it again from current master last night and the system has now run for about 12 hours with a max kernel_task mem usage of 8.91 GB. Previously it was holding at around 32 (which was itself still better than the previous 100). So I'm super optimistic a this point. The real test is going to be to do some heavy file manipulation (especially ownership / metadata changes) to the zpools to see what happens.
Running the tests from slog_001_pos
to sysctl_002_pos
results in some leaks;
memleak 0xffffff90a1fa0200:64 from zfs_acl.c:471
memleak 0xffffff90ae9e85a8:88 from zfs_acl.c:455
memleak 0xffffff90aaf5a3c8:48 from zfs_acl.c:473
Which is interesting, as it is directly connected to chmod and chown, which users have reported before.
Need to cook up some code to track zfs_acl_node_alloc() / zfs_acl_node_free()
acl leak: line 1424 size 24
zfs_acl_chmod(umode_t umode, uint64_t mode, boolean_t split, boolean_t trim,
zfs_acl_t *aclp)
newnode = zfs_acl_node_alloc((abstract_size * 6) + aclp->z_acl_bytes);
net.lundman.zfs : _zfs_acl_node_alloc + 0x9c
net.lundman.zfs : _zfs_acl_chmod + 0x5b
net.lundman.zfs : _zfs_acl_ids_create + 0x905
net.lundman.zfs : _zfs_obtain_xattr + 0x150
net.lundman.zfs : _zfs_vnop_setxattr + 0x38f
mach_kernel : _VNOP_SETXATTR + 0x51
277080f
is the leak gone? :D