zfs icon indicating copy to clipboard operation
zfs copied to clipboard

Leak's back

Open ylluminate opened this issue 5 years ago • 23 comments

I'm having some trouble again with kernel_task growing to enormous sizes after the last manual build I installed several days ago. After letting my system sit for a couple days over the weekend I found that kernel_task had grown to 103GB and then just after about 14 hours I've found kernel_task to have grown to roughly 33Gb after setting these parameters in /etc/zfs/zsysctl.conf:

kstat.zfs.darwin.tunable.zfs_arc_max=4294967296
kstat.zfs.darwin.tunable.zfs_arc_meta_limit=3221225472

kstat.zfs.darwin.tunable.zfs_arc_min=1610612736
kstat.zfs.darwin.tunable.zfs_arc_meta_min=1342177280
kstat.zfs.darwin.tunable.zfs_dirty_data_max=536870912

Any issues currently potentially causing this or do we need to track this down? I'd found that memory consumption within kernel_task had been very reasonable for perhaps about 10 months until this last build.

ylluminate avatar Apr 02 '19 05:04 ylluminate

Yes, we've been looking at that leak lately, so how recently would be the question, in particular:

  • avoid zero alloc in vdev_compact_children()
  • mutex leak in dsl_dataset_hold_obj()
  • add missing dmu_zfetch_fini() in dnode_move_impl()

lundman avatar Apr 02 '19 06:04 lundman

Is there anything that can or should be done on my end as far as collection some metrics? If you'd explain what you'd like for me to do, I can most certainly work at collecting some data at the very least. I'm essentially having to reboot at least daily at this point to release the memory.

ylluminate avatar Apr 03 '19 06:04 ylluminate

last manual build I installed several days ago.

Can you confirm you are running a version that includes the three commits I listed?

lundman avatar Apr 03 '19 06:04 lundman

Oh I'm sorry, I didn't realize you were asking to verify these commits being present. The build I'm running had commits up to:

commit c1412d71ea3d1ae17599db9e8512ea9753de41f6 (origin/trim)
Author: Brian Behlendorf <[email protected]>
Date:   Thu Mar 21 10:58:24 2019 -0700

ylluminate avatar Apr 03 '19 06:04 ylluminate

Then you are missing two of them I think.

lundman avatar Apr 03 '19 07:04 lundman

Gotcha. I've built the new kexts and hopefully will have some positive news later today/tomorrow!

ylluminate avatar Apr 03 '19 07:04 ylluminate

Hmm. kernel_task grew to over 28 GB real mem usage overnight under no appreciable usage, except perhaps spotlight indexing and I see backupd-helper & backupd did a good bit of disk reading.

ylluminate avatar Apr 03 '19 12:04 ylluminate

Ah yep, I was suspecting there was another leak too. I've had a hell of a time trying to track it down though. Can you see a size always-growing with "sysctl kstat | grep inuse" ? Could indicate which size is leaking. Also, if you watch the kernel logs when unloading spl it will tell you how much it thinks it leaked. The final line with ""(os_mem_alloc: %llu)\n"" should be zero.

lundman avatar Apr 04 '19 05:04 lundman

Interesting, I had to be out of the office most of the day yesterday and found it still hovering at roughly 32 GB this morning. Just for reference here are those values, but I don't have any frame of reference yet.

Where are the kernel logs stored now in Mojave?

ylluminate avatar Apr 04 '19 10:04 ylluminate

(standard input):kstat.vmem.vmem.bucket_16384.mem_inuse: 311533568
(standard input):kstat.vmem.vmem.bucket_32768.mem_inuse: 12904857600
(standard input):kstat.vmem.vmem.bucket_65536.mem_inuse: 7973576704
(standard input):kstat.vmem.vmem.bucket_131072.mem_inuse: 4724477952
(standard input):kstat.vmem.vmem.bucket_262144.mem_inuse: 45240320

So, like 32768 has 12G, and the others 8G, 4G, 3G. Certainly makes it look like 32768. In good situations these inuse should go up and down. But if it strictly increases, it is suspicious.

lundman avatar Apr 04 '19 10:04 lundman

Logs are annoyoing after 10.11, you can run log stream --source --predicate 'senderImagePath CONTAINS "zfs" OR senderImagePath CONTAINS "spl"'

To watch it live, or

log stream --source --start '1999-99-99 23:59' | egrep -i 'zfs|spl' To fetch stored logs

lundman avatar Apr 04 '19 10:04 lundman

Yeah, I've been really frustrated by logging now. So on the sizes, where do you come up with those figures out of the 32G?

ylluminate avatar Apr 04 '19 10:04 ylluminate

12904857600 = 12,904,857,600

about 12G

lundman avatar Apr 04 '19 10:04 lundman

I gotcha, so you're just rounding. The 3G is what really threw me off. So any of these vmem buckets with the seeming numeric id assigned are or were (when 0) in use by SPL, right?

ylluminate avatar Apr 04 '19 10:04 ylluminate

"inuse" is actually in used right now - there are other fields for max, etc. So once pools are exported, and zfs unloaded, all inuse should be 0.

lundman avatar Apr 04 '19 10:04 lundman

I guess the question is why are there so many fields like kstat.unix.kmem_cache.zio_buf_94208.buf_inuse: 0? Are they there for potential use yet still? I was figuring that they're old and defunct now, but I guess not...

ylluminate avatar Apr 04 '19 11:04 ylluminate

zio_buf is a different alloc path, from IO data - so you have zero 94208 sized data buffer in flight.That's all good. zio wouldnt have many active at any time, since they are released once IO is done.

lundman avatar Apr 04 '19 11:04 lundman

OK - I managed to get zero leaks in master under the "async_zget2" branches, which we are currently exploring as the way forward. It still uses an ugly while loop but works as a proof-of-concept

lundman avatar Apr 07 '19 23:04 lundman

I think you got it. I had some time to build it again from current master last night and the system has now run for about 12 hours with a max kernel_task mem usage of 8.91 GB. Previously it was holding at around 32 (which was itself still better than the previous 100). So I'm super optimistic a this point. The real test is going to be to do some heavy file manipulation (especially ownership / metadata changes) to the zpools to see what happens.

ylluminate avatar Apr 11 '19 15:04 ylluminate

Running the tests from slog_001_pos to sysctl_002_pos results in some leaks;

memleak 0xffffff90a1fa0200:64 from zfs_acl.c:471
memleak 0xffffff90ae9e85a8:88 from zfs_acl.c:455
memleak 0xffffff90aaf5a3c8:48 from zfs_acl.c:473

Which is interesting, as it is directly connected to chmod and chown, which users have reported before.

Need to cook up some code to track zfs_acl_node_alloc() / zfs_acl_node_free()

lundman avatar Apr 13 '19 04:04 lundman

acl leak: line 1424 size 24

zfs_acl_chmod(umode_t umode, uint64_t mode, boolean_t split, boolean_t trim,
	zfs_acl_t *aclp)

	newnode = zfs_acl_node_alloc((abstract_size * 6) + aclp->z_acl_bytes);

net.lundman.zfs : _zfs_acl_node_alloc + 0x9c
net.lundman.zfs : _zfs_acl_chmod + 0x5b
net.lundman.zfs : _zfs_acl_ids_create + 0x905
net.lundman.zfs : _zfs_obtain_xattr + 0x150
net.lundman.zfs : _zfs_vnop_setxattr + 0x38f
mach_kernel : _VNOP_SETXATTR + 0x51

lundman avatar Apr 15 '19 01:04 lundman

277080f

lundman avatar Apr 15 '19 02:04 lundman

is the leak gone? :D

JMoVS avatar Aug 02 '19 09:08 JMoVS