zfs high CPU for NFS

high CPU for NFS

Open clhedrick opened this issue 8 months ago • 11 comments

System information

Describe the problem you're observing

We have periodic very high CPU , with many nfsds at 100%.

During this time, nfs operations are moderate, in the range of 10K / sec. However arcstat shows 2 M reads / sec, half demand metadata and half prefetch.

A backtrace shows:

un 7 10:55:43 eternal.lcsr.rutgers.edu kernel: [351663.953866] spl_kmem_free+0x31/0x40 [spl] Jun 7 10:55:43 eternal.lcsr.rutgers.edu kernel: [351663.953879] dbuf_issue_final_prefetch_done+0x49/0x60 [zfs] Jun 7 10:55:43 eternal.lcsr.rutgers.edu kernel: [351663.954001] arc_read+0xdfa/0x1790 [zfs] Jun 7 10:55:43 eternal.lcsr.rutgers.edu kernel: [351663.954114] ? __pfx_dbuf_issue_final_prefetch_done+0x10/0x10 [zfs] Jun 7 10:55:43 eternal.lcsr.rutgers.edu kernel: [351663.954225] dbuf_issue_final_prefetch+0xa7/0x100 [zfs] Jun 7 10:55:43 eternal.lcsr.rutgers.edu kernel: [351663.954327] dbuf_prefetch_impl+0x779/0xa70 [zfs] Jun 7 10:55:43 eternal.lcsr.rutgers.edu kernel: [351663.954437] dbuf_prefetch+0x13/0x30 [zfs] Jun 7 10:55:43 eternal.lcsr.rutgers.edu kernel: [351663.954541] dmu_prefetch_dnode.part.0+0x47/0xa0 [zfs] Jun 7 10:55:43 eternal.lcsr.rutgers.edu kernel: [351663.954646] dmu_prefetch_dnode+0x30/0x40 [zfs] Jun 7 10:55:43 eternal.lcsr.rutgers.edu kernel: [351663.954753] zfs_readdir+0x369/0x560 [zfs] Jun 7 10:55:43 eternal.lcsr.rutgers.edu kernel: [351663.954876] zpl_iterate+0x54/0x90 [zfs] Jun 7 10:55:43 eternal.lcsr.rutgers.edu kernel: [351663.954983] iterate_dir+0xa9/0x180 Jun 7 10:55:43 eternal.lcsr.rutgers.edu kernel: [351663.954988] get_name+0x15e/0x1d0 Jun 7 10:55:43 eternal.lcsr.rutgers.edu kernel: [351663.954994] ? __pfx_filldir_one+0x10/0x10 Jun 7 10:55:43 eternal.lcsr.rutgers.edu kernel: [351663.955000] reconnect_one+0x242/0x280 Jun 7 10:55:43 eternal.lcsr.rutgers.edu kernel: [351663.955002] reconnect_path+0xfa/0x120 Jun 7 10:55:43 eternal.lcsr.rutgers.edu kernel: [351663.955005] ? __pfx_nfsd_acceptable+0x10/0x10 [nfsd] Jun 7 10:55:43 eternal.lcsr.rutgers.edu kernel: [351663.955034] exportfs_decode_fh_raw+0x12e/0x340 Jun 7 10:55:43 eternal.lcsr.rutgers.edu kernel: [351663.955043] nfsd_set_fh_dentry+0x2d5/0x490 [nfsd]

They're all in reconnect_one, doing dbuf stuff. The number of reads seems unreasonable.

Our experience is that it will eventually calm down, but then start up again. Basically, after a few weeks uptime, usage peaks get higher and higher, and low usage less and less common, until we reboot. I've been trying different kernels, but I'm not sure whether this is related to the kernel or ZFS. (However the time element may not be true. I recentrly rebooted with the problem was occurring and it immediately started again. I now think it's a specific pattern of accesses from the client.)

Chris Siebenmann of Toronto has kindly looked at it and suggested a cause.

nfsd has to validate the file id's it gets from the client. In the case of directories, this often involves going up the directory tree. At each level it checks every node until it comes to the one below it. In zfs_readdir, every time a node is looked at, its metadata is prefetched, even if it's already in the ARC. Of course the prefetch doesn't do any I/O if it's cached, but there's still a fair amount of code. (We have 3/4 TB of memory and very high cache hit rates. Even if I/O is needed, our metadata is in an NVMe-based special.)

He has proposed a patch that allows us to disable this prefetching. We'll be testing this over the next week. However a better solution would be to avoid the prefetch in the specific situation of the nfsd reconnect_path, if that is possible to detect. (I believe a zfs-specific version of getname could be used.)

I'll report here the results of disabling the prefetch, but it may take a few weeks to assess.

Describe how to reproduce the problem.

not reproducible

Include any warning/errors/backtraces from the system logs

Jun 08 '24 13:06 clhedrick

zfs zfs copied to clipboard

high CPU for NFS

System information

Describe the problem you're observing

Describe how to reproduce the problem.

Include any warning/errors/backtraces from the system logs

zfs
zfs copied to clipboard