linux
linux copied to clipboard
Kernel 6.6 have memory leak on nfs
Describe the bug
Platform : raspberry pi 5 8Gb with kernel 6.6.20+rpt-rpi-2712
I found the problem of memory leak. This is from NFS. My raspberry is used as NFS server with this parameters :
nfsd.conf : [nfsd] threads=16 udp=yes tcp=yes vers2=no vers3=yes vers4=yes vers4.0=yes vers4.1=yes vers4.2=yes
/etc/exports :
/export/cluster-data 172.31.31.160/32(fsid=663f02fb-a2eb-4c16-b809-29da7f5d24c5,rw,async,insecure,all_squash,anonuid=1000,anongid=100,no_subtree_check,rw) 172.31.31.159/32(fsid=5cd4a8cf-642b-4be1-abca-df0532aba469,rw,async,insecure,all_squash,anonuid=1000,anongid=100,no_subtree_check,rw) /export 172.31.31.160/32(ro,fsid=0,root_squash,no_subtree_check) /export 172.31.31.159/32(ro,fsid=0,root_squash,no_subtree_check)
On 2 clients (/etc/mtab): 172.31.31.248:/cluster-data /mnt/pve/cluster-data-pistorage nfs4 rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=172.31.31.160,local_lock=none,addr=172.31.31.248 0 0
Steps to reproduce the behaviour
As soon as the clients are connected (Zone 1 and Zone 3 see on img), the memory begin slowly to leak , even without transfert of data on nfs. As soon as the clients are disconnected (Zone 2 in img), memory stay as the same level of usage. R in img is a reboot.
Nfsd module stay at the same memory size in lsmod. The problem is nfs in the kernel.
Device (s)
Raspberry Pi 5
System
cat /etc/rpi-issue Raspberry Pi reference 2024-03-15 Generated using pi-gen, https://github.com/RPi-Distro/pi-gen, f19ee211ddafcae300827f953d143de92a5c6624, stage2
vcgencmd version 2024/02/16 15:28:41 Copyright (c) 2012 Broadcom version 4c845bd3 (release) (embedded)
uname -a Linux pistorage 6.6.20+rpt-rpi-2712 #1 SMP PREEMPT Debian 1:6.6.20-1+rpt1 (2024-03-07) aarch64 GNU/Linux
Logs
No response
Additional context
No response
After 10 days when the memory is comming full, the pi restarts by itself after a OOM. You can find more details starting here : https://forums.raspberrypi.com/viewtopic.php?t=361116&start=75#p2202411 This problem doesn't exist on previous official 6.1 kernel.
There are no nfs related downstream commits to the kernel, so this is likely to be an upstream kernel issue. We don't have any special knowledge of kernel nfs code, so this may be tricky to track down.
Some possible approaches: It seems likely that the leak may be present other platforms, so trying to find similar reports may be useful. Reporting it to upstream kernel devs may be useful. Any nfs related commits present in 6.6 but not present in 6.1 could be reverted in test builds to try to track it down. Testing 6.2, 6.3, 6.4 and 6.5 kernels would help narrow it down further.
Hum ... Seem to be detected here : https://lore.kernel.org/lkml/[email protected]/ https://bugzilla.kernel.org/show_bug.cgi?id=218671
And it's been backported to 6.6.26. I've just bumped rpi-update to that,
so if you run sudo rpi-update
you should have the fix.
Ok, I just updated my pi Linux pistorage 6.6.26-v8-16k+ #1754 SMP PREEMPT Thu Apr 11 14:51:20 BST 2024 aarch64 GNU/Linux
I will come back to you to give you the result.
The memory leak is gone. Tested during 16 hours and memory used is stable. When did you think it will push on official rpi stable kernel ? Thanks a lot.
When did you think it will push on official rpi stable kernel ?
I'll flag it up as an important bug fix, and let you know when.
Latest apt kernel is 6.6.31-1+rpt1 (2024-05-29) so contains this fix.