mergerfs icon indicating copy to clipboard operation
mergerfs copied to clipboard

Kernel panics and freezing when adding to lru cache

Open Abbotta4 opened this issue 4 years ago • 12 comments
trafficstars

Describe the bug

My machine panics and/or freezes often with errors having to do with paging or the lru_cache. Pictures of the panic text are here: paging: https://i.imgur.com/HT4F7p7.jpg https://i.imgur.com/pTb4Miu.jpg https://i.imgur.com/IsNlvQV.jpeg lru cache: https://i.imgur.com/aWnvtWE.jpg

To Reproduce

/etc/fstab entry: /mnt/w1:/mnt/w2:/mnt/b1 /mnt/pool mergerfs allow_other,minfreespace=20G,moveonenospc=true,use_ino,dropcacheonclose=true,ignorepponrename=true,category.create=lus,cache.files=partial,xattr=nosys,func.getattr=newest,fsname=/mnt/pool,nofail 0 0

I power up the machine and it will eventually panic and/or freeze after anywhere from a half hour to a couple hours after turning on.

Expected behavior

There should be no panics or freezing.

System information:

  • OS, kernel version: Arch Linux and tried both 5.10.27-1 lts and 5.11.11 mainline
  • mergerfs version: 2.32.3
  • mergerfs settings
  • List of drives, filesystems, & sizes:
drew@leo% df -h      
Filesystem      Size  Used Avail Use% Mounted on
dev             7.7G     0  7.7G   0% /dev
run             7.7G  1.1M  7.7G   1% /run
/dev/nvme0n1p2  101G   90G  6.4G  94% /
tmpfs           7.7G  624K  7.7G   1% /dev/shm
tmpfs           7.7G     0  7.7G   0% /tmp
/mnt/pool        26T  6.9T   18T  29% /mnt/pool
/dev/nvme0n1p1  511M  104M  408M  21% /boot
/dev/sdb1        11T  2.2T  8.1T  22% /mnt/w2
/dev/sda1        11T  2.2T  8.2T  21% /mnt/w1
/dev/sdc1       3.6T  2.6T  912G  74% /mnt/b1
tmpfs           1.6G     0  1.6G   0% /run/user/1000
drew@leo% lsblk
NAME        MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda           8:0    0  10.9T  0 disk 
└─sda1        8:1    0  10.9T  0 part /mnt/w1
sdb           8:16   0  10.9T  0 disk 
└─sdb1        8:17   0  10.9T  0 part /mnt/w2
sdc           8:32   0   3.6T  0 disk 
└─sdc1        8:33   0   3.6T  0 part /mnt/b1
sdd           8:48   1  28.6G  0 disk 
├─sdd1        8:49   1   685M  0 part 
├─sdd2        8:50   1    65M  0 part 
└─sdd3        8:51   1   300K  0 part 
nvme0n1     259:0    0 119.2G  0 disk 
├─nvme0n1p1 259:1    0   512M  0 part /boot
├─nvme0n1p2 259:2    0 102.7G  0 part /
└─nvme0n1p3 259:3    0    16G  0 part [SWAP]

Additional context

I'm not sure the panics and errors are due to specific programs or not, but this is a list of programs that do most of the lifting on this box: Plex, Sonarr, Radarr, Deluge, rtorrent-ps, znc, openvpn

Abbotta4 avatar Apr 04 '21 04:04 Abbotta4

This is a kernel issue, not mergerfs one, but I can forward this to the FUSE maintainers for feedback.

trapexit avatar Apr 04 '21 10:04 trapexit

That works for me, thank you

Abbotta4 avatar Apr 05 '21 15:04 Abbotta4

No response yet. I'll leave this open till we hear something.

trapexit avatar Apr 05 '21 15:04 trapexit

The box froze overnight again and I can see fuse functions in the call traces while deluge is doing many writes. I have uploaded journalctl -b -1 -k here as a gzip'ed log (original was 11.3M) in case it will be useful for the upstream case. Thanks again.

Abbotta4 avatar Apr 05 '21 16:04 Abbotta4

Have you tried turning off page caching to see if that has any impact? cache.files=off?

trapexit avatar Apr 05 '21 17:04 trapexit

I just tried changing cache.files to off but still got a panic: https://i.imgur.com/9sGmVFp.jpg

Abbotta4 avatar Apr 05 '21 18:04 Abbotta4

Hi @trapexit, did you receive any response? Would it be helpful if I opened an issue with libfuse?

Abbotta4 avatar Apr 12 '21 15:04 Abbotta4

libfuse will say and do as I did. Tell you it's a kernel issue.

The only response was in effect to try a newer kernel. Do you have the background to do that?

trapexit avatar Apr 12 '21 16:04 trapexit

Yes, I can upgrade the kernel. The newest Linux kernel on Arch is only two patch versions higher than what I have been testing on, though: https://archlinux.org/packages/core/x86_64/linux/ I can still upgrade and test for thoroughness. If (and frankly, when) the system crashes again, would it be worth filing an issue with Linux?

Abbotta4 avatar Apr 12 '21 16:04 Abbotta4

Sorry, I wasn't clear. Testing new kernels in your distro is not what I'm talking about. I'm referring to latest patches for possibly unreleased kernels. The development releases or patches. Can you build your own kernel? Have you tried older LTS kernels?

I've already reached out to the maintainer. The next step would be telling them whether or not the latest version has the issue.

trapexit avatar Apr 12 '21 17:04 trapexit

Oh, I see. Yes, I can try some patches or other releases. I have some experience with that. Are there specific patches or versions of the upstream kernel I should try, or just the latest HEAD? I have tried a number of older lts releases from ~5.9ish to 5.11 and I can try any suggested version. How far back should I go?

Abbotta4 avatar Apr 12 '21 17:04 Abbotta4

https://sourceforge.net/p/fuse/mailman/message/37256225/

https://www.kernel.org/

The longterm and stable versions.

trapexit avatar Apr 12 '21 18:04 trapexit