mergerfs Mergerfs mount not responsive intermittently

issue description

I am having an intermittent issue where all attempts to read/access the /data/media/extd1 mountpoint hang. This includes running command line tools on the host like ls and df which try to read the mount and hang.
This server runs several docker containers including plex which read/write to this mount. When this hangs they can't read the mount either.
Mergerfs is installed natively on the host.
extd1 contains media files only, no docker or application configuration files or databases.
While the issue is occuring, the underlying disks (encd1, encd2) can still be read
/dev/sdc,sdd are in an external usb enclosure
The issue seemingly happens randomly and have not been able to reproduce it. Nothing obvious in dmesg indicating what the issue is.
Rebooting the system resolves the issue and waiting an amount of time it seems to correct itself.
Seems to happen once a day, and there is usually moderate disk r/w activity; file downloads and media playback occurring when it happens

system info

Strace was run while mount was unresponsive, started the mergerfs trace and then a strace on 'df':

app.strace.txt
mergerfs.strace.txt (this is quite large was not sure when to trim)

$ sudo mergerfs --version
mergerfs v2.40.2

$ uname -a
Linux nuc 6.5.0-41-generic #41~22.04.2-Ubuntu SMP PREEMPT_DYNAMIC Mon Jun  3 11:32:55 UTC 2 x86_64 x86_64 x86_64 GNU/Linux

$ lsb_release -a
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 22.04.4 LTS
Release:	22.04
Codename:	jammy

$ df -h
Filesystem          Size  Used Avail Use% Mounted on
tmpfs               3.1G  3.2M  3.1G   1% /run
/dev/nvme0n1p2       91G   29G   58G  34% /
tmpfs                16G     0   16G   0% /dev/shm
tmpfs               5.0M  4.0K  5.0M   1% /run/lock
efivarfs            192K  118K   70K  63% /sys/firmware/efi/efivars
encd1:encd2:ssd4tb   33T   17T   17T  50% /data/media/extd1
/dev/sdb            916G  2.8M  870G   1% /data/unsorted
/dev/nvme0n1p1      511M  6.1M  505M   2% /boot/efi
/dev/nvme0n1p3      3.6T  397G  3.2T  11% /data/media/ssd4tb
/dev/sdc             15T   15T  474G  97% /data/media/encd1
/dev/sdd             15T  1.8T   13T  13% /data/media/encd2
tmpfs               3.1G   72K  3.1G   1% /run/user/127
tmpfs               3.1G   60K  3.1G   1% /run/user/1000

$ ls -la /data/media/
total 16
drwxrwxr-x 8 nuc nuc 4096 Jun 14 22:42 .
drwxr-xr-x 4 nuc nuc 4096 Jan  4 19:19 ..
drwxrwxr-x 5 nuc nuc   59 Jan 12 15:11 encd1
drwxrwxr-x 5 nuc nuc   46 Jun 25 09:13 encd2
drwxrwxr-x 5 nuc nuc   59 Jan 12 15:11 extd1
drwxrwxr-x 2 nuc nuc 4096 Jan  3 20:30 samples
drwxr-xr-x 6 nuc nuc   79 Jun 18 21:32 ssd4tb
drwxr-xr-x 3 nuc nuc 4096 Jan  4 19:39 unsorted

$ cat /etc/fstab 

UUID=d39c17db-0794-4efd-aebf-215dda148e4f /               ext4    errors=remount-ro 0       1
UUID=0139-3011  /boot/efi       vfat    umask=0077      0       1
/swapfile                                 none            swap    sw              0       0

UUID=d239f2c1-8e04-46f3-987d-41469ad23e47	/data/media/encd1	xfs	defaults,nofail	0	0

UUID=dd6ba9fc-b9aa-4136-94d9-9595ea5216e7 /data/unsorted	ext4	defaults,nofail	0	0

UUID=42cdb2ca-50a3-49aa-8c70-83b52399af4f	/data/media/ssd4tb	xfs	defaults	0	0
UUID=d673b99a-6efe-48dd-982d-77be75f9e512	/data/media/encd2	xfs	defaults,nofail	0	0

/data/media/encd1:/data/media/encd2:/data/media/ssd4tb /data/media/extd1 mergerfs cache.files=partial,dropcacheonclose=true,category.create=mfs,allow_other 0 0

$ lsblk -f
NAME        FSTYPE   FSVER LABEL UUID                                 FSAVAIL FSUSE% MOUNTPOINTS
loop0                                                                       0   100% /snap/bare/5
loop1                                                                       0   100% /snap/canonical-livepatch/282
loop2                                                                       0   100% /snap/core18/2823
loop3                                                                       0   100% /snap/core18/2829
loop4                                                                       0   100% /snap/core20/2264
loop5                                                                       0   100% /snap/core20/2318
loop6                                                                       0   100% /snap/core22/1122
loop7                                                                       0   100% /snap/core22/1380
loop8       squashfs 4.0                                                    0   100% /snap/gnome-3-38-2004/140
loop9       squashfs 4.0                                                    0   100% /snap/gnome-3-38-2004/143
loop10      squashfs 4.0                                                    0   100% /snap/gnome-42-2204/172
loop11      squashfs 4.0                                                    0   100% /snap/gnome-42-2204/176
loop12      squashfs 4.0                                                    0   100% /snap/gtk-common-themes/1535
loop13      squashfs 4.0                                                    0   100% /snap/rclone/495
loop14      squashfs 4.0                                                    0   100% /snap/rclone/504
loop15      squashfs 4.0                                                    0   100% /snap/snap-store/1113
loop16      squashfs 4.0                                                    0   100% /snap/snap-store/959
loop17      squashfs 4.0                                                    0   100% /snap/snapd/21465
loop18      squashfs 4.0                                                    0   100% /snap/snapd/21759
loop19      squashfs 4.0                                                    0   100% /snap/snapd-desktop-integration/83
loop20      squashfs 4.0                                                    0   100% /snap/snapd-desktop-integration/157
sda         xfs                  8f58a793-94d2-4daf-a95a-db407648fef2                
sdb         ext4     1.0         dd6ba9fc-b9aa-4136-94d9-9595ea5216e7  869.2G     0% /data/unsorted
sdc         xfs                  d239f2c1-8e04-46f3-987d-41469ad23e47  473.8G    97% /data/media/encd1
sdd         xfs                  d673b99a-6efe-48dd-982d-77be75f9e512   12.8T    12% /data/media/encd2
nvme0n1                                                                              
├─nvme0n1p1 vfat     FAT32       0139-3011                             504.9M     1% /boot/efi
├─nvme0n1p2 ext4     1.0         d39c17db-0794-4efd-aebf-215dda148e4f   57.4G    31% /
└─nvme0n1p3 xfs                  42cdb2ca-50a3-49aa-8c70-83b52399af4f    3.2T    11% /data/media/ssd4tb

$ sudo lshw -class disk -class storage
  *-nvme                    
       description: NVMe device
       product: CT4000P3SSD8
       vendor: Micron/Crucial Technology
       physical id: 0
       bus info: pci@0000:01:00.0
       logical name: /dev/nvme0
       version: P9CR30A
       serial: 2308E6B1EC7D
       width: 64 bits
       clock: 33MHz
       capabilities: nvme pciexpress msix msi pm nvm_express bus_master cap_list
       configuration: driver=nvme latency=0 nqn=nqn.2014.08.org.nvmexpress:c0a9c0a92308E6B1EC7D        CT4000P3SSD8 state=live
       resources: irq:16 memory:6a400000-6a403fff
     *-namespace:0
          description: NVMe disk
          physical id: 0
          logical name: hwmon1
     *-namespace:1
          description: NVMe disk
          physical id: 2
          logical name: /dev/ng0n1
     *-namespace:2
          description: NVMe disk
          physical id: 1
          bus info: nvme@0:1
          logical name: /dev/nvme0n1
          size: 3726GiB (4TB)
          capabilities: gpt-1.00 partitioned partitioned:gpt
          configuration: guid=0c0465f1-c24b-4482-83d3-3613120847c6 logicalsectorsize=512 sectorsize=512 wwid=nvme.c0a9-323330384536423145433744-435434303030503353534438-00000001
  *-usb
       description: Mass storage device
       product: RTL9210B-CG
       vendor: Realtek
       physical id: 2
       bus info: usb@2:2
       logical name: scsi3
       version: 20.01
       serial: 012345679039
       capabilities: usb-3.20 scsi
       configuration: driver=uas maxpower=896mA speed=10000Mbit/s
  *-usb
       description: Mass storage device
       product: USB to ATA/ATAPI Bridge
       vendor: JMicron
       physical id: 3
       bus info: usb@4:3
       logical name: scsi2
       version: 2.05
       serial: 152D00539000
       capabilities: usb-3.00 scsi emulated scsi-host
       configuration: driver=usb-storage maxpower=8mA speed=5000Mbit/s
     *-disk:0
          description: SCSI Disk
          product: M001G-2KK103
          vendor: ST16000N
          physical id: 0.0.0
          bus info: scsi@2:0.0.0
          logical name: /dev/sdc
          logical name: /data/media/encd1
          version: 0125
          size: 14TiB (16TB)
          configuration: ansiversion=6 logicalsectorsize=512 mount.fstype=xfs mount.options=rw,relatime,attr2,inode64,logbufs=8,logbsize=32k,noquota sectorsize=4096 state=mounted
     *-disk:1
          description: SCSI Disk
          product: M000J-2TW103
          vendor: ST16000N
          physical id: 0.0.1
          bus info: scsi@2:0.0.1
          logical name: /dev/sdd
          logical name: /data/media/encd2
          version: 0125
          size: 14TiB (16TB)
          configuration: ansiversion=6 logicalsectorsize=512 mount.fstype=xfs mount.options=rw,relatime,attr2,inode64,logbufs=8,logbsize=32k,noquota sectorsize=4096 state=mounted
  *-sata
       description: SATA controller
       product: Intel Corporation
       vendor: Intel Corporation
       physical id: 17
       bus info: pci@0000:00:17.0
       logical name: scsi1
       version: 20
       width: 32 bits
       clock: 66MHz
       capabilities: sata msi pm ahci_1.0 bus_master cap_list emulated
       configuration: driver=ahci latency=0
       resources: irq:150 memory:6a500000-6a501fff memory:6a503000-6a5030ff ioport:3090(size=8) ioport:3080(size=4) ioport:3060(size=32) memory:6a502000-6a5027ff
     *-disk
          description: ATA Disk
          product: NT-2TB 2242
          physical id: 0.0.0
          bus info: scsi@1:0.0.0
          logical name: /dev/sda
          version: 109
          serial: 0006946000117
          size: 1907GiB (2048GB)
          configuration: ansiversion=5 logicalsectorsize=512 sectorsize=512

Jul 24 '24 10:07 Hark0q

I need a trace from an app experiencing the issue. Your trace is of df and it looks to exit fine and the query of mergerfs took 0.0096 seconds to run.

And what is going on on the system? Have you run free? iotop? What's the loadavg? Etc.

Jul 24 '24 19:07 trapexit

Thanks for the reply. I think the strace shows the df command not completing for ~58 seconds, no activity from 19:47:00 - 19:47:58

I thought df/ls on the host would be a good test rather than the actual apps (which are dockerised).

I can give the output of those other commands next time it happens.

Jul 24 '24 20:07 Hark0q

Using basic tooling is what I suggest using in the docs but a filesystem is a complex beast with many types of functions so getting more than one function testing is useful in debugging. statfs is different from stat is different from file creation, etc. That's why simple tools like "touch", "ls", "rm", "ln", etc. are useful for debugging. They don't do a whole lot. In any case I didn't notice that as later in the log it shows interactions with mergerfs working fine so it isn't entirely held up.

Jul 24 '24 20:07 trapexit

There mergerfs trace shows nothing but a gap in time around then which df had issued a stat request. Without logs upto the event or external information there isn't much there to comment on.

After that point things look pretty normal. stat'ing files mostly.

Jul 24 '24 20:07 trapexit

Do you have any advice on how I can capture traces for this given it’s intermittent?

I was thinking about setting up a rolling trace buffer on the key processes and using ls/df as the test, when it takes >20 seconds to complete to write out the trace buffer to record the trace before the issue starts. Would this work and help to diagnose the issue?

Aug 01 '24 01:08 Hark0q

Having traces will only show who is being slow. It won't necessarily diagnose anything. There is very little going on in mergerfs that can cause slowdown. It's primarily just a proxy. Mostly, if there is slowdown it's because the underlying devices are overwhelmed or there is buffer bloat and swapping going on. This is why general system measurements can be important here. Load avg, IO waits, etc.

I'm not aware of any built in sampling behavior in strace. The problem is if you have some external loop checking for delay and it is buffer bloat related or similar then by the time you trace it could be done flushing.

The easiest thing would be to disable all the caches and ensure flush on close is enabled and compare behavior.

Aug 01 '24 14:08 trapexit

Hi @trapexit, I am not sure if my comment is in the right place here, but I'll try anyway. I was experiencing a similar issue just now. In my case I am merging a remote directory on onedrive (via rclone) and a local directory together. Also in my case, mergerfs was not at fault - clearly the remote directory on onedrive was.

But I was thinking if there was a feature to configure timeouts so that mergerfs does not wait for an answer from the underlying file system if a certain timeout was reached? Of course, I will check that for rclone too. But I also had similar issues in the past where the SAN storage I merged was experiencing too high load so that read requests were running in a timeout. That's where I can't rely on rclone either.

tl;dr: is there a way to configure a timeout for reads/writes?

Sep 08 '25 21:09 d-Rickyy-b

I would prefer moving such discussion to the discussions page rather than overloading an existing ticket.

But to respond: there is no such thing as a timeout in filesystem APIs. They block. They can block forever and that is "valid." You can't tell the kernel "try to write this data but if it takes more than X nanoseconds abort." That's simply not a thing. The best mergerfs could do is, under the assumption that the call will return eventually, time some specific request to the kernel and if it takes a long time put the branch in "timeout" or in a "bad" state which would result it in it being ignored going forward. But it would probably need to be "takes X nanoseconds + returns some error condition" which might catch certain network filesystem issues. But then the question is... then what? Does it stay in the bad state till someone manually changes it? Is the expectation that mergerfs would try to watch for it to "get better"? This is a pretty non-trivial issue.

Sep 08 '25 21:09 trapexit

I would prefer moving such discussion to the discussions page

Alright, thanks for your response though!

This is a pretty non-trivial issue.

Understood, thank you for the explanation. I guess, I'll have to deal with it differently (maybe have some script periodically check the response times and kill the rclone connection, as the two only parameters I was able to find, --contimeout and --timeout, don't seem to fix my issue with rclone).

Sep 08 '25 22:09 d-Rickyy-b

mergerfs mergerfs copied to clipboard

Mergerfs mount not responsive intermittently

mergerfs
mergerfs copied to clipboard