mergerfs
mergerfs copied to clipboard
Renaming or creation of file fails (Telegram won't start)
Describe the bug
I have recently changed from nfs4 to virtiofs for some mounts (including /home) in a VM. Since I changed the mount on the home folder, I get a weird error when starting Telegram in this VM. It complains something about not being able to open a file:
[2021.12.22 18:20:15] Launched version: 3003000, install beta: [FALSE], alpha: 0, debug mode: [FALSE]
[2021.12.22 18:20:15] Executable dir: /usr/bin/, name: telegram-desktop
[2021.12.22 18:20:15] Initial working dir: /home/raziel/
[2021.12.22 18:20:15] Working dir: /home/raziel/.local/share/TelegramDesktop/
[2021.12.22 18:20:15] Command line: telegram-desktop --
[2021.12.22 18:20:15] Executable path before check: /usr/bin/telegram-desktop
[2021.12.22 18:20:15] Logs started
[2021.12.22 18:20:15] Launcher filename: telegramdesktop.desktop
[2021.12.22 18:20:15] Connecting local socket to /run/user/1000/a0b1cff8440a846e992153710f78f8a0-{87A94AB0-E370-4cde-98D3-ACC110C5967D}...
[2021.12.22 18:20:15] Socket connect error 0, starting server and app...
[2021.12.22 18:20:15] Could not open '/home/raziel/.local/share/TelegramDesktop/log.txt' file to start new logging: No such file or directory
[2021.12.22 18:20:15] Executable path before check: /usr/bin/telegram-desktop
[2021.12.22 18:20:15] FATAL: Could not move logging to '/home/raziel/.local/share/TelegramDesktop/log.txt'!
It looks like it has something to do with mergerfs, because if I change the mount to the non-mergerfs-source, everything works.
It might be related to #989 because it sounds a bit familiar. Although this runs on an Arch Linux Server and client.
I have added an strace of the app and mergerfs, done at the same time. I hope you can do something with them, because I am a bit lost there. As far as I understand app.strace, the log file gets deleted and another one is moved in its place. But I don't understand the mergerfs.strace.
I can delete (rm
) the mentioned file without any issue, and it even gets recreated when starting Telegram. But Telegram seems to see this differently.
To Reproduce
- Bind home as mount with virtiofs
- Start telegram
I could not reproduce it with any default tools like rm, mv, ln or touch
Expected behavior
The file should get copied / moved without errors
System information:
- OS, kernel version: Linux sec 5.15.10-arch1-1-vfio #1 SMP PREEMPT Mon, 20 Dec 2021 17:55:46 +0000 x86_64 GNU/Linux
- mergerfs version: 2.33.3
- mergerfs settings
/mnt/disk1:/mnt/disk2 /mnt/userHdd/ fuse.mergerfs allow_other,noforget,inodecalc=path-hash,nfsopenhack=all,cache.files=partial,dropcacheonclose=true,category.create=mfs,ignorepponrename=true,fsname=diskHdd 0 0
/mnt/cache:/mnt/userHdd /mnt/user/ fuse.mergerfs allow_other,noforget,inodecalc=path-hash,nfsopenhack=all,cache.files=partial,dropcacheonclose=true,category.create=epff,func.mkdir=epall,moveonenospc=true,minfreespace=100G,ignorepponrename=true,fsname=disk* 0 0
- List of drives, filesystems, & sizes:
-
df -h
:
-
diskHdd 15T 7.9T 5.9T 58% /mnt/userHdd
disk* 17T 8.8T 6.7T 57% /mnt/user
/dev/sdb1 7.3T 4.0T 3.0T 58% /mnt/disk1
/dev/sdc1 7.3T 4.0T 3.0T 58% /mnt/disk2
/dev/sda1 7.3T 3.8T 3.2T 55% /mnt/parity0
/dev/md127p1 1.8T 905G 835G 53% /mnt/disk
-
lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
sda 8:0 1 7.3T 0 disk
└─sda1 8:1 1 7.3T 0 part /mnt/parity0
sdb 8:16 1 7.3T 0 disk
└─sdb1 8:17 1 7.3T 0 part /mnt/disk1
sdc 8:32 1 7.3T 0 disk
└─sdc1 8:33 1 7.3T 0 part /mnt/disk2
nvme2n1 259:0 0 931.5G 0 disk
└─nvme2n1p1 259:1 0 931.5G 0 part
└─md127 9:127 0 1.8T 0 raid0
└─md127p1 259:7 0 1.8T 0 part /mnt/cache
/mnt/disk0
nvme0n1 259:2 0 931.5G 0 disk
└─nvme0n1p1 259:3 0 931.5G 0 part
└─md127 9:127 0 1.8T 0 raid0
└─md127p1 259:7 0 1.8T 0 part /mnt/cache
/mnt/disk0
-
A strace of the application having a problem: app.strace.txt.gz (sorry for zipping it, github didn't allow me to upload another txt file somehow...)
-
strace of mergerfs while app tried to do it's thing: mergerfs.strace.txt
The mergerfs trace doesn't show interacting with the log files. Either the temp file or log.txt.
Are you able to reproduce this with Telegram (or better yet simple tooling) without all that setup?
BTW... you need to use use_ino
I had a bunch of other issues with use_ino
together with nfs. But since I am moving away from it, I can give it another try.
I now disabled the FS caching of virtiofs (server and client), because I thought maybe I get a better trace then...and instead it works now. This leaves me very confused on how virtiofs and mergerfs interact, but I am also happy that it works. I would create an issue at virtiofs...but honestly I am not sure on how to describe the issue anymore.
As the docs mention use_ino
is required for inodecalc to make any sense. Otherwise libfuse is just giving out random inode values.
caching could explain some of the situation but I don't know enough about virtiofs to comment without more investigation (which btw is being changed in the kernel to use basically the same codepath as FUSE filesystems)
Caching shouldn't be keeping calls from reaching mergerfs generally. Especially not create calls. Would need more investigation.
I will finish the migration and change use_ino
as soon as possible. Maybe it helps.
I will also play around with caching a bit and make a few more traces. Thank you so far :-)
Turned out I got the wrong pid when doing the strace of mergerfs...There are two instances and I got the wrong one :-(
These traces now both show log.txt. Traces done when the error occured (so cache back to normal) traces.zip
Oh, I also added use_ino
, which no visible change so far
use_ino impacts the inode generation. You set inodecalc
and that has no meaning/impact unless you enable use_ino
.
It looks like the traces are from different times.
You're right. I'll have a look at it tomorrow, looks like I am too tired to manage it right now.
So this is the one. Picked the wrong file yesterday, sorry for this. Now both of them are from the same time and say something about log.txt.
After the unlink
there is no further evidence that mergerfs is being requested to do anything with log.txt
. You said if you turn off some caching it works? Can you trace that?
It's calling renameat2
which mergerfs doesn't yet implement but that shouldn't matter as the kernel errors and the software falls back to a link + unlink behavior.
Everything looks the same between the two until after the temporary file for log.txt is unlinked. After that the client tries to chmod the linked log.txt file. When caching is off it succeeds and when on it fails. It fails before it even gets to mergerfs. Interestingly, after the chmod, even in the case where the chmod fails, it tries to open log.txt to append to it. With caching it fails with ENOENT implying part of the path doesn't exist and I'm not seeing anything obvious in mergerfs causing it (or that it even gets called).