mergerfs
mergerfs copied to clipboard
Permission denied through mergerfs, allowed through disk mount
Describe the bug I have a bunch of disks merged together through mergerfs. A folder is owned by a different user but a group my user is in. The group has the rw permissions on the folder. When trying to create a new file or edit an existing one inside this folder, I get a permission denied error. If I try to create or edit the same file, but accessing the disk directly instead of going through mergerfs it succeeds. It also works if the folder is owned by my user, but I don't see why accessing it through the group shouldn't work here.
To Reproduce
-
sudo mkdir /mnt/mergerfs/test
-
sudo chgrp users /mnt/mergerfs/test
-
sudo chmod g+rw /mnt/mergerfs/test
-
touch /mnt/mergerfs/test/foo
<- permission denied -
touch /mnt/disk2/test/foo
<- allowed
goz3rr@cheyenne:/mnt/mergerfs$ ls -la test/
total 8
drwxrwsr-x 2 root users 4096 Mar 2 20:49 .
drwxr-sr-x 3 root users 4096 Jan 6 10:45 ..
-rw-r--r-- 1 goz3rr users 0 Mar 2 20:49 foo
Expected behavior Permissions through mergerfs are the same as the underlying filesystem.
System information:
- OS, kernel version:
Linux cheyenne 4.19.0-13-amd64 #1 SMP Debian 4.19.160-2 (2020-11-28) x86_64 GNU/Linux
- mergerfs version:
mergerfs version: 2.24.2
FUSE library version: 2.9.8-mergerfs
fusermount version: 2.9.9
using FUSE kernel interface version 7.19
- mergerfs settings: mounted through fstab:
/dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_drive-scsi2 /mnt/disk1 ext4 defaults 0 0
/dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_drive-scsi3 /mnt/disk2 ext4 defaults 0 0
/dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_drive-scsi4 /mnt/disk3 ext4 defaults 0 0
/dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_drive-scsi5 /mnt/parity1 ext4 defaults 0 0
/mnt/disk1:/mnt/disk2:/mnt/disk3 /mnt/chungus fuse.mergerfs defaults,allow_other,direct_io,use_ino,minfreespace=20G,fsname=mergerfs 0 0
- List of drives, filesystems, & sizes:
Filesystem Size Used Avail Use% Mounted on
udev 16G 0 16G 0% /dev
tmpfs 3.2G 12M 3.2G 1% /run
/dev/mapper/cheyenne--vg-root 47G 14G 32G 31% /
tmpfs 16G 0 16G 0% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 16G 0 16G 0% /sys/fs/cgroup
mergerfs 22T 7.9T 14T 37% /mnt/mergerfs
/dev/sdf1 63G 37G 24G 61% /mnt/appdata
/dev/sda1 236M 85M 140M 38% /boot
/dev/sde 7.3T 5.4T 2.0T 74% /mnt/disk1
/dev/sdb 7.3T 5.7T 1.6T 78% /mnt/parity1
/dev/sdd 7.3T 614G 6.7T 9% /mnt/disk2
/dev/sdc 7.3T 2.0T 5.3T 28% /mnt/disk3
tmpfs 3.2G 0 3.2G 0% /run/user/1000
- strace of mergerfs while app tried to do it's thing: mergerfs.strace.txt
- Please use an up to date version. 2.24.2 was released Mar 24, 2018.
- It works fine / as expected for me performing the instructions above. What groups is the user a member of? Did you change the group membership while mergerfs was running?
- This is the latest available version for Debian 10 (stable), but I'll try to manually update it
- The user is a member of the
goz3rr cdrom floppy sudo audio dip video plugdev users netdev
groups. The user was added to theusers
group while mergerfs was running. It seems that the problem does not happen with a user that was already part of theusers
group during last boot.
I also noticed that while the main mergerfs process is running as root, some of the threads are running under the goz3rr
account (which was added to the users
group after boot). Is that normal and possibly the cause?
I've tried reloading mergerfs with sudo mount -a
but this doesn't seem to be supported, and having to restart the entire system every time groups are modified doesn't sound like a great solution.
- Debian isn't a rolling release OS. Versions are relatively fixed once released. I've no control over the version they distribute but their release schedule is very slow so no matter what software you use that comes from Debian (or other non-rolling release OSes) you should investigate the versions you're using vs upstream. This is why it's mentioned in the support section to check the most recent release.
- https://github.com/trapexit/mergerfs#supplemental-user-groups If the user group was added after mergerfs was started and the user's group info queried then this is expected behavior. Yes it is normal for threads to run as the user issuing the request. That's a necessary behavior and has nothing to do with this situation.
"mount -a" ? You mean "mount -oremount"? "mount -a" just mounts everything in fstab which has nothing to do with the mount itself and mergerfs will do just fine. But no FUSE filesystem supports remount. It can't. Not a supported behavior by the kernel. As for not picking up new subgroups... it's not ideal but it was also not easy to support in a way that wasn't costly. It also doesn't effect people 99.9999% of the time so effort was spent elsewhere. There isn't a practical way to know when group membership changes and the lookup is very expensive. The best I could do is offer a command to force a clear of the cache and/or a time or call count based clear.
I believe I have pretty much the same issue. First I discovered it in a docker container, now I reproduced it in the host system.
OS Info
OS: Arch Linux Mergerfs Version: 2.32.3 Kernel: 5.10.15 Fuse: 3.10.2 Mount arguments: allow_other,use_ino,noforget,inodecalc=path-hash,nfsopenhack=all,cache.files=partial,dropcacheonclose=true,category.create=epff,func.mkdir=epall,moveonenospc=true,minfreespace=100G,ignorepponrename=true
Steps to reproduce
- Create a directory with a certain group and rw access for this group
drwxrwxr-x 2 http http 4096 Mar 16 21:27 .
drwxr-xr-x 14 root root 4096 Mar 16 21:26 ..
-rw-rw-r-- 1 http http 0 Mar 16 21:27 test
- Add this group to a user
$ groups raziel
wheel http raziel
- try to add a new file after changing to this user
$ touch test2
touch: cannot touch 'test2': Permission denied
I have tried exactly the same scenario in a non-mergerfs directory, where it works perfectly.
If you need any more info just ask, I'm happy to provide it. I can also offer to compile and install a new version if you'd like to test or debug something.
Was the group membership changed after mergerfs was up and running and you had accessed something as that user? If so then the previously explained reason is why it happens. mergerfs has to cache the credentials due to the high cost of querying them. I could add timeouts based on time or number of queries. Maybe occasionally look to see if the /etc/groups file changed but that is pretty hacky and won't work in all cases. There simply isn't a mechanism I'm aware of to know such thing has changed. Even if I had timeouts it'd just lead to weird behaviors where it fails to work for a while and then starts working seemingly without reason.
As for usage in Docker. That can be a whole other can of worms that is also not something mergerfs can do much about. Inside a container, unless you share your users and groups somehow, you will have different values in and out of the container. That's how it's supposed to be. And it only gets more complicated from there if you decide to use user namespaces.
BTW the docs talk about all this: https://github.com/trapexit/mergerfs#supplemental-user-groups
The part about C++11 is outdated now and I could rewrite it to make it less static but that wouldn't change anything with regard to this concern. I'm happy to revisit this but there are no good solutions. Only compromises.
I have missed that part. I have not restartet mergerfs after adding the group. On the hosts side it indeed works after a restart. Having a command to flush this cache would be really nice though, since I basically need to restart the machine to restart mergerfs. Just too many dependencies to easily umount and remount.
In the docker container, it still does not work. There I first encountered this issue, but I can't do anything about it since the dockerized application resets the permissions all the time, so only the group has access to this file. If I change the config directory to be on a non-mergerfs, the issue disappears. Maybe I can add the needed user / group config to the host, but this will get messy with many containers.
re: containers
I'm not sure I understand. How do you have things setup? Is mergerfs in its own container or in the default namespaces? Are you using user namespaces or not? What exactly is the setup. mergerfs is given the uid and gid from the kernel. If you're using user namespaces they won't be the same as the default namespace. And if you are using secondary groups then you have to share them. That isn't unique to mergerfs.
Mergerfs is running on the host, not inside docker. mergerfs mounts to /mnt/user/ And I am not aware of any namespaces, so I suppose I don't have anything special configured.
The docker-compose file looks like the following (I've removed network and other stuff to keep it short. If you need it, I can attach the full file):
version: "3"
services:
pihole:
image: pihole/pihole:latest
volumes:
- '/mnt/user/appData/docker/pihole/config/:/etc/pihole/'
- '/mnt/user/appData/docker/pihole/dnsmasq.d/:/etc/dnsmasq.d/'
#- '/root/pihole/config/:/etc/pihole/'
#- '/root/pihole/dnsmasq.d/:/etc/dnsmasq.d/'
restart: unless-stopped
If I switch the volume paths from /mnt/user/... to /root/..., it suddenly works. I also checked if the permissions might be different, but they are exactly the same.
What are the errors you see exactly?
Have you entered the container's namespace and tried "touch"ing a file or similar? Lots of people, including myself, use mergerfs with docker (with containers running as root).
docker run --rm -it -v /mergerfs-mount:/mnt ubuntu:20.04
I use pihole container all the time but not with mergerfs but I just setup it up and it appears fine to me.
I can't really reproduce it easily with the ubuntu image. The point is: In pihole, the webserver does not run as root, but as the user www-data. This user is also in the group pihole:
$ groups www-data
www-data : www-data pihole
This way, it should have enough access to /etc/pihole/
So when testing like this:
docker run --rm -it -v /mnt/user/pi-test:/etc/pihole pihole/pihole:latest bash
su www-data -s /bin/bash
touch /etc/pihole/test
I get
touch: cannot touch '/etc/pihole/test': Permission denied
Pihole runs fine on first sight. Just when trying to modify the DB (for example enabling / disabling a group or adding a blocklist in the webinterface), it will show errors, because the gravity DB then seems to be readonly to the webserver.
I also tested if it has to do with the DB itself, because some SQLite stuff is mentioned in the mergerfs readme. But as soon as I chmod it to 777, it works. Sadly pihole will change the permissions back shortly.
If I run the exact same thing, just with changing the volume to be stored outside mergerfs, the example above works perfectly.
And what are your supplemental groups for 999 on the host? I'm guessing there aren't any. Hence the perm issues. You have to remember what you are doing / what is happening here.
- mergerfs is running in the default namespaces / env. Supplemental groups are queried via getgroups(2).
- Pihole is running in a container. The image has it's own /etc/passwd and /etc/group. Entirely separate from your "host" / normal environment. You aren't using NIS or ldap or bind mounting said files so the two environments share the same values.
- The pihole container apps are running as uid 0, 33, and 999.
- When the kernel sends requests to mergerfs it provides it with uid and gid values exactly. The kernel doesn't understand human strings. It understand only integers.
- The kernel doesn't provide process supplemental groups like it does the primary. They must be expressly setup via setgroups(2).
- So 33 or 999 or 0 make a request, it goes to the kernel, the kernel tells mergerfs the uid/gid, mergerfs looks up the supplemental groups for the uid on the "host" and caches them for use. Which for 33 and 999 are likely nothing.
- Since there are no supplemental groups to set when the underlying request is made it fails due to the group perms.
This is basically the same thing as if you had two completely different hosts with different groups and expected the same authz check to work. It won't. They are separate envs.
It works when you aren't using mergerfs because picking up the subgroups from within the container.
The supplemental group query interface is done by libc. I can't just plug into it to grab something different. The best I could possibly do is (optionally) use the container's /etc/group file as an alternative to the host one but that's a can of worms. Then, rather than just keeping a cache based on uid, I have to create a cache based on process ID and userID. But it's way worse than that because I don't know when a process exits so every single request will require validation to be safe. I won't know the difference between UID 1000 in the container and 1000 on the host without confirming the process has a different root mount. Otherwise it'd be a security risk.
Or... you could add the relevant entries into your /etc/group file.
I suppose there are other solutions such as forcing mergerfs to add gid's to a particular uid or something but that's just a different vector for the /etc/group data.
I plan on rewriting the cache and will add the ability to clear it but it won't come probably till after 3.x release. I could add uid,gid and subgroup manipulation too but that might be a can of worms and would touch a lot of stuff and introduces security risks that need to be carefully considered.
I've been troubleshooting a very similar issue for the past 2 days with no success, but even the disk mount is throwing a "permission denied" error. This is 100% my own fault as I have a limited understanding of mergerfs
and have not read the provided documents, and have been trying to reduce the mergerfs/snapraid pool from 5 drives (3 internal, 2 USB) to 3.
There is nothing fancy going on. mergerfs simply returns back entitlement metadata (IE permissions and uid/gid) to the kernel and the kernel enforces it. That's it. Simply change the permissions either through mergerfs or on the underlying branches and unless you're messing with container user namespaces or non POSIX filesystems you should be good.
Sorry if this is resurrecting an old issue, but I have the same kind of issue and I just don't understand how I should fix permissions in this case. I just want to pool my drives with mergerfs and use the pooled mount as data location for nextcloud-aio docker setup.
After a week of not finding a solution to the permission issues I reinstalled Ubuntu 22.04 with ext4 filesystem, installed docker from docker repository, setup mergerfs (v2.34.1 ) again. Mergerfs line in /etc/fstab is:
/mnt/disk* /mnt/storage fuse.mergerfs defaults,nonempty,allow_other,use_ino,cache.files=partial,moveonenospc=true,dropcacheonclose=true,minfreespace=10G,fsname=mergerfs 0 0
The disks in /mnt/disk* are gpt and ext4 formatted.
I also added override for docker service to start only after the mount has completed.
# /etc/systemd/system/docker.service.d/override.conf
[Unit]
After=mnt-storage.mount
Requires=mnt-storage.mount
I duplicated the nextcloud-aio-nextcloud container with portainer to look into it. At root user everything looked allright, but as www-data user things look messed up. As root I ran
stat -c "%u:%g %a" "$NEXTCLOUD_DATA_DIR"
33:0 750
Those are the permission the container expects. Then I ran the same as www-data user. /mnt/ncdata is the container location for the mergerfs folder /mnt/storage/nextcloud-data.
sudo -u www-data stat -c "%u:%g %a" "$NEXTCLOUD_DATA_DIR"
stat: cannot stat '/mnt/ncdata': No such file or directory
Then I ran ls -la
as www-data in the container
sudo -u www-data ls -la /mnt
ls: cannot access '/mnt/ncdata': No such file or directory
total 8
drwxr-xr-x 1 root root 4096 Feb 17 11:36 .
drwxr-xr-x 1 root root 4096 Feb 25 11:17 ..
d????????? ? ? ? ? ? ncdata
Looks like it sees the folder per se, but doesn't understand it? How the www-data user looks like in the container:
id www-data
uid=33(www-data) gid=33(www-data) groups=33(www-data),33(www-data)
How the www-data user looks like on the host:
id www-data
uid=33(www-data) gid=33(www-data) groups=33(www-data)
Check the user in /etc/passwd in the container.
cat /etc/passwd | grep www-data
www-data:x:33:33:Linux User,,,:/home/www-data:/sbin/nologin
The same thing on the host:
cat /etc/passwd | grep www-data
www-data:x:33:33:www-data:/var/www:/usr/sbin/nologin
The /etc/group in the container
cat /etc/group | grep www-data
www-data:x:33:www-data
The /etc/group on the host
cat /etc/group | grep www-data
www-data:x:33:
Any ideas would be welcome.
I tested also a different image you mentioned earlier, ubuntu;20.04, and it looks like the same.
docker run --rm -it -v /mnt/storage/nextcloud-data/:/mnt/data -u 33 ubuntu:20.04
www-data@c09e46564e05:/$ ls -la /mnt/
ls: cannot access '/mnt/data': No such file or directory
total 8
drwxr-xr-x 1 root root 4096 Feb 25 14:56 .
drwxr-xr-x 1 root root 4096 Feb 25 14:56 ..
d????????? ? ? ? ? ? data
mounting the underlying disk1 works
docker run --rm -it -v /mnt/disk1/nextcloud-data/:/mnt/data -u 33 ubuntu:20.04
www-data@611025aa59cf:/$ ls -la /mnt/
total 12
drwxr-xr-x 1 root root 4096 Feb 25 15:00 .
drwxr-xr-x 1 root root 4096 Feb 25 15:00 ..
drwxr-x--- 2 www-data root 4096 Feb 25 01:12 data
- truncate -s 1G /tmp/disk.img
- mkfs.ext4 -m0 /tmp/disk.img
- mkdir /tmp/disk /tmp/mergerfs
- sudo mount /tmp/disk.img /tmp/disk
- sudo chmod 1777 /tmp/disk
- mount -f -o allow_other,use_ino,cache.files=partial,moveonenospc=true,dropcacheonclose=true,minfreespace=0 /tmp/disk /tmp/mergerfs
- stat /tmp/disk /tmp/mergerfs
- sudo -u www-data stat /tmp/disk /tmp/mergerfs
- docker run -u www-data:www-data --rm -it -v /tmp/mergerfs:/tmp/mergerfs ubuntu:20.04 stat /tmp/mergerfs
I do that and everything works as expected. Please replicate and tell me where it doesn't work for you.
Right, all of that works without errors.
But my case was to test a directory with uid:gid 33:0 and 750 permissions inside that /tmp/mergerfs mount ie. /tmp/mergerfs/data.
When I try to continue from where your exercise left me and run mkdir /tmp/disk/data
, that doesn't even show up in /tmp/mergerfs/. No file or permission changes replicate to /tmp/mergerfs. So now I really don't know what's going on.
@twaananen is one of your pooled drives NTFS? That was my issue. I copied everything off of it, reformatted the drive to ext0 (I think) and copied everything back, stuff works now
@th0mcat No, all of the pool drives are ext4.
Add to my example exactly what you did that doesn't work. I just added a directory through mergerfs and directly to /tmp/disk and it works as expected.
Right, so 10. mkdir /tmp/disk/data 11. ls -la /tmp/disk/ 12. ls -la /tmp/mergerfs/
Step 11 shows new directory, step 12 does not.
As what uid:gid?
As my default user 1000:1000. /tmp/disk and /tmp/mergerfs are also owned by 1000:1000
Then it's not exactly the same as what I had above. I need to be able to replicate exactly, literally the same thing to properly test anything.
Regardless... no problems when I test this.
Right, I literally copy pasted the commands
- mount -f -o allow_other,use_ino,cache.files=partial,moveonenospc=true,dropcacheonclose=true,minfreespace=0 /tmp/disk /tmp/mergerfs
As per your picture this would mergerfs instead of mount. Now with that, I can create the directory and everything seems to work with step 9. The main difference I see is the 1777 permission set on the underlying disk /tmp/disk, which would be /mnt/disk1 for me.
So now I'll make a comparison between this test setup and my current setup: TEST
- docker run -u www-data:www-data --rm -it -v /tmp/mergerfs/data:/mnt/data ubuntu:20.04 ls -la /mnt/
total 12
drwxr-xr-x 1 root root 4096 Feb 26 23:06 .
drwxr-xr-x 1 root root 4096 Feb 26 23:06 ..
drwxr-x--- 2 www-data root 4096 Feb 26 22:10 data
- docker run -u www-data:www-data --rm -it -v /tmp/mergerfs/data:/mnt/data ubuntu:20.04
www-data@82c6a63d0bd4:/$ ls -la /mnt/
total 12
drwxr-xr-x 1 root root 4096 Feb 26 23:05 .
drwxr-xr-x 1 root root 4096 Feb 26 23:05 ..
drwxr-x--- 2 www-data root 4096 Feb 26 22:10 data
CURRENT:
- docker run -u www-data:www-data --rm -it -v /mnt/storage/nextcloud-data/:/mnt/data ubuntu:20.04 ls -la /mnt/
ls: /mnt/data: No such file or directory
total 12
drwxr-xr-x 1 root root 4096 Feb 26 23:08 .
drwxr-xr-x 1 root root 4096 Feb 26 23:08 ..
drwxr-x--- 2 www-data root 4096 Feb 26 22:48 data
- docker run -u www-data:www-data --rm -it -v /mnt/storage/nextcloud-data/:/mnt/data ubuntu:20.04
www-data@e4b8605b47fe:/$ ls -la /mnt/
ls: cannot access '/mnt/data': No such file or directory
total 8
drwxr-xr-x 1 root root 4096 Feb 26 23:08 .
drwxr-xr-x 1 root root 4096 Feb 26 23:08 ..
d????????? ? ? ? ? ? data
Interestingly 1. doesn't show d????????? but still says no such file or directory. So is the solution to set my disks 1-4 permissions to 1777 ?
ls -la /mnt
on the host
drwx------ 5 tommi tommi 4096 helmi 26 21:16 disk1
drwx------ 6 tommi tommi 4096 helmi 27 00:48 disk2
drwx------ 4 tommi tommi 4096 helmi 14 01:04 disk3
drwx------ 4 tommi tommi 4096 helmi 14 01:01 disk4
drwxr-xr-x 3 root root 4096 helmi 13 15:17 parity1
drwx------ 5 tommi tommi 4096 helmi 26 21:16 storage
compared to ls -la /tmp
on the host
drwxrwxrwt 4 tommi tommi 4096 helmi 27 00:10 disk
-rw-rw-r-- 1 tommi tommi 1073741824 helmi 27 00:39 disk.img
drwxrwxrwt 4 tommi tommi 4096 helmi 27 00:10 mergerfs
As I show in the photo the mount was not 1777. I replicated what I thought you said your setup is. Please just create me a script or specific list of calls just like I did above to replicate this. There is no reason why we can't have literally the exact same setup.
Steps 9 and 10 in another terminal. Edit: sorry almost forgot the important part, difference between 14-17 and 22-25.
- truncate -s 1G /tmp/disk1.img
- truncate -s 1G /tmp/disk2.img
- mkfs.ext4 -m0 /tmp/disk1.img
- mkfs.ext4 -m0 /tmp/disk2.img
- mkdir /tmp/disk1 /tmp/disk2 /tmp/mergerfs
- sudo mount /tmp/disk1.img /tmp/disk1
- sudo mount /tmp/disk2.img /tmp/disk2
- chmod 700 /tmp/disk1 /tmp/disk2
- sudo su
- mergerfs -f -o allow_other,use_ino,cache.files=partial,moveonenospc=true,dropcacheonclose=true,minfreespace=0 /tmp/disk1:/tmp/disk2 /tmp/mergerfs
- mkdir /tmp/disk1/nextcloud-data
- chmod 750 /tmp/disk1/nextcloud-data
- sudo chown 33:0 /tmp/disk1/nextcloud-data
- docker run -u www-data:www-data --rm -it -v /tmp/mergerfs/nextcloud-data:/mnt/data ubuntu:20.04 ls -la /mnt/
- docker run -u www-data:www-data --rm -it -v /tmp/mergerfs/nextcloud-data:/mnt/data ubuntu:20.04
- ls -la /mnt/
- exit
- docker run -u www-data:www-data --rm -it -v /tmp/mergerfs/nextcloud-data:/mnt/data ubuntu:20.04 stat /mnt/data
- docker run -u www-data:www-data --rm -it -v /tmp/mergerfs/nextcloud-data:/mnt/data ubuntu:20.04
- stat /mnt/data
- exit
- docker run -u www-data:www-data --rm -it -v /tmp/disk1/nextcloud-data:/mnt/data ubuntu:20.04 ls -la /mnt/
- docker run -u www-data:www-data --rm -it -v /tmp/disk1/nextcloud-data:/mnt/data ubuntu:20.04
- ls -la /mnt/
- exit