unionfs-fuse icon indicating copy to clipboard operation
unionfs-fuse copied to clipboard

cat: /dev/null: Permission denied

Open lvml opened this issue 7 years ago • 21 comments

[I just realized the problem is much easier to reproduce, so here is my edited report:]

When creating any kind of unionfs on top of the root file system, it is not possible to access device files like /dev/null or /dev/zero, even if those are permitted for access by the user:

> unionfs -o use_ino,suid,dev,nonempty,allow_other  -o relaxed_permissions /=RW  union_root
>
> cat union_root/dev/null
cat: union_root/dev/null: Permission denied

I also tried with/without -o relaxed_permissions, with -o nodev and -o dev etc. - but that does not change the symptom.

When I look into the output when using unionfs with "-d", I find the following references to /dev/null in it:

LOOKUP /dev/null
getattr /dev/null
   NODEID: 59
   unique: 341, success, outsize: 144

An "strace" on the unionfs process does not reveal any good hint, either - the process will "lstat" /dev/null successfully, and there is no system call with regards to /dev/null that fails with ENOPERM.

lvml avatar Jun 06 '17 22:06 lvml

hmmm, this seems to be an issue deeper in fuse itself. the passthrough example from the libfuse examples behaves the same...

rpodgorny avatar Jun 15 '17 14:06 rpodgorny

Should we submit an issue report to the libfuse maintainers?

lvml avatar Jun 15 '17 16:06 lvml

not sure, yet. maybe it's some kind of internal limitation that fuse is unable to work with character devices (or maybe no special files at all?)... or maybe i'm just doing something wrong.

but i guess it won't be that fatal to open an issue - the worst thing that can happen is it to be closed as invalid so feel free to go ahead... ;-)

rpodgorny avatar Jun 15 '17 19:06 rpodgorny

Does this only happen with /dev/null or all character devices?

On a practical side though: Most systems nowadays use a tmpfs or devtmpfs for /dev and you could use

mount --move /dev /union_root/dev

and keep /dev out of the unionfs.

mrvn avatar Jun 23 '17 15:06 mrvn

On 06/23/2017 05:00 PM, Goswin von Brederlow wrote:

Does this only happen with /dev/null or all character devices?

It happens with all character devices I tried.

On a practical side though: Most systems nowadays use a tmpfs or devtmpfs for /dev and you could use mount --move /dev /union_root/dev and keep /dev out of the unionfs.

That would not work without root privileges.

The main reason to use unionfs-fuse for this use case is that it can be used without root privileges.

lvml avatar Jun 23 '17 17:06 lvml

unionfs as / without root? You are not going to be able to set that up.

The normal way to do this is to have /sbin/init a script that starts unionfs and then pivot_root + chroot. You are very much root there.

You might be able to fake it using fakechroot but that would be a special use case.

mrvn avatar Jun 27 '17 14:06 mrvn

Using a unionfs-fuse mounted filesystem as / works very well without root privileges, as long as user namespaces are supported by the kernel - see https://github.com/AppImage/AppImageKit/issues/406 for details. You can just use unshare -r chroot ... to use it.

lvml avatar Jun 27 '17 17:06 lvml

In case you want to try the use-case I described: Find my "makeaoi" software here:

https://github.com/lvml/makeaoi

lvml avatar Jun 29 '17 18:06 lvml

I have my doubts about that makeaoi idea. Why is the overlay needed in the first place? Either the "bundle" really contains everything the application needs (in which case you could just chroot into it), or it does not - but in that case whatever functionality is pulled from the underlying filesystem will be mightily confused by the environment that it finds.

Seems like a much easier solution would be chroot, or an LD_PRELOAD that adds a prefix to each open() and exec() call.

Nikratio avatar Jul 13 '17 18:07 Nikratio

On 07/13/2017 08:17 PM, Nikolaus Rath wrote:

I have my doubts about that makeaoi idea.

It actually does work pretty well already - I really enjoyed using an up-to-date "RawTherapee" on an oldish CentOS...

Why is the overlay needed in the first place?

Because you want your application to be able to work with data from your /home/.... directory, use temporary directories, communicate with services that use unix domain sockets or FIFOs, access /proc/* and /sys/* etc.

LD_PRELOAD that adds a prefix to each open() and exec() call.

I thought about this, but this does not really work - there are many pieces of software that actually do system calls, not just shared-library-function calls.

lvml avatar Jul 13 '17 18:07 lvml

Good news: I was able to implement exemplary ioctl() support (starting with TCGETS and TCSETS, as those are used by bash and may other programs indirectly).

See https://github.com/lvml/unionfs-fuse/tree/fake_dev for a version of unionfs-fuse that supports a new option "-o fake_devices", which exposes devices as if they were regular files - but still allows the client application to perform ioctls() on it.

Another feature enabled by "-o fake_devices" is support for accessing files under /proc/, which unionfs-fuse so far did not allow to read/write.

This feature is now used in my https://github.com/lvml/makeaoi tool.

If anyone wants to test how this works: Compile the following into an executable named "test_ioctl":

#include <stdio.h>
#include <string.h>
#include <sys/ioctl.h>
#include <termios.h>
#include <errno.h>

int main(int argc, char **argv) {

        struct termios t;
        memset(&t, 0, sizeof(t));

        fprintf(stderr, "doing TCGETS\n");

        if (0 != ioctl(0, TCGETS, &t)) {
                fprintf(stderr, "ioctl failed: %s\n", strerror(errno));
        }

        fprintf(stderr, "t.c_lflag = %d \n", t.c_lflag);
        fprintf(stderr, "t.c_lflag & ECHO = %d \n", t.c_lflag & ECHO);

        t.c_lflag ^= ECHO;

        fprintf(stderr, "doing TCSETS\n");

        if (0 != ioctl(0, TCSETS, &t)) {
                fprintf(stderr, "ioctl failed: %s\n", strerror(errno));
        }

        fprintf(stderr, "doing tcgetattr\n");
        if (0 != tcgetattr(0, &t)) {
                fprintf(stderr, "tcgetattr() failed: %s\n", strerror(errno));
        }

        return 0;
}

Then call it via this script:

#!/bin/bash

test_ioctl </dev/tty

This just toggles the echoing of characters to the terminal you started the script from.

(I'm intending to add support for more ioctl()s when I see them used by other applications I would like to pack into "application overlay images".)

lvml avatar Jul 16 '17 21:07 lvml

On Jul 13 2017, Lutz Vieweg [email protected] wrote:

On 07/13/2017 08:17 PM, Nikolaus Rath wrote:

I have my doubts about that makeaoi idea.

It actually does work pretty well already - I really enjoyed using an up-to-date "RawTherapee" on an oldish CentOS...

"I found an instance where it works" is not the same thing as "This is a reliable method" :-).

Why is the overlay needed in the first place?

Because you want your application to be able to work with data from your /home/.... directory, use temporary directories, communicate with services that use unix domain sockets or FIFOs, access /proc/* and /sys/* etc.

In that case you're much better off bind-mounting those directories into your chroot.

As you've seen, you can't passthrough /dev at all, and you will run into trouble with /proc as well (think about eg /proc/self).

Best, -Nikolaus

-- GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

         »Time flies like an arrow, fruit flies like a Banana.«

Nikratio avatar Jul 17 '17 12:07 Nikratio

On 07/17/2017 02:58 PM, Nikolaus Rath wrote:

In that case you're much better off bind-mounting those directories into your chroot.

mount --bind is not allowed to non-root users, therefore I cannot use it to allow ordinary users to run applications.

As you've seen, you can't passthrough /dev at all, and you will run into trouble with /proc as well (think about eg /proc/self).

I already implemented a solution to the /proc/self issue, see https://github.com/lvml/unionfs-fuse/blob/fake_dev/src/fuse_ops.c#L175 which works well for me.

And the successfull passing through of TCGETS and TCSETS gives me hope the passthrough of the most relevant /dev/* cases will be possible.

lvml avatar Jul 17 '17 15:07 lvml

On Jul 17 2017, Lutz Vieweg [email protected] wrote:

On 07/17/2017 02:58 PM, Nikolaus Rath wrote:

In that case you're much better off bind-mounting those directories into your chroot.

mount --bind is not allowed to non-root users,

That's right. The same holds for mounting of fuse filesystems.

therefore I cannot use it to allow ordinary users to run applications.

But that's not necessarily the right conclusion - otherwise you wouldn't be able to use FUSE either.

As you've seen, you can't passthrough /dev at all, and you will run into trouble with /proc as well (think about eg /proc/self).

I already implemented a solution to the /proc/self issue, see https://github.com/lvml/unionfs-fuse/blob/fake_dev/src/fuse_ops.c#L175 which works well for me.

And the successfull passing through of TCGETS and TCSETS gives me hope the passthrough of the most relevant /dev/* cases will be possible.

I am not doubting that this can be made to work for specific cases. I am doubting that this can be used as a general purpose solution. You will forever be chasing corner cases.

Best, -Nikolaus

-- GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

         »Time flies like an arrow, fruit flies like a Banana.«

Nikratio avatar Jul 18 '17 12:07 Nikratio

On 07/18/2017 02:02 PM, Nikolaus Rath wrote:

mount --bind is not allowed to non-root users,

That's right. The same holds for mounting of fuse filesystems.

While both the "mount" and the "fusermount" executables are usually installed with SUID to root there is a big difference:

"mount --bind /some/path /some/path" tells users: mount: only root can use "--bind" option

"fusermount /some/path /some/path" just works when invoked by non-root users.

If there was a better method to allow a user to prepare a "chroot"-like environment for arbitrary executables, I would of course love to use it.

And the successfull passing through of TCGETS and TCSETS gives me hope the passthrough of the most relevant /dev/* cases will be possible.

I am not doubting that this can be made to work for specific cases. I am doubting that this can be used as a general purpose solution. You will forever be chasing corner cases.

I totally agree that this is not "a general purpose solution" in the sense that you cannot expect 100% of all possible applications to run anywhere using "makeaoi".

But it works for many, might even work for most application, and it needs to work anyway only for those few applications where "compiling a new binary for the target system, solving whatever incompatibilities arise" is exceptionally cumbersome.

I would certainly not advise to use "makeaoi" as some sort of "standard format to ship every application" one wants to use.

And all other options have their drawbacks, too:

  • virtual machines come with lots of extra weight, inefficiency, and don't integrate well with the working environment on the host

  • containers still isolate much more than required or reasonable for running one application for a user on the host

  • Snap/FlatPack require specifically prepared/compiled software executables compatible with the target OS - plus root rights there

I am not trying to make any of the above redundant, "makeaoi" is just meant to close a gap of use cases for which no other tool seems appropriate to me at this time.

lvml avatar Jul 18 '17 13:07 lvml

nikolaus, what would be needed to implement proper special device handling in fuse (so it could be implemented without the "as regular" hacks)? some kernel side changes or is only a libfuse problem?

R.

On 07/18/2017 02:02 PM, Nikolaus Rath wrote:

On Jul 17 2017, Lutz Vieweg [email protected] wrote:

On 07/17/2017 02:58 PM, Nikolaus Rath wrote:

In that case you're much better off bind-mounting those directories into your chroot.

mount --bind is not allowed to non-root users,

That's right. The same holds for mounting of fuse filesystems.

therefore I cannot use it to allow ordinary users to run applications.

But that's not necessarily the right conclusion - otherwise you wouldn't be able to use FUSE either.

As you've seen, you can't passthrough /dev at all, and you will run into trouble with /proc as well (think about eg /proc/self).

I already implemented a solution to the /proc/self issue, see https://github.com/lvml/unionfs-fuse/blob/fake_dev/src/fuse_ops.c#L175 which works well for me.

And the successfull passing through of TCGETS and TCSETS gives me hope the passthrough of the most relevant /dev/* cases will be possible.

I am not doubting that this can be made to work for specific cases. I am doubting that this can be used as a general purpose solution. You will forever be chasing corner cases.

Best, -Nikolaus

-- GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

»Time flies like an arrow, fruit flies like a Banana.«

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/rpodgorny/unionfs-fuse/issues/66#issuecomment-316042544, or mute the thread https://github.com/notifications/unsubscribe-auth/ACw2cDb00wYZtFuoW71hp8S0KjaLkDHlks5sPJ5hgaJpZM4NyBBn.

rpodgorny avatar Jul 18 '17 13:07 rpodgorny

On Jul 18 2017, Lutz Vieweg [email protected] wrote:

On 07/18/2017 02:02 PM, Nikolaus Rath wrote:

mount --bind is not allowed to non-root users,

That's right. The same holds for mounting of fuse filesystems.

While both the "mount" and the "fusermount" executables are usually installed with SUID to root there is a big difference:

"mount --bind /some/path /some/path" tells users: mount: only root can use "--bind" option

"fusermount /some/path /some/path" just works when invoked by non-root users.

If there was a better method to allow a user to prepare a "chroot"-like environment for arbitrary executables, I would of course love to use it.

Well, you just gave the answer: provide a setuid mount helper for bind mounts, just like libfuse provides one for fuse mounts.

  • containers still isolate much more than required or reasonable for running one application for a user on the host

Most people would say that your solution actually is a container. And running one application per container is quite popular ("cloud native").

  • Snap/FlatPack require specifically prepared/compiled software executables compatible with the target OS - plus root rights there

I think you are fooling yourself here. You are depending on a setuid fusermount helper program, and a special fuse system. Both of them have to be specifically prepared/compiled on the target OS before you can use makeaoi.

I don't think I'll convince you though, so I will shut up at this point.

Cheers, -Nikolaus

-- GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

         »Time flies like an arrow, fruit flies like a Banana.«

Nikratio avatar Jul 18 '17 14:07 Nikratio

On Jul 18 2017, Radek Podgorny [email protected] wrote:

nikolaus, what would be needed to implement proper special device handling in fuse (so it could be implemented without the "as regular" hacks)? some kernel side changes or is only a libfuse problem?

fuse provides proper devices. It just happens that access to "proper devices" does not go through userspace. FUSE stands for filesystem in userspace, not device in userspace :-).

What you want is a way to pass access to arbitrary devices through a userspace helper. This would be a new kernel feature. Something like that currently exists only for some specific devices (e.g. nbd or, indirectly, loopback devices).

Best, -Nikolaus

GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

         »Time flies like an arrow, fruit flies like a Banana.«

Nikratio avatar Jul 18 '17 14:07 Nikratio

On 07/18/2017 04:16 PM, Nikolaus Rath wrote:

Well, you just gave the answer: provide a setuid mount helper for bind mounts, just like libfuse provides one for fuse mounts.

It is certainly conceivable to offer an optional, alternative starting method in the "AppRun" script that first requires the user to install such a "setuid mount helper tool", or to use the ordinary "mount --bind" if the script is started as "root". (Since makeaoi already supports a "start-as-root" last-resort option, I should probably make it the default to use "mount --bind" for /proc, /dev, /sys, then.)

But I would not like to reduce the audience of those who could possibly run the "application overlay image" to those that have root access on the target system.

  • containers still isolate much more than required or reasonable for running one application for a user on the host

Most people would say that your solution actually is a container.

Providing "/" of the host as writeable filesystem to the application keeps me from calling it a "container" :-)

  • Snap/FlatPack require specifically prepared/compiled software executables compatible with the target OS - plus root rights there

I think you are fooling yourself here. You are depending on a setuid fusermount helper program,...

Which is present by default on every distribution I came across so far...

and a special fuse system.

... which comes as part of the "makeaoi"-generated directory.

lvml avatar Jul 18 '17 15:07 lvml

On 07/18/2017 03:34 PM, Radek Podgorny wrote:

nikolaus, what would be needed to implement proper special device handling in fuse (so it could be implemented without the "as regular" hacks)? some kernel side changes or is only a libfuse problem?

If the existing CUSE support in the FUSE kernel module would allow non-root users to present device nodes to applications that would then be implemented in user-space (possibly by forwarding relevant operations to pre-existing devices the userspace process has access to), that could work.

But for reasons not comprehensively explained in the FUSE and CUSE documentation, their maintainers assume it would allow privilege escalation if non-root users were allowed to present device nodes.

Of course, if some pre-existing device 8:0 had a device node associated like brw-rw---- 1 root disk 8, 0 /dev/sda and if a user could present a device node like brw-rw-rw- 1 root disk 8, 0 /home/user/mydev/sda and if that would allow a user process to access the actual 8:0 device, that would be a huge security hole.

However, my intitial understanding of CUSE was that all accesses to /home/user/mydev/sda would go to the userspace process that implements the CUSE device, and thus only whatever this userspace process can do is possible to do via /home/user/mydev/sda - so if the userspace process cannot access the real /dev/sda device, it cannot disclose any data to the process accessing /home/user/mydev/sda

It might be I just did not yet recognize the actual reason why allowing CUSE to non-root users is considered harmful.

lvml avatar Jul 18 '17 15:07 lvml

BTW: I succeded to package "subsurface" using "makeaoi" such that subsurface can communicate with dive computers attached to serial ports - Proof of Concept example: https://transfer.sh/2tLny/subsurface-4.6.4-2-x86_64.AppImage

In addition to some ioctl() support I also had to implement poll() for unionfs-fuse - see https://github.com/lvml/unionfs-fuse/blob/fake_dev/src/fuse_ops.c#L665 for the bulk of relevant changes.

@rpodgorny: Maybe we can talk about whether/how to contribute my unionfs-fuse changes upstream - I guess others could possibly use the poll() support, too, and the "-o fake_devices" option should not disturb those who do not need it...

lvml avatar Jul 30 '17 21:07 lvml