bubblewrap icon indicating copy to clipboard operation
bubblewrap copied to clipboard

Bind mounting over symlinks attempts to create their targets and bind mount over them

Open robryk opened this issue 5 years ago • 16 comments
trafficstars

What I observe:

$ touch a_file

$ ln -s a_file rel_symlink
$ bwrap --bind / / --bind a_file $PWD/rel_symlink /usr/bin/env ls -l $PWD/rel_symlink
lrwxrwxrwx 1 robryk users 6 Sep 14 23:55 /home/robryk/zero-k/rel_symlink -> a_file

$ ln -s $PWD/a_file abs_symlink
$ bwrap --bind / / --bind a_file $PWD/abs_symlink /usr/bin/env ls -l $PWD/abs_symlink
bwrap: Can't create file at /home/robryk/zero-k/abs_symlink: No such file or directory

$ ls -l nonexistent
ls: cannot access 'nonexistent': No such file or directory
$ ln -s nonexistent broken_symlink
$ bwrap --bind / / --bind a_file $PWD/broken_symlink /usr/bin/env ls -l $PWD/broken_symlink
lrwxrwxrwx 1 robryk users 11 Sep 14 23:55 /home/robryk/zero-k/broken_symlink -> nonexistent
$ ls -l nonexistent 
-rw-rw-rw- 1 robryk users 0 Sep 15 00:04 nonexistent

$ ln -s ../../../nonexistent another_broken_symlink
$ bwrap --bind / / --bind a_file $PWD/another_broken_symlink /usr/bin/env ls -l $PWD/another_broken_symlink
bwrap: Can't create file at /home/robryk/zero-k/another_broken_symlink: Permission denied

It seems that:

  1. Mounting over an unbroken symlink caused bwrap to mount over the symlink's target.
  2. Mounting over an unbroken, absolute symlink caused bwrap to fail.
  3. Mounting over a broken symlink caused bwrap to create the symlink target and mount over it.
  4. Mounting over a broken symlink that points into a directory that is not writeable caused bwrap to fail.

What I expect:

Mounting over any kind of symlink to cause the resulting mount namespace to have the symlink replaced with the source of the mount. This did not happen in any of the four cases.

If doing that is impossible (quick skim of man mount(2) didn't yield an equivalent of O_NOFOLLOW, even though umount2 does have the UMOUNT_NOFOLLOW flag), I expect bwrap to exit with an error. I also expect it not to mount anything over the symlink's target. This failed to happen in cases 1 and 3.

It would be nice if the "File not found" error was more descriptive. It's not obvious which file is "missing" (my first intuition was that the problem was with one of the directories in the target path).

robryk avatar Sep 14 '20 22:09 robryk

Bind mounting over symlinks attempts to create their targets and bind mount over them

Sorry, this is how mount(2) works. We cannot mount anything on a symlink: mount(2) follows symlinks, whether we want it to or not.

smcv avatar Mar 31 '21 17:03 smcv

It's still confusing to actually do that then, instead of failing. I would wager that this is ~never the desired behaviour. I would prefer if (see the last paragraphs of the original report):

  • bwrap errored out in that case ~~(I understand that it's impossible to do that reliably, but doing it in a fashion that admits races is better than nothing)~~(since Linux 5.10 you can do this reliably),
  • bwrap didn't create targets of broken symlinks,
  • any "File not found" errors were accompanied by the path to the file that was unexpectedly not present.

robryk avatar Mar 31 '21 18:03 robryk

Bind mounting over symlinks attempts to create their targets and bind mount over them

Sorry, this is how mount(2) works. We cannot mount anything on a symlink: mount(2) follows symlinks, whether we want it to or not.

My mount (2) manpage says:

MS_NOSYMFOLLOW (since Linux 5.10)

Do not follow symbolic links when resolving paths. Symbolic links can still be created, and readlink(1), readlink(2), realpath(1), and realpath(3) all still work properly.

Is this the feature that could make it work?

apteryks avatar Feb 22 '22 04:02 apteryks

Is this the feature that could make it work?

It depends exactly how it works. Perhaps you could try it and see what happens?

smcv avatar Feb 22 '22 10:02 smcv

Hm. I can't see how the behavior changes, adding this option or leaving it out. I guess it won't help here.

apteryks avatar Feb 22 '22 19:02 apteryks

Note that MS_NOSYMFOLLOW seems to be an option that affects the created mount (i.e. symlinks that are in that filesystem won't be followed) and not how mount() resolves its target path.

robryk avatar Aug 31 '23 20:08 robryk

Can the new mounting API be leveraged for this?

https://brauner.io/2023/02/28/mounting-into-mount-namespaces.html

It looks like the new syscall open_tree accepts the AT_SYMLINK_NOFOLLOW:

https://github.com/torvalds/linux/blob/968f35f4ab1c0966ceb39af3c89f2e24afedf878/tools/testing/selftests/mount_setattr/mount_setattr_test.c#L1138

The move_mount also has the symlink following flag:

https://github.com/torvalds/linux/blob/968f35f4ab1c0966ceb39af3c89f2e24afedf878/include/uapi/linux/mount.h#L73

igo95862 avatar Dec 03 '23 08:12 igo95862

Can the new mounting API be leveraged for this?

I don't know, can it? Or would the syscall just fail with ELOOP if the target is a symlink? (Those are the two reasonable interpretations of AT_SYMLINK_NOFOLLOW.)

If the answer is that a new mounting API can mount over the top of an existing symlink without dereferencing the symlink, I'd be happy to review a pull request. It would still need to be able to fall back to the old mounting API to support older kernels.

smcv avatar Dec 05 '23 14:12 smcv

I'd be happy to review a pull request. It would still need to be able to fall back to the old mounting API to support older kernels.

I will try to experiment with the new API. One issue is that it is barely documented. I guess I will read to the kernel source code to figure it out.

igo95862 avatar Dec 05 '23 17:12 igo95862

Actually the mount from util-linux can already use new mount API and has an option to switch between them.

https://github.com/util-linux/util-linux/commit/fd6b4d94ff013dc7ed680d4e864610da5b9751f1

So lets put it to test:

1). Create a non-existent symlink:

>>> ln -s /nonexistent ./test

2). Unshare the mount namespace:

>>> unshare --user --map-root-user --mount

3). Check that it is still symlink.

>>> /usr/bin/ls -l ./
total 0
lrwxrwxrwx 1 igo95862 igo95862 12 Dec  5 23:10 test -> /nonexistent

4). Do a mount over a symlink forcing new api. Mount some existing file.

env LIBMOUNT_FORCE_MOUNT2=never mount --bind /etc/resolv.conf ./test

5). Check what happens.

>>> /usr/bin/ls -l ./
total 4
-rw-r--r-- 1 nobody nobody 73 Dec  3 17:31 test
>>> cat ./test
nameserver 127.0.0.1
nameserver ::1

So it does become a regular file that we mounted on top.

Lets try to use the old mout API:

>>> env LIBMOUNT_FORCE_MOUNT2=always mount --bind /etc/resolv.conf ./test 
mount: ./test: mount point is a symbolic link to nowhere.
       dmesg(1) may have more information after failed mount system call.

So the new API definitely fixes this issue.

Some strace logs.

Using new API:

open_tree(AT_FDCWD, "/etc/resolv.conf", OPEN_TREE_CLONE|OPEN_TREE_CLOEXEC) = 3
move_mount(3, "", AT_FDCWD, "./test", MOVE_MOUNT_F_EMPTY_PATH) = 0

Using old API:

mount("/etc/resolv.conf", "./test", 0x55e0e3605670, MS_BIND, NULL) = -1 ENOENT (No such file or directory)

igo95862 avatar Dec 05 '23 17:12 igo95862

One issue is that it is barely documented.

You can grab the manpage drafts from the kernel mailing list to get a better starting point.

rusty-snake avatar Dec 05 '23 17:12 rusty-snake

You can grab the manpage drafts from the kernel mailing list to get a better starting point.

Could you please give a link. I am not good with mailing lists.

igo95862 avatar Dec 05 '23 17:12 igo95862

https://lore.kernel.org/all/[email protected]/

https://duckduckgo.com/?q=site%3Akernel.org+%22Add+manpage+for%22+move_mount&ia=web

rusty-snake avatar Dec 05 '23 18:12 rusty-snake

https://lore.kernel.org/all/[email protected]/

https://duckduckgo.com/?q=site%3Akernel.org+%22Add+manpage+for%22+move_mount&ia=web

Thank you.

Weird that those man pages patches were around since 2018 but never merged.

igo95862 avatar Dec 05 '23 18:12 igo95862

I've rendered the man pages to human readable format:

open_tree

move_mount

Looking at man pages it seems the new API has a few more advantages. For example, it has option of not triggering auto-mount.

If you look at this part of man page looks like new mount API can mount over mount point that does not exist.

igo95862 avatar Dec 06 '23 16:12 igo95862

FWIW: https://github.com/sunfishcode/linux-mount-api-documentation (@igo95862)


Actually you can do this with the old mount api as well using /proc/self/fd.

diff --git a/bind-mount.c b/bind-mount.c
index 57b4236..d034c51 100644
--- a/bind-mount.c
+++ b/bind-mount.c
@@ -395,7 +395,9 @@ bind_mount (int           proc_fd,
 
   if (src)
     {
-      if (mount (src, dest, NULL, MS_SILENT | MS_BIND | (recursive ? MS_REC : 0), NULL) != 0)
+      dest_fd = openat(AT_FDCWD, dest, O_CLOEXEC | O_PATH | O_NOFOLLOW);
+      dest_proc = get_oldroot_path (xasprintf ("/proc/self/fd/%d", dest_fd));
+      if (mount (src, dest_proc, NULL, MS_SILENT | MS_BIND | (recursive ? MS_REC : 0), NULL) != 0)
         return BIND_MOUNT_ERROR_MOUNT;
     }
$ ln -s ENOENT hello1.txt
$ echo hello > hello2.txt
$ stat -c '%F' hello1.txt
symbolic link
$ cat hello1.txt
cat: hello1.txt: No such file or directory
$ ./_builddir/bwrap --dev-bind / / --bind $PWD/hello2.txt $PWD/hello1.txt stat -c '%F' hello1.txt
regular file
$ ./_builddir/bwrap --dev-bind / / --bind $PWD/hello2.txt $PWD/hello1.txt cat hello1.txt
hello

[!CAUTION] The PoC diff from above does not

  • handle errors (return code of openat, ...).
  • cares about cleanup (free, close).
  • cares about preserving other flags done below via mountinfo.
  • stop bwrap from creating the target file.

rusty-snake avatar Dec 23 '23 10:12 rusty-snake