bubblewrap icon indicating copy to clipboard operation
bubblewrap copied to clipboard

Execution issue when rootfs is mounted twice "Can't bind mount /oldroot/dev/null on /newroot/dev/null: No such file or directory"

Open TristanCacqueray opened this issue 7 years ago • 6 comments

Greetings, sometimes our CI jobs fails when bwrap fails to start and dies with this message: "bwrap: Can't bind mount /oldroot/dev/null on /newroot/dev/null: No such file or directory"

It seems like this happens when the rootfs is mounted twice:

$ cat /proc/mounts
rootfs / rootfs rw 0 0
sysfs /sys sysfs rw,seclabel,nosuid,nodev,noexec,relatime 0 0
proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0
devtmpfs /dev devtmpfs rw,seclabel,nosuid,size=3984364k,nr_inodes=996091,mode=755 0 0
securityfs /sys/kernel/security securityfs rw,nosuid,nodev,noexec,relatime 0 0
tmpfs /dev/shm tmpfs rw,seclabel,nosuid,nodev 0 0
devpts /dev/pts devpts rw,seclabel,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 0 0
tmpfs /run tmpfs rw,seclabel,nosuid,nodev,mode=755 0 0
tmpfs /sys/fs/cgroup tmpfs ro,seclabel,nosuid,nodev,noexec,mode=755 0 0
cgroup /sys/fs/cgroup/systemd cgroup rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/usr/lib/systemd/systemd-cgrou
ps-agent,name=systemd 0 0
pstore /sys/fs/pstore pstore rw,nosuid,nodev,noexec,relatime 0 0
cgroup /sys/fs/cgroup/blkio cgroup rw,nosuid,nodev,noexec,relatime,blkio 0 0
cgroup /sys/fs/cgroup/net_cls,net_prio cgroup rw,nosuid,nodev,noexec,relatime,net_prio,net_cls 0 0
cgroup /sys/fs/cgroup/cpuset cgroup rw,nosuid,nodev,noexec,relatime,cpuset 0 0
cgroup /sys/fs/cgroup/perf_event cgroup rw,nosuid,nodev,noexec,relatime,perf_event 0 0
cgroup /sys/fs/cgroup/devices cgroup rw,nosuid,nodev,noexec,relatime,devices 0 0
cgroup /sys/fs/cgroup/cpu,cpuacct cgroup rw,nosuid,nodev,noexec,relatime,cpuacct,cpu 0 0
cgroup /sys/fs/cgroup/freezer cgroup rw,nosuid,nodev,noexec,relatime,freezer 0 0
cgroup /sys/fs/cgroup/hugetlb cgroup rw,nosuid,nodev,noexec,relatime,hugetlb 0 0
cgroup /sys/fs/cgroup/memory cgroup rw,nosuid,nodev,noexec,relatime,memory 0 0
cgroup /sys/fs/cgroup/pids cgroup rw,nosuid,nodev,noexec,relatime,pids 0 0
configfs /sys/kernel/config configfs rw,relatime 0 0
/dev/vda1 / xfs rw,seclabel,relatime,attr2,inode64,noquota 0 0
rpc_pipefs /var/lib/nfs/rpc_pipefs rpc_pipefs rw,relatime 0 0
selinuxfs /sys/fs/selinux selinuxfs rw,relatime 0 0
systemd-1 /proc/sys/fs/binfmt_misc autofs rw,relatime,fd=28,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=1302
8 0 0
debugfs /sys/kernel/debug debugfs rw,relatime 0 0
mqueue /dev/mqueue mqueue rw,seclabel,relatime 0 0
hugetlbfs /dev/hugepages hugetlbfs rw,seclabel,relatime 0 0
nfsd /proc/fs/nfsd nfsd rw,relatime 0 0
tmpfs /run/user/0 tmpfs rw,seclabel,nosuid,nodev,relatime,size=801012k,mode=700 0 0
/dev/vda1 / xfs rw,seclabel,relatime,attr2,inode64,noquota 0 0

$ bwrap --dir /tmp --tmpfs /tmp --chdir /tmp/ --dir /var --dir /var/tmp --dir /run/user/974 --ro-bind /usr /usr --ro-bind /lib /lib  --ro-bind /lib64 /lib64 --ro-bind /bin /bin --ro-bind /sbin /sbin --ro-bind /etc/resolv.conf /etc/resolv.conf --ro-bind /etc/hosts /etc/hosts  --proc /proc --dev /dev      --unshare-all --share-net --die-with-parent  /bin/echo ok
bwrap: Can't bind mount /oldroot/dev/null on /newroot/dev/null: No such file or directory

I'm still looking for how this double "/dev/vda1 / xfs" mount happens... This is an OpenStack cloud instance and perhaps the cloud-init growroot process does the remount? There is also a service that bind mount / to /srv/host-rootfs to run runC containers that may be causing the double mount.

It seems like an expected behavior as the bwrap initial pivot_root may not have the SLAVE flag on the right mount point or something... Any help to debug this would be appreciated :-)

TristanCacqueray avatar Jun 24 '18 03:06 TristanCacqueray

Here is how to reproduce this issue:

# mkdir -p /srv/host-rootfs
# echo "/ /srv/host-rootfs none bind,ro,private 0 0" >> /etc/fstab
# for i in $(seq 100); do mount /srv/host-rootfs/ & done
# cat /proc/mounts
...
/dev/vda1 /srv/host-rootfs xfs ro,seclabel,relatime,attr2,inode64,noquota 0 0
/dev/vda1 /srv/host-rootfs xfs ro,seclabel,relatime,attr2,inode64,noquota 0 0
...
/dev/vda1 / xfs rw,seclabel,relatime,attr2,inode64,noquota 0 0
/dev/vda1 /srv/host-rootfs xfs ro,seclabel,relatime,attr2,inode64,noquota 0 0
/dev/vda1 / xfs rw,seclabel,relatime,attr2,inode64,noquota 0 0
/dev/vda1 / xfs rw,seclabel,relatime,attr2,inode64,noquota 0 0
/dev/vda1 /srv/host-rootfs xfs ro,seclabel,relatime,attr2,inode64,noquota 0 0
/dev/vda1 /srv/host-rootfs xfs ro,seclabel,relatime,attr2,inode64,noquota 0 0
...
# uname -a
Linux managesf.sftests.com 3.10.0-862.2.3.el7.x86_64 #1 SMP Wed May 9 18:05:47 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

When this happen, bwrap will fail to use any things that was mounted on the first rootfs, e.g. /dev or /tmp results in "bwrap: Can't bind mount" or "bwrap: Can't get type of source"

To fix the system, do:

while true; do umount -l /srv/host-rootfs || break; done;

TristanCacqueray avatar Jun 27 '18 22:06 TristanCacqueray

Hmm. First, there shouldn't be a need to modify /etc/fstab, you can just do:

for i in $(seq 100); do mount --bind / /srv/host-rootfs/ & done

That fails pretty fast for me, somehow mount exits with ENOSPC? Ah I see this uncommented count_mounts() function in linux/fs/namespace.c...

Anyways, surely you aren't stacking mounts like this (why would you do that?), just trying to illustrate a race condition? So the reproducer instead would be like:

while true; do mount --bind / /srv/host-rootfs/ && umount /srv/host-rootfs; done

Then yep, this quickly fails for me:

while bwrap --ro-bind / / true; do :; done

Will look.

cgwalters avatar Jun 28 '18 10:06 cgwalters

The "for i in $(seq 100)" loop is just a reproducer that happen to trigger the same bug that was making our CI flaky. The culprit was a badly written container driver that was doing simultaneous bind mount that has been fixed by using "mount -a" instead.

TristanCacqueray avatar Jun 28 '18 11:06 TristanCacqueray

I think the problem here is that the mount points are changing underneath us while we're executing the pivot, since we're using MS_SLAVE. Perhaps what we could do is use MS_PRIVATE during the setup, then change to MS_SLAVE after?

cgwalters avatar Jul 11 '18 14:07 cgwalters

crosspost from https://github.com/flatpak/flatpak/issues/3470

I have added some days ago this to my fstab:

sun.local:/volume1/ /var/mnt/sun nfs rw,soft,timeo=120,x-systemd.automount,x-systemd.idle-timeout=180 0 0

I am now in a network in which sun.local does not resolve, so the mount should fail gracefully. However, flatpak has apparently learnt that there is an extra mount and fails now to set it up.

flatpak run --branch=stable --arch=x86_64 --command=bash --file-forwarding org.kde.kontact
bwrap: Can't bind mount /oldroot/var on /newroot/var: Unable to apply mount flags: remount "/newroot/var/mnt/sun": No such device
error: Failed to sync with dbus proxy

rriemann avatar Sep 17 '25 08:09 rriemann

@rriemann, you are encountering a different failure mode with a different error message. It is caused by the use of an automount (see https://github.com/flatpak/flatpak/issues/5112#issuecomment-2337927538, https://github.com/flatpak/flatpak/issues/5112#issuecomment-2338185961).

smcv avatar Sep 17 '25 10:09 smcv