runc icon indicating copy to clipboard operation
runc copied to clipboard

rshared submount is not clearly umount in host

Open gxxxh opened this issue 4 months ago • 2 comments

Description

After a bidirectional mount inside the container, all the sub mount under this mount will be propagated to host, but these mounts won't be umounted after remove the container

Steps to reproduce the issue

  1. I create a pod and a container using crictl, sandbox.json is
{
    "metadata": {
        "name": "looper-sandbox",
        "namespace": "default",
        "attempt": 1,
        "uid": "c6318cce-89b2-4f02-a702-7ba243a2fbd1"
    },
    "log_directory": "/home/guo.gh/",
    "linux": {
        "security_context": {
                    "privileged": true,
            "namespace_options": {
                "network": 2,
                "pid":2,
                "ipc":2
            }
        }
    }
}
  1. container.json like this, it will bind mount hostpath /test to container's /root/test, and this is a bidirectional(rshared) bind mount, which meas submounts will be propagated to the host
{
    "metadata": {
        "name": "looper"
    },
    "log_path": "loop.log",
    "image": {
        "image": "busybox:latest"
    },
    "command": [
        "/bin/sh",
        "-c",
        "i=0; while true; do t=$(date); echo $t -- $i; i=$(expr $i + 1); sleep 1; done"
    ],
    "mounts":[
        {
            "container_path": "/test",
            "host_path": "/root/test",
            "readonly": false,
            "propagation": 2
        }
    ],
"linux": {
    "security_context": {
        "privileged": true,
            "namespace_options": {
                "network": 2,
                "pid":1,
                "ipc":1
            }
        }
    }
}
  1. start sandbox and container
pod=`sudo crictl runp sandbox.json`
cnt=`sudo crictl create $pod container.json sandbox.json`
crictl start $cnt
  1. exec into the container and bind mount /root/test/a to /root/test/b
crictl exec -it $cnt /bin/sh
mkdir  /root/test/a
 mkdir /root/test/b
mount --bind  /root/test/a  /root/test/b
  1. then i can see a mount in host
mount | grep test
  1. remove the container and sandbox, the mount propagated to host still exists without unmounted, and this mount items will increase as I start and remove container, leaking mount and mount mount in host. my runc version is 1.0.8 and I think runc should do something to make sure all mounts propagated from inside the container should be umounted.

Describe the results you received and expected

mounts should be clearly umounted on host

What version of runc are you using?

runc version 1.1.8 commit: v1.1.8-10-g85d13e5c spec: 1.0.2-dev go: go1.18.5 libseccomp: 2.5.2

Host OS information

NAME="Alibaba Cloud Linux" VERSION="3 (Soaring Falcon)" ID="alinux" ID_LIKE="rhel fedora centos anolis" VERSION_ID="3" UPDATE_ID="10" PLATFORM_ID="platform:al8" PRETTY_NAME="Alibaba Cloud Linux 3 (Soaring Falcon)" ANSI_COLOR="0;31" HOME_URL="https://www.aliyun.com/"

Host kernel information

Linux k39c03413.sqa.eu95 5.10.134-007.ali5000.al8.x86_64 #1 SMP Fri Mar 3 18:41:24 CST 2023 x86_64 x86_64 x86_64 GNU/Linux

gxxxh avatar Aug 11 '25 12:08 gxxxh

I found a similar issue in docker , docker issue, which thinks runtime should not umount this propagation submount, and container process should do it by itself

gxxxh avatar Aug 11 '25 15:08 gxxxh

This is one of many pitfalls of shared mount propagation, and as such I would strongly suggest not using it unless it's really necessary.

Unlike regular unmounts, the mount namespace being destroyed (when the container dies) does not trigger unmounts to propagate to the host. In addition, runc has no way of knowing what mounts have been propagated (much less whether the user would actually want us to unmount anything). Also, in most usecases (within Docker/containerd/Kubernetes), runc goes away after the container has been configured so there is no runc program that could even do the unmount if we wanted to.

cyphar avatar Aug 14 '25 08:08 cyphar