buildah icon indicating copy to clipboard operation
buildah copied to clipboard

Buildah fails when building from images defining different volumes for the same symlinked mountpoint

Open an-toine opened this issue 7 months ago • 18 comments


BUG REPORT INFORMATION

Description

We are leveraging Buildah in our CI/CD chain to build and push container images.

Image builds are performed in pods running on Openshift 4.12.36 clusters. These build pods are running with scc anyuid, SETFCAP capability and io.kubernetes.cri-o.Devices: /dev/fuse annotation (we use a setup similar to https://github.com/openshift/enhancements/issues/362#issuecomment-1040664446).

For a specific image, users have reported that the containerized build process failed with this error :

error running subprocess: error creating mountpoint "/var/tmp/buildah403245100/mnt/rootfs/usr/local/demo_root/app" in mount namespace: mkdir /var/tmp/buildah403245100/mnt/rootfs/usr/local/demo_root: file exists

For these users, the build process runs smoothly on their laptop, but fails with Buildah in a pod.

Steps to reproduce the issue:

I managed to pinpoint and to reproduce the issue with the following steps.

Containerfile of the first root image :

FROM registry.access.redhat.com/ubi8/ubi:8.9-1028

# Create directory tree and symlink
RUN mkdir -p /usr/share/demo/app && \
    mkdir -p /usr/share/demo/conf && \
    ln -s /usr/share/demo /usr/local/demo_root

# Declare symlink target as volume
VOLUME /usr/share/demo/app

RUN echo "Hello World"

Build the image :

buildah build -t reproducer:step1 -f Containerfile_step1 .

Containerfile of the second image :

# Inherit from the first root image
FROM localhost/reproducer:step1

# Declare the same directory but via the symlink this time as a volume
VOLUME /usr/local/demo_root/app

Build the image :

buildah build -t reproducer:step2 -f Containerfile_step2 .

Now, to trigger the bug, build this image :

FROM localhost/reproducer:step2

RUN echo "Hello World!"

Describe the results you received: When built with Buildah from inside the pod, the build fails with this error :

STEP 2/2: RUN echo "Hello World!"
error running subprocess: error creating mountpoint "/var/tmp/buildah403245100/mnt/rootfs/usr/local/demo_root/app" in mount namespace: mkdir /var/tmp/buildah403245100/mnt/rootfs/usr/local/demo_root: file exists
error building at STEP "RUN echo "Hello World!"": exit status 1

When building from a Linux Workstation, the build succeeds and two volumes are declared :

buildah inspect reproducer:step3 | jq .OCIv1.config.Volumes
{
  "/usr/local/demo_root/app": {},
  "/usr/share/demo/app": {}
}

Describe the results you expected: I would have expected the build to successfully complete as it would on a simpler setup on a Linux laptop.

Alternatively, I would have expected Buildah to identify that two volumes had been set for the same mountpoint and to error out at step 2. This is what is happening when declaring twice the same directory as a volume in the same directory :

STEP 3/4: VOLUME /usr/share/demo/app /usr/local/demo_root/app
Error: building at STEP "VOLUME /usr/share/demo/app /usr/local/demo_root/app": adding "/usr/share/demo/app" to the volume cache

Output of rpm -q buildah or apt list buildah:

sh-4.4# rpm -q buildah
buildah-1.27.3-1.module+el8.7.0+17498+a7f63b89.x86_64

Output of buildah version:

sh-4.4# buildah -v
buildah version 1.27.3 (image-spec 1.0.2-dev, runtime-spec 1.0.2-dev)

Output of cat /etc/*release:

sh-4.4# cat /etc/redhat-release
Red Hat Enterprise Linux release 8.8 (Ootpa)

Output of uname -a:

sh-4.4# uname -a
Linux pod-builder 4.18.0-372.73.1.el8_6.x86_64 #1 SMP Fri Sep 8 13:16:27 EDT 2023 x86_64 x86_64 x86_64 GNU/Linux

Output of cat /etc/containers/storage.conf:

sh-4.4# cat /etc/containers/storage.conf
# This file is is the configuration file for all tools
# that use the containers/storage library. The storage.conf file
# overrides all other storage.conf files. Container engines using the
# container/storage library do not inherit fields from other storage.conf
# files.
#
#  Note: The storage.conf file overrides other storage.conf files based on this precedence:
#      /usr/containers/storage.conf
#      /etc/containers/storage.conf
#      $HOME/.config/containers/storage.conf
#      $XDG_CONFIG_HOME/containers/storage.conf (If XDG_CONFIG_HOME is set)
# See man 5 containers-storage.conf for more information
# The "container storage" table contains all of the server options.
[storage]

# Default Storage Driver, Must be set for proper operation.
driver = "overlay"

# Temporary storage location
runroot = "/run/containers/storage"

# Primary Read/Write location of container storage
# When changing the graphroot location on an SELINUX system, you must
# ensure  the labeling matches the default locations labels with the
# following commands:
# semanage fcontext -a -e /var/lib/containers/storage /NEWSTORAGEPATH
# restorecon -R -v /NEWSTORAGEPATH
graphroot = "/var/lib/containers/storage"


# Storage path for rootless users
#
# rootless_storage_path = "$HOME/.local/share/containers/storage"

[storage.options]
# Storage options to be passed to underlying storage drivers

# AdditionalImageStores is used to pass paths to additional Read/Only image stores
# Must be comma separated list.
additionalimagestores = [
]

# Remap-UIDs/GIDs is the mapping from UIDs/GIDs as they should appear inside of
# a container, to the UIDs/GIDs as they should appear outside of the container,
# and the length of the range of UIDs/GIDs.  Additional mapped sets can be
# listed and will be heeded by libraries, but there are limits to the number of
# mappings which the kernel will allow when you later attempt to run a
# container.
#
# remap-uids = 0:1668442479:65536
# remap-gids = 0:1668442479:65536

# Remap-User/Group is a user name which can be used to look up one or more UID/GID
# ranges in the /etc/subuid or /etc/subgid file.  Mappings are set up starting
# with an in-container ID of 0 and then a host-level ID taken from the lowest
# range that matches the specified name, and using the length of that range.
# Additional ranges are then assigned, using the ranges which specify the
# lowest host-level IDs first, to the lowest not-yet-mapped in-container ID,
# until all of the entries have been used for maps.
#
# remap-user = "containers"
# remap-group = "containers"

# Root-auto-userns-user is a user name which can be used to look up one or more UID/GID
# ranges in the /etc/subuid and /etc/subgid file.  These ranges will be partitioned
# to containers configured to create automatically a user namespace.  Containers
# configured to automatically create a user namespace can still overlap with containers
# having an explicit mapping set.
# This setting is ignored when running as rootless.
# root-auto-userns-user = "storage"
#
# Auto-userns-min-size is the minimum size for a user namespace created automatically.
# auto-userns-min-size=1024
#
# Auto-userns-max-size is the minimum size for a user namespace created automatically.
# auto-userns-max-size=65536

[storage.options.overlay]
# ignore_chown_errors can be set to allow a non privileged user running with
# a single UID within a user namespace to run containers. The user can pull
# and use any image even those with multiple uids.  Note multiple UIDs will be
# squashed down to the default uid in the container.  These images will have no
# separation between the users in the container. Only supported for the overlay
# and vfs drivers.
#ignore_chown_errors = "false"

# Inodes is used to set a maximum inodes of the container image.
# inodes = ""

# Path to an helper program to use for mounting the file system instead of mounting it
# directly.
mount_program = "/usr/bin/fuse-overlayfs"

# mountopt specifies comma separated list of extra mount options
mountopt = "nodev,metacopy=on"

# Set to skip a PRIVATE bind mount on the storage home directory.
# skip_mount_home = "false"

# Size is used to set a maximum size of the container image.
# size = ""

# ForceMask specifies the permissions mask that is used for new files and
# directories.
#
# The values "shared" and "private" are accepted.
# Octal permission masks are also accepted.
#
#  "": No value specified.
#     All files/directories, get set with the permissions identified within the
#     image.
#  "private": it is equivalent to 0700.
#     All files/directories get set with 0700 permissions.  The owner has rwx
#     access to the files. No other users on the system can access the files.
#     This setting could be used with networked based homedirs.
#  "shared": it is equivalent to 0755.
#     The owner has rwx access to the files and everyone else can read, access
#     and execute them. This setting is useful for sharing containers storage
#     with other users.  For instance have a storage owned by root but shared
#     to rootless users as an additional store.
#     NOTE:  All files within the image are made readable and executable by any
#     user on the system. Even /etc/shadow within your image is now readable by
#     any user.
#
#   OCTAL: Users can experiment with other OCTAL Permissions.
#
#  Note: The force_mask Flag is an experimental feature, it could change in the
#  future.  When "force_mask" is set the original permission mask is stored in
#  the "user.containers.override_stat" xattr and the "mount_program" option must
#  be specified. Mount programs like "/usr/bin/fuse-overlayfs" present the
#  extended attribute permissions to processes within containers rather then the
#  "force_mask"  permissions.
#
# force_mask = ""

[storage.options.thinpool]
# Storage Options for thinpool

# autoextend_percent determines the amount by which pool needs to be
# grown. This is specified in terms of % of pool size. So a value of 20 means
# that when threshold is hit, pool will be grown by 20% of existing
# pool size.
# autoextend_percent = "20"

# autoextend_threshold determines the pool extension threshold in terms
# of percentage of pool size. For example, if threshold is 60, that means when
# pool is 60% full, threshold has been hit.
# autoextend_threshold = "80"

# basesize specifies the size to use when creating the base device, which
# limits the size of images and containers.
# basesize = "10G"

# blocksize specifies a custom blocksize to use for the thin pool.
# blocksize="64k"

# directlvm_device specifies a custom block storage device to use for the
# thin pool. Required if you setup devicemapper.
# directlvm_device = ""

# directlvm_device_force wipes device even if device already has a filesystem.
# directlvm_device_force = "True"

# fs specifies the filesystem type to use for the base device.
# fs="xfs"

# log_level sets the log level of devicemapper.
# 0: LogLevelSuppress 0 (Default)
# 2: LogLevelFatal
# 3: LogLevelErr
# 4: LogLevelWarn
# 5: LogLevelNotice
# 6: LogLevelInfo
# 7: LogLevelDebug
# log_level = "7"

# min_free_space specifies the min free space percent in a thin pool require for
# new device creation to succeed. Valid values are from 0% - 99%.
# Value 0% disables
# min_free_space = "10%"

# mkfsarg specifies extra mkfs arguments to be used when creating the base
# device.
# mkfsarg = ""

# metadata_size is used to set the `pvcreate --metadatasize` options when
# creating thin devices. Default is 128k
# metadata_size = ""

# Size is used to set a maximum size of the container image.
# size = ""

# use_deferred_removal marks devicemapper block device for deferred removal.
# If the thinpool is in use when the driver attempts to remove it, the driver
# tells the kernel to remove it as soon as possible. Note this does not free
# up the disk space, use deferred deletion to fully remove the thinpool.
# use_deferred_removal = "True"

# use_deferred_deletion marks thinpool device for deferred deletion.
# If the device is busy when the driver attempts to delete it, the driver
# will attempt to delete device every 30 seconds until successful.
# If the program using the driver exits, the driver will continue attempting
# to cleanup the next time the driver is used. Deferred deletion permanently
# deletes the device and all data stored in device will be lost.
# use_deferred_deletion = "True"

# xfs_nospace_max_retries specifies the maximum number of retries XFS should
# attempt to complete IO when ENOSPC (no space) error is returned by
# underlying storage device.
# xfs_nospace_max_retries = "0"

Feel free to request any additional detail or clarification that would be of any help to debug this issue.

Antoine

an-toine avatar Dec 21 '23 14:12 an-toine

Associated strace : buildah_strace.txt

an-toine avatar Dec 21 '23 15:12 an-toine

Have you tried this with Docker to see what happens there?

Interested in opening a PR to fix the problem?

rhatdan avatar Dec 21 '23 15:12 rhatdan

I just tried with Docker 23.0.5 on my workstation. The build succeeds, and two volumes are configured on the resulting image :

            "Volumes": {
                "/usr/local/demo_root/app": {},
                "/usr/share/demo/app": {}
            },

I'm not familiar enough with the code structure and the underlying machinery to open a PR though 😕

an-toine avatar Dec 21 '23 16:12 an-toine

So the mkdir command error should probably be ignored or Warned and then continue on.

rhatdan avatar Dec 21 '23 16:12 rhatdan

I just tried your reproducer with buildah-1.33.2-1.fc39.x86_64 and it worked fine.

buildah (mount) $ buildah bud -t reproducer:step1 -f tests/bud/symlinkMounpoints/Containerfile.step1 tests/bud/symlinkMounpoints
STEP 1/4: FROM alpine
STEP 2/4: RUN mkdir -p /usr/share/demo/app &&     mkdir -p /usr/share/demo/conf &&     ln -s /usr/share/demo /usr/local/demo_root
STEP 3/4: VOLUME /usr/share/demo/app
STEP 4/4: RUN echo "Hello World"
Hello World
COMMIT reproducer:step1
Getting image source signatures
Copying blob 5af4f8f59b76 skipped: already exists  
Copying blob 29a355bdbdd7 done   | 
Copying config 7008eb0365 done   | 
Writing manifest to image destination
--> 7008eb036568
Successfully tagged localhost/reproducer:step1
7008eb036568a1a230706635f1597c06d8646982187363592a1ca7996c36f9d1
buildah (mount) $ buildah bud -t reproducer:step2 -f tests/bud/symlinkMounpoints/Containerfile.step2 tests/bud/symlinkMounpoints
STEP 1/2: FROM localhost/reproducer:step1
STEP 2/2: VOLUME /usr/local/demo_root/app
COMMIT reproducer:step2
Getting image source signatures
Copying blob 5af4f8f59b76 skipped: already exists  
Copying blob 29a355bdbdd7 skipped: already exists  
Copying blob 5f70bf18a086 skipped: already exists  
Copying config ef96e6bf55 done   | 
Writing manifest to image destination
--> ef96e6bf55e1
Successfully tagged localhost/reproducer:step2
ef96e6bf55e1c948c66ec3ac9dad85eed8ddc0d2256dba6f8a7eeff46d2ef889
buildah (mount) $ buildah bud -t reproducer:step3 -f tests/bud/symlinkMounpoints/Containerfile.step3 tests/bud/symlinkMounpoints
STEP 1/2: FROM localhost/reproducer:step2
STEP 2/2: RUN echo "Hello World!"
Hello World!
COMMIT reproducer:step3
Getting image source signatures
Copying blob 5af4f8f59b76 skipped: already exists  
Copying blob 29a355bdbdd7 skipped: already exists  
Copying blob 5f70bf18a086 skipped: already exists  
Copying blob f28e87411ddb done   | 
Copying config cacfbfe4b8 done   | 
Writing manifest to image destination
--> cacfbfe4b821
Successfully tagged localhost/reproducer:step3
cacfbfe4b821c7e08f0913b0732dc9a6b6a794be04a5470eaf266f13948e8c9e
buildah (mount) $ rpm -q buildah
buildah-1.33.2-1.fc39.x86_64

rhatdan avatar Dec 21 '23 17:12 rhatdan

Could you try with the latest buildah?

rhatdan avatar Dec 21 '23 17:12 rhatdan

I've created a new builder container image inheriting from quay.io/fedora/fedora:39 :

FROM quay.io/fedora/fedora:39

USER root

ENV BUILDAH_ISOLATION chroot

RUN dnf install -y buildah containers-common

COPY storage.conf /etc/containers/

ENTRYPOINT ["/bin/bash", "-l", "-c"]

Once deployed to Openshift, I still face the issue when building the third reproducer image :

sh-5.2# buildah build -t reproducer:step1 -f Containerfile_step1 .
STEP 1/4: FROM registry.access.redhat.com/ubi8/ubi:8.9-1028
Trying to pull registry.access.redhat.com/ubi8/ubi:8.9-1028...
Getting image source signatures
Checking if image destination supports signatures
Copying blob b4e744f5f131 done   |
Copying config 86b358a425 done   |
Writing manifest to image destination
Storing signatures
STEP 2/4: RUN mkdir -p /usr/share/demo/app &&     mkdir -p /usr/share/demo/conf &&     ln -s /usr/share/demo /usr/local/demo_root
STEP 3/4: VOLUME /usr/share/demo/app
STEP 4/4: RUN echo "Hello World"
Hello World
COMMIT reproducer:step1
Getting image source signatures
Copying blob d93817448019 skipped: already exists
Copying blob 41eb9f386737 done   |
Copying config 6a6149cfcb done   |
Writing manifest to image destination
--> 6a6149cfcb8c
Successfully tagged localhost/reproducer:step1
6a6149cfcb8cc99046be9c5c2570ac4f0f9511971ceeeb3774c7cab0a8465269
sh-5.2# buildah build -t reproducer:step2 -f Containerfile_step2 .
STEP 1/2: FROM localhost/reproducer:step1
STEP 2/2: VOLUME /usr/local/demo_root/app
COMMIT reproducer:step2
Getting image source signatures
Copying blob d93817448019 skipped: already exists
Copying blob 41eb9f386737 skipped: already exists
Copying blob 5f70bf18a086 done   |
Copying config 6534be2886 done   |
Writing manifest to image destination
--> 6534be28866a
Successfully tagged localhost/reproducer:step2
6534be28866adacd387811616dcd1aceb53596729e9942ff3335bc2bdd094cd8
sh-5.2# vi Containerfile_step3
sh-5.2# buildah build -t reproducer:step3 -f Containerfile_step3 .
STEP 1/2: FROM localhost/reproducer:step2
STEP 2/2: RUN echo "Hello World!"
error running subprocess: creating mountpoint "/var/tmp/buildah3005694999/mnt/rootfs/usr/local/demo_root/app" in mount namespace: mkdir /var/tmp/buildah3005694999/mnt/rootfs/usr/local/demo_root: file exists
Error: building at STEP "RUN echo "Hello World!"": exit status 1

Some facts about this new builder image :

sh-5.2# cat /etc/redhat-release
Fedora release 39 (Thirty Nine)
sh-5.2# buildah -v
buildah version 1.33.2 (image-spec 1.1.0-rc.5, runtime-spec 1.1.0)
sh-5.2# rpm -q buildah
buildah-1.33.2-1.fc39.x86_64
sh-5.2# rpm -q fuse-overlayfs
fuse-overlayfs-1.12-2.fc39.x86_64
sh-5.2# buildah info
{
    "host": {
        "CgroupVersion": "v1",
        "Distribution": {
            "distribution": "fedora",
            "version": "39"
        },
        "MemFree": 2072223744,
        "MemTotal": 33708908544,
        "OCIRuntime": "crun",
        "SwapFree": 0,
        "SwapTotal": 0,
        "arch": "amd64",
        "cpus": 4,
        "hostname": "pod-builder",
        "kernel": "4.18.0-372.73.1.el8_6.x86_64",
        "os": "linux",
        "rootless": true,
        "uptime": "1193h 16m 37.47s (Approximately 49.71 days)",
        "variant": ""
    },
    "store": {
        "ContainerStore": {
            "number": 0
        },
        "GraphDriverName": "overlay",
        "GraphOptions": [
            "overlay.mount_program=/usr/bin/fuse-overlayfs",
            "overlay.mountopt=nodev,metacopy=on"
        ],
        "GraphRoot": "/var/lib/containers/storage",
        "GraphStatus": {
            "Backing Filesystem": "overlayfs",
            "Native Overlay Diff": "false",
            "Supports d_type": "true",
            "Supports shifting": "true",
            "Supports volatile": "true",
            "Using metacopy": "false"
        },
        "ImageStore": {
            "number": 3
        },
        "RunRoot": "/run/containers/storage"
    }
}

an-toine avatar Dec 22 '23 09:12 an-toine

If you run this test locally outside of OpenShift does it happen?

rhatdan avatar Dec 22 '23 12:12 rhatdan

When running the fedora 39 builder image with Podman on my workstation, I face the same error :

➜  ~ podman run -ti --rm --entrypoint=/bin/bash --device=/dev/fuse localhost/fedora-builder:latest
[root@ed61fd5becbb /]# ls /dev/
console  core  fd  full  fuse  mqueue  null  ptmx  pts  random  shm  stderr  stdin  stdout  tty  urandom  zero
[root@ed61fd5becbb /]# buildah info
{
    "host": {
        "CgroupVersion": "v1",
        "Distribution": {
            "distribution": "fedora",
            "version": "39"
        },
        "MemFree": 11418132480,
        "MemTotal": 13064253440,
        "OCIRuntime": "crun",
        "SwapFree": 4294967296,
        "SwapTotal": 4294967296,
        "arch": "amd64",
        "cpus": 8,
        "hostname": "ed61fd5becbb",
        "kernel": "5.10.102.1-microsoft-standard-WSL2",
        "os": "linux",
        "rootless": true,
        "uptime": "5h 6m 7.42s (Approximately 0.21 days)",
        "variant": ""
    },
    "store": {
        "ContainerStore": {
            "number": 0
        },
        "GraphDriverName": "overlay",
        "GraphOptions": [
            "overlay.mount_program=/usr/bin/fuse-overlayfs",
            "overlay.mountopt=nodev,metacopy=on"
        ],
        "GraphRoot": "/var/lib/containers/storage",
        "GraphStatus": {
            "Backing Filesystem": "overlayfs",
            "Native Overlay Diff": "false",
            "Supports d_type": "true",
            "Supports shifting": "true",
            "Supports volatile": "true",
            "Using metacopy": "false"
        },
        "ImageStore": {
            "number": 0
        },
        "RunRoot": "/run/containers/storage"
    }
}
[root@ed61fd5becbb tmp]# buildah build -t reproducer:step1 -f Containerfile_step1 .
STEP 1/4: FROM registry.access.redhat.com/ubi8/ubi:8.9-1028
Trying to pull registry.access.redhat.com/ubi8/ubi:8.9-1028...
Getting image source signatures
Checking if image destination supports signatures
Copying blob b4e744f5f131 done   |
Copying config 86b358a425 done   |
Writing manifest to image destination
Storing signatures
STEP 2/4: RUN mkdir -p /usr/share/demo/app &&     mkdir -p /usr/share/demo/conf &&     ln -s /usr/share/demo /usr/local/demo_root
STEP 3/4: VOLUME /usr/share/demo/app
STEP 4/4: RUN echo "Hello World"
Hello World
COMMIT reproducer:step1
Getting image source signatures
Copying blob d93817448019 skipped: already exists
Copying blob f7895495fbc2 done   |
Copying config ee25089c30 done   |
Writing manifest to image destination
--> ee25089c3027
Successfully tagged localhost/reproducer:step1
ee25089c30270e5d76beeb46ab5b84bdf6b318c628faf23fd032424ab3597770
[root@ed61fd5becbb tmp]# buildah build -t reproducer:step2 -f Containerfile_step2 .
STEP 1/2: FROM localhost/reproducer:step1
STEP 2/2: VOLUME /usr/local/demo_root/app
COMMIT reproducer:step2
Getting image source signatures
Copying blob d93817448019 skipped: already exists
Copying blob f7895495fbc2 skipped: already exists
Copying blob 5f70bf18a086 done   |
Copying config 4f37a6cd82 done   |
Writing manifest to image destination
--> 4f37a6cd826c
Successfully tagged localhost/reproducer:step2
4f37a6cd826cd43af751a157714d8899c9c79ae1a2b8a26b132f6f015d47f3de
[root@ed61fd5becbb tmp]# buildah build -t reproducer:step3 -f Containerfile_step3 .
STEP 1/2: FROM localhost/reproducer:step2
STEP 2/2: RUN echo "Hello World!"
error running subprocess: creating mountpoint "/var/tmp/buildah221732845/mnt/rootfs/usr/local/demo_root/app" in mount namespace: mkdir /var/tmp/buildah221732845/mnt/rootfs/usr/local/demo_root: file exists
Error: building at STEP "RUN echo "Hello World!"": exit status 1

However, when running the builder container with --privileged flag and by removing the BUILDAH_ISOLATION=chroot env, I can get the image to build :

➜  ~ podman run -ti --rm --entrypoint=/bin/bash --privileged --cap-add=sys_admin,mknod localhost/fedora-builder:latest
[root@ad1ba37e9983 tmp]# buildah build -t reproducer:step1 -f Containerfile_step1 .
STEP 1/4: FROM registry.access.redhat.com/ubi8/ubi:8.9-1028
STEP 2/4: RUN mkdir -p /usr/share/demo/app &&     mkdir -p /usr/share/demo/conf &&     ln -s /usr/share/demo /usr/local/demo_root
STEP 3/4: VOLUME /usr/share/demo/app
STEP 4/4: RUN echo "Hello World"
Hello World
COMMIT reproducer:step1
Getting image source signatures
Copying blob d93817448019 skipped: already exists
Copying blob d43c26173a37 done   |
Copying config 1af2c06af9 done   |
Writing manifest to image destination
--> 1af2c06af9e5
Successfully tagged localhost/reproducer:step1
1af2c06af9e527b84bcce5abc615babcdc00c8d1f6775094f399de3dbab73c23
[root@ad1ba37e9983 tmp]# buildah build -t reproducer:step2 -f Containerfile_step2 .
STEP 1/2: FROM localhost/reproducer:step1
STEP 2/2: VOLUME /usr/local/demo_root/app
COMMIT reproducer:step2
Getting image source signatures
Copying blob d93817448019 skipped: already exists
Copying blob d43c26173a37 skipped: already exists
Copying blob 5f70bf18a086 skipped: already exists
Copying config 6cd5ae8793 done   |
Writing manifest to image destination
--> 6cd5ae879385
Successfully tagged localhost/reproducer:step2
6cd5ae87938552dd4b85040073703fe905080e7f42bc74cde1ab0a0f0c283661
[root@ad1ba37e9983 tmp]# buildah build -t reproducer:step3 -f Containerfile_step3 .
STEP 1/2: FROM localhost/reproducer:step2
STEP 2/2: RUN echo "Hello World!"
error running subprocess: creating mountpoint "/var/tmp/buildah4168678633/mnt/rootfs/usr/local/demo_root/app" in mount namespace: mkdir /var/tmp/buildah4168678633/mnt/rootfs/usr/local/demo_root: file exists
Error: building at STEP "RUN echo "Hello World!"": exit status 1
[root@ad1ba37e9983 tmp]# unset BUILDAH_ISOLATION
[root@ad1ba37e9983 tmp]# buildah build -t reproducer:step3 -f Containerfile_step3 .
STEP 1/2: FROM localhost/reproducer:step2
STEP 2/2: RUN echo "Hello World!"
Hello World!
COMMIT reproducer:step3
Getting image source signatures
Copying blob d93817448019 skipped: already exists
Copying blob d43c26173a37 skipped: already exists
Copying blob 5f70bf18a086 skipped: already exists
Copying blob c87c478deb65 done   |
Copying config cc88727c3d done   |
Writing manifest to image destination
--> cc88727c3d54
Successfully tagged localhost/reproducer:step3
cc88727c3d541c7658ec465720bcc3e972f7ee742611038ed14b03e2f1d903ce

So could this issue be related in some way to the chroot isolation method we adopted to run Buildah in unprivileged containers on the cluster ?

an-toine avatar Dec 22 '23 14:12 an-toine

If you are running the container as root, it needs CAP_SYS_ADMIN to build the container. If you run as a non root user with /etc/subuild and /etc/subgid setup within the container then you need CAP_SETUID and CAP_SETGID.

rhatdan avatar Dec 22 '23 18:12 rhatdan

Actually, I'm always running podman as root with an alias :

➜  ~ alias podman
podman='sudo podman'

Without the privileged flag, with SYS_ADMIN capability, I can reproduce the error with BUILDAH_ISOLATION set to chroot :

➜  ~ sudo podman run -ti --rm --entrypoint=/bin/bash --device=/dev/fuse --cap-add=sys_admin localhost/fedora-builder
[root@9b05ed302535 tmp]# buildah build -t reproducer:step1 -f Containerfile_step1 .
STEP 1/4: FROM registry.access.redhat.com/ubi8/ubi:8.9-1028
Trying to pull registry.access.redhat.com/ubi8/ubi:8.9-1028...
Getting image source signatures
Checking if image destination supports signatures
Copying blob b4e744f5f131 done   |
Copying config 86b358a425 done   |
Writing manifest to image destination
Storing signatures
STEP 2/4: RUN mkdir -p /usr/share/demo/app &&     mkdir -p /usr/share/demo/conf &&     ln -s /usr/share/demo /usr/local/demo_root
STEP 3/4: VOLUME /usr/share/demo/app
STEP 4/4: RUN echo "Hello World"
Hello World
COMMIT reproducer:step1
Getting image source signatures
Copying blob d93817448019 skipped: already exists
Copying blob b0395c38f74a done   |
Copying config e3fc615ccd done   |
Writing manifest to image destination
--> e3fc615ccd2b
Successfully tagged localhost/reproducer:step1
e3fc615ccd2b4c733234b5fbbc1ee53a1605f999c491981cc8a2eac0a3208f2e
[root@9b05ed302535 tmp]# buildah build -t reproducer:step2 -f Containerfile_step2 .
STEP 1/2: FROM localhost/reproducer:step1
STEP 2/2: VOLUME /usr/local/demo_root/app
COMMIT reproducer:step2
Getting image source signatures
Copying blob d93817448019 skipped: already exists
Copying blob b0395c38f74a skipped: already exists
Copying blob 5f70bf18a086 done   |
Copying config 5c6b46bec2 done   |
Writing manifest to image destination
--> 5c6b46bec245
Successfully tagged localhost/reproducer:step2
5c6b46bec245c738343a9337673e76f3f0a9c450ce8d3c9bc2358773198ab8e8
[root@9b05ed302535 tmp]# buildah build -t reproducer:step3 -f Containerfile_step3 .
STEP 1/2: FROM localhost/reproducer:step2
STEP 2/2: RUN echo "Hello World!"
error running subprocess: creating mountpoint "/var/tmp/buildah3038695144/mnt/rootfs/usr/local/demo_root/app" in mount namespace: mkdir /var/tmp/buildah3038695144/mnt/rootfs/usr/local/demo_root: file exists
Error: building at STEP "RUN echo "Hello World!"": exit status 1

When BUILDAH_ISOLATION is unset, the error is similar to https://github.com/containers/podman/issues/17176 where it had been advised to switch to chroot isolation.

[root@9b05ed302535 tmp]# unset BUILDAH_ISOLATION
[root@9b05ed302535 tmp]# buildah build -t reproducer:step3 -f Containerfile_step3 .
STEP 1/2: FROM localhost/reproducer:step2
STEP 2/2: RUN echo "Hello World!"
error running container: from /usr/bin/crun creating container for [/bin/sh -c echo "Hello World!"]: creating cgroup directory `/sys/fs/cgroup/rdma/buildah-buildah1930637118`: No such file or directory
: exit status 1
ERRO[0000] did not get container create message from subprocess: EOF
Error: building at STEP "RUN echo "Hello World!"": while running runtime: exit status 1

With the privileged flag, Buildah outputs the same error about existing files :

sudo podman run -ti --rm --entrypoint=/bin/bash --device=/dev/fuse --privileged localhost/fedora-builder
[root@03f7906eccc6 tmp]# buildah build -t reproducer:step1 -f Containerfile_step1 .
STEP 1/4: FROM registry.access.redhat.com/ubi8/ubi:8.9-1028
Trying to pull registry.access.redhat.com/ubi8/ubi:8.9-1028...
Getting image source signatures
Checking if image destination supports signatures
Copying blob b4e744f5f131 done   |
Copying config 86b358a425 done   |
Writing manifest to image destination
Storing signatures
STEP 2/4: RUN mkdir -p /usr/share/demo/app &&     mkdir -p /usr/share/demo/conf &&     ln -s /usr/share/demo /usr/local/demo_root
STEP 3/4: VOLUME /usr/share/demo/app
STEP 4/4: RUN echo "Hello World"
Hello World
COMMIT reproducer:step1
Getting image source signatures
Copying blob d93817448019 skipped: already exists
Copying blob 8e0fa2feb55e done   |
Copying config bd8bae0546 done   |
Writing manifest to image destination
--> bd8bae054616
Successfully tagged localhost/reproducer:step1
bd8bae0546163cad71157813b5e6300df83505549d17722b1fb83f99815bd3b9
[root@03f7906eccc6 tmp]# buildah build -t reproducer:step2 -f Containerfile_step2 .
STEP 1/2: FROM localhost/reproducer:step1
STEP 2/2: VOLUME /usr/local/demo_root/app
COMMIT reproducer:step2
Getting image source signatures
Copying blob d93817448019 skipped: already exists
Copying blob 8e0fa2feb55e skipped: already exists
Copying blob 5f70bf18a086 done   |
Copying config b5e781438a done   |
Writing manifest to image destination
--> b5e781438a3c
Successfully tagged localhost/reproducer:step2
b5e781438a3c3498078b9e59df720de35d8496f65bccdb89aac880eb0e1a8d33
[root@03f7906eccc6 tmp]# buildah build -t reproducer:step3 -f Containerfile_step3 .
STEP 1/2: FROM localhost/reproducer:step2
STEP 2/2: RUN echo "Hello World!"
error running subprocess: creating mountpoint "/var/tmp/buildah129897199/mnt/rootfs/usr/local/demo_root/app" in mount namespace: mkdir /var/tmp/buildah129897199/mnt/rootfs/usr/local/demo_root: file exists
Error: building at STEP "RUN echo "Hello World!"": exit status 1

This time though, as soon as env BUILDAH_ISOLATION is unset, the third Containerfile build successfully completes :

[root@03f7906eccc6 tmp]# unset BUILDAH_ISOLATION
[root@03f7906eccc6 tmp]# buildah build -t reproducer:step3 -f Containerfile_step3 .
STEP 1/2: FROM localhost/reproducer:step2
STEP 2/2: RUN echo "Hello World!"
Hello World!
COMMIT reproducer:step3
Getting image source signatures
Copying blob d93817448019 skipped: already exists
Copying blob 8e0fa2feb55e skipped: already exists
Copying blob 5f70bf18a086 skipped: already exists
Copying blob 3e734402ef82 done   |
Copying config 7d5f6acd50 done   |
Writing manifest to image destination
--> 7d5f6acd5017
Successfully tagged localhost/reproducer:step3
7d5f6acd5017bdc8bcb66f8ac46a4b92f82a90a7c75f8e2fcb76b0c0c5e93853

an-toine avatar Dec 28 '23 10:12 an-toine

A friendly reminder that this issue had no activity for 30 days.

github-actions[bot] avatar Jan 28 '24 00:01 github-actions[bot]

Any news on this bug ?

Could we solve it by checking the error kind in file https://github.com/containers/buildah/blob/main/chroot/run_common.go#L446C1-L450C3 to ignore/warn on the file exists error ?

Antoine

an-toine avatar Feb 16 '24 09:02 an-toine

I just looked at this and it seems impossible.

rhatdan avatar Feb 20 '24 20:02 rhatdan

Error messages comes from here: https://github.com/containers/buildah/blob/main/chroot/run_linux.go#L435-L447

And according to golang the error saying the file file exists should never be returned.

rhatdan avatar Feb 20 '24 20:02 rhatdan

There is an error there on the OpenFile should be differentiated from the mkdirall call above.

rhatdan avatar Feb 20 '24 20:02 rhatdan

There is an error there on the OpenFile should be differentiated from the mkdirall call above.

To differentiate both calls I've quickly rebuilt a Buildah binary to prefix log lines :

      if srcinfo.IsDir() {
        if err = os.MkdirAll(target, 0755); err != nil {
          return undoBinds, fmt.Errorf("mkdirall : creating mountpoint %q in mount namespace: %w", target, err)
        }
      } else {
        if err = os.MkdirAll(filepath.Dir(target), 0755); err != nil {
          return undoBinds, fmt.Errorf("ensuring parent of mountpoint %q (%q) is present in mount namespace: %w", target, filepath.Dir(target), err)
        }
        var file *os.File
        if file, err = os.OpenFile(target, os.O_WRONLY|os.O_CREATE, 0755); err != nil {
          return undoBinds, fmt.Errorf("openfile : creating mountpoint %q in mount namespace: %w", target, err)
        }

This code results in the following error :

STEP 1/2: FROM localhost/reproducer:step2
STEP 2/2: RUN echo "Hello World!"
error running subprocess: mkdirall : creating mountpoint "/var/tmp/buildah1298460001/mnt/rootfs/usr/local/demo_root/app" in mount namespace: mkdir /var/tmp/buildah1298460001/mnt/rootfs/usr/local/demo_root: file exists
Error: building at STEP "RUN echo "Hello World!"": exit status 1

It looks like the error is raised by the MkdirAll call and this is confusing : no error is expected when the path exists.

an-toine avatar Feb 21 '24 16:02 an-toine

A friendly reminder that this issue had no activity for 30 days.

github-actions[bot] avatar Mar 23 '24 00:03 github-actions[bot]