buildah icon indicating copy to clipboard operation
buildah copied to clipboard

buildah 1.40.0 fails when running inside podman: masking non-directory: permission denied

Open mh21 opened this issue 7 months ago • 19 comments

Issue Description

  • working: buildah version 1.39.3 (image-spec 1.1.0, runtime-spec 1.2.0)
  • broken: buildah version 1.40.0 (image-spec 1.1.1, runtime-spec 1.2.1)

error running subprocess: masking non-directory "/var/tmp/buildah1757143299/mnt/rootfs/proc/interrupts" in mount namespace: permission denied

this happens when running on the podman-based gitlab-runners on the internal RH GitLab instance

afaict, this runs via

STORAGE_DRIVER: vfs
BUILDAH_ISOLATION: chroot

for more details, contact me (mh21) on RH internal Slack in team-kernel-cki

Steps to reproduce the issue

buildah build on internal RH GitLab runners

Describe the results you received

error running subprocess: masking non-directory "/var/tmp/buildah1757143299/mnt/rootfs/proc/interrupts" in mount namespace: permission denied

Describe the results you expected

no error

buildah version output

buildah version 1.40.0 (image-spec 1.1.1, runtime-spec 1.2.1)

buildah info output

runs via a pipeline based on quay.io/cki/buildah:g-1800472385

Provide your storage.conf

(as shipped by Fedora Rawhide/43)

# This file is the configuration file for all tools
# that use the containers/storage library. The storage.conf file
# overrides all other storage.conf files. Container engines using the
# container/storage library do not inherit fields from other storage.conf
# files.
#
#  Note: The storage.conf file overrides other storage.conf files based on this precedence:
#      /usr/containers/storage.conf
#      /etc/containers/storage.conf
#      $HOME/.config/containers/storage.conf
#      $XDG_CONFIG_HOME/containers/storage.conf (If XDG_CONFIG_HOME is set)
# See man 5 containers-storage.conf for more information
# The "container storage" table contains all of the server options.
[storage]

# Default Storage Driver, Must be set for proper operation.
driver = "overlay"

# Temporary storage location
runroot = "/run/containers/storage"

# Primary Read/Write location of container storage
# When changing the graphroot location on an SELINUX system, you must
# ensure  the labeling matches the default locations labels with the
# following commands:
# semanage fcontext -a -e /var/lib/containers/storage /NEWSTORAGEPATH
# restorecon -R -v /NEWSTORAGEPATH
graphroot = "/var/lib/containers/storage"

# Optional alternate location of image store if a location separate from the
# container store is required. If set, it must be different than graphroot.
# imagestore = ""


# Storage path for rootless users
#
# rootless_storage_path = "$HOME/.local/share/containers/storage"

# Transient store mode makes all container metadata be saved in temporary storage
# (i.e. runroot above). This is faster, but doesn't persist across reboots.
# Additional garbage collection must also be performed at boot-time, so this
# option should remain disabled in most configurations.
# transient_store = true

[storage.options]
# Storage options to be passed to underlying storage drivers

# AdditionalImageStores is used to pass paths to additional Read/Only image stores
# Must be comma separated list.
additionalimagestores = [
"/usr/lib/containers/storage",
]

# Allows specification of how storage is populated when pulling images. This
# option can speed the pulling process of images compressed with format
# zstd:chunked. Containers/storage looks for files within images that are being
# pulled from a container registry that were previously pulled to the host.  It
# can copy or create a hard link to the existing file when it finds them,
# eliminating the need to pull them from the container registry. These options
# can deduplicate pulling of content, disk storage of content and can allow the
# kernel to use less memory when running containers.

# containers/storage supports four keys
#   * enable_partial_images="true" | "false"
#     Tells containers/storage to look for files previously pulled in storage
#     rather then always pulling them from the container registry.
#   * use_hard_links = "false" | "true"
#     Tells containers/storage to use hard links rather then create new files in
#     the image, if an identical file already existed in storage.
#   * ostree_repos = ""
#     Tells containers/storage where an ostree repository exists that might have
#     previously pulled content which can be used when attempting to avoid
#     pulling content from the container registry
#   * convert_images = "false" | "true"
#     If set to true, containers/storage will convert images to a
#     format compatible with partial pulls in order to take advantage
#     of local deduplication and hard linking.  It is an expensive
#     operation so it is not enabled by default.
pull_options = {enable_partial_images = "true", use_hard_links = "false", ostree_repos=""}

# Remap-UIDs/GIDs is the mapping from UIDs/GIDs as they should appear inside of
# a container, to the UIDs/GIDs as they should appear outside of the container,
# and the length of the range of UIDs/GIDs.  Additional mapped sets can be
# listed and will be heeded by libraries, but there are limits to the number of
# mappings which the kernel will allow when you later attempt to run a
# container.
#
# remap-uids = "0:1668442479:65536"
# remap-gids = "0:1668442479:65536"

# Remap-User/Group is a user name which can be used to look up one or more UID/GID
# ranges in the /etc/subuid or /etc/subgid file.  Mappings are set up starting
# with an in-container ID of 0 and then a host-level ID taken from the lowest
# range that matches the specified name, and using the length of that range.
# Additional ranges are then assigned, using the ranges which specify the
# lowest host-level IDs first, to the lowest not-yet-mapped in-container ID,
# until all of the entries have been used for maps. This setting overrides the
# Remap-UIDs/GIDs setting.
#
# remap-user = "containers"
# remap-group = "containers"

# Root-auto-userns-user is a user name which can be used to look up one or more UID/GID
# ranges in the /etc/subuid and /etc/subgid file.  These ranges will be partitioned
# to containers configured to create automatically a user namespace.  Containers
# configured to automatically create a user namespace can still overlap with containers
# having an explicit mapping set.
# This setting is ignored when running as rootless.
# root-auto-userns-user = "storage"
#
# Auto-userns-min-size is the minimum size for a user namespace created automatically.
# auto-userns-min-size=1024
#
# Auto-userns-max-size is the maximum size for a user namespace created automatically.
# auto-userns-max-size=65536

[storage.options.overlay]
# ignore_chown_errors can be set to allow a non privileged user running with
# a single UID within a user namespace to run containers. The user can pull
# and use any image even those with multiple uids.  Note multiple UIDs will be
# squashed down to the default uid in the container.  These images will have no
# separation between the users in the container. Only supported for the overlay
# and vfs drivers.
#ignore_chown_errors = "false"

# Inodes is used to set a maximum inodes of the container image.
# inodes = ""

# Path to an helper program to use for mounting the file system instead of mounting it
# directly.
#mount_program = "/usr/bin/fuse-overlayfs"

# mountopt specifies comma separated list of extra mount options
mountopt = "nodev,metacopy=on"

# Set to skip a PRIVATE bind mount on the storage home directory.
# skip_mount_home = "false"

# Set to use composefs to mount data layers with overlay.
# use_composefs = "false"

# Size is used to set a maximum size of the container image.
# size = ""

# ForceMask specifies the permissions mask that is used for new files and
# directories.
#
# The values "shared" and "private" are accepted.
# Octal permission masks are also accepted.
#
#  "": No value specified.
#     All files/directories, get set with the permissions identified within the
#     image.
#  "private": it is equivalent to 0700.
#     All files/directories get set with 0700 permissions.  The owner has rwx
#     access to the files. No other users on the system can access the files.
#     This setting could be used with networked based homedirs.
#  "shared": it is equivalent to 0755.
#     The owner has rwx access to the files and everyone else can read, access
#     and execute them. This setting is useful for sharing containers storage
#     with other users.  For instance have a storage owned by root but shared
#     to rootless users as an additional store.
#     NOTE:  All files within the image are made readable and executable by any
#     user on the system. Even /etc/shadow within your image is now readable by
#     any user.
#
#   OCTAL: Users can experiment with other OCTAL Permissions.
#
#  Note: The force_mask Flag is an experimental feature, it could change in the
#  future.  When "force_mask" is set the original permission mask is stored in
#  the "user.containers.override_stat" xattr and the "mount_program" option must
#  be specified. Mount programs like "/usr/bin/fuse-overlayfs" present the
#  extended attribute permissions to processes within containers rather than the
#  "force_mask"  permissions.
#
# force_mask = ""

Upstream Latest Release

Yes

Additional environment details

non-privileged podman gitlab-runner job

Additional information

No response

mh21 avatar May 06 '25 07:05 mh21

for reference, CKI downstream issue: https://gitlab.com/cki-project/infrastructure/-/issues/673

mh21 avatar May 06 '25 07:05 mh21

@nalind I am not entirely sure but were you looking at a similar issue ?

flouthoc avatar May 07 '25 15:05 flouthoc

Yes, this looks like what https://github.com/containers/container-selinux/pull/367 was addressing. Checking the audit log for an SELinux policy denial should allow us to confirm if that's the case.

nalind avatar May 12 '25 13:05 nalind

@nalind is this fix applicable for podman or for buildah? While we have control of buildah, the podman version used on the hosts is controlled by IT.

mh21 avatar May 12 '25 14:05 mh21

The updated policy would need to be applied on the node itself.

nalind avatar May 12 '25 15:05 nalind

@nalind so in that case, should I file a RHEL issue so the fixed version makes its way into RHEL/CS, or will this happen automatically anyway?

mh21 avatar May 12 '25 21:05 mh21

The change was part of the container-selinux 2.237 release, which I think is currently a candidate for 10.1. If you need it in a release before then, then yes, I think an issue would help it along.

nalind avatar May 13 '25 14:05 nalind

@nalind filed as https://issues.redhat.com/browse/RHEL-92000

mh21 avatar May 16 '25 06:05 mh21

Hi, we see the same error about /proc/interrupt when running the quay.io buildah image 1.40.0 but not 1.39.3, in a tekton task, hosted in a local openshift cluster running on RHEL. I understand there's a fix in container-selinux.
Could you explain how that fix is delivered to us end-users: next buildah image, RHEL update, openshift update, something else? Thanks!

ekonijn avatar Jun 02 '25 10:06 ekonijn

Hi, we see the same error about /proc/interrupt when running the quay.io buildah image 1.40.0 but not 1.39.3, in a tekton task, hosted in a local openshift cluster running on RHEL. I understand there's a fix in container-selinux. Could you explain how that fix is delivered to us end-users: next buildah image, RHEL update, openshift update, something else? Thanks!

According to the RH JIRA issue the fix will be delivered in a RHEL update.

Those of us on OKD will just have to hope it makes it to SCOS at some point.

nate-duke avatar Jun 02 '25 10:06 nate-duke

A friendly reminder that this issue had no activity for 30 days.

github-actions[bot] avatar Jul 03 '25 00:07 github-actions[bot]

Is there anything on the user side that I can do to workaround this issue? I am running buildah on openshift, and I'm stuck at v1.39. I'm not sure when/if our cluster admins are going to upgrade the nodes.

jcox10 avatar Sep 05 '25 14:09 jcox10

Is there anything on the user side that I can do to workaround this issue? I am running buildah on openshift, and I'm stuck at v1.39. I'm not sure when/if our cluster admins are going to upgrade the nodes.

I'm sorry to just throw it in like this without testing (i don't have the environment right now), but would maybe using --security-opt unmask=/proc/interrupts for the buildah bud work?

ver4a avatar Sep 05 '25 15:09 ver4a

I'm sorry to just throw it in like this without testing (i don't have the environment right now), but would maybe using --security-opt unmask=/proc/interrupts for the buildah bud work?

~That worked for me... and aparently so does the stable:latest tag somehow.~

EDIT: Upon further testing the stable:latest worked because of cached artifacts from runs with 1.39.3. With the unmask argument the failure is the same.

error running subprocess: masking non-directory "/var/tmp/buildah1046798299/mnt/rootfs/proc/interrupts" in mount namespace: permission denied

nate-duke avatar Sep 05 '25 16:09 nate-duke

@ver4a Yes that does work on 1.41.3, thank you!

jcox10 avatar Sep 08 '25 13:09 jcox10

@ver4a Yes that does work on 1.41.3, thank you!

How'd you make it work @jcox10? My runners using the quay.io/buildah/stable:latest image on OKD 4.17 still failed with those arguments. We're hoping to get this sorted before redhat kills the v1.39.3 tag.

nate-duke avatar Sep 08 '25 15:09 nate-duke

How'd you make it work @jcox10? My runners using the quay.io/buildah/stable:latest image on OKD 4.17 still failed with those arguments. We're hoping to get this sorted before redhat kills the v1.39.3 tag.

I just added the arguments to the build and it worked, using v1.41.3 tag of buildah. I'm not sure what OKD version the cluster is running

buildah build --security-opt unmask=/proc/interrupts .

Our buildah image is modified slightly using Gitlab's Tutorial. I'm not sure if there is anything in there that is enabling this to work.

jcox10 avatar Sep 08 '25 15:09 jcox10

thanks @jcox10 For whatever reason that seems to work for me today as well but didn't on Friday. I double checked the commits to my CI definition as well. I'll test a bunch more to make sure before I tell everybody they can unpin buildah in their pipelines!

FWIW the uid/gidmap stuff in those instructions from Gitlab are in the buildah image already. We just use the upstream buildah image from Quay and set STORAGE_DRIVER=vfs and run buildah bud with the --isolation=chroot argument. Modifying it like that does make it a little more hands free tho ...

nate-duke avatar Sep 08 '25 16:09 nate-duke

The workaround with security-opt unmask does not work when using the --sbom flag.

I get following error at the end of the build, when the sbom should be generated:

COMMIT <registry>/spring-petclinic/test:1.27.22
error running subprocess: masking non-directory "/var/tmp/buildah3314428120/mnt/rootfs/proc/interrupts" in mount namespace: permission denied
Error: committing container for step {Env:[PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin container=oci] Command:entrypoint Args:[/usr/bin/java] Flags:[] Attrs:map[json:true] Message:ENTRYPOINT /usr/bin/java Heredocs:[] Original:ENTRYPOINT [ "/usr/bin/java" ]}: scanning rootfs to generate SBOM for container "cfa4a1f6ef61b59ff20a87fe89df45df1761364574dead37d869e683240735e1": running scanning command [syft scan dir:/.rootfs --output cyclonedx-json=/.scans/scan0.json]: exit status 1

Is there also a workaround for this?

alexalbr avatar Nov 18 '25 14:11 alexalbr