buildah Missing files in buildah, but works in podman build

Description

Running a container build with podman build works as expected, using buildah bud (which is being used by Red Hat's github action) fails due to missing files.

Steps to reproduce the issue:

clone [email protected]:trustification/trustification.git at rev 2281c30e93fed7e1c3a9668699ebce582249a722
Run buildah bud --platform linux/amd64 -f spog/ui/Containerfile --build-arg tag=latest --format docker --tls-verify=true -t spog-ui:latest . from the root of the repository
boom

Describe the results you received:

The COPY command (3rd command of first stage) is supposed to copy files into the container for building. It does not these files are missing.

Describe the results you expected:

Have the file available, like podman build.

Output of rpm -q buildah or apt list buildah:

buildah-1.30.0-1.fc38.x86_64

Output of buildah version:

Version:         1.30.0
Go Version:      go1.20.2
Image Spec:      1.0.2-dev
Runtime Spec:    1.1.0-rc.1
CNI Spec:        1.0.0
libcni Version:  v1.1.2
image Version:   5.25.0
Git Commit:      
Built:           Mon Apr 10 09:26:00 2023
OS/Arch:         linux/amd64
BuildPlatform:   linux/amd64

Output of podman version if reporting a podman build issue:

Client:       Podman Engine
Version:      4.5.0
API Version:  4.5.0
Go Version:   go1.20.2
Built:        Fri Apr 14 17:42:22 2023
OS/Arch:      linux/amd64

Output of cat /etc/*release:

Fedora release 38 (Thirty Eight)
NAME="Fedora Linux"
VERSION="38 (KDE Plasma)"
ID=fedora
VERSION_ID=38
VERSION_CODENAME=""
PLATFORM_ID="platform:f38"
PRETTY_NAME="Fedora Linux 38 (KDE Plasma)"
ANSI_COLOR="0;38;2;60;110;180"
LOGO=fedora-logo-icon
CPE_NAME="cpe:/o:fedoraproject:fedora:38"
DEFAULT_HOSTNAME="fedora"
HOME_URL="https://fedoraproject.org/"
DOCUMENTATION_URL="https://docs.fedoraproject.org/en-US/fedora/f38/system-administrators-guide/"
SUPPORT_URL="https://ask.fedoraproject.org/"
BUG_REPORT_URL="https://bugzilla.redhat.com/"
REDHAT_BUGZILLA_PRODUCT="Fedora"
REDHAT_BUGZILLA_PRODUCT_VERSION=38
REDHAT_SUPPORT_PRODUCT="Fedora"
REDHAT_SUPPORT_PRODUCT_VERSION=38
SUPPORT_END=2024-05-14
VARIANT="KDE Plasma"
VARIANT_ID=kde
Fedora release 38 (Thirty Eight)
Fedora release 38 (Thirty Eight)

Output of uname -a:

Linux brocken 6.3.5-200.fc38.x86_64 #1 SMP PREEMPT_DYNAMIC Tue May 30 15:44:17 UTC 2023 x86_64 GNU/Linux

Output of cat /etc/containers/storage.conf:

# This file is is the configuration file for all tools
# that use the containers/storage library. The storage.conf file
# overrides all other storage.conf files. Container engines using the
# container/storage library do not inherit fields from other storage.conf
# files.
#
#  Note: The storage.conf file overrides other storage.conf files based on this precedence:
#      /usr/containers/storage.conf
#      /etc/containers/storage.conf
#      $HOME/.config/containers/storage.conf
#      $XDG_CONFIG_HOME/containers/storage.conf (If XDG_CONFIG_HOME is set)
# See man 5 containers-storage.conf for more information
# The "container storage" table contains all of the server options.
[storage]

# Default Storage Driver, Must be set for proper operation.
driver = "overlay"

# Temporary storage location
runroot = "/run/containers/storage"

# Primary Read/Write location of container storage
# When changing the graphroot location on an SELINUX system, you must
# ensure  the labeling matches the default locations labels with the
# following commands:
# semanage fcontext -a -e /var/lib/containers/storage /NEWSTORAGEPATH
# restorecon -R -v /NEWSTORAGEPATH
graphroot = "/var/lib/containers/storage"


# Storage path for rootless users
#
# rootless_storage_path = "$HOME/.local/share/containers/storage"

[storage.options]
# Storage options to be passed to underlying storage drivers

# AdditionalImageStores is used to pass paths to additional Read/Only image stores
# Must be comma separated list.
additionalimagestores = [
]

# Remap-UIDs/GIDs is the mapping from UIDs/GIDs as they should appear inside of
# a container, to the UIDs/GIDs as they should appear outside of the container,
# and the length of the range of UIDs/GIDs.  Additional mapped sets can be
# listed and will be heeded by libraries, but there are limits to the number of
# mappings which the kernel will allow when you later attempt to run a
# container.
#
# remap-uids = 0:1668442479:65536
# remap-gids = 0:1668442479:65536

# Remap-User/Group is a user name which can be used to look up one or more UID/GID
# ranges in the /etc/subuid or /etc/subgid file.  Mappings are set up starting
# with an in-container ID of 0 and then a host-level ID taken from the lowest
# range that matches the specified name, and using the length of that range.
# Additional ranges are then assigned, using the ranges which specify the
# lowest host-level IDs first, to the lowest not-yet-mapped in-container ID,
# until all of the entries have been used for maps.
#
# remap-user = "containers"
# remap-group = "containers"

# Root-auto-userns-user is a user name which can be used to look up one or more UID/GID
# ranges in the /etc/subuid and /etc/subgid file.  These ranges will be partitioned
# to containers configured to create automatically a user namespace.  Containers
# configured to automatically create a user namespace can still overlap with containers
# having an explicit mapping set.
# This setting is ignored when running as rootless.
# root-auto-userns-user = "storage"
#
# Auto-userns-min-size is the minimum size for a user namespace created automatically.
# auto-userns-min-size=1024
#
# Auto-userns-max-size is the minimum size for a user namespace created automatically.
# auto-userns-max-size=65536

[storage.options.overlay]
# ignore_chown_errors can be set to allow a non privileged user running with
# a single UID within a user namespace to run containers. The user can pull
# and use any image even those with multiple uids.  Note multiple UIDs will be
# squashed down to the default uid in the container.  These images will have no
# separation between the users in the container. Only supported for the overlay
# and vfs drivers.
#ignore_chown_errors = "false"

# Inodes is used to set a maximum inodes of the container image.
# inodes = ""

# Path to an helper program to use for mounting the file system instead of mounting it
# directly.
#mount_program = "/usr/bin/fuse-overlayfs"

# mountopt specifies comma separated list of extra mount options
mountopt = "nodev,metacopy=on"

# Set to skip a PRIVATE bind mount on the storage home directory.
# skip_mount_home = "false"

# Size is used to set a maximum size of the container image.
# size = ""

# ForceMask specifies the permissions mask that is used for new files and
# directories.
#
# The values "shared" and "private" are accepted.
# Octal permission masks are also accepted.
#
#  "": No value specified.
#     All files/directories, get set with the permissions identified within the
#     image.
#  "private": it is equivalent to 0700.
#     All files/directories get set with 0700 permissions.  The owner has rwx
#     access to the files. No other users on the system can access the files.
#     This setting could be used with networked based homedirs.
#  "shared": it is equivalent to 0755.
#     The owner has rwx access to the files and everyone else can read, access
#     and execute them. This setting is useful for sharing containers storage
#     with other users.  For instance have a storage owned by root but shared
#     to rootless users as an additional store.
#     NOTE:  All files within the image are made readable and executable by any
#     user on the system. Even /etc/shadow within your image is now readable by
#     any user.
#
#   OCTAL: Users can experiment with other OCTAL Permissions.
#
#  Note: The force_mask Flag is an experimental feature, it could change in the
#  future.  When "force_mask" is set the original permission mask is stored in
#  the "user.containers.override_stat" xattr and the "mount_program" option must
#  be specified. Mount programs like "/usr/bin/fuse-overlayfs" present the
#  extended attribute permissions to processes within containers rather then the
#  "force_mask"  permissions.
#
# force_mask = ""

[storage.options.thinpool]
# Storage Options for thinpool

# autoextend_percent determines the amount by which pool needs to be
# grown. This is specified in terms of % of pool size. So a value of 20 means
# that when threshold is hit, pool will be grown by 20% of existing
# pool size.
# autoextend_percent = "20"

# autoextend_threshold determines the pool extension threshold in terms
# of percentage of pool size. For example, if threshold is 60, that means when
# pool is 60% full, threshold has been hit.
# autoextend_threshold = "80"

# basesize specifies the size to use when creating the base device, which
# limits the size of images and containers.
# basesize = "10G"

# blocksize specifies a custom blocksize to use for the thin pool.
# blocksize="64k"

# directlvm_device specifies a custom block storage device to use for the
# thin pool. Required if you setup devicemapper.
# directlvm_device = ""

# directlvm_device_force wipes device even if device already has a filesystem.
# directlvm_device_force = "True"

# fs specifies the filesystem type to use for the base device.
# fs="xfs"

# log_level sets the log level of devicemapper.
# 0: LogLevelSuppress 0 (Default)
# 2: LogLevelFatal
# 3: LogLevelErr
# 4: LogLevelWarn
# 5: LogLevelNotice
# 6: LogLevelInfo
# 7: LogLevelDebug
# log_level = "7"

# min_free_space specifies the min free space percent in a thin pool require for
# new device creation to succeed. Valid values are from 0% - 99%.
# Value 0% disables
# min_free_space = "10%"

# mkfsarg specifies extra mkfs arguments to be used when creating the base
# device.
# mkfsarg = ""

# metadata_size is used to set the `pvcreate --metadatasize` options when
# creating thin devices. Default is 128k
# metadata_size = ""

# Size is used to set a maximum size of the container image.
# size = ""

# use_deferred_removal marks devicemapper block device for deferred removal.
# If the thinpool is in use when the driver attempts to remove it, the driver
# tells the kernel to remove it as soon as possible. Note this does not free
# up the disk space, use deferred deletion to fully remove the thinpool.
# use_deferred_removal = "True"

# use_deferred_deletion marks thinpool device for deferred deletion.
# If the device is busy when the driver attempts to delete it, the driver
# will attempt to delete device every 30 seconds until successful.
# If the program using the driver exits, the driver will continue attempting
# to cleanup the next time the driver is used. Deferred deletion permanently
# deletes the device and all data stored in device will be lost.
# use_deferred_deletion = "True"

# xfs_nospace_max_retries specifies the maximum number of retries XFS should
# attempt to complete IO when ENOSPC (no space) error is returned by
# underlying storage device.
# xfs_nospace_max_retries = "0"

Jun 05 '23 09:06 ctron

Not sure this is related, but if fails differently when actually running on a GitHub actions runner. In that case I get:

[1/2] STEP 5/5: RUN true     && npm ci     && rustup target add wasm32-unknown-unknown     && trunk build --release --dist /public
error running container: error from crun creating container for [/bin/sh -c true     && npm ci     && rustup target add wasm32-unknown-unknown     && trunk build --release --dist /public]: chdir: No such file or directory
: exit status 1

Which also seems to miss something. But maybe something different. Locally I can see npm ci fail due to a missing file (package-lock.json) While on github I get the error above.

Jun 05 '23 09:06 ctron

Ok it looks like this is due to a VOLUME statement in the base layer for /usr/src. As soon as I move this into a different location, the copy works as expected.

This feels like a bug in buildah, as it works in podman build, and docker build.

Jun 05 '23 09:06 ctron

Can you generate a simple reproducer?

Jun 06 '23 14:06 rhatdan

Took me a bit: https://github.com/ctron/buildah-repro-4845

When the VOLUME keyword is in the base layer it fails. When the VOLUME keyword goes into the same layer it works.

With podman build it always works.

Jun 06 '23 17:06 ctron

A friendly reminder that this issue had no activity for 30 days.

Jul 07 '23 00:07 github-actions[bot]

I'll take a look at this.

Jul 07 '23 05:07 flouthoc

@flouthoc If you haven't started already, mind If i play with this?

Jul 18 '23 07:07 danishprakash

@danishprakash Sure, thanks.

Jul 18 '23 07:07 flouthoc

A friendly reminder that this issue had no activity for 30 days.

Aug 18 '23 00:08 github-actions[bot]

@danishprakash Any progress?

Aug 18 '23 09:08 rhatdan

Sorry, I had some other items to look into. I'll try to get an update on both this and #4910 by end of this week.

Aug 21 '23 06:08 danishprakash

This is caused due to the default value for --layers[1] between Podman and Buildah. For podman, it has been set to true by default and false in Buildah. The man page doesn't have much to say about --layers, especially for a multi-stage build, which in this case, seems to be a necessity.

So, although the fix might be "enabling" it in Buildah, I'm trying to understand the rationale behind this, I'm surely missing something here. The flag, understandably allows or disallows[2] intermediate layer caching but shouldn't this be enabled by default for multi-stage builds? especially if there's a direct dependency between the layers?

cc/ @umohnani8 @nalind

[1] - https://github.com/containers/buildah/pull/784 [2] - https://github.com/containers/buildah/commit/8ceda286a764e07248e3753b2a919802d5241b6f#diff-d96798032784de488b331cd60b9420f3263b46da4e7b47f8968569a0e6f629beR1505-R1510

Sep 04 '23 12:09 danishprakash

A friendly reminder that this issue had no activity for 30 days.

Oct 05 '23 00:10 github-actions[bot]

@flouthoc any comment on this?

Oct 10 '23 14:10 rhatdan

A friendly reminder that this issue had no activity for 30 days.

Nov 10 '23 00:11 github-actions[bot]

Activity …

Nov 10 '23 07:11 ctron

Can you just change the default for buildah to --layers=true

Nov 10 '23 12:11 rhatdan

This should not happen even with --layers=false , I'll check this ( sorry for the delay )

Nov 10 '23 12:11 flouthoc

A friendly reminder that this issue had no activity for 30 days.

Dec 11 '23 00:12 github-actions[bot]

Activity …

Dec 11 '23 07:12 ctron

@ctron have you tried with --layers=true

Dec 11 '23 10:12 rhatdan

No, but we rely on some redhat github action making using of buildah. So that might not be a solution for us.

Dec 11 '23 10:12 ctron

A friendly reminder that this issue had no activity for 30 days.

Jan 11 '24 00:01 github-actions[bot]

Activity ...

Apr 16 '24 09:04 desmax74