for-mac icon indicating copy to clipboard operation
for-mac copied to clipboard

Running linux/amd64 image only succeedes the first time

Open tomaszklak opened this issue 1 year ago • 5 comments

Description

Running the same command twice (or more) results in all runs after the first with a exec format error error. This does not happen for arm containers and not always for amd64 ones. But for the example command in this ticket it happens 100% of the time.

Reproduce

Use this Dockerfile:

FROM ubuntu:jammy
RUN apt update && apt -y install systemd
ENTRYPOINT ["/lib/systemd/systemd"]

Run it once:

$ docker build . -t mac_issue --platform linux/amd64 && docker run --name foo --privileged --rm -it --platform linux/amd64 mac_issue

This will result in various logs emitted by systemd. After a few seconds kill the container (from a different shell) with:

$ docker kill foo

Now, run the same command again:

$ docker build . -t mac_issue --platform linux/amd64 && docker run --name foo --privileged --rm -it --platform linux/amd64 mac_issue

This results in a failure with just one log:

exec /lib/systemd/systemd: exec format error

Expected behavior

The second run behaves in a very similar or identical way as the first one.

docker version

Client:
 Cloud integration: v1.0.35+desktop.10
 Version:           25.0.1
 API version:       1.44
 Go version:        go1.21.6
 Git commit:        29cf629
 Built:             Tue Jan 23 23:06:12 2024
 OS/Arch:           darwin/arm64
 Context:           desktop-linux

Server: Docker Desktop 4.27.0 (135262)
 Engine:
  Version:          25.0.1
  API version:      1.44 (minimum version 1.24)
  Go version:       go1.21.6
  Git commit:       71fa3ab
  Built:            Tue Jan 23 23:09:35 2024
  OS/Arch:          linux/arm64
  Experimental:     false
 containerd:
  Version:          1.6.27
  GitCommit:        a1496014c916f9e62104b33d1bb5bd03b0858e59
 runc:
  Version:          1.1.11
  GitCommit:        v1.1.11-0-g4bccb38
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

docker info

Client:
 Version:    25.0.1
 Context:    desktop-linux
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.12.1-desktop.4
    Path:     /Users/tomaszklak/.docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.24.3-desktop.1
    Path:     /Users/tomaszklak/.docker/cli-plugins/docker-compose
  debug: Get a shell into any image or container. (Docker Inc.)
    Version:  0.0.22
    Path:     /Users/tomaszklak/.docker/cli-plugins/docker-debug
  dev: Docker Dev Environments (Docker Inc.)
    Version:  v0.1.0
    Path:     /Users/tomaszklak/.docker/cli-plugins/docker-dev
  extension: Manages Docker extensions (Docker Inc.)
    Version:  v0.2.21
    Path:     /Users/tomaszklak/.docker/cli-plugins/docker-extension
  feedback: Provide feedback, right in your terminal! (Docker Inc.)
    Version:  v1.0.4
    Path:     /Users/tomaszklak/.docker/cli-plugins/docker-feedback
  init: Creates Docker-related starter files for your project (Docker Inc.)
    Version:  v1.0.0
    Path:     /Users/tomaszklak/.docker/cli-plugins/docker-init
  sbom: View the packaged-based Software Bill Of Materials (SBOM) for an image (Anchore Inc.)
    Version:  0.6.0
    Path:     /Users/tomaszklak/.docker/cli-plugins/docker-sbom
  scout: Docker Scout (Docker Inc.)
    Version:  v1.3.0
    Path:     /Users/tomaszklak/.docker/cli-plugins/docker-scout

Server:
 Containers: 0
  Running: 0
  Paused: 0
  Stopped: 0
 Images: 14
 Server Version: 25.0.1
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: a1496014c916f9e62104b33d1bb5bd03b0858e59
 runc version: v1.1.11-0-g4bccb38
 init version: de40ad0
 Security Options:
  seccomp
   Profile: unconfined
  cgroupns
 Kernel Version: 6.6.12-linuxkit
 Operating System: Docker Desktop
 OSType: linux
 Architecture: aarch64
 CPUs: 10
 Total Memory: 7.657GiB
 Name: docker-desktop
 ID: 418872b1-e1a8-464c-b2bd-d6b5ac76a278
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 HTTP Proxy: http.docker.internal:3128
 HTTPS Proxy: http.docker.internal:3128
 No Proxy: hubproxy.docker.internal
 Experimental: false
 Insecure Registries:
  hubproxy.docker.internal:5555
  127.0.0.0/8
 Live Restore Enabled: false

Diagnostics ID

E74238F5-93A8-4B54-877E-E73B72A23F20/20240522105917

Additional Info

This works fine on docker desktop for mac 4.26.1, and seems to be broken since 4.27.0 - I can reproduce the problem on both 4.27.0 and 4.30.0.

tomaszklak avatar May 22 '24 11:05 tomaszklak

@tomaszklak The above appears to work for me. For example,

➜ docker build --platform linux/amd64 -t mac_issue .
➜ docker run --name foo --privileged --rm -it mac_issue                       
WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested
systemd 249.11-0ubuntu3.12 running in system mode (+PAM +AUDIT +SELINUX +APPARMOR +IMA +SMACK +SECCOMP +GCRYPT +GNUTLS +OPENSSL +ACL +BLKID +CURL +ELFUTILS +FIDO2 +IDN2 -IDN +IPTC +KMOD +LIBCRYPTSETUP +LIBFDISK +PCRE2 -PWQUALITY -P11KIT -QRENCODE +BZIP2 +LZ4 +XZ +ZLIB +ZSTD -XKBCOMMON +UTMP +SYSVINIT default-hierarchy=unified)
Detected virtualization docker.
Detected architecture x86-64.

Welcome to Ubuntu 22.04.4 LTS!

Queued start job for default target Graphical Interface.
[  OK  ] Created slice Slice /system/getty.
[  OK  ] Created slice Slice /system/modprobe.
[  OK  ] Created slice User and Session Slice.
[  OK  ] Started Dispatch Password Requests to Console Directory Watch.
[  OK  ] Started Forward Password Requests to Wall Directory Watch.
[  OK  ] Set up automount Arbitrary Executable File Formats File System Automount Point.
[  OK  ] Reached target Local Encrypted Volumes.
[  OK  ] Reached target Path Units.
[  OK  ] Reached target Remote File Systems.
[  OK  ] Reached target Slice Units.
[  OK  ] Reached target Swaps.
[  OK  ] Reached target Local Verity Protected Volumes.
[  OK  ] Listening on initctl Compatibility Named Pipe.
[  OK  ] Listening on Journal Audit Socket.
[  OK  ] Listening on Journal Socket (/dev/log).
[  OK  ] Listening on Journal Socket.
[  OK  ] Reached target Socket Units.
         Mounting Huge Pages File System...
         Mounting Kernel Debug File System...
         Mounting Kernel Trace File System...
         Starting Journal Service...
         Starting Load Kernel Module configfs...
         Starting Load Kernel Module drm...
         Starting Load Kernel Module efi_pstore...
         Starting Load Kernel Module fuse...
         Starting Remount Root and Kernel File Systems...
         Starting Apply Kernel Variables...
modprobe@efi_pstore.service: Deactivated successfully.
[  OK  ] Finished Load Kernel Module efi_pstore.
[  OK  ] Mounted Huge Pages File System.
[  OK  ] Mounted Kernel Trace File System.
[email protected]: Deactivated successfully.
[  OK  ] Finished Load Kernel Module configfs.
[email protected]: Deactivated successfully.
[  OK  ] Finished Load Kernel Module drm.
[email protected]: Deactivated successfully.
[  OK  ] Finished Load Kernel Module fuse.
         Mounting FUSE Control File System...
[  OK  ] Mounted Kernel Debug File System.
[  OK  ] Finished Remount Root and Kernel File Systems.
         Starting Create System Users...
[  OK  ] Finished Apply Kernel Variables.
[  OK  ] Mounted FUSE Control File System.
[  OK  ] Started Journal Service.
         Starting Flush Journal to Persistent Storage...
[  OK  ] Finished Create System Users.
         Starting Create Static Device Nodes in /dev...
[  OK  ] Finished Create Static Device Nodes in /dev.
[  OK  ] Reached target Preparation for Local File Systems.
[  OK  ] Reached target Local File Systems.
         Starting Set Up Additional Binary Formats...
[  OK  ] Finished Flush Journal to Persistent Storage.
         Starting Create Volatile Files and Directories...
         Mounting Arbitrary Executable File Formats File System...
[  OK  ] Finished Create Volatile Files and Directories.
         Starting Network Name Resolution...
[  OK  ] Reached target System Time Set.
         Starting Record System Boot/Shutdown in UTMP...
[  OK  ] Mounted Arbitrary Executable File Formats File System.
[  OK  ] Finished Set Up Additional Binary Formats.
[FAILED] Failed to start Network Name Resolution.
See 'systemctl status systemd-resolved.service' for details.
[  OK  ] Finished Record System Boot/Shutdown in UTMP.
[  OK  ] Reached target System Initialization.
[  OK  ] Started Daily apt download activities.
[  OK  ] Started Daily apt upgrade and clean activities.
[  OK  ] Started Daily dpkg database backup timer.
[  OK  ] Started Periodic ext4 Online Metadata Check for All Filesystems.
[  OK  ] Started Message of the Day.
[  OK  ] Started Daily Cleanup of Temporary Directories.
[  OK  ] Reached target Basic System.
[  OK  ] Reached target Timer Units.
[  OK  ] Listening on D-Bus System Message Bus Socket.
[  OK  ] Started D-Bus System Message Bus.
         Starting Remove Stale Online ext4 Metadata Check Snapshots...
         Starting Dispatcher daemon for systemd-networkd...
         Starting User Login Management...
[  OK  ] Stopped Network Name Resolution.
         Starting Network Name Resolution...
         Starting Permit User Sessions...
[FAILED] Failed to start Dispatcher daemon for systemd-networkd.
See 'systemctl status networkd-dispatcher.service' for details.
[  OK  ] Started D-Bus System Message Bus.
[FAILED] Failed to start Permit User Sessions.
See 'systemctl status systemd-user-sessions.service' for details.
[  OK  ] Started Console Getty.
[  OK  ] Started Getty on tty1.
[  OK  ] Reached target Login Prompts.
[  OK  ] Started D-Bus System Message Bus.
[FAILED] Failed to start Remove Stale Online ext4 Metadata Check Snapshots.
See 'systemctl status e2scrub_reap.service' for details.
[FAILED] Failed to start User Login Management.
See 'systemctl status systemd-logind.service' for details.
[  OK  ] Started D-Bus System Message Bus.
[  OK  ] Stopped User Login Management.
         Starting Load Kernel Module drm...
[  OK  ] Finished Load Kernel Module drm.
[  OK  ] Started D-Bus System Message Bus.
         Starting User Login Management...
[FAILED] Failed to start Network Name Resolution.
See 'systemctl status systemd-resolved.service' for details.
[  OK  ] Stopped Network Name Resolution.
         Starting Network Name Resolution...
[FAILED] Failed to start D-Bus System Message Bus.
See 'systemctl status dbus.service' for details.
[FAILED] Failed to start User Login Management.
See 'systemctl status systemd-logind.service' for details.
[  OK  ] Stopped User Login Management.
[  OK  ] Listening on D-Bus System Message Bus Socket.
         Starting Load Kernel Module drm...
[FAILED] Failed to start Network Name Resolution.
See 'systemctl status systemd-resolved.service' for details.
[  OK  ] Stopped Network Name Resolution.
         Starting Network Name Resolution...
[  OK  ] Finished Load Kernel Module drm.
         Starting User Login Management...
[FAILED] Failed to start User Login Management.
See 'systemctl status systemd-logind.service' for details.
[  OK  ] Stopped User Login Management.
         Starting Load Kernel Module drm...
[FAILED] Failed to start Network Name Resolution.
See 'systemctl status systemd-resolved.service' for details.
[  OK  ] Stopped Network Name Resolution.
         Starting Network Name Resolution...
[  OK  ] Finished Load Kernel Module drm.
         Starting User Login Management...
[FAILED] Failed to start User Login Management.
See 'systemctl status systemd-logind.service' for details.
[  OK  ] Stopped User Login Management.
         Starting Load Kernel Module drm...
[FAILED] Failed to start Network Name Resolution.
See 'systemctl status systemd-resolved.service' for details.
[  OK  ] Stopped Network Name Resolution.
[FAILED] Failed to start Network Name Resolution.
See 'systemctl status systemd-resolved.service' for details.
[  OK  ] Reached target Host and Network Name Lookups.
[  OK  ] Finished Load Kernel Module drm.
         Starting User Login Management...
[FAILED] Failed to start User Login Management.
See 'systemctl status systemd-logind.service' for details.
[  OK  ] Stopped User Login Management.
         Starting Load Kernel Module drm...
[  OK  ] Finished Load Kernel Module drm.
[FAILED] Failed to start User Login Management.
See 'systemctl status systemd-logind.service' for details.
[  OK  ] Reached target Multi-User System.
[  OK  ] Reached target Graphical Interface.
         Starting Record Runlevel Change in UTMP...
[FAILED] Failed to start Record Runlevel Change in UTMP.
See 'systemctl status systemd-update-utmp-runlevel.service' for details.

conradwt avatar May 27 '24 01:05 conradwt

@tomaszklak The above appears to work for me. For example,

➜ docker build --platform linux/amd64 -t mac_issue .
➜ docker run --name foo --privileged --rm -it mac_issue                       
...

You need to run it the second time:

➜ docker build --platform linux/amd64 -t mac_issue .
➜ docker run --name foo --privileged --rm -it mac_issue
# here it will work. After a few seconds kill the 'foo' container from a different shell
➜ docker run --name foo --privileged --rm -it mac_issue
# here it will fail

edit: you can also notice that the systemd startup with 4.26.1 has no failures, but that's probably just another side effect of whatever was broken after 4.26.1

tomaszklak avatar May 27 '24 07:05 tomaszklak

@tomaszklak Yes, I can reproduce the issue.

conradwt avatar May 28 '24 00:05 conradwt

Hi @conradwt @tomaszklak, what's happening here is that systemd modifies the bin_fmt config that is required to run emulated processes. It removes all of our config and registers python3.10 as an emulator. Not sure why it does that and also not sure yet how it didn't happen before.

dgageot avatar Jun 24 '24 08:06 dgageot

By comparing the logs between 4.26 and 4.31, I assume that what's happening is that:

  • on 4.26, systemd wasn't able to start "Arbitrary Executable File Formats File System":
[UNSUPP] Starting of Arbitrary Executable File Formats File System Automount Point unsupported.
  • on 4.31, it can.
[  OK  ] Set up automount Arbitrary Executable File Formats File System Automount Point.

So later in the logs, it changes the bin_fmt configuration:

[  OK  ] Finished Set Up Additional Binary Formats.

Which, for some reason, removes all the existing emu/rosetta config, making the execution of new amd64 process always fail, no matter if it's in this container or another container.

dgageot avatar Jun 24 '24 09:06 dgageot