[BUG] compose 2.20.1 broke image building on remote Engine (still not fixed in 2.26.0)
Description
My use case is building/deploying from a Windows client on/to remote Linux Engine instances. I have been using compose 2.16.0 successfully for a long time until I decided to upgrade to the latest 2.23.0. I have found the breaking point to be version 2.20.1.
Environment variables:
COMPOSE_FILE=C:\Users\<redacted>\AppData\Local\Temp\tmp9EA8.tmp
COMPOSE_PROJECT_NAME=bm
COMPOSE_CONVERT_WINDOWS_PATHS=1
COMPOSE_PARALLEL_LIMIT=1
DOCKER_BUILDKIT=1
DOCKER_REGISTRY=<redacted>.azurecr.io/
DOCKER_HOST=ssh://[email protected]
Docker info:
PS> docker info
Client:
Context: default
Debug Mode: false
Plugins:
app: Docker App (Docker Inc., v0.9.1-beta3)
buildx: Docker Buildx (Docker Inc., v0.10.0-docker)
compose: Docker Compose (Docker Inc., v2.20.1)
Server:
Containers: 8
Running: 8
Paused: 0
Stopped: 0
Images: 56
Server Version: 20.10.14
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Native Overlay Diff: true
userxattr: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Cgroup Version: 1
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 9cd3357b7fd7218e4aec3eae239db1f68a5a6ec6
runc version:
init version: de40ad0
Security Options:
apparmor
seccomp
Profile: default
Kernel Version: 5.10.162-1.ph4
Operating System: VMware Photon OS/Linux
OSType: linux
Architecture: x86_64
CPUs: 6
Total Memory: 7.773GiB
Name: photon-9662f57d592b
ID: 37P5:MBLZ:TGUE:HROE:PPZN:56OC:RT7J:ODLZ:B4PZ:URHC:3OGI:EQJJ
Docker Root Dir: /var/lib/docker
Debug Mode: false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
Product License: Community Engine
Not working with 2.20.1:
PS> docker compose build --pull <redacted>
[+] Building 0.0s (0/0)
listing workers for Build: failed to list workers: Unavailable: connection error: desc = "transport: failed to write client preface: write |1: file already closed"
It takes more than 3 minutes from [+] Building 0.0s (0/0) appearing to the command erroring out (but the remote Engine host is just an Hyper-V VM running PhotonOS, on the same machine through a virtual network switch).
This behavior is consistent from try to try.
Working with 2.20.0 (exact same environment, only difference is the compose exe):
(notice the http2 error: I could only reproduce it once, all other times it was never reported, but it did not break the build in this instance)
PS> docker compose build --pull <redacted>
2023/11/07 13:31:30 http2: server: error reading preface from client dummy-1: read |0: file already closed
[+] Building 1.4s (17/17) FINISHED
=> [<redacted> internal] load build definition from Dockerfile 0.2s
=> => transferring dockerfile: 32B 0.0s
=> [<redacted> internal] load .dockerignore 0.2s
=> => transferring context: 2B 0.0s
=> [<redacted> internal] load metadata for docker.io/library/ubuntu:20.04 1.1s
=> [<redacted> internal] load build context 0.0s
=> => transferring context: 728B 0.0s
=> [<redacted> 1/12] FROM docker.io/library/ubuntu:20.04@sha256:ed4a42283d9943135ed87d4ee34e542f7f5ad9ecf2f244870e23122f703f91c2 0.0s
=> CACHED [<redacted> 2/12] COPY 99fixbadproxy /etc/apt/apt.conf.d/ 0.0s
=> CACHED [<redacted> 3/12] COPY ca.pem /usr/local/share/ca-certificates/<redacted>.crt 0.0s
=> CACHED [<redacted> 4/12] RUN apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends apt-utils && DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends apt-transport-https ca-certifi 0.0s
=> CACHED [<redacted> 5/12] RUN curl -sS https://packages.microsoft.com/keys/microsoft.asc | apt-key add - && curl -sS -o /etc/apt/sources.list.d/msprod.list https://packages.microsoft.com/config/ubuntu/20.04/prod.list && apt-get update && 0.0s
=> CACHED [<redacted> 6/12] RUN curl -sS https://packages.elastic.co/GPG-KEY-elasticsearch | apt-key add - && echo 'deb [arch=amd64] https://packages.elastic.co/curator/5/debian9 stable main' >/etc/apt/sources.list.d/curator.list && apt-get 0.0s
=> CACHED [<redacted> 7/12] RUN curl -o /usr/local/bin/mc https://dl.min.io/client/mc/release/linux-amd64/mc && chmod +x /usr/local/bin/mc 0.0s
=> CACHED [<redacted> 8/12] COPY scripts/* /usr/local/bin/ 0.0s
=> CACHED [<redacted> 9/12] COPY crontab /etc/cron.d/ 0.0s
=> CACHED [<redacted> 10/12] COPY curator/* /root/.curator/ 0.0s
=> CACHED [<redacted> 11/12] COPY sqlserver/* /root/.sqlserver/ 0.0s
=> CACHED [<redacted> 12/12] RUN dos2unix /usr/local/bin/* /etc/cron.d/crontab && chmod +x /usr/local/bin/* && chmod -R 644 /etc/cron.d/crontab /root/.curator/ /root/.sqlserver/ 0.0s
=> [<redacted>] exporting to image 0.0s
=> => exporting layers 0.0s
=> => writing image sha256:58fc911450050f9e4c1fdced511c41a3348af897df2714556a8351c91b5e84a0 0.0s
=> => naming to <redacted>.azurecr.io/<redacted>
The problem seems related to BuildKit/buildx, I have also tried upgrading compose to 2.23.0, docker CLI to 24.0.6 and the buildx plugin to 0.11.2 but the result is exactly the same.
Steps To Reproduce
No response
Compose Version
No response
Docker Environment
No response
Anything else?
No response
Can you successfully build the same image using docker buildx build ... ?
With CLI v20.10.23 and its included buildx v0.10.0-docker, docker buildx build . successfully builds the image.
With CLI v24.0.6 and buildx plugin v0.11.2, it also successfully builds the image.
Release v2.23.3 includes an updated version of buildx, would you have a chance to give it a try ?
Thank you for the update, unfortunately the behavior is exactly the same and with the same timings:
[+] Building 0.0s (0/0) docker:default
listing workers for Build: failed to list workers: Unavailable: connection error: desc = "transport: failed to write client preface: write |1: file already closed"
(exit code is 17)
Is there any way to enable "debug" mode for compose like I can do for docker CLI (with --debug and --log-level=debug)?
Same expirience here. What is the latest working version?
@d4rkmen I am using 2.20.0 which works
Any news on this? Our clients have to downgrade to docker desktop 4.21.0 to be able to build docker images on remotes, very inconvenient.
I rolled back to 4.20.1 and can confirm the issue does not occur on that version.
@ndeloof any possibility of resolving this issue? 2.26.0 (with Docker CLI 26.0.0) has the exact same behavior, we can't upgrade from 2.20.0 because any newer version breaks building from a Windows CLI to a Linux Engine.
I have no idea what's wrong here. Compose fully delegates build to buildx (vendored inside binary) and has no control on the communication between client and (remote) builder
Oh that makes sense.
I tried manually building via buildx 0.13.1 with docker buildx build . and it now behaves exactly like compose does; that's not what I expected, but I probably missed something last time.
I will open an issue in the buildx repository, thank you.
Edit: for anybody following this issue, I have opened https://github.com/docker/buildx/issues/2356 I also tried buildx 0.11.2 again, it does actually work.