[BUG] Docker Compose V2 build stuck forever on Windows 10
Hi
Original issue is here: https://github.com/docker/compose/issues/10229
We have very simple Docker Compose configuration that was built successfully until last upgrade to Docker 4.16.3. Docker compose built stuck forever so we have to restart computer and sometime remove empty meta.json file so that Docker Compose start working properly. The last lines from docker-compose build command output are:
exporting to docker image format
sending tarball
Here's logs from \docker-desktop-data\data\docker\containers\
{"log":"time=\"2023-02-02T09:40:22Z\" level=info msg=\"auto snapshotter: using overlayfs\"\n","stream":"stderr","time":"2023-02-02T09:40:22.9837146Z"}
{"log":"time=\"2023-02-02T09:40:22Z\" level=warning msg=\"using host network as the default\"\n","stream":"stderr","time":"2023-02-02T09:40:22.9840142Z"}
{"log":"time=\"2023-02-02T09:40:23Z\" level=info msg=\"found worker \\\"6j5qk61a44tlxwbtilho9kq7k\\\", labels=map[org.mobyproject.buildkit.worker.executor:oci org.mobyproject.buildkit.worker.hostname:1a1de79a9a18 org.mobyproject.buildkit.worker.network:host org.mobyproject.buildkit.worker.oci.process-mode:sandbox org.mobyproject.buildkit.worker.selinux.enabled:false org.mobyproject.buildkit.worker.snapshotter:overlayfs], platforms=[linux/amd64 linux/amd64/v2 linux/amd64/v3 linux/arm64 linux/riscv64 linux/ppc64le linux/s390x linux/386 linux/mips64le linux/mips64 linux/arm/v7 linux/arm/v6]\"\n","stream":"stderr","time":"2023-02-02T09:40:23.0053227Z"}
{"log":"time=\"2023-02-02T09:40:23Z\" level=warning msg=\"skipping containerd worker, as \\\"/run/containerd/containerd.sock\\\" does not exist\"\n","stream":"stderr","time":"2023-02-02T09:40:23.0230857Z"}
{"log":"time=\"2023-02-02T09:40:23Z\" level=info msg=\"found 1 workers, default=\\\"6j5qk61a44tlxwbtilho9kq7k\\\"\"\n","stream":"stderr","time":"2023-02-02T09:40:23.023096Z"}
{"log":"time=\"2023-02-02T09:40:23Z\" level=warning msg=\"currently, only the default worker can be used.\"\n","stream":"stderr","time":"2023-02-02T09:40:23.0230991Z"}
{"log":"time=\"2023-02-02T09:40:23Z\" level=info msg=\"running server on /run/buildkit/buildkitd.sock\"\n","stream":"stderr","time":"2023-02-02T09:40:23.0272842Z"}
{"log":"time=\"2023-02-02T09:40:58Z\" level=warning msg=\"healthcheck failed\" actualDuration=30.0016861s spanID=75bd469047542c05 timeout=30s traceID=52e529792d9014728d75ee03a49545f2\n","stream":"stderr","time":"2023-02-02T09:40:58.2677984Z"}
{"log":"time=\"2023-02-02T09:41:43Z\" level=error msg=\"healthcheck failed fatally\" spanID=75bd469047542c05 traceID=52e529792d9014728d75ee03a49545f2\n","stream":"stderr","time":"2023-02-02T09:41:43.2710605Z"}
{"log":"time=\"2023-02-02T09:41:43Z\" level=error msg=\"/moby.buildkit.v1.Control/Solve returned error: rpc error: code = Canceled desc = failed to copy to tar: rpc error: code = Canceled desc = grpc: the client connection is closing\"\n","stream":"stderr","time":"2023-02-02T09:41:43.3244673Z"}
{"log":"time=\"2023-02-02T09:49:32Z\" level=info msg=\"auto snapshotter: using overlayfs\"\n","stream":"stderr","time":"2023-02-02T09:49:32.1314603Z"}
{"log":"time=\"2023-02-02T09:49:32Z\" level=warning msg=\"using host network as the default\"\n","stream":"stderr","time":"2023-02-02T09:49:32.1317857Z"}
{"log":"time=\"2023-02-02T09:49:32Z\" level=info msg=\"found worker \\\"6j5qk61a44tlxwbtilho9kq7k\\\", labels=map[org.mobyproject.buildkit.worker.executor:oci org.mobyproject.buildkit.worker.hostname:1a1de79a9a18 org.mobyproject.buildkit.worker.network:host org.mobyproject.buildkit.worker.oci.process-mode:sandbox org.mobyproject.buildkit.worker.selinux.enabled:false org.mobyproject.buildkit.worker.snapshotter:overlayfs], platforms=[linux/amd64 linux/amd64/v2 linux/amd64/v3 linux/arm64 linux/riscv64 linux/ppc64le linux/s390x linux/386 linux/mips64le linux/mips64 linux/arm/v7 linux/arm/v6]\"\n","stream":"stderr","time":"2023-02-02T09:49:32.1665478Z"}
{"log":"time=\"2023-02-02T09:49:32Z\" level=warning msg=\"skipping containerd worker, as \\\"/run/containerd/containerd.sock\\\" does not exist\"\n","stream":"stderr","time":"2023-02-02T09:49:32.1850306Z"}
{"log":"time=\"2023-02-02T09:49:32Z\" level=info msg=\"found 1 workers, default=\\\"6j5qk61a44tlxwbtilho9kq7k\\\"\"\n","stream":"stderr","time":"2023-02-02T09:49:32.1850409Z"}
{"log":"time=\"2023-02-02T09:49:32Z\" level=warning msg=\"currently, only the default worker can be used.\"\n","stream":"stderr","time":"2023-02-02T09:49:32.1850431Z"}
{"log":"time=\"2023-02-02T09:49:32Z\" level=info msg=\"running server on /run/buildkit/buildkitd.sock\"\n","stream":"stderr","time":"2023-02-02T09:49:32.190224Z"}
{"log":"time=\"2023-02-02T09:50:52Z\" level=warning msg=\"healthcheck failed\" actualDuration=30.0118699s spanID=691b5971458139c1 timeout=30s traceID=01eb75694fd74c3cb829c0ad33782082\n","stream":"stderr","time":"2023-02-02T09:50:52.1106941Z"}
{"log":"time=\"2023-02-02T09:51:37Z\" level=error msg=\"healthcheck failed fatally\" spanID=691b5971458139c1 traceID=01eb75694fd74c3cb829c0ad33782082\n","stream":"stderr","time":"2023-02-02T09:51:37.1293957Z"}
{"log":"time=\"2023-02-02T09:51:37Z\" level=error msg=\"/moby.buildkit.v1.Control/Solve returned error: rpc error: code = Canceled desc = failed to copy to tar: rpc error: code = Canceled desc = grpc: the client connection is closing\"\n","stream":"stderr","time":"2023-02-02T09:51:37.1719046Z"}
Steps To Reproduce
Here's docker-compose.yml:
---
version: '3.8'
services:
kafka:
build:
context: .
And Dockerfile:
FROM confluentinc/cp-kafka:7.3.1
USER root
RUN echo "kafka-storage format --ignore-formatted -t $(kafka-storage random-uuid) -c /etc/kafka/kafka.properties" >> /etc/confluent/docker/ensure
We run command: "docker-compose build" or "docker compose build". Interesting is if we remove "USER root" line then build is run with success. Also if we just execute "docker build ." command then built is also executed with success.
Compose Version
2.15.1
Docker Environment
Client:
Context: default
Debug Mode: false
Plugins:
buildx: Docker Buildx (Docker Inc., v0.10.0)
compose: Docker Compose (Docker Inc., v2.15.1)
dev: Docker Dev Environments (Docker Inc., v0.0.5)
extension: Manages Docker extensions (Docker Inc., v0.2.17)
sbom: View the packaged-based Software Bill Of Materials (SBOM) for an image (Anchore Inc., 0.6.0)
scan: Docker Scan (Docker Inc., v0.23.0)
Server:
Containers: 10
Running: 1
Paused: 0
Stopped: 9
Images: 16
Server Version: 20.10.22
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Native Overlay Diff: true
userxattr: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Cgroup Version: 1
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 9ba4b250366a5ddde94bb7c9d1def331423aa323
runc version: v1.1.4-0-g5fd4c4d
init version: de40ad0
Security Options:
seccomp
Profile: default
Kernel Version: 5.4.72-microsoft-standard-WSL2
Operating System: Docker Desktop
OSType: linux
Architecture: x86_64
CPUs: 16
Total Memory: 25GiB
Name: docker-desktop
ID: IIT3:Q5TA:JKXM:A4R3:E7AA:TCQW:PMEA:QYVB:BG57:VXSS:Q6FT:PII4
Docker Root Dir: /var/lib/docker
Debug Mode: false
HTTP Proxy: http.docker.internal:3128
HTTPS Proxy: http.docker.internal:3128
No Proxy: hubproxy.docker.internal
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
hubproxy.docker.internal:5000
127.0.0.0/8
Live Restore Enabled: false
Hi,
I think this is not only related to Windows. I have a docker-compose project on Linux system which fails to build as well. In my case it seems that it is because I had a buildx configured earlier (i.e. for docker version <23.0.0) and could use it with docker buildx build command. With the latest docker release (23.0.0) buildx became default builder and this apparently affects compose as well:
$ docker compose build
[+] Building 0.0s (0/0)
no valid drivers found: error during connect: Get "http://docker.example.com/v1.24/info": command [ssh -- 192.168.40.8 docker system dial-stdio] has exited with exit status 255, please make sure the URL is valid, and Docker 18.09 or later is installed on the remote host: stderr=ssh: connect to host 192.168.40.8 port 22: Connection timed out
The 192.168.40.8 is one of my multi-arch docker VMs for multi-arch builds using buildx and it was powered off at the time.
With the VM up and running container image gets built but it is done on the remote machine and transferred back to the system where docker compose build has been called.
I can revert the old behaviour by simply setting DOCKER_BUILDKIT=0 though, but maybe there is some other, cleaner way.
~~Please advise if this should be reported as a separate case (or it if should be considered a 'bug' at all).~~
Edit: It was enough to change the default context for the buildx - something that I did not need to do in the past:
docker buildx use default
The same issue was faced after a clean Ubuntu setup on two different devices at the same time.
same issue using docker-compose with podman on windows/WSL2 when DOCKER_BUILDKIT=1 (buildx)
Same issue here on Windows and multiple machines