compose
compose copied to clipboard
compose up creates duplicated image ids when using buildkit
Description
I have a compose file with several similar-looking dockerfiles to build:
compose.yaml:
services:
data:
build:
context: .
dockerfile: ./internal/data/Dockerfile
auth:
build:
context: .
dockerfile: ./internal/auth/Dockerfile
...
any dockerfile (eg. ./internal/data/Dockerfile)
FROM golang:1.18.3-alpine
WORKDIR /go/src/app
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN go build -o app ./internal/data/
CMD ["./app"]
Running docker-compose up with this configuration occasionally produces images with duplicate image ids, causing containers to run with the wrong image. For example, docker image ls produces:
REPOSITORY TAG IMAGE ID CREATED SIZE
api_auth latest 93139699b227 23 minutes ago 644MB
api_data latest 93139699b227 23 minutes ago 644MB
...
This does not seem to be a problem if I turn off buildkit (ie DOCKER_BUILDKIT=0 docker-compose up). For reference the callsite can be found here.
Output of docker compose version:
Docker Compose version 2.6.0
Output of docker info:
Client:
Context: default
Debug Mode: false
Plugins:
buildx: Docker Buildx (Docker Inc., v0.8.2-docker)
compose: Docker Compose (Docker Inc., 2.6.0)
Server:
Containers: 0
Running: 0
Paused: 0
Stopped: 0
Images: 0
Server Version: 20.10.17
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Native Overlay Diff: false
userxattr: false
Logging Driver: json-file
Cgroup Driver: systemd
Cgroup Version: 2
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 10c12954828e7c7c9b6e0ea9b0c02b01407d3ae1.m
runc version:
init version: de40ad0
Security Options:
seccomp
Profile: default
cgroupns
Kernel Version: 5.18.6-arch1-1
Operating System: Arch Linux
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 15.3GiB
Name: pzpz
ID: ABGC:NYKQ:46TD:XN62:7BCX:ZAJI:PQBX:IHOU:MIYJ:QWIN:PYTY:L4MJ
Docker Root Dir: /var/lib/docker
Debug Mode: false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
I have also seen this behavior, but it was hard to reproduce. Disabling buildkit for the project fixed the issue for me. Notably I was using docker-compose v1, not v2. Seems like a buildkit issue.
Also seeing the same behavior, but as the others have said it seems to only happen intermittently/occasionally. We have this happening in a CI pipeline where the environment should be the same from run to run, and when it happens we seem to be able to fix it by just rerunning the pipeline with zero changes.
The workaround I came up with before finding this thread was to split the CI script into separate docker compose build [service] for each service in the compose file.
Worth highlighting that as far as I can tell we are NOT using buildkit, so it seems to be an issue with docker compose build itself. (Unless the documentation is wrong and buildkit has recently been made the default despite no DOCKER_BUILDKIT env var and no features.buildkit in daemon.json. Is there some way for me to tell from the build command's output whether buildkit is being used?)
EDIT: Disregard this, buildkit is default as @ndeloof says below and DOCKER_BUILDKIT=0 fixes the issue.
buildkit has been made the default builder (see https://github.com/docker/docs/pull/12505)
The IMAGE ID reported by docker images ls is actually truncated to 12 characters, which maybe let you think those are the same IDs but actually would need to run docker image inspect alpine -f '{{.ID}}' to get the full image ID and confirm a collision
The IMAGE ID reported by
docker images lsis actually truncated to 12 characters, which maybe let you think those are the same IDs but actually would need to rundocker image inspect alpine -f '{{.ID}}'to get the full image ID and confirm a collision
This was definitely not the issue - we noticed this was happening because one of our frontend apps was running the container image of a different app. (Ie. as described in the original report above, containers are running with the wrong image.)