compose icon indicating copy to clipboard operation
compose copied to clipboard

compose up creates duplicated image ids when using buildkit

Open pohzipohzi opened this issue 3 years ago • 3 comments

Description

I have a compose file with several similar-looking dockerfiles to build:

compose.yaml:

services:
  data:
    build:
      context: .
      dockerfile: ./internal/data/Dockerfile
  auth:
    build:
      context: .
      dockerfile: ./internal/auth/Dockerfile
...

any dockerfile (eg. ./internal/data/Dockerfile)

FROM golang:1.18.3-alpine

WORKDIR /go/src/app

COPY go.mod go.sum ./
RUN go mod download

COPY . .

RUN go build -o app ./internal/data/

CMD ["./app"]

Running docker-compose up with this configuration occasionally produces images with duplicate image ids, causing containers to run with the wrong image. For example, docker image ls produces:

REPOSITORY   TAG       IMAGE ID       CREATED          SIZE
api_auth     latest    93139699b227   23 minutes ago   644MB
api_data     latest    93139699b227   23 minutes ago   644MB
...

This does not seem to be a problem if I turn off buildkit (ie DOCKER_BUILDKIT=0 docker-compose up). For reference the callsite can be found here.

Output of docker compose version:

Docker Compose version 2.6.0

Output of docker info:

Client:
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc., v0.8.2-docker)
  compose: Docker Compose (Docker Inc., 2.6.0)

Server:
 Containers: 0
  Running: 0
  Paused: 0
  Stopped: 0
 Images: 0
 Server Version: 20.10.17
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: false
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 10c12954828e7c7c9b6e0ea9b0c02b01407d3ae1.m
 runc version: 
 init version: de40ad0
 Security Options:
  seccomp
   Profile: default
  cgroupns
 Kernel Version: 5.18.6-arch1-1
 Operating System: Arch Linux
 OSType: linux
 Architecture: x86_64
 CPUs: 8
 Total Memory: 15.3GiB
 Name: pzpz
 ID: ABGC:NYKQ:46TD:XN62:7BCX:ZAJI:PQBX:IHOU:MIYJ:QWIN:PYTY:L4MJ
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

pohzipohzi avatar Jun 25 '22 20:06 pohzipohzi

I have also seen this behavior, but it was hard to reproduce. Disabling buildkit for the project fixed the issue for me. Notably I was using docker-compose v1, not v2. Seems like a buildkit issue.

mshade avatar Jun 27 '22 15:06 mshade

Also seeing the same behavior, but as the others have said it seems to only happen intermittently/occasionally. We have this happening in a CI pipeline where the environment should be the same from run to run, and when it happens we seem to be able to fix it by just rerunning the pipeline with zero changes.

The workaround I came up with before finding this thread was to split the CI script into separate docker compose build [service] for each service in the compose file.

jon-v avatar Aug 05 '22 15:08 jon-v

Worth highlighting that as far as I can tell we are NOT using buildkit, so it seems to be an issue with docker compose build itself. (Unless the documentation is wrong and buildkit has recently been made the default despite no DOCKER_BUILDKIT env var and no features.buildkit in daemon.json. Is there some way for me to tell from the build command's output whether buildkit is being used?)

EDIT: Disregard this, buildkit is default as @ndeloof says below and DOCKER_BUILDKIT=0 fixes the issue.

jon-v avatar Aug 05 '22 15:08 jon-v

buildkit has been made the default builder (see https://github.com/docker/docs/pull/12505)

ndeloof avatar Nov 29 '22 08:11 ndeloof

The IMAGE ID reported by docker images ls is actually truncated to 12 characters, which maybe let you think those are the same IDs but actually would need to run docker image inspect alpine -f '{{.ID}}' to get the full image ID and confirm a collision

ndeloof avatar Nov 29 '22 08:11 ndeloof

The IMAGE ID reported by docker images ls is actually truncated to 12 characters, which maybe let you think those are the same IDs but actually would need to run docker image inspect alpine -f '{{.ID}}' to get the full image ID and confirm a collision

This was definitely not the issue - we noticed this was happening because one of our frontend apps was running the container image of a different app. (Ie. as described in the original report above, containers are running with the wrong image.)

jon-v avatar Nov 29 '22 08:11 jon-v