buildkit
buildkit copied to clipboard
Building with cache-from and BUILDKIT_INLINE_CACHE args breaks reproducible docker builds
Seeing this issue where when we are using --cache-from and BUILDKIT_INLINE_CACHE=1 build args, we no longer have reproducible docker builds when there are no content changes. We see image ID changes with every build even when there are no content changes. If the BUILDKIT_INLINE_CACHE=1 build arg is removed then we do have reproducible builds and the image ID remains constant. Including a reproduction script below. Slack thread here
✗ docker version
Client: Docker Engine - Community
Version: 19.03.8
API version: 1.40
Go version: go1.12.17
Git commit: afacb8b
Built: Wed Mar 11 01:21:11 2020
OS/Arch: darwin/amd64
Experimental: false
Server: Docker Engine - Community
Engine:
Version: 19.03.8
API version: 1.40 (minimum version 1.12)
Go version: go1.12.17
Git commit: afacb8b
Built: Wed Mar 11 01:29:16 2020
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: v1.2.13
GitCommit: 7ad184331fa3e55e52b890ea95e65ba581ae3429
runc:
Version: 1.0.0-rc10
GitCommit: dc9208a3303feef5b3839f4323d9beb36df0a9dd
docker-init:
Version: 0.18.0
GitCommit: fec3683
Reproduction bash script below, takes a remote registry repo as argument.
#!/bin/bash
set -exv
[ -z $1 ] && echo "No remote registry and repository argument specified, exiting" && exit 1
REGISTRY_REPO=$1
length=${#REGISTRY_REPO}
last_char=${REGISTRY_REPO:length-1:1}
[[ $last_char == "/" ]] && REGISTRY_REPO=${REGISTRY_REPO:0:length-1}
echo "Using remote registry and repo:" $REGISTRY_REPO
cat > /tmp/Dockerfile.test <<EOF
FROM python:3.8.2-buster
WORKDIR /opt/bin
ENV PATH "/opt/bin:$PATH"
WORKDIR /
RUN echo "deb http://deb.debian.org/debian buster-backports main" >> /etc/apt/sources.list
RUN apt-get update && \
apt-get -t buster-backports install -y --no-install-recommends etcd-client bison flex graphviz graphviz-dev protobuf-compiler libprotobuf-dev libprotoc-dev golang-go && \
rm -rf /var/lib/apt/lists/*
RUN go get github.com/golang/protobuf/protoc-gen-go
RUN go get google.golang.org/grpc/cmd/protoc-gen-go-grpc
RUN pip install -U pip
RUN pip install 'poetry==1.1.2'
WORKDIR /src
EOF
cd /tmp/
DOCKER_BUILDKIT=1 docker build --build-arg BUILDKIT_INLINE_CACHE=1 -t ${REGISTRY_REPO}/bug-repro-test:latest -f Dockerfile.test .
docker push ${REGISTRY_REPO}/bug-repro-test:latest
# Now delete the image we built for our test
docker rmi ${REGISTRY_REPO}/bug-repro-test:latest
# Lets build using cache-from and verify if we have reproducible builds
DOCKER_BUILDKIT=1 docker build --cache-from=${REGISTRY_REPO}/bug-repro-test:latest -t ${REGISTRY_REPO}/bug-repro-test:check -f Dockerfile.test .
imageid1="`docker images --format "{{.ID}}" ${REGISTRY_REPO}/bug-repro-test:check`"
DOCKER_BUILDKIT=1 docker build --cache-from=${REGISTRY_REPO}/bug-repro-test:latest -t ${REGISTRY_REPO}/bug-repro-test:check -f Dockerfile.test .
imageid2="`docker images --format "{{.ID}}" ${REGISTRY_REPO}/bug-repro-test:check`"
if [[ "$imageid1" != "$imageid2" ]];
then
echo "Image IDs don't match, builds using cache-from are not reproducible"
fi
# Lets build using cache-from with BUILDKIT_INLINE_CACHE=1 and verify if we have reproducible builds
DOCKER_BUILDKIT=1 docker build --build-arg BUILDKIT_INLINE_CACHE=1 --cache-from=${REGISTRY_REPO}/bug-repro-test:latest -t ${REGISTRY_REPO}/bug-repro-test:check -f Dockerfile.test .
imageid1="`docker images --format "{{.ID}}" ${REGISTRY_REPO}/bug-repro-test:check`"
DOCKER_BUILDKIT=1 docker build --build-arg BUILDKIT_INLINE_CACHE=1 --cache-from=${REGISTRY_REPO}/bug-repro-test:latest -t ${REGISTRY_REPO}/bug-repro-test:check -f Dockerfile.test .
imageid2="`docker images --format "{{.ID}}" ${REGISTRY_REPO}/bug-repro-test:check`"
if [[ "$imageid1" != "$imageid2" ]];
then
echo "Image IDs don't match, builds using cache-from and BUILDKIT_INLINE_CACHE flag are not reproducible"
fi
We're seeing this as well and would love a fix!
Yes, by including the BUILDKIT_INLINE_CACHE=1
flag it builds a different docker image hash each time (even when there are no content changes).
Works as expected without this flag enabled.
I did some more investigation on this and found the following. The only config change between two builds with BUILDKIT_INLINE_CACHE=1 is to the moby.buildkit.cache.v0
key.
When I unmarshal that key (base64) and format it with jq -S
on both images, I find the only differences are small places like this:
First image cache:
[
{
"digest": "sha256:807b3e8722c68572736f54d83fd741a5b1d69ccbef218fae52b4149a56d232ce",
"inputs": [
[
{
"link": 3
}
]
]
},
{
"digest": "sha256:807b3e8722c68572736f54d83fd741a5b1d69ccbef218fae52b4149a56d232ce",
"inputs": [
[
{
"link": 4
}
]
]
}
]
Second image cache:
[
{
"digest": "sha256:807b3e8722c68572736f54d83fd741a5b1d69ccbef218fae52b4149a56d232ce",
"inputs": [
[
{
"link": 4
}
]
]
},
{
"digest": "sha256:807b3e8722c68572736f54d83fd741a5b1d69ccbef218fae52b4149a56d232ce",
"inputs": [
[
{
"link": 3
}
]
]
}
]
I notice there is a method called sortConfig
which is supposed to ensure deterministic sort of these caches (probably to avoid this type of problem): https://github.com/moby/buildkit/blob/46c8b9ee45d0f91aa935d69c53d8b25ed07fcf97/cache/remotecache/v1/utils.go#L16
I don't know enough however about the file format to know what the right fix is. There are some entries in the files that look like this:
{
"digest": "sha256:c963489980ecadec4c2b06eb21b9d6d981669cb00c825aaf97831a79c5a4a5b5",
"inputs": [
[
{
"link": 1
},
{
"link": 2
}
]
]
}
Is it valid to normalize those entries into a single object?
Attaching both buildkit cache files in their entirety in case this helps.