buildkit
buildkit copied to clipboard
Reproducibility is broken when re-building the exact same image multiple times because sometimes the `moby.buildkit.cache.v0` entry changes
The problem is that after a few re-builds the inline cache value (base64) changes.
Nothing else changes except that the chains object is removed.
I don't know how to explain that but its super easy to reproduce (on latest and master buildkit):
test/Dockerfile that just copies something
FROM busybox@sha256:98de1ad411c6d08e50f26f392f3bc6cd65f686469b7c22a85c7b5fb1b820c154
# do some stupid copy
COPY --link --from=alpine@sha256:9b2a28eb47540823042a2ba401386845089bb7b62a9637d55816132c4c3c36eb /bin/ls /bin/ls
reproduce.sh
img="ghcr.io/skirsten/tmp:test-loop-4" # Change or remove this image between tests
for j in {1..10}; do
buildctl prune >/dev/null
echo "building..."
buildctl build --frontend dockerfile.v0 --local context=test --local dockerfile=test \
--import-cache type=registry,ref=$img \
--export-cache type=inline \
--output type=image,name=$img,push=true \
--metadata-file metadata.json 2>/dev/null
digest=$(jq -r '."containerimage.digest"' metadata.json)
config_digest=$(jq -r '."containerimage.config.digest"' metadata.json)
echo "digest: $digest"
echo "config_digest: $config_digest"
crane config "$img@$digest" | jq -r '."moby.buildkit.cache.v0"' | base64 -d | jq . >new.json
diff old.json new.json || true
mv new.json old.json
echo
sleep 1
done
Output:
building...
digest: sha256:630b2de48a2e51079cec38c002f0c8bc3820b859557963c9fbc79e0d4697ecb1
config_digest: sha256:7231e8e691bad124e4d170998c4b566487a1c61deeffa4a74c5c441138ca640f
building...
digest: sha256:630b2de48a2e51079cec38c002f0c8bc3820b859557963c9fbc79e0d4697ecb1
config_digest: sha256:7231e8e691bad124e4d170998c4b566487a1c61deeffa4a74c5c441138ca640f
building...
digest: sha256:9c703d1e490f1e6dfbfde75e6be0a1ee50ea9de7072e10b422c331820b1c50fe
config_digest: sha256:d915739f055f10545f797706423c52f88b5cddd429cb40640b45a03c0f33b8e5
33,40d32
< "chains": [
< {
< "layers": [
< 1
< ],
< "createdAt": "0001-01-01T00:00:00Z"
< }
< ],
building...
digest: sha256:9c703d1e490f1e6dfbfde75e6be0a1ee50ea9de7072e10b422c331820b1c50fe
config_digest: sha256:d915739f055f10545f797706423c52f88b5cddd429cb40640b45a03c0f33b8e5
building...
digest: sha256:9c703d1e490f1e6dfbfde75e6be0a1ee50ea9de7072e10b422c331820b1c50fe
config_digest: sha256:d915739f055f10545f797706423c52f88b5cddd429cb40640b45a03c0f33b8e5
building...
digest: sha256:9c703d1e490f1e6dfbfde75e6be0a1ee50ea9de7072e10b422c331820b1c50fe
config_digest: sha256:d915739f055f10545f797706423c52f88b5cddd429cb40640b45a03c0f33b8e5
building...
digest: sha256:9c703d1e490f1e6dfbfde75e6be0a1ee50ea9de7072e10b422c331820b1c50fe
config_digest: sha256:d915739f055f10545f797706423c52f88b5cddd429cb40640b45a03c0f33b8e5
building...
digest: sha256:9c703d1e490f1e6dfbfde75e6be0a1ee50ea9de7072e10b422c331820b1c50fe
config_digest: sha256:d915739f055f10545f797706423c52f88b5cddd429cb40640b45a03c0f33b8e5
building...
digest: sha256:9c703d1e490f1e6dfbfde75e6be0a1ee50ea9de7072e10b422c331820b1c50fe
config_digest: sha256:d915739f055f10545f797706423c52f88b5cddd429cb40640b45a03c0f33b8e5
building...
digest: sha256:9c703d1e490f1e6dfbfde75e6be0a1ee50ea9de7072e10b422c331820b1c50fe
config_digest: sha256:d915739f055f10545f797706423c52f88b5cddd429cb40640b45a03c0f33b8e5
as can be seen, the third build removes the chains object from the base64'd cache. This causes the config_digest to change which then changes the digest. At which build number it breaks seems random to me... This breaks the reproducibility.
Pretty sure this is a bug. Any help is appreciated, thanks :)
We're running into a similar issue. Images are identical except for these changes to the moby.buildkit.cache.v0 entry. Here's the diff of the base64 decoded entry on both images:
https://gist.github.com/pkwarren/a3f20b1f409d52e66b26ee68011f990c
I updated the title as it seems this is more generic and affects other fields except chains in the cache entry.
Due to the random nature of this bug I would hope its simply some kind of race condition (or missing sorting of async outputs) and can be fixed easily.
I unfortunately do not have the time to dig into the code and try to find it myself so any help is appreciated!
I may have related/same issue:
image build for v2 using --cache-from v1 (previsouly built with inline cache) does not reuse the cache. the next build, v3, using cache from v2, uses the cache. and this repeats every second image.
I found the difference in moby.buildkit.cache.v0:
Image that works as --cache-from has:
{
"layers": [
{
"layer": 12,
"createdAt": "2022-09-28T08:30:41.439375309Z"
}
],
"digest": "sha256:13c946961fa6795c59a5ec1f3ce23075ea8bd73056c4b837765a536fa85c7a92",
"inputs": [
[
{
"link": 6
}
],
[
{
"link": 19
}
]
]
},
...
while the next image, that does not work as --cache-from, is missing the layers and have different numbers on the link values:
{
"digest": "sha256:13c946961fa6795c59a5ec1f3ce23075ea8bd73056c4b837765a536fa85c7a92",
"inputs": [
[
{
"link": 5
}
],
[
{
"link": 18
}
]
]
},
Diff:
< "layers": [
< {
< "layer": 12,
< "createdAt": "2022-09-28T08:30:41.439375309Z"
< }
< ],
13c7
< "link": 6
---
> "link": 5
18c12
< "link": 19
---
> "link": 18
28c22
< "link": 5
---
> "link": 4
33c27
< "link": 23
---
> "link": 22
39,44d32
< "layers": [
< {
< "layer": 5,
< "createdAt": "2022-09-28T08:30:33.097788126Z"
< }
< ],
I was trying to reproduce it, and it happens when the build-args changes only. When build args are the same, the caches can be reused correctly.
Your image reproducibility shouldn't depend on whether you are using a cache or not. Instead you should be targeting getting reproducible images with --no-cache, at which point it doesn't matter if you use a cache or not in following builds as the resulting image will be reproducible either way. Buildkit now supports SOURCE_DATE_EPOCH, and by using this with multi-stage builds, COPY --link, RUN find /dir/to/be/copied/into/image -print0 | xargs -0 touch --no-dereference --date="@${SOURCE_DATE_EPOCH}", and pinning your apt/dnf/apk dependencies it should be possible to create FULLY reproducible images that have the same image digest on every build.