kaniko
kaniko copied to clipboard
Reproducible builds broken in 1.8.0
Actual behavior Consider the Go program main.go and its corresponding Dockerfile (both listed below). Using kaniko in version 1.7.0, two subsequent reproducible builds using the command listed below result – as expected – in two identical Docker images. In version 1.8.0, however, two subsequent builds are no longer identical.
Expected behavior
I expect two subsequent reproducible builds to result in identical images.
To Reproduce Steps to reproduce the behavior:
- Build an image by running:
$ docker run -v $(pwd):/src --network=host gcr.io/kaniko-project/executor:v1.8.0 --reproducible --dockerfile /src/Dockerfile --no-push --tarPath /src/image-file-main-00.tar --destination main:00 --cache=false --context dir:///src/
- Build a second image by running:
$ docker run -v $(pwd):/src --network=host gcr.io/kaniko-project/executor:v1.8.0 --reproducible --dockerfile /src/Dockerfile --no-push --tarPath /src/image-file-main-01.tar --destination main:01 --cache=false --context dir:///src/
- Import both images by running:
$ cat image-file-main-00.tar | docker load
$ cat image-file-main-01.tar | docker load
- Compare the image IDs:
$ docker image ls main
REPOSITORY TAG IMAGE ID CREATED SIZE
main 00 e65d80240143 N/A 1.75MB
main 01 77fc4150ed91 N/A 1.75MB
The Go program is identical in both builds but the surrounding tar archive isn't. I compared the hexdump of the tar archive of both builds and noticed that there are atime and ctime fields that contain a Unix timestamp, which is the reason why the builds differ. Could this regression have been caused by ee95be1e?
Additional Information
- Dockerfile
FROM golang:1.18 as builder
WORKDIR /src
COPY main.go ./
RUN CGO_ENABLED=0 GO111MODULE=off go build -trimpath -o main
FROM scratch as artifact
COPY --from=builder /src/main /bin/
CMD [ "/" ]
- Build Context
package main
import "fmt"
func main() {
fmt.Println("Hello!")
}
Triage Notes for the Maintainers
Description | Yes/No |
---|---|
Please check if this a new feature you are proposing |
|
Please check if the build works in docker but not in kaniko |
|
Please check if this error is seen when you use --cache flag |
|
Please check if your dockerfile is a multistage dockerfile |
|
Just to check, do you get reproducible builds if your builder is golang:1.17
or even :1.16
? There were some build stamping changes in 1.18 that may be causing reproducibility to suffer.
Just to check, do you get reproducible builds if your builder is
golang:1.17
or even:1.16
? There were some build stamping changes in 1.18 that may be causing reproducibility to suffer.
No, the problem remains with a 1.17 builder.
My colleague and I stumbled on this exact issue and now can confirm: downgrading to v1.7.0 of kaniko solved the problem: the produced image had the same digest! So it's indeed a regression.
FWIW I'll add another anecdata here. I'm testing Kaniko at my workplace and had the exact same issue building a Golang 1.17 project. I could not figure out why Kaniko reproducible builds was not working even though I verified that my go binary was the same between builds. Downgrading to v1.7.0 of Kaniko fixed the issue. I could try and add some data here if that would be helpful but my situation was pretty much the one describe in the Issue description.
Just encountered this today. We use kaniko --reproducible
for our base-images. Downgrading to kaniko 1.7.0 worked for this project.
I think #1809 causes this issue. I compared layer tar's produced by 1.7.0 and 1.8.1 (because I am idiot) and the tar format (Pax Header) changed between these versions. 1.8.1 contains ctime and atime as @NullHypothesis noted. The old version does not.
maybe --repoducible
should switch back to the old format?
Narrowing down @suicide's comment, the --reproducible
flag was broken between v1.7.0
and v1.8.0
. The only changes I saw on the v1.8.0
was atime/ctime changes @NullHypothesis mentioned.
I can confirm that building 1.9.1 but with https://github.com/GoogleContainerTools/kaniko/pull/1809 reverted results in reproducible image builds working as expected.
I believe that the most sensible place to resolve this would be in the go-containerregistry project; I've opened an issue there.
Ran into this defect yesterday with 1.9.1
-- any known workarounds for now short of downgrading to 1.7.0
?
Ran into this defect yesterday with
1.9.1
-- any known workarounds for now short of downgrading to1.7.0
?
I've submitted a Pull request to go-containerregistry based on @zx96 initial investigation.
I've confirmed locally that if I build kaniko against that commit, that the bug vanishes. (Note that I also had to bump the version of cloud.google.com/go/storage to atleast v1.27.0 and not just the automatically resolved v1.21.1, before kaniko would compile again, this probably comes from jumping 5 minor versions in go-containerregistry)
As soon as the PR is merged in the upstream project, I'll create a PR for kaniko
That's fantastic, thanks @BronzeDeer and kudos to @zx96 as well.
Given that change, could we then avoid using --reproducible
in kaniko because PAX headers would be corrected in a way that would consistently produce identical shas?
The reason I ask is that I'm trying to optimize our build process for thousands of Spring Boot apps that use multi-stage Dockerfile to separate out the app layer from the 3rd party jar layer, etc.
My concern with --reproducable
is described very clearly in https://github.com/zx96/kaniko/commit/bad2f9433e766b009e5d70d86f0c3eaa8eddb3be and my goals are the same.
...issues with memory consumption memory consumption (https://github.com/GoogleContainerTools/kaniko/issues/862) and build time (https://github.com/GoogleContainerTools/kaniko/issues/1960). Additionally, I didn't really care about reproducibility - I mainly cared about the layers having identical contents so Docker could skip pulling and storing redundant layers from a registry.
That's fantastic, thanks @BronzeDeer and kudos to @zx96 as well.
Given that change, could we then avoid using
--reproducible
in kaniko because PAX headers would be corrected in a way that would consistently produce identical shas?The reason I ask is that I'm trying to optimize our build process for thousands of Spring Boot apps that use multi-stage Dockerfile to separate out the app layer from the 3rd party jar layer, etc.
My concern with
--reproducable
is described very clearly in zx96@bad2f94 and my goals are the same....issues with memory consumption memory consumption (#862) and build time (#1960). Additionally, I didn't really care about reproducibility - I mainly cared about the layers having identical contents so Docker could skip pulling and storing redundant layers from a registry.
The bug fix makes --reproducible
work as intended, it makes the layers identical by setting the timestamps in the tar headers by setting it to a static time. The bug stemmed from the fact, that if the PAX or GNU format was used for the underlying tars, then not only did the code need to change the modified timestamp, but also the access time and change time in the header. PAX tars are not reproducible by default, in fact the opposite, the include more time information which varies between builds by default, which broke the existing code which naively assumed that there was only 1 timestamp in tar headers.
TL;DR: --reproducible
will start producing layers with identical shas again, but you need to use the flag, not using the flag will result in different shas even for the same content
P.S.: On Caching: Kaniko's own build-time caching seems mostly unaffected by the bug , since it caches based on content of the Dockerfile rather than raw digests, varying digests only hinder external tools (which also includes container runtimes pulling images sadly)
Thanks for the clarity on --reproducable
, looking forward to giving it a go after the fix.
I supposed the performance issues with this feature can be explored via (https://github.com/GoogleContainerTools/kaniko/issues/862) and (https://github.com/GoogleContainerTools/kaniko/issues/1960). I see a comment in #862 concerning the possible use of --compressed-caching=false
?
On caching -- I've also struggled to get kaniko to generate matching shas for the cached COPY layers. I'm using the remote --cache-repo
option and I see the cache layers are published with just their contents (as you noted) but the shas change and I get misses at build time.