buildah
buildah copied to clipboard
Reduce --label arguments to a single layer
What type of PR is this?
/kind api-change
What this PR does / why we need it:
#3517 fixed #3370 by adding a step/layer for every --label
argument which matches the behavior of Docker build. Buildkit, however, does not add any layer for --label
arguments.
Rather than clutter an image with empty layers, we can combine label arguments in a single layer.
How to verify it
Compare different tools using the following Dockerfile and arguments.
FROM alpine
LABEL \
file1=one \
file2=two
# Without this PR
$ buildah build --layers --tag cbandy:latest --label arg1=one --label arg2=two .
STEP 1/4: FROM alpine
STEP 2/4: LABEL file1=one file2=two
--> 031d0e5ee40
STEP 3/4: LABEL "arg1"="one"
--> c35ed1cd582
STEP 4/4: LABEL "arg2"="two"
COMMIT cbandy:latest
--> 8dc4f003578
Successfully tagged localhost/cbandy:latest
8dc4f0035785abfe7d75ac683b0ce21a1b1f020bc58951dff2b21472ee78a907
$ podman history cbandy:latest
ID CREATED CREATED BY SIZE COMMENT
e66264b98777 3 seconds ago /bin/sh -c #(nop) LABEL "arg2"="two" 0 B FROM c35ed1cd5823
<missing> 3 seconds ago /bin/sh -c #(nop) LABEL "arg1"="one" 0 B FROM 031d0e5ee405
<missing> 3 seconds ago /bin/sh -c #(nop) LABEL file1=one file... 0 B FROM docker.io/library/alpine:latest
<missing> 5 days ago /bin/sh -c #(nop) CMD ["/bin/sh"] 0 B
<missing> 5 days ago /bin/sh -c #(nop) ADD file:8e81116368669ed... 5.81 MB
# With this PR
$ buildah build --layers --tag cbandy:latest --label arg1=one --label arg2=two .
STEP 1/3: FROM alpine
STEP 2/3: LABEL file1=one file2=two
--> 56fc28a1e56
STEP 3/3: LABEL "arg1"="one" "arg2"="two"
COMMIT cbandy:latest
--> d2418033201
Successfully tagged localhost/cbandy:latest
d2418033201681ccc30b7a3bce2c2f70a6e7816c5867c8d38f7ce602f7bda20e
$ podman history cbandy:latest
ID CREATED CREATED BY SIZE COMMENT
e66264b98777 4 seconds ago /bin/sh -c #(nop) LABEL "arg1"="one" "arg2... 0 B FROM 56fc28a1e566
<missing> 4 seconds ago /bin/sh -c #(nop) LABEL file1=one file... 0 B FROM docker.io/library/alpine:latest
<missing> 5 days ago /bin/sh -c #(nop) CMD ["/bin/sh"] 0 B
<missing> 5 days ago /bin/sh -c #(nop) ADD file:8e81116368669ed... 5.81 MB
$ docker buildx build --tag cbandy:latest --label arg1=one --label arg2=two .
…
$ docker history cbandy:latest
IMAGE CREATED CREATED BY SIZE COMMENT
ef9cbb8f3175 5 days ago LABEL file1=one file2=two 0B buildkit.dockerfile.v0
<missing> 5 days ago /bin/sh -c #(nop) CMD ["/bin/sh"] 0B
<missing> 5 days ago /bin/sh -c #(nop) ADD file:8e81116368669ed3d… 5.52MB
Which issue(s) this PR fixes:
None
[APPROVALNOTIFIER] This PR is NOT APPROVED
This pull-request has been approved by: cbandy
To complete the pull request process, please assign flouthoc after the PR has been reviewed.
You can assign the PR to them by writing /assign @flouthoc
in a comment when ready.
The full list of commands accepted by this bot can be found here.
Approvers can indicate their approval by writing /approve
in a comment
Approvers can cancel approval by writing /approve cancel
in a comment
@nalind WDYT?
Labels are unordered in the image config, so I don't really have a problem with us running sort.Strings()
on options.Labels
before iterating through it to put them into a predictable order on the line that we synthesize.
I don't know that we've ever tested with more than one --label
option on the buildah build
command line, or that the build cache currently still gets a "hit" if the order of multiple --label
options on the buildah build
command line is changed, so if the tests don't complain, I don't think it breaks anything.
Could you let us know the issues caused by the existing behavior.
I like that --label
arguments allow me to annotate images without having to go through --build-arg
and ARG
and LABEL
instructions. I like that BuildKit and buildah v1.23.0 do so without adding to the history.
I was bitten by #3370 a few times and appreciate that was fixed in buildah v1.24.0. But now the history of my images is full of LABEL
instructions.
I think one problem is that we will miss cache hit if order of label is changed
If I understand correctly, this is true both before and after this PR. Note the args and missing cache messages in the third invocation:
# Without this PR
$ buildah build --layers --tag cbandy:latest --label arg1=one --label arg2=two .
…
$ buildah build --layers --tag cbandy:latest --label arg1=one --label arg2=two .
STEP 1/4: FROM alpine
STEP 2/4: LABEL file1=one file2=two
--> Using cache 56fc28a1e5669b7629007a01bd794fffa1fc1382143c0c852868dbb171bef50a
--> 56fc28a1e56
STEP 3/4: LABEL "arg1"="one"
--> Using cache 8c8cd32262c81cfccc989fea54f330817f08e2d2cc37bd583e3aaccf083afe30
--> 8c8cd32262c
STEP 4/4: LABEL "arg2"="two"
--> Using cache c4932f6355b9378abe448ca8d8a02cf972903baeb22d1763df7a3193fabb83da
COMMIT cbandy:latest
--> c4932f6355b
Successfully tagged localhost/cbandy:latest
c4932f6355b9378abe448ca8d8a02cf972903baeb22d1763df7a3193fabb83da
$ buildah build --layers --tag cbandy:latest --label arg2=two --label arg1=one .
STEP 1/4: FROM alpine
STEP 2/4: LABEL file1=one file2=two
--> Using cache 56fc28a1e5669b7629007a01bd794fffa1fc1382143c0c852868dbb171bef50a
--> 56fc28a1e56
STEP 3/4: LABEL "arg2"="two"
--> dc36ade8ba4
STEP 4/4: LABEL "arg1"="one"
COMMIT cbandy:latest
--> 821ecad1fde
Successfully tagged localhost/cbandy:latest
821ecad1fde81ae39f46c70ee826805060d6489c3b21db906f72e42b1dcf05f3
with this PR the behavior of buildah differs from both buildkit and docker.
Agreed.
Looking more closely at the output of docker build
, I see buildah build
still differs with regard to caching. Regardless of the argument order docker build
prints:
Step 3/4 : LABEL arg1=one
---> Using cache
---> a37a055bdea2
Step 4/4 : LABEL arg2=two
---> Using cache
---> 316a2a04c383
Labels are unordered in the image config, so I don't really have a problem with us running
sort.Strings()
onoptions.Labels
before iterating through it to put them into a predictable order
Should the code in this PR sort options.Labels
in-place, or make a copy then sort, or should something else sort the slice long before here?
I don't know that we've ever tested with more than one
--label
option on thebuildah build
command line, or that the build cache currently still gets a "hit" if the order of multiple--label
options on thebuildah build
command line is changed
I'm not familiar with the test suite, but I also don't see any in bud.bats
. 🧐 It looks like bud-multistage-relabel
could change a bit to cover this.
Labels are unordered in the image config, so I don't really have a problem with us running
sort.Strings()
onoptions.Labels
before iterating through it to put them into a predictable orderShould the code in this PR sort
options.Labels
in-place, or make a copy then sort, or should something else sort the slice long before here?
I lean toward making a copy in the same place, and sorting it there. We copy the slice out of the options
struct in newExecutor()
, and that copy is what the build cache has on-hand when it's doing its thing, so I'd prefer it if they didn't end up in different orders.
I don't know that we've ever tested with more than one
--label
option on thebuildah build
command line, or that the build cache currently still gets a "hit" if the order of multiple--label
options on thebuildah build
command line is changedI'm not familiar with the test suite, but I also don't see any in
bud.bats
. monocle_face It looks likebud-multistage-relabel
could change a bit to cover this.
Agreed, that looks like a good place to do it.
We copy the slice out of the
options
struct innewExecutor()
, and that copy is what the build cache has on-hand when it's doing its thing, so I'd prefer it if they didn't end up in different orders.
I see where you're talking about, but I'm getting a bit out of my depth. I see other slices that I suspect are also key/value pairs. Would annotations
and envs
need the same treatment? Perhaps someone else is better equipped to address caching separately from this PR about history?
https://github.com/containers/buildah/blob/fb28761ceb01e3e9050d3567bac721dabdb5953d/imagebuildah/executor.go#L251-L252
https://github.com/containers/buildah/blob/fb28761ceb01e3e9050d3567bac721dabdb5953d/imagebuildah/executor.go#L284-L288
A friendly reminder that this PR had no activity for 30 days.
@flouthoc any chance you can work on this?
@cbandy are you still working on this ? If you are fine I can take over this PR and include your commit.
@cbandy are you still working on this ? If you are fine I can take over this PR and include your commit.
@flouthoc I have nothing new for this. Please feel free.
@cbandy: PR needs rebase.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
A friendly reminder that this PR had no activity for 30 days.
@flouthoc any time to work on this?
Yes I'll cherry-pick the commits.
A friendly reminder that this PR had no activity for 30 days.
@flouthoc Did you ever get a chance to look at this?
A friendly reminder that this PR had no activity for 30 days.
A friendly reminder that this PR had no activity for 30 days.
I think @nalind has already implemented this here https://github.com/containers/buildah/pull/4511, hence closing this please re-open if i am mistaken.