buildkit icon indicating copy to clipboard operation
buildkit copied to clipboard

Proposal for building and pushing multiple targets in parallel

Open rittneje opened this issue 3 years ago • 2 comments

I'm sure this has been requested before, but I couldn't find anything from a cursory look through the issue history.

Currently there is no way to build/push multiple targets via buildctl. For example, consider the following Dockerfile:

FROM a AS base
...

FROM base AS image1
...

FROM base AS image2

If I want to build and push both image1 and image2, either I must manually invoke buildctl twice (or thrice), or I have to rely on some other orchestration tool such as docker buildx bake. Neither of these options is attractive.

  1. For buildctl, I have to deal with the image cache for the base image, which wastes time. I also have to pass all the requisite build args for the base image just so we have a cache hit, even if the sub-stages do not use them.
  2. For docker buildx bake, this adds a dependency on yet another tool that we have no interest in using. It also means that presumably, the tool has to build the same dependency graph as BuildKit itself in order to optimize the parallelism.

Instead of this, I propose the following. First, the --opt target and --metadata-file flags need to be deprecated. Next, the --output flag is allowed to appear multiple times. It also gains two new parameters in its CSV - target and metadata-file. So for example, we can do the following:

buildctl build \
    --output type=image,target=image1,metadata-file=image1.json,name=my-first-image:latest,push=true \
    --output type=image,target=image2,metadata-file=image2.json,name=my-second-image:latest,push=true \
    ...

If there is no target for a given output, then it will default to the legacy --opt target, or the last stage if that isn't provided. If there is no metadata-file for a given output, then it will default to the legacy --metadata-file, or nothing if that isn't provided. Consequently, multiple outputs with the legacy --metadata-file is not supported. (I'm not sure what exactly would happen today.)

With this change, only buildkitd needs to care about the dependency graph and can properly optimize it. Consequently there may not be any need for docker buildx bake or similar tools anymore.

There is one more situation to consider. Today I can build without any --output but with a stage target. (The --metadata-file flag is not useful in this context as there would never be anything to write to it.) To preserve this functionality, either the --opt target flag should not be fully deprecated, or we may need a new output type such as none, which is admittedly confusing.

rittneje avatar Feb 13 '22 19:02 rittneje

Given the 'output none' case, perhaps flip it around, and allow --opt target to appear multiple times, and include the output spec in that?

That said, this is looking at the buildctl CLI, and that shouldn't stray too far from the buildkitd API structure, so it'll probably depend somewhat on how this would be expressed in the API (which doesn't currently know about target as anything special structurally, so rolling --opt target into --output seems like the simpler change of the two ideas so far).

TBBle avatar Mar 03 '22 10:03 TBBle

I also have a use-case where I need to generate multiple images that share a base image with many layers... and when I build the images independently in the CI, the layers are recreated by each CI job instead of being re-used. I would also like to fix that, but I don't see how I can do that except building an image first, then building the other ones after pulling the first one in order to get the cache. Is there an easy workaround or should I wait for this feature to be implemented?

sprat avatar Jun 17 '22 14:06 sprat