buildx icon indicating copy to clipboard operation
buildx copied to clipboard

--load doesn't always export the Docker image

Open bensalilijames opened this issue 3 years ago • 10 comments

This issue is more comprehensively described in the below linked issue, but since it appears to be an issue in buildx itself, I've opened an issue here to track. Hope that's ok!

Original ref: https://github.com/docker/build-push-action/issues/321

The gist of it is that building an image with buildx doesn't always export the image to Docker:

> /usr/bin/docker buildx build \
  --tag ml-intents:latest \
  --iidfile /tmp/docker-build-push-195EVz/iidfile \
  --cache-from type=local,src=/tmp/.buildx-cache \
  --cache-to type=local,dest=/tmp/.buildx-cache-new \
  --file intents/Dockerfile \
  --load \
  .

Running docker image ls immediately after this does not list the exported image, despite the build step apparently succeeding:

#24 importing to docker
#24 sha256:843ff23c8630500017a56b2c0a0318a104520fc9b96e14ea16c25bf6bb5edc27
#24 DONE 32.5s

I've only managed to reproduce this on 1 out of 3 different images (as described in original issue) which is pretty weird. The only real difference between them is their size: the failing one is much bigger than the other two.

Let me know if you need any more info! 🙂

bensalilijames avatar Apr 12 '21 16:04 bensalilijames

Need reproduction steps.

tonistiigi avatar Apr 12 '21 20:04 tonistiigi

Hi @tonistiigi, I've created a repo which reproduces the problem: https://github.com/benhjames/buildx-bug

You can see the issue in action here: https://github.com/benhjames/buildx-bug/runs/2328156905

All you need to do is run the workflow to observe the bug. At the end of the "Build image" stage, you can see that it runs "importing to docker" and reports this as done. However, when subsequently running docker image ls in the "Push image to GitHub Container Registry" phase, it does not exist.

bensalilijames avatar Apr 13 '21 08:04 bensalilijames

Turns out that the runner was running out of disk space, see https://github.com/docker/build-push-action/issues/321 for details.

If it would be possible to make buildx not silently fail loading to Docker when the disk is full that would be ideal. Otherwise feel free to close. Thanks!

bensalilijames avatar Apr 15 '21 09:04 bensalilijames

I ran into the same issue. It seems like a bug if the build command exits 0 while the load fails.

cep21 avatar Apr 22 '21 19:04 cep21

Looks like an upstream issue: moby/moby#34416. cc @thaJeztah.

crazy-max avatar Apr 22 '21 23:04 crazy-max

2017 :-0

At this point, is it possible for buildx to do the right thing (maybe verify somehow) independently, since it seems correct from the buildx side to exit non zero if --load does not load the image?

cep21 avatar Apr 23 '21 00:04 cep21

since it seems correct from the buildx side to exit non zero if --load does not load the image?

This seems reasonable for me.

YevheniiSemendiak avatar Oct 25 '21 09:10 YevheniiSemendiak

Looks like an upstream issue: moby/moby#34416. cc @thaJeztah.

Just left a comment on that issue, but that behaviour is expected, as the endpoint only returns the status code based on the initial request received; the actual success/fail must be determined based on the progress response.

thaJeztah avatar Oct 25 '21 11:10 thaJeztah

I expect a user friendly CLI to behave safely by default: --load should by default make the command block until it knows if the operation was successful or not, and reflect this in its exit code.

docker buildx blocks anyway until the export is complete, so there’s no reason for it to not check if it was completed successfully.

The @moby issue implies that only the HTTP status code is 200, but the actual JSON payload returned in response contains the keys "error" and "errorDetail" which seems to be something easy to check for. Am I missing something?

flying-sheep avatar Jun 27 '22 09:06 flying-sheep

Figured it out:

set -e

docker buildx build ... --output="type=docker,push=false,name=$REPOSITORY:$TAG,dest=/tmp/img.tar" .
docker load --input /tmp/img.tar
rm -f /tmp/img.tar

flying-sheep avatar Jul 01 '22 13:07 flying-sheep

Note that if you output to tar it will not be loaded and this is expected behaviour, see https://github.com/docker/build-push-action/issues/413 ("multiple exporters are not supported"). If you want your image both loaded in docker and exported, you have to docker load the tarball after the build.

(I know this is an unrelated issue, just leaving a note here in case people land here that are trying to do the above)

plu5 avatar Oct 31 '22 01:10 plu5

That’s why my code block contains docker load --input /tmp/img.tar, yes :wink:

But as said, my code is a workaround, not a fix. A fix means that if the user specified “I want to load this image” and the action failed to load this image, the action should fail with an informative error message.

And this is also not an upstream error. As said in https://github.com/moby/moby/issues/34416#issuecomment-950846261, once a header has been sent, it cannot be changed. Buildx therefore needs to read and interpret the JSON payload instead of going by the HTTP status alone here:

That streaming endpoint responds (eventually) with a JSON payload that has a success or error message. Buildx should wait for the final response and succeed or fail accordingly:

  • a success (and more explanation) from https://github.com/moby/moby/issues/34416#issuecomment-950846261:

    responses:
    - code: 100  # Continue
    - code: 200  # OK
      payload: {"stream":"Loaded image: myimage:latest\n"}
    
  • a failure (in that issue’s initial comment):

    responses:
    - code: 100  # Continue
    - code: 200  # OK (!)
      payload: {"errorDetail":{"message":"Error processing tar file(exit status 1): archive/tar: invalid tar header"},
                "error":"Error processing tar file(exit status 1): archive/tar: invalid tar header"}
    

flying-sheep avatar Nov 03 '22 11:11 flying-sheep

So this is simple. The bug is here: https://github.com/docker/buildx/blob/ac85f590ba7d4511f16684cc853a67538bf97ed9/vendor/github.com/docker/docker/client/image_load.go#L21-L24

buildx treats that endpoint as a normal one, not a streaming one. cli.postRaw doesn’t handle streaming endpoints.

What it needs to do here is looping over all responses and bailing once it receives an error, even when that one comes with HTTP 200.

flying-sheep avatar Nov 03 '22 11:11 flying-sheep