build-push-action icon indicating copy to clipboard operation
build-push-action copied to clipboard

multistage dockerfile fails to load from one stage to another

Open deitch opened this issue 4 years ago • 15 comments

Behaviour

A multistage build - in this case on an alternate runner - is failing to copy a file from one stage to another. The error is:

------
166
 > [linux/arm64 runner 1/2] COPY --from=certs /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/ca-certificates.crt:
167
------
168
Dockerfile:26
169
--------------------
170
  24 |     ARG OS=linux
171
  25 |     
172
  26 | >>> COPY --from=certs /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/ca-certificates.crt
173
  27 |     COPY --from=build /go/src/app/dist/bin/${BINARY}-${OS}-${TARGETARCH} ${BINARY}
174
  28 |     
175
--------------------
176
error: failed to solve: rpc error: code = Unknown desc = failed to compute cache key: "/etc/ssl/certs/ca-certificates.crt" not found: not found
177

Results should be available here

The yaml is here, relevant parts below::

    - name: Build
      id: docker_build
      uses: docker/build-push-action@v2
      with:
        platforms: linux/amd64,linux/arm64
        push: true
        tags: |
          ${{ format('{0}:{1}', env.IMAGE_NAME, steps.tagname.outputs.tag) }}
          ${{ format('{0}:{1}', env.IMAGE_NAME, 'latest') }}

Dockerfile is pretty straightforward here, relevant parts below:

root@nginx-7848d4b86f-kmjnf:/# %                                                                                                                                                                         FROM alpine:3.11 as certs

RUN apk --update add ca-certificates

# builder
# lots of other stuff

# Create Docker image of just the binary
FROM scratch as runner

COPY --from=certs /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/ca-certificates.crt

The stage it is copying from is basically just getting certs, and then copying it to a scratch image.

Steps to reproduce this issue

I have not been able yet to reproduce it outside of the above repo. So far it has happened only on the linux/arm64 build.

Expected behaviour

Tell us what should happen

It should build.

Actual behaviour

Tell us what happens instead

It fails

Configuration

  • Repository URL (if public): https://github.com/equinix/cloud-provider-equinix-metal/
  • Build URL (if public): https://github.com/equinix/cloud-provider-equinix-metal/runs/1886356502?check_suite_focus=true

yam is here, relevant part below.

    - name: Build
      id: docker_build
      uses: docker/build-push-action@v2
      with:
        platforms: linux/amd64,linux/arm64
        push: true
        tags: |
          ${{ format('{0}:{1}', env.IMAGE_NAME, steps.tagname.outputs.tag) }}
          ${{ format('{0}:{1}', env.IMAGE_NAME, 'latest') }}

Logs

Download the log file of your build and attach it to this issue.

logs_219.zip

I am rerunning the jobs, in case it is transient, but even if it is, we should know.

deitch avatar Feb 12 '21 10:02 deitch

@deitch Don't see any issue in your pipeline. Have you solved this?

crazy-max avatar Feb 14 '21 21:02 crazy-max

I reran the pipeline and it went away. Did the above link not go to the failed build?

It must be transient, but still would be nice to know how and why.

deitch avatar Feb 14 '21 21:02 deitch

@deitch Looks similar to docker/buildx#472. WDYT @tonistiigi?

crazy-max avatar Feb 14 '21 22:02 crazy-max

It does look similar.

deitch avatar Feb 14 '21 22:02 deitch

Look at https://github.com/equinix/cloud-provider-equinix-metal/runs/1886356502?check_suite_focus=true#step:7:118 . Seems that there was an error and certificates did not get installed. But the process did not fail. Not sure if error on reporting exit code or apk issue in container.

tonistiigi avatar Feb 14 '21 22:02 tonistiigi

@tonistiigi

#21 1.459 Executing busybox-1.31.1-r9.trigger
#21 1.463 ERROR: busybox-1.31.1-r9.trigger: script exited with error 1
#21 1.464 Executing ca-certificates-20191127-r2.trigger
#21 1.476 /bin/sh: can't open 'trigger': No such file or directory
#21 1.478 ERROR: ca-certificates-20191127-r2.trigger: script exited with error 2
#21 1.482 OK: 6 MiB in 15 packages

Good catch, I also encountered this kind of issues with old releases of Alpine.

crazy-max avatar Feb 14 '21 22:02 crazy-max

Huh you're right. Why didn't it fail out at the earlier stage?

deitch avatar Feb 14 '21 22:02 deitch

3.11 isn't that old. But I'm happy to bump it and keep and eye out.

deitch avatar Feb 14 '21 22:02 deitch

@deitch https://gitlab.alpinelinux.org/alpine/aports/-/issues/11942

crazy-max avatar Feb 14 '21 22:02 crazy-max

@ncopa Does the OK: 6 MiB in 15 packages in the end of the log https://github.com/docker/build-push-action/issues/294#issuecomment-778852135 mean apk finished with zero exit code even though there was an error? Is this expected?

tonistiigi avatar Feb 14 '21 22:02 tonistiigi

That does not look correct. I would expect it to exit with failure.

ncopa avatar Feb 16 '21 11:02 ncopa

It does look like an apk issue to me. Afaics that line is printed in https://github.com/alpinelinux/apk-tools/blob/361eb063c6bd97751f48e10908e6beaa383ad82f/src/commit.c#L358-L362 only if errors is false there and same variable is later returned as exit code in https://github.com/alpinelinux/apk-tools/blob/361eb063c6bd97751f48e10908e6beaa383ad82f/src/apk.c#L513-L532 . So if ca-certificates-20191127-r2.trigger failed errors should have been set but it wasn't.

tonistiigi avatar Feb 26 '21 00:02 tonistiigi

Any idea @kaniini? :)

crazy-max avatar Apr 25 '21 13:04 crazy-max

Can somebody summarize the problem? I can try to create a testcase and check it in apk-tools.

kaniini avatar Apr 27 '21 16:04 kaniini

When installing apk package, when there is an error (looks like when executing a trigger) error is printed to stderr but the apk command still prints OK: in the end and exits with 0 exit code. As the package was not actually properly installed it causes build to fail later with a missing file.

tonistiigi avatar Apr 27 '21 16:04 tonistiigi