docker-image-resource icon indicating copy to clipboard operation
docker-image-resource copied to clipboard

Resource tries to pull image with incorrect digest

Open GJKrupa opened this issue 5 years ago • 8 comments

We're seeing a repeat of issue #33 on Concourse CI 6.5.1. We have two tasks in our pipeline that use the same Docker image (same tag). One of them is pulling the correct SHA and the other is failing because it's trying to pull a completely different non-existent SHA. The resource_type definition for the task looks like:

- name: node
  source:
    ca_certs:
    - cert: |
        ((shared.ca-cert))
      domain: '*.our-domain.local'
    repository: harbor.our-domain.local/tools/node
    tag: 13.14.0-ubuntu
  type: docker-image

The first stage succeeds:

Pulling harbor.our-domain.local/tools/node@sha256:5d8f1447da011d2a9215ee312f61ebf49dc00f1173ff40c0e6ba1b9a95819002...
sha256:5d8f1447da011d2a9215ee312f61ebf49dc00f1173ff40c0e6ba1b9a95819002: Pulling from tools/node
3ff22d22a855: Pulling fs layer
...
2d4b3890a2e6: Pull complete
Digest: sha256:5d8f1447da011d2a9215ee312f61ebf49dc00f1173ff40c0e6ba1b9a95819002
Status: Downloaded newer image for harbor.our-domain.local/tools/node@sha256:5d8f1447da011d2a9215ee312f61ebf49dc00f1173ff40c0e6ba1b9a95819002
harbor.our-domain.local/tools/node@sha256:5d8f1447da011d2a9215ee312f61ebf49dc00f1173ff40c0e6ba1b9a95819002

The second fails as follows:

waiting for docker to come up...
Pulling harbor.our-domain.local/tools/node@sha256:6298b1ee476e73f76c06ff99105eedb1290355f09a367ab06d5fd5d334ad8b6a...
Error response from daemon: manifest for harbor.our-domain.local/tools/node@sha256:6298b1ee476e73f76c06ff99105eedb1290355f09a367ab06d5fd5d334ad8b6a not found: manifest unknown: manifest unknown

Pulling harbor.our-domain.local/tools/node@sha256:6298b1ee476e73f76c06ff99105eedb1290355f09a367ab06d5fd5d334ad8b6a (attempt 2 of 3)...
Error response from daemon: manifest for harbor.our-domain.local/tools/node@sha256:6298b1ee476e73f76c06ff99105eedb1290355f09a367ab06d5fd5d334ad8b6a not found: manifest unknown: manifest unknown

Pulling harbor.our-domain.local/tools/node@sha256:6298b1ee476e73f76c06ff99105eedb1290355f09a367ab06d5fd5d334ad8b6a (attempt 3 of 3)...
Error response from daemon: manifest for harbor.our-domain.local/tools/node@sha256:6298b1ee476e73f76c06ff99105eedb1290355f09a367ab06d5fd5d334ad8b6a not found: manifest unknown: manifest unknown

GJKrupa avatar Oct 30 '20 11:10 GJKrupa

Destroying and re-adding the pipeline seems to have resolved it so it's some kind of caching issue though it originally persisted in erroring through a full scale-down and scale-up of both the web and worker nodes.

GJKrupa avatar Oct 30 '20 13:10 GJKrupa

I had the same thing happen and the full delete/rebuild of the pipeline resolved it. Bummer though, we had a lot of build history in the old one.

For some context, I was switching our public Docker images to a private registry mirror. At the same time I was switching from using the docker-image resource and instead using the registry-image one. I updated ~20 pipelines and they all worked except for 1, which was pulling some mysteriously wrong sha256 from the new registry. Re-creating that pipeline resolved it.

mattdodge avatar Nov 06 '20 17:11 mattdodge

We are experiencing the same issue on Concourse CI 7.0.0

pjmpsu avatar Mar 23 '21 15:03 pjmpsu

https://github.com/concourse/concourse/issues/6521 might be related. The mysterious sha256 could be from the version of the old resource type. @pjmpsu i wonder if you ran fly check-resource-type for the problematic resource-type would it solve the issue?

xtremerui avatar Mar 23 '21 15:03 xtremerui

the fly check-resource-type resolved the issue for me.

tluimes avatar Mar 02 '22 14:03 tluimes

Can confirm we get this also. Have tried checking resources, checking resource types and all sorts and it just doesn't work.

What's even more bizzare is that the PUT step will sometimes pull the correct image, but will go onto fail during the GET step after it when it tries to pull an older image that doesn't exist. Then there are times where the PUT step doesn't even pull the right image. Have tried check-resource-type, check-resource, clear-cache-resource. Have even completely deleted the pipeline, recreated it with a new name, and even renamed the resource types and resources and it still fails with the same behaviour.

Not sure if it is a breaking bug introduced in 7.9.0?

ChrisJBurns avatar Jun 20 '23 16:06 ChrisJBurns

Experiencing the same issue. Very heavenly since upgrading to 7.9.1. Re-creating pipeline did not work. Neither did check-ressource, check-ressource-type (they work well).

I also removed the PVC of the workers and debugged with one single worker. I create an issue for registry-image as well: https://github.com/concourse/registry-image-resource/issues/343

mreiche avatar Jun 27 '23 12:06 mreiche