buildx icon indicating copy to clipboard operation
buildx copied to clipboard

Support Copy Multiarch Image Tag From One Repository To Another With Digest Id Intact

Open aggo15 opened this issue 2 years ago • 19 comments

Many of my users prefer to keep the digest id of an image tag when moving/migrating image from one repository to another. This is for the purpose of tracking where the original image came from.

For single architecture image tag, it is easy to do so with command below:

docker image tag <registry>/<src_repo>:<source_image_tag> <registry>/<dest_repo>:<dest_image_tag>
docker push <registry>/<dest_repo>:<dest_image_tag>

However, for multi-arch image, it is only working within the same repository, example working command below:

docker buildx imagetools create -t <registry>/<src_repo>:<dest_image_tag> <registry>/<src_repo>:<source_image_tag>

If I were to change the destination repository, then it will return 400 bad request.

docker buildx imagetools create -t <registry>/<dest_repo>:<dest_image_tag> <registry>/<src_repo>:<source_image_tag>

I'm able to work around this issue by manually copy over each layer by digest id to destination repository, which is a hassle to deal with. I don't see this workaround being documented somewhere as well. Work around commands below (assume the multi-arch image tag consists of armv6, armv7 and arm64 architecture):

# Copy ARMV6 digest
docker buildx imagetools create -t <registry>/<dest_repo>@<dest_armv6_digest <registry>/<src_repo>@<source_armv6_digest>

# Copy ARMV7 digest
docker buildx imagetools create -t <registry>/<dest_repo>@<dest_armv7_digest> <registry>/<src_repo>@<source_armv7_digest> 

# Copy ARM64 digest
docker buildx imagetools create -t <registry>/<dest_repo>@<dest_arm64_digest> <registry>/<src_repo>@<source_arm64_digest> 

# Copy source tag digest
docker buildx imagetools create -t <registry>/<dest_repo>@<dest_tag_digest> <registry>/<src_repo>@<source_tag_digest>

# Assign a name to destination tag
docker buildx imagetools create -t <registry>/<src_repo>:<dest_image_tag> <registry>/<dest_repo>@<dest_tag_digest>

It will be nice if the docker buildx imagetools create command handle copy over image tag from one repository to another, since currently it already support copy image tag within the same repository.

Here is my current docker buildx version. I'm on Windows 10 btw.

docker buildx version
github.com/docker/buildx v0.10.0 876462897612d36679153c3414f7689626251501

docker version
Client:
 Cloud integration: v1.0.29
 Version:           20.10.22
 API version:       1.41
 Go version:        go1.18.9
 Git commit:        3a2c30b
 Built:             Thu Dec 15 22:36:18 2022
 OS/Arch:           windows/amd64
 Context:           desktop-linux
 Experimental:      true

Server: Docker Desktop 4.16.3 (96739)
 Engine:
  Version:          20.10.22
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.18.9
  Git commit:       42c8b31
  Built:            Thu Dec 15 22:26:14 2022
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.6.14
  GitCommit:        9ba4b250366a5ddde94bb7c9d1def331423aa323
 runc:
  Version:          1.1.4
  GitCommit:        v1.1.4-0-g5fd4c4d
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

aggo15 avatar Mar 03 '23 03:03 aggo15

Hm, I'm not able to reproduce, with that version of buildx, I can easily copy a multi-architecture image (like moby/buildkit:latest to another repository).

I think it's possible your command line flags are the wrong way round - it should be:

docker buildx imagetools create -t <registry>/<dest_repo>:<dest_image_tag> <registry>/<src_repo>:<src_image_tag>

Notice how the dest_repo comes immediately after the -t flag.

If that doesn't resolve the issue, could you please share what registry you're using (just what software is being used to host it), as well as the output of docker buildx imagetools inspect --raw <registry>/<src_repo>:<source_image_tag>.

jedevc avatar Mar 03 '23 08:03 jedevc

@jedevc Hi, sorry I think I made a mistake on my original post. Yes. the dest_repo should be immediately after -t. I'll edit my original post later. Still, I'm not able to copy the tag to another repo like you said.

Edit: Original post edited with correct example. Pardon my copy and paste =)

aggo15 avatar Mar 03 '23 09:03 aggo15

@jedevc Just realized I forgot to put in the info you requested. Here is the error I see when I'm doing the tag copy to another repo. I'm currently using Mirantis Secured Registry (v2.9.5), hosted in my work environment.

docker buildx imagetools create -t registry/ns/repo-new:tag registry/ns/repo:tag
[+] Building 11.4s (1/1) FINISHED
 => ERROR [internal] pushing registry/ns/repo-new:tag   11.4s
------
 > [internal] pushing registry/ns/repo-new:tag:
#0 0.001 copying sha256:<digest_id> from registry/ns/repo:tag to registry/ns/repo-new:tag
------
ERROR: unexpected status: 400 Bad Request

No matter what I changed (the namespace, repository, tag name), it will end up with this error. It will only work if the namespace and repository are the same for both source and destination.

aggo15 avatar Mar 04 '23 03:03 aggo15

Can you also share the results of docker buildx imagetools inspect --raw <registry>/<src_repo>:<source_image_tag> as mentioned previously?

jedevc avatar Mar 08 '23 16:03 jedevc

@jedevc here you go.

{
  "mediaType": "application/vnd.docker.distribution.manifest.list.v2+json",
  "schemaVersion": 2,
  "manifests": [
    {
      "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
      "digest": "sha256:fcc<rest_of_long_chars>",
      "size": 2271,
      "platform": {
        "architecture": "arm",
        "os": "linux",
        "variant": "v6"
      }
    },
    {
      "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
      "digest": "sha256:255<rest_of_long_chars>",
      "size": 2271,
      "platform": {
        "architecture": "arm",
        "os": "linux",
        "variant": "v7"
      }
    },
    {
      "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
      "digest": "sha256:a0f<rest_of_long_chars>",
      "size": 2271,
      "platform": {
        "architecture": "arm64",
        "os": "linux"
      }
    }
  ]
}

aggo15 avatar Mar 09 '23 02:03 aggo15

Did you ever find a solution/workaround for this problem? I'm facing the same issue.

phab-cpr avatar Aug 22 '23 13:08 phab-cpr

Did you ever find a solution/workaround for this problem? I'm facing the same issue.

I did mentioned it in my first post, where the workaround is to manually upload source digest for each architecture including the source fat digest and finally assign a name to the destination fat digest. You should be able to get the same digest id as the source digest doing that.

aggo15 avatar Aug 22 '23 14:08 aggo15

Thanks - I have that working already, but was looking for a better single-line workaround. Thanks for documenting your workaround though as that was helpful in at least getting something to work.

phab-cpr avatar Aug 22 '23 14:08 phab-cpr

I don't think there is a single liner that fix the issue, at least not that I'm aware of.

Some bonus info for you. I've recently use docker buildx imagetools create over a replicated registry on Azure and it cause an issue where docker client download wrong architecture image over one side of replicated registry. So, be careful when using the said command especially when retagging image to different repository. The workaround method is reliable for me so far.

aggo15 avatar Aug 22 '23 14:08 aggo15

Hm, @phab-cpr @aggo15, what registries are you using (or at least what software is hosting them)? In https://github.com/docker/buildx/pull/2013 I've added a test for these cases, and they appear to work pretty perfectly as I can see, I wonder if there's maybe something with the registry you're using.

jedevc avatar Aug 22 '23 16:08 jedevc

We're using Nexus. We had to enable redeploy to get the workaround to work (which is a bit ugly for prod repos).

phab-cpr avatar Aug 22 '23 21:08 phab-cpr

@jedevc I'm using Azure Container Registry.

The problem I mentioned will appear only if we use replication over region. Say for example, if I upload an image to the instance in Asia using docker buildx imagetools create command, the client connected to the US instance will download image with wrong architecture, the client in Asia will download the image with correct architecture. If we upload to US instance, then Asia instance will encounter the problem.

We found out that Azure Container Registry API /v2/<namespace>/<reponame>/manifests/<digest> is returning different result for different region. In this case, which ever instance you upload to will return the correct image manifest. However the other instance will return only one of the architecture manifest.

Naturally we will suspect there are some issue on the registry. However, if I use the workaround I mentioned in my original post, then the problem will go away.

The buildx dev team might want to look into this problem. It will be nice if I can just use docker buildx imagetools create without worrying about this problem. Now I have to constantly educate my users about this issue if they are building multi-arch image.

aggo15 avatar Aug 23 '23 00:08 aggo15

We found out that Azure Container Registry API /v2/<namespace>/<reponame>/manifests/<digest> is returning different result for different region. In this case, which ever instance you upload to will return the correct image manifest. However the other instance will return only one of the architecture manifest.

what.

The way that this is described sounds like a horrible bug in the registry - not anything to do with buildx. Can you share the results of comparing docker buildx imagetools inspect on both the US and the Asia instance? I suspect they are giving different digests.

Reading the docs on geo-replication in Azure: https://learn.microsoft.com/en-us/azure/container-registry/container-registry-geo-replication:

After you push an image or tag update to the closest region, it takes some time for Azure Container Registry to replicate the manifests and layers to the remaining regions you opted into. Larger images take longer to replicate than smaller ones. Images and tags are synchronized across the replication regions with an eventual consistency model.

Could it be possible that the reasons digests are different is because of some racy behavior in your pipeline?

I'm able to work around this issue by manually copy over each layer by digest id to destination repository, which is a hassle to deal with.

I genuinely have no idea why this workaround would work in the first place - my best guess, would be something to do with cross-repo mounts: https://github.com/opencontainers/distribution-spec/blob/main/spec.md#mounting-a-blob-from-another-repository.

We're using Nexus.

@phab-cpr are you also using some replication scheme? If not, I imagine that this is a different issue. Can you share both the source and the destination results of docker buildx imagetools inspect --raw?

jedevc avatar Aug 24 '23 10:08 jedevc

Can you share the results of comparing docker buildx imagetools inspect on both the US and the Asia instance? I suspect they are giving different digests.

@jedevc Here you go the inspect result from both sides.

From the US region

$ docker buildx imagetools inspect registry/namespace/reponame:tagname
Name:      registry/namespace/reponame:tagname
MediaType: application/vnd.docker.distribution.manifest.v2+json
Digest:    sha256:57048XXX

From the Asia region

$ docker buildx imagetools inspect registry/namespace/reponame:tagname
Name:      registry/namespace/reponame:tagname
MediaType: application/vnd.docker.distribution.manifest.list.v2+json
Digest:    sha256:7fc94XXX

Manifests:
  Name:      registry/namespace/reponame:tagname@sha256:57048XXX
  MediaType: application/vnd.docker.distribution.manifest.v2+json
  Platform:  linux/arm/v7

  Name:      registry/namespace/reponame:tagname@sha256:93a64XXX
  MediaType: application/vnd.docker.distribution.manifest.v2+json
  Platform:  linux/arm64

In US region, the client will keep download image with digest 57048XXX regardless of the client's architecture. In Asia, everything is working fine.

Could it be possible that the reasons digests are different is because of some racy behavior in your pipeline?

I looked at my dev pipeline and seems like the original image is working fine at both US/Asia replica (Both region getting the same result from inspect command). Fyi, we are using normal docker buildx build --push command to push the original image. The issue start to appear when the prod pipeline is retagging image built from dev pipeline using the command docker buildx imagetools create.

The issue can be easily reproduce by retagging an existing image to another repository by using command below. This is the same command we use in our prod pipeline.

docker buildx imagetools create -t registry/namespace/repoB:newtag registry/namespace/repoA:existingtag

I genuinely have no idea why this workaround would work in the first place - my best guess, would be something to do with cross-repo mounts: https://github.com/opencontainers/distribution-spec/blob/main/spec.md#mounting-a-blob-from-another-repository.

I have no idea which one is the culprit. Perhaps someone else with similar environment as mine can help to replicate my findings here and hopefully figure out the root cause of this issue? My colleague did raised a ticket to MS few days ago, hopefully we got some clue from them.

aggo15 avatar Aug 24 '23 11:08 aggo15

For some reason, it looks like only a single architecture is getting replicated to the US region :anguished:

Note the media types. In the Asia region, we have a manifest list application/vnd.docker.distribution.manifest.list.v2+json, while in the US, we only have a manifest. A manifest list can contain multiple manifests - this is how multi-arch images are made.

For some reason, the replication seems to be turning the multi-arch manifest list into a single-arch manifest. I wonder what process Azure is using for replication.

jedevc avatar Aug 24 '23 11:08 jedevc

For some reason, the replication seems to be turning the multi-arch manifest list into a single-arch manifest. I wonder what process Azure is using for replication.

Is there anything different between the command docker buildx build --push and docker buildx imagetools create, how they talk to the registry in the background?

If the Azure replication is having an issue, it is common to assume all multi-arch image in the US side will be impacted right? However, we checked the US side, not all multi-arch image there are impacted, only those retagged image are impacted. :confused:

Due to this observation, I don't think it is fair to point all the fingers on Azure side. ¯\_(ツ)_/¯

aggo15 avatar Aug 24 '23 11:08 aggo15

It seems like MS is able to replicate the problem on their side according to their reply. They seems to agree that there is a problem when replicating image to other region when retagging using docker buildx imagetools command. No commit from them to fix the problem for now. I guess for my side we will use the workaround method for now.

aggo15 avatar Aug 29 '23 07:08 aggo15

@phab-cpr are you also using some replication scheme? If not, I imagine that this is a different issue. Can you share both the source and the destination results of docker buildx imagetools inspect --raw?

@jedevc No - we have a dev and release repo and we simply want to copy a multi-arch containter from our dev repo to our release repo at the point of release (so we don't have to rebuild/retest).

Info added below: Dev repo:

{
  "schemaVersion": 2,
  "mediaType": "application/vnd.oci.image.index.v1+json",
  "manifests": [
    {
      "mediaType": "application/vnd.oci.image.manifest.v1+json",
      "digest": "sha256:3be65aee3ec948300f05d5da560a445c57c681e6436d580dea7f6f3abb5d31ed",
      "size": 2006,
      "platform": {
        "architecture": "amd64",
        "os": "linux"
      }
    },
    {
      "mediaType": "application/vnd.oci.image.manifest.v1+json",
      "digest": "sha256:57334494c3b33b286cfcb6e23f3952b39409cc277f3d9381d6abf63633b2b405",
      "size": 2006,
      "platform": {
        "architecture": "arm64",
        "os": "linux"
      }
    },
    {
      "mediaType": "application/vnd.oci.image.manifest.v1+json",
      "digest": "sha256:935d98c296f4163fb9700a3b7f1b386226ebefc58a0fc8b3c63d716d86d578fc",
      "size": 566,
      "annotations": {
        "vnd.docker.reference.digest": "sha256:3be65aee3ec948300f05d5da560a445c57c681e6436d580dea7f6f3abb5d31ed",
        "vnd.docker.reference.type": "attestation-manifest"
      },
      "platform": {
        "architecture": "unknown",
        "os": "unknown"
      }
    },
    {
      "mediaType": "application/vnd.oci.image.manifest.v1+json",
      "digest": "sha256:dda690fa0abccb9110a8591c00c38ea9f5342be5b3a8b37a099e5d4e37025fc7",
      "size": 566,
      "annotations": {
        "vnd.docker.reference.digest": "sha256:57334494c3b33b286cfcb6e23f3952b39409cc277f3d9381d6abf63633b2b405",
        "vnd.docker.reference.type": "attestation-manifest"
      },
      "platform": {
        "architecture": "unknown",
        "os": "unknown"
      }
    }
  ]
}

Release repo

{
  "schemaVersion": 2,
  "mediaType": "application/vnd.oci.image.index.v1+json",
  "manifests": [
    {
      "mediaType": "application/vnd.oci.image.manifest.v1+json",
      "digest": "sha256:3be65aee3ec948300f05d5da560a445c57c681e6436d580dea7f6f3abb5d31ed",
      "size": 2006,
      "platform": {
        "architecture": "amd64",
        "os": "linux"
      }
    },
    {
      "mediaType": "application/vnd.oci.image.manifest.v1+json",
      "digest": "sha256:57334494c3b33b286cfcb6e23f3952b39409cc277f3d9381d6abf63633b2b405",
      "size": 2006,
      "platform": {
        "architecture": "arm64",
        "os": "linux"
      }
    },
    {
      "mediaType": "application/vnd.oci.image.manifest.v1+json",
      "digest": "sha256:c6f28ad93c9324b95b677b6eaea4d448f5e98b3418ba654b937b39dc9a27635f",
      "size": 566,
      "annotations": {
        "vnd.docker.reference.digest": "sha256:3be65aee3ec948300f05d5da560a445c57c681e6436d580dea7f6f3abb5d31ed",
        "vnd.docker.reference.type": "attestation-manifest"
      },
      "platform": {
        "architecture": "unknown",
        "os": "unknown"
      }
    },
    {
      "mediaType": "application/vnd.oci.image.manifest.v1+json",
      "digest": "sha256:8bfc3943339ee074c323f53ab8047d532f0c36e6cc95d02e0376b72891e6c40b",
      "size": 566,
      "annotations": {
        "vnd.docker.reference.digest": "sha256:57334494c3b33b286cfcb6e23f3952b39409cc277f3d9381d6abf63633b2b405",
        "vnd.docker.reference.type": "attestation-manifest"
      },
      "platform": {
        "architecture": "unknown",
        "os": "unknown"
      }
    }
  ]
}

The contents of the release repo were populated via the following (where FROM is the version in my dev repo, and TO is the version I want to populate in my release repo):

DETAILS=$(docker buildx imagetools inspect --raw "${FROM}")
readarray -t digests < <(echo "$DETAILS" | jq -r '.manifests[] | .digest')

# For each platform, copy across the artifacts by digest
for digest in "${digests[@]}"
do
    set -x
    docker buildx imagetools create -t "${TO}@${digest}" "${FROM}@${digest}"
done

# Copy the manifest digest
manifest_digest=$(docker buildx imagetools inspect "${FROM}" | grep -E "^Digest:" | awk -F ' ' '{print $2}')
docker buildx imagetools create -t "${TO}@${manifest_digest}" "${FROM}@${manifest_digest}"

Interestingly, I couldn't get the following one-liner to fail again so I wonder if it's because "allow redeployment" wasn't enabled on Nexus (I thought I'd tried that, so perhaps I'm missing something - i.e. something already cached/pushed): docker buildx imagetools create -t nexus.rnd:5000/dockerimage:releasetag nexus.rnd:5001/dockerimage:devtag

With redeployment disabled, I get this error:

docker buildx imagetools create -t nexus.rnd:5000/myproject:myprodrelease nexus.rnd:5001/myproject:mydevversion
[+] Building 0.1s (1/1) FINISHED                                                                                                                                                                                                  
 => ERROR [internal] pushing nexus.rnd:5000/myproject:myprodrelease                                                                                                                   0.1s
------
 > [internal] pushing nexus.rnd:5000/myproject:myprodrelease:
0.000 copying sha256:8b9cbe70508773db632521d9496202cb64f72b65b967fd8ae48009e1868b478f from nexus.rnd:5001/myproject:mydevversion to nexus.rnd:5000/myproject:myprodrelease
------
ERROR: failed commit on ref "manifest-sha256:ad878d1cc0ce5c7cf4ec38d77d60d1e2efd30b3d0d540d508c36eb41ee96237a": unexpected status from PUT request to https://nexus.rnd:5000/v2/myproject/manifests/myprodrelease: 400 Bad Request

Would you expect to need redeployment enabled for docker buildx imagetools create to work? Turning this on for prod repos is uncomfortable as if something goes wrong, our release assets could be accidentally overwritten.

phab-cpr avatar Aug 29 '23 08:08 phab-cpr

Hi @jedevc , @tonistiigi

We are facing the same issue when copying multi arch image using imagetools from one private AWS ECR to another private ECR with image immutability tag enabled. sample cmd -

docker buildx imagetools create --tag destination_ecr:tag source_ecr:tag

We are getting this error -

unexpected status from PUT request to https://account.dkr.ecr.region.amazonaws.com/v2/ecr_repo/manifest/tag: 400 Bad Request 

We can see a single architecture in the destination repo , while getting the above error during the second architecture push.

When we disable image immutability in destination AWS ECR, copy is successful. Here is the exact issue detail - https://github.com/aws/containers-roadmap/issues/1612#issuecomment-2010567931

can the fix be included for imagetools as well? ref - https://github.com/docker/buildx/issues/663#issuecomment-872624233

Or please let me know if there is any workaround to make this work using imagetools. Thanks in advance! :)

abhishekaj avatar Jul 16 '24 15:07 abhishekaj