nydus
nydus copied to clipboard
nydus.remoteimage.v1 in image index causes push to fail with ECR
Hello folks, I was trying to fix #1691 when I stumbled on a similar issue for multi-arch images
Basically, when attempting to convert a multi-arch image, I can't push it to ECR and it seems that this part of the manifest is causing issue because when I remove it, it just works:
"os.features": [
"nydus.remoteimage.v1"
],
Reproducer:
nydusify convert --source nginx:latest --target my-ecr/my-image:nginx --platform "linux/amd64,linux/arm64"
INFO[2025-05-16T18:42:22+02:00] pulling image docker.io/library/nginx:latest module=converter
INFO[2025-05-16T18:42:40+02:00] pulled image docker.io/library/nginx:latest , elapse 17.986182294s module=converter
INFO[2025-05-16T18:42:40+02:00] converting image docker.io/library/nginx:latest module=converter
INFO[2025-05-16T18:42:42+02:00] converted image my-ecr/my-image:nginx , elapse 2.482347474s module=converter
INFO[2025-05-16T18:42:42+02:00] pushing image my-ecr/my-image:nginx module=converter
FATA[2025-05-16T18:42:45+02:00] push image: failed commit on ref "index-sha256:74b970f44b893767940d6a292b5eacf981fd011eea4ce9e5dcfcb973859d8098": unexpected status from PUT request to https://my-ecr/v2/my-image/manifests/nginx: 405Method Not Allowed
Or, if you want to get the manifest manually:
nydusify convert --source nginx:latest --target localhost:5000/oci:latest --platform "linux/amd64,linux/arm64"
crane manifest localhost:5000/oci:latest | jq > nydus_manifest.json
aws ecr put-image --repository-name my-repository --image-tag nginx --image-manifest file://nydus_manifest.json
content of nydus_manifest.json
{ "schemaVersion": 2, "mediaType": "application/vnd.oci.image.index.v1+json", "manifests": [ { "mediaType": "application/vnd.oci.image.manifest.v1+json", "digest": "sha256:fd2da0d77212b6acca1a6abfdfe3dbae1c4bdc7e176a5caabc1c5aa41a7d569b", "size": 4443, "annotations": { "com.docker.official-images.bashbrew.arch": "amd64", "org.opencontainers.image.base.digest": "sha256:5accafaaf0f2c0a3ee5f2dcd9a5f2ef7ed3089fe4ac6a9fc9b1cf16396571322", "org.opencontainers.image.base.name": "debian:bookworm-slim", "org.opencontainers.image.created": "2025-04-28T21:42:34Z", "org.opencontainers.image.revision": "eaf8875a1967d24cea6ed8b37109075e39ed9e43", "org.opencontainers.image.source": "https://github.com/nginx/docker-nginx.git#eaf8875a1967d24cea6ed8b37109075e39ed9e43:mainline/debian", "org.opencontainers.image.url": "https://hub.docker.com/_/nginx", "org.opencontainers.image.version": "1.27.5" }, "platform": { "architecture": "amd64", "os": "linux", "os.features": [ "nydus.remoteimage.v1" ] } }, { "mediaType": "application/vnd.oci.image.manifest.v1+json", "digest": "sha256:79b251ad8717a8fd7c70ab46b0f6f4369da385549078578513c46da5062ac626", "size": 4447, "annotations": { "com.docker.official-images.bashbrew.arch": "arm64v8", "org.opencontainers.image.base.digest": "sha256:f098d2b31ad6f5cbfa8fc974eb69f9f18dcff506c3984eae54deb8b89731bc00", "org.opencontainers.image.base.name": "debian:bookworm-slim", "org.opencontainers.image.created": "2025-04-28T22:38:21Z", "org.opencontainers.image.revision": "eaf8875a1967d24cea6ed8b37109075e39ed9e43", "org.opencontainers.image.source": "https://github.com/nginx/docker-nginx.git#eaf8875a1967d24cea6ed8b37109075e39ed9e43:mainline/debian", "org.opencontainers.image.url": "https://hub.docker.com/_/nginx", "org.opencontainers.image.version": "1.27.5" }, "platform": { "architecture": "arm64", "os": "linux", "os.features": [ "nydus.remoteimage.v1" ], "variant": "v8" } } ] }When looking a bit, I found that this field was introduced in this PR. I'm wondering if that is really needed for index manifests and if it can be omitted?
Hi @Fricounet, the nydus.remoteimage.v1 field is used differentiate to OCI v1 and nydus image, the reason for merging OCI v1 and nydus manifests into a single manifest index is historical, the feature requires support from image runtimes like containerd or podman. In practice, this field is currently unused, to ensure compatibility with ECR, I think it can be removed.
The related code: https://github.com/containerd/nydus-snapshotter/blob/88a72d945ce529053463e1095651402bd4b3822b/pkg/converter/convert_unix.go#L952 (nydusify imported the nydus-snapshotter/pkg/converter package).
@imeoer thanks for the answer, I'll work on this then too.
the feature requires support from image runtimes like containerd or podman
If I understand correctly, the goal would be to have containerd automatically chose the nydus image inside the index if the it has the nydus snapshotter enabled right? Would be quite neat indeed.
I just wonder, is it even possible currently to build a manifest like the one you linked? Using nydusify or buildkit, I don't quite see how you can end up with both in the same manifest? (as nydusify creates a new manifest in --target but with only the nydus image and buildkit can only have 1 output specified, so only nydus or targz for instance)
@Fricounet Try the nydusify convert --merge-platform --source $OCI_IMAGE --target $NYDUS_IMAGE command. :)
Ah I see ! Thank you !
@Fricounet I think we should enable the nydus.remoteimage.v1 flag only for --merge-platform, instead of index conversion.
Yes! That was my thoughts too. I'll work on doing this
This issue is stale because it has been open 60 days with no activity.
Hey @imeoer I wanted to get your opinion on something you mentioned higher in the conversation.
Namely that the os.features: nydus.remoteimage.v1 is basically useless for now because it need the container runtime to implement something to understand this. I've seen in the soci snapshotter manifest that they do a similar thing as your --merge-platform where they put both the regular OCI image and the soci one in the same index manifest:
crane manifest localhost:5000/test:soci | jq
{
"schemaVersion": 2,
"mediaType": "application/vnd.oci.image.index.v1+json",
"manifests": [
{
"mediaType": "application/vnd.oci.image.manifest.v1+json",
"digest": "sha256:04888253b77b5aff33606bb1338ef1e05661e94dd9b3bc8d5f2ba61a7831461f",
"size": 2015,
"annotations": {
"com.amazon.soci.index-digest": "sha256:c1df1fa989c00fa503613dcd0f19a24a2a2fbbf67d33b94cd035a2df43d6e286",
"target": "prod"
},
"platform": {
"architecture": "amd64",
"os": "linux"
}
},
{
"mediaType": "application/vnd.oci.image.manifest.v1+json",
"digest": "sha256:c1df1fa989c00fa503613dcd0f19a24a2a2fbbf67d33b94cd035a2df43d6e286",
"size": 2110,
"annotations": {
"com.amazon.soci.image-manifest-digest": "sha256:04888253b77b5aff33606bb1338ef1e05661e94dd9b3bc8d5f2ba61a7831461f"
},
"platform": {
"architecture": "amd64",
"os": "linux"
},
"artifactType": "application/vnd.amazon.soci.index.v2+json"
}
]
}
and then they use the annotations present in the manifest to be able to find the soci image and leverage lazy pulling this way. Have you considered this approach? I think that's an interesting approach because this way you can have a single manifest with both images and still be able to pull efficiently if the local snapshotter supports nydus while still being able to pull the image normally on devices which don't have nydus installed (since containerd will just pull the regular image instead). All of this without needing to implement anything in the container runtime. What's do you think?
Hey @imeoer @BraveY, regarding my last comment, I did implement the feature in my fork to auto-detect the nydus manifest in the image index. Would you like me to submit a PR for it? https://github.com/DataDog/nydus-snapshotter/pull/3
@Fricounet Thanks, the PR looks good to me, feel free to submit it upstream, maybe we can only add some usage doc. :)
@Fricounet If a node has the nydus snapshotter installed and configured, it will run the nydus image, otherwise it will run the OCI v1 image, is that correct?
@imeoer
If a node has the nydus snapshotter installed and configured, it will run the nydus image, otherwise it will run the OCI v1 image, is that correct?
Yes exactly, that's based on the fact that the OCI images comes first in the list so containerd will pick it up first. When the image goes through a regular snapshotter, the OCI image is handled as usual but when it goes through the nydus-snapshotter, the snapshotter will discover the other image in the manifest and prepare it instead. Just like what the current referrer_detect feature does.
feel free to submit it upstream, maybe we can only add some usage doc. :)
Will do with some usage docs added
However, I just realized that the initial reason for this issue (ecr not supporting the os.features: nydus.remoteimage.v1 field and rejecting push events) is still a thing. Which makes this feature impossible to use there 🤦 I'll try to open an issue with them for this. Unless you have some ideas on how we could go around this limitation?
@Fricounet How about making it compatible like:
{
"schemaVersion": 2,
"manifests": [
{
"mediaType": "application/vnd.oci.image.manifest.v1+json",
"digest": "sha256:4e4bc990609ed865e07afc8427c30ffdddca5153fd4e82c20d8f0783a291e241",
"size": 943,
"platform": {
"architecture": "amd64",
"os": "linux"
}
},
{
"mediaType": "application/vnd.oci.image.manifest.v1+json",
"digest": "sha256:db30bb2870067ed3e0e73c7448d9f0b529169da8295b5b5155b417624d861d81",
"size": 1367,
"platform": {
"architecture": "amd64",
"os": "linux"
},
"artifactType": "application/vnd.nydus.image.manifest.v1+json"
}
]
}
Note: the newly added "artifactType": "application/vnd.nydus.image.manifest.v1+json" field.
@imeoer yes I think that would work. That's actually what soci is doing on their end too and from my tests, this manifest does seem to pass ECR validation
{
"schemaVersion": 2,
"manifests": [
{
"mediaType": "application/vnd.docker.distribution.manifest.v2+json",
"digest": "sha256:a63dfddecc661e4ad05896418b2f3774022269c3bf5b7e01baaa6e851a3a4a23",
"size": 2320,
"platform": {
"architecture": "amd64",
"os": "linux"
}
},
{
"mediaType": "application/vnd.docker.distribution.manifest.v2+json",
"digest": "sha256:cb559143b6929cc3fbaf0dfb0e5b166ac5c7f3421fe71c4ded2499b38512e4ce",
"size": 2320,
"platform": {
"architecture": "arm64",
"os": "linux"
}
},
{
"mediaType": "application/vnd.docker.distribution.manifest.v2+json",
"digest": "sha256:7fca51156823f77b71e9d5e78c69aac7cd2d3154c6be8d6101bb4483ea383184",
"size": 3251,
"platform": {
"architecture": "amd64",
"os": "linux"
},
"artifactType": "application/vnd.nydus.image.manifest.v1+json"
},
{
"mediaType": "application/vnd.docker.distribution.manifest.v2+json",
"digest": "sha256:dd72e2d01ac83c559243ef1ec8c1d96b912e66a8a3dbafacb608a7065c886475",
"size": 3251,
"platform": {
"architecture": "arm64",
"os": "linux"
},
"artifactType": "application/vnd.nydus.image.manifest.v1+json"
}
]
}
This would require a fix in the acceleration-service I believe since the converter package is located there. And then we'd need to bump the package there but we'll encounter the same issue I had trying to bump the snapshotter in #1699. We really need to try to bump nydusify for containerdv2 to support this then.
How would you like to see this implemented?
- Do you prefer to completely get rid of the
os.features: nydus.remoteimage.v1logic in the acceleration service and add theartefactTypeinstead? - or add a separate flag chain in nydusify and the acceleration service to add this artifactType and keep the existing
os.features: nydus.remoteimage.v1logic as it is right now
@Fricounet Currently no runtime (snapshotter) supports os.features: nydus.remoteimage.v1 except for build tools (e.g., nydusify), so I suggest changing it to artifactType.
Sounds good, I can do this