harbor icon indicating copy to clipboard operation
harbor copied to clipboard

Cannot partially mirror fat manifests / manifest lists

Open Jamstah opened this issue 3 years ago • 7 comments

This proposal is heavily copied from https://github.com/distribution/distribution/issues/3628. I believe this proposal depends on that one.

Is your feature request related to a problem? Please describe.

This issue covers an image size concern based on the use of image indexes.

There are three things that combine here:

  • Image indexes provide platform architecture portability with low friction to clients, they list references by digest which means you can be sure you're getting the expected content.
  • Signing enables trust in images, and signing an index is a good way to say "these are definitely the platform images you want". However, once you sign the image index, you can't change the references without invalidating the signature.
  • The distribution code will validate image indexes on push to ensure the referenced platform specific image manifests (and therefore their blobs) exist in the registry.

Putting these three things together, there is no way to copy a subset of architectures of an index to a mirror without losing the signature, changing the digest of the index, or losing the index and having to pull platform images directly. Forcing the user to copy all architectures regardless of the ones they will be running within their environment makes the mirror process longer, uses more storage, uses more network, and increases the load on vulnerability scan within the organisation, especially when we're talking about 100s of images.

Describe the solution you'd like I'd like to avoid these pitfalls by making it possible to push an index even if its references are missing, if the registry admin configures it that way.

My vested interest in this is that I work for IBM developing cloud paks. Our customers use multiple different architectures, but customers don't want to have to mirror every architecture to get the images they want into their restricted network environments. As developers, we want to use image indexes to simplify deployments, support multi architecture k8s clusters, and sign everything to secure deployments, so would prefer a way for customers to mirror partial image indexes over having to not use image indexes at all.

Describe the main design/architecture of your solution I have submitted a PR to distribution/distribution, see: https://github.com/distribution/distribution/issues/3628

Describe the development plan you've considered I have submitted a PR to distribution/distribution, the work for harbor would be to update to a newer level and add the configuration options, along with documentation.

Additional context This is related to a discussion I started on the opencontainers list. The result of this discussion was to make it more clear in the spec that it is perfectly valid behaviour for registries to not validate the existence of references platform specific images:

  • https://groups.google.com/a/opencontainers.org/g/dev/c/Uw8xdBOr444
  • https://github.com/opencontainers/distribution-spec/commit/3c2316eb9ec71117a222f0274fc9e716dd3f892b

I have contributed changes to skopeo to enable mirroring of image indexes without mirroring the underlying platforms:

  • https://github.com/containers/skopeo/pull/1511

I have contributed changes to containers/image to improve error messages where image indexes are missing platform specific images:

  • https://github.com/containers/image/pull/1550

Jamstah avatar May 18 '22 12:05 Jamstah

This issue is being marked stale due to a period of inactivity. If this issue is still relevant, please comment or remove the stale label. Otherwise, this issue will close in 30 days.

github-actions[bot] avatar Jul 05 '22 09:07 github-actions[bot]

Quay have this on their backlog too: https://issues.redhat.com/browse/PROJQUAY-3114

Jamstah avatar Jul 12 '22 17:07 Jamstah

Have put this onto the next community meeting to find out if other people have the same requirements.

https://github.com/goharbor/community/blob/main/MEETING_SCHEDULE.md

Jamstah avatar Jul 12 '22 17:07 Jamstah

This issue is being marked stale due to a period of inactivity. If this issue is still relevant, please comment or remove the stale label. Otherwise, this issue will close in 30 days.

github-actions[bot] avatar Sep 11 '22 09:09 github-actions[bot]

We still care about this

Jamstah avatar Sep 11 '22 11:09 Jamstah

This issue is being marked stale due to a period of inactivity. If this issue is still relevant, please comment or remove the stale label. Otherwise, this issue will close in 30 days.

github-actions[bot] avatar Nov 12 '22 09:11 github-actions[bot]

We definitely still care about this issue

Jamstah avatar Nov 12 '22 10:11 Jamstah

This issue is being marked stale due to a period of inactivity. If this issue is still relevant, please comment or remove the stale label. Otherwise, this issue will close in 30 days.

github-actions[bot] avatar Jan 12 '23 09:01 github-actions[bot]

This issue was closed because it has been stalled for 30 days with no activity. If this issue is still relevant, please re-open a new issue.

github-actions[bot] avatar Feb 12 '23 09:02 github-actions[bot]