image-spec Artifacts clarification

Closes https://github.com/opencontainers/image-spec/issues/1131
Relates to https://github.com/opencontainers/image-spec/pull/1137
Follow-up to #999

Attempt to better clarify the conceptual differences between artifacts and images (which I tie to the config as described in this spec), and provide guidance on how to detect artifacts from code.

Also attempt to clarify how descriptors interact with artifacts in the aftermath of #999.

Oct 17 '23 20:10 neersighted

I have a high level concern that we're trying to document the mixture of runnable Images and Artifacts in the same Index. Is that something we want to standardize in OCI, or would runtimes prefer that they are only asked to run an entry from an Index of only Images?

Yes, this is a realistic use-case with inline metadata today, e.g. what BuildKit does.

I'm wondering if we want to differentiate between a runtime unsure how to run the requested content vs a runtime not finding any Image content in an Index.

I think this is behavior defined by the runtime; what I want to make clear is that a runtime can interpret new artifact types if it has explicit support for them. But the default should be "ignore any content I do not know how to interpret" to obviate the need for platform: unknown/unknown and similar hacks with regard to platform selection.

Oct 27 '23 05:10 neersighted

I have a high level concern that we're trying to document the mixture of runnable Images and Artifacts in the same Index. Is that something we want to standardize in OCI, or would runtimes prefer that they are only asked to run an entry from an Index of only Images?

Yes, this is a realistic use-case with inline metadata today, e.g. what BuildKit does.

I want to be careful that we don't say "buildkit implemented it, therefore OCI needs to recommend it". It's not unlike how cosign pushed manifests that looked like an OCI image with a different layer type to get around registries that filtered unknown config mediaType values. Even though the behavior works and doesn't violate the spec, I'm not incline to recommend that as a method to push artifacts.

Before I weigh in on the mixed content in an Index scenario, I'd like to hear from the impacted users that would be parsing that content (runtime maintainers).

Oct 27 '23 14:10 sudo-bmitch

cc @dmcgowan re: runtime behavior with mixed indexes.

Nov 08 '23 14:11 neersighted

Given the various scenarios we're seeing, and knowing there will be more scenarios in the future, perhaps instead of being prescriptive in exactly what fields are defined and we should keep it at a higher level. Something like "container images are any content pushed to the registry designed to be executed by a container runtime, artifacts are everything else."

There will be artifacts that want to have a valid platform defined, and other artifacts that want to use the image config media type. We may be able to avoid the latter with guidance in the spec, but that value isn't visible in the Index. And expecting runtimes to avoid trying to run an artifact image based on a new field in the descriptor (i.e. artifactType) is going to be error prone for existing runtimes that aren't aware of new fields.

The two options I can think of are either to ensure artifacts are listed after any container images in an index, so that the selector for container runtimes selects the first match, or OCI recommends against mixing container images and artifacts in the same index for the best support of older runtimes.

Nov 20 '23 19:11 sudo-bmitch

Why would we prohibit container runtimes, and their fat clients, from executing/processing artifacts?

My phrasing is that container runtimes trying to execute/process it makes it a container image. It's not restricting what runtimes can do, it's defining whether it's a container image or artifact based on whether runtimes could execute/process it. It leaves flexibility for runtimes to extend to process more content in the future without us needing to adjust the image-spec for every runtime change.

Nov 20 '23 21:11 sudo-bmitch

Given the various scenarios we're seeing, and knowing there will be more scenarios in the future, perhaps instead of being prescriptive in exactly what fields are defined and we should keep it at a higher level. Something like "container images are any content pushed to the registry designed to be executed by a container runtime, artifacts are everything else."

There will be artifacts that want to have a valid platform defined, and other artifacts that want to use the image config media type. We may be able to avoid the latter with guidance in the spec, but that value isn't visible in the Index. And expecting runtimes to avoid trying to run an artifact image based on a new field in the descriptor (i.e. artifactType) is going to be error prone for existing runtimes that aren't aware of new fields.

The two options I can think of are either to ensure artifacts are listed after any container images in an index, so that the selector for container runtimes selects the first match, or OCI recommends against mixing container images and artifacts in the same index for the best support of older runtimes.

IMO it is not a good idea to redefine (is/are language) what a container image is inside the artifacts guidance documentation.

Let's not leave out the index may or may not include platform filters, and "implementations SHOULD support" "application/vnd.oci.image.index.v1+json (nested index)" and "an encountered mediaType that is unknown to the implementation MUST NOT generate an error."

Nov 20 '23 22:11 mikebrow

My phrasing is that container runtimes trying to execute/process it makes it a container image. It's not restricting what runtimes can do, it's defining whether it's a container image or artifact based on whether runtimes could execute/process it. It leaves flexibility for runtimes to extend to process more content in the future without us needing to adjust the image-spec for every runtime change.

I think it's an oci container image if it's in a format conforming to the image spec for storage or transport, and then at runtime, by conversion and augmentation via a container runtime, if it is determined by a runc compatible runtime to then adhere to the runtime spec config and rootfs requirements of a runtime bundle for a platform.

Nov 20 '23 22:11 mikebrow

artifacts are a grey space wrt. whether they can be considered a part of or augmentation to layers/config/index, guidance, filtering mechanisms, additional image config, even possibly a key store, alg, or content for a new digest type... runnable or not is probably up to the iana type or we possibly break with existing spec restrictions.

100% agree we need to avoid a client reading an artifact, via matching, as a default platform image if/when that is not the intended result.

Nov 20 '23 22:11 mikebrow

I like the high level definition here: https://github.com/opencontainers/image-spec/blob/main/spec.md#overview

At a high level the image manifest contains metadata about the contents and dependencies of the image including the content-addressable identity of one or more filesystem layer changeset archives that will be unpacked to make up the final runnable filesystem. The image configuration includes information such as application arguments, environments, etc. The image index is a higher-level manifest which points to a list of manifests and descriptors. Typically, these manifests may provide different implementations of the image, possibly varying by platform or other attributes.

Nov 20 '23 22:11 mikebrow

Let's not leave out the index may or may not include platform filters, and "implementations SHOULD support" "application/vnd.oci.image.index.v1+json (nested index)" and "an encountered mediaType that is unknown to the implementation MUST NOT generate an error."

An issue attempting to be solved in this PR is runtimes picking container images from a mixed index with artifacts included in the manifest list. The artifact entries could have matching platform values, and known media type values. My comments above are looking at what options, if any, we have to resolve that. I don't believe "MUST NOT generate an error" helps in this scenario since the values would be known.

Nov 20 '23 22:11 sudo-bmitch

An issue attempting to be solved in this PR is runtimes picking container images from a mixed index with artifacts included in the manifest list. The artifact entries could have matching platform values, and known media type values. My comments above are looking at what options, if any, we have to resolve that. I don't believe "MUST NOT generate an error" helps in this scenario since the values would be known.

absolutely... 100% agree we need to avoid a client reading an artifact, via matching, as a default platform image if/when that is not the intended result.

my point in including the additional cites was as a reminder:

that said artifacts might be in a nested index vs.. the platform index..
if the manifest type is unknown it must not generate an error...

and those current rules may be useful in finding a resolution..

For example, a) push said artifacts into an index under the platform index... (and check if clients actually recurse today for nested indexes... suggesting a solution here where platform (optional) "index" is for container image manifests and nested indexes not for artifact type manifests irregardless of the desire to filter artifacts by platform b) add a new manifest type for an artifact index, allowing platform (optional) "index" to have nested platform (optional) "indexes" and platform (optional) "artifact indexes", and the new "artifact index" would be a child of another index and contain further artifact indexes and image manifest with an artifact type.

Nov 20 '23 23:11 mikebrow