distribution-spec Proposal: add (optional) section on content-negotiation

The distribution spec currently does not describe active content negotiation, leaving it up to individual implementations whether (and how) content negotiation should be performed.

This proposal is still a draft, and many "blanks" to fill, but I thought I'd open it, to allow having a discussion on this topic.

1. "Problem" statement

Currently, the client is responsible for picking the right variant for multi-manifest ("multi arch") repositories. While this works well, it requires at multiple steps, and multiple requests to get the image manifest;

(GET /v2)
(optional) check if cache can be used: HEAD /v2/<name>/manifests/<reference>
fetch manifest-list: GET /v2/<name>/manifests/<reference>
(parse manifest-list and resolve digest for best match)
fetch image manifest: GET /v2/<name>/manifests/<DIGEST>
fetch blob: /v2/<name>/blobs/<digest> (repeat for each blob)

Content-negotiation can avoid multiple request, which could bring some performance improvements, and (if implemented by the registry), simplify logic in the client.

In addition, content-negotiation can assist in providing backward-compatibility (as outlined in the next section).

2. Existing uses of content-negotiation

Content negotiation is already in use by some registry implementations, such as Docker Hub (mostly for backward-compatibility). Here's some tests against docker hub;

Note: for formatting on GitHub, I removed application/vnd.docker.distribution prefixes for the content-types in the table below.

`Accept`	Multi?	Result	Notes
`<not present>`	:white_check_mark:	`manifest.v1+prettyjws`	`v1`, `linux/amd64`, for backward compatibility with old, non-v2 clients
`/`	:white_check_mark:	`manifest.v1+prettyjws`	`v1`, `linux/amd64`, for backward compatibility with old, non-v2 clients (same as above)
`application/json`	:white_check_mark:	`manifest.v1+prettyjws`	`v1`, `linux/amd64`, for backward compatibility with old, non-v2 clients (same as above)
`manifest.v1+json`	:white_check_mark:	`manifest.v1+prettyjws`	`v1`, matching `Accept` header (`linux/amd64` for backward compatibility)
`manifest.v2+json`	:white_check_mark:	`manifest.v2+json`	`v2`, matching `Accept` header (`linux/amd64` for backward compatibility)
`manifest.list.v2+json`	:white_check_mark:	`manifest.list.v2+json`	`v2 manifest list`, matching `Accept` header
`manifest.list.v2+json`, `manifest.v2+json`, `manifest.v1+json`	:white_check_mark:	`manifest.list.v2+json`	`v2 manifest list`, matching first `Accept` header
`manifest.v2+json`, `manifest.list.v2+json`, `manifest.v1+json`	:white_check_mark:	`manifest.v2+json`	:warning: prefers manifest list over manifest (ignoring order?)
`<not present>`	-	`manifest.v1+prettyjws`	same as multi-manifest repo
`/`	-	`manifest.v1+prettyjws`	same as multi-manifest repo
`application/json`	-	`manifest.v1+prettyjws`	same as multi-manifest repo
`manifest.v1+json`	-	`manifest.v1+prettyjws`	same as multi-manifest repo
`manifest.v2+json`	-	`manifest.v2+json`	same as multi-manifest repo
`manifest.list.v2+json`	-	`manifest.list.v2+json`	:warning: somewhat unexpected to return a v1 manifest for a v2-capable client
`manifest.list.v2+json`, `manifest.v2+json`, `manifest.v1+json`	-	`manifest.list.v2+json`	`v2 manifest`, matching first "acceptable" `Accept` header (by lack of a manifest-list)
`manifest.v2+json`, `manifest.list.v2+json`, `manifest.v1+json`	-	`manifest.v2+json`	`v2 manifest`, matching first "acceptable" `Accept` header (by lack of a manifest-list)

To try these (use repo library/hello-world for multi-manifest, and armhf/hello-world for single manifest);

export repo=library/hello-world

export token="$(curl -fsSL "https://auth.docker.io/token?service=registry.docker.io&scope=repository:${repo}:pull" | jq --raw-output '.token')";

curl -v -X HEAD -I -fsSL -H "Authorization: Bearer $token" \
    -H 'Accept: application/vnd.docker.distribution.manifest.v2+json' \
    -H 'Accept: application/vnd.docker.distribution.manifest.list.v2+json' \
    -H 'Accept: application/vnd.docker.distribution.manifest.v1+json' \
    "https://registry-1.docker.io/v2/${repo}/manifests/latest"

2.1. Omissions in current implementations

Besides some (edge) cases highlighted (:warning:) in the previous section, current implementations appear to have some omissions;

No Vary header is returned (rfc7231, section 7.14)
No 300 or 406 statuses are used

3. Proposal: server-side content-negotiation as optional feature

I suggest this feature to be OPTIONAL to keep backward compatibility with existing registries, and to facilitate static registries (which would not be able to perform server-side content negotiation).

Registries that do not perform content-negotiation, would return either;

a manifest-list for multi-arch repositories (allowing the client to select the best-matching variant)
a v2 manifest for v2 repositories
a v1 manifest for v1 repositories

3.1. Approaches to content-negotiation

Looking for recommendations, and "correct" approaches, I did some reading-up on the HTTP RFCs. From those, there are multiple alternatives, see (rfc7231, section 5.3 - "Content Negotiation", rfc7231, section 3.4.1 - "Proactive Negotiation", and rfc7231, section 3.4.2 - "Reactive Negotiation"), each with their respective "pros" and "cons".

3.2. "Proactive Negotiation" rfc7231, section 3.4.1

With proactive negotiation, the registry picks the most suitable variant, and in case no acceptable variant can be selected, can return a 406 Not Acceptable response, allowing the client to select the variant;

Proactive negotiation is advantageous when the algorithm for
selecting from among the available representations is difficult to
describe to a user agent, or when the server desires to send its
"best guess" to the user agent along with the first response (hoping
to avoid the round trip delay of a subsequent request if the "best
guess" is good enough for the user).  In order to improve the
server's guess, a user agent MAY send request header fields that
describe its preferences.

The RFC does come with some warnings (outlined under "Proactive negotiation has serious disadvantages" in the RFC), although not all of those would apply to the distribution-spec.

If no acceptable match is found, a 406 Not Acceptable (rfc7231, section 6.5.6) can be returned, as outlined in rfc7231, section 5.3.2

(...) If the header field is present in a request and none of the available
representations for the response have a media type that is listed as acceptable,
the origin server can either honor the header field by sending a 406 (Not
Acceptable) response or disregard the header field by treating the
response as if it is not subject to content negotiation.

Note current implementations that handle Content Negotiation do not return a 406 Not Acceptable (rfc7231, section 6.5.6) status if no acceptable alternative is found. Neither do they return a Vary header to indicate which request headers were considered during negotiation.

Despite the downsides mentioned, this variant;

addresses the "multiple requests"; a client can specify what variants are acceptable, and get the best match directly from the registry.
largely corresponds with current implementations that perform content-negotiations; the registry selects the variant to serve.

3.3. "Reactive Negotiation" rfc7231, section 3.4.2

This variant mostly matches the behavior of current registries that do not perform content negotiation, but offers slightly more "assistance":

With reactive negotiation (a.k.a., agent-driven negotiation),
selection of the best response representation (regardless of the
status code) is performed by the user agent after receiving an
initial response from the origin server that contains a list of
resources for alternative representations. (...)

The above roughly corresponds with the registry returning a manifest-list, although the spec seems to indicate the response can be both the actual content ("best manifest"), and a list of alternatives;

(...) If the user agent is not satisfied by the initial response representation,
it can perform a GET request on one or more of the alternative resources (...)

Reactive negotiation can return a 406 Not Acceptable (rfc7231, section 6.5.6) or 300 Multiple Choices (rfc7231, section 6.4.1) response, the latter including a Location header with the "best alternative" as determined by the registry, and which the client can respect. In both cases, the response body SHOULD return a list of alternatives. The spec is unclear what format this list should have, but in case of the registry spec, I think a "manifest-list" would fit this bill.

Reactive negotiation suffers from the disadvantages of transmitting a
list of alternatives to the user agent, which degrades user-perceived
latency if transmitted in the header section, and needing a second
request to obtain an alternate representation.  Furthermore, this
specification does not define a mechanism for supporting automatic
selection, though it does not prevent such a mechanism from being
developed as an extension.

Although this approach somewhat corresponds with the current distribution spec, there are differences, and points of attention;

Current implementation do not return 406 Not Acceptable or 300 Multiple Choices responses
While the 300 Multiple Choices response has a Location header that the client MAY follow, there is a likely possibility that clients automatically follow the redirect.
If a client decides to accept the Location redirect, doing so may (should) strip authentication, making this not a good option.
As mentioned, this alternative does not bring the benefits of "Proactive Negotiation" (no reduction of requests).

4. Proposal/example: automatic platform selection (os, os-version, arch, variant)

Note: this part is written with images distributed through the registry in mind. With other content types becoming more common for distribution through OCI registries, similar proposals/approaches could be taken for other content.

A client can specify multiple variants that are acceptable, optionally with a "Quality" (q) parameter to indicate relative weight (see rfc7231, section 5.3.1 and rfc7231, section 5.3.2);

Note: rfc7231, section 5.3.2 mentions the q value not being "optional" in some cases. I had trouble grasping that section, so parameter-order needs some double-checking against the specs.

curl -X HEAD -I -fsSL -H "Authorization: Bearer $token" \
    -H 'Accept: application/vnd.docker.distribution.manifest.v2+json;platform=linux/arm/v8' \
    -H 'Accept: application/vnd.docker.distribution.manifest.v2+json;platform=linux/arm/v7;q=0.7' \
    -H 'Accept: application/vnd.docker.distribution.manifest.v2+json;platform=linux/arm/v5;q=0.5' \
    -H 'Accept: application/vnd.docker.distribution.manifest.v2+json' \
    -H 'Accept: application/vnd.docker.distribution.manifest.list.v2+json' \
    "https://registry-1.docker.io/v2/${repo}/manifests/latest"

If no q parameter is passed, variants should be weighted in the order specified;

curl -X HEAD -I -fsSL -H "Authorization: Bearer $token" \
    -H 'Accept: application/vnd.docker.distribution.manifest.v2+json;platform=linux/arm/v8' \
    -H 'Accept: application/vnd.docker.distribution.manifest.v2+json;platform=linux/arm/v7' \
    -H 'Accept: application/vnd.docker.distribution.manifest.v2+json;platform=linux/arm/v5' \
    -H 'Accept: application/vnd.docker.distribution.manifest.v2+json' \
    -H 'Accept: application/vnd.docker.distribution.manifest.list.v2+json' \
    "https://registry-1.docker.io/v2/${repo}/manifests/latest"

Wildcards should be acceptable, and equivalent to omitting a property. For example platform=linux/arm/* and platform=linux/arm are equivalent ("any linux arm image").

Wildcards can also be useful for OS-versions, particularly for Windows images, which (lacking a stable Windows kernel API) come in variants specific to Kernel versions. Kernel stability has improved with recent versions of Windows, however, and should not be a problem when using "Hyper-V" isolation, in which case any kernel version could be acceptable for clients.

However, using the suggested string representation may require additional changes:

NOTE

While the image-spec defines the Platform descriptor, I don't think it currently specifies how the platform should be represented as a string. Currently, platforms are represented as a JSON struct/map/object (pick your preferred name for this);

Windows image:
{
  "platform": {
    "architecture": "amd64",
    "os": "windows",
    "os.version": "10.0.17763.1577"
  }
}
Linux, ARM v5
{
  "platform": {
    "architecture": "arm",
    "os": "linux",
    "variant": "v5"
  }
}

As an alternative to the string representation, separate parameters could be provided for each property, for example:

Accept: application/vnd.docker.distribution.manifest.v2+json;platform.os=linux,platform.architecture=arm,platform.variant=v8

Omitting a property is equivalent to * (e.g. platform.variant=*)

5. Backward compatibility

(I think) This proposal would be backward-compatible with existing implementations that perform content-negotiation;

clients that want to perform client-side content-negotiation can explicitly request the manifest list (Accept: application/vnd.docker.distribution.manifest.list.v2+json), or put the manifest-list before the (v2) manifest in the list of Accept headers.
if a client does not send an Accept header, registries continue to select the "most appropriate" response (see "Backward compatibility (pre-v2 clients)" above)

5.1. Backward compatibility (v2 clients)

For backward-compatibility, clients MUST include a application/vnd.docker.distribution.manifest.list.v2+json and/or a application/vnd.docker.distribution.manifest.v2+json (without platform specification or q parameter) in the list, in order of preference. The platform and q parameters must be omitted for compatibility with current implementations, which may not expect/handle these.

5.2. Backward compatibility (pre-v2 clients)

In case a registry supports both v1 and v2 manifests, and a client does not send an Accept header (or sends an Accept: */*), which would be the case for pre-v1 clients, the registry SHOULD return the oldest manifest type it supports (v1).

Or, as outlined in rfc7231, section 5.3.2

(...) A request without any Accept header field implies that the user agent
will accept any media type in response. (..)

and

(...) If the header field is present in a request and none of the available
representations for the response have a media type that is listed as acceptable,
the origin server can either honor the header field by sending a 406 (Not
Acceptable) response or disregard the header field by treating the
response as if it is not subject to content negotiation.

Nov 12 '20 13:11 thaJeztah

Hmm, I am not sure I see any reason for conneg in a content addressable store. It kind of makes sense for the old manifest formats, although arguably we should have used a different endpoint. A modern registry should only return what was uploaded, so there is no choice to negotiate.

Nov 12 '20 14:11 justincormack

@thaJeztah - I think this information is useful, but does not actually change the API defined in the spec (correct me if I'm wrong). It provides a useful framework of how one might leverage the API effectively.

Do you take any issue with scheduling this to be included in a 1.1.0 release?

Nov 12 '20 17:11 jdolitsky

@jdolitsky this is essentially my first point from https://github.com/opencontainers/distribution-spec/issues/211 described way more completely than I ever could have imagined.

As it stands, it's impossible to read this spec and write a client that works with Docker Hub (really, most registries), which seems pretty important.

Even if we don't want to do any content negotiation in this spec, we at least need to document something about this "legacy" (in quotes because AFAIK this is in prod for all major registries) content negotiation for 1.0.

Nov 12 '20 17:11 jonjohnsonjr

Hmm, I am not sure I see any reason for conneg in a content addressable store

I agree that for /v2/<name>/manifests/<DIGEST> it doesn't make sense (that's content-addressable), but /v2/<name>/manifests/<TAG> is not content-addressable, and (I think) could do content-negotiation.

I think this information is useful, but does not actually change the API defined in the spec (correct me if I'm wrong). It provides a useful framework of how one might leverage the API effectively.

I see it as an optional extension to the API, but would help to improve interoperability, and define host registries can handle content negotiation (what dimensions (Accept headers, including parameters) should/can be used), and what fallbacks to implement for older clients (or how to respond to clients not sending Accept headers if they do implement content-negotiation).

Nov 12 '20 17:11 thaJeztah

I agree that for /v2/<name>/manifests/<DIGEST> it doesn't make sense (that's content-addressable), but /v2/<name>/manifests/<TAG> is not content-addressable, and (I think) could do content-negotiation.

I think this is a useful distinction to make. The original spec didn't speak to this at all, and I'm curious about the behavior for other registries. GCR will just 404 if you ask for a manifest by digest without the appropriate accept headers, which had led to some confusion. It would be nice to settle on an expected behavior here.

It also seems like not mentioning the Accept headers at all puts registries in a really weird spot:

A client author reading this spec will see no indication that they need to supply Accept headers, so they will be missing on manifest fetches. A registry that expects Accept headers so that it can support older clients will be non-conforming by returning a schema 1 manifest, and the client should (rightly, per the spec) complain about the registry. What is a registry operator to do? Do you drop support for older clients or not support newer clients?

It should be possible for a registry to implement both the docker spec and the OCI spec, and they are currently mutually exclusive (or the OCI spec is incomplete), per my reading.

Nov 12 '20 23:11 jonjohnsonjr

While manifests are not strictly content addressable, they in effect are. Registries must not change them, there may be external signatures (eg Notary) that depend on the bits. Everything needs to be round tripped for this to work correctly, although we have not made this 100% clear in the spec (@stevvooe often says it though!). There is no sane way to support a tag that can Vary based on Accept now, this is purely legacy behaviour. Its actually really weird how some registries return you garbage generated manifests if you don't put the right Accept headers, and we need to phase this out.

Nov 13 '20 11:11 justincormack

So I had a brief chat with @tonistiigi yesterday, and of course he was able to punch a big gap in my idea w.r.t. content-negotiation for multi-arch (I hate it when he does that :joy:);

While it can be useful to get just the image/arch you're interested in for a specific request; performing content-negotiation on the registry has the downside that the "root" (manifest-list) is skipped. So while it would still be possible to verify (the digest of) that image-manifest, when consuming multiple architectures, it would not be possible to verify that those images are part of the same manifest-list. For example (simplified, trying to describe what I think the problem is with that);

pull image foo:latest (Accept: platform=linux/amd64)
- content-negotiation -> resolves digest for linux/amd64 image manifest
(meanwhile) foo:latest manifest is updated; new versions for linux/amd64 and linux/armv7 are pushed
pull image foo:latest (Accept: platform=linux/armv7)
- content-negotiation -> resolves digest for linux/armv7 image manifest

Now, while both images are valid, they were never part of the same manifest-list; in other words; they contain different versions of foo:latest, and because the manifest-list itself (the "root" of the tree) is skipped, it's not possible to detect that situation.

Of course this may not be an issue if you're only interested in a single arch, but (as mentioned), could also be an issue if the manifest-list itself is expected to be signed.

content-negotiation could still be interesting to store different artifact types in a repository (e.g., store helm-charts and images in the same namespace), although those could of course all be stored in a single manifest-list, and using content-negotiation to pull either one or the other would no longer make this an optional feature (and therefore exclude static registries).

While manifests are not strictly content addressable, they in effect are. Registries must not change them, there may be external signatures (eg Notary) that depend on the bits

Do you mean GET /v2/<name>/manifests/<TAG> should always return the same (effectively: immutable tags, so latest can never be updated)? Or do you mean "should not be rewritten, based on "accept" headers (or "user-agent")?

Its actually really weird how some registries return you garbage generated manifests if you don't put the right Accept headers, and we need to phase this out.

Which would mean: registries MUST NOT perform content-negotiation, which could be a valid outcome of this proposal/ticket as well

(possibly with an addendum for backward-compat? Not sure how/when it can be phased out).

Nov 13 '20 14:11 thaJeztah

Regarding Accept headers and general content-negotiation - can somebody summarize this in 200-300 words or less and suggest which section of the spec to put it in?

Alternatively, we can create some external content-negotiation.md and link to it from spec.md.

We are presented with a lot of (good) information here with no suggestion for how to proceed.

Nov 13 '20 15:11 jdolitsky

There is no sane way to support a tag that can Vary based on Accept now, this is purely legacy behaviour. Its actually really weird how some registries return you garbage generated manifests if you don't put the right Accept headers, and we need to phase this out.

Doesn't Docker Hub? And docker/distribution? GCR does this. I would expect that most registries do this.

I agree with you that we should phase this out. In my experience, this is the biggest stumbling point for people trying to interact with registries by far.

At the same time, not including this in the spec would dramatically reduce the usefulness of the spec until it really is phased out. Registries will probably continue to return schema 1 images if you don't supply the right Accept header unless we specifically require something in this spec about NOT doing content negotiation for OCI media types (or until every customer using an old version of docker has migrated).

I see that schema 1 was deprecated about a year ago in https://github.com/docker/distribution/pull/3000. Is that sufficient warning to break any clients relying on this behavior? With my client and registry maintainer hat on, I'm 100% in favor of no longer supporting schema 1 anywhere; but, if I put on my customer support hat, that would feel irresponsible.

I think I would be okay with this:

Acknowledge current behavior

Someone using this spec to write a client will be quite likely to encounter a docker v2 schema 2 image (you don't really know what's on the other side of a GET /v2/.../manifests/<tag>). Registries use only Accept headers to determine what they should do here. If the client doesn't supply the Accept header, it will receive a v2 schema 1 image. A client based on this spec will have no idea what to do with that, but would be able to handle either a v2 schema 2 image or a manifest list just fine.

At the very least, we should mention this in the GET manifest section and link to a backwards compatibility doc or the legacy spec.

Decide and document the behavior we want

Registries will be doing this content negotiation already. We should be explicit about what they should be doing for OCI compliance. For example, GCR will not do this conversion for artifacts uploaded with OCI media types; however, we do expect clients to provide these media types in their Accept headers, and will 404 if they are not present.

What is the correct behavior here?

Nov 13 '20 19:11 jonjohnsonjr

I don't believe any new registry (ie one you create now, or indeed created in the last few years) needs to do any conneg at all, and should just be round tripping blobs as uploaded, regardless of any Accept headers. Existing registries should be able to start phasing this behaviour out I would think. It is mostly old Docker engines that would have issues, although some API clients that are not sending Accept headers now might get confused, looking at the stats on Docker Hub there are still small numbers of old clients but I rather suspect other registries have none, most are retrieving public images.

Nov 16 '20 12:11 justincormack

@justincormack just to confirm; in your view all clients should be required to understand manifest lists? (I think that's fair, I just want to be explicit).

Nov 16 '20 12:11 amouat

@amouat I think the current practise of redirecting (well actually its not a redirect, that would be better) to the linux/amd64 version is pretty questionable yes, again there is backward compatibility issue but its not something that should be encouraged. If your client is not manifest list aware, you can still point it at the correct location via a different tag/sha. I don't know how many clients break on this (which versions?) but they will already not be working on Arm machines for example, so I don't see how this can be justified...

Nov 16 '20 13:11 justincormack

I've opened https://github.com/opencontainers/distribution-spec/pull/218, adding a new "content-negotation.md" linking to this issue. I think this doc can be updated over time

Dec 09 '20 22:12 jdolitsky

does this proposal have to be resolved before a 1.0.0?

Jan 28 '21 18:01 vbatts

Related JFrog Artifactory issue: https://www.jfrog.com/jira/browse/RTFACT-25069

May 20 '21 07:05 denysvitali

Looks like I LGTM'd https://github.com/opencontainers/distribution-spec/blob/main/content-negotiation.md but we might want to make this more "OCI specific". the "manifest.v1" refers to the docker schema1 manifest and that's not entirely clear.

How realistic is it to deprecate schema1 at this point? From there, pulls by tag can down convert if a manifest is requested by tag.

May 03 '22 03:05 stevvooe

distribution-spec distribution-spec copied to clipboard

Proposal: add (optional) section on content-negotiation

1. "Problem" statement

2. Existing uses of content-negotiation

2.1. Omissions in current implementations

3. Proposal: server-side content-negotiation as optional feature

3.1. Approaches to content-negotiation

3.2. "Proactive Negotiation" rfc7231, section 3.4.1

3.3. "Reactive Negotiation" rfc7231, section 3.4.2

4. Proposal/example: automatic platform selection (os, os-version, arch, variant)

5. Backward compatibility

5.1. Backward compatibility (v2 clients)

5.2. Backward compatibility (pre-v2 clients)

distribution-spec
distribution-spec copied to clipboard