distribution-spec WWW-Authenticate & discovery authentication server

The current version of the specification does not indicate how to identify the authorization server locations.

The source Docker documentation ( https://docs.docker.com/registry/spec/auth/token/ ) contains considerations around realm in WWW-Authenticate for that:

Www-Authenticate: Bearer realm="https://auth.docker.io/token",service="registry.docker.io",scope="repository:samalba/my-app:pull,push"

I would like to draw attention to the work carried out under draft-hardt-oauth-distributed-01, where the following ideas were proposed for this problem:

Authorization Server Discovery

Figure 1, step (A)

To access a protected resource, the client needs to learn the authorization servers or issuers that can issue access tokens that are acceptable to the protected resource. There may be one or more issuers that can issue access tokens for the protected resource. To discover the issuers, the client attempts to make a call to the protected resource URI as defined in [RFC6750] section 2.1, except with an invalid access token or no HTTP "Authorization" request header field. The client notes the hostname of the protected resource that was confirmed by the TLS connection, and saves it as the "host" attribute.

Figure 1, step (B)

The resource server responds with the "WWW-Authenticate" HTTP header that includes the "error" attribute with a value of "invalid_token" and MAY also include the "scope" and "realm" attribute per [RFC6750] section 3, and a "Link" HTTP Header per [RFC8288] that MUST include one link of relation type "resource_uri" and one or more links of type "oauth_server_metadata_uri".

For example (with extra spaces and line breaks for display purposes only):
HTTP/1.1 401 Unauthorized
WWW-Authenticate: Bearer realm="example_realm",
                         scope="example_scope",
                         error="invalid_token"
Link: <https://api.example.com/resource">; rel="resource_uri",
      <https://as.example.com/.well-known/oauth-authorization-server>; rel="oauth_server_metadata_uri"
The client MUST confirm the host portion of the resource URI, as specified in the "resource_uri" link, contains the "host" attribute obtained from the TLS connection in step (A). The client MUST confirm the resource URI is contained in the protected resource URI where access was attempted. The client then retrieves one or more of the OAuth Server Metadata URIs to learn how to interact with the associated authorization server per [OASM] and create a list of one or more authorization server token endpoint URLs.

It seems to me that it will be better for everyone when work in this area is coordinated to unify the way of detecting server authorization. I have noticed the convergence of problems and various solutions to this problem, hence I am creating this issue to facilitate cooperation on the extensive Internet.

Mar 05 '20 02:03 ad-m

ping! +1 on this issue, I'm looking for the "right way" to implement the authentication. Is there a formal guide that the distribution-spec can point to use?

Oct 14 '20 21:10 vsoch

Also, I'd be interested to know what subset of endpoints (other than obvious push/pull) require authentication. For example, can I list tags without it?

Oct 14 '20 21:10 vsoch

The spec isn't touching on anything auth at the moment. Most registries are using a "docker-style" version of OAuth 2.0

Here is the OAuth 2.0 document: https://tools.ietf.org/html/rfc6749#section-3.3

and here is a Docker page regarding a "Token Authentication Specification": https://docs.docker.com/registry/spec/auth/token/

The part that is unique -ish to docker and containers is the scope piece of the auth challenge, and the format of JWT tokens that are returned. Example decoded JWT payload (see access section).

{
  "iss": "auth.docker.com",
  "sub": "jlhawn",
  "aud": "registry.docker.com",
  "exp": 1415387315,
  "nbf": 1415387015,
  "iat": 1415387015,
  "jti": "tYJCO1c6cnyy7kAn0c7rKPgbV1H1bFws",
  "access": [
    {
      "type": "repository",
      "name": "samalba/my-app",
      "actions": [
        "push"
      ]
    }
  ]
}

As far as which endpoints are protected by this - there is no rule. If you make a request, and a 401 is returned, inspect the Www-authenticate header which will contain information on how to retrieve a token. Then retry the original request, adding an Authorization header in the form Authorization: Bearer <token>. This is purely OAuth2. The scope can also be arbitrary, such as scope=abc123.

It may be nice if the spec had some discovery mechanism and known token type, so that clients could ultimately fetch a token without first triggering a 401.

Oct 14 '20 22:10 jdolitsky

Heyo! I'm well into the auth implementation but I have one quick question. What are best practices for the secret used to decode? Should they be generated for one time use, on the level of the repository, or the server? If the purpose is only for the server to generate (encode) and then decode, it seems like it doesn't add much, security wise, and even could just be one value. I ask because I was looking over quay's implementation and they use some kind of public key? https://github.com/quay/quay/blob/38be6d05d08bc72cc13a89073bb5364b8adf6c04/util/security/registry_jwt.py#L100 And they pass it via some kind of kid header?

    kid = headers.get("kid", None)
    if kid is None:
        logger.error("Missing kid header on encoded JWT: %s", bearer_token)
        raise InvalidBearerTokenException("Missing kid header")

How does the header get passed around, and is the kid equivalent to the jti? Thanks!

Oct 26 '20 06:10 vsoch

As a rule (as per the RFC standards on OAuth), the Bearer token is an opaque value that is passed between the authorization servers and the resource provider via the client. In practice, this value is often a JWT token.

Claim "kid" of JWT token is an element provided by OpenID Connect standard, which - together with an appropriate "audience" - allows for distributed verification and enables resource providers to verify the token coming from many authorization servers. The current registry protocol does not make it possible (due to the lack of proper discovery of authorization servers) to use multiple authorization servers, but the details of the JWT format are often adopted at the organization-wide level and used throughout the organization.

Oct 26 '20 11:10 ad-m

Okay, so given a JWT token in context of OAuth - to return to my question - you are saying that it doesn't really matter then how it's encoded and decoded? There is no best practice for the secret?

Oct 26 '20 14:10 vsoch

I believe that this term should not be the scope of the Registry standard. The value should be provided by the authorization server, provided by the client as is, and then understood and trusted by the resource provider. Various environments may approach this issue in a completely different way. I don't see a security problem (and this is a standard mechanism in JWT) to use symmetric cryptography if the secret distribution is limited e.g. within one server that acts as both an authorization server and resource provider. As the environment grows, symmetric cryptography becomes cumbersome, and then key distribution mechanisms can be helpful. There is even no obstacle for the value to be random, in the style of the classic statefull session identifier. The issue requires an analysis of the needs and risks of a specific environment.

Oct 26 '20 15:10 ad-m

Thanks @ad-m! @jdolitsky what are your thoughts?

Oct 26 '20 16:10 vsoch

JWTs are not encrypted, but base64 encoded and signed, and can be used by any client who obtains it to access target resource(s). The security added comes with the fact that these tokens expire after a short period of time. Additionally, the auth server can rotate private key at any time, marking previous tokens as invalid.

I agree w/ @ad-m - I'm not so sure this should be part of the distribution spec. It is purely OAuth. There are some aspects, however, that seem to be unique to registries. This is that JWTs are almost always used, and contain an access section, which has a unique schema related to push/pull/delete access to specific resources.

The auth dance between the client and auth server isn't something that should be defined by the spec, but maybe described. If anything, the unique format of JWT tokens and scope params could become part of the spec.

Oct 27 '20 13:10 jdolitsky

JWTs are not encrypted

JWT might be encrypted eg. thanks to JWE. I have the impression that some hyper-scalers encrypt their tokens, but I haven't paid much attention to it.

This is that JWTs are almost always used, and contain an access section, which has a unique schema related to push/pull/delete access to specific resources.

For me, this is an implantation detail that should be not described in the distribution specification, because client should use token as opaque value. Such elements may be presented in supportive and educational non-normative documentation that presents experiences from actual implementations.

Oct 27 '20 14:10 ad-m

Thanks to you both! I didn’t mean to suggest this should be part of the spec, I was just generally asking for advice.

Oct 27 '20 15:10 vsoch

Thanks again for all the good discussion, I think we're good https://vsoch.github.io/django-oci/docs/getting-started/auth.

Oct 27 '20 20:10 vsoch

@ad-m - just to clarify your original issue - are you suggesting any change to the spec, or just sharing information on the distributed OAuth stuff? (Link header etc.)

Oct 27 '20 21:10 jdolitsky

At this stage, I share information for possible use and draw attention, I have no specific expectations regarding the specification changes.

Oct 27 '20 21:10 ad-m

@ad-m we have tried to avoid a need to including authorization in the specification. From an implementation standpoint though, it is hard to be completely opaque and attempting to do oauth just based on the results of WWW-Authenticate can be tricky. Some discovery system like @jdolitsky mentioned would make that better. Things to consider...

Oauth server is not known, so the realm is used...this is non standard behavior
scope is used directly from registry...however, most clients add scope to avoid additional requests, making the format not opaque
the client id is static and not provided by the service...this makes doing things like authorization code grant types hard (along with other missing configuration)

Oct 28 '20 05:10 dmcgowan