iceberg
iceberg copied to clipboard
REST: assume issued token type is access token
The REST client wrongly assumes that the issued_token_type field is present in all OAuth responses, but that isn't true: e.g. in the client_credentials flow, this field is undefined. See RFC 6749, section 4.4.3:
https://datatracker.ietf.org/doc/html/rfc6749#section-4.4.3
This causes the client to crash when creating a tokens exchange request, since the issued token type becomes the request's subject token type, which is mandatory.
This has been verified against a Keycloak 24.0 server.
This change fixes this issue by assuming that the issued token type is an access token, if the response did not specify any token type.
This change also fixes RESTCatalogAdapter: it was incorrectly including the issued_token_type field in client_credentials responses, thus masking many test failures, e.g. in testCatalogTokenRefresh.
Hi @nastra any updates on this PR?
FYI here is an example of error from a Spark SQL session connected to REST catalog + external auth server:
24/05/26 21:39:19 WARN Tasks: Retrying task after failure: Invalid token type: null
java.lang.IllegalArgumentException: Invalid token type: null
at org.apache.iceberg.relocated.com.google.common.base.Preconditions.checkArgument(Preconditions.java:218)
at org.apache.iceberg.rest.auth.OAuth2Util.tokenExchangeRequest(OAuth2Util.java:244)
at org.apache.iceberg.rest.auth.OAuth2Util.tokenExchangeRequest(OAuth2Util.java:235)
at org.apache.iceberg.rest.auth.OAuth2Util.refreshToken(OAuth2Util.java:140)
@adutra I'm OOO this week, maybe @amogh-jahagirdar gets a chance to look at this PR
@nastra or @amogh-jahagirdar is it possible for you to have a another look here please? Thanks 🙏
@nastra @amogh-jahagirdar This PR has been rebased. Could I get a review please? 🙏
@adutra IMO https://github.com/apache/iceberg/pull/10314/files#r1601220154 still needs to be reverted. It's ok to add a workaround for Keycloak but I don't see a good enough reason to adjust what the server (aka RESTCatalogAdapter is sending back)
@adutra IMO https://github.com/apache/iceberg/pull/10314/files#r1601220154 still needs to be reverted. It's ok to add a workaround for Keycloak but I don't see a good enough reason to adjust what the server (aka
RESTCatalogAdapteris sending back)
@nastra I'm sorry but I really don't agree. This has nothing to do with Keycloak. This is not a workaround. This has to do with complying with the OAuth 2.0 spec.
Have you read the 5.1 section of RFC 6749? Quoting here so that we are on the same page:
The authorization server issues an access token and optional refresh
token, and constructs the response by adding the following parameters
to the entity-body of the HTTP response with a 200 (OK) status code:
access_token
REQUIRED. The access token issued by the authorization server.
token_type
REQUIRED. The type of the token issued as described in
[Section 7.1](https://datatracker.ietf.org/doc/html/rfc6749#section-7.1). Value is case insensitive.
expires_in
RECOMMENDED. The lifetime in seconds of the access token. For
example, the value "3600" denotes that the access token will
expire in one hour from the time the response was generated.
If omitted, the authorization server SHOULD provide the
expiration time via other means or document the default value.
refresh_token
OPTIONAL. The refresh token, which can be used to obtain new
access tokens using the same authorization grant as described
in [Section 6](https://datatracker.ietf.org/doc/html/rfc6749#section-6).
scope
OPTIONAL, if identical to the scope requested by the client;
otherwise, REQUIRED. The scope of the access token as
described by [Section 3.3](https://datatracker.ietf.org/doc/html/rfc6749#section-3.3).
As you can see there is no issued_token_type field. We therefore must comply with that, and be prepared to receive responses that do NOT contain that field.
A client that breaks because that field is missing is not a OAuth2-compliant client.
By returning an unspecified field for a client_credentials grant, RESTCatalogAdapter is also infringing the OAuth 2.0 spec.
@adutra I've read RFC 6749 but as I mentioned in https://github.com/apache/iceberg/pull/10314#discussion_r1601201951 I'm assuming that the token exchange is following RFC 8693 (and my assumption can be wrong). It would be good to get confirmation on that though.
@adutra I've read RFC 6749 but as I mentioned in #10314 (comment) I'm assuming that the token exchange is following RFC 8693 (and my assumption can be wrong). It would be good to get confirmation on that though.
Sure, but RFC 8693 is not applicable here. All depends on the grant type being used. If the grant type is urn:ietf:params:oauth:token-type:access_token then yes, RFC 8693 applies, and yes, there is an issued_token_type field in the response, and yes it is a required field.
But if the grant type is client_credentials then RFC 6749 applies instead, and the field is not part of it at all.
@rdblue since you originally authored this part (https://github.com/apache/iceberg/pull/4771#discussion_r874114263) would you mind having a look as well? Thank you!
@nastra Alex is 100% right. I do not understand why it takes so long to review this fix for an OAuth spec-compliance.
I think the application of extensions referenced in RFC 8693 are a little ambiguous due to the following:
RFC 6749 section 4.1 references the response described in section 5.1
RFC 8693 describe the extensions to what is defined in the response to that same section to expand upon the format of what the access_token field contains. It states in the section 2.2.1:
The identifier access_token is used for historical reasons and the issued token need not be an OAuth access token.
I believe that the intent includes that a client credential exchange could return any of the enumerated token types defined in section 3 and applies.
I don't think it's explicitly clear either way, but I would interpret it as the latter.
Either way, I don't think it's a huge issue to default to ensure RFC 6749 compatibility, but we should just add a note as to why we're defaulting (e.g. "defaulting issued_token_type to access_token for compatibility with RFC 6749 where the issued type is omitted).
@nastra requested to revert @adutra 's fix for the client-credentials flow, only RFC 6749 applies here. This part of the code has nothing to do with RFC 8693 (token-exchange). The request to revert is IMO not justified, because the flow in question does not specify it.
I believe that the intent includes that a client credential exchange could return any of the enumerated token types defined in section 3 and applies.
RFC 8693 builds on top of RFC 6749, but does not modify any of its structs. And how could it be otherwise? An RFC cannot modify another one's structs without officially superseding it. Imho it is wrong to read section 2.2.1 of RFC 8693 as a general rewrite of RFC 6749 section 5.1, valid for all grants. The correct reading is: section 2.2.1 expands section 5.1 by adding extra context, in the scope of a token exchange grant only.
And anyways, that's how all public OAuth 2.0 servers interpret it: none of them include the field issued_token_type in a client_credentials grant response, even if they support RFC 8693.
TLDR is: if you want Iceberg REST to be interoperable with any public OAuth 2.0 server, this PR needs to be in.
@adutra
An RFC cannot modify another one's structs without officially superseding it. . . . RFC 8693 as a general rewrite of RFC 6749 section 5.1
I'm not suggesting that this is a general rewrite, but was more interpreting that RFC 8693 as an extension that can apply to any access_token grant type response.
. . . in the scope of a token exchange grant only.
I think this is the best argument and I agree it makes any argument for issued_token_type for other grant flows to be weak.
And anyways, that's how all public OAuth 2.0 servers interpret it: none of them include the field issued_token_type in a client_credentials grant response, even if they support RFC 8693.
Maybe you misinterpreted my message, but I actually support adding this. I couldn't find any examples and I also understand that RFC 8693 isn't widely adopted, so we may run into a number of issues with existing implementations out there.
Thank you to all involved! I'm very glad that we could reach a consensus here 🙏