go-oidc
go-oidc copied to clipboard
Retry RemoteKeySet#updateKeys on error
https://github.com/coreos/go-oidc/blob/921a42b4b63d5b618d17d7e7a36a4acb59e643c6/oidc/jwks.go#L221-L248
It looks like this function does not attempt to retry on errors. This function performs a GET, so most errors should be retry-friendly. A single retry would significantly[1] help recover from most errors.
[1]: source: none :)
Are you actually hitting issues or is this theoretical?
Retry strategies are usually documented as part of your API contract (e.g. https://google.aip.dev/194). To the best of my knowledge, there's nothing in the spec, and I wouldn't want to make assumptions about what responses indicate a request is or isn't retryable. Is this about adding Retry-After support?
- https://openid.net/specs/openid-connect-core-1_0.html
- https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Retry-After
I'm inclined to omit retry logic since its simpler and easier to reason about this way.
For some additional context, this library used to implement Cache-Control header parsing. It ended up causing more headaches than it solved, and was ripped out eventually.
https://github.com/coreos/go-oidc/pull/259
Thanks for your response! I am indeed experiencing issues due to unreliable network connections, which led me to digging this code path.
I am not familiar with the OpenID Connect Core spec, and indeed do not see any mentions of error handling / retrying. I'll refer to the HTTP spec (RFC9110), which says (roughly): the GET method is considered safe, idempotent, and cacheable, which makes it suitable for retries.
Some requests can be automatically retried by a client in the event of an underlying connection failure, as described in Section 9.2.2. [..] Idempotent methods are distinguished because the request can be repeated automatically if a communication failure occurs before the client is able to read the server's response.
AIP194 follows the same logic:
Clients should automatically retry requests for which repeated runs would not cause unintended state changes, which are non-transactional, and which are unary.
Even today, go-oidc will send multiple GET requests to the same URL, over time.
I'd recommend adding retry logic for network errors, such as connection or read timeouts, as well as the following HTTP status codes:
- 408 (Request Timeout),
- 500 (Internal Server Error),
- 502 (Bad Gateway),
- 503 (Service Unavailable),
- 504 (Gateway Timeout).
Supporting the Retry-After header would be a good follow-up, but requires extra logic indeed.
I also wanted to mention that it seems like errors from this go-oidc code path might not be easily handled by library users (from what I could see).
Sounds good! I think my main concern is if we hit a net deadline (e.g. timing out after 30 seconds and not getting an HTTP response). Retrying that operation might result in the code blocking for some unreasonable amount of time.
There is also: https://github.com/coreos/go-oidc/issues/248
Just to unblock you, we could start returning wrapped HTTP errors when we hit non-2XX HTTP responses from all of our APIs. Similar to https://pkg.go.dev/golang.org/x/oauth2#RetrieveError
You'd be able to do the retry in your call to Verify(), then we can figure out details for retry-by-default later.