kubernetes-client
kubernetes-client copied to clipboard
Feature Request: Support projected bound service account tokens when run in-cluster
Kubernetes has a beta feature to increase service account token security that is planned to go to General Availability and will become the default at some point.
The general idea is that rather than using the automounted service account token at /var/run/secrets/kubernetes.io/serviceaccount/token, you can project a volume that contains a pod scoped token that is valid for the lifetime of the pod using it. The token can also be auto-rotated based on some TTL. This means accidental token leakage doesn't require a manual rotation of keys etc.
See https://github.com/kubernetes/community/blob/master/contributors/design-proposals/auth/bound-service-account-tokens.md and https://github.com/mikedanese/community/blob/2bf41bd80a9a50b544731c74c7d956c041ec71eb/contributors/design-proposals/storage/svcacct-token-volume-source.md
It would be great if the kubernetes-client could handle the SA token being updated in the filesystem. Currently it stores the value of the token in the io.fabric8.kubernetes.client.Config object's oauthToken property. AFAIK this never gets reloaded. Currently the downstream consumers of the client will need to reload their client objects to refresh the config when a token rotates. Whilst this is doable, it feels like it should be taken care of by the kubernetes-client.
It's also not possible (that I could see) to configure the SA token path, which could be projected to somewhere other than the default location of /var/run/secrets/kubernetes.io/serviceaccount/token.
This issue has been automatically marked as stale because it has not had any activity since 90 days. It will be closed if no further activity occurs within 7 days. Thank you for your contributions!
The BoundServiceAccountTokenVolume feature has been promoted to beta, and enabled by default in Kubernetes 1.21 https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.21.md#changelog-since-v1200
I think this feature request is more relevant now that BoundServiceAccountTokenVolume is enabled by default in latest Kubernetes.
@titisan : Sorry, looks like I missed this while upgrading kubernetes model to v1.21.0 . Do you know in which package BoundServiceAccountTokenVolume exists? Would it be possible for you to contribute a PR for this ? We'll be happy to provide code pointers
I was checking the code and I wonder if the Interceptor for handling expired OIDC tokens (TokenRefreshInterceptor) will also refresh service account token when expires.
https://github.com/fabric8io/kubernetes-client/blob/a2b67bf6f36fc21c42812055a2c4183643b62c09/kubernetes-client/src/main/java/io/fabric8/kubernetes/client/utils/TokenRefreshInterceptor.java#L40
I think the "else" will be executed in case of authenticating with the service account token. The autoconfigure() will reload the token.
According to Kubernetes 1.21 release notes: "Clients should reload the token from disk periodically (once per minute is recommended) to ensure they continue to use a valid token." With current implementation of TokenRefreshInterceptor will only refresh (reload) the token when API server rejects with HTTP status code 401.
I think #3105 took care of this.
With current implementation of TokenRefreshInterceptor will only refresh (reload) the token when API server rejects with HTTP status code 401.
IMHO it's cheaper to do this (+backwards compatible), than having a periodic job that reloads the token from the disk files (i.e. once every time is needed instead of once per minute even if it's not needed).
Thanks @manusa, I do agree it is a cheaper solution than reloading the token from disk periodically.
Looking forward to have #3105 in next release.
It is cheaper but incorrect. What happens if I read the docs right is you will not get the 401. Not for a year!
As part of the transition to time limited tokens, the initial token is good for a year, but after a time D (between 1-3) hours, it will be refreshed, the old one remains valid, but the whole point is to generate prometheus metrics to find old updated clients (such as this one). And this client (well its users) will be flagged as "stale"
See https://github.com/kubernetes/enhancements/tree/master/keps/sig-auth/1205-bound-service-account-tokens
A 401 is therefore necessary but insufficient, and will make it impossible for consumers using this library to know if is working properly (and in fact in a year it will stop working)
From the 1.21 readme:
The BoundServiceAccountTokenVolume feature has been promoted to beta, and enabled by default.
- This changes the tokens provided to containers at /var/run/secrets/kubernetes.io/serviceaccount/token to be time-limited, auto-refreshed, and invalidated when the containing pod is deleted.
- Clients should reload the token from disk periodically (once per minute is recommended) to ensure they continue to use a valid token. k8s.io/client-go version v11.0.0+ and v0.15.0+ reload tokens automatically.
- By default, injected tokens are given an extended lifetime so they remain valid even after a new refreshed token is provided. The metric serviceaccount_stale_tokens_total can be used to monitor for workloads that are depending on the extended lifetime and are continuing to use tokens even after a refreshed token is provided to the container. If that metric indicates no existing workloads are depending on extended lifetimes, injected token lifetime can be shortened to 1 hour by starting kube-apiserver with --service-account-extend-token-expiration=false. (#95667, @zshihang) [SIG API Machinery, Auth, Cluster Lifecycle and Testing]
Obviously a hack could be used (a fragile one) wherein everyone that gets a Client instances gets it from a Supplier class, and "pinky-promises" to use that locally only (not per instance). Then the Supplier could reload the Client in a background thread and replace an AtomicReference. That seems hacky and kind of expensive.
@manusa I don't see anyway for us to plugin alternative interceptors (conveniently at least). And have you see the above? It's an issue IMO
We could add a configuration option that would mark the token stale after n period.
Then a few options:
- Scheduled thread that reloads the Token file
- Check staleness for each operation and reload Token file if needed
This also affects informers, which I believe are not as easy to hack as basic usage.
Hi,
As AWS is rolling out Kubernetes 1.21 nowadays, I was wondering what is the status of the support of this feature?
Has it been released already?
Cheers!
This needs more visibility, because as far as I understand AWS EKS disabled the default 1-year extended period and configured 90d period instead.
I am working on implementing this
EDIT: Actually I am not sure how but somehow I was confused and thought this repo was related to the k8s ruby plugin for fluentd: https://github.com/fabric8io/fluent-plugin-kubernetes_metadata_filter/pull/337
Change is here: https://github.com/fabric8io/fluent-plugin-kubernetes_metadata_filter/pull/337
It is already released in the plugin version 2.11.1
Though I am still working on doing final testing actually... but as far as we can tell it works. I will post an update if it actually doesn't.
(It works don't worry)
@PettitWesley @manusa Any update on this? Are you changes merged to kubernetes-client.
@vgaddavcg Changes are merged here: https://github.com/fabric8io/fluent-plugin-kubernetes_metadata_filter/pull/337
And released in plugin version 2.11.1 from what I saw. Anything else beyond that is up to you folks to handle
@PettitWesley I'm a bit confused how does an issue in Java based Kubernetes client relate to a PR in Ruby based Fluentd plugin.
@scholzj @vgaddavcg Sorry yea it seems that I got confused and thought this repo was also related to Fluentd... apologies, I have not fixed anything in this repo.
Hello, Any update on this? critical parts of our operations are impacted and the 90 days grace period allowed by EKS to ensure compatibility is running out. Thanks in advance,
@AbdelrhmanHamouda I believe it was fixed in https://github.com/fabric8io/kubernetes-client/pull/4264 and shipped with 6.1.0.
thanks @victornoel, this is a great help.
Thanks Victor, I'll close the issue seeing as it has been fixed and merged.
I think provided solution is more of a brute force approach to refresh service account token every minute. Could we attempt to test for recommended expires_in before preemptively refreshing token or just let it fail and then refresh access token and retry?
@apiwoni I haven't looked at the current implementation but your question does make sense to me. Would you be able to get a PR together for this improvement?
I believe that will break the metrics . Please read the kep. There may be extended tokens for a year but not reareading and reapplying every minute violates the kep
It is very similar to the original “let’s wait for a 403 approach” already demonstrated to be incorrect
@mikebell90 Does any of this really applies in cases where client provides access token without using KUBECONFIG but rather by getting it via custom API? I know duration of my access tokens. Microsoft Azure Ad does not provide refresh tokens for Oauth client credentials flow so I need to generate new token when it expires. It does not seem I can use OpenIDConnectionUtils#resolveOIDCTokenFromAuthConfig to generate new access token when refresh is not supported and token expired.