kubernetes-client icon indicating copy to clipboard operation
kubernetes-client copied to clipboard

Support for the expiration of the token

Open egor-ryashin opened this issue 5 years ago • 10 comments

Right now the client begins to fail once the expiration time comes. For example if the client is created in the environment with the following .kube/config:

user:
  auth-provider:
    config:
      access-token: ...
      cmd-args: config config-helper --format=json
      cmd-path: /.../google-cloud-sdk/bin/gcloud
      expiry: "2020-04-04T21:52:48Z"
      expiry-key: '{.credential.token_expiry}'
      token-key: '{.credential.access_token}'
    name: gcp

after the time 2020-04-04T21:52:48Z comes the client throws exception:

Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: ... Message: Unauthorized! Token may have expired! Please log-in again. Unauthorized.
	at io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:510) ~[kubernetes-client-4.6.1.jar:na]

And the application requires the token refresh (by running kubectl command) and creation of a new a client instance (which reads the new token).

egor-ryashin avatar Apr 05 '20 16:04 egor-ryashin

This seems somehow related to #2111

rohanKanojia avatar Apr 05 '20 17:04 rohanKanojia

This issue has been automatically marked as stale because it has not had any activity since 90 days. It will be closed if no further activity occurs within 7 days. Thank you for your contributions!

stale[bot] avatar Jul 04 '20 17:07 stale[bot]

This is quite an annoying issue. To work this around, I have created a utility function that makes a kubectl version call on each context I need to interact with:

private fun tryK8sTokenRefresh(contexts: List<String?>) = runBlocking(Dispatchers.IO) {
        val kubeConfig = KubeConfigUtils.parseConfig(File(Config.getKubeconfigFilename()))
        kubeConfig
            .contexts
            .filter { nc -> contexts.isEmpty() || contexts.contains(nc.name) }
            .toList().pmap {
                // Make a request on each cluster to refresh the access tokens
                val versionCmd = "kubectl version --context=${it.name}".runCommand()
                if (versionCmd.first != 0) {
                    verbose(
                        "An error occurred connecting to $it. Check your k8s configuration.\n${versionCmd.second}",
                        err = true
                    )
                    true
                } else {
                    verbose("Refreshed token for ${it.name} -> $versionCmd")
                    false
                }
            }
    }

fabriziofortino avatar Jul 02 '21 08:07 fabriziofortino

Any update on this issue?

subnetmarco avatar Dec 27 '21 19:12 subnetmarco

Is this still happening? I recall we added automatic token refresh mechanism for a few scenarios.

If it is, is anyone with a GCP cluster willing to contribute a fix?

manusa avatar Jan 07 '22 10:01 manusa

The specific circumstances as described -- USING GCP/GKE -- is still a "problem", but AFAICT from digging into the code and some previuosly closed issues, this problem is not due to a bug in this repo...

(NOTE: I am not an expert on oath, k8, this repo, or GCP/GKE -- what follows is just my layman understanding based on issue spelunking)

The user config (example) listed by the original issue reporter is what you get in your .kube/config if you use the GCP/GKE recommended gcloud container clusters get-credentials ... command AND THEN run at least one kubectl command requiring remote authentication -- which is what adds the access-token and expiry entries. (`gcloud container clusters get-credentials does not initialize those).

Subsequent kubectl invocations will use this access-token until the expiry is reached at which point the cmd-path will be re-invoked and the .kube/config rewritten on the fly to update the access-token and expiry

This auth-provider syntax can work with the fabric8io/kubernetes-client if and only if kubectl has been run at least once to populate the access-token AND the expiry has not been reached. Code in fabric8io/kubernetes-client for "reloading" the config in the event of an OAth failure (see issue #2111) does not help in this situation, because nothing in the config loading (or reloading) code knows/tries to exec the configured cmd-path

As noted in pull #1348 this entire auth config syntax that gcloud container clusters get-credentials ... provides doesn't match the "official" documented support in k8s.io/client-go for exec'ing an external process to get authentication credentials -- what is officially documented does in fact seem to be fully supported by fabric8io/kubernetes-client ...

  user:
    exec:
      apiVersion: "client.authentication.k8s.io/v1beta1"
      command: "/some/command/to/run.sh"

NOTE however that while the the GCP/GKE specific user config example noted in pull #1348 (repeated below) DOES work with kubectl it does not work with fabric8io/kubernetes-client (at least as of v5.10.1 which i have installed locally) evidently because of something broken in how the Config code invokes the ProcessBuilder (what exactly i haven't dug into .. i suspect it's related to the way it tries to force a subshell) causing the command to produce garbage output, which then confuses Serialization.unmarshal into thinking the output must be YAML...

  user:
    exec:
      apiVersion: "client.authentication.k8s.io/v1beta1"
      command: "sh"
      args:
        - "-c"
        - |
            gcloud config config-helper --format=json | jq '{"apiVersion": "client.authentication.k8s.io/v1beta1", "kind": "ExecCredential", "status": {"token": .credential.access_token, "expirationTimestamp": .credential.token_expiry}}'
    [ERROR] Failed to parse the kubeconfig.
    unacceptable code point '' (0x1B) special characters are not allowed
    in "'reader'", position 49
        at org.yaml.snakeyaml.reader.StreamReader.update(StreamReader.java:211)
        at org.yaml.snakeyaml.reader.StreamReader.ensureEnoughData(StreamReader.java:176)
        at org.yaml.snakeyaml.reader.StreamReader.ensureEnoughData(StreamReader.java:171)
        at org.yaml.snakeyaml.reader.StreamReader.peek(StreamReader.java:126)
        at org.yaml.snakeyaml.scanner.ScannerImpl.scanToNextToken(ScannerImpl.java:1198)
        at org.yaml.snakeyaml.scanner.ScannerImpl.fetchMoreTokens(ScannerImpl.java:308)
        at org.yaml.snakeyaml.scanner.ScannerImpl.checkToken(ScannerImpl.java:248)
        at org.yaml.snakeyaml.parser.ParserImpl$ParseImplicitDocumentStart.produce(ParserImpl.java:213)
        at org.yaml.snakeyaml.parser.ParserImpl.peekEvent(ParserImpl.java:165)
        at org.yaml.snakeyaml.parser.ParserImpl.checkEvent(ParserImpl.java:155)
        at org.yaml.snakeyaml.composer.Composer.getSingleNode(Composer.java:140)
        at org.yaml.snakeyaml.constructor.BaseConstructor.getSingleData(BaseConstructor.java:151)
        at org.yaml.snakeyaml.Yaml.loadFromReader(Yaml.java:490)
        at org.yaml.snakeyaml.Yaml.load(Yaml.java:429)
        at io.fabric8.kubernetes.client.utils.Serialization.unmarshalYaml(Serialization.java:369)
        at io.fabric8.kubernetes.client.utils.Serialization.unmarshal(Serialization.java:310)
        at io.fabric8.kubernetes.client.utils.Serialization.unmarshal(Serialization.java:235)
        at io.fabric8.kubernetes.client.utils.Serialization.unmarshal(Serialization.java:221)
        at io.fabric8.kubernetes.client.Config.getExecCredentialFromExecConfig(Config.java:678)
        at io.fabric8.kubernetes.client.Config.loadFromKubeconfig(Config.java:638)
        at io.fabric8.kubernetes.client.Config.tryKubeConfig(Config.java:557)
        at io.fabric8.kubernetes.client.Config.autoConfigure(Config.java:279)
        at io.fabric8.kubernetes.client.Config.<init>(Config.java:245)
        at io.fabric8.kubernetes.client.Config.<init>(Config.java:239)
        at io.fabric8.kubernetes.client.ConfigBuilder.<init>(ConfigBuilder.java:11)
        at io.fabric8.kubernetes.client.ConfigBuilder.<init>(ConfigBuilder.java:8)
        at io.fabric8.kubernetes.client.BaseClient.<init>(BaseClient.java:45)
        at io.fabric8.kubernetes.client.BaseKubernetesClient.<init>(BaseKubernetesClient.java:151)
        at io.fabric8.kubernetes.client.DefaultKubernetesClient.<init>(DefaultKubernetesClient.java:32)

...if however you put that same "gcloud | jq" logic into an (executable) shell script, the following configuration works just fine...

  user:
    exec:
      apiVersion: "client.authentication.k8s.io/v1beta1"
      command: "/tmp/auth-helper.sh"
#!/bin/sh
# this is /tmp/auth-helper.sh
gcloud config config-helper --format=json | jq '{"apiVersion": "client.authentication.k8s.io/v1beta1", "kind": "ExecCredential", "status": {"token": .credential.access_token, "expirationTimestamp": .credential.token_expiry}}'

hossman avatar Jan 11 '22 00:01 hossman

:heart: Thanks for the detailed explanation.

So I understand that a temporary workaround would be to use something similar to your auth-helper.sh script.

manusa avatar Jan 11 '22 08:01 manusa

Assuming everything i said is correct, and I'm not missunderstanding something, then short answer: "Yes" ... using the auth-helper.sh I posted -- instead of using gcloud container clusters get-credentials ... -- could be considered a workaround for using fabric8io/kubernetes-client with GKE.

Longer Answer: If i were a maintainer for fabric8io/kubernetes-client I would:

  1. File a new issue XXX to "fix" whatever regressed between pull #1348 and today such that using a config like the one cited as working in that pull request -- where the command is "sh" -- no longer works

    • Note in the issue description that the workaround is to put your shell commands into an executable script. (like my auth-helper.sh)
    • side note: I think this problem was introduced by #2308 adding Utils.getCommandPlatformPrefix() and the string concat of the command + args. Instead of execing "sh","-c","foo | bar" it execs "sh","-c","sh -c foo | bar" ... which is not the same
  2. File a new (low priority) "feature" request YYY to support the (evidently legacy) config syntax generated by gcloud container clusters get-credentials ...

    • noting in the issue summary that the config generated by gcloud container clusters get-credentials ... is not officially documented but if any GKE users want to contribute code to make it work they are welcome to.
    • point out that a workaround for the bug in GKE is to use gcloud config config-helper --format=json | jq '{"apiVersion": "client.authentication.k8s.io/v1beta1", "kind": "ExecCredential", "status": {"token": .credential.access_token, "expirationTimestamp": .credential.token_expiry}}'
    • but also link to the above bug XXX as a reason wy this needs to be in a shellscript
  3. resolve #2112 as "not a bug" due to the confusion/convoluted history of this issue and the fact that there is no single "fix" (let alone a single fix in this repo - get-credentials is the real culprit)

    • cross link to XXX and YYY for more details

hossman avatar Jan 11 '22 17:01 hossman

The current implementation does not support BoundServiceAccountTokenVolume which is enabled as the default in EKS 1.21+. See the following links:

https://docs.aws.amazon.com/eks/latest/userguide/kubernetes-versions.html#kubernetes-1.21

https://github.com/kubernetes/enhancements/issues/542

iciclespider avatar May 17 '22 04:05 iciclespider

When is this feature going to be supported https://github.com/kubernetes/enhancements/issues/542 in fabri8io clients. As per AWS, the api server token expires ( after 1 hour), so the clients should auto refresh the tokens to invoke the api server.

techguy0079 avatar May 19 '22 10:05 techguy0079

Removing the never stale. There could still be follow-up on 1, 2 in https://github.com/fabric8io/kubernetes-client/issues/2112#issuecomment-1010183636 but the basic support for expiration is ensured with #4802. That is we'll always fully refresh on an failure. The only true expiration awareness was added for OIDC refreshes - exec and other tokens from the config will be refreshed every minute.

shawkins avatar May 24 '23 12:05 shawkins

This issue has been automatically marked as stale because it has not had any activity since 90 days. It will be closed if no further activity occurs within 7 days. Thank you for your contributions!

stale[bot] avatar Aug 23 '23 23:08 stale[bot]