argo-workflows icon indicating copy to clipboard operation
argo-workflows copied to clipboard

Kubernetes 1.24: Token-handling code assumes auto-created service account token secrets

Open liggitt opened this issue 2 years ago • 17 comments

Sweeping token-scraping of auto-generated Kubernetes token secrets in preparation for Kubernetes 1.24 showed the following code locations assume auto-generated tokens will exist:

https://github.com/argoproj/argo-workflows/blob/62e0a8ce4e74d2e19f3a9c0fb5e52bd58a6b944b/workflow/controller/operator.go#L3605

https://github.com/argoproj/argo-workflows/blob/a3c326fdf0d2133d5e78ef71854499f576e7e530/server/auth/webhook/interceptor.go#L90

https://github.com/argoproj/argo-workflows/blob/a3c326fdf0d2133d5e78ef71854499f576e7e530/server/auth/gatekeeper.go#L321

That assumption is not universally correct.

In 1.21+, secret-based tokens are no longer used for mounting into pods (ephemeral time-limited tokens are), and the token controller can be turned off.

In 1.24+, secret-based tokens are no longer auto-created by default for new service accounts.

Using ephemeral time-bound tokens is preferred in 1.21+ (see the TokenRequest API) if possible.

If a secret-based token is still desired, one can be created manually, but will not be referenced from the service account's .secrets list.


Message from the maintainers:

Impacted by this bug? Give it a 👍. We prioritise the issues with the most 👍.

liggitt avatar Apr 05 '22 18:04 liggitt

@liggitt do you have some thoughts on how to adapt the code to address this?

alexec avatar Apr 05 '22 20:04 alexec

if you need tokens to use outside the context of a pod (where they continue to be mounted in automatically for use), either request one using the TokenRequest API, or create a secret to hold one.

liggitt avatar Apr 05 '22 20:04 liggitt

None of these SAs are useful if they do not have a token. The user could presumably create a token?

alexec avatar Apr 05 '22 21:04 alexec

Service account tokens are not automatically ambient in secrets in 1.24+ (and possibly in 1.21+ if the cluster owner turned off the token controller since it is no longer required for getting tokens into pods).

Tokens are created automatically for mounting into pods (but are not persisted in Secret API objects).

If a secret-based token is desired/needed, it can be created manually, but would not be referenced from the ServiceAccount's secrets list field.

liggitt avatar Apr 05 '22 21:04 liggitt

it looks like at least some of the uses of getServiceAccountTokenName are to get the name of a Secret object in order to create a volume mount mounting in the token?

If you can't resolve a secret name, I'd recommend using projected token volumes instead, which let you request a token be mounted into a pod without needing to look up a Secret name.

See https://kubernetes.io/docs/reference/access-authn-authz/service-accounts-admin/#bound-service-account-token-volume for details on how to replicate the content of an old secret-based token volume with the new projected token volumes

liggitt avatar Apr 05 '22 21:04 liggitt

Do project volumes work on GKE auto-pilot?

alexec avatar Apr 05 '22 21:04 alexec

yes

liggitt avatar Apr 05 '22 21:04 liggitt

@jessesuen @jannfis FYI

alexec avatar Apr 05 '22 21:04 alexec

Project volumes will not work for our case. We don't know the name of the service account token ahead of time. A single pod read many different tokens.

I think users will have to create the tokens themselve. We should update the code to returrn an error (rather than panic) so that users know what to do.

@liggitt thoughts?

alexec avatar Apr 05 '22 21:04 alexec

We don't know the name of the service account token ahead of time.

The name of the token, or the name of the service account?

A single pod read many different tokens.

Projected token volumes can be used multiple times, but only for a single service account (the one referenced by the pod spec)

I think users will have to create the tokens themselves

If so, they'll either need to give you the names of the secrets they created (since they won't be referenced from the serviceaccount's .secrets field), or you'll have to query for the secrets (filtering by type, then matching up the kubernetes.io/service-account.name annotation to the name of the service account you're looking for).

Use of secret-based tokens (even manually created ones) is discouraged, since it exposes token content via the API.

liggitt avatar Apr 05 '22 22:04 liggitt

We don't know the name of the service account token ahead of time.

The name of the token, or the name of the service account?

Sorry. We don't know the service account name ahead of time (i.e. when pod starts up).

If so, they'll either need to give you the names of the secrets they created (since they won't be referenced from the serviceaccount's .secrets field), or you'll have to query for the secrets (filtering by type, then matching up the kubernetes.io/service-account.name annotation to the name of the service account you're looking for).

Use of secret-based tokens (even manually created ones) is discouraged, since it exposes token content via the API.

I understand.

alexec avatar Apr 05 '22 22:04 alexec

This looks related "argocd cluster add ...", timing out waiting for a secret.

lknite avatar May 07 '22 22:05 lknite

looking at the code you could change it so that it retrieves the secret by annotation kubernetes.io/service-account.name: "service account name " in the method getServiceAccountTokenName using the operator.go as an example but that requires users to explicitly create the secret. This means you would be missing out on the security enhancements provided in 1.24.

It seems to me that the long term solution is to create a token at run time from the argoexe container using something like

...CoreV1().ServiceAccounts(Namespace).CreateToken(ctx, name, &tokenRequest, metav1.CreateOptions{})

based on the name of the service account and inject it into the the container as a tmpfs volume as there will be no way to mount a pre-existing service account secret

jrhoward avatar Sep 09 '22 07:09 jrhoward

I found another place where we assume tokens to be auto generated:

https://github.com/argoproj/argo-workflows/blob/30bd96b4c030fb728a3da78e0045982bf778d554/server/auth/gatekeeper.go#L319

This essentially means that you can't run argo-workflows on k8s 1.24 (like we do) with sso enabled.

Maybe until it gets addressed we should add something to the documentation, or maybe pinned this issue explaining it is not compatible with k8s 1.24

@juliev0 @sarabala1979 this issue needs to scheduled and fixed.

alexec avatar Sep 14 '22 16:09 alexec

@alexec I will add it in RTB we will look at this after 3.4.0. we can target and include it in 3.4.1

sarabala1979 avatar Sep 14 '22 19:09 sarabala1979

v3.5 would be fine I think.

alexec avatar Sep 14 '22 19:09 alexec

Consider testing with k8s v1.25.

gcsfred2 avatar Sep 19 '22 21:09 gcsfred2

work around https://stackoverflow.com/questions/73320413/argo-http-workflow-failed-to-get-token-volumes-service-account-argo-default-d

sarabala1979 avatar Sep 19 '22 21:09 sarabala1979

work around https://stackoverflow.com/questions/73320413/argo-http-workflow-failed-to-get-token-volumes-service-account-argo-default-d

I attempted this workaround with 1.24.4 and was unsuccessful ended up with the same error message. The only thing that worked for me was setting up K8S 1.23.10 installing Argo Workflows. This created the SA with the Secret. Then upgrading to K8S 1.24.4 (doesn't retroactively destroy my SA).

However, could be I didn't create the service-account-token correctly...

apiVersion: v1
kind: Secret
metadata:
  name: default-token
  annotations:
    kubernetes.io/service-account.name: "default"
type: kubernetes.io/service-account-token

callMeFord avatar Sep 19 '22 23:09 callMeFord

that is not a work around, the current implementation in workflow looks for a section in the service account manifest that has a reference to the secret. That section has gone in 1.24 so it requires a code change on workflow

apiVersion: v1
kind: ServiceAccount
.......
secrets:
- name: the-secret-token-name

jrhoward avatar Sep 20 '22 02:09 jrhoward

Is there a way to install AW with @alexec 's PR?

gcsfred2 avatar Sep 23 '22 16:09 gcsfred2

I need someone to take over the PR, as I do not have bandwidth. Can someone please volunteer?

alexec avatar Sep 23 '22 16:09 alexec

Is there a way to install AW with @alexec 's PR?

you can follow the running locally documentation but checkout branch fix-8320

Note: I had to remove go.sum and rerun make clean to account for the module changes

jrhoward avatar Sep 23 '22 23:09 jrhoward

Don't think I am qualified to take over the PR but I'll help out where I can as I have the branch running locally with 1.24.

I have installed the sample executor plugin and successfully run a sample workflow based on the revised instructions with a service account token suffixed with .service-account-token.

jrhoward avatar Sep 23 '22 23:09 jrhoward

interceptor.go is missing updates. I modified and built locally.

https://github.com/argoproj/argo-workflows/blob/a3c326fdf0d2133d5e78ef71854499f576e7e530/server/auth/webhook/interceptor.go#L87-L90

Remove lines 87 to 89 and change line 90 from serviceAccount.Secrets[0].Name to leverage the new util:

tokenSecret, err := secretsInterface.Get(ctx, secrets.SecretName(serviceAccount), metav1.GetOptions{})

jrhoward avatar Sep 24 '22 00:09 jrhoward

I followed the steps of https://argoproj.github.io/argo-workflows/running-locally/ cloning @terrytangyuan 's https://github.com/terrytangyuan/argo-workflows/tree/support-k8s-124 and I got an error with make start: https://pastebin.com/NCn3s7Wr

gcsfred2 avatar Sep 26 '22 18:09 gcsfred2

@gcsfred2 That is unrelated. Try creating a new local cluster and start again.

terrytangyuan avatar Sep 26 '22 18:09 terrytangyuan

Retrying https://github.com/terrytangyuan/argo-workflows/tree/support-k8s-124 and running locally. My recent error: https://pastebin.com/hBySXzWV . There was no error running "make start UI=true" and the Hello World example.

gcsfred2 avatar Sep 29 '22 23:09 gcsfred2

ok I just discovered that there is a work around which does not require a code change on argoworkflow.

Not sure if it is documented so I can't vouch for how long it will last but you can manually insert the secret name into the service account manifest which means you can deploy workflow without a fix into 1.24, as an example this works on server version: {Major:"1", Minor:"24", GitVersion:"v1.24.4+k3s1}, yes tested with k3d > k3s

note the trick is to define the secret name in the service account manifest

I applied the following manifests:

apiVersion: v1
kind: Secret
type: kubernetes.io/service-account-token
metadata:
  name: uber-argo
  annotations:
    kubernetes.io/service-account.name: "uber-argo"
---

apiVersion: v1
kind: ServiceAccount
metadata:
  name: uber-argo
secrets:
- name: uber-argo

functionality has been removed but the spec is still valid

jrhoward avatar Oct 07 '22 09:10 jrhoward