Cannot use wildcard (*) namespace in kops when using IRSA
What happened:
We are trying to use wildcard namespace feature in kops that came up with this PR https://github.com/kubernetes/kops/pull/16113. Now using wildcard namespace in kops cluster manifest and then trying to create a pod that references the service account and IAM policy fails with this particular error in pod-identity-webhook logs:
I0821 08:42:55.226946 1 handler.go:395] Pod was not mutated. Reason: Service account did not have the right annotations or was not found in the cache. Pod=ssm-ec2-test, ServiceAccount=ssm-ec2, Namespace=default
What you expected to happen: Pod to be mutated and contain the required policy/role.
How to reproduce it (as minimally and precisely as possible): in kops cluster manifest, we have this:
spec:
iam:
allowContainerRegistry: true
legacy: false
serviceAccountExternalPermissions:
- name: ssm-ec2
aws:
policyARNs:
- arn:aws:iam::<ACCOUNT_ID>:policy/access-ec2-with-ssm
namespace: "*"
Then we try to deploy an workload:
apiVersion: v1
kind: ServiceAccount
metadata:
name: ssm-ec2
namespace: default
---
apiVersion: v1
kind: Pod
metadata:
name: ssm-ec2-test
namespace: default
spec:
containers:
- name: aws-cli
image: amazon/aws-cli:latest
command:
- sleep
- "30000"
serviceAccountName: "ssm-ec2"
pod-identity-webhook complains with:
I0821 08:42:54.833148 1 cache.go:179] Adding SA default/ssm-ec2 to SA cache: &{RoleARN: Audience: UseRegionalSTS:false TokenExpiration:0}
I0821 08:42:54.833397 1 cache.go:179] Adding SA default/ssm-ec2 to SA cache: &{RoleARN: Audience: UseRegionalSTS:false TokenExpiration:0}
I0821 08:42:55.226659 1 cache.go:80] Fetching sa default/ssm-ec2 from cache
I0821 08:42:55.226847 1 cache.go:93] Service account default/ssm-ec2 not found in cache
I0821 08:42:55.226946 1 handler.go:395] Pod was not mutated. Reason: Service account did not have the right annotations or was not found in the cache. Pod=ssm-ec2-test, ServiceAccount=ssm-ec2, Namespace=default
Anything else we need to know?: When we change the "*" to any namespace (default) everything works just fine as expected.
Environment:
- AWS Region: tested in eu-west-1 but should be valid in all.
- Kubernetes version (if using EKS, run
aws eks describe-cluster --name <name> --query cluster.version): 1.24.16 (not EKS) - Webhook Version:
v0.4.0
Discussed this with @olemarkus in #kops-users slack channel and he feels that https://github.com/aws/amazon-eks-pod-identity-webhook/blob/master/pkg/cache/cache.go#L130 needs to check for both namespace + name and "*" + name
EDIT: I can provide the full cluster spec after redacting sensitive parts if needed.
@kmala Do you know if there's anyone that could take a look at this? Thanks!
the changes looks small as we want to support wild card for all namespaces and don't see any issue with supporting this. let me check if any one can work on it
Awesome, thanks for checking!
I can probably do the PR as well, but it will take a few days before I can find the time.
Hi @kmala @olemarkus I was wondering if you might have some sort of update for this? We are actually waiting to adopt this feature, which is kinda blocked by this issue.
I can probably do the PR as well, but it will take a few days before I can find the time.
🙏 @olemarkus. That would be greatly appreciated.
Although, @kmala it looks like @olemarkus is no longer active - I don't see any commits since August in hit GH profile. Would you or anyone else be able to make this small fix?
i am bit busy currently and hence can't commit to it but can help review the changes. Otherwise i will try to get this prioritized
I am active, just working in other ways :) I have a few things with slightly higher priority, but I'll try to have something by tomorrow.
Have a look at https://github.com/aws/amazon-eks-pod-identity-webhook/pull/251