Annotations eks.amazonaws.com/skip-containers and eks.amazonaws.com/sts-regional-endpoints aren't working
What happened:
I am using EKs version 1.21 and trying to use IRSA, for that m setting these annotations :
eks.amazonaws.com/sts-regional-endpoints: "true
eks.amazonaws.com/skip-containers: sidecar-busybox-container
however, the container is still getting injected with environment variables for container sidecar-busybox-container and I don't see the use of STS regional endpoints with the pod.
What you expected to happen:
as per the docs, here it should have skipped mutating sidecar-busybox-container container and add the environment variable AWS_STS_REGIONAL_ENDPOINTS to use regional STS endpoints.
How to reproduce it (as minimally and precisely as possible):
- Create a cluster with the below YAML
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
name: iam-cluster
region: us-east-1
version: "1.21"
availabilityZones:
- us-east-1a
- us-east-1b
- us-east-1c
iam:
withOIDC: true
serviceAccounts:
- metadata:
name: s3-reader
attachPolicyARNs:
- "arn:aws:iam::aws:policy/AmazonS3ReadOnlyAccess"
managedNodeGroups:
- name: managed-ng-1
instanceType: t3a.medium
minSize: 1
maxSize: 4
desiredCapacity:
- Annotate the service account
s3-readerwith these annotations :
kubectl annotate \
sa s3-reader \
"eks.amazonaws.com/audience=sts.amazonaws.com" \
"eks.amazonaws.com/sts-regional-endpoints=true" \
"eks.amazonaws.com/token-expiration=43200" \
"eks.amazonaws.com/skip-containers=sidecar-busybox-container"
- Create a pod with 2 containers :
apiVersion: v1
kind: Pod
metadata:
name: iam-test
spec:
serviceAccountName: s3-reader
restartPolicy: Never
containers:
- name: iam-test
image: amazon/aws-cli
args: [ "sts", "get-caller-identity" ]
- name: sidecar-busybox-container
image: radial/busyboxplus:curl
once the pod is created, check the environment variables for the containers :
kubectl get pods iam-test -ojson|jq -r '.spec.containers[].env)'
Anything else we need to know?:
Environment: EKS v1.21
- AWS Region: us-east-1
- EKS Platform version (if using EKS, run
aws eks describe-cluster --name <name> --query cluster.platformVersion): "eks.2" - Kubernetes version (if using EKS, run
aws eks describe-cluster --name <name> --query cluster.version): 1.21 - Webhook Version: Not sure how to get it from the cluster.
For anyone looking for a solution.
eks.amazonaws.com/skip-containers only works on pods and eks.amazonaws.com/sts-regional-endpoints isn't working on EKS1.21 due to an issue that was fixed recently.
eks.amazonaws.com/sts-regional-endpoints isn't working on our EKS clusters with version 1.21
@oba11 is this working for you now? Trying to confirm if https://github.com/aws/amazon-eks-pod-identity-webhook/issues/110 was included in the "Platform" release of eks.3 under 1.21
There was an outage in AWS us-east-1 region yesterday, and we discovered that "eks.3 + k8s 1.21" version of the platform (latest) doesn't have this fix included. Adding a eks.amazonaws.com/sts-regional-endpoints annotation doesn't work.
We were hitting STS endpoint in us-east-1 region, which resulted in the pods running with EKS Node role instead of IAM role that was passed to them via ServiceAccount annotation.
same does not work for me as well
kubernetes version : 1.21 platform version: eks.2
Can we please prioritize on fixing this
This should be fixed in EKS 1.21 eks.3, @vgrigoruk could you share your serviceaccount and pod specs if possible (with arn, account id etc redacted)?
Here are my specs and test for reference, I see AWS_STS_REGIONAL_ENDPOINTS on my Pod as expected:
$ kubectl version
Server Version: version.Info{Major:"1", Minor:"21+", GitVersion:"v1.21.2-eks-06eac09", GitCommit:"5f6d83fe4cb7febb5f4f4e39b3b2b64ebbbe3e97", GitTreeState:"clean", BuildDate:"2021-09-13T14:20:15Z", GoVersion:"go1.16.5", Compiler:"gc", Platform:"linux/amd64"}
$ eksctl create iamserviceaccount \
--name matthew \
--namespace default \
--cluster my-cluster \
--attach-policy-arn arn:aws:iam::aws:policy/IAMReadOnlyAccess \
--approve \
--override-existing-serviceaccounts
$ kubectl annotate serviceaccount -n default matthew eks.amazonaws.com/sts-regional-endpoints=true --overwrite && k delete po pause && k create -f kubernetes/pod-matthew.yaml && k get po pause -o yaml | grep AWS
serviceaccount/matthew annotated
pod "pause" deleted
pod/pause created
- name: AWS_STS_REGIONAL_ENDPOINTS
- name: AWS_DEFAULT_REGION
- name: AWS_REGION
- name: AWS_ROLE_ARN
- name: AWS_WEB_IDENTITY_TOKEN_FILE
~ $ k get sa matthew -o yaml
apiVersion: v1
kind: ServiceAccount
metadata:
annotations:
eks.amazonaws.com/role-arn: arn:aws:iam::x:role/eksctl-my-cluster-addon-iam-Role1-1X9GRP4HWB56F
eks.amazonaws.com/sts-regional-endpoints: "true"
creationTimestamp: "2021-12-15T21:46:18Z"
labels:
app.kubernetes.io/managed-by: eksctl
name: matthew
namespace: default
resourceVersion: "4491"
uid: 448b4fc8-5f29-4503-852d-3e18125624df
secrets:
- name: matthew-token-lgpvj
~ $ cat kubernetes/pod-matthew.yaml
apiVersion: v1
kind: Pod
metadata:
name: pause
spec:
containers:
- name: pause
image: k8s.gcr.io/pause
serviceAccount: matthew
~ $ k get po pause -o yaml
apiVersion: v1
kind: Pod
metadata:
annotations:
kubernetes.io/psp: eks.privileged
creationTimestamp: "2021-12-17T00:40:42Z"
name: pause
namespace: default
resourceVersion: "215892"
uid: 40f639a8-7c0f-4670-b4ea-e443dc91167d
spec:
containers:
- env:
- name: AWS_STS_REGIONAL_ENDPOINTS
value: regional
- name: AWS_DEFAULT_REGION
value: us-west-2
- name: AWS_REGION
value: us-west-2
- name: AWS_ROLE_ARN
value: arn:aws:iam::x:role/eksctl-my-cluster-addon-iam-Role1-1X9GRP4HWB56F
- name: AWS_WEB_IDENTITY_TOKEN_FILE
value: /var/run/secrets/eks.amazonaws.com/serviceaccount/token
image: k8s.gcr.io/pause
imagePullPolicy: Always
name: pause
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: kube-api-access-8gbqg
readOnly: true
- mountPath: /var/run/secrets/eks.amazonaws.com/serviceaccount
name: aws-iam-token
readOnly: true
dnsPolicy: ClusterFirst
enableServiceLinks: true
nodeName: ip-192-168-101-177.us-west-2.compute.internal
preemptionPolicy: PreemptLowerPriority
priority: 0
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
serviceAccount: matthew
serviceAccountName: matthew
terminationGracePeriodSeconds: 30
tolerations:
- effect: NoExecute
key: node.kubernetes.io/not-ready
operator: Exists
tolerationSeconds: 300
- effect: NoExecute
key: node.kubernetes.io/unreachable
operator: Exists
tolerationSeconds: 300
volumes:
- name: aws-iam-token
projected:
defaultMode: 420
sources:
- serviceAccountToken:
audience: sts.amazonaws.com
expirationSeconds: 86400
path: token
- name: kube-api-access-8gbqg
projected:
defaultMode: 420
sources:
- serviceAccountToken:
expirationSeconds: 3607
path: token
- configMap:
items:
- key: ca.crt
path: ca.crt
name: kube-root-ca.crt
- downwardAPI:
items:
- fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
path: namespace
status:
conditions:
- lastProbeTime: null
lastTransitionTime: "2021-12-17T00:40:42Z"
status: "True"
type: Initialized
- lastProbeTime: null
lastTransitionTime: "2021-12-17T00:40:44Z"
status: "True"
type: Ready
- lastProbeTime: null
lastTransitionTime: "2021-12-17T00:40:44Z"
status: "True"
type: ContainersReady
- lastProbeTime: null
lastTransitionTime: "2021-12-17T00:40:42Z"
status: "True"
type: PodScheduled
containerStatuses:
- containerID: docker://aab8cd024e1de7051116d4c5dd287dcd81c71b8deee6a521bcb0f062182d5d2e
image: k8s.gcr.io/pause:latest
imageID: docker-pullable://k8s.gcr.io/pause@sha256:a78c2d6208eff9b672de43f880093100050983047b7b0afe0217d3656e1b0d5f
lastState: {}
name: pause
ready: true
restartCount: 0
started: true
state:
running:
startedAt: "2021-12-17T00:40:43Z"
hostIP: 192.168.101.177
phase: Running
podIP: 192.168.101.209
podIPs:
- ip: 192.168.101.209
qosClass: BestEffort
startTime: "2021-12-17T00:40:42Z"
I'm running on eks.4 and although the environment variables are now set correctly the endpoint it tries to use is still us-east-1. I've got the AWS region/default environment variables set to eu-west-2, but it call the us-east-1 regional endpoint
@barrydobson I'd like to know a bit more to debug the issue you're facing. How are you verifying that the regional endpoint is not used? Are you using CloudTrail events ? You could use the CloudTrail EventHistory and look for EventName AssumeRoleWithWebIdentity . In the event record, you'll find clientProvidedHostHeader . Does that value contain us-east-1 ?
Any update on when this will be fixed? I have pods that are annotated correctly to use regional endpoint and i still get issues intermittently where pod falls back to EKS node instance profile. Most of the time when exception is raised the logs show it tries to use global endpoint and us-east-1