aws-load-balancer-controller
aws-load-balancer-controller copied to clipboard
Unable to migrate to pod-identity
Describe the bug
I am trying to configure aws-load-balancer controller to use pod-identity however I keep getting NoCredentialProviders: no valid providers in chain. Deprecated.\n\tFor verbose messaging see aws.Config.CredentialsChainVerboseErrors"}
Steps to reproduce 1 - Deployed aws-load-balancer-controller component using HELM. Latest version 2 - In a cluster with OIDC, I did setup:
- a Role with enough permissions
- a trust policy that allows the
aws-load-balancer-controllerserviceAccount to impersonate said role using OIDC provider
This setup renders a working release of aws-load-balancer-controller
When migrating to eks-pod-identity I did setup:
- EKS addon for pod identity (tested to work as I already migrated the secrets-store-csi-driver component )
- an association between the aws-load-balancer-controller SA and the Role
Restarting the deployment generates new pods with their corresponding environment variables. When applying a load balancer, I do see the infamous
{"level":"error","ts":"2024-09-13T20:07:16Z","msg":"Reconciler error","controller":"ingress","object":{"name":"my-test-ingress","namespace":"some-namespace"},"namespace":"kube-system","name":"my-test-ingress","reconcileID":"70fc8680-1c00-4efc-9d38-5b90019b1e37","error":"ingress: kube-system/my-test-ingress NoCredentialProviders: no valid providers in chain. Deprecated.\n\tFor verbose messaging see aws.Config.CredentialsChainVerboseErrors"}
Expected outcome I expect that a load balancer is provisioned and the error no longer shows.
Environment
- AWS Load Balancer controller version: 2.8.2
- Kubernetes version 1.30
- Using EKS (yes/no), if so version? using EKS 1.30
Additional Context: I already reviewed for hop being set to 2 in instance metadata options I made sure no iptable rule is dropping instance metadata. I did reviewed that no userdata script prevents access to the service. I also have another chart migrated without issues.
@ricfdez Will you be able to provide us with the environment variables from the controller?
I'd like to pick up this issue. I believe that I am having the same issue as described here.
Environment
AWS Load Balancer controller version: 2.9.0 Kubernetes version 1.30 Using EKS (yes/no), if so version? using EKS 1.30
My error log is slightly different:
"error": "ingress: hello/hello: operation error ACM: ListCertificates, get identity: get credentials: failed to refresh cached credentials, no EC2 IMDS role found, operation error ec2imds: GetMetadata, canceled, context deadline exceeded"
My version is looking for EC2 credentials instead of just not finding any. Either way.
With respect to environment variables, I can't get env to run with kubectl exec on the albc pod; but I started an awscli pod with the same service account and got the following environment variables:
AWS_STS_REGIONAL_ENDPOINTS=regional
AWS_DEFAULT_REGION=us-west-2
AWS_REGION=us-west-2
AWS_CONTAINER_CREDENTIALS_FULL_URI=http://169.254.170.23/v1/credentials
AWS_CONTAINER_AUTHORIZATION_TOKEN_FILE=/var/run/secrets/pods.eks.amazonaws.com/serviceaccount/eks-pod-identity-token
I checked out the aws/config source version that you are using to ensure that they will support the pod identity credentials, and they appear to load them here: https://github.com/aws/aws-sdk-go-v2/blob/config/v1.27.27/config/resolve_credentials.go#L312
Not sure what the issue is at this point, but I'll take a closer look in the future.
The issue here appears to have been that there was a bad, old tag on my EKS cluster subnets from a previous setup attempt. The tags described here pointed to the wrong cluster. After removing those tags, everything worked fine.
I will update some documentation to satisfy this issue and demonstrate how pod identities can work.
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle stale - Close this issue with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle rotten - Close this issue with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Reopen this issue with
/reopen - Mark this issue as fresh with
/remove-lifecycle rotten - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
@k8s-triage-robot: Closing this issue, marking it as "Not Planned".
In response to this:
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied- After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied- After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closedYou can:
- Reopen this issue with
/reopen- Mark this issue as fresh with
/remove-lifecycle rotten- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.
I'd like to pick up this issue. I believe that I am having the same issue as described here.
Environment
AWS Load Balancer controller version: 2.9.0 Kubernetes version 1.30 Using EKS (yes/no), if so version? using EKS 1.30
My error log is slightly different:
"error": "ingress: hello/hello: operation error ACM: ListCertificates, get identity: get credentials: failed to refresh cached credentials, no EC2 IMDS role found, operation error ec2imds: GetMetadata, canceled, context deadline exceeded"My version is looking for EC2 credentials instead of just not finding any. Either way.
With respect to environment variables, I can't get
envto run withkubectl execon the albc pod; but I started an awscli pod with the same service account and got the following environment variables:AWS_STS_REGIONAL_ENDPOINTS=regional AWS_DEFAULT_REGION=us-west-2 AWS_REGION=us-west-2 AWS_CONTAINER_CREDENTIALS_FULL_URI=http://169.254.170.23/v1/credentials AWS_CONTAINER_AUTHORIZATION_TOKEN_FILE=/var/run/secrets/pods.eks.amazonaws.com/serviceaccount/eks-pod-identity-tokenI checked out the
aws/configsource version that you are using to ensure that they will support the pod identity credentials, and they appear to load them here: https://github.com/aws/aws-sdk-go-v2/blob/config/v1.27.27/config/resolve_credentials.go#L312Not sure what the issue is at this point, but I'll take a closer look in the future.
Hello, was this officially resolved? I am getting the same error you are here and i dont have old tags on subnet i only have the proper tags.
everything else checks out, the pod identity is running, the IAM role is in place and its associated with the service account in the EKS console in the access panel. What else could be causing this?
I'd like to pick up this issue. I believe that I am having the same issue as described here. Environment AWS Load Balancer controller version: 2.9.0 Kubernetes version 1.30 Using EKS (yes/no), if so version? using EKS 1.30 My error log is slightly different:
"error": "ingress: hello/hello: operation error ACM: ListCertificates, get identity: get credentials: failed to refresh cached credentials, no EC2 IMDS role found, operation error ec2imds: GetMetadata, canceled, context deadline exceeded"My version is looking for EC2 credentials instead of just not finding any. Either way. With respect to environment variables, I can't get
envto run withkubectl execon the albc pod; but I started an awscli pod with the same service account and got the following environment variables:AWS_STS_REGIONAL_ENDPOINTS=regional AWS_DEFAULT_REGION=us-west-2 AWS_REGION=us-west-2 AWS_CONTAINER_CREDENTIALS_FULL_URI=http://169.254.170.23/v1/credentials AWS_CONTAINER_AUTHORIZATION_TOKEN_FILE=/var/run/secrets/pods.eks.amazonaws.com/serviceaccount/eks-pod-identity-tokenI checked out the
aws/configsource version that you are using to ensure that they will support the pod identity credentials, and they appear to load them here: https://github.com/aws/aws-sdk-go-v2/blob/config/v1.27.27/config/resolve_credentials.go#L312 Not sure what the issue is at this point, but I'll take a closer look in the future.Hello, was this officially resolved? I am getting the same error you are here and i dont have old tags on subnet i only have the proper tags.
everything else checks out, the pod identity is running, the IAM role is in place and its associated with the service account in the EKS console in the access panel. What else could be causing this?
just FYI, so I got past this error by doing
aws ec2 modify-instance-metadata-options --instance-id eks_node_id --http-put-response-hop-limit 3 for every EKS node. I'm confused why this was needed because it worked in govcloud without it. but in commercial i had to do it. not sure where the difference is.