aws-load-balancer-controller Unable to migrate to pod-identity

Describe the bug I am trying to configure aws-load-balancer controller to use pod-identity however I keep getting NoCredentialProviders: no valid providers in chain. Deprecated.\n\tFor verbose messaging see aws.Config.CredentialsChainVerboseErrors"}

Steps to reproduce 1 - Deployed aws-load-balancer-controller component using HELM. Latest version 2 - In a cluster with OIDC, I did setup:

a Role with enough permissions
a trust policy that allows the aws-load-balancer-controller serviceAccount to impersonate said role using OIDC provider

This setup renders a working release of aws-load-balancer-controller When migrating to eks-pod-identity I did setup:

EKS addon for pod identity (tested to work as I already migrated the secrets-store-csi-driver component )
an association between the aws-load-balancer-controller SA and the Role

Restarting the deployment generates new pods with their corresponding environment variables. When applying a load balancer, I do see the infamous

{"level":"error","ts":"2024-09-13T20:07:16Z","msg":"Reconciler error","controller":"ingress","object":{"name":"my-test-ingress","namespace":"some-namespace"},"namespace":"kube-system","name":"my-test-ingress","reconcileID":"70fc8680-1c00-4efc-9d38-5b90019b1e37","error":"ingress: kube-system/my-test-ingress NoCredentialProviders: no valid providers in chain. Deprecated.\n\tFor verbose messaging see aws.Config.CredentialsChainVerboseErrors"}

Expected outcome I expect that a load balancer is provisioned and the error no longer shows.

Environment

AWS Load Balancer controller version: 2.8.2
Kubernetes version 1.30
Using EKS (yes/no), if so version? using EKS 1.30

Additional Context: I already reviewed for hop being set to 2 in instance metadata options I made sure no iptable rule is dropping instance metadata. I did reviewed that no userdata script prevents access to the service. I also have another chart migrated without issues.

Sep 13 '24 20:09 ricfdez

@ricfdez Will you be able to provide us with the environment variables from the controller?

Sep 18 '24 21:09 shraddhabang

I'd like to pick up this issue. I believe that I am having the same issue as described here.

Environment

AWS Load Balancer controller version: 2.9.0 Kubernetes version 1.30 Using EKS (yes/no), if so version? using EKS 1.30

My error log is slightly different:

"error": "ingress: hello/hello: operation error ACM: ListCertificates, get identity: get credentials: failed to refresh cached credentials, no EC2 IMDS role found, operation error ec2imds: GetMetadata, canceled, context deadline exceeded"

My version is looking for EC2 credentials instead of just not finding any. Either way.

With respect to environment variables, I can't get env to run with kubectl exec on the albc pod; but I started an awscli pod with the same service account and got the following environment variables:

AWS_STS_REGIONAL_ENDPOINTS=regional
AWS_DEFAULT_REGION=us-west-2
AWS_REGION=us-west-2
AWS_CONTAINER_CREDENTIALS_FULL_URI=http://169.254.170.23/v1/credentials
AWS_CONTAINER_AUTHORIZATION_TOKEN_FILE=/var/run/secrets/pods.eks.amazonaws.com/serviceaccount/eks-pod-identity-token

I checked out the aws/config source version that you are using to ensure that they will support the pod identity credentials, and they appear to load them here: https://github.com/aws/aws-sdk-go-v2/blob/config/v1.27.27/config/resolve_credentials.go#L312

Not sure what the issue is at this point, but I'll take a closer look in the future.

Oct 15 '24 03:10 kingledion

The issue here appears to have been that there was a bad, old tag on my EKS cluster subnets from a previous setup attempt. The tags described here pointed to the wrong cluster. After removing those tags, everything worked fine.

I will update some documentation to satisfy this issue and demonstrate how pod identities can work.

Oct 15 '24 15:10 kingledion

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

Jan 13 '25 15:01 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle rotten
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

Feb 12 '25 15:02 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue with /reopen
Mark this issue as fresh with /remove-lifecycle rotten
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Mar 14 '25 16:03 k8s-triage-robot

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied

After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied

After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue with /reopen

Mark this issue as fresh with /remove-lifecycle rotten

Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Mar 14 '25 16:03 k8s-ci-robot

I'd like to pick up this issue. I believe that I am having the same issue as described here.

Environment

AWS Load Balancer controller version: 2.9.0 Kubernetes version 1.30 Using EKS (yes/no), if so version? using EKS 1.30

My error log is slightly different:
"error": "ingress: hello/hello: operation error ACM: ListCertificates, get identity: get credentials: failed to refresh cached credentials, no EC2 IMDS role found, operation error ec2imds: GetMetadata, canceled, context deadline exceeded"
My version is looking for EC2 credentials instead of just not finding any. Either way.

With respect to environment variables, I can't get env to run with kubectl exec on the albc pod; but I started an awscli pod with the same service account and got the following environment variables:
AWS_STS_REGIONAL_ENDPOINTS=regional
AWS_DEFAULT_REGION=us-west-2
AWS_REGION=us-west-2
AWS_CONTAINER_CREDENTIALS_FULL_URI=http://169.254.170.23/v1/credentials
AWS_CONTAINER_AUTHORIZATION_TOKEN_FILE=/var/run/secrets/pods.eks.amazonaws.com/serviceaccount/eks-pod-identity-token
I checked out the aws/config source version that you are using to ensure that they will support the pod identity credentials, and they appear to load them here: https://github.com/aws/aws-sdk-go-v2/blob/config/v1.27.27/config/resolve_credentials.go#L312

Not sure what the issue is at this point, but I'll take a closer look in the future.

Hello, was this officially resolved? I am getting the same error you are here and i dont have old tags on subnet i only have the proper tags.

everything else checks out, the pod identity is running, the IAM role is in place and its associated with the service account in the EKS console in the access panel. What else could be causing this?

May 13 '25 13:05 perezjasonr

I'd like to pick up this issue. I believe that I am having the same issue as described here. Environment AWS Load Balancer controller version: 2.9.0 Kubernetes version 1.30 Using EKS (yes/no), if so version? using EKS 1.30 My error log is slightly different:
"error": "ingress: hello/hello: operation error ACM: ListCertificates, get identity: get credentials: failed to refresh cached credentials, no EC2 IMDS role found, operation error ec2imds: GetMetadata, canceled, context deadline exceeded"
My version is looking for EC2 credentials instead of just not finding any. Either way. With respect to environment variables, I can't get env to run with kubectl exec on the albc pod; but I started an awscli pod with the same service account and got the following environment variables:
AWS_STS_REGIONAL_ENDPOINTS=regional
AWS_DEFAULT_REGION=us-west-2
AWS_REGION=us-west-2
AWS_CONTAINER_CREDENTIALS_FULL_URI=http://169.254.170.23/v1/credentials
AWS_CONTAINER_AUTHORIZATION_TOKEN_FILE=/var/run/secrets/pods.eks.amazonaws.com/serviceaccount/eks-pod-identity-token
I checked out the aws/config source version that you are using to ensure that they will support the pod identity credentials, and they appear to load them here: https://github.com/aws/aws-sdk-go-v2/blob/config/v1.27.27/config/resolve_credentials.go#L312 Not sure what the issue is at this point, but I'll take a closer look in the future.
Hello, was this officially resolved? I am getting the same error you are here and i dont have old tags on subnet i only have the proper tags.

everything else checks out, the pod identity is running, the IAM role is in place and its associated with the service account in the EKS console in the access panel. What else could be causing this?

just FYI, so I got past this error by doing

aws ec2 modify-instance-metadata-options --instance-id eks_node_id --http-put-response-hop-limit 3 for every EKS node. I'm confused why this was needed because it worked in govcloud without it. but in commercial i had to do it. not sure where the difference is.

May 13 '25 16:05 perezjasonr

aws-load-balancer-controller aws-load-balancer-controller copied to clipboard

Unable to migrate to pod-identity

aws-load-balancer-controller
aws-load-balancer-controller copied to clipboard