aws-cli icon indicating copy to clipboard operation
aws-cli copied to clipboard

Assuming IAM role from within a EKS Pod Identity-enabled container does not work using named profile

Open rkubik-hostersi opened this issue 1 year ago • 22 comments

Describe the bug

When working on a pod in EKS with Pod Identity assigned, it is not possible to assume another role using ~/.aws/config and profiles.

When specifying role_arn in ~/.aws/config, it is required to provide source_profile or credential_source. Since we are in the pod, source_profile is not an option. Unfortunately credential_source is pretty limited:

  • Environment value does not work as there is no env variables
  • Ec2InstanceMetadata points to the IAM role attached to the EC2, Pod Identity is not being used
  • EcsContainer is for ECS

Expected Behavior

It should be possible to instruct aws-cli to use EKS Pod Identity as a credential_source.

Current Behavior

It is not possible to utilize aws-cli with Assume Role mechanism using named profiles within ~/.aws/config when working on a EKS Pod Identity-enabled pods.

Reproduction Steps

  1. Create EKS with Pod Identity agent
  2. Assign sts:assumeRole permission to the pod
  3. Prepare IAM role to be assumed
  4. Create the pod with the Pod Identity assigned, prepare ~/.aws/config
  5. Try to assume a different IAM role using aws --profile

Possible Solution

No response

Additional Information/Context

No response

CLI version used

2.15.57

Environment details (OS name and version, etc.)

aws-cli/2.15.57 Python/3.12.6 Linux/6.8.0-41-generic source/x86_64.alpine.3

rkubik-hostersi avatar Sep 11 '24 13:09 rkubik-hostersi

experiencing this as well using hashicorp/terraform:1.5.6.

after installing the aws cli and running aws configure set role_arn <role-arn> getting:

Error relocating /usr/lib/python3.11/lib-dynload/pyexpat.cpython-311-x86_64-linux-musl.so: XML_SetReparseDeferralEnabled: symbol not found

alex-rochette avatar Sep 11 '24 14:09 alex-rochette

experiencing this as well using hashicorp/terraform:1.5.6.

after installing the aws cli and running aws configure set role_arn <role-arn> getting:

Error relocating /usr/lib/python3.11/lib-dynload/pyexpat.cpython-311-x86_64-linux-musl.so: XML_SetReparseDeferralEnabled: symbol not found

Same error as https://github.com/aws/aws-cli/issues/8913, replied there:

Looks like this is the same as hashicorp/terraform#35715, where a member of Terraform replied:

The Dockerfile for the build wasn't changed during that time, so any differences would be solely from the upstream image. Your above example works correctly if the package is updated, and I also confirmed that newer images have already updated the problematic packages. Closing since there's nothing the Terraform CLI can do to fix the old docker image.

Can you confirm that this is fixed in newer images?

tim-finnigan avatar Sep 11 '24 17:09 tim-finnigan

But the original issue here looks related to https://github.com/aws/aws-cli/issues/3875 and https://github.com/aws/aws-sdk/issues/350.

tim-finnigan avatar Sep 11 '24 17:09 tim-finnigan

I am encountering this as well, which is breaking our gitlab CI that uses apk add aws-cli.

Here is the relevant section from a working run from yesterday:

$ apk add --no-cache aws-cli
fetch https://dl-cdn.alpinelinux.org/alpine/v3.18/main/x86_64/APKINDEX.tar.gz
fetch https://dl-cdn.alpinelinux.org/alpine/v3.18/community/x86_64/APKINDEX.tar.gz
(1/59) Installing libbz2 (1.0.8-r5)
(2/59) Installing libffi (3.4.4-r2)
(3/59) Installing gdbm (1.23-r1)
(4/59) Installing xz-libs (5.4.3-r0)
(5/59) Installing libgcc (12.2.1_git20220924-r10)
(6/59) Installing libstdc++ (12.2.1_git20220924-r10)
(7/59) Installing mpdecimal (2.5.1-r2)
(8/59) Installing libpanelw (6.4_p20230506-r0)
(9/59) Installing readline (8.2.1-r1)
(10/59) Installing sqlite-libs (3.41.2-r3)
(11/59) Installing python3 (3.11.8-r1)
(12/59) Installing python3-pycache-pyc0 (3.11.8-r1)
(13/59) Installing pyc (0.1-r0)
(14/59) Installing py3-certifi (2024.2.2-r0)
(15/59) Installing py3-certifi-pyc (2024.2.2-r0)
(16/59) Installing py3-cparser (2.21-r2)
(17/59) Installing py3-cparser-pyc (2.21-r2)
(18/59) Installing py3-cffi (1.15.1-r3)
(19/59) Installing py3-cffi-pyc (1.15.1-r3)
(20/59) Installing py3-cryptography (41.0.3-r0)
(21/59) Installing py3-cryptography-pyc (41.0.3-r0)
(22/59) Installing py3-six (1.16.0-r6)
(23/59) Installing py3-six-pyc (1.16.0-r6)
(24/59) Installing py3-dateutil (2.8.2-r3)
(25/59) Installing py3-dateutil-pyc (2.8.2-r3)
(26/59) Installing py3-distro (1.8.0-r2)
(27/59) Installing py3-distro-pyc (1.8.0-r2)
(28/59) Installing py3-colorama (0.4.6-r3)
(29/59) Installing py3-colorama-pyc (0.4.6-r3)
(30/59) Installing py3-docutils (0.19-r4)
(31/59) Installing py3-docutils-pyc (0.19-r4)
(32/59) Installing py3-jmespath (1.0.1-r1)
(33/59) Installing py3-jmespath-pyc (1.0.1-r1)
(34/59) Installing py3-urllib3 (1.26.18-r0)
(35/59) Installing py3-urllib3-pyc (1.26.18-r0)
(36/59) Installing py3-wcwidth (0.2.6-r2)
(37/59) Installing py3-wcwidth-pyc (0.2.6-r2)
(38/59) Installing py3-prompt_toolkit (3.0.38-r1)
(39/59) Installing py3-prompt_toolkit-pyc (3.0.38-r1)
(40/59) Installing py3-ruamel.yaml.clib (0.2.7-r1)
(41/59) Installing py3-ruamel.yaml (0.17.28-r0)
(42/59) Installing py3-ruamel.yaml-pyc (0.17.28-r0)
(43/59) Installing aws-cli-pyc (2.15.14-r0)
(44/59) Installing py3-awscrt-pyc (0.20.2-r0)
(45/59) Installing python3-pyc (3.11.8-r1)
(46/59) Installing aws-c-common (0.9.12-r0)
(47/59) Installing aws-c-cal (0.6.9-r0)
(48/59) Installing aws-c-compression (0.2.17-r0)
(49/59) Installing s2n-tls (1.3.47-r0)
(50/59) Installing aws-c-io (0.14.2-r0)
(51/59) Installing aws-c-http (0.8.0-r0)
(52/59) Installing aws-c-sdkutils (0.1.14-r0)
(53/59) Installing aws-c-auth (0.7.14-r0)
(54/59) Installing aws-checksums (0.1.17-r0)
(55/59) Installing aws-c-event-stream (0.4.1-r0)
(56/59) Installing aws-c-mqtt (0.10.1-r0)
(57/59) Installing aws-c-s3 (0.4.10-r0)
(58/59) Installing py3-awscrt (0.20.2-r0)
(59/59) Installing aws-cli (2.15.14-r0)
Executing busybox-1.36.1-r2.trigger
OK: 200 MiB in 100 packages

The CI job then goes on to use the AWS CLI successfully.

And here is a broken one today:

$ apk add --no-cache aws-cli
fetch https://dl-cdn.alpinelinux.org/alpine/v3.18/main/x86_64/APKINDEX.tar.gz
fetch https://dl-cdn.alpinelinux.org/alpine/v3.18/community/x86_64/APKINDEX.tar.gz
(1/59) Installing libbz2 (1.0.8-r5)
(2/59) Installing libffi (3.4.4-r2)
(3/59) Installing gdbm (1.23-r1)
(4/59) Installing xz-libs (5.4.3-r0)
(5/59) Installing libgcc (12.2.1_git20220924-r10)
(6/59) Installing libstdc++ (12.2.1_git20220924-r10)
(7/59) Installing mpdecimal (2.5.1-r2)
(8/59) Installing libpanelw (6.4_p20230506-r0)
(9/59) Installing readline (8.2.1-r1)
(10/59) Installing sqlite-libs (3.41.2-r3)
(11/59) Installing python3 (3.11.10-r0)
(12/59) Installing python3-pycache-pyc0 (3.11.10-r0)
(13/59) Installing pyc (0.1-r0)
(14/59) Installing py3-certifi (2024.2.2-r0)
(15/59) Installing py3-certifi-pyc (2024.2.2-r0)
(16/59) Installing py3-cparser (2.21-r2)
(17/59) Installing py3-cparser-pyc (2.21-r2)
(18/59) Installing py3-cffi (1.15.1-r3)
(19/59) Installing py3-cffi-pyc (1.15.1-r3)
(20/59) Installing py3-cryptography (41.0.3-r0)
(21/59) Installing py3-cryptography-pyc (41.0.3-r0)
(22/59) Installing py3-six (1.16.0-r6)
(23/59) Installing py3-six-pyc (1.16.0-r6)
(24/59) Installing py3-dateutil (2.8.2-r3)
(25/59) Installing py3-dateutil-pyc (2.8.2-r3)
(26/59) Installing py3-distro (1.8.0-r2)
(27/59) Installing py3-distro-pyc (1.8.0-r2)
(28/59) Installing py3-colorama (0.4.6-r3)
(29/59) Installing py3-colorama-pyc (0.4.6-r3)
(30/59) Installing py3-docutils (0.19-r4)
(31/59) Installing py3-docutils-pyc (0.19-r4)
(32/59) Installing py3-jmespath (1.0.1-r1)
(33/59) Installing py3-jmespath-pyc (1.0.1-r1)
(34/59) Installing py3-urllib3 (1.26.18-r0)
(35/59) Installing py3-urllib3-pyc (1.26.18-r0)
(36/59) Installing py3-wcwidth (0.2.6-r2)
(37/59) Installing py3-wcwidth-pyc (0.2.6-r2)
(38/59) Installing py3-prompt_toolkit (3.0.38-r1)
(39/59) Installing py3-prompt_toolkit-pyc (3.0.38-r1)
(40/59) Installing py3-ruamel.yaml.clib (0.2.7-r1)
(41/59) Installing py3-ruamel.yaml (0.17.28-r0)
(42/59) Installing py3-ruamel.yaml-pyc (0.17.28-r0)
(43/59) Installing aws-cli-pyc (2.15.14-r0)
(44/59) Installing py3-awscrt-pyc (0.20.2-r0)
(45/59) Installing python3-pyc (3.11.10-r0)
(46/59) Installing aws-c-common (0.9.12-r0)
(47/59) Installing aws-c-cal (0.6.9-r0)
(48/59) Installing aws-c-compression (0.2.17-r0)
(49/59) Installing s2n-tls (1.3.47-r0)
(50/59) Installing aws-c-io (0.14.2-r0)
(51/59) Installing aws-c-http (0.8.0-r0)
(52/59) Installing aws-c-sdkutils (0.1.14-r0)
(53/59) Installing aws-c-auth (0.7.14-r0)
(54/59) Installing aws-checksums (0.1.17-r0)
(55/59) Installing aws-c-event-stream (0.4.1-r0)
(56/59) Installing aws-c-mqtt (0.10.1-r0)
(57/59) Installing aws-c-s3 (0.4.10-r0)
(58/59) Installing py3-awscrt (0.20.2-r0)
(59/59) Installing aws-cli (2.15.14-r0)
Executing busybox-1.36.1-r2.trigger
OK: 200 MiB in 100 packages

Which then fails with Error relocating /usr/lib/python3.11/lib-dynload/pyexpat.cpython-311-x86_64-linux-musl.so: XML_SetReparseDeferralEnabled: symbol not found

The difference I'm seeing is python 3.11.10-r0 is used now, instead of 3.11.8, so maybe this is a new issue there?

jcary741 avatar Sep 11 '24 19:09 jcary741

We are seeing this across our CICD. All versions of 1.5.x are impacted. So far in our brief testing 1.6 through 1.9 are not impacted. We're scrambling to test newer versions and update our shared templates.

Likely that Python 3 version from Alpine is the issue. The timestamp is 9/11:

https://dl-cdn.alpinelinux.org/alpine/v3.18/main/x86_64/ python3-3.11.10-r0.apk 11-Sep-2024 10:14 9M

joerawr avatar Sep 11 '24 19:09 joerawr

Here is a similar issue with Alpine 3.18 via Terraform 1.5.7: https://gitlab.alpinelinux.org/alpine/aports/-/issues/16441

joerawr avatar Sep 11 '24 19:09 joerawr

For those using Terraform have you referred to: https://github.com/hashicorp/terraform/issues/35715?

tim-finnigan avatar Sep 11 '24 20:09 tim-finnigan

Guys, this is not about terraform or any other library, or even python versions. This is about the missing configuration parameter for credential_source when running aws in EKS Pod Identity enabled container. AWS CLI version also does not matter as there is no "legit" parameter to be used in EKS on PI containers and credential_source.

The scenario has been described in the first post. We need to be able to use aws --profile from within a pod to assume some external role with Pod Identity. This is not possible for now officially. :)

rkubik-hostersi avatar Sep 12 '24 13:09 rkubik-hostersi

My bad @rkubik-hostersi, the timing of when you submitted this issue and the environment you described, then followed by what drunkensway said made me think we were encountering different versions of the same problem. I see now that your submission is actually different. Just to update anyone who happens upon this issue who makes the same mistake, the issue we were encountering appears to have been resolved in Python build 3.11.10-r1.

jcary741 avatar Sep 13 '24 14:09 jcary741

@tim-finnigan I just don't understand this is being marked as feature request. IMO it's a bug as it does not allow to use EKS Pod Identity feature fully with aws-cli tool. The documentation says that Pod Identities are supported in various SDK versions, and AWS CLI, but they are not (fully).

rkubik-hostersi avatar Sep 27 '24 08:09 rkubik-hostersi

The https://github.com/aws/aws-cli/issues/3875 is not exactly about the same behavior, it's more generic case.

rkubik-hostersi avatar Sep 27 '24 08:09 rkubik-hostersi

100% agree with @rkubik-hostersi that this is not a feature request. It is a bug. Please label it accordingly and please prioritize it.

gamma425 avatar Nov 03 '24 19:11 gamma425

Checking in again — can you specify which documentation is not accurate? Here is the EKS User Guide on Pod Identities: https://docs.aws.amazon.com/eks/latest/userguide/pod-id-how-it-works.html , and the AWS CLI documentation on authentication and access credentials: https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-authentication.html

tim-finnigan avatar Nov 04 '24 18:11 tim-finnigan

Creating a profile which is defined as source_profile inside the target profile worked for me.

For more detailed explanation check the official docs Example Use chained AssumeRole operations.

[profile account_b_role]
source_profile = account_a_role
role_arn=arn:aws:iam::444455556666:role/account-b-role

[profile account_a_role]
web_identity_token_file = /var/run/secrets/eks.amazonaws.com/serviceaccount/token
role_arn=arn:aws:iam::111122223333:role/account-a-role

Executing below command inside a Pod running in account 111122223333 lists all AWS S3 buckets in account 444455556666.

aws s3 ls --profile account_b_role
2024-11-23 16:00:00 my-test-bucket 

hebestreit avatar Nov 23 '24 15:11 hebestreit

Creating a profile which is defined as source_profile inside the target profile worked for me.

@hebestreit this is not working with Pod Identity, maybe with IRSA.

An error occurred (InvalidIdentityToken) when calling the AssumeRoleWithWebIdentity operation: No OpenIDConnect provider found in your account for...

rkubik-hostersi avatar Mar 18 '25 09:03 rkubik-hostersi

@tim-finnigan As in my first post, there is no way to use the Pod Identity and switch the IAM role to assume another one. credential_source field should include something like EksPiContainer like the ECS option EcsContainer.

rkubik-hostersi avatar Mar 18 '25 09:03 rkubik-hostersi

I'm not sure if this is aws-cli related or affects the general AWS SDK behavior.

rkubik-hostersi avatar Mar 18 '25 09:03 rkubik-hostersi

I think I found a solution for this and it worked :

I created below aws config in the home directory of the pod (~/.aws/config) :

[profile destination] source_profile=source role_arn=arn:aws:iam::1234:role/pod-identity-crossaccount region=eu-west-2

[profile source] credential_process=/eks-credential-processrole.sh region=eu-west-2

The role arn "arn:aws:iam::1234:role/pod-identity-crossaccount" is the cross account IAM role that I am trying to assume and the eks-credential-processrole.sh would be below :

#!/bin/bash curl -H "Authorization: $(cat $AWS_CONTAINER_AUTHORIZATION_TOKEN_FILE)" $AWS_CONTAINER_CREDENTIALS_FULL_URI | jq -c '{AccessKeyId: .AccessKeyId, SecretAccessKey: .SecretAccessKey, SessionToken: .Token, Expiration: .Expiration, Version: 1}'

so my aws s3 ls --profile destination worked.

Tips :

  • the trust policy in the destination role needs to have "sts:TagSession" also along with the "sts:AssumeRole" and the permission policy attached to the pod in the source account needs to allow both "sts:TagSession" and "sts:AssumeRole" to destination role arn when using pod identity
  • jq should also be installed on the pod

adarshthakur499 avatar Apr 02 '25 18:04 adarshthakur499

@adarshthakur499 This looks cool, thanks! However this is of course a workaround only ;)

rkubik-hostersi avatar Apr 03 '25 07:04 rkubik-hostersi

To chime in this is problematic when trying to use GitHub ARC runners in pods on EKS using pod identity

casey-robertson-paypal avatar Aug 15 '25 21:08 casey-robertson-paypal

any update on the timeline for this?

gamma425 avatar Sep 22 '25 20:09 gamma425

Any updates on this without using the work around?

drdj85 avatar Nov 18 '25 22:11 drdj85