Add support for S3 to work with IAM assume role credential provider
We recently hit an issue with a customer while reading data from an Iceberg setup which uses Assume role credential provider. In such a scenario, the a temporary set of credentials are fetched regularly to access AWS resources. More details here: https://docs.aws.amazon.com/sdkref/latest/guide/access-assume-role.html
Currently, our S3 code doesn't support the assume role refreshing credentials, and only supports static credentials. That is why the reads with Iceberg fail with the error 401 access denied.
As part of this issue, we should
- Add support for assume role credential provider
- Add wrappers for other basic credential provider like the
SystemPropertyCredentialsProvider,ProfileCredentialsProvider, etc. - Make
AWSSDKV2Credentialspublic with annotation@InternalUseOnlyso that Enterprise team can add more credentials provider overloads in the future independently of core.
I will be working on the core side changes and @abaranec will be working on providing quick fixes on the enterprise side.
Not sure if relevant, but note from https://docs.aws.amazon.com/sdkref/latest/guide/feature-assume-role-credentials.html in regards to SDK for Java 2.x:
mfa_serial not supported. Use AWS_ROLE_ARN instead of AWS_IAM_ROLE_ARN. Use AWS_ROLE_SESSION_NAME instead of AWS_IAM_ROLE_SESSION_NAME.
If not executing in EC2 (where I think IAM role will be sufficiently discovered?), looks like there is a manual workflow where this can be done. https://docs.aws.amazon.com/sdk-for-java/latest/developer-guide/credentials-explicit.html
Note: when running on EC2, the final step of the default credentials chain should check the IAM role attached to the EC2 instance:
Amazon EC2 instance IAM role-provided credentials The SDK uses the InstanceProfileCredentialsProvider class to load temporary credentials from the Amazon EC2 metadata service.
see https://docs.aws.amazon.com/sdk-for-java/latest/developer-guide/credentials-chain.html
It's worthwhile to look at how iceberg builds the client when using AssumeRole:
https://github.com/apache/iceberg/blob/apache-iceberg-1.6.1/aws/src/main/java/org/apache/iceberg/aws/AssumeRoleAwsClientFactory.java#L92-L110
We'll need to ensure we adapt the iceberg s3 properties like org.apache.iceberg.aws.AwsProperties#CLIENT_ASSUME_ROLE_ARN "client.assume-role.arn", etc, via io.deephaven.iceberg.util.S3InstructionsProviderPlugin.
I think a workaround is to have it specified in a configuration file:
[default]
source_profile = B
role_arn = arn:aws:iam::1234567890:role/MyRoleName
role_session_name = my-test-session
[profile B]
aws_access_key_id=...
aws_secret_access_key=...
or, if credentials are provided by EC2:
[default]
credential_source = Ec2InstanceMetadata
role_arn = arn:aws:iam::1234567890:role/MyRoleName
role_session_name = my-test-session
or, if credentials are provided by ECS:
[default]
credential_source = EcsContainer
role_arn = arn:aws:iam::1234567890:role/MyRoleName
role_session_name = my-test-session
Going forward, the recommended approach is to use configuration and credentials files to specify the roles, profile, etc.
We have added support in S3Instructions to specify the aws default profile to use for deephaven, and made AWSSDKV2Credentials public with annotation @InternalUseOnly as part of PR #6130.
So closing this issue.