delta-rs
delta-rs copied to clipboard
AWS S3 error message is misleading for insufficient access
Description
For Python package usage, from what I can read, it seems the s3 credentials always have to come from environment variables. Example: https://github.com/delta-io/delta-rs/blob/0afed35fd83b74412e988eb4fc317ecb47100d56/python/tests/conftest.py#L7
Are there other ways to pass the credentials in other manners, such as ~/.aws, profile, IAM role, create&pass S3 client/session, etc for successful S3 connections ? Did I miss some documentation of doing so ?
Use Case
import boto3
s3 = boto3.client('s3')
buckets = s3.list_buckets()
# this will successfully reach S3 and print the buckets
print(buckets)
import deltalake
# this will give "deltalake.PyDeltaTableError: Failed to load checkpoint: Failed to read checkpoint content: Failed to read S3 object content: Couldn't find AWS credentials in environment, credentials file, or IAM role."
table = deltalake.DeltaTable(some_s3_delta_table_path)
print(table)
Related Issue(s)
It should support the the standard aws credential look up chain, we use it with iam role in AWS ecs tasks. The only missing support in the credential look up chain from the Rust AWS sdk is SSO login. Did you generate your credential with SSO login?
It turns out I had to add read/write permission to the S3 path of delta tables to IAM role. I was doing a quick proof of concept and thought the IAM role had read/write within the whole S3 bucket. It was not SSO related.
While I hoped the error message would say something a bit more precise (e.g. unable to read the S3 uri), instead of giving an impression of whole credentials missing, I believe this is not up to this project.
Thanks @houqp. I can close this issue if you understood what happened on my end and agree that nothing to do on this project.
I agree this error message is very misleading. let's keep this issue open to track the improvement on the error message. There might be a way to match of the error returned by rusoto and construct a more friendly message on our end.
I too am having a authentication chain issue, but I know I have sufficient permissions. I can successfully load a table from an EC2 with an IAM role; however, it fails with the misleading error when I try to run the same code locally using the AWS_PROFILE=xxxx environment variable.
I know the profile is configured correctly in my ~/.aws/config because I use it constantly with the AWS CLI, and can for example aws --profile xxxx s3 ls s3://path/to/my/delta/table just fine. And the can use the profile with AWS CLI to successfully download files, so no doubt that my permissions are sufficient.
Are the AWS environment variables not supported like in the default chain provider?
@zcking we use the default chain provider at https://github.com/delta-io/delta-rs/blob/649fdce8560251b1773d2f8dd578fa6bbe4a9834/rust/src/storage/s3/mod.rs#L443
So I think it should support the AWS_PROFILE env var? Can you perhaps try changing the profile you want to test to the default profile to see if it works? That might help us narrow down exactly where the bug is.
Our credentials lookup has changed dramatically and the underlying object_store crate does a good job of handling things here.