AWS SSO not working
I was trying to use AWS SSO to access files on S3 from my development machine. I looked at the other issues previously opened for this and tried to use the information there to resolve it, but no luck so far. So I'm reaching out for a bit more guidance.
First I run SSO login to make sure I have fresh credentials:
$ aws sso login --profile dev
Successfully logged into Start URL: https://xxx.awsapps.com/start#
This is my redacted config:
$ cat ~/.aws/config
[profile dev]
sso_session = xxx
sso_account_id = xxx
sso_role_name = xxx
region = us-east-1
output = json
[sso-session Formative]
sso_start_url = https://xxx.awsapps.com/start#
sso_region = us-east-1
sso_registration_scopes = sso:account:access
I run duckdb, you can see the version in the prompt:
$ duckdb
v1.1.3 19864453f7
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
I create the credentials per the documentation:
D CREATE OR REPLACE SECRET s3 (TYPE S3, PROVIDER CREDENTIAL_CHAIN, PROFILE 'dev');
100% ▕████████████████████████████████████████████████████████████▏
┌─────────┐
│ Success │
│ boolean │
├─────────┤
│ true │
└─────────┘
I try to use an S3 operation that requires permissions:
D SELECT filename, last_modified FROM read_text('s3://bucket/path/*.js');
HTTP Error: HTTP GET error on '/?encoding-type=url&list-type=2&prefix=js%2Fvnd%2Fi18next%2F' (HTTP 403)
Now I check whether the secret was created with credentials in it; it does not:
D SELECT * FROM duckdb_secrets();
┌─────────┬─────────┬──────────────────┬────────────┬─────────┬─────────────────────────┬──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ name │ type │ provider │ persistent │ storage │ scope │ secret_string │
│ varchar │ varchar │ varchar │ boolean │ varchar │ varchar[] │ varchar │
├─────────┼─────────┼──────────────────┼────────────┼─────────┼─────────────────────────┼──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ s3 │ s3 │ credential_chain │ false │ memory │ [s3://, s3n://, s3a://] │ name=s3;type=s3;provider=credential_chain;serializable=true;scope=s3://,s3n://,s3a://;endpoint=s3.amazonaws.com;region=us-east-1 │
└─────────┴─────────┴──────────────────┴────────────┴─────────┴─────────────────────────┴──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
Just getting the extension versions for reference:
D SELECT * FROM duckdb_extensions();
┌──────────────────┬─────────┬───────────┬─────────────────────────────────────────────────────────────────────────────────────────────┬────────────────────────────────────────────────────────────────────────────────────┬───────────────────┬───────────────────┬───────────────────┬────────────────┐
│ extension_name │ loaded │ installed │ install_path │ description │ aliases │ extension_version │ install_mode │ installed_from │
│ varchar │ boolean │ boolean │ varchar │ varchar │ varchar[] │ varchar │ varchar │ varchar │
├──────────────────┼─────────┼───────────┼─────────────────────────────────────────────────────────────────────────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┼───────────────────┼───────────────────┼───────────────────┼────────────────┤
│ arrow │ false │ false │ │ A zero-copy data integration between Apache Arrow and DuckDB │ [] │ │ │ │
│ autocomplete │ true │ true │ (BUILT-IN) │ Adds support for autocomplete in the shell │ [] │ │ STATICALLY_LINKED │ │
│ aws │ true │ true │ /home/dobes/snap/duckdb/9/.duckdb/extensions/v1.1.3/linux_amd64_gcc4/aws.duckdb_extension │ Provides features that depend on the AWS SDK │ [] │ f743d4b │ REPOSITORY │ core │
│ azure │ false │ false │ │ Adds a filesystem abstraction for Azure blob storage to DuckDB │ [] │ │ │ │
│ delta │ false │ false │ │ Adds support for Delta Lake │ [] │ │ │ │
│ excel │ false │ false │ │ Adds support for Excel-like format strings │ [] │ │ │ │
│ fts │ true │ true │ (BUILT-IN) │ Adds support for Full-Text Search Indexes │ [] │ v1.1.3 │ STATICALLY_LINKED │ │
│ httpfs │ true │ true │ /home/dobes/snap/duckdb/9/.duckdb/extensions/v1.1.3/linux_amd64_gcc4/httpfs.duckdb_extens… │ Adds support for reading and writing files over a HTTP(S) connection │ [http, https, s3] │ v1.1.3 │ REPOSITORY │ core │
│ iceberg │ false │ false │ │ Adds support for Apache Iceberg │ [] │ │ │ │
│ icu │ true │ true │ (BUILT-IN) │ Adds support for time zones and collations using the ICU library │ [] │ v1.1.3 │ STATICALLY_LINKED │ │
│ inet │ false │ false │ │ Adds support for IP-related data types and functions │ [] │ │ │ │
│ jemalloc │ true │ true │ (BUILT-IN) │ Overwrites system allocator with JEMalloc │ [] │ v1.1.3 │ STATICALLY_LINKED │ │
│ json │ true │ true │ (BUILT-IN) │ Adds support for JSON operations │ [] │ v1.1.3 │ STATICALLY_LINKED │ │
│ motherduck │ false │ false │ │ Enables motherduck integration with the system │ [md] │ │ │ │
│ mysql_scanner │ false │ false │ │ Adds support for connecting to a MySQL database │ [mysql] │ │ │ │
│ parquet │ true │ true │ (BUILT-IN) │ Adds support for reading and writing parquet files │ [] │ v1.1.3 │ STATICALLY_LINKED │ │
│ postgres_scanner │ false │ false │ │ Adds support for connecting to a Postgres database │ [postgres] │ │ │ │
│ shell │ true │ true │ │ Adds CLI-specific support and functionalities │ [] │ │ STATICALLY_LINKED │ │
│ spatial │ false │ false │ │ Geospatial extension that adds support for working with spatial data and functions │ [] │ │ │ │
│ sqlite_scanner │ false │ false │ │ Adds support for reading and writing SQLite database files │ [sqlite, sqlite3] │ │ │ │
│ substrait │ false │ false │ │ Adds support for the Substrait integration │ [] │ │ │ │
│ tpcds │ false │ false │ │ Adds TPC-DS data generation and query support │ [] │ │ │ │
│ tpch │ true │ true │ (BUILT-IN) │ Adds TPC-H data generation and query support │ [] │ v1.1.3 │ STATICALLY_LINKED │ │
│ vss │ false │ false │ │ Adds indexing support to accelerate Vector Similarity Search │ [] │ │ │ │
├──────────────────┴─────────┴───────────┴─────────────────────────────────────────────────────────────────────────────────────────────┴────────────────────────────────────────────────────────────────────────────────────┴───────────────────┴───────────────────┴───────────────────┴────────────────┤
│ 24 rows 9 columns │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
I thought I'd try to update to nightly, but it doesn't work:
D force install aws from core_nightly;
HTTP Error: Failed to download extension "aws" at URL "http://nightly-extensions.duckdb.org/v1.1.3/linux_amd64_gcc4/aws.duckdb_extension.gz" (HTTP 403)
Extension "aws" is an existing extension.
Afterwards I had a couple ideas that I hoped would help:
- My aws profile uses
sso_session, which isn't supported by the kubernetes command line tools, do I thought maybeduckdbmight also not support it. However, switching to a profile that doesn't usesso_sessiondidn't fix the issue - I initially installed it using
snapand I wondered if maybe duckdb was running in a sandbox and couldn't access my AWS credentials. I uninstalled and reinstalled using the ZIP file and it did not fix the issue
My understanding is that I can probably export temporary credentials to an env file and use it (or something along those files), but I thought maybe I should open an issue and see if this more convenient option can be made to work.
I am able to access S3 if I export the credentials to env, e.g.:
$ eval '$(aws configure export-credentials --profile dev --format env)'
$ echo $AWS_ACCESS_KEY_ID
ASIA....
$ duckdb
v1.1.3 19864453f7
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
D CREATE OR REPLACE SECRET s3 (TYPE S3, PROVIDER CREDENTIAL_CHAIN, CHAIN 'env');
100% ▕████████████████████████████████████████████████████████████▏
┌─────────┐
│ Success │
│ boolean │
├─────────┤
│ true │
└─────────┘
D SELECT * FROM duckdb_secrets();
┌─────────┬─────────┬──────────────────┬────────────┬─────────┬─────────────────────────┬─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ name │ type │ provider │ persistent │ storage │ scope │ secret_string │
│ varchar │ varchar │ varchar │ boolean │ varchar │ varchar[] │ varchar │
├─────────┼─────────┼──────────────────┼────────────┼─────────┼─────────────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ s3 │ s3 │ credential_chain │ false │ memory │ [s3://, s3n://, s3a://] │ name=s3;type=s3;provider=credential_chain;serializable=true;scope=s3://,s3n://,s3a://;endpoint=s3.amazonaws.com;key_id=...;region=us-east-1;secret=redacted;session_token=redacted │
└─────────┴─────────┴──────────────────┴────────────┴─────────┴─────────────────────────┴─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
D SELECT filename, last_modified FROM read_text('s3://bucket/.../*.js');
┌───────────────────────────────────────────────────────────────────────────────────────────────────────────────┬─────────────────────┐
│ filename │ last_modified │
│ varchar │ timestamp │
├───────────────────────────────────────────────────────────────────────────────────────────────────────────────┼─────────────────────┤
│ s3://bucket/path/xxx.js │ 2025-01-15 07:20:21 │
└───────────────────────────────────────────────────────────────────────────────────────────────────────────────┴─────────────────────┘
The route you tried initially - CREATE OR REPLACE SECRET s3 (TYPE S3, PROVIDER CREDENTIAL_CHAIN, PROFILE 'dev'); actually worked for me. Presumably it's looking for the named profile in .aws/credentials.
Unfortunately it's not documented here: https://duckdb.org/docs/extensions/httpfs/s3api.html
Are you using the same version of duckdb and duckdb-aws as I am? I wonder if this is something that has been fixed/broken depending.
I realized I was actually running an old version of duckdb so I tried again with 1.2.1 just now, but the results appear to be the same.
Are you using a profile configured using aws configure sso and aws sso login ? As far as I know those SSO login sessions do not use or update ~/.aws/credentials. So your mention of that file makes me wondering if you're testing the same issue I am reporting.
I tried installing the aws-sso-util and setting the credential_process in ~/.aws/config, and that actually worked. Perhaps this means that the AWS SDK used by duckdb doesn't support the new credentials storage format used by the AWS CLI v2?
Actually there's a better credential_process built into the aws CLI now. So to get your AWS profile working with DuckDB if you are using AWS SSO sessions:
Add to the profile's config in ~/.aws/config:
credential_process = aws configure export-credentials --profile dev --format process
Make sure you run aws sso login --profile dev to login to that profile.
Run duckdb with AWS_PROFILE so the profile name, e.g.
AWS_PROFILE=dev duckdb
Now you can create the S3 secret very easily:
CREATE OR REPLACE SECRET s3 (TYPE S3, PROVIDER CREDENTIAL_CHAIN);
And now you'll be able to access S3:
SELECT filename, last_modified FROM read_text('s3://some-bucket/*');
Note that you'll want to do this again regularly to refresh the access key and secret, as they are cached in duckdb and not refresh automatically (at least at the time of writing).
@dobesv I've been debugging some of the AWS SSO authentication logic and I found two things:
- If the sso_start_url parameter ends in a hash, like this:
https://orgname.awsapps.com/start#
then for some reason the AWS SDK library computes a different hash to the one used by the aws sso login command.
So I had to remove the hash in the sso_start_url parameter.
- I have to set the env var AWS_PROFILE=profilename before running duckdb, then I can create a secret like this:
CREATE SECRET secret2 ( TYPE s3, PROVIDER credential_chain, REGION 'myregion', ENDPOINT 's3.myregion.amazonaws.com' );
however, setting the PROFILE option for some reason does not work
CREATE SECRET secret2 ( TYPE s3, PROVIDER credential_chain, PROFILE 'myprofile', REGION 'myregion', ENDPOINT 's3.myregion.amazonaws.com' );
I tested with duckdb v1.2.1
Does not work for me :( duckdb 1.2.1 seems to not even start the credentials_process.
I have the following profile defined in ~/.aws/config:
[profile duckdb-test]
credential_process = 'touch /Users/user1/.duckdb/IwasHere'
and create the secret like this
con.execute("""
CREATE OR REPLACE SECRET processTest (
TYPE s3,
PROVIDER CREDENTIAL_CHAIN,
PROFILE 'duckdb-test'
);
""")
In addition, I have setup AWS_PROFILE environment variable to duckdb-test as well. The file /Users/user/.duckdb/IwasHere is never created though which makes me think that duckdb-aws is not executing the credential_process at all.
Have you tried running duckdb like:
AWS_PROFILE=duckdb-test duckdb
And then running
CREATE OR REPLACE SECRET s3 (TYPE S3, PROVIDER CREDENTIAL_CHAIN);
Inside that duckdb prompt? Some had said that the PROFILE argument wasn't working for this case, so maybe try it without relying on that.
Also, I wonder whether an AWS profile with only credential_process set is valid? Were you able to use that profile using AWS CLI - e.g. if you run something like AWS_PROFILE=duckdb-test aws s3 ls s3://foo would it create the file in that case ?
Yes, I tried various different way to set the profile, including setting the AWS_PROFILE environment variable before I started the duckdb prompt.
Yes, I can confirm that the file is created when I run aws s3 ls s3://somebucket --profile duckdb-test
Is there any other way to debug this further?
Had some time to debug into this, documenting my findings:
aws_secret.cpp:
if (input.options.find("profile") != input.options.end()) {
Aws::Auth::ProfileConfigFileAWSCredentialsProvider provider(profile.c_str());
credentials = provider.GetAWSCredentials();
} else {
Aws::Auth::DefaultAWSCredentialsProviderChain provider;
credentials = provider.GetAWSCredentials();
}
This explains why setting the Profile only works via AWS_PROFILE environment variable. If set via configuration option in duckdb, the first clause in the condition is jumped into and the ProfileConfigFileAWSCredentialsProvideris used which does not use the ProcessCredentialsProvider at all and therefore does not execute the credentials_prcess. The issue seems that we can't pass a profile into DefaultAWSCredentialsProviderChain and therefore the logic explicitly uses the ProfileConfigFileAWSCredentialsProvider which does not use ProcessCredentialsProvider.
On this computer I used latest duckdb from master and the IwasHere is actually created when I specify the profile using the Environment Variable! Now, to fix this, trying the manual way with a CHAIN configuration option:
CREATE OR REPLACE SECRET s3 (TYPE S3, PROVIDER CREDENTIAL_CHAIN, CHAIN 'process');
This also creates the IwasHere file - and I think the profile can be passed if I change this line:
else if (item == "process") {
AddProvider(std::make_shared<Aws::Auth::ProcessCredentialsProvider>());
}
to:
else if (item == "process") {
if (profile.empty()) {
AddProvider(std::make_shared<Aws::Auth::ProcessCredentialsProvider>());
} else {
AddProvider(std::make_shared<Aws::Auth::ProcessCredentialsProvider>(profile.c_str()));
}
}
I will prepare a PR for that fix.
@nicornk
What do you think about the SSO path? Is there a way to use the PROFILE specified for the SSO provider (or when using the default chain)?
When you use the sso path, the profile is already taken into consideration.
else if (item == "sso") {
if (profile.empty()) {
AddProvider(std::make_shared<Aws::Auth::SSOCredentialsProvider>());
} else {
AddProvider(std::make_shared<Aws::Auth::SSOCredentialsProvider>(profile));
}
https://github.com/duckdb/duckdb-aws/blob/035c589a846b448d5c9cf3523ebfe439053a4406/src/aws_secret.cpp#L85C6-L90C6
In the default chain the AWS SDK does not allow to pass in a profile - that could be a feature request to the AWS c++ sdk?
I opened https://github.com/aws/aws-sdk-cpp/issues/3395 - might be a possibility to pass the profile into the Default Chain if implemented.
You're right, if you specify CHAIN 'sso' then the provided profile is used:
CREATE SECRET secret2 ( TYPE s3, PROVIDER credential_chain, CHAIN 'sso', PROFILE 'myprofile', REGION 'af-south-1', ENDPOINT 's3.my-region.amazonaws.com' );
select * from read_parquet('...');
It also works if you set AWS_PROFILE before launching duckdb
CREATE SECRET secret2 ( TYPE s3, PROVIDER credential_chain, CHAIN 'sso', REGION 'af-south-1', ENDPOINT 's3.af-south-1.amazonaws.com' );
select * from read_parquet('...');
Somehow I missed these combinations of options, it could be because I had that problem with the '#' character in the sso_start_url
Still have this issue in duckdb 1.3.2.
Still have this issue in duckdb 1.3.2.
See my last comments or show us the config you are using
I opened aws/aws-sdk-cpp#3395 - might be a possibility to pass the profile into the Default Chain if implemented.
Seems this has been added to the aws-sdk-cpp - might be able to do a PR here soon.
Still have this issue in duckdb 1.3.2.
See my last comments or show us the config you are using
I misunderstood, I thought your comment was a workaround. I don't have any issues with the configuration you've described.
So your problem is solved, or there is another config you're trying to use which fails?
Seems like https://github.com/duckdb/duckdb-aws/issues/62#issuecomment-2839235394 is a workaround/fix?
@nicornk Happy to keep your PR open so we can include it in the v1.5 release.
Is anyone else still experiencing issues? Or can I close this?
Please reopen if the provided workaround does not work for people