postgres-aws-s3
postgres-aws-s3 copied to clipboard
table_import_from_s3 in localstack has Access Key error
I am trying to import s3 data from localstack, using:
select aws_s3.table_import_from_s3(
'tablename',
'col1,col2',
'(format csv, header true)',
aws_commons.create_s3_uri('my-bucket','test.csv','us-west-2'),
aws_commons.create_aws_credentials('none','none',''),
'http://localstack:4566');
none
is what I use for all localstack calls but using that results in:
ERROR: spiexceptions.ExternalRoutineException: botocore.exceptions.ClientError: An error occurred (InvalidAccessKeyId) when calling the GetObject operation: The AWS Access Key Id you provided does not exist in our records. CONTEXT: Traceback (most recent call last): PL/Python function "table_import_from_s3", line 7, in
return plan.execute( PL/Python function "table_import_from_s3"
I have ensured this bucket is publicly accessible via:
AWS_PAGER="" \
AWS_ACCESS_KEY_ID=none \
AWS_SECRET_ACCESS_KEY=none \
aws \
--endpoint-url=http://localstack:4566 \
--region=us-west-2 \
s3api put-bucket-policy \
--bucket my-bucket \
--policy '{
"Id": "Policy1397632521960",
"Statement": [
{
"Sid": "Stmt1397633323327",
"Action": [
"s3:GetObject"
],
"Effect": "Allow",
"Resource": "arn:aws:s3:::my-bucket/*",
"Principal": {
"AWS": [
"*"
]
}
}
]
}'
@huiser I've tried also adding a secret key of 'none'
and verified I can publicly access the files using wget
without credentials. So is there a way for aws_s3
to accept no credentials? Or none
as I've shown above?
wget http://localstack:4566/my-bucket/test.csv
Resolving localstack (localstack)... 172.25.0.3
Connecting to localstack (localstack)|172.25.0.3|:4566... connected.
HTTP request sent, awaiting response... 200
Length: 133635 (131K) [text/csv]
Saving to: 'test.csv'
and if I use boto3
locally, it also successfully gets:
import boto3
s3client = boto3.client('s3', region_name='us-west-2', endpoint_url='http://localstack:4566', aws_access_key_id='none', aws_secret_access_key='none', aws_session_token='none')
print(s3client.get_object(Bucket='my-bucket',Key='test.csv'))
{'ResponseMetadata': {'RequestId': '1I8DPGPWYI21YKAKHO9Z5Y33OSA74FQU5UTOQ08LJCHZFUJ9TGY7', 'HTTPStatusCode': 200, ...
...
...
So if boto3 can connect in my simple example above, do you think this is a bug in postgres-aws-s3
?
Besides this possibly being a bug, I see a lot of benefits to this since credentials are optional in Amazon's aws_s3
api, so adding this feature would also help maintain a consistent mirror with their product.
Also, The reason I don't use credentials is because I only communicate internally within my VPC (between RDS, S3, and Lambda). I am using an external (local) postgres database only because currently localstack doesn't have full/free support for RDS. Thanks to docker, I can bridge connections to a local postgres database which is otherwise identical to RDS.
I found the alternative method that allows me to specify bucket
and region
separately. When I use this it works:
https://github.com/chimpler/postgres-aws-s3/blob/b817be9caf54e5b09c5c6edb924cf1b17df0e75c/aws_s3--0.0.1.sql#L41
Not ideal though since it doesn't mirror my production code, which uses the s3_uri
object instead. Also, not clear why this works but the other method doesn't. Thanks for any help with this and I hope it's valuable feedback.
This is a great project! Thanks for sharing :)