filesystem_spec
filesystem_spec copied to clipboard
Can't access data from S3 Buckets
import fsspec
s3_fs = fsspec.filesystem("s3", key="xxxxxx', secret="xxxxxxxxx",
client_kwargs={
"region_name": 'xxxxxxx',
})
s3_fs.ls('s3://')
It lists all the buckets but when i use
s3_fs.ls('s3://Bucket_name')
It returns empty
Same bucket can be accessed with boto3 and all the contents of the bucket but using fsspec returns empty. Please help me solve this issue
If I try to read the file using fsspec.open it gives bad request error because file was not found
I have also tried s3fs it has the same issue
What version of s3fs are you using? For boto3, are you also using key/secret, or something else?
cc https://github.com/fsspec/s3fs/issues/701
cc https://github.com/fsspec/s3fs/issues/700
Hi Martin, Thanks for the quick response I am using the verson : s3fs==2023.1.0
I solved it by specifying the bucket required in the endpoint endpoint_url: 'https://s3.amazonaws.com/'+S3_BUCKET_NAME
But If I don't Specify the endpoint of my bucket. It lists all the bucket but not the content of the bucket. Is that the usual behavior
That is fascinating and also mysterious - definitely not how it should work.
@elephantum you were working with endpoint_url, does this ring any bells?
@Eugeny , maybe an interaction with transient bucket state?
@isidentical , long shot, but maybe related to bucket regions?
This is strange, I did not encounter S3 endpoint_url in the form s3.amazonaws.com/{BUCKET_NAME}
Just a hunch: does it work with just https://s3.amazonaws.com as an endpoint? If yes, then probably in some aws-related config there's a misconfiguration
This is strange, I did not encounter S3 endpoint_url in the form
s3.amazonaws.com/{BUCKET_NAME}Just a hunch: does it work with just
https://s3.amazonaws.comas an endpoint? If yes, then probably in some aws-related config there's a misconfiguration
No It does not work
import fsspec
bucket_name = 'test'
config = {
'key': 'xxxxxxxxxxxxxxx',
'secret': 'xxxxxxxxxxxxxxx',
'client_kwargs' : {
"endpoint_url":'https://s3.amazonaws.com/'+bucket_name,
"region_name": 'region',
}
}
s3 = fsspec.filesystem('s3', **config)
file_name = 'xyz'
with s3.open(f"{bucket_name}/{file_name}", "rb") as f:
file_contents = f.read()
print(file_contents)
This is the complete code that i am using to read the file from the bucket and it works but if i change endpoint url it stops working. It would list out buckets but not the contents of the buckets. Same behavior if i remove client kwargs all together.
It Gives Bad Request Error when bucket name is not added in endpoint url which is due to i think file not found error. I have tried to access public and private both buckets without the bucket name in endpoint url and both can't be accessed.
Would you mind listing the set of AWS_ environment variables you have defined (not their values, except where safe). So you have .boto or .aws files? Are you running this from within an AWS service?
aws_access_key_id aws_secret_key_id bucket_name aws_region = us-east-1
I am running this from jupyter notebook.
You have a variable called BUCKET_NAME?
I am running this from jupyter notebook.
But is that notebook running within AWS, perhaps on EC2 or other virtual machine?
Member I am running jupyter on my local not inside aws. bucket name variable is for myself
I think I have an idea.
https://s3.amazonaws.com/ is not correct endpoint for us-east-1.
Try providing: endpoint_url = https://s3.us-east-1.amazonaws.com
Endpoints reference: https://docs.aws.amazon.com/general/latest/gr/s3.html
I think I have an idea.
https://s3.amazonaws.com/is not correct endpoint forus-east-1.Try providing: endpoint_url =
https://s3.us-east-1.amazonaws.comEndpoints reference: https://docs.aws.amazon.com/general/latest/gr/s3.html
Tried it does not work
https://github.com/fsspec/s3fs/issues/701#issuecomment-1480303225 suggests that setting cache_regions=True for S3FileSystem or specifying the region of the bucket as your default region might be what you need. Can you try?