boto3
boto3 copied to clipboard
`FileNotFoundError` when using `download_file` method of s3 client object.
Describe the bug
When running download_file
method of an s3 client object I am receiving this very unexpected error (see below traceback). Within the inner workings of boto it seems to be looking for a file at destination path, plus some random characters appended to the end (.6E57BFFa
).
Strangely this code was working yesterday.
I've also tested in a minimal example outside of the code base and the same error persists.
I've tested with multiple files in multiple buckets.
Expected Behavior
I expected the download_file
method to work as designed without throwing this error.
Current Behavior
File "/home/nickpestell/apha-csu/repos/btb-phylo/utils.py", line 117, in s3_download_file
s3.download_file(bucket, key, dest)
File "/home/nickpestell/python-virtualenvs/btb-phylo/lib/python3.8/site-packages/boto3/s3/inject.py", line 190, in download_file
return transfer.download_file(
File "/home/nickpestell/python-virtualenvs/btb-phylo/lib/python3.8/site-packages/boto3/s3/transfer.py", line 320, in download_file
future.result()
File "/home/nickpestell/python-virtualenvs/btb-phylo/lib/python3.8/site-packages/s3transfer/futures.py", line 103, in result
return self._coordinator.result()
File "/home/nickpestell/python-virtualenvs/btb-phylo/lib/python3.8/site-packages/s3transfer/futures.py", line 266, in result
raise self._exception
File "/home/nickpestell/python-virtualenvs/btb-phylo/lib/python3.8/site-packages/s3transfer/tasks.py", line 139, in __call__
return self._execute_main(kwargs)
File "/home/nickpestell/python-virtualenvs/btb-phylo/lib/python3.8/site-packages/s3transfer/tasks.py", line 162, in _execute_main
return_value = self._main(**kwargs)
File "/home/nickpestell/python-virtualenvs/btb-phylo/lib/python3.8/site-packages/s3transfer/download.py", line 642, in _main
fileobj.seek(offset)
File "/home/nickpestell/python-virtualenvs/btb-phylo/lib/python3.8/site-packages/s3transfer/utils.py", line 378, in seek
self._open_if_needed()
File "/home/nickpestell/python-virtualenvs/btb-phylo/lib/python3.8/site-packages/s3transfer/utils.py", line 361, in _open_if_needed
self._fileobj = self._open_function(self._filename, self._mode)
File "/home/nickpestell/python-virtualenvs/btb-phylo/lib/python3.8/site-packages/s3transfer/utils.py", line 272, in open
return open(filename, mode)
FileNotFoundError: [Errno 2] No such file or directory: '/mnt/fsx-042/share/phyloConsensus/phyloConsensus/AF-12-01538-15.fas.6E57BFFa'
Reproduction Steps
I expect this will not reproduce but I am simply running:
s3 = boto3.client('s3')
s3.download_file(<bucket>, <key>, <dest>)
where bucket
, key
and dest
are all correctly formatted.
Possible Solution
No response
Additional Information/Context
No response
SDK version used
'1.24.28'
Environment details (OS name and version, etc.)
ubuntu 20.04
Hi @nick-pestell thanks for reaching out. This appears to overlap with a recent issue: https://github.com/boto/boto3/issues/3319. As mentioned there, this is likely an issue with your path/sub-directories. Can you review this comment and let me know if the posts referenced there help clarify the behavior you are seeing?
Thanks @tim-finnigan . I think I broadly understand the issue that these other users are facing, although I'm not sure that this applies here. I have confirmed that the key and bucket I'm using exist and are formatted correctly. I've even confirmed that the object exists by passing the key and bucket to the following function:
def s3_object_exists(bucket, key):
"""
Returns true if the S3 key is in the S3 bucket. False otherwise
Thanks: https://stackoverflow.com/questions/33842944/check-if-a-key-exists-in-a-bucket-in-s3-using-boto3
"""
key_exists = True
s3 = boto3.resource('s3')
try:
s3.Object(bucket, key).load()
except botocore.exceptions.ClientError as e:
if e.response['Error']['Code'] == "404":
# The object does not exist.
key_exists = False
else:
# Something else has gone wrong.
raise e
return key_exists
```
Update, this is working again today, without any changes to the code. I'm not sure I understand how this can be an issue with the s3 keys if it works sometimes and not others...
It has again started throwing the error, without any change to our code or s3_keys.
Thanks @nick-pestell for following up. Does the sub-directory destination path exist that you're trying to download to? You can use an approach like this to make sure the directories are created:
import boto3
s3=boto3.client('s3')
list=s3.list_objects(Bucket='bucket')['Contents']
for s3_key in list:
s3_object = s3_key['Key']
if not s3_object.endswith("/"):
s3.download_file('bucket', s3_object, s3_object)
else:
import os
if not os.path.exists(s3_object):
os.makedirs(s3_object)
Thanks @tim-finnigan. Yes the the sub-folder to where I'm attempting to download the file to does exist, and the s3 object is definitely a file (it doesn't end with /
). I don't know if it's useful to know but when this error occurs I'm forced to switch to calling the AWS cli from within python (via a subprocess). This is working fine (although significantly slower), again suggesting that the bug is probably caused by something within boto3
/s3transfer
?
@tim-finnigan, do you have anymore ideas? I'm relatively convinced that this is a bug, but happy to explore other options if you have any ideas.
Hi @nick-pestell could you provide the debug logs by adding boto3.set_stream_logger('')
to your script (with any sensitive info redacted)? That would help give us more insight into the behavior you're seeing.
Thanks @tim-finnigan. Frustratingly, its working right now. I will try again over the next few days to see if I can catch it at a time when the above error occurs and send the log.
@tim-finnigan
I ran into the same issue - However, I figured out what was happening in my situation.
When listing subdirectories, if I had created it manually on the s3 console, it is returned as an object when doing list_objects
. However, when a folder was created using boto3 when uploading a file with an intermediate subdirectory, afterward, list_objects
does not show the folders created when using the boto3 upload method.
My guess is that this is a bug and when folders are created while using the boto3 upload method, they're not added as objects to the s3 bucket.
More details:
These are the contents of my S3 bucket:
When I ran the following snippet:
for obj_metadata in s3.list_objects(Bucket="faiyaz-file-transfer-test")["Contents"]:
print(obj_metadata["Key"])
the output I get is
.DS_Store
another-folder/
another-folder/hello.rtf
goodbye.rtf
hello.rtf
mango-folder/
subdir/goodbye.rtf
untitled folder/goodbye.rtf
The folders subdir
and untitled folder
were created when using boto3 to upload the last two files shown on the list above. You'll notice that they're not listed when list_objects is called. The folders another-folder
and mango-folder
do show up, but they were both created via the console.
Greetings! It looks like this issue hasn’t been active in longer than five days. We encourage you to check if this is still an issue in the latest release. In the absence of more information, we will be closing this issue soon. If you find that this is still a problem, please feel free to provide a comment or upvote with a reaction on the initial post to prevent automatic closure. If the issue is already closed, please feel free to open a new one.
@FyzHsn the behavior you're describing (a folder object is created if you use the S3 console to create folder/
, but a folder object does not get created if an object is uploaded to folder/dog.png
via boto3) is standard behavior across all S3 SDKs, not a bug. In the general case, there are no folder objects in S3 at all. An object folder/dog.png
can exist without an object folder/
existing. Folders are, in a sense, virtual and inferred from key prefixes in existing objects.
This bug is still present after following all the suggestions made above for boto3==1.26.90. Please provide some more guidance.
It's not sure but in my case, it seems to be raised when I try to save file to directory that not exists.
suppose that our code is:
bucket.download_file("my-s3-key", "foo/myfile.txt")
and if the foo
directory not exists, the error is raised.
you should mkdir foo
directory before download from s3. give it a try.