asyncssh
asyncssh copied to clipboard
Missing file segments in 2.12.0
Hi,
I upgraded to v2.12.0 from 2.11.0 and got the error below. It also caused the uploaded file to be corrupted: only about the second half of the file was uploaded to the server. The file is about 50kb in size.
Could not upload file to ['/usr/src/app/file.txt'] due to Traceback (most recent call last):
File "/usr/src/ap/utils/ssh.py", line 55, in _async_sftp_upload
await sftp.put(
File "/usr/local/lib/python3.9/site-packages/asyncssh/sftp.py", line 3898, in put
await self._begin_copy(local_fs, self, localpaths, remotepath, 'put',
File "/usr/local/lib/python3.9/site-packages/asyncssh/sftp.py", line 3689, in _begin_copy
await self._copy(srcfs, dstfs, srcfile, dstfile, srcname.attrs,
File "/usr/local/lib/python3.9/site-packages/asyncssh/sftp.py", line 3608, in _copy
await _SFTPFileCopier(block_size, max_requests, 0,
File "/usr/local/lib/python3.9/site-packages/asyncssh/sftp.py", line 803, in run
await self._dst.close()
File "/usr/local/lib/python3.9/site-packages/asyncssh/sftp.py", line 3436, in close
await self._handler.close(self._handle)
File "/usr/local/lib/python3.9/site-packages/asyncssh/sftp.py", line 2653, in close
await self._make_request(FXP_CLOSE, String(handle))
File "/usr/local/lib/python3.9/site-packages/asyncssh/sftp.py", line 2445, in _make_request
result = self._packet_handlers[resptype](self, resp)
File "/usr/local/lib/python3.9/site-packages/asyncssh/sftp.py", line 2461, in _process_status
raise exc
asyncssh.sftp.SFTPFailure: Missing file segements in upload:0,
I'm doing the upload like so:
async def _async_sftp_upload(
connection: asyncssh.SSHClientConnection | None,
local_paths: list[str],
remote_path: str,
) -> bool:
"""
Upload a file to a SFTP server.
Docs:
https://asyncssh.readthedocs.io/en/latest/api.html#asyncssh.SFTPClient
"""
if connection is None:
return False
try:
async with connection.start_sftp_client() as sftp:
await sftp.put(
localpaths=local_paths,
remotepath=remote_path,
)
message = f"Successfully uploaded {local_paths}"
json_log_writer(message, level="info")
return True
except asyncssh.Error as exc:
_trace = format_stacktrace(exc)
message = f"Could not upload file to {local_paths} due to {_trace}"
json_log_writer(message, level="warning")
return False
Any pointers where I'm going wrong here?
The specific error "Missing file segements in upload:0" is not something being generated by AsyncSSH. It must be coming from the server. What server are you uploading this data to?
I see a report of a similar error being returned to another SFTP client at https://github.com/theophilusx/ssh2-sftp-client/issues/145, also in a case where it is attempting to do parallel writes for speed. Another such report is at https://app.bountysource.com/issues/97514889-missing-file-segements-in-upload-error-even-though-upload-is-complete. So, it could be that the SFTP server you are using just doesn't handle this right.
Does this work properly with 2.11.0 with the exact same server and test code you show here?
Could you try enabling the debug log? The following should do:
import logging
logging.basicConfig(level='ERROR')
asyncssh.set_sftp_log_level('DEBUG')
asyncssh.set_debug_level(1)
There were changes made in the parallel file I/O implementation in between these two releases, but all the testing I did here looked ok, including some stress tests with lots of parallel I/O.
You could also try disabling the parallelism to see if that makes the problem go away. Given that your file is so small, you can probably just set block_size in the sftp.put() call to something like 65536, so that it does the entire copy in a single read and write call. Alternately, you could set max_requests to 1. This would just be a workaround, though -- without sending parallel requests, the file transfers are going to be quite a bit slower.
From your description, it sort of sounds like the file is being closed before all the parallel writes have had a chance to complete. However, I don't see anything obvious in your example code which would cause that.
@ronf apologies for the belated reply and thank you for getting back to me here. I haven't been able to look into this yet as I'm having a hard time replicating the error in a test environment. Had an idea earlier this week on how to try to implement your thoughts. I'll revert to you as soon as I've had a chance to test it.
No problem - since your original report, I've gotten a second one which looks similar (#510).
Which OS are you running this on, and are you doing anything beyond starting up an SFTP client and calling put() on it?
I'm running this in a Docker image (python:3.9-bullseye
) and there's some other things going on in the Docker as well. Just checked and the server is running CrushFTP. Not sure what version though, it's a third-party server.
The SFTP client works like a charm (even after the update) against another SFTP server not running CrushFTP so this might be related to CrushFTP.
Yeah - while this seems to be a server issue with not supporting random access file I/O, I should be able to get things working again as well as they did back in 2.11 against such servers. I have a proposed fix for this over in #510 if you'd like to give it a try.
Thanks again for reporting this! A fix for this is now checked into the "develop" branch as commit ee1baff.
Thank you! I've setup a test account on the server that I can play around with. What I'm doing here is uploading a basic CSV-file which is generated from a list. So I did this in the code: data = list(data) * 1000
to make sure I've got a lot of data to upload.
Result on 2.12.0
:
Traceback (most recent call last):
File "/redacted/utils/ssh.py", line 55, in _async_sftp_upload
await sftp.put(
File "/usr/src/.venv/lib/python3.9/site-packages/asyncssh/sftp.py", line 3898, in put
await self._begin_copy(local_fs, self, localpaths, remotepath, 'put',
File "/usr/src/.venv/lib/python3.9/site-packages/asyncssh/sftp.py", line 3689, in _begin_copy
await self._copy(srcfs, dstfs, srcfile, dstfile, srcname.attrs,
File "/usr/src/.venv/lib/python3.9/site-packages/asyncssh/sftp.py", line 3608, in _copy
await _SFTPFileCopier(block_size, max_requests, 0,
File "/usr/src/.venv/lib/python3.9/site-packages/asyncssh/sftp.py", line 803, in run
await self._dst.close()
File "/usr/src/.venv/lib/python3.9/site-packages/asyncssh/sftp.py", line 3436, in close
await self._handler.close(self._handle)
File "/usr/src/.venv/lib/python3.9/site-packages/asyncssh/sftp.py", line 2653, in close
await self._make_request(FXP_CLOSE, String(handle))
File "/usr/src/.venv/lib/python3.9/site-packages/asyncssh/sftp.py", line 2445, in _make_request
result = self._packet_handlers[resptype](self, resp)
File "/usr/src/.venv/lib/python3.9/site-packages/asyncssh/sftp.py", line 2461, in _process_status
raise exc
asyncssh.sftp.SFTPFailure: Missing file segements in upload:425984,311296,819200,491520,688128,229376,327680,32768,458752,770048,606208,49152,262144,409600,180224,376832,294912,671744,622592,65536,802816,196608,704512,737280,507904,557056,573440,0,720896,81920,278528,655360,212992,98304,114688,524288,638976,245760,475136,131072,16384,835584,753664,589824,540672,344064,163840,393216,442368,147456,851968,786432,360448,
Force-reinstalling your commit pip install git+https://github.com/ronf/asyncssh.git@ee1baff0197d3d7e66e5c7f28e24f3309d51fe60 --force-reinstall
I've also double-checked line 764 in asyncssh.sftp
to make sure I have the latest copy of the file. It reads """Copy a block of the source file"""
as expected.
Now, when I run the upload script I no longer get thrown the asyncssh.sftp.SFTPFailure
and looking at the SFTP-server there's indeed a well-formatted 2MB CSV-file, just as expected.
From my point of view, your patch resolves this issue. Thank you!
That's great, Patrik - thanks for letting me know! This change will be included in the next release...
Great, I'll remain on .11 for now and wait for the .13 release! Thanks again for the fix.