asyncssh icon indicating copy to clipboard operation
asyncssh copied to clipboard

Missing file segments in 2.12.0

Open ggravlingen opened this issue 2 years ago • 6 comments

Hi,

I upgraded to v2.12.0 from 2.11.0 and got the error below. It also caused the uploaded file to be corrupted: only about the second half of the file was uploaded to the server. The file is about 50kb in size.

Could not upload file to ['/usr/src/app/file.txt'] due to Traceback (most recent call last):
  File "/usr/src/ap/utils/ssh.py", line 55, in _async_sftp_upload
    await sftp.put(
  File "/usr/local/lib/python3.9/site-packages/asyncssh/sftp.py", line 3898, in put
    await self._begin_copy(local_fs, self, localpaths, remotepath, 'put',
  File "/usr/local/lib/python3.9/site-packages/asyncssh/sftp.py", line 3689, in _begin_copy
    await self._copy(srcfs, dstfs, srcfile, dstfile, srcname.attrs,
  File "/usr/local/lib/python3.9/site-packages/asyncssh/sftp.py", line 3608, in _copy
    await _SFTPFileCopier(block_size, max_requests, 0,
  File "/usr/local/lib/python3.9/site-packages/asyncssh/sftp.py", line 803, in run
    await self._dst.close()
  File "/usr/local/lib/python3.9/site-packages/asyncssh/sftp.py", line 3436, in close
    await self._handler.close(self._handle)
  File "/usr/local/lib/python3.9/site-packages/asyncssh/sftp.py", line 2653, in close
    await self._make_request(FXP_CLOSE, String(handle))
  File "/usr/local/lib/python3.9/site-packages/asyncssh/sftp.py", line 2445, in _make_request
    result = self._packet_handlers[resptype](self, resp)
  File "/usr/local/lib/python3.9/site-packages/asyncssh/sftp.py", line 2461, in _process_status
    raise exc
asyncssh.sftp.SFTPFailure: Missing file segements in upload:0,

I'm doing the upload like so:

async def _async_sftp_upload(
    connection: asyncssh.SSHClientConnection | None,
    local_paths: list[str],
    remote_path: str,
) -> bool:
    """
    Upload a file to a SFTP server.

    Docs:
    https://asyncssh.readthedocs.io/en/latest/api.html#asyncssh.SFTPClient
    """
    if connection is None:
        return False

    try:
        async with connection.start_sftp_client() as sftp:
            await sftp.put(
                localpaths=local_paths,
                remotepath=remote_path,
            )
            message = f"Successfully uploaded {local_paths}"
            json_log_writer(message, level="info")
            return True
    except asyncssh.Error as exc:
        _trace = format_stacktrace(exc)
        message = f"Could not upload file to {local_paths} due to {_trace}"
        json_log_writer(message, level="warning")

    return False

Any pointers where I'm going wrong here?

ggravlingen avatar Aug 22 '22 15:08 ggravlingen

The specific error "Missing file segements in upload:0" is not something being generated by AsyncSSH. It must be coming from the server. What server are you uploading this data to?

I see a report of a similar error being returned to another SFTP client at https://github.com/theophilusx/ssh2-sftp-client/issues/145, also in a case where it is attempting to do parallel writes for speed. Another such report is at https://app.bountysource.com/issues/97514889-missing-file-segements-in-upload-error-even-though-upload-is-complete. So, it could be that the SFTP server you are using just doesn't handle this right.

Does this work properly with 2.11.0 with the exact same server and test code you show here?

Could you try enabling the debug log? The following should do:

    import logging
    logging.basicConfig(level='ERROR')
    asyncssh.set_sftp_log_level('DEBUG')
    asyncssh.set_debug_level(1)

There were changes made in the parallel file I/O implementation in between these two releases, but all the testing I did here looked ok, including some stress tests with lots of parallel I/O.

You could also try disabling the parallelism to see if that makes the problem go away. Given that your file is so small, you can probably just set block_size in the sftp.put() call to something like 65536, so that it does the entire copy in a single read and write call. Alternately, you could set max_requests to 1. This would just be a workaround, though -- without sending parallel requests, the file transfers are going to be quite a bit slower.

From your description, it sort of sounds like the file is being closed before all the parallel writes have had a chance to complete. However, I don't see anything obvious in your example code which would cause that.

ronf avatar Aug 23 '22 03:08 ronf

@ronf apologies for the belated reply and thank you for getting back to me here. I haven't been able to look into this yet as I'm having a hard time replicating the error in a test environment. Had an idea earlier this week on how to try to implement your thoughts. I'll revert to you as soon as I've had a chance to test it.

ggravlingen avatar Sep 13 '22 19:09 ggravlingen

No problem - since your original report, I've gotten a second one which looks similar (#510).

Which OS are you running this on, and are you doing anything beyond starting up an SFTP client and calling put() on it?

ronf avatar Sep 15 '22 03:09 ronf

I'm running this in a Docker image (python:3.9-bullseye) and there's some other things going on in the Docker as well. Just checked and the server is running CrushFTP. Not sure what version though, it's a third-party server.

The SFTP client works like a charm (even after the update) against another SFTP server not running CrushFTP so this might be related to CrushFTP.

ggravlingen avatar Sep 15 '22 07:09 ggravlingen

Yeah - while this seems to be a server issue with not supporting random access file I/O, I should be able to get things working again as well as they did back in 2.11 against such servers. I have a proposed fix for this over in #510 if you'd like to give it a try.

ronf avatar Sep 17 '22 22:09 ronf

Thanks again for reporting this! A fix for this is now checked into the "develop" branch as commit ee1baff.

ronf avatar Sep 20 '22 01:09 ronf

Thank you! I've setup a test account on the server that I can play around with. What I'm doing here is uploading a basic CSV-file which is generated from a list. So I did this in the code: data = list(data) * 1000 to make sure I've got a lot of data to upload.

Result on 2.12.0:

Traceback (most recent call last):
  File "/redacted/utils/ssh.py", line 55, in _async_sftp_upload
    await sftp.put(
  File "/usr/src/.venv/lib/python3.9/site-packages/asyncssh/sftp.py", line 3898, in put
    await self._begin_copy(local_fs, self, localpaths, remotepath, 'put',
  File "/usr/src/.venv/lib/python3.9/site-packages/asyncssh/sftp.py", line 3689, in _begin_copy
    await self._copy(srcfs, dstfs, srcfile, dstfile, srcname.attrs,
  File "/usr/src/.venv/lib/python3.9/site-packages/asyncssh/sftp.py", line 3608, in _copy
    await _SFTPFileCopier(block_size, max_requests, 0,
  File "/usr/src/.venv/lib/python3.9/site-packages/asyncssh/sftp.py", line 803, in run
    await self._dst.close()
  File "/usr/src/.venv/lib/python3.9/site-packages/asyncssh/sftp.py", line 3436, in close
    await self._handler.close(self._handle)
  File "/usr/src/.venv/lib/python3.9/site-packages/asyncssh/sftp.py", line 2653, in close
    await self._make_request(FXP_CLOSE, String(handle))
  File "/usr/src/.venv/lib/python3.9/site-packages/asyncssh/sftp.py", line 2445, in _make_request
    result = self._packet_handlers[resptype](self, resp)
  File "/usr/src/.venv/lib/python3.9/site-packages/asyncssh/sftp.py", line 2461, in _process_status
    raise exc
asyncssh.sftp.SFTPFailure: Missing file segements in upload:425984,311296,819200,491520,688128,229376,327680,32768,458752,770048,606208,49152,262144,409600,180224,376832,294912,671744,622592,65536,802816,196608,704512,737280,507904,557056,573440,0,720896,81920,278528,655360,212992,98304,114688,524288,638976,245760,475136,131072,16384,835584,753664,589824,540672,344064,163840,393216,442368,147456,851968,786432,360448,

Force-reinstalling your commit pip install git+https://github.com/ronf/asyncssh.git@ee1baff0197d3d7e66e5c7f28e24f3309d51fe60 --force-reinstall

I've also double-checked line 764 in asyncssh.sftp to make sure I have the latest copy of the file. It reads """Copy a block of the source file""" as expected.

Now, when I run the upload script I no longer get thrown the asyncssh.sftp.SFTPFailure and looking at the SFTP-server there's indeed a well-formatted 2MB CSV-file, just as expected.

From my point of view, your patch resolves this issue. Thank you!

ggravlingen avatar Sep 20 '22 14:09 ggravlingen

That's great, Patrik - thanks for letting me know! This change will be included in the next release...

ronf avatar Sep 20 '22 15:09 ronf

Great, I'll remain on .11 for now and wait for the .13 release! Thanks again for the fix.

ggravlingen avatar Sep 20 '22 16:09 ggravlingen