rsync icon indicating copy to clipboard operation
rsync copied to clipboard

If "rsync --partial --timeout=..." times out and we immediately start new "rsync --partial", then it starts from scratch

Open safinaskar opened this issue 3 years ago • 1 comments

Hi. I'm attempting to write script, which should able to reliably upload big files (~100 GiB) to remote server over ~3 MiBytes/s channel (i. e. such uploading should take several hours). Local computer is laptop, so there are various complicating factors, such as suspend or change of local IP address during upload. So my current version of script uses this trick: while ! rsync --partial --timeout=... ...; do sleep 1; done. Unfortunately, the trick doesn't work well: sometimes second rsync invocation starts from scratch. I think this is a bug.

I was able to reproduce this bug "in laboratory conditions", so here follows exact steps, which reproduce this bug, i. e. here follows my experiment, where I intentionally reproduced the bug to write this report.

My local system is debian sid with recent rsync 3.2.4 running in docker container inside very old debian stretch on my laptop. Remote system is debian bullseye with recent rsync 3.2.4 in AWS VPS. This is full blown vps, i. e. I have root access to this system, ssh starts usual bash shell, not some restricted shell. Output of local rsync --version:

rsync  version 3.2.4  protocol version 31
Copyright (C) 1996-2022 by Andrew Tridgell, Wayne Davison, and others.
Web site: https://rsync.samba.org/
Capabilities:
    64-bit files, 64-bit inums, 64-bit timestamps, 64-bit long ints,
    socketpairs, symlinks, symtimes, hardlinks, hardlink-specials,
    hardlink-symlinks, IPv6, atimes, batchfiles, inplace, append, ACLs,
    xattrs, optional protect-args, iconv, prealloc, stop-at, no crtimes
Optimizations:
    SIMD-roll, no asm-roll, openssl-crypto, no asm-MD5
Checksum list:
    xxh128 xxh3 xxh64 (xxhash) md5 md4 none
Compress list:
    zstd lz4 zlibx zlib none

rsync comes with ABSOLUTELY NO WARRANTY.  This is free software, and you
are welcome to redistribute it under certain conditions.  See the GNU
General Public Licence for details.

Output of remote rsync --version:

rsync  version 3.2.4  protocol version 31
Copyright (C) 1996-2022 by Andrew Tridgell, Wayne Davison, and others.
Web site: https://rsync.samba.org/
Capabilities:
    64-bit files, 64-bit inums, 64-bit timestamps, 64-bit long ints,
    socketpairs, symlinks, symtimes, hardlinks, hardlink-specials,
    hardlink-symlinks, IPv6, atimes, batchfiles, inplace, append, ACLs,
    xattrs, optional protect-args, iconv, prealloc, stop-at, no crtimes
Optimizations:
    SIMD-roll, no asm-roll, openssl-crypto, no asm-MD5
Checksum list:
    xxh128 xxh3 xxh64 (xxhash) md5 md4 none
Compress list:
    zstd lz4 zlibx zlib none

rsync comes with ABSOLUTELY NO WARRANTY.  This is free software, and you
are welcome to redistribute it under certain conditions.  See the GNU
General Public Licence for details.

Now I created file ur on local system using this command: dd if=/dev/urandom of=/tmp/ur bs=1M count=1000. This resulted in file with size 1048576000 bytes. Then I typed this command on local system:

while ! rsync --archive --progress --partial --timeout=20 -e 'ssh -i /tmp/0.pem' /tmp/ur [email protected]:.; do sleep 1; done

As you can see I use --partial both for initial invocation of rsync and for following ones. Correct me if this is wrong.

So, what happens?

At the beginning rsync starts first time and uploads file normally. When it reaches 16%, I disconnect the internet (to reproduce the bug). rsync fails with timeout, so while loop starts it again. rsync fails again, because there is no internet. Such way rsync fails several more times. Then I connect internet, while loop restarts rsync one more time, and rsync starts to upload file again. But rsync uploads it from scratch! With same speed, i. e. I don't see effect "now we need checksum only, not actual upload".

Here is log:

root@b09ba6675f62:/# while ! rsync --archive --progress --partial --timeout=20 -e 'ssh -i /tmp/0.pem' /tmp/ur [email protected]:.; do sleep 1; done
sending incremental file list
ur
    168,460,288  16%    4.89MB/s    0:02:55  
[sender] io timeout after 20 seconds -- exiting
rsync error: timeout in data send/receive (code 30) at io.c(197) [sender=3.2.4]
ssh: connect to host 34.244.68.213 port 22: Network is unreachable
rsync: connection unexpectedly closed (0 bytes received so far) [sender]
rsync error: unexplained error (code 255) at io.c(228) [sender=3.2.4]
ssh: connect to host 34.244.68.213 port 22: Network is unreachable
rsync: connection unexpectedly closed (0 bytes received so far) [sender]
rsync error: unexplained error (code 255) at io.c(228) [sender=3.2.4]
ssh: connect to host 34.244.68.213 port 22: Network is unreachable
rsync: connection unexpectedly closed (0 bytes received so far) [sender]
rsync error: unexplained error (code 255) at io.c(228) [sender=3.2.4]
ssh: connect to host 34.244.68.213 port 22: Network is unreachable
rsync: connection unexpectedly closed (0 bytes received so far) [sender]
rsync error: unexplained error (code 255) at io.c(228) [sender=3.2.4]
ssh: connect to host 34.244.68.213 port 22: Network is unreachable
rsync: connection unexpectedly closed (0 bytes received so far) [sender]
rsync error: unexplained error (code 255) at io.c(228) [sender=3.2.4]
ssh: connect to host 34.244.68.213 port 22: Network is unreachable
rsync: connection unexpectedly closed (0 bytes received so far) [sender]
rsync error: unexplained error (code 255) at io.c(228) [sender=3.2.4]
ssh: connect to host 34.244.68.213 port 22: Network is unreachable
rsync: connection unexpectedly closed (0 bytes received so far) [sender]
rsync error: unexplained error (code 255) at io.c(228) [sender=3.2.4]
ssh: connect to host 34.244.68.213 port 22: Network is unreachable
rsync: connection unexpectedly closed (0 bytes received so far) [sender]
rsync error: unexplained error (code 255) at io.c(228) [sender=3.2.4]
sending incremental file list
ur
    477,560,832  45%    2.91MB/s    0:03:11  ^C
rsync error: unexplained error (code 255) at rsync.c(713) [sender=3.2.4]
^C
root@b09ba6675f62:/# ^C
root@b09ba6675f62:/# ^C
root@b09ba6675f62:/# ^C
root@b09ba6675f62:/#

The bug is reproducible always when I try to reproduce it. If you have difficulties with reproducing I can somehow help. Say, I can say when at remote side files like .ur.Z5oY3j appear.

Maybe I use wrong tool for my task? Or wrong options?

safinaskar avatar Jun 20 '22 10:06 safinaskar

With --partial-dir the bug is reproducible, too (but not always). See https://github.com/WayneD/rsync/issues/519#issuecomment-1745159061

safinaskar avatar Oct 03 '23 15:10 safinaskar