rsync icon indicating copy to clipboard operation
rsync copied to clipboard

Triple vvv and above hangs rsync

Open dr-who opened this issue 3 years ago • 8 comments

Hi there,

We have had a case where the addition of an extra v (rsync -arvvv ... ...) cause rsync to hang. There are reports from https://bugs.launchpad.net/ubuntu/+source/rsync/+bug/1528921 and comments such as "I can only manually reproduce it when executing my rsync command with tripe verbosity:", so it's not just us.

Looking at options.c, the 6 values are defined for char *debug_verbosity[] but only three are defined for char *info_verbosity[].

What happens when three v are set? I note that adding 7 values to char *info_verbosity array causes a compilation error of course, but not with less than 6. But what happens with the redirection with three v values, when elements 3,4,5 are not defined?

dr-who avatar Sep 27 '21 23:09 dr-who

For your interest as part of debugging, I compiled with clang (clang version 7.0.1-8+deb10u2 (tags/RELEASE_701/final) and added -fsanitize=memory. It got errors such as:

root@xxx:~/rsync# ./rsync -ar /archives/ /mnt/stu/ ==27592==WARNING: MemorySanitizer: use-of-uninitialized-value #0 0x51da98 in perform_io /root/rsync/io.c:758:7 #1 0x524d40 in write_buf /root/rsync/io.c:2125:3 #2 0x4a43c3 in send_file_entry /root/rsync/flist.c:565:2 #3 0x4a43c3 in send_file_name /root/rsync/flist.c:1604 #4 0x4a6b1d in send_directory /root/rsync/flist.c:1839:3 #5 0x49b5b8 in send1extra /root/rsync/flist.c:1992:3 #6 0x499f5b in send_extra_file_list /root/rsync/flist.c:2078:3 #7 0x4d291a in send_files /root/rsync/sender.c:215:4 #8 0x4ec71b in client_run /root/rsync/main.c:1321:3 #9 0x4f46df in start_client /root/rsync/main.c:1586:8 #10 0x4f46df in main /root/rsync/main.c:1834 #11 0x7f6d40c4809a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a) #12 0x41fd49 in _start (/root/rsync/rsync+0x41fd49)

SUMMARY: MemorySanitizer: use-of-uninitialized-value /root/rsync/io.c:758:7 in perform_io

dr-who avatar Sep 27 '21 23:09 dr-who

Did you test the latest git source? Why are you running -vvv? For a lot of verbosity, it's good to use --stderr=all (aka --msgs2stderr).

WayneD avatar Sep 27 '21 23:09 WayneD

yes, this is the very latest source. Probably like most of the users we just wanted to know exactly what was going on, it was easier than adding an strace. We have >10TB file systems and large files so it's nice to watch.

dr-who avatar Sep 27 '21 23:09 dr-who

This may be helpful to you.

static struct output_struct info_words[COUNT_INFO+1]

and adding into parse_output_words

              assert(j < COUNT_INFO+1);

I get the following with too many v options.

rsync: options.c:447: void parse_output_words(struct output_struct *, short *, const char *, unsigned char): Assertion `j < COUNT_INFO+1' failed.

dr-who avatar Sep 28 '21 00:09 dr-who

There's nothing wrong with the info_verbosity setup -- it has a defined size to make it the same as the debug_verbosity array, so the unmentioned initializers are all NULL (since it's static). Your assert fails because sometimes it is parsing the debug array, which has more items than the info array.

WayneD avatar Sep 29 '21 20:09 WayneD

The uninitialized error is a bit weird. It points to: if (iobuf.in_fd >= 0 && FD_ISSET(iobuf.in_fd, &r_fds)) { which I thought shouldn't have any uninitialized memory because FD_CLEAR(&r_fds) was called earlier, but maybe iobuf.in is greater than the default FD_SETSIZE? It may only be 256 on some systems. I'll check into that. That could be a significant issue, but I thought that all the iobuf fd numbers were pretty low.

WayneD avatar Sep 29 '21 20:09 WayneD

It looked to me with multiple v's the INFO fields were added to the info_words[] structure until it was larger than COUNT_INFO+1. I'll keep hunting.

And yes, the uninitialized error did seem like a worry.

dr-who avatar Sep 29 '21 23:09 dr-who

I'll note that fixing this is not a very high priority given the better -ii option choice & --msgs2stderr. Especially since the only things that can really be done in the current protocol is to either throw more memory at the backlog of messages and/or to slow down the transfer by not doing so much checksum work in the pipeline while waiting for the sender to answer back.

WayneD avatar Jan 02 '22 21:01 WayneD