rtorrent
rtorrent copied to clipboard
Periodic peer disconnect and idling data transfer intervals
Hi All,
I am observing periodic interruptions in data transfer among peers that are using rtorrent. I am not sure if this is a bug, but I would appreciate if you could help me to understand the issue.
I am using rtorrent for file distribution within private network, specifically it is private subnets of a VPC on AWS. I run my own opensource bittorrent-tracker (https://github.com/webtorrent/bittorrent-tracker). My use case comprised of a single seeding server that needs to distribute a ~50GB folder among ~300 machines within the same private network. The seed server (1) creates the torrent file based on the local folder (2) shares this torrent file among all peers (3) each peer adds torrent file into the /rtorrent/watch/start/ folder and connects to the same tracker server (within the same network). All ~300 peers initiate downloading in about the same time.
The problem that I am repeatedly observing is prolonged periods of complete idling among all peers in the network. In other words, for the first ~5 minutes everything works as expected i.e., the seed server uploads at its maximum network bandwidth and all the peers receive pieces and also distribute chunks among themselves. Then after the initial ~5 minutes all peers disconnect from each other and idle for 5-10 minutes, eventually transfer resumes and lasts for another few minutes just to disconnect again. Two important observations: (1) if at the time of such idling, I manually restart the source seed server all transfer resumes for another few minutes, (2) the problem does not appear to be related just to the seed server only, because during the initial period of data transfer (before the first idling) there are many peers that manage to get the complete torrent file, however, none of them are sharing the data with the remaining peers during the idling period.
I am attaching log files from the seed server and one of the peer servers from one of my smaller scale experiments where I used only 20 peers.
In the log file I am seeing the following messages, but I am not sure about their relevance or how to debug them further.
Handshake dropped: seeder rejected.
Received error: message:7 network error.
Upload unchoked slots adjust; currently:10 adjust:1
I am using rTorrent v0.9.8 and RHL8 OS.
I would appreciate any guidance on what could be an issue here.
Thank you. server-log.log client-log.log
Did everything work as expected in the smaller test? If not, would you happen to have a log from a peer that didn't successfully get past the stall? Can you share your config?
Just to break down the log messages you mentioned a bit:
-
Handshake dropped: seeder rejected
: This can happen two ways, one of which is only possible when using magnet links. The other is when rTorrent receives a connection from a seeder when it's also a seeder, so this message seems pretty harmless. -
Received error: message:7 network error.
: Unfortunately this can refer to a couple different kinds of network errors, and themaster
branch has more specific strings. There's plenty of reasons this could happen during normal operation, so it's good to have a timeline but otherwise doesn't tell much on it's own. -
Upload unchoked slots adjust; currently:10 adjust:1
: These are messages from the internal resource manager. By themselves, they're just informational messages telling you how many unchoked peer connections are active. Depending on your settings, it's unlikely but posssible rTorrent is clearing connections unecessarily
One funky thing I see in the logs that I don't think is normal is that within the space of second, rtorrent is starting an outgoing connection, receiving an incoming connection from the same host, then declaring that both connections received a network error. It's possible there's some weird race condition that happens in low latency networks. I assume all the clients are currently receiving the torrent at essentially the same time, would it be possible to try staggering the start across the servers?
Hi @kannibalox
Thanks for the quick response and explanation of the messages!
I was able to reproduce this issue using a single seeding server and a single client. I am attaching both logs and the configuration that was used. In this experiment the client experienced a stall in less than a minute after starting downloading the file.
To answer your last question, I am already spreading start up times across 20 seconds interval, however, the objective is to distribute files as fast as possible. I can artificially slow the process further (say by 1-2 minutes), but the issue is still present in the smallest scale tests.
Also, I don't want to diverge this conversation from the original topic, but I have also observed several times a case when a client shuts down half way through downloading a file. I have observed this when rtorrent client has been launched as a detached daemon process. I am attaching this log file as client_error2.log just in case it will make sense to you.
Let me know if I can provide any other debug information.
Thank you. seed_server1.log client1.log config.log client_error2.log
Hm, 20 seconds would be enough to prevent the behavior I was thinking of, and there's not anything else obvious in the 1-on-1 logs. My interest is sufficiently piqued that I may see if I can replicate it. Are there any other noteworthy details about your setup?
As for client_error2.log, that looks like a normal shutdown procedure. Those can be triggered by SIGINT
or SIGHUP
, or by RPC calls, see https://rtorrent-docs.readthedocs.io/en/latest/cmd-ref.html#term-system-shutdown-normal for more infortmation. If rtorrent encountered an error it couldn't handle or something, it would have just crashed hard instead.