qBittorrent icon indicating copy to clipboard operation
qBittorrent copied to clipboard

Force terminate libtorrent session after saving resume data

Open glassez opened this issue 1 year ago • 37 comments

The problem is that at the end of the session, libtorrent pedantically follows the protocol and tries to correctly de-announce torrents on all connected trackers. For various reasons, this may take a very long time, which causes dissatisfaction of users. qBittorrent is usually constantly running and exits when logging out or shutting down the computer, so if such problems occur, it prevents the system from shutting down, which is extremely unacceptable. Given that such graceful session shutdown is unimportant to the user, this patch modifies qBittorrent in such a way as to terminate the libtorrent session after resume data saving is completed. Since the network interaction takes place in parallel with the resume data saving, I added a small timeout that should be respected before the session is interrupted if the resume data is saved too quickly.

Closes #18697.

glassez avatar Mar 27 '23 14:03 glassez

The protocol is important to follow. It allows trackers to remove the peer from the database when the user sends the stopped event. It also allows private trackers to update the user's statistics between their last announce and when they stopped the torrent.

Roardom avatar Mar 28 '23 10:03 Roardom

It would seem far more preferable to have the client ignore trackers that have a status of "not working", rather than blindly ignore the spec for the benefit of users using broken trackers.

Audionut avatar Mar 28 '23 11:03 Audionut

It would seem far more preferable to have the client ignore trackers that have a status of "not working"

Who said that we are talking about "not working" trackers? It is unlikely that libtorrent sends stopped event to such trackers. You could look at #18697 for some details.

In fact, I could just make it optional so that everyone can choose what is more important to him - to comply with all the formalities of the protocol at all costs or turn off the computer without unacceptable delay.

Of course, it would be preferable to get some kind of support from libtorrent itself, so that it uses strictly limited timeouts for stopped event requests, so as to cut off only really "bad" trackers which do not respond within a reasonable time. But I'm not sure what @arvidn really thinks about it. After all, this problem is not new.

glassez avatar Mar 28 '23 12:03 glassez

Of course, it would be preferable to get some kind of support from libtorrent itself, so that it uses strictly limited timeouts for stopped event requests, so as to cut off only really "bad" trackers which do not respond within a reasonable time. But I'm not sure what @arvidn really thinks about it. After all, this problem is not new.

It turns out that qBittorrent/libtorrent even has an associated setting "Stop tracker timeout". I wonder what kind of influence it has in this or that case? For example, if there are a lot (N) of "stopped" requests to a "bad" tracker, how much time will it take in the end? N*T? Then this is hardly an effective solution. @Voxxel, could you check the value of "Stop tracker timeout" in your qBittorrent settings? Could you try to set it to 1sec? Could you then set it to 0 (IIRC, it should avoid to send stop events in this case)?

EDIT:

IIRC, it should avoid to send stop events in this case

If it really works in such a way that this PR does not make sense at all.

glassez avatar Mar 28 '23 13:03 glassez

Indeed, I didn't read far enough in the issue. Apologies.

Audionut avatar Mar 28 '23 13:03 Audionut

It is better to add an option so that the user himself chooses whether he wants to close the client faster without sending statistics to trackers.

stalkerok avatar Mar 28 '23 15:03 stalkerok

It is better to add an option so that the user himself chooses whether he wants to close the client faster without sending statistics to trackers.

https://github.com/qbittorrent/qBittorrent/pull/18772#issuecomment-1486851350

glassez avatar Mar 28 '23 15:03 glassez

I'm only saying that this feature can be improved, because in its current form, many people did not like it. In principle, I also agree, you should not make a forced session termination by default. But this feature is definitely needed.

stalkerok avatar Mar 28 '23 17:03 stalkerok

IIRC, it should avoid to send stop events in this case

If it really works in such a way that this PR does not make sense at all.

From the other point this patch has advantage over zero Stop tracker timeout method is that it still gives it the opportunity to send stopped events before interrupting the session. So I'll merge it anyway, but first I'll make it optional.

glassez avatar Mar 28 '23 18:03 glassez

the ways libtorrent tries to mitigate this issue are:

  • a shorter timeout for trackers when sending event=stopped (stop_tracker_timeout)
  • caching DNS lookups for trackers, and when shutting down, using the cached IP unconditionally, since DNS lookups can sometimes stall for a very long time. If (for some reason) the cached IP has been evicted, the tracker won't be stopped.
  • When initiating the shutdown, all currently outstanding tracker requests (except event=stopped ones) are aborted.
  • When initiating the shutdown, all queued DNS lookups for trackers (that aren't stopping a torrent) are cancelled.
  • If you want to simply not even try to send event=stopped event to trackers, you can set stop_tracker_timeout to 0

Another aspect to slow shut-downs are sessions with hundreds of active torrents, libtorrent will queue the tracker requests and only keep 50 (by default) announces outstanding at any given time. This can be controlled by (max_concurrent_http_announces). The issue people would run into before this logic was running out of memory because there was so many SSL contexts allocated for all the tracker requests.

Some torrents may also have multiple trackers that need to be stopped. libtorrent already tracks which trackers it successfully sent an event=started to, and will only try to stop those.

Also note that DNS lookups aren't cancelable. If a DNS lookup is stuck, it can hold up the queue for minutes. Killing the thread is really the only way to avoid that unfortunately.

arvidn avatar Mar 28 '23 21:03 arvidn

Also note that DNS lookups aren't cancelable. If a DNS lookup is stuck, it can hold up the queue for minutes. Killing the thread is really the only way to avoid that unfortunately.

Why? it seems libtorrent is using boost.asio for DNS lookup so it should support async lookups. I mean why not ditch an lookup query after it has exceed some timeout limit (5secs or configurable) and presume the host isn't reachable instead of waiting/locking up to minutes? It would be user's own problem if their DNS resolver/provider isn't performant and causes lookups to drop in this scenario of shutting down.

Chocobo1 avatar Mar 29 '23 07:03 Chocobo1

Why? it seems libtorrent is using boost.asio for DNS lookup so it should support async lookups.

because boost.asio uses underlying operating system calls to perform DNS lookups, and they are blocking. Specifically, getaddrinfo().

It does run this in separate threads, to emulate asyncronous operations, but that doesn't solve the cancel operation.

In theory it would be possible for asio to implement the DNS protocol itself, and make lookups async and cancelable, but last time I looked it doesn't. It would still need to query the system for the DNS configuration.

arvidn avatar Mar 29 '23 09:03 arvidn

@Voxxel, could you check the value of "Stop tracker timeout" in your qBittorrent settings? Could you try to set it to 1sec? Could you then set it to 0 (IIRC, it should avoid to send stop events in this case)?

Sorry just spotted this right now. So the default for "Stop tracker timeout" was on 5 sec. I lowered it to 0 sec, the exiting time went down to 4-5 sec but it then is unable to refresh my seed/stopped status for my properly working private torrents too. That's bad because my private torrent site has a strict seed policy. I then changed it to 1 sec : The exiting time went up to about 8-10 seconds but the tracker of my private torrents is once again able to refresh my status (which I can query on their website.)

Voxxel avatar Mar 29 '23 19:03 Voxxel

@arvidn

  1. Is DNS query considered as integral part of tracker announce accounted by max_concurrent_http_announces? Or such queries are performed independently?
  2. What does it do when you need to make several consecutive announcements on the same tracker (effectively, the tracker domain)? Will the DNS query be executed only once?

glassez avatar Mar 30 '23 04:03 glassez

  • caching DNS lookups for trackers, and when shutting down, using the cached IP unconditionally, since DNS lookups can sometimes stall for a very long time. If (for some reason) the cached IP has been evicted, the tracker won't be stopped.

Do all previously successfully announced trackers have a cached DNS lookups, or is cache limited in size/timeout, so some trackers still execute the DNS query while sending the stopped event?

glassez avatar Mar 31 '23 11:03 glassez

From the other point this patch has advantage over zero Stop tracker timeout method is that it still gives it the opportunity to send stopped events before interrupting the session.

Unfortunately, we may still find ourselves in a situation where exactly one of the first requests will be stuck and will hold the rest of the queue before thread is interrupted. Since private torrents need sending stopped events more, it would be nice if libtorrent stopped such torrents in the first place.

glassez avatar Mar 31 '23 12:03 glassez

Do all previously successfully announced trackers have a cached DNS lookups, or is cache limited in size/timeout, so some trackers still execute the DNS query while sending the stopped event?

There's a (hard coded) cap on the DNS cache size. It's currently 700 entries.

As long as you have no more than 700 different hostnames in trackers, every successful lookup will be cached.

To avoid shutdown stalls because of DNS, trackers are only announced to if they have an entry in this cache. There's a flag called cache_only, implemented here. It's potentially problematic to not have the cache size configurable, nor the shutdown behavior of trackers.

The rationale for doing it this was was to really prioritize avoiding shutdown stalls. I'm open to suggestions on how to improve it.

arvidn avatar Apr 03 '23 07:04 arvidn

I tried to exit qBittorent over 12 hours ago. It is still running and using network/disk.

bbeuning avatar Apr 03 '23 11:04 bbeuning

I attached MSVC 2019 to it, and this is what the threads are doing. After that are the thread stacks that are not in wait.

Not Flagged 30696 0 Worker Thread qbittorrent.exe!thread_start<unsigned int (__cdecl*)(void ),1> qbittorrent.exe!QWindowsFileSystemWatcherEngineThread::run Not Flagged 10532 0 Main Thread Main Thread win32u.dll!NtUserMsgWaitForMultipleObjectsEx Not Flagged 37928 0 Worker Thread qbittorrent.exe!thread_start<unsigned int (__cdecl)(void ),1> [Inline Frame] qbittorrent.exe!Concurrency::details::stl_condition_variable_win7::wait_for Not Flagged 34292 0 Worker Thread qbittorrent.exe!thread_start<unsigned int (__cdecl)(void ),1> [Inline Frame] qbittorrent.exe!Concurrency::details::stl_condition_variable_win7::wait_for Not Flagged 15016 0 Worker Thread qbittorrent.exe!thread_start<unsigned int (__cdecl)(void ),1> [Inline Frame] qbittorrent.exe!Concurrency::details::stl_condition_variable_win7::wait_for Not Flagged 31376 0 Worker Thread qbittorrent.exe!thread_start<unsigned int (__cdecl)(void ),1> win32u.dll!NtUserMsgWaitForMultipleObjectsEx Not Flagged 32724 0 Worker Thread qbittorrent.exe!thread_start<unsigned int (__cdecl)(void ),1> win32u.dll!NtUserMsgWaitForMultipleObjectsEx Not Flagged 22684 0 Worker Thread qbittorrent.exe!thread_start<unsigned int (__cdecl)(void ),1> win32u.dll!NtUserMsgWaitForMultipleObjectsEx Not Flagged 10960 0 Worker Thread qbittorrent.exe!thread_start<unsigned int (__cdecl)(void ),1> win32u.dll!NtUserMsgWaitForMultipleObjectsEx Not Flagged 29760 0 Worker Thread qbittorrent.exe!thread_start<unsigned int (__cdecl)(void ),1> [Inline Frame] qbittorrent.exe!Concurrency::details::stl_condition_variable_win7::wait_for Not Flagged 9212 0 Worker Thread winmm.dll!timeThread winmm.dll!timeThread Not Flagged 10592 0 Worker Thread mswsock.dll!SockAsyncThread mswsock.dll!SockAsyncThread Not Flagged 24388 0 Worker Thread qbittorrent.exe!thread_start<unsigned int (__cdecl)(void ),1> mswsock.dll!SockWaitForSingleObject Not Flagged 32620 0 Worker Thread qbittorrent.exe!thread_start<unsigned int (__cdecl)(void ),1> qbittorrent.exe!boost::asio::detail::win_iocp_io_context::do_one Not Flagged 29640 0 Worker Thread qbittorrent.exe!thread_start<unsigned int (__cdecl)(void ),1> win32u.dll!NtUserMsgWaitForMultipleObjectsEx Not Flagged 32116 0 Worker Thread qbittorrent.exe!thread_start<unsigned int (__cdecl)(void ),1> qbittorrent.exe!boost::asio::detail::win_iocp_io_context::do_one Not Flagged 19140 0 Worker Thread ntdll.dll!TppWorkerThread ntdll.dll!NtWaitForWorkViaWorkerFactory Not Flagged > 32556 0 Worker Thread qbittorrent.exe!thread_start<unsigned int (__cdecl)(void ),1> [Inline Frame] qbittorrent.exe!boost::asio::detail::win_iocp_io_context::timer_thread_function::operator() Not Flagged 19560 0 Worker Thread qbittorrent.exe!thread_start<unsigned int (__cdecl)(void ),1> win32u.dll!NtUserMsgWaitForMultipleObjectsEx Not Flagged 33440 0 Worker Thread qbittorrent.exe!thread_start<unsigned int (__cdecl)(void *),1> win32u.dll!NtUserMsgWaitForMultipleObjectsEx Not Flagged 26436 0 Worker Thread ntdll.dll!TppWorkerThread ntdll.dll!NtWaitForWorkViaWorkerFactory

[External Code]	
[Inline Frame] qbittorrent.exe!std::condition_variable::wait(std::unique_lock<std::mutex> &) Line 598	C++
qbittorrent.exe!libtorrent::disk_io_thread::wait_for_job(libtorrent::disk_io_thread::job_queue & jobq, libtorrent::disk_io_thread_pool & threads, std::unique_lock<std::mutex> & l) Line 3113	C++
qbittorrent.exe!libtorrent::disk_io_thread::thread_fun(libtorrent::disk_io_thread::job_queue & queue, libtorrent::disk_io_thread_pool & pool) Line 3137	C++
qbittorrent.exe!libtorrent::disk_io_thread::job_queue::thread_fun(libtorrent::disk_io_thread_pool & pool, boost::asio::io_context::work work) Line 402	C++
[External Code]	

qbittorrent.exe!thread_start<unsigned int (__cdecl*)(void *),1>(void * const parameter) Line 97 C++ [External Code]

[External Code]	
qbittorrent.exe!boost::asio::detail::win_iocp_io_context::do_one(unsigned long this_thread, boost::asio::detail::win_iocp_thread_info & ec, boost::system::error_code &) Line 434	C++
qbittorrent.exe!boost::asio::detail::win_iocp_io_context::run(boost::system::error_code & ec) Line 204	C++
[Inline Frame] qbittorrent.exe!boost::asio::detail::win_iocp_io_context::thread_function::operator()() Line 47	C++
qbittorrent.exe!boost::asio::detail::win_thread::func<boost::asio::detail::win_iocp_io_context::thread_function>::run() Line 123	C++
qbittorrent.exe!boost::asio::detail::win_thread_function(void * arg) Line 127	C++

qbittorrent.exe!thread_start<unsigned int (__cdecl*)(void *),1>(void * const parameter) Line 97 C++ [External Code]

[External Code]	
[Inline Frame] qbittorrent.exe!boost::asio::detail::win_iocp_io_context::timer_thread_function::operator()() Line 70	C++
qbittorrent.exe!boost::asio::detail::win_thread::func<boost::asio::detail::win_iocp_io_context::timer_thread_function>::run() Line 122	C++
qbittorrent.exe!boost::asio::detail::win_thread_function(void * arg) Line 127	C++

qbittorrent.exe!thread_start<unsigned int (__cdecl*)(void *),1>(void * const parameter) Line 97 C++ [External Code]

bbeuning avatar Apr 03 '23 12:04 bbeuning

@arvidn

The mechanisms listed above to deal with shutdown freezes due to "bad" trackers should cover most possible cases. But here's what's weird. The user reports that deleting non-working trackers still affects the delay, although it shouldn't. So my question is, is all of the above inherent in both v1.2 and v2.0 libtorrent branches?

glassez avatar Apr 03 '23 13:04 glassez

a tracker can be bad in other ways. In fact, if the DNS lookup is taking a long time I would think it is mostly a property of the DNS server being used. The tracker can be slow in responding for instance.

@bbeuning that stall of yours is a separate issue. In your case libtorrent is failing to shut down entirely, it's not just taking a long time.

arvidn avatar Apr 03 '23 22:04 arvidn

a tracker can be bad in other ways.

The case I mentioned above is:

I removed the two 'No such host' trackers (on the picture) and the delay suddenly reduced to 5-6 seconds.

So I suppose it should fall under:

libtorrent already tracks which trackers it successfully sent an event=started to, and will only try to stop those.

...since sending an event=started to those trackers seems to be failed.

glassez avatar Apr 04 '23 04:04 glassez

...since sending an event=started to those trackers seems to be failed.

Right, I would expect those host names to not have entries in the cache, and just not be announced to when shutting down. But perhaps they were stuck trying to announce start by the time you shut down.

arvidn avatar Apr 04 '23 10:04 arvidn

But perhaps they were stuck trying to announce start by the time you shut down.

Well, maybe. It seems to be impossible to be protected from all problems.

I still believe that the following could bring some improvement:

Since private torrents need sending stopped events more, it would be nice if libtorrent stopped such torrents in the first place.

glassez avatar Apr 04 '23 10:04 glassez

Since private torrents need sending stopped events more, it would be nice if libtorrent stopped such torrents in the first place.

Do you mean to prioritize private torrents, to announce stop to those first, before trying others?

arvidn avatar Apr 04 '23 11:04 arvidn

Do you mean to prioritize private torrents, to announce stop to those first, before trying others?

Yes. But in fact we are talking about prioritizing private torrents to be paused first once session pause is invoked.

glassez avatar Apr 04 '23 12:04 glassez

2. What does it do when you need to make several consecutive announcements on the same tracker (effectively, the tracker domain)? Will the DNS query be executed only once?

The answer to this question is important to understand whether a single "bad" tracker referred by many torrents can ruin everything.

glassez avatar Apr 04 '23 12:04 glassez

  1. What does it do when you need to make several consecutive announcements on the same tracker (effectively, the tracker domain)? Will the DNS query be executed only once?

The answer to this question is important to understand whether a single "bad" tracker referred by many torrents can ruin everything.

I mean, if it announces 50 torrents in a row on the same tracker, will it execute 50 DNS queries?

glassez avatar Apr 05 '23 04:04 glassez

I mean, if it announces 50 torrents in a row on the same tracker, will it execute 50 DNS queries?

If the DNS lookup fails, it will be re-tried 50 times, yes. It would probably make sense to cache failures as well. Now, I would kind of expect operating systems to do things like this, but it's probably not a safe assumption.

If the lookup succeeds, then the remaining 49 announces will use the cache. Since lookups are serialized, there's no overlap where the cache may not be available yet.

arvidn avatar Apr 05 '23 08:04 arvidn

It would probably make sense to cache failures as well.

👍 Temporarily preventing repeated DNS queries from being executed could be nice.

glassez avatar Apr 05 '23 08:04 glassez