qBittorrent icon indicating copy to clipboard operation
qBittorrent copied to clipboard

Don't check files when restoring torrents

Open glassez opened this issue 3 years ago • 25 comments

It redesigns the session restoring process, using a feature previously added to libtorrent from my submission, so as not to check files of torrents. Now errors with files (for example, missing files) will be detected only "on demand" (when you really need to read/write a file). This kind of emulates the continuity of the session, so there is no difference whether the file is corrupted while the application is running, or between its launches. This should save us from the old-fashioned problem with handling "missing files" state. It should also appeal to fans of so-called "archived" torrents, so that they can (temporarily) delete or move files of completed (downloaded and stopped) torrents without causing any problems. Since this may seem like a rather drastic change (which made me hesitate to implement it), I would like it to be thoroughly tested in advance before the next "feature" update (v4.5).

glassez avatar Mar 04 '22 08:03 glassez

The PR is marked as a "Draft" since I haven't had time to do the preliminary tests yet.

glassez avatar Mar 04 '22 08:03 glassez

Now errors with files (for example, missing files) will be detected only "on demand" (when you really need to read/write a file).

Doesn't happen currently. Moving a torrent elsewhere and starting qBt, that torrent is "Seeding" (same as before, i mean that was the state it was in before this PR). And i don't know if this is related to qBt but if i check another file in that torrent so it starts downloading, after a while i get a crash:

1  pthread_mutex_lock * *GLIBC_2.2.5               0x7ffff5ad48a4 
2  ??                                              0x7ffff7ea1b70 
3  ??                                              0x7ffff7c71618 
4  ??                                              0x7ffff7e89366 
5  ??                                              0x7ffff7c6a00f 
6  ??                                              0x7ffff7c67d69 
7  ??                                              0x7ffff7c60a36 
8  ??                                              0x7ffff7bd9456 
9  std::execute_native_thread_routine thread.cc 82 0x7ffff5e264d4 
10 start_thread                                    0x7ffff5ad15c2 
11 clone                                           0x7ffff5b56584 

thalieht avatar Mar 04 '22 12:03 thalieht

How about the concept itself? I want to know if it makes sense for me to continue with this.

glassez avatar Mar 07 '22 06:03 glassez

How about the concept itself?

I think it is fine. But are there any major drawbacks of it? If there are, then perhaps make it an option for everyone?

Chocobo1 avatar Mar 07 '22 07:03 Chocobo1

One major drawback is that you can fake seed torrents without having any data on disk.

ghost avatar Mar 11 '22 04:03 ghost

One major drawback is that you can fake seed torrents without having any data on disk.

Do you mean malicious behavior of users? Does it make sense? The torrent will fail on first read attempt.

glassez avatar Mar 11 '22 05:03 glassez

One major drawback is that you can fake seed torrents without having any data on disk.

Do you mean malicious behavior of users? Does it make sense? The torrent will fail on first read attempt.

Yes. But there may never be a read attempt if you fake seed very old torrents with no activity. It only makes sense on pvt trackers which provide seedbonus upon seeding.

ghost avatar Mar 11 '22 06:03 ghost

One major drawback is that you can fake seed torrents without having any data on disk.

Do you mean malicious behavior of users? Does it make sense? The torrent will fail on first read attempt.

Maybe a random check of file size on disk vs torrent file size could block cheaters. Currently having to recheck files over nas can take days if you have a torrents of over 15-20tb. So not rechecking until an error occurs/ a piece is not found is I think is good idea. A filename and size match should be good enough to stop most cheaters. Cheaters if they want to cheat can use other apps anyway so just to stop cheating not give regular users a better option should not be a concern

Dnkhatri avatar Mar 11 '22 12:03 Dnkhatri

@Dnkhatri This topic about checking of file existence and not about hash checking.

glassez avatar Mar 11 '22 17:03 glassez

One major drawback is that you can fake seed torrents without having any data on disk.

Do you mean malicious behavior of users? Does it make sense? The torrent will fail on first read attempt.

Yes. But there may never be a read attempt if you fake seed very old torrents with no activity. It only makes sense on pvt trackers which provide seedbonus upon seeding.

100% agree and can be sure there will be people that use this maliciously on private trackers.

HDVinnie avatar Mar 18 '22 10:03 HDVinnie

I installed this as a quick test with 135k torrents, startup time is about the same but it doesn't thrash my HDD the whole time.

oorzkws avatar Apr 01 '22 01:04 oorzkws

Giving this some thought - I think the optimization is worth considering. It's a pretty big change in load-time performance with many files.

That being said, about the concerns - is it really that different? I can leave qBittorrent open for weeks after deleting the file if I understand things correctly, this existence check is only on startup or when the file is requested.

Edit: I think this was meant to go here: https://github.com/qbittorrent/qBittorrent/pull/16840#issuecomment-1095338531

oorzkws avatar Apr 11 '22 12:04 oorzkws

Yes. But there may never be a read attempt if you fake seed very old torrents with no activity. It only makes sense on pvt trackers which provide seedbonus upon seeding.

Then what about making this PR only applicable to non-private torrents? Would that be possible/sensible?

Chocobo1 avatar Apr 15 '22 03:04 Chocobo1

Cheating with qb this way would be less efficient and perhaps more detectable (you'll stop announcing any time there's a downloader that can connect to you) than the dedicated cheating software out there all ready.

Forwarded from a private discussion of the changes.

oorzkws avatar Apr 15 '22 06:04 oorzkws

What if someone forgot to plug in their external hard drive and started qBt. They’d be bombarded with I/O error notifications as soon as peers start connecting.

ghost avatar Apr 15 '22 08:04 ghost

Doesn't make sense to me to reduce integrity checking. Putting aside bonus points, cheating and so forth, this patch says, lets just not care about the integrity of the files until it's potentially to late.

I've managed to catch quite the number of broken files on a system restart, thanks to the error reporting at client startup.

What about delaying the checking until some defined period after startup, and, being able to set a data rate limit for the process?

Also, an actual option that will verify files when manually selected?

Audionut avatar Apr 21 '22 09:04 Audionut

Would it be acceptable as an advanced setting, then? Many who seed multiple TB (myself included) would appreciate faster startup. There was a user on the qBittorrent discord today who didn't understand why opening qBittorrent was thrashing his disk for hours.

oorzkws avatar May 11 '22 10:05 oorzkws

What if someone forgot to plug in their external hard drive and started qBt. They’d be bombarded with I/O error notifications as soon as peers start connecting.

It would make sense that any IO error in this situation would prevent the torrent from continuing to seed. Not exactly a hard situation to deal with..

Ryrynz avatar Jun 10 '22 06:06 Ryrynz

Since this seems controversial, what about starting with a middle ground?

QB will check all files in a directory when checking regardless of if they are marked to download or seed. I just readded an old archived torrent over 1tb because I needed to redownload 1 small file. QB wanted to check all 1tb even though they were all marked "do not download".

Shouldn't the checking operation de-prioritize checking files marked to not download/seed then?

My solution was to create a new folder and pull the files I know I needed over there skipping the rest so I didn't wait all day for a check. It was messy because halfway through the check renaming the folder and setting a new location made a split location torrent and started moving the files I had no intention of including.

tastyratz avatar Jun 17 '22 18:06 tastyratz

This PR is not about file rechecking. It's about checking if files exist on startup. This checking does not equate to checking the hash of pieces. So take that discussion elsewhere, preferably in a issue ticket.

ghost avatar Jun 17 '22 18:06 ghost

@glassez what about a system where if a torrent was restored in paused state, libtorrent would not check for existing files. This would allow users that just want to keep an archieve of torrents at 100% completition without having any actual files on disk. However, as soon as such torrent is resumed, libtorrent would perform the skipped file existence check to ensure that data is on disk. I believe this would be one acceptable solution to this problem and also remove unwanted checking of inactive torrents on startup.

Another solution is to apply the changes from this PR only for public torrents.

Also you should take care of the cases where they may be a ton of I/O error notifications on disconnected drives on startup.

ghost avatar Jul 23 '22 06:07 ghost

@glassez what about a system where if a torrent was restored in paused state, libtorrent would not check for existing files.

I also had such an idea. I'll probably go that way when I come back here again.

glassez avatar Jul 23 '22 10:07 glassez

The best way would be to simply add a setting to enable/disable the file check at startup. I, personnaly, want the file check at startup.

tristanleboss avatar Sep 08 '22 21:09 tristanleboss

From what I've read, the only real downside is the possibility of "fake seeding" of inactive torrents, which could be abused on some private trackers. But AFAIK most torrent clients don't do file checks on startup and yet they're allowed and commonly used on private trackers... Correct me if I'm wrong, though.

dardoor avatar Sep 16 '22 09:09 dardoor

+1 to all who want this to be optional. I, for one, have ~38K torrents, around 150 Tb. Starting up qBittorrent takes ~30 minutes for me. Which I think is a fair price to pay for a little extra peace of mind. I know this check helped me quickly detect some issues in the past. So I would rather keep it.

That said, I also understand that some folks see this excessive and would like to skip this check.

However, if we end up making this check optional, it should be something you can change before starting the app. How? Well, maybe a command-line option would help. Or even something as quick and dirty as "if file XYZ is present in working directory, then go ahead and skip the checks and just start the app already."

baccccccc avatar Dec 04 '22 06:12 baccccccc

This PR is stale because it has been 60 days with no activity. This PR will be automatically closed within 7 days if there is no further activity.

github-actions[bot] avatar Sep 25 '23 00:09 github-actions[bot]

This PR was closed because it has been stalled for some time with no activity.

github-actions[bot] avatar Oct 02 '23 00:10 github-actions[bot]

Let's say I start downloading another torrent with the same name. I just hope that the new torrent doesn't overwrite the existing torrent data, with no checking of qbittorrent's. It sounds like that if the data is only checked when a torrent starts seeding, then perhaps the one torrent that fully downloaded may sit there for a long time, showing 100%, but yet the data is corrupted.

as-muncher avatar Oct 11 '23 00:10 as-muncher