fddf icon indicating copy to clipboard operation
fddf copied to clipboard

Low throughput

Open StyMaar opened this issue 8 years ago • 6 comments

I'm running fddf on Debian Jessy, and the I/O read (shown by iotop) never goes 3MB/s. The tasks isn't CPU bound either, ~25% on both two cores. By comparison, ls -R reads between 10 and 15 MB per seconds, so does rsync on the same workload.

The directory I'm running fddf on contains a lot of small files (text files), a big amount of medium files (pictures or mp3) and a decent number of big files (movies or .iso images).

I have no idea how file I/Os work on Linux, then I don't know how to speed this up.

StyMaar avatar Jul 24 '17 05:07 StyMaar

I'm running fdupes right now to have an «apple to apple» comparison.

StyMaar avatar Jul 24 '17 05:07 StyMaar

Oh, I forgot to tell which version I was running : master (b2da1856bb407339f2f8737f19bed42954d33286) built with rust 1.19 (cargo build --release)

StyMaar avatar Jul 24 '17 05:07 StyMaar

fdupes is pretty irregular, but faster (1-10MB/s)

StyMaar avatar Jul 24 '17 05:07 StyMaar

the raw figures aren't really interesting (it's a RAID array with encryption, which slows things down), but I think the difference with other tools is relevant.

StyMaar avatar Jul 24 '17 06:07 StyMaar

Increasing the number of threads in the thread pool (I arbitrarily chose 20) helped me reach 10MB/s during the first part of the process (when walking the directories and hashing files), and during the second part (exact file comparison) I'm currently around 40MB/s. For the second part, I don't really know if increasing the number of threads changed anything.

StyMaar avatar Jul 24 '17 18:07 StyMaar

According to this benchmark my HDD (Seagate HDD 1TB, ST1000LM014) performs best when the number of outstanding IO operations is 32. Does that mean I should use a threadpool of 32+1 threads?

Boscop avatar Oct 01 '17 20:10 Boscop