syncthing icon indicating copy to clipboard operation
syncthing copied to clipboard

Use concurrency in scanner.Walk

Open aminvakil opened this issue 3 years ago • 3 comments

First discussed in https://forum.syncthing.net/t/syncthing-does-not-use-all-cpu-cores/18667

TLDR; Syncthing does not use all cores to read from database.

As discussed and suggested by @calmh in forum and if I have understood their solution correctly it would be a benefit for folders with large numbers of files (2 millions in my case) to: Citing from @calmh:

The database supports concurrent reads, and the filesystem supports concurrent stats, so I could see it being a gain for you to have one walk routine that feeds filenames to a queue and multiple processing routines handling the files from that queue.

Please change my issue title any way you see fit.

aminvakil avatar Jul 05 '22 14:07 aminvakil

While looking to improve scanning performance on a raspberry pi 2 I came across this almost nine year old issue that already has some thoughts on this topic: #293. Maybe it can be of use for anybody taking a second look at this.

For reference: Scanning 946 GB on a HDD USB drive with one core at 8.2 MB/s takes ~30 h, out of the box, no tuning applied.

ArthurusDent avatar Mar 27 '23 19:03 ArthurusDent

We already use all CPU cores for hashing, since that issue was closed (six years ago). This is about using more cores in the walk stage, which is only useful when all data is already hashed but there are many files to look at for changes.

calmh avatar Mar 27 '23 20:03 calmh

Ok, thanks for explaining. Will take my questions to the forum then.

ArthurusDent avatar Mar 27 '23 20:03 ArthurusDent