Poor first-time indexing performance
For larger repositories, the indexer seems to struggle quite a bit and takes many minutes to complete. For particularly large projects such as the Linux kernel, you can basically forget about the indexer finishing at all. I had it running for a number of hours on an AMD Ryzen 1800X, and had to just abort it eventually. Post #866, this is just limited to the first run, but still, it prevents the search bar from being used until it's done for example, and in any case, on a laptop it's not ideal to have a long-running task like this if it can be optimised a lot.
Looking at the code, there are a few improvements that can be made.
-
Reading the indexer files could be mmap-ed, or at the very least buffered. Reading single bytes as done in Index::readVInt is anything but performant. I prototyped something using mmap, and that seems to shave off about 10-15% of the time
-
Similarly, writing small blocks is also pretty in-efficient (happens in Index::writeVInt for example) I wonder if there's some buffering mechanism in Qt that could be used, but prototyping something buffering 64KiB directly in the code, shaves off about 15-20%.
In total, this shaves off about 30% of the time needed when indexing something like poky. On my machine, and running the indexer without the background argument (aka. directly on the command line), it translates to ~3 minutes instead of ~4 minutes. With the background argument I assume the difference is bigger.
@Murmele Would it be of interest to create pull requests for these changes? The code needs a decent clean-up, hence me asking.
PS: I've not looked at the connection to the search bar, but obviously, point 1 will improve the performance there as well.