Improve performance of excluded files filter
The current algorithm is like "collect all included files and subtract all excluded files". Collecting all included and all excluded files relies on the file system. This can become slow when the patterns used to exclude files resolve to a large number of files.
The new approach only collects all lintable files and checks them against the exclude patterns. This can be done by in-memory string-regex-match and does therefore not require file system accesses.
The most critical part is the conversion of glob patterns to regular expressions. I might have missed cases.
Fixes #5018.
| 17 Messages | |
|---|---|
| :book: | Linting Aerial with this PR took 0.91s vs 0.91s on main (0% slower) |
| :book: | Linting Alamofire with this PR took 1.26s vs 1.26s on main (0% slower) |
| :book: | Linting Brave with this PR took 7.38s vs 7.36s on main (0% slower) |
| :book: | Linting DuckDuckGo with this PR took 4.29s vs 4.28s on main (0% slower) |
| :book: | Linting Firefox with this PR took 10.74s vs 10.65s on main (0% slower) |
| :book: | Linting Kickstarter with this PR took 9.25s vs 9.23s on main (0% slower) |
| :book: | Linting Moya with this PR took 0.52s vs 0.53s on main (1% faster) |
| :book: | Linting NetNewsWire with this PR took 2.51s vs 2.51s on main (0% slower) |
| :book: | Linting Nimble with this PR took 0.74s vs 0.75s on main (1% faster) |
| :book: | Linting PocketCasts with this PR took 8.16s vs 8.28s on main (1% faster) |
| :book: | Linting Quick with this PR took 0.43s vs 0.43s on main (0% slower) |
| :book: | Linting Realm with this PR took 4.6s vs 4.6s on main (0% slower) |
| :book: | Linting Sourcery with this PR took 2.33s vs 2.33s on main (0% slower) |
| :book: | Linting Swift with this PR took 4.53s vs 4.53s on main (0% slower) |
| :book: | Linting VLC with this PR took 1.23s vs 1.23s on main (0% slower) |
| :book: | Linting Wire with this PR took 17.81s vs 17.57s on main (1% slower) |
| :book: | Linting WordPress with this PR took 11.88s vs 11.83s on main (0% slower) |
Generated by :no_entry_sign: Danger
Periphery had a similar performance issue not long ago, and I noticed there wasn't a solid glob to regex implementation, so I ported Python's fnmatch to Swift: https://github.com/ileitch/swift-filename-matcher. It might be useful here too.
Last time we tried to speed this up, it caused some slight differences in the result of what was matched vs not, so please be super careful here.
At the moment, I'm rather concerned here that normal runs without any excludes and includes seem to become much slower sometimes.
Periphery had a similar performance issue not long ago, and I noticed there wasn't a solid glob to regex implementation, so I ported Python's fnmatch to Swift: https://github.com/ileitch/swift-filename-matcher. It might be useful here too.
This is a very helpful tip. I don't want to invent a half-backed version myself. Thanks!
Any chance this can land? 🙏
Any chance this can land? 🙏
This is a critical change that needs thorough testing. Unfortunately, I'm lacking own projects with nifty included and excluded specifications.
@JaviSoto: In case this change took effect in your projects, I'd appreciate your feedback.