glob
glob copied to clipboard
Cache file names in fill_todo
Background:
While working with cargo, I've noticed that it takes ~30s to cargo clean -p with large enough target directory (~200GB). Under a profiler, it turned out that most of the time was spent retrieving paths for removal via glob::glob in rm_rf_glob (and not actually removing the files).
Change description:
In call to .sort_by, we repetitively parse the paths to obtain file names for comparison. This commit caches file names in PathWrapper object, akin to #135 that did so for dir info.
For my use case, a cargo build using that branch takes ~14s to clean files instead of previous 30s (I've measured against main branch of this repository, to account for changes made since glob 0.3.1). Still not ideal, but hey, we're shaving 50% of time off for a bit heavier memory use.
I benchmarked this locally and this improves the speed (on top of https://github.com/rust-lang/glob/pull/135) on the rust-lang/rust repository by 20%, which is quite nice.
@the8472 Could you please take a look? The patch looks good to me.
Huh, the CI failure seems a bit spurious to me..
Uh-oh, this repo still runs CI on Rust 1.23.0, but one of the dev-dependencies has upgraded to a version that uses a too new Rust feature.
Could we follow suit and add a sort_paths field to MatchOptions that'd default to true?
Yeah that'd make sense I think but should be done in a separate PR since the fields of MatchOptions are all pub so it'd be a breaking change and needs a new version.
What you can try for this PR is to check if unstable sort would provide another speedup. Since filenames in a directory should be unique (barring pathological filesystems) we don't need a stable sort.
You should also rebase your PR to get the CI fixes.
Yeah that'd make sense I think but should be done in a separate PR since the fields of
MatchOptionsare all pub so it'd be a breaking change and needs a new version.
The next breaking change should probably also mark this #[non_exhaustive]. Which requires bumping the MSRV to at least 1.40, but I think that's unobjectionable anyway.
sort_unstable_by doesn't seem to be substantially faster for rust-lang/rust repro used for #135.
Is there anything I could do to push this PR forward?
Sorry, I totally forgot about this. I iterated on the design a little bit, and there's probably a lot of improvements that could be done here, e.g. using arena allocation. But what I noticed in my benchmarks that really the most expensive part is not the allocation itself, but rather iterating over the components of a path, especially if it has a bunch of nested directories.
I think that one improvement could also be to get rid of sorting or use some smarter sorting mechanism, such as popping things in a sorted order, using a heap or something like that.
But as the first step, I would like to propose a middle ground that is smaller than this PR and achieves almost the same gain. The thing to note is that we only ever need the filename for sorting in fill_todo, so it is wasteful to store it in PathWrapper for all paths. Furthermore, if we extract the filename from DirEntry, it's actually a pretty cheap operation (as the filename is already known, and we "just" need to heap allocate it). So we can only extract the filename in this specific situation.
I tried to implement this here: https://github.com/rust-lang/glob/compare/master...Kobzol:glob:cache-filename?expand=1 This reduced the find to grep for all .rs files in a rustc checkout from ~1000ms to ~800ms. Let me know what you think.
That sounds ok to me.
I did also benchmarks on this branch and my solution seems to be faster on rust-lang/rust. So I'd rather go forward with that, I'll open a PR.
Thanks for finding this, btw. Eventually we might want to remove the sorting altogether in 1.0, but this is a pretty nice speedup for a small code change.