glob icon indicating copy to clipboard operation
glob copied to clipboard

Cache file names in fill_todo

Open osiewicz opened this issue 1 year ago • 8 comments

Background: While working with cargo, I've noticed that it takes ~30s to cargo clean -p with large enough target directory (~200GB). Under a profiler, it turned out that most of the time was spent retrieving paths for removal via glob::glob in rm_rf_glob (and not actually removing the files). image

Change description: In call to .sort_by, we repetitively parse the paths to obtain file names for comparison. This commit caches file names in PathWrapper object, akin to #135 that did so for dir info.

For my use case, a cargo build using that branch takes ~14s to clean files instead of previous 30s (I've measured against main branch of this repository, to account for changes made since glob 0.3.1). Still not ideal, but hey, we're shaving 50% of time off for a bit heavier memory use.

osiewicz avatar Apr 27 '24 11:04 osiewicz

I benchmarked this locally and this improves the speed (on top of https://github.com/rust-lang/glob/pull/135) on the rust-lang/rust repository by 20%, which is quite nice.

@the8472 Could you please take a look? The patch looks good to me.

Kobzol avatar Dec 23 '24 11:12 Kobzol

Huh, the CI failure seems a bit spurious to me..

osiewicz avatar Dec 23 '24 12:12 osiewicz

Uh-oh, this repo still runs CI on Rust 1.23.0, but one of the dev-dependencies has upgraded to a version that uses a too new Rust feature.

Kobzol avatar Dec 23 '24 12:12 Kobzol

Could we follow suit and add a sort_paths field to MatchOptions that'd default to true?

osiewicz avatar Dec 30 '24 01:12 osiewicz

Yeah that'd make sense I think but should be done in a separate PR since the fields of MatchOptions are all pub so it'd be a breaking change and needs a new version.

What you can try for this PR is to check if unstable sort would provide another speedup. Since filenames in a directory should be unique (barring pathological filesystems) we don't need a stable sort.

the8472 avatar Dec 30 '24 01:12 the8472

You should also rebase your PR to get the CI fixes.

the8472 avatar Dec 30 '24 01:12 the8472

Yeah that'd make sense I think but should be done in a separate PR since the fields of MatchOptions are all pub so it'd be a breaking change and needs a new version.

The next breaking change should probably also mark this #[non_exhaustive]. Which requires bumping the MSRV to at least 1.40, but I think that's unobjectionable anyway.

tgross35 avatar Dec 30 '24 01:12 tgross35

sort_unstable_by doesn't seem to be substantially faster for rust-lang/rust repro used for #135.

osiewicz avatar Dec 30 '24 09:12 osiewicz

Is there anything I could do to push this PR forward?

osiewicz avatar Oct 21 '25 08:10 osiewicz

Sorry, I totally forgot about this. I iterated on the design a little bit, and there's probably a lot of improvements that could be done here, e.g. using arena allocation. But what I noticed in my benchmarks that really the most expensive part is not the allocation itself, but rather iterating over the components of a path, especially if it has a bunch of nested directories.

I think that one improvement could also be to get rid of sorting or use some smarter sorting mechanism, such as popping things in a sorted order, using a heap or something like that.

But as the first step, I would like to propose a middle ground that is smaller than this PR and achieves almost the same gain. The thing to note is that we only ever need the filename for sorting in fill_todo, so it is wasteful to store it in PathWrapper for all paths. Furthermore, if we extract the filename from DirEntry, it's actually a pretty cheap operation (as the filename is already known, and we "just" need to heap allocate it). So we can only extract the filename in this specific situation.

I tried to implement this here: https://github.com/rust-lang/glob/compare/master...Kobzol:glob:cache-filename?expand=1 This reduced the find to grep for all .rs files in a rustc checkout from ~1000ms to ~800ms. Let me know what you think.

Kobzol avatar Oct 22 '25 08:10 Kobzol

That sounds ok to me.

osiewicz avatar Oct 22 '25 10:10 osiewicz

I did also benchmarks on this branch and my solution seems to be faster on rust-lang/rust. So I'd rather go forward with that, I'll open a PR.

Kobzol avatar Oct 23 '25 07:10 Kobzol

Thanks for finding this, btw. Eventually we might want to remove the sorting altogether in 1.0, but this is a pretty nice speedup for a small code change.

Kobzol avatar Oct 23 '25 07:10 Kobzol