cargo
cargo copied to clipboard
Clean package perf improvements
What does this PR try to resolve?
I've noticed that cargo clean -p
execution time scales poorly with size of target directory; in my case (~250GB target directory on M1 Mac) running cargo clean -p
takes circa 35 seconds. Notably, it's the file listing that takes that time, not deleting the package itself. That is, when running cargo clean -p SOME_PACKAGE
twice in a row, both executions take roughly the same time.
I've tracked it down to the fact that we seem quite happy to use glob::glob
function, which iterates over contents of target dir. It also was a bit sub-optimal when it came to doing that, for which I've already filled a PR in https://github.com/rust-lang/glob/pull/144 - that PR alone takes down cleaning time down to ~14 seconds. While it is a good improvement for a relatively straightforward change, this PR tries to take it even further. With glob PR applied + changes from this PR, my test case goes down to ~6 seconds. I'm pretty sure that we could squeeze this further, but I'd rather do so in a follow-up PR.
Notably, this PR doesn't help with just super-large target directories. cargo clean -p serde
on cargo repo (with ~7Gb target directory size) went down from ~380ms to ~100ms for me. Not too shabby.
How should we test and review this PR?
I've mostly tested it manually, running cargo clean
against multiple different repos.
Additional information
TODO:
- [x] c770700 is not quite correct; we need to consider that it changes how progress reporting works; as is, we're gonna report all progress relatively quickly and stall at the end (when we're actually iterating over directories, globbing, removing files, that kind of jazz). I'll address that.
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @ehuss (or someone else) some time within the next two weeks.
Please see the contribution instructions for more information. Namely, in order to ensure the minimum review times lag, PR authors and assigned reviewers should ensure that the review label (S-waiting-on-review
and S-waiting-on-author
) stays updated, invoking these commands when appropriate:
-
@rustbot author
: the review is finished, PR author should check the comments and take action accordingly -
@rustbot review
: the author is ready for a review, this PR will be queued again in the reviewer's queue
#5931 will require re-working the target directory to be organized around package names. Depending on the complexity this entails, I wonder if its better to wait on that. Granted, my bias is that I never use cargo clean -p
and instead do cargo clean
.
Ah, I see; thanks for pointing that issue out. That's definitely another good argument for not squeezing every last bit out of cargo clean -p
if it's gonna be reworked in near-ish term.
The only pending item I want to address is that TODO for progress reporting; I expect this PR to be ready for review tomorrow-ish.
@bors r+
:pushpin: Commit 9c1139ec72b7fbc9bfc8678a7c50276a533c7765 has been approved by weihanglo
It is now in the queue for this repository.
:hourglass: Testing commit 9c1139ec72b7fbc9bfc8678a7c50276a533c7765 with merge 27d3e3d603e75d96c9a4bd94d651fc8445c63536...
:sunny: Test successful - checks-actions Approved by: weihanglo Pushing 27d3e3d603e75d96c9a4bd94d651fc8445c63536 to master...