cargo icon indicating copy to clipboard operation
cargo copied to clipboard

Ensure discoverability of automatic garbage collection

Open matthieu-m opened this issue 2 years ago • 6 comments

Problem

Cache cleaning is coming to Cargo, which I am really excited about.

Ideally, it should (once mature) be opt-out: active by default, with reasonable defaults, unless a user configures it. But how is a new user ever supposed to know the feature exists and that it can be configured?

This may catch users unaware, wiping out their crate cache behind their back before they embark on a journey in which they wished to revive an old project while sitting in a train, plane, in a cabin in the woods, etc...

Proposed Solution

The GC feature should be discoverable, so that new users, even those who didn't read the documentation, or who forgot about it, are regularly reminded about its existence.

A potential solution would be to remind users of the feature... whenever the feature kicks in. According to the article:

When you run cargo, once a day it will inspect the last-use cache tracker, and determine if any cache elements have not been used in a while.

It would be very useful if at the very time automatic cleaning runs to determine what to clean, it would print:

  • Either a single line indicating there's nothing to clean.
  • Or, if it does clean anything, one line per crate/version that is cleaned.

Since this feedback is not time-sensitive, it would be fine -- should the cleaning run in the background -- to provide it on the next run of cargo after cleaning completes instead. It would still be only once a day, and it would still make the feature discoverable by new users.

Notes

Bottom of the reddit comment chain which inspired this issue.

matthieu-m avatar Dec 15 '23 17:12 matthieu-m

This may catch users unaware, wiping out their crate cache behind their back before they embark on a journey in which they wished to revive an old project while sitting in a train, plane, in a cabin in the woods, etc...

If the focus is on helping people when they go offline, would #13137 prevent people from running into that problem so its not needed?

As suggested, my biggest concern is bloated output. People already complain about how noisy cargo is.

epage avatar Dec 15 '23 17:12 epage

#13137 would certainly help alleviate the most common case of this. But if I have a project versioned in git with e.g. feature branches that pull in additional dependencies, I don't think there's a good way for cargo to detect that without this feature getting significantly more complex.

And even if cargo went so far as to search git branches, what if I'm using a VCS other than git? What if I keep some of my projects on an external USB drive? What if I simply moved my project folders around while reorganizing my home directory?

The reality is that Cargo doesn't (and can't) have control over the things that use its cache, and so it fundamentally can't (reliably) know if any item in the cache is actually no longer used by something. And trying to make the autoclean feature more sophisticated to handle more and more cases is never going to be enough, and would likely become a significant maintenance burden even if that were attempted.

So it would be both easier and (in some sense) more robust to just take the simple approach of ensuring that the user is informed and can opt out if needed/desired for their usage patterns.

As suggested, my biggest concern is bloated output. People already complain about how noisy cargo is.

Yeah, noise can be an issue. I would suggest just keeping the message short. Maybe something along the lines of:

Cargo deleted 7 crates from its local cache that haven't been used in the last 3 months.
See <link to relevant manual page> for cargo's garbage collection configuration options.

cessen avatar Dec 15 '23 19:12 cessen

Alternatively, when -Zgc is disabled, we can emit something like what git does:

warning: There are too many unreachable loose objects; run 'git prune' to remove them

So that people are aware of the feature even when they intentionally disabled it.

weihanglo avatar Feb 07 '24 20:02 weihanglo

We could start by calling it "cache cleaning" rather than "garbage collection", which already means something else?

airstrike avatar Apr 21 '24 18:04 airstrike

Yeah, that's a good point: garbage collection generally means cleaning stuff up that is provably unreachable/useless (garbage), whereas this is indeed a cache and (as outlined in this issue) its items may still have use.

cessen avatar Apr 23 '24 05:04 cessen

Note that naming is also being discussed on #12633. We likely don't need it in two places and this is more about how to raise visibility, rather than what name will catch people's attention

epage avatar Apr 23 '24 10:04 epage