Add `must-use-output` attribute
This adds an RFC proposing a new attribute that would help detect mistakes when a parameter of a function is not used after the function returns. E.g. when inserting into a collection that is then immediately dropped.
Cc @joshtriplett
I've incorporated all suggestions I completely agree with.
Thanks @kpreid ! I've addressed your feedback. Also @joshtriplett thanks for your review so far, I have also addressed the other rewording.
While I like the idea of this feature, I do wonder how many custom attributes we're going to have to add to the language before we just decide to reintroduce the idea of labeling functions as pure.
@clarfonthey note that if we had pure attribute/keyword it wouldn't help here because, according to Wikipedia:
the function has no side effects (no mutation of local static variables, non-local variables, mutable reference arguments or input/output streams)
An attribute that disallows side effects except for allocation and mutation of parameters might be interesting but we would need to invent a name for it. :)
I personally think that it's okay to consider a function pure if its mutations are explicitly defined; the issue with purity is when they're not defined, i.e. external state.
There isn't much of a distinction between taking and returning a value versus taking a mutable reference, other than calling convention. The idea of purity really boils down to whether a function call can be removed if its output is not used, which is what's relevant here.
Sure, maybe calling it pure is a bad idea, and maybe const isn't good enough either, but we're definitely approaching the point where typing pure fn is a whole lot easier than adding multiple attributes.
Note also that under this model, stuff like must_use are still useful, but useful for cases when it's actually not creating dead code and instead suggesting to the user they should handle extra cases. So, pushing to a vector that isn't used means the entire lifetime of the vector can be deleted, whereas ignoring an error still means the function is run.
The other benefit of this is that, with an actual language feature for this, we can do complete dead code analysis, whereas at the moment, the best we can do is emit one lint at the final call site if the code path ends.
@clarfonthey I agree pretty much completely and I think excluding mutation through pointer arguments was a bad idea when someone coined the term pure. But the term exists and if possible I'd prefer a different one, if possible, to avoid confusion.
Related: #[must_use] should go to the return type:
fn foo() -> #[must_use] Output;
This feels more consistent with #[must_use_output] on function parameters.
This would also address the problem that you can't mark part of a return type as #[must_use]:
#[must_use]
fn read() -> Result<usize> {...}
read()?; // does not trigger the lint
What we want is this:
fn read() -> Result<#[must_use] usize> {...}
@Aloso in principle yes but there isn't much we can do now about it apart from having it as an alternative way to specify the atribute.
I'm not sure where specifically in the RFC to comment but the RFC and this thread both don't seem to be mentioning it. As presented this will warn on data structures which impl Drop (it has to, otherwise the vec example at the top cannot work), but it is entirely possible to use a data structure to hold RAII guards (as one example). That is:
struct FancyDataStructure {
something_atomic: AtomicU64,
lock: Mutex<()>,
}
fn foo() {
// snip...
let mut guards = vec![];
for thing in structures{
guards.push(thing.lock.lock().unwrap());
}
// Don't touch guards, but when we drop they'l unlock.
}
As far as I can tell that is roughly equivalent to the first example but we actually do for real rely on the drop only. If you're wondering why you'd do something like this, it can come up in concurrent data structures when it is safe for readers to access the other fields.
As another possible write-only do-not-consume case, consider File where you are passing raw fds off to something else but need the File objects to stay around.
The trouble is that in order to do this generically--at least as far as I see--the rule has to become more complex. Something like "warn if the container's type parameters do not impl Drop" or something. Otherwise, you get false positives in these cases. Consider a crate like smallvec, which is what I would really use for cases like these; it has to extend to that.
There's the obvious rule "if any type parameter impls Drop then ignore it" but that immediately fails out on Vec<Box<Anything>> or Vec<Vec<Anything>>.
I kind of agree with @clarfonthey in that this is sort of poor man's pure. As written you can basically only use it on pure functions. If it can't handle the Drop case then even putting it in std doesn't work unless the function is pure.
Of course I imagine there'd be some way to turn this off if it's a false positive. Maybe people feel that we should make people do that. I have weak opinions on this whole thing in general: I see no harm in it but have never personally run into a case where it matters.
@ahicks92 in typical Rust that example will be written not as a for loop but a map+collect:
let guards = structures.iter()
.map(|s| s.lock.lock().unwrap())
.collect::<Vec<_>>();
which currently rustc will warn #[warn(unused_variables)]. The standard solution is to name it with a leading underscore:
let _guards = structures.iter()
.map(|s| s.lock.lock().unwrap())
.collect::<Vec<_>>();
I think the situation should be the same here. If we require to use Vec::push and don't want the warning and don't want to use #[expect] or drop(guards), maybe the RFC should support recognizing variables with a leading underscore:
let mut _guards = vec![];
for thing in structures{
_guards.push(thing.lock.lock().unwrap());
// suppress warning since `Vec::push` is called on `&mut _guards`?
}
_ prefixes don't play well with this, especially since they are taught to people as unused variable:
let mut v = vec![];
v.push(something);
if let Some(front) = v.front() {
// stuff.
}
v.push(another_thing);
Where does the _ go?
let _ = whatever.push(...) under the same logic as Result might be the best option, otherwise as far as I can tell you have to allow() it somewhere at least with current mechanisms. For this I'd at least like the mitigation/override to be on/near the line that triggers it.
I also feel the need to push back on the "not idiomatic" response. I started with significant compiler contributions in 2017 and have spent the last 4 years doing Rust full-time. I do understand that iterators are in some sense better and sometimes faster, I do understand that most experienced Rust people use them, but I could also make an argument for why I don't always do that rooted primarily in interacting with polyglot teams. In this case though, the goal is illustrating a problem I see with the RFC in a straightforward manner, not coming up with something idiomatic or arguing about it. In fact "learn to use iterator chains, and it won't warn" is exactly the opposite of what I would want to do to anyone--that's actively hostile if it's the solution, whether that hostility is intended or not, and in addition isn't something that Rustc could advise people about.
@ahicks92 I haven't used the word "idiomatic" in my response ("standard solution" is talking about suppressing the unused_variable lint), I'm just pointing out if this is written in terms of iterators instead of for-loop, we see that rustc provided one escape hatch which this RFC may copy. I'm sorry if you find it confused.
In more complex situation one should #[expect] the lint to clearly convey the intention why it's ok v is no longer used.
let mut v = vec![];
v.push(something);
if let Some(front) = v.front() {
// stuff.
}
#[expect(unused_must_use, reason="v takes ownership of another_thing which is a lock, ensuring it won't be released until v is dropped at the end of function")]
v.push(another_thing);
We discussed this in today's @rust-lang/lang meeting.
We agreed that #[must_use] is less ambiguous than I previously thought, for a few different reasons. One of them is that we already check that parameters are used, and require that they're named with a leading `_ if they aren't.
Given that, we'd like to see this updated to use #[must_use] after all. Thank you!
@rustbot labels -I-lang-nominated -P-lang-drag-3 +I-lang-radar @rustbot author
Reminder, once the PR becomes ready for a review, use @rustbot ready.