RFC: Never patterns
Summary
A ! pattern indicates a type with no valid values. It is used to indicate an impossible case when matching an empty type in unsafe code.
enum Void {}
unsafe fn empty_tup<T>(tup: *const (T, Void)) -> ! {
unsafe {
match *tup { ! }
}
}
Note: this RFC is purely about a new pattern syntax. It does not propose changes to exhaustiveness checking or to the operational semantics of uninhabited types.
it might be a good idea to instead require an empty block (or maybe ! => !, where ! is an expression -- a constructor for the ! type that the compiler ensures is never reachable) as the match arm, since that makes it work with macros a whole lot better since all macros currently expect that all match arms have bodies.
There are two separate issues here: rust-lang/rust#131452 is entirely independent from this RFC, since this RFC is purely about adding a new pattern syntax
I don't think I agree that these are separate. The RFC says that ! is needed "in unsafe contexts". This is entangled with #131452 and the discussion about the semantics of uninhabited types, as discussed here: https://github.com/rust-lang/rust/issues/131452#issuecomment-2402800267
My own concern is primarily the possibility (which afaict is not explicitly excluded by this RFC text) that people writing safe code might be required to write ! arms. I think we must avoid that. That is one way this RFC is entangled with #131452.
Avoiding that means we need a distinction between safe and unsafe places. That distinction is also the thing that solves #131452 (which is about safe code). With that distinction this RFC will become clearer. It will also be more obvious what the alternatives are to never patterns (generally, I think, additional explicit dereferencing of the unsafe place?)
In any case, before we have fixed #131452, the handling of &Void in safe code (which is a prominent problem with Rust today) is going to keep turning up in the conversation, making it difficult to talk about even the aspects of the never patterns RFC that don't relate to safe code. Even if you consider the questions entirely unrelated, the apparent relationship between them makes the conversation confusing.
So I think we must deal with #131452 first.
I was trying to keep the conversations separate to make discussion easier, but it's becoming clear that this is not helping x) I'll go ahead with my plan for https://github.com/rust-lang/rust/issues/131452 and come back to this afterwards.
The most confusing bit of this RFC for me is exactly how it interacts with safety.
Specifically, it doesn't clarify when the need would arise for match x { ! } versus match x {}. It doesn't even specify whether match x { ! } requires that the match be enclosed in an unsafe block, even though the examples in the RFC seem to imply this.
I get that the goal here is to just provide a syntax, but there are too many unanswered questions about the motivation to really feel like the syntax is justified. As folks have mentioned, the lack of an arm means that this now has to be covered separately in macros, although I think that never_pat as its own macro matcher could probably be fine.
Additionally, would you be expected to mix uninhabited patterns with inhabited patterns? What does this mean? How does this play into where the unsafe block is located? Doesn't this mean that the unsafe block is now placed farther away from the unsafety, not localised to the specific unsafe bit (the never pattern) but the entire match block?
I just don't really see why adding a specific never pattern is a good solution to any problem, and I also don't understand what problems require them.
It doesn't even specify whether match x { ! } requires that the match be enclosed in an unsafe block, even though the examples in the RFC seem to imply this.
A never pattern is not an unsafe operation, since you can't cause UB by using it on safe values. It's exactly as safe as the true pattern on booleans. So this requires no unsafe block.
The part that requires unsafe in these examples is the match scrutinee. We can't write match (unsafe { *ptr }) { .. } because that would have a different meaning: that would copy the value out of the pointed-to-place. To match on the place directly we have to write unsafe { match *ptr { .. }}. That's just how rust works today.
Specifically, it doesn't clarify when the need would arise for match x { ! } versus match x {}.
match <expr> { ! } is always allowed if the type allows. It's a pattern like any other. match <expr> {} is allowed when <expr> is not inside a pointer or union (that's not a change, that's how rust works today, though admittedly that changed last month). That's why the examples use unsafe: never patterns is mostly only useful around unsafe code.
Additionally, would you be expected to mix uninhabited patterns with inhabited patterns? What does this mean?
I don't know what you mean. You use patterns as usual, depending on the type you want to match on. It's just that for empty types there's now a new pattern you can use. You can mix and match as usual.
I just don't really see why adding a specific never pattern is a good solution to any problem, and I also don't understand what problems require them.
To be fair this is a rather subtle point of rust operational semantics. To rephrase the "motivation" section in the RFC: it solves the problem of having an explicit way to assert validity of data at a place of uninhabited type using a pattern.
I guess I should have clarified on the mixing: specifically, imagine the case like this:
match x {
Ok(x) => /* ... */,
Err(!),
}
and this:
match x {
Err(!),
Ok(x) => /* ... */,
}
And you can imagine more than two arms for more complicated examples. If we allow more than one never pattern, then our macro-matching situation gets more complicated, and you have to explicitly use tt-matchers to handle both cases, which can get a lot more complicated than just having multiple arm matchers. However, I can see value in allowing multiple never patterns separated by commas, compared to one never pattern separated by pipes, for convenience and to allow sorting patterns better.
Like, that is a rather sizeable complication to the syntax which needs to be explained, and should probably have style guidelines too.
Adding an additional example for clarity:
match x {
Ok(None) => /* ... */,
Ok(Some(!)),
Err(None) => /* ... */,
Err(Some(!)),
}
Note that since it's not just all inhabited patterns followed by an optional never pattern, and instead perfectly allowed mixing, you can't use a single-rule matcher for macros any more: it has to be a tt-matcher with separate rules for the different arm cases.
Adding a relevant bit here after discussing a bit in rust-lang/rust#131452: I really don't think that motivation is properly described in this RFC, at all. I understand the desire to remain brief, but there are a lot of very, very subtle details that do not feel clarified at all in the RFC. Even the "guide-level" explanation, which is supposed to be for people who don't understand these subtle details, uses very specific terminology like "place" that isn't properly described to the person reading.
Like, my understanding is that the opsem team would prefer to define pattern-matching such that reads happen on a per-arm basis, meaning that you can always look at the arms of a match to determine what reads are happening, and thus the points at which UB can occur if a reference were not well-formed. This avoids the need for a "discriminant read" operation, since you can rely on pattern-matching instead.
While syntactically matches of uninhabited types would generally be allowed to be left empty, with never patterns being added "automatically" by the compiler, semantically, the idea is that they would still be there, and still able to cause accesses to memory per the memory model. To help ensure that people don't get caught up by these "automatically added" arms, people can opt into using these patterns, either by denying a lint that checks for them or via a separate lint which will only fire when they are omitted in the presence of unsafe code, to be worked out later.
This RFC is attempting to sidestep the operational semantics and simply propose a new syntax but I, genuinely do not think that's a good idea. Because, well, the semantics do matter: match x {} or match x { ! } and match x { _ => unreachable!() } no longer have the same meaning, assuming that the implicit ! pattern is added to the end and not the beginning. And, with the proposed semantics, the unreachable!() branch is not dead code, and cannot be automatically removed by the compiler.
Like, now that I actually understand the motivation for this change, I really do get it. I just don't really know if the proposed solution is the correct one: since the semantics aren't really negotiable, that means either we add an explicit syntax for never patterns and recommend that they sometimes be used in unsafe code, or, we keep the semantics for these patterns being added implicitly and don't add any special syntax for them. I'm not going to argue whether the syntax is worth the downsides it brings, but I do think that the semantic side plays a lot into making that justification, and I don't think it's adequately stated here.
I don't see any mention of writing let ! = returns_uninhabited();. I can only assume that would be valid, and that might be a common idiom even more so than match usage.
I'm not necessarily advocating for this but a more conservative version of this RFC would be to not add body-less arms. You can instead do
match x {
None => {}
Some(never) => {
let ! = never;
}
}
(or Some(never @ !) => never,)
Indeed let ! = ... would be natural, I believe that's already allowed in the implemented feature. To avoid bodiless arms my thinking was even to do:
match x {
None => 42
Some(!) => {}
}
The reasoning is that the arm body is known to be unreachable so can take any type, just like { return; } would. Just gotta update the RFC :)
I kinda like the bodiless arms personally as it clearly communicates that the arm is unreachable. Since the arm pattern is unreachable code, the arm body is like adding unreachable code on top of unreachable code. It would be especially nice in big complicated patterns where the ! is harder to spot.
match x {
None => {}
Some(Foo {
bing: _,
bap: _,
bop: (_, _, !),
}),
}