rfcs
rfcs copied to clipboard
`is` operator for pattern-matching and binding
Introduce an is
operator in Rust 2024, to test if an expression matches a
pattern and bind the variables in the pattern. This is in addition to
let
-chaining; this RFC proposes that we allow both let
-chaining and the
is
operator.
Previous discussions around let
-chains have treated the is
operator as an
alternative on the basis that they serve similar functions, rather than
proposing that they can and should coexist. This RFC proposes that we allow
let
-chaining and add the is
operator.
The is
operator allows developers to chain multiple matching-and-binding
operations and simplify what would otherwise require complex nested
conditionals. The is
operator allows writing and reading a pattern match from
left-to-right, which reads more naturally in many circumstances. For instance,
consider an expression like x is Some(y) && y > 5
; that boolean expression
reads more naturally from left-to-right than let Some(y) = x && y > 5
.
This is even more true at the end of a longer expression chain, such as
x.method()?.another_method().await? is Some(y)
. Rust method chaining and ?
and .await
all encourage writing code that reads in operation order from left
to right, and is
fits naturally at the end of such a sequence.
Having an is
operator would also help to reduce the demand for methods on
types such as Option
and Result
(e.g. Option::is_some_and
and
Result::is_ok_and
and Result::is_err_and
), by allowing prospective users of
those methods to write a natural-looking condition using is
instead.
Nominating because this is making a proposal for the 2024 edition.
I see there is no mention of pattern types though it seems they would be similar but distinct use of is
as an operator?
is this a pre-requisite of pattern types (to get the keyword in the language?) or does it conflict with the types usage?
when combined with pattern types, what way does the precedence go?
so, does v as i32 is 5
parse as (v as i32) is 5
or v as (i32 is 5)
? or is it ambiguous and errors, requiring parenthesis?
@fbstj wrote:
I see there is no mention of pattern types though it seems they would be similar but distinct use of
is
as an operator?is this a pre-requisite of pattern types (to get the keyword in the language?) or does it conflict with the types usage?
This is not related to pattern types. I believe we can do both without conflict. I added some text to the "unresolved questions" section to confirm that we can do both without conflicts.
@programmerjake wrote:
when combined with pattern types, what way does the precedence go? so, does
v as i32 is 5
parse as(v as i32) is 5
orv as (i32 is 5)
? or is it ambiguous and errors, requiring parenthesis?
I've added some text to the RFC, stating that this should require parentheses (assuming pattern types work with as
).
What patterns does is
enable that aren't covererd by matches!
?
@dev-ardi One example:
if expr is Some(x) && x > 3 {
println!("value is {x}");
}
I find it a bit odd that we would want both is
expressions and let
chains. They serve exactly the same purpose, the only difference being their reading order. I can understand the argument that we would want to have let
chains due to people expecting them to work given we already have if let
and the like but this feels like the wrong way to address that. I would instead expect us to deprecate if let
and while let
in favor of is
and dropping let
chains.
I feel like that should be added to the alternatives and/or pad out the feature duplication drawbacks paragraph.
@Veykril wrote:
I would instead expect us to deprecate
if let
andwhile let
in favor ofis
That would be a massive amount of churn for very little benefit.
Nonetheless, you're right that I should add this to the alternatives section.
Adding multiple ways to do the same thing also makes teaching Rust harder: let
in Rust is everywhere: if let
, while let
, let
-chains, let ... else
, ... So you have to teach pattern matching with let
anyway. Meaning, this "right-to-left" reading order will become natural to Rust users quick. By introducing a different way, while easy and intuitive to understand, won't help much in code clarity IMO, as people are already used to reading let
patterns.
I'd epxect is
to be a pretty common variable name, so maybe worth exploring less common words, like Some(y) binds x && y > 5
or x matches Some(y) && y > 5
.
I do think larger expression make the left vs right swap interesting, but remember perl created chaos with its left vs right trickery, so one should really be careful here. matches
maybe works both ways.
Yes both let Some(y) = x && y > 5
and let .. else
become extremely confusing, but humans could parse some sensibly bracketed flavors, like { let super Some(x) = foo } && y > 5
ala https://blog.m-ou.se/super-let/
If we add is
as a keyword, we should also reserve isnot
as a keyword for future NOT-patterns
if expr isnot Some(x) {
println!("error");
}
Edited: I'm sorry for some impoliteness with "must"
Author mentioned just one alternative name for is
: ~
.
But I think we should add another alternative names in RFC, like equal
or identic
:
if expr identic Some(x) && x > 3 {
println!("value = {x}");
}
@VitWW I don't think so, Vit.
@VitWW
If we add
is
as a keyword, we must also reserveisnot
as a keyword for future NOT-patterns
You speak in a commanding way ("we must"), without justification. So, having considered my own thoughts: ...I disagree! Please offer a justification for your reasoning, and especially, why it should be addressed now, not "we do this for future expansion opportunities". It seems it will simply run up against all the concerns we're already facing, and we can wait until then.
Author mentioned just one alternative name for
is
:~
. But I think we should add another alternative names in RFC, likeequal
oridentic
:
To say we should do something is better than to command, but I don't think you have explained the prior art or other reasoning why it must be addressed in the RFC. Perhaps you were building off the point that burdges made? But unfortunately equal
is also a common function name in Rust, used in e.g. polars (as public API) and the stdlib (as private), and also seems to be a reasonably popular variable name. So it at least doesn't feel obvious as to why we would go with that.
For everyone else suggesting alternative keywords, I do really recommend everyone at least check using grep.app or something similar if their recommendation is in Rust public API somewhere, and how many cases, and be forthcoming on how many examples they find. You will likely pull hundreds of pages, so you may wish to do extrapolation or more exact queries using other tools after downloading the crates.io index.
Of course, we do have our system of keyword reservation, the k#
and r#
stropping, and edition-sensitive keyword parsing, so I think this is not the only thing to consider, and we can in fact simply pick the nicest-looking syntax if it doesn't seem an overwhelming problem. But it is best if we keep in mind any induced complexity in the lexer and parser, and the community reaction, while we rummage through our collection of Pantone chips for this shed.
@joshtriplett While I think is
, er, is a fine choice, I wish to (gently) refute ~
as lacking a history as a "pattern-matching operator", and provide some background that might be worth at least reviewing. First, SQL does have bare ~
but I think it is reasonable to mostly omit considering SQL's language features, as it is deliberately unlike most other PLs for reasons beyond this discussion. However, ~=
and =~
do have prominent histories as a pattern-matching operator!
- Swift does use
~=
as a pattern-matching operator, and even uses it as part ofcase
evaluation: https://developer.apple.com/documentation/swift/range/~=(::) - Ruby offers the inverse,
=~
, for a regex-centric pattern-matching: https://ruby-doc.org/core-2.6.3/Regexp.html#class-Regexp-label-3D~+and+Regexp-23match - And part of why it does so is because Bash does it: https://www.gnu.org/software/bash/manual/html_node/Conditional-Constructs.html#index-_005b_005b
Obviously, the regexp-centric examples don't exactly match to the Rust pattern language, but it's clearly a popular choice if three exceptionally common procedural PLs use it. Other examples like Vimscript and PromQL also use them, but obviously that gets increasingly niche. Wiktionary even asserts ~=
is used in mathematics... but also mentions~=
is also used as an equivalent to Rust's !=
, e.g. Lua and MATLAB.
It seems to me when ~
is included in an operator's symbol, either it means that negation, or it does imply something akin to saying "roughly like...", an approximate match, which may be why Dart uses ~/
for divide-to-integer (as opposed to dividing to a double, which more accurately represents the result of 3 / 2
). Of course, that very page I just cited also mentions Dart has is
, so I only consider this to be interesting context!
Some reactions I had while walking and thinking on this earlier:
- I like
is
for legibility and I think it will probably read nicer than let chains in almost all cases - Python has
is
operator as object identity which is almost only used forx is None
, which the operator here would support. A possible addition to the prior art. - I strongly agree with the concern that's been repeated a few times here that we already have forms like
if let
and alsolet - else
, and the distinction here is currently proposed to just be a style choice.
Especially as the recent language survey seemed to highlight language bloat as one of the largest risks to the language, having this purely be stylistic seems to be in direct opposition to the data.
If we were to move forward with this I'd hope that this RFC takes a stronger stance on when to use let
forms and when to use is
forms, and strongly considers the deprecation alternative.
Possible observation: by allowing expr is PAT && condition
here, users may be more likely to try PAT && condition
as match arms instead of the current PAT if condition
. We may want to allow that:
match color {
(RGB(r, g, b) | RGBA(r, g, b, _)) && r == b && g < 1 => /* ... */,
^^ - this is currently a compile error, should be `if`
_ => /* ... */
}
... I think it'd ease refactoring and papercuts when converting code between x is PAT && y { ... }
to match x { PAT && y => ... }
While I think
is
, er, is a fine choice, I wish to (gently) refute~
as lacking a history as a "pattern-matching operator", and provide some background that might be worth at least reviewing. First, SQL does have bare~
but I think it is reasonable to mostly omit considering SQL's language features, as it is deliberately unlike most other PLs for reasons beyond this discussion. However,~=
and=~
do have prominent histories as a pattern-matching operator!* Swift does use `~=` as a pattern-matching operator, and even uses it as part of `case` evaluation: https://developer.apple.com/documentation/swift/range/~=(_:_:) * Ruby offers the inverse, `=~`, for a regex-centric pattern-matching: https://ruby-doc.org/core-2.6.3/Regexp.html#class-Regexp-label-3D~+and+Regexp-23match * And part of why it does so is because Bash does it: https://www.gnu.org/software/bash/manual/html_node/Conditional-Constructs.html#index-_005b_005b
Obviously, the regexp-centric examples don't exactly match to the Rust pattern language, but it's clearly a popular choice if three exceptionally common procedural PLs use it. Other examples like Vimscript and PromQL also use them, but obviously that gets increasingly niche. Wiktionary even asserts
~=
is used in mathematics... but also mentions~=
is also used as an equivalent to Rust's!=
, e.g. Lua and MATLAB.It seems to me when
~
is included in an operator's symbol, either it means that negation, or it does imply something akin to saying "roughly like...", an approximate match, which may be why Dart uses~/
for divide-to-integer (as opposed to dividing to a double, which more accurately represents the result of3 / 2
). Of course, that very page I just cited also mentions Dart hasis
, so I only consider this to be interesting context!
Just to follow up on this a bit, particularly from a mathematical perspective. Yes, you're right that ~ has some similarity to ≈, which means "approximately equal to," and thus it makes sense as a pattern-matching operator.
However, ~=
and =~
, from a programming perspective, are far too loaded to really work well as that kind of operator. Like, I've been writing a lot of Lua lately and ~=
is just straight-up !=
in Lua.
Plus, with the way Rust tends to organise its operators, the existence of ~=
implies that there should be a standalone ~
, which wouldn't be the case here. So, I would advocate against that regardless.
Drawing to the bigger point of what this operator should be: I genuinely don't think that there's something better than is
. It's two characters, which is as long as many existing operators. People say that it's a common variable name, but I think that it's only common as a pluralisation of i
, where i_s
could easily serve that purpose. And the only other reasonable alternative that I can think of is ~
, which is shorter and less clear. Any other keywords are going to be longer, more awkward, and more likely to cause name conflicts.
I point out some of the alternatives in the RFC because I think that we should definitely include the best arguments in favour of is
in the RFC, but I genuinely do think that it's the best choice.
If we add
is
as a keyword, we must also reserveisnot
as a keyword for future NOT-patterns
I disagree, IMO not patterns can be written as !Some(_)
(!
-patterns can be used everywhere a fallible pattern is (match
, if let
, is
, let ... else
), so isn't an is!
operator).
This means there's two ways to write it, with the not operator:
!(a is Some(v)) || v == 0
or with a not pattern:
a is !Some(v) || v == 0
or a is (!Some(_) | Some(0))
You speak in a commanding way ("we must"), without justification. .... Please offer a justification for your reasoning, and especially, why it should be addressed now
@workingjubilee I'm sorry for some impoliteness with "must".
Not-patterns wasn't added also because in today rust syntax it is ugly to write them: NOT(Some(x)) = expr
and it becomes almost pretty with isnot
keyword.
Now it should be reserved a a keyword, because it is dual to is
, just like >=
/<=
; ==
/!=
and it is strange to add just one from dual pair.
But unfortunately
equal
is also a common function name in Rust
Uups
People say that it's a common variable name, but I think that it's only common as a pluralisation of
i
, wherei_s
could easily serve that purpose.
I won’t claim it’s common, but it’s probably worth noting that is
is the country code for Iceland, and so is a natural variable name for strings containing Icelandic-language text.
I prototyped this feature back in 2018 and converted rustfmt to this style, but later dropped the corresponding rustfmt branch, accidentally and unfortunately. But the experience report is preserved at least - https://github.com/rust-lang/rfcs/pull/2260#issuecomment-367158854.
I still think this is the right thing to do, and something that should have been added instead of if-let chains from the start.
It would be unfortunate if the scenario I predicted in https://github.com/rust-lang/rfcs/pull/2497#issuecomment-404860099 plays out and EXPR is PAT
is not added for social reasons because if-let chains already exist.
@petrochenkov Agreed. I think let
chains have value because if
-let
already exists and people expect let
chains to work, but I don't think that should prevent us from adding is
. That would feel like a suboptimal path caused by path dependence.
Considering the multiple bugs around temporaries that was found with let chains, perhaps we should just reserve the is
keyword in edition 2024 and give the implementation more time to mature?
Just because I haven't seen anyone comment on it yet, I would like to know if my intuition that is
should have higher precedence than ==
(but still recommend parentheses, similar to mixing &&
and ||
) matches others' intuition as well. I could just be an outlier here and would love if others pitched in how they feel as well.
Particularly this thread: https://github.com/rust-lang/rfcs/pull/3573#discussion_r1492740859
Feel free to just thumbs up/thumbs down to express support if you don't have much else to add.
@clarfonthey wrote:
Just because I haven't seen anyone comment on it yet, I would like to know if my intuition that
is
should have higher precedence than==
(but still recommend parentheses, similar to mixing&&
and||
) matches others' intuition as well. I could just be an outlier here and would love if others pitched in how they feel as well.
My intuition tells me "there is no possible circumstance in which I would ever want to see these combined without parentheses", which makes me feel that it's irrelevant what their relative precedence is.
(I think that's true for a few other cases in the existing precedence table as well.)
That is disappointing to hear. People tend to eschew parentheses where they are unnecessary because the language already has many cases where some kind of parenthetical or brace or bracket is already either mandated by the syntactic form or is mandated by expressing the desired result, and it does not actually make the code significantly less clear to imitate Lisp slightly less.
why would any need either boolean == (x is Some(z))
or (value == y) is true
so frequently that one or two pairs of parenthesis are going to bother them :confused:
People tend to eschew parentheses where they are unnecessary
That's my preference as well, for cases that are widely parsed correctly by people who don't have the precedence table memorized. But for instance, the lint against using && and || together without parentheses is a good example where we suggest that they are more necessary than the precedence table would otherwise indicate. I think there are some cases that are intuitively obvious to people, and others where if you don't have the precedence table memorized you're likely to find them confusing. And I've regularly seen confusion about (for instance) the parsing of as
.
I do personally think mixing == and is
without parentheses seems more likely to lead to confusion than clarity. If many people feel strongly in the other direction, I could imagine changing that from "parentheses are always required" to "warning lint for not using parentheses", like && and ||. In any case, I will include it in the alternatives section.
On parens:
The safe thing to do is start out always requiring them, since then we could look at how the code comes out with them, and remove the requirement as a non-breaking change later once we have evidence.