rfcs icon indicating copy to clipboard operation
rfcs copied to clipboard

Explicit move binding mode

Open schuelermine opened this issue 2 years ago • 42 comments

This RFC suggests using the move keyword to explicitly specify the moving binding mode to override match ergonomics

Rendered

schuelermine avatar Apr 07 '23 21:04 schuelermine

For context, there’s a URLO discussion where the motivation for writing this RFC probably emerged. I don’t know yet what I’d like to contribute from that thread to this discussion… but to paraphrase my own take on binding modes in the context of match ergonomics from that thread:

I believe it might be worth a change to lint (or eventually even error, or error-by-default-lint) in general against any ref x, ref mut x or mut x that appears in places where match ergonomics have changed the default binding mode. I think the current behavior has confusion points such as

  • ref x and x, or ref mut x and x, are identical in meaning in match-ergonomics contexts, and different syntax for the same thing can lead to confusion, in particular when their meaning differs significantly in other contexts.
  • the current state of mut x and x (in the context of match ergonomics) differing in whether or not binding-by-reference happens is just obviously confusing and nothing else

In case that isn’t clear from the above, I believe that complicated settings should require the user to fall back to patterns that properly match the type, i.e. to no longer use match ergonomics. This could also be aided by help messages in diagnostics.


I don’t want to say that there’s no truth to the argument that “requiring the user to add lots of ref” to convert from match ergonomics to classical patterns, just in order to change one binding mode, might have some significant inconvenience in some use-cases. But the current state of binding-mode specifiers interacting with match-ergonomic’s default binding modes seems so messy and inconsistent to me that patching it up by adding even more syntax seems like the wrong approach. That being said, I would like the idea of exploring how reasonable the thing listed under “alternatives” could be, an idea that I described in the URLO thread with the words

let (PAT1, PAT2) = r; with r: &(S, T) will create a &S and match it against PAT1 ; as well as creating a &T and matching it against PAT2

there’d be some consistency to that; and the way to write move x from this thread under that approach would be simply &x, as opposed to x which would (already) bind x to a reference. Naturally such a change would need disallowing any binding mode specifiers in places with changed default binding mode as a prerequisite. Following such an idea, a place with (what is now) a bind-by-reference default binding mode context would (if we re-allowed binding modes in such places at all) treat ref x as opposed to x by binding x to yet another level of (reference) indirection; which would probably not be particularly useful, but at least consistent (and different from what it does now); and mut x as opposed to x would still bind by-reference, but the variable x holding the reference would become mutable.

steffahn avatar Apr 08 '23 10:04 steffahn

You are correct that this RFC originates from that discussion. I originally included a link to it but was advised against it by a member of the Discord community. I will update the alternatives section to elaborate further on your alternative.

schuelermine avatar Apr 08 '23 10:04 schuelermine

As an alternative: Rust use *mut and *const for raw pointers. So it is handy to reuse this keyword here

let (x, const y, mut z) = &mut xyz;
// The type of `x` is `&mut i32` and
// the type of `y` is `&i32` and
// the type of `z` is `i32` (and the binding is mutable)

let (x, const y) = &xy;
// The type of `x` is `&i32` and
// the type of `y` is `i32`

P.S. It is legal already to write

let (x, mut y) = &xy;
let (x, ref y, mut z) = &mut xyz;

VitWW avatar Apr 08 '23 12:04 VitWW

For context, an alternative to ref/move

(ref | move)? mut? IDENTIFIER (@ PatternNoTopAlt)?

is mut/const

ref? (const| mut)? IDENTIFIER (@ PatternNoTopAlt)?

or both

VitWW avatar Apr 09 '23 00:04 VitWW

There is no need for anything like this. It's just piling even more special cases on top of the already-confusing match "ergonomics". Just make sure that the pattern's type matches that of the value, it's as simple as that.

H2CO3 avatar Apr 10 '23 11:04 H2CO3

This isn’t a special case. It’s simply filling an obvious gap in the existing match ergonomics. If you don’t like match ergonomics, that’s fine, but it’s here and people use it.

Even if the “better” approach is to just not use match ergonomics, having a half-baked feature in the language is worse than having one that’s on par with the alternative in what it can do.

Also, have you read the alternatives section of the RFC? It contains an alternative implementation of match ergonomics suggested by @steffahn that may be more to your liking.

schuelermine avatar Apr 10 '23 11:04 schuelermine

I consider the RFC well-motivated: from conversations with people learning Rust, match ergonomics has generally been a success, but it definitely has nasty edge cases around Copy types (and I know that there is a significant set of users that find it confusing). We expected at some point to resolve those by making references to copy types (i.e., values of type &impl Copy) usable more like the underlying type itself, but we've made no progress on adding coercions or other rules to make that true, and coercions could never fully hide the distinction between &impl Copy and impl Copy. In the meantime I've become generally more pessimistic about the benefit of partially hiding that distinction, though I'm still curious to experiment more there.

The choice of move keyword makes sense but is also confusing, in that moving a copy type doesn't move, and actually moving (in the strict sense) from inside a reference is always an error. The fact that taking ownership of a type is calling moving in the closure sugar, and that moving only copies a copy type, is definitely a bit of pre-existing confusion that we woudl be doubling down on here. I'm not sure how to feel about that, tbh.

The other obvious solution would be to allow & patterns in this position -- i.e., let & be used to "cancel out" an implicit ref binding mode. I think @cramertj was going to propose this but it never happened.

I've definitely reached the point where I'd like to see us do something here. I kind of like the move keyword option a bit more than the &x option, in that &x is known to be confusing, but I am a bit worried about the choice of move keyword here.

nikomatsakis avatar Apr 10 '23 13:04 nikomatsakis

The usage of the word move matches the usage of it in the match ergonomics RFC.

schuelermine avatar Apr 10 '23 14:04 schuelermine

I think using the move keyword for it isn't that bad as that's how move closures and move async blocks use it too, so it would only be consistent to use move here as well.

CryZe avatar Apr 10 '23 16:04 CryZe

@nikomatsakis

I am a bit worried about the choice of move keyword here.

If move isn't the right keyword here, then it isn't the right keyword for move |x| .. closures as well. I think that using two keywords for the same concept is unnecessarily confusing.

And if Rust ever finds a better keyword for this, then the closure syntax ought to also be updated in a new Rust edition (while I see that the bar for changing syntax should be higher, this should meet it on the grounds of user confusion, just like bare trait objects were updated to dyn Trait)

That said,

The choice of move keyword makes sense but is also confusing, in that moving a copy type doesn't move, and actually moving (in the strict sense) from inside a reference is always an error. The fact that taking ownership of a type is calling moving in the closure sugar, and that moving only copies a copy type, is definitely a bit of pre-existing confusion that we woudl be doubling down on here. I'm not sure how to feel about that, tbh.

I think that move is a good keyword on both cases. It's a (minor) confusion, but rooted in some (incomplete but widely held) mental model that exists since Rust 1.0. And saying that Copy types aren't actually moved when they are moved is maybe more of a semantic distinction - you could also say that variables with Copy types can be used even after they are moved, and it would be equivalent.

In some ways, it kinds of mirrors the confusion of &mut being called that way for "mutable" (even though you can also mutate through an & with interior mutability), rather than a more descriptive name like &exclusive or something along those lines. It's a misunderstanding in absolute terms but is rooted in an useful mental model.

dlight avatar Apr 10 '23 22:04 dlight

@afetisov Why do you dislike the proposal? Do you have any concrete misgivings?

@steffahn Do you think I adequately describe the alternative(s) in the RFC?

schuelermine avatar Apr 11 '23 20:04 schuelermine

I really like @steffahn's idea that &(T, U) could be treated like (&T, &U) in patterns. Here's an example where this would be useful:

some_iter
    .enumerate()
    .filter(|(&i, item)| ...)

Currently the pattern must be written as &(i, ref item). Match ergonomics currently don't work here, because i should be an owned usize, but item should be bound by reference. But I think the above could be allowed and would feel both intuitive and consistent.

Aloso avatar Apr 13 '23 02:04 Aloso

My core objection is that this proposal doesn't fix match ergonomics or make it more consistent. Instead, it piles on yet another hack, with likely more unintuitive interactions. Worse, it does that by modifying the pattern syntax, so it affects not just match ergonomics, but any place where a pattern can be used: function and closure arguments, let bindings. The proposed addition is useless and undesirable in all those other cases.

Some more specific issues:

  • The syntax doesn't make sense. The pattern in match &x already binds to a reference.

    match &Some(3) {
      Some(move x) => {}
      _ => {}
    }
    

    Here x is a reference. The intuitive reading is that we "move a reference", which is absolutely not the intended semantics. This just doubles-down on the current confusing interpretation of mut bindings. Things get even more confusing if the matched field is itself a reference.

  • The semantics don't make sense. Without match ergonomics, everything is clear: we match a struct/enum by value, all fields are exposed, and you can either move them or create a borrow. With match ergonomics, the matched expression is a reference. The only thing one can naturally produce for its fields is references. How can you even move a field which you don't own? So either the field has type allowing it to be moved out of a borrowed state (i.e. it's Copy), in which case one can do this move manually in a match arm, or it should be a compile error "move of borrowed value" anyway.

  • For by-value matches, the new binding mode is useless. It's just syntactic noise, and the only reasonable treatment is to lint against it. Still, you can be sure that some people will ignore/disable this lint, and insist on "explicit moves", so the codebases will be littered with useless moves.

  • The keyword is actually too good to waste on this hack. There are many other potential uses for move, including in patterns, which would be confusing or incompatible with the proposed addition. E.g. there are many discussions about &move T references, which would naturally require ref move binding modes in patterns.

afetisov avatar Apr 14 '23 11:04 afetisov

I don’t see this as piling on another hack on top of match ergonomics. While I grant this does not make match ergonomics more intuitive, it does fill a hole in the currently existing implementation of match ergonomics. When introducing something that changes a default, it seems to me that specifying something non-default is obviously a necessary element, which was neglected in the match ergonomics RFC. I agree that the current implementation of match ergonomics is confusing, and I believe this is in part because the idea of binding modes is not well-known and not fleshed out, which this proposal aims to rectify partially.

  • Yes, mut is a confusing case, but I believe it becomes less confusing when move exists a binding mode specifier with similar properties. The most intuitive mental model that fits your description would be @steffahn’s idea. Not also that move is used consistently with the terminology in the match ergonomics RFC.

  • No, the semantics make sense. We are not moving a field we don’t own, we are changing the binding mode to the moving mode, which is the default. This causes a copy when the value is Copy, and an error otherwise. This is an argument over nomenclature: Should we say that a copy is strictly not a move, or that a move can be a copy? I believe that saying that a copy is a move that can be used in its original place afterward is a fine mental model. Generally I find this argument is much more an argument against match ergonomics as-is and less an argument against this proposal.

  • I don’t think so. The same would be possible with existing match ergonomics (explicit ref) but I have not come across this (feel free to educate me). This proposal actually makes this less likely as linting against unnecessary ref and ref mut is part of it.

  • I don’t see how this keyword is incompatible with that proposed addition. ref move is well-distinct from move. In fact, I believe even a differentiation between ref mut and mut ref would be perfectly understandable, as is suggested in the alternatives section.

I grant that you present strong points against the current implementation of match ergonomics. But a replacement of match ergonomics can happen even after this proposal, and this proposal can serve a purpose in providing more flexible match ergonomics in the mean-time.

If you believe that match ergonomics is just a misfeature, I very much disagree with you. Using ref is less readable, more noisy, and tedious to write. Match ergonomics as they exist aren’t perfect but they have made getting references to elements of data-structures much more intuitive. As @nikomatsakis has said, match ergonomics has been embraced by people learning Rust. I can also attest that I first learned about match ergonomics (before knowing they were called that), and only afterwards discovered ref and was confused by it, not match ergonomics. Therefore I feel that something like match ergonomics should exist, even if the corner-cases match ergonomics currently introduces are unintuitive.

schuelermine avatar Apr 14 '23 17:04 schuelermine

I don’t see this as piling on another hack on top of match ergonomics. While I grant this does not make match ergonomics more intuitive, it does fill a hole in the currently existing implementation of match ergonomics.

The whole (supposed) point of match ergonomics was/is to make patterns more intuitive. If any addition fails this criterion, then it's pointless.

Since moving and referencing can already be accomplished by being explicit about the pattern, there's no need for a completely redundant way to do the same thing with a different syntax. It simply doesn't carry its weight, and since you need to be explicit anyway, it doesn't make anything easier at all.

H2CO3 avatar Apr 14 '23 18:04 H2CO3

Using ref is less readable, more noisy, and tedious to write

If ref is unreadable, noisy, and tedious to write, then why isn't the same true for move (which is longer)? This argument is inconsistent.

H2CO3 avatar Apr 14 '23 18:04 H2CO3

We are not moving a field we don’t own, we are changing the binding mode to the moving mode, which is the default.

This phrase only makes sense if you know exactly the algorithm of match ergonomics. The point of match ergonomics is exactly that the user does not need to know or care about "binding modes". You match a reference, you get a reference, it's as simple as that. Once you get into "binding modes" you'd be better served by writing out the classic derefed match with ref and ref mut explicitly.

If you don't know about "binding modes", the proposed syntax makes zero sense.

This is an argument over nomenclature: Should we say that a copy is strictly not a move, or that a move can be a copy?

That's pedantic. Nobody cares about that, beyond the first steps in Rust when they learn that every move is a copy, but the compiler may prevent you from using the original.

The same would be possible with existing match ergonomics (explicit ref) but I have not come across this (feel free to educate me).

I have seen plenty of such code in the wild. Most often it's a codebase which was started before match ergonomics were implemented. Sometimes it's a preference of the people writing it. In those cases match ergonomics aren't used, and the matching is done old-style, with a dereference on the scrutinee and explicit ref/ref mut bindings.

I have seen people argue that old-style matching makes it more explicit which of the bindings are moved, and which are borrowed. With inferred types of scrutinees, I tend to agree, though I still prefer match ergonomics, both because it's shorter and because I find the binding modes very counterintuitive.


That said, I like @steffahn 's suggestion (the alternative you listed). I have often wondered why match ergonomics doesn't already work this way. Is there a reason beyond "nobody implemented it"?

afetisov avatar Apr 14 '23 18:04 afetisov

If ref is unreadable, noisy, and tedious to write, then why isn't the same true for move (which is longer)? This argument is inconsistent.

I’m not suggesting putting move everywhere. What I’m suggesting is reducing noise by setting the default binding mode to the most common one and explicitly specifying the binding sites that deviate from this.

schuelermine avatar Apr 14 '23 18:04 schuelermine

That said, I like @steffahn 's suggestion (the alternative you listed). I have often wondered why match ergonomics doesn't already work this way. Is there a reason beyond "nobody implemented it"?

I have talked to other community members on the community Discord Server about this and there seems to be no clear answer. Perhaps no-one came up with this during the match ergonomics RFC period. However there may be subtle problems in the implementation which is why I didn’t feel confident to suggest this change. I feel this proposal is much more conservative as it suggest exposing functionality that is already present (mut’s effect on match ergonomics) and does not require an edition boundary.

schuelermine avatar Apr 14 '23 18:04 schuelermine

I have seen plenty of such code in the wild. Most often it's a codebase which was started before match ergonomics were implemented. Sometimes it's a preference of the people writing it. In those cases match ergonomics aren't used, and the matching is done old-style, with a dereference on the scrutinee and explicit ref/ref mut bindings.

I was talking about unnecessary ref or ref mut, when match ergonomics already set them as the default. Have you seen that?

schuelermine avatar Apr 14 '23 18:04 schuelermine

Well, the point of match ergonomics is to avoid writing ref/ref mut. So no, I haven't seen those used together.

Please don't spread your replies over many messages, it makes hard to keep track of discussion.

afetisov avatar Apr 14 '23 19:04 afetisov

I kind of like the move keyword option a bit more than the &x option, in that &x is known to be confusing, but I am a bit worried about the choice of move keyword here.

Regardless of choice of keyword (let's go with move for the sake of the argument), if people learn that move x can be used in certain contexts to turn what used to be a reference x: &i32, without the word move, into an owned x: i32, there will IMO inevitably be confusion as to why that doesn't work consistently everywhere. It will intuitively function like a dereferencing pattern, because the thing that it intuitively does is dereferencing - the fact that the "true" functionality would be to affect the "default binding mode" is a language design detail that most people will be unaware of, or at least not know in full detail.

Of course, assuming this outcome is realistic, it would be a bad outcome, because we already have true dereferencing patterns &x and &mut x, so if the common perception would be that move x is effectively a third kind of dereferencing pattern that must be used in certain cases, whereas it cannot be used in others, that seems to me like bad language design and unnecessary complication. E. g. iterating over HashMap<i32, String>, it you want k: i32 and v: &String, you would need to write .iter().for_each(|(&k, v)|) but for Vec<(i32, String)> it would be .iter().for_each(|(move k, v)|) – a differentiation in contrast to the case of a simple (k, v) pattern, which can be used for both consistently and with the consistent outcome k: &i32 and v: &String.

If we follow argument that &x is more confusing than something like move x, which may very well be true, as reference patterns are known to be a slightly hard thing to learn due to the dual nature of patterns, then someone might even come to prefer move x and/or learn about it first, and then wonder why they cannot simply write .iter().for_each(|(move k, v)|) in the HashMap<i32, String> case - or possibly even - why they can still write it, but with vastly different outcome (as move k would change nothing compared to k).


I feel this proposal is much more conservative as it suggest exposing functionality that is already present (mut’s effect on match ergonomics) and does not require an edition boundary.

Regarding edition boundaries... the kind of syntax that might change in meaning over an edition is mut x and ref x/ref mut x in by-reference default binding mode contexts. These are of limited usefulness anyways, their new respective meanings would be to create a reference held in a mutable variable, and to create a reference to a reference. Both are operations that used to be completely impossible to achieve anyways with normal patterns and binding modes before any match ergonomics, so they cannot be all that important.

Without/before an edition boundary, we can simply lint against all this syntax that would change in meaning (because it would then become a corner case incompatible with the new concept of how match ergonomics operate, only kept for backwards compatibility). Even after an edition we could still consider just having it error instead of introducing the new consistent meaning of mut x/ref x/ref mut x in these contexts. If there's not much of a practical need to ever bind to references in mutable variables or references-to-references, that would not be much of a restriction.

steffahn avatar Apr 15 '23 00:04 steffahn

@steffahn That's why const (not mut) is an alternative to move/ value(not ref). It is already in use for raw types, and the content of immutability is clear

VitWW avatar Apr 15 '23 00:04 VitWW

@VitWW I don’t think const is a good keyword, since it usually implies knowledge about the value at compile time. Raw pointers are the exception to this, not the rule. Also, it highlights that the value is immutable, as opposed to mut, but mut’s behaviour here is already confusing, and we’d rather highlight that mut and move don’t bind by reference but by value.

schuelermine avatar Apr 18 '23 15:04 schuelermine

May I leave my two cents?

I think Rust should go in the opposite direction. Rely solely on match ergonomics and deprecate the ref keyword.

In my experience, I manage to never use the ref keyword (almost nobody in our team/project uses it). Almost all code can be written without ref in match, and exceptions are so rare that the language simplification outweighs these rare inconveniences.

stepancheg avatar Apr 20 '23 16:04 stepancheg

Rely solely on match ergonomics and deprecate the ref keyword.

That would be massively harmful, primarily in unsafe code. Only having an option where types do not match is unacceptable in any language seeking correctness to any degree.

H2CO3 avatar Apr 20 '23 18:04 H2CO3

@H2CO3 Can you elaborate on that further? With this RFC match ergonomics are almost identical in power to "Rust 2015 matching", meaning that Rust 2015 matching can almost be deprecated. I don't know how any of this has anything to do with unsafe or types that don't match.

CryZe avatar Apr 20 '23 18:04 CryZe

With this RFC match ergonomics are almost identical in power to "Rust 2015 matching"

First, I don't care about "power"; it's not what I am talking about. I am talking about situations where whether or not a type is a reference is important, and you want to be explicit about types rather than letting the compiler second-guess you. In unsafe, this can cause problems (read: even UB). Here's a trivial demonstration where the call is intended to overwrite x = 137 with the value 42 but fails to, due to match ergonomics. The equivalent with explicit patterns works as intended. This is "only" a logic bug, but if the value of x were relied on by unsafe code, it can trivially lead to UB. E.g. you can rewrite the above example with 0 and a non-zero value instead, which then gets fed into NonZeroU8::new_unchecked(). Here it is.

meaning that Rust 2015 matching can almost be deprecated

There's no need to deprecate anything. Just because there's a convenience feature, it doesn't mean that the option of being explicit should be taken away from people. This is exactly the kind of slippery slope I and many others had been worrying and warning about when match ergonomics was introduced.

H2CO3 avatar Apr 20 '23 18:04 H2CO3

@H2CO3 you example show how match ergonomics might be dangerous in unsafe context. OK.

How ref would make code safe? You didn't use ref in your examples.

There's no need to deprecate anything.

I explained the reasons: ref is needed rarely, code can be written without it, and we do not need to have two ways to write the same code. Maybe these are not compelling reasons, I don't know.

If you are arguing that match ergonomics need to be deprecated instead, this is a different topic.

stepancheg avatar Apr 20 '23 20:04 stepancheg

If you are arguing that match ergonomics need to be deprecated instead

I was not; refrain from putting words into my mouth.

How ref would make code safe? You didn't use ref in your examples.

Being explicit about the types would make the code safe. That may or may not involve ref.

H2CO3 avatar Apr 20 '23 20:04 H2CO3