rfcs icon indicating copy to clipboard operation
rfcs copied to clipboard

Extending deref/index with ownership transfer: DerefMove, IndexMove, IndexSet

Open nikomatsakis opened this issue 10 years ago • 89 comments

It is clear that there is a need for the ability to move out of smart pointers and indexable things (DerefMove, IndexMove). The other frequently desired option is to IndexSet, which would be a special-cased version for indexing in this situation:

map[key] = value

currently, this is handled via IndexMut, but that is sub-optimal, because if the key is not already part of the map, the result is a panic.

Basic plan

DerefMove/IndexMove/IndexSet should layer on top of the traits we have now. One needs to be careful here because of subtle interactions with autoref and so forth.

Postponed RFCs

  • https://github.com/rust-lang/rfcs/pull/178
  • https://github.com/rust-lang/rfcs/pull/159
  • https://github.com/rust-lang/rfcs/pull/1129 (map[key] = value)

nikomatsakis avatar Mar 21 '15 11:03 nikomatsakis

I'd like DerefMove for similar reasons that Servo wants it: https://github.com/servo/servo/issues/3868

Namely, I have an type Foo, which owns a raw pointer from a crate that has FFI bindings:

struct Foo {
    handle: my_ffi_crate::foo // foo is a *mut to an uninhabited enum
}

and a "borrowed" version of that called BFoo:

struct BFoo<'a> {
    handle: my_ffi_crate::bfoo, // bfoo is the *const equivalent of my_ffi_crate::foo
    _phantom: PhantomData<&'a Foo>
}

Foo is the Rust equivalent of a type in the C library that's a flexible data storage type: think something like a C type that works like JSON. The C library supports things like: list indexing, map lookup, etc. which, in the Rust crate, have signatures like fn get<'a>(&'a self, key: Key) -> BFoo<'a>.

Most of the immutable "accessor" methods of Foo have to be implemented for both Foo and BFoo<'a> (a map might be from keys to more maps), which is annoying. It would make sense, I believe, to implement DerefMove<Target=BFoo<'a>> for &'a Foo, and then just implement those methods on BFoo<'a>. Of course, that might also require some additional autoref/autoderef magic.

tomjakubowski avatar Aug 03 '15 22:08 tomjakubowski

A potential problem for the design of DerefMove - How would unsized types be handled? (e.g. FnOnce)

thepowersgang avatar Nov 20 '15 07:11 thepowersgang

Missing from this list is also IndexGet, which is the "returns a value" version of Index.

Gankra avatar Nov 30 '15 04:11 Gankra

@Gankro what is the difference between your IndexGet versus the IndexMove in the list ?

pnkfelix avatar Nov 30 '15 12:11 pnkfelix

Ah I assumed IndexMove took self by-value (given the comparison to DerefMove), but if it takes &mut self, then never mind.

Gankra avatar Nov 30 '15 13:11 Gankra

what does the RFCs being postponed mean? will they simply be back on track as soon as someone has made up a coherent idea of how everything should play together and writes it up?

flying-sheep avatar Mar 17 '16 11:03 flying-sheep

@flying-sheep it means "we're not rejecting this RFC because it's bad, now is just not the right time." This happened a lot before 1.0, so we could focus on what needed to ship in 1.0.

And yes, if there's a postponed RFC, someone needs to go through and evaluate if it's still valid in its current form, or update it, and re-submit it.

steveklabnik avatar Mar 17 '16 12:03 steveklabnik

i’m glad that IndexGet/IndexMove will be a thing.

things like ndarray profit from it. generally, all data structures where “slice” views can’t be rust slices.

should i try to update the “remaining Index traits” RFC (#159)?

if so:

  • which name? the IndexRef/IndexValue train is gone, and @Gankro was confused by IndexMove, so maybe IndexGet?
  • better do index_set as default-implemented trait for IndexMut or in a dedicated IndexSet?
  • also merge in IndexAssign (#1129)?

flying-sheep avatar Mar 17 '16 13:03 flying-sheep

I would say that IndexGet is the clearest, because of the ease of implementing it for both T and &T. IndexMove is too restricted as a name, in that you might not actually want to move anything, but instead you might want to copy.

Perhaps it would be possible to use type inference to decide whether to use Index or IndexGet. Example:

let some_array: [i32; 4] = [1, 2, 3, 4];
let some_slice = &some_array[0..3];
let x = some_slice[2]; // x: &i32 (default behavior)
let y: i32 = some_slice[2]; // y: i32 (would use `IndexGet` to pass by value

This would more or less follow from the current inference of IndexMut.

I do see the potential for compiler confusion in choosing which trait to use for a given operation, but I would at least appreciate a default implementation of IndexGet<E, R> for &[R] where R: Copy. Syntactic sugar ([E]) aside, this would let us assume that slices would be acceptable, and additional implementors of IndexGet could define their own behavior (whether they want to copy, clone, whatever).

I would also like to mention that even though copying-when-indexing is antithetical to C/C++, it is a very common idiom when it comes to Ruby, etc. I don't think this is a weakness in Ruby, but in fact allowing the end-user to use one idiom or other other should help ease the learning curve for people coming from dynamic languages that assume copy.

One of my greatest frustrations with the Ruby language is not knowing whether array indexing was happening by-value or by-reference. I think if Rust could acknowledge both of these domains, and allow programmers the explicit option of choosing one or the other, it would go a long way.

Edit: thanks @comex for pointing out that I indexed a stack array rather than a slice initially

andrewcsmith avatar Apr 01 '16 18:04 andrewcsmith

@andrewcsmith some_slice[2] is already i32 - you need &some_slice[2] for &i32. The indexing operator carries an implicit dereference.

(some_slice is also not a slice...)

comex avatar Apr 02 '16 06:04 comex

Theory

Both deref and index are for types containing values of other types. They allow to conveniently access contained value. The main difference is that deref types usually contain only one instance of other type, while index types usually contain multiple instances. There are actually many ways to access internal instance, and index/deref families should cover most of them.

There are two cases currently supported:

(1) accept ref, return ref (Deref/Index) (2) accept mut ref, return mut ref (DerefMut,IndexMut)

There is one case requested for derefs:

(3) consume value, return value (DerefMove?,IndexMove?)

It's an open question whether or not we need Index counterpart. I personally don't think it will be very useful.

There are two more index cases requested:

(4) accept ref, return value (IndexGet?) (5) accept mut ref, accept value, set value (IndexSet?)

They can be used for collections not actually storing value instances (e.g. bit array setting/returning bools). Case 5) can also be used to insert new values to hashmaps, which I think is a highly desirable feature. These cases apparently do not need deref counterparts.

Practice

Now let's get to practice, and please correct me if I'm wrong about current rules. Here I assume that S is source type, T is target type and I is index type. s, t and i will be corresponding instances.

Deref

Current deref rules

Current manual deref follows simple rules:

  1. &*s- perform Deref
  2. &mut *s - perform DerefMut
  3. *s and S is Copy - perform Deref and copy Note that if there is no DerefMut and S is Copy, then &mut *s fails, but &mut {*s} succeeds. If T is not Copy, then *s will fail. Copyability of S does not matter.

Autoderef follows the same rules, but also depends on context:

  1. If context needs &T - perform Deref
  2. If context needs &mut T - perform DerefMut
  3. If context needs value of T and T is Copyable - perform Deref and copy. Note that rule autoderef rule 3) is somewhat inconsistent, it works for self in method calls, but does not work for arguments in function calls. If context needs value and T is not Copy, then it fails. Copyability of S does not matter.

Proposed deref rules

Deref rules proposed in #178, in short, are: try to move copyables, try to ref non-copyables, otherwise use whatever possible. I do mostly agree with the proposed rules, but don't really like 'use whatever possible' part. #178 says that, for example, if there is no Deref, but DerefMove, then it should be used for all derefs. Let's imagine that there is no Deref, but T is Copy. In this case &*s would perform DerefMove and take reference of the result, which is both non-intuitive and contradicting with how Deref currently works (see my note about missing DerefMut). I'd say it's better to fail in this case. Another, even simpler option is to make Deref implementation mandatory for types implementing DerefMove. I think it's reasonable and makes everything easier.

Proposed manual deref rules:

  1. &*s - perform Deref
  2. &mut *s - perform DerefMut
  3. *s and S is Copy and implements DerefMove - copy source and perform DerefMove
  4. *s and T is Copy and S implements Deref - perform Deref and copy result
  5. *s and neither 3) or 4) apply - perform DerefMove

Proposed autoderef rules:

  1. If context needs &T - perform Deref
  2. If context needs &mut T - perform DerefMut
  3. If context needs value of T, S is Copy and implements DerefMove - copy source and perform DerefMove
  4. If context needs value of T, T is Copy and S implements Deref - perform Deref and copy result
  5. If context needs value of T, but neither 3) or 4) apply - perform DerefMove

If rule condition is satisfied, but required trait is not available, then that's a failure and compiler should not try to apply other rules (that's the only thing different from #178)

It is possible to support DerefGet/DerefSet, but, then again, it's noticeably ramps complexity and likely not very useful. If needed, DerefGet/DerefSet should follow strategy for IndexGet/IndexSet described below.

Indexing

Current indexing rules

Indexing is more or less follow deref, but is generally simpler, because there is no autoderef counterpart.

Current rules for indexing:

  1. &s[i] - perform Index
  2. &mut s[i] - perform IndexMut
  3. s[i] and T is Copy - perform Index and Copy If T is not Copy, then s[i] will fail.

Proposed indexing rules

This extends both #159 and #1129, adds more details and answers some unresolved questions.

Main problem is IndexGet. Option of having it adds lots of ambiguity into rules. Also, possiblity of different implementation of Index and IndexGet scares me. I think that if collection need IndexGet, then having Index/IndexMut/IndexMove does not make sense.

I propose to make IndexGet and Index/IndexMut/IndexMove mutually exclusive. Types implementing IndexGet can have automatically generated default implementation of Index/IndexMove for trait compatibility purposes, but manually implementing Index/IndexMut/IndexMove should be forbidden.

With mutually exclusive IndexGet and Index/IndexMut/IndexMove, indexing rules actually become pretty simple.

For types implementing IndexGet:

  1. s[i] - perform IndexGet
  2. s[i] = expr (asssignment) - perform IndexSet
  3. s[i] += expr (compound assignment) - perform IndexGet, apply non-compound operation, then IndexSet result With these rules &s[i] and &mut s[i] are still possible, but they represent references to temporary value returned by IndexGet. Note that it differs from the way Deref works, but I think it worth it. For deref it is not as useful because autoderef can handle most of function interface compatibility issues. Indexing does not have autoderef counterpart and therefore it's not wise to prohibit taking references.

For all other types:

  1. &s[i] - perform Index
  2. &mut s[i] - perform IndexMut
  3. s[i] = expr (asssignment) - perform IndexSet (note that it should NOT fallback to IndexMut)
  4. s[i] += expr (compound asssignment) - perform IndexMut, apply compound operation on it's result (note that it should NOT fallback to IndexSet)
  5. s[i] and T is Copy - perform Index and Copy
  6. s[i] and T is not Copy - perform IndexMove

Just like with derefs, if rule condition is satisfied, but required trait is not available, then that's a failure and compiler should not try to apply other rules.

Note that I'm not sure if we need IndexMove (consume value, return value), but I list it here for completeness sake.

Indexing examples

// Bits is a tightly packed bits array, implements IndexGet/IndexSet
println!("first bit is: {}", bits[0]); // bits.index_get(0)
bits[1] = true; // bits.index_set(1, true)
bits[2] ^= true; // bits.index_set(2, (bits.index_get(2) ^ true))
let tref = &bits[3] // &bits.index_get(3), reference to temporary

// Array implements Index/IndexMut/IndexSet
println!("first element is: {}", array[0]) // *array.index(0)
let second = &array[1] // array.index(1), ref of second element
let third = &mut array[2] // array.index_mut(2), mutable ref of third element
array[3] = 2; // array.index_set(3, 2)
array[4] += 3; // (*array.index_mut(4)) += 3

// Map implements:
//   Index/IndexMut for Index types where Key impl Borrow<Index>
//   IndexSet/IndexMove for Index types where Key impl From<Index>
let vref = &array["key"] // array.index("key"), ref of value
let vrefmut = &mut array["key"] // array.index_mut("key"), mutable ref of value
map["new key"] = 1; // map.index_set("new key", 1), creates new entry
map["new key"] += 2 // (*map.index_mut("new key")) += 2, modifies existing entry
return map["key"]; // map.index_move("key"), consumes map
// failures:
&map["missing key"] // map.index("missing key"), fails
&mut map["missing key"] // map.index_mut("missing key"), fails
map["missing key"] += 2 // (*map.index_mut("missing key")) += 2, does not insert value, fails

P.S.

  • It's the longest comment I've ever wrote
  • If necessary, I can create a new rfc with even more implementation details
  • Or I can start hacking, you know =)

VFLashM avatar Jun 15 '16 20:06 VFLashM

In the case of indexing, what would happen to s[i].do(), for some impl T {fn do(&self);}?

would it become s.index_move(i).do() by rule 6 or perhaps worse s.index(i).clone().do() by rule 5, where clone has the Copy implementation

obviously it would be nice if it became (&s[i]).do()

spiveeworks avatar Jun 08 '17 02:06 spiveeworks

Wow, it's been a while.

You're absolutely right, that's a very important use case I missed.

I guess rules have to be modified like this to make it work:

  1. &s[i] or s[i].method(...) where method needs &self - perform Index
  2. &mut s[i] or s[i].method(...) where method needs &mut self - perform IndexMut
  3. ...

I hate to see how complex it gets, but still can't find a more usable solution.

VFLashM avatar Jun 08 '17 02:06 VFLashM

The same goes for Borrow. The Borrow trait is pretty useless as it is now. It even needs a workaround in the language to work for &[] and &str as the conversion creates a (hidden) new type.

chpio avatar Jun 08 '17 10:06 chpio

Given that we're in the impl Period, is there any motion to make forward progress on this? In addition to the heavily discussed stuff, this issue seems to be a blocker for things like Box<FnOnce> which have been a sore spot for years now.

I also don't see a clear summary of what is blocking this, in case someone wanted to hack on this, which I think means the design work was never completed? Just trying to get a feel for the situation.

coder543 avatar Oct 22 '17 21:10 coder543

IndexGet plus IndexSet should not be invoked together, say by s[i] += ... It'd become a nightmare for performance when using larger numerical types, like even 256 bits, which may or may not be copy.

A priori, I'd think indexing into collection types should return Entry/Guard like types that provide access to the memory using DerefMut or DerefMove, whatever that looks like. Adding this IndexGet and IndexSet zoo sounds problematic, especially when HashMap's Entry took so much effort to hammer out recently.

burdges avatar Oct 22 '17 23:10 burdges

Hello. I'd like this to move forward (especially because of Box<FnOnce>). Is there anything I could do to help to move forward?

vorner avatar Jan 17 '18 19:01 vorner

@vorner A formal RFC needs to be written, possibly based off of @VFLashM's comment.

alercah avatar Mar 22 '18 20:03 alercah

Anything happened with this lately?

alexreg avatar May 01 '18 21:05 alexreg

Random thought while getting myself up to speed---IndexMove may not be desirable to overload on existing syntax, because the effect of the move is necessarily dynamic. I can definitely see someone doing something like:

let s = m[i];
{ ... }
let t = m[i];

and getting confused that the latter fails if the result type isn't copyable. This can't be caught at compile-time, since we don't actually move out of m. So perhaps we should require explicitly invoking moves on indexing operations, such as let s = m[move i]?

alercah avatar May 01 '18 22:05 alercah

IndexMove would consume m - that would fail to compile since m no longer exists on the third line.

sfackler avatar May 01 '18 22:05 sfackler

Err, right. I forget that case is useful. In that case, I meant IndexGet---the one that moves out of a container without consuming it.

alercah avatar May 01 '18 22:05 alercah

@alercah Moving out is consuming it, as far as I can tell. Unless you mean something different by "consuming it"...

alexreg avatar May 01 '18 22:05 alexreg

Ah, IndexGet isn't quite that. It's designed for cases where you have a data structure which can't return a reference into it (e.g. a bitset). The signature is fn index_get(&self) -> Self::Item, so the underlying datastructure isn't modified.

sfackler avatar May 01 '18 22:05 sfackler

Ahhh, right, ok! That makes sense, then. :)

What's the use case of IndexMove, then?

alercah avatar May 02 '18 01:05 alercah

I think IndexMove is more of an intellectual curiosity than a thing we have a concrete usecase for.

On Tue, May 1, 2018 at 9:12 PM, Alexis Hunt [email protected] wrote:

Ahhh, right, ok! That makes sense, then. :)

What's the use case of IndexMove, then?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/rust-lang/rfcs/issues/997#issuecomment-385835902, or mute the thread https://github.com/notifications/unsubscribe-auth/ABFY4LLu26WeXqHBvZ9Qzqm68KqYycWEks5tuQf1gaJpZM4Dybu7 .

Gankra avatar May 02 '18 13:05 Gankra

I was under the impression that IndexMove was consuming the container: fn index_move(self, index) -> Self::Item, such that one could use

vec[i]

rather than the current nicest way (as far as I'm aware?)

vec.into_iter().nth(i).unwrap()

alecmocatta avatar May 02 '18 14:05 alecmocatta

Wouldn’t it do the equivalent of remove(index) for collections?

alexreg avatar May 02 '18 14:05 alexreg

@alexreg It is equivalent to the extent that it returns a T without requiring Copy/Clone. It differs in that remove takes &mut collection, and does does nontrivial work – shifting all the remaining elements 1 to the left. One could continue to use the collection of remaining elements.

The IndexMut I described consumes the collection, but need not perform the potentially expensive shifting. As it's been consumed, one can no longer use it nor the other elements it might have contained.

alecmocatta avatar May 02 '18 14:05 alecmocatta

@alecmotta Right. But with IndexMove, it would just be statically guaranteed that (for example) iterating over the collection after an IndexMove operation would not include that element, or what?

alexreg avatar May 02 '18 15:05 alexreg