Arbitrary self types v2
This PR suggests small changes to the existing unstable "aribtrary self types" feature to make it more flexible. In particular, it suggests driving this feature from a new (ish) Receiver trait instead of from Deref, but to maintain compatibility by having a blanket implementation for all Deref types.
This is a squashed commit of many edits by various folks including @Urhengulas, @Veykril , @madsmtm and myself. Thanks also to @davidhewitt, @Manishearth and many folks over on Zulip for feedback.
another good use case for arbitrary self types that may be worth mentioning in the rfc:
#[derive(Default)]
pub struct GCArena {
objects: Vec<Rc<dyn Any>>,
}
impl GCArena {
pub fn alloc<T: 'static>(&mut self, value: T) -> Gc<T> {
let rc: Rc<T> = Rc::new(value);
let weak = Rc::downgrade(&rc);
self.objects.push(rc);
Gc(weak)
}
}
#[derive(Clone)]
pub struct Gc<T: 'static + ?Sized>(Weak<T>);
impl PartialEq for Gc<T: 'static + ?Sized> {
fn eq(&self, other: &Gc<T>) -> bool {
self.0.ptr_eq(&other.0)
}
}
impl Eq for Gc<T: 'static + ?Sized> {}
impl Hash for Gc<T: 'static + ?Sized> {
fn hash<H: Hasher>(&self, state: &mut H) {
(self.0.as_ptr() as *const ()).hash(state);
}
}
impl<T: 'static + ?Sized> Receiver for Gc<T> {
type Target = T;
}
impl<T: 'static + ?Sized> Receiver for Gc<T> {
pub fn get(&self) -> Rc<T> {
self.0.upgrade().expect("GCArena has been dropped")
}
}
// demo of using Gc:
pub struct Node<T> {
pub edges: Vec<Gc<Node<T>>>,
pub data: T,
}
impl<T> Node<T> {
pub fn walk(self: &Gc<Self>, seen: &mut HashSet<Gc<Self>>, f: &mut impl FnMut(&Gc<Self>)) {
if seen.insert(self.clone()) {
f(self);
for i in &self.get().edges {
i.walk(seen, f);
}
}
}
}
@programmerjake I think this is now covered by the paragraph I wrote that is now part of the RFC?
For example, taking &Arc<T> allows me to both clone the smart pointer (noting that the underlying T might not implement Clone) in addition to access the data inside the type, which is useful for some methods. Also, being able to change a method from accepting &self to self: &Arc<Self> can be done in a mostly frictionless way, whereas changing from &self to a static method accepting &Arc<Self> will always require some amount of refactoring.
This feels like the same case you present: you want the ability to both access the inner type and clone the smart pointer, and simply accessing the inner type isn't enough.
This feels like the same case you present: you want the ability to both access the inner type and clone the smart pointer, and simply accessing the inner type isn't enough.
well, i had intended to demonstrate a type where you can't just impl Deref, since the inner value may have already been dropped -- I didn't do a very good job of that.
another use case: a type where the inner value doesn't exist locally, e.g. for RPC with promise pipelining:
pub struct Id(u32);
pub struct Promise<T>(Id, PhantomData<T>);
impl<T> Receiver for Promise<T> {
type Target = T;
}
pub struct MyRemoteData;
pub struct SomeIntermediate;
impl MyRemoteData {
pub fn get() -> Promise<Self> {
...
}
pub fn foo(self: Promise<Self>, a: i32) -> Promise<SomeIntermediate> {
...
}
}
impl SomeIntermediate {
pub fn bar(self: Promise<Self>) -> Promise<()> {
...
}
}
pub async fn demo() {
MyRemoteData::get().foo(5).bar().await;
}
All, I just wanted to say thanks for all the discussion so far! @Urhengulas and I are working through the various things one at a time - sorry if some of your comment threads are awaiting an RFC update from us, we just haven't got to them all yet. (I'm currently poking at lifetime elision.)
@rustbot labels +I-lang-nominated
I'm nominated to take the temperature of the @rust-lang/lang team on this RFC -- I'd like to see it move forward! I'm going to give it a good read before our meeting
Gave this a quick read. One thing I would note is that I think that the RFC should highlight more that it also enables *const self methods -- this is a big win for unsafe code, in my opinion, for all the same reasons. Right now code that wants to take a raw pointer and doesn't want to guarantee reference validity is pretty stuck.
One small point regarding stabilization: the purpose of the current Receiver trait is to tell the compiler which types can be used as a method receiver from code that doesn't have the arbitrary_self_types feature enabled. This way, types like Rc, Arc, and Pin can be defined in std and marked as vetted receiver types that can be used from stable user code, but arbitrary types that implement Deref<Target=Self> cannot be used as method receivers without arbitrary_self_types enabled.
This RFC proposes adding a blanket impl of the new Receiver trait for anything that implements Deref. With that blanket impl, the new Receiver trait will not be useful as a way to mark those vetted receiver types, because every type that implements Deref will implement Receiver. If we want to avoid insta-stabilizing the ability to use arbitrary types that implement Deref as method receivers, we will need some way to tell which ones are the vetted receiver types and which ones aren't, likely with another marker trait that serves the same purpose as the existing Receiver trait. (This marker trait can be removed once arbitrary_self_types is stabilized.)
Another note: I'd like to see some more details of how method resolution and deref coercion would work with this change. Currently, it says that the algorithm would remain the same, but instead of using the Deref trait, it would use the new Receiver trait. I'm not sure this is quite right, because some method calls require deref coercions, and this new trait does not imply Deref.
In the following example, the Deref trait would still need to be involved to check if the calls to frobnicate_ref are allowed.
struct Foo;
impl Foo{
fn frobnicate_ref(&self) {}
fn frobnicate_cpp_ref(self: CppRef<Self>) {}
}
fn main() {
let foo_rc: Rc<Foo> = Rc::new(Foo);
// this compiles fine, and desugars to `Foo::frobnicate_ref(&*foo_rc)`
foo_rc.frobnicate_ref();
let foo_cpp_ref: CppRef<Foo> = CppRef::new(Foo);
// should not compile because it would desugar to `Foo::frobnicate_ref(&*foo_cpp_ref)`
// and you can't deref a CppRef
foo_cpp_ref.frobnicate_ref();
// should compile, because we're not dereffing the CppRef
// desugars to `Foo::frobnicate_cpp_ref(foo_cpp_ref)`
foo_cpp_ref.frobnicate_cpp_ref();
}
Aside: thanks @adetaylor and everyone that's helped for taking on the challenge of this RFC. I can tell there's been a lot of effort put in already and a lot of details have been thought out, and it's pretty nice to be able to walk in now, read a well put-together RFC and give a little bit of feedback.
With that blanket impl, the new
Receivertrait will not be useful as a way to mark those vetted receiver types
Assuming we gate the new behavior behind arbitrary_self_types, we can:
- Have a blanket
impl Receiver for T: Derefgated behindarbitrary_self_types - Retain
impl Receiver for Box<T>etc. gated behindreceiver_traitI'm finding we need to do something like that to maintain compatibility with the bootstrap compiler anyway.
@adetaylor it's been a while since I was involved in compiler development, but I don't remember it being possible to gate an impl behind a feature gate. If it is, then maybe that could work.
In the following example, the
Dereftrait would still need to be involved to check if the calls tofrobnicate_refare allowed.
Using my current playing around branch (which is roughly equivalent to what's proposed in this RFC, though the feature gates are different):
#![feature(receiver_trait)]
use std::rc::Rc;
struct Foo;
struct CppRef<T: ?Sized>(T);
impl<T> CppRef<T> {
fn new(t: T) -> Self {
Self(t)
}
}
impl<T: ?Sized> std::ops::Receiver for CppRef<T> {
type Target = T;
}
//impl<T: ?Sized> std::ops::Deref for CppRef<T> {
//type Target = T;
//fn deref(&self) -> &Self::Target {
//&self.0
//}
//}
impl Foo{
fn frobnicate_ref(&self) {}
fn frobnicate_cpp_ref(self: CppRef<Self>) {}
}
fn main() {
let foo_rc: Rc<Foo> = Rc::new(Foo);
// this compiles fine, and desugars to `Foo::frobnicate_ref(&*foo_rc)`
foo_rc.frobnicate_ref();
let foo_cpp_ref: CppRef<Foo> = CppRef::new(Foo);
// should not compile because it would desugar to `Foo::frobnicate_ref(&*foo_cpp_ref)`
// and you can't deref a CppRef
// Indeed does not compile. Currently, it causes a compiler panic, which is non-desirable.
// This suggests to me a mismatch between the method candidate identification code (which we've updated)
// and the actual dispatch code.
// This line DOES compile if CppRef impls Deref above instead of Receiver, as we'd hope.
// foo_cpp_ref.frobnicate_ref();
// should compile, because we're not dereffing the CppRef
// desugars to `Foo::frobnicate_cpp_ref(foo_cpp_ref)`
// does indeed compile fine
foo_cpp_ref.frobnicate_cpp_ref();
}
So playing around, I think it's behaving as you propose it should? And yes, this implies that Deref is involved at some level, and you're right, we should try to describe how exactly. But meanwhile if you can confirm that my comments in the snippet above are what you're expecting (apart from the compiler panic!) that would be great.
@adetaylor based on your comments, it sounds like you understand what I was getting at with this example. In the case of foo_cpp_ref.frobnicate_ref(), there should be an error during type checking. If the compiler panic is happening at a later stage, then it should go away once the error is caught earlier during type checking.
Another note: I'd like to see some more details of how method resolution and deref coercion would work with this change. Currently, it says that the algorithm would remain the same, but instead of using the Deref trait, it would use the new Receiver trait. I'm not sure this is quite right, because some method calls require deref coercions, and this new trait does not imply Deref.
Yeah, I caught that too.
I think the RFC is mixing two sets of semantics at times. As far as I understand, the reference as it's currently written doesn't explicitly address Box/Arc/etc as receivers, but it does implicitly address them in this paragraph:
Then, for each candidate type T, search for a visible method with a receiver of that type in the following places:
- T's inherent methods (methods implemented directly on T).
- Any of the methods provided by a visible trait implemented by T. If T is a type parameter, methods provided by trait bounds on T are looked up first. Then all remaining methods in scope are looked up.
That first bullet point is the part that must be amended to clarify that "T's inherent methods" need to take the Receiver trait into account; the paragraph above about dereferencing chains must be kept untouched.
It's therefore a major compatibility break if existing Deref implementors cease to be usable as self parameters. Just in the standard library, we'd have to add Receiver implementations for Cow, Ref, ManuallyDrop and possibly many other existing implementors of Deref: third party libraries would have to do the same. Without that, method calls on these types would not be possible:
Here too, I think the RFC is confusing "Box<Self> as a receiver" with "Box<Self> as a way to get a &self which is a receiver".
Right now the only custom receivers (outside nightly crates) are the ones defined in the standard library; changing how they're defined isn't a breaking change.
I'd like to give some context to the "smart pointers in GUI" use-case.
In Masonry, I'm trying to create a way to access a widget hierarchy imperatively, that still automatically updates the render tree whenever something is changed. My ideal API looks something like this:
struct Button {
label: WidgetPod<Label>,
size: Size,
color: Color,
callback: impl Fn(Stuff),
}
impl Button {
pub fn new(text: impl Into<String>) -> Button {
Button::from_label(Label::new(text))
}
pub fn from_label(label: Label) -> Button {
Button {
label: WidgetPod::new(label),
// ...
}
}
pub fn set_size(self: &mut WidgetMut<'_, Self>, size: Size) {
self.widget.size = size;
self.ctx.request_layout();
}
pub fn set_color(self: &mut WidgetMut<'_, Self>, color: Color) {
self.widget.color = color;
self.ctx.request_paint();
}
pub fn set_callback(self: &mut WidgetMut<'_, Self>, callback: impl Fn(Stuff)) {
self.widget.callback = callback;
}
}
That is, every widget would be defined in a way that it's not possible for an external user to directly modify the fields of the widget. Instead users would have to use a WidgetMut<MyWidget> receiver. Widgets might have internal methods taking mutable references, but every public method mutating a widget would take a WidgetMut<MyWidget> receiver; that WidgetMut would include methods to notify the system when repaint or relayout is needed, and people writing new widgets would make sure that setters use those methods.
Some widgets would also have children. These widgets would have method to go from WidgetMut<Parent> to WidgetMut<Child>:
impl Button {
pub fn label_mut(self: &mut WidgetMut<'_, Self>) -> WidgetMut<'_, Label> {
self.ctx.get_child_mut(&mut self.widget.label)
}
pub fn set_text(self: &mut WidgetMut<'_, Self>, new_text: impl Into<String>) {
self.label_mut().set_text(new_text.into());
}
}
So far this RFC looks like it would help me satisfy that use case, so I'm hoping it makes progress! :heart:
EDIT: Note that I've tried to implement an API that looks like the above in Masonry, with workarounds for the lack of arbitrary receivers. Those workarounds are pretty ugly, so I would love for arbitrary receivers to get stabilized, or at least make enough progress that I feel confident going nightly-only just to get them.
I'd like to give some context to the "smart pointers in GUI" use-case.
@PoignardAzur Thanks for adding the additional context - very happy to accept edits if you think any of this should belong in the Motivation section!
I'm not sure it belongs in the motivation section. ^^
I just wanted to have the full explanation in an easy-to-find place.
We talked about this at length in today's @rust-lang/lang meeting, and had several thoughts. We were all generally in favor of arbitrary self types, but also had a number of questions and potential improvements. Several of us are going to post our thoughts, in separate comments.
In general this is looking great, and I'd definitely like this to become something possible.
I just have one thing about pointers to discuss.
I'm having trouble understanding how both of these make sense at the same time:
[
Receiver] is also implemented for &T, &mut T, *const T and *mut T.
Weak<T>andNonNull<T>were not supported by the prior unstable arbitrary self types support, but they share the property that it may be desirable to implement method calls to T using them as self types. Unfortunately they also share the property that these types have many Rust methods usingself,&selfor&mut self.
We also have, if I counted correctly, 42 methods on *const _ right now. If we go back in time there were far fewer, but today it seems like there's enough that we can't assume it's a full set -- the strict provenance stuff added a whole bunch relatively recently, for example.
More generally, what about all the other possible types in the standard library? I could imagine someone reasonably wanting to have self: MaybeUninit<Self>, for example. But since that doesn't Deref, it also has loads of methods on it, and thus it probably shouldn't be Receiver either? Same for Mutex or Vec or ...
Thus my instinct is that we should require that they make something like the CppRef type you describe in the Motivation, rather than having a pointer be a receiver directly. That doesn't seem that unreasonable, either, since most of the methods on pointers aren't things that one should be doing on a C++ reference anyway.
Then we stick with the "you shouldn't do this if you have methods" rule, which means no impl<T> Receiver for Vec<T>, but it could mean impl<T> Receiver for ManuallyDrop<T>, or similar. (Well, I guess that one has Deref so it'd be caught by the proposed blanket impl.)
Just as a SHOULD, not as a MUST, in the language rule though. If you really want inherent methods -- that always win in resolution -- on something impl Receiver, I think that's fine, just not for standard library things. An external crate could always do a major version bump to add more inherent methods like that if it really wanted to.
Unless there's some strong reason that methods need pointer receivers directly that I'm missing? My instinct is that if you can implement Receiver for your own type, that's fine, and it's not needed on raw pointers directly.
The restriction on NonNull and Weak seems unfortunate, for multiple reasons.
- These are types that make a lot of sense to use as receivers.
- It seems like raw pointers also have lots of methods, and we may well add more in the future, so it's not obvious why it's OK to have raw-pointer receivers but not
NonNull/Weakreceivers. Is there a reason why the same considerations don't apply?
I'd be enthusiastically in favor of this, but I'd like to find a way that we can allow it for NonNull and Weak as well, and in general for receivers that do have methods people might want to call.
When you add a new method to a trait, that's not a breaking change, because inherent methods get priority.
However, with Deref, we already have the rule that methods directly on the type have priority over methods on the Deref::Target. Which makes it a breaking change to add methods directly on the type.
For Receiver, the other way around seems more robust against breakage, since it seems acceptable for a method impl SomeType { fn method(self: FooPtr<Self>); } to shadow FooPtr::method. However, that would be incompatible with existing usage with Deref, and I don't see any way we could get there even if we wanted to.
That said, I think we should consider (in order from most to least preferable):
- Adding some way to make it not a breaking change to add methods on a type that implements
Receiver, or to make it reasonable to put it in the category of "acceptable breakage" like we do inference failure. Then, implementReceiverforNonNullandWeak. - Implement
ReceiverforNonNullandWeakanyway, if we're going to have it for raw pointers. In particular, I don't think it makes sense to have it for raw pointers and not forNonNull. - Making raw pointers not a
Receiver. Same argument: it doesn't make sense to have it for raw pointer and notNonNull. - Add a way for us to transition over an edition to moving the various methods on raw pointers and
NonNullto associated functions. That seems like a massive amount of churn, though. I'm only mentioning this possibility for completeness, in case it inspires some better idea.
it seems acceptable for a method
impl SomeType { fn method(self: FooPtr<Self>); }to shadowFooPtr::method. However, that would be incompatible with existing usage withDeref
Why is it incompatible? From what I understand we could treat Deref and Receiver differently when deciding shadowing precedence. The only types where this could break are types that are already Receiver today, i.e. Box, Rc and friends, but they already don't have inherent methods so that's fine.
Implement
ReceiverforNonNullandWeak
Once we do those, one might ask: why don't we also allow self: Cell<Self>, self: [Self], self: HashSet<Self> and so on?
So yeah, definitely a good point, I think the motivation for allowing raw pointers was also because it lined up nicely with the proposed changes to the method resolution, but we're changing those anyway, so it might not make so much sense any more?
Will have to consider this further!
Thanks for all the discussion around NotNull, Weak etc. I thought I'd take the liberty of summarizing the different perspectives to see if it helps us reach a consensus:
- The existing unstable
arbitrary_self_typesfeature allows use of raw pointers as self types (but notNotNull,Weaketc. because they don't implementDeref) - @scottmcm feels we shouldn't have raw pointer self types, but require people to wrap those pointers in a newtype wrapper (this is my instinct too but just because I'm automatically nervous of larger changes!)
- @nikomatsakis is of the view that raw pointer self types are important for better quality unsafe code and I've heard related points from @Manishearth who says:
raw pointer receivers are quite important for the future of safe Rust, because stacked borrows makes it illegal to materialize references in many positions, and there are a lot of operations (like going from a raw pointer to a raw pointer to a field) where you don't need to or want to do that.
- @joshtriplett says we should implement
ReceiverforNonNullandWeakif we implement it for raw pointers, but hopes we can find a way to avoid it being a compatibility break
I think: it seems like there's enough desire for raw pointers, NotNull, Weak etc. as self types that we should try pretty hard to find a way to avoid compatibility breaks, even if it means we do something a bit unexpected from the user's perspective. There are some proposals around that in this comment thread about the RFC which I haven't had a chance to properly get my head around, let alone work into RFC revisions! But further comments on that thread are very much appreciated.
(Edit: that linked comment thread now contains an actual proposal and a trial implementation, please take a look)
it seems acceptable for a method
impl SomeType { fn method(self: FooPtr<Self>); }to shadowFooPtr::method. However, that would be incompatible with existing usage withDerefWhy is it incompatible? From what I understand we could treat
DerefandReceiverdifferently when deciding shadowing precedence.
I mean that it'd be confusing if some type SmartPtr that implements Deref (and thus gets the blanket impl of Receiver) has methods that take precedence over methods of the type it Derefs to if they take a T, but don't take precedence if they take a SmartPtr<T>.
I mean that it'd be confusing if some type SmartPtr that implements Deref (and thus gets the blanket impl of Receiver) has methods that take precedence over methods of the type it Derefs to if they take a T, but don't take precedence if they take a SmartPtr<T>.
I think of it as "most specific impl first" (with caveat and details here). It makes sense in my head, it's kind of like specialization: a foo(Box<Self>) takes priority over a generic foo(self) on Box<T>. I think we can teach this in a way that's not too confusing.
Periodic update on what's happening here:
- @madsmtm is kindly fixing up the current method call documentation
- I have a PR to try to fix pre-existing bugs in the lifetime elision rules, although current consensus is that we should try to understand and document the current rules before fixing them :) Help and assistance appreciated!
- I've been trying to implement this proposal for searching
implblocks (viaReceiver) differently from actually probing methods (viaDeref) - it's proving fiddly I'm in the middle of a very busy two weeks so that's why @madsmtm is doing the only actual effective stuff right now :)
@rustbot labels -I-lang-nominated
This was discussed in the T-lang meeting on 2023-11-27. Some members have left comments above deriving from this.
The general feeling seemed to be that we wanted to find some way to enable this, including for raw pointers, NonNull, etc., but we were feeling unsure about the path to get there.
In light of this, probably further reflection and discussion on this thread or in Zulip is warranted, and a revision to the RFC that addresses this and proposed a plan here would be in order.
Once that is done, please renominate for T-lang so that the team can discuss the changes.
@traviscross Thank you! Yes.
Another periodic status update:
- I've figured out (I think) how to solve the problem with non-dereffable types - I intend to write it up. I've even made some (non-fancy) diagrams of how it all works right now and how it will change.
- I think we can make it work for
NonNulland friends using the proposals in this thread... but it's not quite working yet, TBD - Together, the two things above will require a significant rewrite of the Reference section of this RFC (maybe even with diagrams, I don't know if that's normal in Rust RFCs)
- In parallel, the lifetime elision anomalies are being explored here including a crater run
Here's the new problem with NonNull and friends which I allude to in the previous comment.
#![feature(receiver_trait)]
struct MyNonNull<T>(*const T);
#[allow(dead_code)]
impl<T> MyNonNull<T> {
fn foo<U>(&self) -> *const U {
self.cast().bar()
}
fn cast<U>(&self) -> MyNonNull<U> {
MyNonNull(self.0 as *const U)
}
fn bar(&self) -> *const T {
self.0
}
}
This compiles. But if I implement Receiver<Target=T> for MyNonNull<T>, we get:
error[E0282]: type annotations needed
--> /Users/adetaylor/dev/rust/tests/ui/self/recursive-receiver.rs:13:14
|
LL | self.cast().bar()
| ^^^^ --- type must be known at this point
| |
| cannot infer type of the type parameter `U` declared on the method `cast`
|
help: consider specifying the generic argument
|
LL | self.cast::<U>().bar()
| +++++
This is with my working branch here. The only explanation is that the probing must be coing up with more possible method candidates, and this is how the ambiguity is presented. I don't quite understand this yet, but as there's a determination to implement Receiver for NonNull I do think we need to get to the bottom of it before making progress.
Here's how I think method resolution should work after this RFC. This approach allows us to use NonNull method receivers, using something approximating the approach in this thread, and solves the problem highlighted around non-dereffable types here. It also solves the problem in the last comment.
First, this assumes that there is a blanket impl of Receiver for T: Deref. To abuse UML a little bit, things look like this:
classDiagram
Receiver <|-- Deref
Receiver <|-- CppRef
note for CppRef "or other types that\nimpl Receiver but not Deref\ne.g.NonNull"
Deref <|-- Box
note for Box "or other types that\nimpl Deref\ne.g.Rc"
class usize
note for usize "or other types that\nimpl neither"
So, for any given type, say A<B<C<D<E>>>>, there's a chain of Derefs which can be followed, and a chain of Receivers. The Receiver chain is always just an extension of the Deref chain.
graph LR
A --Receiver--> B
A --Deref--> B
B --Receiver--> C
B --Deref--> C
C --Receiver--> D
D --Receiver-->E
How method probing works now
This is roughly a diagrammatic version of this excellent page though it includes a few different details which will be important for this RFC. This is all, of course, quite simplified.
Everything is driven from autoderef which knows how to iterate through the chain of Deref links shown above. This is used to check valid receiver types (in wfcheck.rs) and twice in method probing, in probe.rs. At present, the same chain of Derefs is used in all three places.
flowchart TB
subgraph autoderef
direction TB
A --Deref--> B
B --Deref--> C
end
subgraph wfcheck[wfcheck.rs]
wfchecker[Confirm valid self types in each method]
end
autoderef-->wfchecker
subgraph probe[probe.rs]
method_autoderef_steps
candidate_steps[(list of candidate steps with\nself_ty\nnumber of derefs)]
method_autoderef_steps-->candidate_steps
probe_op[probe_op & various\n'functions calling push_candidate]
candidate_steps-->probe_op
candidates[(list of candidate methods with\nassociated item &\nxformed self ty)]
probe_op-->candidates
candidates-->pick_all_method
pick_all_method[pick which candidate\nby iterating types and\nlooking for matching methods.\nAt each step]
candidate_steps--work out types to which\nreceiver can be converted-->pick_all_method
end
autoderef-->method_autoderef_steps
ambiguity[Report method ambiguity error]
confirm[confirm.rs]
pick_all_method--exactly 1 method-->confirm
pick_all_method--multiple methods-->ambiguity
e0282[error 0282: type annotation needed]
method_autoderef_steps--final type was a type variable-->e0282
Note in particular that the candidate_steps are effectively used twice.
How this changes with arbitrary self types
With this RFC, the following changes occur:
- Sometimes we follow the chain of
Receiverinstead ofDeref. Specifically,- We do this in
wfcheck.rswhen checking for well-formedselftypes - We also do this in
probe.rswhen assembling the list of candidate functions, but not when we are working out the types to which the actual method call receiver can be converted. For that we continue to useDeref.
- We do this in
- Sometimes, if there are multiple candidates, we pick one but show a warning about the ambiguity. (This is what allows us to use
impl Receiver for NonNulldespite the fact that we might want to add methods toNonNullin future).
So the picture changes like this:
flowchart TB
subgraph autoderef[autoderef using Deref]
direction TB
A --Deref--> B
B --Deref--> C
end
subgraph autoderef2[autoderef using Receiver]
direction TB
A2[A] --Receiver--> B2[C]
B2[B] --Receiver--> C2[C]
C2[C] --Receiver--> D2[D]
D2[D] --Receiver--> E2[E]
end
subgraph wfcheck[wfcheck.rs]
wfchecker[Confirm valid self types in each method]
end
autoderef2-->wfchecker
subgraph probe[probe.rs]
method_autoderef_steps[method_autoderef_steps using Deref]
method_autoderef_steps2[method_autoderef_steps using Receiver]
candidate_steps[(list of candidate steps with\nself_ty\nnumber of derefs)]
candidate_steps2[(list of candidate steps with\nself_ty\nnumber of receiver hops)]
method_autoderef_steps-->candidate_steps
method_autoderef_steps2-->candidate_steps2
probe_op[probe_op & various\n'functions calling push_candidate]
candidate_steps2-->probe_op
candidates[(list of candidate methods with\nassociated item &\nxformed self ty)]
probe_op-->candidates
candidates-->pick_all_method
pick_all_method[pick which candidate\nby iterating types and\nlooking for matching methods.\nAt each step]
candidate_steps--work out types to which\nreceiver can be converted-->pick_all_method
end
autoderef-->method_autoderef_steps
autoderef2-->method_autoderef_steps2
ambiguity[Report method ambiguity error]
confirm[confirm.rs]
confirm_warn[confirm but warn of ambiguity]
pick_all_method--exactly 1 method-->confirm
pick_all_method--exactly 1 'innermost' method-->confirm_warn
pick_all_method--multiple methods-->ambiguity
e0282[error 0282: type annotation needed]
method_autoderef_steps--final type was a type variable-->e0282
'Innermost' in this context means a viable candidate method on the 'innermost' type, that is, the type we got via the largest number of steps along the Receiver chain.
In practice, the code isn't this complex - we don't have to store two different lists of candidate_steps - but logically this is what we're doing.
More detail on how we support the NotNull<T> case
pick_all_method iterates through the chain of types built by the Deref trait, to look for matching methods.
For example, given a receiver type like this:
graph LR
A --Receiver--> B
A --Deref--> B
B --Receiver--> C
B --Deref--> C
C --Receiver--> D
D --Receiver-->E
it will iterate first through A<B<C<D<E>>>> and then B<C<D<E>>> and then C<D<E>>. At each of those steps, it will look for candidate functions found during the probe_op stage.
Within a given step of iteration, for instance, just considering a receiver type of A<B<C<D<E>>>> with no derefs applied, we might find candidates functions within the impl blocks for A<B<C<D<E>>>>, B<C<D<E>>>, C<D<E>>, D<E> or E. Any of those might have fn something(self: A<B<C<D<E>>>>>).
If we do find multiple candidates in each level of iteration, currently, we show an error about the ambiguity.
The proposed change is:
- switch the error to a warning
- only if this ambiguity results from differently-nested candidates
- pick the most-nested candidate
This all seems counterintuitive, since the overall iteration iterates from the "outer" type via the deref chain. So we always iterate from the outside inwards in terms of dereferencing the receiver. But within each individual iteration step, if there are candidates found in impl blocks of differing types, we'll now pick the innermost.
Although slightly counterintuitive, the warning should cause everyone to resolve the ambiguity by being explicit, so this should not result in an ongoing cognitive load for Rustaceans. This is a compatible change because previously, errors would always have been shown in this circumstance.
Code example: (pseudo-code)
// Assume NonNull<T> implements Receiver
struct Dog;
impl Dog {
fn bark(self: NonNull<Self>) { }
}
fn main() {
let d: NonNull<Dog> = visit_kennels();
d.bark(); // obviously calls Dog::bark
}
// Now imagine NonNull adds a bark(self) method in the standard library
// Proposal:
// Compiler CONTINUES to choose Dog::bark but shows a warning along the lines of:
// Call to Dog::bark shadows NonNull::bark. Be explicit about which method you're aiming to call,
// e.g. Dog::bark(dog)
Overall it appears to make it possible for us to support using NonNull<T> and friends as receivers. There are some details to figure out:
- Currently, all by-value methods are searched before autoreffed methods. So, if we had
Dog::bark(self: &NonNull<Self>)but then addedNonNull::bark(self)then this would, I think, still cause undesirable shadowing. We may need to rearrange the wayself_pick_all_methodworks so it can spot this scenario. - There are bits of the algorithm around collapsing trait picks, and unstable candidates, which I haven't figured out and we need to see if it fits this proposal.
Takeaways
The main bits of cunning here are:
- We follow the chain of
Receiverwhile identifying candidate methods, but we follow the shorter chain ofDerefwhen working out the types to which the actual method receiver can be converted. - In
pick_all_method, when we're considering a particular self typeA<B>, if there's ambiguity about whether to choose a method onAorBinA<B>, we'll choose the method onBbut show a warning. This allows us to add new methods on types likeNonNull<T>. - The error 0282 discussed in the last comment is solved by considering bad types at the end of the
Derefchain not at the end of theReceiverchain.
Thoughts? Does this all make any sense? Can you see any pitfalls?
This is all implemented hackily here.
(@madsmtm @Urhengulas!)