rfcs icon indicating copy to clipboard operation
rfcs copied to clipboard

Implement function delegation in rustc

Open petrochenkov opened this issue 1 year ago • 12 comments

Summary

This RFC proposes a syntactic sugar for delegating implementations of functions to other already implemented functions.

There were two major delegation RFCs in the past, the first RFC in 2015 (https://github.com/rust-lang/rfcs/pull/1406) and the second one in 2018 (https://github.com/rust-lang/rfcs/pull/2393).

The second RFC was postponed by the language team in 2021 (https://github.com/rust-lang/rfcs/pull/2393#issuecomment-816822011). We hope to revive that work again.

How this proposal is different from the previous ones:

  • This proposal follows the "prototype first, finalized design later" approach, so it's oriented towards compiler team as well, not just language team. The prototyping is already in progress and we are ready to provide resources for getting the feature to production quality if accepted.
  • This proposal takes a more data driven approach, and builds the initial design on relatively detailed statistics about use of delegation-like patterns collected from code in the wild. The resulting design turns out closer in spirit to the original proposal by @contactomorph than to later iterations.

This proposal is also the subject of an experimental feature gate: https://github.com/rust-lang/rust/pull/117978.

Rendered

petrochenkov avatar Nov 15 '23 19:11 petrochenkov

@petrochenkov: In the T-lang meeting, some people who had read the RFC were feeling good about it, and there was some interest in proposing FCP merge on this as a normal RFC in addition to going forward with the experiment while the FCP is pending.

However, some text in the body of the RFC and in the PR description here describes this RFC as an experimental one, which, as you note, is no longer a thing.

If you're ready for this RFC to move forward as a normal RFC, if you could, please remove the language about the RFC being experimental.

traviscross avatar Nov 22 '23 17:11 traviscross

My 2 cents (as a maintainer of https://crates.io/crates/delegate, which is probably the most used crate for delegation, in combination with https://crates.io/crates/ambassador):

There is a lot of various configuration options/knobs and variants of delegation that are important to different users, which can be seen from the myriad of options implemented in the delegate crate over the years (and there are still some use-cases in our issue tracker that haven't been even implemented yet).

Here is an example of a few such things:

  • Should the forwarding functions be inline or not? Should they be marked with some other custom attributes?
  • Should their return values or input parameters be coerced somehow? Into, TryInto, AsRef etc.? Should the result be unwrapped?
  • Do I want to change the delegated function name?
  • Should additional expressions be passed to the forwarded methods?
  • Should await be called on the forwarded methods?

It's probably not practical to add support for all these use-cases on the language level. A language level solution would hopefully solve the most common use-cases, and ideally also provide some extension points to go beyond these basic use-cases, but that might be quite challenging. If the language won't solve the problems of Rust users, they will just go back to using a third-party crate. I think that for gauging which problems are actually the most important, it might be better to take a look at the crates that use delegate or ambassador, and examine their usage patterns, rather than scanning the source code and trying to derive possible delegation patterns from that (although having this code-mining platform available will be definitely incredibly useful too!).

From my view, the most important aspect of delegation (that is now easily possible to perform with third-party crates at the moment) is the automatic enumeration of things that should be delegated (e.g. the signatures of methods of a trait), rather than the "forwarding" itself. It's quite common to delegate a trait impl to a field of a structure (or to a field of each enum variant), but since we don't have the ability to query the existing signatures, we have to both enumerate the method names, and also repeat their signatures.

Thus for me, a solution to delegation in Rust could be to implement a different language feature - some form of reflection that would give us the ability to code-gen the signatures of methods based on their names, and also the signatures of all methods of a trait. If this was available, any third-party crate (such as delegate) would be able to implement pretty much any delegation pattern without any further support from the language. Right now, delegate doesn't know the signatures of trait methods, so the user has to repeat the signatures, which is quite annoying. Repeating only the names of the functions would be already much better. With some form of reflection being available, we could provide the low-level building block in the language (and reflection is also useful for many other things), but leave the many details to third-party crates that are much easier to iterate upon than the language itself.

That being said, if we decide to go with the "delegation in language" way, I think that it should support the automatic enumeration of trait methods, and perhaps as an MVP also just enable delegating traits with some simple syntax. I like how it works in Kotlin, where you specify the interface (so a trait) that you want to delegate, and the expression/field that you want to forward that interface to. From my experience at least, this is the most common use-case for delegation.

So e.g. it would be great if something like this was possible to do with "language-level delegation":

struct Wrapper(Foo);

impl Bar for Wrapper by self.0 {
  // optionally allow to override the generated forwarding implementations
}

Kobzol avatar Nov 23 '23 12:11 Kobzol

I think that for gauging which problems are actually the most important, it might be better to take a look at the crates that use delegate or ambassador, and examine their usage patterns, rather than scanning the source code and trying to derive possible delegation patterns from that.

I agree, that sounds like an additional useful source of data!

@petrochenkov, do you have any interest in the following approach? I suppose they’re not mutually exclusive—there could be language support for the common delegation scenarios, and eventually also add reflection support to enable crates to “fill in the rest” of the more complex/niche functionality. Plus, better reflection support would unlock a lot of other functionality for Rust.

Thus for me, a solution to delegation in Rust could be to implement a different language feature - some form of reflection that would give us the ability to code-gen the signatures of methods based on their names, and also the signatures of all methods of a trait

ericsampson avatar Nov 23 '23 14:11 ericsampson

@traviscross I've removed the process disclaimer. It's probably fine to treat this as a normal compiler & language RFC for an experimental feature, the text is relatively detailed.

But it will certainly need a second iteration with finalized syntax and other design choices before any attempts at stabilization. I also didn't add some clauses typical for proper language RFCs, like guide-level explanation, high-level alternatives (e.g. reflection as mentioned by other commenters) or drawbacks.

petrochenkov avatar Nov 23 '23 15:11 petrochenkov

@Kobzol @ericsampson I did look at delegate, ambassador and everything else in literature before collecting my statistics, that knowledge sort of guided which statistics to collect.

Regarding the specific suggestions:

Do I want to change the delegated function name?

This is supported - renaming.

Should the forwarding functions be inline or not? Should they be marked with some other custom attributes?

Extra attributes can be added because delegation item is still an item, and items may have attributes. inline is an open question (front matter). It should probably be added by default, but overridable.

just enable delegating traits with some simple syntax. From my experience at least, this is the most common use-case for delegation.

Yep, the statistics also show that it's common, just not overwhelmingly common, see the numbers in postponed features. So we'll certainly try to do it, but it is postponed as a second layer of sugar, until we implement the basic layer.

  • Should their return values or input parameters be coerced somehow? Into, TryInto, AsRef etc.? Should the result be unwrapped?
  • Should additional expressions be passed to the forwarded methods?
  • Should await be called on the forwarded methods?

That all falls under arbitrary argument pre-processing and result post-processing and therefore not supported due to not fitting into syntactic budget for a built-in feature - rejected features, syntactic budget.

If some cases are supported by the same method as SameUpToSelfType then it's good (that may be possible for the From/Into/AsRef/AsMut-like stuff), otherwise not supported. The data collection pass can actually be extended to check which specific kinds or pre- and post-processing are common.

I guess there is a general "workaround" for a case in which most of the method body can be delegated - make a private delegated function, and a real function that calls the delegated one and does the necessary pre-post-processing.

petrochenkov avatar Nov 23 '23 16:11 petrochenkov

Regarding reflection, I'm interested in seeing proposals, but I'm not sure it's even implementable in a general enough form with Rust type system, and the readiness timeline will be 5-10 years at least. Even this proposal with its signature copying may open sort of a can of worms. to which rustc is I'm not sure is ready. We'll see, I guess.

petrochenkov avatar Nov 23 '23 16:11 petrochenkov

Regarding reflection, I'm interested in seeing proposals, but I'm not sure it's even implementable in a general enough form with Rust type system, and the readiness timeline will be 5-10 years at least.

Yes, sadly I also think that reflection is way off. So while I think that it would be a more general solution, if we want to get delegation faster, it makes sense to handle it specifically, and not wait for reflection.

Kobzol avatar Nov 23 '23 16:11 Kobzol

Yeah for sure.

Just for clarity, I am very excited that this is being worked on, so thanks a ton!!!

The RFC is well-written, and I really appreciate the data-driven approach.

Cheers 😊

ericsampson avatar Nov 23 '23 19:11 ericsampson

If we end up doing data analysis of broader crates.io, I wonder how many of the ArgsPreproc.Other cases involve Pin? Delegating traits like AsyncRead/AsyncWrite currently sometimes requires invocations of Pin::new.

Those aren't likely to come up in your current data set, though.

joshtriplett avatar Dec 05 '23 22:12 joshtriplett

Design question: how exactly the body template is duplicated for list or glob delegation

Typically the target expression (i.e. to what we delegate) in a delegation item is very simple, like.

reuse a::b { self.field }
// or
reuse b::c { self.getter() }

However, it's possible that it may contain some entities with identity, like items or closures. I would expect anonymous constants, closures and imports to be most useful in this context.

reuse prefix::{a, b, c} {
    use some::import; // import
    (
        self.field.map(|x| x.y), // closure
        [10; SIZE], // Anonymous constant
    )
}

This poses a question - how are all these entities duplicated when we desugar the list delegation prefix::{a, b, c} or glob delegation prefix::* into multiple actual functions

fn a { ... }
fn b { ... }
fn c { ... }

Will there be 3 separate imports, closures and anonymous constants with their own identities, or there will be only one instance, and the generated functions will refer to it?

Due to implementation details of rustc (how DefId tree is built), the answer dictates the compilation stage at which the desugaring happens.

Alternative 1: Duplication at token / AST level

Every generated function will contain its own version of items/closures/constants/etc.

This is closest to what would happen if we wrote the function bodies manually instead of delegating.

Each item clone will have its own identity - its own DefId. That means the desugaring of list and glob delegations into single delegations must happen before definition collector runs, that means during macro expansion and import resolution, not even during AST -> HIR lowering or right before it.

For glob delegation (reuse prefix::*) it means that prefix must be resolvable early, but that seems fine because we are only going to realistically support glob delegation to traits (reuse Trait::*).

Both list and glob delegation, including target expression cloning, essentially become macro features. That means we don't even need to parse the target block contents as Rust code if the delegation list is empty (prefix::{} { random garbage tokens }), or if the glob refers to a trait without items (EmptyTrait::* { random garbage tokens }). It still probably makes sense to parse it (but not name-resolve it), then such code will be equivalent to code under #[cfg(FALSE)] or any other code dropped by a macro.

Unfortunate detail: prefix still needs to be resolved for empty delegations, and therefore stability checked as well. It means "delegation stems" will need to be preserved in HIR to reach stability checking, similarly to import stems.

Drawback of this approach: name resolution will run multiple times on every copy of the target block, even if there are no items inside it and all the results are the same. Possible solution: better name resolution caching, not just for delegation, but for everything.

Alternative 2: "Semantic" duplication

Every generated function will contain references to items/closures/constants/etc canonically defined in the single delegation item block.

Item definitions will be parented under DefId of the delegation item itself, not under DefIds of functions that it produces.

I suppose this can work for imports, and other item definitions (e.g. structs).

However, I'm not sure how this is going to work for e.g. closures. Type checking results for different functions generated from a delegation item may be different, that's a significant design point. Even in simplest cases different bodies may differ in their use Deref or DerefMut, for example. Closure signatures are also produced by type inference, so in theory they may differ too, but that's impossible if all the closures have a single identity.

With this approach the block body will be parsed and name resolved once, even if the delegation item is "empty" and produces no functions.

This approach also seems to just difficult to implement, for no good reason, especially for things like closures that are actually a part of the executable code, can capture local variables, etc.

Alternative 3: Prohibit everything with identity in target blocks

This is actually combinable with both Alternative 1 and Alternative 2, we can emit a hard error or a feature gate for this case until a practical case requiring it arises.

However, we still need to make a choice between 1 and 2 because the implementation strategy depends on it. We also still need to decide how much checking is performed for target blocks or "empty" delegations.

The choice

I suggest selecting the Alternative 1.

It does make delegation sort of a macro feature to a larger degree, but makes life easier in all other regards, and keeps the "desugaring is identical to manually written code" property.

petrochenkov avatar Mar 26 '24 16:03 petrochenkov

Design question: what does the block around the target expression mean?

Suppose we have two delegation items

reuse just_expr { self.0 }

reuse multi_statements {
    let x = something;
    self.get(x)
}

What is the meaning of the curly braces around self.0 - is it actually a block expression or just a syntax that looks like a block expression. In the just_expr case it's likely the latter, and in the multi_statements case it's likely the former (but there are nuances).

Single expression

It's clear that if we implemented some different syntax, e.g. reuse PATH from EXPR, then the just_expr example would be written as

reuse just_expr from self.0;

and not as

reuse just_expr from { self.0 };

The block expression here would be not just noisy, but harmful in the common case. For example, the body generated for this code could fail borrow checking (autoref/deref is assumed).

just_expr(&{ self.0 }) // ERROR cannot move out of `self`

Multiple statements (alternative 1)

The second example with the alternative syntax would look like this.

reuse multi_statements from {
    let x = something;
    self.get(x)
}

And the generated body would look like this.

multi_statements({ let x = something; self.get(x)})

The block expression is clearly necessary here.

Multiple statements (alternative 2)

An alternative body desugaring for the multi-statement case could be

let x = something;
multi_statements(self.get(x))

This could potentially be better for borrow checking, but it needs to be proven with some practical cases.

The catch is that it no longer fits into some alternative expression only syntax like reuse PATH from EXPR, because the block here is clearly not a block expression. If we need to actually delegate to a block expression we will need "double blocking" reuse foo { { bar } }.

The choice

Not clear yet. The current implementaion uses the "alternative 2" desugaring, but strips the block if it contains only a single expression, which is a bit hacky.

petrochenkov avatar Jun 28 '24 15:06 petrochenkov