rfcs icon indicating copy to clipboard operation
rfcs copied to clipboard

Trait objects for multiple traits

Open sgrif opened this issue 7 years ago • 44 comments

Given arbitrary traits Foo and Bar, it'd be great to have Box<Foo + Bar> compile, but there are unresolved questions that need to be laid out explicitly. (Moved from https://github.com/rust-lang/rust/issues/32220)

sgrif avatar Jun 17 '17 20:06 sgrif

Just dealt with a really frustrating workaround for the lack of upcasting, I'd be interested in working on this. A few questions/comments (retrospectively, maybe more than a few):

It looks like the best way to start out with this is super trait coercion, and then move on to arbitrary combined traits. The first looks more or less straight forward, and the second still has some unanswered tradeoff questions, but I do think it's good to keep the second in mind so that casting is implementing in a way that can eventually segue into multiple-trait pointers.

For super traits, it seems that we could for the most part just concatenate vtables together. However, consider this example:

trait A: Debug + Display { /* pretend there are methods here */ }
trait B: Debug + Display { /* same here */ }
trait C: A + B {}

What does the vtable for Box<C> look like? You can only have one implementation of Debug and Display but if you want to allow coercion to Box<A> or Box<B> and then from either of those to Debug or Display, you basically have to list duplicate implementations of those in A's vtable. So how big does this realistically get and will it be a problem? It seems feasible to me that you might have cases where you implement several traits, and some of them require other custom traits, in addition to the occasional PartialEq, etc, which will lead to lots of repetitions. This gets worse when you allow Box<X + Y + Z> to convert to arbitrary combinations of bounds, potentially, although realistically there may not be enough uses of this. I also have a feeling that trait aliases will encourage people not to overdo it, since you'll just keep passing Box<Alias> rather than different combinations of Box<FirstThingINeed + SecondThingINeed>.

As an aside, there is a possible optimization around the case of Box<A + B> - since we know which combinations are used, we can change the order that supertraits are listed for implementations of those combinations to minimize the number of new vtables created. This pretty much rules out dynamic linking, but I'm not sure that's really feasible anyway for this feature, since you'd basically need a runtime function that scans external vtables and writes new ones for any possible combinations you may use.

Which brings up another question - going from Box<A + B> to Box<B> is easy because you just move the offset. But what if you try to go from Box<A + B + C> to Box<A + C>? Even if you're only storing offset tables, if those three traits appear together often, like say Debug + PartialEq + PartialOrd, that can end up being a lot of offset tables.

The only really feasible option I can think of for this is to have the compiler trace which types have a potential path to any of those combinations, and only generate extra tables for those types. Not familiar at all with internals, so I don't know how much work that would be.

For reference, here's how C++ does virtual inheritance. The problem is that virtual inheritance in C++ is not nearly as common as multiple trait implementation in Rust, so the overhead of writing all the extra tables and offsets isn't as much. On the other hand, most of the common std traits are just one or two methods, so I don't know if it would even help to do it the C++ way with offset tables, and C++ doesn't have a way to pass a pointer to two classes, you just have to use dynamic_cast, so I think the Rust solution is going to be fairly different. I do want to spend some more time looking at this though and seeing what I can come up with for Rust.

parkovski avatar Sep 18 '17 01:09 parkovski

Which brings up another question - going from Box<A + B> to Box<B> is easy because you just move the offset. But what if you try to go from Box<A + B + C> to Box<A + C>? Even if you're only storing offset tables, if those three traits appear together often, like say Debug + PartialEq + PartialOrd, that can end up being a lot of offset tables.

Is there any reason you can't have one vtable for A + B + C and an entirely different vtable for a + c? The benefits of reusing the same vtable if possible are clear - but in cases where that's not possible, what's stopping the compiler from generating a new one for just A + C?

ejmahler avatar Apr 08 '18 23:04 ejmahler

I just ran my head into this, and it's left me pretty disappointed. I'm developing the rust_dct crate.

Each of the Discrete Cosine transforms type 1 through for has its own trait (DCT1, DCT2, DCT3, DCT4), and same with discrete sine transforms (DST1, DST2, DST3, DST4). Previously the structs that implemented these traits were completely disjoint: There's a struct that converts DCT3 problems into FFT problems, and an entirely separate struct that converts DCT2 problems into FFT problems. They're completely separated.

But recently, I discovered that DCT2, DST2, DCT3, and DST3 problems pretty much always require the same precomputed data and pre-allocated buffers, and so I've started creating single structs that can compute all four. So a single "convert to FFT" structs all four traits: DCT2, DST2, DCT3, and DST3. All D{C,S}{2,3} structs are now implemented this way.

Sometimes, I have a problem that needs both a DCT2 and a DCT3. Currently, I have to write this:

fn my_algorithm(input: &[f32], output: &mut [f32], dct2: Arc<DCT2>, dct3: Arc<DCT3>) {
    
}

If I had a problem that needed DCT2, DST2, DCT3, and DST3, I'd have to be even more verbose:

fn my_algorithm2(input: &[f32], output: &mut [f32], dct2: Arc<DCT2>, dct3: Arc<DCT3>, dst2: Arc<DCT2>, dst3: Arc<DCT3>) {
    
}

I absolutely despise this API though, because it requires an unreasonable amount of repetition by the user, and in the end all four Arcs will be pointing to the same thing. It would be much, much more ergonomic if I could write this:

fn my_algorithm3(input: &[f32], output: &mut [f32], dct: Arc<DCT2 + DCT3 + DST2 + DST3>) {
    
}

And then I can use the same dct object to compute all four transforms. If, internally, my_algorithm3 delegates to the following method:

fn my_algorithm4(input: &[f32], output: &mut [f32], dct: Arc<DCT2 + DCT3>) {
    
}

wouldn't it be ridiculously convenient if I could just pass the dct object along and let the compiler figure it out?

fn my_algorithm3(input: &[f32], output: &mut [f32], dct: Arc<DCT2 + DCT3 + DST2 + DST3>) {
    my_algorithm4(input, output, dct);
}

ejmahler avatar Apr 09 '18 00:04 ejmahler

Is there any reason you can't have one vtable for A + B + C and an entirely different vtable for a + c? The benefits of reusing the same vtable if possible are clear - but in cases where that's not possible, what's stopping the compiler from generating a new one for just A + C?

Mainly the fact that that requires a number of vtables which is exponential in the number of +s.

wouldn't it be ridiculously convenient if I could just pass the dct object along and let the compiler figure it out?

I agree this should 'just work' one way or another. But as a workaround, for the record, rather than taking separate Arc parameters, I'd make a wrapper trait or traits, like trait Foo : DCT2 + DCT3 + DST2 + DST3.

Alternately… do you actually need to be using trait objects in the first place? Just from a glance at your use case, my guess is that having my_algorithm* use a generic parameter instead would work fine – i.e. it's unlikely that one program would need to choose at runtime between a large number of different algorithms for the same calculation.

comex avatar Apr 09 '18 00:04 comex

Alternately… do you actually need to be using trait objects in the first place? Just from a glance at your use case, my guess is that having my_algorithm* use a generic parameter instead would work fine – i.e. it's unlikely that one program would need to choose at runtime between a large number of different algorithms for the same calculation.

It's definitely occurred to me that my situation works as expected if this is all done at compile time instead of with trait objects. To explain why trait object are more or less necessary here, let me provide two more bits of context:

  1. A crucial piece of my library is the "planner" which takes a given problem size, and returns an assembled set of algorithms that compute the relevant problem size. The intention is that a user doesn't need to know all the different ways to compute a DCT Type 2, they just give the library a size, and the library returns a thing that can compute a DCT Type 2 of that size.
  2. Some DCT algorithms are implemented in terms of other DCT algorithms, or are even implemented recursively. See this DCT3 algorithm which computes the DCT3 by dividing it into one DCT3 of half-size, and another DCT3 of quarter-size. If the half and quarter algorithms are generic parameters instead of trait object, then you'd have to write out both types anywhere you used it, in addition to the parent struct. And what if the half_dct and quarter_dct structs are also split radix? They need their own generic parameters. Suddenly you have an exploding tree of generic types.

ejmahler avatar Apr 09 '18 00:04 ejmahler

This is a good idea and I think a bunch of people would like to see this implemented. Does anyone want to have a crack at an RFC? (I don't feel too expert about it myself.)

alexreg avatar Jun 11 '18 02:06 alexreg

I was reading accepted RFC 1733 — trait aliases and came to the erroneous conclusion that it would supersede this RFC:

  1. Aliases can be defined from multiple traits: trait DebugDefault = Debug + Default;
  2. Aliases can be used as trait objects: Box<MyTraitAlias>

Only careful reading of the RFC's examples showed me this was not the case:

trait PrintableIterator = Iterator<Item=i32> + Display;
fn bar3(x: Box<PrintableIterator>) { ... } // ERROR: too many traits (*)

There's even an example immediately after that that makes it look like this would work, but only because one of the magic auto traits was renamed:

trait Sink = Sync;
trait IntIterator = Iterator<Item=i32>;
fn bar4(x: Box<IntIterator + Sink + 'static>) { ... } // ok (*)

Anyway, my point is that I think that RFC 1733 is going to exacerbate the occurrences of this issue.

shepmaster avatar Jun 11 '18 03:06 shepmaster

Which brings up another question - going from Box<A + B> to Box<B> is easy because you just move the offset. But what if you try to go from Box<A + B + C> to Box<A + C>?

This is a complication which doesn't have to be solved immediately — the compiler can simply state that up-casting to multi-trait objects is not (currently) supported. As a workaround users can use trait AC: A + C {} and cast from Box<AC + B>, although this doesn't cover all cases. (Alternatively the compiler could implement vtables and casts only when used; this likely requires a unique vtable for each source/target combination.)

What should be supported is:

  • Box<A + B + ...>Box<A> for any A, B, ...
  • Box<D>Box<A> where trait D: A { ... }
  • x.foo where x: Box<A + B> and A::foo exists
  • x.foo where x: Box<D>, D: A and A::foo exists

Issue: name conflict where A::foo and B::foo both exist. Solution: calling x.foo() for x: Box<A + B> should be illegal and UFCS required, same as for static dispatch.

Issue: if A: C and B: C, then Box<A + B>Box<C> has conflicting implementations. Poor solution: disable direct up-cast on multi-trait-objects; i.e. require x as Box<A> as Box<C> for x: Box<A + B>. Better solution: only disable direct up-cast on multi-trait-objects where there are conflicts. This is more convenient but means that Box<A + C>Box<C> where A: C would silently ignore the indirect upcast option even though it would otherwise be an option. Another solution: something akin to UFCS syntax but for upcasts.

Is this enough of an RFC? It doesn't detail what vtables should look like, but this is probably best left unspecified (I don't see any further complications).

dhardy avatar Sep 17 '18 15:09 dhardy

trait A: Debug + Display { /* pretend there are methods here */ }
trait B: Debug + Display { /* same here */ }
trait C: A + B {}

Going back to @parkovski's example: can we not use a reduced vtable for C (basically just A plus unique methods from B), then use a lookup table to replace one vtable with another to support &C → &B?

This implies that trait-object-casting functions may need a small dataset (of pointers to vtables) embedded in the executable and that trait-object-cast may be slow, but I don't think those are real problems?

dhardy avatar Oct 08 '18 18:10 dhardy

The hard question is what happens if you want to cast Box<A+B+C+D+...> to a trait object of a subset of {A, B, C, D, ...} rathern than a single one. If it's allowed, there are a large number of ways that could be be implemented, but most ways scale badly as you add more traits to the multi-trait object (e.g., pointer sizes increase linearly in the number of traits, or the number of vtables that are emitted increases exponentially in the number of traits). The "lookup table holding the vtables you need for upcast" approach falls under the latter if it's extended up "upcast to another multi-trait object" in the way I expect.

For more discussion of those trade-offs see https://internals.rust-lang.org/t/wheres-the-catch-with-box-read-write/6617 -- there is one approach in there (vorner's) that side-steps the aforementioned problems but instead sacrifices efficency of some virtual calls.

hanna-kruppe avatar Oct 08 '18 18:10 hanna-kruppe

I opened a discussion at https://internals.rust-lang.org/t/casting-families-for-fast-dynamic-trait-object-casts-and-multi-trait-objects/12195 that should address the issues with layering compilation units.

burdges avatar Apr 19 '20 12:04 burdges

I figured I'd mention since I haven't seen any discussion of it here, that there is a minimal version of this that is probably much more trivial to add which is adding support just for user-defined marker traits. Because marker traits purely exist at the type level and shouldn't (AFAIK) require any vtables. I ran into wanting this due to trying to make an object safe wrapper trait for an existing non-object safe trait. Anywhere the original trait used Self or an associated type I tried substituting Box<dyn (Any + MarkerForAssociatedType)>. Of course Box<dyn Any> can hold anything, but I wanted the marker trait in order to preserve a little type safety -- it made so I couldn't accidentally pass a Box<dyn (Any + MarkerForAssociatedTypeA)> into a method expecting Box<dyn (Any + MarkerForAssociatedTypeB)>, making it a bit more likely my runtime downcasts would succeed. It also made it less likely that somebody would unintentionally pass in a Box<dyn Any> from some other library.

jgarvin avatar Aug 04 '20 04:08 jgarvin

@jgarvin Good comment! Makes me think that, rather than Send and Sync being special, perhaps they should just be ordinary #[marker] traits, and all #[marker] traits should get the "can be added to other traits in dyn" behaviour.

EDIT: Come to think of it, the "trait addition ok" is probably an ~~OIBIT~~auto trait behaviour, not special to Send/Sync.

scottmcm avatar Aug 04 '20 06:08 scottmcm

@scottmcm I believe it is currently an auto trait behavior because the error message you get when you try Box<dyn Foo + Bar> specifically complains about having more than one non-auto trait. But I think expanding the behavior to include anything #[marker] makes sense.

jgarvin avatar Aug 05 '20 00:08 jgarvin

It feels weird to me that this doesn't compile:

fn write_something(w: &mut (dyn Write + Seek));

But this does:

trait WriteSeek: Write + Seek {}
fn write_something(w: &mut dyn WriteSeek);

Will this fix that?

Timmmm avatar Aug 06 '20 12:08 Timmmm

@scottmcm @jgarvin why #[marker] in particular, but not empty traits (without any virtual methods) in general?

SOF3 avatar Aug 06 '20 13:08 SOF3

@SOF3 I'd assume for the same reason only closures allow type inference in function signatures. It'd be too easy to conflate interface and implementation.

ssokolow avatar Aug 06 '20 14:08 ssokolow

A trait could add new methods with default implementations without breaking semver compatibility.

bjorn3 avatar Aug 06 '20 14:08 bjorn3

@SOF3 @ssokolow I just assumed #[marker] was some way to identify intentionally empty traits. I'm not actually sure what the advantage of labeling them this way is, or what it has to do with interface vs implementation. Their defining feature seems to be the absence of an interface :)

jgarvin avatar Aug 08 '20 20:08 jgarvin

or what it has to do with interface vs implementation. Their defining feature seems to be the absence of an interface :)

A trait like Send or Sync can be added to a trait bound because it requires no vtable or instance pointer. It's just a constraint on what programs will be allowed to compile. Adding methods would break that by requiring a vtable and Rust is all about making costs and decisions explicit, rather than doing transformations behind the scenes to make the pieces match up.

In that context, having #[marker] required to opt you into that behaviour would mean that you could get a compile-time error if you tried to add methods to a marker trait without removing #[marker].

ssokolow avatar Aug 08 '20 20:08 ssokolow

@ssokolow I understand the vtable implication, that's why I was proposing that an easy extension would be to just allow it for all empty traits. Requiring empty traits to be labeled with #[marker] though doesn't have a clear benefit to me -- if the user adds a method and any uses of the trait actually depend on it being a marker they should get an error anyway, and if none do, why make them jump through an extra hoop?

jgarvin avatar Aug 08 '20 21:08 jgarvin

if the user adds a method and any uses of the trait actually depend on it being a marker they should get an error anyway, and if none do, why make them jump through an extra hoop?

Requiring #[marker] prevents downstream crates from depending on the trait being a marker trait when it is not guaranteed. Otherwise a downstream crate could depend on it, when the defining crate may change it to a non-marker trait. This would make adding a method to a trait without methods a breaking change. Requiring #[marker] avoids this problem.

bjorn3 avatar Aug 08 '20 22:08 bjorn3

@bjorn3 why require #[marker] but not require #[object_safe]?

jgarvin avatar Aug 10 '20 00:08 jgarvin

@jgarvin That's been discussed as a possible change for a future edition, actually. To make it so that you can't dyn Foo unless it was originally defined as dyn trait Foo, both for the semver of whether it's object-safe as well as a way to get compiler errors if you accidentally add something to make something not object-safe even though it used to be.

scottmcm avatar Aug 11 '20 05:08 scottmcm

Late to the party, and also don't know if this wasn't pointed before, but dyn (A + B) can be translated to machine code with more fat pointer - that is, three pointers instead of two (one for the data, one for A's vtable, and one for B's vtable). Then casting this to dyn A or dyn B is just matter of playing with the pointers.

Example (pseudocode):

fn foo(x: &dyn (A + B + C)) {
    let y: &dyn (A + C) = x;
    let z: &dyn C = y;
    z.method();
}
// -->
fn foo(x: [*const (); 4]) {
    let y: [*const (); 3] = [x[0], x[1], x[3]];
    let z: [*const (); 2] = [y[0], y[2]];
    unsafe { std::mem::transmute::<*const (), fn(*const ())>(z[1].add(METHOD_OFFSET))(z[0]) };
}

ChayimFriedman2 avatar Mar 03 '21 05:03 ChayimFriedman2

Yes, this is already known, but some people don't want extra fat pointers.(Personally I would be fine with them)

RustyYato avatar Mar 04 '21 13:03 RustyYato

If you don't want extra fat pointers, don't use multiple traits. This is exactly the concept of "zero-cost abstraction".

ChayimFriedman2 avatar Mar 04 '21 13:03 ChayimFriedman2

There exist other approaches like https://internals.rust-lang.org/t/casting-families-for-fast-dynamic-trait-object-casts-and-multi-trait-objects/12195 that pay extra indirection runtime costs but avoid compiler complexity and avoid enlarging metadata, and likely handle up-down casting better too. A priori, it fits less well with borrowing but maybe that's solvable by being careful.

burdges avatar Mar 04 '21 13:03 burdges

@burdges I thought the point of multi-trait objects is ergonomics, and it doesn't seem very ergonomic to me to declare trait implementations to get this. I think I simpler solution is to lean on trait aliases.

  1. You can use that trait alias as a trait object (if all composed traits are object-safe) that's 2 pointers wide.
trait Foo = Bar + Yam + Tek;
fn foo() {
    assert_eq!(std::mem::size_of::<&dyn Foo>(), 2 * std::mem::size_of::<usize>());
}
  1. You can upcast to any one of the traits that make up the trait object i.e.:

dyn Foo can be upcast to one of Bar, Yam, Tek. If you want to upcast to Yam + Tek then you should declare an alias

trait BarTek = Bar + Tek;
trait Foo = Bar + BarTek;

These rules are simple, straightforward, and handle the most common use-case. Which is just fine for sugar.

  1. Explicitly don't support upcasting to arbitrary subsets. (we could allow this if you are statically linking under some circumstances, but it's not necessary)

RustyYato avatar Mar 05 '21 13:03 RustyYato

I do not think multi-trait objects exist only for ergonomics, but rust rarely gets used for such code currently.

I do not disagree otherwise, but the families trick I gave provides an unergonomic casting solution that works now if you really require this, and aligns well with an ergonomic family based implementation doable in rustc, like what you describe. We both declare all valid hops between vtables and place these hops into vtables, but..

How do you cast from dyn Borrow<Foo> to trait BorrowFooBar = Borrow<Foo>+Borrow<Bar>? As declared in core, the vtable of dyn Borrow<Foo> contains a size, alignment, and one method pointer. Are downstream compilation units allowed to increase vtable size?

Yes, there are several issues caused by https://github.com/rust-lang/rust/issues/46139 so vtables already need to be split out into object files touched by multiple compliation units.

Yet, there also exist reasons vtables should be locked by earlier compilation unit, like if vtables were provided by the standard library of a rust-friendly operating system, as opposed to rust's own standard library.

We'd presumably distinguish these like dyn(..) Trait, so dyn(redox) BorrowFooBar cannot be created and dyn(redox) Borrow<Foo> cannot be upcast to anything newer.

burdges avatar Mar 05 '21 14:03 burdges