rust icon indicating copy to clipboard operation
rust copied to clipboard

Tracking issue for RFC #1909: Unsized Rvalues (unsized_locals, unsized_fn_params)

Open aturon opened this issue 7 years ago • 64 comments
trafficstars

This is a tracking issue for the RFC "Unsized Rvalues " (rust-lang/rfcs#1909).

Steps:

Blocking bugs for unsized_fn_params:

  • https://github.com/rust-lang/rust/issues/111175
  • https://github.com/rust-lang/rust/issues/115709 (bad interaction with extern_type: we either need to be okay with post-mono checks or need a trait for "dynamically sized" types)
  • Reject unsized arguments for functions with non-Rust ABI

Related bugs:

  • [x] https://github.com/rust-lang/rust/issues/61335 -- ICE when combined with async-await
  • [x] https://github.com/rust-lang/rust/issues/68304 -- Box<dyn FnOnce> doesn't respect self alignment

Unresolved questions:

  • [ ] What are the MIR semantics for unsized locals? We currently do not have operational semantics for them, and the way they currently work, there are no good operational semantics. This needs a complete from-scratch re-design.
  • [ ] Can we carve out a path of "guaranteed no alloca" optimization? (See #68304 for some related discussion)
  • [ ] Given that LLVM doesn't seem to support alloca with alignment, how do we expect to respect alignment limitations? (See #68304 for one specific instance)
  • [ ] How can we mitigate the risk of unintended unsized or large allocas? Note that the problem already exists today with large structs/arrays. A MIR lint against large/variable stack sizes would probably help users avoid these stack overflows. Do we want it in Clippy? rustc?
  • [ ] How do we handle truely-unsized DSTs when we get them? They can theoretically be passed to functions, but they can never be put in temporaries.
  • [ ] Decide on a concrete syntax for VLAs.
  • [ ] What about the interactions between async-await/generators and unsized locals?
  • [ ] We currently allow extern type arguments with unsized_fn_params, but that does not make much sense and leads to ICEs: https://github.com/rust-lang/rust/issues/115709

aturon avatar Feb 07 '18 17:02 aturon

How do we handle truely-unsized DSTs when we get them?

@aturon: Are you referring to extern type?

Aaron1011 avatar Feb 20 '18 21:02 Aaron1011

@Aaron1011 that was copied straight from the RFC. But yes, I presume that's what it's referring to.

aturon avatar Feb 20 '18 21:02 aturon

Why would unsized temporaries ever be necessary? The only way it would make sense to pass them as arguments would be by fat pointer, and I cannot think of a situation that would require the memory to be copied/moved. They cannot be assigned or returned from functions under the RFC. Unsized local variables could also be treated as pointers.

In other words, is there any reason why unsized temporary elision shouldn't be always guaranteed?

ldr709 avatar Feb 28 '18 03:02 ldr709

Is there any progress on this issue? I'm trying to implement VLA in the compiler. For the AST and HIR part, I added a new enum member for syntax::ast::ExprKind::Repeat and hir::Expr_::ExprRepeat to save the count expression as below:

enum RepeatSyntax { Dyn, None }
syntax::ast::ExprKind::Repeat(P<Expr>, P<Expr>, RepeatSyntax)

enum RepeatExprCount {
  Const(BodyId),
  Dyn(P<Expr>),
}
hir::Expr_::ExprRepeat(P<Expr>, RepeatExprCount)

But for the MIR part, I have no idea how to construct a correct MIR. Should I update the structure of mir::RValue::Repeat and corresponding trans_rvalue function? What should they look like? What is the expected LLVM-IR?

Thanks in advance if someone would like to write a simple mentoring instruction.

F001 avatar May 11 '18 09:05 F001

I'm trying to remove the Sized bounds and translate MIRs accordingly.

qnighy avatar May 26 '18 14:05 qnighy

An alternative that would solve both of the unresolved questions would be explicit &move references. We could have an explicit alloca! expression that returns &move T, and truly unsized types work with &move T because it is just a pointer.

If I remember correctly, the main reason for this RFC was to get dyn FnOnce() to be callable. Since FnOnce() is not implementable in stable Rust, would it be a backward-compatible change to make FnOnce::call_once take &move Self instead? If that was the case, then we could make &move FnOnce() be callable, as well as Box<FnOnce()> (via DerefMove).

cc @arielb1 (RFC author) @qnighy (currently implementing this RFC in #51131) @eddyb (knows a lot about this stuff)

mikeyhew avatar Jul 14 '18 19:07 mikeyhew

@mikeyhew There's not really much of a problem with making by-value self work and IMO it's more ergonomic anyway. We might eventually even have DerefMove without &move at all.

eddyb avatar Jul 14 '18 20:07 eddyb

@eddyb

I guess I can see why people think it's more ergonomic: in order to opt into it, you just have to add ?Sized to your function signature, or in the case of trait methods, do nothing. And maybe it will help new users of the language, since &move wouldn't be show up in documentation everywhere.

If we're going to go ahead with this implicit syntax, then there are a few details that would be good to nail down:

  • If this is syntactic sugar for &move references, what does it desugar too? For function arguments, this could be pretty straightforward: the lifetime of the reference would be limited to the function call, and if you want to extend it past that, you'd have to use explicit &move references. So

    fn call_once(f: FnOnce(i32))) -> i32
    

    desugars too

    fn call_once(f: &move FnOnce(i32)) -> i32
    

    and you can call the function directly on its argument, so foo(|x| x + 1) desugars to foo(&move (|x| x + 1)).

    And to do something fancier, you'd have to resort to the explicit version:

    fn make_owned_pin<'a, T: 'a + ?Sized>(value: &'a move T) -> PinMove<'a, T> { ... }
    
    struct Thunk<'a> {
        f: &'a move FnOnce()
    }
    

    Given the above semantics, DerefMove could be expressed using unsized rvalues, as you said:

    EDIT: This is kind of sketchy though. What happens if the implementation is wrong, and doesn't call f?

    // this is the "closure" version of DerefMove. The alternative would be to have an associated type
    // `Cleanup` and return `(Self::Target, Self::Cleanup)`, but that wouldn't work with unsized
    // rvalues because you can't return a DST by value
    fn deref_move<F: FnOnce(Self::Target) -> O, O>(self, f: F) -> O;
    
    // explicit form
    fn deref_move<F: for<'a>FnOnce(&'a move Self::Target) -> O, O>(&'a move self, f: F) -> O;
    

    I should probably write an RFC for this.

  • When do there need to be implicit allocas? I can't actually think of a case where an implicit alloca would be needed. Any function arguments would last as long as the function does, and wouldn't need to be alloca'd. Maybe something involving stack-allocated dynamic arrays, if they are returned from a block, but I'm pretty sure that's explicitly disallowed by the RFC.

mikeyhew avatar Jul 20 '18 03:07 mikeyhew

@eddyb have you seen @alercah's RFC for DerefMove? https://github.com/rust-lang/rfcs/pull/2439

mikeyhew avatar Jul 20 '18 05:07 mikeyhew

As a next step, I'll be working on trait object safety.

qnighy avatar Aug 20 '18 00:08 qnighy

@mikeyhew Sadly @alercah just postponed their DerefMove RFC, but I think a separate RFC for &move that complements that (when it does get revived) would be very much desirable. I would be glad to assist with that even, if you're interested.

alexreg avatar Aug 22 '18 01:08 alexreg

@alexreg I would definitely appreciate your help, if I end up writing an RFC for &move.

The idea I have so far is to treat unsized rvalues as a sort of sugar for &move references with an implicit lifetime. So if a function argument has type T, it will be either be passed by value (if T is Sized) or as a &'a move T, and the lifetime 'a of the reference will outlive the function call, but we can't assume any more than that. For an unsized local variable, the lifetime would be the variable's scope. If you want something that lives longer than that, e.g. you want to take an unsized value and return it, you'd have to use an explicit &move reference so that the borrow checker can make sure it lives long enough.

mikeyhew avatar Aug 23 '18 03:08 mikeyhew

@mikeyhew That sounds like a reasonable approach to me. Has anyone specified the supposed semantics of &move yet, even informally? (Also, I'm not sure if bikeshedding on this has already been done, but we should probably consider calling it &own.)

alexreg avatar Aug 24 '18 03:08 alexreg

Not sure if this is the right place to document this, but I found a way to make a subset of unsized returns (technically, all of them, given a T -> Box<T> lang item) work without ABI (LLVM) support:

  • only Rust ABI functions can return unsized types
  • instead of passing a return pointer in the call ABI, we pass a return continuation
    • we can already pass unsized values to functions, so if we could CPS-convert Rust functions (or wanted to), we'd be done (at the cost of a stack that keeps growing)
    • @nikomatsakis came up with something similar (but only for Box) a few years ago
  • however, only the callee (potentially a virtual method) needs to be CPS-like, and only in the ABI, the callers can be restricted and/or rely on dynamic allocation, not get CPS-transformed
  • while Clone becoming object-safe is harder, this is an alright starting point:
// Rust definitions
trait CloneAs<T: ?Sized> {
    fn clone_as(&self) -> T;
}
impl<T:  Trait + Clone> CloneAs<dyn Trait> for T {
    fn clone_as(&self) -> dyn Trait { self.clone() }
}
trait Trait: CloneAs<dyn Trait> {}
// Call ABI signature for `<dyn Trait as CloneAs<dyn Trait>>::clone_as`
fn(
     // opaque pointer passed to `ret` as the first argument
    ret_opaque: *(),
    // called to return the unsized value
    ret: fn(
        // `ret_opaque` from above
        opaque: *(),
        // the `dyn Trait` return value's components
        ptr: *(), vtable: *(),
    ) -> (),
    // `self: &dyn Trait`'s components
    self_ptr: *(), self_vtable: *(),
) -> ()
  • the caller would use the ret_opaque pointer to pass one or more sized values to its stack frame
    • could allow ret return one or two pointer-sized values, but that's an optional optimization
  • we can start by allowing composed calls, of this MIR shape:
y = call f(x); // returns an unsized value
z = call g(y); // takes the unsized value and returns a sized one
// by compiling it into:
f(&mut z, |z, y| { *z = call g(y); }, x)
  • this should work out of the box for {Box,Rc,...}::new(obj.clone_as())
  • while we could extract entire portions of the MIR into these "return continuations", that's not necessary for being able to express most things: worst case, you write a separate function
  • since Box::new works, anything with a global allocator around could fall back to that
    • let y = f(x); would work as well as let y = *Box::new(f(x));
    • its cost might be a bit high, but so would that of a proper "unsized return" ABI
  • we can, at any point, switch to an ABI where e.g. the value is copied onto the caller's stack, effectively "extending it on return", and there shouldn't be any observable differences

cc @rust-lang/compiler

eddyb avatar Aug 25 '18 16:08 eddyb

@alexreg

Has anyone specified the supposed semantics of &move yet, even informally?

I don't think it's been formally specified. Informally, &'a move T is a reference that owns its T. It's like

  • an &'a mut T that owns the T instead of mutably borrowing it, and therefore drops the T when dropped, or
  • a Box<T> that is only valid for the lifetime 'a, and doesn't free heap allocated memory when dropped (but still drops the T).

(Also, I'm not sure if bikeshedding on this has already been done, but we should probably consider calling it &own.)

Don't think that bikeshed has been painted yet. I guess &own is better. It requires a new keyword, but afaik it can be a contextual keyword, and it more accurately describes what is going on. Often times you would use it to avoid moving something in memory, so calling it &move T would be confusing, and plus there's the problem of &move ||{}, which looks like &move (||{}) but would have to mean & (move ||{}) for backward compatibility.

mikeyhew avatar Aug 25 '18 20:08 mikeyhew

@mikeyhew Oh, I'm sorry I haven't replied to the &move thread, only noticed just now that some of it was addressed at me. I don't think &move is a good desugaring for unsized values.

At most, &move T (without an explicit lifetime), is an ABI detail of passing down a T argument "by reference" - we already do this for types larger than registers, and it naturally extends to unsized T.

And even with &move T in the language, you don't get a mechanism for returning "detached" ownership of an unsized value, based solely on it, as returning &'a move T means the T value must live "higher-up" (i.e. 'a is for how long the caller provided the memory space for the T value).

My previous comment, https://github.com/rust-lang/rust/issues/48055#issuecomment-415980600 provides one such mechanism, that we can implement today (others exist but require e.g. adding a new calling convention to LLVM). Only one such mechanism, if general enough to support all callees (of Rust ABI), is needed, in order to support the surface-level feature, and its details can/should remain implementation-defined.


So IMO &move T is orthogonal and the only intersection here is that it "happens to" match what some ABIs do when they have to pass large (or in this case, of unknown size) values in calls. I do want something like &move T for opting into Box's behavior from any library etc. but not for this.

eddyb avatar Aug 26 '18 06:08 eddyb

@eddyb

Automatic boxing by the compiler feels weird to me (it occurs only in this special case), and people will do MyRef::new(x.clone(), do_xyz()?), which feels like it would require autoboxing to implement sanely.

However, we can force unsized returns to always happen the "right way" (i.e., to immediately be passed to a function with an unsized parameter) by a check in MIR. Maybe we should do that?

arielb1 avatar Aug 26 '18 07:08 arielb1

@arielb1 Yes, I'm suggesting that we can start with that MIR check and maybe add autoboxing if we want to (and only if an allocator is even available).

eddyb avatar Aug 26 '18 09:08 eddyb

A check that the unsized return value is immediately passed to another function, which can then be CPS'd without all the usual problems of CPS in Rust, sounds like it would work well enough technically. However, the user experience might be non-obvious:

  • The restriction on what to do with the return value can be learned, but is still weird
  • Having to CPS-transform manually in more complex scenarios is quite annoying
  • It affects peak stack usage in a way that is not visible from the source code and completely surprising unless one knows how it's implemented

I think it's worthwhile thinking long and hard whether there might be more tailored features that address the usage patterns that we want to enable with unsized return values. For example, cloning trait objects might also be achieved by extending box.

hanna-kruppe avatar Aug 28 '18 14:08 hanna-kruppe

@rkruppe If we can manage to "pass" unsized values down the stack without actually copying them, I don't think the stack usage should actually increase from what you would need to e.g. call Box::new.

For example, cloning trait objects might also be achieved by extending box.

Sure, but, AFAIK, all ideas, until now, for "emplace" / boxing, seemed to involve generics, whereas my scheme operates at the ABI level and requires making no assumptions about the unsized type. (e.g. @nikomatsakis' previous ideas relied on the fact that Clone::clone returns Self, therefore, you can get the size/alignment from its self value, and allocate the destination before calling it)

eddyb avatar Aug 29 '18 01:08 eddyb

@rkruppe If we can manage to "pass" unsized values down the stack without actually copying them, I don't think the stack usage should actually increase from what you would need to e.g. call Box::new.

The stack size increase is not from the unsized value but from the rest of the stack frame of the unsized-value-returning function. For example,

fn new() -> dyn Debug {
    let _buf1 = [0u8; 1 << 10]; // this remains allocated "after returning" ...
    "hello world" // ... because here we actually call take()
}
fn take(arg: dyn Debug) {
    let _buf2 = [0u8; 1 << 10];
    println!("{:?}", arg);
}
fn foo() {
    take(new()); // 2 KB peak stack usage, not 1 KB
}

(And we can't do tail call optimization because new is passing a pointer to its stack frame to the continuation, so its stack frame needs to be preserved until after the call.)

hanna-kruppe avatar Aug 29 '18 11:08 hanna-kruppe

@rkruppe I assume you meant take instead of arg? ~~However, I think you need to take into account that _buf1 is not live when the the function returns, and continuation get called. At that point, only a copy of "hello world" should be left on the stack.~~

~~In other words, your new function is entirely equivalent to this:~~

fn new() -> dyn Debug {
    let return_value: dyn Debug = {
        let _buf1 = [0u8; 1 << 10];
        "hello world"
    };
    return_value
}

EDIT: @rkruppe clarified below.

eddyb avatar Sep 08 '18 12:09 eddyb

We discussed on IRC, for the record: LLVM does not shrink the stack frame (by adjusting the stack pointer), not ever for static allocs and only on explicit request (stacksave/stackrestore intrinsics) for dynamic allocas.

hanna-kruppe avatar Sep 08 '18 21:09 hanna-kruppe

@eddyb would you help me with design decision? For by-value trait object safety, I need to insert shims to force the receiver to be passed by reference in the ABI. For example, for

trait Foo {
    fn foo(self);
}

to be object safe, <Foo as Foo>::foo should be able to call <T as Foo>::foo without knowing T. However, concrete <T as Foo>::foos may receive self directly, making it difficult to call the method without knowledge of T. Therefore we need a shim of <T as Foo>::foo that always receives self indirectly.

The problem is: how do I introduce two instances of <i32 as Foo>::foo for example? AFAICT there are two ways:

  1. Introduce a new DefPath Foo::foo::{{vtable-shim}}. This is rather straightforward but we'll need to change all the related IDs: NodeId, HirId, DefId, Node, and DefPath. This seems too much for mere shims.
  2. Use the same DefPath Foo::foo, but include another salt for the symbol hash to distinguish it from the original Foo::foo. It still affects back to rustc::ty but will not change syntax.

I've been mainly working with the first approach, but am being attracted by the second one. I'd like to know if there are any circumstances that we should prefer either of these.

qnighy avatar Sep 09 '18 06:09 qnighy

I'd think you'd use a new variant in InstanceDef, and manually craft it, when creating the vtable. And in the MIR shim code, you'd need to habdle that new variant. I would expect that would be enough for the symbols to be distinct (if not, that's a separate bug).

eddyb avatar Sep 09 '18 06:09 eddyb

With respect to the surface syntax for VLAs, I'm highly (to put it mildly) skeptical of permitting [T; n] because:

  • the risk of introducing accidental VLAs is too high particularly for folks who are used to new int[n] (Java).
  • n may be a const n: usize parameter and therefore "captures no variables" is not enough.

I'd suggest that we should use [T; dyn expr] or [T; let expr] to distinguish; this also has the advantage that you don't have to store anything in a temporary.

I'll add an unresolved question for this.

Centril avatar Oct 07 '18 10:10 Centril

[T; dyn expr] makes sense to me, for the reason you suggest.

alexreg avatar Oct 07 '18 14:10 alexreg

As #54183 is merged, <dyn FnOnce>::call_once is callable now! I'm going to work on these two things:

Next step I: unsized temporary elision

As a next step, I'm thinking of implementing unsized temporary elision. I think we can do this as follows: firstly, we can restrict allocas to "unsized rvalue generation sites". The only unsized rvalue generation sites we already have are dereferences of Box. We can codegen other unsized rvalue assignments as mere copies of references.

In addition to that, we can probably infer lifetimes of each unsized rvalue using the borrow checker. Then, if the origin of an unsized rvalue generation site lives longer than required, we can elide the alloca there.

Next step II: Box<FnOnce>

Not being exactly part of #48055, it would be useful to implement FnOnce for Box<impl FnOnce>. I think I can do this without making it insta-stable or immediately breaking fnbox #28796, thanks to specialization.

qnighy avatar Oct 27 '18 23:10 qnighy

@qnighy Great work. Thanks for your ongoing efforts with this... and I agree, those seem like good next steps to take.

alexreg avatar Oct 27 '18 23:10 alexreg

Ah, annotating a trait impl with #[unstable] doesn't make sense? impl CoerceUnsized<NonNull<U>> for NonNull<T> has one but doesn't show any message on the doc. It can be used without features. Does anyone have an idea to prevent impl<F: FnOnce> FnOnce for Box<F> from being insta-stable?

qnighy avatar Oct 28 '18 02:10 qnighy