nll-rfc [lifetimes] lifetime-liveness of drops of uninitialized values

Moved from a comment on #40 and modified.

@nikomatsakis had discovered the following interaction:

fn foo() {
    let x = Box::new(RefCell::new(42));

    let ref_x: &'α RefCell<u32> = &x;
    let inner_ref: Ref<'α, u32> = RefCell::borrow(ref_x);
    // ^ for all you guys following me, `Ref` maintains a borrow to its parent
    // `RefCell` that it clears during its very-not-blind destructor.
    std::mem::drop(inner_ref);

//  (†)
    std::mem::drop(x); // free the box
    maybe_unwind();
//  (‡)
}

This code is legal only of 'α is not required to be live at (†). There are no explicit uses of 'α after inner_ref is dropped, so it might seem like 'α wouldn't be live at †, but there are 2 MIR drops for inner_ref, that can both use ref_x and keep it live:

at end of scope at point ‡
at the unwind path, which can be reached from maybe_unwind

Dynamically speaking, inner_ref is in an uninitialized state after each of the drops, so both drops are no-ops and this program is sound.

However, if we go according to liveness requirements according to the current dropck rules, both of these implicit MIR drops will force 'α to be live at (†) and cause the conflict.

If we care about it, a simple dataflow analysis could discover that the drop at ‡ is always dead.

The drop at the unwind path looks like it might be a hard nut to crack - there's only 1 unwind path for the function, ending at the single resume terminator. However, I think there's a trick we can use - if a value is uninitialized, then I'm quite sure it can never be live. We could backwards-propagate α live if VARS are initialized dataflow facts, and combine them with forwards-propagated VARS is maybe-initialized facts.

Aug 22 '17 19:08 arielb1

(To be clear, in the example that @arielb1 added, the drop is something more like std::mem::drop(inner_ref), and not the MIR drop operation. Or else I am confused.)

Aug 24 '17 11:08 nikomatsakis

The drop at the unwind path looks like it might be a hard nut to crack - there's only 1 unwind path for the function, ending at the single resume terminator. However, I think there's a trick we can use - if a value is uninitialized, then I'm quite sure it can never be live. We could backwards-propagate α live if VARS are initialized dataflow facts, and combine them with forwards-propagated VARS is maybe-initialized facts.

This is roughly the same analyses that we do for drop elaboration, right? I was wondering if it would make sense to think about separating out the initialization analysis and drop elaboration from the rest of borrow check. In other words, we would first check that everything is initialized when it is used (where a move counts as a de-initialization). We would then elaborate drops. We could then run borrow-check on this elaborated form, with the analysis assuming that all values are initialized when used (even if it's not always obvious). Would this suffice? I guess I have to experiment and check out what MIR results.

Aug 24 '17 11:08 nikomatsakis

(To be clear, in the example that @arielb1 added, the drop is something more like std::mem::drop(inner_ref), and not the MIR drop operation. Or else I am confused.)

Sure, because you can't write a MIR drop.

Aug 24 '17 11:08 arielb1

This is roughly the same analyses that we do for drop elaboration, right? I was wondering if it would make sense to think about separating out the initialization analysis and drop elaboration from the rest of borrow check. In other words, we would first check that everything is initialized when it is used (where a move counts as a de-initialization).

No. The MIR generated is (taking a parameter to make the MIR simpler):

fn foo(x: Box<RefCell<u32>>) {
bb1:
    let ref_x: &'α RefCell<u32> = &x;
    let inner_ref: Ref<'α, u32> = RefCell::borrow(ref_x) -> bb2 [unwind=bbU1]
bb2:
    // This wasn't present in the original, but I expected it - turns
    // out to be somewhat of a complication
    maybe_unwind() -> bb3 [unwind=bb2U]
bb3:
    // ^ for all you guys following me, `Ref` maintains a borrow to its parent
    // `RefCell` that it clears during its very-not-blind destructor.
    std::mem::drop(inner_ref) -> bb4 [unwind=bbU2]
bb4:
//  (†)
    std::mem::drop(x) -> bb5 [unwind=bbU2] // free the box
bb5:
    maybe_unwind() -> bb6 [unwind=bbU2]
bb6:
//  (‡)
    DROP inner_ref // this drop is deleted by drop elaboration
    DROP x // this drop is deleted by drop elaboration
    ret = () // don't nitpick
    return
bbU2:
    // this drop can be called from both bb2 & bb4,
    // and therefore can be both live and dead, and
    // therefore we create a drop flag.
    //
    // Yes, this is an inefficiency, but *I think* LLVM
    // can get rid of it pretty quickly in SROA+SimplifyCFG.
    DROP inner_ref -> bbU1 
bbU1:
    // same, but less important for us.
    DROP x -> bbU0
bbU0:
    resume
    
}

Aug 24 '17 11:08 arielb1

@arielb1 ok, so the problem is that in bbU2, we have incoming branches from both bb2 (where the DROP would occur) and bb4 (where it would not)? In other words, if we were to do a more "thorough" drop elaboration, we'd be ok here (not saying that is the impl strategy we should take necessarily, I can imagine it would lead to explosions in the graph or something like that).

Aug 24 '17 12:08 nikomatsakis

@nikomatsakis

Sure.

Aug 24 '17 15:08 arielb1

@arielb1 so I was talking about this issue with @pnkfelix and it got me thinking. It seems to me that if we do a simple is-initialized analysis, we will have the same problem that the "elaborate drops" pass does: the unwinding path both before and after invoking mem::drop(x) is the same, and hence we get into a "maybe initialized" state.

I was thinking about how we might fix this. The most obvious thing is to generate more unwind paths, but that's going to lead to worse compile time.

I think what you were saying is that, when we merge in bits from a successor, if we see that the value may be dropped, and we also know that it is definitely not initialized, we can clear those bits? If so, that makes sense. (I guess this means that the "may drop" bits from bbU2 would be ignored by bb4, but not bb2).

Seems like something we could readily prototype.

Aug 24 '17 21:08 nikomatsakis

@nikomatsakis

I had some odd dataflow-ish solution in mind, which I could try to write in more detail. I believe it means we can do this sanely in a sane amount of time.

Aug 24 '17 22:08 arielb1

@arielb1 ok -- I was imagining we might do a "must be moved" analysis first, and then use the results of that to inform the liveness (i.e., if a variable must be moved at some point X, we don't care if it is live or if it may be dropped).

Aug 24 '17 22:08 nikomatsakis

nll-rfc nll-rfc copied to clipboard

[lifetimes] lifetime-liveness of drops of uninitialized values

nll-rfc
nll-rfc copied to clipboard