creusot icon indicating copy to clipboard operation
creusot copied to clipboard

Non extensionality

Open arnaudgolfouse opened this issue 1 year ago • 33 comments

Test the removal of extensionality for mutable references (see #883).

Some specifications no longer work, like

#[ensures(result == &mut x.0)]
fn f(x: &mut (i32, i32)) -> &mut i32 {
    &mut x.0
}

arnaudgolfouse avatar Oct 26 '23 12:10 arnaudgolfouse

Here is a list of the tests that now fail:

  • should_succeed/bdd.mlcfg
  • should_succeed/heapsort_generic.mlcfg
  • should_succeed/hillel.mlcfg
  • should_succeed/iterators/02_iter_mut.mlcfg
  • should_succeed/iterators/03_std_iterators.mlcfg
  • should_succeed/iterators/04_skip.mlcfg
  • should_succeed/iterators/05_map.mlcfg
  • should_succeed/iterators/06_map_precond.mlcfg
  • should_succeed/iterators/12_zip.mlcfg
  • should_succeed/iterators/13_cloned.mlcfg
  • should_succeed/iterators/14_copied.mlcfg
  • should_succeed/iterators/15_enumerate.mlcfg
  • should_succeed/iterators/16_take.mlcfg
  • should_succeed/knapsack.mlcfg
  • should_succeed/knapsack_full.mlcfg
  • should_succeed/list_reversal_lasso.mlcfg
  • should_succeed/projection_toggle.mlcfg
  • should_succeed/red_black_tree.mlcfg
  • should_succeed/rusthorn/inc_max.mlcfg
  • should_succeed/rusthorn/inc_max_many.mlcfg
  • should_succeed/rusthorn/inc_max_repeat.mlcfg
  • should_succeed/selection_sort_generic.mlcfg
  • should_succeed/sparse_array.mlcfg
  • should_succeed/take_first_mut.mlcfg
  • should_succeed/vector/01.mlcfg
  • should_succeed/vector/02_gnome.mlcfg
  • should_succeed/vector/03_knuth_shuffle.mlcfg
  • should_succeed/vector/06_knights_tour.mlcfg

arnaudgolfouse avatar Oct 26 '23 15:10 arnaudgolfouse

should_succeed/rusthorn/inc_max.mlcfg should_succeed/rusthorn/inc_max_many.mlcfg should_succeed/rusthorn/inc_max_repeat.mlcfg

These ones are interesting, it means we are really losing something compared to rusthorn.

xldenis avatar Oct 26 '23 16:10 xldenis

Are we sure this is not yet another caprice from provers?

jhjourdan avatar Oct 26 '23 18:10 jhjourdan

Are we sure this is not yet another caprice from provers?

These proofs are very stable and fast so I don't think so.

#[ensures(if *ma >= *mb { *mb == ^mb && result == ma }
                   else { *ma == ^ma && result == mb })]
fn take_max<'a>(ma: &'a mut u32, mb: &'a mut u32) -> &'a mut u32 {
    if *ma >= *mb {
        ma
    } else {
        mb
    }
}

I think that the version @arnaudgolfouse currently implemented is probably losing the identity of borrows in some preventable situations like &mut * x.

xldenis avatar Oct 26 '23 19:10 xldenis

I think we might need something like a Borrow.update (x : borrowed 'a) (y : borrowed 'a) which guarantees that the result is "x with y as its current value" but I don't know how to write that spec precisely enough..

xldenis avatar Oct 26 '23 19:10 xldenis

I wonder if another option wouldn't be to simply add a third field addr to the borrow type of some opaque type t. We wouldn't have to change anything about the translation then. It should avoid most of the breakage we've seen here.

xldenis avatar Oct 27 '23 10:10 xldenis

Would that addr field be inherited on reborrows (so TermKind::Reborrow { cur, fin } would become TermKind::Reborrow { cur, fin, addr }) ?

arnaudgolfouse avatar Oct 27 '23 12:10 arnaudgolfouse

Would that addr field be inherited on reborrows

Yes, that's the essential thing we need to fix here.

(so TermKind::Reborrow { cur, fin } would become TermKind::Reborrow { cur, fin, addr }) ?

No, my revised proposal is that this is entirely hidden within creusot and only happens at the prelude.mlw level:

module Borrow
 
type t 

type borrowed 'a = { current: 'a; final : 'a, addr : t }
end

Then doing { x with current = foo } will implicitly preserve the address while ensuring that *x = *y -> ^x = ^y -> x = y does not hold.

xldenis avatar Oct 27 '23 12:10 xldenis

I'm pretty sure that naively adding an addr field is unsound. I have not searched for a counter example, but @dewert99's example seems to show that.

I think we should assume the third field is unchanged when:

  • assigning to a mutable borrow (without creating a reborrow)
  • reborrowing a borrow (or a subfield of it) which we resolve immediately after.

If we do this, then we should remain sound (I can imagine an extension of RustHornBelt supporting this), but we support the two problematic examples we have seen above.

jhjourdan avatar Oct 27 '23 13:10 jhjourdan

I have not searched for a counter example, but @dewert99's example seems to show that.

Which example? this one? https://github.com/xldenis/creusot/issues/869#issue-1913537036

I don't understand how it would imply that an opaque 'address' field would still be unsound?

  • assigning to a mutable borrow (without creating a reborrow)
  • reborrowing a borrow (or a subfield of it) which we resolve immediately after.

The problem is these aren't really notions that are present in MIR and thus highly non-compositional.

I think that the 'address' should be preserved in the same situations the physical address would be, so in particular a reborrow should have no impact on it since it's a nop operationally.

xldenis avatar Oct 27 '23 13:10 xldenis

I think that the 'address' should be preserved in the same situations the physical address would be, so in particular a reborrow should have no impact on it since it's a nop operationally.

And I think this is unsound, since there can be several prophecies for the same address (hence the third field does not determines the prophecy, and hence the equality of borrows is not implied by the equality of current values and of the third field).

At the very least, I don't see how we could build a model of this in RustHornBelt.

I have not searched for a counter example, but @dewert99's example seems to show that.

Which example? this one? https://github.com/xldenis/creusot/issues/869#issue-1913537036 I don't understand how it would imply that an opaque 'address' field would still be unsound?

This one: https://github.com/xldenis/creusot/pull/902#discussion_r1373742595

The crucial point here is that evil and *bg share the same addr fields, since they are both reborrows of xm.

jhjourdan avatar Oct 27 '23 13:10 jhjourdan

  • assigning to a mutable borrow (without creating a reborrow)
  • reborrowing a borrow (or a subfield of it) which we resolve immediately after.

The problem is these aren't really notions that are present in MIR and thus highly non-compositional.

Does assigning to a mutable borrow really is not something primitive in MIR?

As for reborrowing, I agree this is something that does not exists in MIR but it is typically something the Rust user writes directly, so we could see this as a kind of "decompilation".

jhjourdan avatar Oct 27 '23 13:10 jhjourdan

Does assigning to a mutable borrow really is not something primitive in MIR?

The only thing that exists in MIR is assigning a place to a place; that can contain arbitrarily nested dereferences, indexing, and field accesses.

Similarly, in MIR all that exists is borrowing a place, the place being borrowed can itself contain dereferences, indexing or other field accesses.

Additionally, reborrowing is done systematically in Rust, the compiler intentionally performs it before any operation that could involve unifiying lifetimes ie: every function parameter, many simple borrow manipulations. The handling that Creusot has for all this operations is natural and extensional, I would be very reticent to give that up for a non-modular translation.

I think we should relax extensionality, I do think adding an address field should be sound (ignoring ghost code), and not costly in terms of expressivity if we're careful. If cost code introduces unsoundness then it's ghost code that should be amputated not the other way around.

xldenis avatar Oct 27 '23 14:10 xldenis

The only thing that exists in MIR is assigning a place to a place; that can contain arbitrarily nested dereferences, indexing, and field accesses.

Ok, that's what I thought. Then why do you say that assigning to a borrow (i.e., assigning to a place containing a dereference) is not primitive in MIR?

Similarly, in MIR all that exists is borrowing a place, the place being borrowed can itself contain dereferences, indexing or other field accesses.

The problem is that Rust inserts reborrows where the user has not written a reborrow (example: take_max). Also, MIR does only support splitting a borrow via reborrowing.

Another (perhaps more "composable") way to do this, in a more or less equivalent way: when reborrowing, we assume the third field stay the same only if the lifetimes (of the new borrow and of the dereferenced borrow) are the same. We should be cautious about unnesting, though.

jhjourdan avatar Oct 27 '23 14:10 jhjourdan

Ok, that's what I thought. Then why do you say that assigning to a borrow (i.e., assigning to a place containing a dereference) is not primitive in MIR?

Because it's not actually assigning to a borrow, that borrow could be nested inside of another borrow etc... we cannot translate it it into a 'primitive' update of the why3 local. My point is that we shouldn't try to distinguish 'oh you're just doing *x' from other writes, because in MIR there is no difference. Going against the grain here is a recipe for pain and weird broken edge cases.

The problem is that Rust inserts reborrows where the user has not written a reborrow (example: take_max). Also, MIR does only support splitting a borrow via reborrowing.

Another (perhaps more "composable") way to do this, in a more or less equivalent way: when reborrowing, we assume the third field stay the same only if the lifetimes (of the new borrow and of the dereferenced borrow) are the same. We should be cautious about unnesting, though.

That poses the usual difficulties of lifetime-dependent translations which we've mostly avoided until now: MIR has all lifetime information erased in it. @voidc's work has enabled us to recover some of that, but I'm not sure it covers every single lifetime which appears in a MIR body.

Additionally, why wouldn't we just recover the third field of the 'inner' borrow in an unnest?

See: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=cfe66114b2ac8ceee26246fd4f95d4ef

xldenis avatar Oct 27 '23 14:10 xldenis

I am wondering if addr could be a list of addresses: then a reborrow would append to that list, meaning that

  • A borrow and its reborrow are 'compatible', since one address list is a prefix of the other
  • Two reborrows of the same borrow are 'incompatible'

I'm not sure if that would actually work, and in particular how to express this compatibility property.

arnaudgolfouse avatar Oct 27 '23 14:10 arnaudgolfouse

Additionally, why wouldn't we just recover the third field of the 'inner' borrow in an unnest?

Again, that's unsound for the same reason as in @dewert99 's example!

jhjourdan avatar Oct 27 '23 14:10 jhjourdan

I am wondering if addr could be a list of addresses: then a reborrow would append to that list, meaning that

We need to build a notion of equality over borrows that we can use in both ghost and logic code without losing soundness.

The equality relation needs to be an equivalence relation (that's equality, after all). So your "compatibility" relation needs to be an equivalence relation. If reborrows are compatible with their initial borrow, then they must be compatible between themselves, don't you think?

jhjourdan avatar Oct 27 '23 14:10 jhjourdan

Hm, I see...

arnaudgolfouse avatar Oct 27 '23 14:10 arnaudgolfouse

We need to build a notion of equality over borrows that we can use in both ghost and logic code without losing soundness.

I think that equality in ghost code should just be PartialEq::eq, nothing more nothing less.

EDIT: Or similarly, we need gh! to only work on snapshots of values so that prophecies are stripped out.

xldenis avatar Oct 27 '23 14:10 xldenis

EDIT: Or similarly, we need gh! to only work on snapshots of values so that prophecies are stripped out.

We already discussed that. This does not compose, because that would mean defining a notion of "prophecy stripping" that works for every data structure.

jhjourdan avatar Oct 27 '23 15:10 jhjourdan

I think that equality in ghost code should just be PartialEq::eq, nothing more nothing less.

This proposal also has its own problems: we need to make sure that PartialEq::eq always terminates, which is not something we are currently doing because this is a program function, and I don't expect this to be easy in the general case because it is typically a function which may be defined by mutual (or nested) recursion.

jhjourdan avatar Oct 27 '23 15:10 jhjourdan

One simple option would be not to allow checking equality in ghost code. We could potentially also make a trait:

trait GhostEq {
  #[ghost]
  #[ensures(result == (self == other))]
  fn ghost_eq(self, other: Self) -> bool;
}

and allow checking equality of types that implement GhostEq. It should be possible to implement GhostEq on most (all?) types other than mutable references. If we want a connection PartialEq::eq we could add the constraint to DeepModelTy: GhostEq to the DeepModel trait so that x.deep_model() == y.deep_model() (which is the logical interpretation of PartialEq).

dewert99 avatar Oct 27 '23 16:10 dewert99

It seems to me that if we do this, then @dewert99 still needs mutable references to be non-extensional. @dewert99, do you confirm?

In that case, I would really prefer if we could solve both problems at the same time if a solution like the one I'm proposing (adding the right assumption about the third field) do not have too many drawbacks.

@xldenis : don't take me wrong. I do agree that we need a notion of ghost code that is close to the notion of executable code in Creusot. But that's mostly for reasons related to expressiveness and usability, not because of soundness. The current situation where (logically) comparing borrows in ghost code is unsound seems really bad to me. To me, this is counter-intuitive and dangerous. Even if we syntactically forbid this, the user could easily axiomatize a ghost function performing this comparison (maybe this user does not even know she is doing this, she might be comparing polymorphic objects), and then she just introduced an unsound hypotheses which is particularly difficult to detect.

jhjourdan avatar Oct 27 '23 16:10 jhjourdan

It seems to me that if we do this, then @dewert99 still needs mutable references to be non-extensional. @dewert99, do you confirm?

Yes, although for the Prusti->Creusot translation, mutable references with the same physical address can be considered equal.

dewert99 avatar Oct 27 '23 19:10 dewert99

But that's mostly for reasons related to expressiveness and usability, not because of soundness. The current situation where (logically) comparing borrows in ghost code is unsound seems really bad to me.

Agreed, but if ghost code really were just program code (but erased), then the problem wouldn't occur either. In all these discussions I think there's a deeper underlying issue about the nature of 'snapshots' in ghost code which is involved in the unsoundnesses we've seen. I think the idea of being able to "copy" a program value into ghost code is highly problematic with Creusot's underlying translation; it inherently leads to the duplication of the prophecy. I believe that until we solve that issue the soundness of ghost code will always be precarious at best.

However, in the meantime, we need to salvage a clear path forward for @arnaudgolfouse, I still think that my approach of adding a third field satisfies your critieria @jhjourdan, but if not we need to discuss a new plan asap.

xldenis avatar Oct 30 '23 09:10 xldenis

I added a third field, that is not inherited for reborrows currently (a new one is generated for each borrow/reborrow). This is equivalent with the previous approach, while fixing the following tests:

  • should_succeed/hillel
  • should_succeed/iterators/04_skip
  • should_succeed/knapsack
  • should_succeed/list_reversal_lasso
  • should_succeed/vector/01
  • should_succeed/vector/02_gnome
  • should_succeed/vector/03_knuth_shuffle

This is because in loops where a reborrow happen, like

let v : &mut Vec<i32> = /* */;
loop {
    v.push(0);
}

^v stays the same, and thus it should be added as an invariant to the loop. Making addr an explicit third field helps with that.

arnaudgolfouse avatar Nov 03 '23 09:11 arnaudgolfouse

So we talked about this, and here is what we think should be done:

  • Borrows should have a third "addr" (or "id") field (this is the current state of this branch) ;
  • This field should be "inherited" for reborrows: "direct" reborrows (&mut *x) do not change it, and borrows of fields are deterministic (two successive &mut x.field yield the same address).

@dewert99 This should satisfy the needs of the Prusti translation ? The issue of unsoundness in ghost code will be addressed in another way.

I don't really have the time and knowledge of the code to implement this, but I'm leaving this branch free for you to take over 🙂

arnaudgolfouse avatar Nov 09 '23 16:11 arnaudgolfouse

The issue of unsoundness in ghost code will be addressed in another way.

What is the plan for this?

dewert99 avatar Nov 09 '23 17:11 dewert99

To disallow the use of comparisons (==, !=) inside gh! { ... } blocks

arnaudgolfouse avatar Nov 10 '23 09:11 arnaudgolfouse