dejavu
dejavu copied to clipboard
Test the soundness of `vm::Value`
Edit: This has been addressed; this issue now tracks testing to ensure this doesn't get broken.
The VM stack contains three kinds of raw pointers, with relatively unclear rules around when and how they can be used soundly. This makes it hard to determine whether (modifications to) the interpreter are correct. Worse, it exposes the entirely-safe engine APIs to unsoundness- for example, this is the main blocker for implementing instance_create
and instance_destroy
.
These are the three kinds of pointers:
-
vm::Value
can encode avm::Array
, which is itself anRc<UnsafeCell>
. However, becausevm::Value
isCopy
, theRc
is not helpful at this level- it's there to implement copy-on-write arrays at the GML level. -
vm::Row
is an intermediateptr::NonNull
into avm::Array
. It is intended to be "short-lived" (i.e. not stored in instances) but we may still want Dejavu to be allowed to produce optimized bytecode such that it lives across calls to other scripts or even engine APIs. - A
with
iterator is aptr::NonNull
to some array ofvm::Entity
s owned by thevm::Thread
orvm::World
. It is "short-lived" in the same way asvm::Row
, but unlikevm::Row
is required by GML semantics to live across calls in the body of thewith
loop.
There are two separate-but-related problems we need to avoid:
- The most obvious problem is that these pointers are not tracked by the borrow checker, making it easy to use one after its target is gone, on accident. This risk is mitigated somewhat for
vm::Row
andwith
iterators because they are confined to the VM stack, but it's still there. - The second problem is scarier and more subtle- the interpreter or the engine might violate Rust's aliasing rules and hit undefined behavior. This risk is also mitigated somewhat for raw pointers, but the code was not written with very much attention to this detail.
What I'd like to do to build confidence that this is sound:
- [x] Figure out a way to resolve the fact that
Value
contains anRc
but is alsoCopy
- maybe this is okay if we can enforce that there is always a corresponding liveRc
on the VM stack, or maybe we need to drop theCopy
impl. (#45) - [x] Fix the way
with
and instances work so it doesn't immediately explode if you e.g.instance_create
in awith
body. @Zekka suggested making the arrays copy-on-write, which solves the use-after-free issue; we just need to work out the specifics. (#46) - [x] Encode the above in the types exposed to the engine. We might need to expose the concept of a frame lifetime, add some reference counting, etc.
- [x] Audit the whole thing and fix any remaining shenanigans with references and other kinds of aliasing.
- [ ] Expand the test suite with this in mind and run it through Miri, ideally on CI. The current test suite passes, but it doesn't exercise much of this stuff.
Regarding the first checkbox (Value
contains an Rc
but is also Copy
) I believe my original intent was that Value
s outside of the VM stack were really more like RcBorrow
s- they don't increment the refcount, but they don't decrement it either. But to expand on what the checkbox says: this isn't quite enough, because they don't actually borrow from anything.
This suggests a way to drop the Copy
impl without drastically increasing refcount traffic- most of what the interpreter and engine APIs do can be done without a fully owned Value
, so they can deal in &Value
s and use clone
when they need to keep it around longer.
On its own, this just replaces all that refcount traffic with a bunch of extra references instead. It may be better to introduce something like a ValueRef<'v>
type with the same representation as the Value
it borrows from, but Copy
like the original implementation of Value
.
I've addressed the known issues here, so the remaining work is to make sure we have good test coverage. This might include running the test suite under Miri and other sanitizers, as well as fuzzing.