rusty_v8
rusty_v8 copied to clipboard
GcCell
This introduces a utility to aid authoring garbage collected objects using cppgc with safe code.
The challenge is this:
- Since #1802, all objects that implement
GarbageCollectedmust beSend + Sync. - But in practice, cppgc objects are only accessed from a single, single-threaded isolate.
- Except when the GC is actually running. In that case, the objects may be traced and sweeped from other threads than the one owning the isolate.
There is no built-in synchronization primitive in Rust that encodes this "phasic" access pattern. In practice, the only option is to wrap things behind mutexes or atomics, and that is needlessly expensive. Very undesirable for fast access to small fields such as this.
The solution I'm proposing is inspired by qcell. The idea is to have a special cell type (GcCell), whose contents are accessible as long as you have access to the isolate. This should solve all aliasing problems with zero runtime overhead.
This also solves the main use case for https://github.com/denoland/deno_core/issues/787, as interior mutability in cppgc objects becomes free of charge (no RefCell overhead).
I think interior mutable access is still unsound without a runtime check due to reenetrancy/aliasing through JS.
I think interior mutable access is still unsound without a runtime check due to reenetrancy/aliasing through JS.
Can you elaborate? I believe you need a &mut Isolate (such as through a handle scope) to get any kind of aliasing, which should be statically prevented here. What am I missing?
if you write a rust function which performs get_mut and then also evaluates some javascript, that javascript may call the rust function again, which will do get_mut again. you could also have a rust function with multiple arguments, pass the same object in js to both arguments, unwrap both arguments to a GcCell, and call get_mut on both of them.
I should perhaps have explained more clearly - apologies. This technique is inspired by qcell, where interior mutability is achieved by "outsourcing" it to something other than the physical memory location of a variable - a unique integer, a unique type instance, a unique lifetime, or in this case, v8::Isolate.
if you write a rust function which performs
get_mutand then also evaluates some javascript, that javascript may call the rust function again, which will doget_mutagain.
No. Because of the lifetime annotation of the returned reference from get_mut(), it's not possible to invoke more JavaScript while the reference is held, because it borrows the isolate mutably, and you need mutable access to the isolate to make any calls.
you could also have a rust function with multiple arguments, pass the same object in js to both arguments, unwrap both arguments to a GcCell, and call get_mut on both of them.
Again, this would require two overlapping mutable borrows of the isolate.
Illustrating examples:
fn cannot_reenter(scope: &mut v8::HandleScope, a: &GcCell<u32>, b: v8::Local<v8::Object>) {
let a = a.get_mut(scope);
let b = b.get_index(scope, 0); // Not fine: scope is borrowed by `a`
// <something keeping `a` alive>
}
fn cannot_mutably_alias(scope: &mut v8::HandleScope, a: &GcCell<u32>, b: &GcCell<u32>) -> u32 {
let a = a.get_mut(scope); // Fine
let b = b.get_mut(scope); // Not fine: scope is borrowed by `a`
*a += *b;
*b = 0;
return *a;
}
In short, GcCell acts as if it is a pointer to data inside the v8::Isolate for the purposes of borrow checking.
However, I should point out: As is, this API is not safe in the presence of multiple Isolates. If you can produce two overlapping &mut v8::Isolate (e.g. by creating two isolates), you can produce UB with the current implementation here, which kind of assumes that v8::Isolate is a thread-local singleton. I'm curious if anybody has ideas about how to counter this.
Options:
- Runtime check to see that the
GcCellwas created from within the same isolate that re-opens it. Very heavy, at least 8 bytes memory overhead per instance, an extra validation step on every call. Not nice. - Some way to ensure that
GcCells can only be accessed when owned by av8::Object(i.e. holding a cppgc heap object). This can come down to markingnewasunsafeand documenting that theGcCellmust not travel between isolates, the same way that any otherv8::Localmustn't. - More generally, it's actually a little unclear to me how references in V8 are guarded from migrating between isolates in the host/embedder code. It's clearly not a valid thing for them to do, but do they check at runtime? Anyone knows?
In short, GcCell acts as if it is a pointer to data inside the v8::Isolate for the purposes of borrow checking.
thanks for clarifying, i should've read the patch before commenting :)
it's actually a little unclear to me how references in V8 are guarded from migrating between isolates in the host/embedder code.
v8 doesn't generally prevent you from doing this, it will happily corrupt itself. though some places where isolate pointers happen to be available do contain debug assertions that they match.
thanks for clarifying, i should've read the patch before commenting :)
No worries :)
v8 doesn't generally prevent you from doing this, it will happily corrupt itself. though some places where isolate pointers happen to be available do contain debug assertions that they match.
I see. That seems hard. 😅
I guess that the situation isn't much worse (or better) in terms of risk - there's just a bit of a potential for UB in safe code moving references between isolates just with the current vanilla V8 Rust API, I suppose? In other words, v8::cppgc::Member et al have the same problem.
The only way to really fix it might be to require that all calls that operate on a scope and 1 or more v8::Locals have the same 'scope lifetime parameter from v8::HandleScope<'scope>, which would imply moving all methods from, e.g., v8::Object to v8::Local<v8::Object> (or replacing &self with this: v8::Local<'scope, Self>). Nightmarish refactor, massively breaking lots of Deno code, and would it even be sufficient?
(Upside: The Rust API would then become significantly less error prone than the C++ API.)
Anyway, for the purposes of this PR... It seems "sufficient" to make the user document that the GcCell reference comes from a v8::Local<v8::Object>, but this would interfere with Deno's op2 macro, which only gives you &T for cppgc types, and not the wrapping object.
@devsnek Hmm, any ideas how to move forward with this? 🤔 Or do you have alternatives in mind?