threads
threads copied to clipboard
Consider supporting verifiably safe concurrency
What is the sentiment for supporting a subset of wasm that is verifiably safe with respect to concurrency?
The motivation is to allow modules to interoperate efficiently (lock free, zero-copy), without depending on the correctness of the other party to preserve their safety.
This is an extension of Issue #4 in the GC proposal, which suggests maintaining a verifiably safe subset of wasm.
I'm not sure I understand what you're looking for exactly. IIUC you'd like to share a memory, but know that parts of it aren't modified? That could be done through support for multiple memories.
I may be wrong, but IIUC the Midori approach can be layered on top of WebAssembly, so can Rust or any other languages / safety approaches (say, CFI).
Somewhat related are https://github.com/tc39/ecmascript_sharedmem/issues/133 and https://github.com/tc39/ecmascript_sharedmem/issues/88 which I expect WebAssembly to inherit from. This will likely be discussed at the first official WebAssembly meeting (watch this space for dates).
It's true that high-level code in a safe language like Rust is lowered to unsafe machine instructions for execution. The question I'm interested in is whether, as the module passes through the intermediate stage of web assembly, the safe semantics that were apparent in the original program can be verified in the byte code.
In Midori, the approach we took was to define a safe subset of our byte code that excluded sharing mutable memory. We provided an efficient operation for zero-copy conversion of mutable memory to immutable memory, zero-copy transfer of ownership of mutable memory between agents (i.e., the sender revokes it's own access to the memory), and an operation for copying immutable memory to mutable memory.
This approach allowed Midori to elide most locks and defensive copies, and still be provably free of low-level concurrency hazards at the byte code level, regardless of which mix of compilers produced the modules. (Excluding the uKernel and a handful of other low-level modules that used small amounts of "unsafe" code.)
(BTW - Thank you for the pointers to the sharedmem discussions. I really appreciate how cognizant you are being of the subtle/non-intuitive pitfalls in the discussion.)
Its true that high-level code in a safe language like Rust is lowered to unsafe machine instructions for execution. The question I'm interested in is whether, as the module passes through the intermediate stage of web assembly, the safe semantics that were apparent in the original program can be verified in the byte code.
In Midori, the approach we took was to define a safe subset of our byte code that excluded sharing mutable memory. We provided an efficient operation for zero-copy conversion of mutable memory to immutable memory, zero-copy transfer of ownership of mutable memory between agents (i.e., the sender revokes it's own access to the memory), and an operation for copying immutable memory to mutable memory.
This approach allowed Midori to elide most locks and defensive copies, and still be provably free of low-level concurrency hazards at the byte code level, regardless of which mix of compilers produced the modules. (Excluding the uKernel and a handful of other low-level modules that used small amounts of "unsafe" code.)
I'm not familiar enough with Midori's approach besides reading random publications and talking to @mtrofin : does this allow varied languages' concurrency and parallelism models to map efficiently to the underlying bytecode? Can I efficiently lower and validate say C++, Java, Rust and .NET?
I ask because I agree that the properties you describe are desirable, but I'm not convinced this is a practical and proven approach versus or an open research topic. We can definitely cover both proven as well as research avenues, but for WebAssembly threads my impression, which I'll validate at the upcoming meeting, is that most people want something by the end of the year. If we want to significantly diverge from the current approach then we need a champion, fast, who delivers by that timeline.
I don't think @DLehenbauer suggests diverging from the current approach, rather it's about gaging interest in "a subset of wasm that is verifiably safe with respect to concurrency". I'm not reading that as "at the cost/exclusion of traditional concurrency mechanisms". It seems complementary.
On that, probably having a clear beneficiary of the verifiable approach (ideally, engaged and wanting to ship something), and a wasm specific proposal, would go a long way.
I don't think @DLehenbauer suggests diverging from the current approach, rather it's about gaging interest in "a subset of wasm that is verifiably safe with respect to concurrency". I'm not reading that as "at the cost/exclusion of traditional concurrency mechanisms". It seems complementary.
On that, probably having a clear beneficiary of the verifiable approach (ideally, engaged and wanting to ship something), and a wasm specific proposal, would go a long way.
OK, my understanding was that the model may have to change for a verifiable subset to be tractable. Now's the right time to do such changes, but the window is quickly closing!
If no change is needed, then that subset can be defined any time.
Good clarification. I was thinking of the two as complimentary, although the two of you are making me realize that there might still be some time pressure to understand how these should fit together harmoniously.
(I'll respond soon abt. the problem of lowering different languages to verifiable byte code.)
Verifiable type/memory safety depends on languages sharing a common type system, like the one introduced by the GC proposal. There are inevitable tradeoffs in how completely and efficiently different languages can target a shared type system.
What Midori did for concurrency safety is simpler and more agnostic. Similar to the web, Midori did not allow safe code to block the current thread (related to issue #6), and required agents to either pass memory by copy, transfer ownership of it, or convert it to immutable memory before sharing references.
Conveniently, the memory aspects talk about bytes as bytes, although down the road the type system will enable compelling additions. In particular:
- A transferred/copied buffer that is strongly typed can embed references to immutable shared memory.
- Agents running in the same GC heap can directly exchange references to immutable types.
I think I could reasonably have a concrete proposal for the concurrency aspects in June and represent it at the 7/17 meeting. I also don't believe there would be large impact to the current threads proposal, beyond some minor byte code details to make it easy to detect which functions can block, etc.