Multiple Managed Memories
This is in the form of a suggestion; not an 'issue'.
There is a potential conflict between multiple threads and managed memory. Especially, if that managed memory (MM) is to be shared with JS. On the other hand, many MM languages also have threads (Java, C#, ...) and prohibiting those languages from accessing wasm-based MM would seriously affect the ROI of the proposed GC scheme.
Note that not all MM languages that are multi-threaded share a single MM: e.g., Erlang has a process model where different processes do not share heaps.
So, this suggestion is to allow multiple MMs; with the additional constraint that a direct write of a reference is not permitted to cross a MM boundary (i.e. when using multiple MMs, message posting between them is required).
The meta data as to which MM a reference belongs to can be relatively cheaply be embedded in the same structure used to record the structural type of the object; and so testing whether a write is legal or not is not enormously expensive.
TL;DR. Instead of having a single MM, allow multiple MMs; and require that 'extra' threads do not have direct access to the MM that is used for JS objects. This will facilitate languages like Java to be implemented over wasm whilst preserving the single-threaded nature of JS itself.
(There are follow-on issues: such as invoking imported JS functions from multiple-threads ...)
Would a monitor (synchronized) not be more effectively to access the single thread ranges of JavaScript side? The disadvantages of two MM seams me large.
But I think this are details that the implementer of the wasm runtime must check.
The current idea for supporting the combination of references and threads is that you can only share references across threads that are explicitly marked as sharable. That is, every reference type will have a Boolean "shared" attribute, similar to the one added to memories. Validation then ensures that these attributes are transitively consistent, e.g., shared struct refs can only be formed for structs whose ref fields are shared references themselves.
Under such a regime, any non-shared ref is thread-local by construction, and you can think of each thread as having its own private heap, plus a global heap of shared objects. Local heap objects can point into the global heap, but not vice versa.
For the MVP, we will restrict all references to be non-shared, however.
This split of local heaps + global heap implies that, for languages like Java, there will be nothing in the local heap. I am pretty sure that that was not the intention; but it would be the consequence. I was suggesting a more agnostic stance: support multiple heaps; but only restrict where necessary (such as the JS heap)
On Fri, Mar 8, 2019 at 3:38 AM Andreas Rossberg [email protected] wrote:
The current idea for supporting the combination of references and threads is that you can only share references across threads that are explicitly marked as sharable. That is, every reference type will have a Boolean "shared" attribute, similar to the one added to memories. Validation then ensures that these attributes are transitively consistent, e.g., shared struct refs can only be formed for structs whose ref fields are shared references themselves.
Under such a regime, any non-shared ref is thread-local by construction, and you can think of each thread as having its own private heap, plus a global heap of shared objects. Local heap objects can point into the global heap, but not vice versa.
For the MVP, we will restrict all references to be non-shared, however.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/WebAssembly/gc/issues/63#issuecomment-470900107, or mute the thread https://github.com/notifications/unsubscribe-auth/ACAL0B1LzfjeTSOmz6uPsuuXmGtzd5Dnks5vUkvNgaJpZM4bfliv .
-- Francis McCabe SWE
Sounds not very practicable. On which point should be decision if local heap or global shared heap. The compiler can't decide this.
Every language compiler is going to map the semantics of their language to wasm. In the case of Java, there will be a single heap for all threads. In the case of erlang, there will be multiple heaps: one per thread. The main question is about the heap that is shared with JS. Since most languages are not aware of JS, it will be up to the compiler writer to decide how to handle that. One approach, for Java say, would be to use the JS heap in a highly restricted way: only for communication with the 'OS'.
On Fri, Mar 8, 2019 at 8:39 AM Volker Berlin [email protected] wrote:
Sounds not very practicable. On which point should be decision if local heap or global shared heap. The compiler can't decide this.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/WebAssembly/gc/issues/63#issuecomment-470993030, or mute the thread https://github.com/notifications/unsubscribe-auth/ACAL0HaoB0T5-1hcXdQrmEcXqD05P2uaks5vUpJDgaJpZM4bfliv .
-- Francis McCabe SWE
Indeed, for Java all refs are shared. Isn't that what you would expect?
Partitioning the global heap into disjoint logical "worlds" would be possible by extending the "shared" attribute with an abstract region parameter and allowing programs to declare new regions. However, it is not clear to me what the practical benefit of such a mechanism would be.
JS is single-threaded, so all its objects must be non-shared and naturally live on the local heap(s) of the main thread or respective workers.
@rossberg But I can pass from JS any object to wasm as an anyref. That all objects in JS must also declared as shared. With host binding this should be possible.
@Horcrux7, in the extension I described anyref and shared anyref are different types (the latter is a subtype of the former). JS values can only be given type anyref, not shared anyref. Everything else would completely break JavaScript semantics.
Everything else would completely break JavaScript semantics.
@rossberg I understand this. But I have no idea how this should work if wasm will support multiple threads in the future. I hope your core developer knows what you are doing.
We have shared references listed in our post-MVP doc and would welcome PRs adding separate managed heaps as well, but I'll close this issue since neither feature will be included in the MVP.