Extension for access to JS array buffers from core wasm
Many important Web API's use 'array buffers' (either ArrayBuffer or SharedArrayBuffer) or 'typed array views' (Int8Array, etc.) to represent bulk data. With externref wasm can hold on to references to these objects, however wasm still needs to call out to imported JS code to actually access the contents of these buffers and this can be expensive.
Engines can optimize this pattern, but similarly to js-string-builtins it's nicer if critical operations have consistent performance and don't rely on heuristics.
In addition to the above, once there is a native type for a 'view of bytes' we could add extensions to wasm to create views of GC arrays or even linear memory.
Basic Proposal
Introduce a new type (slice share? mut? addrtype). A shorthand of sliceref is given for (slice mut i32). A slice is a view into some other buffer of bytes.
A slice differs from a wasm array by being an object that points into another buffer, as opposed to being just a buffer. A slice is expected to have slightly higher overhead than an array because it must always have an extra indirection to get the referenced buffer, while a wasm array can theoretically be implemented as just a pointer to the data it owns.
Open question: I'm not sure if a slice makes more sense as a heap type or a defined type (that would go in the type section).
JS-API conversions
With the JS-API, a wasm slice is interchangeable with the JS array buffer types and typed array views. ToWebAssemblyValue(slice) will coerce any typed array view down to it's underlying array buffer type, then pass it into wasm as a sliceref. This is equivalent to the WebIDL 'BufferSource' behavior. ToJSValue(slice) will return the array buffer type.
Open question: Do we need the coercion behavior from a typed array view down to the underlying buffer. This seems nice from an ergonomic perspective, but does break round tripping strict equality.
Sharing
JS array buffers can be shared using SharedArrayBuffer. A shared wasm slice is interchangeable with a SharedArrayBuffer, and an unshared wasm slice is interchangeable with a plain ArrayBuffer.
Address types
JS array buffers can have lengths larger than 32-bits. Following the memory64 proposal, slices are parameterized by their address type (either i32 or i64). A dynamic length check is performed when a sliceable value is passed into wasm to ensure the length fits within the slice type's limit. For growable array buffer types, this check must also apply to the maximum value.
Access instructions
We would define instructions for loading and storing to the slice, along with accessing the length.
- Get the length of a slice in bytes
slice.length $typeidx
$typeidx refers to a defined slice type
[(ref $t)] => $addrtype
- Load from the slice
slice.$numtype.load $typeidx
Defined for the numtypes {i32, i64, f32, f64}
$typeidx refers to a defined slice type
[(ref $t) addrtype] => [$numtype]
slice.$packedtype.load_s/u $typeidx
Defined for the packedtypes {i8, i16}
$typeidx refers to a defined slice type
[(ref $t) addrtype] => [$numtype]
- Store to the slice
slice.$storagetype.store $typeidx
Defined for the storagetypes {i8, i16, i32, i64, f32, f64}
$typeidx refers to a defined slice type
The slice must be mutable
[(ref $t) addrtype $storagetype] => []
Creation instructions (optional)
We do not need to define instructions for creating slices, and could rely on wasm getting access to slices from the host. However, it could be useful to have instructions for creating slices that refer to core wasm types.
- Creating a slice from a wasm GC array
slice.of_array $typeidx
$typeidx must refer to a wasm array
[(ref $typeidx) i32 i32] => [(slice mut? i32)]
- Creating a slice from a wasm memory
slice.of_memory $memidx
[(ref $t) addrtype addrtype] => [(slice mut addrtype)]
Bulk operations (optional)
We optionally could have bulk instructions for copying/filling slices. It's a bit annoying to add another column to the types that we want efficient copying between, but I'm not sure of another alternative. In JS engines, if there is no native instruction for this users could call out to JS and perform the equivalent operation.
Detaching and resizing
JS array buffer types can be detached via Web API's or through a transfer method. This operation semantically changes the length of the buffer to zero. In addition, ArrayBuffers can be resizable and shrink in size. To handle this, slices are required to shrink in size if their underlying buffer source shrinks such that the slice's view is no longer valid. This matches the behavior of the JS typed array views.
This behavior will need to be permitted by the core wasm semantics, but we do not need to expose it through a core wasm instruction. This will permit other kinds of hosts of wasm that do not have such a behavior to optimize their slices a bit more.
Alternatives
The same approach taken by js-string-builtins could be used here. We could define a wasm:typed-array builtin collection which exposes all the critical operations of JS typed arrays. This would be perfectly fine by me.
However, in this particular case I believe the semantics of JS array buffers and typed arrays are simple and general enough that we could define a core wasm extension that has broad applicability beyond just JS environments.
Naming
slice and sliceref seem like a fine name for this concept. It has been pointed out to me that slice typically is used in programming languages for references to a range inside an array of arbitrary type, not just bytes. One alternative name would be buffer and bufferref.
Interesting!
Do you have ideas for how the wasm would be generated for this API? I'm curious what use cases you have in mind. (Specifically, it seems C++/Rust will have some of the same challenges as using a second memory, and we don't really have a good solution there AFAIK)
cc @brendandahl
How would you imagine slice would fit into the type hierarchies? Would it be its own hierarchy, a subtype of extern, or a subtype of any? extern would make the most sense to me, except for the part where you can make a slice from a WasmGC array.
What would happen when a slice created from a WasmGC array flows out to JS? My reading of the OP is that JS would get a reference to the underlying WasmGC array. Is that right? It would also be useful to be able to have JS TypedArrays backed by WasmGC arrays. Could that fit into this proposal somehow?
Would we want to provide atomic accessors to slices as well?
The nice thing about this proposal is that it can serve several related use cases, for example:
- multi-byte access to WasmGC arrays
- efficient access to JS ArrayBuffers
- possibly access to WasmGC arrays from JS
- generalizing over memories, arrays, and host buffers
But I think this flexibility also means we risk scope creep, and it's possible that these separate use cases would be better served by separate features. As a first step, I think it would be useful to enumerate the use cases that are intended to be in-scope.
This seems like a great idea, and a long-time coming. To be clear, I don't think it will solve the general "zero-copy access of external memory from wasm" problem that linear-memory languages like C/C++ have (since all loads/stores implicitly access the default linear memory), but perhaps it would for wasm-gc languages and it seems useful even to linear-memory languages if the developer is willing to write custom code to work directly with slicerefs.
I imagine this would be exposed to C++ via some smart-pointer-esque class that contained an i32 index into a table of slicerefs (using the class's dtor to clear the element) and then the class's methods would contain whatever compiler-intrinsic magic was needed to emit the table and sliceref load/store instructions; is that what you were thinking?
Interesting!
Do you have ideas for how the wasm would be generated for this API? I'm curious what use cases you have in mind. (Specifically, it seems C++/Rust will have some of the same challenges as using a second memory, and we don't really have a good solution there AFAIK)
I don't think this alone will be enough for linear memory languages to have both ergonomic and efficient access to typed arrays. This is primarily targeted at GC languages which could compile a source language type to use slice natively under the hood. I'm not an expert in all the different GC languages, but from a quick search I see things like Dart's ByteBuffer and Kotlin's ByteArray which could be candidates.
It would make it easier for linear memory languages to use the i32 index into table of sliceref's approach that Luke mentions. Without a native type you could still do that, but you'd need to use externref and call into JS glue code and have the engine find a way to make that fast.
How would you imagine slice would fit into the type hierarchies? Would it be its own hierarchy, a subtype of extern, or a subtype of any? extern would make the most sense to me, except for the part where you can make a slice from a WasmGC array.
I would lean towards it being a subtype of eq in the any hierarchy, but I haven't thought deeply about that yet.
What would happen when a slice created from a WasmGC array flows out to JS? My reading of the OP is that JS would get a reference to the underlying WasmGC array. Is that right? It would also be useful to be able to have JS TypedArrays backed by WasmGC arrays. Could that fit into this proposal somehow?
No, in the case you use slice.of_array you will be creatung a new JS ArrayBuffer that shares the same backing store as the wasm GC array. The new slice object will point at the storage for the wasm GC array. When the slice flows out into JS you will see that JS ArrayBuffer.
My mental model here is that the value representation for a slice in a JS environment is just a JS array buffer object, so their capabilities would be tied together.
Would we want to provide atomic accessors to slices as well?
I think we would want that as well. Basically any memory operation that we have would be in scope in my opinion.
The nice thing about this proposal is that it can serve several related use cases, for example:
multi-byte access to WasmGC arrays efficient access to JS ArrayBuffers possibly access to WasmGC arrays from JS generalizing over memories, arrays, and host buffersBut I think this flexibility also means we risk scope creep, and it's possible that these separate use cases would be better served by separate features. As a first step, I think it would be useful to enumerate the use cases that are intended to be in-scope.
Specifically for multi-byte access to WasmGC arrays, I believe that this proposal could give you access to this, but at the cost of having to allocate a slice object. I would support an alternative proposal that just adds multi-byte accessors for (array i8). I think both arrays and slices could have a place in the future. My mental model for an array is that it's conceptually just a pointer to some fixed amount of memory, while a slice is more expensive because it has to maintain more metadata and an extra indirection to point at memory owned by some other thing.
For this proposal, the main use-case in scope for me is 'efficient access to JS ArrayBuffers'. The other use cases are interesting if they work out, but not critical.
JS-API conversions
It is not clear to me whether slices correspond to whole ArrayBuffers or just a portion of them. A coercion down to the underlying array buffer would correspond to the former while the BufferView behavior would correspond to the latter. I think it would be simpler if slices are just ArrayBuffers.
Access instructions
We should probably support loading and storing half-precision floating point numbers as well.
It could be useful to have variants of the instructions that uses the platform's native byte order, to emulate JavaScript typed arrays and to interact with Web APIs such as WebGL. Or maybe we can consider that interesting platforms are all little-endian?
Creation instructions
Creating a slice from a whole Wasm memory should be easy to implement since the memory can already be accessed from JavaScript as an ArrayBuffer or a SharedArrayBuffer. I'm not sure about slicing a part of the memory, unless we add a level of indirection.
Being able to create a slice from a Wasm GC array put some constraints on the implementation: this might be harder to implement or have some performance impact. For instance, if the array is moved in memory by the GC, the slice has to be updated to still point to it.
Sharing
It seems annoying to me if we cannot use the same code to manipulate both ArrayBuffers and SharedArrayBuffers. Maybe a SharedArrayBuffer can be viewed as an unshared Wasm slice as well?
Bulk instructions
Copying to/from Wasm arrays (i8 arrays, at least) would be useful, and we don't have any alternative at the moment to transfer data between Wasm arrays and ArrayBuffers.
Implementing an operation in JavaScript will have some overhead if we have to allocate typed arrays to perform it:
function copy(src, i, dst, j, len) {
return new Uint8Array(dst).set(new Uint8Array(src, j, len), i);
}
Alternatives
Exposing DataView operations would make more sense to me than exposing typed array operations (I'm not sure how we would deal with the different kinds of typed arrays). V8 already implements this, but a standardized API would be better. However, DataViews do not provide any bulk operation. Also, using a DataView adds one level of indirection compared to operating directly on the underlying ArrayBuffer.
Use case
OCaml has BigArrays, which are multi-dimensional arrays of integers and floating-point numbers. They could be implemented using slices as the underlying buffer.
At the moment, I'm using typed arrays as the underlying representation. This provides a way to manipulate JavaScript typed arrays from OCaml. But this is not very efficient. I could import DataView primitives, but this complicates the implementation since I need both a typed array for bulk operations and a dataview for accessing single values.
It is not clear to me whether slices correspond to whole ArrayBuffers or just a portion of them. A coercion down to the underlying array buffer would correspond to the former while the BufferView behavior would correspond to the latter. I think it would be simpler if slices are just ArrayBuffers.
Ah, yes I forgot that it's just the typed array classes that can be sub-ranges, while the backing ArrayBuffer always refers to the entire store. So coercing the typed arrays down to the underlying buffer would be very confusing. I will drop that part.
Also, the bits about creating a slice from just a subsection of wasm memory won't work either unless we're creating a new ArrayBuffer that aliases the main ArrayBuffer for the wasm memory, and I'm not sure what the implications of that are.
We should probably support loading and storing half-precision floating point numbers as well
I believe we would just need a slice.load_v128 sort of instruction and that would support half-precision? And I think that's covered as one of the numeric types.
Being able to create a slice from a Wasm GC array put some constraints on the implementation: this might be harder to implement or have some performance impact. For instance, if the array is moved in memory by the GC, the slice has to be updated to still point to it.
Yeah I would expect this to be hard to implement, but also pretty valuable. I think it would be fair to drop it if an engine finds it not viable to implement.
It seems annoying to me if we cannot use the same code to manipulate both ArrayBuffers and SharedArrayBuffers. Maybe a SharedArrayBuffer can be viewed as an unshared Wasm slice as well?
I agree it's annoying but this is a hard constraint. SAB must be accessed and handled differently from an array buffer, and they have different representations in at least SpiderMonkey.
We should probably support loading and storing half-precision floating point numbers as well
I believe we would just need a
slice.load_v128sort of instruction and that would support half-precision? And I think that's covered as one of the numeric types.
I was thinking of an equivalent to f32.load_f16 and f32.store_f16, to load and store single values.
It seems annoying to me if we cannot use the same code to manipulate both ArrayBuffers and SharedArrayBuffers. Maybe a SharedArrayBuffer can be viewed as an unshared Wasm slice as well?
I agree it's annoying but this is a hard constraint. SAB must be accessed and handled differently from an array buffer, and they have different representations in at least SpiderMonkey.
I suppose this is an issue for the alternative of importing builtins as well.
We should probably support loading and storing half-precision floating point numbers as well
I believe we would just need a
slice.load_v128sort of instruction and that would support half-precision? And I think that's covered as one of the numeric types.I was thinking of an equivalent to
f32.load_f16andf32.store_f16, to load and store single values.
Oh, yeah those load/store instructions could well be in scope.
As a C programmer, it feels unfortunate to me that Wasm continues to distance itself from being a good target for C. Sure, this feature doesn’t specifically introduce the problem, but it accentuates it a bit. Ideally you’d have types with which you can use normal C functions and operations, e.g. memcpy, strlen, subscripting/indexing/dereferencing, etc. But with both opaque types (e.g. externref) and addressable values outside of the main memory (e.g. multiple memories, arrays, slices), it becomes increasingly more difficult to handle Wasm data in C without copies in an ergonomic way.
On the other hand, the premise of unifying “things that can be addressed” is something that sounds very appealing, I think. Regardless of the language, I think it would be very awkward if this introduced a new addressable type without a way to referring to other addressable things. In a GC language, imagine e.g. a Buffer type (that wants to refer to either Wasm memories, or JS ArrayBuffer, or Wasm arrays) with e.g. a buffer1.copyFrom(buffer2) method, which would have to match all possible cases for to/from buffers.
As a C programmer, it feels unfortunate to me that Wasm continues to distance itself from being a good target for C. Sure, this feature doesn’t specifically introduce the problem, but it accentuates it a bit. Ideally you’d have types with which you can use normal C functions and operations, e.g.
memcpy,strlen, subscripting/indexing/dereferencing, etc. But with both opaque types (e.g.externref) and addressable values outside of the main memory (e.g. multiple memories, arrays, slices), it becomes increasingly more difficult to handle Wasm data in C without copies in an ergonomic way.
I think linear memory languages (like C) and GC languages are going to need different solutions here. You may be interested in the memory-control proposal which aims to solve this use-case for linear memory languages. It's an early stage proposal, so new ideas there are welcome.