simd icon indicating copy to clipboard operation
simd copied to clipboard

Branch of the spec repo scoped to discussion of SIMD in WebAssembly

Results 47 simd issues
Sort by recently updated
recently updated
newest added

Hi everyone, @ngzhian has encouraged me to share the [discussion we're having on building an x64 constant pool and related optimizations in V8](https://groups.google.com/u/3/g/v8-dev/c/QJfpvc55Hfg/m/A7ZEuASOBQAJ) since it might be helpful to other...

[`bitselect`](https://github.com/WebAssembly/simd/blob/master/proposals/simd/SIMD.md#bitwise-select) is a 3-instruction lowering in [cranelift](https://github.com/bytecodealliance/cranelift/blob/48029b4a16264672ce24afbee1050b37e1e68020/cranelift-codegen/meta/src/isa/x86/legalize.rs#L447-L458) and a 4-instruction lowering in [v8](https://github.com/v8/v8/blob/19be4913881bb02c5d9b4f1c7547ee2d1273120b/src/compiler/backend/x64/code-generator-x64.cc#L3558-L3566).

perf documentation

[`splat`](https://github.com/WebAssembly/simd/blob/master/proposals/simd/SIMD.md#create-vector-with-identical-lanes) has 2- to 3-instruction lowerings in [cranelift](https://github.com/bytecodealliance/cranelift/blob/48029b4a16264672ce24afbee1050b37e1e68020/cranelift-codegen/meta/src/isa/x86/legalize.rs#L348-L406) and [v8](https://github.com/v8/v8/blob/19be4913881bb02c5d9b4f1c7547ee2d1273120b/src/compiler/backend/x64/code-generator-x64.cc#L3095-L3105). I believe the "splat all ones" and "splat all zeroes" cases are a single-instruction lowering in both platforms but it...

perf documentation

Certain [SIMD conversions](https://github.com/WebAssembly/simd/blob/master/proposals/simd/SIMD.md#conversions) seem to have inefficient lowerings in x64. `f32x4.convert_i32x4_u` is lowered to 8 instruction by [v8](https://github.com/v8/v8/blob/19be4913881bb02c5d9b4f1c7547ee2d1273120b/src/compiler/backend/x64/code-generator-x64.cc#L2448-L2464). The signed version, `f32x4.convert_i32x4_s`, on the other hand, is lowered to a...

perf documentation

[`all_true`](https://github.com/WebAssembly/simd/blob/master/proposals/simd/SIMD.md#all-lanes-true) checks if all lanes are (unsigned) greater than 0. This requires 4 instructions in [cranelift](https://github.com/bytecodealliance/cranelift/blob/48029b4a16264672ce24afbee1050b37e1e68020/cranelift-codegen/meta/src/isa/x86/legalize.rs#L471-L501) and 6 in [v8](https://github.com/v8/v8/blob/19be4913881bb02c5d9b4f1c7547ee2d1273120b/src/compiler/backend/x64/code-generator-x64.cc#L590-L602). Perhaps there is a more granular way to reduce lanes...

perf documentation

In both cranelift and v8, unsigned integer comparison are lowered to more than 1instruction: - unsigned greater/less-than takes 4 instructions; e.g. [cranelift](https://github.com/bytecodealliance/cranelift/blob/48029b4a16264672ce24afbee1050b37e1e68020/cranelift-codegen/meta/src/isa/x86/legalize.rs#L525-L532) and [v8](https://github.com/v8/v8/blob/19be4913881bb02c5d9b4f1c7547ee2d1273120b/src/compiler/backend/x64/code-generator-x64.cc#L3071-L3081) - both unsigned and signed greater/less-than-or-equal...

perf documentation

In both [cranelift](https://github.com/bytecodealliance/cranelift/blob/48029b4a16264672ce24afbee1050b37e1e68020/cranelift-codegen/meta/src/isa/x86/legalize.rs#L575-L607) (ignore the bitcasts) and [v8](https://github.com/v8/v8/blob/19be4913881bb02c5d9b4f1c7547ee2d1273120b/src/compiler/backend/x64/code-generator-x64.cc#L2465-L2491), floating-point absolute value and floating-point negation are 3-instruction lowerings. I don't believe there is any better lowering than these (is there?) for...

perf documentation

While attempting to lower `shl` and `shr` (https://github.com/WebAssembly/simd/blob/master/proposals/simd/SIMD.md#bit-shifts) in cranelift, I observed that following instructions would involve a non-optimal lowering to x86: - `i8x16.shl` - `i8x16.shr_s` - `i8x16.shr_u` - `i64x2.shr_s`...

perf documentation

Suggestion from https://github.com/WebAssembly/simd/pull/455#issuecomment-775787091. We can try doing this after adding all the instructions to syntax.

spectext

Sign-replication is an often-used operation that replicates the sign bit of a SIMD lane into all bits of the lane. There are two reasons why we need to pay attention...

perf documentation