Deepti Gandluri comments

Results 118 comments of


                                            Deepti Gandluri

JS API for SIMD: WA.Global<v128>, lane accessors, coercions at calls

Given that JS not being able to talk to V128 values was by design, i.e. we expect that most JS developers won't need to interact with SIMD code, would this...

Summarizing previous polls, criteria for including SIMD operations

> What are the "multiple relevant arches" ? > > FWIW, some SIMD operations aren't necessarily more performant than their scalar counter-parts, but are often necessary and faster than moving...

Summarizing previous polls, criteria for including SIMD operations

The axes of evaluating operations - usefulness, architecture support, and performance I agree with at a high level. But there are a couple of things that are being conflated, and...

Requirements for phase 3

Looks like this discussion has stalled, was this discussed in a meeting with additional follow up? If not, I'm wondering if in addition to the existing spec tests, would [WPT](https://web-platform-tests.org/)...

Documenting performance tradeoffs

Thanks @zeux for documenting this, as it's documenting the V8 implementation leaving a disclaimer in here that the support in V8 is still experimental, and may change as we shift...

Accelerated shuffle masks

Thanks @penzn for filing - I'm guessing the purpose of this is currently for documenting known fast shuffles? I'm marking this with a documentation label till #196 is resolved to...

Inefficient x64 codegen for splat

This is very specifically an Intel ISA quirk because `pshufd/pshufw/pshufb` all have different semantics. In the specific case that you linked for i16x8.splat, the `pshufw` instruction only operates on 64-bit...

Inefficient x64 codegen for splat

The different semantics are an issue for the specific i16x8.splat that you linked code to, but I agree that the additional `mov/pinsr*` instruction for splats is harder to get rid...

Inefficient x64 codegen for splat

There doesn't seem to be anything actionable here, so closing this issue - please reopen if you have suggestions for more we can do here.

Inefficient x64 codegen for splat

Not sure if you need permissions to reopen as the original author for the issue, but reopening. This was previously discussed at a meeting (03/06), and there was an AI...