flexible-vectors icon indicating copy to clipboard operation
flexible-vectors copied to clipboard

SIMD horizontal adds

Open ppenzin opened this issue 1 year ago • 3 comments

Originally thought to be a post-MVP feature: https://github.com/WebAssembly/simd/issues/20

There is PR to LLVM to introduce shuffle patterns that combined with other instructions would translate to horizontal additions - motivation to restart the conversation on horizontal SIMD ops or at least provide a way to disseminate this among the runtimes.

@sparker-arm as the author of the patch, sorry to put you on the spot.

ppenzin avatar Jun 04 '24 17:06 ppenzin

From my brief look at this spec, it looks like we'd use dup_odd to pattern match a pairwise operation. So, IIUC, pattern matching with flexible vectors should be a bit easier than the current SIMD shuffles.

But we would still have the trouble of choosing a canonical form that matches well to hardware and for all the runtimes to perform the matching. For instance, the current shuffle approach in LLVM would map to concat_lower_upper and that, again, isn't useful for the horizontal FP instructions that I'm aware of.

So, I would definitely be in favour of having dedicated wasm instruction(s), for both fixed and flexible!

Arm hardware-wise, Neon includes faddp for floats, which are chained for a full reduction, and addv is used for integer reduction. SVE includes faddv, which performs a recursive pairwise reduction on floats, but I'm not sure what we use for integers.

sparker-arm avatar Jun 05 '24 08:06 sparker-arm

We have the SADDV and UADDV instructions to deal with integers (signed and unsigned respectively) in SVE. BTW another option for floating-point values is FADDA, which is strictly ordered, but realistically has a performance cost.

There are also pairwise operations, i.e. ADDP and FADDP.

akirilov-arm avatar Jun 05 '24 12:06 akirilov-arm

I vaguely remember horizontal ops were not great on x86. @rrwinterton, do you have any thoughts?

ppenzin avatar Jun 06 '24 16:06 ppenzin