Jan Wassenberg comments

Results 405 comments of


                                            Jan Wassenberg

`RVV` target test failures

Oh, good catch! That's likely it. We seem to have 512-bit VLEN. Note that SortTag uses LMUL=1/2. The problem is that the base case is meant to handle at least...

`RVV` target test failures

The sort itself does check for the problem, but TestAllPartition did not, and soon will. Unfortunately our CI doesn't work with 1024, so not entirely certain this fixes it.

Support for saturating doubling multiply add

It sounds reasonable to want access to this instruction. I'm curious about the purpose of the extra 2x mul in their definition? Because NEON differs from SVE in that it...

Support for saturating doubling multiply add

Makes sense. `ReorderWidenMulAccumulate` also returns a second value using an output param, so there is precedent. So the proposed op would call both vqdmlal and vqdmlal_high on NEON, and svqdmlalb...

Support for saturating doubling multiply add

Got it, thanks. In that case adding FixedPoint to the op name may be helpful.

Support for saturating doubling multiply add

Very nice! @Ryo-not-rio FYI John's solution defines the op in terms of NEON/x86, so for SVE we have two extra Zip. Does that work for you?

Choosing NEON over SVE when fixed size vectors are used where possible

If I understand correctly, the issue is that we use `FixedTag`, which on SVE requires Load/Store etc to do extra work to limit the work to 128 bits. +1 to...

Choosing NEON over SVE when fixed size vectors are used where possible

hm, if the code is isolated and not alternating between SVE/NEON in the same function or source file, it is easy to compile one source file with SVE disabled (so...

Choosing NEON over SVE when fixed size vectors are used where possible

It can work like this. ``` template NeonType Func(D d) { return NeonType(); } template SveType Func(D d) { return SveType(); } ``` and for functions not involving a `D=Simd`,...