Anton Kirilov

Results 40 comments of Anton Kirilov

This operation is not that easy to implement with Neon and SVE either because of the variable index. It is basically equivalent to `extract_lane`, followed by an indexed `DUP` (index...

> Also, for 8-bit types, it is also that hard to implement on all archs: DUP the index into a vector, and use this vector in a TBL. I am...

I don't assume JIT compilation in my analysis and I use vector length-agnostic code generation - in fact, AOT compilation of WebAssembly doesn't sound like something completely out of the...

I won't be able to make it this Friday too.

> The motivation is simply hardware availability (mostly) on the consumer market. On the other hand, we obviously don't want to close the door to SVE and RISC-V, though the...

> For unaligned loads, it might be necessary to add a check to see if the full-load might cross page boundary and fallback to slower load in that case (because...

Sure, considering each branch in isolation, the branching pattern should be mostly regular, but the branch density might have negative interactions with branch prediction (which wouldn't be an issue with...

I have a couple of remarks about setting the vector length dynamically, as currently defined by the proposal. I am not sure if this is the best place to write...

Is there a place where the collection of code generation samples for the operations that are currently defined by the proposal would be hosted? I would be quite happy to...