pulp icon indicating copy to clipboard operation
pulp copied to clipboard

Gather ops

Open valadaptive opened this issue 1 year ago • 1 comments

Not sure how important this is across architectures since only AVX2 supports gather (ARM only supports gather via SVE, which LLVM doesn't seem to support, and WASM seems to only support it via an experimental proposal), but I was porting a noise library to another SIMD library a while back, and using AVX2 gather speeds things up by ~20-40%. Might be worth implementing for AVX2 and falling back to a scalar implementation for everything else.

valadaptive avatar Dec 01 '24 11:12 valadaptive

sounds good. i can have that ready soon

sarah-quinones avatar Dec 01 '24 12:12 sarah-quinones