Xiangyang (Mark) Guo
Xiangyang (Mark) Guo
Recently .NET Core enabled hardware intrinsics to generate SIMD instructions from SSE to AVX2. And more instructions are added into the .NET Core API interface. The instruction list can be...
Optimize `Vectorized exp()` with neon simd instructions, copy from the implementation https://github.com/ARM-software/optimized-routines/blob/master/math/aarch64/v_expf.c with minor changes. cc @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10
https://github.com/ARM-software/optimized-routines/tree/master/math/aarch64 implements some math operations with neon simd instructions. The perf looks good, especially for exp(). I'm wondering if it's possible to integrate ARM-software/optimized-routines into sleef? Thanks!
Use sleef for aarch64 by default. cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @peterbell10 @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang
Summary: Skip asmjit test on ARM because asmjit doesn't work on ARM. `kernel_32` and `kernel_64` are generated from `GenerateEmbeddingSpMDMNBit`, which calls auto vec version on ARM. Differential Revision: D60181430