whisper.cpp
whisper.cpp copied to clipboard
Simplify the SIMD code
This is an attempt to simplify and unify the SIMD implementation for various architectures. The idea is similar to https://github.com/ggerganov/whisper.cpp/pull/95#issuecomment-1344845687
We define a common subset of C macros which map to specific intrinsics based on the current architecture:
#define GGML_F32_VEC ...
#define GGML_F32_VEC_ZERO ...
#define GGML_F32_VEC_SET1 ...
#define GGML_F32_VEC_LOAD ...
#define GGML_F32_VEC_STORE ...
#define GGML_F32_VEC_FMA ...
#define GGML_F32_VEC_ADD ...
#define GGML_F32_VEC_MUL ...
#define GGML_F32_VEC_REDUCE ...
#define GGML_F16_VEC ...
#define GGML_F16_VEC_ZERO ...
#define GGML_F16_VEC_SET1 ...
#define GGML_F16_VEC_LOAD ...
#define GGML_F16_VEC_STORE ...
#define GGML_F16_VEC_FMA ...
#define GGML_F16_VEC_ADD ...
#define GGML_F16_VEC_MUL ...
#define GGML_F16_VEC_REDUCE ...
We then implement the SIMD functions using only these macros. Refactored functions:
- [x] ggml_vec_dot_f32
- [x] ggml_vec_dot_f16
- [x] ggml_vec_mad_f32
- [x] ggml_vec_mad_f16
- [x] ggml_scale_f32
Supported and tested architectures:
- [x] Arm64
- [x] ARMv7
- [x] x86
- [x] WASM
- [ ] ppc64le
Still not sure if this would be easier to maintain, but I will give it a try