whisper.cpp icon indicating copy to clipboard operation
whisper.cpp copied to clipboard

Reorganize POWER9 SIMD code

Open fitzsim opened this issue 1 year ago • 0 comments

I could not eliminate the separate index argument in the f16 load and store macros, so this patch set needs testing on other architectures.

The existing GGML_F32x4_REDUCE macro performs as well as the implementation in #366 so I used the existing one.

When I test with the F32 model, ggml_vec_dot_f16 and ggml_vec_mad_f16 are still being called. Is that expected?

fitzsim avatar Jan 04 '23 08:01 fitzsim