whisper.cpp
whisper.cpp copied to clipboard
Enable POWER9 fp32 and fp16 SIMD code
With the FP32 base model and this patch set, the jfk example takes about 3.2 seconds to transcribe. This is another data point for #300, and it is about one second faster than the current FP16 SIMD code.