whisper.cpp Can all the calc use in F16？

Can all the calc use in F16？

Open xyx361100238 opened this issue 3 years ago • 1 comments

trafficstars

Hi，ggerganov： In this time，you already done with 【AVX intrinsics support & F16 & OpenBlas】，I don't know how to accelerate the calculation in other way， can you give some advise，thx！ ex： if all the calc use in F16,does it works? how to modify convert-pt-to-ggml.py to change all the in F16？

Nov 09 '22 10:11 xyx361100238

To convert all weights to F16, simply comment the following:

https://github.com/ggerganov/whisper.cpp/blob/46a68fb9b5b19e322b2c7ee21550481798f0061c/models/convert-pt-to-ggml.py#L295-L302

You then have to change all GGML_TYPE_F32 to wtype in whisper_model_load().

However, I think there are still some operations that do not support F16 - not sure. It shouldn't be too difficult to implement.

But you won't gain much performance from that, because only the small 1-dimensional tensors are currently F32. The computation is already very fast because they are small, so going to F16 wouldn't help much.

Nov 11 '22 16:11 ggerganov

whisper.cpp whisper.cpp copied to clipboard

Can all the calc use in F16？

whisper.cpp
whisper.cpp copied to clipboard