whisper.cpp icon indicating copy to clipboard operation
whisper.cpp copied to clipboard

Can all the calc use in F16?

Open xyx361100238 opened this issue 1 year ago • 1 comments

Hi,ggerganov: In this time,you already done with 【AVX intrinsics support & F16 & OpenBlas】,I don't know how to accelerate the calculation in other way, can you give some advise,thx! ex: if all the calc use in F16,does it works? how to modify convert-pt-to-ggml.py to change all the in F16? image

xyx361100238 avatar Nov 09 '22 10:11 xyx361100238

To convert all weights to F16, simply comment the following:

https://github.com/ggerganov/whisper.cpp/blob/46a68fb9b5b19e322b2c7ee21550481798f0061c/models/convert-pt-to-ggml.py#L295-L302

You then have to change all GGML_TYPE_F32 to wtype in whisper_model_load().

However, I think there are still some operations that do not support F16 - not sure. It shouldn't be too difficult to implement.

But you won't gain much performance from that, because only the small 1-dimensional tensors are currently F32. The computation is already very fast because they are small, so going to F16 wouldn't help much.

ggerganov avatar Nov 11 '22 16:11 ggerganov