whisper.cpp
whisper.cpp copied to clipboard
Can all the calc use in F16?
Hi,ggerganov: In this time,you already done with 【AVX intrinsics support & F16 & OpenBlas】,I don't know how to accelerate the calculation in other way, can you give some advise,thx! ex: if all the calc use in F16,does it works? how to modify convert-pt-to-ggml.py to change all the in F16?
To convert all weights to F16, simply comment the following:
https://github.com/ggerganov/whisper.cpp/blob/46a68fb9b5b19e322b2c7ee21550481798f0061c/models/convert-pt-to-ggml.py#L295-L302
You then have to change all GGML_TYPE_F32
to wtype
in whisper_model_load()
.
However, I think there are still some operations that do not support F16 - not sure. It shouldn't be too difficult to implement.
But you won't gain much performance from that, because only the small 1-dimensional tensors are currently F32. The computation is already very fast because they are small, so going to F16 wouldn't help much.