whisper.cpp Android run demo by small model

Android run demo by small model

Open qcxu2 opened this issue 1 year ago • 3 comments

I run the demo on a real Android machine and copied the samll model, but the conversion time is too long. Is it a machine problem?

Feb 21 '23 09:02 qcxu2

I have the same problem, Samsung s10e exynos, base data model

Feb 22 '23 22:02 ron-diesel

Make sure to use release build and select tiny or base model. On my Redmi note 9s (Cortex-A75), 48sec of voice sample can be transcribed in 37sec with base model.

Feb 27 '23 14:02 tinoue

In addition to @tinoue comment, in the future you might be able to use quantised models for better performance on mobile devices - follow #540 for more informatiom

Feb 27 '23 19:02 ggerganov

Make sure to use release build and select tiny or base model.

This. There were a huge difference in audio conversion times between a debug apk and a release one.

Model	Variant	Input audio time	Time for ranscribing
Tiny ( ggml-tiny.en.bin )	Debug	11001 ms (11 sec)	37519 ms (37.51 sec)
Tiny ( ggml-tiny.en.bin )	Release	11001 ms (11 sec)	5123 ms (5.12 sec )

Using a quantized model had almost no conversion time difference , but a major advantage in apk size , without compromising in Accuracy (in my observation)

Model	Variant	Input audio time	Time for ranscribing
Tiny Q4_2 ( ggml-tiny.en-q4_2.bin )	Debug	11001 ms (11 sec)	46517 ms (46.51 sec)
Tiny Q4_2 ( ggml-tiny.en-q4_2.bin )	Release	11001 ms (11 sec)	5309 ms (5.3 sec )

Also , clicking the benchmark button in android crashes the app. Don't know the reason since im not much familiar with android.

May 07 '23 05:05 VipinVIP

whisper.cpp whisper.cpp copied to clipboard

Android run demo by small model

whisper.cpp
whisper.cpp copied to clipboard