Saleh Soleimani comments

Results 15 comments of


                                            Saleh Soleimani

Realtime use possible?

we've already used vosk model for real-time process but with it's low accuracy, we highly need the performance of whisper's audio transcription in real-time, but the inference time is taking...

Realtime use possible?

> use whisper-v3-turbo model.Its 3x faster then others also make sure you enable cuda or OpenCL on AMD thanks but I searched and found large-v3-turbo, but I didn't found a...

Realtime use possible?

> Yes. It is possible. We are working on capturing audio from mic and transcribe in realtime. any updates?

Realtime use possible?

> Well , is there any way to prevent and ignore some other languages like arabic , indian etc... and load only 1 language model so we can optimize it...

> @salehsoleimani , did you tried 8 bit models ? https://huggingface.co/ggerganov/whisper.cpp like : large-v3-turbo-q8_0 i guess currently used model is a quantized version of whisper. isn't that so? @vilassn

Realtime use possible?

> no , default model uses float32 or bfloat32 afaik also some more quantized models https://huggingface.co/ctranslate2-4you/distil-whisper-small.en-ct2-bfloat16 oh thanks. i'm gonna try the 8/5-bit models do you know that are these...

Realtime use possible?

> @salehsoleimani did you check for 8 bit models ? any news? no, actually I've been tolled quantization doesn't affect on speed too much

Realtime use possible?

> also try here and return back please https://github.com/huggingface/distil-whisper they claim code will run 6x faster then whisper v3 sure thanks i'm gonna try it out

Realtime use possible?

> also try here and return back please https://github.com/huggingface/distil-whisper they claim code will run 6x faster then whisper v3 Also can someone tell me how to convert these to whisper...