Whisper LargeV3 differences between whisper.cpp & python
I'm a bit stuck and maybe someone can help me.
Running whisper large-v3 model via whisper.cpp is significantly more performant than running it through python, VRAM wise and time wise. On a large file python implementation was taking 40GB of VRMA (using Mac Studio)
However I find that running large-v3 through whisper cpp can cause weird anomalies and repetitions that I just don't see when running it through python. Running it through python gives almost perfect accuracy with no weird hallucinations.
What am I missing, how are they so different?
Medium on whisper.cpp seems to be more accurate and hallucinates less than large-v3
I'm working on resolving this issue. It shouldn't affect performance.
Medium on whisper.cpp seems to be more accurate and hallucinates less than large-v3
Thanks for the hint, BTW. I think it solved my problem :) [switched to medium]
Interestingly, if I enable -tr on medium then it also starts hallucinating. It seems to be a problem with foreign-language-context silences. And for some unknown reason large-v3 translates even if -tr is disabled.
@xaionaro Yes your experience with translation is also happening to me. I think I am going to continue using V2 until V3 problems are solved.
Sent from my iPadOn 28 Mar 2024, at 3:56 AM, Jeff Koons Hater @.***> wrote: @xaionaro Yes your experience with translation is also happening to me. I think I am going to continue using V2 until V3 problems are solved.
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you authored the thread.Message ID: @.***>