whisper.cpp icon indicating copy to clipboard operation
whisper.cpp copied to clipboard

Whisper LargeV3 differences between whisper.cpp & python

Open magnacartatron opened this issue 2 years ago • 4 comments

I'm a bit stuck and maybe someone can help me.

Running whisper large-v3 model via whisper.cpp is significantly more performant than running it through python, VRAM wise and time wise. On a large file python implementation was taking 40GB of VRMA (using Mac Studio)

However I find that running large-v3 through whisper cpp can cause weird anomalies and repetitions that I just don't see when running it through python. Running it through python gives almost perfect accuracy with no weird hallucinations.

What am I missing, how are they so different?

Medium on whisper.cpp seems to be more accurate and hallucinates less than large-v3

magnacartatron avatar Feb 02 '24 10:02 magnacartatron

I'm working on resolving this issue. It shouldn't affect performance.

bobqianic avatar Feb 02 '24 16:02 bobqianic

Medium on whisper.cpp seems to be more accurate and hallucinates less than large-v3

Thanks for the hint, BTW. I think it solved my problem :) [switched to medium]


Interestingly, if I enable -tr on medium then it also starts hallucinating. It seems to be a problem with foreign-language-context silences. And for some unknown reason large-v3 translates even if -tr is disabled.

xaionaro avatar Feb 29 '24 19:02 xaionaro

@xaionaro Yes your experience with translation is also happening to me. I think I am going to continue using V2 until V3 problems are solved.

RazeBerry avatar Mar 27 '24 16:03 RazeBerry

Sent from my iPadOn 28 Mar 2024, at 3:56 AM, Jeff Koons Hater @.***> wrote: @xaionaro Yes your experience with translation is also happening to me. I think I am going to continue using V2 until V3 problems are solved.

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you authored the thread.Message ID: @.***>

magnacartatron avatar Apr 20 '24 00:04 magnacartatron