whisper.cpp icon indicating copy to clipboard operation
whisper.cpp copied to clipboard

whisper.cpp 1.20 produces different inference than OpenAI whisper and with higher WER

Open jordimas opened this issue 1 year ago • 2 comments

Hello!

First, thanks for writing such a great tool.

Whisper.cpp: version 1.20 Open AI: version openai-whisper-20230124 Model used: medium

Audio file used: https://github.com/jordimas/whisper-cpp-error/raw/main/15GdH9-curt.mp3 Open AI transcription: https://raw.githubusercontent.com/jordimas/whisper-cpp-error/main/15GdH9-curt/15GdH9-curt.mp3.txt Whisper.cpp: transcription: https://raw.githubusercontent.com/jordimas/whisper-cpp-error/main/15GdH9-curt.wav.txt

I will expect Whisper.cpp to produce the same output under the same model and input than OpenAI Whisper.

In terms of WER against reference the txt human transcribed file: OpenAI whisper -WER: 28.08, Whisper.cpp : WER 35.86

If there is anything that I can do to help, let me know

Thanks

jordimas avatar Feb 05 '23 09:02 jordimas

Thanks for the data point! How do I calculate WER scores?

ggerganov avatar Feb 14 '23 17:02 ggerganov

Basically:

  1. I execute the tools from the command lines (whisper.cpp, OpenAI python client)
  2. I use HuggingFace WER metric module to calculate the difference between transcription and expected file: https://github.com/jordimas/whisper-cpp-error/blob/main/benchmark.py#L37

However, you can also see that the produced files are different.

Thanks

jordimas avatar Feb 14 '23 19:02 jordimas