faster-whisper icon indicating copy to clipboard operation
faster-whisper copied to clipboard

Confused about recently added distil-whisper support (CPU?)

Open Arche151 opened this issue 11 months ago • 2 comments

First of all, I wanna say how thankful I am for faster-whisper and that I'm using it every single day!

I saw, that two days ago, distil-whisper support was added to faster-whisper. My question: Can I use distil-whisper inside faster-whisper via CPU?

Distil-whisper itself supports CPU, but in the faster-whisper docs only GPU transcription with distil-whisper is mentioned.

And lastly, if I can use distil-whisper inside faster-whisper via CPU, how does its performance compare to normal faster-whisper CPU transcription?

Thanks in advance!

Arche151 avatar Feb 26 '24 10:02 Arche151

@Arche151 , hello. Yes we can use CPU. For my tests, FW Distil large-v2 is 2x faster than normal FW large-v2. I tested with an mp3 audio (192s):

  • FW Distil large-v2: 85.31s (condition_on_previous_text=False)
  • Normal FW large-v2: 194.67s (condition_on_previous_text=False)
  • Normal FW large-v2: 230.51s (condition_on_previous_text=True)

Notes that you should use condition_on_previous_text=False with Distil model to improve the transcription quality (default = True)

model = WhisperModel('distil-large-v2', device='cpu')
segments, info = model.transcribe(jfk_path, word_timestamps=True, condition_on_previous_text=False)
for segment in segments:
    print("[%.2fs -> %.2fs] %s" % (segment.start, segment.end, segment.text))

trungkienbkhn avatar Feb 27 '24 08:02 trungkienbkhn

@trungkienbkhn Thank you so much for the info and the comparisons! Now, I only have to wait for distil-whisper to support large-v3 haha

Arche151 avatar Feb 27 '24 09:02 Arche151