WhisperLive Investigate Word-level timestamps for discarding processed audio

Investigate Word-level timestamps for discarding processed audio

Open makaveli10 opened this issue 9 months ago • 0 comments

In the faster-whisper backend, current impl considers the last segment as incomplete to be able to correctly discard processed audio by assuming the last segment might have a word cut off. The idea is to use word level timestamps and just keep the audio for the last word.

Jan 21 '25 08:01 makaveli10

WhisperLive WhisperLive copied to clipboard

Investigate Word-level timestamps for discarding processed audio

WhisperLive
WhisperLive copied to clipboard