WhisperLive
WhisperLive copied to clipboard
Investigate Word-level timestamps for discarding processed audio
In the faster-whisper backend, current impl considers the last segment as incomplete to be able to correctly discard processed audio by assuming the last segment might have a word cut off. The idea is to use word level timestamps and just keep the audio for the last word.