whisper.cpp icon indicating copy to clipboard operation
whisper.cpp copied to clipboard

Label different speakers

Open savchenko opened this issue 1 year ago • 1 comments

Might be a stretch, but would it be possible to label different speakers if audio has >1 person talking?

This would come handy for conference recordings with multiple presenters, etc.

savchenko avatar Nov 29 '22 07:11 savchenko

Thinking about possible implementation, the simplest one might be to label based on the audio channel.

Say we have a stereo recording:

  1. Recognize L/R separately
  2. Label accordingly
  3. Done

savchenko avatar Nov 29 '22 12:11 savchenko

Stereo-diarization is already implemented - see #64 I have some other ideas in mind for general diarization, but low-priority for the moment.

ggerganov avatar Dec 01 '22 17:12 ggerganov