whisper.cpp
whisper.cpp copied to clipboard
Label different speakers
Might be a stretch, but would it be possible to label different speakers if audio has >1 person talking?
This would come handy for conference recordings with multiple presenters, etc.
Thinking about possible implementation, the simplest one might be to label based on the audio channel.
Say we have a stereo recording:
- Recognize L/R separately
- Label accordingly
- Done
Stereo-diarization is already implemented - see #64 I have some other ideas in mind for general diarization, but low-priority for the moment.