whispering icon indicating copy to clipboard operation
whispering copied to clipboard

whisperx implementation

Open Infinitay opened this issue 1 year ago • 0 comments

Are you familiar with whisperx? Would it be possible to support using whisperX's model instead of Whisper? I was taking a look at doing so with the plugins but to be honest I wasn't entirely sure about how to do so. The maintainer recently committed support for their batched inferencing allowing for (nearly) realtime transcripts and alongside faster-whisper. I think it would be nice to have for it's support for (phoneme based?) alignment in addition to other things such as diarization. I personally don't use whispering for VRChat, but I suppose with whisperx's diarization, it would be beneficial to have given an example where you're in a lobby where there are many people talking and you can at least distinguish different speakers even if there are no labels. It would be cool to dynamically support adding labels in the future if you decide to support whisperx either via the web app (could maybe change bg color of transcription with respect to speaker) or your ui

Infinitay avatar May 28 '23 03:05 Infinitay