whisper-diarization icon indicating copy to clipboard operation
whisper-diarization copied to clipboard

diarization wrongly assigns speaker 0 and 1 sometimes.

Open manjunath7472 opened this issue 1 year ago • 2 comments

Transcription is good but diarisation speaker labels are wrong sometimes, speaker 0 mapped as speaker 1 down the line and vice versa. Am using Indian English conversation as audio input. Its conversation between a teacher teaching and student online. Could you suggest any more precise methods or alterations. Any other nemo configs available other than telephonic. Does it require any additional training for indian english accent? Could anyone suggest some near perfect pipeline for this? @MahmoudAshraf97

manjunath7472 avatar Oct 19 '23 09:10 manjunath7472

It seems the diarization from Nemo is not good enough, anyone else got the same problem?

v-nhandt21 avatar Nov 01 '23 09:11 v-nhandt21

This might be the solution. https://github.com/google/uis-rnn

manjunath7472 avatar Nov 09 '23 06:11 manjunath7472