whisper-diarization
whisper-diarization copied to clipboard
diarization wrongly assigns speaker 0 and 1 sometimes.
Transcription is good but diarisation speaker labels are wrong sometimes, speaker 0 mapped as speaker 1 down the line and vice versa. Am using Indian English conversation as audio input. Its conversation between a teacher teaching and student online. Could you suggest any more precise methods or alterations. Any other nemo configs available other than telephonic. Does it require any additional training for indian english accent? Could anyone suggest some near perfect pipeline for this? @MahmoudAshraf97
It seems the diarization from Nemo is not good enough, anyone else got the same problem?
This might be the solution. https://github.com/google/uis-rnn