whisper-auto-transcribe icon indicating copy to clipboard operation
whisper-auto-transcribe copied to clipboard

speaker diarization

Open tomchang25 opened this issue 1 year ago • 3 comments

tomchang25 avatar Mar 08 '23 10:03 tomchang25

Have test pyannote-audio as speaker diarization. The error rate is about 30% and need lots of extra install step. In other hands, segmentation (and VAD) is working pretty good. I'll temporarily put on hold speaker diarization until beta version complete.

tomchang25 avatar Mar 11 '23 16:03 tomchang25

A successful example about whisper + speaker diarization.

https://github.com/MahmoudAshraf97/whisper-diarization

tomchang25 avatar Mar 12 '23 06:03 tomchang25

Another example https://huggingface.co/spaces/vumichien/Whisper_speaker_diarization/blob/main/app.py

device = 0 if torch.cuda.is_available() else "cpu"   | pipe = pipeline(   | task="automatic-speech-recognition",   | model=MODEL_NAME,   | chunk_length_s=30,   | device=device,   | )   | os.makedirs('output', exist_ok=True)   | pipe.model.config.forced_decoder_ids = pipe.tokenizer.get_decoder_prompt_ids(language=lang, task="transcribe")   |     | embedding_model = PretrainedSpeakerEmbedding(   | "speechbrain/spkrec-ecapa-voxceleb",   | device=torch.device("cuda" if torch.cuda.is_available() else "cpu"))  


tomchang25 avatar Apr 16 '23 06:04 tomchang25