insanely-fast-whisper
insanely-fast-whisper copied to clipboard
[Discussion] Speaker diarisation options
Currently, we are leveraging Pyannotes speaker diarisation. However, there is still scope for improvement here, and we should be able to leverage other open-source packages like NVIDIA NeMo.
I'd like to know if the community has had any experience with this and comparisons between pyannote and NeMo for diarisation.
Copying a comment from #46
Might I suggest using nemo toolkit instead? It seems to avoid pyannote's requirement of using a huggingface key or what not to access their model. omarsiddiqi224 is the one who posted a link to a repository that relies on it instead of pyannote.
My two cents :-) One can actually use pyannote
pretrained models without Huggingface authentication. As soon as the model has been downloaded and cached once (yes, this needs a HF token), you no longer need the token for subsequent calls.
@Vaibhavs10, you might want to add a insanely-fast-whisper download --hf-token ...
command to do just that (= download and cache the models once and for all). Subsequent calls to insanely-fast-whisper
would then use this cached version...
The only reason for this HF token thing is for me to know a bit more about my user base.
I am completely blind without this. Thanks for your understanding.
Yeah! Makes sense! Adding an HF Token, in my opinion, is not much of an inconvenience. I'd rework the overall API a bit more over the weekend to make it easier for people to use.
Looking at the codebase, do you have any suggestions for me to make the diarisation process even faster btw?
here is one repo
https://github.com/MahmoudAshraf97/whisper-diarization
also for diarization ,you havnt updated readme file & also in collab its not working
@Vaibhavs10 @akashAD98 can you please provide example code/notebook for diarization? something like this one https://github.com/Vaibhavs10/insanely-fast-whisper/blob/main/notebooks/infer_faster_whisper_large_v2.ipynb
My two cents :-) One can actually use
pyannote
pretrained models without Huggingface authentication. As soon as the model has been downloaded and cached once (yes, this needs a HF token), you no longer need the token for subsequent calls.@Vaibhavs10, you might want to add a
insanely-fast-whisper download --hf-token ...
command to do just that (= download and cache the models once and for all). Subsequent calls toinsanely-fast-whisper
would then use this cached version...The only reason for this HF token thing is for me to know a bit more about my user base. I am completely blind without this. Thanks for your understanding.
Do we have any updates on running the Pyannote locally? would it be possible to download the model, and run it locally?
@omarsiddiqi224 - It does already. After the first time of downloading the weights it should work locally without the need to pass the token.