Remc comments

Results 5 comments of


                                            Remc

Huge acceleration for speaker-diarization pipeline

Why did you use whisperX for loading? Is there method especially efficient?

Use pyannote-audio for speaker diarization

I am working on the same subject, you can find work done by Majdoddin here: https://github.com/Majdoddin/nlp Not perfect but a good way to start. I ll push my solution when...

Distil-Whisper support?

If it happen, it will be link to faster_whisper implementation and not on whisperX side. Good news, their is an issue open on faster_whisper repo : https://github.com/guillaumekln/faster-whisper/issues/533

Is it possible to fine-tune this model, or any method to update the vocabulary of it?

You can finetune faster_whisper models. Just fine tune a regular whisper model and you will be able to pass those weigth to faster_whisper model.

ValueError: Requested float16 compute type, but the target device or backend do not support efficient float16 computation.

When using int8 on Mac with m2, whisperx responds `TypeError: TranscriptionOptions.__new__() missing 3 required positional arguments: 'max_new_tokens', 'clip_timestamps', and 'hallucination_silence_threshold'`