Hervé BREDIN comments

Results 270 comments of


                                            Hervé BREDIN

Remove pytorch-lightning

What are the alternatives? Feel free to open a PR.

How to finetune clustering and embedding models in speaker diarization pipeline?

Fine-tuning speaker embedding is currently not implemented as `pyannote` relies on external libraries for that part. You can however tune the clustering threshold to your use case. [This tutorial](https://github.com/pyannote/pyannote-audio/blob/develop/tutorials/adapting_pretrained_pipeline.ipynb) may...

How to finetune clustering and embedding models in speaker diarization pipeline?

I think this is a question for speechbrain project.

How to finetune clustering and embedding models in speaker diarization pipeline?

Plaquet's paper comes with a companion repository (https://github.com/FrenchKrab/IS2023-powerset-diarization) that does include a pipeline based on `speechbrain` ECAPA-TDNN.

Can I download a separate audio file for each speaker?

What you are looking for is speaker separation, not speaker diarization. `pyannote` does not do that... yet... but we are working on it! In the meantime, you might want to...

speaker-diarization-3.1 high memory usage

You may want to try and reduce `pipeline.embedding_batch_size` that [defaults to 32](https://huggingface.co/pyannote/speaker-diarization-3.1/blob/eb9d8dd72c3ae9de0c77346f4254dfb62d861cb3/config.yaml#L8).

Audio file formats – nothing other than .wav supported on HF

`pyannote` relies on `torchaudio` to read audio files. If `torchaudio.load` can load a file, it is supported. If `torchaudio.load` cannot load a file, it is not supported.

Speaking and singing segmentation

You can definitely use `pyannote.audio` to train such a model. See [this tutorial](https://github.com/pyannote/pyannote-audio/blob/develop/tutorials/add_your_own_task.ipynb) However, you'll need labeled training data.

Speaking and singing segmentation

I have no plan in training such a model... but one should never say never ;-)