pyannote-audio Is that possible to convert the model to ONNX then use it in C++

Is that possible to convert the model to ONNX then use it in C++ for speaker diarization? Thanks.

Apr 11 '23 07:04 leohuang2013

We found the following entry in the FAQ which you may find helpful:

Does pyannote support streaming speaker diarization?

Feel free to close this issue if you found an answer in the FAQ. Otherwise, please give us a little time to review.

This is an automated reply, generated by FAQtory

Apr 11 '23 07:04 github-actions[bot]

https://github.com/pengzhendong/pyannote-onnx

Jul 20 '23 03:07 pengzhendong

@pengzhendong

I am looking to implement the speaker-diarization of pyannote with ONNX. I've been referring to this link: https://github.com/pengzhendong/pyannote-onnx. However, the repository linked doesn't seem to have the speaker-diarization output implemented.

I want to make the necessary adjustments myself, but pyannote's speaker-diarization operates by loading multiple models. Considering this, I'm unsure how to proceed with the modifications. I would appreciate it if you could provide me with advice or instructions on the specific steps or methods to follow.

Aug 04 '23 05:08 kfsky

@kfsky Could you provide the link of but pyannote's speaker-diarization operates by loading multiple models?

Aug 04 '23 06:08 pengzhendong

@pengzhendong I have been referring to this notebook: https://github.com/pyannote/pyannote-audio/blob/develop/tutorials/applying_a_pipeline.ipynb. When executing the following section of the notebook, multiple models get downloaded:

Copy code
from pyannote.audio import Pipeline
pipeline = Pipeline.from_pretrained("pyannote/speaker-diarization@develop", use_auth_token=True)

Therefore, I believe these multiple models are necessary for the conversion to ONNX. Is my understanding incorrect?

Aug 04 '23 07:08 kfsky

@kfsky There are two models:

https://huggingface.co/pyannote/segmentation
https://huggingface.co/speechbrain/spkrec-ecapa-voxceleb

The first one is used to segment the audio (pyannote-onnx does the same thing): The second one is used to get the embeddings of the segments.

Aug 04 '23 07:08 pengzhendong

@pengzhendong

The second one is used to get the embeddings of the segments.

Could you possibly share some ideas on the steps to follow when incorporating the second model into pyannote-onnx?

Aug 04 '23 08:08 kfsky

@kfsky Please refer this file: https://github.com/pyannote/pyannote-audio/blob/develop/pyannote/audio/pipelines/speaker_diarization.py

Aug 04 '23 08:08 pengzhendong

@kfsky Did you manage to export the whole diarization pipeline to ONNX?

Aug 22 '23 09:08 mark95

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Feb 18 '24 14:02 stale[bot]

I'm also looking to convert pyannote model to onnx format and then use it from Rust with ort Did anyone manged to use it in c++?

Jun 29 '24 18:06 thewh1teagle

pyannote-audio pyannote-audio copied to clipboard

Is that possible to convert the model to ONNX then use it in C++

pyannote-audio
pyannote-audio copied to clipboard