diart Maintaining state across file audio source chunks

Maintaining state across file audio source chunks

Open Aduomas opened this issue 1 year ago • 1 comments

Hello,

I am looking for a way to do chunk-based inference instead of streaming inference using audio files. The issue now is that each file audio will have new inference and thus new state (new speaker embeddings) which is unwanted behaviour for my program.

How should I do achieve wanted behaviour of doing inference on larger chunks of audio (such as 20 second) and keeping the pipeline state across ?

Oct 28 '24 12:10 Aduomas

Hi @Aduomas, given your description, do you actually require a streaming pipeline? It looks like pyannote.audio could achieve what you want

Dec 13 '24 09:12 juanmc2005

diart diart copied to clipboard

Maintaining state across file audio source chunks

diart
diart copied to clipboard