diart icon indicating copy to clipboard operation
diart copied to clipboard

Maintaining state across file audio source chunks

Open Aduomas opened this issue 1 year ago • 1 comments

Hello,

I am looking for a way to do chunk-based inference instead of streaming inference using audio files. The issue now is that each file audio will have new inference and thus new state (new speaker embeddings) which is unwanted behaviour for my program.

How should I do achieve wanted behaviour of doing inference on larger chunks of audio (such as 20 second) and keeping the pipeline state across ?

Aduomas avatar Oct 28 '24 12:10 Aduomas

Hi @Aduomas, given your description, do you actually require a streaming pipeline? It looks like pyannote.audio could achieve what you want

juanmc2005 avatar Dec 13 '24 09:12 juanmc2005