speechbrain icon indicating copy to clipboard operation
speechbrain copied to clipboard

Real-Time ASR with Device Capture using torchaudio StreamReader

Open benluks opened this issue 1 year ago • 5 comments

Describe the bug

I made a demo of using StreamingASR with live device input using torchaudio StreamReader. The setup is a little tedious, and it would be nice to integrate this as an official feature to abstract threading and buffering away from the user.

Of course this can be applied to any model or pipeline, provided there are streaming capabilities.

Expected behaviour

Streaming ASR from device

To Reproduce

No response

Environment Details

No response

Relevant Log Output


Additional Context

No response

benluks avatar Feb 20 '25 15:02 benluks

@pplantinga @Adel-Moumen, this could be our first pipeline example, what do you think? I'm wondering if we could put some sore of voice activity detection in top of this to make it a full proper pipeline.

TParcollet avatar Feb 20 '25 15:02 TParcollet

I'd be happy to tackle the issue if no one's claimed it yet

benluks avatar Mar 01 '25 17:03 benluks

Hi! We have very limited bandwidth but we keep this proposal in mind!!

TParcollet avatar Mar 01 '25 18:03 TParcollet

I'd be happy to tackle the issue if no one's claimed it yet

I would be happy having you on board. If you are willing to push this feature, and make it speechbrain-like, we can team up on that. What do you think? :)

Adel-Moumen avatar Mar 02 '25 17:03 Adel-Moumen

Sounds great! Let's chat :)

benluks avatar Mar 02 '25 19:03 benluks