ffsubsync icon indicating copy to clipboard operation
ffsubsync copied to clipboard

A better VAD than webrtc

Open Dnkhatri opened this issue 3 years ago • 1 comments

I saw this on hackernews it seems to be better at distinguishing noise and voice

https://thegradient.pub/one-voice-detector-to-rule-them-all/

https://github.com/snakers4/silero-vad

Dnkhatri avatar Feb 22 '22 16:02 Dnkhatri

Thanks for bringing this to my attention! Will check it out.

smacke avatar Mar 07 '22 02:03 smacke

I did some exploration here but wasn't able to get anything interesting working. The silero vad can be used by passing --vad silero to ffsubsync version >= 0.4.21 (provided torch and torchaudio are both installed, which are not listed as project requirements), but I was never able to get decent performance compared to webrtc. (Admittedly, I didn't try very hard; just documenting here in case others want to give it a shot.)

smacke avatar Dec 31 '22 01:12 smacke