ffsubsync
ffsubsync copied to clipboard
A better VAD than webrtc
I saw this on hackernews it seems to be better at distinguishing noise and voice
https://thegradient.pub/one-voice-detector-to-rule-them-all/
https://github.com/snakers4/silero-vad
Thanks for bringing this to my attention! Will check it out.
I did some exploration here but wasn't able to get anything interesting working. The silero vad can be used by passing --vad silero to ffsubsync version >= 0.4.21 (provided torch and torchaudio are both installed, which are not listed as project requirements), but I was never able to get decent performance compared to webrtc. (Admittedly, I didn't try very hard; just documenting here in case others want to give it a shot.)