TensorflowASR icon indicating copy to clipboard operation
TensorflowASR copied to clipboard

你好,考虑将silero-vad加入到项目中吗

Open TszSimLaw opened this issue 1 year ago • 2 comments

TszSimLaw avatar Apr 25 '23 08:04 TszSimLaw

暂时没有用过这个项目,还没想好怎么加入。 后续再规划一下

Z-yq avatar Apr 25 '23 09:04 Z-yq

I am not all that sure about silero-vad as the Number Detector and Language Classifier sort of make it a bit 'fat' for just VAD. Maybe there are simpler and easier ways to chunk spoken audio to fit beam search lengths of incoming realtime audio?

Z-yq haven't looked much but likely a simpler lower parameter model than silero could be used.

Also I think farfield and BSS/Beamforming are likely wireless distributed arrays and ASR central due to the possible diversification of use zonal systems could use.

https://github.com/breizhn/DTLN is a pretty good filter but the dataset needs to be mixed with noise and processed by DTLN or any filter so artefacts are trained in. https://github.com/Rikorose/DeepFilterNet is truly outstanding but more load and a shame the Ladspa plugin uses Tract as a ML framework as its single thread only.

StuartIanNaylor avatar May 17 '23 17:05 StuartIanNaylor