Question regarding to vad_process (AIS-1679)

Open mike-2020 opened this issue 1 year ago • 1 comments

Hello,

I understand that this function is used to detect speech in received audio. But when it returns VAD_SPEECH, does it means the current frame (the data input for the current call to this function) contain speech? or it means current frame along with a number of previous frames contains speech?

I'd like to record speech only. So, want to make sure when vad_process returns VAD_SPEECH, it is the right time to start the recording, and will not miss any speech audio.

Aug 22 '24 14:08 mike-2020

Your understanding is correct, but you need to pay attention to the performance of VAD. It cannot be 100% accurate. You should consider the status of previous frames to determine whether certain frames need to be ignored.

Aug 30 '24 08:08 sun-xiangyu