VAD-python icon indicating copy to clipboard operation
VAD-python copied to clipboard

VAD on singing voice?

Open shoegazerstella opened this issue 5 years ago • 0 comments

I am trying to adapt this script to detect voice-silence segments in an audio file containing source separated singing voice signal obtained from http://github.com/sigsep/open-unmix-pytorch

I have some questions:

  • Does it make sense to compute the threshold for each data_window independently? Instead of having a fixed speech_energy_threshold? I would do that by computing the energy of the data_window signal, normalizing it and taking its mean value. If this value is = 0.0, I can label that segment as silence.

  • Is there a clever way to choose parameters like sample_window, sample_overlap, speech_window that would be more appropriate for singing voice signals?

Thanks a lot!

shoegazerstella avatar Oct 30 '19 17:10 shoegazerstella