FreeVC icon indicating copy to clipboard operation
FreeVC copied to clipboard

Necessary preprocessing for inference wav data

Open takan1 opened this issue 2 years ago • 6 comments

Hi, Thanks for the great work! I'm trying to test the inference part with my own wav file but the output quality is less than I expected and I'm suspecting it's due to the input file. Could you give me some instruction for how to preprocess the input source/target wav?

takan1 avatar Dec 09 '22 02:12 takan1

For source wav, it is better to denoise it if it is too noisy, normalize it if the volume is too loud or too low; for target wav, it is better to trim its silent segments.

OlaWod avatar Dec 09 '22 08:12 OlaWod

@OlaWod Thanks a lot for the answer! Understood. I'll try the above. Also, would there be a preferred sampling rate for both source and target wav?

takan1 avatar Dec 10 '22 01:12 takan1

16kHz

OlaWod avatar Dec 10 '22 04:12 OlaWod

Great, thanks for your response :)

takan1 avatar Dec 12 '22 09:12 takan1

For the target wav, do you have recommendations as to the length? Does performance of the model improve/deteriorate with longer target wavs? If so, where's the sweet spot?

Crazy-Duck avatar Jan 05 '23 21:01 Crazy-Duck

haven't explored yet, but I think longer target wav might contain longer silent, which can deteriorate the performance. I think the sweet spot is a target wav that is not too short and contain almost no silent.

OlaWod avatar Jan 08 '23 14:01 OlaWod