FreeVC
FreeVC copied to clipboard
Necessary preprocessing for inference wav data
Hi, Thanks for the great work! I'm trying to test the inference part with my own wav file but the output quality is less than I expected and I'm suspecting it's due to the input file. Could you give me some instruction for how to preprocess the input source/target wav?
For source wav, it is better to denoise it if it is too noisy, normalize it if the volume is too loud or too low; for target wav, it is better to trim its silent segments.
@OlaWod Thanks a lot for the answer! Understood. I'll try the above. Also, would there be a preferred sampling rate for both source and target wav?
16kHz
Great, thanks for your response :)
For the target wav, do you have recommendations as to the length? Does performance of the model improve/deteriorate with longer target wavs? If so, where's the sweet spot?
haven't explored yet, but I think longer target wav might contain longer silent, which can deteriorate the performance. I think the sweet spot is a target wav that is not too short and contain almost no silent.