Retrieval-based-Voice-Conversion-WebUI icon indicating copy to clipboard operation
Retrieval-based-Voice-Conversion-WebUI copied to clipboard

Questions about the pretrain

Open ShiromiyaG opened this issue 1 year ago • 1 comments

How exactly was he trained? Was there a different configuration used in the training? Because when I and other people tried to train a new model, there was a problem with lines in the spectrum of the audios generated by these pretrains. @RVC-Boss Were the speakers separated when the dataset was prepared? Was there a different config used in the .json files?

Here an imagem of the line problem: image

ShiromiyaG avatar Sep 22 '24 08:09 ShiromiyaG

After examining multiple RVC forks the result is the same - using net_g Synthesizer in a blank slate (without loading weights from a voice model) to infer a silent audio results in the following:

image

So training a model from scratch results in this heavy noise being blended and perhaps at 1000+ epochs diluted so only the solid 8KHz noise remains.

AznamirWoW avatar Sep 22 '24 21:09 AznamirWoW