Retrieval-based-Voice-Conversion-WebUI
Retrieval-based-Voice-Conversion-WebUI copied to clipboard
How to get good results from voice lines taken from videogames game files
I tried training some models with different voice lines taken from a videogame, they were all about 10 minutes long, but I could't managed to get good results. The audio i used had all the different speaches very close to each others, like 0.5s. I tried training with 200/250 epochs but the results was always not clear. Also I noticed that the model could not reproduce silence very well, it filled it with random noises. How can I improve the results? Unfortunately the game files voice lines are all max 10/11 minutes long, so the duration is not something i can change.