Cross-Lingual-Voice-Cloning
Cross-Lingual-Voice-Cloning copied to clipboard
How many hours of speech and epochs it takes to get the quality as in paper?
I am training an English-Russian model and want to know how many hours of speech it takes to get the quality as in the paper, and can I train a model with 6gb VRAM?