YourTTS icon indicating copy to clipboard operation
YourTTS copied to clipboard

From which version does coqui TTS starts supporting voice conversions and cloning?

Open tieincred opened this issue 3 years ago • 2 comments

Hi @Edresson, I am fairly new into the feild so please forgive for naive question. I am trying to use voice cloning feature. I trained a model on coqui-ai version 0.6 and in that installed environment. And I am using the command below to get the cloning done but it gives error that tts command does not expect "reference_wav" tts --model_path trained_model/best_model.pth.tar --config_path trained_model/config.json --speaker_idx "icici" --out_path output.wav --reference_wav target_content/asura_10secs.wav which might be because it did not support voice conversion then. Can you please confirm? Also, the model trained on version 0.6 doesn't run with latest version and ends up in dimension mismatch error which I am assuming due to model structure change probably. Please shed some light on this, It'll be really helpful.

tieincred avatar Jul 15 '22 03:07 tieincred

Voice conversion inference support was introduced on TTS v0.6.2 (I recommend using always the last one because of bug fixes). Between versions, we have changed config parameters not the model structure. You can get the config.json file from the released model and ajust your config.json to match. The model's cache is available at ~/.local/share/tts/.

Edresson avatar Jul 15 '22 11:07 Edresson

Thanks a lot! @Edresson I really appreciate the time you've taken out to answer this query. This will help me a lot, I just had one small query, that whether a model trained on version 0.6 can be used for inference, voice conversion in latest version? if architecture is same I hope we can do it.

tieincred avatar Jul 16 '22 03:07 tieincred

Thanks a lot! @Edresson I really appreciate the time you've taken out to answer this query. This will help me a lot, I just had one small query, that whether a model trained on version 0.6 can be used for inference, voice conversion in latest version? if architecture is same I hope we can do it.

I think so, but it might need some changes in the config file.

Edresson avatar Dec 12 '22 17:12 Edresson