From which version does coqui TTS starts supporting voice conversions and cloning?
Hi @Edresson,
I am fairly new into the feild so please forgive for naive question.
I am trying to use voice cloning feature.
I trained a model on coqui-ai version 0.6 and in that installed environment.
And I am using the command below to get the cloning done but it gives error that tts command does not expect "reference_wav"
tts --model_path trained_model/best_model.pth.tar --config_path trained_model/config.json --speaker_idx "icici" --out_path output.wav --reference_wav target_content/asura_10secs.wav
which might be because it did not support voice conversion then.
Can you please confirm?
Also, the model trained on version 0.6 doesn't run with latest version and ends up in dimension mismatch error which I am assuming due to model structure change probably.
Please shed some light on this, It'll be really helpful.
Voice conversion inference support was introduced on TTS v0.6.2 (I recommend using always the last one because of bug fixes). Between versions, we have changed config parameters not the model structure. You can get the config.json file from the released model and ajust your config.json to match. The model's cache is available at ~/.local/share/tts/.
Thanks a lot! @Edresson I really appreciate the time you've taken out to answer this query. This will help me a lot, I just had one small query, that whether a model trained on version 0.6 can be used for inference, voice conversion in latest version? if architecture is same I hope we can do it.
Thanks a lot! @Edresson I really appreciate the time you've taken out to answer this query. This will help me a lot, I just had one small query, that whether a model trained on version 0.6 can be used for inference, voice conversion in latest version? if architecture is same I hope we can do it.
I think so, but it might need some changes in the config file.