OpenVoice
OpenVoice copied to clipboard
some qustions?
Hi, I have some questions as belows:
- Are the speaker encoder models of the base tts models and tone color converter model be the same model structure? Is there any connection between base tts models and tone color converter model?
- During training, for text-audio pair <x, y>, are the reference speaker audio, the output of tone color converter model (speech with reference tone color and controlled styles) and g from both flow and reverse flow all from y?
- Would you plan to release the codes of the training parts, we still could not train a good model following your paper. Thanks a lot
are you guys planning to release its code