parler-tts icon indicating copy to clipboard operation
parler-tts copied to clipboard

Training on a NEW language

Open rudransh2004 opened this issue 4 months ago • 3 comments

Suppose we have to train this TTS model on a language and the tokens of that language are not in the Flan-T5 transformer. So can I simply change the name of the tokenizer in the config.json or do I have to make any code changes also. NOTE The new tokenizer will not be of FLAN-T5

rudransh2004 avatar Apr 15 '24 12:04 rudransh2004

Hey @rudransh2004, you can do this but you'd have to retrain the model from scratch!

ylacombe avatar Apr 20 '24 17:04 ylacombe

Hey @ylacombe thank you soo much for your reply. Could you share some reciepe for doing this with some another language without the annotations if we wish to

rudransh2004 avatar Apr 20 '24 17:04 rudransh2004

Hey @rudransh2004, if you want to avoid using the annotations, you could simply use a description column with each samples having empty string "". Note that the model currently doesn't support passing samples without annotations, but the trick above should work

ylacombe avatar Apr 26 '24 12:04 ylacombe