supervoice-vall-e-2
supervoice-vall-e-2 copied to clipboard
Misspelling issues.
I have tried your models(voicebox and this one) and vall-e-2 sounds more natural, but there is lot of misspellings in the generated speech. Is it because of dataset? Have you tried to train voicebox on the libriheavy?