parler-tts icon indicating copy to clipboard operation
parler-tts copied to clipboard

Feature improvements

Open lucasjinreal opened this issue 1 year ago • 6 comments
trafficstars

Hello, I tried the Parler-TTS Mini model and it exceeded my expectations with very good results.

However, I have some questions and possible suggestions for improvement:

  1. Will there be a multi-lingual version available, such as support for Mandarin?
  2. Currently, the accuracy of numbers and punctuation marks is not very good, and there are instances where words are dropped in sentences. Will these issues be addressed in future versions?

lucasjinreal avatar May 07 '24 02:05 lucasjinreal

Thanks for the feedback!

  1. It's not planned yet, but I'd be happy to support any efforts on this!
  2. In terms of numbers, the training dataset lacks numbers and only use fully-written numbers. There are some issues with punctuation that will be addressed on the next versions!

In terms of dropping words, this is only a v0.1 version so I'd expect the upcoming versions to be better. One thing that I noticed though is that you should always finish your prompt with a punctuation mark, otherwise the model drops the last word!

ylacombe avatar May 09 '24 11:05 ylacombe

thank u.

Just still wondering what kind of data should be prepared to supports Chinese or Japanese or Korean? I think it would be very much benifit opensource community if there are some chance to add any of above non-english language example!

lucasjinreal avatar May 09 '24 11:05 lucasjinreal

@lucasjinreal, we need 3 things:

  1. A dataset of audio and transcriptions samples, moderatly clean with audio samples from 5s-ish to 30s
  2. Adapting DataSpeech recipe for the language(s) of this dataset and running it on the dataset
  3. Decide a prompt tokenizer adapted for these languages ( and English if you want to mix languages)

Let me know if that helps!

Note that step 2 should be quite reachable. We principally need to find phonemizer(s) adapted to the languages, to compute the speaking rate

ylacombe avatar May 09 '24 12:05 ylacombe

Have you been able to get this working @lucasjinreal ?

ylacombe avatar May 27 '24 09:05 ylacombe

Hi, didn't get time to try this recently. Does parler side has any progress on more langual support?

lucasjinreal avatar May 27 '24 14:05 lucasjinreal