hifi-gan
hifi-gan copied to clipboard
Pause between sentence
It is a great work, It did produce really high quality of voice. But It seems perfect for one sentence, when I try to generate multiple sentences, it doesn't have any pause between sentence, may I have your advice on how to let it pause between sentence? thanks.
What do you use as a model to convert input sentences into melspectragrams? I would suggest this exact part of your pipeline fails to add pauses between sentences correctly.
Hifigan is a vocoder - it only converts melspectragrams into audio. Melspectragram should have a pause at a proper place so the vocoder could generate several sentences properly.
Anyway, the first advice here is to visually check the melspectragram if it has a long enough pause between sentences.
Hi @evrrn , thanks for your reply! is there any way I could add some markup on the text so it could recognise it and pause for let say 0.5 seconds between the sentence? thanks.
As I mentioned, it depends on the model you are using to obtain the melspectrogram. Guess you are trying to use TransformerTTS - so your pauses issue only depends on its implementation. It's not hifigan's problem.