fish-speech icon indicating copy to clipboard operation
fish-speech copied to clipboard

Does v1.4 model respect punctuation?

Open hoveychen opened this issue 1 year ago • 3 comments

Self Checks

  • [X] I have searched for existing issues search for existing issues, including closed ones.
  • [X] I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
  • [X] [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
  • [X] Please do not modify this template :) and fill in all the required fields.

1. Is this request related to a challenge you're experiencing? Tell me about your story.

We've tried synthesizing voices in both English and Chinese dialog scripts, and it seems to ignore the punctuation like question mark.

Example:

  • 他为什么要劈腿?我到底哪里不好?
  • 我男朋友在和别的女人在一起,真人秀?
  • Do you hear the whispers of the planets, the gentle hum of the galaxies far beyond our reach?

2. Additional context or comments

No response

3. Can you help us with this feature?

  • [X] I am interested in contributing to this feature.

hoveychen avatar Sep 13 '24 10:09 hoveychen

I found that if there is a semicolon ; in the sentence, the whole result will become a noise.

czkoko avatar Sep 13 '24 11:09 czkoko

The training data has been processed by punctuation normalizer, so you should normalize the normalizer in inference.

Stardust-minus avatar Sep 13 '24 12:09 Stardust-minus

Not sure why, but technically puncs are normalized here: https://github.com/fishaudio/fish-speech/blob/c7c8c943c966a03a85ce4a61bca605f1d9bf7567/fish_speech/text/clean.py#L28 We are finetuning the model to let it understand different puncs.

leng-yue avatar Sep 13 '24 23:09 leng-yue

Closed as issue had been solved.

PoTaTo-Mika avatar Jan 01 '25 10:01 PoTaTo-Mika