PaddleSpeech icon indicating copy to clipboard operation
PaddleSpeech copied to clipboard

🔍 TTS 文本前端问题汇总(Text Frontend Bugs)

Open yt605155624 opened this issue 1 year ago • 25 comments

Please report TTS text frontend bugs here, for examples: text normalization, polyphone and tone sandhi, etc.

We encourage developers to solve these problems.

  1. polyphone: 能说多长(zhang3 ❎)的语音呢?是否可以长(zhang3 ❎)语音合成呢?长(chang2 ✅)语音,长(zhang3 ❎)文本 -> fixed

yt605155624 avatar Jul 28 '22 02:07 yt605155624

  • 教教(jiao1)(jiao)我好不好!读成了(jiao4) -> fixed
  • 哈哈哈(❌)-> fixed

LiuChiachi avatar Jul 28 '22 09:07 LiuChiachi

  1. 干嘛(❎) -> fixed
  2. 你像夏至的分界线,是我终身里最长(❎)的那个白昼!夸夸你!-> fixed
  3. 我今天写了两行(❎)代码 -> fixed
  4. 媳妇儿(儿化音)-> fixed
  5. 小数(❎)点 -> fixed
  6. 哈哈哈哈哈哈哈-> fixed
  7. 学子(❎无需轻声变调) -> fixed
  8. 向阳中学是一所有比较长(❎)的历史的中学校 -> "长"发音 fixed, 所有分词还有问题
  9. 咕呱(gu1 ❎)(gua? ✅)-> fixed

yt605155624 avatar Jul 28 '22 13:07 yt605155624

https://github.com/PaddlePaddle/PaddleSpeech/issues/2206

yt605155624 avatar Jul 30 '22 00:07 yt605155624

you can try g2pw. 睡得着觉? G2pM: ['shui4', 'de2', 'zhe5', 'jue2', '?'] 睡得着觉? lazy_pinyin: ['shui4', 'de2', 'zhe', 'jue2', '?'] 睡得着觉? G2pW: [['shui4', 'de5', 'zhao2', 'jiao4', None]]

小数点 G2pM: ['xiao3', 'shu4', 'dian3'] 小数点 lazy_pinyin: ['xiao3', 'shu3', 'dian3'] 小数点 G2pW: [['xiao3', 'shu4', 'dian3']]

干嘛? G2pM: ['gan1', 'ma5', '?'] 干嘛? lazy_pinyin: ['gan4', 'ma', '?'] 干嘛? G2pW: [['gan4', 'ma2', None]]

我今天写了两行代码 G2pM: ['wo3', 'jin1', 'tian1', 'xie3', 'le5', 'liang3', 'xing2', 'dai4', 'ma3'] 我今天写了两行代码 lazy_pinyin: ['wo3', 'jin1', 'tian1', 'xie3', 'le', 'liang3', 'xing2', 'dai4', 'ma3'] 我今天写了两行代码 G2pW: [['wo3', 'jin1', 'tian1', 'xie3', 'le5', 'liang3', 'hang2', 'dai4', 'ma3']]

教教我好不好! G2pM: ['jiao4', 'jiao4', 'wo3', 'hao3', 'bu4', 'hao3', '!'] 教教我好不好! lazy_pinyin: ['jiao4', 'jiao4', 'wo3', 'hao3', 'bu4', 'hao3', '!'] 教教我好不好! G2pW: [['jiao1', 'jiao1', 'wo3', 'hao3', 'bu4', 'hao3', None]]

能说多长的语音呢?是否可以长语音合成呢 G2pM: ['neng2', 'shuo1', 'duo1', 'zhang3', 'de5', 'yu3', 'yin1', 'ne5', '?', 'shi4', 'fou3', 'ke3', 'yi3', 'chang2', 'yu3', 'yin1', 'he2', 'cheng2', 'ne5'] 能说多长的语音呢?是否可以长语音合成呢 lazy_pinyin: ['neng2', 'shuo1', 'duo1', 'zhang3', 'de', 'yu3', 'yin1', 'ne', '?', 'shi4', 'fou3', 'ke3', 'yi3', 'zhang3', 'yu3', 'yin1', 'he2', 'cheng2', 'ne'] 能说多长的语音呢?是否可以长语音合成呢 G2pW: [['neng2', 'shuo1', 'duo1', 'chang2', 'de5', 'yu3', 'yin1', 'ne5', None, 'shi4', 'fou3', 'ke3', 'yi3', 'chang2', 'yu3', 'yin1', 'he2', 'cheng2', 'ne5']]

BarryKCL avatar Aug 02 '22 07:08 BarryKCL

you can try g2pw. 睡得着觉? G2pM: ['shui4', 'de2', 'zhe5', 'jue2', '?'] 睡得着觉? lazy_pinyin: ['shui4', 'de2', 'zhe', 'jue2', '?'] 睡得着觉? G2pW: [['shui4', 'de5', 'zhao2', 'jiao4', None]]

小数点 G2pM: ['xiao3', 'shu4', 'dian3'] 小数点 lazy_pinyin: ['xiao3', 'shu3', 'dian3'] 小数点 G2pW: [['xiao3', 'shu4', 'dian3']]

干嘛? G2pM: ['gan1', 'ma5', '?'] 干嘛? lazy_pinyin: ['gan4', 'ma', '?'] 干嘛? G2pW: [['gan4', 'ma2', None]]

我今天写了两行代码 G2pM: ['wo3', 'jin1', 'tian1', 'xie3', 'le5', 'liang3', 'xing2', 'dai4', 'ma3'] 我今天写了两行代码 lazy_pinyin: ['wo3', 'jin1', 'tian1', 'xie3', 'le', 'liang3', 'xing2', 'dai4', 'ma3'] 我今天写了两行代码 G2pW: [['wo3', 'jin1', 'tian1', 'xie3', 'le5', 'liang3', 'hang2', 'dai4', 'ma3']]

教教我好不好! G2pM: ['jiao4', 'jiao4', 'wo3', 'hao3', 'bu4', 'hao3', '!'] 教教我好不好! lazy_pinyin: ['jiao4', 'jiao4', 'wo3', 'hao3', 'bu4', 'hao3', '!'] 教教我好不好! G2pW: [['jiao1', 'jiao1', 'wo3', 'hao3', 'bu4', 'hao3', None]]

能说多长的语音呢?是否可以长语音合成呢 G2pM: ['neng2', 'shuo1', 'duo1', 'zhang3', 'de5', 'yu3', 'yin1', 'ne5', '?', 'shi4', 'fou3', 'ke3', 'yi3', 'chang2', 'yu3', 'yin1', 'he2', 'cheng2', 'ne5'] 能说多长的语音呢?是否可以长语音合成呢 lazy_pinyin: ['neng2', 'shuo1', 'duo1', 'zhang3', 'de', 'yu3', 'yin1', 'ne', '?', 'shi4', 'fou3', 'ke3', 'yi3', 'zhang3', 'yu3', 'yin1', 'he2', 'cheng2', 'ne'] 能说多长的语音呢?是否可以长语音合成呢 G2pW: [['neng2', 'shuo1', 'duo1', 'chang2', 'de5', 'yu3', 'yin1', 'ne5', None, 'shi4', 'fou3', 'ke3', 'yi3', 'chang2', 'yu3', 'yin1', 'he2', 'cheng2', 'ne5']]

we'are very looking forward you to add g2pW into PaddleSpeech TTS through this pr https://github.com/PaddlePaddle/PaddleSpeech/pull/2221

yt605155624 avatar Aug 04 '22 08:08 yt605155624

踩一踩一踩 -> fixed

yt605155624 avatar Aug 17 '22 05:08 yt605155624

TN:一共有1兆320万5000人 => 一共有一兆三百二十万五零零零人

pengzhendong avatar Aug 17 '22 05:08 pengzhendong

TN:一共有1兆320万5000人 => 一共有一兆三百二十万五零零零人

现在对于数字,判断其后是否有指定的单位来确定其是数字还是编号,所以把 “人” 加到这里应该可以解决问题,欢迎开发者提交 pr 修复~ https://github.com/PaddlePaddle/PaddleSpeech/blob/7cc1d66863a48b50c2430059c8b84060d84b11a3/paddlespeech/t2s/frontend/zh_normalization/num.py#L31

yt605155624 avatar Aug 26 '22 02:08 yt605155624

TN:一共有1兆320万5000人 => 一共有一兆三百二十万五零零零人

现在对于数字,判断其后是否有指定的单位来确定其是数字还是编号,所以把 “人” 加到这里应该可以解决问题,欢迎开发者提交 pr 修复~

https://github.com/PaddlePaddle/PaddleSpeech/blob/7cc1d66863a48b50c2430059c8b84060d84b11a3/paddlespeech/t2s/frontend/zh_normalization/num.py#L31

fixed by https://github.com/PaddlePaddle/PaddleSpeech/pull/2308

yt605155624 avatar Aug 26 '22 03:08 yt605155624

  • https://github.com/PaddlePaddle/PaddleSpeech/issues/2566

yt605155624 avatar Oct 21 '22 08:10 yt605155624

  • https://github.com/PaddlePaddle/PaddleSpeech/issues/2571

yt605155624 avatar Oct 24 '22 02:10 yt605155624

“嗯”这个字 lazy_pinyin返回为空

HandsLing avatar Oct 26 '22 09:10 HandsLing

@yt605155624

HandsLing avatar Oct 27 '22 23:10 HandsLing

  • https://github.com/PaddlePaddle/PaddleSpeech/issues/2601 -> fixed by https://github.com/PaddlePaddle/PaddleSpeech/pull/2629

yt605155624 avatar Oct 31 '22 13:10 yt605155624

@HandsLing 这个问题你在 pypinyin 的 issue 搜一下,和版本有关,他们做了不兼容升级,我们 pypinyin 的依赖参考 https://github.com/PaddlePaddle/PaddleSpeech/blob/8ea289a2517aa842ff4c7797f382832cfe13b187/setup.py#L55

yt605155624 avatar Oct 31 '22 13:10 yt605155624

  • https://github.com/PaddlePaddle/PaddleSpeech/issues/2603 -> fixed by https://github.com/PaddlePaddle/PaddleSpeech/pull/2606

yt605155624 avatar Nov 01 '22 02:11 yt605155624

"种点薄荷" 发音有问题

yt605155624 avatar Nov 03 '22 12:11 yt605155624

model_alias = { # acoustic model "fastspeech2": "paddlespeech.t2s.models.fastspeech2:FastSpeech2", "fastspeech2_inference": "paddlespeech.t2s.models.fastspeech2:StyleFastSpeech2Inference", # voc "pwgan": "paddlespeech.t2s.models.parallel_wavegan:PWGGenerator", "pwgan_inference": "paddlespeech.t2s.models.parallel_wavegan:PWGInference", } 用的自定义训练声音里的fastspeech2_mix和pwgan_aishell3,上面加粗部分应该怎样改,找不到相关资料,上面代码没有改引入训练的自定义声音后,合成的声音不正常,应该是跟上面的字段有关吗?感觉应该改成对应的字段吧

mogosmart avatar Nov 17 '22 16:11 mogosmart

@mogosmart 没有关系,使用自己训练好的模型可以参考 https://github.com/PaddlePaddle/PaddleSpeech/issues/2225

yt605155624 avatar Nov 18 '22 08:11 yt605155624

好的 这边看一下

mogosmart avatar Nov 25 '22 13:11 mogosmart

  • https://github.com/PaddlePaddle/PaddleSpeech/issues/2720

yt605155624 avatar Dec 06 '22 03:12 yt605155624

噢 发音不对,因为在台湾话里面是多音字,被错误预测了 -> fixed by https://github.com/PaddlePaddle/PaddleSpeech/pull/2831

yt605155624 avatar Jan 11 '23 03:01 yt605155624

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Mar 25 '23 01:03 stale[bot]

a small bug?

text = '全国一共有112所211大学..'
from paddlespeech.t2s.frontend.zh_frontend import Frontend as zhFrontend
fe = zhFrontend()
print(sum(fe.get_phonemes(raw_text), []))

# Outputs:
[全国一共有一百一十二所二幺幺大学..] not in g2pW dict,use g2pM
['j', 'ie2', 'k', 'e4', 'sp', 'n', 'i3', 'zh', 'iii1', 'd', 'ao4', 'm', 'a5', 'sp', 'q', 'van2', 'g', 'uo2', 'i2', 'g', 'ong4', 'iou3', 'i1', 'b', 'ai3',
 'i1', 'sh', 'iii2', 'er4', 's', 'uo3', 'er4', 'iao1', 'iao1', 'd', 'a4', 'x', 've2', '..', '..']

There are two '..' in the results.

QinlongHuang avatar May 17 '23 06:05 QinlongHuang

  1. )

请问怎么实现这个修复呢

zhuqn avatar Apr 15 '24 02:04 zhuqn