mfa-models icon indicating copy to clipboard operation
mfa-models copied to clipboard

G2P mandarin_pinyin_g2p.zip ignore repeated tokens

Open liubc-ai opened this issue 1 year ago • 0 comments

Hi,I found that using mandarin_pinyin_g2p.zip to extract pinyin phonemes ignored repeated tokens, how can I avoid it?

Example: shi4 yi1 jia1 zhi4 yao4 gong1 si1 de5 duan3 qi1 gong1

Expected results: sh ii4 i1 j ia1 zh ii4 iao4 g o1 ng s ii1 d e5 d ua3 n q i1 g o1 ng

But I got the results: sh ii4 i1 j ia1 zh ii4 iao4 g o1 ng s ii1 d e5 d ua3 n q i1

Looking forward to your reply

liubc-ai avatar Nov 22 '23 02:11 liubc-ai