merlin icon indicating copy to clipboard operation
merlin copied to clipboard

Chinese Synthetic speech pacing is very fast

Open zyb8543d opened this issue 7 years ago • 6 comments

synthetic speech pacing is very fast B11_2.zip

zyb8543d avatar Jun 05 '18 03:06 zyb8543d

for chinese language , anyone have same test result?

zyb8543d avatar Jul 04 '18 04:07 zyb8543d

How about your training data speed?

candlewill avatar Jul 04 '18 08:07 candlewill

my training data speed is normal. when i use the train data which is generated by forced aliment to genarate wave , its speed is normal.

when i use the test data which is genarates by fronted to generate wave, its speed is fast.

i find tha train data and test data with same text have different lab.

zyb8543d avatar Jul 05 '18 09:07 zyb8543d

There’s something wrong with downloaded train data lab file. Use front end to generate new lab file.

jiangyibin avatar Jul 24 '18 15:07 jiangyibin

Cannot hear anything.

shartoo avatar Aug 14 '18 07:08 shartoo

@v-yunbin a Chinese Front tool Chinese Front toolcan generate lab file like this:

0 0 a4^k-uai4+w=uen2@/A:4-4^2@/B:7+2@2^3^2+9#2-9-/C:n_n^u#0+1+0&/D:xx=10!xx@1-1&/E:xx|10-xx@xx#1&xx!1-1#/F:xx^10=17_1-1!
0 0 k^uai4-w+uen2=zh@/A:4-2^1@/B:8+1@3^2^3+8#3-8-/C:n_n^u#0+1+0&/D:xx=10!xx@1-1&/E:xx|10-xx@xx#1&xx!1-1#/F:xx^10=17_1-1!

namely ,there is no duration information before phone label.Could this kind of labe file be feed into merlin to synthsis voice? Or, what step further operation should i do to fill duration ahead of phone information? The real lab file may look like this

26500000 27500000 a4^k-uai4+w=uen2@/A:4-4^2@/B:7+2@2^3^2+9#2-9-/C:n_n^u#0+1+0&/D:xx=10!xx@1-1&/E:xx|10-xx@xx#1&xx!1-1#/F:xx^10=17_1-1!
27500000 28300000 k^uai4-w+uen2=zh@/A:4-2^1@/B:8+1@3^2^3+8#3-8-/C:n_n^u#0+1+0&/D:xx=10!xx@1-1&/E:xx|10-xx@xx#1&xx!1-1#/F:xx^10=17_1-1!

shartoo avatar Aug 24 '18 00:08 shartoo