中文的文本生成的字幕,词语分割点总是在奇怪的位置,本来是一个词语应该在一条字幕内显示,而现在生成的字幕会把词语拆成两条字幕显示,不知道--write-subtitles命令中文本是如何分割的,我把中文标点符号替换成英文标点符号,结果还是一样
中文的文本生成的字幕,词语分割点总是在奇怪的位置,本来是一个词语应该在一条字幕内显示,而现在生成的字幕会把词语拆成两条字幕显示,不知道--write-subtitles命令中文本是如何分割的,我把中文标点符号替换成英文标点符号,结果还是一样
The subtitles generated from Chinese text always have word breaks at strange positions. Originally, a single word should appear in one subtitle line, but now the generated subtitles split the words into two lines. I'm not sure how the text is segmented in the --write-subtitles command. Even after replacing Chinese punctuation marks with English ones, the issue remains the same.
Hello, can you try this branch for me? https://github.com/rany2/edge-tts/tree/wip-subtitles
Hello, can you try this branch for me? https://github.com/rany2/edge-tts/tree/wip-subtitles
Do I need to reinstall this branch of edge-tts? I tried the command in the cmd window, but the result was unsatisfactory:edge-tts --voice zh-CN-YunxiNeural --text "萧漪气得粉拳挥舞,二师兄,你的嘴巴也太讨厌了.谁痴呆,谁脑残了.真是的,二师兄,你这样子,没有女孩子会喜欢你.吕少卿满脸不屑,爱情什么的,有灵石重要吗?谁要女孩子喜欢了?麻烦!" --write-media D:\hello.mp3 --write-subtitles D:\hello.vtt The subtitles I generated are as follows .I've extracted the first few sentences: WEBVTT
00:00:00.100 --> 00:00:03.000
萧漪 气 得 粉 拳 挥舞 二师兄 你 的 嘴巴
00:00:03.025 --> 00:00:06.638
也 太 讨厌 了 谁 痴呆 谁 脑残 了 真是的
00:00:06.912 --> 00:00:11.062
二师兄 你 这 样子 没有 女孩子 会 喜欢 你 吕少卿
00:00:11.062 --> 00:00:14.325
满 脸 不屑 爱情 什么的 有 灵石 重要 吗 谁
00:00:14.325 --> 00:00:15.900
要 女孩子 喜欢 了 麻烦
======================================== If the generated subtitles are segmented as shown below, it's considered quite good:
1 00:00:00,100 --> 00:00:01,525 萧漪气得粉拳挥舞
2 00:00:01,825 --> 00:00:02,437 二师兄
3 00:00:02,799 --> 00:00:04,299 你的嘴巴也太讨厌了
4 00:00:04,862 --> 00:00:05,587 谁痴呆
5 00:00:05,900 --> 00:00:06,862 谁脑残了
6 00:00:07,412 --> 00:00:08,087 真是的
7 00:00:08,349 --> 00:00:08,900 二师兄
8 00:00:09,262 --> 00:00:10,175 你这样子
9 00:00:10,500 --> 00:00:11,637 没有女孩子会喜欢你
10 00:00:12,199 --> 00:00:13,449 吕少卿满脸不屑
11 00:00:13,750 --> 00:00:15,937 爱情什么的有灵石重要吗
12 00:00:16,500 --> 00:00:17,699 谁要女孩子喜欢了
13 00:00:18,262 --> 00:00:18,837 麻烦
You need to reinstall from that branch, I believe you didn't do so which is why the same behaviour remained.
You need to reinstall from that branch, I believe you didn't do so which is why the same behaviour remained.
After testing, I found that the subtitle format did change, and it even retained punctuation marks. However, I'm not sure where the problem lies because the subtitles didn't start a new line at punctuation marks. Generally, it's more reasonable for subtitles to move to the next line at pauses in the audio to avoid making a subtitle too long and cluttered. Was it supposed to be like this originally?
The following is the subtitle I obtained using the command (edge-tts --voice zh-CN-YunxiNeural --text "萧漪气得粉拳挥舞,二师兄,你的嘴巴也太讨厌了。谁痴呆,谁脑残了。真是的,二师兄,你这样子,没有女孩子会喜欢你。吕少卿满脸不屑,爱情什么的,有灵石重要吗?谁要女孩子喜欢了?麻烦!" --write-media D:\hello.mp3 --write-subtitles D:\hello.vtt):
1 00:00:00,100 --> 00:00:03,525 萧漪气得粉拳挥舞,二师兄,你的嘴巴
2 00:00:03,550 --> 00:00:08,250 也太讨厌了。谁痴呆,谁脑残了。真是的
3 00:00:08,550 --> 00:00:13,112 二师兄,你这样子,没有女孩子会喜欢你。吕少卿
4 00:00:13,137 --> 00:00:17,212 满脸不屑,爱情什么的,有灵石重要吗?谁
5 00:00:17,212 --> 00:00:19,300 要女孩子喜欢了?麻烦
Makes sense, thanks for the feedback.
Hi rany2, IIUC, you are generating the subtitle using results with wordBoundaryEnabled. I wonder have you tried getting subtitle with sentenceBoundaryEnabled. that looks like a promising way to solve the issue listed above.
Unfortunately sentence boundary has many bugs if proper punctuation isn't provided and is deprecated by Microsoft themselves.
https://juejin.cn/post/7368637177428426752 这个代码算解决了 但是断句 还不是那么完美
请问这个问题解决了吗?我使用最新版后还是有相同的问题