How do I remove blank lines from VTT subtitles?
WEBVTT
00:00:00.086 --> 00:00:00.961
xxxxx
00:00:01.166 --> 00:00:02.586
xxxxx
There's always a blank line between the timeline and the characters?
I don't understand? is it doing something different from https://developer.mozilla.org/en-US/docs/Web/API/WebVTT_API#webvtt_files ?
I don't understand? is it doing something different from https://developer.mozilla.org/en-US/docs/Web/API/WebVTT_API#webvtt_files ?
Strange, the first line in my VTT subtitles is always a blank line
WEBVTT
00:00:00.086 --> 00:00:00.961
xxxxx
00:00:01.166 --> 00:00:02.586
xxxxx
Normally it should be like this: WEBVTT
00:00:00.086 --> 00:00:00.961 xxxxx
00:00:01.166 --> 00:00:02.586 xxxxx
And mine is this: WEBVTT
00:00:00.086 --> 00:00:00.961
xxxxx
00:00:01.166 --> 00:00:02.586
xxxxx
I have an internal version of edge-tts which has many subtitle fixes (especially noticeable Chinese) and uses pysrt for subtitle generation so this issue should be fixed, but I never had this issue in the first place so :/
I have an internal version of edge-tts which has many subtitle fixes (especially noticeable Chinese) and uses pysrt for subtitle generation so this issue should be fixed, but I never had this issue in the first place so :/
New version coming soon? Expect to generate str directly
If you're keen you could test it out I pushed my wip branch, https://github.com/rany2/edge-tts/tree/wip-subtitles
It needs to be simplified a bit more before it's ready, right now it's more of a bodge and a concept. There are some issues so it's not ready to be in master yet because the TTS service would rewrite the input text and then return in word boundary.
For example, if you asked TTS to generate text for "1k.m." it will be rewritten internally by the service as "1 kilometer" and the mapping will fail; I've attempted to fix such issues but it's still a WIP.
Using newline="\n" in with open(...) as file: fixed the issue on my windows device. It seems to be a Linux/windows problem.
https://stackoverflow.com/questions/9184107/how-can-i-force-pythons-file-write-to-use-the-same-newline-format-in-windows
line 31 in async subtitle example should be adjusted
WEBVTT 00:00:00.086 --> 00:00:00.961 xxxxx 00:00:01.166 --> 00:00:02.586 xxxxx
Normally it should be like this: WEBVTT
00:00:00.086 --> 00:00:00.961 xxxxx
00:00:01.166 --> 00:00:02.586 xxxxx
And mine is this: WEBVTT
00:00:00.086 --> 00:00:00.961
xxxxx
00:00:01.166 --> 00:00:02.586
xxxxx
You can delete these lines after the subs have been written to the VTT file using the example streaming_with_subtitles.py by adding this code:
with open(WEBVTT_FILE, "w", encoding="utf-8") as file:
file.write(submaker.generate_subs())
# Delete new lines in VTT file below cue
with open(WEBVTT_FILE, "r", encoding="utf-8") as file:
lines = file.readlines()
with open(WEBVTT_FILE, "w", encoding="utf-8") as file:
for line in lines:
if "-->" in line:
file.write(line.strip() + " ")
else:
file.write(line)
This allows to play the audio together with the VTT file in players such as mpv and MPC-HC, otherwise the subs will not be displayed as they are considered invalid due to an incorrect format.
Ideally, this should also be fixed in the CLI.