whisper.cpp icon indicating copy to clipboard operation
whisper.cpp copied to clipboard

Leading space in SRT files?

Open peterk opened this issue 2 years ago • 2 comments
trafficstars

First, thank you for your awesome project – a great value to society!

I am using the SRT mode output and discovered that a leading space is always added to the output. E.g.:

2
00:00:11,000 --> 00:00:26,000
 Tack.

3
00:00:26,000 --> 00:00:36,000
 Vi ses imorgon.

I use the SRT output in conjunction with the max length parameter like this:

main -m {model_path} --output-srt --language sv -f audio.wav -ml 72

It seems the leading space could be a bug?

peterk avatar Jan 10 '23 19:01 peterk

The leading space can be significant if a word is split across multiple SRT entries. For example, with -ml 1, you might see output like:

2
00:00:00,520 --> 00:00:00,570
 Per

3
00:00:00,570 --> 00:00:00,680
formed

4
00:00:00,680 --> 00:00:00,890
 by

Note that entry 3 doesn't start with a space, indicating that it's a continuation of the previous word.

boolemancer avatar Jan 16 '23 10:01 boolemancer

I agree but I see no lines without a leading space. I will try some more examples.

peterk avatar Jan 16 '23 10:01 peterk

@peterk Adding --split-on-word arg to main should be what you are looking for

ggerganov avatar Feb 08 '23 06:02 ggerganov