whisper.cpp issues

If there is a large gap between the output result and whisper, the top 1 token is also selected

whisper output [00:00.000 --> 00:02.000] 吃飯了嗎 main output [00:00:00.000 --> 00:00:00.500] (按讚) [00:00:00.500 --> 00:00:01.200] 謝謝你 I want to know where the main difference between the two sides is in...

wuhongsheng

whisper_full: failed to generate timestamp token - skipping one second

Here's the before and after (on an Italian tv show) (after 00:06:18.840 main just quietly died. no error. no exit file, no nothing.) `main -m ./models/ggml-medium.bin -l it -osrt -f...

janngobble

Support for other sample rates

2

It seems just adjusting `WHISPER_SAMPLE_RATE` doesn't work :)

luke-jr

Timestamps skip leading silence

1

For some reason, transcripts always start at 0s, even when there's leading silence. P.S. It might be nicer to just handle multiple audio channels with overlapping speech gracefully ;)

luke-jr

question

Support AVX/AVX2/FMA/F16C on x86-32 Linux

2

This works: ```diff --- a/Makefile +++ b/Makefile @@ -53,7 +53,7 @@ endif # Architecture specific # TODO: probably these flags need to be tweaked on some architectures # feel free...

luke-jr

Weird words not being capitalized - even at start of sentence.

1

So, I don't know if this is more the trained data set or how whisper.cpp cuts the file up to process - and therefore doesn't realize it's the beginning of...

janngobble

question

Slow inference

1

Hello, I am writing a program to integrate whisper.cpp with ROS (Robot Operating System). At this moment: every 10 seconds, it receives through ROS some bytes that are equivalent to...

gustavoflw

build

Memory allocation in Windows

11

I don't know how many people noticed that but the memory handling seems to be rather suspicious to me. I have to use large model to get a usable results...

vitacon

bug

Suggestion: Add .clang-format file

4

It would be useful when making larger changes (like [this](https://github.com/ggerganov/whisper.cpp/pull/331)) if it could be properly formatted using clang-format. Do you happen to have a `.clang-format` file you can drop into...

asmaloney

enhancement

good first issue

command: Refactor to split command list & general transcription modes

This makes it easier to understand if you're looking for only one of the capabilities.

asmaloney

whisper.cpp
whisper.cpp copied to clipboard

Metadata

If there is a large gap between the output result and whisper, the top 1 token is also selected

whisper_full: failed to generate timestamp token - skipping one second

Support for other sample rates

Timestamps skip leading silence

Support AVX/AVX2/FMA/F16C on x86-32 Linux

Weird words not being capitalized - even at start of sentence.

Slow inference

Memory allocation in Windows

Suggestion: Add .clang-format file

command: Refactor to split command list & general transcription modes

← Metadata

Owner

Metadata

whisper.cpp whisper.cpp copied to clipboard

Metadata

← Metadata

Owner

Metadata

whisper.cpp
whisper.cpp copied to clipboard