whisper.cpp
whisper.cpp copied to clipboard
Port of OpenAI's Whisper model in C/C++
whisper output [00:00.000 --> 00:02.000] 吃飯了嗎 main output [00:00:00.000 --> 00:00:00.500] (按讚) [00:00:00.500 --> 00:00:01.200] 謝謝你 I want to know where the main difference between the two sides is in...
Here's the before and after (on an Italian tv show) (after 00:06:18.840 main just quietly died. no error. no exit file, no nothing.) `main -m ./models/ggml-medium.bin -l it -osrt -f...
It seems just adjusting `WHISPER_SAMPLE_RATE` doesn't work :)
For some reason, transcripts always start at 0s, even when there's leading silence. P.S. It might be nicer to just handle multiple audio channels with overlapping speech gracefully ;)
This works: ```diff --- a/Makefile +++ b/Makefile @@ -53,7 +53,7 @@ endif # Architecture specific # TODO: probably these flags need to be tweaked on some architectures # feel free...
So, I don't know if this is more the trained data set or how whisper.cpp cuts the file up to process - and therefore doesn't realize it's the beginning of...
Hello, I am writing a program to integrate whisper.cpp with ROS (Robot Operating System). At this moment: every 10 seconds, it receives through ROS some bytes that are equivalent to...
I don't know how many people noticed that but the memory handling seems to be rather suspicious to me. I have to use large model to get a usable results...
It would be useful when making larger changes (like [this](https://github.com/ggerganov/whisper.cpp/pull/331)) if it could be properly formatted using clang-format. Do you happen to have a `.clang-format` file you can drop into...
This makes it easier to understand if you're looking for only one of the capabilities.