whisper.cpp
whisper.cpp copied to clipboard
slow or empty output on windows with stream example
I've got everything build, SDL2 in-line, and its anecdotally working
stream -c 0 -t 8 --length 500 -vth 0.1
is the only setting I've been able to consistently get some sort of output from, but its slow, and super intermittent... I might get two or three words from a 30 second blast, and it takes sometimes minutes to show up.
this is on a modern i9 hardware
this is a minute of results from counting to 60:

Does anyone have anecdotal evidence of real-time-ish transcription working with stream on windows?
I am currently working on making whisper.cpp consume streams, since I didn't find stream example given to be sufficient for my use cases (transcript of continuous speech, not command-based interface). I am going to process data in x seconds chunks, but preserve model context in between, unlike stream example which instantiates new model each time it has data to process. Will send a PR when/if I finish my streaming approach. So far going well with chunks (thus latency) of 10 seconds.
My approach is to keep inference with the same model to preserve long speech context. Possibly, will also be able to detect language change during that