whisper.cpp icon indicating copy to clipboard operation
whisper.cpp copied to clipboard

[Discussion - Improvements] - Real-time (or near-real-time) transcription in the browser with React

Open avie41 opened this issue 11 months ago • 1 comments

Hello @ggerganov

I managed to reuse the code from the stream example and integrate it into a React application using Vite.js.

Keeping the basic implementation, adapted in TypeScript, I have a latency of about 1.5, 2 seconds on average.

But it looks like the implementation given in the example presents a fairly basic audio chunking strategy that could be improved.

  • Has any work already been done on this?
  • Could the CPP code that is then compiled with Emscripten be improved?

Additional context:

At the moment, my application uses vosk-browser, which plugs into an Audio streamer. I would like to turn to Whisper for its superior transcription quality and would like to optimize my implementation as much as possible to get closer to realtime with whisper.cpp.

avie41 avatar Mar 19 '24 19:03 avie41

hi @avie41 can you share the code of your implementation , I wanted to get streaming to work in the whisper.cpp server.

qxprakash avatar Apr 07 '24 07:04 qxprakash