epicenter icon indicating copy to clipboard operation
epicenter copied to clipboard

[FEATURE REQUEST]: Option to use local faster-whisper server instead of OpenAI API

Open Arche151 opened this issue 1 year ago • 10 comments

Feature request

I'd like to propose adding an option to use this local faster-whisper-server instead of the OpenAI API. The faster-whisper-server is OpenAI API compatible, which suggests that implementing this option should be relatively straightforward and require minimal modifications to the existing code.

This addition would provide users with an alternative that could offer:

  1. Improved privacy by keeping audio data local
  2. Potentially faster processing times
  3. Cost savings by eliminating API usage fees

I understand this might be a niche request, but it could be valuable for users who prefer local processing or have specific performance requirements.

Thank you for considering this feature. I would highly appreciate, if you were able to implement it!

Arche151 avatar Jul 07 '24 06:07 Arche151

Great feature request and it seems like the API is compatible! I'll put this on queue and hopefully have an opportunity to work on it this month!

braden-w avatar Jul 10 '24 01:07 braden-w

@braden-w Yayy, happy to hear that!

Arche151 avatar Jul 10 '24 05:07 Arche151

I think the GroQ API is also compatible , maybe you could add the en point in the settings ?

OLH21 avatar Jul 11 '24 20:07 OLH21

This is my most wanted feature! Glad to see it getting some traction, I use the app all the time!

quickreactor avatar Jul 19 '24 12:07 quickreactor

I think a configurable URL without any guarantees for compatibility would be a nice ... I would personally start to test GroQ ... https://wow.groq.com/groq-runs-whisper-large-v3-at-a-164x-speed-factor-according-to-new-artificial-analysis-benchmark/

I think in the following file it is needed to get 3 configuration options: recorder.svelte.CMw7ZXBL.js

  1. Turn off, the api check because groq starts with gsk_

  2. The option to change the model

  3. The option to change the URL

    transcribe: (e,{apiKey: t, outputLanguage: n})=>L(function*() { var a; if (!t.startsWith("sk-")) return yieldnew B({ title: "Invalid API Key", description: 'The API Key must start with "sk-"', action: { label: "Update API Key", goto: "/settings" } }); const r = e.size / (1024 * 1024); if (r > Hh) return yieldnew B({ title: The file size (${r}MB) is too large, description: Please upload a file smaller than ${Hh}MB. }); const s = new File([e],xC) , o = new FormData; o.append("file", s), o.append("model", "whisper-1"), n !== "auto" && o.append("language", n); const i = yield*$e({ try: ()=>fetch("https://api.openai.com/v1/audio/transcriptions", {

OLH21

If you're still search a Groq version, here it's a Groq only fork. Because patience is not my strength. ;) https://github.com/AlpSantoGlobalMomentumLLC/whisperingGroq/tree/main

Groq support has landed in the latest version! I have faster-whisper-server finally working locally (was challenging since it didn't support CORS), hoping to push it out later as well!

braden-w avatar Aug 09 '24 04:08 braden-w

I am really curious on how you manage to make it work without cors

OLH21 avatar Aug 11 '24 17:08 OLH21

@OLH21 faster-whisper-server worked when fetching from a backend context (Node, Bun, Rust fetch), but wouldn't in a browser.

I used Tauri's backend to proxy the request. At first I wrote out actual rust code—a function to take a blob and construct a request to the server—but realized that Tauri provides a convenient Javascript fetch function in their HTTP client that is implemented in Rust under the hood. Since it executes in a backend function, it bypasses CORS. This was really convenient, but it only would work for the desktop app—not the web app! That's because using the Rust server to proxy requests is only possible in Tauri. The equivalent in a browser context would be setting up a second local server that would handle communication between the web app and faster-whisper-server

While all of this was happening, I opened up this pull request in faster-whisper-server as a hail mary hoping they would add CORS support, and surprisingly they made it work and merged the feature in!

So now, the task at hand is to implement the faster-whisper-server CORS feature so I can get support on both platforms!

braden-w avatar Aug 11 '24 22:08 braden-w

Unfortunately the CORS pull request broke faster-whisper-server for some and I'm waiting for the maintainer to respond 😅 but in the meantime it is now working in desktop! #290 just dropped and will be released soon

Note that for most devices, transcription will be significantly slower—which is expected! I aim to integrate more local options later, like #227

braden-w avatar Aug 15 '24 15:08 braden-w

Tt turns out that CORS actually works! I'm so silly 😅

Restored browser functionality in #292

braden-w avatar Aug 15 '24 15:08 braden-w

eyy, thanks so much for asking this feature! I'll try it out later on my Ubuntu PC :)

Arche151 avatar Aug 15 '24 15:08 Arche151

@OLH21 faster-whisper-server worked when fetching from a backend context (Node, Bun, Rust fetch), but wouldn't in a browser.

I used Tauri's backend to proxy the request. At first I wrote out actual rust code—a function to take a blob and construct a request to the server—but realized that Tauri provides a convenient Javascript fetch function in their HTTP client that is implemented in Rust under the hood. Since it executes in a backend function, it bypasses CORS. This was really convenient, but it only would work for the desktop app—not the web app! That's because using the Rust server to proxy requests is only possible in Tauri. The equivalent in a browser context would be setting up a second local server that would handle communication between the web app and faster-whisper-server

While all of this was happening, I opened up this pull request in faster-whisper-server as a hail mary hoping they would add CORS support, and surprisingly they made it work and merged the feature in!

So now, the task at hand is to implement the faster-whisper-server CORS feature so I can get support on both platforms!

Thanks for the explaination

OLH21 avatar Aug 15 '24 15:08 OLH21

Absolute legend!

quickreactor avatar Aug 16 '24 08:08 quickreactor

It's out!

https://github.com/braden-w/whispering/releases/tag/v5.0.0

Thank you guys again for the support 🙏 let me know if there are any issues! Directions are in the settings page if you select "faster-whisper-server" as your provider.

braden-w avatar Aug 16 '24 14:08 braden-w