whisper.cpp icon indicating copy to clipboard operation
whisper.cpp copied to clipboard

Suggestion / Idea: VST, LV2, AU plugins

Open trholding opened this issue 1 year ago • 1 comments

If Whisper.cpp could be built as a VST, LADSPA, LV2 plugin, it could be used with a wide range of audio tools.

Especially useful to create closed captions. Other plugins such as filtering and noise reduction could be chained so whisper could do a better job at transcribing.

Adding the plugin to carla for example would allow system wide audio transcription - any sound that plays on speaker could be transcribed. Also could be awesome for people who are deaf && /|| blind as the output text could be piped to a mechanical braille display https://www.afb.org/node/16207/refreshable-braille-displays . Other possibilities are MIDI triggers via chained plugins. One example could be, hot words to play certain music or notes, mute bad words (offline) etc or maybe trigger lights during live events with a bit of delay, live translate etc, multiple chained things and so on.

VST, LADSPA, LV2 are usually cross platform plugins and can have a custom UI as well. An idea that comes to mind is store output text log file, to clipboard or expose a Web UI where all transcribed text is displayed.

https://lv2plug.in/ http://linux-sound.org/linux-vst-plugins.html https://en.wikipedia.org/wiki/Audio_Units
https://en.wikipedia.org/wiki/Virtual_Studio_Technology https://lv2plug.in/

There are skeleton projects on github to figure out how its coded.

trholding avatar Nov 16 '22 19:11 trholding

Also, whisper.cpp would make for a good VAMP plugin.

https://www.vamp-plugins.org/

This would allow it to run out of https://vamp-plugins.org/sonic-annotator/ as well as become potentially usable by other apps employing these plugins, from http://audacity.sourceforge.net to https://mixxx.org/ ...

NielsMayer avatar Jan 10 '23 18:01 NielsMayer