obsidian-audio-notes icon indicating copy to clipboard operation
obsidian-audio-notes copied to clipboard

custom url for whisper api

Open 7596ff opened this issue 2 years ago • 6 comments

I run whisper in docker and I would like to automatically generate transcripts through the plugin myself instead of having to do so manually.

7596ff avatar Jan 05 '23 03:01 7596ff

Great, can you share the docker build?

Is there an API available in the docker container that allows you to interact with whisper from outside the container?

jjmaldonis avatar Jan 05 '23 07:01 jjmaldonis

I just ran the commands in the readme: https://github.com/ahmetoner/whisper-asr-webservice

The API is pretty simple, you upload the file and tell it which format to return.

7596ff avatar Jan 06 '23 16:01 7596ff

Ah I was hoping there was an existing docker image you used to run whisper. In order to run the commands within the plugin, it's necessary to create a REST API within the docker container to interact with Whisper, like running the model and getting the result. A downloadable docker image will need to be created to make it distributable. That's a good chunk of work and it'll be a while until I get to it. If you'd like to contribute that would be great too.

jjmaldonis avatar Jan 06 '23 18:01 jjmaldonis

If I'm reading you correctly, I don't think it's a good idea to automatically spin up a docker image from within obsidian. I think it would be best to require users to spin up what I linked themselves, which is quite easy with docker desktop. https://github.com/djmango/obsidian-transcription does this, but it's flaky from my testing. Not that I know anything about the internals of this plugin currently, but it seems moderate to run a command on an audio file that saves the json result next to it in-tree.

7596ff avatar Jan 06 '23 19:01 7596ff

The plugin would never spin up a docker container automatically.

The overlap between the number of users who can spin up a docker container and use it, but who do not know how to install Whisper on their own machine is likely small, so the use case for using a docker container would be to support people who a) cannot install python and Whisper, b) can install docker and one image, and c) cannot interact with docker. So the workflow I would want to implement would be to support accessing docker via an API, which is necessary for the plugin anyway.

Writing an API will probably take 4 hours, and as you have seen from the other image it can be finicky. This is the majority of the work - create a container, preinstall Whisper, create a REST API, and publish the container. Hooking it up in the plugin will be straightforward after that. I likely won't get to this for a while.

jjmaldonis avatar Jan 07 '23 07:01 jjmaldonis

Thanks for that clarification. I'm one of those people who can't install python and whisper, because I can not figure out python's dependency management and environments and so on and so forth. I guess I also don't know the difference between a docker container and a docker image.

Thanks for considering this issue.

7596ff avatar Jan 07 '23 21:01 7596ff