autosub icon indicating copy to clipboard operation
autosub copied to clipboard

Add new provider: wit.ai

Open yshalsager opened this issue 3 years ago • 4 comments

I would like to suggest adding wit.ai API as a new Speech-to-Text engine. It's a very solid and open-source natural language processing API. https://github.com/wit-ai

I might be able to add it and send a PR if I managed to have some free time after the idea is accepted of course.

Here's an API implementation example I wrote for another project https://github.com/yshalsager/Userge-Plugins/blob/98feca02f75ec2fa18cb49255577af85761d0c37/plugins/transcribe.py#L18

yshalsager avatar Apr 06 '21 18:04 yshalsager

@BingLingGroup I have started working on it and finished an initial implementation that works. https://github.com/yshalsager/autosub/commits/witai

However, before I make a pull request I'd like to ask about a point. WIT API accepts audio input as wav, mpeg3, ogg, and raw pcm. For the rate, it should be 8000. I managed to get it to work by defining these options as cli arguments -i test.m4a -S ar-eg -sapi witai -skey xxxxx -asf .pcm -asr 8000 but I believe there should be a way to make this audio configuration autosub's default for WIT speech engine, wouldn't it be better?

yshalsager avatar Apr 14 '21 17:04 yshalsager

I'm not sure about the accuracy of this api. So I guess it's better not to change the default api especially when it needs to sign up and get the token to use.

BingLingGroup avatar Apr 15 '21 02:04 BingLingGroup

@BingLingGroup I didn't mean to change the default API. I meant, is there a way provided by autosub code to set default settings of a speech engine?

yshalsager avatar Apr 15 '21 08:04 yshalsager

Sorry I misunderstood. I set the defaut audio settings here and here. Perhaps it's better to set the constaints in https://github.com/BingLingGroup/autosub/blob/dev/autosub/constants.py.

BingLingGroup avatar Apr 17 '21 08:04 BingLingGroup