vosk-api icon indicating copy to clipboard operation
vosk-api copied to clipboard

Question for Implementation for Short Commands

Open MarioIb14 opened this issue 1 month ago • 7 comments

Hello, I am working on project the user requires user to say short commands such as "in" and "out". Is there way to modify the code for single worded commands so command do not chain together? Also is there a way to reduce the time that the function waits for silence?

MarioIb14 avatar May 05 '24 02:05 MarioIb14

What kind of project exactly

nshmyrev avatar May 06 '24 07:05 nshmyrev

Sorry I am looking to use python version. I meant the project is trying convert voice to text for short commands and text will be used interact GUI program with single word commands. I started to use setgrammer and EndpointerMode(cannot import from vosk it for some reason).

MarioIb14 avatar May 06 '24 16:05 MarioIb14

I'm asking what kind of short commands your software is going to recognize

nshmyrev avatar May 06 '24 19:05 nshmyrev

So the ones I am planning to use is '["in", "out", "left","right","up", "down", "pause", "stop","next","start", "[unk]"]'

MarioIb14 avatar May 06 '24 19:05 MarioIb14

Sorry do you need any more clarification on the specific short commands?

MarioIb14 avatar May 07 '24 23:05 MarioIb14

Yes, I need to understand the application you are creating

nshmyrev avatar May 10 '24 19:05 nshmyrev

So we have a Matlab script that runs in the background and has a GUI with a text input box that our software needs to write voice commands in each iteration. So the script checks the text input each iteration to control the movement of a robotic arm with commands seen above(if no new command it will take the previous command for movement). So we need a program that can be running in the background checking if a specific command (single-worded) is said to write in the input box and press enter, I am trying to make fast as possible by limiting amount words (setting grammar), preventing the program from stringing words into sentences (if possible) and reduce time from partial to final output (I notice when the test microphone example code that it would detect word command so quick from partial output but it would wait some time output the result).

MarioIb14 avatar May 10 '24 20:05 MarioIb14