pocketsphinx.js icon indicating copy to clipboard operation
pocketsphinx.js copied to clipboard

pocketsphinx.js + MULTIPLE keywords?

Open Villmer opened this issue 9 years ago • 4 comments

I have two questions:

  1. With pocketsphinx.js, can we simultaneously listen for multiple keywords instead of just one? I've looked at the live_kws.html sample, which allows one, but I've been unable to successfully write code that listens (and responds) to multiple keywords. Is there a way to supply an array of keywords or something? Anyone who can answer this (with code) will absolutely make my day! :-)
  2. If I want to listen for a key phrase that has two words, such as "play scene", what do I need to add to my word list?

For example, which is correct? (more efficient)

  1. var wordlist = [["PLAY","P L EY"],["SCENE","S IY N"]]
  2. var wordList = [["PLAY","P L EY"],["SCENE","S IY N"],["PLAY-SCENE","P L EY S IY N"]]
  3. var wordList = [["PLAY-SCENE","P L EY S IY N"]]

Villmer avatar Jan 20 '16 14:01 Villmer

  1. PocketSphinx does support listening to multiple key phrases, but (currently) only if provided in a file. To do that you'd need to put your key phrases in a text file, package it into JavaScript as explained in the doc, load the JavaScript and initialize the decoder with the argument to load the file ("-kws"). Since the file is loaded when initializing, you also need to pass a corresponding dictionary file that contains the words in your key phrases. All this is documented in README.md. Of course a drawback of adding key phrases that way is that you can not change them at runtime, so it'd be great if the pocketsphinx API could include that too. Actually everything was already discussed and documented in earlier tickets, such as this one: https://github.com/syl22-00/pocketsphinx.js/issues/45
  2. None of them is incorrect, it depends on which files you have in your key phrases. All words should have their pronunciation. I'd say it is more straightforward to use individual word, but maybe combining short phrases into one word might increase accuracy. That must be tested. Note that you may have better performance using a grammar with well chosen and tested probabilities instead.

syl22-00 avatar Jan 20 '16 14:01 syl22-00

I've created a "grammar" version that works well but I am unable to determine when the recognizer has reached the "last" result after I've spoken. For example, as I speak, I see that the recognizer goes through several guesses before it finally stops. When it does, it is often correct - BUT I'm unable to find (in the various js files) how to determine if it is "done" listening so I can fire an event using only the last result.

The reason I want only the LAST guess is that I want to fire a function. As it is, I will fire several unwanted functions before the last one.

So, here is my question: How do I determine if the processing session is done (reached a final result) after I've stopped speaking? I'm making a mobile application that continually listens. The user will speak one or two-word phrases (short commands under a second).

Villmer avatar Jan 20 '16 15:01 Villmer

I'm taking a look at the ticket now (https://github.com/syl22-00/pocketsphinx.js/issues/45) ... I think if I can get the multiple keywords working, it will be a better solution.

Villmer avatar Jan 20 '16 15:01 Villmer

For future reference, here is a repo containing example using multiple keywords https://github.com/miguelmota/pocketsphinxjs-multiple-keywords

miguelmota avatar Feb 21 '16 01:02 miguelmota