Recognizing Background Noise
I have tested PocketSphinx.js against a compiled Android app using the same us-ptm models. Performance was equal on both platforms. Except for one annoying difference on PocketSphinx.js. It often recognizes background noise as words. I guess it is background noise anyways. I can leave the webapp/live.html page up for 5 minutes in a fairly quiet room and not say a word. Yet it will still think I said 5 or 10 of the city words in the demo. I think many of you have experienced this in the demo as I see it on all 5 devices I have tried. I have tried different acoustic models too. Any way to change this sensitivity?
Anyone have a comment on this?
Live demo uses grammars. For continuous listening pocketsphinx requires keyword spotting mode.
@Thread7 I have a few procedures for coping with the background noise - but these are not issues with pocketsphinx.js - more the way I deal with it. Send me your contact details I will communicate by email.
@JohnAReid please email me at this address: thread7 AT gmail.com
Thanks a lot!
I am trying to write a command and control application using pocketsphinx.js inside a web worker with recognizer.js. I have successfully implemented keyword spotting for the commands with a switch to grammar for the remaining processing. The respective FSG grammar is written using state transitions as described in pocketsphinx.js README file. My problem is how to code the grammar to ignore out of grammar words and thus achieve accurate recognition. All solutions I found for pocketsphinx (as it still does not support the '
Rejecting words with grammars is indeed a difficult problem. You can try training filler words, if you go all the way to do acoustic model training, or you can add a phoneme loop to your grammar. A loop would just be a transition from one state to the same state.
As for using JSGF grammars, you can use them by loading them from a file, using LazyLoad for instance.
I have tried to implement a parallel garbage loop but I always get garbage as output (even though I speak the words "FIRST", "SECOND" in the example below, I still get a combination of G1, G2,... as grammar outputs). I used: grammarOptions = {numStates: 2, start: 0, end: 1, transitions: [{from: 0, to: 1, logp: 0, word: "FIRST"}, {from: 0, to: 1, logp: 0, word: "SECOND"}, {from: 0, to: 1, logp: -5, word: "G1"}, {from: 0, to: 1, logp: -5, word: "G2"}, ... repeat for remaining phonemes
{from: 0, to: 0, logp: -5, word: "G1"},
{from: 0, to: 0, logp: -5, word: "G2"},
... repeat for remaining phonemes
For the garbage phoneme transitions (both to state 1 and remaining in state 0) I used the same logp and tried values -5, -10, -20, all with the same negative result. What am I missing please?
Try -2000, sometimes it should choose a proper variant. Such experiments are easier to conduct with pocketsphinx desktop version, not with js.
@nshmyrev Many thanks for your input. I tried -2000 but unfortunately things did not change.
As pocketsphinx still does not support the unknown word
You can share a pocketsphinx_continuous example (not js) with audio file and grammar and I'll take a look.
Hi, I have produced two pocketsphinx_contnuous examples using the commands: (1) pocketsphinx_continuous -dict /share/keyphrase.dict -fsg /share/balanceGarbageLoop.fsg -inmic yes -infile /share/allNoise.wav (an audio file with only garbage noise outside the allowed grammar) (2) pocketsphinx_continuous -dict /share/keyphrase.dict -fsg /share/balanceGarbageLoop.fsg -inmic yes -infile /share/accountNoise.wav (an audio file with garbage noise in the middle of allowed grammar words) The grammar used contains a garbage loop as outlined in my previous append. In both examples, the garbage noise is recognised as valid grammar words ("CURRENT") Also in example (2), the first valid word "CURRENT" is systematically not recognised (over many runs) and instead pocketsphinx_continuous responds with the following error: ERROR: "fsg_search.c", line 940: Final result does not match the grammar in frame 115 Any idea regarding the above error would also be very welcome.
The audio files and grammar used for the tests are at: https://www.dropbox.com/s/5mveyre9lnajdnp/ProblemDocumentation.zip?dl=0
Hi, any news regarding the above? Many thanks
Your audio is clipped, you simply need to reduce the recording level.
@nshmyrev Many thanks for pointing the issue with recording levels I did some more tests with the garbage loop. I used a "jsgf" form of the grammar (included below) and I picked the weights between the garbage loop and the valid grammar words so that there is maximum distance between them (this is only one of the many tests I did). Using a sound file of eight valid 8 words and a simple grammar without the garbage loop and weights, pocketsphinx_continuous was able to provide excellent recognition results for all valid words over many runs. However, adding back the garbage loop, and despite the selected largest distance between the weights, pocketsphinx_continuous was recognizing garbage instead of the valid words in the vast majority of cases as shown below after the grammar. All files (sound, grammars) exist at: https://www.dropbox.com/s/2txxo0ep98odmhk/ProblemDocumentation.zip?dl=0
Getting a garbage loop to work is proving to be a very challenging problem. Are there any news regarding proper support of the UNKNOWN word by Pocketsphinx (I could see that the development team was working on it some time back)? Many thanks for all your support on this!
Grammar (with garbage loop) used: #JSGF V1.0;
grammar balance;
public <balanceAccount> = /0.0000000000000000000000000000000000000000001/ <garbage_loop> |
/10000000000000000000000000000000000/
Pocketsphinx_continuous recognition results with the garbage loop added (input sound file has 8 valid words):
G20 G2 G11 G31 G12 G23 G9
G20 G32 G6 G33 G28 G34 G23 G32 G31
G39 G38 G13 G37 G18 G23 G15 G38 G27 G32 G30 G39 G13 G18 G13 G37 G29 G7 G20 G32 G3 G7
G7 G22 VISA
G33 G7 G18 G29 G11 G3 G24 G3 G3 G20
G36 G34 G36 G1 G2 G29 G33 G3 G11 G16 G1 G31 G9
G7 G33 G7 G3 G29 G16 G17 G3 G15 G3 G15 G6 G3 G27 G31 G10
Hi, any news regarding the above? Many thanks
Well, ideally one would rewrite the decoder to include the loop like we have in kws search ;) Give me some more time please.
@nshmyrev Hi Nickolay, any news? Your last append sounded very promising. It would be great if we could have grammar support in Pocketsphinx with embedded the garbage loop in the decoder. Many thanks
I agree it would be great to have this working.
I was wondering whether there has been any progress on this. Many thanks.
I am experiencing the same issue as everybody else.
A thought - it might be helpful to run some kind of volume change detection and only feed audio into pocketsphinx while the volume is measurably changing. So, if the audio goes flat for a period of time (a second or two?), stop actively analyzing it.
This seems to be a duplicate of https://github.com/syl22-00/pocketsphinx.js/issues/60