pocketsphinx.js icon indicating copy to clipboard operation
pocketsphinx.js copied to clipboard

default config for memory-mapping causes problems when lazy-loading pocketsphinx-provided en-us acoustic model

Open jcmoore opened this issue 9 years ago • 1 comments

tl;dr in order to (lazily) load the acoustic model from here (https://github.com/cmusphinx/pocketsphinx/tree/a60982363101704eca342e7e0920754090cd49b1/model/en-us/en-us) without warnings/errors, I'm having to provide a ["-mmap", "no"] configuration setting to the recognizer.

...

When loading the en-us model, I experienced numerous errors -- many survivable (ERROR: "dict.c", line 195: Line 134722: Phone 'Z' is mising in the acoustic model; word 'zyuganov(2)' ignored) and some fatal (Uncaught Assertion failed: (ci >= 0) && (ci < m->n_ciphone), at: /home/sylvain/dev/projects/pocketsphinx.js/pocketsphinx/src/libpocketsphinx/bin_mdef.c,758,bin_mdef_phone_id at). A comparable native (non-js/emscripten) build of pocketsphinx (on OSX) experienced no such problems for the same configuration (just setting an -hmm argument).

After narrowing down differences between the logs of the native and emscripten builds, I noticed the following warning occurred a number of times for me in pocketsphinx.js:

WARN: "bin_mdef.c", line 499: Senone 0 is shared between multiple base phones

Seems that mdef was not loading properly -- many (but seemingly not all) of my m->phone[i].ssid were 0 as a result. If I understand correctly, the offending code is here (https://github.com/cmusphinx/pocketsphinx/blob/a60982363101704eca342e7e0920754090cd49b1/src/libpocketsphinx/bin_mdef.c#L403-L430). Memory mapping is on by default, I don't know emscripten that well but I thought it was likely there might not be support for memory mapping, and when I explicitly disabled memory mapping with ["-mmap", "no"], errors and warnings went away.

Strangely, this did not seem to happen for the acoustic model cmusphinx-en-us-5.2.tar.gz here (https://sourceforge.net/projects/cmusphinx/files/Acoustic%20and%20Language%20Models/US%20English/). Go figure.

jcmoore avatar Aug 29 '16 04:08 jcmoore

I ran into this problem too, it is an issue with Emscripten which I don't fully understand. It looks like mmap() should work in the JavaScript runtime but it actually just returns zero-filled memory. The good news is that there is no real advantage to using mmap() to load files, so you can just turn it off.

dhdaines avatar Apr 24 '20 14:04 dhdaines