vosk-browser icon indicating copy to clipboard operation
vosk-browser copied to clipboard

Updated instructions

Open JerryGeis opened this issue 9 months ago • 24 comments

Are There updated instructions on this ? I followed the example webpage provided. It does not work. The actual example webpage - does - work but it looks nothing like the example webpage I would like to get this working.

jerry

JerryGeis avatar Mar 16 '25 16:03 JerryGeis

Could you give more detail? I'm guessing you're talking about the "Basic Example" in the README? And you could describe what not working means, e.g. was there a compilation error or some specific message/behavior that was unexpected?

Even better, propose a change to the documentation that could fix it.

Keep in mind this is an open source project with basically one guy making free updates. We should try to make his work easier.

(I'm not a maintainer. Just another user like you.)

erikh2000 avatar Mar 16 '25 17:03 erikh2000

820f4021-a204-403b-bea9-622bb1747630:243 Recognizer (id: ee6b9fdb-72b6-40da-8463-32f2053a4306): Could not be created due to: TypeError: Cannot convert "undefined" to float TypeError: Cannot convert "undefined" to float at Object.toWireType (blob:

This is the error I get, Correct I used the code in the readme

Testing

jerry

JerryGeis avatar Mar 16 '25 19:03 JerryGeis

Ah, cool. It's the same as the error in your other issue, I see.

Checking my own working code to see if there's a difference. Hmm. Nope, you're doing the same thing.

I would open developer tools (or non-Chrome equivalent) in your browser and check a few things:

  1. Do you have "404" errors under the network tab, particularly for "vosk/model.tar.gz"? If so, you should verify that your web server has hosted the model file where it is expected.
  2. If no insights from #1, then narrow down the exact line that throws the exception. You can do this with adding console.log() statements or using the browser debugger tool to set a breakpoint and step through.

erikh2000 avatar Mar 16 '25 20:03 erikh2000

This is everything in the console log upto the error

LOG (VoskAPI:ReadDataFiles():src/model.cc:211) Decoding params beam=10 max-active=3000 lattice-beam=2 put_char @ 820f4021-a204-403b-bea9-622bb1747630:41Understand this warningAI 820f4021-a204-403b-bea9-622bb1747630:41 LOG (VoskAPI:ReadDataFiles():src/model.cc:214) Silence phones 1:2:3:4:5:6:7:8:9:10 put_char @ 820f4021-a204-403b-bea9-622bb1747630:41Understand this warningAI 820f4021-a204-403b-bea9-622bb1747630:41 LOG (VoskAPI:RemoveOrphanNodes():nnet-nnet.cc:948) Removed 0 orphan nodes. put_char @ 820f4021-a204-403b-bea9-622bb1747630:41Understand this warningAI 820f4021-a204-403b-bea9-622bb1747630:41 LOG (VoskAPI:RemoveOrphanComponents():nnet-nnet.cc:847) Removing 0 orphan components. put_char @ 820f4021-a204-403b-bea9-622bb1747630:41Understand this warningAI 820f4021-a204-403b-bea9-622bb1747630:41 LOG (VoskAPI:CompileLooped():nnet-compile-looped.cc:345) Spent 0.0239999 seconds in looped compilation. put_char @ 820f4021-a204-403b-bea9-622bb1747630:41Understand this warningAI 820f4021-a204-403b-bea9-622bb1747630:41 LOG (VoskAPI:ReadDataFiles():src/model.cc:238) Loading i-vector extractor from /vosk/vosk_model_tar_gz/ivector/final.ie put_char @ 820f4021-a204-403b-bea9-622bb1747630:41Understand this warningAI 820f4021-a204-403b-bea9-622bb1747630:41 LOG (VoskAPI:ComputeDerivedVars():ivector-extractor.cc:183) Computing derived variables for iVector extractor put_char @ 820f4021-a204-403b-bea9-622bb1747630:41Understand this warningAI 820f4021-a204-403b-bea9-622bb1747630:41 LOG (VoskAPI:ComputeDerivedVars():ivector-extractor.cc:198) Done. put_char @ 820f4021-a204-403b-bea9-622bb1747630:41Understand this warningAI 820f4021-a204-403b-bea9-622bb1747630:41 LOG (VoskAPI:ReadDataFiles():src/model.cc:271) Loading HCL and G from /vosk/vosk_model_tar_gz/graph/HCLr.fst /vosk/vosk_model_tar_gz/graph/Gr.fst put_char @ 820f4021-a204-403b-bea9-622bb1747630:41Understand this warningAI 820f4021-a204-403b-bea9-622bb1747630:41 LOG (VoskAPI:ReadDataFiles():src/model.cc:292) Loading winfo /vosk/vosk_model_tar_gz/graph/phones/word_boundary.int put_char @ 820f4021-a204-403b-bea9-622bb1747630:41Understand this warningAI 820f4021-a204-403b-bea9-622bb1747630:243 Recognizer (id: ee6b9fdb-72b6-40da-8463-32f2053a4306): Could not be created due to: TypeError: Cannot convert "undefined" to float TypeError: Cannot convert "undefined" to float at Object.toWireType (blob:

JerryGeis avatar Mar 16 '25 20:03 JerryGeis

I added the JERR comments

JERRY createModel() complete vosk.html:21 JERRY KaldiRecognizer complete 1647b03b-fa7b-46b2-ae2e-79af992cec48:243 Recognizer (id: 1afb9cb3-5636-4ec2-91d8-3b248096ae94): Could not be created due to: TypeError: Cannot convert "undefined" to float TypeError: Cannot convert "undefined" to float at Object.toWireType (blob:https://devgeis.layeredsolutionsinc.com/1647b03b-fa7b-46b2-ae2e-79af992cec48:41:4178973) at KaldiRecognizer.constructor$KaldiRecognizer (eval at new_ (blob:https://devgeis.layeredsolutionsinc.com/1647b03b-fa7b-46b2-ae2e-79af992cec48:41:4171005), :8:26) at KaldiRecognizer. (blob:https://devgeis.layeredsolutionsinc.com/1647b03b-fa7b-46b2-ae2e-79af992cec48:41:4169898) at new KaldiRecognizer (eval at createNamedFunction (blob:https://devgeis.layeredsolutionsinc.com/1647b03b-fa7b-46b2-ae2e-79af992cec48:41:4146821), :4:34) at RecognizerWorker. (blob:https://devgeis.layeredsolutionsinc.com/1647b03b-fa7b-46b2-ae2e-79af992cec48:232:43) at Generator.next () at blob:https://devgeis.layeredsolutionsinc.com/1647b03b-fa7b-46b2-ae2e-79af992cec48:25:75 at new Promise () at __awaiter (blob:https://devgeis.layeredsolutionsinc.com/1647b03b-fa7b-46b2-ae2e-79af992cec48:21:16) at RecognizerWorker.createRecognizer (blob:https://devgeis.layeredsolutionsinc.com/1647b03b-fa7b-46b2-ae2e-79af992cec48:224:20) (anonymous) @ 1647b03b-fa7b-46b2-ae2e-79af992cec48:243 (anonymous) @ 1647b03b-fa7b-46b2-ae2e-79af992cec48:25 __awaiter @ 1647b03b-fa7b-46b2-ae2e-79af992cec48:21 createRecognizer @ 1647b03b-fa7b-46b2-ae2e-79af992cec48:224 handleMessage @ 1647b03b-fa7b-46b2-ae2e-79af992cec48:153 (anonymous) @ 1647b03b-fa7b-46b2-ae2e-79af992cec48:107Understand this errorAI vosk.html:33 JERRY before audiocontext vosk.html:36 [Deprecation] The ScriptProcessorNode is deprecated. Use AudioWorkletNode instead. (https://bit.ly/audio-worklet) init @ vosk.html:36Understand this warningAI vosk.html:47 JERRY init done

JerryGeis avatar Mar 16 '25 20:03 JerryGeis

Useful, but... again, I would use some basic JS troubleshooting to narrow it down to the line that is failing. In the shot below, you can see my debugger is up on the Vosk demo browser site. In this case, the call to createModel() is successful and shows healthy values. Maybe if you ran your code snippet, you'd see that the call to createModel() fails or maybe it's next call to const recognizer = new model.KaldiRecognizer();.

And sometimes it's useful to step inside Vosk-Browser code for more insight into what is failing. Looking at your console log above, there's clearly some variable that is undefined that is expected to be a float. By looking at source code in the debugger at that point of failure, you might learn it's like a sample rate value or something like that. And then you just backtrack through the call stack to see what should have supplied a value and where the first problem was.

Image

erikh2000 avatar Mar 16 '25 21:03 erikh2000

I posted my thing above just after you added your "JERRY" notes. (Nice, thanks.) I can see that it's likely throwing inside of new model.KaldiRecognizer(); but still I think my thoughts on using the debugger apply.

erikh2000 avatar Mar 16 '25 21:03 erikh2000

The line below marked JERRY-----> is the error

    createRecognizer({recognizerId, sampleRate, grammar, }) {
        var _a;
        return __awaiter(this, void 0, void 0, function*() {
            console.debug(`Creating recognizer (id: ${recognizerId}) with sample rate ${sampleRate} and grammar ${grammar}`);
            try {
                let kaldiRecognizer;
                if (grammar) {
                    kaldiRecognizer = new this.Vosk.KaldiRecognizer(this.model,sampleRate,grammar);
                } else {
                    kaldiRecognizer = new this.Vosk.KaldiRecognizer(this.model,sampleRate);
                }
                this.recognizers.set(recognizerId, {
                    id: recognizerId,
                    kaldiRecognizer,
                    sampleRate,
                    grammar,
                });
            } catch (error) {
                const errorMsg = `Recognizer (id: ${recognizerId}): Could not be created due to: ${error}\n${(_a = error) === null || _a === void 0 ? void 0 : _a.stack}`;
JERRY ----->>>                console.error(errorMsg);
                return {
                    error: errorMsg,
                };
            }
        });
    }

JerryGeis avatar Mar 16 '25 21:03 JerryGeis

Okay, that narrows it a little. The exception is being through from inside of one of two constructor calls. (Either the one that passes grammar or the other.)

You've got a console debug above the call. Does it show sampleRate as being undefined/null or set to a float value? If it's undefined, then it's almost certainly the cause of the error.

If no insights come from this, I again recommend stepping into the code with a debugger. Particularly, the constructor call.

erikh2000 avatar Mar 16 '25 22:03 erikh2000

Thanks for the assistance. What exactly should I put in the script to show what you need ?

JerryGeis avatar Mar 16 '25 22:03 JerryGeis

Keep in mind I'm just another user helping out. I don't need anything! :) My friendly suggestion, one dev to another, is to dig in with the browser debugger and learn for yourself why the error occurs. My suggestions above are more like "this is what I would do for my troubleshooting if I were in your shoes".

erikh2000 avatar Mar 17 '25 01:03 erikh2000

Ok - diving deap - KaldiRecognizer() constructor says sampleRate and grammar are undefined ??? not good.

I am using small english model - is that what your using ? the contents of my file are: tar -tvf model.tar.gz drwxr-xr-x silentm/silentm 0 2020-12-08 10:39 model/ drwxr-xr-x silentm/silentm 0 2020-12-08 10:33 model/graph/ -rw-r--r-- silentm/silentm 24013795 2020-12-08 10:23 model/graph/Gr.fst drwxr-xr-x silentm/silentm 0 2020-12-08 10:32 model/graph/phones/ -rw-r--r-- silentm/silentm 1761 2020-12-08 10:23 model/graph/phones/word_boundary.int -rw-r--r-- silentm/silentm 102 2020-12-08 10:23 model/graph/disambig_tid.int -rw-r--r-- silentm/silentm 22416994 2020-12-08 10:23 model/graph/HCLr.fst drwxr-xr-x silentm/silentm 0 2020-12-08 10:32 model/am/ -rw-r--r-- silentm/silentm 15962575 2020-12-08 10:33 model/am/final.mdl drwxr-xr-x silentm/silentm 0 2020-12-08 10:33 model/ivector/ -rw-r--r-- silentm/silentm 1080 2020-12-08 10:33 model/ivector/global_cmvn.stats -rw-r--r-- silentm/silentm 95 2020-12-08 10:33 model/ivector/online_cmvn.conf -rw-r--r-- silentm/silentm 168048 2020-12-08 10:33 model/ivector/final.dubm -rw-r--r-- silentm/silentm 8288887 2020-12-08 10:33 model/ivector/final.ie -rw-r--r-- silentm/silentm 44975 2020-12-08 10:33 model/ivector/final.mat -rw-r--r-- silentm/silentm 35 2020-12-08 10:33 model/ivector/splice.conf -rw-r--r-- silentm/silentm 199 2020-12-08 10:39 model/README drwxr-xr-x silentm/silentm 0 2020-12-08 10:33 model/conf/ -rw-r--r-- silentm/silentm 290 2020-12-08 10:33 model/conf/model.conf -rw-r--r-- silentm/silentm 131 2020-12-08 10:33 model/conf/mfcc.conf

Is this what you have? Thanks!

JerryGeis avatar Mar 17 '25 13:03 JerryGeis

KaldiRecognizer() constructor says sampleRate

Nice, you've narrowed it. You probably just need to make sure your calling code to createRecognizer() is passing a sampleRate value.

Yes, I also use the small English model.

erikh2000 avatar Mar 17 '25 14:03 erikh2000

Do you have parameters to your call to KaldiRecognizer() ? The sample has no parameters when I add 16000 as a parameter - I no longer see errors, but does not work when I add (model, 16000) I see errors again?

vosk.js:245 Uncaught (in promise) DataCloneError: Failed to execute 'postMessage' on 'Worker': EventTarget object could not be cloned. at Model.postMessage (vosk.js:245:25) at new (vosk.js:297:27) at init (vosk.html:13:24)

Jerry

JerryGeis avatar Mar 17 '25 14:03 JerryGeis

I'm using 44100 for the sample rate. In the example apps that are in the repo, I see 48000.

erikh2000 avatar Mar 17 '25 15:03 erikh2000

So I am even more confused now. I set the paramter to Kaldi.Recognizer(44100); - however WHY is that not in the example ? So now I get no errors - but nothing prints when I speak on the console.

Where does teh grammar get defined - its still "undefined"

jerry

JerryGeis avatar Mar 17 '25 16:03 JerryGeis

Can you share your start page or the "init" code like they have. Clearly there is something else I am missing.

Thanks

JerryGeis avatar Mar 17 '25 20:03 JerryGeis

Jerry, I think my project is maybe factored out in a way that will be hard to follow. But here is the recognizer startup code: https://github.com/erikh2000/sl-web-speech/blob/main/src/speech/Recognizer.ts

If you got one of the sample projects running from vosk-web, that's probably going to be easier to compare against.

erikh2000 avatar Mar 18 '25 04:03 erikh2000

get KaldiRecognizer() { const model = this; return class extends EventTarget { constructor(sampleRate, grammar) {

In this code the "grammar" is undefined. How / where do I set this ? I tried looking at the examples -

vosk-browser/examples/ deprecated modern-vanilla react words-vanilla

and I dont see anyting. Thanks

Jerry

JerryGeis avatar Mar 18 '25 12:03 JerryGeis

BTW - none of the exmples work for me in the vosk-browser/examples directory

Jerry

JerryGeis avatar Mar 18 '25 16:03 JerryGeis

Erik - thanks for your time and patience- I GOT ONE of the examples to work. modern-vanilla.

I had to do this: change index.js const model = await Vosk.createModel('vosk/vosk-browser/examples/modern-vanilla/model.tar.gz');

Then copy my model.tar.gz file to the correct directory also.

Jerry

JerryGeis avatar Mar 18 '25 17:03 JerryGeis

So it sounds like you've got things working except for specifying grammar. I've never used that feature before, though I've always wondered how well it works.

For grammar, you have Vosk-browser calling into C++ to WASM code from Vosk. I looked up the code that is called from the Vosk project and found it here:

https://github.com/alphacep/vosk-api/blob/7012103b3bc7a3da67faf9eb452991e60fcd8af7/src/recognizer.cc

Based on that, it looks to me like the grammar parameter should be a JSON string. So in JS something like:

const grammar = '["dog", "cat", "apple", "banana"]';

Or maybe there is some kind of JSON-to-string conversion built into the JS-to-WASM interface. If the string version above doesn't work to pass, you might also try:

const grammar = ["dog", "cat", "apple", "banana"];

If you get this working, consider listing out your recommended improvements to the docs in a way that is easy for Daniel to make them. I think the project could benefit from your struggle.

erikh2000 avatar Mar 18 '25 18:03 erikh2000

Will do If I get that orignal code to work.

Silly question if I say something like "open door 4" - it decodes as "open door for" if I say something like "open door number 4" it gives me "open door FOUR". how would I get "open door 4" - what do I say to get that ?

Thanks - this is going to be fun!

Jerry

JerryGeis avatar Mar 18 '25 19:03 JerryGeis

If my app is listening for specific words like "four", I usually write logic to catch homonyms, e.g., "for". So basically, someplace I'm going to have code that is like if (word === 'for' || word === 'four') number = 4; And depending on what the app is doing, I might need to be pretty loose in accepting close-enough words -- like accepting "I" for "Hi".

Very curious how well the grammar works. It might be that you specifying "four" and not "for" as an accepted word will also solve this problem. But I never tried it, and don't know if that's how it works.

erikh2000 avatar Mar 18 '25 19:03 erikh2000