vosk-browser icon indicating copy to clipboard operation
vosk-browser copied to clipboard

Attempting to pass data to the KaldiRecognizer results in an odd internal error

Open TimBoettcher opened this issue 1 year ago • 4 comments

I'm trying to integrate vosk-browser into my Rust-based WASM project.

First of, I'd like to note that the API documentation linked in the README could be more precise: I only learned that model.KaldiRecognizer() requires sampleRate as an argument by looking at the source code.

I'm using the AudioRecorder web API to record a MediaStream, converting that to a Float32Array and copying that array into an AudioBuffer, which I then pass to acceptWaveform().

Apparently, the microphone records at a rate of 48 kHz, which seems reasonable to me. But when I actually pass the data to acceptWaveform(), I receive the following error:

ASSERTION_FAILED (VoskAPI:Compute():mel-computations.cc:242) Assertion failed: (!KALDI_ISNAN((*mel_energies_out)(i)))

, followed by another log of undefined.

I'm not sure what this is about, honestly. Any pointers would be appreciated.

TimBoettcher avatar Jul 18 '23 11:07 TimBoettcher

Hey, Tim. What happens if you run the examples? Do you get the same error with their calls to acceptWaveform()? Just trying to narrow down the problem, e.g. maybe Vosk always barfs on 48k sample rate.

Note I am not a maintainer. Just another vosk-browser user.

erikh2000 avatar Jul 18 '23 16:07 erikh2000

Hi @TimBoettcher

there are rust bindings for vosk which would be a better choice for your rust application I guess.

As for your issue, could you share which model you're using and what sampleRate you're passing as a param to the recognizer?

ccoreilly avatar Jul 25 '23 16:07 ccoreilly

@ccoreilly The Rust bindings are for a non-WASM context, though. Since I'm compiling the Rust project to WASM, the bindings wouldn't be particularly helpful, I believe.

I'm dynamically checking for the sample rate provided by the user's media device (via the settings of the MediaStream). In my case, that's a value of 48000, which I pass to the recognizer.

As for the models, I downloaded the models intended for mobile devices from this website, excluding those models that did not comply to the file structure specified in the lib README.

The same error occured with the small models for German and English.

TimBoettcher avatar Jul 25 '23 16:07 TimBoettcher

I would assume there is an issue with the inputs you're passing to the acceptWaveform method.

Could you share the snippet of code you use to record up to when you feed it to the recognizer?

ccoreilly avatar Jul 25 '23 17:07 ccoreilly