vosk-api icon indicating copy to clipboard operation
vosk-api copied to clipboard

Nodejs microphone recognition starts slow

Open s267348 opened this issue 1 year ago • 9 comments

I tried to run vosk on nodejs after following all the steps, i'm finding two problems at the moment:

  1. When i start the command that launches the micInstance.start(); it takes almost 1 minute to start the recording, i figured the problem might be the setup of the model, is it possible to load it independently from the start of the recording?

  2. When the recording starts thats my https://gyazo.com/20fbbd68389464eec26b87828a3cc682, i even tried to run before a rec on sox to check the audio and it sounds fine. It either get the wrong words or keeps spamming "the" even when i'm silent.

Since i saw the same problem on the issue #841 i was wondering if anyone resolved that problem.

Another minor problem is that i have to put the absolute path of the model, otherwise the if (!fs.existsSync(MODEL_PATH)) { console.log("Please download the model from https://alphacephei.com/vosk/models and unpack as " + MODEL_PATH + " in the current folder."); process.exit(); } gets triggered

s267348 avatar Aug 03 '22 17:08 s267348

is it possible to load it independently from the start of the recording?

Yes

nshmyrev avatar Aug 03 '22 17:08 nshmyrev

When the recording starts thats my https://gyazo.com/20fbbd68389464eec26b87828a3cc682, i even tried to run before a rec on sox to check the audio and it sounds fine. It either get the wrong words or keeps spamming "the" even when i'm silent.

You need to dump the audio data you send to recognizer to a file and share i

nshmyrev avatar Aug 03 '22 17:08 nshmyrev

You need to dump the audio data you send to recognizer to a file and share it

I'm kinda new to this topic so sorry for the question, you want me to write to a file the data received from here? Also the format of the file has to be a specific one?

micInputStream.on('data', (data: any) => { 
    if (rec.acceptWaveform(data)){
	console.log(rec.result());
    }
    else{
	console.log("Partial: "+rec.resultString());
    }
    fs.appendFile("./output.raw", data, function(err: any) {
    });
});

s267348 avatar Aug 03 '22 18:08 s267348

This code is enough. You do not need any format, raw data is ok.

nshmyrev avatar Aug 03 '22 19:08 nshmyrev

Sorry for the late reply, here the file output.raw resulting from a mix of silence and the phrase "recording attempt number one" output.zip

s267348 avatar Aug 04 '22 11:08 s267348

The audio file you shared has 2 seconds of complete silence. Maybe your microphone is muted or you pick the audio from the wrong input.

nshmyrev avatar Aug 04 '22 22:08 nshmyrev

The problem was that the microphone took like 5/6 seconds before activation so my small attempts recorded just silence. I also noticed that with a rate of 8000 instead of 16000 in the settings the partial text printed was way more accurate, it's working as intended now, apparently. Thanks!

s267348 avatar Aug 06 '22 11:08 s267348

I would figure out why activation is so slow, it should not work like that

nshmyrev avatar Aug 08 '22 17:08 nshmyrev

"The problem was that the microphone took like 5/6 seconds before activation" - I do currently have this problem (already on it for 5 to 10 hours.. I tried alot but I am not yet feeling close to a solution!).

How did you solve it?

I use two different "node-mics":

const Microphone = require('node-microphone'); // Microphone A const record = require('node-mic-record') // Microphone B

The problem occurs with both, when I start record and say "a, b, c, d, e, ..." I get "c, d, e..."

heres how I go about it (in a nutshell!):

const mic = new Microphone({ useDataEmitter: true, rate: 22050, channels: 1 }); // MIC A const writer = new wav.Writer({ sampleRate: 44100, channels: 1, bitDepth: 16 }); const audioFilePath = path.resolve(__dirname, 'recorded_audio.wav'); const fileStream = fs.createWriteStream(audioFilePath, { encoding: 'binary' }); mic.startRecording(); writer.pipe(fileStream); mic.on('data', (chunk) => { console.log('Audio data received.'); writer.write(chunk); }); writer.end();

When I record directly from the browser using MediaRecorder this does not happen! I want to record in a node environment though! Help would be much appreciated!

the chunks that come in are all processed fine it seems, problem is the chunks along with the "Audio data recieved" logs come in only after a couple of seconds, not right away! I think there must be an easy solution to this! But what do I know.. oO LMK!! TYVM!!!

JonasDeitmersATACAMA avatar Feb 22 '24 13:02 JonasDeitmersATACAMA