nodejs-whisper
nodejs-whisper copied to clipboard
[Bug?] Audio being transcribed forever despite following standard
Hello, this might or might not be a bug, but I wanted to mention it as this package could be well used in combination with node-mic
as well.
import { Logger } from '@src/logger';
import path from 'path';
import { nodewhisper } from 'nodejs-whisper'
import fs from 'fs';
import NodeMic from 'node-mic';
const voicesPath = path.resolve(process.cwd(), 'src/voice');
export const initVoice = async (): Promise<void> => {
const mic = new NodeMic({
rate: 16000,
encoding: 'signed-integer',
bitwidth: 16,
endian: 'little',
channels: 2,
threshold: 20,
fileType: 'wav',
debug: true,
});
const micInputStream = mic.getAudioStream();
const outputFileStream = fs.createWriteStream(`${voicesPath}/output.wav`);
micInputStream.pipe(outputFileStream);
micInputStream.on('data', (data) => {
// Do something with the data.
});
micInputStream.on('error', (err) => {
console.log(`Error: ${err.message}`);
});
micInputStream.on('started', () => {
console.log('Started');
});
micInputStream.on('stopped', () => {
console.log('Stopped');
});
micInputStream.on('paused', () => {
console.log('Paused');
});
micInputStream.on('unpaused', () => {
console.log('Unpaused');
});
micInputStream.on('silence', async () => {
console.log('Silence');
mic.stop();
});
micInputStream.on('exit', async (code) => {
console.log(`Exited with code: ${code}`);
await transcribeWAV();
});
mic.start();
Logger.DEBUG(`Voices path: ${voicesPath}`);
};
const transcribeWAV = async () => {
try {
Logger.DEBUG('Transcribing voice...');
const transcript = await nodewhisper(`${voicesPath}/output.wav`, {
modelName: "tiny",
verbose: true
});
Logger.INFO(`Transcript: ${transcript}`); // output: [ {start,end,speech} ]
} catch (error: any) {
Logger.ERROR(`Error occurred while transcribing voice: ${error.message}`);
}
};
This is my code.
Output:
Microphone stopped
Found silence block: 21 of 20
Recording has finished with code = 1
Exited with code: 1
[DEBUG] Transcribing voice...
[Nodejs-whisper] Checking file existence: /home/wolf/develop/nodejs/okuuai/src/voice/output.wav
[Nodejs-whisper] Converting file to WAV format: /home/wolf/develop/nodejs/okuuai/src/voice/output.wav
[Nodejs-whisper] Checking if the file is a valid WAV: /home/wolf/develop/nodejs/okuuai/src/voice/output.wav
[Nodejs-whisper] File is a valid WAV file.
[Nodejs-whisper] Constructing command for file: /home/wolf/develop/nodejs/okuuai/src/voice/output.wav
[Nodejs-whisper] Executing command: ./main -l auto -m ./models/ggml-tiny.bin -f /home/wolf/develop/nodejs/okuuai/src/voice/output.wav
output.wav
seems to be at 16kHz, following the same codec I use to make a test recording using Audacity. The audacity one is properly transcribed, whereas the one using the node-mic doesn't, but they both have exactly the same stream info.
Am I missing something?
Update
Somehow this came out after leaving it to transcribe a while
[09:15:25.440 --> 09:15:35.440] [BLANK_AUDIO]
[09:15:35.440 --> 09:15:45.440] [BLANK_AUDIO]
[09:15:45.440 --> 09:15:55.440] [BLANK_AUDIO]
[09:15:55.440 --> 09:16:05.440] [BLANK_AUDIO]
[09:16:05.440 --> 09:16:15.440] [BLANK_AUDIO]
[09:16:15.440 --> 09:16:25.440] [BLANK_AUDIO]
[09:16:25.440 --> 09:16:35.440] [BLANK_AUDIO]
[09:16:35.440 --> 09:16:45.440] [BLANK_AUDIO]
[09:16:45.440 --> 09:16:55.440] [BLANK_AUDIO]
[09:16:55.440 --> 09:17:05.440] [BLANK_AUDIO]
[09:17:05.440 --> 09:17:15.440] [BLANK_AUDIO]
[09:17:15.440 --> 09:17:25.440] [BLANK_AUDIO]
[09:17:25.440 --> 09:17:35.440] [BLANK_AUDIO]
[09:17:35.440 --> 09:17:45.440] [BLANK_AUDIO]
[09:17:45.440 --> 09:17:55.440] [BLANK_AUDIO]
[09:17:55.440 --> 09:18:05.440] [BLANK_AUDIO]
[09:18:05.440 --> 09:18:15.440] [BLANK_AUDIO]
[09:18:15.440 --> 09:18:25.440] [BLANK_AUDIO]
[09:18:25.440 --> 09:18:35.440] [BLANK_AUDIO]
[09:18:35.440 --> 09:18:45.440] [BLANK_AUDIO]
[09:18:45.440 --> 09:18:55.440] [BLANK_AUDIO]
[09:18:55.440 --> 09:19:05.440] [BLANK_AUDIO]
[09:19:05.440 --> 09:19:15.440] [BLANK_AUDIO]
(Please to take in consideration that this is a 0:06 seconds recording)