nodejs-whisper
nodejs-whisper copied to clipboard
Model not Found
I am on a Mac and trying to use this in a nextjs project
Code ` const filePath = path.join(tempDir, 'out.wav'); console.log(filePath)
// generate the transcript with whisper
const transcript = await nodewhisper(filePath, {
modelName: 'base.en', //Downloaded models name
autoDownloadModelName: 'base.en', // (optional) autodownload a model if model is not present
whisperOptions: {
outputInText: true, // get output result in txt file
outputInVtt: false, // get output result in vtt file
outputInSrt: false, // get output result in srt file
outputInCsv: false, // get output result in csv file
translateToEnglish: false, //translate from source language to english
wordTimestamps: false, // Word-level timestamps
timestamps_length: 20, // amount of dialogue per timestamp pair
splitOnWord: true, //split on word rather than on token
},
}
);
`
Error:
cd: no such file or directory: /Users/dylanb/Documents/Github/StudyMan/studyapp/.next/server/cpp/whisper.cpp/models
[Nodejs-whisper] Autodownload Model: base
chmod: File not found: /Users/dylanb/Documents/Github/StudyMan/download-ggml-model.sh
node:internal/modules/cjs/loader:1078
throw err;
^
Error: Cannot find module '/Users/dylanb/Documents/Github/StudyMan/studyapp/.next/server/vendor-chunks/exec-child.js'
at Module._resolveFilename (node:internal/modules/cjs/loader:1075:15)
at Module._load (node:internal/modules/cjs/loader:920:27)
at Function.executeUserEntryPoint [as runMain] (node:internal/modules/run_main:81:12)
at node:internal/main/run_main_module:23:47 {
code: 'MODULE_NOT_FOUND',
requireStack: []
}
Node.js v18.16.0
[Nodejs-whisper] Attempting to compile model...
node:internal/modules/cjs/loader:1078
throw err;
^
Error: Cannot find module '/Users/dylanb/Documents/Github/StudyMan/studyapp/.next/server/vendor-chunks/exec-child.js'
at Module._resolveFilename (node:internal/modules/cjs/loader:1075:15)
at Module._load (node:internal/modules/cjs/loader:920:27)
at Function.executeUserEntryPoint [as runMain] (node:internal/modules/run_main:81:12)
at node:internal/main/run_main_module:23:47 {
code: 'MODULE_NOT_FOUND',
requireStack: []
}
Node.js v18.16.0
[Nodejs-whisper] Transcribing file: /var/folders/wm/6p8gkm_x6b17rlvy4178hql00000gn/T/out.wav
[Nodejs-whisper] Error: Models do not exist. Please Select a downloaded model.
Error: [Nodejs-whisper] Error: Model not found
at constructCommand (webpack-internal:///(rsc)/./node_modules/nodejs-whisper/dist/WhisperHelper.js:33:15)
at eval (webpack-internal:///(rsc)/./node_modules/nodejs-whisper/dist/index.js:53:62)
at Generator.next (<anonymous>)
at fulfilled (webpack-internal:///(rsc)/./node_modules/nodejs-whisper/dist/index.js:11:32)
Here is my log of running the model download command
(base) Dylans-MacBook-Air:studyapp dylanb$ npx nodejs-whisper download
[Nodejs-whisper] Models do not exist. Please Select a model to download.
| Model | Disk | RAM |
|-----------|--------|---------|
| tiny | 75 MB | ~390 MB |
| tiny.en | 75 MB | ~390 MB |
| base | 142 MB | ~500 MB |
| base.en | 142 MB | ~500 MB |
| small | 466 MB | ~1.0 GB |
| small.en | 466 MB | ~1.0 GB |
| medium | 1.5 GB | ~2.6 GB |
| medium.en | 1.5 GB | ~2.6 GB |
| large-v1 | 2.9 GB | ~4.7 GB |
| large | 2.9 GB | ~4.7 GB |
[Nodejs-whisper] Enter model name (e.g. 'tiny.en') or 'cancel' to exit
(ENTER for tiny.en): base.en
Downloading ggml model base.en from 'https://huggingface.co/ggerganov/whisper.cpp' ...
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 1204 100 1204 0 0 3111 0 --:--:-- --:--:-- --:--:-- 3119
100 141M 100 141M 0 0 8405k 0 0:00:17 0:00:17 --:--:-- 9733k
Done! Model 'base.en' saved in 'models/ggml-base.en.bin'
You can now use it like this:
$ ./main -m models/ggml-base.en.bin -f samples/jfk.wav
[Nodejs-whisper] Attempting to compile model...
sysctl: unknown oid 'hw.optional.arm64'
I whisper.cpp build info:
I UNAME_S: Darwin
I UNAME_P: i386
I UNAME_M: x86_64
I CFLAGS: -I. -O3 -DNDEBUG -std=c11 -fPIC -D_DARWIN_C_SOURCE -pthread -mf16c -mfma -mavx -mavx2 -DGGML_USE_ACCELERATE
I CXXFLAGS: -I. -I./examples -O3 -DNDEBUG -std=c++11 -fPIC -D_DARWIN_C_SOURCE -pthread
I LDFLAGS: -framework Accelerate
I CC: Apple clang version 12.0.5 (clang-1205.0.22.11)
I CXX: Apple clang version 12.0.5 (clang-1205.0.22.11)
cc -I. -O3 -DNDEBUG -std=c11 -fPIC -D_DARWIN_C_SOURCE -pthread -mf16c -mfma -mavx -mavx2 -DGGML_USE_ACCELERATE -c ggml.c -o ggml.o
c++ -I. -I./examples -O3 -DNDEBUG -std=c++11 -fPIC -D_DARWIN_C_SOURCE -pthread -c whisper.cpp -o whisper.o
c++ -I. -I./examples -O3 -DNDEBUG -std=c++11 -fPIC -D_DARWIN_C_SOURCE -pthread examples/main/main.cpp examples/common.cpp examples/common-ggml.cpp ggml.o whisper.o -o main -framework Accelerate
./main -h
usage: ./main [options] file0.wav file1.wav ...
options:
-h, --help [default] show this help message and exit
-t N, --threads N [4 ] number of threads to use during computation
-p N, --processors N [1 ] number of processors to use during computation
-ot N, --offset-t N [0 ] time offset in milliseconds
-on N, --offset-n N [0 ] segment index offset
-d N, --duration N [0 ] duration of audio to process in milliseconds
-mc N, --max-context N [-1 ] maximum number of text context tokens to store
-ml N, --max-len N [0 ] maximum segment length in characters
-sow, --split-on-word [false ] split on word rather than on token
-bo N, --best-of N [2 ] number of best candidates to keep
-bs N, --beam-size N [-1 ] beam size for beam search
-wt N, --word-thold N [0.01 ] word timestamp probability threshold
-et N, --entropy-thold N [2.40 ] entropy threshold for decoder fail
-lpt N, --logprob-thold N [-1.00 ] log probability threshold for decoder fail
-su, --speed-up [false ] speed up audio by x2 (reduced accuracy)
-tr, --translate [false ] translate from source language to english
-di, --diarize [false ] stereo audio diarization
-tdrz, --tinydiarize [false ] enable tinydiarize (requires a tdrz model)
-nf, --no-fallback [false ] do not use temperature fallback while decoding
-otxt, --output-txt [false ] output result in a text file
-ovtt, --output-vtt [false ] output result in a vtt file
-osrt, --output-srt [false ] output result in a srt file
-olrc, --output-lrc [false ] output result in a lrc file
-owts, --output-words [false ] output script for generating karaoke video
-fp, --font-path [/System/Library/Fonts/Supplemental/Courier New Bold.ttf] path to a monospace font for karaoke video
-ocsv, --output-csv [false ] output result in a CSV file
-oj, --output-json [false ] output result in a JSON file
-of FNAME, --output-file FNAME [ ] output file path (without file extension)
-ps, --print-special [false ] print special tokens
-pc, --print-colors [false ] print colors
-pp, --print-progress [false ] print progress
-nt, --no-timestamps [false ] do not print timestamps
-l LANG, --language LANG [en ] spoken language ('auto' for auto-detect)
-dl, --detect-language [false ] exit after automatically detecting language
--prompt PROMPT [ ] initial prompt
-m FNAME, --model FNAME [models/ggml-base.en.bin] model path
-f FNAME, --file FNAME [ ] input WAV file path
-oved D, --ov-e-device DNAME [CPU ] the OpenVINO device used for encode inference
c++ -I. -I./examples -O3 -DNDEBUG -std=c++11 -fPIC -D_DARWIN_C_SOURCE -pthread examples/bench/bench.cpp ggml.o whisper.o -o bench -framework Accelerate
c++ -I. -I./examples -O3 -DNDEBUG -std=c++11 -fPIC -D_DARWIN_C_SOURCE -pthread examples/quantize/quantize.cpp examples/common.cpp examples/common-ggml.cpp ggml.o whisper.o -o quantize -framework Accelerate