whisper.cpp
whisper.cpp copied to clipboard
Limit on audio file duration
Hi - awesome work! I am wondering if there is a size or duration limit to the size of file that can be processed using this library?
There's the --duration (or -d) switch. It doesn't eat bytes but milliseconds though, so if you need to use the file size it will be necessary to do some calculation first (with WAV it shouldn't be too hard). I guess the other option is to use some external tool to clip the file as you want it.
Hi!
I'm also wondering this - I keep running into segfaults.
❯ ./main -m models/ggml-medium.en.bin -f /users/Jonathan/Desktop/biology_1.wav -pc -d 110000 --offset-t 0 -t 8 -of /users/Jonathan/Desktop/biology_1-1 -otxt
whisper_init_from_file_no_state: loading model from 'models/ggml-medium.en.bin'
whisper_model_load: loading model
whisper_model_load: n_vocab = 51864
whisper_model_load: n_audio_ctx = 1500
whisper_model_load: n_audio_state = 1024
whisper_model_load: n_audio_head = 16
whisper_model_load: n_audio_layer = 24
whisper_model_load: n_text_ctx = 448
whisper_model_load: n_text_state = 1024
whisper_model_load: n_text_head = 16
whisper_model_load: n_text_layer = 24
whisper_model_load: n_mels = 80
whisper_model_load: f16 = 1
whisper_model_load: type = 4
whisper_model_load: mem required = 1720.00 MB (+ 43.00 MB per decoder)
whisper_model_load: adding 1607 extra tokens
whisper_model_load: model ctx = 1462.35 MB
whisper_model_load: model size = 1462.12 MB
whisper_init_state: kv self size = 42.00 MB
whisper_init_state: kv cross size = 140.62 MB
system_info: n_threads = 8 / 10 | AVX = 0 | AVX2 = 0 | AVX512 = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | F16C = 0 | FP16_VA = 1 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 0 | VSX = 0 |
main: processing '/users/Jonathan/Desktop/biology_1.wav' (65200124 samples, 4075.0 sec), 8 threads, 1 processors, lang = en, task = transcribe, timestamps = 1 ...
[00:00:00.000 --> 00:00:06.000] measuring all of these methylation marks that sit on top of their DNA.
[00:00:06.000 --> 00:00:14.000] That's cool, you can literally go out, send up the samples, get your biological information through that method.
[00:00:14.000 --> 00:00:19.000] Cool. So I digress. And now this study of internal structure.
[00:00:19.000 --> 00:00:29.000] Physiology, internal structure, pathology, and embryology become it, and virology taxonomy.
[00:00:29.000 --> 00:00:34.000] Study of classification and its part of biopharmaceuticals.
[00:00:34.000 --> 00:00:38.000] Panatology, study of ancient life.
[00:00:38.000 --> 00:00:42.000] Long-term biology, study of biological molecules.
[00:00:42.000 --> 00:00:45.000] Physiology, study of tissues.
[00:00:45.000 --> 00:00:47.000] And there are many more.
[00:00:47.000 --> 00:00:54.000] So when you say you want to study biology and beauty, it doesn't mean a whole lot, right?
[00:00:54.000 --> 00:00:59.000] Which one of these fields do you want to study?
[00:00:59.000 --> 00:01:05.000] So, what makes something alive?
[00:01:05.000 --> 00:01:12.000] What makes this lion alive and what makes this monkey not alive?
[00:01:12.000 --> 00:01:14.000] The cell structure.
[00:01:14.000 --> 00:01:17.000] Ooh, a heart. Does something need a heart for a bit of life?
[00:01:17.000 --> 00:01:19.000] No.
ggml_new_tensor_impl: not enough space in the scratch memory
[1] 12274 segmentation fault ./main -m models/ggml-medium.en.bin -f /users/Jonathan/Desktop/biology_1.wav
This line ggml_new_tensor_impl: not enough space in the scratch memory
- I'm looking through the code and wondering if there's a hard limit on how much we can process at a time?
Just noting that this is 110sec:
./main -m models/ggml-medium.en.bin -f /users/Jonathan/Desktop/biology_1.wav -pc -d 110000 --offset-t 0 -t 8 -of /users/Jonathan/Desktop/biology_1-1 -otxt
I started to start receive the same error from this week, i was trying to test some configurations before moving the options to the nodejs addon im using, but now is starting to recieve the same error when i use the plain whisper cpp script
ggml_new_tensor_impl: not enough space in the scratch memory zsh: segmentation fault ./main -m models/ggml-medium.bin -p 1 -f samples/jfk2.wav
The interesting part is, i can process the exact same audio with the nodejs addon from my server, but from the main whisper cpp from shell i receive the error.
This https://github.com/ggerganov/whisper.cpp/commit/0be9cd34979d9c989330eda80dfe9e7086b694d4 fixed the issue.