Container image `main-a1867e0dad0b21b35afa43fc815dae60c9a139d6` crashes with code 132
The latest container image, ghcr.io/ggml-org/whisper.cpp:main-a1867e0dad0b21b35afa43fc815dae60c9a139d6, crashes and exits with code 132 (which usually means illegal instruction) immediately after startup:
nadia@Nadiarch curry-admin@Curry
14:59:18 ~ $> docker run -ti --rm ghcr.io/ggml-org/whisper.cpp:main-a1867e0dad0b21b35afa43fc815dae60c9a139d6 /app/build/bin/whisper-server
whisper_init_from_file_with_params_no_state: loading model from 'models/ggml-base.en.bin'
whisper_init_with_params_no_state: use gpu = 1
whisper_init_with_params_no_state: flash attn = 1
whisper_init_with_params_no_state: gpu_device = 0
whisper_init_with_params_no_state: dtw = 0
whisper_init_with_params_no_state: devices = 1
whisper_init_with_params_no_state: backends = 1
whisper_model_load: loading model
whisper_model_load: n_vocab = 51864
whisper_model_load: n_audio_ctx = 1500
whisper_model_load: n_audio_state = 512
whisper_model_load: n_audio_head = 8
whisper_model_load: n_audio_layer = 6
whisper_model_load: n_text_ctx = 448
whisper_model_load: n_text_state = 512
whisper_model_load: n_text_head = 8
whisper_model_load: n_text_layer = 6
whisper_model_load: n_mels = 80
whisper_model_load: ftype = 1
whisper_model_load: qntvr = 0
whisper_model_load: type = 2 (base)
whisper_model_load: adding 1607 extra tokens
whisper_model_load: n_langs = 99
nadia@Nadiarch curry-admin@Curry
14:59:22 ~ $> echo $?
132
This did not happen with the previous tag.
I have reproduced this in two computers, one with Intel(R) Core(TM) i5-6500T CPU @ 2.50GHz and another with Intel(R) Core(TM) i9-9900K CPU @ 3.60GHz.
I can confirm that this happens with Intel® Core™ Ultra 7 Processor 165H too when running with docker.
The output is also the exact same, and the error happens after the whisper_model_load: n_langs = 99 line with exist code 132 Illegal instruction (core dumped).
The argument passed to the server don't seem to affect the result, it crashes no matter of the arguments.
Downgrading the image to c62adfbd1ecdaea9e295c72d672992514a2d887c (released 2 weeks ago as of me writing this) fixes the issue.
To use the commit with the image: ghcr.io/ggml-org/whisper.cpp:main-c62adfbd1ecdaea9e295c72d672992514a2d887c
Also experiencing this on an Intel Core i3-10100 with a GeForce RTX 3090 24 GB:
- NVIDIA Driver 590.48.01
- NVIDIA CUDA 13.1
- NVIDIA Container Toolkit CLI 1.18.1
- Docker 29.1.3
- Debian 13
services:
whisper-cpp:
container_name: whisper-cpp
image: ghcr.io/ggml-org/whisper.cpp:main-cuda
entrypoint: /app/build/bin/whisper-server
command:
- --model
- /models/ggml-large-v3-turbo.bin
- --host
- "0.0.0.0"
- --port
- "9477"
- --language
- en
- --beam-size
- "1"
- --print-realtime
- --print-progress
environment:
LD_LIBRARY_PATH:
ports:
- 9477:9477
volumes:
- /home/ethan/whisper-cpp:/models
runtime: nvidia
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: all
capabilities:
- gpu
restart: unless-stopped