whisper.cpp Container image `main-a1867e0dad0b21b35afa43fc815dae60c9a139d6` crashes with code 132

The latest container image, ghcr.io/ggml-org/whisper.cpp:main-a1867e0dad0b21b35afa43fc815dae60c9a139d6, crashes and exits with code 132 (which usually means illegal instruction) immediately after startup:

nadia@Nadiarch 󱃾 curry-admin@Curry
14:59:18 ~ $> docker run -ti --rm ghcr.io/ggml-org/whisper.cpp:main-a1867e0dad0b21b35afa43fc815dae60c9a139d6 /app/build/bin/whisper-server
whisper_init_from_file_with_params_no_state: loading model from 'models/ggml-base.en.bin'
whisper_init_with_params_no_state: use gpu    = 1
whisper_init_with_params_no_state: flash attn = 1
whisper_init_with_params_no_state: gpu_device = 0
whisper_init_with_params_no_state: dtw        = 0
whisper_init_with_params_no_state: devices    = 1
whisper_init_with_params_no_state: backends   = 1
whisper_model_load: loading model
whisper_model_load: n_vocab       = 51864
whisper_model_load: n_audio_ctx   = 1500
whisper_model_load: n_audio_state = 512
whisper_model_load: n_audio_head  = 8
whisper_model_load: n_audio_layer = 6
whisper_model_load: n_text_ctx    = 448
whisper_model_load: n_text_state  = 512
whisper_model_load: n_text_head   = 8
whisper_model_load: n_text_layer  = 6
whisper_model_load: n_mels        = 80
whisper_model_load: ftype         = 1
whisper_model_load: qntvr         = 0
whisper_model_load: type          = 2 (base)
whisper_model_load: adding 1607 extra tokens
whisper_model_load: n_langs       = 99

nadia@Nadiarch 󱃾 curry-admin@Curry
14:59:22 ~ $> echo $?
132

This did not happen with the previous tag.

I have reproduced this in two computers, one with Intel(R) Core(TM) i5-6500T CPU @ 2.50GHz and another with Intel(R) Core(TM) i9-9900K CPU @ 3.60GHz.

Nov 10 '25 14:11 nadiamoe

I can confirm that this happens with Intel® Core™ Ultra 7 Processor 165H too when running with docker. The output is also the exact same, and the error happens after the whisper_model_load: n_langs = 99 line with exist code 132 Illegal instruction (core dumped). The argument passed to the server don't seem to affect the result, it crashes no matter of the arguments. Downgrading the image to c62adfbd1ecdaea9e295c72d672992514a2d887c (released 2 weeks ago as of me writing this) fixes the issue. To use the commit with the image: ghcr.io/ggml-org/whisper.cpp:main-c62adfbd1ecdaea9e295c72d672992514a2d887c

Nov 11 '25 17:11 Yazan-Sharaya

Also experiencing this on an Intel Core i3-10100 with a GeForce RTX 3090 24 GB:

NVIDIA Driver 590.48.01
NVIDIA CUDA 13.1
NVIDIA Container Toolkit CLI 1.18.1
Docker 29.1.3
Debian 13

services:
  whisper-cpp:
    container_name: whisper-cpp
    image: ghcr.io/ggml-org/whisper.cpp:main-cuda
    entrypoint: /app/build/bin/whisper-server
    command:
      - --model
      - /models/ggml-large-v3-turbo.bin
      - --host
      - "0.0.0.0"
      - --port
      - "9477"
      - --language
      - en
      - --beam-size
      - "1"
      - --print-realtime
      - --print-progress
    environment:
      LD_LIBRARY_PATH:
    ports:
      - 9477:9477
    volumes:
      - /home/ethan/whisper-cpp:/models
    runtime: nvidia
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: all
              capabilities:
                - gpu
    restart: unless-stopped

Dec 30 '25 13:12 EthanC