whisper.cpp Dockerfile for Whisper.cpp with PyTorch, CUDA, and GPU Support (whisper-cli / whisper-stream)

Dockerfile for Whisper.cpp with PyTorch, CUDA, and GPU Support (whisper-cli / whisper-stream)

Open naren200 opened this issue 9 months ago • 0 comments

Description:

Hi Community,

I'm trying to set up a Docker environment for using whisper.cpp, but I've encountered several issues with existing Dockerfiles online that don't seem to work out of the box. Specifically, I'm looking for a Dockerfile that meets the following requirements:

PyTorch with CUDA support: I need to have PyTorch installed with proper CUDA support for GPU acceleration.
NVIDIA-SMI drivers: The container should have access to the NVIDIA-SMI drivers to check and manage GPU status.
NVCC toolkit: The NVIDIA CUDA toolkit (nvcc) should be installed to compile CUDA code.
Most importantly GPU Access through whisper-cli or whisper-stream: The container should provide GPU access through tools like whisper-cli or whisper-stream for inference tasks.

Prodominant outcome

One of the Issue used: https://github.com/ggerganov/whisper.cpp/issues/2032#issuecomment-2078881661 Many users have supported the solution, which doesn't seem to work with GPU support along with whisper.cpp Command for running Docker image:

docker run \
      --rm \
      --gpus all -e LD_LIBRARY_PATH="" \
      -v ./whisper_models:/app/models \
      -v ./wav_dir:/app/testdata \
      ghcr.io/ggerganov/whisper.cpp:main-cuda \
      "/app/main --file /app/testdata/harvard.wav --language en --output-txt true --model /app/models/ggml-large-v3-turbo.bin --output-file /app/testdata/harvard"

Outcome:

whisper_init_with_params_no_state: use gpu    = 1
whisper_init_with_params_no_state: flash attn = 0
whisper_init_with_params_no_state: gpu_device = 0

ggml_cuda_init: failed to initialize CUDA: unknown error
whisper_model_load:      CPU total size =  1623.92 MB
whisper_model_load: model size    = 1623.92 MB
whisper_backend_init_gpu: using CUDA backend
ggml_backend_cuda_init: invalid device 0
whisper_backend_init_gpu: ggml_backend_cuda_init() failed

Please let me know if anyone has a working Dockerfile that satisfies all of the above requirements, or if anyone can point me in the right direction to resolve this. Let's conclude a dockerfile which works for any system.

Kind request: Don't close this issue without a proper solution, because there are several Dockerfiles out there that don't work out of the box. I have been trying this for several days. No solution yet...

Jan 20 '25 14:01 naren200

whisper.cpp whisper.cpp copied to clipboard

Dockerfile for Whisper.cpp with PyTorch, CUDA, and GPU Support (whisper-cli / whisper-stream)

Description:

Prodominant outcome

whisper.cpp
whisper.cpp copied to clipboard