[BUG] CUDA failed with error CUDA driver version is insufficient for CUDA runtime version
Is there an existing issue for this?
- [X] I have searched the existing issues
Current Behavior
When running the gpu tagged image, I get an error in the container that the CUDA driver version is insufficient for CUDA runtime version. But unless I am mistaken, the image uses cudnn-cu12 and cublas-cu12, which should be compatible with NVIDIA-SMI 550.120 Driver Version: 550.120 CUDA Version: 12.4 that I run Any ideas?
Expected Behavior
No response
Steps To Reproduce
- run the image in the environment described below
Environment
- OS:Ubuntu 24.04.1 LTS
- How docker service was installed: docker repo
- GPU: Nvidia RTX A4000
- nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Fri_Jan__6_16:45:21_PST_2023
Cuda compilation tools, release 12.0, V12.0.140
Build cuda_12.0.r12.0/compiler.32267302_0
- nvidia-smi: NVIDIA-SMI 550.120 Driver Version: 550.120 CUDA Version: 12.4
CPU architecture
x86-64
Docker creation
services:
faster-whisper:
image: lscr.io/linuxserver/faster-whisper:gpu
container_name: faster-whisper
environment:
- PUID=1000
- PGID=1000
- TZ=Etc/UTC
- WHISPER_MODEL=tiny-int8
- WHISPER_BEAM=1 #optional
- WHISPER_LANG=nl #optional
volumes:
- /mnt/local/container_data/whisper/data2:/config
ports:
- 10300:10300
deploy:
resources:
reservations:
generic_resources:
- discrete_resource_spec:
kind: "NVIDIA-GPU"
value: 1
Container logs
Traceback (most recent call last):
File "<frozen runpy>", line 198, in _run_module_as_main
File "<frozen runpy>", line 88, in _run_code
File "/lsiopy/lib/python3.12/site-packages/wyoming_faster_whisper/__main__.py", line 149, in <module>
run()
File "/lsiopy/lib/python3.12/site-packages/wyoming_faster_whisper/__main__.py", line 144, in run
asyncio.run(main())
File "/usr/lib/python3.12/asyncio/runners.py", line 194, in run
return runner.run(main)
^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/asyncio/runners.py", line 118, in run
return self._loop.run_until_complete(task)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/asyncio/base_events.py", line 687, in run_until_complete
return future.result()
^^^^^^^^^^^^^^^
File "/lsiopy/lib/python3.12/site-packages/wyoming_faster_whisper/__main__.py", line 119, in main
whisper_model = faster_whisper.WhisperModel(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/lsiopy/lib/python3.12/site-packages/faster_whisper/transcribe.py", line 133, in __init__
self.model = ctranslate2.models.Whisper(
^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: CUDA failed with error CUDA driver version is insufficient for CUDA runtime version
Thanks for opening your first issue here! Be sure to follow the relevant issue templates, or risk having this issue marked as invalid.
I have a similar issue. Any ideas?
same here.
Had the same error. In my case I started the docker container with a
services:
faster-whisper:
runtime: nvidia
but I was not exposing the nvidia gpu. Adding the following section fixed the issue for me:
services:
faster-whisper:
runtime: nvidia
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
On Unraid on the Docker container page I toggled Basic View to Advance View, then next to 'Extra Parameters:' I added: --gpus=all
This fixed it for me, thanks!
I am getting the same issue with Unraid. I attempted the "--gpus=all" parameter, but still no luck.
This issue has been automatically marked as stale because it has not had recent activity. This might be due to missing feedback from OP. It will be closed if no further activity occurs. Thank you for your contributions.
Are you sure @Phantom-Glass ? I just fixed the "error" with adding --gpus all on Unraid 7.0
Are you sure @Phantom-Glass ? I just fixed the "error" with adding --gpus all on Unraid 7.0
After updating to 7.0, adding that into extra parameters did start working for me.
hey. Just tried with the extra parameter and the issue is gone.
But it seems not to work because I found this in the logs:
INFO:faster_whisper:Processing audio with duration 00:02.820 INFO:wyoming_faster_whisper.handler:!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Any insight?
This issue has been automatically marked as stale because it has not had recent activity. This might be due to missing feedback from OP. It will be closed if no further activity occurs. Thank you for your contributions.
Attempting to deploy on a k8s server with exposed GPUs that is happily running models on GPUs for Frigate and Ollama, but getting this same error trying to run this container. Sadly not much to add other than that, but I want to scare off that bot trying to mark this as stale and then close it 😀
Edit: Alright, solved it, did a couple things though - as I had 3x pods requesting 3x GPUs and I only have 3x GPUs, I was trying to just not request one at all and see if it would just run anyway (like nvidia-smi and frigate both do if you don't do requests and limits), I set up time sharing on the cluster so each GPU shows as 2 - and then I added a request and limit for 1 GPU - so either or both of those may have been the solution for me.
This issue has been automatically marked as stale because it has not had recent activity. This might be due to missing feedback from OP. It will be closed if no further activity occurs. Thank you for your contributions.
This issue is locked due to inactivity