Docker container for version 2.3.0 CUDA detection broken
System Info
Running this container on multiples services produces an issue with cuda gpu detection. No gpus are detected.
- Running LLama 3.1 from HF -Tried on Runpod/Local/Novita platforms. -GPUs tested RTX 4090, A4500.
Reverting back to container tagged version :2.2.0 Fixes the issue.
Just though I would post this up just in case others are usuing 2.3.0 in production, we had a automated scaling process instantiate the new container with :latest tagged and it brought down our production systems.
Please take a look this issue team.
Thank you.
Information
- [X] Docker
- [ ] The CLI directly
Tasks
- [ ] An officially supported command
- [ ] My own modifications
Reproduction
- Pull latest 2.3.0 docker images
- Run with any LLM.
- Will faill to find GPU
Expected behavior
We would expect this version to automatically detect local GPU cuda.
We ran into the same issue yesterday with our docker launching scripts using latest image tag. Looks like latest is pointing to 2.3.0-rocm tag instead of 2.3.0.
Using version based tag addressed the issue