"Can't initialize NVML" on system with CUDA 13.0 and Maxwell GPU
Describe the bug
Docker command: docker run --gpus all -p 8880:8880 ghcr.io/remsky/kokoro-fastapi-gpu
Kokoro falls back to using CPU. This is fine, but I wanted to use my GPU. As I understand it, CUDA is somewhat backwards compatible, so my CUDA 13.0 system should run a CUDA 12.8 program fine, yes?
I was able to install and use Open-WebUI and its bundled Ollama with my GPU just fine.
Screenshots or console output
2025-10-30 01:23:43.851 | INFO | __main__:download_model:60 - Model files already exist and are valid
/app/.venv/lib/python3.10/site-packages/torch/cuda/__init__.py:734: UserWarning: Can't initialize NVML
warnings.warn("Can't initialize NVML")
INFO: Started server process [30]
Branch / Deployment used
docker local on a headless machine
Docker command: docker run --gpus all -p 8880:8880 ghcr.io/remsky/kokoro-fastapi-gpu
Operating System
me@my-box:~> cat /etc/os-release
NAME="openSUSE Leap"
VERSION="15.6"
ID="opensuse-leap"
ID_LIKE="suse opensuse"
#Linux kernel version
me@my-box:~> uname -rp
6.4.0-150600.23.73-default x86_64
me@my-box:~> sudo nvidia-smi -q
==============NVSMI LOG==============
Timestamp : Wed Oct 29 18:52:11 2025
Driver Version : 580.95.05
CUDA Version : 13.0
Attached GPUs : 1
GPU 00000000:01:00.0
Product Name : Quadro K620
Product Brand : Quadro
Product Architecture : Maxwell
me@my-box:~> docker --version
Docker version 28.3.3-ce, build bea959c7b
Additional context Installed NVIDIA proprietary driver and NVIDIA-container-toolkit using https://en.opensuse.org/SDB:NVIDIA_drivers
I'd guess that Pytorch + Maxwell GPU architecture is the problem. The Kokoro-FastAPI project uses Pytorch 2.8.0 with CUDA 12.9.
https://github.com/remsky/Kokoro-FastAPI/blob/88dcf00e4fc622b12eeb271e6f56aff860229646/pyproject.toml#L46
Unfortunately, Pytorch has removed support for Maxwell and Pascal architectures with CUDA 12.8 and 12.9 builds. Although, Pytorch still offers supported 12.6 builds. Additionally, CUDA 13 has deprecated library support for Maxwell.
You could try building a container with this change: #407
@wired-filipino-owl were you able to test this fix?
@wired-filipino-owl I've spent a bit of time to produce working builds and containers with my fork. The latest builds include changes from the master branch merged with the changes from this PR. If you get a chance to test, please reply. Thanks!
docker run --gpus all -p 8880:8880 ghcr.io/ryan-steed-usa/kokoro-fastapi-gpu:latest
@ryan-steed-usa thank you! I will test when I get around to ripping out CUDA 13.0 and downgrading to 12.6 on my OpenSUSE machine. Swapping CUDA versions is an involved process 😓