Kokoro-FastAPI icon indicating copy to clipboard operation
Kokoro-FastAPI copied to clipboard

Graphics card not found under Linux Mint

Open gitchat1 opened this issue 10 months ago • 10 comments

I bought myself a laptop with an RTX 4060 Max-Q. Installed Linux Mint plus drivers. When I try to build the docker image or run the ready made on it can't find the GPU. Here are the logs.

~$ docker run --gpus all -p 8880:8880 ghcr.io/remsky/kokoro-fastapi-gpu:v0.2.2 Unable to find image 'ghcr.io/remsky/kokoro-fastapi-gpu:v0.2.2' locally v0.2.2: Pulling from remsky/kokoro-fastapi-gpu 40a1c66fd673: Pulling fs layer d1da8f0791a1: Download complete 4f4fb700ef54: Already exists 6921f6108f74: Download complete d27544a87475: Download complete 2b73a021bed5: Download complete 333fa843603f: Download complete 6eabf45d2dc4: Download complete bf83beda4da9: Download complete 74d90daacc5b: Download complete 36b7d6a9ac01: Download complete Digest: sha256:4dc5455046339ecbea75f2ddb90721ad8dbeb0ab14c8c828e4b962dac2b80135 Status: Downloaded newer image for ghcr.io/remsky/kokoro-fastapi-gpu:v0.2.2 docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]].

+-----------------------------------------------------------------------------------------+ | NVIDIA-SMI 550.120 Driver Version: 550.120 CUDA Version: 12.4 | |-----------------------------------------+------------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+========================+======================| | 0 NVIDIA GeForce RTX 4060 ... Off | 00000000:01:00.0 Off | N/A | | N/A 45C P3 15W / 60W | 9MiB / 8188MiB | 0% Default | | | | N/A | +-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=========================================================================================| | 0 N/A N/A 1221 G /usr/lib/xorg/Xorg 4MiB | +-----------------------------------------------------------------------------------------+

~$ cd Kokoro-FastAPI

~/Kokoro-FastAPI$ cd docker/gpu /Kokoro-FastAPI/docker/gpu$ docker compose up --build [+] Building 2.3s (18/18) FINISHED docker:desktop-linux => [kokoro-tts internal] load build definition from Dockerfile 0.0s => => transferring dockerfile: 1.97kB 0.0s => [kokoro-tts internal] load metadata for docker.io/nvidia/cuda:12.8.0-cudnn-runtime-ubuntu24.04 1.7s => [kokoro-tts internal] load .dockerignore 0.0s => => transferring context: 367B 0.0s => [kokoro-tts stage-0 1/12] FROM docker.io/nvidia/cuda:12.8.0-cudnn-runtime-ubuntu24.04@sha256:c40d1065da90274969f9 0.0s => => resolve docker.io/nvidia/cuda:12.8.0-cudnn-runtime-ubuntu24.04@sha256:c40d1065da90274969f9faa7fe1a7fcd1c374d578 0.0s => [kokoro-tts internal] load build context 0.0s => => transferring context: 7.69kB 0.0s => CACHED [kokoro-tts stage-0 2/12] RUN apt-get update && apt-get install -y python3.10 python3-venv esp 0.0s => CACHED [kokoro-tts stage-0 3/12] RUN curl -LsSf https://astral.sh/uv/install.sh | sh && mv /root/.local/bin/u 0.0s => CACHED [kokoro-tts stage-0 4/12] RUN useradd -m -u 1001 appuser && mkdir -p /app/api/src/models/v1_0 && c 0.0s => CACHED [kokoro-tts stage-0 5/12] WORKDIR /app 0.0s => CACHED [kokoro-tts stage-0 6/12] COPY --chown=appuser:appuser pyproject.toml ./pyproject.toml 0.0s => CACHED [kokoro-tts stage-0 7/12] RUN --mount=type=cache,target=/root/.cache/uv uv venv --python 3.10 && u 0.0s => CACHED [kokoro-tts stage-0 8/12] COPY --chown=appuser:appuser api ./api 0.0s => CACHED [kokoro-tts stage-0 9/12] COPY --chown=appuser:appuser web ./web 0.0s => CACHED [kokoro-tts stage-0 10/12] COPY --chown=appuser:appuser docker/scripts/ ./ 0.0s => CACHED [kokoro-tts stage-0 11/12] RUN chmod +x ./entrypoint.sh 0.0s => CACHED [kokoro-tts stage-0 12/12] RUN if [ "true" = "true" ]; then python download_model.py --output api/src/m 0.0s => [kokoro-tts] exporting to image 0.2s => => exporting layers 0.0s => => exporting manifest sha256:189f732009c521ba500bba819fc7816b608cd832efe063f51a43895f3c1a0ae6 0.0s => => exporting config sha256:5a85df4075c6f50607557fb663c1b87637fd418647002dc202fc4ef0a111517b 0.0s => => exporting attestation manifest sha256:8821ed04133f360dbbdf8040515bbf8c35bd520eac6f5973ec1aad917e950fe9 0.1s => => exporting manifest list sha256:851211ab5d9f52f0531f383a8f7fc38bf3ba42efed2b18d0a8ff2d6af2016885 0.0s => => naming to docker.io/library/kokoro-tts-gpu-kokoro-tts:latest 0.0s => => unpacking to docker.io/library/kokoro-tts-gpu-kokoro-tts:latest 0.0s => [kokoro-tts] resolving provenance for metadata file 0.0s [+] Running 3/3 ✔ kokoro-tts Built 0.0s ✔ Network kokoro-tts-gpu_default Created 0.1s ✔ Container kokoro-tts-gpu-kokoro-tts-1 Created 0.2s Attaching to kokoro-tts-1 Gracefully stopping... (press Ctrl+C again to force) Error response from daemon: could not select device driver "nvidia" with capabilities: [[gpu]]

gitchat1 avatar Feb 20 '25 10:02 gitchat1

does nvidia-smi work also do u have this installed? https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html

fireblade2534 avatar Feb 20 '25 14:02 fireblade2534

Also experiencing this just now.

Ubuntu 24 in a VM GTX 1650 in PCI passthrough

I've confirmed separately that nvidia-smi and nvcc are both working well.


EDIT: Resolved with the following commands...

curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list |   sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' |   sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
apt update && apt-get install -y nvidia-container-toolkit
nvidia-ctk runtime configure --runtime=docker
systemctl restart docker
docker run --rm --runtime=nvidia --gpus all nvidia/cuda:12.1.1-base-ubuntu22.04 nvidia-smi

I hope this is helpful.


Now I'm seeing a different error, probably unrelated:

root@kokoro-tts:~/Kokoro-FastAPI/docker/gpu# docker compose up
[+] Running 1/1
 ✔ Container kokoro-tts-gpu-kokoro-tts-1  Created                                                                                                                              
Attaching to kokoro-tts-1
kokoro-tts-1  |
kokoro-tts-1  | ==========
kokoro-tts-1  | == CUDA ==
kokoro-tts-1  | ==========
kokoro-tts-1  |
kokoro-tts-1  | CUDA Version 12.8.0
kokoro-tts-1  |
kokoro-tts-1  | Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
kokoro-tts-1  |
kokoro-tts-1  | This container image and its contents are governed by the NVIDIA Deep Learning Container License.
kokoro-tts-1  | By pulling and using the container, you accept the terms and conditions of this license:
kokoro-tts-1  | https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license
kokoro-tts-1  |
kokoro-tts-1  | A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.
kokoro-tts-1  |
kokoro-tts-1  | 2025-02-20 19:45:21.722 | INFO     | __main__:download_model:63 - Downloading Kokoro v1.0 model files
kokoro-tts-1  | 2025-02-20 19:45:21.722 | INFO     | __main__:download_model:71 - Downloading model file...
kokoro-tts-1  | 2025-02-20 19:45:21.956 | ERROR    | __main__:download_model:84 - Failed to download model: [Errno 13] Permission denied: 'api/src/models/v1_0/kokoro-v1_0.pth'
kokoro-tts-1  | Traceback (most recent call last):
kokoro-tts-1  |   File "/app/download_model.py", line 104, in <module>
kokoro-tts-1  |     main()
kokoro-tts-1  |   File "/app/download_model.py", line 100, in main
kokoro-tts-1  |     download_model(args.output)
kokoro-tts-1  |   File "/app/download_model.py", line 72, in download_model
kokoro-tts-1  |     urlretrieve(model_url, model_path)
kokoro-tts-1  |   File "/home/appuser/.local/share/uv/python/cpython-3.10.16-linux-x86_64-gnu/lib/python3.10/urllib/request.py", line 251, in urlretrieve
kokoro-tts-1  |     tfp = open(filename, 'wb')
kokoro-tts-1  | PermissionError: [Errno 13] Permission denied: 'api/src/models/v1_0/kokoro-v1_0.pth'
kokoro-tts-1 exited with code 1

EDIT again: Resolved by running:

apt install -y python3 python3-pip
python3 -m pip install loguru --break-system-packages
cd /root/Kokoro-FastAPI/
python3 docker/scripts/download_model.py --output api/src/models/v1_0
docker compose up

Jefferderp avatar Feb 20 '25 19:02 Jefferderp

Do you think this solution might also work for me? If so how would I best do it I would really apreciate step by step instructions.

gitchat1 avatar Feb 20 '25 19:02 gitchat1

@gitchat1 does the command nvidia-smi work also do u have this installed? https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html

fireblade2534 avatar Feb 21 '25 01:02 fireblade2534

I tried that already but for some reason it couldn't find the docker deamon. I'm using docker desktop and as far as I can tell that programme apperently does not support gpu accelleration on Linux. Is there anything left to try?

gitchat1 avatar Feb 21 '25 11:02 gitchat1

@gitchat1 try using the command line version of docker https://docs.docker.com/engine/install/ and install https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html

fireblade2534 avatar Feb 21 '25 14:02 fireblade2534

Okay I was able to make some progress. I configured the Nvidia Toolkit for administrator use when I tried to configure rootless mode I got the following error. Failed to restart docker.service: Unit docker.service not found. So I reverted the changes and simply put sudo in front of run --gpus all -p 8880:8880 ghcr.io/remsky/kokoro-fastapi-gpu:v0.2.2 #NVIDIA GPU I did unfortunately make the mistake of running the comand without the version number specified the first time.

In any case I got the following error both times. docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running prestart hook #0: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy' nvidia-container-cli: requirement error: unsatisfied condition: cuda>=12.8, please update your driver to a newer version, or use an earlier cuda container: unknown I don't know if there is a newer driver Linux Mint supports I'm unsure of whether to add the ppa with the latest Nvidia drivers for Ubuntu since I don't know what that might do and don't know if it will conflict with the repository linux Mint uses.

gitchat1 avatar Feb 21 '25 20:02 gitchat1

@gitchat1 The problem is that in v0.2.2 now requires CUDA 12.8, from the looks of it you have 12.4. I just went through the process myself of upgrading CUDA and drivers on Ubuntu for my GTX 1080, and made this guide. It worked for me and you can probably use something similar, checking the link for other Linux OS's. You maybe have to reboot your system without a driver installed, but it may work without doing that (I was trying to figure things out so I did).

My Ubuntu 24.04 guide to upgrade to CUDA 12.8 and latest NVIDIA Drivers (Risky)

Remove existing CUDA:

sudo apt-get --purge remove "*cuda*" "*cublas*" "*cufft*" "*cufile*" "*curand*" "*cusolver*" "*cusparse*" "*gds-tools*" "*npp*" "*nvjpeg*" "nsight*" "*nvvm*"

Remove existing NVIDIA drivers:

sudo apt-get --purge remove "*nvidia*" "libxnvctrl*"

Cleanup:

sudo apt-get autoremove
sudo apt-get clean

Remove CUDA folders:

sudo rm -rf /usr/local/cuda*

Re-install CUDA (Only for Ubuntu 24.04, look here for other versions):

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2404/x86_64/cuda-ubuntu2404.pin
sudo mv cuda-ubuntu2404.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget https://developer.download.nvidia.com/compute/cuda/12.8.0/local_installers/cuda-repo-ubuntu2404-12-8-local_12.8.0-570.86.10-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu2404-12-8-local_12.8.0-570.86.10-1_amd64.deb
sudo cp /var/cuda-repo-ubuntu2404-12-8-local/cuda-*-keyring.gpg /usr/share/keyrings/
sudo apt-get update
sudo apt-get -y install cuda-toolkit-12-8

Check ~/.bashrc (or whatever you use) for existing export CUDA paths and remove:

nano ~/.bashrc

Link /cuda-12-8 folder to /cuda folder:

sudo ln -s /usr/local/cuda-12.8 /usr/local/cuda

Re-add to ~/.bashrc (or whatever you use):

echo 'export PATH=/usr/local/cuda/bin:$PATH' >> ~/.bashrc
echo 'export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH' >> ~/.bashrc
source ~/.bashrc

Re-install NVIDIA drivers:

sudo apt-get install -y nvidia-open
sudo apt-get install -y nvidia-driver-570

Reboot:

sudo reboot

Re-install NVIDIA CONTAINER TOOLKIT for Docker:

https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html

richardr1126 avatar Feb 24 '25 00:02 richardr1126

If anyone feels like trying adding docker to the video group in docker/gpu/docker-compose.yml and see if this does the trick? https://github.com/remsky/Kokoro-FastAPI/pull/198/files#diff-6b65a4e9ffad3492f9e7ea20df64a768176ba404e3a98231f3879db6acbe9107

blakkd avatar Feb 24 '25 00:02 blakkd

I tried everything suggested on Ubuntu 22.04 and gave up using Docker. The start-gpu.sh script works.

sky-cake avatar Jun 11 '25 20:06 sky-cake