cog-stable-diffusion icon indicating copy to clipboard operation
cog-stable-diffusion copied to clipboard

Cog doesn't know if CUDA is compatible with PyTorch / Docker is missing required device driver

Open lukestanley opened this issue 1 year ago • 2 comments

Cog says it's not sure about the compatibility up front, then (after a lot of downloads) it has Docker say: "Docker is missing required device driver". I figured this is an issue since Cog pitches itself as: " - 📦 Docker containers without the pain.

  • 🤬️ No more CUDA hell. Cog knows which CUDA/cuDNN/PyTorch/Tensorflow/Python combos are compatible and will set it all up correctly for you."

This is my log: cog-stable-diffusion$ sudo cog run script/download-weights hf_****************************** ⚠ Cog doesn't know if CUDA 11.6.2 is compatible with PyTorch 1.12.1 --extra-index-url=https://download.pytorch.org/whl/cu116. This might cause CUDA problems. Building Docker image from environment in cog.yaml... [+] Building 2.0s (16/16) FINISHED
=> [internal] load build definition from Dockerfile 0.0s => => transferring dockerfile: 1.67kB 0.0s => [internal] load .dockerignore 0.0s => => transferring context: 2B 0.0s => resolve image config for docker.io/docker/dockerfile:1.2 0.9s => CACHED docker-image://docker.io/docker/dockerfile:1.2@sha256:e2a8561e419ab1ba6b2fe6cbdf49fd92b95912df1cf7d313c3e2230a333fdbcc 0.0s => [internal] load metadata for docker.io/nvidia/cuda:11.6.2-cudnn8-devel-ubuntu20.04 0.6s => [stage-0 1/8] FROM docker.io/nvidia/cuda:11.6.2-cudnn8-devel-ubuntu20.04@sha256:55211df43bf393d3393559d5ab53283d4ebc3943d802b04 0.0s => [internal] load build context 0.0s => => transferring context: 31.63kB 0.0s => CACHED [stage-0 2/8] RUN rm -f /etc/apt/sources.list.d/cuda.list && rm -f /etc/apt/sources.list.d/nvidia-ml.list && apt 0.0s => CACHED [stage-0 3/8] RUN --mount=type=cache,target=/var/cache/apt apt-get update -qq && apt-get install -qqy --no-install-recom 0.0s => CACHED [stage-0 4/8] RUN curl -s -S -L https://raw.githubusercontent.com/pyenv/pyenv-installer/master/bin/pyenv-installer | bas 0.0s => CACHED [stage-0 5/8] COPY .cog/tmp/build1496174735/cog-0.0.1.dev-py3-none-any.whl /tmp/cog-0.0.1.dev-py3-none-any.whl 0.0s => CACHED [stage-0 6/8] RUN --mount=type=cache,target=/root/.cache/pip pip install /tmp/cog-0.0.1.dev-py3-none-any.whl 0.0s => CACHED [stage-0 7/8] RUN --mount=type=cache,target=/root/.cache/pip pip install diffusers==0.2.4 torch==1.12.1 --extra-index- 0.0s => CACHED [stage-0 8/8] WORKDIR /src 0.0s => exporting to image 0.1s => => exporting layers 0.0s => => writing image sha256:1c81aeabd3aa4357e1eda8a0c8ea7add1172a525b025079f2361d745f88beb33 0.0s => => naming to docker.io/library/cog-cog-stable-diffusion-base 0.0s => exporting cache 0.0s => => preparing build cache for export 0.0s

Running 'script/download-weights hf_******************************' in Docker with the current directory mounted as a volume... docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]]. ⅹ Docker is missing required device driver

nvidia-smi Wed Aug 31 14:08:41 2022
+-----------------------------------------------------------------------------+ | NVIDIA-SMI 470.141.03 Driver Version: 470.141.03 CUDA Version: 11.4 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 NVIDIA GeForce ... Off | 00000000:01:00.0 Off | N/A | | 33% 35C P8 1W / 38W | 5MiB / 2002MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | 0 N/A N/A 3517 G /usr/lib/xorg/Xorg 2MiB | +-----------------------------------------------------------------------------+

docker -v Docker version 20.10.12, build 20.10.12-0ubuntu2~20.04.1

cat /etc/issue Ubuntu 20.04.5 LTS

lukestanley avatar Aug 31 '22 13:08 lukestanley