nvidia-docker
nvidia-docker copied to clipboard
error - runtime error: cudnn error: cudnn_status_execution_failed
1. Issue or feature description
When I run my code (just basic code from pytorch tutorial), they all have the similar errors "RuntimeError: CUDA error: no kernel image is available for execution on the device" or "runtime error: cudnn error: cudnn_status_execution_failed". But I could run the code in my host system and docker container generated by image with "cu113".
So I guess that cuda 10. series docker images is not compatible with NVIDIA GeForce RTX 3070 Ti ?
2. Steps to reproduce the issue
my host linux environment:
ubuntu 20, GPU (NVIDIA GeForce RTX 3070 Ti)
diriver(NVIDIA-SMI 470.103.01 Driver Version: 470.103.01 CUDA Version: 11.4).
my docker images with error above is:
"10.1-cudnn8-devel-ubuntu18.04",
"10.1-cudnn7-devel-ubuntu18.04",
"10.0-cudnn7-devel-ubuntu18.04", that are from "https://gitlab.com/nvidia/container-images/cuda/blob/master/doc/supported-tags.md#cuda-101".
To exclude the pytorch version influence, I use images from "https://hub.docker.com/r/pytorch/pytorch", too. The image is "pytorch/pytorch: 1.4-cuda10.1-cudnn7-devel", which has installed compatible pytorch already but with the same error.
3. Information to attach (optional if deemed irrelevant)
-
[ ] Some nvidia-container information:
nvidia-container-cli -k -d /dev/tty info
-
[ ] Kernel version from
uname -a
Linux walker-dev 5.13.0-39-generic #44~20.04.1-Ubuntu SMP Thu Mar 24 16:43:35 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
-
[ ] Any relevant kernel output lines from
dmesg
-
[ ] Driver information from
nvidia-smi -a
diriver(NVIDIA-SMI 470.103.01 Driver Version: 470.103.01 CUDA Version: 11.4)
-
[ ] Docker version from
docker version
-
[ ] NVIDIA packages version from
dpkg -l '*nvidia*'
orrpm -qa '*nvidia*'
-
[ ] NVIDIA container library version from
nvidia-container-cli -V
-
[ ] NVIDIA container library logs (see troubleshooting)
-
[ ] Docker command, image and tag used