NeMo
NeMo copied to clipboard
Can't launch NeMo containers with CUDA support
Describe the bug After pulling the image and starting the container, I get the following error:
ERROR: The NVIDIA Driver is present, but CUDA failed to initialize. GPU functionality will not be available.
[[ Forward compatibility was attempted on non supported HW (error 804) ]]
Steps/Code to reproduce bug
The command I use to start the container:
docker run -it --gpus '"device=1,2"' --ulimit stack=67108864 --runtime nvidia nvcr.io/nvidia/nemo:24.03.01.framework
I've also tried the nvcr.io/nvidia/nemo:dev.framework
image.
Expected behavior
CUDA should initialize properly.
Environment overview (please complete the following information)
GPU: RTX 2080 Ti NVIDIA driver version: 535.154.05 CUDA version: 12.2 OS: Ubuntu 20.04.5 LTS (amd64)