NeMo icon indicating copy to clipboard operation
NeMo copied to clipboard

Can't launch NeMo containers with CUDA support

Open drunkinlove opened this issue 9 months ago • 1 comments

Describe the bug After pulling the image and starting the container, I get the following error:

ERROR: The NVIDIA Driver is present, but CUDA failed to initialize.  GPU functionality will not be available.
   [[ Forward compatibility was attempted on non supported HW (error 804) ]]

Steps/Code to reproduce bug

The command I use to start the container:

docker run -it --gpus '"device=1,2"' --ulimit stack=67108864 --runtime nvidia nvcr.io/nvidia/nemo:24.03.01.framework

I've also tried the nvcr.io/nvidia/nemo:dev.framework image.

Expected behavior

CUDA should initialize properly.

Environment overview (please complete the following information)

GPU: RTX 2080 Ti NVIDIA driver version: 535.154.05 CUDA version: 12.2 OS: Ubuntu 20.04.5 LTS (amd64)

drunkinlove avatar May 21 '24 19:05 drunkinlove