isaac_ros_visual_slam Docker Error - Unknown or Invalid Runtime Name: Nvidia

trafficstars

I am encountering a runtime error with Docker when trying to use the Nvidia runtime. This issue arises despite having a successful output with an initial Docker command and making subsequent edits to the Docker configuration.

Steps to Reproduce

Run the following Docker command which executes successfully:

sudo docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi

Edit /etc/docker/daemon.json as follows:

{
    "runtimes": {
        "nvidia": {
            "path": "nvidia-container-runtime",
            "runtimeArgs": []
        }
    },
    "default-runtime": "nvidia"
}

After making these changes, attempt to execute a script with the command:

scripts/run_dev.sh ~/workspaces/isaac_ros-dev/

This results in the following error:

docker: Error response from daemon: unknown or invalid runtime name: nvidia.

Expected Behavior

The Docker container should recognize the Nvidia runtime without errors, especially since the initial command runs without issues.

Actual Behavior

The system throws an error stating "unknown or invalid runtime name: nvidia" when trying to run a script that utilizes Docker with the Nvidia runtime.

Environment

Docker version: Docker version 24.0.7, build afdd53b
Operating System:ubuntu22.04
Any other relevant environmental details

Attempts to Resolve

Verified that the initial Docker command runs successfully.
Checked the syntax and paths in the daemon.json file.
Searched for similar issues in forums and GitHub Issues.

Request for Help

Could anyone provide insights or suggest potential solutions to resolve this runtime error? Any advice or guidance would be greatly appreciated.

Nov 15 '23 05:11 YuminosukeSato

It looks like you may not have nvidia-container-toolkit installed. See here for instructions on how to install on your x86_64 system running Jammy.

Nov 17 '23 01:11 hemalshahNV

We have installed nvidia-container-toolkit and then started docker, but we get this error.

Nov 17 '23 05:11 YuminosukeSato

I am experiencing same issue, nvidia-container-toolkit is also installed.

Nov 17 '23 08:11 solix

I am experiencing same issue, nvidia-container-toolkit is also installed.

Nov 22 '23 03:11 weirdsim14

We're looking into this but haven't been able to reproduce this yet with the same OS and Docker version. We're still running a few more experiments on freshly provisioned machines to see if we can narrow it down.

Our theory is that setup instructions in nvidia-container-toolkit is different than what our machine provisioning scripts do (listed below):

# Install Nvidia Docker runtime
curl -s -L https://nvidia.github.io/nvidia-container-runtime/gpgkey | \
  sudo apt-key add -
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-container-runtime/$distribution/nvidia-container-runtime.list | \
  sudo tee /etc/apt/sources.list.d/nvidia-container-runtime.list
sudo apt-get update
sudo apt-get install -y nvidia-container-runtime
sudo systemctl restart docker

sudo gpasswd -a $USER docker
sudo usermod -a -G docker $(whoami)
newgrp docker

Nov 30 '23 04:11 hemalshahNV

Hi,

is there any update regarding this issue? I'm experiencing the same on Ubuntu 22.04, Docker v4.30.0

May 24 '24 14:05 mrlreable

was facing the same issue... SOLVED by following these steps below

Editing the file /etc/docker/daemon.json to include:

{
    "runtimes": {
        "nvidia": {
            "path": "nvidia-container-runtime",
            "runtimeArgs": []
        }
    }
}

and then running:

sudo systemctl daemon-reload
sudo systemctl restart docker

The error stops showing and we are able to see the GPUs inside the containers when we run:

sudo docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi

prior to all this, we followed this tutorial (NVIDIA container toolkit instructions). Yet, it did not require to edit the file, as described above.

May 28 '24 10:05 sid-isq

The previous solution did not solve my problem. My original daemon.json was:

{
    "runtimes": {
        "nvidia": {
            "args": [],
            "path": "nvidia-container-runtime"
        }
    }
}

I changed for the above one and did not solve. I already installed nvidia-container-toolkit. I am using Ubuntu 22.04.3 LTS.

May 28 '24 11:05 EmanuelCastanho

Happened to run across this thread, so will give my experience:

I had the same problem a couple weeks ago, also with Ubuntu 22.04. I had docker installed via snap, and that caused some of the paths to be different than what the Nvidia tools expect. I'm sure it should be fixable for the snap installation as well, but for me the easiest solution was to remove docker entirely and re-install it via apt-get as instructed here in docker guides. I tried to make it work with the snap version but quickly ran out of patience and decided to just reinstall docker entirely.

So if you haven't already, you might want to check how your docker is installed.

May 28 '24 12:05 tanelikor

Happened to run across this thread, so will give my experience:

I had the same problem a couple weeks ago, also with Ubuntu 22.04. I had docker installed via snap, and that caused some of the paths to be different than what the Nvidia tools expect. I'm sure it should be fixable for the snap installation as well, but for me the easiest solution was to remove docker entirely and re-install it via apt-get as instructed here in docker guides. I tried to make it work with the snap version but quickly ran out of patience and decided to just reinstall docker entirely.

So if you haven't already, you might want to check how your docker is installed.

This solution worked for me. Thank you very very much. I had docker installed in host machine (windows docker desktop). I did not need to uninstall it, but instead I installed docker in WSL Ubuntu and kept docker desktop closed so WSL wouldn't use it, this solved the nvidia runtime problem. Actually I don't know if having both docker installed will have other side effects yet but it would be solved having docker installed in WSL only.

Jul 17 '24 14:07 caio-swdev

I faced the same problem with the installed NVIDIA container toolkit. It helped me to reconfigure the runtime sudo nvidia-ctk runtime configure --runtime=docker and then restart the daemon: sudo systemctl restart docker

It worked for me. Manual editing of /etc/docker/daemon.json and then restarting the daemon didn't, for some reason.

Jul 26 '24 10:07 evstigneevnm

I encountered the same problem and successfully solved it. My environment is "Win10 + wsl + Docker Desktop". The daemon.json file is not only exists in the /etc/docker/daemon. json but also in the Windows host directory : C:\Users\XXX\.docker\daemon.json.

So, Editing the file C:\Users\XXX.docker\daemon.json to include:

{
    "runtimes": {
        "nvidia": {
            "path": "nvidia-container-runtime",
            "runtimeArgs": []
        }
    }
}

and then running in wsl :

sudo systemctl daemon-reload
sudo systemctl restart docker
sudo docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi

I hope this can help you.

Aug 08 '24 07:08 hipop

I get the corner case repeated several times:

env:

Ubuntu 24.04
Docker (CE): Docker version 27.0.2, build 912c1dd
CUDA: 12.x
Toolkit: 1_1.13.5-1
Installation: install-guide

Bad case

If one enables non root user to use docker like this:

sudo groupadd docker
sudo gpasswd -a $USER docker

and manually config the /etc/docker/daemon.json like this:

{
    "default-runtime": "nvidia",
    "runtimes": {
        "nvidia": {
            "args": [],
            "path": "nvidia-container-runtime"
        }
    }
}

After sudo systemctl restart docker. One may

Get the right feedback with sudo docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi;
Get docker: Error response from daemon: unknown or invalid runtime name: nvidia. error message with docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi (non root user).

If one follows Rootless mode session in install-guide

If one uses sudo groupadd docker & sudo gpasswd -a $USER docker to setup the roorkess mode.
- Then the first step will fail (there should be no $HOME/.config/docker/daemon.json or $HOME/.config/docker).
- One should set up $HOME/.config/docker/daemon.json manually (well, copy /etc/docker/daemon.json might work);
- After this, one may get the following error message when execute sudo nvidia-ctk config --set nvidia-container-cli.no-cgroups --in-place:

No help topic for 'config'

HowTO Fix it

Check your rootless configuration file. The manual uses $HOME/.config/docker/daemon.json, but you may find it in another place, saying $HOME/.docker/daemon.json. In my env setting, it is in the latter one.
flush the dockerd rootless: systemctl --user restart docker
If sudo nvidia-ctk config --set nvidia-container-cli.no-cgroups --in-place fails, just add no-cgroups = true in the [nvidia-container-cli] session in /etc/nvidia-container-runtime/config.toml:

[nvidia-container-cli]
# some .....
no-cgroups = true

⚠️ Fail to set up Step 3, leads to some legacy OCI runtime error:

docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'
nvidia-container-cli: mount error: failed to add device rules: unable to find any existing device filters attached to the cgroup: bpf_prog_query(BPF_CGROUP_DEVICE) failed: operation not permitted: unknown.

Hints for potential reasons

sudo docker vs docker use different Docker Root Dir and configurations;
For rootless users, the configuration file path may be different than the install-guide.
The nvidia-ctk misbehaves, users should adjust the /etc/nvidia-container-runtime/config.toml manually.

Aug 12 '24 14:08 yfyang86

isaac_ros_visual_slam isaac_ros_visual_slam copied to clipboard

Docker Error - Unknown or Invalid Runtime Name: Nvidia

Steps to Reproduce

Expected Behavior

Actual Behavior

Environment

Attempts to Resolve

Request for Help

Bad case

HowTO Fix it

Hints for potential reasons

isaac_ros_visual_slam
isaac_ros_visual_slam copied to clipboard