Insufficient permission error for Nvidia container runtime using Podman v5
Issue Description
Running nvidia-smi command inside the container returns Failed to initialize NVML: Insufficient Permissions error when running the following Containerfile:
FROM docker.io/library/ubuntu:22.04 as non-root-test
ARG USER=user
ARG USERGROUP=${USER}
ARG UID=1000
ARG GID=${UID}
RUN groupadd ${USERGROUP} -g ${GID} && useradd -ms /bin/bash ${USER} -g ${USERGROUP} -u ${UID} -G video
USER ${USER}
WORKDIR ${HOME}
Nvidia container runtime has been set up using the CDI, as Nvidia suggests.
Steps to reproduce the issue
- Build the Containerfile with
--tag non-root-test - Execute the following command using
podman v5.1.1
podman run --gpus=all --rm non-root-test:latest nvidia-smi -L
Describe the results you received
This command exits with Failed to initialize NVML: Insufficient Permissions.###
Describe the results you expected
The command must return the video card information.
The same container works as expected with podman v4.6.2 on another machine:
podman run --device nvidia.com/gpu=all --rm non-root-test:latest nvidia-smi -L
podman info output
host:
arch: amd64
buildahVersion: 1.36.0
cgroupControllers:
- pids
cgroupManager: systemd
cgroupVersion: v2
conmon:
package: conmon-2.1.12-1.1.x86_64
path: /usr/bin/conmon
version: 'conmon version 2.1.12, commit: unknown'
cpuUtilization:
idlePercent: 91.28
systemPercent: 1.88
userPercent: 6.84
cpus: 12
databaseBackend: sqlite
distribution:
distribution: opensuse-tumbleweed
version: "20240613"
eventLogger: journald
freeLocks: 2030
hostname: workstation
idMappings:
gidmap:
- container_id: 0
host_id: 1000
size: 1
- container_id: 1
host_id: 100000
size: 65536
uidmap:
- container_id: 0
host_id: 1000
size: 1
- container_id: 1
host_id: 100000
size: 65536
kernel: 6.9.3-1-default
linkmode: dynamic
logDriver: journald
memFree: 7649325056
memTotal: 33220648960
networkBackend: netavark
networkBackendInfo:
backend: netavark
dns:
package: aardvark-dns-1.11.0-1.1.x86_64
path: /usr/libexec/podman/aardvark-dns
version: aardvark-dns 1.11.0
package: netavark-1.11.0-1.1.x86_64
path: /usr/libexec/podman/netavark
version: netavark 1.11.0
ociRuntime:
name: crun
package: crun-1.15-1.1.x86_64
path: /usr/bin/crun
version: |-
crun version 1.15
commit: e6eacaf4034e84185fd8780ac9262bbf57082278
rundir: /run/user/1000/crun
spec: 1.0.0
+SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +LIBKRUN +WASM:wasmedge +YAJL
os: linux
pasta:
executable: /usr/bin/pasta
package: passt-20240523.765eb0b-1.1.x86_64
version: |
pasta 20240523.765eb0b-1.1
Copyright Red Hat
GNU General Public License, version 2 or later
<https://www.gnu.org/licenses/old-licenses/gpl-2.0.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
remoteSocket:
exists: false
path: /run/user/1000/podman/podman.sock
rootlessNetworkCmd: pasta
security:
apparmorEnabled: false
capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
rootless: true
seccompEnabled: true
seccompProfilePath: /etc/containers/seccomp.json
selinuxEnabled: false
serviceIsRemote: false
slirp4netns:
executable: /usr/bin/slirp4netns
package: slirp4netns-1.3.1-1.1.x86_64
version: |-
slirp4netns version 1.3.1
commit: unknown
libslirp: 4.8.0
SLIRP_CONFIG_VERSION_MAX: 5
libseccomp: 2.5.5
swapFree: 1245118464
swapTotal: 2148204544
uptime: 28h 13m 49.00s (Approximately 1.17 days)
variant: ""
plugins:
authorization: null
log:
- k8s-file
- none
- passthrough
- journald
network:
- bridge
- macvlan
- ipvlan
volume:
- local
registries:
search:
- registry.opensuse.org
- registry.suse.com
- docker.io
store:
configFile: /home/dogan/.config/containers/storage.conf
containerStore:
number: 5
paused: 0
running: 0
stopped: 5
graphDriverName: overlay
graphOptions: {}
graphRoot: /home/dogan/.local/share/containers/storage
graphRootAllocated: 983351140352
graphRootUsed: 335927275520
graphStatus:
Backing Filesystem: extfs
Native Overlay Diff: "true"
Supports d_type: "true"
Supports shifting: "false"
Supports volatile: "true"
Using metacopy: "false"
imageCopyTmpDir: /var/tmp
imageStore:
number: 316
runRoot: /run/user/1000/containers
transientStore: false
volumePath: /home/dogan/.local/share/containers/storage/volumes
version:
APIVersion: 5.1.1
Built: 1717565069
BuiltTime: Wed Jun 5 08:24:29 2024
GitCommit: ""
GoVersion: go1.21.11
Os: linux
OsArch: linux/amd64
Version: 5.1.1
Podman in a container
No
Privileged Or Rootless
Rootless
Upstream Latest Release
Yes
Additional environment details
Additional environment details
Additional information
- Root container user works as expected.
- Using
--userns=keep-idworks as expected. But this was not required for the Podman version4.6.2.
I don't see any change from 4.6.2 that could explain a different behavior. Is there a way to try the two versions on the same machine? Is there any customization in /etc/containers/containers.conf on the two machines?
A friendly reminder that this issue had no activity for 30 days.
As there was no reply, closing