podman icon indicating copy to clipboard operation
podman copied to clipboard

Insufficient permission error for Nvidia container runtime using Podman v5

Open doganulus opened this issue 1 year ago • 2 comments

Issue Description

Running nvidia-smi command inside the container returns Failed to initialize NVML: Insufficient Permissions error when running the following Containerfile:

FROM docker.io/library/ubuntu:22.04 as non-root-test

ARG USER=user
ARG USERGROUP=${USER}
ARG UID=1000
ARG GID=${UID}

RUN groupadd ${USERGROUP} -g ${GID} && useradd -ms /bin/bash ${USER} -g ${USERGROUP} -u ${UID} -G video

USER ${USER}
WORKDIR ${HOME}

Nvidia container runtime has been set up using the CDI, as Nvidia suggests.

Steps to reproduce the issue

  1. Build the Containerfile with --tag non-root-test
  2. Execute the following command using podman v5.1.1
podman run --gpus=all --rm non-root-test:latest nvidia-smi -L

Describe the results you received

This command exits with Failed to initialize NVML: Insufficient Permissions.###

Describe the results you expected

The command must return the video card information.

The same container works as expected with podman v4.6.2 on another machine:

podman run --device nvidia.com/gpu=all --rm non-root-test:latest nvidia-smi -L

podman info output

host:
  arch: amd64
  buildahVersion: 1.36.0
  cgroupControllers:
  - pids
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.1.12-1.1.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.12, commit: unknown'
  cpuUtilization:
    idlePercent: 91.28
    systemPercent: 1.88
    userPercent: 6.84
  cpus: 12
  databaseBackend: sqlite
  distribution:
    distribution: opensuse-tumbleweed
    version: "20240613"
  eventLogger: journald
  freeLocks: 2030
  hostname: workstation
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
  kernel: 6.9.3-1-default
  linkmode: dynamic
  logDriver: journald
  memFree: 7649325056
  memTotal: 33220648960
  networkBackend: netavark
  networkBackendInfo:
    backend: netavark
    dns:
      package: aardvark-dns-1.11.0-1.1.x86_64
      path: /usr/libexec/podman/aardvark-dns
      version: aardvark-dns 1.11.0
    package: netavark-1.11.0-1.1.x86_64
    path: /usr/libexec/podman/netavark
    version: netavark 1.11.0
  ociRuntime:
    name: crun
    package: crun-1.15-1.1.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 1.15
      commit: e6eacaf4034e84185fd8780ac9262bbf57082278
      rundir: /run/user/1000/crun
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +LIBKRUN +WASM:wasmedge +YAJL
  os: linux
  pasta:
    executable: /usr/bin/pasta
    package: passt-20240523.765eb0b-1.1.x86_64
    version: |
      pasta 20240523.765eb0b-1.1
      Copyright Red Hat
      GNU General Public License, version 2 or later
        <https://www.gnu.org/licenses/old-licenses/gpl-2.0.html>
      This is free software: you are free to change and redistribute it.
      There is NO WARRANTY, to the extent permitted by law.
  remoteSocket:
    exists: false
    path: /run/user/1000/podman/podman.sock
  rootlessNetworkCmd: pasta
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /etc/containers/seccomp.json
    selinuxEnabled: false
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.3.1-1.1.x86_64
    version: |-
      slirp4netns version 1.3.1
      commit: unknown
      libslirp: 4.8.0
      SLIRP_CONFIG_VERSION_MAX: 5
      libseccomp: 2.5.5
  swapFree: 1245118464
  swapTotal: 2148204544
  uptime: 28h 13m 49.00s (Approximately 1.17 days)
  variant: ""
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  - ipvlan
  volume:
  - local
registries:
  search:
  - registry.opensuse.org
  - registry.suse.com
  - docker.io
store:
  configFile: /home/dogan/.config/containers/storage.conf
  containerStore:
    number: 5
    paused: 0
    running: 0
    stopped: 5
  graphDriverName: overlay
  graphOptions: {}
  graphRoot: /home/dogan/.local/share/containers/storage
  graphRootAllocated: 983351140352
  graphRootUsed: 335927275520
  graphStatus:
    Backing Filesystem: extfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Supports shifting: "false"
    Supports volatile: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 316
  runRoot: /run/user/1000/containers
  transientStore: false
  volumePath: /home/dogan/.local/share/containers/storage/volumes
version:
  APIVersion: 5.1.1
  Built: 1717565069
  BuiltTime: Wed Jun  5 08:24:29 2024
  GitCommit: ""
  GoVersion: go1.21.11
  Os: linux
  OsArch: linux/amd64
  Version: 5.1.1

Podman in a container

No

Privileged Or Rootless

Rootless

Upstream Latest Release

Yes

Additional environment details

Additional environment details

Additional information

  • Root container user works as expected.
  • Using --userns=keep-id works as expected. But this was not required for the Podman version 4.6.2.

doganulus avatar Jun 17 '24 12:06 doganulus

I don't see any change from 4.6.2 that could explain a different behavior. Is there a way to try the two versions on the same machine? Is there any customization in /etc/containers/containers.conf on the two machines?

giuseppe avatar Jun 18 '24 09:06 giuseppe

A friendly reminder that this issue had no activity for 30 days.

github-actions[bot] avatar Jul 19 '24 00:07 github-actions[bot]

As there was no reply, closing

Luap99 avatar Oct 29 '24 17:10 Luap99