zhoaxiaohu
zhoaxiaohu
run Container: ctr -n k8s.io run --runc-binary /usr/bin/nvidia-container-runtime --rm --tty --env NVIDIA_VISIBLE_DEVICES=3 --env KGPU_MEM_DEV=23028 --env KGPU_SCHD_WEIGHT=0 --env KGPU_MEM_CONTAINER=10000 org.gpu.com/vgpu/video-worker-faas:1.0-release test_gpu_9 /bin/bash
runc: https://github.com/opencontainers/runc tag:V1.3.0 If I comment out the code responsible for generating device properties, preventing systemd from taking over device permissions, the issue is resolved.
``` # systemctl status cri-containerd-4bcc7895834609ab9bc2a04be1c3def1b7fde15fff8f1db7edabf09bd0e3121e.scope ● cri-containerd-4bcc7895834609ab9bc2a04be1c3def1b7fde15fff8f1db7edabf09bd0e3121e.scope - libcontainer container 4bcc7895834609ab9bc2a04be1c3def1b7fde15fff8f1db7edabf09bd> Loaded: loaded (/run/systemd/transient/cri-containerd-4bcc7895834609ab9bc2a04be1c3def1b7fde15fff8f1db7edabf09bd0e3121e.scope; transient) Transient: yes Drop-In: /run/systemd/transient/cri-containerd-4bcc7895834609ab9bc2a04be1c3def1b7fde15fff8f1db7edabf09bd0e3121e.scope.d └─50-CPUShares.conf, 50-DeviceAllow.conf, 50-DevicePolicy.conf Active: active (running) since Fri 2025-08-08 11:51:18 CST;...
I understand that for /dev/nvidia devices, runc does not append the corresponding NVIDIA entries to the deviceAllowList after parsing /proc/devices, or the propagated device rules are incorrect, resulting in systemd...