FFmpeg NVENC fails in pods unless `/dev/nvidia#` index matches GPU index from `nvidia-smi` โ with `deviceListStrategy: volume-mounts`
๐ Describe the bug
When deploying GPU-bound pods using the NVIDIA device plugin (nvidia-device-plugin Helm chart v0.17.1), FFmpeg NVENC fails inside the container unless the assigned GPU is mounted at the path /dev/nvidiaN where N matches its index in nvidia-smi.
This issue occurs only when using deviceListStrategy: volume-mounts, which is required for secure GPU isolation in our multi-tenant environment. Using envvar is not an option, as users can override NVIDIA_VISIBLE_DEVICES in untrusted Docker images.
As a result, only pods where the assigned GPU's nvidia-smi index matches the container path /dev/nvidiaN succeed. All others fail with unsupported device errors in FFmpeg.
๐ ๏ธ Helm values
deviceIDStrategy: uuid
deviceListStrategy: volume-mounts
runtimeClassName: nvidia
๐ง Root cause
NVENC appears to rely on the assumption that:
/dev/nvidiaN <โ> GPU with index N from `nvidia-smi`
If this alignment is broken (e.g. GPU with index: 0 is mounted as /dev/nvidia5), the encoder fails:
[h264_nvenc @ 0x637317ea8e80] OpenEncodeSessionEx failed: unsupported device (2): (no details)
[h264_nvenc @ 0x637317ea8e80] No capable devices found
This behavior is reproducible and consistent across all tested environments.
๐ฅ๏ธ Host configuration
- 6ร NVIDIA RTX 4090 (UUID-assigned, known-good hardware)
- Host
/dev/nvidia[0-5]layout matchesnvidia-smioutput nvidia-smi, CUDA, and NVENC work fine directly on host- Issue only occurs inside container when mount path/index diverge from
nvidia-smi
โ Working pod example
- GPU UUID:
GPU-46b5dd79-... nvidia-smi index:0- Mounted as:
/dev/nvidia0 - โ
ffmpeg -c:v h264_nvencworks
โ Failing pod example
- GPU UUID:
GPU-dada647b-... nvidia-smi index:0- Mounted as:
/dev/nvidia5 - โ
ffmpeg -c:v h264_nvencfails with:
[h264_nvenc @ 0x637317ea8e80] OpenEncodeSessionEx failed: unsupported device (2): (no details)
[h264_nvenc @ 0x637317ea8e80] No capable devices found
๐ Additional observations
- All expected character devices (
nvidia[0-9],nvidiactl,uvm, etc.) are present inside the pod. - The mounted
/dev/nvidiaXfiles have correct major/minor numbers. - The issue only depends on the alignment between
nvidia-smi indexand the mounted path. - The
Device Minor:in/proc/driver/nvidia/gpus/.../informationdoes not determine NVENC success, only the mount path does.
โ Expected behavior
All GPUs assigned to a container should be fully usable via NVENC โ regardless of physical or logical index โ as long as the device is properly mounted.
The device plugin should ensure that /dev/nvidiaN always maps to the GPU with nvidia-smi index N, or NVENC workloads will fail.
๐ Environment
- Host OS: Ubuntu 22.04
- GPUs: 6ร NVIDIA RTX 4090
- Container runtime: containerd
- Kubernetes: v1.32.x (K3s)
- NVIDIA Driver: 570.133.20 (also tested with 575)
- NVIDIA device plugin: v0.17.1 (Helm)
- nvidia-container-runtime: 3.14.0-1
- nvidia-container-toolkit: 1.17.6-1
- NVIDIA_DRIVER_CAPABILITIES:
compute,video,utility,graphics,display(set in the deployment image) - FFmpeg: NVENC-enabled build (confirmed working directly on host)
๐งช Steps to reproduce
-
Deploy multiple pods with:
resources: limits: nvidia.com/gpu: 1 -
Inside each pod, run:
nvidia-smi --query-gpu=gpu_uuid,index,name --format=csv,noheader ls -l /dev/nvidia[0-9] ffmpeg -hide_banner -f lavfi -i testsrc=duration=3:size=1280x720:rate=30 -c:v h264_nvenc -y /tmp/test.mp4 -
Observe:
- If
/dev/nvidiaNmatches theindex: Nreported bynvidia-smi, encoding works. - If not, FFmpeg fails.
- If
๐ก Suggested improvement
Ensure the device plugin mounts GPU devices inside the pod at the /dev/nvidiaN path where N is the GPU's index reported by nvidia-smi.
This will restore NVENC compatibility and likely benefit other workloads that rely on this path/index alignment.
๐ซ Partial workaround
None identified.
Detecting the mismatch inside user space (via nvidia-smi + ls -l /dev/nvidia*) lets us fail fast, but does not resolve the root problem โ NVENC will still fail to initialize.