ROCm icon indicating copy to clipboard operation
ROCm copied to clipboard

[Feature]: CDI generator for a universal container support

Open Scapal opened this issue 10 months ago • 2 comments

Suggestion Description

In order to make it easier way to expose AMD GPUs in containers, ROCm should embrace the Container Device Interface.

It should be quite straightforward to implement as it is just about naming the devices in a yaml file and having hooks.

cdiVersion: 0.5.0
kind: amd.com/gpu
devices:
- name: "0"
  containerEdits:
    deviceNodes:
    - path: /dev/kfd
    - path: /dev/dri/renderD128
- name: "1"
  containerEdits:
    deviceNodes:
    - path: /dev/kfd
    - path: /dev/dri/renderD130
- name: "all"
  containerEdits:
    deviceNodes:
    - path: /dev/kfd
    - path: /dev/dri/renderD128
    - path: /dev/dri/renderD130

CDI is now supported by Docker (v25), containers, CRI-O and Podman.

For example, for Docker or Podman, you can then specify --device amd.com/gpu/1 instead of --device /dev/kfd --device /dev/dri/renderD130

Operating System

Linux

GPU

No response

ROCm Component

No response

Scapal avatar Apr 17 '24 11:04 Scapal