k8s-device-plugin icon indicating copy to clipboard operation
k8s-device-plugin copied to clipboard

[Feature]: expose gpu model name as resource

Open baddoub opened this issue 7 months ago • 3 comments

Suggestion Description

Hello AMD team,

We would like to request an enhancement to the device plugin's StrategyMixed feature to allow GPU resources to be exposed using their model names (e.g., amd.com/mi250x, amd.com/mi210x) instead of the generic amd.com/gpu.

We’re aware of the existing resource_naming_strategy option, but it currently exposes partition types rather than the actual GPU model names.

Is there any plan to extend the plugin to support this level of resource granularity? FYI, Nvida device plugin has the same capability.

If this aligns with your roadmap, we’d be happy to contribute to help implement it.

Thanks for your help.

Operating System

No response

GPU

No response

ROCm Component

Device plugin

baddoub avatar May 17 '25 08:05 baddoub

Good issue.

Monokaix avatar May 20 '25 09:05 Monokaix

Good issue +1

zhoushuke avatar Sep 03 '25 08:09 zhoushuke

Hi @baddoub , thanks for raising this feature request. This is a good idea for the cluster that has a mixture of different GPU models.

Currently we don't have this functionality because:

  • most of the customers are using amd.com/gpu and some of them occasionally need GPU partitioning specific name
  • many ecosystem partner or projects are by default looking at the amd.com/gpu as GPU resource name for their integration.

PR is welcome to enable this feature, I'd suggest keep amd.com/gpu as default resource name, and users could optionally use resource_naming_strategy to turn on the feature you requested.

yansun1996 avatar Dec 12 '25 08:12 yansun1996