gpu-operator icon indicating copy to clipboard operation
gpu-operator copied to clipboard

NVidia GPU operator - Cluster policy

Open jnirmalraj opened this issue 4 months ago • 1 comments

Hi Team,

In our company, having 4.16 OpenShift cluster. We are using certified operator catalog (Nvidia GPU Operator - 25.3.1).

When try to configure custom policy(gpu-cluster-policy) for dcgmExporter for custom configMap the nvidia-dcgm-exporter pod struck in the init containter.

can you help us to resolve the issue.

Note:

dcgmExporter: config: name: dcp-metrics-included (which contains the dcp-metrics-included.csv) enabled: true

when remove the name (dcp-metrics-included). The nvidia-dcgm-exporter pod working fine

Thanks, Nirmalraj

jnirmalraj avatar Aug 21 '25 21:08 jnirmalraj

@jnirmalraj Can you please provide some details of the dcgm exporter pod like describing the pods and getting its logs

visheshtanksale avatar Nov 14 '25 22:11 visheshtanksale