gpu-operator
gpu-operator copied to clipboard
NVidia GPU operator - Cluster policy
Hi Team,
In our company, having 4.16 OpenShift cluster. We are using certified operator catalog (Nvidia GPU Operator - 25.3.1).
When try to configure custom policy(gpu-cluster-policy) for dcgmExporter for custom configMap the nvidia-dcgm-exporter pod struck in the init containter.
can you help us to resolve the issue.
Note:
dcgmExporter: config: name: dcp-metrics-included (which contains the dcp-metrics-included.csv) enabled: true
when remove the name (dcp-metrics-included). The nvidia-dcgm-exporter pod working fine
Thanks, Nirmalraj
@jnirmalraj Can you please provide some details of the dcgm exporter pod like describing the pods and getting its logs