dcgm-exporter icon indicating copy to clipboard operation
dcgm-exporter copied to clipboard

Why gpu drain state is not included in the dcgm-exporter

Open lordofire opened this issue 1 year ago • 0 comments

Hi there,

I'm curious why the gpu drain state like the following is not included in the dcgm exporter:

Linux:~$ sudo nvidia-smi drain -p 0000:3f:00.0 -q
The current drain state of GPU 00000000:3F:00.0 is: not draining.

Also, it is not included in the dcgm-api-field-ids, which I believe should be the reason it is not exportable here. Any plans to add the support on it?

Specifically, our use case is to avoid sending the alerts if a problematic GPU has been administratively drained before being replaced physically. Therefore, any other official way to identify drained GPU may also be fit our requirements.

Thanks

lordofire avatar Dec 01 '23 01:12 lordofire