dashboard Why not show GPU resource on the dashboard

I want know why not GPU resource information on the dashboard? When I use kubectl describe nodes CLI get the GPU detailed information,But I didn't see any GPU information on the dashboard.This is the plan?

Environment

Dashboard version:
Kubernetes version:
Operating system:
Node.js version:
Go version:

Steps to reproduce

Observed result

Expected result

Comments

May 06 '17 17:05 wjdfx

This is the plan?

Yes it is.

May 09 '17 07:05 maciaszczykm

@wjdfx I assume that GPU resource is some information on node details. I have never tried this setup, so I don't know what it looks like.

@maciaszczykm

do you mean, yet it is planned to add this information sometime in the future? or yes it is the plan not to show this information? If so, why?

Jun 22 '17 08:06 cheld

Some relevant docs: https://kubernetes.io/docs/tasks/manage-gpus/scheduling-gpus/

The key (under resources and limits) is alpha.kubernetes.io/nvidia-gpu (also alpha.kubernetes.io/nvidia-gpu-name can be specified with the --node-labels='alpha.kubernetes.io/nvidia-gpu-name=xxx kubelet option).

This is an alpha feature at the moment. It might make sense wait until it enters beta at least?

Jun 25 '17 10:06 lenartj

This is an alpha feature at the moment. It might make sense wait until it enters beta at least?

Yes, we should wait for at least beta.

Jun 26 '17 07:06 maciaszczykm

Any follow-up in showing GPU stats in dashboard?

Sep 21 '17 11:09 fanyangCS

We were focused on more important topics lately, like security and logging in mechanism. This feature is rather low on our priority list for now. No ETA.

Sep 21 '17 12:09 floreks

Any update?

Dec 12 '17 11:12 bhack

Any update on showing GPU stats on Kubernetes Dashboard?

Nov 29 '18 22:11 xinxingliu90

has any update to support show gpu info in dashboard?

Sep 27 '19 11:09 sunxianchao

any update?

Jul 02 '20 05:07 therealnlee

This has low priority for us at the moment. If you are willing to contribute then let us know.

Jul 02 '20 06:07 maciaszczykm

It would be great to have at least information that pod has limits/requests set on any device that is compatible with device plugin framework (so not only gpus) and how much of that resource is requested. For example if node has 4 tpu's and there are 3 pods each consuming one 1 tpu it should be visible somewhere ideally right next to cpu/memory. It would really help debugging scheduling issues if nothing else. At this point in time devices exposed by device plugin framework are treated like third class citizens. CPU/Memory is not enough.

Nov 24 '21 17:11 boniek83

I am not a contributor, but I am looking into this issue. I found this on Nvidia's website https://docs.nvidia.com/datacenter/cloud-native/gpu-telemetry/dcgm-exporter.html#gpu-telemetry

Which is configured to export to this grafana dashboard https://grafana.com/grafana/dashboards/12239-nvidia-dcgm-exporter-dashboard/

If we can replicate this effort, we could then setup the metrics-scraper to consume metrics with the same pattern that Nvidia uses to build that grafana dashboard. We would want to provide information on the cluster level, with node and namespace level metrics

@maciaszczykm is there anyone from the contributors working on this that we could help?

Jun 06 '23 20:06 Talador12