gpu-operator icon indicating copy to clipboard operation
gpu-operator copied to clipboard

Allow adding custom labels to the "gpu-operator" ServiceMonitor

Open peihsuant opened this issue 1 year ago • 12 comments

feature description

There is a ServiceMonitor named gpu-operator that was automatically created and is owned by the ClusterPolicy. I would like to add custom labels to it so the Prometheus instance with a serviceMonitorSelector can scrape it.

Thanks.

peihsuant avatar Apr 30 '24 03:04 peihsuant

@peihsuant this is already possible today with the dcgmExporter.serviceMonitor.additionalLabels field in clusterpolicy.

cdesiniotis avatar Apr 30 '24 23:04 cdesiniotis

Hi @cdesiniotis, what I need to modify is the servicemonitor of gpu-operator not dcgmExporter, thanks.

peihsuant avatar May 02 '24 07:05 peihsuant

@peihsuant apologies, I misread the description. I see we do not have any fields for configuring the gpu-operator ServiceMonitor, unlike for dcgmExporter.

If you are interested in working on this, PRs against our gitlab repository are always welcome. https://gitlab.com/nvidia/kubernetes/gpu-operator

cdesiniotis avatar May 02 '24 17:05 cdesiniotis

@cdesiniotis I would like to give this a try if its still needed.

csauoss avatar Jul 05 '24 01:07 csauoss

@csauoss yes this is open. PRs are welcome here https://github.com/NVIDIA/gpu-operator/pulls

cdesiniotis avatar Jul 12 '24 16:07 cdesiniotis

@cdesiniotis thank you, but should I create a MR in gitlab(since documentation mentions it) or a PR here is fine as well?

csauoss avatar Jul 15 '24 05:07 csauoss

@cdesiniotis I created MR 1099 in gitlab for this issue for now. Please review it when you get a chance.

csauoss avatar Jul 17 '24 04:07 csauoss

@csauoss we have recently migrated to GitHub and now perform development here. Can you open your PR against https://github.com/NVIDIA/gpu-operator?

(since documentation mentions it)

Can you point me to where our documentation states this?

cdesiniotis avatar Jul 18 '24 15:07 cdesiniotis

Thanks @cdesiniotis! sounds good, will PR here in that case. Here in CONTRIBUTING.md file is where I saw references to Gitlab.

csauoss avatar Jul 18 '24 17:07 csauoss

Thanks @cdesiniotis! sounds good, will PR here in that case. Here in CONTRIBUTING.md file is where I saw references to Gitlab.

Actually the pr template in github too

csauoss avatar Jul 18 '24 18:07 csauoss

Thank you for pointing this out. I have filed https://github.com/NVIDIA/gpu-operator/pull/851 to update our docs.

cdesiniotis avatar Jul 18 '24 18:07 cdesiniotis

@cdesiniotis I created pr #850 in github to address this issue. Please review it when you get a chance. Thanks!

csauoss avatar Jul 19 '24 00:07 csauoss

Hello,

Any progress for this feature ?

awrel34 avatar Feb 28 '25 13:02 awrel34

@awrel34 thanks for bumping it. PR #850 is awaiting review.

csauoss avatar May 18 '25 00:05 csauoss

This issue is stale because it has been open 90 days with no activity. This issue will be closed in 30 days unless new comments are made or the stale label is removed. To skip these checks, apply the "lifecycle/frozen" label.

github-actions[bot] avatar Nov 05 '25 00:11 github-actions[bot]