controller-runtime icon indicating copy to clipboard operation
controller-runtime copied to clipboard

metrics.init() method calls metrics.Register() of client-go, cause other component can not observe metrics of client-go

Open shaofeng66 opened this issue 1 year ago • 3 comments

https://github.com/kubernetes/client-go/blob/master/tools/metrics/metrics.go#L132

metrics.Register() is in a sync.Once.Do(), so the register can only has effect once.

My program import controller runtime to use its client, and I want to track the latency of all client-go requests and other metrics defined in client-go. But I can not register any other metrics to client-go b/c contoller-runtime has done it with a single "RequestResult".

Or is there any other method that I can use to get the metrics from client-go?

shaofeng66 avatar Sep 23 '24 04:09 shaofeng66

An W.A. would be call my own registring code before "importing" controller runtime, by adding some code like

import _ "github.com/myorg/myproj/pkg/metrics"

in main.go before any controller runtime initilaztion.

But I think an explicit named initilaztion func and it invokation is better than the init()?

shaofeng66 avatar Sep 23 '24 06:09 shaofeng66

As a workaround, It should be possible provide own adapter to the client-go/metrics package and use it with metrics collector.

import (
	"github.com/prometheus/client_golang/prometheus"
  	 restmetrics "k8s.io/client-go/tools/metrics"
)
var requestLatencySeconds = prometheus.NewHistogramVec(prometheus.HistogramOpts{Name: "rest_client_request_duration_seconds"}, []string{"verb", "path"})

func exposeRestClientMetrics(r *prometheus.Registry){
  r.MustRegister(requestLatencySeconds)
  restmetrics.RequestLatency = &latencyAdapter{collector: requestLatencySeconds}
}

type latencyAdapter struct {
	collector *prometheus.HistorgramVec
}

func (la *latencyAdapter) Observe(ctx context.Context, verb string, u url.URL, latency time.Duration) {
	la.collector.WithLabelValues(verb, u.Path).Observe(latency.Seconds())
}

f41gh7 avatar Sep 27 '24 15:09 f41gh7

But I think an explicit named initilaztion func and it invokation is better than the init()?

The issue is that this means it will break for the majority of users who don't want to do anything and just get metrics by default. Retaining both that and avoiding this issue will be difficult. IMHO the best solution for this would be for upstream not to make this a sync.Once.

We could change downstream to register them right before first use rather than in an init, that would slightly improve things in that you can just register them before constructing controllers, but it is still a bit arcane and not obvious that it will not work if you register them too late.

I opened https://github.com/kubernetes/kubernetes/issues/127739 in upstream, because I don't think its possible to provide a "good" solution for this downstream.

alvaroaleman avatar Sep 29 '24 15:09 alvaroaleman

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Dec 28 '24 16:12 k8s-triage-robot

/remove-lifecycle stale

sbueringer avatar Dec 30 '24 08:12 sbueringer

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Mar 30 '25 09:03 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot avatar Apr 29 '25 10:04 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

k8s-triage-robot avatar May 29 '25 10:05 k8s-triage-robot

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

k8s-ci-robot avatar May 29 '25 10:05 k8s-ci-robot