java [Query] Client metrics/monitoring

trafficstars

We have been developing our operators using the kubernetes java client, and want to define some metrics about the operators, e.g. health metric, lagging metric, etc.

Prometheus metrics are used by the kubernetes java client (example). I wonder if there are any metrics defined by the client already and we can leverage directly (e.g. the requests in the work queue, etc.) ?

Thanks! Weiqing

Apr 13 '22 07:04 weiqingy

For the controller-runtime in go, seems they have the default exported metrics: https://github.com/kubernetes-sigs/kubebuilder/blob/c0a0bb6dc05e04332cb7b460bef5f155b53770cc/docs/book/src/reference/metrics-reference.md

I wonder if the java client has the default exported metrics as well?

Apr 15 '22 21:04 weiqingy

All of the metrics that we have are defined here:

https://github.com/kubernetes-client/java/blob/master/util/src/main/java/io/kubernetes/client/monitoring/PrometheusInterceptor.java

I think that we would be open to adding more.

Apr 18 '22 21:04 brendandburns

Thanks for the reply @brendandburns ! The metrics described https://github.com/kubernetes-sigs/kubebuilder/blob/c0a0bb6dc05e04332cb7b460bef5f155b53770cc/docs/book/src/reference/metrics-reference.md are useful. I wonder if they can be added to the Java client?

Apr 21 '22 05:04 weiqingy

Hi @brendandburns ,

In https://github.com/kubernetes-client/java/blob/master/util/src/main/java/io/kubernetes/client/monitoring/PrometheusInterceptor.java, seems the following metrics are defined:

k8s_java_requests_total
k8s_java_response_code_total
k8s_java_resource_request_latency_seconds
k8s_java__non_resource_request_latency_seconds

Besides the message put in the function help(), is there a wiki/doc about more details of the metric definition? E.g. Is "k8s_java_requests_total" the total count of requests in the work queue? or it's the total request count the controller has been reconciled so far? For k8s_java_resource_request_latency_seconds, is it defined as the time from when a user change is received to when it propagates out of the controller? What is the non_resource_request? What is response code total?

Thanks!

May 12 '22 06:05 weiqingy

Hey @brendandburns and @yue9944882,

Could you please help take a look at the questions above when you get a chance? Really appreciate.

Thanks, Weiqing

Jun 13 '22 20:06 weiqingy

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

Sep 11 '22 21:09 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

Oct 11 '22 21:10 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue with /reopen
Mark this issue as fresh with /remove-lifecycle rotten
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Nov 10 '22 21:11 k8s-triage-robot

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied

After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied

After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue with /reopen

Mark this issue as fresh with /remove-lifecycle rotten

Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Nov 10 '22 21:11 k8s-ci-robot

@weiqingy were you able to find these answers? and use the metrics

Apr 21 '23 09:04 prash-kr-meena

java java copied to clipboard

[Query] Client metrics/monitoring

java
java copied to clipboard