descheduler icon indicating copy to clipboard operation
descheduler copied to clipboard

Improve podEvictor statistics

Open damemi opened this issue 4 years ago • 39 comments

As suggested in https://github.com/kubernetes-sigs/descheduler/issues/501#issuecomment-781967812, it would be nice to improve the pod evictor type to report eviction statistics for individual strategies. Some suggestions were:

Number of evicted pods (in this strategy): XX
Number of evicted pods in this run: XX
Total number of evicted pods in all strategies: XX

x-ref this could also be reported as Prometheus metrics (https://github.com/kubernetes-sigs/descheduler/issues/348)

damemi avatar Feb 19 '21 21:02 damemi

I prefer to report the statistics in metrics. So we don't have to cumulative much in the pod evictor itself.

ingvagabund avatar Feb 22 '21 11:02 ingvagabund

I am just suggesting that, since those metrics will have to be calculated somewhere, doing it in podEvictor makes sense because it already has access to the information. Metrics can then use the podEvictor instance to report them when requested.

damemi avatar Feb 22 '21 13:02 damemi

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale

fejta-bot avatar May 23 '21 13:05 fejta-bot

@damemi I would be happy to contribute to this. Any docs highlighting the decision made to-date?

a7i avatar Jun 15 '21 20:06 a7i

@a7i nothing concrete, though if you would like to put some ideas together and share a doc that would be a great place to start the discussion. Right now we have 1 metric pods_evicted that's reported by the PodEvictor after a run.

As suggested above, it would be good to have some similar reports on a per-strategy basis. From there we could probably even come up with some additional meta metrics that are specific to the different strategies themselves.

damemi avatar Jun 15 '21 20:06 damemi

Stale issues rot after 30d of inactivity. Mark the issue as fresh with /remove-lifecycle rotten. Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten

fejta-bot avatar Jul 17 '21 02:07 fejta-bot

/remove-lifecycle rotten

damemi avatar Jul 30 '21 18:07 damemi

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue or PR with /reopen
  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

k8s-triage-robot avatar Aug 29 '21 18:08 k8s-triage-robot

@k8s-triage-robot: Closing this issue.

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue or PR with /reopen
  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Aug 29 '21 18:08 k8s-ci-robot

/reopen

ingvagabund avatar Aug 30 '21 09:08 ingvagabund

@ingvagabund: Reopened this issue.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Aug 30 '21 09:08 k8s-ci-robot

/remove-lifecycle rotten

ingvagabund avatar Aug 30 '21 09:08 ingvagabund

I'd like to work on this issue if someone is not working on it already 🙂 @damemi @ingvagabund

pravarag avatar Sep 14 '21 08:09 pravarag

Not aware of anyone working on this atm. Although, this requires some design and probably starting some discussion (e.g. in a google doc). @damemi wdyt?

ingvagabund avatar Sep 14 '21 09:09 ingvagabund

Yeah I think we have some good patterns started in the code for metrics reporting already that could be fleshed out more. @pravarag feel free to take this on if you'd like

damemi avatar Sep 14 '21 18:09 damemi

It would be great to have the following:

  • pods evicted success
  • pods evicted failed
  • pods skipped
  • total pods under consideration

Overall and per strategy.

a7i avatar Sep 15 '21 02:09 a7i

/assign

pravarag avatar Sep 15 '21 05:09 pravarag

@damemi @ingvagabund I'm trying to replicate the eviction of pods in a local cluster for better understanding of the way statistics are currently being represented. I have a 3 node cluster with the resources not that much heavily utilized for the three nodes cluster, the stats stand here:

NAME            CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%
10.177.140.38   161m         4%     3460Mi          26%
10.208.40.245   182m         4%     3849Mi          28%
10.74.193.204   149m         3%     4002Mi          30%

And here are the logs from descheduler pod,

->  k logs descheduler-7bdbc8f9b7-d9r46 -nkube-system
I0920 14:27:38.995798       1 named_certificates.go:53] "Loaded SNI cert" index=0 certName="self-signed loopback" certDetail="\"apiserver-loopback-client@1632148058\" [serving] validServingFor=[apiserver-loopback-client] issuer=\"apiserver-loopback-client-ca@1632148058\" (2021-09-20 13:27:38 +0000 UTC to 2022-09-20 13:27:38 +0000 UTC (now=2021-09-20 14:27:38.995739889 +0000 UTC))"
I0920 14:27:38.995912       1 secure_serving.go:195] Serving securely on [::]:10258
I0920 14:27:38.996045       1 tlsconfig.go:240] "Starting DynamicServingCertificateController"
I0920 14:27:40.554774       1 node.go:46] "Node lister returned empty list, now fetch directly"
I0920 14:27:40.973812       1 duplicates.go:99] "Processing node" node="10.177.140.38"
I0920 14:27:41.225473       1 duplicates.go:99] "Processing node" node="10.208.40.245"
I0920 14:27:41.500405       1 duplicates.go:99] "Processing node" node="10.74.193.204"
I0920 14:27:41.717340       1 pod_antiaffinity.go:81] "Processing node" node="10.177.140.38"
I0920 14:27:41.823705       1 pod_antiaffinity.go:81] "Processing node" node="10.208.40.245"
I0920 14:27:41.879063       1 pod_antiaffinity.go:81] "Processing node" node="10.74.193.204"
I0920 14:27:42.198284       1 nodeutilization.go:170] "Node is appropriately utilized" node="10.177.140.38" usage=map[cpu:1172m memory:1327634Ki pods:20] usagePercentage=map[cpu:29.974424552429667 memory:9.74255252448638 pods:18.181818181818183]
I0920 14:27:42.198333       1 nodeutilization.go:170] "Node is appropriately utilized" node="10.208.40.245" usage=map[cpu:1044m memory:1137170Ki pods:12] usagePercentage=map[cpu:26.70076726342711 memory:8.344874004635447 pods:10.909090909090908]
I0920 14:27:42.198354       1 nodeutilization.go:170] "Node is appropriately utilized" node="10.74.193.204" usage=map[cpu:1355m memory:1552914Ki pods:15] usagePercentage=map[cpu:34.65473145780051 memory:11.395720666245547 pods:13.636363636363637]
I0920 14:27:42.198369       1 lownodeutilization.go:100] "Criteria for a node under utilization" CPU=20 Mem=20 Pods=20
I0920 14:27:42.198380       1 lownodeutilization.go:101] "Number of underutilized nodes" totalNumber=0
I0920 14:27:42.198392       1 lownodeutilization.go:114] "Criteria for a node above target utilization" CPU=50 Mem=50 Pods=50
I0920 14:27:42.198403       1 lownodeutilization.go:115] "Number of overutilized nodes" totalNumber=0
I0920 14:27:42.198415       1 lownodeutilization.go:118] "No node is underutilized, nothing to do here, you might tune your thresholds further"
I0920 14:27:42.198439       1 descheduler.go:152] "Number of evicted pods" totalEvicted=0
I0920 14:32:42.198973       1 node.go:46] "Node lister returned empty list, now fetch directly"
I0920 14:32:42.261831       1 pod_antiaffinity.go:81] "Processing node" node="10.177.140.38"
I0920 14:32:42.295166       1 pod_antiaffinity.go:81] "Processing node" node="10.208.40.245"
I0920 14:32:42.336749       1 pod_antiaffinity.go:81] "Processing node" node="10.74.193.204"
I0920 14:32:42.479844       1 nodeutilization.go:170] "Node is appropriately utilized" node="10.177.140.38" usage=map[cpu:1172m memory:1327634Ki pods:20] usagePercentage=map[cpu:29.974424552429667 memory:9.74255252448638 pods:18.181818181818183]
I0920 14:32:42.479892       1 nodeutilization.go:170] "Node is appropriately utilized" node="10.208.40.245" usage=map[cpu:1044m memory:1137170Ki pods:12] usagePercentage=map[cpu:26.70076726342711 memory:8.344874004635447 pods:10.909090909090908]
I0920 14:32:42.479914       1 nodeutilization.go:170] "Node is appropriately utilized" node="10.74.193.204" usage=map[cpu:1355m memory:1552914Ki pods:15] usagePercentage=map[cpu:34.65473145780051 memory:11.395720666245547 pods:13.636363636363637]
I0920 14:32:42.479930       1 lownodeutilization.go:100] "Criteria for a node under utilization" CPU=20 Mem=20 Pods=20
I0920 14:32:42.479941       1 lownodeutilization.go:101] "Number of underutilized nodes" totalNumber=0
I0920 14:32:42.479953       1 lownodeutilization.go:114] "Criteria for a node above target utilization" CPU=50 Mem=50 Pods=50
I0920 14:32:42.479963       1 lownodeutilization.go:115] "Number of overutilized nodes" totalNumber=0
I0920 14:32:42.479982       1 lownodeutilization.go:118] "No node is underutilized, nothing to do here, you might tune your thresholds further"
I0920 14:32:42.480009       1 duplicates.go:99] "Processing node" node="10.177.140.38"
I0920 14:32:42.516420       1 duplicates.go:99] "Processing node" node="10.208.40.245"
I0920 14:32:42.549396       1 duplicates.go:99] "Processing node" node="10.74.193.204"
I0920 14:32:42.595868       1 descheduler.go:152] "Number of evicted pods" totalEvicted=0

I wanted to check if I decrease the threshold values to 10 here, will that be a good way to replicate the pod evictions so that I can look at the current statistics log?

pravarag avatar Sep 20 '21 14:09 pravarag

@pravarag in your logs you don't have any underutilized nodes, so lowering threshold won't help (there is already no nodes with all 3 below the set values). Instead, you want to raise the threshold values, so anything with usage under those values will be underutilized.

You also don't have any overutilized nodes, so you should lower the targetThresholds as well. For replicating evictions, cordoning certain nodes while you create test pods will help create the uneven distribution you want.

damemi avatar Sep 27 '21 13:09 damemi

Thanks @damemi for the above suggestions. Also had few doubts about adding newer metrics. I've identified the changes will mainly take place in these files:

  1. metrics.go - which will mainly include newer metrics that we want to put.
  2. evictions.go - where the calculation of newer metrics will happen just like for pods_evicted

Now, do we also want to modify the logging w.r.t the new metrics that are to be added? Something to include in every strategy like this log?

And one more question: I could see that for the metric pods_evicted, the help says that we can calculate number of pods evicted per strategy and namespace as well. And I'm guessing the code for calculation needs to be added so, do we need an extra metric per strategy like pods_evicted_per_strategy ?

So far, I'm working on adding few new metrics like, pods_evicted_success, pods_evicted_failed, pods_skipped.

pravarag avatar Sep 30 '21 11:09 pravarag

@pravarag I think that all sounds good, except we probably don't need a new log line for every new metric. An extra metric per strategy would be good to get pods_evicted_per_strategy, but you can probably just make that 1 metric with different labels for each strategy (that might be what you meant though)

damemi avatar Oct 05 '21 17:10 damemi

@damemi, I've created a draft pull request while I continue to make few more changes for the same.

pravarag avatar Oct 17 '21 12:10 pravarag

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Jan 15 '22 12:01 k8s-triage-robot

/remove-lifecycle stale

pravarag avatar Jan 15 '22 12:01 pravarag

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Apr 15 '22 12:04 k8s-triage-robot

/remove-lifecycle stale

pravarag avatar Apr 19 '22 12:04 pravarag

It would be great to have the following:

* pods evicted success

* pods evicted failed

* pods skipped

* total pods under consideration

Overall and per strategy.

We would like to help to implement the following above metrics! @eminaktas

There is closed PR https://github.com/kubernetes-sigs/descheduler/pull/648 that submitted by @pravarag And I think it closed in favor of new descheculing framework as also discussed in https://github.com/kubernetes-sigs/descheduler/issues/753#issuecomment-1150133689.

How should we help/proceed here to implement these metrics? Are there any example to create a plugin from scratch using the new descheculing framework?

Dentrax avatar Jun 16 '22 10:06 Dentrax

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Sep 14 '22 11:09 k8s-triage-robot

/remove-lifecycle stale

Dentrax avatar Sep 17 '22 19:09 Dentrax

Kind ping @pravarag 🤞 Any ongoing work on this?

it sounds like we've come back around to not needing single-run metrics. I think per-strategy metrics could be a good option though. ^1

I think this is a really cool idea! Any implementation idea on your mind? Are we supposed to create metrics per strategy? Like pods_evicted_STRATEGY etc.

Solution 1: Without duplication (use map)

func registerForStrategy(strategy string) {
	metric := metrics.NewCounterVec(
		&metrics.CounterOpts{
			Subsystem:      DeschedulerSubsystem,
			Name:           fmt.Sprintf("pods_evicted_%s", strategy),
			Help:           "Number of evicted pods, by the result, by the namespace, by the node name. 'error' result means a pod could not be evicted",
			StabilityLevel: metrics.ALPHA,
		}, []string{"result", "namespace", "node"})

    podsEvicted[strategy] = metric

	legacyregistry.MustRegister(metric)
}

Solution 2: With duplication (create CounterVec for each strategy)

PodsEvictedHighNodeUtilization = metrics.NewCounterVec
PodsEvictedLowNodeUtilization = metrics.NewCounterVec
PodsEvictedPodLifeTime = metrics.NewCounterVec
...

Waiting your thoughts!

Dentrax avatar Sep 17 '22 21:09 Dentrax