metrics-server icon indicating copy to clipboard operation
metrics-server copied to clipboard

metrics-server changes with k8s v1.24.0

Open spowelljr opened this issue 3 years ago • 10 comments

Hello, metrics-server is one of the addons for minikube, I'm in the process of updating our default k8s version to v1.24.0 and all of our metrics-server integration tests were failing. The pod was never coming up and was repeatedly outputting:

I0520 18:59:00.056614       1 server.go:187] "Failed probe" probe="metric-storage-ready" err="no metrics to serve"
E0520 18:59:00.464381       1 scraper.go:140] "Failed to scrape node" err="Get \"https://192.168.49.2:10250/metrics/resource\": context deadline exceeded" node="minikube"

We've been passing --metric-resolution=15s with previous versions of k8s, but once I updated it to 17s or greater on k8s v1.24.0 it started working again. With --metric-resolution=15s on previous k8s versions the pod would take ~20 seconds to come up, but now with k8s v1.24.0 with --metric-resolution=17s it takes 40-60 seconds to come up.

Just reporting our findings incase this is something of significance, let me know if you need more information.

Thanks!

/kind support

spowelljr avatar May 20 '22 23:05 spowelljr

It takes at least two metric-resolution cycles for metrics-server to be ready. When the value of flagmetric-resolution is large, it is normal for metrics-server pod startup time to become longer

yangjunmyfm192085 avatar May 21 '22 03:05 yangjunmyfm192085

I'm having exactly the same problem on docker after upgrading to 1.24.0. Before that (1.23.x) everything was fine. I'm running --kubelet-insecure-tls as an arg.

Problem happens both with the latest helm chart of metrics-server (3.8.2) as well as with the chart included in kubernetes-dashboard (3.5.0). Changing --metric-resolution did not help, pod never reports ready. Pod comes up and is alive, but always reports it is not ready:

E0606 14:32:09.659571       1 scraper.go:139] "Failed to scrape node" err="Get \"https://192.168.65.4:10250/stats/summary?only_cpu_and_memory=true\": context deadline exceeded" node="docker-desktop"
I0606 14:32:18.790252       1 server.go:188] "Failed probe" probe="metric-storage-ready" err="not metrics to serve"

Jejuni avatar Jun 06 '22 14:06 Jejuni

Expect this is dupe of https://github.com/kubernetes-sigs/metrics-server/issues/983

serathius avatar Jun 06 '22 14:06 serathius

Is there no direct solution?

zouchengli avatar Jul 04 '22 00:07 zouchengli

Hi, @zouchengli https://github.com/kubernetes-sigs/metrics-server/pull/1009 This issue has been resolved and a new version will be released

yangjunmyfm192085 avatar Jul 04 '22 00:07 yangjunmyfm192085

Hi @yangjunmyfm192085 ! Any update on when the new version will be released?

radhikamattoo avatar Jul 14 '22 20:07 radhikamattoo

Hi @yangjunmyfm192085 ! Any update on when the new version will be released?

We are waiting for the release of metrics-server 0.6.2. @serathius I shouldn't have permission to release the version, right? Could we release version 0.6.2?

yangjunmyfm192085 avatar Jul 15 '22 02:07 yangjunmyfm192085

Hi @yangjunmyfm192085 ! Any update on when the new version will be released?

We are waiting for the release of metrics-server 0.6.2. @serathius I shouldn't have permission to release the version, right? Could we release version 0.6.2?

I use k8s v1.21.4and the chart version is metrics-server-3.7.0.tgz.After install the chart I got the error:

I1008 09:00:42.457563       1 serving.go:341] Generated self-signed cert (/tmp/apiserver.crt, /tmp/apiserver.key)
E1008 09:00:43.175692       1 scraper.go:139] "Failed to scrape node" err="Get \"https://10.0.2.112:10250/stats/summary?only_cpu_and_memory=true\": x509: certificate signed by unknown authority" node="10.0.2.112"
I1008 09:00:43.179983       1 requestheader_controller.go:169] Starting RequestHeaderAuthRequestController
I1008 09:00:43.180033       1 shared_informer.go:240] Waiting for caches to sync for RequestHeaderAuthRequestController
I1008 09:00:43.179989       1 configmap_cafile_content.go:202] Starting client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I1008 09:00:43.180052       1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I1008 09:00:43.179991       1 configmap_cafile_content.go:202] Starting client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I1008 09:00:43.180131       1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I1008 09:00:43.180476       1 secure_serving.go:202] Serving securely on [::]:4443
I1008 09:00:43.180542       1 dynamic_serving_content.go:130] Starting serving-cert::/tmp/apiserver.crt::/tmp/apiserver.key
I1008 09:00:43.180566       1 tlsconfig.go:240] Starting DynamicServingCertificateController
I1008 09:00:43.280084       1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file 
I1008 09:00:43.280082       1 shared_informer.go:247] Caches are synced for RequestHeaderAuthRequestController 
I1008 09:00:43.280179       1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file 
E1008 09:00:58.172317       1 scraper.go:139] "Failed to scrape node" err="Get \"https://10.0.2.112:10250/stats/summary?only_cpu_and_memory=true\": x509: certificate signed by unknown authority" node="10.0.2.112"
I1008 09:01:10.508496       1 server.go:188] "Failed probe" probe="metric-storage-ready" err="not metrics to serve"
E1008 09:01:13.168492       1 scraper.go:139] "Failed to scrape node" err="Get \"https://10.0.2.112:10250/stats/summary?only_cpu_and_memory=true\": x509: certificate signed by unknown authority" node="10.0.2.112"

When I add the args- --kubelet-insecure-tlsmetrics-server can running well with no error. So when the release of metrics-server 0.6.2?Does all the charts has this problem before v0.6.2?

Huimintai avatar Oct 08 '22 09:10 Huimintai

the relea

You are not the same issue. yeah, release v0.6.0 and v0.6.1 has this issue https://github.com/kubernetes-sigs/metrics-server/issues/983 I will ask the sig-instrumentation for the release plan of metrics-server

yangjunmyfm192085 avatar Oct 08 '22 13:10 yangjunmyfm192085

What is the workaround until 0.6.2 is available?

symedley avatar Oct 21 '22 01:10 symedley

We're facing the same issue in our K8S clusters. Is there an ETA for the release of v0.6.2? Nothing has been release since Feb 2022 :/

shovelend avatar Nov 23 '22 11:11 shovelend

We are currently working on the release, it should be out soon

dgrisonnet avatar Nov 23 '22 12:11 dgrisonnet

The metrics server v0.6.2 has been released. Please update the Metrics server version. Close this issue

yangjunmyfm192085 avatar Nov 25 '22 22:11 yangjunmyfm192085

/close

yangjunmyfm192085 avatar Nov 25 '22 22:11 yangjunmyfm192085

@yangjunmyfm192085: Closing this issue.

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Nov 25 '22 22:11 k8s-ci-robot