metrics-server icon indicating copy to clipboard operation
metrics-server copied to clipboard

Metrics Server not working as expected in EKS 1.24 version when we are using New Relic adapter for HPA

Open Ashik01 opened this issue 2 years ago • 3 comments

Hi ,

We own a large EKS cluster around 100 nodes , we observed that metrics server was crashing when API call is high so we where not able to perform our scaling test via ASG. This has happened to us twice

Also Control Plane is controlled by AWS we are not getting full grab on troubleshooting API logs

What we did is we turned off ie we set our NR adapter deployment which we are using for HPA to Zero replica for it to not have any interrupt during our scaling test with ASG

Please find the logs below

3-09-20T03:21:23.943154313-04:00 I0920 07:21:23.943050 1 serving.go:342] Generated self-signed cert (/tmp/apiserver.crt, /tmp/apiserver.key) 2023-09-20T03:21:23.943507186-04:00 I0920 07:21:23.943424 1 dynamic_serving_content.go:112] "Loaded a new cert/key pair" name="serving-cert::/tmp/apiserver.crt::/tmp/apiserver.key" 2023-09-20T03:21:24.248532892-04:00 I0920 07:21:24.248426 1 round_trippers.go:463] GET https://10.100.0.1:443/api/v1/namespaces/kube-system/configmaps/extension-apiserver-authentication 2023-09-20T03:21:24.248557900-04:00 I0920 07:21:24.248441 1 round_trippers.go:469] Request Headers: 2023-09-20T03:21:24.248562398-04:00 I0920 07:21:24.248450 1 round_trippers.go:473] Accept: application/json, / 2023-09-20T03:21:24.248566587-04:00 I0920 07:21:24.248475 1 round_trippers.go:473] Authorization: Bearer 2023-09-20T03:21:24.261804996-04:00 I0920 07:21:24.261739 1 round_trippers.go:574] Response Status: 200 OK in 13 milliseconds 2023-09-20T03:21:24.268749220-04:00 I0920 07:21:24.268677 1 requestheader_controller.go:244] Loaded a new request header values for RequestHeaderAuthRequestController 2023-09-20T03:21:24.269742401-04:00 I0920 07:21:24.269692 1 config.go:700] Not requested to run hook priority-and-fairness-config-consumer 2023-09-20T03:21:24.273609090-04:00 I0920 07:21:24.273546 1 reflector.go:219] Starting reflector *v1.PartialObjectMetadata (0s) from pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167 2023-09-20T03:21:24.273618856-04:00 I0920 07:21:24.273563 1 reflector.go:255] Listing and watching *v1.PartialObjectMetadata from pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167 2023-09-20T03:21:24.273623033-04:00 I0920 07:21:24.273567 1 reflector.go:219] Starting reflector *v1.Node (0s) from pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167 2023-09-20T03:21:24.273626734-04:00 I0920 07:21:24.273579 1 reflector.go:255] Listing and watching *v1.Node from pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167 2023-09-20T03:21:24.274011815-04:00 I0920 07:21:24.273943 1 round_trippers.go:463] GET https://10.100.0.1:443/api/v1/pods?fieldSelector=status.phase%3DRunning&limit=500&resourceVersion=0 2023-09-20T03:21:24.274029657-04:00 I0920 07:21:24.273962 1 round_trippers.go:469] Request Headers: 2023-09-20T03:21:24.274035822-04:00 I0920 07:21:24.273972 1 round_trippers.go:473] Accept: application/vnd.kubernetes.protobuf;as=PartialObjectMetadataList;g=meta.k8s.io;v=v1,application/json;as=PartialObjectMetadataList;g=meta.k8s.io;v=v1,application/json 2023-09-20T03:21:24.274052170-04:00 I0920 07:21:24.273987 1 round_trippers.go:473] User-Agent: metrics-server/v0.6.2 (linux/amd64) kubernetes/8cabbee 2023-09-20T03:21:24.274058156-04:00 I0920 07:21:24.274000 1 round_trippers.go:473] Authorization: Bearer 2023-09-20T03:21:24.274124603-04:00 I0920 07:21:24.274074 1 round_trippers.go:463] GET https://10.100.0.1:443/api/v1/nodes?limit=500&resourceVersion=0 2023-09-20T03:21:24.274133626-04:00 I0920 07:21:24.274089 1 round_trippers.go:469] Request Headers: 2023-09-20T03:21:24.274138941-04:00 I0920 07:21:24.274099 1 round_trippers.go:473] Accept: application/vnd.kubernetes.protobuf, / 2023-09-20T03:21:24.274148008-04:00 I0920 07:21:24.274115 1 round_trippers.go:473] User-Agent: metrics-server/v0.6.2 (linux/amd64) kubernetes/8cabbee 2023-09-20T03:21:24.274156067-04:00 I0920 07:21:24.274126 1 round_trippers.go:473] Authorization: Bearer 2023-09-20T03:21:24.289567522-04:00 I0920 07:21:24.289502 1 round_trippers.go:574] Response Status: 200 OK in 15 milliseconds 2023-09-20T03:21:24.323401269-04:00 I0920 07:21:24.323322 1 round_trippers.go:463] GET https://10.100.0.1:443/api/v1/nodes?allowWatchBookmarks=true&resourceVersion=1015156217&timeout=8m1s&timeoutSeconds=481&watch=true 2023-09-20T03:21:24.323447697-04:00 I0920 07:21:24.323339 1 round_trippers.go:469] Request Headers: 2023-09-20T03:21:24.323452457-04:00 I0920 07:21:24.323347 1 round_trippers.go:473] User-Agent: metrics-server/v0.6.2 (linux/amd64) kubernetes/8cabbee 2023-09-20T03:21:24.323468803-04:00 I0920 07:21:24.323355 1 round_trippers.go:473] Authorization: Bearer 2023-09-20T03:21:24.323474211-04:00 I0920 07:21:24.323361 1 round_trippers.go:473] Accept: application/vnd.kubernetes.protobuf, / 2023-09-20T03:21:24.325384188-04:00 I0920 07:21:24.325323 1 round_trippers.go:574] Response Status: 200 OK in 1 milliseconds 2023-09-20T03:21:24.374531516-04:00 I0920 07:21:24.374476 1 shared_informer.go:270] caches populated 2023-09-20T03:21:24.383797481-04:00 I0920 07:21:24.383749 1 round_trippers.go:574] Response Status: 200 OK in 109 milliseconds 2023-09-20T03:21:24.828588942-04:00 I0920 07:21:24.828475 1 round_trippers.go:463] GET https://10.100.0.1:443/api/v1/pods?allowWatchBookmarks=true&fieldSelector=status.phase%3DRunning&resourceVersion=1015156716&timeout=9m42s&timeoutSeconds=582&watch=true 2023-09-20T03:21:24.828633836-04:00 I0920 07:21:24.828502 1 round_trippers.go:469] Request Headers: 2023-09-20T03:21:24.828638248-04:00 I0920 07:21:24.828511 1 round_trippers.go:473] Accept: application/vnd.kubernetes.protobuf;as=PartialObjectMetadata;g=meta.k8s.io;v=v1,application/json;as=PartialObjectMetadata;g=meta.k8s.io;v=v1,application/json 2023-09-20T03:21:24.828642148-04:00 I0920 07:21:24.828518 1 round_trippers.go:473] User-Agent: metrics-server/v0.6.2 (linux/amd64) kubernetes/8cabbee 2023-09-20T03:21:24.828645877-04:00 I0920 07:21:24.828528 1 round_trippers.go:473] Authorization: Bearer 2023-09-20T03:21:24.831049757-04:00 I0920 07:21:24.830390 1 round_trippers.go:574] Response Status: 200 OK in 1 milliseconds 2023-09-20T03:21:24.875155019-04:00 I0920 07:21:24.875079 1 shared_informer.go:270] caches populated 2023-09-20T03:21:24.875287193-04:00 I0920 07:21:24.875229 1 server.go:136] "Scraping metrics" 2023-09-20T03:21:24.875758924-04:00 I0920 07:21:24.875707 1 scraper.go:115] "Scraping metrics from nodes" nodeCount=320 2023-09-20T03:21:24.876494410-04:00 I0920 07:21:24.876431 1 healthz.go:174] Installing health checkers for (/healthz): "ping","log","poststarthook/generic-apiserver-start-informers","poststarthook/max-in-flight-filter","metadata-informer-sync" 2023-09-20T03:21:24.876825234-04:00 I0920 07:21:24.876774 1 healthz.go:174] Installing health checkers for (/livez): "ping","log","poststarthook/generic-apiserver-start-informers","poststarthook/max-in-flight-filter","metric-collection-timely","metadata-informer-sync" 2023-09-20T03:21:24.877147198-04:00 I0920 07:21:24.877108 1 healthz.go:174] Installing health checkers for (/readyz): "ping","log","poststarthook/generic-apiserver-start-informers","informer-sync","poststarthook/max-in-flight-filter","metric-storage-ready","metadata-informer-sync","shutdown" 2023-09-20T03:21:24.877624644-04:00 I0920 07:21:24.877574 1 genericapiserver.go:406] MuxAndDiscoveryComplete has all endpoints registered and discovery information is complete 2023-09-20T03:21:24.877680909-04:00 I0920 07:21:24.877649 1 round_trippers.go:463] GET https://10.100.0.1:443/api/v1/namespaces/kube-system/configmaps/extension-apiserver-authentication 2023-09-20T03:21:24.877685946-04:00 I0920 07:21:24.877660 1 round_trippers.go:469] Request Headers: 2023-09-20T03:21:24.877704299-04:00 I0920 07:21:24.877669 1 round_trippers.go:473] Authorization: Bearer 2023-09-20T03:21:24.877710217-04:00 I0920 07:21:24.877675 1 round_trippers.go:473] Accept: application/json, / 2023-09-20T03:21:24.882971150-04:00 I0920 07:21:24.882924 1 round_trippers.go:574] Response Status: 200 OK in 5 milliseconds 2023-09-20T03:21:24.884396435-04:00 I0920 07:21:24.884348 1 scraper.go:137] "Scraping node" node="ip-63-20-71-83.us-west-2.compute.internal" 2023-09-20T03:21:24.884492272-04:00 I0920 07:21:24.884447 1 round_trippers.go:463] GET https://63.20.71.83:10250/metrics/resource 2023-09-20T03:21:24.884502054-04:00 I0920 07:21:24.884473 1 round_trippers.go:469] Request Headers: 2023-09-20T03:21:24.884509270-04:00 I0920 07:21:24.884482 1 round_trippers.go:473] User-Agent: metrics-server/v0.6.2 (linux/amd64) kubernetes/8cabbee 2023-09-20T03:21:24.884514469-04:00 I0920 07:21:24.884490 1 round_trippers.go:473] Authorization: Bearer 2023-09-20T03:21:24.889846438-04:00 I0920 07:21:24.889779 1 requestheader_controller.go:169] Starting RequestHeaderAuthRequestController 2023-09-20T03:21:24.889863867-04:00 I0920 07:21:24.889794 1 shared_informer.go:240] Waiting for caches to sync for RequestHeaderAuthRequestController 2023-09-20T03:21:24.889896215-04:00 I0920 07:21:24.889832 1 reflector.go:219] Starting reflector *v1.ConfigMap (12h0m0s) from pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167 2023-09-20T03:21:24.889911731-04:00 I0920 07:21:24.889831 1 configmap_cafile_content.go:201] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file" 2023-09-20T03:21:24.889918589-04:00 I0920 07:21:24.889860 1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file 2023-09-20T03:21:24.889924444-04:00 I0920 07:21:24.889828 1 configmap_cafile_content.go:201] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file" 2023-09-20T03:21:24.889936405-04:00 I0920 07:21:24.889886 1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::client-ca-file 2023-09-20T03:21:24.889942327-04:00 I0920 07:21:24.889916 1 reflector.go:219] Starting reflector *v1.ConfigMap (12h0m0s) from pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167 2023-09-20T03:21:24.889946283-04:00 I0920 07:21:24.889924 1 reflector.go:255] Listing and watching *v1.ConfigMap from pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167 2023-09-20T03:21:24.889979386-04:00 I0920 07:21:24.889842 1 reflector.go:255] Listing and watching *v1.ConfigMap from pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167 2023-09-20T03:21:24.889988217-04:00 I0920 07:21:24.889951 1 reflector.go:219] Starting reflector *v1.ConfigMap (12h0m0s) from pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167 2023-09-20T03:21:24.890036585-04:00 I0920 07:21:24.889988 1 reflector.go:255] Listing and watching *v1.ConfigMap from pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167 2023-09-20T03:21:24.890042131-04:00 I0920 07:21:24.890011 1 tlsconfig.go:200] "Loaded serving cert" certName="serving-cert::/tmp/apiserver.crt::/tmp/apiserver.key" certDetail=""localhost@1695194483" [serving] validServingFor=[127.0.0.1,localhost,localhost] issuer="localhost-ca@1695194483" (2023-09-20 06:21:23 +0000 UTC to 2024-09-19 06:21:23 +0000 UTC (now=2023-09-20 07:21:24.889975584 +0000 UTC))" 2023-09-20T03:21:24.890051302-04:00 I0920 07:21:24.890031 1 round_trippers.go:463] GET https://10.100.0.1:443/api/v1/namespaces/kube-system/configmaps?fieldSelector=metadata.name%3Dextension-apiserver-authentication&limit=500&resourceVersion=0 2023-09-20T03:21:24.890055466-04:00 I0920 07:21:24.890038 1 dynamic_serving_content.go:131] "Starting controller" name="serving-cert::/tmp/apiserver.crt::/tmp/apiserver.key" 2023-09-20T03:21:24.890070780-04:00 I0920 07:21:24.890039 1 round_trippers.go:469] Request Headers: 2023-09-20T03:21:24.890075426-04:00 I0920 07:21:24.890055 1 round_trippers.go:473] Accept: application/json, / 2023-09-20T03:21:24.890079012-04:00 I0920 07:21:24.890056 1 round_trippers.go:463] GET https://10.100.0.1:443/api/v1/namespaces/kube-system/configmaps?fieldSelector=metadata.name%3Dextension-apiserver-authentication&limit=500&resourceVersion=0 2023-09-20T03:21:24.890086025-04:00 I0920 07:21:24.890063 1 round_trippers.go:473] Authorization: Bearer 2023-09-20T03:21:24.890089541-04:00 I0920 07:21:24.890065 1 round_trippers.go:469] Request Headers: 2023-09-20T03:21:24.890096169-04:00 I0920 07:21:24.890076 1 round_trippers.go:473] Accept: application/json, / 2023-09-20T03:21:24.890099847-04:00 I0920 07:21:24.890086 1 round_trippers.go:473] Authorization: Bearer 2023-09-20T03:21:24.890171187-04:00 I0920 07:21:24.890133 1 named_certificates.go:53] "Loaded SNI cert" index=0 certName="self-signed loopback" certDetail=""apiserver-loopback-client@1695194484" [serving] validServingFor=[apiserver-loopback-client] issuer="apiserver-loopback-client-ca@1695194484" (2023-09-20 06:21:23 +0000 UTC to 2024-09-19 06:21:23 +0000 UTC (now=2023-09-20 07:21:24.890120523 +0000 UTC))" 2023-09-20T03:21:24.890176473-04:00 I0920 07:21:24.890135 1 round_trippers.go:463] GET https://10.100.0.1:443/api/v1/namespaces/kube-system/configmaps?fieldSelector=metadata.name%3Dextension-apiserver-authentication&limit=500&resourceVersion=0 2023-09-20T03:21:24.890187857-04:00 I0920 07:21:24.890159 1 round_trippers.go:469] Request Headers: 2023-09-20T03:21:24.890192094-04:00 I0920 07:21:24.890161 1 tlsconfig.go:240] "Starting DynamicServingCertificateController" 2023-09-20T03:21:24.890196179-04:00 I0920 07:21:24.890166 1 round_trippers.go:473] Accept: application/json, / 2023-09-20T03:21:24.890206733-04:00 I0920 07:21:24.890178 1 round_trippers.go:473] Authorization: Bearer 2023-09-20T03:21:24.890221855-04:00 I0920 07:21:24.890152 1 secure_serving.go:266] Serving securely on [::]:4443 2023-09-20T03:21:24.890230323-04:00 I0920 07:21:24.890211 1 genericapiserver.go:462] [graceful-termination] waiting for shutdown to be initiated 2023-09-20T03:21:24.890280214-04:00 W0920 07:21:24.890239 1 shared_informer.go:372] The sharedIndexInformer has started, run more than once is not allowed 2023-09-20T03:21:24.892687814-04:00 I0920 07:21:24.892603 1 round_trippers.go:574] Response Status: 200 OK in 2 milliseconds 2023-09-20T03:21:24.893181310-04:00 I0920 07:21:24.893123 1 round_trippers.go:574] Response Status: 200 OK in 3 milliseconds 2023-09-20T03:21:24.893329706-04:00 I0920 07:21:24.893279 1 round_trippers.go:574] Response Status: 200 OK in 3 milliseconds 2023-09-20T03:21:24.893729652-04:00 I0920 07:21:24.893679 1 handler.go:153] metrics-server: GET "/readyz" satisfied by nonGoRestful 2023-09-20T03:21:24.893741739-04:00 I0920 07:21:24.893699 1 pathrecorder.go:240] metrics-server: "/readyz" satisfied by exact match 2023-09-20T03:21:24.893798741-04:00 I0920 07:21:24.893760 1 shared_informer.go:270] caches populated 2023-09-20T03:21:24.893815659-04:00 I0920 07:21:24.893780 1 server.go:187] "Failed probe" probe="metric-storage-ready" err="no metrics to serve" 2023-09-20T03:21:24.893823308-04:00 I0920 07:21:24.893796 1 healthz.go:257] metric-storage-ready check failed: readyz 2023-09-20T03:21:24.893827341-04:00 [-]metric-storage-ready failed: no metrics to serve 2023-09-20T03:21:24.893927752-04:00 I0920 07:21:24.893885 1 httplog.go:129] "HTTP" verb="GET" URI="/readyz" latency="310.592µs" userAgent="kube-probe/1.24+" audit-ID="b67d8c26-44e6-47de-8183-f113b274cb88" srcIP="63.20.71.243:48156" resp=0 2023-09-20T03:21:24.900222095-04:00 I0920 07:21:24.900161 1 round_trippers.go:463] GET https://10.100.0.1:443/api/v1/namespaces/kube-system/configmaps?allowWatchBookmarks=true&fieldSelector=metadata.name%3Dextension-apiserver-authentication&resourceVersion=1015156742&timeout=9m9s&timeoutSeconds=549&watch=true 2023-09-20T03:21:24.900237462-04:00 I0920 07:21:24.900176 1 round_trippers.go:469] Request Headers: 2023-09-20T03:21:24.900241566-04:00 I0920 07:21:24.900185 1 round_trippers.go:473] Accept: application/json, / 2023-09-20T03:21:24.900244964-04:00 I0920 07:21:24.900193 1 round_trippers.go:473] Authorization: Bearer 2023-09-20T03:21:24.900524820-04:00 I0920 07:21:24.900474 1 round_trippers.go:463] GET https://10.100.0.1:443/api/v1/namespaces/kube-system/configmaps?allowWatchBookmarks=true&fieldSelector=metadata.name%3Dextension-apiserver-authentication&resourceVersion=1015156742&timeout=7m28s&timeoutSeconds=448&watch=true 2023-09-20T03:21:24.900540550-04:00 I0920 07:21:24.900491 1 round_trippers.go:469] Request Headers: 2023-09-20T03:21:24.900544774-04:00 I0920 07:21:24.900499 1 round_trippers.go:473] Accept: application/json, / 2023-09-20T03:21:24.900554751-04:00 I0920 07:21:24.900506 1 round_trippers.go:473] Authorization: Bearer 2023-09-20T03:21:24.901573370-04:00 I0920 07:21:24.901530 1 round_trippers.go:574] Response Status: 200 OK in 1 milliseconds 2023-09-20T03:21:24.901695816-04:00 I0920 07:21:24.901656 1 round_trippers.go:574] Response Status: 200 OK in 1 milliseconds 2023-09-20T03:21:24.903436368-04:00 I0920 07:21:24.903348 1 round_trippers.go:463] GET https://10.100.0.1:443/api/v1/namespaces/kube-system/configmaps?allowWatchBookmarks=true&fieldSelector=metadata.name%3Dextension-apiserver-authentication&resourceVersion=1015156742&timeout=9m16s&timeoutSeconds=556&watch=true 2023-09-20T03:21:24.903451387-04:00 I0920 07:21:24.903369 1 round_trippers.go:469] Request Headers: 2023-09-20T03:21:24.903478227-04:00 I0920 07:21:24.903393 1 round_trippers.go:473] Accept: application/json, / 2023-09-20T03:21:24.903485943-04:00 I0920 07:21:24.903408 1 round_trippers.go:473] Authorization: Bearer 2023-09-20T03:21:24.904973591-04:00 I0920 07:21:24.904931 1 round_trippers.go:574] Response Status: 200 OK in 1 milliseconds 2023-09-20T03:21:24.906084922-04:00 I0920 07:21:24.906033 1 scraper.go:137] "Scraping node" node="ip-63-20-75-160.us-west-2.compute.internal" 2023-09-20T03:21:24.906115335-04:00 I0920 07:21:24.906086 1 round_trippers.go:463] GET https://63.20.75.160:10250/metrics/resource 2023-09-20T03:21:24.906121004-04:00 I0920 07:21:24.906095 1 round_trippers.go:469] Request Headers: 2023-09-20T03:21:24.906129361-04:00 I0920 07:21:24.906107 1 round_trippers.go:473] User-Agent: metrics-server/v0.6.2 (linux/amd64) kubernetes/8cabbee 2023-09-20T03:21:24.906136969-04:00 I0920 07:21:24.906119 1 round_trippers.go:473] Authorization: Bearer 2023-09-20T03:21:24.923101729-04:00 I0920 07:21:24.923050 1 scraper.go:137] "Scraping node" node="ip-63-20-70-209.us-west-2.compute.internal" 2023-09-20T03:21:24.923118597-04:00 I0920 07:21:24.923085 1 round_trippers.go:463] GET https://63.20.70.209:10250/metrics/resource 2023-09-20T03:21:24.923123867-04:00 I0920 07:21:24.923091 1 round_trippers.go:469] Request Headers: 2023-09-20T03:21:24.923127685-04:00 I0920 07:21:24.923098 1 round_trippers.go:473] User-Agent: metrics-server/v0.6.2 (linux/amd64) kubernetes/8cabbee 2023-09-20T03:21:24.923131041-04:00 I0920 07:21:24.923106 1 round_trippers.go:473] Authorization: Bearer 2023-09-20T03:21:24.924820844-04:00 I0920 07:21:24.924774 1 scraper.go:137] "Scraping node" node="ip-63-20-64-53.us-west-2.compute.internal"

Ashik01 avatar Sep 27 '23 08:09 Ashik01

Additional Logs

Metrics server logs

In the Metrics server pod logs we could see the below error. 2023-09-20T03:26:28.000516312-04:00 I0920 07:26:28.000466 1 secure_serving.go:352] http: TLS handshake error from 63.20.72.85:49364: EOF 2023-09-20T03:26:28.000561264-04:00 I0920 07:26:28.000520 1 secure_serving.go:352] http: TLS handshake error from 63.20.72.85:49346: EOF

Ashik01 avatar Sep 27 '23 09:09 Ashik01

Additional Information

When we set the NR adapter deployment to zero we are also seeing the metrics server top command is giving UNKNOWN values in output

Ashik01 avatar Sep 27 '23 09:09 Ashik01

/kind bug /assign @serathius /triage accepted

dashpole avatar Oct 05 '23 16:10 dashpole