DOKS icon indicating copy to clipboard operation
DOKS copied to clipboard

Support default/secure metrics-server installation

Open timoreimann opened this issue 5 years ago • 17 comments

As of today, installing metrics-server requires tweaking the configuration since the default is to reach out to nodes by DNS and use TLS. A fair number of users have asked to support the default setup including TLS, which is a highly reasonable request.

The issue has originally been discussed in https://github.com/digitalocean/digitalocean-cloud-controller-manager/issues/150. Several comments describe how to run metrics-server in TLS-less mode as a workaround for now.

timoreimann avatar Jul 11 '19 11:07 timoreimann

Required work items are referenced above.

timoreimann avatar Jul 11 '19 12:07 timoreimann

With secure TLS usage of the kubelet API now possible (see the update to #6), the --kubelet-insecure-tls parameter is not needed anymore.

Users are down to having to specify --kubelet-preferred-address-types=InternalIP at this point, which is the last item to tackle prior to the default metrics-server configuration working out of the box.

timoreimann avatar Sep 12 '19 13:09 timoreimann

I just set up the metrics-server with the components.yaml from the v0.3.6 release and made sure to inject the --kubelet-preferred-address-types=InternalIP flag as mentioned here. (Using kustomize and a JSON patch for this)

The cluster is running on 1.17.5-do.0 currently.

Everything seems to be running just fine and I get no errors. With kubectl top node i see some metrics. However I see no change in the DOKS dashboard as suggested in official dashboard docs:

image

This is my view instead:

image

Am I missing something?

mbrodala avatar Jun 02 '20 14:06 mbrodala

@mbrodala is this the dashboard we integrate in the DigitalOcean cloud control panel, or a separate deployment you manage on your own?

timoreimann avatar Jun 03 '20 13:06 timoreimann

@timoreimann this is the original dashboard provided by DOKS.

mbrodala avatar Jun 03 '20 13:06 mbrodala

@mbrodala this may indeed be an issue on our end. I filed an internal bug report so that we can look into the matter more closely.

Thanks for bringing this to our attention. I'm going to report back to this issue once we've identified and fixed the problem.

timoreimann avatar Jun 03 '20 20:06 timoreimann

@timoreimann Any news on this? I'm also facing the exact same issue - metrics server works properly (with the mentioned modifications) & I can see the stats using the top command, but the (built-in) dashboard isn't showing them in the UI.

TonyBogdanov avatar Jul 14 '20 20:07 TonyBogdanov

The dashboard metrics are served by a different, separate sidecar, which we yet have to integrate. I created #21 to track the effort.

timoreimann avatar Jul 15 '20 16:07 timoreimann

I just installed metrics-server with doctl kubernetes cluster create --1-clicks="metrics-server" .... When I run kubectl top pods --all-namespaces I get:

W1031 04:06:49.790056 1123183 top_pod.go:265] Metrics not available for pod default/external-dns-68cf9b5c56-4bjqs, age: 40m32.790034709s
error: Metrics not available for pod default/external-dns-68cf9b5c56-8bjqs, age: 40m32.790034709s

The metrics pod logs something like this:

I1031 03:03:25.598399       1 manager.go:148] ScrapeMetrics: time: 5.606118ms, nodes: 1, pods: 0
E1031 03:03:26.647337       1 reststorage.go:160] unable to fetch pod metrics for pod cert-manager/cert-manager-cainjector-6d59c8d4f7-7l274: no metrics known for pod
E1031 03:03:26.647589       1 reststorage.go:160] unable to fetch pod metrics for pod kube-system/cilium-operator-6cb976fbcf-rc8lv: no metrics known for pod
E1031 03:03:26.647701       1 reststorage.go:160] unable to fetch pod metrics for pod kube-system/cilium-rl67k: no metrics known for pod
E1031 03:03:26.647793       1 reststorage.go:160] unable to fetch pod metrics for pod cert-manager/cert-manager-webhook-578954cdd-9vwph: no metrics known for pod
E1031 03:03:26.647843       1 reststorage.go:160] unable to fetch pod metrics for pod kube-system/kube-proxy-4f2vb: no metrics known for pod
E1031 03:03:26.647900       1 reststorage.go:160] unable to fetch pod metrics for pod kube-system/coredns-76bcfddf46-rkfvt: no metrics known for pod
E1031 03:03:26.647957       1 reststorage.go:160] unable to fetch pod metrics for pod kube-system/do-node-agent-wx74s: no metrics known for pod
E1031 03:03:26.648048       1 reststorage.go:160] unable to fetch pod metrics for pod kube-system/csi-do-node-skgr8: no metrics known for pod
E1031 03:03:26.648137       1 reststorage.go:160] unable to fetch pod metrics for pod default/external-dns-68cf9b5c56-8bjqs: no metrics known for pod
E1031 03:03:26.648196       1 reststorage.go:160] unable to fetch pod metrics for pod kube-system/cilium-operator-6cb976fbcf-z5vmk: no metrics known for pod
E1031 03:03:26.648288       1 reststorage.go:160] unable to fetch pod metrics for pod cert-manager/cert-manager-86548b886-j655n: no metrics known for pod
E1031 03:03:26.648387       1 reststorage.go:160] unable to fetch pod metrics for pod kube-system/coredns-76bcfddf46-lsvm5: no metrics known for pod
E1031 03:03:26.648433       1 reststorage.go:160] unable to fetch pod metrics for pod kube-system/metrics-server-5b8f47666-m6sn5: no metrics known for pod
E1031 03:03:26.648487       1 reststorage.go:160] unable to fetch pod metrics for pod kube-system/kube-state-metrics-6f4b6669f9-zhgw2: no metrics known for pod
E1031 03:03:26.648544       1 reststorage.go:160] unable to fetch pod metrics for pod ingress-nginx/ingress-nginx-controller-98cb87fb7-bfvvr: no metrics known for pod

feluxe avatar Oct 31 '20 03:10 feluxe

@feluxe I have just the same issue with you

johnwook avatar Nov 03 '20 09:11 johnwook

@feluxe @johnwook and I have the same issue as you guys too.

kyranb avatar Nov 04 '20 04:11 kyranb

1.19 users are presumably affected by an incompatibility with Docker 18 as described in kubernetes/kubernetes#94281.

We plan to release a DOKS 1.19 update (ideally today) that is going to address the problem by moving to Docker 19.03.

timoreimann avatar Nov 04 '20 06:11 timoreimann

@timoreimann as a 1.19 user I can confirm I'm affected by that (or something else) and currently don't have pod metrics through metrics-server

WyriHaximus avatar Nov 05 '20 07:11 WyriHaximus

FYI: 1.19.3-do.2 was just released and should fix the problem. Please report back if that's not the case.

Originally posted by @timoreimann in https://github.com/digitalocean/digitalocean-cloud-controller-manager/issues/150#issuecomment-722604873

Can confirm this is now fixed for me

WyriHaximus avatar Nov 05 '20 19:11 WyriHaximus

Same here. kubectl top ... works with the new update, but the stats still don't show within the dashboard web UI.

feluxe avatar Nov 05 '20 22:11 feluxe

the stats still don't show within the dashboard web UI.

Unfortunately, that's unrelated to the latest release. It's still on the agenda to get it fixed as well.

timoreimann avatar Nov 05 '20 22:11 timoreimann

Hi this is still biting me on 1,22. I have to change the endpoints to be InternalIP and enable the apiService

Nuxij avatar Jul 19 '22 23:07 Nuxij