kind icon indicating copy to clipboard operation
kind copied to clipboard

Resource Metrics API (metrics-server)

Open hjacobs opened this issue 5 years ago • 42 comments

The metrics-server should be installed out of the box to enable kubectl top .. and tools like kube-ops-view. I tried to install the metrics-server, but it did not work for me, it just logs errors like:

E0322 20:17:29.205246       1 reststorage.go:129] unable to fetch node metrics for node "kind-control-plane": no metrics known for node
E0322 20:17:29.212924       1 reststorage.go:144] unable to fetch pod metrics for pod kube-system/kube-proxy-gwhkb: no metrics known for pod
E0322 20:17:29.212948       1 reststorage.go:144] unable to fetch pod metrics for pod kube-system/weave-net-w2lsb: no metrics known for pod
E0322 20:17:29.212954       1 reststorage.go:144] unable to fetch pod metrics for pod kube-system/metrics-server-fc6d4999b-6mz4l: no metrics known for pod
E0322 20:17:29.212960       1 reststorage.go:144] unable to fetch pod metrics for pod kube-system/coredns-86c58d9df4-xl5rm: no metrics known for pod
E0322 20:17:29.212966       1 reststorage.go:144] unable to fetch pod metrics for pod kube-system/kube-scheduler-kind-control-plane: no metrics known for pod
E0322 20:17:29.212971       1 reststorage.go:144] unable to fetch pod metrics for pod kube-system/kube-controller-manager-kind-control-plane: no metrics known for pod
E0322 20:17:29.212977       1 reststorage.go:144] unable to fetch pod metrics for pod kube-system/kube-apiserver-kind-control-plane: no metrics known for pod
E0322 20:17:29.212982       1 reststorage.go:144] unable to fetch pod metrics for pod kube-system/etcd-kind-control-plane: no metrics known for pod
E0322 20:17:29.212988       1 reststorage.go:144] unable to fetch pod metrics for pod kube-system/coredns-86c58d9df4-c8nxt: no metrics known for pod

hjacobs avatar Mar 22 '19 20:03 hjacobs

Our out of the box install is currently kubeadm + cni. I'd want to know why kubeadm doesn't configure it by default first 🤔

BenTheElder avatar Mar 22 '19 20:03 BenTheElder

I really want to replace my local Minikube environment with kind and the Metrics API is essential for this to happen :smile:

I would also be fine with the Metrics API being an "addon" which can be easily enabled via kind CLI --- Minikube does it like this with the old (deprecated) "heapster" name: https://github.com/kubernetes/minikube/blob/master/docs/addons.md

hjacobs avatar Mar 22 '19 20:03 hjacobs

try the manifests from here: https://github.com/luxas/kubeadm-workshop/tree/master/demos/monitoring

but kubeadm is not bundling metrics support, because the kubeadm cluster is a minimal viable one - i.e. no addons except kube-proxy and a DNS server.

/triage support

neolit123 avatar Mar 22 '19 22:03 neolit123

@neolit123 thanks for the hint, but that manifest (which runs the old v0.2.1 of metrics server btw) also did not work for me. The error from metrics server:

E0323 09:05:05.004941       1 summary.go:97] error while getting metrics summary from Kubelet kind-control-plane(172.17.0.2:10255): Get http://172.17.0.2:10255/stats/summary/: dial tcp 172.17.0.2:10255: getsockopt: connection refused
E0323 09:06:05.006613       1 summary.go:97] error while getting metrics summary from Kubelet kind-control-plane(172.17.0.2:10255): Get http://172.17.0.2:10255/stats/summary/: dial tcp 172.17.0.2:10255: getsockopt: connection refused
E0323 09:07:05.007664       1 summary.go:97] error while getting metrics summary from Kubelet kind-control-plane(172.17.0.2:10255): Get http://172.17.0.2:10255/stats/summary/: dial tcp 172.17.0.2:10255: getsockopt: connection refused
E0323 09:08:05.003351       1 summary.go:97] error while getting metrics summary from Kubelet kind-control-plane(172.17.0.2:10255): Get http://172.17.0.2:10255/stats/summary/: dial tcp 172.17.0.2:10255: getsockopt: connection refused

I'm confused on what the correct way to fix this is: apparently the old version of metrics server (v0.2.1) is still the way to go? The flags differ between the current (v0.3.1) and the old version (v0.2.1), also not sure if I need to additionally deploy cadvisor (e.g. like here)?

Hoping for someone more knowledgeable than me to help here :smile:

hjacobs avatar Mar 23 '19 09:03 hjacobs

sorry, I'm not particularly familiar with this, I'll see if I can find someone who is

BenTheElder avatar Mar 29 '19 19:03 BenTheElder

those manifests are old, try these for 0.3.1: https://github.com/kubernetes/kubernetes/blob/master/cluster/addons/metrics-server/metrics-server-deployment.yaml

neolit123 avatar Mar 29 '19 20:03 neolit123

@neolit123 I tried the 0.3.1 manifests before and did not succeed so far :disappointed:

hjacobs avatar Mar 30 '19 10:03 hjacobs

@hjacobs please add this flags to your metric-deployment

    args:
        - --kubelet-insecure-tls
        - --kubelet-preferred-address-types=InternalIP

bugbuilder avatar Mar 31 '19 04:03 bugbuilder

@bugbuilder yeah! That (adding the two args) worked with the official metrics server manifests in https://github.com/kubernetes-incubator/metrics-server/tree/master/deploy/1.8%2B :tada:

$kubectl top nodes
NAME                 CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%   
kind-control-plane   374m         4%     1104Mi          6%        
$ kubectl top pod --all-namespaces
NAMESPACE     NAME                                         CPU(cores)   MEMORY(bytes)   
kube-system   coredns-86c58d9df4-cmlzl                     6m           8Mi             
kube-system   coredns-86c58d9df4-jzr2n                     6m           9Mi             
kube-system   etcd-kind-control-plane                      46m          32Mi            
kube-system   kube-apiserver-kind-control-plane            75m          357Mi           
kube-system   kube-controller-manager-kind-control-plane   89m          66Mi            
kube-system   kube-proxy-vfxds                             11m          15Mi            
kube-system   kube-scheduler-kind-control-plane            27m          12Mi            
kube-system   metrics-server-76db6db868-mp8tb              2m           13Mi            
kube-system   weave-net-srjkt                              4m           120Mi           

hjacobs avatar Mar 31 '19 08:03 hjacobs

kube-ops-view now also works fine on kind :tada: Screenshot_2019-03-31_11-14-20

hjacobs avatar Mar 31 '19 09:03 hjacobs

So how can we get the Resource Metrics API deployment packaged into the default kind cluster creation (or as addon)?

hjacobs avatar Mar 31 '19 09:03 hjacobs

@hjacobs I just create kind-dev (WIP) that will help me with addons like: metallb, ingress-nginx, metrics, etc.

bugbuilder avatar Mar 31 '19 09:03 bugbuilder

I created a gist for the working Metrics Server API deployment manifests, so anybody can just try it out:

kubectl apply -f https://gist.githubusercontent.com/hjacobs/69b6844ba8442fcbc2007da316499eb4/raw/5b8678ac5e11d6be45aa98ca40d17da70dcb974f/kind-metrics-server.yaml

hjacobs avatar Mar 31 '19 09:03 hjacobs

@bugbuilder thanks, but ideally standard addons should be part of the kind CLI (like kind enable addon metrics-server or similar). This is how Minikube does it.

hjacobs avatar Mar 31 '19 09:03 hjacobs

Yep, but in the meantime I need something to start working with Kind. #253

bugbuilder avatar Mar 31 '19 09:03 bugbuilder

@hjacobs thanks for your manifest

TommyLike avatar Apr 24 '19 09:04 TommyLike

Hello, thank you for your manifest! Does anyone know if is it possible to create a ServiceMonitor to have the metrics from metrics-server?? So far, when I launch your manifest and a ServiceMonitor like the following one but I get the server returned HTTP status 403 Forbidden


> 
---
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  labels:
    k8s-app: metrics-server
  name: metrics-server
  namespace: kube-system
spec:
  endpoints:
 - interval: 10s
    port: https-metrics
    scheme: https
    tlsConfig:
      insecureSkipVerify: true
  jobLabel: k8s-app
  namespaceSelector:
    matchNames:
    - kube-system
  selector:
    matchLabels:
      k8s-app: metrics-server


Ferdinanddb avatar May 21 '19 19:05 Ferdinanddb

HPA tests require the metrics server and are now in conformance, I want to see if kubeadm should be shipping this by default but one way or another kind needs to ship the metrics APIs :-) /assign cc @neolit123

BenTheElder avatar Jul 17 '19 00:07 BenTheElder

@BenTheElder my personal vote is -1 on including the metrics server addon as part of kind or kubeadm deployments by default.

neolit123 avatar Jul 17 '19 17:07 neolit123

@neolit123 why? We need metrics APIs to run Kubernetes tests at parity with the existing test setups, and it seems these are commonly available / depended on.

BenTheElder avatar Jul 17 '19 18:07 BenTheElder

Note: we do not aim for mere conformance, but conformance is definitely the minimum bar.

BenTheElder avatar Jul 17 '19 18:07 BenTheElder

definitely out of scope for kubeadm. a metrics system is not a essential dependency. kind(er) might enable it if conformance demands it.

We need metrics APIs to run Kubernetes tests at parity with the existing test setups, and it seems these are commonly available / depended on

can you show examples?

HPA tests require the metrics server and are now in conformance,

latest comments suggest that this will be revised, possibly metrics tests will be skiped if no metrics system is present. i think that creating a mock metrics system in the framework is better but someone needs the bandwidth to work on that.

neolit123 avatar Jul 17 '19 18:07 neolit123

kind(er) might enable it if conformance demands it.

Conformance OR reasonable testing / user usage, including HPA / dashboard / ...

can you show examples?

The test that was promoted to conformance was already one of the default presubmit tests for Kubernetes (see also https://github.com/kubernetes-sigs/kind/pull/701), eventually we should be able to run ~all of these (relatively trivial things block this currently, such as node exec mechanisms, we're working on it).

latest comments suggest that this will be revised, possibly metrics tests will be skiped if no metrics system is present.

That means we test less. A) currently skipping is banned in conformance and B) we don't want to do less testing with kind because we can't be bothered to configure something that enables a default-enabled v1 core API to function. We only want to skip tests that we can't technically accomplish.

I gather some people think this should not be a core API, but it already is and v1 / default available at this point so that seems like a moot argument.

i think that creating a mock metrics system in the framework is better but someone needs the bandwidth to work on that.

Metrics can't be in the framework as it's HPA reading them? I don't think we should HPA with fake metrics.

BenTheElder avatar Jul 17 '19 18:07 BenTheElder

these are topics for sig-arch(conformance), testing commons and possibly sig-instrumentation. i personally do not like that HPA requires a metrics system to begin with.

if we have to add the metrics server for kind(er) deployments so be it, but for kubeadm we cannot, as neither HPA or the metric server are essential for the "minimal viable cluster" concept.

neolit123 avatar Jul 17 '19 18:07 neolit123

these are topics for sig-arch(conformance), testing commons and possibly sig-instrumentation. i personally do not like that HPA requires a metrics system to begin with.

HPA requiring a metrics system or not is at best sig-instrumentation, but realistically I don't see that changing. (also unclear that there's anything actually wrong with that).

if we have to add the metrics server for kind(er) deployments so be it, but for kubeadm we cannot, as neither HPA or the metric server are essential for the "minimal viable cluster" concept.

Sure, I can agree that kubeadm does not necessarily need to handle this, but I doubt we're going to significantly change HPA at this point and kind not supporting it seems less useful than kind supporting it, regardless of how individual test cases are handled.

As a kubernetes dev I should be able to legitimately test everything and as an end user I should be able to leverage common APIs when testing / developing my application.

Technically a "minimum viable cluster" for some form of "viable" (conformance?) does not need dynamic PVC either, but in practice not having it has been problematic for lots of real, otherwise relatively portable usage. We'll likely need to fix that for kind as well (though I'd expect it to also be out of scope for kubeadm).

BenTheElder avatar Jul 17 '19 18:07 BenTheElder

also unclear that there's anything actually wrong with that

reading the docs it seems quite coupled to metrics, so i also don't expect this to change.

As a kubernetes dev I should be able to legitimately test everything and as an end user I should be able to leverage common APIs when testing / developing my application.

that is true. HPA on it's own as a feature seems great to be part of conformance, but i don't think the metric server addon requirement should be part of conformance. if yes, then all deployers have to enable it (if not already) - kops, kubeadm, kubespray, etc. i guess this will be decided before 1.16.

does not need dynamic PVC either

interestingly i have not seen kubeadm feature requests to enable PVC, while we do get the occasional "please enable {PSP|PDB|metric-server|dashboard}".

neolit123 avatar Jul 17 '19 19:07 neolit123

tentatively revisiting this during the next release. might slip to later.

BenTheElder avatar Apr 25 '20 09:04 BenTheElder

If you came here because you want to deploy metrics-server to your kind cluster:

kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.3.6/components.yaml
kubectl patch deployment metrics-server -n kube-system -p '{"spec":{"template":{"spec":{"containers":[{"name":"metrics-server","args":["--cert-dir=/tmp", "--secure-port=4443", "--kubelet-insecure-tls","--kubelet-preferred-address-types=InternalIP"]}]}}}}'

This will deploy metrics-server 0.3.6 and patch the deployment to fix https://github.com/kubernetes-sigs/metrics-server/issues/131. It worked for me using kind 0.7.0.

rabenhorst avatar Apr 29 '20 11:04 rabenhorst

I tried the above commands suggested by @rabenhorst but they don't work for me :(

commands:

kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.3.6/components.yaml
	kubectl patch deployment metrics-server -n kube-system -p '{"spec":{"template":{"spec":{"containers":[{"name":"metrics-server","args":["--cert-dir=/tmp", "--secure-port=4443", "--kubelet-insecure-tls","--kubelet-preferred-address-types=InternalIP"]}]}}}}'

logs:

I0502 13:59:19.147717       1 serving.go:312] Generated self-signed cert (/tmp/apiserver.crt, /tmp/apiserver.key)
I0502 13:59:19.620092       1 secure_serving.go:116] Serving securely on [::]:4443

after I run command kubectl top nodes here other logs:

E0502 14:00:08.367258       1 reststorage.go:135] unable to fetch node metrics for node "kind-control-plane": no metrics known for node
E0502 14:00:08.367286       1 reststorage.go:135] unable to fetch node metrics for node "kind-worker": no metrics known for node
E0502 14:00:08.367292       1 reststorage.go:135] unable to fetch node metrics for node "kind-worker2": no metrics known for node
E0502 14:00:09.866095       1 reststorage.go:135] unable to fetch node metrics for node "kind-control-plane": no metrics known for node
E0502 14:00:09.866125       1 reststorage.go:135] unable to fetch node metrics for node "kind-worker": no metrics known for node
E0502 14:00:09.866130       1 reststorage.go:135] unable to fetch node metrics for node "kind-worker2": no metrics known for node

I have a simple KinD configuration:

apiVersion: kind.x-k8s.io/v1alpha4
kind: Cluster
nodes:
  - role: control-plane
    image: kindest/node:v1.18.0@sha256:0e20578828edd939d25eb98496a685c76c98d54084932f76069f886ec315d694
  - role: worker
    image: kindest/node:v1.18.0@sha256:0e20578828edd939d25eb98496a685c76c98d54084932f76069f886ec315d694
  - role: worker
    image: kindest/node:v1.18.0@sha256:0e20578828edd939d25eb98496a685c76c98d54084932f76069f886ec315d694

KinD version: kind v0.7.0 go1.13.6 darwin/amd64

bygui86 avatar May 02 '20 14:05 bygui86

Just leaving this here since I (and anyone else who might be using Ansible with KinD to deploy metrics-server) will probably find it in search results again at some point: To add the two required options to the official metrics-server manifest prior to deploying it into the cluster, here's a few Ansible tasks you can use. Hacky but I like to rely on the official upstream manifest for my testing:

---
- name: Download metrics-server manifest.
  get_url:
    url: https://github.com/kubernetes-sigs/metrics-server/releases/download/{{ metrics_server_version }}/components.yaml
    dest: /tmp/metrics-server.yaml
    mode: 0644

- name: Modify the manifest to allow insecure TLS for testing.
  lineinfile:
    path: /tmp/metrics-server.yaml
    state: present
    regexp: "^.+{{ item }}$"
    line: "          - --{{ item }}"
    insertafter: "^.+args:$"
  with_items:
    - kubelet-preferred-address-types=InternalIP
    - kubelet-insecure-tls

- name: Deploy metrics-server into the cluster.
  community.kubernetes.k8s:
    state: present
    src: /tmp/metrics-server.yaml
    wait: true

geerlingguy avatar Aug 27 '20 16:08 geerlingguy