metrics-server icon indicating copy to clipboard operation
metrics-server copied to clipboard

Document securing connection between Metrics Server <-> Kubelet

Open thanos1983 opened this issue 4 years ago • 23 comments

What would you like to be added: Analytical steps for begginers how to configure TLS from Master to Workers

Why is this needed: All the tickets that I have found e.g. x509: certificate signed by unknown authority metrics-server or Metrics server issue with hostname resolution of kubelet and apiserver unable to communicate with metric-server clusterIP #131 all use the --kubelet-insecure-tls flag.

I have spend 2 days now trying to figure out how to set it up but with no luck so far.

I think it would be a good addition as a tutorial with analytical steps.

/kind feature

thanos1983 avatar Aug 20 '20 09:08 thanos1983

Same question! I try this and this

I spend 3 days. Result:

 x509: certificate signed by unknown authority

MatthewPattell avatar Oct 21 '20 16:10 MatthewPattell

Same question! I try this and this

I spend 3 days. Result:

 x509: certificate signed by unknown authority

Hello @MatthewPattell ,

I downloaded the latest patch and seems the problem to be fixed. I am also running calico as network element (I do not know if this affects the solution but just keep it in mind).

Give it a try and let us if this works for you as well.

BR / Thanos

thanos1983 avatar Oct 22 '20 06:10 thanos1983

@thanos1983 could you explain me more about how download latest patch? I try to use container image: gcr.io/k8s-staging-metrics-server/metrics-server:master, but it still not working:( Part of my deployment:

      volumes:
        - name: tmp-dir
          emptyDir: {}
        - configMap:
            defaultMode: 420
            name: ca-certs
          name: ca-dir
      containers:
        - name: metrics-server
          image: k8s.gcr.io/metrics-server/metrics-server:v0.3.7
          imagePullPolicy: IfNotPresent
          args:
            - --cert-dir=/tmp
            - --secure-port=4443
#            - --kubelet-insecure-tls
            - --kubelet-preferred-address-types=Hostname,InternalIP,ExternalIP
            - --kubelet-certificate-authority=/ca/ca.crt
            - --tls-cert-file=/ca/apiserver.crt
            - --tls-private-key-file=/ca/apiserver.key
          ports:
            - name: main-port
              containerPort: 4443
              protocol: TCP
          securityContext:
            readOnlyRootFilesystem: true
            runAsNonRoot: true
            runAsUser: 1000
          volumeMounts:
            - name: tmp-dir
              mountPath: /tmp
            - mountPath: /ca
              name: ca-dir

My logs:

server.go:132] unable to fully scrape metrics: [unable to fully scrape metrics from node kubernetes.sample.com: unable to fetch metrics from node kubernetes.sample.com: Get "https://kubernetes.sample.com:10250/stats/summary?only_cpu_and_memory=true": x509: certificate signed by unknown authority, unable to fully scrape metrics from node master-kubernetes.sample.com: unable to fetch metrics from node master-kubernetes.sample.com: Get "https://master-kubernetes.sample.com:10250/stats/summary?only_cpu_and_memory=true": x509: certificate signed by unknown authority]

MatthewPattell avatar Oct 22 '20 07:10 MatthewPattell

@thanos1983 could you explain me more about how download latest patch? I try to use container image: gcr.io/k8s-staging-metrics-server/metrics-server:master, but it still not working:( Part of my deployment:

      volumes:
        - name: tmp-dir
          emptyDir: {}
        - configMap:
            defaultMode: 420
            name: ca-certs
          name: ca-dir
      containers:
        - name: metrics-server
          image: k8s.gcr.io/metrics-server/metrics-server:v0.3.7
          imagePullPolicy: IfNotPresent
          args:
            - --cert-dir=/tmp
            - --secure-port=4443
#            - --kubelet-insecure-tls
            - --kubelet-preferred-address-types=Hostname,InternalIP,ExternalIP
            - --kubelet-certificate-authority=/ca/ca.crt
            - --tls-cert-file=/ca/apiserver.crt
            - --tls-private-key-file=/ca/apiserver.key
          ports:
            - name: main-port
              containerPort: 4443
              protocol: TCP
          securityContext:
            readOnlyRootFilesystem: true
            runAsNonRoot: true
            runAsUser: 1000
          volumeMounts:
            - name: tmp-dir
              mountPath: /tmp
            - mountPath: /ca
              name: ca-dir

My logs:

server.go:132] unable to fully scrape metrics: [unable to fully scrape metrics from node kubernetes.sample.com: unable to fetch metrics from node kubernetes.sample.com: Get "https://kubernetes.sample.com:10250/stats/summary?only_cpu_and_memory=true": x509: certificate signed by unknown authority, unable to fully scrape metrics from node master-kubernetes.sample.com: unable to fetch metrics from node master-kubernetes.sample.com: Get "https://master-kubernetes.sample.com:10250/stats/summary?only_cpu_and_memory=true": x509: certificate signed by unknown authority]

Hello @MatthewPattell ,

The new version 0.3.7 seems to be working fine for me out of the box. I do not need to pass all those parameters:

args:
  - --cert-dir=/tmp
  - --secure-port=4443
#            - --kubelet-insecure-tls
  - --kubelet-preferred-address-types=Hostname,InternalIP,ExternalIP
  - --kubelet-certificate-authority=/ca/ca.crt
  - --tls-cert-file=/ca/apiserver.crt
  - --tls-private-key-file=/ca/apiserver.key
- mountPath: /ca # also this
   name: ca-dir

The file is as it comes by default sample:

args:
  - --cert-dir=/tmp
  - --secure-port=4443

What version of kubectl are you running?

BR / Thanos

thanos1983 avatar Oct 22 '20 08:10 thanos1983

@thanos1983 my kubernetes version:

kubelet --version
Kubernetes v1.19.3

kubectl version
Client Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.8", GitCommit:"9f2892aab98fe339f3bd70e3c470144299398ace", GitTreeState:"clean", BuildDate:"2020-08-13T16:12:48Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.3", GitCommit:"1e11e4a2108024935ecfcb2912226cedeafd99df", GitTreeState:"clean", BuildDate:"2020-10-14T12:41:49Z", GoVersion:"go1.15.2", Compiler:"gc", Platform:"linux/amd64"}

MatthewPattell avatar Oct 22 '20 09:10 MatthewPattell

@thanos1983 my kubernetes version:

kubelet --version
Kubernetes v1.19.3

kubectl version
Client Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.8", GitCommit:"9f2892aab98fe339f3bd70e3c470144299398ace", GitTreeState:"clean", BuildDate:"2020-08-13T16:12:48Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.3", GitCommit:"1e11e4a2108024935ecfcb2912226cedeafd99df", GitTreeState:"clean", BuildDate:"2020-10-14T12:41:49Z", GoVersion:"go1.15.2", Compiler:"gc", Platform:"linux/amd64"}

I am not very experienced with kubernetes to the extend that I can give advices but I am not sure if you should have different version of client / server. Take a look on this it might end up as a problem in future.

Regarding the image did it worked for you after the proposed configurations?

thanos1983 avatar Oct 22 '20 09:10 thanos1983

@thanos1983 I am not think different version of client it is problem. I was try proposed configurations, it not work for me(

MatthewPattell avatar Oct 22 '20 09:10 MatthewPattell

TLS configuration depends on how Kubernetes distribution your using has set it's default and what options you overwrote (no impact of K8s version or Metrics Server version). Some distribution use self signed certificates in Kubelet, some use separate CA then apiserver and on some TLS for metrics server works out of the box.

I don't think Metrics Server documentation can do anything better then asking it's users to understand how CA is configured in their cluster and adapt their configuration accordingly. Trying to document how to fix those problems would require separate documentation per K8s distribution, which would not be maintainable.

Currently we try to list requirements that users should take a look into https://github.com/kubernetes-sigs/metrics-server#requirements. Users should look into documentation of k8s distribution to find how they can configure their cluster to fulfill those requirements

Please let me know if you have any ideas on how we can improve it and please let's not do debugging in feature request issues.

serathius avatar Nov 15 '20 13:11 serathius

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale

fejta-bot avatar Feb 13 '21 14:02 fejta-bot

/remove-lifecycle stale

serathius avatar Feb 13 '21 14:02 serathius

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale

fejta-bot avatar May 14 '21 14:05 fejta-bot

/remove-lifecycle stale

serathius avatar Jun 08 '21 16:06 serathius

/remove-lifecycle frozen

serathius avatar Jun 08 '21 16:06 serathius

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Sep 06 '21 17:09 k8s-triage-robot

/remove-lifecycle stale /lifecycle frozen

serathius avatar Oct 02 '21 09:10 serathius

would like to know it as well, I'm wondering whether a clean approach is even possible, the kubelet generated certificates will have 0600 permissions, so only the user running kubelet daemon will be able to read them.

https://github.com/kubernetes/kubernetes/blob/v1.23.1/staging/src/k8s.io/client-go/util/certificate/certificate_store.go#L196

avoidik avatar Jan 17 '22 09:01 avoidik

@avoidik that's a good question, I think it would be good to create a list of steps required to secure Metrics Server and verify it on some popular K8s distro.

cc @yangjunmyfm192085 @dgrisonnet

would like to know it as well, I'm wondering whether a clean approach is even possible, the kubelet generated certificates will have 0600 permissions, so only the user running kubelet daemon will be able to read them.

Proper K8s setup, certificates served by Kubelet are signed by cluster main CA. Metrics Sever doesn't need to access to them. When creating a TLS connection to Kubelet it should be able to confirm that served certificates are properly signed.

serathius avatar Jan 17 '22 10:01 serathius

I just wanted to reuse the same certificates specifically issued for/by kubelet, but it seems I would need to have another pair of certificates for metrics-server

avoidik avatar Jan 17 '22 10:01 avoidik

Generally, we do not need to add flags manually to configure the certificate required by mertrics-server to access kubelet. as @serathius discussed, Proper K8s setup, certificates served by Kubelet are signed by cluster main CA. Metrics Sever doesn't need to access to them. When creating a TLS connection to Kubelet it should be able to confirm that served certificates are properly signed. But if you do want to configure the certificate manually, there is the issue you discussed. My question is, do you really need to configure the certificate manually? not recommended to do this. /cc @sanwishe , Can you help to analyze it, if the metrics-server pod accesses the certificate, what permissions are required?

yangjunmyfm192085 avatar Jan 18 '22 12:01 yangjunmyfm192085

Generally, we do not need to add flags manually to configure the certificate required by mertrics-server to access kubelet. as @serathius discussed, Proper K8s setup, certificates served by Kubelet are signed by cluster main CA. Metrics Sever doesn't need to access to them. When creating a TLS connection to Kubelet it should be able to confirm that served certificates are properly signed. But if you do want to configure the certificate manually, there is the issue you discussed. My question is, do you really need to configure the certificate manually? not recommended to do this. /cc @sanwishe , Can you help to analyze it, if the metrics-server pod accesses the certificate, what permissions are required?

ok,i am working on this.

sanwishe avatar Jan 19 '22 10:01 sanwishe

Has any progress been made, here or elsewhere, since @sanwishe commented on January 19?

mprimeaux avatar Sep 17 '22 23:09 mprimeaux

I myself cannot make this work myself without using --kubelete-insecure-tls

This is what shows up in the API server logs.

2023-04-16 21:18:10.865Z E0416 21:18:10.865645       1 controller.go:116] loading OpenAPI spec for "v1beta1.metrics.k8s.io" failed with: failed to retrieve openAPI spec, http error: ResponseCode: 503, Body: error trying to reach service: x509: certificate signed by unknown authority

I created a root certificate authority and placed it in /etc/kubernetes/pki/ca.crt and a key /etc/kubernetes/pki/ca.key

I then run kubeadm init --config kubeadm-init.yml kubeadm-init.yml

apiVersion: kubeadm.k8s.io/v1beta3
kind: InitConfiguration
nodeRegistration:
  criSocket: unix:///var/run/containerd/containerd.sock
bootstrapTokens:
  - token: "<my token>"
    description: "kubeadm bootstrap token"
    ttl: "1h"
    groups: 
      - system:bootstrappers:kubeadm:default-node-token
---
apiVersion: kubeadm.k8s.io/v1beta3
kind: ClusterConfiguration
kubernetesVersion: 1.26.0
certificatesDir: /etc/kubernetes/pki
networking:
  podSubnet: 10.244.0.0/16
---
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
serverTLSBootstrap: true

I created a metrics-server.crt and metrics-server.key that I mount on the metrics server pod and run with the following arguments. Note that --cert-dir=/tmp and --kubelet-insecure-tls are commented.

      - args:
        # - --cert-dir=/tmp
        - --client-ca-file=/etc/kubernetes/pki/ca.crt
        - --secure-port=4443
        - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
        - --kubelet-use-node-status-port
        - --metric-resolution=15s
        # - --kubelet-insecure-tls
        - --kubelet-certificate-authority=/etc/kubernetes/pki/ca.crt
        - --tls-cert-file=/etc/kubernetes/pki/metrics-server.crt
        - --tls-private-key-file=/etc/kubernetes/pki/metrics-server.key

Here is the metrics-server csr configuration (cfssl) :

{
    "CN": "metrics-server",
    "hosts": [
        "metrics-server.kube-system",
        "metrics-server.kube-system.svc",
        "metrics-server.kube-system.svc.cluster.local",
        "localhost"
    ],
    "key": {
        "algo": "rsa",
        "size": 2048
    }
}

I don't understand why I'm getting Body: error trying to reach service: x509: certificate signed by unknown authority in the api server. I don't when running certigo (after installating the certificate to /usr/local/share/ca-certificates/)

./certigo connect metrics-server.kube-system.svc.cluster.local:443 # no issues with this

Which process is reaching the metrics server? The api server directly? If so, which certificate is it complaining about? The one sent by the metrics server to the api server when it acts as a client? Or the other way around? We need docs about this, maybe some network schema that shows the certificates on it.

shellwhale avatar Apr 16 '23 21:04 shellwhale

Hi there. I want to share my story. TL;DR: I did it xD

I have a bare-metal cluster with v1.28.2 k8s version. It was set up using kubeadm. My goal was to set up MetalLB, which requires metrics API on k8s. I found that it has 2 main solutions MS and Prometheus-adapter. I choose 1st as a simpler one.

After applying the helm chart I saw the mentioned error. After trying some random stuff I want to check https certs by myself and even verify them.

I tried to verify kubelet HTTP server cert with CA taken from cm

kubectl get cm -n kube-system extension-apiserver-authentication -o json | jq -r ".data[\"client-ca-file\"]" | openssl x509 > ../client-ca.pem

openssl verify -verbose -CAfile client-ca.pem  master-4-cluster1682686205-chain.pem
verification failed

Then I found this

A kubelet also can use serving certificates. The kubelet itself exposes an https endpoint for certain features. To secure these, the kubelet can do one of:

  • use provided key and certificate, via the --tls-private-key-file and --tls-cert-file flags
  • create self-signed key and certificate, if a key and certificate are not provided
  • request serving certificates from the cluster server, via the CSR API The client certificate provided by TLS bootstrapping is signed, by default, for client auth only, and thus cannot be used as serving certificates, or server auth.

However, you can enable its server certificate, at least partially, via certificate rotation.

So I updated kubelet cm kubectl edit cm -n kube-system kubelet-config

...
serverTLSBootstrap: true
...

and updated all the nodes with sudo kubeadm upgrade node phase kubelet-config sudo systemctl restart kubelet.service

After that found new cert requests k get csr -n kube-system

I approved all of them using kubectl certificate approve ... command, and then....

... magic happened, MS reached node metrics endpoints with SSL certs signed by CA that MS took from extension-apiserver-authentication config.

5n00p4eg avatar Nov 21 '23 09:11 5n00p4eg