metrics-server
metrics-server copied to clipboard
Document securing connection between Metrics Server <-> Kubelet
What would you like to be added: Analytical steps for begginers how to configure TLS from Master to Workers
Why is this needed: All the tickets that I have found e.g. x509: certificate signed by unknown authority metrics-server or Metrics server issue with hostname resolution of kubelet and apiserver unable to communicate with metric-server clusterIP #131 all use the --kubelet-insecure-tls flag.
I have spend 2 days now trying to figure out how to set it up but with no luck so far.
I think it would be a good addition as a tutorial with analytical steps.
/kind feature
Same question! I try this and this
I spend 3 days. Result:
x509: certificate signed by unknown authority
Same question! I try this and this
I spend 3 days. Result:
x509: certificate signed by unknown authority
Hello @MatthewPattell ,
I downloaded the latest patch and seems the problem to be fixed. I am also running calico as network element (I do not know if this affects the solution but just keep it in mind).
Give it a try and let us if this works for you as well.
BR / Thanos
@thanos1983 could you explain me more about how download latest patch? I try to use container image: gcr.io/k8s-staging-metrics-server/metrics-server:master
, but it still not working:(
Part of my deployment:
volumes:
- name: tmp-dir
emptyDir: {}
- configMap:
defaultMode: 420
name: ca-certs
name: ca-dir
containers:
- name: metrics-server
image: k8s.gcr.io/metrics-server/metrics-server:v0.3.7
imagePullPolicy: IfNotPresent
args:
- --cert-dir=/tmp
- --secure-port=4443
# - --kubelet-insecure-tls
- --kubelet-preferred-address-types=Hostname,InternalIP,ExternalIP
- --kubelet-certificate-authority=/ca/ca.crt
- --tls-cert-file=/ca/apiserver.crt
- --tls-private-key-file=/ca/apiserver.key
ports:
- name: main-port
containerPort: 4443
protocol: TCP
securityContext:
readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 1000
volumeMounts:
- name: tmp-dir
mountPath: /tmp
- mountPath: /ca
name: ca-dir
My logs:
server.go:132] unable to fully scrape metrics: [unable to fully scrape metrics from node kubernetes.sample.com: unable to fetch metrics from node kubernetes.sample.com: Get "https://kubernetes.sample.com:10250/stats/summary?only_cpu_and_memory=true": x509: certificate signed by unknown authority, unable to fully scrape metrics from node master-kubernetes.sample.com: unable to fetch metrics from node master-kubernetes.sample.com: Get "https://master-kubernetes.sample.com:10250/stats/summary?only_cpu_and_memory=true": x509: certificate signed by unknown authority]
@thanos1983 could you explain me more about how download latest patch? I try to use container image:
gcr.io/k8s-staging-metrics-server/metrics-server:master
, but it still not working:( Part of my deployment:volumes: - name: tmp-dir emptyDir: {} - configMap: defaultMode: 420 name: ca-certs name: ca-dir containers: - name: metrics-server image: k8s.gcr.io/metrics-server/metrics-server:v0.3.7 imagePullPolicy: IfNotPresent args: - --cert-dir=/tmp - --secure-port=4443 # - --kubelet-insecure-tls - --kubelet-preferred-address-types=Hostname,InternalIP,ExternalIP - --kubelet-certificate-authority=/ca/ca.crt - --tls-cert-file=/ca/apiserver.crt - --tls-private-key-file=/ca/apiserver.key ports: - name: main-port containerPort: 4443 protocol: TCP securityContext: readOnlyRootFilesystem: true runAsNonRoot: true runAsUser: 1000 volumeMounts: - name: tmp-dir mountPath: /tmp - mountPath: /ca name: ca-dir
My logs:
server.go:132] unable to fully scrape metrics: [unable to fully scrape metrics from node kubernetes.sample.com: unable to fetch metrics from node kubernetes.sample.com: Get "https://kubernetes.sample.com:10250/stats/summary?only_cpu_and_memory=true": x509: certificate signed by unknown authority, unable to fully scrape metrics from node master-kubernetes.sample.com: unable to fetch metrics from node master-kubernetes.sample.com: Get "https://master-kubernetes.sample.com:10250/stats/summary?only_cpu_and_memory=true": x509: certificate signed by unknown authority]
Hello @MatthewPattell ,
The new version 0.3.7 seems to be working fine for me out of the box. I do not need to pass all those parameters:
args:
- --cert-dir=/tmp
- --secure-port=4443
# - --kubelet-insecure-tls
- --kubelet-preferred-address-types=Hostname,InternalIP,ExternalIP
- --kubelet-certificate-authority=/ca/ca.crt
- --tls-cert-file=/ca/apiserver.crt
- --tls-private-key-file=/ca/apiserver.key
- mountPath: /ca # also this
name: ca-dir
The file is as it comes by default sample:
args:
- --cert-dir=/tmp
- --secure-port=4443
What version of kubectl are you running?
BR / Thanos
@thanos1983 my kubernetes version:
kubelet --version
Kubernetes v1.19.3
kubectl version
Client Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.8", GitCommit:"9f2892aab98fe339f3bd70e3c470144299398ace", GitTreeState:"clean", BuildDate:"2020-08-13T16:12:48Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.3", GitCommit:"1e11e4a2108024935ecfcb2912226cedeafd99df", GitTreeState:"clean", BuildDate:"2020-10-14T12:41:49Z", GoVersion:"go1.15.2", Compiler:"gc", Platform:"linux/amd64"}
@thanos1983 my kubernetes version:
kubelet --version Kubernetes v1.19.3 kubectl version Client Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.8", GitCommit:"9f2892aab98fe339f3bd70e3c470144299398ace", GitTreeState:"clean", BuildDate:"2020-08-13T16:12:48Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"darwin/amd64"} Server Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.3", GitCommit:"1e11e4a2108024935ecfcb2912226cedeafd99df", GitTreeState:"clean", BuildDate:"2020-10-14T12:41:49Z", GoVersion:"go1.15.2", Compiler:"gc", Platform:"linux/amd64"}
I am not very experienced with kubernetes to the extend that I can give advices but I am not sure if you should have different version of client / server. Take a look on this it might end up as a problem in future.
Regarding the image did it worked for you after the proposed configurations?
@thanos1983 I am not think different version of client it is problem. I was try proposed configurations, it not work for me(
TLS configuration depends on how Kubernetes distribution your using has set it's default and what options you overwrote (no impact of K8s version or Metrics Server version). Some distribution use self signed certificates in Kubelet, some use separate CA then apiserver and on some TLS for metrics server works out of the box.
I don't think Metrics Server documentation can do anything better then asking it's users to understand how CA is configured in their cluster and adapt their configuration accordingly. Trying to document how to fix those problems would require separate documentation per K8s distribution, which would not be maintainable.
Currently we try to list requirements that users should take a look into https://github.com/kubernetes-sigs/metrics-server#requirements. Users should look into documentation of k8s distribution to find how they can configure their cluster to fulfill those requirements
Please let me know if you have any ideas on how we can improve it and please let's not do debugging in feature request issues.
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
Send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale
/remove-lifecycle stale
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
Send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale
/remove-lifecycle stale
/remove-lifecycle frozen
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle stale
- Mark this issue or PR as rotten with
/lifecycle rotten
- Close this issue or PR with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale /lifecycle frozen
would like to know it as well, I'm wondering whether a clean approach is even possible, the kubelet generated certificates will have 0600 permissions, so only the user running kubelet daemon will be able to read them.
https://github.com/kubernetes/kubernetes/blob/v1.23.1/staging/src/k8s.io/client-go/util/certificate/certificate_store.go#L196
@avoidik that's a good question, I think it would be good to create a list of steps required to secure Metrics Server and verify it on some popular K8s distro.
cc @yangjunmyfm192085 @dgrisonnet
would like to know it as well, I'm wondering whether a clean approach is even possible, the kubelet generated certificates will have 0600 permissions, so only the user running kubelet daemon will be able to read them.
Proper K8s setup, certificates served by Kubelet are signed by cluster main CA. Metrics Sever doesn't need to access to them. When creating a TLS connection to Kubelet it should be able to confirm that served certificates are properly signed.
I just wanted to reuse the same certificates specifically issued for/by kubelet, but it seems I would need to have another pair of certificates for metrics-server
Generally, we do not need to add flags manually to configure the certificate required by mertrics-server to access kubelet. as @serathius discussed, Proper K8s setup, certificates served by Kubelet are signed by cluster main CA. Metrics Sever doesn't need to access to them. When creating a TLS connection to Kubelet it should be able to confirm that served certificates are properly signed.
But if you do want to configure the certificate manually, there is the issue you discussed.
My question is, do you really need to configure the certificate manually? not recommended to do this.
/cc @sanwishe , Can you help to analyze it, if the metrics-server pod accesses the certificate, what permissions are required?
Generally, we do not need to add flags manually to configure the certificate required by mertrics-server to access kubelet. as @serathius discussed,
Proper K8s setup, certificates served by Kubelet are signed by cluster main CA. Metrics Sever doesn't need to access to them. When creating a TLS connection to Kubelet it should be able to confirm that served certificates are properly signed.
But if you do want to configure the certificate manually, there is the issue you discussed. My question is, do you really need to configure the certificate manually? not recommended to do this. /cc @sanwishe , Can you help to analyze it, if the metrics-server pod accesses the certificate, what permissions are required?
ok,i am working on this.
Has any progress been made, here or elsewhere, since @sanwishe commented on January 19?
I myself cannot make this work myself without using --kubelete-insecure-tls
This is what shows up in the API server logs.
2023-04-16 21:18:10.865Z E0416 21:18:10.865645 1 controller.go:116] loading OpenAPI spec for "v1beta1.metrics.k8s.io" failed with: failed to retrieve openAPI spec, http error: ResponseCode: 503, Body: error trying to reach service: x509: certificate signed by unknown authority
I created a root certificate authority and placed it in /etc/kubernetes/pki/ca.crt
and a key /etc/kubernetes/pki/ca.key
I then run
kubeadm init --config kubeadm-init.yml
kubeadm-init.yml
apiVersion: kubeadm.k8s.io/v1beta3
kind: InitConfiguration
nodeRegistration:
criSocket: unix:///var/run/containerd/containerd.sock
bootstrapTokens:
- token: "<my token>"
description: "kubeadm bootstrap token"
ttl: "1h"
groups:
- system:bootstrappers:kubeadm:default-node-token
---
apiVersion: kubeadm.k8s.io/v1beta3
kind: ClusterConfiguration
kubernetesVersion: 1.26.0
certificatesDir: /etc/kubernetes/pki
networking:
podSubnet: 10.244.0.0/16
---
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
serverTLSBootstrap: true
I created a metrics-server.crt
and metrics-server.key
that I mount on the metrics server pod and run with the following arguments. Note that --cert-dir=/tmp
and --kubelet-insecure-tls
are commented.
- args:
# - --cert-dir=/tmp
- --client-ca-file=/etc/kubernetes/pki/ca.crt
- --secure-port=4443
- --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
- --kubelet-use-node-status-port
- --metric-resolution=15s
# - --kubelet-insecure-tls
- --kubelet-certificate-authority=/etc/kubernetes/pki/ca.crt
- --tls-cert-file=/etc/kubernetes/pki/metrics-server.crt
- --tls-private-key-file=/etc/kubernetes/pki/metrics-server.key
Here is the metrics-server csr configuration (cfssl) :
{
"CN": "metrics-server",
"hosts": [
"metrics-server.kube-system",
"metrics-server.kube-system.svc",
"metrics-server.kube-system.svc.cluster.local",
"localhost"
],
"key": {
"algo": "rsa",
"size": 2048
}
}
I don't understand why I'm getting Body: error trying to reach service: x509: certificate signed by unknown authority
in the api server. I don't when running certigo (after installating the certificate to /usr/local/share/ca-certificates/)
./certigo connect metrics-server.kube-system.svc.cluster.local:443 # no issues with this
Which process is reaching the metrics server? The api server directly? If so, which certificate is it complaining about? The one sent by the metrics server to the api server when it acts as a client? Or the other way around? We need docs about this, maybe some network schema that shows the certificates on it.
Hi there. I want to share my story. TL;DR: I did it xD
I have a bare-metal cluster with v1.28.2
k8s version. It was set up using kubeadm
.
My goal was to set up MetalLB, which requires metrics API on k8s. I found that it has 2 main solutions MS and Prometheus-adapter. I choose 1st as a simpler one.
After applying the helm chart I saw the mentioned error. After trying some random stuff I want to check https certs by myself and even verify them.
I tried to verify kubelet HTTP server cert with CA taken from cm
kubectl get cm -n kube-system extension-apiserver-authentication -o json | jq -r ".data[\"client-ca-file\"]" | openssl x509 > ../client-ca.pem
openssl verify -verbose -CAfile client-ca.pem master-4-cluster1682686205-chain.pem
verification failed
Then I found this
A kubelet also can use serving certificates. The kubelet itself exposes an https endpoint for certain features. To secure these, the kubelet can do one of:
- use provided key and certificate, via the --tls-private-key-file and --tls-cert-file flags
- create self-signed key and certificate, if a key and certificate are not provided
- request serving certificates from the cluster server, via the CSR API The client certificate provided by TLS bootstrapping is signed, by default, for client auth only, and thus cannot be used as serving certificates, or server auth.
However, you can enable its server certificate, at least partially, via certificate rotation.
So I updated kubelet cm
kubectl edit cm -n kube-system kubelet-config
...
serverTLSBootstrap: true
...
and updated all the nodes with
sudo kubeadm upgrade node phase kubelet-config
sudo systemctl restart kubelet.service
After that found new cert requests
k get csr -n kube-system
I approved all of them using kubectl certificate approve ...
command, and then....
... magic happened, MS reached node metrics endpoints with SSL certs signed by CA that MS took from extension-apiserver-authentication
config.