kube-prometheus Shows apiserver client certificate expiring, kube controller-manager and scheduler down on fresh 1.20 cluster

What happened?

Originally posted at https://github.com/kubernetes/kops/issues/11211

The Prometheus dashboard shows 5 alerts firing:

name: Watchdog
expr: vector(1)
labels:
  severity: none
annotations:
  message: This is an alert meant to ensure that the entire alerting pipeline is functional.
  This alert is always firing, therefore it should always be firing in Alertmanager
  and always fire against a receiver. There are integrations with various notification
  mechanisms that send a notification when this alert is not firing. For example the
  "DeadMansSnitch" integration in PagerDuty.
  runbook_url: https://github.com/prometheus-operator/kube-prometheus/wiki/watchdog

name: KubeClientCertificateExpiration
expr: apiserver_client_certificate_expiration_seconds_count{job="apiserver"} > 0 and on(job) histogram_quantile(0.01, sum by(job, le) (rate(apiserver_client_certificate_expiration_seconds_bucket{job="apiserver"}[5m]))) < 604800
labels:
  severity: warning
annotations:
  description: A client certificate used to authenticate to the apiserver is expiring in less than 7.0 days.
  runbook_url: https://github.com/prometheus-operator/kube-prometheus/wiki/kubeclientcertificateexpiration
  summary: Client certificate is about to expire.

name: KubeClientCertificateExpiration
expr: apiserver_client_certificate_expiration_seconds_count{job="apiserver"} > 0 and on(job) histogram_quantile(0.01, sum by(job, le) (rate(apiserver_client_certificate_expiration_seconds_bucket{job="apiserver"}[5m]))) < 86400
labels:
  severity: critical
annotations:
  description: A client certificate used to authenticate to the apiserver is expiring in less than 24.0 hours.
  runbook_url: https://github.com/prometheus-operator/kube-prometheus/wiki/kubeclientcertificateexpiration
  summary: Client certificate is about to expire.

name: KubeControllerManagerDown
expr: absent(up{job="kube-controller-manager"} == 1)
for: 15m
labels:
  severity: critical
annotations:
  description: KubeControllerManager has disappeared from Prometheus target discovery.
  runbook_url: https://github.com/prometheus-operator/kube-prometheus/wiki/kubecontrollermanagerdown
  summary: Target disappeared from Prometheus target discovery.

name: KubeSchedulerDown
expr: absent(up{job="kube-scheduler"} == 1)
for: 15m
labels:
  severity: critical
annotations:
  description: KubeScheduler has disappeared from Prometheus target discovery.
  runbook_url: https://github.com/prometheus-operator/kube-prometheus/wiki/kubeschedulerdown
  summary: Target disappeared from Prometheus target discovery.

Did you expect to see some different?

The Prometheus dashboard should not have pending or firing alerts only inactive.

How to reproduce it (as minimally and precisely as possible):

Create a fresh AWS cluster using kops.

# Create cluster.
$ kops create cluster \
    --node-count 2 \
    --master-count 1 \
    --zones eu-central-1a \
    --master-zones eu-central-1a \
    --networking calico \
    ${NAME}
$ kops update cluster --name XXX --yes --admin
# Install kube-prometheus.
$ git clone https://github.com/prometheus-operator/kube-prometheus.git
$ cd kube-prometheus
$ kubectl create -f manifests/setup
$ kubectl create -f manifests/
$ kubectl --namespace monitoring port-forward svc/prometheus-k8s 9090
=> navigate to `http://localhost:9090/alerts`

Environment

kops: 1.20 k8s: 1.20 provider: AWS

Prometheus Operator version:

quay.io/prometheus-operator/prometheus-operator:v0.46.0

Kubernetes version information:

Client Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.0", GitCommit:"cb303e613a121a29364f75cc67d3d580833a7479", GitTreeState:"clean", BuildDate:"2021-04-08T21:16:14Z", GoVersion:"go1.16.3", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.5", GitCommit:"6b1d87acf3c8253c123756b9e61dac642678305f", GitTreeState:"clean", BuildDate:"2021-03-18T01:02:01Z", GoVersion:"go1.15.8", Compiler:"gc", Platform:"linux/amd64"}

Kubernetes cluster kind:

See above instructions.

Manifests:

Using master branch manifests. See above instructions.

Prometheus Operator Logs:

Prometheus Logs:

Anything else we need to know?:

Apr 12 '21 11:04 xpepermint

Going through the list:

Watchdog should be firing and it is expected. Alert description says why and what is the purpose of this alert.
KubeControllerManagerDown firing means you need to configure prometheus access to k8s Controller Manager or disable it (use this platform addon: https://github.com/prometheus-operator/kube-prometheus/blob/main/jsonnet/kube-prometheus/platforms/kops.libsonnet)
KubeSchedulerDown - same as above but for k8s Scheduler
both KubeClientCertificateExpiration means you have client certificates that need to be renewed. This looks like something on kops side of things. If those certificates are auto-refreshed, then alert should be removed/silenced.

Apr 12 '21 11:04 paulfantom

I followed the README part on customization and managed to build the manifest files. I redeploy the manifests but I don't see any changes. Did I include the kops platform addon correctly?

local kp =
  (import 'kube-prometheus/main.libsonnet') +
  // Uncomment the following imports to enable its patches
  // (import 'kube-prometheus/addons/anti-affinity.libsonnet') +
  // (import 'kube-prometheus/addons/managed-cluster.libsonnet') +
  // (import 'kube-prometheus/addons/node-ports.libsonnet') +
  // (import 'kube-prometheus/addons/static-etcd.libsonnet') +
  // (import 'kube-prometheus/addons/custom-metrics.libsonnet') +
  // (import 'kube-prometheus/addons/external-metrics.libsonnet') +
  {
    values+:: {
      kubePrometheus+: {
        platform: 'kops',
      },
      common+: {
        namespace: 'monitoring',
      },
    },
  };

{ 'setup/0namespace-namespace': kp.kubePrometheus.namespace } +
{
  ['setup/prometheus-operator-' + name]: kp.prometheusOperator[name]
  for name in std.filter((function(name) name != 'serviceMonitor' && name != 'prometheusRule'), std.objectFields(kp.prometheusOperator))
} +
// serviceMonitor and prometheusRule are separated so that they can be created after the CRDs are ready
{ 'prometheus-operator-serviceMonitor': kp.prometheusOperator.serviceMonitor } +
{ 'prometheus-operator-prometheusRule': kp.prometheusOperator.prometheusRule } +
{ 'kube-prometheus-prometheusRule': kp.kubePrometheus.prometheusRule } +
{ ['alertmanager-' + name]: kp.alertmanager[name] for name in std.objectFields(kp.alertmanager) } +
{ ['blackbox-exporter-' + name]: kp.blackboxExporter[name] for name in std.objectFields(kp.blackboxExporter) } +
{ ['grafana-' + name]: kp.grafana[name] for name in std.objectFields(kp.grafana) } +
{ ['kube-state-metrics-' + name]: kp.kubeStateMetrics[name] for name in std.objectFields(kp.kubeStateMetrics) } +
{ ['kubernetes-' + name]: kp.kubernetesControlPlane[name] for name in std.objectFields(kp.kubernetesControlPlane) }
{ ['node-exporter-' + name]: kp.nodeExporter[name] for name in std.objectFields(kp.nodeExporter) } +
{ ['prometheus-' + name]: kp.prometheus[name] for name in std.objectFields(kp.prometheus) } +
{ ['prometheus-adapter-' + name]: kp.prometheusAdapter[name] for name in std.objectFields(kp.prometheusAdapter) }

Apr 12 '21 13:04 xpepermint

Your code looks good and I just checked it on my machine. Are you on the latest revision of kube-prometheus?

Replacing example.jsonnet from repository with your code and running make vendor && ./build.sh should produce following changes in files:

$ git status
On branch main
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
        modified:   example.jsonnet

Untracked files:
  (use "git add <file>..." to include in what will be committed)
        manifests/kubernetes-kubeControllerManagerPrometheusDiscoveryService.yaml
        manifests/kubernetes-kubeDnsPrometheusDiscoveryService.yaml
        manifests/kubernetes-kubeSchedulerPrometheusDiscoveryService.yaml

no changes added to commit (use "git add" and/or "git commit -a")

Apr 12 '21 13:04 paulfantom

A tried several things.

Yes, cloning the directory as you explained above, creates 3 new files. If I apply manifests, I see the same alerts as before.
I followed the README guide and cloned the @main branch. The results are the same.

So you say you don't have problems on your k8s and all alerts are green?

Apr 12 '21 14:04 xpepermint

Yes, cloning the directory as you explained above, creates 3 new files.

Then it is a cluster bootstrap issue as not all endpoints are configured correctly.

So you say you don't have problems on your k8s and all alerts are green?

I don't use kops, but yes all alerts that are expected to be green are green (Watchdog is always firing)

Apr 13 '21 07:04 paulfantom

Can you please confirm the k8s version?

Apr 13 '21 07:04 xpepermint

I have multiple (but non-kops ones), it works for me on 1.19, and 1.20 :)

Apr 13 '21 08:04 paulfantom

I have the same issue with a newly setup 1.19 cluster.. with 1.18 (also managed by kops) I had no issue (same kube-prometheus version).

Apr 13 '21 08:04 KlavsKlavsen

I have also noiticed that ever since updating to 1.19 - setup by KOPS - our kubectl config goes bad after ~1 day.. which is "fixed" by updating it using: kops export kubecfg --name=mycluster --admin - and I can see that the client cert is indeed a new one every day. I checked that cert: Issuer: CN = kubernetes Validity Not Before: Apr 11 07:12:42 2021 GMT Not After : Apr 14 01:12:42 2021 GMT Subject: O = system:masters, CN = kubecfg-Klavs Klavsen

So it seems KOPS 1.19 uses rotating admin certs.. and since we talk to the API-server all the time.. thats whats triggering the alarm.

Apr 13 '21 08:04 KlavsKlavsen

Is there a setting to adjust mixin alert.. perhaps using some of the buckets setup to handle this (in api-server) by this commit: https://github.com/kubernetes/kubernetes/commit/f90bbc3d6bfba992831eb216161990eae1098ae5

Apr 13 '21 08:04 KlavsKlavsen

Since this also covers the admin cert.. we use for kubectl access - this alert should probably be disabled unfortunately - as its not fine grained enough, and we don't want to have an alert because some admin didn't renew his local cert used.. (which is what we get now)

Apr 13 '21 08:04 KlavsKlavsen

atleast disabled for those wanting to use such auto-rotating certs.

Apr 13 '21 08:04 KlavsKlavsen

It is tweakable via certExpirationWarningSeconds and certExpirationCriticalSeconds. You can apply this as:

kubernetesControlPlane+: {
  mixin+: {
    _config+: {
      certExpirationWarningSeconds: SOME_VALUE,
      certExpirationCriticalSeconds: OTHER_VALUE,
    }
  }
}

As for alert removal - there is a closed PR about it in https://github.com/kubernetes-monitoring/kubernetes-mixin/pull/550 and an open issue in https://github.com/prometheus-operator/kube-prometheus/issues/881. Plus you still can remove the alert in jsonnet (we are doing this for OpenShift).

Apr 13 '21 08:04 paulfantom

Thinking about it, it would be worth extending https://github.com/prometheus-operator/kube-prometheus/blob/main/jsonnet/kube-prometheus/platforms/kops.libsonnet to make this work OOTB. Anyone would like to create a PR with a fix?

Apr 13 '21 10:04 paulfantom

@paulfantom are you talking only about certificates or the "kube-down" alerts as well? I would create a PR but I'm pretty fresh in this space. so I'll be happy if a k8s ninja does it instead :).

In the meantime, @hakman proposed a fix which resolves the "expired certificate" alert issues. This change is actually documented here however I still think that kube-prometheus should be aware of this and should handle it transparently for you. For now, when using kops, make sure you set --admin=87600h when creating a cluster:

$ kops create cluster ...
$ kops update cluster --name $NAME --yes --admin=87600h

The "kube-down" alerts remain a mystery.

Apr 13 '21 11:04 xpepermint

actually the --admin=87600h is to be used with: kops export kubecfg --admin=87600h which is used to generate kubeconfig for the cluster.

Apr 13 '21 12:04 KlavsKlavsen

it issues a new "admin" client cert to be used with api-server calls

Apr 13 '21 12:04 KlavsKlavsen

Alerts regarding the controller manager and the scheduler should stay as those are quite important to have. For those to work scheduler and controller manager need to be configured during bootstrap (or later) and there is nothing kube-prometheus can do about it apart from documenting the fact. We have such docs for kubeadm, but not for kops (any volunteers for writing it? :slightly_smiling_face:).

As for alerts regarding certificates, if --admin=87600h fixes it, then it should be documented in the same docs as above. If it doesn't fix it, we should either remove the alert or adjust it in https://github.com/prometheus-operator/kube-prometheus/blob/main/jsonnet/kube-prometheus/platforms/kops.libsonnet file.

Apr 13 '21 12:04 paulfantom

@paulfantom @KlavsKlavsen have you seen comments by @johngmyers :

Those certs are issued per node when running kops 1.19 or later. They will have expiration within seconds of the expiration of the node's apiserver server certificate. Why would you want to monitor them?

The --admin=87600h is pretty dangerous security-wise. That's a pretty long time for an unrevokable user credential.

Hum, maybe such a check is obsolete and should really be removed.

Why would you want to monitor kube-controller-manager and kube-scheduler separately?

I don't have a good answer to that. Do you have it? Are we overengineering?

Apr 13 '21 15:04 xpepermint

kOps certificate management changed substantially in 1.19. The release notes state:

The lifetimes of certificates used by various components have been substantially reduced. The certificates on a node will expire sometime between 455 and 485 days after the node's creation. The expiration times vary randomly so that nodes are likely to have their certs expire at different times than other nodes.

I intentionally designed it so that all of a given node's certificates expire around the same time. (And that different nodes are likely to have widely different expirations.) Since the kube-controller-manager and kube-scheduler expire within seconds of the node's apiserver server certificate, it is sufficient to only monitor the node's apiserver server certificate. (Though I think it would be quite negligent to let a node go for over 455 days without an update.)

Since the --admin flag doesn't affect anything that gets provisioned into the cluster, I don't see how it should affect monitoring. Unless one deploys the generated local credential into the monitoring system, which would give the monitoring system in-advisably powerful permissions.

Apr 13 '21 16:04 johngmyers

@johngmyers thank you very much for the in-depth explanation. @paulfantom I think we have a solution. I vote for removing these, now redundant, alerts.

Apr 13 '21 16:04 xpepermint

Since the --admin flag doesn't affect anything that gets provisioned into the cluster, I don't see how it should affect monitoring.

@johngmyers this is not about monitoring system and its certificates but about monitoring k8s certificates expiration and notifying if those are about to expire. If there is no automatic certificate renewal then those alerts provide a way to notify users about imminent problems and shouldn't be removed.

Apr 13 '21 16:04 paulfantom

@paulfantom the --admin flag doesn't affect anything that gets provisioned into the cluster. It just mints an admin certificate into the local kubeconfig. The value of the switch changes the expiration of that local certificate.

Apr 13 '21 16:04 johngmyers

Per https://github.com/kubernetes/kubernetes/commit/f90bbc3d6bfba992831eb216161990eae1098ae5 the apiserver_client_certificate_expiration_seconds_count metric measures the remaining lifetimes of client certificates that were used to authenticate to the apiserver. This does not appear to be a useful metric to monitor, as clients' use of short-expiration credentials do not indicate an impending problem.

Apr 13 '21 20:04 johngmyers

Hi everyone! Can you suggest here the right way to get rid of these noisy alerts? What is the current workaround for this?

Oct 08 '21 16:10 AntonUspishnyi

Delete the alert definition.

Oct 08 '21 17:10 johngmyers

I'm hitting the KubeClientCertificateExpiration warning too with Kubespray 2.17.0, so it's not somehow scoped to Kops.

The interesting part is that Prometheus shows an increasing value for those metrics

Also kubeadm shows clearly there are no close expirations:

CERTIFICATE                EXPIRES                  RESIDUAL TIME   CERTIFICATE AUTHORITY   EXTERNALLY MANAGED
admin.conf                 Oct 25, 2022 12:52 UTC   364d                                    no      
apiserver                  Sep 28, 2022 12:27 UTC   337d            ca                      no      
apiserver-kubelet-client   Sep 28, 2022 12:27 UTC   337d            ca                      no      
controller-manager.conf    Sep 28, 2022 12:28 UTC   337d                                    no      
front-proxy-client         Sep 28, 2022 12:27 UTC   337d            front-proxy-ca          no      
scheduler.conf             Sep 28, 2022 12:28 UTC   337d                                    no      

CERTIFICATE AUTHORITY   EXPIRES                  RESIDUAL TIME   EXTERNALLY MANAGED
ca                      Oct 28, 2030 16:13 UTC   9y              no      
front-proxy-ca          Oct 28, 2030 16:13 UTC   9y              no

Oct 25 '21 14:10 irizzant

我也遇到了KubeClientCertificateExpirationKubespray 2.17.0的警告，所以它并没有以任何方式仅限 Kops。

有趣的部分是普罗米修斯显示这些指标的价值增加大

还kubeadm明确地辩白：

CERTIFICATE                EXPIRES                  RESIDUAL TIME   CERTIFICATE AUTHORITY   EXTERNALLY MANAGED
admin.conf                 Oct 25, 2022 12:52 UTC   364d                                    no      
apiserver                  Sep 28, 2022 12:27 UTC   337d            ca                      no      
apiserver-kubelet-client   Sep 28, 2022 12:27 UTC   337d            ca                      no      
controller-manager.conf    Sep 28, 2022 12:28 UTC   337d                                    no      
front-proxy-client         Sep 28, 2022 12:27 UTC   337d            front-proxy-ca          no      
scheduler.conf             Sep 28, 2022 12:28 UTC   337d                                    no      

CERTIFICATE AUTHORITY   EXPIRES                  RESIDUAL TIME   EXTERNALLY MANAGED
ca                      Oct 28, 2030 16:13 UTC   9y              no      
front-proxy-ca          Oct 28, 2030 16:13 UTC   9y              no

I have the same problem ls there any solution I can't find out which certificate is the problem

Nov 12 '21 04:11 infiniteprogres

@irizzant

Nov 12 '21 04:11 infiniteprogres

@irizzant I have the same problem ls there any solution I can't find out which certificate is the problem

Nov 12 '21 04:11 infiniteprogres

kube-prometheus kube-prometheus copied to clipboard

Shows apiserver client certificate expiring, kube controller-manager and scheduler down on fresh 1.20 cluster

kube-prometheus
kube-prometheus copied to clipboard