pai icon indicating copy to clipboard operation
pai copied to clipboard

K8s API server's cert need renew each year

Open yiyione opened this issue 3 years ago • 2 comments

The k8s API server's cert will expire every year, and will cause OpenPAI cluster not available. Certificate Management with kubeadm: https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-certs/#automatic-certificate-renewal

image

How to fix

  1. renew k8s cert
  2. upgrade the kube-config in all worker nodes

Todo

  • [ ] Document this requirement in the repo
  • [x] Add warning for the cert expire

yiyione avatar Mar 02 '21 05:03 yiyione

refer: https://github.com/kubernetes/kubeadm/issues/581#issuecomment-471575078

Binyang2014 avatar Mar 02 '21 05:03 Binyang2014

Test case:

  1. setup the alert-manager to enable the email-admin action (change the admin-receiver to a test address)
  2. change the schedule and alert-residual-days in alert-manager.cert-expiration-checker from services-configuration.yaml to trigger the alert:
    cert-expiration-checker:
      schedule: '* * * * *' # every minute
      alert-residual-days: 365 # always trigger
      cert-path: '/etc/kubernetes/ssl' # the k8s cert path in master node
    
  3. use kubectl get pods to check whether the cert-expiration-checker cronjob be triggered
  4. check the alert email or the logs from alert handler.

yiyione avatar Apr 08 '21 03:04 yiyione