Some PODS are getting crashed when i am trying to upgrade the linkerd certificate.
What is the issue?
I tried to upgrade the trust anchor certtificate and issuer certificate for linkerd. I am getting the below error for some pods.
When i do linkerd check --proxy i see this message
linkerd-control-plane-proxy
---------------------------
\ The "linkerd-controller-78b96b6f94-sgstq" pod is not running
× viz extension proxies are healthy
Some pods do not have the current trust bundle and must be restarted:
* grafana-848bd95ff-n8bfl
* metrics-api-7fdd9c4776-h9r67
* prometheus-6c665d96d9-pbh9m
* tap-7bf49cd67b-qz8ck
* tap-injector-7d6dfb5698-jtchs
* web-75c64d4849-f7qph
see https://linkerd.io/2.11/checks/#l5d-viz-proxy-healthy for hints
and when i do
kubectl -n linkerd get pods
NAME READY STATUS RESTARTS AGE
linkerd-controller-5cdd4c5c8-gq5sp 2/2 Running 0 48d
linkerd-controller-78b96b6f94-sgstq 0/2 CrashLoopBackOff 331 20h
linkerd-destination-6557cfd654-v257d 4/4 Running 0 20h
linkerd-identity-6567c674c8-cqmwx 2/2 Running 0 20h
linkerd-proxy-injector-8d6dd4bf7-l8v7v 2/2 Running 0 20h
linkerd-sp-validator-6c464858d5-cscrp 0/2 CrashLoopBackOff 334 20h
linkerd-sp-validator-6d9f7d4685-44b2z 2/2 Running 0 48d
The pods are crashing. Can someone guide me on how to resolve this.
How can it be reproduced?
When i do linkerd check --proxy
Logs, error output, etc
included in the issue
output of linkerd check -o short
Linkerd core checks
===================
linkerd-version
---------------
‼ cli is up-to-date
is running version 2.11.1 but the latest stable version is 2.11.2
see https://linkerd.io/2.11/checks/#l5d-version-cli for hints
control-plane-version
---------------------
‼ control plane is up-to-date
is running version 2.11.1 but the latest stable version is 2.11.2
see https://linkerd.io/2.11/checks/#l5d-version-control for hints
linkerd-control-plane-proxy
---------------------------
× control plane proxies are healthy
The "linkerd-controller-78b96b6f94-sgstq" pod is not running
see https://linkerd.io/2.11/checks/#l5d-cp-proxy-healthy for hints
Status check results are ×
Linkerd extensions checks
=========================
linkerd-viz
-----------
× viz extension proxies are healthy
Some pods do not have the current trust bundle and must be restarted:
* grafana-848bd95ff-n8bfl
* metrics-api-7fdd9c4776-h9r67
* prometheus-6c665d96d9-pbh9m
* tap-7bf49cd67b-qz8ck
* tap-injector-7d6dfb5698-jtchs
* web-75c64d4849-f7qph
see https://linkerd.io/2.11/checks/#l5d-viz-proxy-healthy for hints
Status check results are ×
Environment
ST
Possible solution
No response
Additional context
No response
Would you like to work on fixing this bug?
yes
I tried to upgrade the trust anchor certtificate and issuer certificate for linkerd.
Can you explain in more details what steps you took?
linkerd-controller-78b96b6f94-sgstq 0/2 CrashLoopBackOff 331 20h
You can use kubectl describe and kubectl logs to get more information about the reason the pod is crashing.
I have generated a new trust anchor certificate and a new issuer certificate and upgraded both ofthem using the basic commands as shown in this document. https://linkerd.io/2.10/tasks/generate-certificates/#generating-the-certificates-with-step
I tried doing kubectl describe on that but i am getting this $ kubectl describe linkerd-controller-78b96b6f94-sgstq error: the server doesn't have a resource type "linkerd-controller-78b96b6f94-sgstq"
$ kubectl logs linkerd-controller-78b96b6f94-sgstq Error from server (NotFound): pods "linkerd-controller-78b96b6f94-sgstq" not found
There are instructions for rotating Linkerd certificates here: https://linkerd.io/2.11/tasks/manually-rotating-control-plane-tls-credentials/
I have followed the same steps for upgrading the certificate. But i am getting the above mentioned errors.
Can anyone give me some suggestions on the above issue i posted.
@sreeyadlapati it looks like an issue that I faced and I've updated the docs: https://linkerd.io/2.11/tasks/manually-rotating-control-plane-tls-credentials/#removing-the-old-trust-anchor.
We can now remove the old trust anchor from the trust bundle we created earlier.
NOTE: Before the action, it is necessary to explicitly rollout all deployments in the linkerd namespace:
kubectl -n linkerd rollout restart deployments
Try to return the old CA and roll out all the pods in the linkerd namespace. Then you can remove the old one, and roll it out once again. It helped me, and maybe it helps you)
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.