serving
serving copied to clipboard
Webhook Flake on Upgrade
I wonder if we are clearing certificates?
upgrade.go:98: Failed to create Service: Internal error occurred: failed calling webhook "webhook.serving.knative.dev": failed to call webhook: Post "https://webhook.db15bd17-dfe9-41c9-9dfb-dd8115ecfe22.svc:443/?timeout=10s": tls: failed to verify certificate: x509: certificate signed by unknown authority (possibly because of "x509: ECDSA verification failure" while trying to verify candidate authority certificate "webhook.db15bd17-dfe9-41c9-9dfb-dd8115ecfe22.svc")
Originally posted by @dprotaso in https://github.com/knative/serving/issues/15141#issuecomment-2066443436
@dprotaso is not true that the certificate reconciler fills in the secret with a certificate based on the service name of the webhook and during the upgrade we override the secret with empty content? I suspect the new webhook controller loads the new cert before it is filled in by the reconciler and thus the error. I think we need to keep the secret around and not update it or wait for the webhook or something? I am wondering if instead of just presenting the certificate with GetCertificate we should also link readiness with proper certificate content (it happens elsewhere too tbh https://github.com/cert-manager/cert-manager/issues/3045)?
This issue is stale because it has been open for 90 days with no
activity. It will automatically close after 30 more days of
inactivity. Reopen the issue with /reopen
. Mark the issue as
fresh by adding the comment /remove-lifecycle stale
.