prometheus-engine
prometheus-engine copied to clipboard
Switch webhook cert setup from polling to controller
The previous implementation did a sleep before updating the webhooks. This caused a race condition where if the operator tries to touch a resource which is backed by a webhook before the webhook is updated, the call will fail because the webhook does not have a valid certificate (yet).
Error seen:
Internal error occurred: failed calling webhook \"abc\": failed to call webhook: Post \"xyz?timeout=10s\": x509: certificate signed by unknown authority"
The way I tested this was to create a webhook for a common resource (e.g. secrets). I did not observe this issue happening after my patch.