boulder icon indicating copy to clipboard operation
boulder copied to clipboard

Add external issuance monitoring tool

Open jsha opened this issue 4 years ago • 0 comments

Right now we do end-to-end issuance testing with Certbot in a cron job, and analyze the resulting output to generate alerts. This has a few weaknesses:

  • The DNS challenge type uses an external DNS provider, which sometimes throttles us. That limits how often we can do test issuances, and also increases flakiness. The flakines means we require multiple failures before raising an alert, to avoid alert fatigue.
  • We generate failures based on command status codes rather than Prometheus.

Instead we should write a Go binary that speaks the ACME API and does various types of issuances (and revocations) on a configurable schedule, exporting Prometheus metrics. This binary should also be able to act as its own authoritative DNS server, so that it's not limited by external DNS providers.

Types of requests it should make:

  • Single SAN issuance for DNS-01, TLS-ALPN-01, and HTTP-01.
  • 100-SAN issuance for each of the above.
  • Revocation (and subsequent OCSP request to verify).
  • Single-SAN issuance for a non-authorized domain name (verify failure).

Issuances should happen on random subdomains of a configurable registered domain. Random subdomains avoid rate limiting based on duplicate certificates.

jsha avatar Oct 08 '21 21:10 jsha