boulder icon indicating copy to clipboard operation
boulder copied to clipboard

Automatically Pause Zombie Clients

Open beautifulentropy opened this issue 1 year ago • 0 comments

Manual Pausing Background

In #7406, we deployed all the necessary code and infrastructure to manually pause specific account-identifier pairs. Two batches of manual pauses were conducted based on 90 days of authorization logs:

  • Batch 1: Averaged 50 authorization failures per day, with no successful attempts, over 90 days.
  • Batch 2: Averaged 40 authorization failures per day, with no successful attempts, over 90 days.

After a few weeks with no complaints and very few unpauses, it seems reasonable to move forward with automated detection and pausing for accounts that meet the criteria established in our second batch.

Automatic Pausing Requirements

To efficiently identify pairs for pausing, we'll implement a new rate limit within our existing key-value rate limit system. This limit will be similar to our current FailedAuthorizationsPerDomainPerAccount limit and will use the same bucket key format of enum:regId:domain.

However, there are some differences:

  1. The configured period will match our longest issuance time, 90 days.
  2. The configured count will be our period (90) * acceptable failures per day (40), or 3600.
  3. The bucket will always be reset to 0 if the subscriber successfully validates an authorization for that identifier.
  4. When the limit is reached, the account and identifier will be added to our paused table by calling SA.PauseIdentifiers().

Any subsequent new-order requests from this account for certificates containing this identifier will then be rate limited. The rate limit notice will include a URL they can use to automatically unpause all paused identifiers associated with their account.

beautifulentropy avatar Oct 02 '24 19:10 beautifulentropy