Cache getRegistrations in memory
https://github.com/letsencrypt/boulder/blob/1bed7405757fa283a14e5cb7a58236113c485c88/sa/model.go#L56-L64
This code path is hot, and the data here is not changing per ID that often. We can also deal with stale data here as well.
Caching this is a significant win in reads/connections to the database.
Given an account lookup is the first step in any ACME request, it should be a good candidate for caching. We need to be clear about the possible impacts of caching account information, though. Once created, an account can go through these state transitions:
- deactivation
- key change
- contact change
Contact change is only really relevant to the expiration mailer, and is relevant on a long-ish time scale, so we're not too worried about those.
For deactivation and key change, users would expect that once the request is successful, authenticating with the old key (or the old account) should fail. The typical approach to this is a write-through cache, which we don't have.
One possible workaround: if we think a caching duration of ~ten seconds is sufficient to make performance improvements, we could cache for a very short time period, and then artificially slow down the account deactivation / key change requests so we know any cached entries are expired.
What's the best place for a cache of account objects to live? A few options: database; proxysql; SA; Redis; WFE. The database is out because we're trying to reduce load on it. ProxySQL is out because it can't cache prepared statements. Redis is out because it's not ready yet.
For choosing between the SA and the WFE: we could save a little SA work and some RPC traffic by caching in the WFE instead. It looks like the only other place in the issuance path where we call GetRegistration is in RA, where we check that a certificate key is not the same as the key of the account issuing it. That's a fairly infrequent call.
For a long-term caching solution, SA/WFE are not the best option, because increasing the number of shards would decrease our cache hit ratio. Ideally we want something that provides sharding by account and write-through caching. So if we go forward with this we should treat it as a short term patch.
We should estimate how much memory we need for a cache of a given size (that is, how much memory does an account object take up, including pointers to objects not directly contained). From that we can figure out what size of cache we can reasonably support on each of our (SA|WFE) instances, and what our expected offload from that will be.
It's worth noting that we already cache accounts in the WFE, and that doing so causes problems that external folks have noticed.
We're closing this in favor of https://github.com/letsencrypt/boulder/issues/6744.