lighthouse icon indicating copy to clipboard operation
lighthouse copied to clipboard

Signature caching in the VC

Open paulhauner opened this issue 3 years ago • 2 comments

Description

@jmcruz1983 (Juan) has pointed out that Lighthouse is doing orders of magnitude more signing requests to Web3Signer than Teku. At scale (e.g., thousands of validators), this can overload infrastructure and cause real problems.

Based on some data from Juan, I suspect this is caused by duplicate signing of selection proofs (I'm not sure if it's for attestations or sync messages).

I have a proposed solution that should be simple to implement and will:

  • Definitely be useful for Juan to test on his infra to see if it resolves the issue.
  • Probably be a tenable long-term solution.

Proposed Solution

In the validator_store, create a signature_cache: SignatureCache<T>(HashMap<(T, SigningContext), Signature>) struct.

  1. Attached to the ValidatorStore is one signature_cache (probably wrapped in an RwLock) for produce_selection_proof and one for produce_sync_selection_proof.
  2. When a selection proof is requested, we check the cache to see if it already exists. If so, we return early with that signature.
  3. After we create a signature (because it wasn't in the cache), we add it to the cache.
    • If, during the cache add, we discover that the cache is over a certain size (64?) then we prune the entry with the lowest slot.

Whenever the signature cache reaches a certain size (4?) it will prune the entry

paulhauner avatar May 25 '22 00:05 paulhauner

Oh wait, I just realised that those signature_caches need to be per validator. Perhaps attaching them to the SigningMethod would be more appropriate.

paulhauner avatar May 25 '22 00:05 paulhauner

We already pre-compute all the sync selection proof signatures. It could be that this burst of signing is what shows up on web3signer's end.

Or, alternatively if we do adopt a cache we can probably drop the signature pre-compute, as I think a cache would obsolete the pre-compute.

michaelsproul avatar May 25 '22 00:05 michaelsproul

Closing since #3223 implemented this feature and saw little to no benefit.

paulhauner avatar Sep 12 '22 08:09 paulhauner