lighthouse icon indicating copy to clipboard operation
lighthouse copied to clipboard

VC using excessive memory when using Web3Signer

Open macladson opened this issue 2 years ago • 2 comments

Credit to @jmcruz1983 for discovering.

Description

Running a validator client using Web3Signer uses significantly more memory than an equivalent local signer. The problem becomes more pronounced at higher numbers of validator clients.

I ran a local network using custom scripts (based on the existing scripts at scripts/local_testnet which create 4 validator clients, 3 using local signers and 1 using a web3signer. Scripts can be found here in case anyone wants to run this themselves: https://github.com/macladson/lighthouse/tree/web3signer-local-test

Even with 200 validators, the effect was pronounced: ~200MB was being used for the VC using the web3signer, but only ~30MB was being used for the other VCs.

I ran a heaptrack and analyzed the memory profile with help from @michaelsproul. The memory allocations appear to be attributed to the reqwest::Client which is part of SigningMethod.

Each validator gets allocated a SigningMethod so once validator counts reach the 1000s, a significant amount of memory is being used.

Note: It is currently unknown why @jmcruz1983 was seeing reduced memory when building Lighthouse from source (rather than using a pre-built docker image). On my machine, I saw the increase in memory usage even though I was building locally. At the very least, the issue seems to be related to the specifics of your configuration when building the binary.

Steps to resolve

As per the docuementation for reqwest::Client:

The Client holds a connection pool internally, so it is advised that you create one and reuse it.

If we share the reqwest::Client between all the validator instances, we should see a large memory reduction. This seems like the simplest solution to me.

macladson avatar Jul 01 '22 05:07 macladson

Great finding guys! any idea when this can make it to unstable branch?

jmcruz1983 avatar Jul 01 '22 08:07 jmcruz1983

Hi @jmcruz1983, an optimization for the memory usage is now present in unstable. You will be able to test the changes in the latest-unstable or latest-unstable-modern docker images.

It is also worth noting that the optimization works best when your validators all share a single client identity. If your validators all have unique client identities then you likely won't see any improvements.

macladson avatar Jul 20 '22 02:07 macladson

Hey @jmcruz1983 are you satisfied with the VC memory improvements? Any more issues since updating?

michaelsproul avatar Aug 26 '22 02:08 michaelsproul

HI @michaelsproul yes we are happy with current memory footprint, great improvement since v2.5.1 & v3.0.0 mem usage

FYI, this chart includes tests with internal versions + latest public releases

Thanks! /Juan

jmcruz1983 avatar Aug 26 '22 10:08 jmcruz1983

Awesome, I am going to close this issue as resolved!

@macladson I think we're probably also safe to close https://github.com/sigp/lighthouse/pull/3223, unless you think we might want to merge it anyway?

michaelsproul avatar Sep 08 '22 06:09 michaelsproul