Network thread performance issue due to async randomness

Open twoeths opened this issue 8 months ago • 0 comments

Describe the bug

This is a review of metrics monitored on our test mainnet node of #7761, it's very likely we'll merge that PR since the issue only happens on a test mainnet node subscribing on all subnets and it improved the mainnet thread a lot, so I make this issue for later reference

in general, that PR improves the main thread a lot that cause more pressure on the network thread

on the last 8 days, scavenge gc keeps going up

the event loop lag keeps increasing

due to that the request I/O time increased, especially for ping, status, metadata

the node has so many peers so it has to disconnect a lot of them

peer manager heart beat also increased

on the main thread, it improved a lot

The issue does not happen on other nodes

Expected behavior

Event loop lag on the network thread is the same to before

Steps to reproduce

No response

Additional context

No response

Operating system

Linux

Lodestar version or commit hash

mkeil/aggregate-with-randomness-async-again

May 07 '25 08:05 twoeths