lodestar icon indicating copy to clipboard operation
lodestar copied to clipboard

Network thread performance issue due to async randomness

Open twoeths opened this issue 8 months ago • 0 comments

Describe the bug

This is a review of metrics monitored on our test mainnet node of #7761, it's very likely we'll merge that PR since the issue only happens on a test mainnet node subscribing on all subnets and it improved the mainnet thread a lot, so I make this issue for later reference

  • in general, that PR improves the main thread a lot that cause more pressure on the network thread
Image
  • on the last 8 days, scavenge gc keeps going up
Image
  • the event loop lag keeps increasing
Image
  • due to that the request I/O time increased, especially for ping, status, metadata
Image
  • the node has so many peers so it has to disconnect a lot of them
Image Image
  • peer manager heart beat also increased
Image
  • on the main thread, it improved a lot
Image Image

The issue does not happen on other nodes

Expected behavior

Event loop lag on the network thread is the same to before

Steps to reproduce

No response

Additional context

No response

Operating system

Linux

Lodestar version or commit hash

mkeil/aggregate-with-randomness-async-again

twoeths avatar May 07 '25 08:05 twoeths