solana icon indicating copy to clipboard operation
solana copied to clipboard

Add better IO metrics

Open willhickey opened this issue 2 years ago • 2 comments

Problem

Our disk metrics lack granularity, making it hard to diagnose IO bottlenecks

Proposed Solution

Add IOPS and bandwidth metrics for each of the configurable paths:

  • ledger
  • snapshots
  • incremental snapshots
  • accounts
  • accounts index
  • accounts hash cache
  • logs

The underlying system data will be at the volume level, so the data will be duplicated in most configurations. This granularity will give us the ability to narrow down the source of IO activity and will also provide insight into how many volumes each node has and which directories are grouped together on the same volume.

willhickey avatar Jan 11 '24 18:01 willhickey

An additional configurable path is the accounts hash cache too. Can that be added to the list here?

brooksprumo avatar Jan 11 '24 18:01 brooksprumo

An additional configurable path is the accounts hash cache too. Can that be added to the list here?

Added

willhickey avatar Jan 11 '24 18:01 willhickey