Add better IO metrics
Problem
Our disk metrics lack granularity, making it hard to diagnose IO bottlenecks
Proposed Solution
Add IOPS and bandwidth metrics for each of the configurable paths:
- ledger
- snapshots
- incremental snapshots
- accounts
- accounts index
- accounts hash cache
- logs
The underlying system data will be at the volume level, so the data will be duplicated in most configurations. This granularity will give us the ability to narrow down the source of IO activity and will also provide insight into how many volumes each node has and which directories are grouped together on the same volume.
An additional configurable path is the accounts hash cache too. Can that be added to the list here?
An additional configurable path is the accounts hash cache too. Can that be added to the list here?
Added