aptos-core
aptos-core copied to clipboard
[cli][consensus] Add analyze-validator-performance to node cli
Adding utilities to analyze validator performance, based on accessing live on-chain data
Fastest to run a full node locally, and connecting cli to it.
Currently fetching from all transactions. We can fetch create API for just metadata transactions and/or fetch from BigQuery, if this is too slow for people. When we run loadtest, and BlockMetadata transactions are not the majority, it does slow down.
Test Plan: cargo run -p aptos -- node analyze-validator-performance --analyze-mode=all
@ikabiljo - Possible to split this into two PRs for easier review? I see its addressing two different functionalities one for each commit?
sorry, github issues with stacked PRs.
Corrected it now, it's now showing only one change.
Other one is a separate PR already.
Sorry, smoke_test was accidentally added, that's how I used it before extracting and putting into CLI, I'll remove it.
other changes on main allowed me to simplify fetching logic (using fetching events). So this should also be faster now.
@sitalkedia , @zekun000 this is ready for review
crates/aptos/src/node/mod.rs
line 779 at r2 (raw file):
} let all_validators: Vec<_> = total_stats.validator_stats.keys().cloned().collect(); if self.analyze_mode == AnalyzeMode::ValidatorHealthOverTime
nit: A
match
statement would be cleaner to replace thisif statement
and statement below.
with match statement, I would need to duplicate the calls instead (because ALL calls both summaries - so seems like a similar complexity
:white_check_mark: Forge test success on a501cc65a1f63f84a42a5a5420295e4d3e1b9117
performance benchmark : 7430 TPS, 3993 ms latency, 6000 ms p99 latency,no expired txns
- Grafana dashboard
- Validator 0 logs
- Humio Logs
- Test runner output
- Test run 1 is land-blocking