rusk
rusk copied to clipboard
Investigate block acceptance timing
Summary
Block acceptance is taking ~200ms
{"timestamp":"2023-11-22T11:26:14.936995Z","level":"DEBUG","event":"committee_generated","members":" [0]=pk:npbMNQM1xKYsxzrB w:1,","target":"dusk_consensus::phase","spans":[{"name":"sel","pk":"212Eszgyh7GqMTws","round":7161,"step":0,"name":"main"}]}
{"timestamp":"2023-11-22T11:26:14.937213Z","level":"DEBUG","event":"run event_loop","target":"dusk_consensus::execution_ctx","spans":[{"name":"sel","pk":"212Eszgyh7GqMTws","round":7161,"step":0,"name":"main"}]}
{"timestamp":"2023-11-22T11:26:14.937428Z","level":"DEBUG","event":"iter completed","len":0,"round":7161,"iter":0,"target":"dusk_consensus::execution_ctx"}
{"timestamp":"2023-11-22T11:26:14.937558Z","level":"INFO","message":"Received finalize request","target":"rusk::vm"}
{"timestamp":"2023-11-22T11:26:14.952122Z","level":"INFO","event":"block accepted","height":7161,"iter":0,"hash":"378c44abb97a11e8...dc5600b71a975224","txs":0,"state_hash":"f5f251a24cb8c986...30f22ded1801d8f4","fsv_bitset":129482526,"ssv_bitset":259355773,"block_time":7,"dur_ms":"14","target":"node::chain::acceptor"}
{"timestamp":"2023-11-22T11:26:14.952335Z","level":"INFO","event":"spawn consensus","id":6641,"round":7162,"all":30,"eligible":30,"target":"node::chain::consensus"}
{"timestamp":"2023-11-22T11:26:14.952470Z","level":"WARN","message":"consensus err: Canceled","target":"node::chain"}
{"timestamp":"2023-11-22T11:26:14.952610Z","level":"INFO","event":"block received","src":"wire","blk_height":7162,"blk_hash":"f04878b7118190a7...56fa7c4b59ec9ca8","target":"node::chain"}
{"timestamp":"2023-11-22T11:26:15.151301Z","level":"DEBUG","event":"consensus canceled","round":7162,"target":"dusk_consensus::consensus"}
{"timestamp":"2023-11-22T11:26:15.151761Z","level":"DEBUG","event":"init","name":"sel","round":7162,"step":0,"timeout":5000,"target":"dusk_consensus::selection::step"}
{"timestamp":"2023-11-22T11:26:15.152009Z","level":"DEBUG","event":"execute_step","timeout":5000,"target":"dusk_consensus::phase","spans":[{"name":"sel","pk":"212Eszgyh7GqMTws","round":7162,"step":0,"name":"main"}]}
{"timestamp":"2023-11-22T11:26:15.152194Z","level":"DEBUG","event":"committee_generated","members":" [0]=pk:23YLXnqMDSS7Dts6 w:1,","target":"dusk_consensus::phase","spans":[{"name":"sel","pk":"212Eszgyh7GqMTws","round":7162,"step":0,"name":"main"}]}
{"timestamp":"2023-11-22T11:26:15.152329Z","level":"DEBUG","event":"run event_loop","
Confirmed!
Running it locally showed the bottleneck is probably in verify_block_header
but also abort_and_wait
should be taken into account. This needs to be cpu-profileed and eventually optimized.
Bootstrapping of 30 provisioners in a single machine:
{"level":"INFO","event":"after verify_block_header","height":2,"dur_ms":"4567","target":"node::chain::acceptor"}
{"level":"INFO","event":"after abort_and_wait","height":2,"dur_ms":"780","target":"node::chain::acceptor"}
{"level":"INFO","event":"block accepted","height":2,"iter":0,"hash":"26b1c44d781aafe2...0d49ae87a09d8efa","txs":0,"state_hash":"7be435993b353a3d...f87703e9703c12cb","fsv_bitset":129941471,"ssv_bitset":64928975,"block_time":44,"generator":"22GNXdXKRebsro2F","dur_ms":"16","tar
CPU-profiling data attached proves that the bottleneck is in APK::aggregate.
Perfect, going to bench it on bls-sign and see what can be done
I've added benches that aggregate 30 pks on bls_sign:
However, timing seems to be different, since it result only 26ms to aggregate instead of 200ms as per your profiler
cargo bench bench_aggregate_pk
test benches::bench_aggregate_pk ... bench: 882,331 ns/iter (+/- 30,354)
test benches::bench_aggregate_pk_30 ... bench: 26,540,112 ns/iter (+/- 682,946)
test benches::bench_aggregate_pk_30_each ... bench: 26,623,183 ns/iter (+/- 654,939)
Timing seems to be consistent.
Indeed we perform 2 aggregation per certificate.
And latest main
verifies at least 2 certificate per block (current+prev)
Addressing the performance with a parallel
feature on BLS library
Certainly. Yes, each certificate contains two signatures, and a block can have anywhere from two to (RELAX_ITERATION_THRESHOLD+2) certificates. In simpler terms, we can anticipate a block having between 4 and 24 signatures, with RELAX_ITERATION_THRESHOLD set to 10.
Certainly. Yes, each certificate contains two signatures, and a block can have anywhere from two to (RELAX_ITERATION_THRESHOLD+2) certificates. In simpler terms, we can anticipate a block having between 4 and 24 signatures, with RELAX_ITERATION_THRESHOLD set to 10.
note that failed iterations will typically have around half signatures to aggregate (NilQuorum is 22 sigs) than the agreement quorum (43 sigs), so the impact of their certificate should be aprox 2 failed-iteration certificate = 1 quorum certificate