StackerDB benchmark
Description
The new "pinger mode" for the signer will request round-trip latency messages through an extra dedicated slot in the stackerDB contract. The Ping/Pong event handling is always enabled. A signer can and will answer incoming Pings/Pongs through StackerDB regardless of whether the signer runs in "ping mode." The "ping mode" enables an extra thread that periodically sends a Ping command to the RunLoop to broadcast Pong requests. Use the Run subcommand with the option ping-in-millis to enable "ping-mode." The current implementation supports arbitrarily-big payloads with the option ping-payload-size.
The pinger thread sends Ping commands when it is spawned at periodic intervals. You should choose a sensible ping cadence, especially when the signer takes a long time to initialize.
The "benchmark" measures the time from requesting to insert a StackerDB chunk to the time it takes signers to process the event. It does not measure the time it takes for an observer to get/receive chunks. This means that the Signer's workload impacts Variance.
The event handling happens before parsing and handling Signer Packets. The first "tick" Instant is captured as soon as the RunCommand is processed and before the RPC call to the stacks-node. The "tock" Instant is calculated once the first "tick" is retrieved from the bookkeeping map.
There is no RTT post-processing; once the RTT is calculated, it is dumped into the logger.
sequenceDiagram
participant PeriodicPinger
participant Signer0
participant StackerDB
participant Signer1
PeriodicPinger->>Signer0: Push Command(Ping)
Signer0->>+Signer0: Store tick instant
Signer0->>-StackerDB: Broadcast ping
par
Note right of StackerDB: slot_id: 0, signer_id: 0, slot_version: 0
StackerDB->>+Signer0: Ping { id }
alt self is signer_id
Signer0-->>-Signer0: end
end
StackerDB->>+Signer1: Ping { id }
activate Signer1
alt self is not signer_id
Signer1-->>-StackerDB: Broadcast pong
end
end
par
Note right of StackerDB: slot_id: 1, signer_id: 1, slot_version: 0
StackerDB->>+Signer0: Pong { id }
alt Fetch ping with pong.id
Signer0-->>Signer0: calculate tick.elapsed()
Signer0->>-PeriodicPinger: store RTT
end
StackerDB->>+Signer1: Pong { id }
alt self is signer_id or pong.id is not found
Signer1-->>-Signer1: Skip
end
end
Applicable issues
Additional info (benefits, drawbacks, caveats)
Checklist
- [X] Test coverage for new or modified code paths
- [ ] Changelog is updated
- [ ] Required documentation changes (e.g.,
docs/rpc/openapi.yamlandrpc-endpoints.mdfor v2 endpoints,event-dispatcher.mdfor new events) - [ ] New clarity functions have corresponding PR in
clarity-benchmarkingrepo - [ ] New integration test(s) added to
bitcoin-tests.yml
Codecov Report
Attention: 82 lines in your changes are missing coverage. Please review.
Comparison is base (
26d4833) 80.82% compared to head (decf2f5) 83.38%. Report is 9 commits behind head on next.
Additional details and impacted files
@@ Coverage Diff @@
## next #4167 +/- ##
==========================================
+ Coverage 80.82% 83.38% +2.56%
==========================================
Files 435 437 +2
Lines 309211 309787 +576
==========================================
+ Hits 249918 258327 +8409
+ Misses 59293 51460 -7833
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
@CAGS295 can you share the minimum required framework needed to test this in a real-world environment? i.e. 2 vm's with public IP's running your fork in this PR?
2 VMs with public IPs running your fork in this PR
- Each VM runs as a seed node with a signer bin.
- At least one signer must be run with pinging options
cargo run -p stacks-signer -- run --config=/path/to/config.toml --ping-in-millis 10000 --ping-payload-size=4096 - The payload size option should be the max msg size allowed in the pox contract.
- ~~The slots per signer in the contract must also be increased by one~~.
- The slots per signer in the contract should match this variable.
- If everything goes well and once the signers start processing events, you should expect lines like "New RTT for id 1234: 7.7s".
This is the time it took from sending a command to the run loop until the Ping requester processes each incoming Pong.
One last thing - I just saw that this is based against develop. It should be based against next since that's where Nakamoto development is happening.
Otherwise LGTM. I'll approve once it's based on next.
@CAGS295 can you resolve the conflicts here?