keyless prover vk rotation zero downtime
Description
- New on-chain config:
SetupsVKsto store (Setup, VK) pairs. - New keyless transaction field that takes a
setup_id. - In keyless txn validation, if a
setup_idis specified, use the corresponding VK fromSetupsVKsconfig.
Type of Change
- [x] Refactoring
Which Components or Systems Does This Change Impact?
- [x] Move/Aptos Virtual Machine
- [x] Aptos Framework
How Has This Been Tested?
new smoke test.
Key Areas to Review
- Does this refactoring really NOT need a feature flag?
Checklist
- [x] I have read and followed the CONTRIBUTING doc
- [x] I have performed a self-review of my own code
- [x] I have commented my code, particularly in hard-to-understand areas
- [x] I identified and added all stakeholders and component owners affected by this change as reviewers
- [x] I tested both happy and unhappy path of the functionality
- [x] I have made corresponding changes to the documentation
⏱️ 4h 40m total CI duration on this PR
🚨 1 job on the last run was significantly faster/slower than expected
| Job | Duration | vs 7d avg | Delta |
|---|---|---|---|
| forge-compat-test / forge | 19m | 15m |
Codecov Report
Attention: Patch coverage is 0% with 197 lines in your changes missing coverage. Please review.
Project coverage is 59.0%. Comparing base (
f2a427f) to head (a4bb6ff). Report is 201 commits behind head on main.
:exclamation: There is a different number of reports uploaded between BASE (f2a427f) and HEAD (a4bb6ff). Click for more details.
HEAD has 1 upload less than BASE
Flag BASE (f2a427f) HEAD (a4bb6ff) 2 1
Additional details and impacted files
@@ Coverage Diff @@
## main #14154 +/- ##
===========================================
- Coverage 70.7% 59.0% -11.7%
===========================================
Files 2338 827 -1511
Lines 466716 201089 -265627
===========================================
- Hits 330314 118830 -211484
+ Misses 136402 82259 -54143
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Forge is running suite realistic_env_max_load on a4bb6ff307d58267d16b318f391ed1724bcd3eaa
- Grafana dashboard (auto-refresh)
- Humio Logs
- Axiom Logs
- Validator CPU Profile
- Fullnode CPU Profile
- Test runner output
- Test run is land-blocking
Forge is running suite compat on 1c2ee7082d6eff8c811ee25d6f5a7d00860a75d5 ==> a4bb6ff307d58267d16b318f391ed1724bcd3eaa
- Grafana dashboard (auto-refresh)
- Humio Logs
- Axiom Logs
- Validator CPU Profile
- Fullnode CPU Profile
- Test runner output
- Test run is land-blocking
:x: Forge suite realistic_env_max_load failure on a4bb6ff307d58267d16b318f391ed1724bcd3eaa
two traffics test: inner traffic : committed: 9323.590141455545 txn/s, latency: 4292.5837488434545 ms, (p50: 3600 ms, p90: 6300 ms, p99: 17500 ms), latency samples: 3545040
two traffics test : committed: 100.05358883322234 txn/s, latency: 3352.8186170212766 ms, (p50: 2100 ms, p90: 5100 ms, p99: 20000 ms), latency samples: 1880
Latency breakdown for phase 0: ["QsBatchToPos: max: 0.261, avg: 0.224", "QsPosToProposal: max: 1.279, avg: 1.138", "ConsensusProposalToOrdered: max: 0.313, avg: 0.291", "ConsensusOrderedToCommit: max: 0.413, avg: 0.400", "ConsensusProposalToCommit: max: 0.704, avg: 0.691"]
Test Failed: check for success
Caused by:
Failed latency check, for ["P90 latency is 5.1s and exceeds limit of 4.5s"]
Stack backtrace:
0: anyhow::error::<impl anyhow::Error>::msg
at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/anyhow-1.0.79/src/error.rs:83:36
1: aptos_forge::success_criteria::SuccessCriteriaChecker::check_latency
at ./testsuite/forge/src/success_criteria.rs:574:13
2: aptos_forge::success_criteria::SuccessCriteriaChecker::check_for_success::{{closure}}
at ./testsuite/forge/src/success_criteria.rs:298:9
3: aptos_forge::interface::network::NetworkContext::check_for_success::{{closure}}
at ./testsuite/forge/src/interface/network.rs:112:10
4: <dyn aptos_testcases::NetworkLoadTest as aptos_forge::interface::network::NetworkTest>::run::{{closure}}
at ./testsuite/testcases/src/lib.rs:321:14
5: <core::pin::Pin<P> as core::future::future::Future>::poll
at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/core/src/future/future.rs:123:9
6: <aptos_testcases::two_traffics_test::TwoTrafficsTest as aptos_forge::interface::network::NetworkTest>::run::{{closure}}
at ./testsuite/testcases/src/two_traffics_test.rs:76:47
7: <core::pin::Pin<P> as core::future::future::Future>::poll
at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/core/src/future/future.rs:123:9
8: <aptos_testcases::CompositeNetworkTest as aptos_forge::interface::network::NetworkTest>::run::{{closure}}
at ./testsuite/testcases/src/lib.rs:617:37
9: <core::pin::Pin<P> as core::future::future::Future>::poll
at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/core/src/future/future.rs:123:9
10: tokio::runtime::park::CachedParkThread::block_on::{{closure}}
at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/park.rs:282:63
11: tokio::runtime::coop::with_budget
at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/coop.rs:107:5
12: tokio::runtime::coop::budget
at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/coop.rs:73:5
13: tokio::runtime::park::CachedParkThread::block_on
at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/park.rs:282:31
14: tokio::runtime::context::blocking::BlockingRegionGuard::block_on
at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/context/blocking.rs:66:9
15: tokio::runtime::handle::Handle::block_on::{{closure}}
at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/handle.rs:310:22
16: tokio::runtime::context::runtime::enter_runtime
at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/context/runtime.rs:65:16
17: tokio::runtime::handle::Handle::block_on
at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/handle.rs:309:9
18: aptos_forge::runner::Forge<F>::run::{{closure}}
at ./testsuite/forge/src/runner.rs:611:49
19: aptos_forge::runner::run_test
at ./testsuite/forge/src/runner.rs:684:11
20: aptos_forge::runner::Forge<F>::run
at ./testsuite/forge/src/runner.rs:611:30
21: forge::run_forge
at ./testsuite/forge-cli/src/main.rs:429:11
22: forge::main
at ./testsuite/forge-cli/src/main.rs:355:21
23: core::ops::function::FnOnce::call_once
at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/core/src/ops/function.rs:250:5
24: std::sys_common::backtrace::__rust_begin_short_backtrace
at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/sys_common/backtrace.rs:155:18
25: std::rt::lang_start::{{closure}}
at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/rt.rs:166:18
26: core::ops::function::impls::<impl core::ops::function::FnOnce<A> for &F>::call_once
at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/core/src/ops/function.rs:284:13
27: std::panicking::try::do_call
at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/panicking.rs:552:40
28: std::panicking::try
at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/panicking.rs:516:19
29: std::panic::catch_unwind
at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/panic.rs:146:14
30: std::rt::lang_start_internal::{{closure}}
at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/rt.rs:148:48
31: std::panicking::try::do_call
at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/panicking.rs:552:40
32: std::panicking::try
at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/panicking.rs:516:19
33: std::panic::catch_unwind
at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/panic.rs:146:14
34: std::rt::lang_start_internal
at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/rt.rs:148:20
35: main
36: __libc_start_main
37: _start
Trailing Log Lines:
32: std::panicking::try
at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/panicking.rs:516:19
33: std::panic::catch_unwind
at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/panic.rs:146:14
34: std::rt::lang_start_internal
at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/rt.rs:148:20
35: main
36: __libc_start_main
37: _start
Swarm logs can be found here: See fgi output for more information.
{"level":"INFO","source":{"package":"aptos_forge","file":"testsuite/forge/src/backend/k8s/cluster_helper.rs:292"},"thread_name":"main","hostname":"forge-e2e-pr-14154-1722299752-a4bb6ff307d58267d16b318f391ed1724","timestamp":"2024-07-30T00:51:18.137202Z","message":"Deleting namespace forge-e2e-pr-14154: Some(NamespaceStatus { conditions: None, phase: Some(\"Terminating\") })"}
{"level":"INFO","source":{"package":"aptos_forge","file":"testsuite/forge/src/backend/k8s/cluster_helper.rs:400"},"thread_name":"main","hostname":"forge-e2e-pr-14154-1722299752-a4bb6ff307d58267d16b318f391ed1724","timestamp":"2024-07-30T00:51:18.137229Z","message":"aptos-node resources for Forge removed in namespace: forge-e2e-pr-14154"}
failures:
CompositeNetworkTest
test result: FAILED. 0 passed; 1 failed; 0 filtered out
Failed to run tests:
Tests Failed
Error: Tests Failed
Stack backtrace:
0: anyhow::error::<impl anyhow::Error>::msg
at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/anyhow-1.0.79/src/error.rs:83:36
1: aptos_forge::runner::Forge<F>::run
at ./testsuite/forge/src/runner.rs:636:13
2: forge::run_forge
at ./testsuite/forge-cli/src/main.rs:429:11
3: forge::main
at ./testsuite/forge-cli/src/main.rs:355:21
4: core::ops::function::FnOnce::call_once
at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/core/src/ops/function.rs:250:5
5: std::sys_common::backtrace::__rust_begin_short_backtrace
at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/sys_common/backtrace.rs:155:18
6: std::rt::lang_start::{{closure}}
at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/rt.rs:166:18
7: core::ops::function::impls::<impl core::ops::function::FnOnce<A> for &F>::call_once
at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/core/src/ops/function.rs:284:13
8: std::panicking::try::do_call
at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/panicking.rs:552:40
9: std::panicking::try
at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/panicking.rs:516:19
10: std::panic::catch_unwind
at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/panic.rs:146:14
11: std::rt::lang_start_internal::{{closure}}
at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/rt.rs:148:48
12: std::panicking::try::do_call
at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/panicking.rs:552:40
13: std::panicking::try
at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/panicking.rs:516:19
14: std::panic::catch_unwind
at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/panic.rs:146:14
15: std::rt::lang_start_internal
at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/rt.rs:148:20
16: main
17: __libc_start_main
18: _start
Debugging output:
NAME READY STATUS RESTARTS AGE
aptos-node-0-fullnode-eforge233-0 1/1 Running 0 14m
aptos-node-0-validator-0 1/1 Running 0 14m
aptos-node-1-fullnode-eforge233-0 1/1 Running 0 14m
aptos-node-1-validator-0 1/1 Running 0 14m
aptos-node-2-fullnode-eforge233-0 1/1 Running 0 14m
aptos-node-2-validator-0 1/1 Running 0 14m
aptos-node-3-fullnode-eforge233-0 1/1 Running 0 14m
aptos-node-3-validator-0 1/1 Running 0 14m
aptos-node-4-fullnode-eforge233-0 1/1 Running 0 14m
aptos-node-4-validator-0 1/1 Running 0 14m
aptos-node-5-validator-0 1/1 Running 0 14m
aptos-node-6-validator-0 1/1 Running 0 14m
genesis-aptos-genesis-eforge233-jwh8t 0/1 Completed 0 15m
- Grafana dashboard
- Humio Logs
- Axiom Logs
- Validator CPU Profile
- Fullnode CPU Profile
- Test runner output
- Test run is land-blocking
:white_check_mark: Forge suite compat success on 1c2ee7082d6eff8c811ee25d6f5a7d00860a75d5 ==> a4bb6ff307d58267d16b318f391ed1724bcd3eaa
Compatibility test results for 1c2ee7082d6eff8c811ee25d6f5a7d00860a75d5 ==> a4bb6ff307d58267d16b318f391ed1724bcd3eaa (PR)
1. Check liveness of validators at old version: 1c2ee7082d6eff8c811ee25d6f5a7d00860a75d5
compatibility::simple-validator-upgrade::liveness-check : committed: 7049.722701141375 txn/s, latency: 4029.6089231720725 ms, (p50: 3300 ms, p90: 5100 ms, p99: 25300 ms), latency samples: 291040
2. Upgrading first Validator to new version: a4bb6ff307d58267d16b318f391ed1724bcd3eaa
compatibility::simple-validator-upgrade::single-validator-upgrading : committed: 7516.632390717834 txn/s, latency: 3614.660722995266 ms, (p50: 4000 ms, p90: 4400 ms, p99: 4500 ms), latency samples: 139420
compatibility::simple-validator-upgrade::single-validator-upgrade : committed: 6407.407149864283 txn/s, latency: 4634.957230968998 ms, (p50: 4400 ms, p90: 7000 ms, p99: 7600 ms), latency samples: 246440
3. Upgrading rest of first batch to new version: a4bb6ff307d58267d16b318f391ed1724bcd3eaa
compatibility::simple-validator-upgrade::half-validator-upgrading : committed: 6142.443795864756 txn/s, latency: 4293.309752547307 ms, (p50: 4800 ms, p90: 5200 ms, p99: 5600 ms), latency samples: 123660
compatibility::simple-validator-upgrade::half-validator-upgrade : committed: 6797.869585565258 txn/s, latency: 4629.830172119972 ms, (p50: 4800 ms, p90: 6400 ms, p99: 7000 ms), latency samples: 234720
4. upgrading second batch to new version: a4bb6ff307d58267d16b318f391ed1724bcd3eaa
compatibility::simple-validator-upgrade::rest-validator-upgrading : committed: 10993.233803277955 txn/s, latency: 2546.2926749379653 ms, (p50: 2800 ms, p90: 3100 ms, p99: 3300 ms), latency samples: 201500
compatibility::simple-validator-upgrade::rest-validator-upgrade : committed: 9354.59274853686 txn/s, latency: 3610.889398715356 ms, (p50: 3000 ms, p90: 8700 ms, p99: 10300 ms), latency samples: 336280
5. check swarm health
Compatibility test for 1c2ee7082d6eff8c811ee25d6f5a7d00860a75d5 ==> a4bb6ff307d58267d16b318f391ed1724bcd3eaa passed
Test Ok
- Grafana dashboard
- Humio Logs
- Axiom Logs
- Validator CPU Profile
- Fullnode CPU Profile
- Test runner output
- Test run is land-blocking