aptos-core icon indicating copy to clipboard operation
aptos-core copied to clipboard

[quorum store] filter txns seen in remote batches on batch creation

Open bchocho opened this issue 1 year ago • 16 comments

Description

We filter txns seen from a remote batch for a short time (~500ms). This reduces duplicate txns created in an adversarial client/PFN scenario -- when transactions are duplicated to all VFNs.

Builds on #13007

Type of Change

  • [ ] New feature
  • [ ] Bug fix
  • [ ] Breaking change
  • [x] Performance improvement
  • [ ] Refactoring
  • [ ] Dependency update
  • [ ] Documentation update
  • [ ] Tests

Which Components or Systems Does This Change Impact?

  • [x] Validator Node
  • [ ] Full Node (API, Indexer, etc.)
  • [ ] Move/Aptos Virtual Machine
  • [ ] Aptos Framework
  • [ ] Aptos CLI/SDK
  • [ ] Developer Infrastructure
  • [ ] Other (specify)

How Has This Been Tested?

Added the new forge test pfn_spam_duplicates and observed performance

Key Areas to Review

Checklist

  • [ ] I have read and followed the CONTRIBUTING doc
  • [ ] I have performed a self-review of my own code
  • [ ] I have commented my code, particularly in hard-to-understand areas
  • [ ] I identified and added all stakeholders and component owners affected by this change as reviewers
  • [ ] I tested both happy and unhappy path of the functionality
  • [ ] I have made corresponding changes to the documentation

bchocho avatar Apr 24 '24 22:04 bchocho

⏱️ 10h 32m total CI duration on this PR
Job Cumulative Duration Recent Runs
forge-e2e-test / forge 2h 3m 🟥🟥🟩🟩🟩 (+1 more)
rust-targeted-unit-tests 1h 50m 🟩🟩🟩🟩 (+2 more)
rust-images / rust-all 1h 14m 🟩🟩🟩🟩🟩 (+1 more)
rust-move-tests 57m 🟩🟩🟩🟩 (+3 more)
execution-performance / single-node-performance 50m 🟩🟩
rust-smoke-tests 43m 🟥
rust-lints 33m 🟥🟩🟩🟩 (+2 more)
windows-build 30m 🟩
rust-unit-tests 27m 🟩
run-tests-main-branch 26m 🟩🟩🟩🟩 (+2 more)
forge-compat-test / forge 13m 🟩
general-lints 11m 🟩🟩🟩🟩 (+2 more)
check-dynamic-deps 10m 🟩🟩🟩🟩🟩 (+3 more)
cli-e2e-tests / run-cli-tests 6m 🟩
rust-build-cached-packages 4m 🟩
check 4m 🟩
semgrep/ci 3m 🟩🟩🟩🟩🟩 (+3 more)
file_change_determinator 1m 🟩🟩🟩🟩🟩 (+3 more)
file_change_determinator 1m 🟩🟩🟩🟩🟩 (+2 more)
file_change_determinator 1m 🟩🟩🟩🟩🟩 (+1 more)
node-api-compatibility-tests / node-api-compatibility-tests 50s 🟩
permission-check 29s 🟩🟩🟩🟩🟩 (+3 more)
permission-check 24s 🟩🟩🟩🟩🟩 (+3 more)
permission-check 24s 🟩🟩🟩🟩🟩 (+3 more)
permission-check 20s 🟩🟩🟩🟩🟩 (+3 more)
permission-check 15s 🟩🟩🟩🟩🟩 (+1 more)
determine-docker-build-metadata 13s 🟩🟩🟩🟩🟩 (+1 more)
execution-performance / file_change_determinator 10s 🟩

🚨 2 jobs on the last run were significantly faster/slower than expected

Job Duration vs 7d avg Delta
rust-smoke-tests 43m 34m +25%
rust-move-tests 3m 9m -65%

settingsfeedbackdocs ⋅ learn more about trunk.io

trunk-io[bot] avatar Apr 24 '24 22:04 trunk-io[bot]

Forge is running suite realistic_env_max_load on 36c2f64a700155805b5ad42a0fcdd1f2f79d3752

github-actions[bot] avatar Apr 24 '24 22:04 github-actions[bot]

:white_check_mark: Forge suite realistic_env_max_load success on 36c2f64a700155805b5ad42a0fcdd1f2f79d3752

PFNPerformance : committed: 3236 txn/s, latency: 12306 ms, (p50: 11800 ms, p90: 13800 ms, p99: 25900 ms), latency samples: 1381977
Latency breakdown for phase 0: ["QsBatchToPos: max: 0.178, avg: 0.177", "QsPosToProposal: max: 0.453, avg: 0.280", "ConsensusProposalToOrdered: max: 0.406, avg: 0.367", "ConsensusOrderedToCommit: max: 0.253, avg: 0.245", "ConsensusProposalToCommit: max: 0.648, avg: 0.612"]
Max round gap was 2 [limit 4] at version 957301. Max no progress secs was 7.006265 [limit 10] at version 947412.
Test Ok

github-actions[bot] avatar Apr 24 '24 22:04 github-actions[bot]

Forge is running suite realistic_env_max_load on 1b66af8683f3fbd20012443e04c57fa65760b133

github-actions[bot] avatar Apr 25 '24 02:04 github-actions[bot]

:x: Forge suite realistic_env_max_load failure on 1b66af8683f3fbd20012443e04c57fa65760b133

PFNPerformance : committed: 2060 txn/s, latency: 19329 ms, (p50: 19000 ms, p90: 20400 ms, p99: 26500 ms), latency samples: 879840
Latency breakdown for phase 0: ["QsBatchToPos: max: 0.178, avg: 0.177", "QsPosToProposal: max: 0.624, avg: 0.568", "ConsensusProposalToOrdered: max: 0.436, avg: 0.428", "ConsensusOrderedToCommit: max: 0.259, avg: 0.255", "ConsensusProposalToCommit: max: 0.695, avg: 0.683"]
Test Failed: check for success

Caused by:
    TPS requirement failed. Average TPS 2060, minimum TPS requirement 2500. Full stats: committed: 2060 txn/s, latency: 19329 ms, (p50: 19000 ms, p90: 20400 ms, p99: 26500 ms), latency samples: 879840

Stack backtrace:
   0: anyhow::error::<impl anyhow::Error>::msg
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/anyhow-1.0.79/src/error.rs:83:36
   1: aptos_forge::success_criteria::SuccessCriteriaChecker::check_tps
             at ./testsuite/forge/src/success_criteria.rs:436:13
   2: aptos_forge::success_criteria::SuccessCriteriaChecker::check_throughput
             at ./testsuite/forge/src/success_criteria.rs:489:9
   3: aptos_forge::success_criteria::SuccessCriteriaChecker::check_for_success::{{closure}}
             at ./testsuite/forge/src/success_criteria.rs:271:9
   4: tokio::runtime::park::CachedParkThread::block_on::{{closure}}
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/park.rs:282:63
   5: tokio::runtime::coop::with_budget
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/coop.rs:107:5
   6: tokio::runtime::coop::budget
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/coop.rs:73:5
   7: tokio::runtime::park::CachedParkThread::block_on
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/park.rs:282:31
   8: tokio::runtime::context::blocking::BlockingRegionGuard::block_on
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/context/blocking.rs:66:9
   9: tokio::runtime::scheduler::multi_thread::MultiThread::block_on::{{closure}}
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/scheduler/multi_thread/mod.rs:87:13
  10: tokio::runtime::context::runtime::enter_runtime
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/context/runtime.rs:65:16
  11: tokio::runtime::scheduler::multi_thread::MultiThread::block_on
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/scheduler/multi_thread/mod.rs:86:9
  12: tokio::runtime::runtime::Runtime::block_on
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/runtime.rs:350:50
  13: aptos_forge::interface::network::NetworkContext::check_for_success
             at ./testsuite/forge/src/interface/network.rs:71:9
  14: <dyn aptos_testcases::NetworkLoadTest as aptos_forge::interface::network::NetworkTest>::run
             at ./testsuite/testcases/src/lib.rs:229:13
  15: aptos_forge::runner::Forge<F>::run::{{closure}}
             at ./testsuite/forge/src/runner.rs:598:42
  16: aptos_forge::runner::run_test
             at ./testsuite/forge/src/runner.rs:666:11
  17: aptos_forge::runner::Forge<F>::run
             at ./testsuite/forge/src/runner.rs:598:30
  18: forge::run_forge
             at ./testsuite/forge-cli/src/main.rs:427:11
  19: forge::main
             at ./testsuite/forge-cli/src/main.rs:353:21
  20: core::ops::function::FnOnce::call_once
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/ops/function.rs:250:5
  21: std::sys_common::backtrace::__rust_begin_short_backtrace
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/sys_common/backtrace.rs:154:18
  22: std::rt::lang_start::{{closure}}
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/rt.rs:167:18
  23: core::ops::function::impls::<impl core::ops::function::FnOnce<A> for &F>::call_once
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/ops/function.rs:284:13
  24: std::panicking::try::do_call
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panicking.rs:552:40
  25: std::panicking::try
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panicking.rs:516:19
  26: std::panic::catch_unwind
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panic.rs:142:14
  27: std::rt::lang_start_internal::{{closure}}
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/rt.rs:148:48
  28: std::panicking::try::do_call
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panicking.rs:552:40
  29: std::panicking::try
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panicking.rs:516:19
  30: std::panic::catch_unwind
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panic.rs:142:14
  31: std::rt::lang_start_internal
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/rt.rs:148:20
  32: std::rt::lang_start
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/rt.rs:166:17
  33: __libc_start_main
  34: _start
Trailing Log Lines:
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panicking.rs:516:19
  30: std::panic::catch_unwind
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panic.rs:142:14
  31: std::rt::lang_start_internal
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/rt.rs:148:20
  32: std::rt::lang_start
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/rt.rs:166:17
  33: __libc_start_main
  34: _start


Swarm logs can be found here: See fgi output for more information.
{"level":"INFO","source":{"package":"aptos_forge","file":"testsuite/forge/src/backend/k8s/cluster_helper.rs:292"},"thread_name":"main","hostname":"forge-e2e-pr-13019-1714012621-1b66af8683f3fbd20012443e04c57fa65","timestamp":"2024-04-25T02:59:37.567318Z","message":"Deleting namespace forge-e2e-pr-13019: Some(NamespaceStatus { conditions: None, phase: Some(\"Terminating\") })"}
{"level":"INFO","source":{"package":"aptos_forge","file":"testsuite/forge/src/backend/k8s/cluster_helper.rs:400"},"thread_name":"main","hostname":"forge-e2e-pr-13019-1714012621-1b66af8683f3fbd20012443e04c57fa65","timestamp":"2024-04-25T02:59:37.567335Z","message":"aptos-node resources for Forge removed in namespace: forge-e2e-pr-13019"}
Failed to run tests:
Tests Failed

failures:
    PFNPerformance

test result: FAILED. 0 passed; 1 failed; 0 filtered out

Error: Tests Failed

Stack backtrace:
   0: anyhow::error::<impl anyhow::Error>::msg
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/anyhow-1.0.79/src/error.rs:83:36
   1: aptos_forge::runner::Forge<F>::run
             at ./testsuite/forge/src/runner.rs:618:13
   2: forge::run_forge
             at ./testsuite/forge-cli/src/main.rs:427:11
   3: forge::main
             at ./testsuite/forge-cli/src/main.rs:353:21
   4: core::ops::function::FnOnce::call_once
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/ops/function.rs:250:5
   5: std::sys_common::backtrace::__rust_begin_short_backtrace
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/sys_common/backtrace.rs:154:18
   6: std::rt::lang_start::{{closure}}
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/rt.rs:167:18
   7: core::ops::function::impls::<impl core::ops::function::FnOnce<A> for &F>::call_once
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/ops/function.rs:284:13
   8: std::panicking::try::do_call
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panicking.rs:552:40
   9: std::panicking::try
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panicking.rs:516:19
  10: std::panic::catch_unwind
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panic.rs:142:14
  11: std::rt::lang_start_internal::{{closure}}
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/rt.rs:148:48
  12: std::panicking::try::do_call
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panicking.rs:552:40
  13: std::panicking::try
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panicking.rs:516:19
  14: std::panic::catch_unwind
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panic.rs:142:14
  15: std::rt::lang_start_internal
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/rt.rs:148:20
  16: std::rt::lang_start
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/rt.rs:166:17
  17: __libc_start_main
  18: _start
Debugging output:
NAME                                READY   STATUS    RESTARTS   AGE
aptos-node-0-fullnode-eforge11-0    1/1     Running   0          21m
aptos-node-0-validator-0            1/1     Running   0          21m
aptos-node-1-fullnode-eforge11-0    1/1     Running   0          21m
aptos-node-1-validator-0            1/1     Running   0          21m
aptos-node-10-fullnode-eforge11-0   1/1     Running   0          21m
aptos-node-10-validator-0           1/1     Running   0          21m
aptos-node-11-fullnode-eforge11-0   1/1     Running   0          21m
aptos-node-11-validator-0           1/1     Running   0          21m
aptos-node-12-fullnode-eforge11-0   1/1     Running   0          21m
aptos-node-12-validator-0           1/1     Running   0          21m
aptos-node-13-fullnode-eforge11-0   1/1     Running   0          21m
aptos-node-13-validator-0           1/1     Running   0          21m
aptos-node-14-fullnode-eforge11-0   1/1     Running   0          21m
aptos-node-14-validator-0           1/1     Running   0          21m
aptos-node-15-fullnode-eforge11-0   1/1     Running   0          21m
aptos-node-15-validator-0           1/1     Running   0          21m
aptos-node-16-fullnode-eforge11-0   1/1     Running   0          21m
aptos-node-16-validator-0           1/1     Running   0          21m
aptos-node-17-fullnode-eforge11-0   1/1     Running   0          21m
aptos-node-17-validator-0           1/1     Running   0          21m
aptos-node-18-fullnode-eforge11-0   1/1     Running   0          21m
aptos-node-18-validator-0           1/1     Running   0          21m
aptos-node-19-fullnode-eforge11-0   1/1     Running   0          21m
aptos-node-19-validator-0           1/1     Running   0          21m
aptos-node-2-fullnode-eforge11-0    1/1     Running   0          21m
aptos-node-2-validator-0            1/1     Running   0          21m
aptos-node-3-fullnode-eforge11-0    1/1     Running   0          21m
aptos-node-3-validator-0            1/1     Running   0          21m
aptos-node-4-fullnode-eforge11-0    1/1     Running   0          21m
aptos-node-4-validator-0            1/1     Running   0          21m
aptos-node-5-fullnode-eforge11-0    1/1     Running   0          21m
aptos-node-5-validator-0            1/1     Running   0          21m
aptos-node-6-fullnode-eforge11-0    1/1     Running   0          21m
aptos-node-6-validator-0            1/1     Running   0          21m
aptos-node-7-fullnode-eforge11-0    1/1     Running   0          21m
aptos-node-7-validator-0            1/1     Running   0          21m
aptos-node-8-fullnode-eforge11-0    1/1     Running   0          21m
aptos-node-8-validator-0            1/1     Running   0          21m
aptos-node-9-fullnode-eforge11-0    1/1     Running   0          21m
aptos-node-9-validator-0            1/1     Running   0          21m
public-fullnode-20-7bd42cb5-0       1/1     Running   0          19m
public-fullnode-21-f2ac2e77-0       1/1     Running   0          18m
public-fullnode-22-16eb6ceb-0       1/1     Running   0          17m
public-fullnode-23-b834795d-0       1/1     Running   0          17m
public-fullnode-24-9d4e494e-0       1/1     Running   0          15m
public-fullnode-25-16dc6f94-0       1/1     Running   0          13m
public-fullnode-26-0cde15fe-0       1/1     Running   0          12m
public-fullnode-27-82e6bdb4-0       1/1     Running   0          12m
public-fullnode-28-c619593b-0       1/1     Running   0          10m
public-fullnode-29-7dbc8d3c-0       1/1     Running   0          9m53s

github-actions[bot] avatar Apr 25 '24 02:04 github-actions[bot]

Forge is running suite realistic_env_max_load on 36eeac84d97d364cfcbbdc20122097b599a21b00

github-actions[bot] avatar Apr 25 '24 18:04 github-actions[bot]

:x: Forge suite realistic_env_max_load failure on 36eeac84d97d364cfcbbdc20122097b599a21b00

PFNPerformance : committed: 2081 txn/s, latency: 19264 ms, (p50: 19000 ms, p90: 20700 ms, p99: 26600 ms), latency samples: 888760
Latency breakdown for phase 0: ["QsBatchToPos: max: 0.179, avg: 0.177", "QsPosToProposal: max: 0.563, avg: 0.534", "ConsensusProposalToOrdered: max: 0.429, avg: 0.423", "ConsensusOrderedToCommit: max: 0.261, avg: 0.255", "ConsensusProposalToCommit: max: 0.685, avg: 0.678"]
Test Failed: check for success

Caused by:
    TPS requirement failed. Average TPS 2081, minimum TPS requirement 2500. Full stats: committed: 2081 txn/s, latency: 19264 ms, (p50: 19000 ms, p90: 20700 ms, p99: 26600 ms), latency samples: 888760

Stack backtrace:
   0: anyhow::error::<impl anyhow::Error>::msg
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/anyhow-1.0.79/src/error.rs:83:36
   1: aptos_forge::success_criteria::SuccessCriteriaChecker::check_tps
             at ./testsuite/forge/src/success_criteria.rs:436:13
   2: aptos_forge::success_criteria::SuccessCriteriaChecker::check_throughput
             at ./testsuite/forge/src/success_criteria.rs:489:9
   3: aptos_forge::success_criteria::SuccessCriteriaChecker::check_for_success::{{closure}}
             at ./testsuite/forge/src/success_criteria.rs:271:9
   4: tokio::runtime::park::CachedParkThread::block_on::{{closure}}
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/park.rs:282:63
   5: tokio::runtime::coop::with_budget
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/coop.rs:107:5
   6: tokio::runtime::coop::budget
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/coop.rs:73:5
   7: tokio::runtime::park::CachedParkThread::block_on
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/park.rs:282:31
   8: tokio::runtime::context::blocking::BlockingRegionGuard::block_on
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/context/blocking.rs:66:9
   9: tokio::runtime::scheduler::multi_thread::MultiThread::block_on::{{closure}}
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/scheduler/multi_thread/mod.rs:87:13
  10: tokio::runtime::context::runtime::enter_runtime
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/context/runtime.rs:65:16
  11: tokio::runtime::scheduler::multi_thread::MultiThread::block_on
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/scheduler/multi_thread/mod.rs:86:9
  12: tokio::runtime::runtime::Runtime::block_on
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/runtime.rs:350:50
  13: aptos_forge::interface::network::NetworkContext::check_for_success
             at ./testsuite/forge/src/interface/network.rs:71:9
  14: <dyn aptos_testcases::NetworkLoadTest as aptos_forge::interface::network::NetworkTest>::run
             at ./testsuite/testcases/src/lib.rs:229:13
  15: aptos_forge::runner::Forge<F>::run::{{closure}}
             at ./testsuite/forge/src/runner.rs:598:42
  16: aptos_forge::runner::run_test
             at ./testsuite/forge/src/runner.rs:666:11
  17: aptos_forge::runner::Forge<F>::run
             at ./testsuite/forge/src/runner.rs:598:30
  18: forge::run_forge
             at ./testsuite/forge-cli/src/main.rs:427:11
  19: forge::main
             at ./testsuite/forge-cli/src/main.rs:353:21
  20: core::ops::function::FnOnce::call_once
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/ops/function.rs:250:5
  21: std::sys_common::backtrace::__rust_begin_short_backtrace
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/sys_common/backtrace.rs:154:18
  22: std::rt::lang_start::{{closure}}
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/rt.rs:167:18
  23: core::ops::function::impls::<impl core::ops::function::FnOnce<A> for &F>::call_once
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/ops/function.rs:284:13
  24: std::panicking::try::do_call
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panicking.rs:552:40
  25: std::panicking::try
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panicking.rs:516:19
  26: std::panic::catch_unwind
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panic.rs:142:14
  27: std::rt::lang_start_internal::{{closure}}
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/rt.rs:148:48
  28: std::panicking::try::do_call
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panicking.rs:552:40
  29: std::panicking::try
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panicking.rs:516:19
  30: std::panic::catch_unwind
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panic.rs:142:14
  31: std::rt::lang_start_internal
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/rt.rs:148:20
  32: std::rt::lang_start
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/rt.rs:166:17
  33: __libc_start_main
  34: _start
Trailing Log Lines:
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panicking.rs:516:19
  30: std::panic::catch_unwind
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panic.rs:142:14
  31: std::rt::lang_start_internal
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/rt.rs:148:20
  32: std::rt::lang_start
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/rt.rs:166:17
  33: __libc_start_main
  34: _start


Swarm logs can be found here: See fgi output for more information.
{"level":"INFO","source":{"package":"aptos_forge","file":"testsuite/forge/src/backend/k8s/cluster_helper.rs:292"},"thread_name":"main","hostname":"forge-e2e-pr-13019-1714068083-36eeac84d97d364cfcbbdc20122097b59","timestamp":"2024-04-25T18:30:56.874560Z","message":"Deleting namespace forge-e2e-pr-13019: Some(NamespaceStatus { conditions: None, phase: Some(\"Terminating\") })"}
{"level":"INFO","source":{"package":"aptos_forge","file":"testsuite/forge/src/backend/k8s/cluster_helper.rs:400"},"thread_name":"main","hostname":"forge-e2e-pr-13019-1714068083-36eeac84d97d364cfcbbdc20122097b59","timestamp":"2024-04-25T18:30:56.874606Z","message":"aptos-node resources for Forge removed in namespace: forge-e2e-pr-13019"}

failures:
    PFNPerformance

test result: FAILED. 0 passed; 1 failed; 0 filtered out
Failed to run tests:
Tests Failed

Error: Tests Failed

Stack backtrace:
   0: anyhow::error::<impl anyhow::Error>::msg
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/anyhow-1.0.79/src/error.rs:83:36
   1: aptos_forge::runner::Forge<F>::run
             at ./testsuite/forge/src/runner.rs:618:13
   2: forge::run_forge
             at ./testsuite/forge-cli/src/main.rs:427:11
   3: forge::main
             at ./testsuite/forge-cli/src/main.rs:353:21
   4: core::ops::function::FnOnce::call_once
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/ops/function.rs:250:5
   5: std::sys_common::backtrace::__rust_begin_short_backtrace
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/sys_common/backtrace.rs:154:18
   6: std::rt::lang_start::{{closure}}
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/rt.rs:167:18
   7: core::ops::function::impls::<impl core::ops::function::FnOnce<A> for &F>::call_once
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/ops/function.rs:284:13
   8: std::panicking::try::do_call
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panicking.rs:552:40
   9: std::panicking::try
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panicking.rs:516:19
  10: std::panic::catch_unwind
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panic.rs:142:14
  11: std::rt::lang_start_internal::{{closure}}
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/rt.rs:148:48
  12: std::panicking::try::do_call
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panicking.rs:552:40
  13: std::panicking::try
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panicking.rs:516:19
  14: std::panic::catch_unwind
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panic.rs:142:14
  15: std::rt::lang_start_internal
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/rt.rs:148:20
  16: std::rt::lang_start
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/rt.rs:166:17
  17: __libc_start_main
  18: _start
Debugging output:
NAME                                READY   STATUS    RESTARTS   AGE
aptos-node-0-fullnode-eforge84-0    1/1     Running   0          24m
aptos-node-0-validator-0            1/1     Running   0          24m
aptos-node-1-fullnode-eforge84-0    1/1     Running   0          24m
aptos-node-1-validator-0            1/1     Running   0          24m
aptos-node-10-fullnode-eforge84-0   1/1     Running   0          24m
aptos-node-10-validator-0           1/1     Running   0          24m
aptos-node-11-fullnode-eforge84-0   1/1     Running   0          24m
aptos-node-11-validator-0           1/1     Running   0          24m
aptos-node-12-fullnode-eforge84-0   1/1     Running   0          24m
aptos-node-12-validator-0           1/1     Running   0          24m
aptos-node-13-fullnode-eforge84-0   1/1     Running   0          24m
aptos-node-13-validator-0           1/1     Running   0          24m
aptos-node-14-fullnode-eforge84-0   1/1     Running   0          24m
aptos-node-14-validator-0           1/1     Running   0          24m
aptos-node-15-fullnode-eforge84-0   1/1     Running   0          24m
aptos-node-15-validator-0           1/1     Running   0          24m
aptos-node-16-fullnode-eforge84-0   1/1     Running   0          24m
aptos-node-16-validator-0           1/1     Running   0          24m
aptos-node-17-fullnode-eforge84-0   1/1     Running   0          24m
aptos-node-17-validator-0           1/1     Running   0          24m
aptos-node-18-fullnode-eforge84-0   1/1     Running   0          24m
aptos-node-18-validator-0           1/1     Running   0          24m
aptos-node-19-fullnode-eforge84-0   1/1     Running   0          24m
aptos-node-19-validator-0           1/1     Running   0          24m
aptos-node-2-fullnode-eforge84-0    1/1     Running   0          24m
aptos-node-2-validator-0            1/1     Running   0          24m
aptos-node-3-fullnode-eforge84-0    1/1     Running   0          24m
aptos-node-3-validator-0            1/1     Running   0          24m
aptos-node-4-fullnode-eforge84-0    1/1     Running   0          24m
aptos-node-4-validator-0            1/1     Running   0          24m
aptos-node-5-fullnode-eforge84-0    1/1     Running   0          24m
aptos-node-5-validator-0            1/1     Running   0          24m
aptos-node-6-fullnode-eforge84-0    1/1     Running   0          24m
aptos-node-6-validator-0            1/1     Running   0          24m
aptos-node-7-fullnode-eforge84-0    1/1     Running   0          24m
aptos-node-7-validator-0            1/1     Running   0          24m
aptos-node-8-fullnode-eforge84-0    1/1     Running   0          24m
aptos-node-8-validator-0            1/1     Running   0          24m
aptos-node-9-fullnode-eforge84-0    1/1     Running   0          24m
aptos-node-9-validator-0            1/1     Running   0          24m
public-fullnode-20-5bcae513-0       1/1     Running   0          20m
public-fullnode-21-3d49d21e-0       1/1     Running   0          19m
public-fullnode-22-509213c2-0       1/1     Running   0          18m
public-fullnode-23-98e3e2eb-0       1/1     Running   0          18m
public-fullnode-24-c7e15214-0       1/1     Running   0          16m
public-fullnode-25-742247fc-0       1/1     Running   0          15m
public-fullnode-26-cace0a43-0       1/1     Running   0          12m
public-fullnode-27-9fab26fa-0       1/1     Running   0          11m
public-fullnode-28-04a163ee-0       1/1     Running   0          11m
public-fullnode-29-5e78a3ff-0       1/1     Running   0          10m

github-actions[bot] avatar Apr 25 '24 18:04 github-actions[bot]

Forge is running suite realistic_env_max_load on 7341796fb3c6ca46dbcc36a6a76fce5eece502d2

github-actions[bot] avatar Apr 25 '24 23:04 github-actions[bot]

:white_check_mark: Forge suite realistic_env_max_load success on 7341796fb3c6ca46dbcc36a6a76fce5eece502d2

two traffics test: inner traffic : committed: 7931 txn/s, latency: 4941 ms, (p50: 4800 ms, p90: 5700 ms, p99: 10400 ms), latency samples: 3426400
two traffics test : committed: 100 txn/s, latency: 1866 ms, (p50: 1800 ms, p90: 2100 ms, p99: 2300 ms), latency samples: 1760
Latency breakdown for phase 0: ["QsBatchToPos: max: 0.206, avg: 0.202", "QsPosToProposal: max: 0.236, avg: 0.226", "ConsensusProposalToOrdered: max: 0.449, avg: 0.435", "ConsensusOrderedToCommit: max: 0.358, avg: 0.352", "ConsensusProposalToCommit: max: 0.806, avg: 0.787"]
Max round gap was 1 [limit 4] at version 1318147. Max no progress secs was 4.666697 [limit 15] at version 1318147.
Test Ok

github-actions[bot] avatar Apr 26 '24 00:04 github-actions[bot]

Forge is running suite realistic_env_max_load on c92c092764da91271bea4e340e2b17fe6399ac34

github-actions[bot] avatar Apr 29 '24 18:04 github-actions[bot]

:white_check_mark: Forge suite realistic_env_max_load success on c92c092764da91271bea4e340e2b17fe6399ac34

two traffics test: inner traffic : committed: 7888 txn/s, latency: 4971 ms, (p50: 4800 ms, p90: 5700 ms, p99: 10200 ms), latency samples: 3407940
two traffics test : committed: 100 txn/s, latency: 1995 ms, (p50: 1800 ms, p90: 2200 ms, p99: 7500 ms), latency samples: 1760
Latency breakdown for phase 0: ["QsBatchToPos: max: 0.218, avg: 0.207", "QsPosToProposal: max: 0.236, avg: 0.227", "ConsensusProposalToOrdered: max: 0.470, avg: 0.423", "ConsensusOrderedToCommit: max: 0.397, avg: 0.384", "ConsensusProposalToCommit: max: 0.818, avg: 0.807"]
Max round gap was 1 [limit 4] at version 1680126. Max no progress secs was 4.728026 [limit 15] at version 1680126.
Test Ok

github-actions[bot] avatar Apr 29 '24 18:04 github-actions[bot]

Forge is running suite compat on 01b24e7e3548382dd25440b39a0438a993387f12 ==> e4983863c1f8865dc00945a6909b29eb798e68a3

github-actions[bot] avatar Apr 29 '24 21:04 github-actions[bot]

Forge is running suite realistic_env_max_load on e4983863c1f8865dc00945a6909b29eb798e68a3

github-actions[bot] avatar Apr 29 '24 21:04 github-actions[bot]

:white_check_mark: Forge suite compat success on 01b24e7e3548382dd25440b39a0438a993387f12 ==> e4983863c1f8865dc00945a6909b29eb798e68a3

Compatibility test results for 01b24e7e3548382dd25440b39a0438a993387f12 ==> e4983863c1f8865dc00945a6909b29eb798e68a3 (PR)
1. Check liveness of validators at old version: 01b24e7e3548382dd25440b39a0438a993387f12
compatibility::simple-validator-upgrade::liveness-check : committed: 6193 txn/s, latency: 5039 ms, (p50: 4800 ms, p90: 8700 ms, p99: 10500 ms), latency samples: 235340
2. Upgrading first Validator to new version: e4983863c1f8865dc00945a6909b29eb798e68a3
compatibility::simple-validator-upgrade::single-validator-upgrade : committed: 1048 txn/s, submitted: 1054 txn/s, expired: 5 txn/s, latency: 15833 ms, (p50: 18600 ms, p90: 22000 ms, p99: 22600 ms), latency samples: 90205
3. Upgrading rest of first batch to new version: e4983863c1f8865dc00945a6909b29eb798e68a3
compatibility::simple-validator-upgrade::half-validator-upgrade : committed: 1770 txn/s, latency: 16067 ms, (p50: 19500 ms, p90: 23800 ms, p99: 24300 ms), latency samples: 88540
4. upgrading second batch to new version: e4983863c1f8865dc00945a6909b29eb798e68a3
compatibility::simple-validator-upgrade::rest-validator-upgrade : committed: 3606 txn/s, latency: 8815 ms, (p50: 9600 ms, p90: 12600 ms, p99: 12900 ms), latency samples: 144260
5. check swarm health
Compatibility test for 01b24e7e3548382dd25440b39a0438a993387f12 ==> e4983863c1f8865dc00945a6909b29eb798e68a3 passed
Test Ok

github-actions[bot] avatar Apr 29 '24 21:04 github-actions[bot]

:white_check_mark: Forge suite realistic_env_max_load success on e4983863c1f8865dc00945a6909b29eb798e68a3

two traffics test: inner traffic : committed: 8193 txn/s, latency: 4793 ms, (p50: 4500 ms, p90: 5700 ms, p99: 10500 ms), latency samples: 3531360
two traffics test : committed: 100 txn/s, latency: 1960 ms, (p50: 1900 ms, p90: 2100 ms, p99: 3300 ms), latency samples: 1820
Latency breakdown for phase 0: ["QsBatchToPos: max: 0.204, avg: 0.203", "QsPosToProposal: max: 0.229, avg: 0.213", "ConsensusProposalToOrdered: max: 0.426, avg: 0.403", "ConsensusOrderedToCommit: max: 0.396, avg: 0.380", "ConsensusProposalToCommit: max: 0.794, avg: 0.783"]
Max round gap was 1 [limit 4] at version 1142104. Max no progress secs was 4.761405 [limit 15] at version 1142104.
Test Ok

github-actions[bot] avatar Apr 29 '24 22:04 github-actions[bot]

Smoke tests revealed some bugs. Will update

bchocho avatar May 01 '24 17:05 bchocho

Forge is running suite compat on 01b24e7e3548382dd25440b39a0438a993387f12 ==> 6cd08b1434758ca00be56ac64816b391b0630c27

github-actions[bot] avatar May 04 '24 01:05 github-actions[bot]

Forge is running suite realistic_env_max_load on 6cd08b1434758ca00be56ac64816b391b0630c27

github-actions[bot] avatar May 04 '24 01:05 github-actions[bot]

:white_check_mark: Forge suite compat success on 01b24e7e3548382dd25440b39a0438a993387f12 ==> 6cd08b1434758ca00be56ac64816b391b0630c27

Compatibility test results for 01b24e7e3548382dd25440b39a0438a993387f12 ==> 6cd08b1434758ca00be56ac64816b391b0630c27 (PR)
1. Check liveness of validators at old version: 01b24e7e3548382dd25440b39a0438a993387f12
compatibility::simple-validator-upgrade::liveness-check : committed: 5649 txn/s, latency: 5895 ms, (p50: 5000 ms, p90: 9900 ms, p99: 10600 ms), latency samples: 225980
2. Upgrading first Validator to new version: 6cd08b1434758ca00be56ac64816b391b0630c27
compatibility::simple-validator-upgrade::single-validator-upgrade : committed: 1752 txn/s, latency: 15781 ms, (p50: 18700 ms, p90: 21900 ms, p99: 22900 ms), latency samples: 92900
3. Upgrading rest of first batch to new version: 6cd08b1434758ca00be56ac64816b391b0630c27
compatibility::simple-validator-upgrade::half-validator-upgrade : committed: 1793 txn/s, latency: 15900 ms, (p50: 19300 ms, p90: 23200 ms, p99: 23700 ms), latency samples: 89660
4. upgrading second batch to new version: 6cd08b1434758ca00be56ac64816b391b0630c27
compatibility::simple-validator-upgrade::rest-validator-upgrade : committed: 3593 txn/s, latency: 8672 ms, (p50: 9800 ms, p90: 12300 ms, p99: 12600 ms), latency samples: 143740
5. check swarm health
Compatibility test for 01b24e7e3548382dd25440b39a0438a993387f12 ==> 6cd08b1434758ca00be56ac64816b391b0630c27 passed
Test Ok

github-actions[bot] avatar May 04 '24 01:05 github-actions[bot]

:white_check_mark: Forge suite realistic_env_max_load success on 6cd08b1434758ca00be56ac64816b391b0630c27

two traffics test: inner traffic : committed: 8122 txn/s, latency: 4830 ms, (p50: 4500 ms, p90: 5700 ms, p99: 10200 ms), latency samples: 3508860
two traffics test : committed: 100 txn/s, latency: 1916 ms, (p50: 1800 ms, p90: 2100 ms, p99: 4400 ms), latency samples: 1780
Latency breakdown for phase 0: ["QsBatchToPos: max: 0.205, avg: 0.203", "QsPosToProposal: max: 0.235, avg: 0.212", "ConsensusProposalToOrdered: max: 0.479, avg: 0.422", "ConsensusOrderedToCommit: max: 0.373, avg: 0.354", "ConsensusProposalToCommit: max: 0.790, avg: 0.776"]
Max round gap was 1 [limit 4] at version 1661834. Max no progress secs was 4.88016 [limit 15] at version 1661834.
Test Ok

github-actions[bot] avatar May 04 '24 01:05 github-actions[bot]

Forge is running suite realistic_env_max_load on 6cd08b1434758ca00be56ac64816b391b0630c27

github-actions[bot] avatar May 04 '24 01:05 github-actions[bot]

Forge is running suite framework_upgrade on 01b24e7e3548382dd25440b39a0438a993387f12 ==> 6cd08b1434758ca00be56ac64816b391b0630c27

github-actions[bot] avatar May 04 '24 01:05 github-actions[bot]

Forge is running suite compat on 01b24e7e3548382dd25440b39a0438a993387f12 ==> 6cd08b1434758ca00be56ac64816b391b0630c27

github-actions[bot] avatar May 04 '24 01:05 github-actions[bot]

:white_check_mark: Forge suite compat success on 01b24e7e3548382dd25440b39a0438a993387f12 ==> 6cd08b1434758ca00be56ac64816b391b0630c27

Compatibility test results for 01b24e7e3548382dd25440b39a0438a993387f12 ==> 6cd08b1434758ca00be56ac64816b391b0630c27 (PR)
1. Check liveness of validators at old version: 01b24e7e3548382dd25440b39a0438a993387f12
compatibility::simple-validator-upgrade::liveness-check : committed: 5580 txn/s, latency: 5578 ms, (p50: 4800 ms, p90: 9600 ms, p99: 11800 ms), latency samples: 234380
2. Upgrading first Validator to new version: 6cd08b1434758ca00be56ac64816b391b0630c27
compatibility::simple-validator-upgrade::single-validator-upgrade : committed: 1837 txn/s, latency: 15742 ms, (p50: 18600 ms, p90: 22300 ms, p99: 22600 ms), latency samples: 91860
3. Upgrading rest of first batch to new version: 6cd08b1434758ca00be56ac64816b391b0630c27
compatibility::simple-validator-upgrade::half-validator-upgrade : committed: 1683 txn/s, latency: 17716 ms, (p50: 19000 ms, p90: 24500 ms, p99: 24700 ms), latency samples: 90920
4. upgrading second batch to new version: 6cd08b1434758ca00be56ac64816b391b0630c27
compatibility::simple-validator-upgrade::rest-validator-upgrade : committed: 3674 txn/s, latency: 8698 ms, (p50: 9800 ms, p90: 12300 ms, p99: 12600 ms), latency samples: 143320
5. check swarm health
Compatibility test for 01b24e7e3548382dd25440b39a0438a993387f12 ==> 6cd08b1434758ca00be56ac64816b391b0630c27 passed
Test Ok

github-actions[bot] avatar May 04 '24 02:05 github-actions[bot]

:white_check_mark: Forge suite realistic_env_max_load success on 6cd08b1434758ca00be56ac64816b391b0630c27

two traffics test: inner traffic : committed: 7874 txn/s, latency: 4961 ms, (p50: 4600 ms, p90: 6000 ms, p99: 10400 ms), latency samples: 3417520
two traffics test : committed: 100 txn/s, latency: 1897 ms, (p50: 1800 ms, p90: 2100 ms, p99: 2300 ms), latency samples: 1800
Latency breakdown for phase 0: ["QsBatchToPos: max: 0.202, avg: 0.201", "QsPosToProposal: max: 0.245, avg: 0.223", "ConsensusProposalToOrdered: max: 0.465, avg: 0.427", "ConsensusOrderedToCommit: max: 0.405, avg: 0.386", "ConsensusProposalToCommit: max: 0.831, avg: 0.813"]
Max round gap was 1 [limit 4] at version 1727717. Max no progress secs was 4.58751 [limit 15] at version 1727717.
Test Ok

github-actions[bot] avatar May 04 '24 02:05 github-actions[bot]

:white_check_mark: Forge suite framework_upgrade success on 01b24e7e3548382dd25440b39a0438a993387f12 ==> 6cd08b1434758ca00be56ac64816b391b0630c27

Compatibility test results for 01b24e7e3548382dd25440b39a0438a993387f12 ==> 6cd08b1434758ca00be56ac64816b391b0630c27 (PR)
Upgrade the nodes to version: 6cd08b1434758ca00be56ac64816b391b0630c27
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 1149 txn/s, submitted: 1152 txn/s, failed submission: 2 txn/s, expired: 2 txn/s, latency: 2583 ms, (p50: 2100 ms, p90: 4400 ms, p99: 6600 ms), latency samples: 104620
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 1178 txn/s, submitted: 1183 txn/s, failed submission: 4 txn/s, expired: 4 txn/s, latency: 2679 ms, (p50: 2100 ms, p90: 4500 ms, p99: 6300 ms), latency samples: 107240
5. check swarm health
Compatibility test for 01b24e7e3548382dd25440b39a0438a993387f12 ==> 6cd08b1434758ca00be56ac64816b391b0630c27 passed
Upgrade the remaining nodes to version: 6cd08b1434758ca00be56ac64816b391b0630c27
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 1051 txn/s, submitted: 1053 txn/s, failed submission: 1 txn/s, expired: 1 txn/s, latency: 2812 ms, (p50: 2400 ms, p90: 4800 ms, p99: 6800 ms), latency samples: 95660
Test Ok

github-actions[bot] avatar May 04 '24 03:05 github-actions[bot]