substrate
substrate copied to clipboard
Performance regression in block import after upgrading to polkadot 0.9.24
We noticed a huge drop in performance when trying to sync an archive node on our existing networks with a node that has polkadot-0.9.24 depedencies. With around 5-8 peers the speed comes down to 25-35 blocks per second.
On substrate V3 the block import speed was around 100-200 blocks per second.
V3 node output:
2022-09-21 15:01:51 ⚙️ Syncing 138.6 bps, target=#2418956 (6 peers), best: #5110 (0x5df0…b1a8), finalized #4608 (0x9d01…9736), ⬇ 109.2kiB/s ⬆ 2.6kiB/s
2022-09-21 15:01:56 ⚙️ Syncing 135.8 bps, target=#2418957 (6 peers), best: #5789 (0x5a8b…fcfe), finalized #5632 (0xc521…f20f), ⬇ 0.9kiB/s ⬆ 0.3kiB/s
2022-09-21 15:02:01 ⚙️ Syncing 130.2 bps, target=#2418958 (6 peers), best: #6440 (0xced8…f40e), finalized #6144 (0x77b4…0c34), ⬇ 5.0kiB/s ⬆ 4.4kiB/s
2022-09-21 15:02:06 ⚙️ Syncing 126.6 bps, target=#2418959 (6 peers), best: #7073 (0xfcb6…5719), finalized #6656 (0xa670…97ed), ⬇ 65.0kiB/s ⬆ 2.9kiB/s
2022-09-21 15:02:11 ⚙️ Syncing 121.4 bps, target=#2418959 (6 peers), best: #7680 (0x667e…52e0), finalized #7680 (0x667e…52e0), ⬇ 0.4kiB/s ⬆ 0.7kiB/s
2022-09-21 15:02:16 ⚙️ Syncing 114.6 bps, target=#2418960 (6 peers), best: #8253 (0x6418…49ab), finalized #8192 (0x9c83…c5fd), ⬇ 0.6kiB/s ⬆ 0.2kiB/s
2022-09-21 15:02:21 ⚙️ Syncing 111.8 bps, target=#2418961 (6 peers), best: #8812 (0x076b…da76), finalized #8704 (0x1676…2515), ⬇ 524.2kiB/s ⬆ 17.8kiB/s
2022-09-21 15:02:26 ⚙️ Syncing 108.2 bps, target=#2418962 (6 peers), best: #9353 (0x40ac…82e4), finalized #9216 (0xac74…eec9), ⬇ 3.0kiB/s ⬆ 1.4kiB/s
2022-09-21 15:02:31 ⚙️ Syncing 105.4 bps, target=#2418963 (6 peers), best: #9880 (0xfd60…a549), finalized #9728 (0xc71f…ec9b), ⬇ 4.5kiB/s ⬆ 3.7kiB/s
Polkadot 0.9.24 output:
2022-09-21 15:21:57 ⚙️ Syncing 23.6 bps, target=#2419157 (6 peers), best: #4913 (0x3ccd…2eb0), finalized #4608 (0x9d01…9736), ⬇ 2.8kiB/s ⬆ 4.0kiB/s
2022-09-21 15:22:02 ⚙️ Syncing 20.0 bps, target=#2419158 (6 peers), best: #5013 (0xece6…de66), finalized #4608 (0x9d01…9736), ⬇ 2.4kiB/s ⬆ 2.3kiB/s
2022-09-21 15:22:07 ⚙️ Syncing 26.0 bps, target=#2419159 (6 peers), best: #5143 (0x1dc8…8b3a), finalized #5120 (0x766f…661f), ⬇ 1.5kiB/s ⬆ 1.8kiB/s
2022-09-21 15:22:12 ⚙️ Syncing 27.6 bps, target=#2419160 (6 peers), best: #5281 (0x457a…d010), finalized #5120 (0x766f…661f), ⬇ 3.4kiB/s ⬆ 4.9kiB/s
2022-09-21 15:22:17 ⚙️ Syncing 31.2 bps, target=#2419160 (6 peers), best: #5437 (0xa503…c895), finalized #5120 (0x766f…661f), ⬇ 33.1kiB/s ⬆ 3.4kiB/s
2022-09-21 15:22:22 ⚙️ Syncing 29.2 bps, target=#2419161 (6 peers), best: #5583 (0x444d…f831), finalized #5120 (0x766f…661f), ⬇ 2.7kiB/s ⬆ 1.9kiB/s
2022-09-21 15:22:27 ⚙️ Syncing 27.4 bps, target=#2419162 (6 peers), best: #5720 (0xaa40…1234), finalized #5632 (0xc521…f20f), ⬇ 2.7kiB/s ⬆ 2.6kiB/s
2022-09-21 15:22:32 ⚙️ Syncing 27.6 bps, target=#2419163 (6 peers), best: #5858 (0xc038…9b78), finalized #5632 (0xc521…f20f), ⬇ 2.4kiB/s ⬆ 2.0kiB/s
2022-09-21 15:22:37 ⚙️ Syncing 26.6 bps, target=#2419164 (6 peers), best: #5991 (0xf9ff…f28d), finalized #5632 (0xc521…f20f), ⬇ 41.1kiB/s ⬆ 4.0kiB/s
After making our node compatible with polkadot-0.9.29 depedencies, block import speed improved by 50% to 45-60 blocks per second
2022-09-21 17:33:42 ⚙️ Syncing 60.2 bps, target=#2420475 (4 peers), best: #2751 (0xdb9e…6a7e), finalized #2560 (0xb0c5…1cab), ⬇ 14.9kiB/s ⬆ 1.8kiB/s
2022-09-21 17:33:47 ⚙️ Syncing 59.8 bps, target=#2420475 (4 peers), best: #3050 (0xb616…665b), finalized #2560 (0xb0c5…1cab), ⬇ 24.0kiB/s ⬆ 2.4kiB/s
2022-09-21 17:33:52 ⚙️ Syncing 50.8 bps, target=#2420476 (4 peers), best: #3304 (0x26d3…23cb), finalized #3072 (0x72cf…d65f), ⬇ 2.4kiB/s ⬆ 1.7kiB/s
2022-09-21 17:33:57 ⚙️ Syncing 53.4 bps, target=#2420477 (4 peers), best: #3571 (0x64bd…7724), finalized #3072 (0x72cf…d65f), ⬇ 16.7kiB/s ⬆ 0.5kiB/s
2022-09-21 17:34:02 ⚙️ Syncing 51.0 bps, target=#2420478 (6 peers), best: #3826 (0x9df4…54ae), finalized #3584 (0x7ae9…5c86), ⬇ 55.5kiB/s ⬆ 13.1kiB/s
2022-09-21 17:34:07 ⚙️ Syncing 53.6 bps, target=#2420479 (6 peers), best: #4094 (0x06ad…a051), finalized #3584 (0x7ae9…5c86), ⬇ 0.6kiB/s ⬆ 0.3kiB/s
2022-09-21 17:34:12 ⚙️ Syncing 55.0 bps, target=#2420480 (6 peers), best: #4369 (0xb180…3e0e), finalized #4096 (0x23ba…8ae8), ⬇ 28.2kiB/s ⬆ 2.0kiB/s
2022-09-21 17:34:17 ⚙️ Syncing 57.0 bps, target=#2420480 (6 peers), best: #4654 (0xcc55…301d), finalized #4608 (0x9d01…9736), ⬇ 1.3kiB/s ⬆ 1.7kiB/s
2022-09-21 17:34:22 ⚙️ Syncing 56.4 bps, target=#2420481 (6 peers), best: #4936 (0x9d08…5e97), finalized #4608 (0x9d01…9736), ⬇ 34.0kiB/s ⬆ 1.9kiB/s
Maybe related issue:https://substrate.stackexchange.com/questions/2002/database-corruption-on-syncing-parachain-to-rococo
On substrate V3 the block import speed was around 100-200 blocks per second.
Can you give an exact commit? And can you also link to the different versions of your node you are speaking about?
Can you give an exact commit? And can you also link to the different versions of your node you are speaking about?
Substrate V3 node: https://github.com/threefoldtech/tfchain/tree/fix_devnet/substrate-node (compile with cargo +nightly-2021-06-09 build --release)
Substrate polkadot-0.9.24 node: https://github.com/threefoldtech/tfchain/tree/2.1.0/substrate-node
And your V3 wasn't just faster because you used the native runtime?
No I don't think so.
Okay. @koute maybe something you could take a look at some point? The regression seems to be relative high.
@DylanVerstraete it would be nice to post the exact commands you used to launch your node.
@DylanVerstraete it would be nice to post the exact commands you used to launch your node.
./target/release/tfchain --chain chainspecs/dev/chainSpecRaw.json --pruning=archive -d /tmp/dev
Sure, I can take a look at this.
@DylanVerstraete Here's a patch which will speed up your blocks/sec when syncing by 10x from ~50 bps to ~500 bps:
diff --git a/substrate-node/node/Cargo.toml b/substrate-node/node/Cargo.toml
index f3ebc12..1fac895 100644
--- a/substrate-node/node/Cargo.toml
+++ b/substrate-node/node/Cargo.toml
@@ -37,7 +37,7 @@ frame-benchmarking-cli = {git = "https://github.com/paritytech/substrate.git", b
pallet-transaction-payment-rpc = {git = "https://github.com/paritytech/substrate", branch = "polkadot-v0.9.24"}
pallet-transaction-payment = {git = "https://github.com/paritytech/substrate", branch = "polkadot-v0.9.24"}
sc-basic-authorship = {git = "https://github.com/paritytech/substrate", branch = "polkadot-v0.9.24"}
-sc-cli = {git = "https://github.com/paritytech/substrate", branch = "polkadot-v0.9.24"}
+sc-cli = {git = "https://github.com/paritytech/substrate", branch = "polkadot-v0.9.24", features = ["wasmtime"]}
sc-client-api = {git = "https://github.com/paritytech/substrate", branch = "polkadot-v0.9.24"}
sc-executor = {git = "https://github.com/paritytech/substrate", branch = "polkadot-v0.9.24"}
sc-service = {git = "https://github.com/paritytech/substrate", branch = "polkadot-v0.9.24"}
Basically, you were using the interpreted WASM executor. If you switch to the compiled executor (which is the default, as long as it's enabled at compile time) you can not only regain all of that lost performance but also significantly speed up your syncing compared to what it was.
It is true that there is a performance regression when running with the interpreted executor; compared to the old node (which is using wasmi 0.6.2) the new node (which is using wasmi 0.9.1) is spending the majority of its time (~82%) erasing the linear memory with memset (which the old node did not). There's not much point in investigating this any further as even the "newer" version of wasmi is over a year old at this point, and you're not really supposed to run your chain on the interpreted executor anymore anyway. (This reminds me, we should make the wasmtime-based executor enabled by default; there's no point in it being disabled by default anymore. I'll whip up a PR to prevent these kinds of cases in the future.)
@koute verified, thank!
This issue has been mentioned on Polkadot Forum. There might be relevant details there:
https://forum.polkadot.network/t/polkadot-release-analysis/1026/2