substrate
substrate copied to clipboard
State-db refactoring
Remove "pending" state in statedb, which greatly simplifies implementation. Now in case there's a backend error the in-memory state is reverted by simply reloading from disk.
Also fixes an issue with #11980. After warp sync, if node is restarted before any of the block is pruned, it would not be able to start again in consustent state.
try_commit failures result in node termination anyway. So the reset is just simply there to allow data consistency guarantees at the statedb API level. But the way it is used in substrate currently, it does not really matter.
Hey, is anyone still working on this? Due to the inactivity this issue has been automatically marked as stale. It will be closed if no further activity occurs. Thank you for your contributions.
Waiting for second review, please don't close.
@bkchr Could you please take a look?
Recently I got few reports from Khala Node operators that when syncing a new node from 0, they randomly experiencing
[Block import error: Backend error: Can't canonicalize missing block number #{BLOCK_NUMBER} when importing {BLOCK_HASH}]
the log looks like
2022-11-01 07:08:54 [Parachain] Block import error: Backend error: Can't canonicalize missing block number #2482945 when importing 0xa0000aacfc918561284868cba29427613320594d3b129dc5981b0946f8368ead (#2487041)
2022-11-01 07:08:54 [Parachain] 💔 Error importing block 0xa0000aacfc918561284868cba29427613320594d3b129dc5981b0946f8368ead: consensus error: Import failed: Backend error: Can't canonicalize missing block number #2482945 when importing 0xa0000aacfc918561284868cba29427613320594d3b129dc5981b0946f8368ead (#2487041)
2022-11-01 07:08:55 [Parachain] ⚙️ Syncing 0.0 bps, target=#2629792 (11 peers), best: #2482944 (0x7272…aac6), finalized #405357 (0x2ba4…5c69), ⬇ 1.8MiB/s ⬆ 2.7kiB/s
2022-11-01 07:08:56 [Relaychain] ⚙️ Syncing 48.7 bps, target=#15134037 (30 peers), best: #9247717 (0x6478…a2cc), finalized #9247232 (0xd3a2…417d), ⬇ 1.2MiB/s ⬆ 132.9kiB/s
2022-11-01 07:09:00 [Parachain] ⚙️ Syncing 0.0 bps, target=#2629792 (11 peers), best: #2482944 (0x7272…aac6), finalized #405357 (0x2ba4…5c69), ⬇ 8.8MiB/s ⬆ 3.3kiB/s
2022-11-01 07:09:01 [Relaychain] ⚙️ Syncing 43.5 bps, target=#15134037 (30 peers), best: #9247935 (0x9a21…f6fe), finalized #9247744 (0xc1c1…b50c), ⬇ 1.0MiB/s ⬆ 139.8kiB/s
2022-11-01 07:09:05 [Parachain] Block import error: Backend error: Can't canonicalize missing block number #2482945 when importing 0xa0000aacfc918561284868cba29427613320594d3b129dc5981b0946f8368ead (#2487041)
2022-11-01 07:09:05 [Parachain] 💔 Error importing block 0xa0000aacfc918561284868cba29427613320594d3b129dc5981b0946f8368ead: consensus error: Import failed: Backend error: Can't canonicalize missing block number #2482945 when importing 0xa0000aacfc918561284868cba29427613320594d3b129dc5981b0946f8368ead (#2487041)
2022-11-01 07:09:05 [Parachain] ⚙️ Syncing 0.0 bps, target=#2629792 (11 peers), best: #2482944 (0x7272…aac6), finalized #405593 (0x89a5…6ff9), ⬇ 9.0MiB/s ⬆ 4.2kiB/s
2022-11-01 07:09:06 [Relaychain] ⚙️ Syncing 48.3 bps, target=#15134037 (30 peers), best: #9248177 (0xb440…3ff2), finalized #9247744 (0xc1c1…b50c), ⬇ 1.0MiB/s ⬆ 121.6kiB/s
2022-11-01 07:09:10 [Parachain] ⚙️ Syncing 0.0 bps, target=#2629792 (13 peers), best: #2482944 (0x7272…aac6), finalized #405593 (0x89a5…6ff9), ⬇ 9.7MiB/s ⬆ 4.6kiB/s
2022-11-01 07:09:11 [Relaychain] ⚙️ Syncing 42.5 bps, target=#15134037 (30 peers), best: #9248390 (0xdbc3…aad9), finalized #9248256 (0xb48f…6a96), ⬇ 953.1kiB/s ⬆ 120.9kiB/s
2022-11-01 07:09:15 [Parachain] ⚙️ Syncing 0.0 bps, target=#2629793 (11 peers), best: #2482944 (0x7272…aac6), finalized #405593 (0x89a5…6ff9), ⬇ 9.0MiB/s ⬆ 4.2kiB/s
2022-11-01 07:09:15 [Parachain] Block import error: Backend error: Can't canonicalize missing block number #2482945 when importing 0xa0000aacfc918561284868cba29427613320594d3b129dc5981b0946f8368ead (#2487041)
2022-11-01 07:09:15 [Parachain] 💔 Error importing block 0xa0000aacfc918561284868cba29427613320594d3b129dc5981b0946f8368ead: consensus error: Import failed: Backend error: Can't canonicalize missing block number #2482945 when importing 0xa0000aacfc918561284868cba29427613320594d3b129dc5981b0946f8368ead (#2487041)
2022-11-01 07:09:16 [Relaychain] ⚙️ Syncing 47.7 bps, target=#15134045 (30 peers), best: #9248629 (0x0da6…9e37), finalized #9248256 (0xb48f…6a96), ⬇ 1.0MiB/s ⬆ 130.5kiB/s
2022-11-01 07:09:20 [Parachain] ⚙️ Syncing 0.0 bps, target=#2629793 (13 peers), best: #2482944 (0x7272…aac6), finalized #405829 (0x92bb…2cd7), ⬇ 5.7MiB/s ⬆ 7.7kiB/s
2022-11-01 07:09:21 [Relaychain] ⚙️ Syncing 43.3 bps, target=#15134046 (30 peers), best: #9248846 (0x4dff…ab5b), finalized #9248769 (0xeb85…0f4a), ⬇ 956.7kiB/s ⬆ 123.9kiB/s
2022-11-01 07:09:25 [Parachain] ⚙️ Syncing 0.0 bps, target=#2629795 (13 peers), best: #2482944 (0x7272…aac6), finalized #405829 (0x92bb…2cd7), ⬇ 6.9MiB/s ⬆ 5.7kiB/s
2022-11-01 07:09:26 [Relaychain] ⚙️ Syncing 43.1 bps, target=#15134046 (30 peers), best: #9249062 (0xf167…12a9), finalized #9248769 (0xeb85…0f4a), ⬇ 889.9kiB/s ⬆ 140.6kiB/s
2022-11-01 07:09:30 [Parachain] ⚙️ Syncing 0.0 bps, target=#2629798 (14 peers), best: #2482944 (0x7272…aac6), finalized #406070 (0x49b1…c151), ⬇ 9.4MiB/s ⬆ 5.5kiB/s
2022-11-01 07:09:31 [Relaychain] ⚙️ Syncing 45.6 bps, target=#15134046 (30 peers), best: #9249290 (0x88d5…5cf9), finalized #9249281 (0x2c85…36a1), ⬇ 1.1MiB/s ⬆ 146.0kiB/s
2022-11-01 07:09:35 [Parachain] ⚙️ Syncing 0.0 bps, target=#2629798 (15 peers), best: #2482944 (0x7272…aac6), finalized #406070 (0x49b1…c151), ⬇ 5.8MiB/s ⬆ 3.6kiB/s
2022-11-01 07:09:36 [Relaychain] ⚙️ Syncing 45.9 bps, target=#15134046 (30 peers), best: #9249520 (0x9522…56f7), finalized #9249281 (0x2c85…36a1), ⬇ 1000.6kiB/s ⬆ 112.3kiB/s
2022-11-01 07:09:38 [Parachain] Block import error: Backend error: Can't canonicalize missing block number #2482945 when importing 0xa0000aacfc918561284868cba29427613320594d3b129dc5981b0946f8368ead (#2487041)
2022-11-01 07:09:38 [Parachain] 💔 Error importing block 0xa0000aacfc918561284868cba29427613320594d3b129dc5981b0946f8368ead: consensus error: Import failed: Backend error: Can't canonicalize missing block number #2482945 when importing 0xa0000aacfc918561284868cba29427613320594d3b129dc5981b0946f8368ead (#2487041)
2022-11-01 07:09:38 [Parachain] 💔 Error importing block 0x27939286e6722c962d6c964e4bfc4a3371b2ac6d78f6b2ee534a6a1ad7543786: block has an unknown parent
the node stuck there and seems won't advance anymore, restart node can't help, only delete DB and do resync, but it may occur again
I saw the error message comming from force_delayed_canonicalize which called by try_commit_operation
do you think this refactor can help?
@jasl does not seem to be related. Please file a separate issue. Collecting logs with -l db=trace would help there.
@jasl does not seem to be related. Please file a separate issue. Collecting logs with
-l db=tracewould help there.
https://github.com/paritytech/substrate/issues/12613 also I attached log
And sorry for the delay. :see_no_evil:
One question in general, could we not simplify DeathRowQueue even more when we would make the Mem backend only also keep the inserted keys in memory. Otherwise there is no further difference or? We could also use batch loading etc? Or do I miss something?
The main difference for Mem vs DbBacked is that Mem does its own tracking of key references (death_index). Mem is currently only used when the backend database does not support reference counting (i.e. rocksdb). Once we remove rocksdb, we can remove Mem as well.
bot merge