polkadot-runtime-template icon indicating copy to clipboard operation
polkadot-runtime-template copied to clipboard

🐞 [Bug]: Aura panics with `Slot must increase`

Open ozgunozerk opened this issue 1 year ago • 4 comments

What happened?

We bootstrapped a custom experimental relay chain, and then registered a new parachain to it.

Steps to reproduce for creating a new parachain can be found here: https://github.com/KitHat/sub0-oz-workshop/

When the nodes for parachain starts (Alice and Bob in this case) the following happens:

  1. starts syncing
  2. finishes syncing
  3. starts producing blocks
  4. Error is printed to CLI: Thread 'tokio-runtime-worker' panicked at 'Slot must increase', /Users/ozgunozerk/.cargo/git/checkouts/polkadot-sdk-cff69157b985ed76/2fe3145/substrate/frame/aura/src/lib.rs:139 (details can be found below)
  5. node does not crash, and continues to produce blocks, everything works as if nothing is wrong

Below is the detailed logs:

2024-03-07 15:21:00 [Parachain] 🎁 Prepared block for proposing at 1 (10 ms) [hash: 0x36cc7723f8596d0c7d71edec712333a885081489d945d1f6faf77963453df532; parent_hash: 0x6280…af9f; extrinsics (2): [0x3fe8…2e1c, 0x76e2…627e]
2024-03-07 15:21:00 [Parachain] 🔖 Pre-sealed block for proposal at 1. Hash now 0x85d8949ea424bb1851727c396b91864e212b46046e59997fa5f0b56cbd8edfff, previously 0x36cc7723f8596d0c7d71edec712333a885081489d945d1f6faf77963453df532.
2024-03-07 15:21:00 [Parachain] ✨ Imported #1 (0x85d8…dfff)
2024-03-07 15:21:00 [Parachain] PoV size { header: 0.2177734375kb, extrinsics: 3.1728515625kb, storage_proof: 2.2626953125kb }
2024-03-07 15:21:00 [Parachain] Compressed PoV size: 4.7470703125kb
2024-03-07 15:21:04 [Relaychain] 💤 Idle (3 peers), best: #177 (0xee41…7c56), finalized #175 (0xbe24…988e), ⬇ 1.8kiB/s ⬆ 2.4kiB/s
2024-03-07 15:21:04 [Parachain] 💤 Idle (1 peers), best: #1 (0x85d8…dfff), finalized #0 (0x6280…af9f), ⬇ 0.2kiB/s ⬆ 0.9kiB/s
2024-03-07 15:21:04 [Relaychain] Trying to remove unknown reserved node 12D3KooWQCkBm1BYtkHpocxCwMgR8yjitEeHGx8spzcDLGt2gkBm from SetId(3).
2024-03-07 15:21:06 [Relaychain] ✨ Imported #178 (0x2a01…73c6)
2024-03-07 15:21:06 [Parachain] 🙌 Starting consensus session on top of parent 0x85d8949ea424bb1851727c396b91864e212b46046e59997fa5f0b56cbd8edfff

====================

Version: 0.1.2-26778e090f2

   0: backtrace::capture::Backtrace::new
   1: sp_panic_handler::set::{{closure}}
   2: std::panicking::rust_panic_with_hook
   3: std::panicking::begin_panic_handler::{{closure}}
   4: std::sys_common::backtrace::__rust_end_short_backtrace
   5: _rust_begin_unwind
   6: core::panicking::panic_fmt
   7: <pallet_aura::pallet::Pallet<T> as frame_support::traits::hooks::OnInitialize<<<<T as frame_system::pallet::Config>::Block as sp_runtime::traits::HeaderProvider>::HeaderT as sp_runtime::traits::Header>::Number>>::on_initialize
   8: <(TupleElement0,TupleElement1,TupleElement2,TupleElement3,TupleElement4,TupleElement5,TupleElement6,TupleElement7,TupleElement8,TupleElement9,TupleElement10,TupleElement11,TupleElement12,TupleElement13,TupleElement14,TupleElement15,TupleElement16,TupleElement17,TupleElement18,TupleElement19,TupleElement20,TupleElement21,TupleElement22,TupleElement23,TupleElement24,TupleElement25,TupleElement26) as frame_support::traits::hooks::OnInitialize<BlockNumber>>::on_initialize
   9: frame_executive::Executive<System,Block,Context,UnsignedValidator,AllPalletsWithSystem,COnRuntimeUpgrade>::initialize_block
  10: parachain_template_runtime::api::dispatch
  11: environmental::using
  12: sc_executor::executor::WasmExecutor<H>::with_instance::{{closure}}
  13: sc_executor::wasm_runtime::RuntimeCache::with_instance
  14: <sc_executor::executor::NativeElseWasmExecutor<D> as sp_core::traits::CodeExecutor>::call
  15: sp_state_machine::execution::StateMachine<B,H,Exec>::execute
  16: <sc_service::client::call_executor::LocalCallExecutor<Block,B,E> as sc_client_api::call_executor::CallExecutor<Block>>::contextual_call
  17: <sc_service::client::client::Client<B,E,Block,RA> as sp_api::CallApiAt<Block>>::call_api_at
  18: <parachain_template_runtime::RuntimeApiImpl<__SrApiBlock__,RuntimeApiImplCall> as sp_api::Core<__SrApiBlock__>>::__runtime_api_internal_call_api_at::{{closure}}
  19: <parachain_template_runtime::RuntimeApiImpl<__SrApiBlock__,RuntimeApiImplCall> as sp_consensus_aura::AuraApi<__SrApiBlock__,sp_consensus_aura::sr25519::app_sr25519::Public>>::__runtime_api_internal_call_api_at
  20: sc_block_builder::BlockBuilderBuilderStage2<B,C>::build
  21: <sc_basic_authorship::basic_authorship::Proposer<Block,C,A,PR> as sp_consensus::Proposer<Block>>::propose::{{closure}}
  22: <sc_service::task_manager::prometheus_future::PrometheusFuture<T> as core::future::future::Future>::poll
  23: <tracing_futures::Instrumented<T> as core::future::future::Future>::poll
  24: tokio::runtime::park::CachedParkThread::block_on
  25: tokio::runtime::context::runtime::enter_runtime
  26: <tokio::runtime::blocking::task::BlockingTask<T> as core::future::future::Future>::poll
  27: tokio::runtime::task::core::Core<T,S>::poll
  28: tokio::runtime::task::harness::Harness<T,S>::poll
  29: tokio::runtime::blocking::pool::Inner::run
  30: std::sys_common::backtrace::__rust_begin_short_backtrace
  31: core::ops::function::FnOnce::call_once{{vtable.shim}}
  32: std::sys::pal::unix::thread::Thread::new::thread_start
  33: __pthread_joiner_wake


Thread 'tokio-runtime-worker' panicked at 'Slot must increase', /Users/ozgunozerk/.cargo/git/checkouts/polkadot-sdk-cff69157b985ed76/2fe3145/substrate/frame/aura/src/lib.rs:139

This is a bug. Please report it at:

	https://github.com/paritytech/polkadot-sdk/issues/new

2024-03-07 15:21:06 [Parachain] Evicting failed runtime instance error=Runtime panicked: Slot must increase
2024-03-07 15:21:06 [Parachain] err=Error { inner: Proposing

Caused by:
    0: Error at calling runtime api: Execution failed: Runtime panicked: Slot must increase
    1: Execution failed: Runtime panicked: Slot must increase }
2024-03-07 15:21:09 [Relaychain] 💤 Idle (3 peers), best: #178 (0x2a01…73c6), finalized #175 (0xbe24…988e), ⬇ 1.8kiB/s ⬆ 0.8kiB/s
2024-03-07 15:21:09 [Parachain] 💤 Idle (1 peers), best: #1 (0x85d8…dfff), finalized #0 (0x6280…af9f), ⬇ 0.2kiB/s ⬆ 0.2kiB/s
2024-03-07 15:21:13 [Relaychain] ✨ Imported #179 (0x6982…185c)
2024-03-07 15:21:14 [Relaychain] 💤 Idle (3 peers), best: #179 (0x6982…185c), finalized #177 (0xee41…7c56), ⬇ 1.4kiB/s ⬆ 1.0kiB/s
2024-03-07 15:21:14 [Parachain] 💤 Idle (1 peers), best: #1 (0x85d8…dfff), finalized #0 (0x6280…af9f), ⬇ 0.1kiB/s ⬆ 59 B/s

platform

  • [ ] linux
  • [ ] windows
  • [X] macos (I've encountered this on macos, but this error is probably not OS related)

Expected behavior

This error should not be present

Contribution Guidelines

  • [X] I agree to follow this project's Contribution Guidelines

ozgunozerk avatar Mar 07 '24 12:03 ozgunozerk

As additional notes:

  • async-backing feature was not enabled
  • thus, AllowMultipleBlocksPerSlot is set to false

and the code piece that causes the error is from lib.rs of aura pallet:

#[pallet::hooks]
impl<T: Config> Hooks<BlockNumberFor<T>> for Pallet<T> {
fn on_initialize(_: BlockNumberFor<T>) -> Weight {
    if let Some(new_slot) = Self::current_slot_from_digests() {
        let current_slot = CurrentSlot::<T>::get();
        
        if T::AllowMultipleBlocksPerSlot::get() {
            assert!(current_slot <= new_slot, "Slot must not decrease");
        } else {
            assert!(current_slot < new_slot, "Slot must increase");  // THIS LINE CAUSES THE ERROR
        }
        
        CurrentSlot::<T>::put(new_slot);
        
        if let Some(n_authorities) = <Authorities<T>>::decode_len() {
            let authority_index = *new_slot % n_authorities as u64;
            if T::DisabledValidators::is_disabled(authority_index as u32) {
                panic!(
	                "Validator with index {:?} is disabled and should not be attempting to author blocks.",
	                authority_index,
                );
            }
        }
        
        // TODO [#3398] Generate offence report for all authorities that skipped their
        // slots.
        
        T::DbWeight::get().reads_writes(2, 1)
    } else {
        T::DbWeight::get().reads(1)
    }
}

ozgunozerk avatar Mar 07 '24 13:03 ozgunozerk

Hello @ozgunozerk

I'm using the evm-template with our custom fork of polkadot-sdk where the version is downgraded to release-polkadot-v1.7.2. I encounter a similar error here when running the node on Zombienet (Slot must increase).

Is this the same issue as above?

2024-12-10 10:11:06 [Parachain] 🙌 Starting consensus session on top of parent 0x25b7774afbb6a05802a39e6cb10710b1edcb7b67a667272f1aa95debc5c32b54    
2024-12-10 10:11:06 [Parachain] panicked at /Users/rizal/.cargo/git/checkouts/polkadot-sdk-91c59d351e7f851d/b8f1cb0/substrate/frame/aura/src/lib.rs:139:21:
Slot must increase    
2024-12-10 10:11:06 [Parachain] err=Error { inner: Proposing

Caused by:
    0: Error at calling runtime api: Execution failed: Execution aborted due to trap: wasm trap: wasm `unreachable` instruction executed
       WASM backtrace:
       error while executing at wasm backtrace:
           0: 0x51ddf8 - evm_runtime_template.wasm!rust_begin_unwind
           1: 0xc5c5 - evm_runtime_template.wasm!core::panicking::panic_fmt::he18b4f03cee20687
           2: 0x26fb2c - evm_runtime_template.wasm!<pallet_aura::pallet::Pallet<T> as frame_support::traits::hooks::OnInitialize<<<<T as frame_system::pallet::Config>::Block as sp_runtime::traits::HeaderProvider>::HeaderT as sp_runtime::traits::Header>::Number>>::on_initialize::h20a78cba5358bf06
           3: 0x1a271b - evm_runtime_template.wasm!<(TupleElement0,TupleElement1,TupleElement2,TupleElement3,TupleElement4,TupleElement5,TupleElement6,TupleElement7,TupleElement8,TupleElement9,TupleElement10,TupleElement11,TupleElement12,TupleElement13,TupleElement14,TupleElement15,TupleElement16,TupleElement17,TupleElement18,TupleElement19,TupleElement20,TupleElement21,TupleElement22,TupleElement23,TupleElement24,TupleElement25,TupleElement26,TupleElement27,TupleElement28,TupleElement29,TupleElement30,TupleElement31) as frame_support::traits::hooks::OnInitialize<BlockNumber>>::on_initialize::hbb37758212c29152
           4: 0x1f2e58 - evm_runtime_template.wasm!frame_executive::Executive<System,Block,Context,UnsignedValidator,AllPalletsWithSystem,COnRuntimeUpgrade>::initialize_block::hc2929e5fe2afade4
           5: 0x148e6d - evm_runtime_template.wasm!Core_initialize_block
    1: Execution failed: Execution aborted due to trap: wasm trap: wasm `unreachable` instruction executed
       WASM backtrace:
       error while executing at wasm backtrace:
           0: 0x51ddf8 - evm_runtime_template.wasm!rust_begin_unwind
           1: 0xc5c5 - evm_runtime_template.wasm!core::panicking::panic_fmt::he18b4f03cee20687
           2: 0x26fb2c - evm_runtime_template.wasm!<pallet_aura::pallet::Pallet<T> as frame_support::traits::hooks::OnInitialize<<<<T as frame_system::pallet::Config>::Block as sp_runtime::traits::HeaderProvider>::HeaderT as sp_runtime::traits::Header>::Number>>::on_initialize::h20a78cba5358bf06
           3: 0x1a271b - evm_runtime_template.wasm!<(TupleElement0,TupleElement1,TupleElement2,TupleElement3,TupleElement4,TupleElement5,TupleElement6,TupleElement7,TupleElement8,TupleElement9,TupleElement10,TupleElement11,TupleElement12,TupleElement13,TupleElement14,TupleElement15,TupleElement16,TupleElement17,TupleElement18,TupleElement19,TupleElement20,TupleElement21,TupleElement22,TupleElement23,TupleElement24,TupleElement25,TupleElement26,TupleElement27,TupleElement28,TupleElement29,TupleElement30,TupleElement31) as frame_support::traits::hooks::OnInitialize<BlockNumber>>::on_initialize::hbb37758212c29152
           4: 0x1f2e58 - evm_runtime_template.wasm!frame_executive::Executive<System,Block,Context,UnsignedValidator,AllPalletsWithSystem,COnRuntimeUpgrade>::initialize_block::hc2929e5fe2afade4
           5: 0x148e6d - evm_runtime_template.wasm!Core_initialize_block }
2024-12-10 10:11:07 [Relaychain] 💤 Idle (2 peers), best: #68 (0x0308…5c8e), finalized #65 (0x6458…3838), ⬇ 1.1kiB/s ⬆ 0.6kiB/s    
2024-12-10 10:11:07 [Parachain] 💤 Idle (0 peers), best: #2 (0x25b7…2b54), finalized #1 (0x07a0…908f), ⬇ 31 B/s ⬆ 70 B/s

Another thing: I notice that while the parachain block is progressing, the block time always exceeds 12s (see video below).

https://github.com/user-attachments/assets/3e14aaf7-9216-4d87-bfa9-9690c0d74d40

Is there any way to fix this? I'm thinking of adding async backing feature when compiling, but I'm unsure where to start given that we really can't upgrade polkadot-sdk dependencies for now.

Please let me know your thoughts. Thanks.

ical10 avatar Dec 10 '24 02:12 ical10

Hello @ical10! I remember fixing this issue in some of the debuggings, however I do not remember how right now. I will try to reconcile it tomorrow and get back to you.

KitHat avatar Dec 11 '24 14:12 KitHat

@KitHat we're having the same issue. Could you double check if you have the resolution for that?

th7nder avatar Dec 18 '24 16:12 th7nder