sui icon indicating copy to clipboard operation
sui copied to clipboard

run fullnode get error authenticator_state disabled

Open zilongliang opened this issue 1 year ago • 6 comments

authenticator_state disabled thread 'sui-node-runtime' panicked at crates/sui-core/src/authority/authority_per_epoch_store.rs:903:2024-10-09T06:50:35.204262Z ERROR AuthorityPerEpochStore::new{epoch=0}: telemetry_subscribers: panicked at crates/sui-core/src/authority/authority_per_epoch_store.rs:903:13:

zilongliang avatar Oct 09 '24 08:10 zilongliang

Hi @zilongliang

Thanks for the report. Could you provide a few more details about your setup? Which sui-node version are you using?

stefan-mysten avatar Oct 09 '24 15:10 stefan-mysten

Also can you show the full backtrace please? Btw it is a lot faster to restore from a snapshot instead of starting a fullnode from epoch 0.

mwtian avatar Oct 09 '24 15:10 mwtian

Also can you show the full backtrace please? Btw it is a lot faster to restore from a snapshot instead of starting a fullnode from epoch 0.

2024-10-10T05:03:37.410911Z INFO AuthorityPerEpochStore::new{epoch=0}: sui_core::authority::authority_per_epoch_store: authenticator_state disabled thread 'sui-node-runtime' panicked at crates/sui-core/src/authority/authority_per_epoch_store.rs:903:2024-10-10T05:03:37.411095Z ERROR AuthorityPerEpochStore::new{epoch=0}: telemetry_subscribers: panicked at crates/sui-core/src/authority/authority_per_epoch_store.rs:903:13: assertion failed: s.randomness_state_enabled() panic.file="crates/sui-core/src/authority/authority_per_epoch_store.rs" panic.line=903 panic.column=13 13: assertion failed: s.randomness_state_enabled() stack backtrace: 0: 0x557aa552c375 - <std::sys::backtrace::BacktraceLock::print::DisplayBacktrace as core::fmt::Display>::fmt::h1b9dad2a88e955ff 1: 0x557aa555a07b - core::fmt::write::h4b5a1270214bc4a7 2: 0x557aa55282df - std::io::Write::write_fmt::hd04af345a50c312d 3: 0x557aa552d831 - std::panicking::default_hook::{{closure}}::h96ab15e9936be7ed 4: 0x557aa552d50c - std::panicking::default_hook::h3cacb9c27561ad33 5: 0x557aa338de3c - telemetry_subscribers::set_panic_hook::{{closure}}::hf23417b43f726af0 6: 0x557aa552e0cf - std::panicking::rust_panic_with_hook::hfe205f6954b2c97b 7: 0x557aa552dcc3 - std::panicking::begin_panic_handler::{{closure}}::h6cb44b3a50f28c44 8: 0x557aa552c839 - std::sys::backtrace::__rust_end_short_backtrace::hf1c1f2a92799bb0e 9: 0x557aa552d984 - rust_begin_unwind 10: 0x557aa2031ce3 - core::panicking::panic_fmt::h3d8fc78294164da7 11: 0x557aa2031d6c - core::panicking::panic::hec978767ec2d35ff 12: 0x557aa29e5aec - sui_core::authority::authority_per_epoch_store::AuthorityPerEpochStore::new::h5cae570cce507c2f 13: 0x557aa20f0da4 - sui_node::SuiNode::start_async::{{closure}}::hd50c2ff992c18b3c 14: 0x557aa2118c62 - sui_node::main::{{closure}}::h43138b264ff25e18 15: 0x557aa2148a25 - tokio::runtime::task::harness::Harness<T,S>::poll::h9d16e3d8afcc7b2c 16: 0x557aa54a5a92 - tokio::runtime::scheduler::multi_thread::worker::Context::run_task::h4e1500d559c922d7 17: 0x557aa54a4d2b - tokio::runtime::scheduler::multi_thread::worker::Context::run::h5aeb48907b1057f6 18: 0x557aa54a2e4b - tokio::runtime::context::scoped::Scoped<T>::set::hf1ab9c0a5dffcdc3 19: 0x557aa549ab8c - tokio::runtime::context::runtime::enter_runtime::hf1482e86d2baf2e1 20: 0x557aa5493ff0 - tokio::runtime::task::core::Core<T,S>::poll::h683671ea95b9ba2d 21: 0x557aa548bc48 - tokio::runtime::task::harness::Harness<T,S>::poll::hcbbb51e3abd7498c 22: 0x557aa549f054 - tokio::runtime::blocking::pool::Inner::run::h458bf7b6efb9f280 23: 0x557aa5492739 - std::sys::backtrace::__rust_begin_short_backtrace::hb3c7e098c6f820f4 24: 0x557aa5492c41 - core::ops::function::FnOnce::call_once{{vtable.shim}}::h9e0e446d9e9aa512 25: 0x557aa5533e7b - std::sys::pal::unix::thread::Thread::new::thread_start::ha8af9c992ef0b208 26: 0x7f5943828ac3 - 27: 0x7f59438ba850 - 28: 0x0 - Aborted

zilongliang avatar Oct 10 '24 05:10 zilongliang

Hi @zilongliang

Thanks for the report. Could you provide a few more details about your setup? Which sui-node version are you using?

how can i get?

zilongliang avatar Oct 10 '24 05:10 zilongliang

sui-node -V should work. Thanks!

stefan-mysten avatar Oct 10 '24 05:10 stefan-mysten

sui-node -V

sui-node -V should work. Thanks!

I haven't run it yet, so this command doesn't exist.

zilongliang avatar Oct 10 '24 05:10 zilongliang

I'm getting the same error when trying to start a node for the first time.

$ ./sui-node -V sui-node 1.35.1-f176dc2599f0

Error gist: https://gist.github.com/StephenFluin/81c81d6d7e259709729c08c81bc4b1d6

StephenFluin avatar Oct 18 '24 23:10 StephenFluin

Getting the same errors on darwin mac with sui-node and sui-tool installed via binaries as well as ubuntu 24.04 with binaries. Also getting this error when trying to restore snapshot:

$ RUST_BACKTRACE=full ./sui-tool download-formal-snapshot --latest --genesis genesis.blob      --network testnet      --path /data/sui-testnet/db/ --num-parallel-downloads 50 --no-sign-request
Beginning formal snapshot restore to end of epoch 525, network: Testnet, verification mode: Normal                                                        thread 'tokio-runtime-worker' panicked at crates/sui-archival/src/lib.rs:183:17:
assertion failed: summary_files.windows(2).all(|w|
        w[1].checkpoint_seq_range.start == w[0].checkpoint_seq_range.end)
stack backtrace:
   0:     0x57455de06c85 - <std::sys::backtrace::BacktraceLock::print::DisplayBacktrace as core::fmt::Display>::fmt::h1b9dad2a88e955ff
   1:     0x57455de34ceb - core::fmt::write::h4b5a1270214bc4a7
   2:     0x57455de032df - std::io::Write::write_fmt::hd04af345a50c312d
   3:     0x57455de08141 - std::panicking::default_hook::{{closure}}::h96ab15e9936be7ed
   4:     0x57455de07e1c - std::panicking::default_hook::h3cacb9c27561ad33
   5:     0x57455c1c2d2c - telemetry_subscribers::set_panic_hook::{{closure}}::h7a002485fb2438fb
   6:     0x57455de089df - std::panicking::rust_panic_with_hook::hfe205f6954b2c97b
   7:     0x57455de085d3 - std::panicking::begin_panic_handler::{{closure}}::h6cb44b3a50f28c44
   8:     0x57455de07149 - std::sys::backtrace::__rust_end_short_backtrace::hf1c1f2a92799bb0e
   9:     0x57455de08294 - rust_begin_unwind
  10:     0x57455b6aaec3 - core::panicking::panic_fmt::h3d8fc78294164da7
  11:     0x57455b6aaf4c - core::panicking::panic::hec978767ec2d35ff
  12:     0x57455cc6caa2 - sui_archival::Manifest::next_checkpoint_after_epoch::hf19ff9523306018e
  13:     0x57455ba3969d - <alloc::vec::Vec<T> as alloc::vec::spec_from_iter::SpecFromIter<T,I>>::from_iter::h5e27863e3621f50a
  14:     0x57455ba2d715 - sui_tool::start_summary_sync::{{closure}}::h54236521fedfeefa
  15:     0x57455ba2645c - tokio::runtime::task::core::Core<T,S>::poll::h5408f5e090aa4ace
  16:     0x57455ba46390 - tokio::runtime::task::raw::poll::hf0858243381f4d7e
  17:     0x57455dcffa92 - tokio::runtime::scheduler::multi_thread::worker::Context::run_task::hf66891fd9ee6bacb
  18:     0x57455dcfed5b - tokio::runtime::scheduler::multi_thread::worker::Context::run::h4082b3aa5a88b0da
  19:     0x57455dcf719b - tokio::runtime::context::scoped::Scoped<T>::set::hd8546a7146c024cd
  20:     0x57455dd065bc - tokio::runtime::context::runtime::enter_runtime::h945f6a7224a9471b
  21:     0x57455dd02720 - tokio::runtime::task::core::Core<T,S>::poll::h3ed6f38fdccaab95
  22:     0x57455dcf2688 - tokio::runtime::task::harness::Harness<T,S>::poll::hea5fe1595049d9e9
  23:     0x57455dd051b4 - tokio::runtime::blocking::pool::Inner::run::hcbfdd4b1a61ae127
  24:     0x57455dd012f9 - std::sys::backtrace::__rust_begin_short_backtrace::h3b0feebf3e96ff28
  25:     0x57455dd017e1 - core::ops::function::FnOnce::call_once{{vtable.shim}}::h61fcab04a9c59de2
  26:     0x57455de0f03b - std::sys::pal::unix::thread::Thread::new::thread_start::ha8af9c992ef0b208
  27:     0x7c5495c9ca94 - <unknown>
  28:     0x7c5495d29c3c - <unknown>
  29:                0x0 - <unknown>
Aborted (core dumped)

StephenFluin avatar Oct 18 '24 23:10 StephenFluin

I'm getting the same error when trying to start a node for the first time.

$ ./sui-node -V sui-node 1.35.1-f176dc2599f0

Error gist: https://gist.github.com/StephenFluin/81c81d6d7e259709729c08c81bc4b1d6

This might happen if the sui-node is upgrading from an older version... I will pass it to the team.

stefan-mysten avatar Oct 18 '24 23:10 stefan-mysten

Having the same problem when running everything from scratch: (Ubuntu 24.04)

2024-10-19T11:07:58.992401Z  INFO sui_core::authority::authority_store: Cur epoch: 0
2024-10-19T11:07:59.001691Z  INFO sui_core::execution_cache::proxy_cache: using cache impl WritebackCache
2024-10-19T11:07:59.003212Z  INFO mysten_network::client: DISABLE_CACHING_RESOLVER: false
2024-10-19T11:07:59.169215Z  INFO AuthorityPerEpochStore::new{epoch=0}: sui_core::authority::authority_per_epoch_store: epoch flags: [ExecutedInEpochTable, StateAccumulatorV2EnabledTestnet, StateAccumulatorV2EnabledMainnet, WritebackCacheEnabled]
2024-10-19T11:07:59.169382Z  INFO AuthorityPerEpochStore::new{epoch=0}: sui_core::authority::authority_per_epoch_store: authenticator_state disabled
thread 'sui-node-runtime' panicked at crates/sui-core/src/authority/authority_per_epoch_store.rs:9032024-10-19T11:07:59.169474Z ERROR AuthorityPerEpochStore::new{epoch=0}: telemetry_subscribers: panicked at crates/sui-core/src/authority/authority_per_epoch_store.rs:903:13:
assertion failed: s.randomness_state_enabled() panic.file="crates/sui-core/src/authority/authority_per_epoch_store.rs" panic.line=903 panic.column=13
:13:
assertion failed: s.randomness_state_enabled()
stack backtrace:
   0:     0x55846bf90655 - <std::sys::backtrace::BacktraceLock::print::DisplayBacktrace as core::fmt::Display>::fmt::h1b9dad2a88e955ff
   1:     0x55846bfbe35b - core::fmt::write::h4b5a1270214bc4a7
   2:     0x55846bf8c5bf - std::io::Write::write_fmt::hd04af345a50c312d
   3:     0x55846bf91b11 - std::panicking::default_hook::{{closure}}::h96ab15e9936be7ed
   4:     0x55846bf917ec - std::panicking::default_hook::h3cacb9c27561ad33
   5:     0x558469feb36c - telemetry_subscribers::set_panic_hook::{{closure}}::h7a002485fb2438fb
   6:     0x55846bf923af - std::panicking::rust_panic_with_hook::hfe205f6954b2c97b
   7:     0x55846bf91fa3 - std::panicking::begin_panic_handler::{{closure}}::h6cb44b3a50f28c44
   8:     0x55846bf90b19 - std::sys::backtrace::__rust_end_short_backtrace::hf1c1f2a92799bb0e
   9:     0x55846bf91c64 - rust_begin_unwind
  10:     0x558468ab4a43 - core::panicking::panic_fmt::h3d8fc78294164da7
  11:     0x558468ab4acc - core::panicking::panic::hec978767ec2d35ff
  12:     0x558469b620cc - sui_core::authority::authority_per_epoch_store::AuthorityPerEpochStore::new::h27aa0ef8397ce352
  13:     0x558468b6ad94 - sui_node::SuiNode::start_async::{{closure}}::h44febe2967fc3535
  14:     0x558468b927d2 - sui_node::main::{{closure}}::hd0a0e86c500553c2
  15:     0x558468ed1cf5 - tokio::runtime::task::harness::Harness<T,S>::poll::h03975ac7274d743c
  16:     0x55846befeef2 - tokio::runtime::scheduler::multi_thread::worker::Context::run_task::hf66891fd9ee6bacb
  17:     0x55846befe1bb - tokio::runtime::scheduler::multi_thread::worker::Context::run::h4082b3aa5a88b0da
  18:     0x55846bef5e2b - tokio::runtime::context::scoped::Scoped<T>::set::hd8546a7146c024cd
  19:     0x55846bf06b9c - tokio::runtime::context::runtime::enter_runtime::h945f6a7224a9471b
  20:     0x55846bf01cd0 - tokio::runtime::task::core::Core<T,S>::poll::h3ed6f38fdccaab95
  21:     0x55846bef0bb8 - tokio::runtime::task::harness::Harness<T,S>::poll::hea5fe1595049d9e9
  22:     0x55846bf054a4 - tokio::runtime::blocking::pool::Inner::run::hcbfdd4b1a61ae127
  23:     0x55846bf00759 - std::sys::backtrace::__rust_begin_short_backtrace::h3b0feebf3e96ff28
  24:     0x55846bf00c41 - core::ops::function::FnOnce::call_once{{vtable.shim}}::h61fcab04a9c59de2
  25:     0x55846bf9815b - std::sys::pal::unix::thread::Thread::new::thread_start::ha8af9c992ef0b208
  26:     0x7f022ca3bac3 - <unknown>
  27:     0x7f022cacd850 - <unknown>
  28:                0x0 - <unknown>
Aborted (core dumped)

sky93 avatar Oct 19 '24 11:10 sky93

I think this failure is currently expected if starting a mainnet or testnet fullnode from epoch 0

assertion failed: s.randomness_state_enabled()

The solution for now is to use a formal snapshot as @mwtian linked, sorry for the inconvenience and we should be able to remove that assert soon

aschran avatar Oct 21 '24 16:10 aschran

https://github.com/MystenLabs/sui/pull/19935 should fix this once it gets released.

aschran avatar Oct 21 '24 16:10 aschran