eos icon indicating copy to clipboard operation
eos copied to clipboard

stop nodeos on ship error

Open cc32d9 opened this issue 4 years ago • 1 comments
trafficstars

If state history plugin has problems writing the history, nodeos must stop immediately.

related issue: #9483

Another condition is if the node starts from a snapshot that is newer than the state history's last block. The node moves along and does not write any history.

Currently it just prints a warning and goes along.

In case of an error, the operator needs to restart nodeos from a snapshot, and sometimes this condition is difficult to detect. Normally if someone installs a node with state history enabled, it's the primary job of the node to deliver history.

cc32d9 avatar Mar 07 '21 16:03 cc32d9

A 2.0.13 WAX node encountered this error, and stopped as expected. But it can't restart, as state history is missing the block.

Failure where it stopped:

Sep 11 22:25:11 eosio nodeos[37573]: error 2021-09-11T22:25:11.972 nodeos    state_history_log.hpp:126     write_entry          ] chain::plugin_exception: 3110000 plugin_exception: Plugin exception
Sep 11 22:25:11 eosio nodeos[37573]: missed a block in trace_history.log
Sep 11 22:25:11 eosio nodeos[37573]:     {"name":"trace_history"}
Sep 11 22:25:11 eosio nodeos[37573]:     nodeos  state_history_log.hpp:93 write_entry
Sep 11 22:25:12 eosio nodeos[37573]: error 2021-09-11T22:25:12.011 nodeos    controller.cpp:1962           apply_block          ] e.to_detail_string(): 3140002 state_history_write_exception: State history write error
Sep 11 22:25:12 eosio nodeos[37573]: State history encountered an Error which it cannot recover from.  Please resolve the error and relaunch the process
Sep 11 22:25:12 eosio nodeos[37573]:     {}
Sep 11 22:25:12 eosio nodeos[37573]:     nodeos  state_history_log.hpp:134 write_entry
Sep 11 22:25:12 eosio nodeos[37573]:  
Sep 11 22:25:12 eosio nodeos[37573]: error 2021-09-11T22:25:12.079 nodeos    producer_plugin.cpp:361       on_incoming_block    ] 3140002 state_history_write_exception: State history write error
Sep 11 22:25:12 eosio nodeos[37573]: State history encountered an Error which it cannot recover from.  Please resolve the error and relaunch the process
Sep 11 22:25:12 eosio nodeos[37573]:     {}
Sep 11 22:25:12 eosio nodeos[37573]:     nodeos  state_history_log.hpp:134 write_entry
Sep 11 22:25:12 eosio nodeos[37573]:     {}
Sep 11 22:25:12 eosio nodeos[37573]:     nodeos  controller.cpp:1966 apply_block
Sep 11 22:25:12 eosio nodeos[37573]: rethrow
Sep 11 22:25:12 eosio nodeos[37573]:     {}
Sep 11 22:25:12 eosio nodeos[37573]:     nodeos  controller.cpp:2030 push_block
Sep 11 22:25:12 eosio nodeos[37573]: error 2021-09-11T22:25:12.175 nodeos    net_plugin.cpp:3021           process_signed_block ] ["37.58.52.123:9876 - bce67fa" 37.58.52.123:9101]bad block exception #139897766 39fb347266779bb7...: State history write error (3140002)
Sep 11 22:25:12 eosio nodeos[37573]: State history encountered an Error which it cannot recover from.  Please resolve the error and relaunch the process
Sep 11 22:25:12 eosio nodeos[37573]: rethrow
Sep 11 22:25:12 eosio nodeos[37573]: error 2021-09-11T22:25:12.211 net-0     net_plugin.cpp:2253           operator()           ] connection failed to peer1.wax.blacklusion.io:4646: Operation canceled

Failure after restart:

Sep 12 10:50:47 eosio nodeos[39816]: info  2021-09-12T10:50:47.424 nodeos    state_history_log.hpp:228     open_log             ] trace_history.log has blocks 2-139897764
Sep 12 10:50:47 eosio nodeos[39816]: info  2021-09-12T10:50:47.476 nodeos    state_history_log.hpp:228     open_log             ] chain_state_history.log has blocks 2-139897764
Sep 12 10:50:47 eosio nodeos[39816]: error 2021-09-12T10:50:47.636 nodeos    state_history_log.hpp:126     write_entry          ] chain::plugin_exception: 3110000 plugin_exception: Plugin exception
Sep 12 10:50:47 eosio nodeos[39816]: missed a block in trace_history.log
Sep 12 10:50:47 eosio nodeos[39816]:     {"name":"trace_history"}
Sep 12 10:50:47 eosio nodeos[39816]:     nodeos  state_history_log.hpp:93 write_entry
Sep 12 10:50:47 eosio nodeos[39816]: error 2021-09-12T10:50:47.645 nodeos    controller.cpp:1962           apply_block          ] e.to_detail_string(): 3140002 state_history_write_exception: State history write error
Sep 12 10:50:47 eosio nodeos[39816]: State history encountered an Error which it cannot recover from.  Please resolve the error and relaunch the process
Sep 12 10:50:47 eosio nodeos[39816]:     {}
Sep 12 10:50:47 eosio nodeos[39816]:     nodeos  state_history_log.hpp:134 write_entry
Sep 12 10:50:47 eosio nodeos[39816]:  
Sep 12 10:50:47 eosio nodeos[39816]: error 2021-09-12T10:50:47.721 nodeos    main.cpp:125                  main                 ] 3140002 state_history_write_exception: State history write error
Sep 12 10:50:47 eosio nodeos[39816]: State history encountered an Error which it cannot recover from.  Please resolve the error and relaunch the process
Sep 12 10:50:47 eosio nodeos[39816]:     {}
Sep 12 10:50:47 eosio nodeos[39816]:     nodeos  state_history_log.hpp:134 write_entry
Sep 12 10:50:47 eosio nodeos[39816]:     {}
Sep 12 10:50:47 eosio nodeos[39816]:     nodeos  controller.cpp:1966 apply_block
Sep 12 10:50:47 eosio nodeos[39816]: rethrow
Sep 12 10:50:47 eosio nodeos[39816]:     {}
Sep 12 10:50:47 eosio nodeos[39816]:     nodeos  controller.cpp:2081 replay_push_block
Sep 12 10:50:47 eosio nodeos[39816]:     {}
Sep 12 10:50:47 eosio nodeos[39816]:     nodeos  chain_plugin.cpp:1193 plugin_startup

I'll restart it from a snapshot and I'm sure it will work. Just a note that the node leaves an inconsistent state after the failure.

cc32d9 avatar Sep 12 '21 10:09 cc32d9