celo-blockchain
celo-blockchain copied to clipboard
Full node on mainnet stop syncing with BAD BLOCK
The fullnode stopped syncing at block number 1952640
with the error message below.
Logs
Are there any logs?
ERROR[08-17|09:38:21.814] The header retrieved from the chain is nil block num=1952640
ERROR[08-17|09:38:21.814]
########## BAD BLOCK #########
Chain config: {ChainID: 42220 Homestead: 0 DAO: <nil> DAOSupport: false EIP150: 0 EIP155: 0 EIP158: 0 Byzantium: 0 Constantinople: 0 Petersburg: 0 Istanbul: 0 Engine: istanbul}
Number: 1966203
Hash: 0xc778e99dfd1dee0a97df2c5e68a3a933d5a1ae2022b1e3953290060e8b8425b0
Error: unknown block
##############################
System information
Run geth version
Celo
Version: 1.0.1-stable
Architecture: amd64
Protocol Versions: [65 64]
Go Version: go1.13.14
Operating System: linux
GOPATH=
GOROOT=/usr/local/go
Hi @kwunyeung. Sorry for taking so long to address this issue. Are you still seeing it, even after deleting your on-disk chain data?
I think that the root cause of this issue is this: https://github.com/celo-org/celo-blockchain/issues/1107
What's happening is that on the last block of each epoch block (specifically for block numbers where mod 17280 == 0), it will calculate validator awards for all of the validators within the ending epoch.
As part of that calculation, you node will need to calculate all of the epoch's validators uptime scores, which use data saved on your nodes' local leveldb. However, there are rare corner cases that the data in the leveldb is incorrect, which leads to your node calculating incorrect uptime scores (leading to this BAD BLOCK error).
I'm going to close this for now. If you encounter it again even after deleting your chain data and resyncing, then please re-open.
@kevjue we have synced the node from scratch and it's currently working fine. Thanks!
I saw this now on a full node (on a recent version syncing to mainnet):
ERROR[03-23|19:23:33.610] The header retrieved from the chain is nil block num=1002240
ERROR[03-23|19:23:33.610]
########## BAD BLOCK #########
Chain config: {ChainID: 42220 Homestead: 0 DAO: <nil> DAOSupport: true EIP150: 0 EIP155: 0 EIP158: 0 Byzantium: 0 Constantinople: 0 Petersburg: 0 Istanbul: 0 Churrito: <nil>, Donut: <nil>, Engine: istanbul}
Number: 1004268
Hash: 0x10b43d1b808c0f1e1be98f9f36b84bb591e07051f0341cc7af30d9cb87a9b6d7
Error: unknown block
##############################
The node is stuck on block 978688, and reports an error on block 1004268. Not clear why it got stuck syncing, though.
Presumably deleting the chain data would work, but then it'd have to sync from the beginning again, and it seems the issue is not really resolved.
I just hit this on mainnet, latest block in Geth is 2333423
ERROR[04-29|12:19:53.531] The header retrieved from the chain is nil block num=2350080
ERROR[04-29|12:19:53.531]
########## BAD BLOCK #########
Chain config: {ChainID: 42220 Homestead: 0 DAO: <nil> DAOSupport: true EIP150: 0 EIP155: 0 EIP158: 0 Byzantium: 0 Constantinople: 0 Petersburg: 0 Istanbul: 0 Churrito: 6774000, Donut: 6774000, Engine: istanbul}
Number: 3360065
Hash: 0x058bcdb82cef7af601306c1a6ef2ce4139beb85ba4332e6308b65e3db15ec8fd
Error: unknown block
##############################
WARN [04-29|12:19:53.535] Error in sending message func=AsyncMulticastCeloMsg msgCode=18 peer="Peer 7a2c1573a9b944c0 [eth/65]" ethMsgCode=18 err="shutting down"
WARN [04-29|12:19:53.539] Synchronisation failed, dropping peer peer=167d06a00d57b861 err="retrieved hash chain is invalid: unknown block"
[Edit: the node was on v1.0.0-stable]
@trianglesphere @gastonponti check if this happens during new release version. If not, close