cronos icon indicating copy to clipboard operation
cronos copied to clipboard

"UnmarshalJSON cannot decode empty bytes" during statesync

Open zanglang opened this issue 3 years ago • 7 comments

Describe the bug

Whilst starting a new node with statesync and applying snapshot chunks, multiple chunks will fail with the following errors:

6:05AM INF Applied snapshot chunk to ABCI app chunk=27 format=1 height=1154000 module=statesync server=node total=29
6:05AM ERR ante handler panicked error="UnmarshalJSON cannot decode empty bytes" stack trace="goroutine 3868 [running]:\nruntime/debug.Stack()\n\t/usr/local/go/src/runtime/debug/stack.go:24
 +0x65\ngithub.com/tharsis/ethermint/app/ante.Recover({0x27cace0, 0xc0000a3740}, 0xc143dee440)\n\t/go/pkg/mod/github.com/crypto-org-chain/[email protected]/app/ante/ante.go:132 +0xf
f\npanic({0x1b1dea0, 0xc000f20cb0})\n\t/usr/local/go/src/runtime/panic.go:1038 +0x215\ngithub.com/cosmos/cosmos-sdk/x/params/types.Subspace.Get({{0x27f4520, 0xc001278020}, 0xc000212398, {0x
27936b8, 0xc001198470}, {0x2793708, 0xc001198510}, {0xc0002c80c0, 0x4, 0x1a}, ...}, ...)\n\t/go/pkg/mod/github.com/cosmos/[email protected]/x/params/types/subspace.go:109 +0x319\ngithub.co
m/cosmos/cosmos-sdk/x/params/types.Subspace.GetParamSet({{0x27f4520, 0xc001278020}, 0xc000212398, {0x27936b8, 0xc001198470}, {0x2793708, 0xc001198510}, {0xc0002c80c0, 0x4, 0x1a}, ...}, ...)
\n\t/go/pkg/mod/github.com/cosmos/[email protected]/x/params/types/subspace.go:222 +0x145\ngithub.com/cosmos/cosmos-sdk/x/auth/keeper.AccountKeeper.GetParams(...)\n\t/go/pkg/mod/github.com
/cosmos/[email protected]/x/auth/keeper/params.go:15\ngithub.com/cosmos/cosmos-sdk/x/auth/ante.ValidateMemoDecorator.AnteHandle({{_, _}}, {{0x27c9ce8, 0xc000052090}, {0x27fa3a0, 0xc0d1a934
00}, {{0x0, 0x0}, {0x0, 0x0}, ...}, ...}, ...)\n\t/go/pkg/mod/github.com/cosmos/[email protected]/x/auth/ante/basic.go:56 +0x104\ngithub.com/cosmos/cosmos-sdk/types.ChainAnteDecorators.fun
c1({{0x27c9ce8, 0xc000052090}, {0x27fa3a0, 0xc0d1a93400}, {{0x0, 0x0}, {0x0, 0x0}, 0x0, {0x0, ...}, ...}, ...}, ...)\n\t/go/pkg/mod/github.com/cosmos/[email protected]/types/handler.go:40
+0x158\ngithub.com/cosmos/cosmos-sdk/x/auth/ante.TxTimeoutHeightDecorator.AnteHandle({}, {{0x27c9ce8, 0xc000052090}, {0x27fa3a0, 0xc0d1a93400}, {{0x0, 0x0}, {0x0, 0x0}, 0x0, ...}, ...}, ...
)\n\t/go/pkg/mod/github.com/cosmos/[email protected]/x/auth/ante/basic.go:205 +0x28e\ngithub.com/cosmos/cosmos-sdk/types.ChainAnteDecorators.func1({{0x27c9ce8, 0xc000052090}, {0x27fa3a0, 0
xc0d1a93400}, {{0x0, 0x0}, {0x0, 0x0}, 0x0, {0x0, ...}, ...}, ...}, ...)\n\t/go/pkg/mod/github.com/cosmos/[email protected]/types/handler.go:40 +0x158\ngithub.com/cosmos/cosmos-sdk/x/auth/
ante.MempoolFeeDecorator.AnteHandle({}, {{0x27c9ce8, 0xc000052090}, {0x27fa3a0, 0xc0d1a93400}, {{0x0, 0x0}, {0x0, 0x0}, 0x0, ...}, ...}, ...)\n\t/go/pkg/mod/github.com/cosmos/cosmos-sdk@v0.
44.2/x/auth/ante/fee.go:54 +0x306\ngithub.com/cosmos/cosmos-sdk/types.ChainAnteDecorators.func1({{0x27c9ce8, 0xc000052090}, {0x27fa3a0, 0xc0d1a93400}, {{0x0, 0x0}, {0x0, 0x0}, 0x0, {0x0, ..
.}, ...}, ...}, ...)\n\t/go/pkg/mod/github.com/cosmos/[email protected]/types/handler.go:40 +0x158\ngithub.com/tharsis/ethermint/app/ante.EthSetupContextDecorator.AnteHandle({}, {{0x27c9ce
8, 0xc000052090}, {0x27fa3a0, 0xc0d1a93400}, {{0x0, 0x0}, {0x0, 0x0}, 0x0, ...}, ...}, ...)\n\t/go/pkg/mod/github.com/crypto-org-chain/[email protected]/app/ante/eth.go:635 +0x1b2\n
github.com/cosmos/cosmos-sdk/types.ChainAnteDecorators.func1({{0x27c9ce8, 0xc000052090}, {0x27fa3a0, 0xc0d1a93400}, {{0x0, 0x0}, {0x0, 0x0}, 0x0, {0x0, ...}, ...}, ...}, ...)\n\t/go/pkg/mod
/github.com/cosmos/[email protected]/types/handler.go:40 +0x158\ngithub.com/tharsis/ethermint/app/ante.NewAnteHandler.func1({{0x27c9ce8, 0xc000052090}, {0x27fa3a0, 0xc0d1a93400}, {{0x0, 0x
0}, {0x0, 0x0}, 0x0, {0x0, ...}, ...}, ...}, ...)\n\t/go/pkg/mod/github.com/crypto-org-chain/[email protected]/app/ante/ante.go:77 +0x796\ngithub.com/cosmos/cosmos-sdk/baseapp.(*Bas
eApp).runTx(0xc0010c69c0, 0x0, {0xc0fe5eab60, 0x194, 0x194})\n\t/go/pkg/mod/github.com/cosmos/[email protected]/baseapp/baseapp.go:647 +0x63f\ngithub.com/cosmos/cosmos-sdk/baseapp.(*BaseAp
p).CheckTx(0xc0010c69c0, {{0xc0fe5eab60, 0xc0030bdd40, 0x38bb7e0}, 0xb0cc1970})\n\t/go/pkg/mod/github.com/cosmos/[email protected]/baseapp/abci.go:243 +0x185\ngithub.com/tendermint/tenderm
int/abci/client.(*localClient).CheckTxAsync(0xc000cfbe60, {{0xc0fe5eab60, 0xcc42a0, 0xc0002d27e0}, 0xb0cc1b70})\n\t/go/pkg/mod/github.com/tendermint/[email protected]/abci/client/local_cl
ient.go:104 +0x118\ngithub.com/tendermint/tendermint/proxy.(*appConnMempool).CheckTxAsync(0xc0002d27e0, {{0xc0fe5eab60, 0x0, 0x0}, 0xb0cc1c68})\n\t/go/pkg/mod/github.com/tendermint/tendermi
[email protected]/proxy/app_conn.go:126 +0x26\ngithub.com/tendermint/tendermint/mempool.(*CListMempool).CheckTx(0xc001050000, {0xc0fe5eab60, 0x194, 0x194}, 0xc0b0cc1d28, {0x357d, {0xc0b0d94360, 0
x1c9a260}})\n\t/go/pkg/mod/github.com/tendermint/[email protected]/mempool/clist_mempool.go:288 +0x490\ngithub.com/tendermint/tendermint/mempool.(*Reactor).Receive(0xc0010c3a40, 0x30, {0x
28073d0, 0xc0b0d9a6c0}, {0xc0b0dca000, 0x2795fa8, 0xc000f84bd0})\n\t/go/pkg/mod/github.com/tendermint/[email protected]/mempool/reactor.go:183 +0x476\ngithub.com/tendermint/tendermint/p2p
.createMConnection.func1(0x30, {0xc0b0dca000, 0x19a, 0x1000})\n\t/go/pkg/mod/github.com/tendermint/[email protected]/p2p/peer.go:390 +0x20e\ngithub.com/tendermint/tendermint/p2p/conn.(*MC
onnection).recvRoutine(0xc002fb46e0)\n\t/go/pkg/mod/github.com/tendermint/[email protected]/p2p/conn/connection.go:647 +0x3bd\ncreated by github.com/tendermint/tendermint/p2p/conn.(*MConn
ection).OnStart\n\t/go/pkg/mod/github.com/tendermint/[email protected]/p2p/conn/connection.go:231 +0x231\n"

After a while the statesync fails with:

panic: failed to process committed block (1154002:3A292C12712B3AFECCEAB5255EC6384906AED7C2BD5048091670F773A2CDBDC7): wrong Block.Header.AppHash.  Expected 24B3AEB57BF785B227ED9C8D2FB86A9465
ED843076E4F39DBC1A652CB0D4060F, got 9498BFE00ECCA3A16C7174AA4166A54E744A6991FE11F0C0ED81AED16821CA65 [recovered]
        panic: failed to process committed block (1154002:3A292C12712B3AFECCEAB5255EC6384906AED7C2BD5048091670F773A2CDBDC7): wrong Block.Header.AppHash.  Expected 24B3AEB57BF785B227ED9C8D2F
B86A9465ED843076E4F39DBC1A652CB0D4060F, got 9498BFE00ECCA3A16C7174AA4166A54E744A6991FE11F0C0ED81AED16821CA65

To Reproduce

I am using the following trust settings for statesync:

trust_height = 1155168
trust_hash = "74246A38B289169AEF74DD4E970BB0EB449F67045196595563E53157954EA78C"
trust_period = "168h0m0s"

Expected behavior

Statesync should apply chunks normally.

Additional context

Tested on v0.6.5 with version:

name: cronos
server_name: cronosd
version: 0.6.5
commit: fb7db2ecfd4c78ef31ab9887d4d2728c0473357e
build_tags: ""
go: go version go1.17.1 linux/amd64

zanglang avatar Jan 23 '22 14:01 zanglang

Managed to replicate this issue here. Including the stack trace to see if it helps:

goroutine 3495 [running]:

github.com/tendermint/tendermint/blockchain/v0.(*BlockchainReactor).poolRoutine(0xc0012be700, 0x1)

	/go/pkg/mod/github.com/tendermint/[email protected]/blockchain/v0/reactor.go:401 +0x1265

created by github.com/tendermint/tendermint/blockchain/v0.(*BlockchainReactor).SwitchToFastSync

	/go/pkg/mod/github.com/tendermint/[email protected]/blockchain/v0/reactor.go:125 +0xef

Terra854 avatar Jan 23 '22 16:01 Terra854

@zanglang @Terra854 thanks for the report, This error actually is not related to the statesync, this error happens when the node is receiving a transaction from the connecting peer, and then executing the CheckTxAsync, but the state of the app is not ready yet (because still syncing).

Are your node both got stuck during the statesync? @Terra854 do you have more logs can provide? thanks!

JayT106 avatar Apr 01 '22 20:04 JayT106

@zanglang @Terra854 thanks for the report, This error actually is not related to the statesync, this error happens when the node is receiving a transaction from the connecting peer, and then executing the CheckTxAsync, but the state of the app is not ready yet (because still syncing).

Are your node both got stuck during the statesync? @Terra854 do you have more logs can provide? thanks!

I am able to reproduce @Terra854 's status. Looking into it.

JayT106 avatar Apr 05 '22 20:04 JayT106

Update some logs from reproducing the issue, the snapshot was able to restore, but the status of the next block executing looks incorrect, it has 23 invalid tx, no valid tx into the block (the explorer has 19 txs). So the next block executing will mismatch for sure.

6:55PM INF Verified ABCI app appHash="��I��q\x12�b,,3\u007f\x04��C�VZ\x1ake��W�mDk�\v" height=2243000 module=statesync server=node
6:55PM INF Snapshot restored format=1 hash="\x06[h\x11��p�M\x02i�\x0e 1��#\x04\x03��\x02�)�H�l\x02��" height=2243000 module=statesync server=node
6:55PM INF Starting BlockPool service impl=BlockPool module=blockchain server=node
6:55PM INF executed block height=2243001 module=state num_invalid_txs=23 num_valid_txs=0 server=node
6:55PM INF commit synced commit=436F6D6D697449447B5B37332031302038332035312031323320313632203235302031312031353020313938203136332031392031353420313432203133342034312032372031353220323331203136342033352032303620313735203136352031313120323030203430203232312031343920333020323134203230375D3A3232333942397D
6:55PM INF committed state app_hash=490A53337BA2FA0B96C6A3139A8E86291B98E7A423CEAFA56FC828DD951ED6CF height=2243001 module=state num_txs=23 server=node
6:55PM INF indexed block height=2243001 module=txindex server=node
panic: Failed to process committed block (2243002:2B4FCE3BDF999B6A9BCACA04057A734189EF3AA12F6EEFF9F4EFE895BBEE092E): wrong Block.Header.AppHash.  Expected 490A53337BA2FA0B96C6A3139A8E86291B98E7A423CEAFA56FC828DD951ED6CF, got 19938979E4AA523DE01075D1B7C0A05D707DFDC205CED64761F5179B68DDE50A

goroutine 3117 [running]:
github.com/tendermint/tendermint/blockchain/v0.(*BlockchainReactor).poolRoutine(0xc00145c000, 0x1)
	/go/pkg/mod/github.com/tendermint/[email protected]/blockchain/v0/reactor.go:401 +0x1265
created by github.com/tendermint/tendermint/blockchain/v0.(*BlockchainReactor).SwitchToFastSync
	/go/pkg/mod/github.com/tendermint/[email protected]/blockchain/v0/reactor.go:125 +0xef

JayT106 avatar Apr 07 '22 19:04 JayT106

resync the snapshot at height 2255000, and it can continue to the block sync.

looks like once the node receives a TX and runs CheckTx, the statesync will always fail.

JayT106 avatar Apr 09 '22 00:04 JayT106

This problem occurred after the v0.7.0 upgrade

jsvisa avatar May 07 '22 23:05 jsvisa

Problem: some transaction's receipt status is wrong #462

What problem you are talking about? UnmarshalJSON cannot decode empty bytes or failed to process committed block (or both)?

JayT106 avatar May 09 '22 14:05 JayT106

close issue due to no discussion recently.

JayT106 avatar Jan 10 '23 16:01 JayT106