cronos icon indicating copy to clipboard operation
cronos copied to clipboard

The node shows it catches up but it's not on the latest block height.

Open kuoweilai opened this issue 2 years ago • 9 comments

Describe the bug I notice sometimes my node is left behind but it shows it catches up still.

To Reproduce

# curl -s https://rpc.cronos.org/commit | jq "{height: .result.signed_header.header.height}"

{
  "height": "1742910"
}
# ./cronos/bin/cronosd status 2>&1 | jq '.SyncInfo.catching_up'
false
# ./cronos/bin/cronosd status 2>&1 | jq '.SyncInfo.latest_block_height'
"1742900"

My node is 10 blocks behind but it still show catching_up false.

Sometimes even more:

# ./cronos/bin/cronosd status 2>&1 | jq '.SyncInfo.latest_block_height'
"1742900"
# curl -s https://rpc.cronos.org/commit | jq "{height: .result.signed_header.header.height}"
{
  "height": "1742963"
}
# ./cronos/bin/cronosd status 2>&1 | jq '.SyncInfo.catching_up'
false

I am running a websocket application with my node twenty four seven. Can this be the reason? Or is there any possible reason causing this? Thanks. Desktop (please complete the following information): Ubuntu 20.04 0.6.7


Update: It suddenly catches up now. Is it because I use fast sync, so it may not catch up closely and then catch up in chunk?

kuoweilai avatar Mar 04 '22 01:03 kuoweilai

I guess there are some slow rpc queries going on?

yihuang avatar Mar 09 '22 04:03 yihuang

My application is basically a websocket subscription on monitoring every block, and I read particular tx data only. I ain't very sure it's because of the websocket subscription slowed down the node.

kuoweilai avatar Mar 09 '22 04:03 kuoweilai

My application is basically a websocket subscription on monitoring every block, and I read particular tx data only. I ain't very sure it's because of the websocket subscription slowed down the node.

then it doesn't sound heavy. is the CPU or disk io the bottleneck? the leveldb do compaction periodically, not sure if that's the case.

yihuang avatar Mar 09 '22 06:03 yihuang

Sure, let me check and I'll report later. I need to rollback to 0.6.5 first. Thank you so mcuh!

kuoweilai avatar Mar 09 '22 06:03 kuoweilai

Is there a good way to find the bottleneck or check whether this is a setup issue? I had this issue even I only run the following:

var main_provider = "ws://localhost:8546";
const Web3 = require('web3');
const web3 = new Web3(main_provider);

const monitor = async () => {

	var subscription = web3.eth.subscribe('newBlockHeaders', async (error, block_info) =>{

		if (!error) {
			console.log('[Block Number] ' + block_info.number.toString());
		}
	});
};
monitor();

My node will lose sync, but ./cronosd status 2>&1 | jq '.SyncInfo.catching_up' still shows false.

Restart the node and without the websocket querying, it will gradually catch up again. ./cronosd status 2>&1 | jq '.SyncInfo.catching_up' shows true and when it catches up, it becomes false.

kuoweilai avatar Mar 16 '22 18:03 kuoweilai

I also notice even without running any query, the node still loses sync. but ./cronosd status 2>&1 | jq '.SyncInfo.catching_up' shows false.

kuoweilai avatar Mar 17 '22 15:03 kuoweilai

@gap370 thanks for the reporting, I think it relates to Tendermint's issue, so the catching_up only show the correct status when it's under block_sync mode.

JayT106 avatar Mar 22 '22 21:03 JayT106

Hi @JayT106, thank you for looking into this. Is it possible to use blocksync mode?

kuoweilai avatar Mar 22 '22 21:03 kuoweilai

Ah, the blocksync and statesync is the mechanism after the node started and catching to the network height. It will happen only the node starting after the node catches up to the network height, will not go back to this mode.

JayT106 avatar Mar 23 '22 01:03 JayT106

close issue due to no reply.

JayT106 avatar Jan 10 '23 16:01 JayT106