vega icon indicating copy to clipboard operation
vega copied to clipboard

[Feature]: Allow to automatical restart when data node is behind the core but local snapshot is present on the disk

Open daniel1302 opened this issue 1 year ago • 0 comments

Feature Overview

It is very often that data-node is behind the core when the system is not very healthy for some reason (e.g high traffic, disk IOPS issues, etc...). We can see the followingerror:


Jan 08 14:45:29 api0.vega.community visor[3749951]: vega data node stopped with error: block height on begin block, 30391214, is too high, the height of the last processed block is 30391119

It says data-node last block is 30391119, the core block was 30391214.

You can see that the data node was only 5 blocks behind. However, the snapshot was created at block 30391120.

To restart it We have to specify the core start height it with the --snapshot.load-from-block-height flag or with the config param in the <vega_home>/config/node/config.toml file.

It is not very practical. We can add some config that allows us to find the last available snapshot if available when this error happens.

daniel1302 avatar Jan 08 '24 14:01 daniel1302