iotex-core
iotex-core copied to clipboard
improve blocksync speed
What would you like to be added
as title
Why is this needed
currently the mainnet node syncs at a rate of 3-5 blocks every second, which is not satisfactory. when running a fullnode, sometime can observer the following:
- this log
dispatcher/dispatcher.go:372 dispatcher block channel is full, drop an event.
will print out, showing the receiving channel has been filled up faster than blocks are drained (committed) - sometimes on testnet, it will commit in a batch (you can see from the log) like 20~ blocks, then pause for couple of seconds, then commit the next batch, it seems the blocksync go-routine has a bottleneck/hiccup somewhere
How important you think this is for IoTeX
- [x] must have
- [] should have
- [] nice to have
Additional information
Let us know any background or context that would help us better understand the request (for example the particular use-case that prompted this request)
The couple of seconds pause maybe caused by these code:
// blocksync/blocksync.go:179
func (bs *blockSyncer) sync() {
updateTime, targetHeight := bs.flushInfo()
if updateTime.Add(bs.cfg.Interval).After(time.Now()) {
return
}
...
}
It will request blocks only after block buffer is empty regardless of the blocksync interval config. In the period from block buffer is empty to next sync, there is no block to be commit.
The actual blocksync timeline may look like:
assume: sync_interval=30s block_buffer_size=200 speed=5 block/s
00:00:00 sync
commit blocks
00:00:30 ignore sync for block buffer is not empty
commit blocks
00:00:40 block buffer empty
no block to be commit and no sync ( pause )
00:01:00 sync
Block channel is full
maybe caused next height block haven't been received for a long time during blocksync process.
As we know, the config in mainnet is:
dispatcher:
blockChanSize: 1000
blockSync:
interval: 10s
bufferSize: 400
maxRepeat: 3
repeatDecayStep: 3
intervalSize: 20
According to following code, node will request about 860 (400x2+20x3) blocks from neighbours after sync once. The blockChanSize
is enough at this time.
// blocksync/blocksync.go:193
bs.requestBlock(context.Background(), interval.Start, interval.End, bs.cfg.MaxRepeat-i/bs.cfg.RepeatDecayStep)
The next sync must wait interval
time after the last block commit. So there are two passible situation to start next sync:
- Situation 1: All blocks those are received from last sync have been commited. It will not fill the block channel up when next sync start.
- Situation 2: There are some blocks to be commit in block channel, but the next height block haven't been received for a
interval
time. It may make the block channel full when next sync start.
Two principles we want to adhere to when syncing blocks are
- New blocks are continuously put by the statedb
- The channel for block in the dispatcher should not be full when syncing( Current request strategy with fixed time intervals might need to be improved.