blockchain-core icon indicating copy to clipboard operation
blockchain-core copied to clipboard

Get Block Sync First Block Height

Open anthonyra opened this issue 2 years ago • 0 comments

PR Description

This is the first step, at adding the lowest possible height for a node at which a block sync batch could be created. This is a feature I've wanted to add for syncing. The reason is that the node currently randomly picks peers from its peerbook and attempts to sync from them. The issue is due to the snapshots nature of most nodes it leads to sporadically sized ledgers found on the blockchain from this. Amplified even more so with Garbage Collection now. You can go 50-100 attempts before finding a node with blocks you need. This is an attempt to predicate the peerbook list based on the lowest possible height that node could sync from that peer.

get_block_sync_first_block_height/1 - Requires the blockchain as input. It then seeks to the next available block after the Genesis block. It then checks to see if a block sync batch could be created starting at that height. It does this by verifying that the node has enough consecutive blocks equal to the block_sync_batch_size env variable. This ensures that there's at least one batch worth of blocks starting at the specified height. If a gap is detected it seeks forward until it finds a block sync batch that meets the above requirements.

get_block_sync_first_block_height/2 - Was exposed mostly for testing purposes in regards to testing but could be useful in future features. It takes an initial height and the blockchain and finds the next block sync batch height.

The function names are subject to change, but I couldn't figure out an elegant name to describe the functions purpose.

Notes

  • The first attempt was going to simply use the find_first_height_after in a recursive manner building out the expected blocks and compare them to the actual blocks found. This worked great but resulted in full rocksdb reads/gets equal to block_sync_batch_size + 1 ... this was fast but figured that if the database was splintered a great deal that it would result in a lot of random reads that could hinder overall performance. Instead, this method reads the full rocksdb to get the lowest height after genesis. It then reads the DB again to get the potential block sync batch and sets it as the iterator. It iterates and checks this "snapshot" (iterator). If it finds a gap it repeats the process for 2 more reads. This results in a lot less full DB reads.

TODO

  • [ ] Update the height in a cache after a snapshot is loaded or after Garbage Collection
  • [ ] Add this height to the signed_metadata_fun, which should be grabbed by the from above cache
  • [ ] Testing Suite?

Additional Notes:

  • During creation and testing of this feature it was noticed that find_first_height_after had an edge case where if you tried to iterate using next while at the end of the iterator or outside the iterator. I tried to find this in the documents for rocksdb but came up short. However, if you try to iterate outside of the rocksdb after using seek you don't get an {error, _} but instead what appears to be {ok, _Hash, _NextBlock} a reversed tuple then expected. Where _NextBlock is the first key if trying to select a key outside of the range or equal to the very last key when trying to iterate to the next key from it.

anthonyra avatar Mar 01 '22 14:03 anthonyra