dcrd icon indicating copy to clipboard operation
dcrd copied to clipboard

Multi-peer Checklist

Open davecgh opened this issue 7 years ago • 27 comments

This issue is intended to be a running checklist of several of the planned and necessary refactors in order to properly support efficient multi-peer parallel downloads. It will be a multi-month effort and is being provided in order to help prevent duplication of effort and to avoid merge conflicts if anyone was also thinking of working in the referenced areas.

Given how drastic the overall changes will be, some of these will very likely need to be updated as the work proceeds since unexpected things invariably come up during large development efforts such as this.

  • [X] Refactor CheckConnectBlock to CheckConnectBlockTemplate -- Implemented by PR #1086
    • More accurately reflects its purpose of testing block template proposals and uses this fact to be more restrictive about the allowed inputs and to avoid performing the proof of work check. In particular, the provided block template must only build from the current tip or its parent or an error should be returned.
  • [X] Convert to a full block index in memory -- Implemented by PR #1229
    • Since every block node will be in memory, the code which reconstructs headers from block nodes means that all headers will always be served from memory which will be important since the network will be moving to header-based semantics
    • Several of the error paths can be removed since they will no longer be necessary
    • It will no longer be expensive to calculate CSV sequence locks or median times of blocks way in the past
    • It will be much less expensive to calculate the initial states for the various intervals such as the stake and voter version
    • It will be possible to create much more efficient iteration and simplified views of the overall index
  • [X] Refactor and optimize checkpoint handling to use nodes instead of full blocks -- Implemented by PR #1230
    • Since the entire block index will be in memory, this code can be significantly optimized to use a node that is already in memory versus requiring the entire block
  • [X] Refactor and optimize block locator generation -- Implemented by PR #1237
    • The block locator code can be significantly optimized once the full block index is in memory since it will no longer be necessary to consult the database
  • [X] Refactor and optimize inventory discovery -- Implemented by PR #1239
    • The code to discover inventory can be significantly optimized once the full block index is in memory since it will no longer be necessary to consult the database
    • This will also fix issue #427
  • [X] Refactor and optimize exported header access -- Implemented by PR #1273
    • Since headers can be reconstructed from a block index node and the full block index will be in memory, it will no longer be necessary to consult the database
  • [X] Refactor all code that relies on the main chain index in the database to use the new memory block index -- Implemented by PR #1332
    • This will be a major optimization for several functions because they will no longer have to first consult the database (which is incredibly slow in many cases due to the way leveldb splits all of the information across files) to perform lookups and determine if blocks are in the main chain
  • [X] Perform a database migration to remove the main chain index which will no longer be used -- Implemented by PR #1334
  • [X] Implement an efficient and well-tested chain view -- Implemented by PR #1337
    • Chain views can take advantage of the fact that all block nodes will be in memory to provide a flat view of a specific chain of blocks (a specific branch of the overall block tree) from a given tip all the way back to the genesis block
    • An example of some benefits are:
      • Efficiently comparing two views
      • Quickly finding the fork point (if any) between two views
      • O(1) lookups of nodes at a specific height
      • Possibility for more efficient block locator code through the use of the aforementioned O(1) height lookups
      • Possibility for efficient skip lists for ancestor iteration
  • [X] Refactor code to use the new chain view instead of manually tracking the best node -- Implemented by PR #1344
    • This will further speed up areas such as best chain selection, best chain membership testing, fork finding, and block locator generation
    • It will also simplify some of the more challenging parts of the code since it will more cleanly separate the chain-specific logic from the block index logic
  • [X] Refactor reorganization logic to make use of known validation status -- Implemented by PR #1367
    • This will help ensure that no more than one attempt is made to reorganize to an invalid chain and valid blocks are not evaluated more than once during reorganizations
    • It will avoid a lot of potential unnecessary extra work when moving to parallel insertion in the future
    • It will allow more accurate reporting on the order in which historical side chains tips (aka orphaned blocks) were handled
  • [X] Add support to the block index for marking dirty nodes and flushing them to the database -- Implemented by PR #1375
    • This will allow validation states to be set on arbitrary nodes and persisted
  • [X] Refactor logic not specifically related to inputs out of CheckTransactionInputs -- Implemented by PRs #1452, #1453, #1457, and #1468
    • This will allow all sanity checks to be performed on blocks that do not yet connect due to arriving out of order in parallel
    • It will also reduce the time required to connect the block to the best chain
  • [X] Change the UTXO view semantics to include the tip block unless it is disallowed by voters -- Implemented by PRs #1471 and #1520
    • The logic is currently backwards in that the current UTXO set does not include the tip block until it is approved by voters which leads to a plethora of undesirable behavior
    • This will allow all UTXO-related code to be significantly simplified
    • It optimizes for the typical case
    • It will fix issue #618
    • It will provide a path to more easily supporting a UTXO cache
  • [X] Convert chain to perform direct single-step reorganizations with reversion to original tip in case of failure -- Implemented by PR #1500
    • This will significantly reduce he memory consumption of large reorgs
    • It is a much more cache-friendly approach
    • It will provide a path to more easily decoupling the chain processing and connection code from the download logic
  • [X] Remove all chain state from block manager -- Implemented by PR #1416 and #2498
    • This will help ensure all of the chain-related state logic lives in the blockchain package
    • It will also provide a nice performance boost for various chain-state query functions since the block manager acts as a big lock
  • [X] Decouple the chain processing and connection code from the download logic -- Implemented by PR #2518
    • This will ultimately allow blocks to be downloaded and stored based on the known headers even when all of the ancestors are not yet known
    • It will provide a path to remove orphan (unknown parents) handling from chain
    • It will allow the code related to the live tickets and UTXO set to be further simplified and optimized
  • [X] Convert the UTXO view to store and work on a per-output basis instead of at a transaction level -- Implemented by #2540
    • This simplifies the code and paves the way for a UTXO cache
    • Optimizes runtime performance.
  • [X] Implement a UTXO cache -- Implemented by PR #2591
    • Since connection is necessarily linear and all inputs reference previous outputs, performing the updates in memory with periodic writes to the database will allow intermediate states to effectively be skipped
  • [X] Refactor the block manager to a netsync package -- Implemented by PR #2500
    • This will make it easier to add unit tests and paves the way for modifying the existing syncing logic
    • It will require several steps as well
    • [X] Reduce the shared state between block manager and server -- Implemented by PR #1735
    • [X] Introduce a config struct for block manager to decouple it from the main package config -- Implemented by PR #1728
    • [X] Remove the dependency on serverPeer -- Implemented by PR #1735
    • [X] Remove the dependency on global config variable -- Implemented by PR #2497
    • [X] Split progress logging to internal package -- Implemented by PR #2499
  • [X] Rework the existing headers-first logic to download all of the headers instead of alternating based on checkpoints -- Implemented by PR #2555
    • This will allow the headers to be stored independently and ensure they all connect as well as providing all of the information needed to determine exactly which blocks comprise the chain with the most work without relying on checkpoints
  • [X] Rework the block download logic to use the newly available full header information while still being linear -- Implemented by PR #2555
    • This will break the reliance on blindly requesting newer blocks from peers
    • It will also pave the way towards implementing the logic necessary to request blocks from multiple peers
  • [ ] Modify the block download logic to request blocks from multiple peers in parallel
    • [X] Track the best known block announced by each peer as an efficient mechanism to discover which blocks are available to download from each peer -- Implemented by PR #3443
    • [X] Update tracking of inflight blocks to include the requested peer -- Implemented by PR #3444
    • [ ] Refactor the header sync peer to be part of the header sync state
    • [ ] Request blocks from multiple peers in parallel using the information added by the previous steps
    • [ ] Deprecate all code related to the notion of a sync peer
    • This will also involve several other steps such re-requesting any data that has not been delivered after a certain time, weighting towards faster peers, and blacklisting misbehaving peers

davecgh avatar Mar 10 '18 19:03 davecgh

Updated issue description for PR #1229.

davecgh avatar May 27 '18 02:05 davecgh

Update issue description for PR #1230.

davecgh avatar May 27 '18 12:05 davecgh

Updated issue description for PR #1237.

davecgh avatar May 29 '18 22:05 davecgh

Updated issue description for PR #1239.

davecgh avatar May 29 '18 23:05 davecgh

Updated issue description for PR #1273.

davecgh avatar Jun 09 '18 02:06 davecgh

Updated issue description for PR #1332.

davecgh avatar Jul 02 '18 20:07 davecgh

Updated issue description for PR #1334.

davecgh avatar Jul 03 '18 16:07 davecgh

Updated issue description for PR #1337.

davecgh avatar Jul 06 '18 18:07 davecgh

Updated issue description for PR #1344.

davecgh avatar Jul 09 '18 20:07 davecgh

Updated issue description for PR #1367.

davecgh avatar Jul 21 '18 09:07 davecgh

Updated issue description for PR #1375.

davecgh avatar Jul 25 '18 08:07 davecgh

Updated issue description for PR #1416.

davecgh avatar Aug 28 '18 19:08 davecgh

Updated issue description for PRs #1452 and #1453.

davecgh avatar Sep 14 '18 03:09 davecgh

Updated issue description for PR #1457.

davecgh avatar Sep 14 '18 07:09 davecgh

Updated issue description for PR #1468.

davecgh avatar Sep 22 '18 00:09 davecgh

Updated issue description for PR #1500.

davecgh avatar Oct 16 '18 19:10 davecgh

Updated issue description for PRs #1471 and #1520.

davecgh avatar Nov 09 '18 23:11 davecgh

Updated issue description for PR #1728 and #1735.

davecgh avatar Dec 09 '20 21:12 davecgh

Updated issue description for PR #2497.

davecgh avatar Dec 09 '20 21:12 davecgh

Updated issue description for PR #2498.

davecgh avatar Dec 10 '20 07:12 davecgh

Updated issue description for PR #2499 and #2500.

davecgh avatar Dec 10 '20 21:12 davecgh

Updated issue description for PR #2518.

davecgh avatar Dec 23 '20 22:12 davecgh

Updated issue description for PR #2540.

davecgh avatar Jan 01 '21 16:01 davecgh

Updated issue description for PR #2555.

davecgh avatar Jan 17 '21 10:01 davecgh

Updated issue description for PR #2591.

davecgh avatar Feb 12 '21 05:02 davecgh