grin icon indicating copy to clipboard operation
grin copied to clipboard

[DNM] PIBD Task / Issue Tracker

Open yeastplume opened this issue 2 years ago • 3 comments

This PR is a task tracker for PIBD work, which also handily displays all PIBD implementation related changes on the pibd_impl branch thus far. Outstanding tasks or issues that need investigation are outlined below, followed by a list of recently completed (and thus far unreviewed PRs) on the pibd_impl branch. Note that considerable work was done on the segmentation and network messaging side previous to this round of work.

This posting will be kept up to date as work progresses, feel free to discuss, ask questions or raise issues in the comments.

General Progress

  • Code for Segmentation and desegmentation based on an archive block header is in place, as well as a first version of all related peer and network functions.
  • Sync code is in place that performs a PIBD reconstruction of the txhashset instead of the txhashset.zip download when running on testnet only.
  • As of #3694, First rough and ready version is contained on the pibd_impl branch, and a node has successfully synced from scratch on testnet via PIBD.
  • As of #3704, I'm happy enough with the state of it and think it's progressed enough for review and merge into master (with testnet only-enabled at the moment, or possibly a config flag that defaults to running PIBD on testnet only)

RFCs

Outstanding issues

  • [x] The segmenter code (all work performed before I picked this up) is currently sending a number of spent outputs within segments that need to be manually removed from the leaf set when reconstructing the tree. Determine why this is happening and whether this should be adjusted.
    • See first comment below for a history of this
  • [x] Related to above, determine if segmenter is also sending spent rangeproofs, which would be extremely wasteful in terms network activity.
    • See first comment below, short answer: Yes it is.
  • [ ] Determine the earlier reasoning behind why a seemingly redundant list of output positions needs to be sent as part of a segment, as this information should be available via the bitmap combined with segment size calculations.
  • [ ] Based on above, decide if the existing protocol needs to be changed, note this would delay deployment, as (rust) nodes on the existing network already respond to PIBD segment requests according to the existing protocol outlined in the RFC. Note that Grin++ nodes don't have PIBD messages implemented, nor do they do any pruning
  • [x] Check that process is able to properly continue if a segment request times out or doesn't complete. At the moment the networking layer keeps track of what segments it's already requested in order to prevent them being requested again; this list isn't being currently updated as a timeout or other request error as far as I can tell.
    • #3696 Partially covers this, still need a mechanism to remove segments from the list of requested segments in sync_state if a peer does not respond with a segment for any reason.
    • #3698 Covers the rest of this, any issues should be shaken out during testing, (hopefully)
  • [x] Check and document theading assumptions, at the moment the entire process is simply continued via continue_pibd in the main sync thread, and each sync loop compiles a list of required pibd segments and requests them if they haven't been requested. This keeps things simple and within existing design, which is good. However a separate thread is also started to periodically check how 'complete' the trees is, and kick off validation when they've complete. This will be updated as per issue below, but we still want to ensure this model will suit our needs.
    • In keeping with the existing sync design, this extra thread was removed and all requests + validation happen within the sync thread (with actual kernel checking + application in peer response threads). All seems to work well enough at the moment, and likely no need to complicate threading further unless a reason to do so is identified.
  • [x] At the moment all 3 PMMRs are reconstructed, and only when they match the archive header is the whole thing validated. Much of the validation can obviously occur in a separate thread while waiting for segments (particularly rangeproof + kernel validation), and the body head can be updated as the process continues, allowing for partial rollbacks if errors occur or the horizon window rolls over. The DB will need to keep track of the the state of the validation process.
    • To clarify, PMMR i.e. segment merkel-proof validation happens during PMMR reconstruction, and rangeproof + kernel sig validation + kernel sum validation happens after the txhashset download is complete. After some experimentation, the continual locks on PMMRs during PIBD mean there doesn't appear to be much time regained by trying to multithread the rangeproof validation, so unless we run into severe performance issues it's probably best to leave the final RP + Kernel sig validation to a separate step as we currently do.
  • [x] The validation process thread isn't killed properly on server shutdown (note this is true for the txhashset validation method as well). Stopping the server during the (long) validation process requires the entire thing to be restarted on next startup. The validation process should be made more granular in keeping with the issue above.
    • Longer-running validation processes, such as rangeproof + kernel sig checking now pay attention to the stop state, mitigating this somewhat. Still restarts the entire validation if the process is killed, but I think it's usable in the current state.
  • [ ] Determine behaviour when the horizon window changes during the PIBD process. At the moment, the thinking is to rollback everything and start again, but there might be something more clever we can do here, particularly if we validate the txhashset state up to a certain block height as PIBD is occurring. (i.e, we could continue the process using the updated archive header so long as the PMMRs reconstructed thus far are valid)
  • [x] Bitmap Accumulator as downloaded by the PIBD process is currently not being stored anywhere, and is re-requested on every start up. This doesn't actually appear to be particularly undesirable as it's only 2 or 3 segments at the moment, but determine if we want to cache this for use between invocations (may have to consider future-proofing)
    • I think this is fine as is, the time added to the process is marginal and it helps ensure the node is using the correct bitmap on startup without any added complexity, which helps if someone stops and restarts their node over a horizon boundary change.
  • [ ] At the moment when any error is encountered while reconstructing the txhashset, all PMMRs are rolled back to the genesis entry and the entire process starts again. Determine if this is desirable or whether a partial rollback can be implemented.
  • [ ] Roll back of prune list does not appear to ever have been implemented, assumption being this never had to be implemented as rollbacks never happened beyond the horizon. At the moment, when an error occurs in pibd and the state of the TxHashset is rolled back to block 0, the prune list state is retained. An extra function has currently been implemented to reset the prune list under these circumstances, but we need to determine whether a proper rollback is required, particularly if we want to support partial rollbacks when any errors are found when reconstructing the TXHashset
  • [x] Determine how/whether and under what conditions we want to 'give up' on PIBD during a sync, and revert to a txhashset.zip download
    • see #3704
  • [x] Verification of behaviour when syncing body due to a node being offline for a while, whereas previously it might have reconstructed the entire hashset via txhashset.zip, now it can simply request the segments it needs and validate up to the new archive header, investigate this.
  • [ ] Think about / determine number of peers we want to request segments from at a time
  • [ ] Peers just chosen at random at the moment, determine whether this is okay
  • [ ] Determine how and whether to ban a peer on receiving a bad segment
  • [x] Determine how we want to display PIBD progress in the TUI, right now it displays a notion of 'segments downloaded up to representative block x', would it make more sense to display a total segment count here instead?
    • Currently displays total number of leaves, which is the only thing that makes sense in all cases given segment sizes can change.
  • [ ] Determine if segment height defaults can be left as is, or whether they need to be dynamically adjusted
    • Note: from comments in https://github.com/mimblewimble/grin/pull/3453 :There is an optimization in the above, if the requester can specify the segment size/height. Given the output bitmap they can determine which segments are empty and request larger segments (output and rangeproof) that cover multiple empty segments (or combination of empty and adjacent non-empty segments) of the MMR
  • [ ] Do we need flags in config file to turn off/on PIBD and/or force it?
  • [ ] Determination of whether and how NRD kernels need to be considered in the context of PIBD
  • [ ] Many TODOs throughout the code to deal with unwraps or clean up redundant code, etc
  • [ ] Much testing on testnet
  • [ ] Documentation / RFC of sync process when progress of all above is satisfactory
  • [ ] Much reviewing of all code before merging into master and turning on PIBD sync on mainnet
  • [ ] Much testing on mainnet

Completed (Mostly unreviewed but merged into pibd_impl branch)

  • #3665
  • #3667
  • #3685
  • #3686
  • #3688
  • #3689
  • #3690
  • #3691
  • #3692
  • #3694
  • #3696
  • #3698
  • #3699
  • #3700
  • #3702
  • #3703
  • #3704
  • #3705
  • #3711

yeastplume avatar Feb 21 '22 12:02 yeastplume

Some thinking and github archaeology around the 'pruning coupling' issue above:

  • Compaction originally removed all pruned leaves

  • Merkle proofs desired originally to allow spending of timelocked coinbase outputs without having access to the full block:

    • https://github.com/mimblewimble/grin/pull/699
  • Merkle proof implementation here

    • https://github.com/mimblewimble/grin/pull/716
    • Spending a coinbase output requires the block hash and merkle proof to be provided in the input (to verify coinbase maturity without requiring full block)
  • In original implementation, where all pruned leaves were removed, was impossible to reliably generate a merkle proof for a coinbase after a rewind.

    • Was decided that the solution was to change the check_compact function to maintain leaf siblings in remove log and underlying file, only purging when parents are removed
    • https://github.com/mimblewimble/grin/pull/720
    • https://github.com/mimblewimble/grin/pull/753#issuecomment-372130599
  • It seems here that the alternate solution here could have been to store hashes of pruned leaves with siblings instead of retaining all data.

  • In the current code, the wallet is not required to provide merkle proofs to spend coinbase outputs, the node checks their maturity by looking up the height in the chain instead. (Made possible as output_mmr_size now being commit to in block header)

    • https://github.com/mimblewimble/grin/pull/1203
  • So Merkle proofs and 'prune coupling' changes no longer needed at this stage.

  • However, merkle proofs are needed for PIBD, however these could also be generated from hashes instead of stored siblings. They are also beyond the rollback horizon so there shouldn't be any rewind concerns.

  • Other (possibly relevant) PRs:

    • https://github.com/mimblewimble/grin/pull/1015
    • https://github.com/mimblewimble/grin/pull/3453
    • https://github.com/mimblewimble/grin/pull/3487 (Merkle proofs not actively used by anyone)
  • Questions are:

    • Will Merkle proofs be required in the 'rewindable' portion of the chain (for which original blocks are stored in the DB anyhow and from which hashes can be reconstructed if absolutely required)
    • Are there any properties of a rewind or cases in which proofs can't be generated from hashes (or in a pinch, reconstructed from the block DB).
  • So to summarize:

    • Coupling behaviour was implemented to allow for merkle proofs in all cases, with the intention that Merkle proofs would be used to validate maturity of coinbase outputs.
    • Not clear to me whether decision to keep sibling data around as opposed to storing hashes for pruned sibling data was made for a reason other than ease of implementation.
    • With output_mmr_size added to blockheader, coinbase maturity could be validated by nodes without wallets supplying Merkle proof.
    • Merkle proofs unused by anything until the advent of PIBD.
    • Rollback or rewind concerns won't be an issue with PIBD as it all happens beyond the horizon, so it's difficult to see why we can't just store and transmit a hash instead of complete leaf data for a pruned node with an unpruned sibling.
  • Based on above I can only really conclude that the 'pruning coupling' behaviour is legacy behaviour, which may have had less of an impact when the only adverse implication was (cheap) local node storage. However, with PIBD this choice affects the amount of network data that has to be sent, particularly unneeded rangeproofs.

  • PIBD segmenter sends these unpruned siblings in their segment data, and the recipient needs to apply and then immediately prune them based on their presence in the output bitmap.

  • The coupling behavior is fairly deep rooted in the implementation, and would be very time consuming with an RFC likely needed

  • Caveat that I could be missing something.

yeastplume avatar Feb 23 '22 13:02 yeastplume

Is this ready for review, or should I wait until the outstanding issues are resolved?

DavidBurkett avatar Feb 27 '22 03:02 DavidBurkett

Is this ready for review, or should I wait until the outstanding issues are resolved?

The code as it stands is nowhere near ready for proper review, but everything about PIBD and all issues listed above are ripe for discussion and comment.

yeastplume avatar Feb 27 '22 13:02 yeastplume

I think this is ready for merging into master, with PIBD used only for testnet (unless custom compiled). We'll encourage more people to run testnet nodes, and stop them from time to time for a few weeks. Maybe with regular forum posts to remind people.

tromp avatar Oct 17 '22 09:10 tromp