ouroboros-consensus icon indicating copy to clipboard operation
ouroboros-consensus copied to clipboard

Simplify time conversions in Consensus

Open dnadales opened this issue 1 year ago • 5 comments

This issue captures some changes that can be implemented to reduce or localize slot-to-wallclock conversions in our code.

  • [ ] 1) We could replaceblockForgeUTCTime with headerForgeUTCTime, since the equivalent header is in scope.
  • [ ] 2) We could annotate headers that have been validated by ChainSync with their UTC Time, which is calculated here.
  • [ ] 3) We take the following two independent paths:
    • [ ] a) Remove HeaderStateWithTime and use the header's annotation for the historicity check instead. The only caveat here is that the MsgAwaitReply part of the historicity check currently relies on having a slot time for the anchor in case the fragment is empty. Possible approaches are:

      1. Stop relying on this and always disengage (or even disconnect) a peer that sends MsgAwaitReply when their candidate fragment is empty (remember that candidate fragments are anchored in a recent immutable tip). This would be more zeleaous than the current historicity check, so it requires proper justification. It seems very plausible to us, but it doesn't seem to immediately follow from any of the "usual" properties like Chain Growth or Existential Chain Quality.
      2. Introduce wrapper on top of AnchoredFragment that also stores extra data (the slot time of the anchor) for use for the candidate fragments. We would need to implement functions like intersect for this that properly update the extra anchor data.[^1]
    • [ ] b) Propagate HeaderWithTime into the Diffusion Layer's interface so that headerForgeUTCTime simply returns the annotation from the header. This will remove an invocation of slotToWallclock, instead reusing the header's time annotation.

Observation: Steps 2 and 3b are not really independent. 2 has a significant performance cost without 3b (because discarding the headers requires a lot of allocation). So, both probably need to be in a PR together, but they could be different commits.

[^1]: With infinite time, it would be neat to generalize the AnchoredFragment API to allow custom extra data for the anchor; but it is rather questionable that this is a useful thing to pursue right now.

dnadales avatar Nov 06 '24 12:11 dnadales

Regarding

We'll either need to also separately store the slot onset of the candidate fragment's anchor, to support the historicity check on an empty fragment OR --- at first blush --- it seems harmless to just skip the historicity check when the current candidate is the empty fragment.

Esgen wrote:

When the current candidate is empty, a rollback can't actually roll back a header, so it will either be the trivial empty rollback (already not checked today) or it will roll back to before the anchor, in which case we disconnect either way. However, for MsgAwaitReply, the historicity check is still necessary to perform even if the current candidate is empty (which can easily happen when an adversary connects to us by accepting the intersection to our immutable tip).

nfrisby avatar Nov 06 '24 13:11 nfrisby

After our discussion in the call just now, I have a few higher-level thoughts to record/emphasize.

  • AnchoredFragment is a specialization of AnchoredSeq. It constrains what the anchor could be (type AnchoredFragment block = AnchoredSeq (WithOrigin SlotNo) (Anchor block) block), and it offers some useful functions, espeically AF.intersect.

  • The node's selection and each ChainSync client's candidate fragment are both AnchoredFragments. Notably, AF.intersect is used to update the candidate fragment when the selection has changed (not necessarily every time it changes).

  • This makes it somewhat awkward to change the contents of the candidate fragment without also changing the contents of the node's selection, which is an important part of the API. "Just map forget over the internal TVar to hide the extra data" might be OK, but it also seems like it could incur inefficiency.

  • The existing HeaderStateHistory uses the selection's available ledger states to compute the necessary times when the candidate fragment is intersected with the selection. We could do the same when the candidate fragment itself carries the times. This also seems inefficient (but only per-MsgFindIntersect).

So, in terms of a final target: I suspect we should be adding times to the ChainDB's internal TVar in addition to the ChainSync clients' candidate fragments. Perhaps the API for the selection should be exists x. (AnchoredFragment x, x -> blk), for the sake of extensibility. (I have not considered the best series of steps to reach that point, which is a focus of this Issue.)

nfrisby avatar Nov 06 '24 14:11 nfrisby

  • This makes it somewhat awkward to change the contents of the candidate fragment without also changing the contents of the node's selection, which is an important part of the API. "Just map forget over the internal TVar to hide the extra data" might be OK, but it also seems like it could incur inefficiency.

Yeah, changing the selection to also track the slot times would make that cleaner (and would be an overall nice goal as you say).

However, it requires changes to ChainSel (in particular: how/where exactly do we get the slot time for blocks we load from disk, both on startup and when switching forks), so it seems potentially nice to have an incremental path ("best series of steps to reach that point" as you say) where the candidate fragments are already annotated with time, but the selection isn't (yet).

#1288 is currently written in that style, and @dnadales also made sure that the BlockFetchConsensusInterface can be adapted accordingly (see https://github.com/IntersectMBO/ouroboros-network/pull/5010).

amesgen avatar Nov 06 '24 15:11 amesgen

FTR: See the PR description of https://github.com/IntersectMBO/ouroboros-consensus/pull/1288#issue-2604967619 for a concrete work item.

amesgen avatar Dec 02 '24 16:12 amesgen

In PR #1288, we are computing slot times in ChainSel for the new blocks.

Under nice and simple circumstances, those calculations are redundant, having already happened in the ChainSync clients.

However, a few things prevent that slot time from being available.

  • If the blocks arrive out of order, then the "currently processed block" in ChainSel might not be the tip of the new blocks. The only way such younger blocks could have the annotation readily available in the ChainSel logic is if the VolDB recorded those annotations (maybe on disk, maybe just in mem with recalc on init, etc). That's plausible, but that's a more disruptive change than we want to start with.

  • Blocks arrive in three ways: via ChainSync client, via init, and via our local block mint. The first two already do time calculations, by this Issue's design. The third does not currently, but it could.

nfrisby avatar Jan 06 '25 15:01 nfrisby