consensus-specs
consensus-specs copied to clipboard
Restrict best LC update collection to canonical blocks
Currently, the best LC update for a sync committee period may refer to blocks that have later been orphaned, if they rank better than canonical blocks according to is_better_update. This was done because the most important task of the light client sync protocol is to track the correct next_sync_committee. However, practical implementation is quite tricky because existing infrastructure such as fork choice modules can only be reused in limited form when collecting light client data. Furthermore, it becomes impossible to deterministically obtain the absolute best LC update available for any given sync committee period, because orphaned blocks may become unavailable.
For these reasons, LightClientUpdate should only be served if they refer to data from the canonical chain as selected by fork choice. This also assists efforts for a reliable backward sync in the future.
- consensus-spec-tests (based on
v1.5.0-alpha.8)
Could you please elaborate on the complication aspect? It led to quite a bit of simplification in case of Nimbus:
- https://github.com/status-im/nimbus-eth2/pull/5613
One notable aspect is, that even with the old system, you'd need to track separate branches in a proper implementation, they are just on a per-period basis. As in, track the best LightClientUpdate for each (period, current_sync_committee, next_sync_committee). Only when finality advances, this can be simplified to track the best LightClientUpdate for each (period). This can be tested with the minimal preset, where non-finality of an entire sync committee period is feasible.
With the new system, that remains the same, but you track the best LightClientUpdate for each non-finalized block; same way, how we track many other aspects for the purpose of fork choice.
So, similar to regular fork choice (which is already present):
- When a new block is added, compute the data and attach it to the memory structure.
- When a new head is selected, read from the memory structure and persist to database.
- On finality, purge from the memory structure.
And, because the best
LightClientUpdatedoesn't change that often, can deduplicate the memory using a reference count (or, just use aref objectand have the language runtime itself deal with the count).
Regarding "little to no benefit", I think having canonical data made available on the network allows better reasoning.
- No other API exposes orphaned data (unless maybe when explicitly asked for using a by-root request).
- It also avoids complications when feeding the data into portal network, because the different nodes won't end up storing different versions of the data in the regular case.
- Furthermore, it unlocks future backfill protocols for syncing the canonical history without recomputing it from the local database. Such a backfill protocol can include proofs of canonical history with the data, to ensure that, for example, someone isn't just serving an arbitrary history that ends up at the same head sync committee, and have your node then serve that possibly malicious early history (leading to the verifiable head sync committee) to others.
- Finally, it allows providing a reference implementation with pyspecs, to ensure that most BNs are computing the same history for the same chain.
- Other implementations are not disallowed, it's a should not, not a shall not.
From offline chat, would be great to define a direction for a backfill spec to make the motivation for this PR stronger
From offline chat, would be great to define a direction for a backfill spec to make the motivation for this PR stronger
https://hackmd.io/@etan-status/electra-lc
minimal.zip
Extra test vectors based on v1.4.0-beta.7
✅ Nimbus 24.2.2 passing the additional tests.
@hwwhww anything still blocking this?