lighthouse icon indicating copy to clipboard operation
lighthouse copied to clipboard

Shift subnet backbone structure

Open AgeManning opened this issue 4 years ago • 4 comments

Description

As described in https://github.com/ethereum/consensus-specs/issues/2749 we may need to shift our subnet backbone structure to having one subnet per node.

This will simplify the attestation service in the network crate. This should not be merged to unstable as the issue is still in an experimental phase. It just might be nice to have a PR ready to participate in new testnets formed.

AgeManning avatar Dec 01 '21 06:12 AgeManning

I can work on this guy.

pawanjay176 avatar Dec 01 '21 20:12 pawanjay176

Yeah cool. The mapping between node-id/public-key has not been spec'd yet, so we can't actually build this yet, so this issue is more of a tracking issue for the time being. We could put in a stub for the mapping and make the apporiate code changes in the mean-time if we wanted

AgeManning avatar Dec 02 '21 01:12 AgeManning

There is some duplicate code between sync committees and the attestation service. In this update, we should try and group all this logic as the entire attestation service will need to be modified.

Things we need to consider:

  • Peer count and peer management: This change will mean that each peer will be subscribed to one long-lived subnet. We would like to avoid peer-churn, so as we cycle through required subnets, it would be nice to try and maintain 1 peer on each subnet. I think we should increase our default target-peer count to around 70 or 80. When we are pruning peers I think we should prune by score, then by how crowded a subnet is. I.e we have a choice of equal peers but have 2 or 3 on a given subnet, we prune one of those 2 or three peers. Over-time this should help us balance peers uniformly accross subnets.
  • Attestation service: We need to remove the logic for long-lived subnets.
  • ENR: Remove the ENR mapping field. For backwards compatibility we may temporarily leave it but update it based on our node-id.
  • Discovery searches: Based on the node-id mapping we should update our target searches so search for specific node ids. This will remove the ability to do "grouped" subnet queries as we are now looking for a specific node-id. We can limit the target peers tho to speed up discovery queries
  • Group the sync-comittee logic with the attestation subnet logic

AgeManning avatar Jan 24 '22 01:01 AgeManning

Update with more specific details:

The spec issue contains the latest updates on what we are trying to achieve. From an implementation side:

We have two files: https://github.com/sigp/lighthouse/blob/stable/beacon_node/network/src/subnet_service/attestation_subnets.rs and https://github.com/sigp/lighthouse/blob/stable/beacon_node/network/src/subnet_service/sync_subnets.rs

which handle attestation and sync subnet logic. There a quite a bit of code duplication going on here. My thoughts were to unify these two. I was thinking we could be either:

  • Build a generic struct that can be used by both sync subnets and attnet subnets
  • Make a single struct that can handle both

The above changes are more of a nice to have, figured we could build this in whilst we are making these modifications.

The actual issue involves subscribing to a fixed constant amount of subnets (hopefully 1 for mainnet) per node-id (i.e beacon node). The exact mapping is given in the specs and uses the shuffling functions we already have in the beacon chain.

To do this, we need to modify the AttestationService. So that is only subscribes to one long-lived subnet.

Some points to consider:

  • We should make this backwards compatible. This means that although its not strictly required moving into this new regime, we should continue to update our ENR to indicate which subnets we are subscribed to (long term).
  • We should implement a new discovery function which searches for target node-ids which should correspond to nodes that are subscribed to subnets we want. The issue here is that old-notes won't abide by this, so potentially we search for random targets also in parallel.
  • We may need to increase the number of parallel discovery searches (in lighthouse, and maybe the parallelization parameter in discovery config) if we increase the number of "target-subnets" that a node can exist in for any given subnet.
  • We do not know how the network will respond to these changes. So this feature should not be set by default. We should either:
    • Hide it behind a compilation flag
    • Hide it behind a hidden CLI to enable this
    • Keep it in a fork to be maintained against unstable (this is less ideal because it requires more work)

AgeManning avatar Jun 01 '22 06:06 AgeManning

Most of this is closed with #3453 I've created https://github.com/sigp/lighthouse/issues/3648 with the remaining work

divagant-martian avatar Oct 20 '22 18:10 divagant-martian