beacon-APIs Detecting changes in duties from a head event

Issue

There has been some discussion in #96 with regarding how to determine if some event on the BN will invalidate the cached duties in a VC. To quote @ajsutton (mainly cause the contained link has some useful background):

The key reasoning behind getting an event for empty slots is to provide a guarantee from the beacon node that it will notify the Validator Client if a later chain reorg affects that slot. The vc may need to recalculate duties because of that switch from empty to filled block. https://hackmd.io/kutQ7smJRZ-sJNSuY1WWVA#Why-head-must-be-published-for-empty-slots sets out the scenario.

We've faced this issue several times in Lighthouse and I believe we have a concise solution, which I detail here.

Thanks to @ajsutton for letting me run this past him and giving feedback.

Proposed solution (OUTDATED, SEE v2)

First define two new terms which refer to the head block and state of a beacon chain:

proposer_shuffling_pivot_root: just like an attestation target_root, this is the block_root at the start of the epoch which contains the head block. (Or the genesis root in the case of underflow).
- Formally: get_block_root_at_slot(state, compute_start_slot_at_epoch(get_current_epoch(state)).
attester_shuffling_pivot_root: this is the block_root in the last slot of the penultimate epoch, relative to the head block. (Or the genesis root in the case of underflow).
- Formally: get_block_root_at_slot(state, max(compute_start_slot_at_epoch(get_current_epoch(state) - MIN_SEED_LOOKAHEAD) - 1, GENESIS_SLOT)))

Then, include them in the API:

Add proposer_shuffling_pivot_root and attester_shuffling_pivot_root to the head event.
Add a proposer_shuffling_pivot_root to the .../validator/duties/proposer/{epoch} return result.
Add a attester_shuffling_pivot_root to the .../validator/duties/attester/{epoch} return result.

Why this solution?

The (target_root, epoch) tuple uniquely identifies the list of proposers for epoch, whilst the (shuffling_pivot_root, epoch) uniquely identifies the shuffled list of attesters for epoch.

With these two new pieces of information a VC is able to cheaply determine if any head event has invalidated their proposer/attester shuffling.

Examples

Examples are in Rust, just to make life easier for me. I'm pretty sure it's C-like enough to be readable.
These examples are just markdown examples, they might have some bugs. We use these methods in Lighthouse though, so I'm confident that the concept is valid.

Example: Attesters

There is some VC which caches duties like this:

struct AttesterDuty {
  // The epoch to which this shuffling pertains. I.e., all attestation slots are within this epoch.
  epoch: Epoch,
  // The root which identifies this shuffling.
  shuffling_pivot_root: Hash256,
  validator_index: ValidatorIndex,
  attestation_slot: Option<Slot>,
  .. // Other stuff related to attesting.
}

It's currently epoch current_epoch and we store all the AttesterDuty for this epoch in a list called attester_duties.

When we get a head event we run this code:

let head_epoch = epoch(head.slot);

if head_epoch > current_epoch {
  panic!("probably a clock sync issue?")
}

let shuffling_pivot_root = if head_epoch + 1 >= current_epoch {
  head.attester_shuffling_pivot_root
} else {
  head.block
};

let invalidated_validators = vec![];   // empty list

for duty in attester_duties {
  if (duty.shuffling_pivot_root, duty.epoch) != (shuffling_pivot_root, head_epoch) {
    invalidated_validators.push(duty.validator_index)
  }
}

update_duties(invalidated_validators);

Example: Proposers

Just like the previous example, we store proposer duties like this:

struct ProposerDuty {
  // The epoch to which this shuffling pertains. I.e., all proposal slots are within this epoch.
  epoch: Epoch,
  // The root which identifies this shuffling.
  shuffling_pivot_root: Hash256,
  validator_index: ValidatorIndex,
  // The list of slots where the proposer should propose.
  proposal_slots: Vec<Slot>
}

It's currently epoch current_epoch and we store all the ProposerDuty for this epoch in a list called proposer_duties.

When we get a head event we run this code:

let head_epoch = epoch(head.slot);

if head_epoch > current_epoch {
  panic!("probably a clock sync issue?")
}

let shuffling_pivot_root = if head_epoch == current_epoch {
  head.proposer_shuffling_pivot_root
} else {
  head.block
};

let invalidated_validators = vec![];   // empty list

for duty in proposer_duties {
  if (duty.shuffling_pivot_root, duty.epoch) != (shuffling_pivot_root, head_epoch) {
    invalidated_validators.push(duty.validator_index)
  }
}

update_duties(invalidated_validators);

Impact

I believe this should have a fairly minimal impact on BN implementations, apart from perhaps having to thread these new pivot roots through their caches. Unsurprisingly, Lighthouse's shuffling caches are keyed by this exact scheme.

The reason I think this is low-impact is because it's trivial to compute the pivot roots from the BeaconState which was used to generate the shuffling. I expect that clients have access to a state when they're determining the shuffling for some epoch.

I understand that this appears rather complex, but I think it's a concise solution and therefore ultimately less cumbersome and more accurate/efficient than existing solutions.

You might argue that this is leaking information about caches into the API, but I'm not convinced this is the case. I believe this is the fundamental "source of truth" method to detecting re-orgs that affect proposer/attester shuffling.

Nov 12 '20 01:11 paulhauner

I quite like this approach. I'll have a go at implementing it with Teku but I'm pretty sure it will actually wind up being simpler both for the beacon node and the validator client than what we have to do to produce and handle the head and chain_reorg events currently.

Nov 12 '20 03:11 ajsutton

Another option is to have proposer_shuffling_changed: bool and attester_shuffling_changed: bool and compute all this stuff on the BN side.

It's simpler but less powerful since it you can miss a changed = true event and fall out of sync with it.

Nov 12 '20 05:11 paulhauner

Oh I realised I made a mistake on attester_shuffling_pivot_root, it needs to be slightly different to how we do it for caching internally.

We could add two extra fields (the roots of last slot of the previous and penultimate epochs) and gain the ability to support re-orgs for attestation duties of 32-64 slots instead of the current 31-63. This didn't seem worth it, so I decided not to.

Proposed solution v2

First define two new terms which refer to the head block and state of a beacon chain:

current_target_root: just like an attestation target_root; the block_root from the first slot in the head epoch (or genesis).
- Formally: get_block_root_at_slot(state, compute_start_slot_at_epoch(get_current_epoch(state)).
previous_target_root: the block_root from the first slot in the epoch previous to the head (or genesis).
- Formally: get_block_root_at_slot(state, compute_start_slot_at_epoch(get_previous_epoch(state)).

Then, include both of these fields in

The head event.
The .../validator/duties/proposer/{epoch} return result.
The .../validator/duties/attester/{epoch} return result.

Why this solution?

The (current_target_root, epoch) tuple uniquely identifies the list of proposers for epoch, whilst the (previous_target_root, epoch) uniquely identifies* the shuffled list of attesters for epoch.

With these two new pieces of information a VC is able to cheaply determine if any head event has invalidated their proposer/attester shuffling.

*: The previous_target_root is does not perfectly identify it, but it's one slot off. We use that value to avoid adding another field.

Examples

Examples are in Rust, just to make life easier for me. I'm pretty sure it's C-like enough to be readable.
These examples are just markdown examples, they might have some bugs. We use these methods in Lighthouse though, so I'm confident that the concept is valid.

Example: Attesters

There is some VC which caches duties like this:

struct AttesterDuty {
  // The epoch to which this shuffling pertains. I.e., all attestation slots are within this epoch.
  epoch: Epoch,
  // The root which identifies this shuffling.
  previous_target_root: Hash256,
  validator_index: ValidatorIndex,
  attestation_slot: Option<Slot>,
  .. // Other stuff related to attesting.
}

It's currently epoch current_epoch and we store all the AttesterDuty for this epoch in a list called attester_duties.

When we get a head event we run this code:

let head_epoch = epoch(head.slot);

if head_epoch > current_epoch {
  panic!("probably a clock sync issue?")
}

let previous_target_root = if head_epoch == current_epoch {
  head.previous_target_root
} else if head_epoch + 1 == current_epoch {
  head.current_target_root
} else {
  head.block
};

let invalidated_validators = vec![];   // empty list

for duty in attester_duties {
  if (duty.previous_target_root, duty.epoch) != (previous_target_root, head_epoch) {
    invalidated_validators.push(duty.validator_index)
  }
}

update_duties(invalidated_validators);

Example: Proposers

Just like the previous example, we store proposer duties like this:

struct ProposerDuty {
  // The epoch to which this shuffling pertains. I.e., all proposal slots are within this epoch.
  epoch: Epoch,
  // The root which identifies this shuffling.
  current_target_root: Hash256,
  validator_index: ValidatorIndex,
  // The list of slots where the proposer should propose.
  proposal_slots: Vec<Slot>
}

It's currently epoch current_epoch and we store all the ProposerDuty for this epoch in a list called proposer_duties.

When we get a head event we run this code:

let head_epoch = epoch(head.slot);

if head_epoch > current_epoch {
  panic!("probably a clock sync issue?")
}

let current_target_root = if head_epoch == current_epoch {
  head.current_target_root
} else {
  head.block
};

let invalidated_validators = vec![];   // empty list

for duty in proposer_duties {
  if (duty.current_target_root, duty.epoch) != (current_target_root, head_epoch) {
    invalidated_validators.push(duty.validator_index)
  }
}

update_duties(invalidated_validators);

Nov 12 '20 06:11 paulhauner

The (target_root, epoch) tuple uniquely identifies the list of proposers for epoch, whilst the (shuffling_pivot_root, epoch) uniquely identifies the shuffled list of attesters for epoch.

I assume you mean previous_target_root in the latter.

I'm on board with this, especially if it makes beacon node developers' lives easier.

Nov 12 '20 07:11 mcdee

I assume you mean previous_target_root in the latter.

You're right, thanks! I fixed it.

Nov 12 '20 14:11 paulhauner

This approach generally looks like it will work well. A very rough implementation of this in Teku is at https://github.com/ConsenSys/teku/pull/3210 Some notes from going through that:

Is there a reason that current_target_root and previous_target_root are added to the proposer and attester duties? It looks like only current_target_root is needed for proposer duties and only previous_target_root for attester duties.

get_block_root_at_slot(state, compute_start_slot_at_epoch(get_current_epoch(state)) doesn't work if the state is from the first slot of the epoch. The historical roots doesn't contain the root for the current slot. But I think we actually want the last slot of the prior epoch for both proposers and attesters - you have to be able to calculate proposer duties prior to the first block in an epoch so you know who should create that first block. I've made that change in my implementation. Not sure if target_root is still the right name or if we should go back to shuffling_root.

The responses for getting proposer and attestation duties will have to change from data being a list to an object. I've made both:

{
  data: {
    current_target_root: "...",
    previous_target_root: "...",
    duties: [ <same as previous data content> ]
  }
}

Unfortunately means the change isn't backwards compatible on the API.

For the .../validator/duties/attester/{epoch} endpoint, there's some ambiguity about what current target root and previous target root refer to as there are two epochs which could have been used to calculate the duties (ie to calculate duties for epoch 3, I could have used a state from epoch 2 or 3 and that changes which changes the resulting target roots. I think it should just contain a single target_root which is the last block before epoch 2 (ie the last block that could possibly change the shuffling).

To give a few pointers to the Teku changes required, there's a fair bit of boiler plate. The key point where we check which duties to invalidate is https://github.com/ConsenSys/teku/pull/3210/files#diff-cc3852e06c0d37c4735ad47c0fe503a63f56b83a2dd29edf46ffa44d4ff0326e https://github.com/ConsenSys/teku/pull/3210/files#diff-4ff41183c9b1c7fbf6cb04178e7dd65bc3577769d88fe4e484c134cfae3bfe99 has the code to select the target root for attestation duties and https://github.com/ConsenSys/teku/pull/3210/files#diff-f3fea5c99cee00d154dbde5e9e06a08f2e88242b451402f9a2f79762d107d214 for block production duties. Duties are already grouped by epoch so we just need to iterate through the duties that could be affected and check that the target root used to calculate them matches the one we selected from the head update.

https://github.com/ConsenSys/teku/pull/3210/files#diff-b4374db262b5a5a6020174ebebcad32bf0990864b70803f35614e5d2e0c8990f has the calculation of current and previous target root.

There is a bunch of code I'll be able to remove because of this change as well but I haven't done that yet in this PR. Plus I need to actually test it properly etc...

Nov 13 '20 03:11 ajsutton

Is there a reason that current_target_root and previous_target_root are added to the proposer and attester duties?

I assumed that current_target_root would be for proposer duties and previous_target_root for attester duties.

The responses for getting proposer and attestation duties will have to change from data being a list to an object.

An alternative would be to add the root in to each duty. It's a little bit of duplication, but maintains compatibility (and we already do this elsewhere, I believe).

I'll leave the other items to Paul, as I'm unsure of the answers.

Nov 13 '20 08:11 mcdee

I assumed that current_target_root would be for proposer duties and previous_target_root for attester duties.

Yes, I think that's all that's required, but the current text suggests that both fields are added to both the attester and proposer duty responses.

An alternative would be to add the root in to each duty. It's a little bit of duplication, but maintains compatibility (and we already do this elsewhere, I believe).

I could live with this as long as there was a requirement that they are all the same. Otherwise validator clients have to check every single duty individually which makes life more difficult. Having different roots would also mean that some of those duties are immediately invalid and should be recalculated so I think it's a reasonable take.

It does make it harder to deal with the case of having no duties though. That would mean no validators are active and that shouldn't be affected by forks as far as I know but its still a weird corner case to have to handle in validator clients.

Nov 15 '20 22:11 ajsutton

Oh and there's one other interesting corner case that bit me - when calculating duties at the genesis slot you should return the genesis block root, but its not available from the state. Every other case the target root can be found purely from the state but genesis would require the state and block. I've punted and in the case of genesis I return the root from the finalized checkpoint in the genesis state, which will be all 0s. The logic works but its a slightly surprising corner case.

Nov 15 '20 22:11 ajsutton

I've punted and in the case of genesis I return the root from the finalized checkpoint in the genesis state, which will be all 0s. The logic works but its a slightly surprising corner case.

So genesis blocks are a lot weirder than I remembered, but fortunately that makes it easy to create the genesis block root from the state. So I've switched to that and now always return the right block root.

On a small local network it now appears to work correctly. Going to deploy a build to one of our Medalla validating nodes to monitor it in the real world.

Nov 16 '20 04:11 ajsutton

dependent_root added to all relevant routes

Dec 08 '23 13:12 dapplion

beacon-APIs beacon-APIs copied to clipboard

Detecting changes in duties from a head event

Issue

Proposed solution (OUTDATED, SEE v2)

Why this solution?

Examples

Example: Attesters

Example: Proposers

Impact

Proposed solution v2

Why this solution?

Examples

Example: Attesters

Example: Proposers

beacon-APIs
beacon-APIs copied to clipboard