teku icon indicating copy to clipboard operation
teku copied to clipboard

reorg depth 0 seen on event stream

Open rolfyone opened this issue 3 years ago • 1 comments

It looks like we got both blocks, and set 775 to head before 774 was made head, then went back to 774 being head first...

https://beaconcha.in/block/4683774 (new head) https://beaconcha.in/block/4683775 (old head)

{
  slot: '4683774',
  depth: '0',
  old_head_block: '0xb942665c2eb6093ce53e46db6f458fbd32f5ec44f320ad63b569854592cbd26e',
  new_head_block: '0xeed75ccb5201c46614121324c7203e23b22ca166741fdb59bb426482b76369c4',
  old_head_state: '0x5b3363a7d5d7e2d38400742402e0b7b9be2fe8ce57c814aa7c035b983a715570',
  new_head_state: '0xe73d572c268d568284c919bb4e25951d27dc0bca48ce6e99ccde3db386ea8603',
  epoch: '146367',
  execution_optimistic: false
}

rolfyone avatar Sep 13 '22 21:09 rolfyone

side note - depth should also probably be using minusMinZero...

rolfyone avatar Sep 13 '22 21:09 rolfyone

I've tested following ideas and now stuck:

  1. could be imported in reverse order - impossible, checks parent execution root during import
  2. updateHead could be called not only on import (ForkChoice.processHead() path):
  • it's called on invalid payloads
  • it's called when preparing for block production Couldn't be raced there, runs on ForkChoiceThread
    also updateHead is called on init - not our case
  1. could be imported 774, 775, 774 by some reason, it could give reorg like in issue. Could we import the block that is already imported? Made a test
  @Test
  void onBlockReImport_shouldNotSendForkChoiceUpdatedNotification() {
    final SignedBlockAndState blockAndState = chainBuilder.generateBlockAtSlot(ONE);
    final SignedBlockAndState blockAndState2 = chainBuilder.generateBlockAtSlot(UInt64.valueOf(2));
    storageSystem.chainUpdater().advanceCurrentSlotToAtLeast(blockAndState.getSlot());
    final SafeFuture<BlockImportResult> importResult =
      forkChoice.onBlock(blockAndState.getBlock(), Optional.empty(), executionLayer);
    storageSystem.chainUpdater().advanceCurrentSlotToAtLeast(blockAndState2.getSlot());
    final SafeFuture<BlockImportResult> importResult2 =
      forkChoice.onBlock(blockAndState2.getBlock(), Optional.empty(), executionLayer);
    final SafeFuture<BlockImportResult> importResult3 =
      forkChoice.onBlock(blockAndState.getBlock(), Optional.empty(), executionLayer);
    assertBlockImportedSuccessfully(importResult, false);
    assertBlockImportedSuccessfully(importResult2, false);
    assertBlockImportedSuccessfully(importResult3, false);

    assertForkChoiceUpdateNotification(blockAndState, false);
    assertForkChoiceUpdateNotification(blockAndState2, false);
  }

It passes assertForkChoiceUpdateNotification(blockAndState, false);, so shouldn't be in issue. And even if we have race between 2nd and 3rd import before calling importBlockAndState, which only runs on ForkChoiceThread, we will not get such reorg as in issue.

  1. there was a bug in code at the moment of creating issue, now fixed ForkChoice.java - no changes, which could affect this behaviour ForkChoiceNotifierImpl.java - some changes related to block production preparation, shouldn't affect RecentChainData.java - identical So, miss again

zilm13 avatar Nov 05 '22 11:11 zilm13

the format is from EventSubscription in json, which is where it was reported by user to have a reorg depth 0...

I'm not sure if that code is generating a reorg for the above test case, but i'd suggest that's where we're interested in probing, maybe a log for when that reorg gets triggered with depth 0 or something...


Reorg Event *** New Head: b836ab11206d22c406e22e928242a79fcf0b8f911438ff7f558b47dd489197ea (4246125), Previous Head: 78029545df7a1c7105e9b38ae2f810e859809ef5149093566f9bdf2484769ea9 (4246126), Common Ancestor: 3a530a25df5ef52546d37f68a8d838d86cf9f0f2342b81587ed0cb0139b38a31 (4246123)
  |   | Nov 3, 2022 @ 17:11:16.257 | Reorg Event *** New Head: cf137ba0673f3013c8c071e37db0c80c9435ce71759fc9c71bef5b7635105c54 (4245956), Previous Head: d2e6c2c15111a057a58048371fee9f1c23220a4702849d4ff801299eeb15a2d3 (4245955), Common Ancestor: 941cf6629e2542dc159a0c240ad5651dc4d0dc0376e3edc7ba8980036df01e90 (4245954)

wouldn't be surprised if that reorg event came through event subscriptions poorly... but we don't subscribe so it's hard to know...

rolfyone avatar Nov 09 '22 10:11 rolfyone

This kind of event could be triggered by calling

RecentChainData.updateHead(774);
RecentChainData.updateHead(775);
RecentChainData.updateHead(774);

on the last call updateHead will call OperationsReOrgManager.chainHeadUpdatedwith context which will trigger similar event, but I've not found a way we could call it in this order as all callers are in ForkChoiceThread so the last call couldn't be happened. I've moved from updateHead to all its callers way upper and upper and stuck with no evidence how this could happen

zilm13 avatar Nov 10 '22 17:11 zilm13

stuck with no evidence how this could happen

I think it is time to close this ticket. We don't really have a way of reproducing it and we haven't received any new reports about it.

lucassaldanha avatar Nov 10 '22 22:11 lucassaldanha

Agreed, closing as can't reproduce. :(

ajsutton avatar Nov 10 '22 22:11 ajsutton