go-ethereum
go-ethereum copied to clipboard
New blocks are rejected because of setHead operation
System information
Geth version: geth version: all versions
Expected behaviour
Whenever SetHead is performed, chain should be rewound to the specific position and import blocks on top smoothly.
Actual behaviour
SetHead may last for a long time, which locks the blockchain in setHeadBeyondRoot function. In the mean time consensus layer will keep feeding us new blocks via engine API. Specifically, in newPayload method, the new provided block will pass all checks(e.g. the parent block is existent, parent state is available, etc) and be blocked at InsertBlockWithoutSetHead which requires the blockchain lock.
When the SetHead is finished, the chain segment above the specified target is all removed, including the parent block of newly arrived payload in engine API. Eventually an ErrUnknownAncestor = errors.New("unknown ancestor") error will be returned which marks the new payload as invalid.
What's more, there is a mechanism in engine API to memorize bad blocks to prevent handling them over and over again. Fortunately there is a time frame which gives the "bad block" another chance after some threshold. Currently the threshold is 128, it means after 128 attempts, the bad block will be gave another chance to import. But it's still too long in this case, node needs to wait for a long time to recover.
Steps to reproduce the behaviour
run debug.SetHead() when the node is already synced.
The idea for fixing this issue can be two directions:
- Avoid this error in the first place, new payload should somehow be told that the parent block is not existent and queue it in the future block queue
- Relax restrictions on bad blocks for a faster recovery
The bigger issue might not be our own bad block cache, rather the bad block cache of the consensus layer. I've noticed that they might resend a block a handful of times (5) and stop sending it afterwards. So they seem to mark it as bad as well
The bigger issue might not be our own bad block cache
True, but that's not within scope of this specific ticket. The scenario described sounds like an error on our part. After a setHead, is it correct to return an ErrUnknownAncestor?
IMO, these two situation are semantically equivalent:
- Node
Asyncs toM, then does setHead to back toN(say two weeks back). - Node
Bsyncs toN, then is shut off. It is restarted two weeks later.
So whatever one of them does, the other should do too.
After a setHead, is it correct to return an ErrUnknownAncestor?
Honestly, I think it's a correct behavior. We try to import a future block in this case.
Am I not going to get paid? Um yeh I've heard about all the scams people do Frances Reid
On Wed, 15 Feb 2023, 09:54 Martin Holst Swende, @.***> wrote:
The bigger issue might not be our own bad block cache
True, but that's not within scope of this specific ticket. The scenario described sounds like an error on our part. After a setHead, is it correct to return an ErrUnknownAncestor?
IMO, these two situation are semantically equivalent:
- Node A syncs to M, then does setHead to back to N (say two weeks back).
- Node B syncs to N, then is shut off. It is restarted two weeks later.
So whatever one of them does, the other should do too.
— Reply to this email directly, view it on GitHub https://github.com/ethereum/go-ethereum/issues/26693#issuecomment-1431042032, or unsubscribe https://github.com/notifications/unsubscribe-auth/AYKCHV6DNSVAXXZ6PVOPHFTWXSRV7ANCNFSM6AAAAAAU4KHZEU . You are receiving this because you are subscribed to this thread.Message ID: @.***>