erigon icon indicating copy to clipboard operation
erigon copied to clipboard

Valid block loses this status

Open xrchz opened this issue 1 year ago • 3 comments

System information

Erigon version: ./erigon --version

2.56.2

OS & Version: Windows/Linux/OSX

Linux

Commit hash:

c9216ce10451a235af1cdde572642eac3640fadf (2.56.2 release)

Erigon Command (with flags/config):

erigon --datadir=/home/ramana/9be5/erigon --nat=none --db.size.limit=8TB \
--http --http.addr=0.0.0.0 --http.api=engine,erigon,eth,net,web3,ots --p2p.allowed-ports=30303,30304,30306 \
--rpc.returndata.limit 1048576 \
--http.corsdomain=<redacted> \
--authrpc.addr=0.0.0.0 --authrpc.jwtsecret=/home/ramana/9be5/erigon/jwt.hex \
--torrent.download.rate=512mb --torrent.upload.rate=128mb

Consensus Layer: Nimbus

Consensus Layer Command (with flags/config):

nimbus_beacon_node \
 --el=http://<redacted> \
 --data-dir=/home/ramana/lh/nimbus \
 --external-beacon-api-url=http://<redacted> \
 --nat=none --history=archive \
 --jwt-secret=/home/ramana/jwt.hex \
 --rest=true --rest-address=:: --rest-statecache-size=16 \
 --payload-builder=true --payload-builder-url=http://<redacted>

Chain/Network: mainnet

Expected behaviour

Nimbus should not get a change of state of a block from VALID to ACCEPTED/SYNCING. Once VALID the block should stay VALID.

Actual behaviour

Here are the relevant logs on Erigon and Nimbus:

INFO[01-24|15:05:37.826] RPC Daemon notified of new headers       from=19077277 to=19077278 hash=0xde5df95011d5fd2f97ff0ab7f75c03101559dd93f5c11556f68122a9cd649422 header sending=13.66µs log sending=200ns
INFO[01-24|15:05:37.826] head updated                             hash=0xde5df95011d5fd2f97ff0ab7f75c03101559dd93f5c11556f68122a9cd649422 number=19077278
INFO[01-24|15:05:37.826] [snapshots] Prune Blocks                 to=18987000 limit=100
INFO[01-24|15:05:49.361] [NewPayload] Handling new payload        height=19077279 hash=0x3a5c925499ac4488957114d082279eff79b53fdd550befc1143524429fcdf37a
INFO[01-24|15:05:52.408] [updateForkchoice] Fork choice update: flushing in-memory state (built by previous newPayload)
INFO[01-24|15:05:54.306] RPC Daemon notified of new headers       from=19077278 to=19077279 hash=0x3a5c925499ac4488957114d082279eff79b53fdd550befc1143524429fcdf37a header sending=24.41µs log sending=300ns
INFO[01-24|15:05:54.306] head updated                             hash=0x3a5c925499ac4488957114d082279eff79b53fdd550befc1143524429fcdf37a number=19077279
INFO[01-24|15:05:54.314] [snapshots] Prune Blocks                 to=18987000 limit=100
INFO[01-24|15:06:14.301] [NewPayload] Handling new payload        height=19077280 hash=0x4ac0e3137708d4305dc6d13e670fa3c494122c6114886eb91eaa60fc6b9dd70c
INFO[01-24|15:06:16.024] [updateForkchoice] Fork choice update: flushing in-memory state (built by previous newPayload)
INFO[01-24|15:06:16.842] RPC Daemon notified of new headers       from=19077279 to=19077280 hash=0x4ac0e3137708d4305dc6d13e670fa3c494122c6114886eb91eaa60fc6b9dd70c header sending=12.05µs log sending=330ns
INFO[01-24|15:06:16.842] head updated                             hash=0x4ac0e3137708d4305dc6d13e670fa3c494122c6114886eb91eaa60fc6b9dd70c number=19077280
INFO[01-24|15:06:16.842] [snapshots] Prune Blocks                 to=18987000 limit=100
INF 2024-01-24 15:05:35.000+00:00 Slot start                                 topics="beacnde" slot=8273726 epoch=258553 sync=synced peers=159 head=94636007:8273725 finalized=258551:d9e8b666 delay=547us747ns
INF 2024-01-24 15:05:45.071+00:00 Slot end                                   topics="beacnde" slot=8273726 nextActionWait=1s928ms814us276ns nextAttestationSlot=8273727 nextProposalSlot=-1 syncCommitteeDuties=none head=8796d06d:8273726
INF 2024-01-24 15:05:47.001+00:00 Slot start                                 topics="beacnde" slot=8273727 epoch=258553 sync=synced peers=159 head=8796d06d:8273726 finalized=258551:d9e8b666 delay=1ms69us609ns
INF 2024-01-24 15:05:52.347+00:00 State replayed                             topics="chaindag" blocks=0 slots=1 current=ae4a6b80:8273695@8273696 ancestor=729238b0:8273727 target=729238b0:8273727@8273728 ancestorStateRoot=0b845ed3 targetStateRoot=bf28dbc5 found=true assignDur=17ms738us232ns replayDur=244ms639us227ns
INF 2024-01-24 15:05:53.409+00:00 execution payload forkChoiceUpdated status ACCEPTED/SYNCING, but was previously VALID topics="gossip_blocks" payloadExecutionStatus=SYNCING headBlockHash=3a5c9254 safeBlockHash=cd747946 finalizedBlockHash=73b837b5 receivedBlock="(blck: (slot: 8273727, proposer_index: 580283, parent_root: \"8796d06d\", state_root: \"0b845ed3\", eth1data: (deposit_root: cec9c99bb4070b00b46b85702dfbd9120256328f283da5dfa4dc11c0f41cc134, deposit_count: 1190475, block_hash: ffc06d0611d54b4773d5189768f056e318eb40149634959d837ef4eaab6d98d3), graffiti: \"Lighthouse/v4.4.1-2841f60\", proposer_slashings_len: 0, attester_slashings_len: 0, attestations_len: 64, deposits_len: 0, voluntary_exits_len: 0, sync_committee_participants: 504, block_number: 19077279, block_hash: \"0x3a5c925499ac4488957114d082279eff79b53fdd550befc1143524429fcdf37a\", parent_hash: \"0xde5df95011d5fd2f97ff0ab7f75c03101559dd93f5c11556f68122a9cd649422\", fee_recipient: \"0x95222290dd7278aa3ddd389cc1e1d165cc4bafe5\", bls_to_execution_changes_len: 0, blob_kzg_commitments_len: 0), signature: \"93c16d25\")"
INF 2024-01-24 15:05:57.068+00:00 Slot end                                   topics="beacnde" slot=8273727 nextActionWait=1s931ms294us4ns nextAttestationSlot=8273728 nextProposalSlot=-1 syncCommitteeDuties=none head=729238b0:8273727

Attached are full logs for the past hour or so before these snippets: erilogs.txt nimlogs.txt

Steps to reproduce the behaviour

It happens on several blocks - just run both clients and wait.

Backtrace

N/A

xrchz avatar Jan 24 '24 15:01 xrchz

I am seeing the same problem with Nimbus.

Nimbus beacon node v24.2.2-24.2.2-stateofus
erigon version 2.58.1

catwith1hat avatar Mar 12 '24 18:03 catwith1hat

Please re-open if it still happens with v2.60.2

yperbasis avatar Jun 28 '24 11:06 yperbasis

This was still occuring with v2.60.0. I am unable to test v2.60.2 as I deleted my Erigon deployment because of this issue.

catwith1hat avatar Jun 28 '24 17:06 catwith1hat

Hey, so this is a complaint I got before, the reason why this happens is that Erigon 2.0 is not optimized for the chain tip, so when it calls forkchoice it goes in timeout and returns syncing/accepted. Now, FCU saying syncing just means that the node is syncing. NewPayload on the other hand return VALID if the block is valid, these functions have same statuses but do different things. We clearly cannot return valid on FCU if we cannot finish the job on time and we also cannot remove the timeout. so I will be closing this issue as this will be fixed with Erigon 3.0 where we are pouring a lot of care in optimization at chain tip

Giulio2002 avatar Jul 20 '24 20:07 Giulio2002