OP node fails to derive the **finalized** state from L1
Bug Description
- Ever since we upgraded to a new version a month ago (the version that added
--l1.beacon), we've noticed that some instances are stuck while deriving the finalized state from L1. They continue to be in sync (just getting the sequencers' unsafe blocks). After a restart they recover though.
--l1.beacon is pointing to a Lighthouse beacon archive node which is in sync without having any issues
Steps to Reproduce
- When a fork occurs on L1 beacon (https://beaconscan.com/slot/8727479), some OP nodes instances fail with:
t=2024-03-28T03:58:12+0000 lvl=warn msg="Derivation process temporary error" attempts=1 err="engine stage failed: temp: failed to fetch blobs: failed to get blob sidecars for L1BlockRef 0xe31a552f337deb8cb4857d3acc94c98524841543a6cc6aa71f67bc1b8b222e02:19526568: failed to fetch blob sidecars for slot 8727479 block 0xe31a552f337deb8cb4857d3acc94c98524841543a6cc6aa71f67bc1b8b222e02:19526568: failed request with status 404: {\"code\":404,\"message\":\"NOT_FOUND: beacon block at slot 8727479\",\"stacktraces\":[]}"
Expected behavior
- OP node should handle the forked blocks gracefully and continue to derive them from L1.
Environment Information:
-
Operating System: NAME="Ubuntu" VERSION="20.04.5 LTS (Focal Fossa)" ID=ubuntu ID_LIKE=debian PRETTY_NAME="Ubuntu 20.04.5 LTS" VERSION_ID="20.04"
-
Package Version (or commit hash): OP node v1.7.2, Lighthouse v5.1.2
Configurations:
/op-node --l1 http://erigon:8545 --l1.beacon http://lighthouse:5052 --l1.rpckind erigon --l1.trustrpc --l1.max-concurrency 130 --l1.rpc-max-batch-size 160 --l2 http://localhost:8551 --l2.jwt-secret /somesecret --network mainnet --rpc.addr 0.0.0.0 --rpc.port 5052 --metrics.enabled --metrics.addr 0.0.0.0 --metrics.port 5054 --rollup.load-protocol-versions --l1.http-poll-interval 3s
Logs:
l2_finalized=0xb7d30640c31b920e54100135ebf9164caf351b729d059038725e1f8f0923c90e:117977219 l2_safe=0xf9d5d7fe82cf10d55f330ddf484d923c7d91a74fe52a8951abadcacc5bedc483:117977353 l2_pending_safe=0xf9d5d7fe82cf10d55f330ddf484d923c7d91a74fe52a8951abadcacc5bedc483:117977353 l2_unsafe=0x423a5d64bf964c74c05c28d77b2bf9e6d08e8209e15533900b0641232e4dd54a:117999757 l2_backup_unsafe=0x0000000000000000000000000000000000000000000000000000000000000000:0 l2_time=1711598291 l1_derived=0xe31a552f337deb8cb4857d3acc94c98524841543a6cc6aa71f67bc1b8b222e02:19526568 t=2024-03-28T03:58:12+0000 lvl=warn msg="Derivation process temporary error" attempts=1 err="engine stage failed: temp: failed to fetch blobs: failed to get blob sidecars for L1BlockRef 0xe31a552f337deb8cb4857d3acc94c98524841543a6cc6aa71f67bc1b8b222e02:19526568: failed to fetch blob sidecars for slot 8727479 block 0xe31a552f337deb8cb4857d3acc94c98524841543a6cc6aa71f67bc1b8b222e02:19526568: failed request with status 404: {\"code\":404,\"message\":\"NOT_FOUND: beacon block at slot 8727479\",\"stacktraces\":[]}"
As I undestand, you do not have the blobs in the beacon. Needs to set up the lighthouse with -prune-blobs=false parameter.
https://docs.optimism.io/builders/node-operators/management/blobs
Nope, --l1.beacon and --l1.beacon-archiver are pointing to an archive node (Lighthouse is running with -prune-blobs=false and -reconstruct-historic-states)
Recent changes likely fixed this issue