lighthouse icon indicating copy to clipboard operation
lighthouse copied to clipboard

Increased frequency of missed attestations post-merge, errors in consensus client

Open PanosChtz opened this issue 3 years ago • 12 comments

Description

I am experiencing frequent missed attestations since the merge. I have 7 so far, which is definitely abnormal. I suspect that these might be correlated with errors I see in the log below.

Version

Lighthouse v3.1.0 aarch64 binary (NOT portable)

Present Behaviour

I examined the logs of the beacon client around the point of my last missed attestation at slot 4703169, here's what I see:

3:57.580 ERRO Execution engine call failed            error: Reqwest(reqwest::Error { kind: Request, url: Url { scheme: "http", cannot_be_a_base: false, username: "", password: None, host: Some(Ipv4(127.0.0.1)), port: Some(8551), path: "/", query: None, fragment: None }, source: TimedOut }), service: exec
Sep 15 17:13:57 validator lighthouse[927]: Sep 15 17:13:57.582 ERRO Error fetching block for peer           error: ExecutionLayerErrorPayloadReconstruction(0x7452239120415b44302afd7c8c5b3bc31b09838c253d8ca5f2d6cf4400e30399, EngineError(Api { error: Reqwest(reqwest::Error { kind: Request, url: Url { scheme: "http", cannot_be_a_base: false, username: "", password: None, >
...
Sep 15 17:14:08 validator lighthouse[927]: Sep 15 17:14:08.003 WARN Did not advance head state              reason: Err(HeadMissingFromSnapshotCache(0x22a340948469b7448d8a7d5c96b23ea0e05d51f2bb95f1b2a9f30da55ee1198f)), service: state_advance
Sep 15 17:14:15 validator lighthouse[927]: Sep 15 17:14:15.227 INFO New block received                      root: 0xe09d15f3c06aede951698f5166b41835c677f519cfba93a314de45e9c8fefa95, slot: 4703169
Sep 15 17:14:17 validator lighthouse[927]: Sep 15 17:14:17.002 INFO Synced                                  slot: 4703169, block:    …  empty, epoch: 146974, finalized_epoch: 146972, finalized_root: 0x4ab8…ac6c, exec_hash: 0x4cfb…3a96 (verified), peers: 88, service: slot_notifier
Sep 15 17:14:20 validator lighthouse[927]: Sep 15 17:14:20.148 WARN Did not advance head state              reason: Err(HeadMissingFromSnapshotCache(0x1fc17f075f6ca7b260cd17d278776c05390c7966daa6ce386d994fe9fcdf14d0)), service: state_advance
...
Sep 15 17:14:57 validator lighthouse[927]: Sep 15 17:14:57.006 ERRO Failed to advance head state            error: SnapshotCacheLockTimeout, service: state_advance>

logs of the validator for that missed attestation seem normal:

Sep 15 17:14:15 validator lighthouse[928]: Sep 15 17:14:15.355 INFO Successfully published attestations type: unaggregated, slot: 4703169, committee_index: 46, head_block: 0x1fc17f075f6ca7b260cd17d278776c05390c7966daa6ce386d994fe9fcdf14d0, validator_indices: [109040], count: 1, service: attestation

Expected Behaviour

Not produce errors and not miss attestations.

Steps to resolve

N/A

PanosChtz avatar Sep 15 '22 18:09 PanosChtz

The issue continues very frequently. I missed attestation at slot 4703670, here are the logs a few minutes before:

Sep 15 18:43:32 validator lighthouse[927]: Sep 15 18:43:32.015 WARN Did not advance head state              reason: Err(HeadMissingFromSnapshotCache(0x15b0e195218e5fcf103e21f56fb8cd12700266b307b667b4923583e6d7ed2745)), service: state_advance
Sep 15 18:43:33 validator lighthouse[927]: Sep 15 18:43:33.293 ERRO Execution engine call failed            error: Reqwest(reqwest::Error { kind: Request, url: Url { scheme: "http", cannot_be_a_base: false, username: "", password: None, host: Some(Ipv4(127.0.0.1)), port: Some(8551), path: "/", query: None, fragment: None }, source: TimedOut }), service: exec
Sep 15 18:43:33 validator lighthouse[927]: Sep 15 18:43:33.296 WARN Error whilst processing payload status  error: Api { error: Reqwest(reqwest::Error { kind: Request, url: Url { scheme: "http", cannot_be_a_base: false, username: "", password: None, host: Some(Ipv4(127.0.0.1)), port: Some(8551), path: "/", query: None, fragment: None }, source: TimedOut }) }, service: exec
Sep 15 18:43:35 validator lighthouse[927]: Sep 15 18:43:35.990 ERRO Execution engine call failed            error: Reqwest(reqwest::Error { kind: Request, url: Url { scheme: "http", cannot_be_a_base: false, username: "", password: None, host: Some(Ipv4(127.0.0.1)), port: Some(8551), path: "/", query: None, fragment: None }, source: TimedOut }), service: exec
Sep 15 18:43:36 validator lighthouse[927]: Sep 15 18:43:36.010 ERRO Error fetching block for peer           error: ExecutionLayerErrorPayloadReconstruction(0x3f1fbb383d33dc291516c43efb202485452149a1a16a62f5e61ff07052ac431b, EngineError(Api { error: Reqwest(reqwest::Error { kind: Request, url: Url { scheme: "http", cannot_be_a_base: false, username: "", password: None, host: Some(Ipv4(127.0.0.1)), port: Some(8551), path: "/", query: None, fragment: None }, source: TimedOut }) })), block_root: 0x8c4c9efd507ef3d338f1704e8147a2f8677b63be97ee9c09187ce5084729bf6f
Sep 15 18:43:37 validator lighthouse[927]: Sep 15 18:43:37.459 ERRO Execution engine call failed            error: Reqwest(reqwest::Error { kind: Request, url: Url { scheme: "http", cannot_be_a_base: false, username: "", password: None, host: Some(Ipv4(127.0.0.1)), port: Some(8551), path: "/", query: None, fragment: None }, source: TimedOut }), service: exec
Sep 15 18:43:37 validator lighthouse[927]: Sep 15 18:43:37.473 ERRO Error fetching block for peer           error: ExecutionLayerErrorPayloadReconstruction(0x51b247d76882559207c12f750a942847304cdd122dc85ee3973a2d5768ebfe8a, EngineError(Api { error: Reqwest(reqwest::Error { kind: Request, url: Url { scheme: "http", cannot_be_a_base: false, username: "", password: None, host: Some(Ipv4(127.0.0.1)), port: Some(8551), path: "/", query: None, fragment: None }, source: TimedOut }) })), block_root: 0x3c2e6e2cde7771416dd70b33e879e390a7d54c8adc70b26a80ab2a48bb0ae853
Sep 15 18:43:41 validator lighthouse[927]: Sep 15 18:43:41.001 INFO Synced                                  slot: 4703616, block:    …  empty, epoch: 146988, finalized_epoch: 146986, finalized_root: 0x3b13…301c, exec_hash: 0x0ac3…eb7b (verified), peers: 87, service: slot_notifier
Sep 15 18:43:44 validator lighthouse[927]: Sep 15 18:43:44.000 WARN Did not advance head state              reason: Err(HeadMissingFromSnapshotCache(0x15b0e195218e5fcf103e21f56fb8cd12700266b307b667b4923583e6d7ed2745)), service: state_advance
Sep 15 18:43:49 validator lighthouse[927]: Sep 15 18:43:49.064 INFO New RPC block received                  hash: 0x9315…4333, slot: 4703615
Sep 15 18:43:53 validator lighthouse[927]: Sep 15 18:43:53.009 INFO Synced                                  slot: 4703617, block:    …  empty, epoch: 146988, finalized_epoch: 146986, finalized_root: 0x3b13…301c, exec_hash: 0x86d6…91f9 (verified), peers: 87, service: slot_notifier
Sep 15 18:43:56 validator lighthouse[927]: Sep 15 18:43:56.006 WARN Did not advance head state              reason: Err(HeadMissingFromSnapshotCache(0x93155e56db7c9337a2aa65507e2c845d5d858ad0d2deea45001d63ec78354333)), service: state_advance
Sep 15 18:43:57 validator lighthouse[927]: Sep 15 18:43:57.778 INFO New RPC block received                  hash: 0x23bc…b5a0, slot: 4703616
Sep 15 18:44:00 validator lighthouse[927]: Sep 15 18:44:00.868 INFO Sending metrics to remote endpoint      endpoint: https://beaconcha.in/, service: monitoring_client
Sep 15 18:44:02 validator lighthouse[927]: Sep 15 18:44:02.866 INFO New RPC block received                  hash: 0x0e9f…5556, slot: 4703617
Sep 15 18:44:05 validator lighthouse[927]: Sep 15 18:44:05.001 INFO Synced                                  slot: 4703618, block:    …  empty, epoch: 146988, finalized_epoch: 146986, finalized_

and from validator:

Sep 15 18:54:17 validator lighthouse[928]: Sep 15 18:54:17.001 INFO All validators active                   slot: 4703669, epoch: 146989, total_validators: 1, active_validators: 1, current_epoch_proposers: 0, service: notifier
Sep 15 18:54:27 validator lighthouse[928]: Sep 15 18:54:27.100 INFO Successfully published attestations     type: unaggregated, slot: 4703670, committee_index: 56, head_block: 0xd2b85031ec59f83050fbb9fa2c5d9def1862c6f22f34a00856491684e5cbf66d, validator_indices: [109040], count: 1, service: attestation
Sep 15 18:54:29 validator lighthouse[928]: Sep 15 18:54:29.001 INFO Connected to beacon node(s)             synced: 1, available: 1, total: 1, service: notifier

PanosChtz avatar Sep 15 '22 19:09 PanosChtz

Sorry about the missed attestations, we'll be working on several optimisations to make all of this a bit smoother.

What hardware are you running on, and which execution client are you using?

We've seen a few reports of issues like this which are often related to running Besu as the EL. Our theory is that it gets overwhelmed by the requests to reconstruct blocks which has knock-on effects on Lighthouse and causes attestation misses.

michaelsproul avatar Sep 15 '22 22:09 michaelsproul

I'm using a RPi4 and my EL client is Geth. Now the problems have become worse. I am getting this error all the time:

Sep 16 00:27:15 validator lighthouse[130547]: Sep 16 00:27:15.439 ERRO Execution engine call failed            error: Reqwest(reqwest::Error { kind: Request, url: Url { scheme: "http", cannot_be_a_base: false, username: "", password: None, host: Some(Ipv4(127.0.0.1)), port: Some(8551), path: "/", query: None, fragment: None }, source: TimedOut }), service: exec
Sep 16 00:27:15 validator lighthouse[130547]: Sep 16 00:27:15.439 WARN Error whilst processing payload status  error: Api { error: Reqwest(reqwest::Error { kind: Request, url: Url { scheme: "http", cannot_be_a_base: false, username: "", password: None, host: Some(Ipv4(127.0.0.1)), port: Some(8551), path: "/", query: None, fragment: None }, source: TimedOut }) }, service: exec
Sep 16 00:27:16 validator lighthouse[130547]: Sep 16 00:27:16.588 ERRO Execution engine call failed            error: Reqwest(reqwest::Error { kind: Request, url: Url { scheme: "http", cannot_be_a_base: false, username: "", password: None, host: Some(Ipv4(127.0.0.1)), port: Some(8551), path: "/", query: None, fragment: None }, source: TimedOut }), service: exec
Sep 16 00:27:16 validator lighthouse[130547]: Sep 16 00:27:16.592 WARN Error whilst processing payload status  error: Api { error: Reqwest(reqwest::Error { kind: Request, url: Url { scheme: "http", cannot_be_a_base: false, username: "", password: None, host: Some(Ipv4(127.0.0.1)), port: Some(8551), path: "/", query: None, fragment: None }, source: TimedOut }) }, service: exec
Sep 16 00:27:16 validator lighthouse[130547]: Sep 16 00:27:16.593 CRIT Failed to update execution head         error: ExecutionForkChoiceUpdateFailed(EngineError(Api { error: Reqwest(reqwest::Error { kind: Request, url: Url { scheme: "http", cannot_be_a_base: false, username: "", password: None, host: Some(Ipv4(127.0.0.1)), port: Some(8551), path: "/", query: None, fragment: None }, source: TimedOut }) })), service: beacon

I restarted lighthouse service, and now it seems that lighthouse is consistently about 70 slots behind and cannot catch up:


Sep 16 00:29:41 validator lighthouse[130547]: Sep 16 00:29:41.002 INFO Syncing                                 est_time: 26 mins, speed: 0.04 slots/sec, distance: 67 slots (13 mins), peers: 81, service: slot_notifier
Sep 16 00:29:43 validator lighthouse[130547]: Sep 16 00:29:43.277 INFO Sending metrics to remote endpoint      endpoint: https://beaconcha.in/, service: monitoring_client
Sep 16 00:29:53 validator lighthouse[130547]: Sep 16 00:29:53.001 INFO Syncing                                 est_time: 27 mins, speed: 0.04 slots/sec, distance: 68 slots (13 mins), peers: 87, service: slot_notifier
Sep 16 00:29:56 validator lighthouse[130547]: Sep 16 00:29:56.657 CRIT Beacon block processing error           error: ValidatorPubkeyCacheLockTimeout, service: beacon
Sep 16 00:29:56 validator lighthouse[130547]: Sep 16 00:29:56.657 WARN BlockProcessingFailure                  outcome: ValidatorPubkeyCacheLockTimeout, msg: unexpected condition in processing block.
Sep 16 00:30:05 validator lighthouse[130547]: Sep 16 00:30:05.016 INFO Syncing                                 est_time: 27 mins, speed: 0.04 slots/sec, distance: 69 slots (13 mins), peers: 87, service: slot_notifier
Sep 16 00:30:15 validator lighthouse[130547]: Sep 16 00:30:15.807 CRIT Beacon block processing error           error: ValidatorPubkeyCacheLockTimeout, service: beacon
Sep 16 00:30:15 validator lighthouse[130547]: Sep 16 00:30:15.814 WARN BlockProcessingFailure                  outcome: ValidatorPubkeyCacheLockTimeout, msg: unexpected condition in processing block.
Sep 16 00:30:17 validator lighthouse[130547]: Sep 16 00:30:17.003 INFO Syncing                                 est_time: 55 mins, speed: 0.02 slots/sec, distance: 69 slots (13 mins), peers: 87, service: slot_notifier
Sep 16 00:30:17 validator lighthouse[130547]: Sep 16 00:30:17.037 INFO Sync state updated                      new_state: Syncing Head Chain, old_state: Syncing Finalized Chain, service: sync

Any ideas?

PanosChtz avatar Sep 16 '22 00:09 PanosChtz

Second data point. I am observing the same issues with Geth. I've restarted both Geth and Lighthouse and will see what happens from there. Does not appear to be CPU or memory bound, as far as I can tell.

This is on a i5-10210U w/32GB RAM and a 2TB SSD drive running Ubuntu directly on the metal with 1Gb fiber internet.

onyxrev avatar Sep 16 '22 00:09 onyxrev

@PanosChtz You can try to improve your situation slightly by adding --disable-lock-timeouts to your beacon node flags. This will stop the Pi from getting stuck in a retry loop as it times out on the same block over and over.

michaelsproul avatar Sep 16 '22 00:09 michaelsproul

Does not appear to be CPU or memory bound, as far as I can tell.

@onyxrev It appears that I/O is the biggest bottleneck in cases like this, although your 2TB SSD should be able to keep up. How often are you getting timeouts, and do you see the CRIT with ValidatorPubkeyCacheLockTimeout?

michaelsproul avatar Sep 16 '22 00:09 michaelsproul

@michaelsproul after chatting with you on discord the verdict was that my Pi could not just keep up, since it was falling behind increasingly in slots. Feel free to close this issue after the other users here have resolved theirs. Thanks!

PanosChtz avatar Sep 16 '22 13:09 PanosChtz

@michaelsproul Not seeing that ValidatorPubkeyCacheLockTimeout error, but I am seeing some EL <> CL communication issues via the ExecutionLayerErrorPayloadReconstruction error that is floating around in the Discord channels. The system was configured and ready for the merge and Geth appears to be listening properly on 8551.

Sep 16 14:40:10 ethereum lighthouse[2555816]: Sep 16 14:40:10.968 ERRO Error fetching block for peer           error: ExecutionLayerErrorPayloadReconstruction(0x14f0502dc27bc049e179b6e2bc8f3dff09b2c8b3804d0b544adfbfae3e78a06b, EngineError(Api { error: Reqwest(reqwest::Error { kind: Request, url: Url { scheme: "http", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("localhost")), port: Some(8551), path: "/", query: None, fragment: None }, source: TimedOut }) })), block_root: 0x9066a89d61bf16984ab3eaf023d3260a1f45a81db7555903bef4997d1b34dba3

and

Sep 16 14:45:30 ethereum lighthouse[2555816]: Sep 16 14:45:30.446 ERRO Execution engine call failed            error: Reqwest(reqwest::Error { kind: Request, url: Url { scheme: "http", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("localhost")), port: Some(8551), path: "/", query: None, fragment: None }, source: TimedOut }), service: exec
Sep 16 14:45:30 ethereum lighthouse[2555816]: Sep 16 14:45:30.446 WARN Error whilst processing payload status  error: Api { error: Reqwest(reqwest::Error { kind: Request, url: Url { scheme: "http", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("localhost")), port: Some(8551), path: "/", query: None, fragment: None }, source: TimedOut }) }, service: exec
Sep 16 14:45:30 ethereum lighthouse[2555816]: Sep 16 14:45:30.727 ERRO Execution engine call failed            error: Reqwest(reqwest::Error { kind: Request, url: Url { scheme: "http", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("localhost")), port: Some(8551), path: "/", query: None, fragment: None }, source: TimedOut }), service: exec
Sep 16 14:45:30 ethereum lighthouse[2555816]: Sep 16 14:45:30.727 WARN Error whilst processing payload status  error: Api { error: Reqwest(reqwest::Error { kind: Request, url: Url { scheme: "http", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("localhost")), port: Some(8551), path: "/", query: None, fragment: None }, source: TimedOut }) }, service: exec
Sep 16 14:45:30 ethereum lighthouse[2555816]: Sep 16 14:45:30.727 CRIT Failed to update execution head         error: ExecutionForkChoiceUpdateFailed(EngineError(Api { error: Reqwest(reqwest::Error { kind: Request, url: Url { scheme: "http", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("localhost")), port: Some(8551), path: "/", query: None, fragment: None }, source: TimedOut }) })), service: beacon
Geth
Version: 1.10.25-stable
Git Commit: 69568c554880b3567bace64f8848ff1be27d084d
Architecture: amd64
Go Version: go1.18.5
Operating System: linux
Lighthouse v3.1.0-aa022f4
BLS library: blst-modern
SHA256 hardware acceleration: false
Specs: mainnet (true), minimal (false), gnosis (true)

onyxrev avatar Sep 16 '22 14:09 onyxrev

Referring to OP's validator logs. I believe the attestation fails because validator attests the parent block instead of the actual block it needs to attest. So the head_block in

Successfully published attestations type: unaggregated, slot: 4703169, committee_index: 46, head_block: 0x1fc17f075f6ca7b260cd17d278776c05390c7966daa6ce386d994fe9fcdf14d0, validator_indices: [XXXXXX], count: 1, service: attestation

should have been:

Successfully published attestations type: unaggregated, slot: 4703169, committee_index: 46, head_block: 0xe09d15f3c06aede951698f5166b41835c677f519cfba93a314de45e9c8fefa95, validator_indices: [XXXXXX], count: 1, service: attestation

ronaldjmaas avatar Sep 16 '22 15:09 ronaldjmaas

Furthermore the same issue is reported the Besu Github:

https://github.com/hyperledger/besu/issues/4398

ronaldjmaas avatar Sep 16 '22 15:09 ronaldjmaas

I also missed an attestation. No errors in LH BN on VC at the time but Besu has been having JSON streaming errors. Not sure if related. See issue over at Besu github.

Example of LH errors seen:

Lighthouse sometimes complains about "Error Execution engine call failed" and "Error during execution engine upcheck":

Sep 15 15:18:53  lighthouse[127013]: Sep 15 12:18:53.000 INFO Synced                                  slot: 4701692, block: 0x9f3a…4bde, epoch: 146927, finalized_epoch: 146925, fina>
Sep 15 15:19:00  lighthouse[127013]: Sep 15 12:19:00.808 INFO New block received                      root: 0xdfdc116c6683f54d0b6321f9b58e008bad1ec05961808f76c9070ecf6234b90a, slot:>
Sep 15 15:19:01  lighthouse[127013]: Sep 15 12:19:01.914 ERRO Execution engine call failed            error: Reqwest(reqwest::Error { kind: Request, url: Url { scheme: "http", canno>
Sep 15 15:19:01  lighthouse[127013]: Sep 15 12:19:01.914 ERRO Error fetching block for peer           error: ExecutionLayerErrorPayloadReconstruction(0x56183a657fa1b7498f4ec197ce456>
Sep 15 15:19:01  lighthouse[127013]: Sep 15 12:19:01.924 ERRO Execution engine call failed            error: Reqwest(reqwest::Error { kind: Request, url: Url { scheme: "http", canno>
Sep 15 15:19:01  lighthouse[127013]: Sep 15 12:19:01.924 ERRO Error fetching block for peer           error: ExecutionLayerErrorPayloadReconstruction(0x988600f495a40defc8127eda5b59c>
Sep 15 15:19:05  lighthouse[127013]: Sep 15 12:19:05.000 INFO Synced                                  slot: 4701693, block: 0xdfdc…b90a, epoch: 146927, finalized_epoch: 146925, fina>
Sep 15 15:19:12  lighthouse[127013]: Sep 15 12:19:12.084 INFO New block received                      root: 0xcdbefd7e58594d6cfdd8ed336a7adc63335e1146b92c5b20cd50c2cec597294a, slot:>
Sep 15 15:19:12  lighthouse[127013]: Sep 15 12:19:12.247 INFO New RPC block received                  hash: 0xcdbe…294a, slot: 4701694

maninthecryptosuit avatar Sep 16 '22 17:09 maninthecryptosuit

Happy to report that attestations have stabilized on my end. I suspect that misbehaving clients on the network added load on my execution layer, causing it to fall behind.

onyxrev avatar Sep 20 '22 20:09 onyxrev

I am seeing a similar issue, which is causing missed attestations. I am running Geth v1.10.25-stable and with Lighthouse v3.1.0-aa022f4 and I get these error logs in my beacon chain process around once a day:

lighthouse[256023]: Sep 25 13:50:13.243 ERRO Execution engine call failed            error: Reqwest(reqwest::Error { kind: Request, url: Url { scheme: "http", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("localhost")), port: Some(8551), path: "/", query: None, fragment: None }, source: TimedOut }), service: exec
lighthouse[256023]: Sep 25 13:50:13.243 ERRO Execution engine call failed            error: Reqwest(reqwest::Error { kind: Request, url: Url { scheme: "http", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("localhost")), port: Some(8551), path: "/", query: None, fragment: None }, source: TimedOut }), service: exec
lighthouse[256023]: Sep 25 13:50:13.243 ERRO Execution engine call failed            error: Reqwest(reqwest::Error { kind: Request, url: Url { scheme: "http", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("localhost")), port: Some(8551), path: "/", query: None, fragment: None }, source: TimedOut }), service: exec
lighthouse[256023]: Sep 25 13:50:13.243 ERRO Error fetching block for peer           error: ExecutionLayerErrorPayloadReconstruction(0xdd156559d6afd1e86d04a65d85243b35848968fe18ddab61a0b613a73df18801, EngineError(Api { error: Reqwest(reqwest::Error { kind: Request, url: Url { scheme: "http", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("localhost")), port: Some(8551), path: "/", query: None, fragment: None }, source: TimedOut }) })), block_root: 0xf4056981c9378902073163b682dc0eb90c69183b01027207f6a69e0b4ef656e7
lighthouse[256023]: Sep 25 13:50:13.244 ERRO Error fetching block for peer           error: ExecutionLayerErrorPayloadReconstruction(0x78c9b95744f2468355cbc11caf3984091f720b4f38a7a41f25891b9c8ae2efc4, EngineError(Api { error: Reqwest(reqwest::Error { kind: Request, url: Url { scheme: "http", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("localhost")), port: Some(8551), path: "/", query: None, fragment: None }, source: TimedOut }) })), block_root: 0x6af5d9d6f4676b12eadbaa2a7e4ce71f106978ba313b410fc454a74f8f3b9c9a
lighthouse[256023]: Sep 25 13:50:13.244 ERRO Error fetching block for peer           error: ExecutionLayerErrorPayloadReconstruction(0xea85533fe18557ceb63dae7896d3f0833ac255fd30b9760eb49ffab6cf101407, EngineError(Api { error: Reqwest(reqwest::Error { kind: Request, url: Url { scheme: "http", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("localhost")), port: Some(8551), path: "/", query: None, fragment: None }, source: TimedOut }) })), block_root: 0xa6f9ac76346c1e0eacfb520f95006f57fc82d9bebcd8c4b0d9ccee68ae712a72
lighthouse[256023]: Sep 25 13:50:13.247 ERRO Execution engine call failed            error: Reqwest(reqwest::Error { kind: Request, url: Url { scheme: "http", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("localhost")), port: Some(8551), path: "/", query: None, fragment: None }, source: TimedOut }), service: exec
lighthouse[256023]: Sep 25 13:50:13.248 ERRO Error fetching block for peer           error: ExecutionLayerErrorPayloadReconstruction(0xab301646ff7aaf806e4cf9e714a35f70edadec8ba712607549bc03d16c05578e, EngineError(Api { error: Reqwest(reqwest::Error { kind: Request, url: Url { scheme: "http", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("localhost")), port: Some(8551), path: "/", query: None, fragment: None }, source: TimedOut }) })), block_root: 0x3d951eea50f265d469ef85a756dfef93db3a7622f209b484de81ebf089dbde4f

There are no error or warning logs at all on the Geth side, and the issue goes away without taking any action, until it happens again.

I confirmed that CPU, Memory and network activity looked fine at the time these errors occurred.

gamell avatar Sep 26 '22 04:09 gamell

@gamell Please try upgrading to Lighthouse v3.1.2 and setting --prune-payloads false. This will alleviate some of the load on your execution node.

michaelsproul avatar Sep 26 '22 05:09 michaelsproul