charon
charon copied to clipboard
Lighthouse with beaconmock sometimes fails to download proposer duties
🐞 Bug Report
Description
Running Lighthouse with beaconmock exhibit connection problems, often related to connection resets and (presumed) early EOF.
🔬 Minimal Reproduction
Create a compose cluster, and watch any of the Lighthouse instance logs with:
docker-compose logs -f vc0-lighthouse | grep download
🔥 Error
ERRO Failed to download proposer duties err: Some endpoints failed, num_failed: 1 http://node0:3600/ => RequestFailed(Reqwest(reqwest::Error { kind: Request, url: Url { scheme: "http", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("node0")), port: Some(3600), path: "/eth/v1/validator/duties/proposer/1817407", query: None, fragment: None }, source: hyper::Error(Io, Os { code: 104, kind: ConnectionReset, message: "Connection reset by peer" }) })), service: duties
🌍 Your Environment
Operating System:
Ubuntu Linux 22.10
What version of Charon are you running? (Which release)
v0.13.0
Anything else relevant (validator index / public key)?
Tried a slightly different version of the MVP HTTP stresser I wrote in #860, modified to unmarshal data returned from beaconmock:
use reqwest::{ClientBuilder, StatusCode};
use serde::Deserialize;
use serde::Serialize;
use std::time::Duration;
#[derive(Default, Debug, Clone, PartialEq, Serialize, Deserialize)]
#[serde(rename_all = "camelCase")]
pub struct Root {
#[serde(rename = "dependent_root")]
pub dependent_root: String,
pub data: Vec<Daum>,
#[serde(rename = "execution_optimistic")]
pub execution_optimistic: bool,
}
#[derive(Default, Debug, Clone, PartialEq, Serialize, Deserialize)]
#[serde(rename_all = "camelCase")]
pub struct Daum {
pub pubkey: String,
pub slot: String,
#[serde(rename = "validator_index")]
pub validator_index: String,
}
#[derive(Debug)]
enum Error {
Reqwest(reqwest::Error),
MissingURL,
}
impl From<reqwest::Error> for Error {
fn from(e: reqwest::Error) -> Self {
Self::Reqwest(e)
}
}
#[tokio::main]
async fn main() -> Result<(), Error> {
// Some simple CLI args requirements...
let url = match std::env::args().nth(1) {
Some(url) => url,
None => return Err(Error::MissingURL),
};
eprintln!("Fetching {:?}...", url);
let client = ClientBuilder::new()
.tcp_keepalive(Duration::from_secs(1))
.build()
.unwrap();
let mut idx: u64 = 0;
loop {
let res = client.post(url.clone()).send().await?;
if res.status() != StatusCode::OK {
println!("status code not OK!: {}", res.status())
}
let j: Root = res.json().await?;
_ = j;
idx += 1;
println!("Done call #{}, data: {}", idx, j.dependent_root);
}
}
I never managed to reproduce the error this way.
Disabling HTTP keep-alive seems to alleviate the issue inside the Docker compose cluster though.