erigon
erigon copied to clipboard
Erigon OOM Killed - (currently trying 2.53.4)
System information erigon version 2.53.4
OS & Version: Linux / Ubuntu on AWS with 64 GB RAM
Commit hash: tag - v2.53.4
Erigon Service:
[Unit] Description=Erigon Execution Layer Client service (Mainet) Wants=network-online.target After=network-online.target
[Service]
Environment="GOGC=50 GOMEMLIMIT=24GiB GOMAXPROCS=2"
MemoryLimit=24G
OOMScoreAdjust=-100
Type=simple
User=root
Restart=allways
RestartSec=5
KillSignal=SIGINT
TimeoutStopSec=300
ExecStart=/opt/erigon/build/bin/erigon
--datadir /opt/data/erigon
--chain mainnet
--port "30303"
--metrics
--pprof
--authrpc.jwtsecret "/opt/secrets/jwt.hex"
--http
--ws
--http.vhosts=""
--http.corsdomain=""
--http.addr="0.0.0.0"
--http.port "8545"
--http.api "eth,erigon,personal,db,admin,web3,net,trace,rpc,debug,txpool"
--txpool.api.addr "0.0.0.0:9094"
--private.api.addr "0.0.0.0:9090"
--batchSize=1G
[Install]
WantedBy=multi-user.target
Consensus Layer: lighthouse Lighthouse v4.5.0-441fc16
Consensus Service:
[Unit] Description=Lighthouse Consensus Layer Client BN (Mainet) Wants=network-online.target After=network-online.target
[Service]
Type=simple
User=root
Restart=allways
RestartSec=5
KillSignal=SIGINT
TimeoutStopSec=300
ExecStart=/usr/local/bin/lighthouse bn
--network mainnet
--datadir "/opt/data/lighthouse"
--execution-endpoint http://localhost:8551
--execution-jwt "/opt/secrets/jwt.hex"
--checkpoint-sync-url https://mainnet.checkpoint.sigp.io
--disable-deposit-contract-sync
--reconstruct-historic-states
--metrics
[Install] WantedBy=multi-user.target
Chain/Network: mainnet
Expected behaviour Node properly syncs after version upgrarde
Actual behaviour After a couple of hours synchronized, erigon get's killed by OOM
Steps to reproduce the behaviour Full sync on v2.51.0, then upgrade to v2.53.4
Backtrace N/A
Executed go tool pprof -inuse_space -png http://127.0.0.1:6060/debug/pprof/heap > mem.png
This mem.png shows - everything is good: using expected 3gb
Ok, but OOM is still happening, is there anything else I can do to prevent this happening all the time?
dmesg
shows:
[210146.815414] oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=eth1.service,mems_allowed=0,oom_memcg=/system.slice/eth1.service,task_memcg=/system.slice/eth1.service,task=erigon,pid=7926,uid=0 [210146.815570] Memory cgroup out of memory: Killed process 7926 (erigon) total-vm:5312414528kB, anon-rss:20685544kB, file-rss:2650224kB, shmem-rss:0kB, UID:0 pgtables:4081092kB oom_score_adj:-100 [210148.956419] oom_reaper: reaped process 7926 (erigon), now anon-rss:0kB, file-rss:1958520kB, shmem-rss:0kB
and what shows alloc
in logs before kill?
try get profiling when alloc
> 5g
[txpool] stat pending=9964 baseFee=0 queued=5125 alloc=3.1GB sys=7.5GB
Unfortunately this pic is healthy
Just to clarify, is it normal that 64 GB are not enought to run Erigon?