erigon
erigon copied to clipboard
Memory limits? (how to prevent OOM kill?)
Hi again,
I've noticed Erigon swapping at times so I tried disabling my swap entirely. However, this has resulted in Erigon being OOM killed after a few days of syncing. I then tried disabling memory overcommitment system-wide (sysctl vm.overcommit_memory=2
) but this is much worse still since it results in Erigon crashing after one or two minutes with the message fatal error: runtime: out of memory
(and some very long stack traces). This is during initial sync, during the Execution phase (7/16).
I have 16GB of RAM + 8GB of swap and am running a Linux 5 kernel.
So I have a few questions:
- How does Erigon detect how much memory it can use? I have tried running it in a docker container with mem_limit and memswap_limit set to 8g but this just results in Erigon being OOM-killed a few minutes into block execution once total system memory usage approaches 9GB (I only have Erigon running, excepting system daemons). So it almost seems like Erigon tries to detect hardware memory limits.
- Would it make sense to add a command-line option to explicitly limit memory usage?
- Are there any practical steps I can take right now to limit Erigon's memory usage so that it doesn't get OOM-killed?
probably fixed by https://github.com/ledgerwatch/erigon/pull/5253
- yes, we can do it in 90% of cases, but not in 100%
- you can:
- update to by https://github.com/ledgerwatch/erigon/pull/5253
- hardcode to
cmd/erigon.go
next linecompress.SetDecompressionTableCondensity(6)
Thank you, @AskAlexSharov.
- yes, we can do it in 90% of cases, but not in 100%
That would probably still help! At least that way I could prevent Erigon and my CL client from eating each other's RAM space and swapping.
you can:
- update to by OOM fix for bodies stage #5253
I tried this. I'm now on the latest devel version (erigon/v2022.09.3-dev-fb0f811f/linux-amd64/go1.19.1) with that commit, but with mem_limit and memswap_limit set to 7g, Erigon still gets OOM-killed after a few minutes in the Execution phase.
* hardcode to `cmd/erigon.go` next line `compress.SetDecompressionTableCondensity(6)`
Unfortunately, I don't understand what this is supposed to do and where in the code to put it :/
- show your logs
- add --pprof flag, when you see high mem usage - do
go tool pprof -inuse_space -png http://127.0.0.1:6060/debug/pprof/heap > mem.png
- show your logs
There's nothing special in the logs. Here's the full log:
[INFO] [09-22|09:23:46.305] Build info git_branch=HEAD git_tag=v2021.10.03-1902-gfb0f811f1-dirty git_commit=fb0f811f19c41d2f005fc6173e4af1e6f01f6319
[INFO] [09-22|09:23:46.305] Starting Erigon on Ethereum mainnet...
[INFO] [09-22|09:23:46.315] Maximum peer count ETH=100 total=100
[INFO] [09-22|09:23:46.315] starting HTTP APIs APIs=eth,erigon,engine,web3,net,debug,trace,txpool
[INFO] [09-22|09:23:46.315] torrent verbosity level=WRN
[INFO] [09-22|09:23:46.570] [torrent] Public IP ip=[...]
[INFO] [09-22|09:23:46.570] Set global gas cap cap=50000000
[INFO] [09-22|09:23:46.611] Opening Database label=chaindata path=/home/erigon/.local/share/erigon/chaindata
[INFO] [09-22|09:23:46.647] Initialised chain configuration config="{ChainID: 1, Homestead: 1150000, DAO: 1920000, DAO Support: true, Tangerine Whistle: 2463000, Spurious Dragon: 2675000, Byzantium: 4370000, Constan
00, Petersburg: 7280000, Istanbul: 9069000, Muir Glacier: 9200000, Berlin: 12244000, London: 12965000, Arrow Glacier: 13773000, Gray Glacier: 15050000, Terminal Total Difficulty: 58750000000000000000000, Merge Netsplit: <nil
nil>, Cancun: <nil>, Engine: ethash}" genesis=0xd4e56740f876aef8c010b86a40d5f56745a118d0906a34e69aec8c0db1cb8fa3
[INFO] [09-22|09:23:46.647] Effective prune_flags= snapshot_flags="--snapshots=true" history.v2=false
[INFO] [09-22|09:23:49.798] Initialising Ethereum protocol network=1
[INFO] [09-22|09:23:49.798] Disk storage enabled for ethash DAGs dir=/home/erigon/.local/share/erigon/ethash-dags count=2
[INFO] [09-22|09:23:50.547] Starting private RPC server on=[...]
[INFO] [09-22|09:23:50.547] new subscription to logs established
[INFO] [09-22|09:23:50.547] rpc filters: subscribing to Erigon events
[INFO] [09-22|09:23:50.547] New txs subscriber joined
[INFO] [09-22|09:23:50.547] new subscription to newHeaders established
[INFO] [09-22|09:23:50.547] Reading JWT secret path=/home/erigon/.local/share/erigon/jwt.hex
[INFO] [09-22|09:23:50.547] HTTP endpoint opened for Engine API url=[...] ws=true ws.compression=true
[INFO] [09-22|09:23:50.548] HTTP endpoint opened url=[...] ws=false ws.compression=true grpc=false
[INFO] [09-22|09:23:50.562] [1/16 Snapshots] Fetching torrent files metadata
[INFO] [09-22|09:23:50.581] Started P2P networking version=66 self=[...] name=erigon/v2022.09.3-dev-fb0f811f/linux-amd64/go1.19.1
[INFO] [09-22|09:23:53.595] [Snapshots] Stat blocks=15000k indices=15000k alloc=2.5GB sys=2.6GB
[INFO] [09-22|09:23:53.596] [7/16 Execution] Blocks execution from=13081565 to=15537393
[INFO] [09-22|09:23:54.220] [txpool] Started
[INFO] [09-22|09:24:13.613] [7/16 Execution] Executed blocks number=13081942 blk/s=18.8 tx/s=3838.2 Mgas/s=293.8 gasState=0.01 batch=10.9MB alloc=2.4GB sys=4.4GB
[INFO] [09-22|09:24:33.642] [7/16 Execution] Executed blocks number=13082393 blk/s=22.5 tx/s=4573.9 Mgas/s=350.9 gasState=0.02 batch=25.3MB alloc=3.2GB sys=4.4GB
[INFO] [09-22|09:24:50.548] [txpool] stat block=15537393 pending=10000 baseFee=0 queued=1335 alloc=3.6GB sys=4.5GB
[INFO] [09-22|09:24:53.654] [7/16 Execution] Executed blocks number=13082877 blk/s=24.2 tx/s=4905.9 Mgas/s=373.5 gasState=0.04 batch=39.7MB alloc=4.0GB sys=4.5GB
[INFO] [09-22|09:25:13.617] [7/16 Execution] Executed blocks number=13083396 blk/s=26.0 tx/s=4875.8 Mgas/s=403.1 gasState=0.05 batch=55.4MB alloc=2.9GB sys=4.6GB
[INFO] [09-22|09:25:33.599] [7/16 Execution] Executed blocks number=13083908 blk/s=25.6 tx/s=4717.7 Mgas/s=398.3 gasState=0.07 batch=70.6MB alloc=3.9GB sys=4.6GB
[INFO] [09-22|09:25:49.799] [p2p] GoodPeers eth66=9
[INFO] [09-22|09:25:50.549] [txpool] stat block=15537393 pending=10000 baseFee=0 queued=2710 alloc=2.4GB sys=4.7GB
[INFO] [09-22|09:25:53.623] [7/16 Execution] Executed blocks number=13084438 blk/s=26.5 tx/s=4602.9 Mgas/s=408.0 gasState=0.08 batch=85.4MB alloc=2.9GB sys=4.7GB
[INFO] [09-22|09:26:13.598] [7/16 Execution] Executed blocks number=13085057 blk/s=31.0 tx/s=5410.8 Mgas/s=478.5 gasState=0.10 batch=104.0MB alloc=4.4GB sys=4.8GB
[INFO] [09-22|09:26:33.605] [7/16 Execution] Executed blocks number=13085677 blk/s=31.0 tx/s=6393.9 Mgas/s=481.0 gasState=0.12 batch=122.7MB alloc=4.0GB sys=4.9GB
[INFO] [09-22|09:26:50.549] [txpool] stat block=15537393 pending=10000 baseFee=0 queued=3883 alloc=2.8GB sys=4.9GB
[INFO] [09-22|09:26:53.613] [7/16 Execution] Executed blocks number=13086301 blk/s=31.2 tx/s=5814.5 Mgas/s=480.4 gasState=0.13 batch=142.0MB alloc=3.4GB sys=4.9GB
[INFO] [09-22|09:27:13.602] [7/16 Execution] Executed blocks number=13086913 blk/s=30.6 tx/s=6224.8 Mgas/s=476.2 gasState=0.15 batch=161.5MB alloc=2.6GB sys=5.0GB
[INFO] [09-22|09:27:33.645] [7/16 Execution] Executed blocks number=13087592 blk/s=33.9 tx/s=7034.0 Mgas/s=518.7 gasState=0.17 batch=183.8MB alloc=4.6GB sys=5.1GB
[INFO] [09-22|09:27:49.799] [p2p] GoodPeers eth66=11
[INFO] [09-22|09:27:50.551] [txpool] stat block=15537393 pending=10000 baseFee=0 queued=5831 alloc=3.2GB sys=5.2GB
[INFO] [09-22|09:27:53.627] [7/16 Execution] Executed blocks number=13088276 blk/s=34.2 tx/s=6771.8 Mgas/s=534.1 gasState=0.19 batch=206.1MB alloc=3.9GB sys=5.2GB
[INFO] [09-22|09:28:13.644] [7/16 Execution] Executed blocks number=13088953 blk/s=33.8 tx/s=6580.3 Mgas/s=523.4 gasState=0.21 batch=229.3MB alloc=3.3GB sys=5.2GB
[INFO] [09-22|09:28:33.623] [7/16 Execution] Executed blocks number=13089668 blk/s=35.8 tx/s=6767.1 Mgas/s=554.2 gasState=0.23 batch=251.1MB alloc=2.9GB sys=5.3GB
[INFO] [09-22|09:28:50.547] [txpool] stat block=15537393 pending=10000 baseFee=0 queued=8280 alloc=3.0GB sys=5.3GB
[INFO] [09-22|09:28:53.625] [7/16 Execution] Executed blocks number=13090179 blk/s=25.5 tx/s=4340.5 Mgas/s=392.1 gasState=0.24 batch=264.4MB alloc=3.4GB sys=5.3GB
[INFO] [09-22|09:29:13.616] [7/16 Execution] Executed blocks number=13090682 blk/s=25.2 tx/s=4514.6 Mgas/s=389.8 gasState=0.26 batch=279.8MB alloc=4.2GB sys=5.3GB
[INFO] [09-22|09:29:33.624] [7/16 Execution] Executed blocks number=13091173 blk/s=24.5 tx/s=4380.7 Mgas/s=376.7 gasState=0.27 batch=294.9MB alloc=4.8GB sys=5.3GB
[INFO] [09-22|09:29:49.799] [p2p] GoodPeers eth66=15
[INFO] [09-22|09:29:50.550] [txpool] stat block=15537393 pending=10000 baseFee=0 queued=10609 alloc=4.4GB sys=5.3GB
[INFO] [09-22|09:29:53.686] [7/16 Execution] Executed blocks number=13091595 blk/s=21.0 tx/s=3754.1 Mgas/s=324.7 gasState=0.28 batch=308.0MB alloc=4.9GB sys=5.3GB
[INFO] [09-22|09:30:13.631] [7/16 Execution] Executed blocks number=13092089 blk/s=24.8 tx/s=4787.4 Mgas/s=385.5 gasState=0.30 batch=322.6MB alloc=3.0GB sys=5.4GB
[INFO] [09-22|09:30:33.655] [7/16 Execution] Executed blocks number=13092555 blk/s=23.3 tx/s=4164.0 Mgas/s=357.0 gasState=0.31 batch=336.4MB alloc=3.4GB sys=5.4GB
[INFO] [09-22|09:30:50.548] [txpool] stat block=15537393 pending=10000 baseFee=0 queued=14204 alloc=3.5GB sys=5.4GB
[INFO] [09-22|09:30:53.605] [7/16 Execution] Executed blocks number=13093029 blk/s=23.8 tx/s=4280.9 Mgas/s=366.0 gasState=0.32 batch=351.2MB alloc=3.9GB sys=5.4GB
erigon exited with code 137
Here's the relevant part of the kernel log:
erigon invoked oom-killer: gfp_mask=0xcc0(GFP_KERNEL), order=0, oom_score_adj=0
CPU: 1 PID: 2138533 Comm: erigon Not tainted 5.10.0-18-amd64 #1 Debian 5.10.140-1
Hardware name: [...]
Call Trace:
dump_stack+[...]
dump_header+[...]
oom_kill_process.cold+[...]
out_of_memory+[...]
mem_cgroup_out_of_memory+[...]
try_charge+[...]
? __alloc_pages_nodemask+[...]
mem_cgroup_charge+[...]
handle_mm_fault+[...]
do_user_addr_fault+[...]
exc_page_fault+[...]
? asm_exc_page_fault+[...]
asm_exc_page_fault+[...]
[...]
memory: usage 7340032kB, limit 7340032kB, failcnt 557044
[...]
kernel: Memory cgroup out of memory: Killed process 2138343 (erigon) total-vm:15531473176kB, anon-rss:6987456kB, file-rss:4186596kB, shmem-rss:0kB, UID:1000 pgtables:291232kB oom_score_adj:0
add --pprof flag, when you see high mem usage - do go tool pprof -inuse_space -png http://127.0.0.1:6060/debug/pprof/heap > mem.png
I can try this but the memory usage isn't unusually high so not sure what it will show.
To reiterate: I did restrict the container's memory to 7GB for this test, otherwise I would have to wait potentially days to be able to reproduce the OOM kill on my full 16GB of RAM.
Reduce —batchSize
Reduce —batchSize
Thank you very much. Although this does not appear to provide strict guarantees as to max memory usage, it does help a lot in practice. I set --batchSize=32M (instead of the default of 256M Ggas) and Erigon is now consuming less than 9GB at all times whereas with the default setting, it could regularly consume up to 12GB (executing blocks near 13,500,000).
An option to finalise the batch at a certain memory threshold (as opposed to gas threshold) might be even better, but --batchSize will probably satisfy my needs :)
I seem to be getting an average of around 19.8 blocks/s with --batchSize=256M and 19.5 blocks/s with --batchSize=32M (and of course, I realise this tiny difference between the two speeds is not statistically meaningful). I calculated those speeds myself over a 1 to 3 hour period, based on timestamps and number=
in the logs; this isn't the blk/s=
log output. So I'm wondering if there really is that great of an advantage to using a larger batch size vs 32M (which results in a batch being written every few minutes). If you don't mind me asking, was there any particular reason for choosing 256M as the default?
- It depends on your disk speed
- It also may depend on - current state size (progress). bigger state - more random reads.
- Yes, we need choose batchSize depend on available memory by default.
I see, thanks. I don't know if the current OOM kill situation is expected behaviour (I don't really know how it could be prevented in the general case, but you know better than me). So I won't close this issue but feel free to do so yourself of course.
btw, I'm seeing what seems like a similar situation when trying to run erigon on a RasberryPi with 8gb ram. I have swap setup so erigon never gets killed, but it goes so far into using swap that it crawls to an halt.
Params:
--chain mainnet --blockDownloaderWindow 32768 --prune=htc --prune.r.before=11052984 --authrpc.jwtsecret=/etc/ethereum/jwtsecret --datadir /home/ethereum/.erigon --private.api.addr= --authrpc.addr 0.0.0.0 --authrpc.vhosts=xxx
Memory over 24 hours looks like this:
Logs are generally un-interesting, it gets stuck in 7/16 executing I'm running with --pprof
right now and going to see if I can catch high memory usage.
I tried --batchSize 32M
but it didn't seem to make a difference, maybe I need a smaller value for 8gb ram?
Try smaller
Also you can try add to cmd/erigon/main.go
line compress.SetDecompressionTableCondensity(6)
and add --txpool.disable
Also you can try add to
cmd/erigon/main.go
linecompress.SetDecompressionTableCondensity(6)
and add --txpool.disable
hmm, I don't see compress.SetDecompressionTableCondensity(6)
in
https://github.com/ledgerwatch/erigon/blob/stable/cmd/erigon/main.go
or
https://github.com/ledgerwatch/erigon/blob/devel/cmd/erigon/main.go
there is none. you can add this line.
got it, does it matter where? I'd probably add it above these statements:
https://github.com/ledgerwatch/erigon/blob/c6a000997e39b4bb89a2b73403dbdf9fb16780c0/cmd/erigon/main.go#L56-L58
Try smaller
Tried --batchSize 24M
doesn't seem like it slowed down the time it takes to dip into swap at all. Idk if there is a way to confirm the setting got set properly. If I get time I can try the other approaches later
Here is the pprof output for --batchSize 24M @ 70% swap usage
Logs at that time were:
Sep 28 04:04:34 ethereumonarm-9a6eeddd5 erigon[256413]: [INFO] [09-28|04:04:34.082] [7/16 Execution] Executed blocks number=2436561 blk/s=15.7 tx/s=91.1 Mgas/s=4.0 gasState=0.41 batch=65.2MB alloc=9.3GB sys=9.8GB
Sep 28 04:04:53 ethereumonarm-9a6eeddd5 erigon[256413]: [INFO] [09-28|04:04:53.910] [7/16 Execution] Executed blocks number=2437003 blk/s=22.4 tx/s=162.6 Mgas/s=6.1 gasState=0.42 batch=81.4MB alloc=10.0GB sys=10.6GB
Sep 28 04:05:11 ethereumonarm-9a6eeddd5 erigon[256413]: [INFO] [09-28|04:05:11.110] [p2p] GoodPeers eth66=84
Sep 28 04:05:11 ethereumonarm-9a6eeddd5 erigon[256413]: [INFO] [09-28|04:05:11.167] [txpool] stat block=14999999 pending=559 baseFee=0 queued=2827 alloc=10.8GB sys=11.4GB
Sep 28 04:05:13 ethereumonarm-9a6eeddd5 erigon[256413]: [INFO] [09-28|04:05:13.962] [7/16 Execution] Executed blocks number=2437451 blk/s=22.2 tx/s=177.2 Mgas/s=7.1 gasState=0.42 batch=101.4MB alloc=10.9GB sys=11.5GB
Sep 28 04:05:33 ethereumonarm-9a6eeddd5 erigon[256413]: [INFO] [09-28|04:05:33.937] [7/16 Execution] Executed blocks number=2437947 blk/s=24.8 tx/s=192.1 Mgas/s=7.8 gasState=0.42 batch=121.0MB alloc=11.9GB sys=12.6GB
Sep 28 04:05:53 ethereumonarm-9a6eeddd5 erigon[256413]: [INFO] [09-28|04:05:53.783] [7/16 Execution] Executed blocks number=2438139 blk/s=9.7 tx/s=96.5 Mgas/s=3.4 gasState=0.42 batch=126.9MB alloc=8.3GB sys=12.9GB
Sep 28 04:06:11 ethereumonarm-9a6eeddd5 erigon[256413]: [INFO] [09-28|04:06:11.135] [txpool] stat block=14999999 pending=559 baseFee=0 queued=2830 alloc=8.6GB sys=12.9GB
Sep 28 04:06:14 ethereumonarm-9a6eeddd5 erigon[256413]: [INFO] [09-28|04:06:14.638] [7/16 Execution] Executed blocks number=2438274 blk/s=6.5 tx/s=61.2 Mgas/s=2.2 gasState=0.42 batch=132.3MB alloc=8.6GB sys=12.9GB
Sep 28 04:06:33 ethereumonarm-9a6eeddd5 erigon[256413]: [INFO] [09-28|04:06:33.910] [7/16 Execution] Executed blocks number=2438433 blk/s=8.3 tx/s=70.0 Mgas/s=2.2 gasState=0.43 batch=136.2MB alloc=9.0GB sys=12.9GB
- one red square on picture can be improved by
compress.SetDecompressionTableCondensity(6)
- another it's --batchSize, try reduce it more. from this part of logs I don't see value of
batch=136.2MB
before commit
and add --txpool.disable
Just for my understanding, the --txpool.disable will free up some memory during the execution phase, but needs to be turned back on once the node is sycned? Is this correct?
compress.SetDecompressionTableCondensity(6)
is working much better now (using alpha
branch, devel
seems to be broken)
with --batchSize 16M I'm seeing batch hit 64MB before committing memory is staying much lower, I'll increase batchSize and see what I get.
Sep 28 15:55:01 ethereumonarm-9a6eeddd5 erigon[283197]: [INFO] [09-28|15:55:01.406] [Snapshots] Stat blocks=15000k indices=15000k alloc=602.8MB sys=910.1MB
Sep 28 15:55:06 ethereumonarm-9a6eeddd5 erigon[283197]: [INFO] [09-28|15:55:06.678] [7/16 Execution] Blocks execution from=4235123 to=14999999
Sep 28 15:55:26 ethereumonarm-9a6eeddd5 erigon[283197]: [INFO] [09-28|15:55:26.719] [7/16 Execution] Executed blocks number=4235921 blk/s=39.8 tx/s=2765.6 Mgas/s=138.1 gasState=0.08 batch=4.9MB alloc=846.4MB sys=1.3GB
Sep 28 15:55:46 ethereumonarm-9a6eeddd5 erigon[283197]: [INFO] [09-28|15:55:46.723] [7/16 Execution] Executed blocks number=4236496 blk/s=28.7 tx/s=3308.6 Mgas/s=126.7 gasState=0.15 batch=9.7MB alloc=1.1GB sys=1.3GB
Sep 28 15:55:53 ethereumonarm-9a6eeddd5 erigon[283197]: [INFO] [09-28|15:55:53.428] [txpool] stat block=14999999 pending=1209 baseFee=0 queued=6948 alloc=783.2MB sys=1.3GB
Sep 28 15:56:06 ethereumonarm-9a6eeddd5 erigon[283197]: [INFO] [09-28|15:56:06.683] [7/16 Execution] Executed blocks number=4237082 blk/s=29.4 tx/s=3577.2 Mgas/s=139.8 gasState=0.24 batch=14.7MB alloc=788.5MB sys=1.3GB
Sep 28 15:56:26 ethereumonarm-9a6eeddd5 erigon[283197]: [INFO] [09-28|15:56:26.695] [7/16 Execution] Executed blocks number=4237596 blk/s=25.7 tx/s=4156.7 Mgas/s=155.9 gasState=0.33 batch=20.4MB alloc=1.1GB sys=1.3GB
Sep 28 15:56:46 ethereumonarm-9a6eeddd5 erigon[283197]: [INFO] [09-28|15:56:46.722] [7/16 Execution] Executed blocks number=4238137 blk/s=27.0 tx/s=4094.5 Mgas/s=165.7 gasState=0.42 batch=26.6MB alloc=903.2MB sys=1.3GB
Sep 28 15:56:53 ethereumonarm-9a6eeddd5 erigon[283197]: [INFO] [09-28|15:56:53.412] [p2p] GoodPeers eth66=5
Sep 28 15:56:53 ethereumonarm-9a6eeddd5 erigon[283197]: [INFO] [09-28|15:56:53.433] [txpool] stat block=14999999 pending=1209 baseFee=0 queued=6981 alloc=1.2GB sys=1.3GB
Sep 28 15:57:06 ethereumonarm-9a6eeddd5 erigon[283197]: [INFO] [09-28|15:57:06.707] [7/16 Execution] Executed blocks number=4238762 blk/s=31.3 tx/s=3947.4 Mgas/s=195.8 gasState=0.54 batch=35.0MB alloc=700.5MB sys=1.3GB
Sep 28 15:57:26 ethereumonarm-9a6eeddd5 erigon[283197]: [INFO] [09-28|15:57:26.682] [7/16 Execution] Executed blocks number=4239534 blk/s=38.6 tx/s=4206.2 Mgas/s=193.2 gasState=0.65 batch=42.7MB alloc=1.2GB sys=1.4GB
Sep 28 15:57:46 ethereumonarm-9a6eeddd5 erigon[283197]: [INFO] [09-28|15:57:46.680] [7/16 Execution] Executed blocks number=4240216 blk/s=34.1 tx/s=5248.3 Mgas/s=190.9 gasState=0.76 batch=49.9MB alloc=1.0GB sys=1.5GB
Sep 28 15:57:53 ethereumonarm-9a6eeddd5 erigon[283197]: [INFO] [09-28|15:57:53.426] [txpool] stat block=14999999 pending=1206 baseFee=0 queued=7097 alloc=751.1MB sys=1.5GB
Sep 28 15:58:06 ethereumonarm-9a6eeddd5 erigon[283197]: [INFO] [09-28|15:58:06.685] [7/16 Execution] Executed blocks number=4240988 blk/s=38.6 tx/s=4716.8 Mgas/s=221.3 gasState=0.89 batch=57.7MB alloc=849.4MB sys=1.5GB
Sep 28 15:58:22 ethereumonarm-9a6eeddd5 erigon[283197]: [INFO] [09-28|15:58:22.152] Committed State gas reached=34366195645 gasTarget=34359738368
Sep 28 15:58:41 ethereumonarm-9a6eeddd5 erigon[283197]: [INFO] [09-28|15:58:41.735] [7/16 Execution] Executed blocks number=4241609 blk/s=17.7 tx/s=2304.0 Mgas/s=109.1 gasState=0.00 batch=0B alloc=980.8MB sys=1.8GB
Sep 28 15:58:46 ethereumonarm-9a6eeddd5 erigon[283197]: [INFO] [09-28|15:58:46.686] [7/16 Execution] Executed blocks number=4241788 blk/s=36.2 tx/s=4387.8 Mgas/s=208.2 gasState=0.03 batch=1.5MB alloc=1.2GB sys=1.8GB
Sep 28 15:58:53 ethereumonarm-9a6eeddd5 erigon[283197]: [INFO] [09-28|15:58:53.413] [p2p] GoodPeers eth66=12
Sep 28 15:58:53 ethereumonarm-9a6eeddd5 erigon[283197]: [INFO] [09-28|15:58:53.430] [txpool] stat block=14999999 pending=1206 baseFee=0 queued=7103 alloc=730.2MB sys=1.8GB
Sep 28 15:59:06 ethereumonarm-9a6eeddd5 erigon[283197]: [INFO] [09-28|15:59:06.699] [7/16 Execution] Executed blocks number=4242663 blk/s=43.7 tx/s=4682.0 Mgas/s=267.7 gasState=0.19 batch=12.8MB alloc=953.5MB sys=1.8GB
Sep 28 15:59:26 ethereumonarm-9a6eeddd5 erigon[283197]: [INFO] [09-28|15:59:26.681] [7/16 Execution] Executed blocks number=4243384 blk/s=36.1 tx/s=5239.9 Mgas/s=214.8 gasState=0.31 batch=20.2MB alloc=957.2MB sys=1.8GB
Sep 28 15:59:46 ethereumonarm-9a6eeddd5 erigon[283197]: [INFO] [09-28|15:59:46.696] [7/16 Execution] Executed blocks number=4244113 blk/s=36.4 tx/s=5765.0 Mgas/s=233.3 gasState=0.45 batch=28.2MB alloc=961.1MB sys=1.8GB
Sep 28 15:59:54 ethereumonarm-9a6eeddd5 erigon[283197]: [INFO] [09-28|15:59:54.572] [txpool] stat block=14999999 pending=1206 baseFee=0 queued=7117 alloc=806.8MB sys=1.8GB
Sep 28 16:00:06 ethereumonarm-9a6eeddd5 erigon[283197]: [INFO] [09-28|16:00:06.687] [7/16 Execution] Executed blocks number=4244836 blk/s=36.2 tx/s=5302.1 Mgas/s=231.6 gasState=0.58 batch=36.3MB alloc=897.6MB sys=1.8GB
Sep 28 16:00:26 ethereumonarm-9a6eeddd5 erigon[283197]: [INFO] [09-28|16:00:26.688] [7/16 Execution] Executed blocks number=4245590 blk/s=37.7 tx/s=4829.3 Mgas/s=233.6 gasState=0.72 batch=45.7MB alloc=855.0MB sys=1.8GB
Sep 28 16:00:46 ethereumonarm-9a6eeddd5 erigon[283197]: [INFO] [09-28|16:00:46.697] [7/16 Execution] Executed blocks number=4246391 blk/s=40.0 tx/s=5129.1 Mgas/s=247.2 gasState=0.86 batch=54.6MB alloc=1.3GB sys=1.8GB
Sep 28 16:00:53 ethereumonarm-9a6eeddd5 erigon[283197]: [INFO] [09-28|16:00:53.412] [p2p] GoodPeers eth66=19
Sep 28 16:00:53 ethereumonarm-9a6eeddd5 erigon[283197]: [INFO] [09-28|16:00:53.435] [txpool] stat block=14999999 pending=1206 baseFee=0 queued=7188 alloc=1.1GB sys=1.8GB
Sep 28 16:01:06 ethereumonarm-9a6eeddd5 erigon[283197]: [INFO] [09-28|16:01:06.698] [7/16 Execution] Executed blocks number=4247112 blk/s=36.0 tx/s=4882.3 Mgas/s=225.2 gasState=0.99 batch=63.0MB alloc=1.2GB sys=1.8GB
Sep 28 16:01:07 ethereumonarm-9a6eeddd5 erigon[283197]: [INFO] [09-28|16:01:07.826] Committed State gas reached=34359911584 gasTarget=34359738368
Hmm, once I increased to --batchSize 256M
and it started writing out to the DB I got this error:
Staged Sync err="runtime error: invalid memory address or nil pointer dereference, snapshot: 4000000-4500000, trace: [block_reader.go:609 panic.go:884 panic.go:260 signal_unix.go:835 decompress.go:432 decompress.go:524 block_reader.go:634 block_reader.go:467 block_snapshots.go:319 block_snapshots.go:745 block_reader.go:466 stage_execute.go:462 default_stages.go:101 sync.go:352 sync.go:255 stageloop.go:169 stageloop.go:90 asm_arm64.s:1172], trace: [stageloop.go:135 panic.go:890 block_reader.go:609 panic.go:884 panic.go:260 signal_unix.go:835 decompress.go:432 decompress.go:524 block_reader.go:634 block_reader.go:467 block_snapshots.go:319 block_snapshots.go:745 block_reader.go:466 stage_execute.go:462 default_stages.go:101 sync.go:352 sync.go:255 stageloop.go:169 stageloop.go:90 asm_arm64.s:1172]
Logs:
Sep 28 16:43:37 ethereumonarm-9a6eeddd5 erigon[283293]: [INFO] [09-28|16:43:37.981] [7/16 Execution] Executed blocks number=4349182 blk/s=34.9 tx/s=3439.3 Mgas/s=195.5 gasState=0.99 batch=1.1GB alloc=2.4GB sys=5.1GB
Sep 28 16:43:57 ethereumonarm-9a6eeddd5 erigon[283293]: [INFO] [09-28|16:43:57.982] [7/16 Execution] Executed blocks number=4350048 blk/s=43.3 tx/s=5085.9 Mgas/s=253.4 gasState=1.00 batch=1.1GB alloc=3.7GB sys=5.1GB
Sep 28 16:44:09 ethereumonarm-9a6eeddd5 erigon[283293]: [INFO] [09-28|16:44:09.670] Committed State gas reached=549756019262 gasTarget=549755813888
Sep 28 16:44:29 ethereumonarm-9a6eeddd5 erigon[283293]: [INFO] [09-28|16:44:29.482] [p2p] GoodPeers eth66=56
Sep 28 16:44:29 ethereumonarm-9a6eeddd5 erigon[283293]: [INFO] [09-28|16:44:29.510] [txpool] stat block=14999999 pending=3375 baseFee=0 queued=9351 alloc=2.4GB sys=6.5GB
Sep 28 16:44:32 ethereumonarm-9a6eeddd5 erigon[283293]: [INFO] [09-28|16:44:32.749] Flushed buffer file name=/home/ethereum/.erigon/etl-temp/erigon-sortable-buf-3214123939 alloc=2.5GB sys=6.5GB
Sep 28 16:44:50 ethereumonarm-9a6eeddd5 erigon[283293]: [INFO] [09-28|16:44:50.916] Flushed buffer file name=/home/ethereum/.erigon/etl-temp/erigon-sortable-buf-450471080 alloc=2.8GB sys=6.5GB
Sep 28 16:44:50 ethereumonarm-9a6eeddd5 erigon[283293]: [INFO] [09-28|16:44:50.917] Write to db progress=6.0M/19.0M current table=PlainState
Sep 28 16:44:59 ethereumonarm-9a6eeddd5 erigon[283293]: [INFO] [09-28|16:44:59.828] Flushed buffer file name=/home/ethereum/.erigon/etl-temp/erigon-sortable-buf-741370933 alloc=2.9GB sys=6.5GB
Sep 28 16:45:29 ethereumonarm-9a6eeddd5 erigon[283293]: [INFO] [09-28|16:45:29.498] [txpool] stat block=14999999 pending=3375 baseFee=0 queued=9369 alloc=3.5GB sys=6.5GB
Sep 28 16:45:29 ethereumonarm-9a6eeddd5 erigon[283293]: [INFO] [09-28|16:45:29.830] [] ETL [2/2] Loading into=PlainState current key=6090a6e4... alloc=3.5GB sys=6.5GB
Sep 28 16:45:59 ethereumonarm-9a6eeddd5 erigon[283293]: [INFO] [09-28|16:45:59.830] [] ETL [2/2] Loading into=PlainState current key=8d12a197... alloc=4.2GB sys=6.5GB
Sep 28 16:46:29 ethereumonarm-9a6eeddd5 erigon[283293]: [INFO] [09-28|16:46:29.482] [p2p] GoodPeers eth66=59
Sep 28 16:46:29 ethereumonarm-9a6eeddd5 erigon[283293]: [INFO] [09-28|16:46:29.504] [txpool] stat block=14999999 pending=3375 baseFee=0 queued=9578 alloc=2.4GB sys=6.5GB
Sep 28 16:46:29 ethereumonarm-9a6eeddd5 erigon[283293]: [INFO] [09-28|16:46:29.831] [] ETL [2/2] Loading into=PlainState current key=ca8a9168... alloc=2.5GB sys=6.5GB
Sep 28 16:46:59 ethereumonarm-9a6eeddd5 erigon[283293]: [INFO] [09-28|16:46:59.831] [] ETL [2/2] Loading into=PlainState current key=f598e316... alloc=3.0GB sys=6.5GB
Sep 28 16:47:04 ethereumonarm-9a6eeddd5 erigon[283293]: [INFO] [09-28|16:47:04.174] [] etl: temp files removed total size=432.3MB
Sep 28 16:47:04 ethereumonarm-9a6eeddd5 erigon[283293]: [INFO] [09-28|16:47:04.176] Write to db progress=7.6M/19.0M current table=Code
Sep 28 16:47:21 ethereumonarm-9a6eeddd5 erigon[283293]: [INFO] [09-28|16:47:21.941] [7/16 Execution] Executed blocks number=4350455 blk/s=2.0 tx/s=204.4 Mgas/s=11.6 gasState=0.00 batch=0B alloc=3.6GB sys=6.5GB
Sep 28 16:47:29 ethereumonarm-9a6eeddd5 erigon[283293]: [INFO] [09-28|16:47:29.499] [txpool] stat block=14999999 pending=3373 baseFee=0 queued=9939 alloc=3.9GB sys=6.5GB
Sep 28 16:47:37 ethereumonarm-9a6eeddd5 erigon[283293]: [INFO] [09-28|16:47:37.971] [7/16 Execution] Executed blocks number=4350799 blk/s=21.5 tx/s=2183.8 Mgas/s=121.8 gasState=0.00 batch=4.3MB alloc=4.2GB sys=6.5GB
Sep 28 16:47:58 ethereumonarm-9a6eeddd5 erigon[283293]: [INFO] [09-28|16:47:58.012] [7/16 Execution] Executed blocks number=4351239 blk/s=22.0 tx/s=2711.0 Mgas/s=125.9 gasState=0.01 batch=9.6MB alloc=1.4GB sys=6.5GB
Sep 28 16:48:11 ethereumonarm-9a6eeddd5 erigon[283293]: [EROR] [09-28|16:48:11.977] RPC method eth_getBlockByNumber crashed: runtime error: invalid memory address or nil pointer dereference, snapshot: 4000000-4500000, trace: [block_reader.go:609 panic.go:884 panic.go:260 signal_unix.go:835 decompress.go:432 decompress.go:544 block_reader.go:634 block_reader.go:467 block_snapshots.go:319 block_snapshots.go:745 block_reader.go:466 eth_api.go:169 eth_api.go:149 eth_api.go:246 eth_block.go:359 eth_block.go:194 value.go:584 value.go:368 service.go:222 handler.go:494 handler.go:444 handler.go:392 handler.go:223 handler.go:316 asm_arm64.s:1172]
Sep 28 16:48:11 ethereumonarm-9a6eeddd5 erigon[283293]: [service.go:217 panic.go:884 block_reader.go:609 panic.go:884 panic.go:260 signal_unix.go:835 decompress.go:432 decompress.go:544 block_reader.go:634 block_reader.go:467 block_snapshots.go:319 block_snapshots.go:745 block_reader.go:466 eth_api.go:169 eth_api.go:149 eth_api.go:246 eth_block.go:359 eth_block.go:194 value.go:584 value.go:368 service.go:222 handler.go:494 handler.go:444 handler.go:392 handler.go:223 handler.go:316 asm_arm64.s:1172]
Sep 28 16:48:11 ethereumonarm-9a6eeddd5 erigon[283293]: [EROR] [09-28|16:48:11.978] Staged Sync err="runtime error: invalid memory address or nil pointer dereference, snapshot: 4000000-4500000, trace: [block_reader.go:609 panic.go:884 panic.go:260 signal_unix.go:835 decompress.go:432 decompress.go:524 block_reader.go:634 block_reader.go:467 block_snapshots.go:319 block_snapshots.go:745 block_reader.go:466 stage_execute.go:462 default_stages.go:101 sync.go:352 sync.go:255 stageloop.go:169 stageloop.go:90 asm_arm64.s:1172], trace: [stageloop.go:135 panic.go:890 block_reader.go:609 panic.go:884 panic.go:260 signal_unix.go:835 decompress.go:432 decompress.go:524 block_reader.go:634 block_reader.go:467 block_snapshots.go:319 block_snapshots.go:745 block_reader.go:466 stage_execute.go:462 default_stages.go:101 sync.go:352 sync.go:255 stageloop.go:169 stageloop.go:90 asm_arm64.s:1172]"
Sep 28 16:48:11 ethereumonarm-9a6eeddd5 erigon[283293]: [WARN] [09-28|16:48:11.979] Served conn=192.168.7.233:38730 method=eth_getBlockByNumber reqid=1 t=8.794201ms err="method handler crashed"
Sep 28 16:48:12 ethereumonarm-9a6eeddd5 erigon[283293]: [INFO] [09-28|16:48:12.538] [7/16 Execution] Blocks execution from=4350455 to=14999999
Sep 28 16:48:12 ethereumonarm-9a6eeddd5 erigon[283293]: [EROR] [09-28|16:48:12.541] Staged Sync err="runtime error: invalid memory address or nil pointer dereference, snapshot: 4000000-4500000, trace: [block_reader.go:609 panic.go:884 panic.go:260 signal_unix.go:835 decompress.go:432 decompress.go:524 block_reader.go:634 block_reader.go:467 block_snapshots.go:319 block_snapshots.go:745 block_reader.go:466 stage_execute.go:462 default_stages.go:101 sync.go:352 sync.go:255 stageloop.go:169 stageloop.go:90 asm_arm64.s:1172], trace: [stageloop.go:135 panic.go:890 block_reader.go:609 panic.go:884 panic.go:260 signal_unix.go:835 decompress.go:432 decompress.go:524 block_reader.go:634 block_reader.go:467 block_snapshots.go:319 block_snapshots.go:745 block_reader.go:466 stage_execute.go:462 default_stages.go:101 sync.go:352 sync.go:255 stageloop.go:169 stageloop.go:90 asm_arm64.s:1172]"
Update: I see this with --batchSize 32M
as well. Happens right after state gets committed
more logs
Sep 28 17:03:41 ethereumonarm-9a6eeddd5 erigon[284629]: [INFO] [09-28|17:03:41.191] [txpool] stat block=14999999 pending=4051 baseFee=0 queued=8453 alloc=1.4GB sys=2.3GB
Sep 28 17:03:59 ethereumonarm-9a6eeddd5 erigon[284629]: [INFO] [09-28|17:03:59.411] [7/16 Execution] Executed blocks number=4371527 blk/s=39.2 tx/s=3612.9 Mgas/s=211.6 gasState=0.75 batch=98.2MB alloc=1.1GB sys=2.3GB
Sep 28 17:04:19 ethereumonarm-9a6eeddd5 erigon[284629]: [INFO] [09-28|17:04:19.417] [7/16 Execution] Executed blocks number=4372520 blk/s=49.6 tx/s=4454.6 Mgas/s=235.2 gasState=0.81 batch=107.3MB alloc=1.5GB sys=2.3GB
Sep 28 17:04:39 ethereumonarm-9a6eeddd5 erigon[284629]: [INFO] [09-28|17:04:39.419] [7/16 Execution] Executed blocks number=4373742 blk/s=61.1 tx/s=4636.8 Mgas/s=232.5 gasState=0.88 batch=115.3MB alloc=1.2GB sys=2.3GB
Sep 28 17:04:41 ethereumonarm-9a6eeddd5 erigon[284629]: [INFO] [09-28|17:04:41.185] [txpool] stat block=14999999 pending=4051 baseFee=0 queued=8483 alloc=1.3GB sys=2.3GB
Sep 28 17:04:59 ethereumonarm-9a6eeddd5 erigon[284629]: [INFO] [09-28|17:04:59.422] [7/16 Execution] Executed blocks number=4374853 blk/s=55.5 tx/s=4687.8 Mgas/s=226.1 gasState=0.95 batch=122.2MB alloc=943.8MB sys=2.3GB
Sep 28 17:05:15 ethereumonarm-9a6eeddd5 erigon[284629]: [INFO] [09-28|17:05:15.155] Committed State gas reached=68723182638 gasTarget=68719476736
Sep 28 17:05:41 ethereumonarm-9a6eeddd5 erigon[284629]: [INFO] [09-28|17:05:41.178] [p2p] GoodPeers eth66=26
Sep 28 17:05:41 ethereumonarm-9a6eeddd5 erigon[284629]: [INFO] [09-28|17:05:41.258] [txpool] stat block=14999999 pending=4049 baseFee=0 queued=9536 alloc=1.4GB sys=2.4GB
Sep 28 17:05:46 ethereumonarm-9a6eeddd5 erigon[284629]: [INFO] [09-28|17:05:46.906] [7/16 Execution] Executed blocks number=4375975 blk/s=23.6 tx/s=1551.5 Mgas/s=74.8 gasState=0.00 batch=0B alloc=1.5GB sys=2.4GB
Sep 28 17:05:59 ethereumonarm-9a6eeddd5 erigon[284629]: [INFO] [09-28|17:05:59.411] [7/16 Execution] Executed blocks number=4376697 blk/s=57.7 tx/s=4807.7 Mgas/s=222.4 gasState=0.04 batch=4.4MB alloc=1.1GB sys=2.4GB
Sep 28 17:06:19 ethereumonarm-9a6eeddd5 erigon[284629]: [INFO] [09-28|17:06:19.408] [7/16 Execution] Executed blocks number=4377697 blk/s=50.0 tx/s=4859.9 Mgas/s=221.2 gasState=0.10 batch=11.9MB alloc=1.1GB sys=2.4GB
Sep 28 17:06:39 ethereumonarm-9a6eeddd5 erigon[284629]: [INFO] [09-28|17:06:39.422] [7/16 Execution] Executed blocks number=4378669 blk/s=48.6 tx/s=4728.4 Mgas/s=203.5 gasState=0.16 batch=19.4MB alloc=1.1GB sys=2.4GB
Sep 28 17:06:41 ethereumonarm-9a6eeddd5 erigon[284629]: [INFO] [09-28|17:06:41.185] [txpool] stat block=14999999 pending=4049 baseFee=0 queued=9617 alloc=1.2GB sys=2.4GB
Sep 28 17:06:48 ethereumonarm-9a6eeddd5 erigon[284629]: [EROR] [09-28|17:06:48.590] Staged Sync err="runtime error: invalid memory address or nil pointer dereference, trace: [stageloop.go:135 panic.go:890 panic.go:260 signal_unix.go:835 decompress.go:432 decompress.go:544 block_reader.go:510 block_reader.go:320 block_snapshots.go:275 block_snapshots.go:733 block_reader.go:319 stage_execute.go:148 evm.go:103 instructions.go:483 interpreter.go:326 evm.go:67 evm.go:236 instructions.go:723 interpreter.go:326 evm.go:67 evm.go:236 state_transition.go:418 state_transition.go:232 state_processor.go:48 state_processor.go:105 blockchain.go:274 stage_execute.go:172 stage_execute.go:477 default_stages.go:101 sync.go:352 sync.go:255 stageloop.go:169 stageloop.go:90 asm_arm64.s:1172]"
Sep 28 17:06:49 ethereumonarm-9a6eeddd5 erigon[284629]: [INFO] [09-28|17:06:49.148] [7/16 Execution] Blocks execution from=4375975 to=14999999
Sep 28 17:06:49 ethereumonarm-9a6eeddd5 erigon[284629]: [EROR] [09-28|17:06:49.176] Staged Sync err="runtime error: invalid memory address or nil pointer dereference, trace: [stageloop.go:135 panic.go:890 panic.go:260 signal_unix.go:835 decompress.go:432 decompress.go:524 block_reader.go:510 block_reader.go:432 block_snapshots.go:275 block_snapshots.go:733 block_reader.go:431 stage_execute.go:462 default_stages.go:101 sync.go:352 sync.go:255 stageloop.go:169 stageloop.go:90 asm_arm64.s:1172]"
Sep 28 17:06:49 ethereumonarm-9a6eeddd5 erigon[284629]: [INFO] [09-28|17:06:49.677] [7/16 Execution] Blocks execution from=4375975 to=14999999
Sep 28 17:06:49 ethereumonarm-9a6eeddd5 erigon[284629]: [EROR] [09-28|17:06:49.706] Staged Sync err="runtime error: invalid memory address or nil pointer dereference, trace: [stageloop.go:135 panic.go:890 panic.go:260 signal_unix.go:835 decompress.go:432 decompress.go:524 block_reader.go:510 block_reader.go:432 block_snapshots.go:275 block_snapshots.go:733 block_reader.go:431 stage_execute.go:462 default_stages.go:101 sync.go:352 sync.go:255 stageloop.go:169 stageloop.go:90 asm_arm64.s:1172]"
Sep 28 17:06:50 ethereumonarm-9a6eeddd5 erigon[284629]: [INFO] [09-28|17:06:50.207] [7/16 Execution] Blocks execution from=4375975 to=14999999
Sep 28 17:06:50 ethereumonarm-9a6eeddd5 erigon[284629]: [EROR] [09-28|17:06:50.236] Staged Sync err="runtime error: invalid memory address or nil pointer dereference, trace: [stageloop.go:135 panic.go:890 panic.go:260 signal_unix.go:835 decompress.go:432 decompress.go:524 block_reader.go:510 block_reader.go:432 block_snapshots.go:275 block_snapshots.go:733 block_reader.go:431 stage_execute.go:462 default_stages.go:101 sync.go:352 sync.go:255 stageloop.go:169 stageloop.go:90 asm_arm64.s:1172]"
Sep 28 17:06:50 ethereumonarm-9a6eeddd5 erigon[284629]: [INFO] [09-28|17:06:50.737] [7/16 Execution] Blocks execution from=4375975 to=14999999
Sep 28 17:06:50 ethereumonarm-9a6eeddd5 erigon[284629]: [EROR] [09-28|17:06:50.765] Staged Sync err="runtime error: invalid memory address or nil pointer dereference, trace: [stageloop.go:135 panic.go:890 panic.go:260 signal_unix.go:835 decompress.go:432 decompress.go:524 block_reader.go:510 block_reader.go:432 block_snapshots.go:275 block_snapshots.go:733 block_reader.go:431 stage_execute.go:462 default_stages.go:101 sync.go:352 sync.go:255 stageloop.go:169 stageloop.go:90 asm_arm64.s:1172]"
Sep 28 17:06:51 ethereumonarm-9a6eeddd5 erigon[284629]: [INFO] [09-28|17:06:51.266] [7/16 Execution] Blocks execution from=4375975 to=14999999
Sep 28 17:06:51 ethereumonarm-9a6eeddd5 erigon[284629]: [EROR] [09-28|17:06:51.296] Staged Sync err="runtime error: invalid memory address or nil pointer dereference, trace: [stageloop.go:135 panic.go:890 panic.go:260 signal_unix.go:835 decompress.go:432 decompress.go:524 block_reader.go:510 block_reader.go:432 block_snapshots.go:275 block_snapshots.go:733 block_reader.go:431 stage_execute.go:462 default_stages.go:101 sync.go:352 sync.go:255 stageloop.go:169 stageloop.go:90 asm_arm64.s:1172]"
Sep 28 17:06:51 ethereumonarm-9a6eeddd5 erigon[284629]: [INFO] [09-28|17:06:51.796] [7/16 Execution] Blocks execution from=4375975 to=14999999
Update:
perhaps this is part of the issue: https://github.com/ledgerwatch/erigon-lib/blob/ec49625cd9b5e230aed091fd1da4330cd9c60ba4/compress/decompress.go#L431-L432
cw := table.condensedTableSearch(code)
l = cw.len
table.condensedTableSearch(...)
can return nil, then the next line immediately dereferences it
try add --downloader.verify flag
try add --downloader.verify flag
adding --downloader.verify
did not resolve the problem, still throws an exception eventually. It doesn't look like it's a particular bad block, it always gets a bit of distance before it throws, usually after it saves the state.
Yep, it made it awhile this time before crashing, but eventually died
Sep 29 17:24:42 ethereumonarm-9a6eeddd5 erigon[315830]: [INFO] [09-29|17:24:42.091] [p2p] GoodPeers eth66=87
Sep 29 17:24:42 ethereumonarm-9a6eeddd5 erigon[315830]: [INFO] [09-29|17:24:42.135] [txpool] stat block=14999999 pending=9993 baseFee=0 queued=22537 alloc=1.6GB sys=4.0GB
Sep 29 17:24:59 ethereumonarm-9a6eeddd5 erigon[315830]: [INFO] [09-29|17:24:59.738] [] ETL [2/2] Loading into=PlainState current key=80fb784b... alloc=1.9GB sys=4.0GB
Sep 29 17:25:29 ethereumonarm-9a6eeddd5 erigon[315830]: [INFO] [09-29|17:25:29.739] [] ETL [2/2] Loading into=PlainState current key=8d12a197... alloc=1.5GB sys=4.0GB
Sep 29 17:25:42 ethereumonarm-9a6eeddd5 erigon[315830]: [INFO] [09-29|17:25:42.120] [txpool] stat block=14999999 pending=9993 baseFee=0 queued=22551 alloc=1.8GB sys=4.0GB
Sep 29 17:25:59 ethereumonarm-9a6eeddd5 erigon[315830]: [INFO] [09-29|17:25:59.907] [] ETL [2/2] Loading into=PlainState current key=b1690c08... alloc=2.2GB sys=4.0GB
Sep 29 17:26:29 ethereumonarm-9a6eeddd5 erigon[315830]: [INFO] [09-29|17:26:29.848] [] ETL [2/2] Loading into=PlainState current key=d1c4c294... alloc=2.9GB sys=4.0GB
Sep 29 17:26:42 ethereumonarm-9a6eeddd5 erigon[315830]: [INFO] [09-29|17:26:42.091] [p2p] GoodPeers eth66=88
Sep 29 17:26:42 ethereumonarm-9a6eeddd5 erigon[315830]: [INFO] [09-29|17:26:42.148] [txpool] stat block=14999999 pending=9993 baseFee=0 queued=22570 alloc=1.6GB sys=4.0GB
Sep 29 17:27:00 ethereumonarm-9a6eeddd5 erigon[315830]: [INFO] [09-29|17:27:00.056] [] ETL [2/2] Loading into=PlainState current key=f230b790... alloc=2.0GB sys=4.0GB
Sep 29 17:27:05 ethereumonarm-9a6eeddd5 erigon[315830]: [INFO] [09-29|17:27:05.098] Write to db progress=2.8M/4.9M current table=Code
Sep 29 17:27:14 ethereumonarm-9a6eeddd5 erigon[315830]: [INFO] [09-29|17:27:14.828] Write to db progress=3.0M/4.9M current table=IncarnationMap
Sep 29 17:27:18 ethereumonarm-9a6eeddd5 erigon[315830]: [INFO] [09-29|17:27:18.686] [7/16 Execution] Executed blocks number=4976559 blk/s=1.0 tx/s=159.5 Mgas/s=6.7 gasState=0.00 batch=0B alloc=1.6GB sys=4.0GB
Sep 29 17:27:32 ethereumonarm-9a6eeddd5 erigon[315830]: [INFO] [09-29|17:27:32.566] [7/16 Execution] Executed blocks number=4976667 blk/s=7.8 tx/s=1326.3 Mgas/s=53.9 gasState=0.01 batch=1.3MB alloc=1.2GB sys=4.0GB
Sep 29 17:27:42 ethereumonarm-9a6eeddd5 erigon[315830]: [INFO] [09-29|17:27:42.118] [txpool] stat block=14999999 pending=9993 baseFee=0 queued=22617 alloc=819.3MB sys=4.0GB
Sep 29 17:27:52 ethereumonarm-9a6eeddd5 erigon[315830]: [INFO] [09-29|17:27:52.614] [7/16 Execution] Executed blocks number=4976886 blk/s=10.9 tx/s=1908.8 Mgas/s=74.6 gasState=0.02 batch=4.1MB alloc=1.2GB sys=4.0GB
Sep 29 17:28:11 ethereumonarm-9a6eeddd5 erigon[315830]: [EROR] [09-29|17:28:11.981] RPC method eth_getBlockByNumber crashed: runtime error: invalid memory address or nil pointer dereference, snapshot: 4500000-5000000, trace: [block_reader.go:609 panic.go:884 panic.go:260 signal_unix.go:835 decompress.go:432 decompress.go:524 block_reader.go:634 block_reader.go:467 block_snapshots.go:319 block_snapshots.go:745 block_reader.go:466 eth_api.go:169 eth_api.go:149 eth_api.go:246 eth_block.go:359 eth_block.go:194 value.go:584 value.go:368 service.go:222 handler.go:494 handler.go:444 handler.go:392 handler.go:223 handler.go:316 asm_arm64.s:1172]
Sep 29 17:28:11 ethereumonarm-9a6eeddd5 erigon[315830]: [service.go:217 panic.go:884 block_reader.go:609 panic.go:884 panic.go:260 signal_unix.go:835 decompress.go:432 decompress.go:524 block_reader.go:634 block_reader.go:467 block_snapshots.go:319 block_snapshots.go:745 block_reader.go:466 eth_api.go:169 eth_api.go:149 eth_api.go:246 eth_block.go:359 eth_block.go:194 value.go:584 value.go:368 service.go:222 handler.go:494 handler.go:444 handler.go:392 handler.go:223 handler.go:316 asm_arm64.s:1172]
Sep 29 17:28:11 ethereumonarm-9a6eeddd5 erigon[315830]: [EROR] [09-29|17:28:11.981] Staged Sync err="runtime error: invalid memory address or nil pointer dereference, snapshot: 4500000-5000000, trace: [block_reader.go:609 panic.go:884 panic.go:260 signal_unix.go:835 decompress.go:432 decompress.go:544 block_reader.go:634 block_reader.go:467 block_snapshots.go:319 block_snapshots.go:745 block_reader.go:466 stage_execute.go:462 default_stages.go:101 sync.go:352 sync.go:255 stageloop.go:169 stageloop.go:90 asm_arm64.s:1172], trace: [stageloop.go:135 panic.go:890 block_reader.go:609 panic.go:884 panic.go:260 signal_unix.go:835 decompress.go:432 decompress.go:544 block_reader.go:634 block_reader.go:467 block_snapshots.go:319 block_snapshots.go:745 block_reader.go:466 stage_execute.go:462 default_stages.go:101 sync.go:352 sync.go:255 stageloop.go:169 stageloop.go:90 asm_arm64.s:1172]"
Sep 29 17:28:11 ethereumonarm-9a6eeddd5 erigon[315830]: [WARN] [09-29|17:28:11.984] Served conn=192.168.7.233:60178 method=eth_getBlockByNumber reqid=1 t=15.863296ms err="method handler crashed"
Sep 29 17:28:12 ethereumonarm-9a6eeddd5 erigon[315830]: [INFO] [09-29|17:28:12.484] [7/16 Execution] Blocks execution from=4976559 to=14999999
Sep 29 17:28:12 ethereumonarm-9a6eeddd5 erigon[315830]: [EROR] [09-29|17:28:12.487] Staged Sync err="runtime error: invalid memory address or nil pointer dereference, snapshot: 4500000-5000000, trace: [block_reader.go:609 panic.go:884 panic.go:260 signal_unix.go:835 decompress.go:432 decompress.go:524 block_reader.go:634 block_reader.go:467 block_snapshots.go:319 block_snapshots.go:745 block_reader.go:466 stage_execute.go:462 default_stages.go:101 sync.go:352 sync.go:255 stageloop.go:169 stageloop.go:90 asm_arm64.s:1172], trace: [stageloop.go:135 panic.go:890 block_reader.go:609 panic.go:884 panic.go:260 signal_unix.go:835 decompress.go:432 decompress.go:524 block_reader.go:634 block_reader.go:467 block_snapshots.go:319 block_snapshots.go:745 block_reader.go:466 stage_execute.go:462 default_stages.go:101 sync.go:352 sync.go:255 stageloop.go:169 stageloop.go:90 asm_arm64.s:1172]"
Sep 29 17:28:12 ethereumonarm-9a6eeddd5 erigon[315830]: [INFO] [09-29|17:28:12.988] [7/16 Execution] Blocks execution from=4976559 to=14999999
Update here, I finally got through the 7/16 Execution phase, it took 3 weeks 😁 with --batchsize 24M
and --txpool.disable
.
It looks like there is another possible leak in the 13/16 LogIndex
phase. Overall memory consumption hit 12GB or so (on an 8GB RasPI 4B) I assume the heavy use of swap here will kill performance. I've restarted with --pprof
so I can capture a memory profile. In the logs below we can see the 10GB of sys usage 🙁
Logs:
Oct 23 18:03:13 ethereumonarm-9a6eeddd5 erigon[987195]: [INFO] [10-23|18:03:13.415] [13/16 LogIndex] ETL [2/2] Loading into=LogTopicIndex current key=00000000000000000000000003251ae9065a02d50230e02b137265157e101c2000da3d3a alloc=6.9GB sys=10.5GB
Oct 23 18:03:43 ethereumonarm-9a6eeddd5 erigon[987195]: [INFO] [10-23|18:03:43.340] [13/16 LogIndex] ETL [2/2] Loading into=LogTopicIndex current key=000000000000000000000000039ade827d952ba6d99d2ee75b77ee0b4d189bf9ffffffff alloc=4.4GB sys=10.5GB
Oct 23 18:04:13 ethereumonarm-9a6eeddd5 erigon[987195]: [INFO] [10-23|18:04:13.347] [13/16 LogIndex] ETL [2/2] Loading into=LogTopicIndex current key=00000000000000000000000003e14fc8b4e81b9297f071b0b4747959d6d0de23ffffffff alloc=5.0GB sys=10.5GB
Oct 23 18:04:44 ethereumonarm-9a6eeddd5 erigon[987195]: [INFO] [10-23|18:04:44.481] [13/16 LogIndex] ETL [2/2] Loading into=LogTopicIndex current key=00000000000000000000000003f7724180aa6b939894b5ca4314783b0b36b32900cd3080 alloc=6.5GB sys=10.5GB
Oct 23 18:05:08 ethereumonarm-9a6eeddd5 erigon[987195]: [INFO] [10-23|18:05:08.662] [p2p] GoodPeers eth66=90
Oct 23 18:05:13 ethereumonarm-9a6eeddd5 erigon[987195]: [INFO] [10-23|18:05:13.628] [13/16 LogIndex] ETL [2/2] Loading into=LogTopicIndex current key=00000000000000000000000003f7724180aa6b939894b5ca4314783b0b36b32900ce54da alloc=6.6GB sys=10.5GB
Oct 23 18:05:43 ethereumonarm-9a6eeddd5 erigon[987195]: [INFO] [10-23|18:05:43.349] [13/16 LogIndex] ETL [2/2] Loading into=LogTopicIndex current key=000000000000000000000000048d2c3c467e33a57c420dc92cda7012459270bfffffffff alloc=5.6GB sys=10.5GB
Here's the pprof output for LogIndex phase:
CommandLine:
--chain mainnet --batchSize 24M --txpool.disable --pprof --blockDownloaderWindow 32768 --prune=htc --prune.r.before=11052984 --authrpc.jwtsecret=xxx --datadir /home/ethereum/.erigon --private.api.addr= --authrpc.addr 0.0.0.0 --authrpc.vhosts=xxx
Logs:
Oct 23 23:10:55 ethereumonarm-9a6eeddd5 erigon[1114328]: [INFO] [10-23|23:10:55.773] [13/16 LogIndex] Progress number=13333575 alloc=3.4GB sys=7.5GB
Oct 23 23:11:25 ethereumonarm-9a6eeddd5 erigon[1114328]: [INFO] [10-23|23:11:25.775] [13/16 LogIndex] Progress number=13337605 alloc=4.1GB sys=7.5GB
Oct 23 23:11:55 ethereumonarm-9a6eeddd5 erigon[1114328]: [INFO] [10-23|23:11:55.771] [13/16 LogIndex] Progress number=13341522 alloc=4.7GB sys=7.5GB
Oct 23 23:12:25 ethereumonarm-9a6eeddd5 erigon[1114328]: [INFO] [10-23|23:12:25.771] [13/16 LogIndex] Progress number=13345609 alloc=5.4GB sys=7.5GB
Oct 23 23:12:33 ethereumonarm-9a6eeddd5 erigon[1114328]: [INFO] [10-23|23:12:33.065] [p2p] GoodPeers eth66=100
Oct 23 23:12:58 ethereumonarm-9a6eeddd5 erigon[1114328]: [INFO] [10-23|23:12:58.266] [13/16 LogIndex] Progress number=13348532 alloc=3.3GB sys=7.5GB
Oct 23 23:13:36 ethereumonarm-9a6eeddd5 erigon[1114328]: [INFO] [10-23|23:13:36.247] Flushed buffer file name=/home/ethereum/.erigon/etl-temp/erigon-sortable-buf-406585851 alloc=4.0GB sys=7.5GB
Oct 23 23:13:37 ethereumonarm-9a6eeddd5 erigon[1114328]: [INFO] [10-23|23:13:37.834] [13/16 LogIndex] Progress number=13350923 alloc=4.0GB sys=7.5GB
Oct 23 23:13:56 ethereumonarm-9a6eeddd5 erigon[1114328]: [INFO] [10-23|23:13:56.565] [13/16 LogIndex] Progress number=13353535 alloc=4.5GB sys=7.5GB
Oct 23 23:14:25 ethereumonarm-9a6eeddd5 erigon[1114328]: [INFO] [10-23|23:14:25.772] [13/16 LogIndex] Progress number=13358084 alloc=5.3GB sys=7.5GB
Oct 23 23:14:33 ethereumonarm-9a6eeddd5 erigon[1114328]: [INFO] [10-23|23:14:33.065] [p2p] GoodPeers eth66=100
Oct 23 23:14:56 ethereumonarm-9a6eeddd5 erigon[1114328]: [INFO] [10-23|23:14:56.903] [13/16 LogIndex] Progress number=13362308 alloc=6.0GB sys=7.5GB
Oct 23 23:15:25 ethereumonarm-9a6eeddd5 erigon[1114328]: [INFO] [10-23|23:15:25.772] [13/16 LogIndex] Progress number=13365721 alloc=3.6GB sys=7.5GB
Oct 23 23:15:55 ethereumonarm-9a6eeddd5 erigon[1114328]: [INFO] [10-23|23:15:55.772] [13/16 LogIndex] Progress number=13369719 alloc=4.3GB sys=7.5GB
Oct 23 23:16:27 ethereumonarm-9a6eeddd5 erigon[1114328]: [INFO] [10-23|23:16:27.488] [13/16 LogIndex] Progress number=13373575 alloc=5.0GB sys=7.5GB
Oct 23 23:16:33 ethereumonarm-9a6eeddd5 erigon[1114328]: [INFO] [10-23|23:16:33.065] [p2p] GoodPeers eth66=100
Oct 23 23:16:57 ethereumonarm-9a6eeddd5 erigon[1114328]: [INFO] [10-23|23:16:57.600] [13/16 LogIndex] Progress number=13377713 alloc=5.7GB sys=7.5GB
Oct 23 23:17:25 ethereumonarm-9a6eeddd5 erigon[1114328]: [INFO] [10-23|23:17:25.771] [13/16 LogIndex] Progress number=13380617 alloc=3.5GB sys=7.5GB
Oct 23 23:17:55 ethereumonarm-9a6eeddd5 erigon[1114328]: [INFO] [10-23|23:17:55.773] [13/16 LogIndex] Progress number=13383691 alloc=4.3GB sys=7.5GB
Oct 23 23:18:25 ethereumonarm-9a6eeddd5 erigon[1114328]: [INFO] [10-23|23:18:25.774] [13/16 LogIndex] Progress number=13387795 alloc=5.1GB sys=7.5GB
Oct 23 23:18:33 ethereumonarm-9a6eeddd5 erigon[1114328]: [INFO] [10-23|23:18:33.064] [p2p] GoodPeers eth66=100
Oct 23 23:18:55 ethereumonarm-9a6eeddd5 erigon[1114328]: [INFO] [10-23|23:18:55.772] [13/16 LogIndex] Progress number=13391918 alloc=5.9GB sys=7.5GB
Oct 23 23:19:25 ethereumonarm-9a6eeddd5 erigon[1114328]: [INFO] [10-23|23:19:25.773] [13/16 LogIndex] Progress number=13395019 alloc=6.6GB sys=7.5GB
Oct 23 23:19:55 ethereumonarm-9a6eeddd5 erigon[1114328]: [INFO] [10-23|23:19:55.773] [13/16 LogIndex] Progress number=13398816 alloc=4.0GB sys=7.5GB
Oct 23 23:20:25 ethereumonarm-9a6eeddd5 erigon[1114328]: [INFO] [10-23|23:20:25.772] [13/16 LogIndex] Progress number=13402443 alloc=4.8GB sys=7.5GB
Oct 23 23:20:33 ethereumonarm-9a6eeddd5 erigon[1114328]: [INFO] [10-23|23:20:33.064] [p2p] GoodPeers eth66=100
Oct 23 23:20:55 ethereumonarm-9a6eeddd5 erigon[1114328]: [INFO] [10-23|23:20:55.772] [13/16 LogIndex] Progress number=13406000 alloc=5.5GB sys=7.5GB
pprof:
Don't see anything bad in log-index phase.
- left one: is known issue - we will fix it in future
- right one: is just work of the stage - 600Mb is expected (we can reduce if need).
This issue is stale because it has been open for 40 days with no activity. Remove stale label or comment, or this will be closed in 7 days.
This issue was closed because it has been stalled for 7 days with no activity.