full archive caplin restart loses too much processed state
System information
Erigon version: git main
OS & Version: Linux
Commit hash: d780e319cd
Erigon Command (with flags/config): ../../bin/erigon/erigon --datadir ./data --prune.mode archive --http.api=eth,erigon,web3,net,debug,trace,txpool --caplin.archive --beacon.api beacon,builder,config,debug,events,node,validator,lighthouse --diagnostics.disabled --nat none --private.api.addr --beacon.api.port 3500 --http.port 8545 --ws --ws.port 8546 --authrpc.port 8551 --torrent.port 20202 --port 30303 --p2p.protocol 68 --p2p.allowed-ports 30303 --caplin.discovery.tcpport 40404 --caplin.discovery.port 40404 --sentinel.port 50505 --rpc.batch.limit 50000 --db.read.concurrency 8 --rpc.returndata.limit 100000000000
Chain/Network: ethereum mainnet
Expected behaviour
When erigon is restarted, it should be back in normal operation in 1-2 minutes.
Actual behaviour
When erigon is ran in full sync mode (caplin archive also enabled), and erigon is restarted, caplin sync somehow jumps back around 20K slots, and then historical download has to be done, and then "State processing progress" has to process the downloaded slots.
[INFO] [10-14|01:55:57.856] State processing progress slot=10164630 blk/sec=17.78
This goes on for 15-30 minutes (depending on luck).
Steps to reproduce the behaviour
Have an erigon fully synced, with caplin archival enabled, and do a restart.
Discussion
If there are plans to increase the "State processing progress" drastically (at least 10x), then this becomes a non-issue.
If that is not feasible, can we somehow make more frequent snapshots or save state on exit signal to make the restart less disastrous regarding waiting time?
Mmmmhhh - I cannot really help with this. I would just wait for alpha 6 here. Will keep this open