erigon
erigon copied to clipboard
DumpBlocks: DumpBodies: header missed in db: block_num
System information
Erigon version: erigon version 2.58.0-9731f716
OS & Version: Linux
Commit hash: 9731f716
Erigon Command (with flags/config):
--datadir /data/erigon
--authrpc.jwtsecret /data/prysm/jwt_secret
--private.api.addr 127.0.0.1:<grpcport>
--port <wireport>
--authrpc.port=<authrpcport>
--torrent.port <torrentport>
--prune=
--chain mainnet
--torrent.download.rate 200mb
--bodies.cache 4GB
--maxpeers 300
--db.size.limit=7T
Consensus Layer: prysm
Consensus Layer Command (with flags/config):
--datadir=/data/prysm
--mainnet
--rpc-host=0.0.0.0
--p2p-udp-port 30200
--p2p-tcp-port 30200
--grpc-gateway-port 7028
--execution-endpoint=http://localhost:7004
--jwt-secret=/data/prysm/jwt_secret
--accept-terms-of-use
--genesis-state=/data/prysm/genesis/genesis.ssz
--terminal-total-difficulty-override 58750000000000000000000"
Chain/Network: mainnet
Expected behaviour
no errors in log
Actual behaviour
DumpBlocks: DumpBodies: header missed in db: block_num=19000000
[INFO] [02-21|14:00:12.912] [NewPayload] Handling new payload height=19276488 hash=0x6bbbf5fb639db651234beaa2bad8336e619d01590a31149bd63b7520a9521f58
[INFO] [02-21|14:00:13.731] [updateForkchoice] Fork choice update: flushing in-memory state (built by previous newPayload)
[INFO] [02-21|14:00:14.042] RPC Daemon notified of new headers from=19276487 to=19276488 hash=0x6bbbf5fb639db651234beaa2bad8336e619d01590a31149bd63b7520a9521f58 header sending=22.054µs log sending=337ns
[INFO] [02-21|14:00:14.042] head updated hash=0x6bbbf5fb639db651234beaa2bad8336e619d01590a31149bd63b7520a9521f58 number=19276488
[INFO] [02-21|14:00:14.043] [snapshots] Prune Blocks to=19000000 limit=100
[WARN] [02-21|14:00:14.047] [snapshots] retire blocks err="DumpBlocks: DumpBodies: header missed in db: block_num=19000000, hash=cf384012b91b081230cdf17a3f7dd370d8e67056058af6b272b3d54aa2714fac"
Identical problem on mine.
Logs sample
Feb 21 19:03:26 erigon[16423]: [INFO] [02-21|19:03:26.435] RPC Daemon notified of new headers from=19279473 to=19279474 hash=0xb5aa8822e5f62e54e2d9b110fe360e358711ead6346b05504c08d521ce4a19b9 header sending=15.913µs log sending=257ns
Feb 21 19:03:26 erigon[16423]: [INFO] [02-21|19:03:26.436] head updated hash=0xb5aa8822e5f62e54e2d9b110fe360e358711ead6346b05504c08d521ce4a19b9 number=19279474
Feb 21 19:03:26 erigon[16423]: [INFO] [02-21|19:03:26.485] [snapshots] Prune Blocks to=19150000 limit=100
Feb 21 19:03:26 erigon[16423]: [WARN] [02-21|19:03:26.486] [snapshots] retire blocks err="DumpBlocks: DumpBodies: header missed in db: block_num=19150000, hash=9c87e585613f7ea1ebd89ec92a7bdc7e2ceaa5f8843c4237aaa8b44030532f19"
Feb 21 19:03:37 erigon[16423]: [INFO] [02-21|19:03:37.591] [NewPayload] Handling new payload height=19279475 hash=0xf9df32dda0808cdbd94652bd036887463d3d1d6c2b41540a92b01d9b1694c94b
Feb 21 19:03:41 erigon[16423]: [INFO] [02-21|19:03:41.866] [updateForkchoice] Fork choice update: flushing in-memory state (built by previous newPayload)
Feb 21 19:03:42 erigon[16423]: [INFO] [02-21|19:03:42.201] RPC Daemon notified of new headers from=19279474 to=19279475 hash=0xf9df32dda0808cdbd94652bd036887463d3d1d6c2b41540a92b01d9b1694c94b header sending=16.059µs log sending=493ns
Feb 21 19:03:42 erigon[16423]: [INFO] [02-21|19:03:42.202] head updated hash=0xf9df32dda0808cdbd94652bd036887463d3d1d6c2b41540a92b01d9b1694c94b number=19279475
Feb 21 19:03:42 erigon[16423]: [INFO] [02-21|19:03:42.332] [snapshots] Prune Blocks to=19150000 limit=100
Feb 21 19:03:42 erigon[16423]: [WARN] [02-21|19:03:42.333] [snapshots] retire blocks err="DumpBlocks: DumpBodies: header missed in db: block_num=19150000, hash=9c87e585613f7ea1ebd89ec92a7bdc7e2ceaa5f8843c4237aaa8b44030532f19"
Feb 21 19:03:48 erigon[16423]: [INFO] [02-21|19:03:48.996] [NewPayload] Handling new payload height=19279476 hash=0xe88d3373a340cbb6c951f1f9d75441fc136aa7d61429854d69871607146e4f98
Feb 21 19:03:50 erigon[16423]: [INFO] [02-21|19:03:50.574] [updateForkchoice] Fork choice update: flushing in-memory state (built by previous newPayload)
Feb 21 19:03:50 erigon[16423]: [INFO] [02-21|19:03:50.806] RPC Daemon notified of new headers from=19279475 to=19279476 hash=0xe88d3373a340cbb6c951f1f9d75441fc136aa7d61429854d69871607146e4f98 header sending=16.048µs log sending=211ns
Feb 21 19:03:50 erigon[16423]: [INFO] [02-21|19:03:50.806] head updated hash=0xe88d3373a340cbb6c951f1f9d75441fc136aa7d61429854d69871607146e4f98 number=19279476
Feb 21 19:03:50 erigon[16423]: [INFO] [02-21|19:03:50.884] [snapshots] Prune Blocks to=19150000 limit=100
Feb 21 19:03:50 erigon[16423]: [WARN] [02-21|19:03:50.885] [snapshots] retire blocks err="DumpBlocks: DumpBodies: header missed in db: block_num=19150000, hash=9c87e585613f7ea1ebd89ec92a7bdc7e2ceaa5f8843c4237aaa8b44030532f19"
Feb 21 19:04:01 erigon[16423]: [INFO] [02-21|19:04:01.428] [NewPayload] Handling new payload height=19279477 hash=0x8469b4deb0bf718ceae415fcc9557ab14cf744ef98c68d383721adda7cd02593
System information
Erigon version: erigon version 2.58.0-9731f716
Seemingly related
https://github.com/ledgerwatch/erigon/issues/6636
temporary fix: --snap.stop permanent fix: integration stage_header --reset
interesting is, we have two nodes and this affects only one one of them. Both the same data, the same version. Weird.
same problem in my node also i have "can't build missed indices" error.
integration stage_header --reset doesn't fix problem.
Erigon Version: 2.58.1
Logs:
INFO[02-23|00:04:40.528] [1/12 Snapshots] Requesting downloads
INFO[02-23|00:04:42.225] [snapshots:download] Blocks Stat blocks=19150k indices=19150k alloc=4.7GB sys=5.6GB
INFO[02-23|00:04:42.225] [snapshots:missed-idx] Blocks Stat blocks=19150k indices=19150k alloc=4.7GB sys=5.6GB
EROR[02-23|00:04:42.782] Could not start execution service err="[1/12 Snapshots] can't build missed indices: TransactionsIdx: at=19150000-19160000, pre index building, expect: 1494586, got 7297745"
INFO[02-23|00:05:03.608] [Sentinel] Sentinel started app=caplin enr=enr:-Ly4QEczJkMibOYHW1rG2JP6WMCQT2l4kUoogbB55cIOseQSB3kgQ6ZSQpaMO_j8-yAR5qgAjaaUmQGt4HT0rcA6JIMBh2F0dG5ldHOIAAAAAAAAAACEZXRoMpC7pNqWBAAAAAAAAAAABB0AgmlkgnY0gmlwhH8AAAGJc2VjcDI1NmsxoQNaDSrBk_V6BSTZn7J1_ZV8-pzmeV-33HgUthW0umdJtohzeW5jbmV0cwCDdGNwgg-hg3VkcIIPoA
INFO[02-23|00:05:03.608] Static peers len=0
INFO[02-23|00:05:03.608] Started Ethereum 2.0 Gossip Service app=caplin
INFO[02-23|00:05:28.806] [Caplin] starting clstages loop app=caplin
INFO[02-23|00:05:28.806] Starting downloading History app=caplin stage=DownloadHistoricalBlocks from=8485120
INFO[02-23|00:05:30.033] Backfilling finished
INFO[02-23|00:05:33.811] Ready to insert history, waiting for sync cycle to finish app=caplin stage=DownloadHistoricalBlocks
INFO[02-23|00:05:33.815] [Caplin] Forward Sync app=caplin stage=ForwardSync from=8485120 to=8485225
INFO[02-23|00:05:48.839] Adding light client update period=1035
INFO[02-23|00:06:03.608] P2P app=caplin peers=30
INFO[02-23|00:06:06.055] [4/12 Execution] Completed on block=19286611
INFO[02-23|00:06:18.251] RPC Daemon notified of new headers from=19286596 to=19286611 hash=0xccda8961b36855dc8354cbdfa796bde43a9664a3f46f71facac2a6a51a6a019e header sending=32.13µs log sending=220ns
INFO[02-23|00:06:18.251] head updated hash=0xccda8961b36855dc8354cbdfa796bde43a9664a3f46f71facac2a6a51a6a019e number=19286611
INFO[02-23|00:06:18.251] [snapshots] Prune Blocks to=19150000 limit=100
WARN[02-23|00:06:18.252] [snapshots] retire blocks err="DumpBlocks: DumpBodies: header missed in db: block_num=19150000, hash=9c87e585613f7ea1ebd89ec92a7bdc7e2ceaa5f8843c4237aaa8b44030532f19"
INFO[02-23|00:06:43.830] [4/12 Execution] Completed on block=19286614
INFO[02-23|00:06:46.528] RPC Daemon notified of new headers from=19286611 to=19286614 hash=0xcd98e735fce6d47e68c96986c67e01968fefac9815d3b129c04fd19a14d4f7ae header sending=38.012µs log sending=221ns
INFO[02-23|00:06:46.528] head updated hash=0xcd98e735fce6d47e68c96986c67e01968fefac9815d3b129c04fd19a14d4f7ae number=19286614
INFO[02-23|00:06:46.529] [snapshots] Prune Blocks to=19150000 limit=100
WARN[02-23|00:06:46.530] [snapshots] retire blocks err="DumpBlocks: DumpBodies: header missed in db: block_num=19150000, hash=9c87e585613f7ea1ebd89ec92a7bdc7e2ceaa5f8843c4237aaa8b44030532f19"
INFO[02-23|00:07:03.609] P2P app=caplin peers=14
INFO[02-23|00:07:25.947] [p2p] GoodPeers eth68=3 eth66=2 eth67=6
INFO[02-23|00:07:36.540] [txpool] stat pending=729 baseFee=506 queued=2631 alloc=10.1GB sys=12.3GB
INFO[02-23|00:08:03.614] P2P app=caplin peers=17
INFO[02-23|00:09:03.608] P2P app=caplin peers=22
INFO[02-23|00:09:10.881] [4/12 Execution] Completed on block=19286626
I've noticed we get this issue pretty consistently on Goerli when stopping, starting a node. For example on stopping we see this WARN
:
Feb 24 05:19:28 ip-10-0-65-9 erigon[9174]: [WARN] [02-24|05:19:28.340] [snapshots] retire blocks err="DumpBlocks: compress: context canceled"
If we see that warning, then the next startup we always see:
Feb 24 05:38:21 ip-10-0-65-9 erigon[57376]: [WARN] [02-24|05:38:21.672] [1/12 Snapshots] Some blocks are not in snapshots and not in db max_in_snapshots=8499999 min_in_db=10446000
Feb 24 05:38:21 ip-10-0-65-9 erigon[57376]: [INFO] [02-24|05:38:21.672] [snapshots:download] Blocks Stat blocks=8500k indices=8500k alloc=1.1GB sys=1.7GB
Feb 24 05:38:21 ip-10-0-65-9 erigon[57376]: [INFO] [02-24|05:38:21.678] [snapshots] Prune Blocks to=8500000 limit=100
Feb 24 05:38:21 ip-10-0-65-9 erigon[57376]: [WARN] [02-24|05:38:21.679] [snapshots] retire blocks err="DumpBlocks: DumpBodies: header missed in db: block_num=8500000>
Permanent fix:
integration stage_header --reset
From https://github.com/ledgerwatch/erigon/issues/9479#issuecomment-1958530855
This indeed worked for me. It took way longer than I expected—about 5 days, on a non-networked (local) NVMe SSD—but it does appear to have succeeded.
found 1 reason of it https://github.com/ledgerwatch/erigon/pull/9555 If there is a gap in snapshots (for example if file manually deleted): erigon will ignore all snapshots after gap and will try to create snapshots starting from block "before gap" and of-course doesn't have such blocks (erigon already created older snapshots in the past and removed such block). So, there is no bug in prune/retirement code of blocks, but just gap in snap files. I will print it to debug logs now and maybe we will do something else about it in future.
I had the same issue.
Using the -snap.stop
temporary fix seemed to work.
Thank you @AskAlexSharov for your suggestions and also a great project 🙌