erigon icon indicating copy to clipboard operation
erigon copied to clipboard

Occasional memory reference errors while handling GET_BLOCK_HEADERS_66 in latest alpha/devel branches

Open mconover opened this issue 2 years ago • 3 comments

System information

Erigon version: ./erigon --version erigon.exe version 2022.08.1-alpha-7dabdc32 (tag v2022.08.01 latest alpha branch)

OS & Version: Windows/Linux/OSX Windows 11

Commit hash : 7dabdc3269f1e3b098fee5ab4d96ec843823a6fd

Expected behaviour

Handle GET_BLOCK_HEADERS_66 messages in RecvUploadHeadersMessage without invalid memory references

Actual behaviour

I reproduced this behavior with both the latest alpha branch and latest devel branch, so it isn't specific to alpha branch. I get this occasionally (it's happened on multiple occasions over several days, but not on every block... it seems to happen a few times per day)

INFO[08-11|10:47:35.303] [6/16 Execution] Executed blocks number=11250271 blk/s=9.1 tx/s=1743.3 Mgas/s=111.2 gasState=0.47 batch=722.6MB alloc=4.8GB sys=9.8GB WARN[08-11|10:47:42.480] Handling incoming message stream=RecvUploadHeadersMessage err="runtime error: invalid memory address or nil pointer dereference, msgID=GET_BLOCK_HEADERS_66, trace: [sentry_multi_client.go:638 panic.go:838 panic.go:220 signal_windows.go:255 decompress.go:324 decompress.go:405 block_reader.go:504 block_reader.go:242 block_snapshots.go:271 block_snapshots.go:717 block_reader.go:241 handlers.go:68 sentry_multi_client.go:521 kv_mdbx.go:669 sentry_multi_client.go:520 sentry_multi_client.go:671 sentry_multi_client.go:642 sentry_multi_client.go:211 asm_amd64.s:1571]" WARN[08-11|10:47:59.711] Handling incoming message stream=RecvUploadHeadersMessage err="runtime error: invalid memory address or nil pointer dereference, msgID=GET_BLOCK_HEADERS_66, trace: [sentry_multi_client.go:638 panic.go:838 panic.go:220 signal_windows.go:255 decompress.go:324 decompress.go:405 block_reader.go:504 block_reader.go:242 block_snapshots.go:271 block_snapshots.go:717 block_reader.go:241 handlers.go:68 sentry_multi_client.go:521 kv_mdbx.go:669 sentry_multi_client.go:520 sentry_multi_client.go:671 sentry_multi_client.go:642 sentry_multi_client.go:211 asm_amd64.s:1571]" INFO[08-11|10:48:05.543] [txpool] stat block=15311269 pending=9990 baseFee=28337 queued=30000 alloc=5.5GB sys=9.8GB INFO[08-11|10:48:09.665] [6/16 Execution] Executed blocks number=11250459 blk/s=5.5 tx/s=1011.6 Mgas/s=66.6 gasState=0.48 batch=729.1MB alloc=5.5GB sys=9.8GB INFO[08-11|10:48:15.333] [6/16 Execution] Executed blocks number=11250511 blk/s=9.2 tx/s=1827.0 Mgas/s=110.0 gasState=0.48 batch=730.9MB alloc=5.7GB sys=9.8GB WARN[08-11|10:48:21.361] Handling incoming message stream=RecvUploadHeadersMessage err="runtime error: invalid memory address or nil pointer dereference, msgID=GET_BLOCK_HEADERS_66, trace: [sentry_multi_client.go:638 panic.go:838 panic.go:220 signal_windows.go:255 decompress.go:324 decompress.go:405 block_reader.go:504 block_reader.go:242 block_snapshots.go:271 block_snapshots.go:717 block_reader.go:241 handlers.go:68 sentry_multi_client.go:521 kv_mdbx.go:669 sentry_multi_client.go:520 sentry_multi_client.go:671 sentry_multi_client.go:642 sentry_multi_client.go:211 asm_amd64.s:1571]"

Steps to reproduce the behaviour

I was syncing the full history.

Backtrace

[sentry_multi_client.go:638 panic.go:838 panic.go:220 signal_windows.go:255 decompress.go:324 decompress.go:405 block_reader.go:504 block_reader.go:242 block_snapshots.go:271 block_snapshots.go:717 block_reader.go:241 handlers.go:68 sentry_multi_client.go:521 kv_mdbx.go:669 sentry_multi_client.go:520 sentry_multi_client.go:671 sentry_multi_client.go:642 sentry_multi_client.go:211 asm_amd64.s:1571]"

mconover avatar Aug 11 '22 18:08 mconover

Let'd do next way:

  1. start erigon with flag --downloader.verify, wait of verification complete
  2. then run ./build/bin/erigon snapshots index --rebuild --datadir=<your_datadir> --chain=<your_chain>
  3. then start erigon as usually

And also I will try to add more context info to such errors.

AskAlexSharov avatar Aug 12 '22 05:08 AskAlexSharov

I'll give that a try and follow up. Thanks!

mconover avatar Aug 12 '22 06:08 mconover

I build two Linux machines to sync mainnet with the same cmd, and they both panic with the exact same error,and I just restart and it continues to work

algtm avatar Aug 18 '22 14:08 algtm

This issue is stale because it has been open for 40 days with no activity. Remove stale label or comment, or this will be closed in 7 days.

github-actions[bot] avatar Sep 28 '22 04:09 github-actions[bot]

This issue was closed because it has been stalled for 7 days with no activity.

github-actions[bot] avatar Oct 08 '22 03:10 github-actions[bot]