optimism
optimism copied to clipboard
geth crashing with OutOfMemory if request debug_traceTransaction
Describe the bug
If you request debug_traceTransaction for some tx hashed, Optimism client continuously consuming memory until it will be killed by oom-killer.
That happens all the time for (at least) these hashes:
"0xcf6e46a1f41e1678fba10590f9d092690c5e8fd2e85a3614715fb21caa74655d"
"0xacb46bc13597e0125a31af5f4162d5e2929a1049700e6f75f2141a522c08567f"
"0xe3463602219d0ed638f4b8db3f7126e1ee747435501812abf0cd38cf7b526969"
"0x6eeb08f0b8b1ba240430d213800770244257f4844a20addf0590b65e177bd589"
"0x6d6fc578e10d88295accecc0f4eb44a8013cb8d11850426672f821c6892d6798"
"0xff8cfb7966450ec5cdc3d71bac104265a332cf6ccd35c849ada0ddaf1e8683b8"
"0x2d579e19657855a85fee279bff59a313a34b25b4b26848b20216d7d8f5e0c9d1"
"0x815d44bc86298c1bdd936132d99f08f3b891e5efcab8e28a0be8e2440ccd6434"
To Reproduce
Just run this request:
curl http://localhost:8545 -d '{"method":"debug_traceTransaction","params":["0xacb46bc13597e0125a31af5f4162d5e2929a1049700e6f75f2141a522c08567f"],"id":1,"jsonrpc":"2.0"}' -H "Content-Type: application/json"
curl: (56) Recv failure: Connection reset by peer
After the crash, you can find these messages in the dmesg
[Thu Feb 24 14:04:34 2022] geth invoked oom-killer: gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0
[Thu Feb 24 14:04:34 2022] CPU: 12 PID: 319878 Comm: geth Not tainted 5.4.0-97-generic #110-Ubuntu
[Thu Feb 24 14:04:34 2022] Hardware name: DigitalOcean Droplet, BIOS 20171212 12/12/2017
[Thu Feb 24 14:04:34 2022] Call Trace:
[Thu Feb 24 14:04:34 2022] dump_stack+0x6d/0x8b
[Thu Feb 24 14:04:34 2022] dump_header+0x4f/0x1eb
[Thu Feb 24 14:04:34 2022] oom_kill_process.cold+0xb/0x10
[Thu Feb 24 14:04:34 2022] out_of_memory+0x1cf/0x4d0
[Thu Feb 24 14:04:34 2022] __alloc_pages_slowpath+0xd5e/0xe50
[Thu Feb 24 14:04:34 2022] __alloc_pages_nodemask+0x2d0/0x320
[Thu Feb 24 14:04:34 2022] alloc_pages_current+0x87/0xe0
[Thu Feb 24 14:04:34 2022] __page_cache_alloc+0x72/0x90
[Thu Feb 24 14:04:34 2022] pagecache_get_page+0xbf/0x300
[Thu Feb 24 14:04:34 2022] filemap_fault+0x6b2/0xa50
[Thu Feb 24 14:04:34 2022] ? native_flush_tlb_others+0x105/0x170
[Thu Feb 24 14:04:34 2022] ? xas_load+0xd/0x80
[Thu Feb 24 14:04:34 2022] ? xas_find+0x17f/0x1c0
[Thu Feb 24 14:04:34 2022] ? filemap_map_pages+0x24c/0x380
[Thu Feb 24 14:04:34 2022] ext4_filemap_fault+0x32/0x50
[Thu Feb 24 14:04:34 2022] __do_fault+0x3c/0x130
[Thu Feb 24 14:04:34 2022] do_fault+0x24b/0x640
[Thu Feb 24 14:04:34 2022] __handle_mm_fault+0x4c5/0x7a0
[Thu Feb 24 14:04:34 2022] handle_mm_fault+0xca/0x200
[Thu Feb 24 14:04:34 2022] do_user_addr_fault+0x1f9/0x450
[Thu Feb 24 14:04:34 2022] __do_page_fault+0x58/0x90
[Thu Feb 24 14:04:34 2022] ? exit_to_usermode_loop+0xbf/0x160
[Thu Feb 24 14:04:34 2022] do_page_fault+0x2c/0xe0
[Thu Feb 24 14:04:34 2022] do_async_page_fault+0x39/0x70
[Thu Feb 24 14:04:34 2022] async_page_fault+0x34/0x40
[Thu Feb 24 14:04:34 2022] RIP: 0033:0x4536f8
[Thu Feb 24 14:04:34 2022] Code: Bad RIP value.
[Thu Feb 24 14:04:34 2022] RSP: 002b:000000c0066e71c8 EFLAGS: 00010206
[Thu Feb 24 14:04:34 2022] RAX: 00000000016d2d88 RBX: 000000000195d600 RCX: 000000000195d600
[Thu Feb 24 14:04:34 2022] RDX: 00000000002a29a0 RSI: 000000000166e060 RDI: 0000000000005d40
[Thu Feb 24 14:04:34 2022] RBP: 000000c0066e71d0 R08: 000000000166e060 R09: 000000000000013e
[Thu Feb 24 14:04:34 2022] R10: 000000000043c480 R11: 00007f80eed76fff R12: 000000c0066e7260
[Thu Feb 24 14:04:34 2022] R13: 000000c85c66c280 R14: 000000c001c74680 R15: 00007f80944b9762
[Thu Feb 24 14:04:34 2022] Mem-Info:
[Thu Feb 24 14:04:34 2022] active_anon:16111429 inactive_anon:47 isolated_anon:0
active_file:370 inactive_file:44 isolated_file:0
unevictable:4616 dirty:0 writeback:0 unstable:0
slab_reclaimable:125951 slab_unreclaimable:86997
mapped:2347 shmem:290 pagetables:33515 bounce:0
free:81575 free_pcp:0 free_cma:0
[Thu Feb 24 14:04:34 2022] Node 0 active_anon:64445716kB inactive_anon:188kB active_file:1480kB inactive_file:176kB unevictable:18464kB isolated(anon):0kB isolated(file):0kB mapped:9388kB dirty:0kB writeback:0kB shmem:1160kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 1460224kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no
[Thu Feb 24 14:04:34 2022] Node 0 DMA free:15908kB min:16kB low:28kB high:40kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15992kB managed:15908kB mlocked:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[Thu Feb 24 14:04:34 2022] lowmem_reserve[]: 0 3477 64265 64265 64265
[Thu Feb 24 14:04:34 2022] Node 0 DMA32 free:246796kB min:3656kB low:7216kB high:10776kB active_anon:3322876kB inactive_anon:0kB active_file:12kB inactive_file:112kB unevictable:0kB writepending:0kB present:3653596kB managed:3588060kB mlocked:0kB kernel_stack:64kB pagetables:6620kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[Thu Feb 24 14:04:34 2022] lowmem_reserve[]: 0 0 60787 60787 60787
[Thu Feb 24 14:04:34 2022] Node 0 Normal free:63596kB min:63908kB low:126152kB high:188396kB active_anon:61122376kB inactive_anon:188kB active_file:1108kB inactive_file:0kB unevictable:18464kB writepending:0kB present:63438848kB managed:62254300kB mlocked:18464kB kernel_stack:9600kB pagetables:127440kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[Thu Feb 24 14:04:34 2022] lowmem_reserve[]: 0 0 0 0 0
[Thu Feb 24 14:04:34 2022] Node 0 DMA: 1*4kB (U) 0*8kB 0*16kB 1*32kB (U) 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15908kB
[Thu Feb 24 14:04:34 2022] Node 0 DMA32: 464*4kB (UME) 298*8kB (UME) 418*16kB (UME) 387*32kB (UME) 287*64kB (UME) 187*128kB (UME) 89*256kB (UME) 36*512kB (UME) 9*1024kB (UE) 2*2048kB (ME) 31*4096kB (ME) = 247120kB
[Thu Feb 24 14:04:34 2022] Node 0 Normal: 708*4kB (UME) 267*8kB (UME) 376*16kB (ME) 1825*32kB (E) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 69384kB
[Thu Feb 24 14:04:34 2022] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[Thu Feb 24 14:04:34 2022] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[Thu Feb 24 14:04:34 2022] 3079 total pagecache pages
[Thu Feb 24 14:04:34 2022] 0 pages in swap cache
[Thu Feb 24 14:04:34 2022] Swap cache stats: add 0, delete 0, find 0/0
[Thu Feb 24 14:04:34 2022] Free swap = 0kB
[Thu Feb 24 14:04:34 2022] Total swap = 0kB
[Thu Feb 24 14:04:34 2022] 16777109 pages RAM
[Thu Feb 24 14:04:34 2022] 0 pages HighMem/MovableOnly
[Thu Feb 24 14:04:34 2022] 312542 pages reserved
[Thu Feb 24 14:04:34 2022] 0 pages cma reserved
[Thu Feb 24 14:04:34 2022] 0 pages hwpoisoned
[Thu Feb 24 14:04:34 2022] Tasks state (memory values in pages):
[Thu Feb 24 14:04:34 2022] [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name
[Thu Feb 24 14:04:34 2022] [ 623] 0 623 29706 1054 253952 0 -250 systemd-journal
[Thu Feb 24 14:04:34 2022] [ 825] 0 825 70034 4485 98304 0 -1000 multipathd
[Thu Feb 24 14:04:34 2022] [ 868] 102 868 22558 732 81920 0 0 systemd-timesyn
[Thu Feb 24 14:04:34 2022] [ 934] 100 934 4604 785 77824 0 0 systemd-network
[Thu Feb 24 14:04:34 2022] [ 970] 101 970 6014 1674 86016 0 0 systemd-resolve
[Thu Feb 24 14:04:34 2022] [ 996] 0 996 4730 781 57344 0 -1000 systemd-udevd
[Thu Feb 24 14:04:34 2022] [ 1111] 0 1111 60258 258 106496 0 0 accounts-daemon
[Thu Feb 24 14:04:34 2022] [ 1114] 0 1114 2134 591 53248 0 0 cron
[Thu Feb 24 14:04:34 2022] [ 1118] 103 1118 2007 901 49152 0 -900 dbus-daemon
[Thu Feb 24 14:04:34 2022] [ 1124] 0 1124 20482 530 57344 0 0 irqbalance
[Thu Feb 24 14:04:34 2022] [ 1125] 0 1125 7299 2746 90112 0 0 networkd-dispat
[Thu Feb 24 14:04:34 2022] [ 1127] 104 1127 56087 478 86016 0 0 rsyslogd
[Thu Feb 24 14:04:34 2022] [ 1134] 0 1134 4210 803 73728 0 0 systemd-logind
[Thu Feb 24 14:04:34 2022] [ 1139] 0 1139 98706 1338 126976 0 0 udisksd
[Thu Feb 24 14:04:34 2022] [ 1146] 0 1146 948 521 45056 0 0 atd
[Thu Feb 24 14:04:34 2022] [ 1191] 0 1191 1838 439 45056 0 0 agetty
[Thu Feb 24 14:04:34 2022] [ 1202] 0 1202 1457 386 53248 0 0 agetty
[Thu Feb 24 14:04:34 2022] [ 1214] 0 1214 27027 2725 106496 0 0 unattended-upgr
[Thu Feb 24 14:04:34 2022] [ 1224] 0 1224 59105 661 90112 0 0 polkitd
[Thu Feb 24 14:04:34 2022] [ 1296] 0 1296 3044 834 61440 0 -1000 sshd
[Thu Feb 24 14:04:34 2022] [ 5579] 112 5579 999286 25345 1101824 0 0 agent
[Thu Feb 24 14:04:34 2022] [ 5580] 112 5580 846820 6224 626688 0 0 process-agent
[Thu Feb 24 14:04:34 2022] [ 5581] 112 5581 840052 4468 573440 0 0 trace-agent
[Thu Feb 24 14:04:34 2022] [ 15422] 997 15422 374101 19191 3772416 0 0 node
[Thu Feb 24 14:04:34 2022] [2266772] 107 2266772 2437 170 53248 0 0 uuidd
[Thu Feb 24 14:04:34 2022] [3386667] 0 3386667 361977 353 188416 0 -900 droplet-agent
[Thu Feb 24 14:04:34 2022] [ 319732] 997 319732 2187 513 53248 0 0 start.sh
[Thu Feb 24 14:04:34 2022] [ 319736] 997 319736 17187097 16036736 130777088 0 0 geth
[Thu Feb 24 14:04:34 2022] [1027034] 0 1027034 569154 4478 417792 0 -900 snapd
[Thu Feb 24 14:04:34 2022] [2028985] 0 2028985 3453 911 65536 0 0 sshd
[Thu Feb 24 14:04:34 2022] [2028998] 0 2028998 4588 1147 73728 0 0 systemd
[Thu Feb 24 14:04:34 2022] [2029001] 0 2029001 42494 1163 102400 0 0 (sd-pam)
[Thu Feb 24 14:04:34 2022] [2029130] 0 2029130 2540 977 61440 0 0 bash
[Thu Feb 24 14:04:34 2022] [2029718] 0 2029718 25081 748 98304 0 0 curl
[Thu Feb 24 14:04:34 2022] [2029790] 0 2029790 3451 938 61440 0 0 sshd
[Thu Feb 24 14:04:34 2022] [2029869] 0 2029869 2508 900 53248 0 0 bash
[Thu Feb 24 14:04:34 2022] [2029878] 0 2029878 2153 751 57344 0 0 htop
[Thu Feb 24 14:04:34 2022] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/system.slice/optimism.service,task=geth,pid=319736,uid=997
[Thu Feb 24 14:04:34 2022] Out of memory: Killed process 319736 (geth) total-vm:68748388kB, anon-rss:64146944kB, file-rss:0kB, shmem-rss:0kB, UID:997 pgtables:127712kB oom_score_adj:0
[Thu Feb 24 14:04:37 2022] oom_reaper: reaped process 319736 (geth), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
Expected behavior I expect to receive the properly formed json response.
System Specs:
- OS: Ubuntu 20.06 / 64G RAM / 32core CPU
- Package Version (or commit hash):
@eth-optimism/[email protected]
Additional context Looks like related to geth issue - https://github.com/ethereum/go-ethereum/issues/22244
Hey @McSim85. This is most likely because those transactions touch alot of memory and each slot access is being traced by default. You could try disabling memory traces as in this comment? If you need to observe memory accesses, it'll be better to write a custom tracer script that does traces memory accesses more efficiently.
came across another tx 0x6b25cfeb35d16898b80400b6d3679d76aaca2aa1f9470707b50825ac8502c7ff that's crashing my nodes. I tried with the above mentioned workaround by passing the { "disableMemory": true} but no luck.
Any update on this?