silkworm
silkworm copied to clipboard
Crash at public_key_to_address() (Senders stage)
INFO [09-30|21:24:48.410 UTC] [1/8 Senders] phase=2/3 blocks=10193021 current=20133 transactions=4458589 workers=16
INFO [09-30|21:25:18.410 UTC] [1/8 Senders] phase=2/3 blocks=10193021 current=39108 transactions=8654934 workers=16
INFO [09-30|21:25:48.410 UTC] [1/8 Senders] phase=2/3 blocks=10193021 current=67112 transactions=14686352 workers=16
INFO [09-30|21:26:08.721 UTC] Collector flushed file path=/blockchain/erigon-stable/etl-temp/137438013954776-0.bin size=256.00 MB in=395.503ms
INFO [09-30|21:26:18.410 UTC] [1/8 Senders] phase=2/3 blocks=10193021 current=100813 transactions=20717666 workers=15
Thread 20 "silkworm" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7cffbcffa640 (LWP 1320644)]
0x0000555555807e07 in public_key_to_address () at /home/ipsilon/silkworm/third_party/silkpre/lib/silkpre/ecdsa.c:54
54 memcpy(out, &key_hash.bytes[12], 20);
(gdb) bt
#0 0x0000555555807e07 in public_key_to_address () at /home/ipsilon/silkworm/third_party/silkpre/lib/silkpre/ecdsa.c:54
#1 silkpre_recover_address () at /home/ipsilon/silkworm/third_party/silkpre/lib/silkpre/ecdsa.c:64
#2 0x00005555557a971e in work () at /home/ipsilon/silkworm/node/silkworm/stagedsync/stage_senders/recovery_worker.cpp:65
#3 0x00005555556aed16 in operator() () at /home/ipsilon/silkworm/node/silkworm/concurrency/worker.cpp:43
#4 __invoke<(lambda at /home/ipsilon/silkworm/node/silkworm/concurrency/worker.cpp:36:45)> () at /usr/lib/llvm-14/bin/../include/c++/v1/type_traits:3640
#5 __thread_execute<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, (lambda at /home/ipsilon/silkworm/node/silkworm/concurrency/worker.cpp:36:45)> ()
at /usr/lib/llvm-14/bin/../include/c++/v1/thread:282
#6 __thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, (lambda at /home/ipsilon/silkworm/node/silkworm/concurrency/worker.cpp:36:45)> >(void)
() at /usr/lib/llvm-14/bin/../include/c++/v1/thread:293
#7 0x00007ffff7bc7b43 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#8 0x00007ffff7c59a00 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
@chfast Thanks for reporting this! I have some questions:
- is this happening during your test using a database primed by Erigon and starting the staged sync loop from Senders stage (by commenting out first 3 stages as suggested by @mriccobene )?
- what hardware platform and Silkworm commit did you use for running this test?
- did you experiment this once or repeatedly?
This happens after some minutes of running Senders state on Erigon database. This is heap-use-after-free bug.
=================================================================
==3478483==ERROR: AddressSanitizer: heap-use-after-free on address 0x7c295934bc80 at pc 0x55c9a54bd990 bp 0x7c2994dba190 sp 0x7c2994dba188
READ of size 8 at 0x7c295934bc80 thread T17
#0 0x55c9a54bd98f in silkworm::stagedsync::recovery::RecoveryWorker::work() /home/ipsilon/silkworm/node/silkworm/stagedsync/stage_senders/recovery_worker.cpp:61:38
#1 0x55c9a511499e in operator() /home/ipsilon/silkworm/node/silkworm/concurrency/worker.cpp:43:17
#2 0x55c9a511499e in __invoke<(lambda at /home/ipsilon/silkworm/node/silkworm/concurrency/worker.cpp:36:45)> /usr/lib/llvm-14/bin/../include/c++/v1/type_traits:3640
:23
#3 0x55c9a511499e in __thread_execute<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, (lambda at /home/ipsilo
n/silkworm/node/silkworm/concurrency/worker.cpp:36:45)> /usr/lib/llvm-14/bin/../include/c++/v1/thread:282:5
#4 0x55c9a511499e in void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_stru
ct> >, silkworm::Worker::start(bool)::$_0> >(void*) /usr/lib/llvm-14/bin/../include/c++/v1/thread:293:5
#5 0x7f29af5d0b42 in start_thread nptl/./nptl/pthread_create.c:442:8
#6 0x7f29af6629ff misc/../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
0x7c295934bc80 is located 853120 bytes inside of 67108864-byte region [0x7c295927b800,0x7c295d27b800)
freed by thread T8 here:
#0 0x55c9a4f8e5ad in operator delete(void*) (/home/ipsilon/silkworm/build/asan/cmd/silkworm+0x2305ad) (BuildId: a5bf77ccfd350aa48011e40f73f9cc35c0a5777e)
#1 0x55c9a549f7d9 in __libcpp_operator_delete<void *> /usr/lib/llvm-14/bin/../include/c++/v1/new:255:3
#2 0x55c9a549f7d9 in __do_deallocate_handle_size<> /usr/lib/llvm-14/bin/../include/c++/v1/new:279:10
#3 0x55c9a549f7d9 in __libcpp_deallocate /usr/lib/llvm-14/bin/../include/c++/v1/new:295:14
#4 0x55c9a549f7d9 in deallocate /usr/lib/llvm-14/bin/../include/c++/v1/__memory/allocator.h:114:13
#5 0x55c9a549f7d9 in deallocate /usr/lib/llvm-14/bin/../include/c++/v1/__memory/allocator_traits.h:282:13
#6 0x55c9a549f7d9 in ~vector /usr/lib/llvm-14/bin/../include/c++/v1/vector:402:9
#7 0x55c9a549f7d9 in silkworm::stagedsync::recovery::RecoveryFarm::collect_workers_results() /home/ipsilon/silkworm/node/silkworm/stagedsync/stage_senders/recovery_
farm.cpp:249:5
#8 0x55c9a549651d in silkworm::stagedsync::recovery::RecoveryFarm::dispatch_batch() /home/ipsilon/silkworm/node/silkworm/stagedsync/stage_senders/recovery_farm.cpp:
336:30
#9 0x55c9a54945b5 in silkworm::stagedsync::recovery::RecoveryFarm::transform_and_fill_batch(unsigned long, std::__1::vector<silkworm::Transaction, std::__1::allocat
or<silkworm::Transaction> > const&) /home/ipsilon/silkworm/node/silkworm/stagedsync/stage_senders/recovery_farm.cpp:325:14
#10 0x55c9a548bcc6 in silkworm::stagedsync::recovery::RecoveryFarm::recover() /home/ipsilon/silkworm/node/silkworm/stagedsync/stage_senders/recovery_farm.cpp:117:34
#11 0x55c9a54815a1 in silkworm::stagedsync::Senders::forward(silkworm::db::RWTxn&) /home/ipsilon/silkworm/node/silkworm/stagedsync/stage_senders.cpp:33:27
previously allocated by thread T8 here:
#0 0x55c9a4f8dd4d in operator new(unsigned long) (/home/ipsilon/silkworm/build/asan/cmd/silkworm+0x22fd4d) (BuildId: a5bf77ccfd350aa48011e40f73f9cc35c0a5777e)
#1 0x55c9a54a9fba in __libcpp_operator_new<unsigned long> /usr/lib/llvm-14/bin/../include/c++/v1/new:245:10
#2 0x55c9a54a9fba in __libcpp_allocate /usr/lib/llvm-14/bin/../include/c++/v1/new:271:10
#3 0x55c9a54a9fba in allocate /usr/lib/llvm-14/bin/../include/c++/v1/__memory/allocator.h:105:38
#4 0x55c9a54a9fba in allocate /usr/lib/llvm-14/bin/../include/c++/v1/__memory/allocator_traits.h:262:20
#5 0x55c9a54a9fba in __split_buffer /usr/lib/llvm-14/bin/../include/c++/v1/__split_buffer:306:29
#6 0x55c9a54a9fba in void std::__1::vector<silkworm::stagedsync::recovery::RecoveryPackage, std::__1::allocator<silkworm::stagedsync::recovery::RecoveryPackage> >::
__push_back_slow_path<silkworm::stagedsync::recovery::RecoveryPackage>(silkworm::stagedsync::recovery::RecoveryPackage&&) /usr/lib/llvm-14/bin/../include/c++/v1/vector:
1517:49
#7 0x55c9a5493f82 in push_back /usr/lib/llvm-14/bin/../include/c++/v1/vector:1549:9
#8 0x55c9a5493f82 in silkworm::stagedsync::recovery::RecoveryFarm::transform_and_fill_batch(unsigned long, std::__1::vector<silkworm::Transaction, std::__1::allocat
or<silkworm::Transaction> > const&) /home/ipsilon/silkworm/node/silkworm/stagedsync/stage_senders/recovery_farm.cpp:314:16
#9 0x55c9a548bcc6 in silkworm::stagedsync::recovery::RecoveryFarm::recover() /home/ipsilon/silkworm/node/silkworm/stagedsync/stage_senders/recovery_farm.cpp:117:34
#10 0x55c9a54815a1 in silkworm::stagedsync::Senders::forward(silkworm::db::RWTxn&) /home/ipsilon/silkworm/node/silkworm/stagedsync/stage_senders.cpp:33:27
Thread T17 created by T8 here:
#0 0x55c9a4f3c3fc in __interceptor_pthread_create (/home/ipsilon/silkworm/build/asan/cmd/silkworm+0x1de3fc) (BuildId: a5bf77ccfd350aa48011e40f73f9cc35c0a5777e)
#1 0x55c9a511341b in __libcpp_thread_create /usr/lib/llvm-14/bin/../include/c++/v1/__threading_support:375:10
#2 0x55c9a511341b in thread<(lambda at /home/ipsilon/silkworm/node/silkworm/concurrency/worker.cpp:36:45), void> /usr/lib/llvm-14/bin/../include/c++/v1/thread:309:1
6
#3 0x55c9a511341b in make_unique<std::__1::thread, (lambda at /home/ipsilon/silkworm/node/silkworm/concurrency/worker.cpp:36:45)> /usr/lib/llvm-14/bin/../include/c+
+/v1/__memory/unique_ptr.h:725:32
#4 0x55c9a511341b in silkworm::Worker::start(bool) /home/ipsilon/silkworm/node/silkworm/concurrency/worker.cpp:36:15
#5 0x55c9a54a51c1 in silkworm::stagedsync::recovery::RecoveryFarm::initialize_new_worker() /home/ipsilon/silkworm/node/silkworm/stagedsync/stage_senders/recovery_fa
rm.cpp:390:26
#6 0x55c9a5496651 in silkworm::stagedsync::recovery::RecoveryFarm::dispatch_batch() /home/ipsilon/silkworm/node/silkworm/stagedsync/stage_senders/recovery_farm.cpp:
355:17
#7 0x55c9a54945b5 in silkworm::stagedsync::recovery::RecoveryFarm::transform_and_fill_batch(unsigned long, std::__1::vector<silkworm::Transaction, std::__1::allocat
or<silkworm::Transaction> > const&) /home/ipsilon/silkworm/node/silkworm/stagedsync/stage_senders/recovery_farm.cpp:325:14
#8 0x55c9a548bcc6 in silkworm::stagedsync::recovery::RecoveryFarm::recover() /home/ipsilon/silkworm/node/silkworm/stagedsync/stage_senders/recovery_farm.cpp:117:34
#9 0x55c9a54815a1 in silkworm::stagedsync::Senders::forward(silkworm::db::RWTxn&) /home/ipsilon/silkworm/node/silkworm/stagedsync/stage_senders.cpp:33:27
Thread T8 created by T0 here:
#0 0x55c9a4f3c3fc in __interceptor_pthread_create (/home/ipsilon/silkworm/build/asan/cmd/silkworm+0x1de3fc) (BuildId: a5bf77ccfd350aa48011e40f73f9cc35c0a5777e)
#1 0x55c9a511341b in __libcpp_thread_create /usr/lib/llvm-14/bin/../include/c++/v1/__threading_support:375:10
#2 0x55c9a511341b in thread<(lambda at /home/ipsilon/silkworm/node/silkworm/concurrency/worker.cpp:36:45), void> /usr/lib/llvm-14/bin/../include/c++/v1/thread:309:1
6
#3 0x55c9a511341b in make_unique<std::__1::thread, (lambda at /home/ipsilon/silkworm/node/silkworm/concurrency/worker.cpp:36:45)> /usr/lib/llvm-14/bin/../include/c+
+/v1/__memory/unique_ptr.h:725:32
#4 0x55c9a511341b in silkworm::Worker::start(bool) /home/ipsilon/silkworm/node/silkworm/concurrency/worker.cpp:36:15
#5 0x55c9a4f96252 in main /home/ipsilon/silkworm/cmd/silkworm.cpp:143:19
#6 0x7f29af565d8f in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
SUMMARY: AddressSanitizer: heap-use-after-free /home/ipsilon/silkworm/node/silkworm/stagedsync/stage_senders/recovery_worker.cpp:61:38 in silkworm::stagedsync::recovery
::RecoveryWorker::work()
Shadow bytes around the buggy address:
0x0f85ab261740: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
0x0f85ab261750: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
0x0f85ab261760: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
0x0f85ab261770: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
0x0f85ab261780: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
=>0x0f85ab261790:[fd]fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
0x0f85ab2617a0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
0x0f85ab2617b0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
0x0f85ab2617c0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
0x0f85ab2617d0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
0x0f85ab2617e0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
==3478483==ABORTING
Recently I have sporadically worked at refactoring the Senders stage to simplify the code and overcome the root cause of this problem (or at least prepare the ground for easier debugging).
@chfast can you please take a look if https://github.com/torquem-ch/silkworm/commit/b30d780121d07e88bfbc4ff1cd2694c911eabed1 solves the issue on your machine?
Closed by b30d780