AddressSanitizer: stack-use-after-scope on address in `RecResponseStats::RecResponseStats`
- [x] This is not a support question, I have read about opensource and will send support questions to the IRC channel, GitHub Discussions or the mailing list.
- [x] I have read and understood the 'out in the open' support policy
- [x] I have read and understood the PowerDNS AI policy
- Program: Recursor
- Issue type: Bug report
Short description
AddressSanitizer: stack-use-after-scope on address in RecResponseStats::RecResponseStats in build recursor (autotools, asan+ubsan, full)
Environment
- Operating system: Ubuntu 24.04.3 LTS
- Software version: master
- Software source: PowerDNS repository
Steps to reproduce
- Open #16481
- Trigger CI
- Get this random error
Expected behaviour
No random errors, just errors caused by my own mistakes (there were a handful here)
Actual behaviour
https://github.com/PowerDNS/pdns/actions/runs/19311542898/job/55232765186?pr=16481#step:13:3175
../test-filterpo_cc.cc(286): warning: in "test_filter_policies_wildcard_with_enc": Please fix issue #8231
../test-filterpo_cc.cc(287): warning: in "test_filter_policies_wildcard_with_enc": Please fix issue #8231
../test-filterpo_cc.cc(293): warning: in "test_filter_policies_wildcard_with_enc": Please fix issue #8231
../test-filterpo_cc.cc(294): warning: in "test_filter_policies_wildcard_with_enc": Please fix issue #8231
../test-filterpo_cc.cc(300): warning: in "test_filter_policies_wildcard_with_enc": Please fix issue #8231
../test-filterpo_cc.cc(301): warning: in "test_filter_policies_wildcard_with_enc": Please fix issue #8231
=================================================================
==31205==ERROR: AddressSanitizer: stack-use-after-scope on address 0x7f8c6a5a6f00 at pc 0x5651a8b2f74c bp 0x7f8c6a5a2ff0 sp 0x7f8c6a5a27b8
WRITE of size 2280 at 0x7f8c6a5a6f00 thread T3
#0 0x5651a8b2f74b in __asan_memset (/__w/pdns/pdns/pdns/recursordist/pdns-recursor-0.0.0-git1/testrunner+0xd2074b)
#1 0x5651a8ef45b5 in RecResponseStats::RecResponseStats() /__w/pdns/pdns/pdns/recursordist/pdns-recursor-0.0.0-git1/../rec-responsestats.hh:73:53
#2 0x5651a91cb715 in rec::Counters::Counters() /__w/pdns/pdns/pdns/recursordist/pdns-recursor-0.0.0-git1/../rec-tcounters.hh:223:33
#3 0x5651a91b4725 in pdns::TLocalCounters<rec::Counters>::TLocalCounters(pdns::GlobalCounters<rec::Counters>&, timeval) /__w/pdns/pdns/pdns/recursordist/pdns-recursor-0.0.0-git1/../tcounters.hh:131:3
#4 0x5651a8a8eac6 in __cxx_global_var_init.10 /__w/pdns/pdns/pdns/recursordist/pdns-recursor-0.0.0-git1/../test-rec-tcounters_cc.cc:19:36
#5 0x5651a982cf7e in __tls_init /__w/pdns/pdns/pdns/recursordist/pdns-recursor-0.0.0-git1/../test-rec-tcounters_cc.cc
#6 0x5651a9829a18 in thread-local wrapper routine for tlocal test-rec-tcounters_cc.cc
#7 0x5651a9829dc6 in test_rec_tcounters_cc::update_fast::test_method()::$_2::operator()() const /__w/pdns/pdns/pdns/recursordist/pdns-recursor-0.0.0-git1/../test-rec-tcounters_cc.cc:53:9
#8 0x5651a9829dc6 in void std::__invoke_impl<void, test_rec_tcounters_cc::update_fast::test_method()::$_2>(std::__invoke_other, test_rec_tcounters_cc::update_fast::test_method()::$_2&&) /usr/bin/../lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/invoke.h:61:14
#9 0x5651a9829c66 in std::__invoke_result<test_rec_tcounters_cc::update_fast::test_method()::$_2>::type std::__invoke<test_rec_tcounters_cc::update_fast::test_method()::$_2>(test_rec_tcounters_cc::update_fast::test_method()::$_2&&) /usr/bin/../lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/invoke.h:96:14
#10 0x5651a9829c66 in void std::thread::_Invoker<std::tuple<test_rec_tcounters_cc::update_fast::test_method()::$_2> >::_M_invoke<0ul>(std::_Index_tuple<0ul>) /usr/bin/../lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/std_thread.h:252:13
#11 0x5651a9829c66 in std::thread::_Invoker<std::tuple<test_rec_tcounters_cc::update_fast::test_method()::$_2> >::operator()() /usr/bin/../lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/std_thread.h:259:11
#12 0x5651a9829c66 in std::thread::_State_impl<std::thread::_Invoker<std::tuple<test_rec_tcounters_cc::update_fast::test_method()::$_2> > >::_M_run() /usr/bin/../lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/std_thread.h:210:13
#13 0x7f8c71bda4a2 (/lib/x86_64-linux-gnu/libstdc++.so.6+0xd44a2)
#14 0x7f8c718ab1f4 (/lib/x86_64-linux-gnu/libc.so.6+0x891f4)
#15 0x7f8c7192ab3f in clone (/lib/x86_64-linux-gnu/libc.so.6+0x108b3f)
Address 0x7f8c6a5a6f00 is a wild pointer inside of access range of size 0x0000000008e8.
SUMMARY: AddressSanitizer: stack-use-after-scope (/__w/pdns/pdns/pdns/recursordist/pdns-recursor-0.0.0-git1/testrunner+0xd2074b) in __asan_memset
Shadow bytes around the buggy address:
0x0ff20d4acd90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0ff20d4acda0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0ff20d4acdb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0ff20d4acdc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0ff20d4acdd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x0ff20d4acde0:[f8]f8 f8 f8 00 00 00 00 f8 00 00 00 00 00 00 00
0x0ff20d4acdf0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0ff20d4ace00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0ff20d4ace10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0ff20d4ace20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0ff20d4ace30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
Thread T3 created by T0 here:
#0 0x5651a8b1a6bc in pthread_create (/__w/pdns/pdns/pdns/recursordist/pdns-recursor-0.0.0-git1/testrunner+0xd0b6bc)
#1 0x7f8c71bda578 in std::thread::_M_start_thread(std::unique_ptr<std::thread::_State, std::default_delete<std::thread::_State> >, void (*)()) (/lib/x86_64-linux-gnu/libstdc++.so.6+0xd4578)
#2 0x5651a98251ff in test_rec_tcounters_cc::update_fast_invoker() /__w/pdns/pdns/pdns/recursordist/pdns-recursor-0.0.0-git1/../test-rec-tcounters_cc.cc:44:1
#3 0x5651a931a831 in boost::detail::function::void_function_invoker0<void (*)(), void>::invoke(boost::detail::function::function_buffer&) /usr/include/boost/function/function_template.hpp:117:11
==31205==ABORTING
FAIL testrunner (exit status: 1)
Other information
This occasionally happens in the testrunner. Both @rgacogne and myself tried to diagnose it, but no success so far. Will revisit.
So not able to reproduce locally on either macOS or debian-trixie. I'm trying to replicate the exact runtime, but now I'm running into an issue: the CI base image is debian bookworm with clang-13 and then runs a ubuntu-24.4 based container. Ubuntu 24-4 itself does not have clang-13, so I suppose it used the compiler from the base image?
This makes it harder than needed to reproduce the runtime on a "real" ubuntu VM. I like to be able to use such a setup, since debugging in a container is such a pain.
My understanding is that the Docker host runs a ubuntu-24.4 based container but we build and run the unit tests in a Docker container based on the CI base image (debian bookworm) with clang-13. In theory we should get the same behaviour using clang-13 in a bookworm VM, but since the issue seems to be happening randomly this is very annoying to investigate :-/
Ah, thanks, all these virtualization layers got me confused.
Running on a bookworm VM, compiled with clang-13 I'm also not able to reproduce so far. Will let the test loop run for a few more hours.
Test is still running. I also ran using valgrind which did not spot any issue.
I'll stop investigating this now, but I'll leave the issue open, so we have a better chance of remembering that this is a somewhat rare, but still common enough unit test issue that so far only has been observed in our CI.
@cmouse speculated:
i feel like i could understand the issue, it sounds like the stack allocated thread local is being used via reference probably because of the operator+= overload but then again, i might be totally wrong it looks like it's caused by ++tlocal(something) ++tlocal.at(rec::Counter::servFails); RecResponseStats& operator+=(const RecResponseStats&); i wonder if this should be (const RecResponseStats)
That does not make a lot of sense to me. The call on the stack is a += of a counter, not of the whole struct. The issue seems to happen when the first ref to the thread local is done, related to the init of the thread local.