MacOS malloc hooking is incomplete
The C++ tests for MacOS are broken. The _pthread_tsd_cleanup function is calling find_zone_and_free() which calls into the system allocator and not IsoAlloc. This is not broken on other platforms.
$ lldb build/cxx_tests
(lldb) target create "build/cxx_tests"
Current executable set to '/Users/user/code/mycode/isoalloc/build/cxx_tests' (arm64).
(lldb) r
Process 9083 launched: '/Users/user/code/mycode/isoalloc/build/cxx_tests' (arm64)
cxx_tests(9083,0x16fe87000) malloc: *** error for object 0x2b97f57fb980: pointer being freed was not allocated
cxx_tests(9083,0x16fe87000) malloc: *** set a breakpoint in malloc_error_break to debug
Process 9083 stopped
* thread #2, stop reason = EXC_BREAKPOINT (code=1, subcode=0x18e7bcaf4)
frame #0: 0x000000018e7bcaf4 libsystem_c.dylib`__abort + 168
libsystem_c.dylib`:
-> 0x18e7bcaf4 <+168>: brk #0x1
libsystem_c.dylib`abort_report_np:
0x18e7bcaf8 <+0>: pacibsp
0x18e7bcafc <+4>: sub sp, sp, #0x30
0x18e7bcb00 <+8>: stp x20, x19, [sp, #0x10]
Target 0: (cxx_tests) stopped.
(lldb) bt
* thread #2, stop reason = EXC_BREAKPOINT (code=1, subcode=0x18e7bcaf4)
* frame #0: 0x000000018e7bcaf4 libsystem_c.dylib`__abort + 168
frame #1: 0x000000018e7bca4c libsystem_c.dylib`abort + 192
frame #2: 0x000000018e6d3b08 libsystem_malloc.dylib`malloc_vreport + 908
frame #3: 0x000000018e6d73f4 libsystem_malloc.dylib`malloc_report + 64
frame #4: 0x000000018e6ebebc libsystem_malloc.dylib`find_zone_and_free + 308
frame #5: 0x000000018e8aa978 libsystem_pthread.dylib`_pthread_tsd_cleanup + 488
frame #6: 0x000000018e8ad724 libsystem_pthread.dylib`_pthread_exit + 84
frame #7: 0x000000018e8ad040 libsystem_pthread.dylib`_pthread_start + 148
This seems to be because MacOS calls pthread_key_create at startup from a constructor which may be passing in a specific destructor function that we aren't hooking:
(lldb) bt
* thread #1, stop reason = breakpoint 3.1
* frame #0: 0x000000018e8a7570 libsystem_pthread.dylib`pthread_key_create
frame #1: 0x000000018e53f268 dyld`dyld4::RuntimeState::initialize() + 72
frame #2: 0x000000018e5644e0 dyld`dyld4::APIs::_libdyld_initialize(dyld4::LibSystemHelpers const*) + 400
frame #3: 0x000000019b102684 libSystem.B.dylib`libSystem_initializer + 196
frame #4: 0x000000018e54da24 dyld`invocation function for block in dyld4::Loader::findAndRunAllInitializers(dyld4::RuntimeState&) const::$_0::operator()() const + 168
frame #5: 0x000000018e593328 dyld`invocation function for block in dyld3::MachOAnalyzer::forEachInitializer(Diagnostics&, dyld3::MachOAnalyzer::VMAddrConverter const&, void (unsigned int) block_pointer, void const*) const + 340
frame #6: 0x000000018e586668 dyld`invocation function for block in dyld3::MachOFile::forEachSection(void (dyld3::MachOFile::SectionInfo const&, bool, bool&) block_pointer) const + 496
frame #7: 0x000000018e52d2fc dyld`dyld3::MachOFile::forEachLoadCommand(Diagnostics&, void (load_command const*, bool&) block_pointer) const + 300
frame #8: 0x000000018e5856a0 dyld`dyld3::MachOFile::forEachSection(void (dyld3::MachOFile::SectionInfo const&, bool, bool&) block_pointer) const + 192
frame #9: 0x000000018e592e3c dyld`dyld3::MachOAnalyzer::forEachInitializer(Diagnostics&, dyld3::MachOAnalyzer::VMAddrConverter const&, void (unsigned int) block_pointer, void const*) const + 516
frame #10: 0x000000018e549b38 dyld`dyld4::Loader::findAndRunAllInitializers(dyld4::RuntimeState&) const + 524
frame #11: 0x000000018e553928 dyld`dyld4::PrebuiltLoader::runInitializers(dyld4::RuntimeState&) const + 44
frame #12: 0x000000018e56f360 dyld`dyld4::APIs::runAllInitializersForMain() + 84
frame #13: 0x000000018e531fa0 dyld`dyld4::prepare(dyld4::APIs&, dyld3::MachOAnalyzer const*) + 3192
frame #14: 0x000000018e530edc dyld`start + 1844
Setting -DMALLOC_HOOK=0 manually or in the Makefile will allow the tests to pass for now.
Unfortunately it looks like find_zone_and_free is not exported. So I can't easily hook this.
Stupid question, but why didn't the CI catch this? It's running on OSX :/
Good question, I am not sure. It could be a different version of MacOS and libc?
I think we just need to hook the correct destructor for the pthread and invoke the correct free(). Will work on this. Edit: in jemalloc this is all implemented in tsd.c