dev3 crash on Windows when High Entropy ASLR is disabled
Hi Daan, we've experiencing a 100% crash in mimalloc dev3 which reproduces when we have High Entropy ASLR disabled. mimalloc appears to misbehave either specifically when High Entropy ASLR is disabled or more generally when the kernel gives it a page with a very low address (<2GB). callstack:
0:000> .exr -1
ExceptionAddress: 00007ff6a63f0ad4 (App.exe!mi_page_map_set_range+0x00000000000001f4)
ExceptionCode: c0000005 (Access violation)
ExceptionFlags: 00000000
NumberParameters: 2
Parameter[0]: 0000000000000001
Parameter[1]: 00000000033019b8
Attempt to write to address 00000000033019b8
0:000> k
# Child-SP RetAddr Call Site
00 00000000`00dff3b0 00007ff6`a63ebcb0 App.exe!mi_page_map_set_range+0x1f4 [mimalloc\src\page-map.c @ 264]
01 00000000`00dff490 00007ff6`a63ebefa App.exe!mi_arenas_page_alloc_fresh+0x380 [mimalloc\src\arena.c @ 709]
02 00000000`00dff570 00007ff6`a63f07b3 App.exe!mi_arenas_page_regular_alloc+0x21a [mimalloc\src\arena.c @ 727]
03 (Inline Function) --------`-------- App.exe!_mi_arenas_page_alloc+0x1b [mimalloc\src\arena.c @ 766]
04 00000000`00dff620 00007ff6`a63f0ee4 App.exe!mi_page_fresh_alloc+0x33 [mimalloc\src\page.c @ 305]
05 (Inline Function) --------`-------- App.exe!mi_page_fresh+0x12 [mimalloc\src\page.c @ 335]
06 00000000`00dff650 00007ff6`a63ee2d7 App.exe!mi_page_queue_find_free_ex+0x2b4 [mimalloc\src\page.c @ 807]
07 (Inline Function) --------`-------- App.exe!mi_find_free_page+0x31 [mimalloc\src\page.c @ 848]
08 00000000`00dff6b0 00007ff6`a63e68d7 App.exe!mi_find_page+0xa7 [mimalloc\src\page.c @ 929]
09 00000000`00dff6e0 00007ff6`a63f2398 App.exe!_mi_malloc_generic+0x157 [mimalloc\src\page.c @ 965]
0a 00000000`00dff730 00007ff6`a671b83e App.exe!mi_calloc+0x118 [mimalloc\src\alloc.c @ 233]
0b (Inline Function) --------`-------- App.exe!internal_get_ptd_head+0x4d [minkernel\crts\ucrt\src\appcrt\internal\per_thread_data.cpp @ 244]
0c (Inline Function) --------`-------- App.exe!internal_getptd_noexit+0x4d [minkernel\crts\ucrt\src\appcrt\internal\per_thread_data.cpp @ 271]
0d (Inline Function) --------`-------- App.exe!internal_getptd_noexit+0x53 [minkernel\crts\ucrt\src\appcrt\internal\per_thread_data.cpp @ 283]
0e 00000000`00dff760 00007ff6`a671b988 App.exe!__acrt_getptd_noexit+0x62 [minkernel\crts\ucrt\src\appcrt\internal\per_thread_data.cpp @ 295]
0f 00000000`00dff790 00007ff6`a672b0e5 App.exe!__acrt_initialize_ptd+0x24 [minkernel\crts\ucrt\src\appcrt\internal\per_thread_data.cpp @ 36]
10 00000000`00dff7c0 00007ff6`a66dc984 App.exe!__acrt_execute_initializers+0x35 [minkernel\crts\ucrt\src\appcrt\internal\shared_initialization.cpp @ 25]
11 00000000`00dff7f0 00007ff6`a66dcf84 App.exe!__scrt_initialize_crt+0x34 [crt\vcstartup\src\utility\utility.cpp @ 199]
12 00000000`00dff820 00007ff8`2761e8d7 App.exe!__scrt_common_main_seh+0x14 [crt\vcstartup\src\startup\exe_common.inl @ 237]
13 00000000`00dff860 00007ff8`285914fc KERNEL32!BaseThreadInitThunk+0x17
14 00000000`00dff890 00000000`00000000 ntdll!RtlUserThreadStart+0x2c
Could prove or deny if mimalloc is expected to work with HE ASLR disabled? Thanks
Thanks @Noxybot -- I cannot repro locally but I think I may have fixed the issue -- can you try with the latest dev3?
Hi Daan, I've tried latest dev3 and now observing another crash, but with different callstack and on NULL page:
Unhandled exception at 0x00007FF721024208 in App.exe: 0xC0000005: Access violation reading location 0x0000000000000000.
App.exe!std::_Atomic_integral<unsigned __int64,8>::fetch_or(const unsigned __int64 _Operand, const std::memory_order _Order) Line 1668 (msvc-2019\include\atomic:1668)
App.exe!std::atomic_fetch_or_explicit<unsigned __int64>(std::atomic<unsigned __int64> * _Mem, const unsigned __int64 _Value, const std::memory_order _Order) Line 2763 (msvc-2019\include\atomic:2763)
App.exe!mi_page_flags_set(mi_page_s * page, bool set, unsigned __int64 newflag) Line 733 (mimalloc\include\mimalloc\internal.h:733)
App.exe!mi_page_set_has_aligned(mi_page_s * page, bool has_aligned) Line 751 (mimalloc\include\mimalloc\internal.h:751)
App.exe!mi_heap_malloc_zero_aligned_at_overalloc(mi_heap_s * const heap, const unsigned __int64 size, const unsigned __int64 alignment, const unsigned __int64 offset, const bool zero) Line 102 (mimalloc\src\alloc-aligned.c:102)
App.exe!mi_heap_malloc_zero_aligned_at_generic(mi_heap_s * const heap, const unsigned __int64 size, const unsigned __int64 alignment, const unsigned __int64 offset, const bool zero) Line 164 (mimalloc\src\alloc-aligned.c:164)
App.exe!mi_heap_malloc_zero_aligned_at(mi_heap_s * const heap, const unsigned __int64 size, const unsigned __int64 alignment, const unsigned __int64 offset, const bool zero) Line 206 (mimalloc\src\alloc-aligned.c:206)
App.exe!mi_heap_malloc_aligned_at(mi_heap_s * heap, unsigned __int64 size, unsigned __int64 alignment, unsigned __int64 offset) Line 216 (mimalloc\src\alloc-aligned.c:216)
App.exe!mi_heap_malloc_aligned(mi_heap_s * heap, unsigned __int64 size, unsigned __int64 alignment) Line 220 (mimalloc\src\alloc-aligned.c:220)
App.exe!mi_malloc_aligned(unsigned __int64 size, unsigned __int64 alignment) Line 249 (mimalloc\src\alloc-aligned.c:249)
It still only happens with HE ASLR disabled. You can disable in via admin Powershell like:
Set-ProcessMitigation -Name App.exe -Disable HighEntropy
But I'm not able to repro it in a minimal app, unfortunately :(
I can tell that mi_page_t* page = _mi_ptr_page(p); in mi_heap_malloc_zero_aligned_at_overalloc returned NULL page when p=0x0000000003990080 and then we used it here:
if (aligned_p != p) {
mi_page_set_has_aligned(page, true);
With MI_DEBUG_FULL this assertion fails:
mimalloc\include\mimalloc/internal.h":585, _mi_ptr_page
assertion: "p==NULL || mi_is_in_heap_region(p)"
Ah, I think I know what this is. dev3 has a nice _mi_page_map that tracks all valid mimalloc pages in memory and it is committed on demand. However, I reserve only the minimal first part around address 0 to catch NULL pointers .. I need to reserve it fully though for HE. I will try to push a fix soon.
Hi @daanx , I have a Windows program using mimalloc v3.0.3, work on Windows; running under Wine/Linux, it crashes right away with debug/asserts enabled
on Wine, allocated memory has low addresses
same program with v2.2.3 all good on Windows and Wine/Linux
I think this is related to this bug, interested in a fix as 3.0.3 seems a bit faster than 2.2.3
Hi @pmeerw -- I think the issue is fixed now in the latest dev3 (with commit 4161152 -- I wrote the issue ID wrongly so it didn't show up in this thread). Let me know if it works now for you.
Hi @daanx , is the crash reported in https://github.com/microsoft/mimalloc/issues/1087#issuecomment-2878619915 also supposed to be fixed by https://github.com/microsoft/mimalloc/commit/4161152d17edbfb891733493310ea65278298f76?
I seem to still be hitting it with the latest dev3 (_mi_ptr_page returns nullptr for p=0x00000000036b0080...
Hi @daanx, latest dev3 (commit 550b6283) works for me now; it's about same runtime as v2.2.3 for my workload (some Windows executable under Wine)
I've added more logs for page allocations / placing of pages in _mi_page_map and here is what I can tell:
mi_heap_malloc_zero_aligned_at_overalloc:
mimalloc: ptr 0x04050080 allocated from page 0x04040000, size: 32768
mimalloc: ptr 0x04050080 registered at page with idx: 0, subidx 1029
mimalloc: OOPS page is null for ptr 0x04050080 , size: 16385, alignment: 16384
And here are the logs about creation of 0x04040000 page:
_mi_page_map_register:
mimalloc: page 0x04040000 registered at idx: 0, subidx 1028
mimalloc: page 0x04040000 CREATED
So somehow we registered the page at sub_idx 1028, then allocated from it pointer 0x04050080 (with size 16385 and alignment 16384) and then _mi_ptr_page for that pointer returned page from idx=0 and sub_idx=1029 instead of sub_idx=1028.
Seems like off-by-1 error somewhere?
The page in question has blocksize of 32768... Maybe it's too small to fit 16385+16384=32769?
OK, after debugging it for a while I think this line is wrong:
static mi_page_t** mi_page_map_ensure_at(size_t idx) {
mi_page_t** sub = mi_page_map_ensure_committed(idx);
if mi_unlikely(sub == NULL || idx == 0 /* low addresses */)
Basically, it makes mi_page_map_ensure_at to CONSTANTLY override sub page map at index=0. So the result of mi_page_map_set_range is essentially discarded, therefore we're failing to find desired page afterwards.
Removing || idx == 0 /* low addresses */ solves the problem.
Thanks @daanx - it works with the recent commit in dev3!