mimalloc icon indicating copy to clipboard operation
mimalloc copied to clipboard

dev3 crash on Windows when High Entropy ASLR is disabled

Open Noxybot opened this issue 7 months ago • 8 comments

Hi Daan, we've experiencing a 100% crash in mimalloc dev3 which reproduces when we have High Entropy ASLR disabled. mimalloc appears to misbehave either specifically when High Entropy ASLR is disabled or more generally when the kernel gives it a page with a very low address (<2GB). callstack:

0:000> .exr -1
ExceptionAddress: 00007ff6a63f0ad4 (App.exe!mi_page_map_set_range+0x00000000000001f4)
   ExceptionCode: c0000005 (Access violation)
  ExceptionFlags: 00000000
NumberParameters: 2
   Parameter[0]: 0000000000000001
   Parameter[1]: 00000000033019b8
Attempt to write to address 00000000033019b8
0:000> k
 # Child-SP          RetAddr               Call Site
00 00000000`00dff3b0 00007ff6`a63ebcb0     App.exe!mi_page_map_set_range+0x1f4 [mimalloc\src\page-map.c @ 264] 
01 00000000`00dff490 00007ff6`a63ebefa     App.exe!mi_arenas_page_alloc_fresh+0x380 [mimalloc\src\arena.c @ 709] 
02 00000000`00dff570 00007ff6`a63f07b3     App.exe!mi_arenas_page_regular_alloc+0x21a [mimalloc\src\arena.c @ 727] 
03 (Inline Function) --------`--------     App.exe!_mi_arenas_page_alloc+0x1b [mimalloc\src\arena.c @ 766] 
04 00000000`00dff620 00007ff6`a63f0ee4     App.exe!mi_page_fresh_alloc+0x33 [mimalloc\src\page.c @ 305] 
05 (Inline Function) --------`--------     App.exe!mi_page_fresh+0x12 [mimalloc\src\page.c @ 335] 
06 00000000`00dff650 00007ff6`a63ee2d7     App.exe!mi_page_queue_find_free_ex+0x2b4 [mimalloc\src\page.c @ 807] 
07 (Inline Function) --------`--------     App.exe!mi_find_free_page+0x31 [mimalloc\src\page.c @ 848] 
08 00000000`00dff6b0 00007ff6`a63e68d7     App.exe!mi_find_page+0xa7 [mimalloc\src\page.c @ 929] 
09 00000000`00dff6e0 00007ff6`a63f2398     App.exe!_mi_malloc_generic+0x157 [mimalloc\src\page.c @ 965] 
0a 00000000`00dff730 00007ff6`a671b83e     App.exe!mi_calloc+0x118 [mimalloc\src\alloc.c @ 233] 
0b (Inline Function) --------`--------     App.exe!internal_get_ptd_head+0x4d [minkernel\crts\ucrt\src\appcrt\internal\per_thread_data.cpp @ 244] 
0c (Inline Function) --------`--------     App.exe!internal_getptd_noexit+0x4d [minkernel\crts\ucrt\src\appcrt\internal\per_thread_data.cpp @ 271] 
0d (Inline Function) --------`--------     App.exe!internal_getptd_noexit+0x53 [minkernel\crts\ucrt\src\appcrt\internal\per_thread_data.cpp @ 283] 
0e 00000000`00dff760 00007ff6`a671b988     App.exe!__acrt_getptd_noexit+0x62 [minkernel\crts\ucrt\src\appcrt\internal\per_thread_data.cpp @ 295] 
0f 00000000`00dff790 00007ff6`a672b0e5     App.exe!__acrt_initialize_ptd+0x24 [minkernel\crts\ucrt\src\appcrt\internal\per_thread_data.cpp @ 36] 
10 00000000`00dff7c0 00007ff6`a66dc984     App.exe!__acrt_execute_initializers+0x35 [minkernel\crts\ucrt\src\appcrt\internal\shared_initialization.cpp @ 25] 
11 00000000`00dff7f0 00007ff6`a66dcf84     App.exe!__scrt_initialize_crt+0x34 [crt\vcstartup\src\utility\utility.cpp @ 199] 
12 00000000`00dff820 00007ff8`2761e8d7     App.exe!__scrt_common_main_seh+0x14 [crt\vcstartup\src\startup\exe_common.inl @ 237] 
13 00000000`00dff860 00007ff8`285914fc     KERNEL32!BaseThreadInitThunk+0x17
14 00000000`00dff890 00000000`00000000     ntdll!RtlUserThreadStart+0x2c

Could prove or deny if mimalloc is expected to work with HE ASLR disabled? Thanks

Noxybot avatar May 08 '25 22:05 Noxybot

Thanks @Noxybot -- I cannot repro locally but I think I may have fixed the issue -- can you try with the latest dev3?

daanx avatar May 14 '25 01:05 daanx

Hi Daan, I've tried latest dev3 and now observing another crash, but with different callstack and on NULL page:

Unhandled exception at 0x00007FF721024208 in App.exe: 0xC0000005: Access violation reading location 0x0000000000000000.
App.exe!std::_Atomic_integral<unsigned __int64,8>::fetch_or(const unsigned __int64 _Operand, const std::memory_order _Order) Line 1668 (msvc-2019\include\atomic:1668)
App.exe!std::atomic_fetch_or_explicit<unsigned __int64>(std::atomic<unsigned __int64> * _Mem, const unsigned __int64 _Value, const std::memory_order _Order) Line 2763 (msvc-2019\include\atomic:2763)
App.exe!mi_page_flags_set(mi_page_s * page, bool set, unsigned __int64 newflag) Line 733 (mimalloc\include\mimalloc\internal.h:733)
App.exe!mi_page_set_has_aligned(mi_page_s * page, bool has_aligned) Line 751 (mimalloc\include\mimalloc\internal.h:751)
App.exe!mi_heap_malloc_zero_aligned_at_overalloc(mi_heap_s * const heap, const unsigned __int64 size, const unsigned __int64 alignment, const unsigned __int64 offset, const bool zero) Line 102 (mimalloc\src\alloc-aligned.c:102)
App.exe!mi_heap_malloc_zero_aligned_at_generic(mi_heap_s * const heap, const unsigned __int64 size, const unsigned __int64 alignment, const unsigned __int64 offset, const bool zero) Line 164 (mimalloc\src\alloc-aligned.c:164)
App.exe!mi_heap_malloc_zero_aligned_at(mi_heap_s * const heap, const unsigned __int64 size, const unsigned __int64 alignment, const unsigned __int64 offset, const bool zero) Line 206 (mimalloc\src\alloc-aligned.c:206)
App.exe!mi_heap_malloc_aligned_at(mi_heap_s * heap, unsigned __int64 size, unsigned __int64 alignment, unsigned __int64 offset) Line 216 (mimalloc\src\alloc-aligned.c:216)
App.exe!mi_heap_malloc_aligned(mi_heap_s * heap, unsigned __int64 size, unsigned __int64 alignment) Line 220 (mimalloc\src\alloc-aligned.c:220)
App.exe!mi_malloc_aligned(unsigned __int64 size, unsigned __int64 alignment) Line 249 (mimalloc\src\alloc-aligned.c:249)

It still only happens with HE ASLR disabled. You can disable in via admin Powershell like:

Set-ProcessMitigation -Name App.exe -Disable HighEntropy

But I'm not able to repro it in a minimal app, unfortunately :(

Noxybot avatar May 14 '25 04:05 Noxybot

I can tell that mi_page_t* page = _mi_ptr_page(p); in mi_heap_malloc_zero_aligned_at_overalloc returned NULL page when p=0x0000000003990080 and then we used it here:

if (aligned_p != p) {
    mi_page_set_has_aligned(page, true);

Noxybot avatar May 14 '25 04:05 Noxybot

With MI_DEBUG_FULL this assertion fails:

mimalloc\include\mimalloc/internal.h":585, _mi_ptr_page
  assertion: "p==NULL || mi_is_in_heap_region(p)"

Noxybot avatar May 14 '25 04:05 Noxybot

Ah, I think I know what this is. dev3 has a nice _mi_page_map that tracks all valid mimalloc pages in memory and it is committed on demand. However, I reserve only the minimal first part around address 0 to catch NULL pointers .. I need to reserve it fully though for HE. I will try to push a fix soon.

daanx avatar May 21 '25 18:05 daanx

Hi @daanx , I have a Windows program using mimalloc v3.0.3, work on Windows; running under Wine/Linux, it crashes right away with debug/asserts enabled

on Wine, allocated memory has low addresses

same program with v2.2.3 all good on Windows and Wine/Linux

I think this is related to this bug, interested in a fix as 3.0.3 seems a bit faster than 2.2.3

pmeerw avatar May 26 '25 06:05 pmeerw

Hi @pmeerw -- I think the issue is fixed now in the latest dev3 (with commit 4161152 -- I wrote the issue ID wrongly so it didn't show up in this thread). Let me know if it works now for you.

daanx avatar May 28 '25 16:05 daanx

Hi @daanx , is the crash reported in https://github.com/microsoft/mimalloc/issues/1087#issuecomment-2878619915 also supposed to be fixed by https://github.com/microsoft/mimalloc/commit/4161152d17edbfb891733493310ea65278298f76? I seem to still be hitting it with the latest dev3 (_mi_ptr_page returns nullptr for p=0x00000000036b0080...

Noxybot avatar May 28 '25 19:05 Noxybot

Hi @daanx, latest dev3 (commit 550b6283) works for me now; it's about same runtime as v2.2.3 for my workload (some Windows executable under Wine)

pmeerw avatar May 29 '25 10:05 pmeerw

I've added more logs for page allocations / placing of pages in _mi_page_map and here is what I can tell: mi_heap_malloc_zero_aligned_at_overalloc:

mimalloc: ptr 0x04050080 allocated from page 0x04040000, size: 32768
mimalloc: ptr 0x04050080 registered at page with idx: 0, subidx 1029
mimalloc: OOPS page is null for ptr 0x04050080 , size: 16385, alignment: 16384

And here are the logs about creation of 0x04040000 page: _mi_page_map_register:

mimalloc: page 0x04040000 registered at idx: 0, subidx 1028
mimalloc: page 0x04040000 CREATED

So somehow we registered the page at sub_idx 1028, then allocated from it pointer 0x04050080 (with size 16385 and alignment 16384) and then _mi_ptr_page for that pointer returned page from idx=0 and sub_idx=1029 instead of sub_idx=1028. Seems like off-by-1 error somewhere?

The page in question has blocksize of 32768... Maybe it's too small to fit 16385+16384=32769?

Noxybot avatar May 30 '25 04:05 Noxybot

OK, after debugging it for a while I think this line is wrong:

static mi_page_t** mi_page_map_ensure_at(size_t idx) {
  mi_page_t** sub = mi_page_map_ensure_committed(idx);
  if mi_unlikely(sub == NULL || idx == 0 /* low addresses */)

Basically, it makes mi_page_map_ensure_at to CONSTANTLY override sub page map at index=0. So the result of mi_page_map_set_range is essentially discarded, therefore we're failing to find desired page afterwards. Removing || idx == 0 /* low addresses */ solves the problem.

Noxybot avatar May 30 '25 07:05 Noxybot

Thanks @daanx - it works with the recent commit in dev3!

Noxybot avatar May 31 '25 04:05 Noxybot