STL
STL copied to clipboard
<xlocinfo>: Mismatched call to free when malloc is replaced with user implementation
Describe the bug
The VC runtime explicitly supports user replacement of the malloc / free family of heap functions with custom implementations. For example, in malloc_base.cpp in the UCRT source there is this comment at the top:
// Implementation of _malloc_base(). This is defined in a different source file
// from the malloc() function to allow malloc() to be replaced by the user.
There is also this comment before the _malloc_base function:
// This function implements the logic of malloc(). It is called directly by the
// malloc() function in the Release CRT and is called by the debug heap in the
// Debug CRT.
//
// This function must be marked noinline, otherwise malloc and
// _malloc_base will have identical COMDATs, and the linker will fold
// them when calling one from the CRT. This is necessary because malloc
// needs to support users patching in custom implementations.
extern "C" __declspec(noinline) _CRTRESTRICT void* __cdecl _malloc_base(size_t const size)
However, the STL's _Locinfo class in the <xlocinfo> header has a problem where it will call free to free an object that was allocated by the CRT with an explicit call to _malloc_base. If the user has replaced malloc and free with custom implementations, then the STL will call the user's free with a pointer that was not allocated by the user's malloc function.
In particular, the _Getdays function (and other similar functions on that file) calls the UCRT's ::_Getdays export, which ends up doing this:
__crt_unique_heap_ptr<char> buffer(_malloc_crt_t(char, length + 1));
Where _malloc_crt_t is a macro that expands to _malloc_base, i.e. the CRT's default malloc implementation. And then, the STL calls free to free this object.
Instead of calling free, the STL should call the matching _free_crt_t (which is an alias for _free_base).
Command-line test case
C:\Temp>type repro.cpp
#include <Windows.h>
#include <sstream>
#include <iomanip>
#include <cassert>
HANDLE heap = HeapCreate(0, 0, 0);
extern "C" void* malloc(size_t size)
{
void* ptr = HeapAlloc(heap, 0, size);
printf("malloc: %p\n", ptr);
return ptr;
}
extern "C" void free(void* ptr)
{
printf("free: %p\n", ptr);
HeapFree(heap, 0, ptr);
}
int main(int argc, char** argv)
{
std::tm tm;
std::istringstream stream{"Thu, 01 Jan 1970 00:00:00 GMT"};
stream >> std::get_time(&tm, "%a, %d %b %Y %H:%M:%S %Z");
printf("OK\n");
return 0;
}
C:\Temp>cl /EHsc /MT repro.cpp
Microsoft (R) C/C++ Optimizing Compiler Version 19.27.29016 for x86
Copyright (C) Microsoft Corporation. All rights reserved.
repro.cpp
Microsoft (R) Incremental Linker Version 14.27.29016.0
Copyright (C) Microsoft Corporation. All rights reserved.
/out:repro.exe
repro.obj
C:\Temp>repro
malloc: 015B05B8
malloc: 015B05E0
malloc: 015B05F0
malloc: 015B0618
free: 015B0618
malloc: 015B0618
malloc: 015B0628
malloc: 015B0648
malloc: 015B0658
free: 015B0658
free: 015B0648
malloc: 015B0648
malloc: 015B0658
malloc: 015B0668
free: 015B05B8
malloc: 015B0690
malloc: 015B05B8
malloc: 015B05C8
malloc: 015B06E0
free: 012858B8 // <---- NOTE: Not allocated by user malloc
C:\Temp>echo %ERRORLEVEL%
-1073740940 // <---- executable crashed
The callstack for the crash is:
> ntdll.dll!_RtlReportCriticalFailure
ntdll.dll!_RtlpReportHeapFailure
ntdll.dll!_RtlpHpHeapHandleError
ntdll.dll!_RtlpLogHeapFailure
ntdll.dll!_RtlpAnalyzeHeapFailure
ntdll.dll!@RtlpFreeHeap
ntdll.dll!_RtlpFreeHeapInternal
ntdll.dll!RtlFreeHeap
Expected behavior
The expectation is that the STL never uses a mismatched call to free an object. If the object was allocated with malloc then it should use free, and if it was allocated by _malloc_base then it should use _free_base.
STL version
Microsoft Visual Studio Professional 2019
Version 16.6.4
Also tracked by DevCom-246248 and Microsoft-internal VSO-470455 / AB#470455 .
There are other similar mismatches, should I open an issue for each one?
-
<xlocale>:_calloc_crt/freeinclass ctypeAlloc: https://github.com/microsoft/STL/blob/7930d3512f74b3263efc315a06551d406ccd3191/stl/src/_tolower.cpp#L107 Free: https://github.com/microsoft/STL/blob/7930d3512f74b3263efc315a06551d406ccd3191/stl/inc/xlocale#L2501
-
<xlocale>:_calloc_crt/freeinclass ctype<char>Alloc: https://github.com/microsoft/STL/blob/7930d3512f74b3263efc315a06551d406ccd3191/stl/src/_tolower.cpp#L107 Free: https://github.com/microsoft/STL/blob/7930d3512f74b3263efc315a06551d406ccd3191/stl/inc/xlocale#L2778
-
<xlocale>:_calloc_crt/freeinclass ctype<wchar_t>Alloc: https://github.com/microsoft/STL/blob/7930d3512f74b3263efc315a06551d406ccd3191/stl/src/_tolower.cpp#L107 Free: https://github.com/microsoft/STL/blob/7930d3512f74b3263efc315a06551d406ccd3191/stl/inc/xlocale#L2926
-
<xlocale>:_calloc_crt/freeinclass ctype<unsigned short>Alloc: https://github.com/microsoft/STL/blob/7930d3512f74b3263efc315a06551d406ccd3191/stl/src/_tolower.cpp#L107 Free: https://github.com/microsoft/STL/blob/7930d3512f74b3263efc315a06551d406ccd3191/stl/inc/xlocale#L3128
-
<xlocinfo>:_malloc_crt/freeinclass _TimevecAlloc:
_malloc_crt(see UCRT function_W_Gettnamesinwcsftime.cpp) Free: https://github.com/microsoft/STL/blob/7930d3512f74b3263efc315a06551d406ccd3191/stl/inc/xlocinfo#L34, https://github.com/microsoft/STL/blob/7930d3512f74b3263efc315a06551d406ccd3191/stl/inc/xlocinfo#L39
I think one issue is fine, thanks. This mismatch does seem to be improper. It may be possible to fix this without breaking ABI, if the behavior remains identical for users who don't attempt to replace malloc/free.
Thanks STL!
Yeah, I think it will be possible to fix this without changing the ABI. From looking at the CRT source, the _malloc_crt_t macro is defined as follows:
#define _malloc_crt_t(t, n) (__crt_unique_heap_ptr <t>(static_cast<t*>(_malloc_crt ( (n) * sizeof(t)))))
And _malloc_crt is, in essence:
#ifdef _DEBUG
#define _malloc_crt(s) (_malloc_dbg ( s, _CRT_BLOCK, __FILE__, __LINE__))
#define _free_crt(p) (_free_dbg (p, _CRT_BLOCK ))
#else // ^^^ _DEBUG ^^^ // vvv !_DEBUG vvv
#define _malloc_crt _malloc_base
#define _free_crt _free_base
#endif // !_DEBUG
So _Locinfo::_Getdays could be changed to do this:
const char* __CLR_OR_THIS_CALL _Getdays() const {
const char* _Ptr = ::_Getdays();
if (_Ptr) { // capture names and free allocated C string
const_cast<_Locinfo*>(this)->_Days = _Ptr;
- _CSTD free(const_cast<char*>(_Ptr));
+#ifdef _DEBUG
+ _free_dbg(const_cast<char*>(_Ptr), _CRT_BLOCK);
+#else // _DEBUG
+ _free_crt(const_cast<char*>(_Ptr));
+#endif // _DEBUG
}
return !_Days._Empty() ? _Days._C_str()
: ":Sun:Sunday:Mon:Monday:Tue:Tuesday:Wed:Wednesday"
":Thu:Thursday:Fri:Friday:Sat:Saturday";
}
Which is basically inlining the default implementation of free in the CRT, which looks like this:
extern "C" _CRT_HYBRIDPATCHABLE __declspec(noinline) void __cdecl free(void* const block)
{
// Some libraries that hook memory allocation routines (such as libtcmalloc)
// look for an appropriate place inside free to place a patch to its version
// of free. Without these extra instructions padding the call to _free_base,
// some libraries may choose to insert this patch in _free_base instead.
volatile int extra_instructions_for_patching_libraries = 0;
(void) extra_instructions_for_patching_libraries;
#ifdef _DEBUG
_free_dbg(block, _NORMAL_BLOCK);
#else
_free_base(block);
#endif
}
For the majority of users, who don't replace malloc and free, the change would result in the same behavior as before, with the possible exception of that memory hooking workaround mentioned in the comment. But I don't think that actually matters here, since as it stands now a library that hooks malloc and free would only be seeing the call to free from _Locinfo and not the call to _malloc_base in ::_Getdays, so it's probably ignoring that mismatched free anyway.
Is this something you would accept a PR for? Any other important considerations I would need to have in mind if I were to put forward a fix for this?
Apologies for the delay - I fell behind on my GitHub notifications while trying to catch up with PR reviews :joy_cat:
So
_Locinfo::_Getdayscould be changed to do this:- _CSTD free(const_cast<char*>(_Ptr)); +#ifdef _DEBUG + _free_dbg(const_cast<char*>(_Ptr), _CRT_BLOCK); +#else // _DEBUG + _free_crt(const_cast<char*>(_Ptr)); +#endif // _DEBUG
Unless I'm missing something, I think this should just be:
- _CSTD free(const_cast<char*>(_Ptr));
+ _free_crt(const_cast<char*>(_Ptr));
Because, as you quoted, _free_crt is a macro that switches between _free_dbg(p, _CRT_BLOCK) in debug mode, and _free_base in release mode.
Note that there is a difference here - also as you quoted, free frees as _NORMAL_BLOCK, while we're using _malloc_crt and _calloc_crt to allocate _CRT_BLOCK (because that's excluded from the leak tracking infrastructure by default; users are confused when long-lived CRT/STL allocations are reported as leaks). However, the CRT machinery allows CRT blocks to be freed as NORMAL blocks, so either free or _free_crt is acceptable; see C:\Program Files (x86)\Windows Kits\10\Source\10.0.18362.0\ucrt\heap\debug_heap.cpp and search for "freed as NORMAL blocks".
So yes, I think we can accept a PR to change these free calls to _free_crt. (We already have 3 in stl/src.) Ideally, such a PR would have a test case exercising every changed call, to prevent future regressions (especially during the upcoming vNext refactoring). Thanks!
You're right, _free_crt would be all that's needed there. Thanks STL, I'll try to get a PR with this change submitted in the near future.
operator new and operator delete have the same problem: operator new calls malloc unconditionally, but operator delete calls free in release builds and _free_dbg (so it can specify _UNKNOWN_BLOCK) in debug builds. This also explodes when the user has replaced malloc and free.
Presumably the fix here is to unconditionally call free from operator delete, and fixup any vcruntime/STL allocations which we think are important to discriminate from user allocations so they are deallocated with _free_dbg instead of operator delete.
maybe related: https://github.com/microsoft/STL/issues/160
Note that (despite source comments indicating otherwise) we don't actually support malloc/free replacement, which may limit the amount of work we need to do here: https://github.com/microsoft/STL/issues/4564#issuecomment-2037401144
Thanks STL! I was just recently looking at this issue again after a ~4 year hiatus actually.
Maintaining a small STL patch is not a burden for us. We link against it statically and that allows us to easily replace malloc with snmalloc's.
A slightly more complex issue, but which we also have a local fix for, is the fact that while libucrt.lib defines malloc and _malloc_base in different object files (malloc.obj and malloc_base.obj respectively) - which avoids duplicate symbol definitions during linking because our own malloc.obj would be picked over the one in libucrt.lib - it doesn't do that consistently for other memory allocation functions, such as _recalloc/_recalloc_base (both defined in recalloc.obj) and _msize/_msize_base (both defined in msize.obj). We wrote a simple tool that extracts those two object files from libucrt.lib and internalizes the _recalloc and _msize symbols so we can provide our own definition and the one from the CRT is not used.
Leaving this here mainly for others who are interested in memory allocator overrides this way.