MINGW-packages
MINGW-packages copied to clipboard
Exception handling is broken on mingw32 when using static runtime libraries.
If you build ccache >= 4.2 for mingw32, it crashes at runtime. This bug started to occur when static linking was enabled for the gcc runtime libraries. (https://github.com/ccache/ccache/pull/732)
The error almost always occurs in the same place: In Util.cpp read_file rais an exception:
std::string
read_file(const std::string& path, size_t size_hint)
{
if (size_hint == 0) {
auto stat = Stat::stat(path);
if (!stat) {
>> throw Error(strerror(errno));
}
size_hint = stat.size();
}
But the catch block in the calling function is never reached.
std::string data;
try {
>> data = Util::read_file(path);
} catch (const Error&) {
// Ignore.
return counters;
}
Here is the callstack of "ccache -s" :
msvcrt.dll!msvcrt!_exit (Unbekannte Quelle:0)
msvcrt.dll!msvcrt!abort (Unbekannte Quelle:0)
uw_init_context_1(struct _Unwind_Context * context, void * outer_cfa, void * outer_ra) (c:\_\M\mingw-w64-gcc\src\gcc-10.3.0\libgcc\unwind-dw2.c:1593)
_Unwind_RaiseException(struct _Unwind_Exception * exc) (c:\_\M\mingw-w64-gcc\src\gcc-10.3.0\libgcc\unwind.inc:93)
__cxxabiv1::__cxa_throw(void * obj, std::type_info * tinfo, void (*)(void *) dest) (c:\_\M\mingw-w64-gcc\src\gcc-10.3.0\libstdc++-v3\libsupc++\eh_throw.cc:90)
Util::read_file(const std::string & path, size_t size_hint) (...\cache-4.3\src\Util.cpp:1164)
Statistics::read(const std::string & path) (...\cache-4.3\src\Statistics.cpp:196)
operator()(const struct {...} * const __closure, const std::string & path) (...\cache-4.3\src\Statistics.cpp:102)
std::__invoke_impl<void, collect_counters(const Config&)::<lambda(const string&)>&, const std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&>(std::__invoke_other, struct {...} &)(struct {...} & __f) (c:\msys64\mingw32\include\c++\10.3.0\bits\invoke.h:60)
std::__invoke_r<void, collect_counters(const Config&)::<lambda(const string&)>&, const std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&>(struct {...} &)(struct {...} & __fn) (c:\msys64\mingw32\include\c++\10.3.0\bits\invoke.h:153)
std::_Function_handler<void(const std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&), collect_counters(const Config&)::<lambda(const string&)> >::_M_invoke(const std::_Any_data &, const std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > &)(const std::_Any_data & __functor, __args#0) (c:\msys64\mingw32\include\c++\10.3.0\bits\std_function.h:291)
std::function<void (std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)>::operator()(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) const(const std::function<void(const std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&)> * const this, __args#0) (c:\msys64\mingw32\include\c++\10.3.0\bits\std_function.h:622)
for_each_level_1_and_2_stats_file(const std::string &, std::function<void(const std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&)>)(const std::string & cache_dir, const std::function<void(const std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&)> function) (...\cache-4.3\src\Statistics.cpp:84)
collect_counters(const Config & config) (...\cache-4.3\src\Statistics.cpp:99)
Statistics::format_human_readable[abi:cxx11](Config const&)(const Config & config) (...\cache-4.3\src\Statistics.cpp:281)
handle_main_options(int argc, const char * const * argv) (...\cache-4.3\src\ccache.cpp:2772)
ccache_main(int argc, const char * const * argv) (...\cache-4.3\src\ccache.cpp:2839)
main(int argc, char * const * argv) (...\cache-4.3\src\main.cpp:24)
I can't generate a minimal example for this bug. But, since the prebuilt 32 bit versions of ccache do not have this bug I assume that it is caused by the mingw32 environment and not by ccache.
is this attempting to throw across a module boundary? That can be an issue, but doesn't appear to be the case here as far as I can see from the backtrace.
hmm guess this is what is also hitting a gcc clang build if trying to link to static libgcc it will fail to build.
is this attempting to throw across a module boundary? That can be an issue, but doesn't appear to be the case here as far as I can see from the backtrace.
As I said, I tried to create a minimal example where I also threw exceptions over library boundaries. But I could not reproduce the error there. I suspect that ccache makes a system call that breaks the exception handling.
Unfortunately I don't understand enough about exception handling models like dwarf to be able to isolate the issue better.
Seems like this also affects #9088
I purged -static-libgcc
from ldflags and the build works without additional changes/patches
Without this I was consistently getting ICE on CI
guess we should report this upstream, thats a major breakage :O
this also affected the tdm build i was maintaining, and it seems to go back further than i thought. first time this cropped up problems with the tdm builds was gcc-8 and it slowly got worse with the newer gcc versions. at first it was only the 32 bit compiler which occasionally would bail on code that worked before, but later it would also fail with static exceptions on the 64 bit code. The funny thing about the tdm builds is that they make use of code to allow throwing exceptions across dll boundaries even when linked to the static exception runtimes, so this kinda sucked because before all these problems i could actually build a gcc version of clang that did not rely on the libgcc and libstdc++ dll's, this is now impossible unfortuantly.
Not sure if there is any correlation, but there was a similar problem in ccache with the MIPS toolchain when using the gold linker instead of the bfd linker: https://github.com/ccache/ccache/issues/907
wow thats quite a problem :S
I was running in the same or similar bug during the debugging of ccache built with 64 bit gcc.
A system test failed with an exception. So I tried to to debug it with gdb.
But it does not hit the expected exception, with attached debugger the process was died at the same point as the 32-bit version, however in contrast with an error message terminate called after throwing an instance of 'core::Error'
.
After that, I took some time to look at the bug more closely, and the behavior gets weirder as I looked on it.
32-bit with static linking
The process died on the following assert, without any error message:
File: unwind-dw2.c
1578: static void __attribute__((noinline))
1579: uw_init_context_1 (struct _Unwind_Context *context,
1580: void *outer_cfa, void *outer_ra)
1581: {
1582: void *ra = __builtin_extract_return_addr (__builtin_return_address (0));
1583: _Unwind_FrameState fs;
1584: _Unwind_SpTmp sp_slot;
1585: _Unwind_Reason_Code code;
1586:
1587: memset (context, 0, sizeof (struct _Unwind_Context));
1588: context->ra = ra;
1589: if (!ASSUME_EXTENDED_UNWIND_CONTEXT)
1590: context->flags = EXTENDED_CONTEXT_BIT;
1591:
1592: code = uw_frame_state_for (context, &fs);
>> 1593: gcc_assert (code == _URC_NO_REASON);
1594:
The reason is located in the _Unwind_Find_FDE
function, the both pointer seen_objects
and unseen_objects
are a nullptr:
File: unwind-dw2-fde.c
1029: const fde *
1030: _Unwind_Find_FDE (void *pc, struct dwarf_eh_bases *bases)
1031: {
...
1051: /* Linear search through the classified objects, to find the one
1052: containing the pc. Note that pc_begin is sorted descending, and
1053: we expect objects to be non-overlapping. */
>> 1054: for (ob = seen_objects; ob; ob = ob->next)
...
1061: }
1062:
1063: /* Classify and search the objects we've not yet processed. */
>> 1064: while ((ob = unseen_objects))
1065: {
...
1078: if (f)
1079: goto fini;
1080: }
I tried using a memory breakpoint to see if these pointers are ever set, but couldn't see it.
64-bit static/dynamic
I was searching for the reason for a fmt::v7::format_error
.
After I attatched the debugger to the 64-bit version, the error message was now a core::Error
instead of the fmt::v7::format_error
.
The position of the exit seems to be the normal place when no catch block was found:
File: eh_throw.cc
74: extern "C" void
75: __cxxabiv1::__cxa_throw (void *obj, std::type_info *tinfo,
76: void (_GLIBCXX_CDTOR_CALLABI *dest) (void *))
77: {
78: PROBE2 (throw, obj, tinfo);
79:
80: __cxa_eh_globals *globals = __cxa_get_globals ();
81: globals->uncaughtExceptions += 1;
82: // Definitely a primary.
83: __cxa_refcounted_exception *header =
84: __cxa_init_primary_exception(obj, tinfo, dest);
85: header->referenceCount = 1;
86:
87: #ifdef __USING_SJLJ_EXCEPTIONS__
88: _Unwind_SjLj_RaiseException (&header->exc.unwindHeader);
89: #else
90: _Unwind_RaiseException (&header->exc.unwindHeader);
91: #endif
92:
93: // Some sort of unwinding error. Note that terminate is a handler.
94: __cxa_begin_catch (&header->exc.unwindHeader);
>> 95: std::terminate ();
96: }
32-bit dynamic
Then I thought I'll debug the dynamic 32-bit version, but the gcc-libs havn't any symbols. So I built the gcc packages locally (first without making any changes to the PKGBUILD) and installed the gcc-libs package.
Now everything was broken!
Every process crashed at startup and it was impossible to debug any process (even with the 64-bit multiarch gdb).
32-bit static/dynamic with new gcc
I thought that the reason for the strange behavior was in the local build of the gcc libs, so I triggered a github action to build a new gcc package. Then I installed the new packages together with cmake and ninja in an empty environment:
pacman --root new_root -Sy
pacman --root new_root -U mingw-w64-i686-gcc* mingw-w64-i686-libgccjit*
pacman --root new_root -S mingw-w64-i686-cmake mingw-w64-i686-ninja
After that I started a cmd, added the new environment as the only entry in the PATH variable and built the ccache project.
With static linking it was the same behavior as befor but with dynamic linking it was terminating with the following error message: terminate called after throwing an instance of 'core::Error'
After several tries to debug the behavior, it seems to be impossible to reproduce it with an attached debugger. Therefore I have enabled the JIT degugger to debug it. As far as I understood the code there, it was the normal place when no catch block was found.
But for me it was also not possible to detect the reason for this behavior. Maybe i can post a callstack when I setup the JIT debugger again.
I'm not sure what is in the current gcc package, but it is not reproducible locally or on github.
It's now broken in the dynamic case too -> #9771
As you correctly predicted I guess
I tried rebuilding ccache without -DSTATIC_LINK=OFF
with the rebuild gcc, and it still seems to fail as originally described.
Ooh, it's pulling in libgcc_s_dw2-1.dll via libhiredis.dll, I figure that's probably messing things up. (it's also coming via libzstd.dll)
Minimal reproducer:
#include <stdio.h>
#include <zstd.h>
int main()
{
try
{
printf("About to throw\n");
#ifdef BREAK_EXCEPTIONS
printf("Calling zstd: %u\n", ZSTD_versionNumber());
#endif
throw 42;
printf("After throw (unreachable)\n");
}
catch (...)
{
printf("Caught\n");
return 1;
}
return 0;
}
$ g++ -static-libgcc -static-libstdc++ -o testexc.exe testexc.cpp -lzstd
$ ./testexc
About to throw
Caught
$ g++ -static-libgcc -static-libstdc++ -o testexc.exe testexc.cpp -lzstd -DBREAK_EXCEPTIONS
$ ./testexc
About to throw
Calling zstd: 10500
I chose zstd pretty arbitrarily, it could be any DLL that's dynamically linked to libgcc.
Interestingly, it works with either -static-libgcc
, -static-libstdc++
, or neither, but breaks with both
We have fixed some of unwinding issues, does this still reproduce?
$ g++ -static-libgcc -static-libstdc++ -o testexc.exe testexc.cpp -lzstd $ ./testexc About to throw Caught $ g++ -static-libgcc -static-libstdc++ -o testexc.exe testexc.cpp -lzstd -DBREAK_EXCEPTIONS $ ./testexc About to throw Calling zstd: 10500
I still get the same result with gcc 11.3.0
Here is an upstream comment on this issue: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105507#c5
these built with my TDM based toolset, exceptions work again guess it all came down to the grep problem we had.