LibRuntime: Move Itanium C++ ABI and runtime support functions from LibC to a new library
As per message in Discord, I want to:
- not link against libstdc++, libc++, or libunwind in Serenity code;
- still be able to use some of their headers; in particular,
<new>,<initializer_list>,<type_info>, and<coroutine>; - do not provide implementation for any of them;
- leave ports that use C++ exceptions and link with our libc in a working state;
- be able to start a refactor that remove libc dependency from our libraries when they run on SerenityOS;
- remove dependency cycle with libc & co.;
- and not implement enormous amounts of Itanium ABI.
To do so, I moved Itanium ABI, implementation of errno.h, string.h, and strings.h, and a part of stdio.h (struct FILE and *printf*) to a new library which does not depend on anything else from LibC or on other libraries (except for LibSystem and AK). In the future, this library—LibRuntime—should implement enough libc to link (as --whole-archive) with exceptionless part of libc++abi and compiler-rt (or libgcc and libsupc++) without undefined symbols. Additionally, LibRuntime provides C++-ified wrappers for syscalls which should make interaction with the kernel more pleasant.
Benefits of adopting LibRuntime include:
- Clearer architecture without dependency cycles.
- No need to add special rules for LibC initialization.
- If ported to Lagom (what I plan to do), less ifdef soup in AK (since unlike LibCore, it's fine to use LibRuntime there).
- Less low-level C-style C++ code everywhere.
- Clearer distinction between libc headers we provide for our own use and only for ports.
- Potential for easier code deduplication in low-level libraries like AK, LibCore, LibC, and LibThreading.
Demo:
TODO:
- [x] Figure out why dynamic loader cannot find
_Unwind_Resumeanymore - [x] Make LibC link against LibRuntime instead of using object files directly
- [x] Teach DynamicLoader how to pass environ to LibRuntime
- [x] Make the whole thing work with GCC
- [x] Clean up the mess in CMakeLists.txt
- [ ] ~~Implement
__cxa_thread_atexit(otherwise libc++abi depends on libc++)~~ - [x] Figure out the situation with sanitizers and profiling
- [x] Remove LocalItaniumCXXABI in favor of
crt0_shared.o - [x] Test ports
- [X] Test static executables (don't work in the exact same way as before)
Despite this being an absolutely massive PR, all commits are atomic, so, hopefully, it shouldn't be that painful to review.
be able to start a refactor that remove libc dependency from our libraries when they run on SerenityOS;
I don't really agree that this is a goal. What's the benefit here? Why can't we just keep "libruntime" as a Lagom-only thing? Why shouldn't we link against the C library?
In the past we made moves to move LibPthread and LibDL into LibC, as the centralized location for runtimey things.
I don't think that avoiding linking against libgcc and compiler-rt should be a goal either. Those libraries are part of "the compiler". The compiler will insert calls to functions in it whenever it feels like. Adding extra math functions and atomic helpers and what have you to our own Runtime library seems like it'll evolve into unnecessary wack-a-mole every compiler update or compiler flags change.
If there's things that our LibC doesn't provide that a Port linking against libc++ expects to be part of LibC, then we should teach -lc++ or -lstdc++ to pull in the extra required libraries in each compiler driver. Like, if we need libunwind to be pulled in by libstdc++, rather than assume it's linked to libc, we should tell the compiler that's a thing we need to do.
As a whole, I'm skeptical that this is adding enough value to be worth making the change? What's the cleanup in CMake like? Why do we need things like stdio in a separate library instead of LibC? Why does adding this library between LibC and LibCore break cycles? If we have a bunch of cycles, we should merge more things into LibC, not move things out. LibC is the runtime library.
@ADKaster ~~This's why I tried to ask about this whole thing on Discord before writing any code but nobody seemed to care~~
Why shouldn't we link against the C library?
Think of my JSSpecCompiler test runner. When I call Core::Process::spawn there with Vector<String> arguments, Core::Process reallocates and adds zero terminators to arguments to convert this vector to char**, then inside LibC these zero terminators are trimmed once again and lost lengths are restored. The kernel actually expects Vector<pair<char*, size_t>> and we are doing this useless round-trip just because POSIX is mandating C APIs for interaction with the system when in reality we are communicating between two C++ programs which do not need C wrappers.
This pain, of course, is partially alleviated by LibCore/System.h but we can't use it in AK and LibC requiring us to reimplement system interfaces in the later two. Also note that C++ -> C conversion is usually more painful (in terms of number of useless reallocations) than one in the other direction.
Moreover, we had a few instances where we were using C APIs with AK::String and forgetting completely about zero terminator (most notably with strtol).
On top of that, I don't suggest a nuclear option (as I initially did in #21995): widely used C APIs can be just moved to LibRuntime and no porting will be needed. Here I did this with string.h and strings.h.
Why can't we just keep "libruntime" as a Lagom-only thing?
We certainly can but LibRuntime there would basically be LibCore/System.cpp C++ -> C conversion without Serenity-specific stuff. But this feels like a partial measure to me. Why can't we just provide LibRuntime on Serenity too to remove as much ifdef soup as possible? And I argue Serenity would also benefit in terms of performance from not doing C++ -> C -> C++ round-trip.
In the past we made moves to move LibPthread and LibDL into LibC, as the centralized location for runtimey things.
Fine, but why can't shift this responsibility to LibRuntime which isn't constrained by POSIX and is not bloated with C-style APIs that have nothing to do with runtime?
I don't think that avoiding linking against libgcc and compiler-rt should be a goal either.
I haven't said that, I continue to link with them. Three libraries I avoid is LibC, libstdc++, and libc++.
What's the cleanup in CMake like?
Just the mess I made in the initial versions of the PR.
Why do we need things like stdio in a separate library instead of LibC?
Unfortunately, stdio.h is used by compiler runtimes, so I had no other choice, basically. But, fortunately, this is not an exception because LibRuntime already provides subset of LibC.
Why does adding this library between LibC and LibCore break cycles?
It is not between but under LibC. The libraries on Serenity with Clang are linked as follows:
LibRuntime (contains AK) -> LibSystem
LibRuntime -> (statically, `--as-needed`) clang.builtins, libc++abi
LibC -> LibRuntime
LibC -> LibSystem
LibC -> llibunwind
libunwind -> LibC
LibC -> (statically, `-as-needed`) clang.builtins, libc++abi
and the rest is unchanged. So, we can use LibRuntime functions everywhere including AK and LibC. This also has a side-effect of simplifying the linker.
If we have a bunch of cycles, we should merge more things into LibC, not move things out. LibC is the runtime library.
I doubt that merging libunwind into LibC is a great idea. libunwind is C++ implementation detail and not a thing we want to have available (at least, publicly) in LibC.
Also, look at this, for example. Seems like it's so much nicer to be able to use C++ APIs inside LibC.