IncludeOS
IncludeOS copied to clipboard
Upgrade to latest musl 1.2.5 (via 1.1.24)
This upgrades to musl 1.2.5, the latest release.
The update has been tested by building the chainloader and booting the example.
The upgrade is done via 1.1.24, using partial patches from the pthread branch (originally written by @fwsGonzo). See commit messages for more details.
musl has introduced _time64 versions of several functions to provide 64-bit time for 32-bit builds without breaking the existing API. It looks like IncludeOS is using 64 bit values already, so I have just kept the existing versions in IncludeOS. It would probably be useful with a read through by someone more familiar with the IncludeOS time code...
Awesome! Great to get musl unpinned and up to date, thanks!
- ✅ Builds
- ✅ UDP test passes
nix-shell --argstr unikernel ./test/net/integration/udp/ --run "./test.py" - ✅ SMP test passes
nix-shell --argstr unikernel ./test/kernel/integration/smp/ --run "./test.py" - ❌ TCP test enters crash loop
nix-shell --argstr unikernel ./test/net/integration/tcp/ --run "./test.py"
I'm pretty sure the TCP test failing is unrelated - it fails during heap initialisation, before initializing libc. Incidentally, changing the musl patch to make a_crash emit a message and cli;hlt instead of ud2, catches the error:
[ WARNING ] Running with sudo
<vm> SeaBIOS (version 1.16.3-debian-1.16.3-2)
<vm> Booting from ROM..* Multiboot begin: 0x9500
<vm> * Multiboot cmdline @ 0x23c035: /nix/store/74g3yzkxj9jbli24vhqg38q0psgng2cj-chainloader-static-i686-unknown-linux-musl-dev/bin/chainloader booted with vmrunner
<vm> * Multiboot end: 0x23c0b5
<vm> * Module list @ 0x23c000
<vm> * Module @ 0x23d000
<vm> * Args: net_tcp.elf.bin booted with vmrunner
<vm> * End: 0x4a6d80
<vm> * Multiboot end: 0x4a6d80
<vm> <Multiboot>Booted with multiboot
<vm> * Boot flags: 0x24f
<vm> * Valid memory (261631 Kib):
<vm> 0x00000000 - 0x0009fbff (639 Kib)
<vm> 0x00100000 - 0x0ffdffff (260992 Kib)
<vm>
<vm> * Booted with parameters @ 0x23c035: /nix/store/74g3yzkxj9jbli24vhqg38q0psgng2cj-chainloader-static-i686-unknown-linux-musl-dev/bin/chainloader booted with vmrunner
<vm> * Multiboot provided memory map (8 entries @ 0x9000)
<vm> 0x0000000000 - 0x000009fbff FREE (639 Kb.)
<vm> 0x000009fc00 - 0x000009ffff RESERVED (1 Kb.)
<vm> 0x00000f0000 - 0x00000fffff RESERVED (64 Kb.)
<vm> 0x0000100000 - 0x000ffdffff FREE (260992 Kb.)
<vm> 0x000ffe0000 - 0x000fffffff RESERVED (128 Kb.)
<vm> 0x00feffc000 - 0x00feffffff RESERVED (16 Kb.)
<vm> 0x00fffc0000 - 0x00ffffffff RESERVED (256 Kb.)
<vm> 0x0000000000 - 0x00ffffffff RESERVED (0 Kb.)
<vm>
<vm> <Multiboot>OS loaded with 1 modules
<vm> * net_tcp.elf.bin booted with vmrunner @ 0x23d000 - 0x4a6d80, size: 2530688b
<vm> * Multiboot begin: 0x9500
<vm> * Multiboot end: 0x3bd2a0
<vm> [x86_64 PC] constructor
<vm> [ Machine ] Initializing heap
<vm> [ Machine ] Main memory detected as 264380224 b
<vm> [ Machine ] Reserving 1048576 b for machine use
<vm>
<vm> *** Libc crashed 😵 ***
You can bring this in if you'd like, or we can discuss it in a separate PR: https://github.com/alfreb/IncludeOS/commit/e81fb7c7da96b8cae8b43d406b6d868b7d09b66e
I think this is a great opportunity to maybe find an old ghost; I'll try to find some time to investigate.
Here's a backgrace, for reference:
#0 0x0000000000329bc2 in a_crash ()
#1 0x000000000032895e in enframe ()
#2 0x0000000000329840 in alloc_group ()
#3 0x0000000000328853 in alloc_slot ()
#4 0x00000000003297df in alloc_group ()
#5 0x0000000000328853 in alloc_slot ()
#6 0x00000000003297df in alloc_group ()
#7 0x0000000000328853 in alloc_slot ()
#8 0x00000000003285eb in __libc_malloc_impl ()
#9 0x00000000003267a5 in malloc ()
#10 0x000000000023f36b in strdup ()
#11 0x0000000000246f1d in x86::init_libc (magic=<optimized out>, addr=<optimized out>) at /build/source/src/platform/x86_pc/init_libc.cpp:107
#12 0x000000000024769a in long_mode ()
#13 0x0000000000000000 in ?? ()
I've opened a bug for this here: https://github.com/includeos/IncludeOS/issues/2252 and will try to chase it down from there.
Would you like to keep this open while you find the bug or should we merge? I you're (relatively) confident it's unrelated to the updated musl version I think we could merge this. It would be nice to track down before any new release though
We don't need the TCP test to work, but I do think we need the SMP test to pass also with all CPU's actually active, since there might be changes to how TLS is initialized in the new libc. I hadn't noticed before but when running the SMP test with the command I posted above it detects 4 CPU's, but doesn't actually bring them online. So we need to:
- Enable SMP - we can consider having it on by default for now to have fewer code paths to maintain
- Run the SMP test with both v0.16.0-release and this branch.
- If v0.16.0 has working SMP and this PR has problems, we ned to investigate to make sure new musl does not change something important, like how TLS is initialized.
- If v0.16.0 has problems with SMP enabled, we need to go back further, to v0.15.1 or earlier to find when it broke.
As a point of reference, the last time I saw all AP's coming online (although I didn't test them) was in the pthread-kernel branch where I get this output when starting the example with 16 cores:
[ x86 ] Enabling interrupts
[ SMP ] Initializing APs
[ SMP ] Starting APs
AP 8 started at 0x547000
AP 9 started at 0x54b000
AP 15 started at 0x563000
AP 5 started at 0x53b000
AP 10 started at 0x54f000
AP 11 started at 0x553000
AP 13 started at 0x55b000
AP 6 started at 0x53f000
AP 7 started at 0x543000
AP 4 started at 0x537000
AP 1 started at 0x52b000
AP 2 started at 0x52f000
AP 14 started at 0x55f000
AP 12 started at 0x557000
AP 3 started at 0x533000
[ SMP ] All 16 APs are online now
Note that this is after a couple of reboots, so that particular image has problems - possibly related to #2252 . I'm not able to reproduce that with pthread-kernel-musl-1.1.24 which has a more recent musl; there we get an infinite crash loop, possibly related to the same issue, but it could also be an unresolved incompatibility wth more recent musl.
Steps to reproduce the last known working SMP setup:
git checkout origin/pthread-kernel
export INCLUDEOS_PACKAGE=$(nix-build)
nix-shell
make -j && grubify.sh hello_includeos.elf.bin
sudo qemu-system-x86_64 --nographic --enable-kvm -smp 16 -cpu host -hda hello_includeos.elf.bin.grub.img
libc needs to enable locking for SMP to work properly, and that only happens when threads > 0 in musl. See:
https://github.com/includeos/IncludeOS/blob/master/src/kernel/threads.cpp#L19-L23
and
https://github.com/includeos/IncludeOS/blob/master/src/kernel/threads.cpp#L60-L70
This means supporting an arbitrary musl version is not straight-forward. We have to inspect the internal struct for each version in order to set current threads, and to also enable locking.
Yes, there's definitely an SMP issue here. I've added a few more commits to make it easier to debug:
- enable SMP by default. We may want to add a parameter to the nix-package later so it's configurable again.
- disable
FOR_PRODUCTIONby default and update the warning - include Alfred's
cli;hltcommit ina_crash - add
os::shutdownto example to save somepkill qemuinvocations
It's now easy to reproduce with nix-build ./example.nix && qemu-system-x86_64 -kernel ./chainloader -initrd result/bin/hello_includeos.elf.bin -smp 4 -nographic:
[...]
[ xAPIC ] Enabling xAPIC
ID: 0 Ver: 50014
[ x86 ] Enabling interrupts
__libc.can_do_threads: 65537 __libc.threaded: 0
__libc.threads_minus_1: 0 -> 1
[ SMP ] Initializing APs
[ SMP ] Starting APs
AP 3 started at 0xa84000
AP 2 started at 0xa04000
AP 1 started at 0x984000
***** Lib** Libc crashed 😵 c ***
**crashed 😵 *** Libc crashed 😵 ***
With -smp 1 we get the expected:
================================================================================
IncludeOS VERY_DIRTY (x86_64 / 64-bit)
+--> Running [ Hello world - OS included ]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+--> WARNING: No good random source found: RDRAND/RDSEED instructions not available.
+--> To make this warning fatal, re-compile with FOR_PRODUCTION=ON.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Args = result/bin/hello_includeos.elf.bin
Try giving the service less memory, eg. 10MB in vm.json
Service done. Shutting down...
[ Kernel ] Stopping service
[ Kernel ] Powering off
And with -enable-kvm -cpu host -smp 4. So far I haven't seen the libc crash message here.
[ x2APIC ] Enabling x2APIC @ 0xfee00900
APIC id: 0 ver: 50014
[ x86 ] Enabling interrupts
__libc.can_do_threads: 65537 __libc.threaded: 0
__libc.threads_minus_1: 0 -> 1
[ SMP ] Initializing APs
[ SMP ] Starting APs
AP 1 started at 0x984000
AP 3 started at 0xa84000
AP 2 started at 0xa04000
>>>> !!! CPU 1 EXCEPTION !!! <<<<
General Protection Fault (13) EIP 0x2c5bb8 CODE 0
RAX: f000ff53f000ff53 R 8: 0000000000000080
RBX: 0000000000000080 R 9: 0000000000000000
RCX: 0000000000000000 R10: 8080808080808080
RDX: 0000000000000000 R11: 0000000000a03a60
RBP: 0000000000000000 R12: 00000000003ec614
RSP: 0000000000843fd0 R13: 0000000000000000
RSI: 0000000000a03e04 R14: 0000000000000000
RDI: 0000000000000001 R15: 0000000080000010
RIP: 00000000002c5bb8 FLA: 0000000000010202
CR0: 0000000080000013 CR4: 0000000000000620
CR1: 0000000000000000 CR8: 0000000000000000
CR2: 0000000000000000 GDT: 0000000000323790 (41)
CR3: 0000000000001000 IDT: 00000000003383c0 (2559)
I've experimented a bit more with this, but no solution yet. I'll include some notes below.
@fwsGonzo do you think changes are required for tls/threads setup in addition to enabling locks in the internal __libc struct? The code you linked uses the new threads code, which is not (yet) available v15/16.
In https://github.com/includeos/IncludeOS/pull/2251/commits/57bdaf7acfae66c51e6fad9646faa18e5c8243b4 I updated the __libc struct to match changes in musl. With need_locks = 1 it still crashes with a CPU exception. I also get this crash with -enable-kvm -cpu host.
[ I/O APIC ] Initializing ID=0
Addr: 0xfec00000 Version: 0x20 Intr: 0 Entries: 24
[ xAPIC ] Enabling xAPIC
ID: 0 Ver: 50014
[ x86 ] Enabling interrupts
__libc.can_do_threads: 1
__libc.threaded: 0
__libc.threads_minus_1: 0 -> 1
__libc.need_locks: 0 -> 1 [ SMP ] Initializing APs
[ SMP ] Starting APs
AP 1 started at 0x784000
>>>> !!! CPU 1 EXCEPTION !!! <<<<
General Protection Fault (13) EIP 0x2c9407 CODE 0
RAX: f000ff53f000ff87 R 8: 0000000000000000
RBX: 0000000000000026 R 9: 00000000002e0ae3
RCX: 4b4d564b00000000 R10: 0000000000783db0
RDX: 0000000000000010 R11: 0000000000783a60
RBP: 0000000000783f00 R12: 00000000003ec610
RSP: 0000000000943fd0 R13: 0000000000000000
RSI: 4743544743544743 R14: 000000000037acc0
RDI: ffffffffffffffda R15: 00000000003ec5e0
RIP: 00000000002c9407 FLA: 0000000000000286
CR0: 00000000e0000013 CR4: 0000000000000620
CR1: 0000000000000000 CR8: 0000000000000000
CR2: f000ff53f000ff87 GDT: 0000000000323790 (41)
CR3: 0000000000001000 IDT: 00000000003383c0 (2559)
With -smp 2 and gdb I've tracked it down to RNG::get().init() on my machine (in revenant_main on CPU 1), but I suspect it is actually caused by something not being set up properly in musl (based on @fwsGonzo's comment above).
If I add more cores (e.g. -smp 4) I get "No timeout" messages (which look like they come from the stubbed sys_futex) and a libc-crash. Perhaps we get contention on a lock when we add more cores and the crash just happens elsewhere. Or maybe we have multiple issues.
[ x86 ] Enabling interrupts
__libc.can_do_threads: 1
__libc.threaded: 0
__libc.threads_minus_1: 0 -> 1
__libc.need_locks: 0 -> 1 [ SMP ] Initializing APs
[ SMP ] Starting APs
AP 1 started at 0x984000
AP 2 started at 0xa04000
AP 3 started at 0xa84000
No timeout
*** Libc crashed 😵 ***
No timeout
No timeout
I see a similar crash with -smp 2 on thev0.16.0-release branch with SMP enabled and musl 1.1.18:
[ x86 ] Enabling interrupts
__libc.can_do_threads: 1 __libc.threaded: 0
__libc.threads_minus_1: 0 -> 1
[ SMP ] Initializing APs
[ SMP ] Starting APs
AP 1 started at 0x77e000
>>>> !!! CPU 1 EXCEPTION !!! <<<<
General Protection Fault (13) EIP 0x267e87 CODE 0
RAX: f000ff53f000ff97 R 8: 0000000000000000
RBX: 0000000000000026 R 9: 00000000002d9786
RCX: 4b4d564b00000000 R10: 000000000077ddb0
RDX: 0000000000000010 R11: 000000000000001c
RBP: 000000000077df00 R12: 000000000036a350
RSP: 000000000063dfd0 R13: 0000000000000000
RSI: 0000000000000000 R14: 000000000036fb40
RDI: ffffffffffffffda R15: 000000000036a320
RIP: 0000000000267e87 FLA: 0000000000010282
CR0: 0000000080000013 CR4: 0000000000000620
CR1: 0000000000000000 CR8: 0000000000000000
CR2: 0000000000000000 GDT: 0000000000318a90 (41)
CR3: 0000000000001000 IDT: 000000000032a4c0 (2559)
Testing with v0.15.x would require a working conan setup, which I don't have at the moment unfortunately.
I rebased on v0.16.0-release and reran the tests. The following tests are failing on this branch:
step 5.9: nix-shell --argstr unikernel ./test/kernel/integration/smp/ --run ./test.py step 6.1: nix-shell --argstr unikernel ./test/net/integration/bufstore/ --run ./test.py step 6.3: nix-shell --argstr unikernel ./test/net/integration/dns/ --run ./test.py step 6.7: nix-shell --argstr unikernel ./test/net/integration/tcp/ --run ./test.py
With SMP=OFF there are other failures:
step 5.10: nix-shell --argstr unikernel ./test/kernel/integration/term/ --run ./test.py step 6.2: nix-shell --argstr unikernel ./test/net/integration/configure/ --run ./test.py step 6.3: nix-shell --argstr unikernel ./test/net/integration/dns/ --run ./test.py step 6.4: nix-shell --argstr unikernel ./test/net/integration/icmp/ --run ./test.py step 6.5: nix-shell --argstr unikernel ./test/net/integration/icmp6/ --run ./test.py step 6.6: nix-shell --argstr unikernel ./test/net/integration/slaac/ --run ./test.py step 6.7: nix-shell --argstr unikernel ./test/net/integration/tcp/ --run ./test.py
It seems that these failures are related to the new malloc-implementation in musl. Compiling with --with-malloc=oldmalloc makes all tests pass (with SMP=OFF).
With SMP=ON now only two tests fail:
step 5.9: nix-shell --argstr unikernel ./test/kernel/integration/smp/ --run ./test.py step 6.1: nix-shell --argstr unikernel ./test/net/integration/bufstore/ --run ./test.py
Source for oldmalloc and mallocng (the default): https://git.musl-libc.org/cgit/musl/tree/src/malloc
I've pushed a new commit to use oldmalloc for now -- all tests should be passing with SMP=OFF
That's interesting!
I'm noticing that they have different locking mechanics, at least on the surface. oldmalloc calls __wait, in src/thread/__wait.c, which calls __syscall(SYS_futex, addr, FUTEX_WAIT ...), if libc.need_locks (callsite in src/malloc/oldmalloc/malloc.c).
It also calls __wake defined in src/internal/phread_impl.h and is an inline wrapper around __syscall(SYS_futex, addr, FUTEX_WAKE. (Callsite in oldmalloc/malloc.c)
The new mallocng calls
rdlock,wrlock,unlockfrom src/mallocng/glue.h- Which are wrappers around
LOCK/UNLOCK, which are macros wrapping__lock,__unlock, from src/internal/lock.h__lockcalls__futexwaitin src/thread/__lock.c, but again also only iflibc.need_locks.__futexwaitwraps__syscall(SYS_futex, addr, FUTEX_WAIT, in a similar manner to__wait
__unlockimplemented in the same file, calls__wake, if multithreaded, checked by dereferencing the lock and checking if it has a negative value. See this comment,/* Check l[0] to see if we are multi-threaded. */
So while these look very different, apart from the lock implementation being separated out in the new malloc, while the old one had them defined in malloc.c, I can't see any obvious difference in how they call into futex. Both implementations also call both brk and mmap.
Going back to the callstack I posted above, we see that strdup is called here:
https://github.com/MagnusS/IncludeOS/blob/musl1-2-5/src/platform/x86_pc/init_libc.cpp#L107
This is before libc is initialized, when we're setting things up for calling into it for the first time. So it's possible that the new malloc has dependency on some state that the old didn't have, and that is not ready before libc is initialized, causing one of the asserts in enframe to fail. We might want to check that malloc_context is properly zero-initialized as it seems to expect here: https://git.musl-libc.org/cgit/musl/tree/src/malloc/mallocng/malloc.c#n40
The bufstore-test only fails when spinlocks are used - both scoped_spinlock and regular lock/unlock spinlocks will crash. Removing the alignment here fixes the issue: https://github.com/MagnusS/IncludeOS/blob/musl1-2-5/api/smp_utils#L28
If I add the alignment in BufferStore class' private section instead with alignas(128) it also works, here: https://github.com/MagnusS/IncludeOS/blob/musl1-2-5/api/net/buffer_store.hpp#L89
Should we remove the alignment from the spinlock_t typedef or could there be other ways to fix this? How much do we gain by always aligning spinlock_t?
From the discussion here I'm not sure if __attribute__(aligned) has well defined behaviour for typedef's: https://stackoverflow.com/questions/15788947/where-can-i-use-alignas-in-c11/34006668#34006668
#2266 should fix the bufstore test.
The SMP failure is likely due to this issue, but may be other issues as well: https://github.com/includeos/IncludeOS/issues/2268
Modern Linux has some new system calls that may not be implemented. I don't know if that's related here, but a few that I can tihnk of:
clone3 futex64_time-something futex with WAIT_BITSET && WAKE_BITSET
I've never really understood how to implement futex properly, and I suspect it's just not working right. Any code that leads to futex should be assumed to be problematic with newer libc's.
I rebased on master and fixed the stack trace test to work with newer musl. With the latest fixes for SMP and the early calls into libc the tests are now passing. I tested both with SMP enabled and disabled.
$ ./test.sh [...] 🌈✨ Everything is awesome ✨
Fantastic news! I'm adding cores to a few more integration tests to get a run where they not only have SMP enabled but also many cores to wake up. Will get back with review tomorrow.
Couldn't wait 😀 The extended test set from https://github.com/includeos/IncludeOS/pull/2276 passes without issues. Everything is indeed awesome 💪
The only lingering question for me is why we need to use the old malloc implementation - I couldn't find any good reasons for it when reading the code, but we can get to that later. I've spawned an issue for it here https://github.com/includeos/IncludeOS/issues/2277.