IncludeOS Upgrade to latest musl 1.2.5 (via 1.1.24)

This upgrades to musl 1.2.5, the latest release.

The update has been tested by building the chainloader and booting the example.

The upgrade is done via 1.1.24, using partial patches from the pthread branch (originally written by @fwsGonzo). See commit messages for more details.

musl has introduced _time64 versions of several functions to provide 64-bit time for 32-bit builds without breaking the existing API. It looks like IncludeOS is using 64 bit values already, so I have just kept the existing versions in IncludeOS. It would probably be useful with a read through by someone more familiar with the IncludeOS time code...

Jun 17 '24 18:06 MagnusS

Awesome! Great to get musl unpinned and up to date, thanks!

✅ Builds
✅ UDP test passes nix-shell --argstr unikernel ./test/net/integration/udp/ --run "./test.py"
✅ SMP test passes nix-shell --argstr unikernel ./test/kernel/integration/smp/ --run "./test.py"
❌ TCP test enters crash loop nix-shell --argstr unikernel ./test/net/integration/tcp/ --run "./test.py"

I'm pretty sure the TCP test failing is unrelated - it fails during heap initialisation, before initializing libc. Incidentally, changing the musl patch to make a_crash emit a message and cli;hlt instead of ud2, catches the error:

[ WARNING ] Running with sudo
<vm> SeaBIOS (version 1.16.3-debian-1.16.3-2)
<vm> Booting from ROM..* Multiboot begin: 0x9500
<vm> * Multiboot cmdline @ 0x23c035: /nix/store/74g3yzkxj9jbli24vhqg38q0psgng2cj-chainloader-static-i686-unknown-linux-musl-dev/bin/chainloader booted with vmrunner
<vm> * Multiboot end: 0x23c0b5
<vm> * Module list @ 0x23c000
<vm> 	 * Module @ 0x23d000
<vm> 	 * Args: net_tcp.elf.bin booted with vmrunner
<vm>  	 * End: 0x4a6d80
<vm>  * Multiboot end: 0x4a6d80
<vm> <Multiboot>Booted with multiboot
<vm> 	* Boot flags: 0x24f
<vm> 	* Valid memory (261631 Kib):
<vm> 	  0x00000000 - 0x0009fbff (639 Kib)
<vm> 	  0x00100000 - 0x0ffdffff (260992 Kib)
<vm> 
<vm> 	* Booted with parameters @ 0x23c035: /nix/store/74g3yzkxj9jbli24vhqg38q0psgng2cj-chainloader-static-i686-unknown-linux-musl-dev/bin/chainloader booted with vmrunner
<vm> 	* Multiboot provided memory map  (8 entries @ 0x9000)
<vm> 	  0x0000000000 - 0x000009fbff FREE (639 Kb.)
<vm> 	  0x000009fc00 - 0x000009ffff RESERVED (1 Kb.)
<vm> 	  0x00000f0000 - 0x00000fffff RESERVED (64 Kb.)
<vm> 	  0x0000100000 - 0x000ffdffff FREE (260992 Kb.)
<vm> 	  0x000ffe0000 - 0x000fffffff RESERVED (128 Kb.)
<vm> 	  0x00feffc000 - 0x00feffffff RESERVED (16 Kb.)
<vm> 	  0x00fffc0000 - 0x00ffffffff RESERVED (256 Kb.)
<vm> 	  0x0000000000 - 0x00ffffffff RESERVED (0 Kb.)
<vm> 
<vm> <Multiboot>OS loaded with 1 modules
<vm> 	* net_tcp.elf.bin booted with vmrunner @ 0x23d000 - 0x4a6d80, size: 2530688b
<vm> * Multiboot begin: 0x9500
<vm> * Multiboot end: 0x3bd2a0
<vm> [x86_64 PC] constructor
<vm> [ Machine ] Initializing heap
<vm> [ Machine ] Main memory detected as 264380224 b
<vm> [ Machine ] Reserving 1048576 b for machine use
<vm> 
<vm> *** Libc crashed 😵 ***

You can bring this in if you'd like, or we can discuss it in a separate PR: https://github.com/alfreb/IncludeOS/commit/e81fb7c7da96b8cae8b43d406b6d868b7d09b66e

I think this is a great opportunity to maybe find an old ghost; I'll try to find some time to investigate.

Jun 19 '24 04:06 alfreb

Here's a backgrace, for reference:

#0  0x0000000000329bc2 in a_crash ()
#1  0x000000000032895e in enframe ()
#2  0x0000000000329840 in alloc_group ()
#3  0x0000000000328853 in alloc_slot ()
#4  0x00000000003297df in alloc_group ()
#5  0x0000000000328853 in alloc_slot ()
#6  0x00000000003297df in alloc_group ()
#7  0x0000000000328853 in alloc_slot ()
#8  0x00000000003285eb in __libc_malloc_impl ()
#9  0x00000000003267a5 in malloc ()
#10 0x000000000023f36b in strdup ()
#11 0x0000000000246f1d in x86::init_libc (magic=<optimized out>, addr=<optimized out>) at /build/source/src/platform/x86_pc/init_libc.cpp:107
#12 0x000000000024769a in long_mode ()
#13 0x0000000000000000 in ?? ()

I've opened a bug for this here: https://github.com/includeos/IncludeOS/issues/2252 and will try to chase it down from there.

Jun 19 '24 08:06 alfreb

Would you like to keep this open while you find the bug or should we merge? I you're (relatively) confident it's unrelated to the updated musl version I think we could merge this. It would be nice to track down before any new release though

Jun 28 '24 10:06 MagnusS

We don't need the TCP test to work, but I do think we need the SMP test to pass also with all CPU's actually active, since there might be changes to how TLS is initialized in the new libc. I hadn't noticed before but when running the SMP test with the command I posted above it detects 4 CPU's, but doesn't actually bring them online. So we need to:

Enable SMP - we can consider having it on by default for now to have fewer code paths to maintain
Run the SMP test with both v0.16.0-release and this branch.
- If v0.16.0 has working SMP and this PR has problems, we ned to investigate to make sure new musl does not change something important, like how TLS is initialized.
- If v0.16.0 has problems with SMP enabled, we need to go back further, to v0.15.1 or earlier to find when it broke.

As a point of reference, the last time I saw all AP's coming online (although I didn't test them) was in the pthread-kernel branch where I get this output when starting the example with 16 cores:

      [ x86 ] Enabling interrupts
        [ SMP ] Initializing APs
        [ SMP ] Starting APs
                AP 8 started at 0x547000
                AP 9 started at 0x54b000
                AP 15 started at 0x563000
                AP 5 started at 0x53b000
                AP 10 started at 0x54f000
                AP 11 started at 0x553000
                AP 13 started at 0x55b000
                AP 6 started at 0x53f000
                AP 7 started at 0x543000
                AP 4 started at 0x537000
                AP 1 started at 0x52b000
                AP 2 started at 0x52f000
                AP 14 started at 0x55f000
                AP 12 started at 0x557000
                AP 3 started at 0x533000
        [ SMP ] All 16 APs are online now

Note that this is after a couple of reboots, so that particular image has problems - possibly related to #2252 . I'm not able to reproduce that with pthread-kernel-musl-1.1.24 which has a more recent musl; there we get an infinite crash loop, possibly related to the same issue, but it could also be an unresolved incompatibility wth more recent musl.

Steps to reproduce the last known working SMP setup:

git checkout origin/pthread-kernel
export INCLUDEOS_PACKAGE=$(nix-build)
nix-shell
make -j && grubify.sh hello_includeos.elf.bin
sudo qemu-system-x86_64 --nographic --enable-kvm -smp 16 -cpu host -hda hello_includeos.elf.bin.grub.img

Jun 29 '24 12:06 alfreb

libc needs to enable locking for SMP to work properly, and that only happens when threads > 0 in musl. See:

https://github.com/includeos/IncludeOS/blob/master/src/kernel/threads.cpp#L19-L23

and

https://github.com/includeos/IncludeOS/blob/master/src/kernel/threads.cpp#L60-L70

This means supporting an arbitrary musl version is not straight-forward. We have to inspect the internal struct for each version in order to set current threads, and to also enable locking.

Jun 29 '24 12:06 fwsGonzo

Yes, there's definitely an SMP issue here. I've added a few more commits to make it easier to debug:

enable SMP by default. We may want to add a parameter to the nix-package later so it's configurable again.
disable FOR_PRODUCTION by default and update the warning
include Alfred's cli;hlt commit in a_crash
add os::shutdown to example to save some pkill qemu invocations

It's now easy to reproduce with nix-build ./example.nix && qemu-system-x86_64 -kernel ./chainloader -initrd result/bin/hello_includeos.elf.bin -smp 4 -nographic:

[...]
      [ xAPIC ] Enabling xAPIC
                ID: 0  Ver: 50014
        [ x86 ] Enabling interrupts
__libc.can_do_threads: 65537  __libc.threaded: 0
__libc.threads_minus_1: 0 -> 1
        [ SMP ] Initializing APs
        [ SMP ] Starting APs
                AP 3 started at 0xa84000
                AP 2 started at 0xa04000
                AP 1 started at 0x984000



***** Lib** Libc crashed 😵 c ***
**crashed 😵 *** Libc crashed 😵 ***

With -smp 1 we get the expected:

================================================================================
 IncludeOS VERY_DIRTY (x86_64 / 64-bit)
 +--> Running [ Hello world - OS included ]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 +--> WARNING: No good random source found: RDRAND/RDSEED instructions not available.
 +-->        To make this warning fatal, re-compile with FOR_PRODUCTION=ON.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Args = result/bin/hello_includeos.elf.bin
Try giving the service less memory, eg. 10MB in vm.json
Service done. Shutting down...
     [ Kernel ] Stopping service
     [ Kernel ] Powering off

And with -enable-kvm -cpu host -smp 4. So far I haven't seen the libc crash message here.

     [ x2APIC ] Enabling x2APIC @ 0xfee00900
                APIC id: 0  ver: 50014
        [ x86 ] Enabling interrupts
__libc.can_do_threads: 65537  __libc.threaded: 0
__libc.threads_minus_1: 0 -> 1
        [ SMP ] Initializing APs
        [ SMP ] Starting APs
                AP 1 started at 0x984000
                AP 3 started at 0xa84000
                AP 2 started at 0xa04000

>>>> !!! CPU 1 EXCEPTION !!! <<<<
General Protection Fault (13)   EIP  0x2c5bb8   CODE 0

  RAX:  f000ff53f000ff53  R 8:  0000000000000080
  RBX:  0000000000000080  R 9:  0000000000000000
  RCX:  0000000000000000  R10:  8080808080808080
  RDX:  0000000000000000  R11:  0000000000a03a60

  RBP:  0000000000000000  R12:  00000000003ec614
  RSP:  0000000000843fd0  R13:  0000000000000000
  RSI:  0000000000a03e04  R14:  0000000000000000
  RDI:  0000000000000001  R15:  0000000080000010
  RIP:  00000000002c5bb8  FLA:  0000000000010202

  CR0:  0000000080000013  CR4:  0000000000000620
  CR1:  0000000000000000  CR8:  0000000000000000
  CR2:  0000000000000000  GDT:  0000000000323790 (41)
  CR3:  0000000000001000  IDT:  00000000003383c0 (2559)

Jul 02 '24 09:07 MagnusS

I've experimented a bit more with this, but no solution yet. I'll include some notes below.

@fwsGonzo do you think changes are required for tls/threads setup in addition to enabling locks in the internal __libc struct? The code you linked uses the new threads code, which is not (yet) available v15/16.

In https://github.com/includeos/IncludeOS/pull/2251/commits/57bdaf7acfae66c51e6fad9646faa18e5c8243b4 I updated the __libc struct to match changes in musl. With need_locks = 1 it still crashes with a CPU exception. I also get this crash with -enable-kvm -cpu host.

   [ I/O APIC ] Initializing ID=0
                Addr: 0xfec00000  Version: 0x20  Intr: 0  Entries: 24
      [ xAPIC ] Enabling xAPIC
                ID: 0  Ver: 50014
        [ x86 ] Enabling interrupts
__libc.can_do_threads: 1
__libc.threaded: 0
__libc.threads_minus_1: 0 -> 1
__libc.need_locks: 0 -> 1        [ SMP ] Initializing APs
        [ SMP ] Starting APs
                AP 1 started at 0x784000

>>>> !!! CPU 1 EXCEPTION !!! <<<<
General Protection Fault (13)   EIP  0x2c9407   CODE 0

  RAX:  f000ff53f000ff87  R 8:  0000000000000000
  RBX:  0000000000000026  R 9:  00000000002e0ae3
  RCX:  4b4d564b00000000  R10:  0000000000783db0
  RDX:  0000000000000010  R11:  0000000000783a60

  RBP:  0000000000783f00  R12:  00000000003ec610
  RSP:  0000000000943fd0  R13:  0000000000000000
  RSI:  4743544743544743  R14:  000000000037acc0
  RDI:  ffffffffffffffda  R15:  00000000003ec5e0
  RIP:  00000000002c9407  FLA:  0000000000000286

  CR0:  00000000e0000013  CR4:  0000000000000620
  CR1:  0000000000000000  CR8:  0000000000000000
  CR2:  f000ff53f000ff87  GDT:  0000000000323790 (41)
  CR3:  0000000000001000  IDT:  00000000003383c0 (2559)

With -smp 2 and gdb I've tracked it down to RNG::get().init() on my machine (in revenant_main on CPU 1), but I suspect it is actually caused by something not being set up properly in musl (based on @fwsGonzo's comment above).

If I add more cores (e.g. -smp 4) I get "No timeout" messages (which look like they come from the stubbed sys_futex) and a libc-crash. Perhaps we get contention on a lock when we add more cores and the crash just happens elsewhere. Or maybe we have multiple issues.

        [ x86 ] Enabling interrupts
__libc.can_do_threads: 1
__libc.threaded: 0
__libc.threads_minus_1: 0 -> 1
__libc.need_locks: 0 -> 1        [ SMP ] Initializing APs
        [ SMP ] Starting APs
                AP 1 started at 0x984000
                AP 2 started at 0xa04000
                AP 3 started at 0xa84000
No timeout

*** Libc crashed 😵 ***
No timeout
No timeout

I see a similar crash with -smp 2 on thev0.16.0-release branch with SMP enabled and musl 1.1.18:

        [ x86 ] Enabling interrupts
__libc.can_do_threads: 1  __libc.threaded: 0
__libc.threads_minus_1: 0 -> 1
        [ SMP ] Initializing APs
        [ SMP ] Starting APs
                AP 1 started at 0x77e000

>>>> !!! CPU 1 EXCEPTION !!! <<<<
General Protection Fault (13)   EIP  0x267e87   CODE 0

  RAX:  f000ff53f000ff97  R 8:  0000000000000000
  RBX:  0000000000000026  R 9:  00000000002d9786
  RCX:  4b4d564b00000000  R10:  000000000077ddb0
  RDX:  0000000000000010  R11:  000000000000001c

  RBP:  000000000077df00  R12:  000000000036a350
  RSP:  000000000063dfd0  R13:  0000000000000000
  RSI:  0000000000000000  R14:  000000000036fb40
  RDI:  ffffffffffffffda  R15:  000000000036a320
  RIP:  0000000000267e87  FLA:  0000000000010282

  CR0:  0000000080000013  CR4:  0000000000000620
  CR1:  0000000000000000  CR8:  0000000000000000
  CR2:  0000000000000000  GDT:  0000000000318a90 (41)
  CR3:  0000000000001000  IDT:  000000000032a4c0 (2559)

Testing with v0.15.x would require a working conan setup, which I don't have at the moment unfortunately.

Jul 03 '24 08:07 MagnusS

I rebased on v0.16.0-release and reran the tests. The following tests are failing on this branch:

step 5.9: nix-shell --argstr unikernel ./test/kernel/integration/smp/ --run ./test.py step 6.1: nix-shell --argstr unikernel ./test/net/integration/bufstore/ --run ./test.py step 6.3: nix-shell --argstr unikernel ./test/net/integration/dns/ --run ./test.py step 6.7: nix-shell --argstr unikernel ./test/net/integration/tcp/ --run ./test.py

Aug 15 '24 13:08 MagnusS

With SMP=OFF there are other failures:

step 5.10: nix-shell --argstr unikernel ./test/kernel/integration/term/ --run ./test.py step 6.2: nix-shell --argstr unikernel ./test/net/integration/configure/ --run ./test.py step 6.3: nix-shell --argstr unikernel ./test/net/integration/dns/ --run ./test.py step 6.4: nix-shell --argstr unikernel ./test/net/integration/icmp/ --run ./test.py step 6.5: nix-shell --argstr unikernel ./test/net/integration/icmp6/ --run ./test.py step 6.6: nix-shell --argstr unikernel ./test/net/integration/slaac/ --run ./test.py step 6.7: nix-shell --argstr unikernel ./test/net/integration/tcp/ --run ./test.py

It seems that these failures are related to the new malloc-implementation in musl. Compiling with --with-malloc=oldmalloc makes all tests pass (with SMP=OFF).

With SMP=ON now only two tests fail:

step 5.9: nix-shell --argstr unikernel ./test/kernel/integration/smp/ --run ./test.py step 6.1: nix-shell --argstr unikernel ./test/net/integration/bufstore/ --run ./test.py

Source for oldmalloc and mallocng (the default): https://git.musl-libc.org/cgit/musl/tree/src/malloc

I've pushed a new commit to use oldmalloc for now -- all tests should be passing with SMP=OFF

Aug 16 '24 20:08 MagnusS

That's interesting!

I'm noticing that they have different locking mechanics, at least on the surface. oldmalloc calls __wait, in src/thread/__wait.c, which calls __syscall(SYS_futex, addr, FUTEX_WAIT ...), if libc.need_locks (callsite in src/malloc/oldmalloc/malloc.c).

It also calls __wake defined in src/internal/phread_impl.h and is an inline wrapper around __syscall(SYS_futex, addr, FUTEX_WAKE. (Callsite in oldmalloc/malloc.c)

The new mallocng calls

rdlock, wrlock, unlock from src/mallocng/glue.h
Which are wrappers around LOCK / UNLOCK, which are macros wrapping __lock, __unlock, from src/internal/lock.h
- __lock calls __futexwait in src/thread/__lock.c, but again also only if libc.need_locks.
  - __futexwait wraps __syscall(SYS_futex, addr, FUTEX_WAIT, in a similar manner to __wait
- __unlock implemented in the same file, calls __wake, if multithreaded, checked by dereferencing the lock and checking if it has a negative value. See this comment, /* Check l[0] to see if we are multi-threaded. */

So while these look very different, apart from the lock implementation being separated out in the new malloc, while the old one had them defined in malloc.c, I can't see any obvious difference in how they call into futex. Both implementations also call both brk and mmap.

Aug 17 '24 12:08 alfreb

Going back to the callstack I posted above, we see that strdup is called here: https://github.com/MagnusS/IncludeOS/blob/musl1-2-5/src/platform/x86_pc/init_libc.cpp#L107

This is before libc is initialized, when we're setting things up for calling into it for the first time. So it's possible that the new malloc has dependency on some state that the old didn't have, and that is not ready before libc is initialized, causing one of the asserts in enframe to fail. We might want to check that malloc_context is properly zero-initialized as it seems to expect here: https://git.musl-libc.org/cgit/musl/tree/src/malloc/mallocng/malloc.c#n40

Aug 17 '24 15:08 alfreb

The bufstore-test only fails when spinlocks are used - both scoped_spinlock and regular lock/unlock spinlocks will crash. Removing the alignment here fixes the issue: https://github.com/MagnusS/IncludeOS/blob/musl1-2-5/api/smp_utils#L28

If I add the alignment in BufferStore class' private section instead with alignas(128) it also works, here: https://github.com/MagnusS/IncludeOS/blob/musl1-2-5/api/net/buffer_store.hpp#L89

Should we remove the alignment from the spinlock_t typedef or could there be other ways to fix this? How much do we gain by always aligning spinlock_t?

From the discussion here I'm not sure if __attribute__(aligned) has well defined behaviour for typedef's: https://stackoverflow.com/questions/15788947/where-can-i-use-alignas-in-c11/34006668#34006668

Aug 22 '24 21:08 MagnusS

#2266 should fix the bufstore test.

The SMP failure is likely due to this issue, but may be other issues as well: https://github.com/includeos/IncludeOS/issues/2268

Aug 28 '24 10:08 MagnusS

Modern Linux has some new system calls that may not be implemented. I don't know if that's related here, but a few that I can tihnk of:

clone3 futex64_time-something futex with WAIT_BITSET && WAKE_BITSET

I've never really understood how to implement futex properly, and I suspect it's just not working right. Any code that leads to futex should be assumed to be problematic with newer libc's.

Aug 28 '24 12:08 fwsGonzo

I rebased on master and fixed the stack trace test to work with newer musl. With the latest fixes for SMP and the early calls into libc the tests are now passing. I tested both with SMP enabled and disabled.

$ ./test.sh [...] 🌈✨ Everything is awesome ✨

Sep 07 '24 16:09 MagnusS

Fantastic news! I'm adding cores to a few more integration tests to get a run where they not only have SMP enabled but also many cores to wake up. Will get back with review tomorrow.

Sep 07 '24 19:09 alfreb

Couldn't wait 😀 The extended test set from https://github.com/includeos/IncludeOS/pull/2276 passes without issues. Everything is indeed awesome 💪

The only lingering question for me is why we need to use the old malloc implementation - I couldn't find any good reasons for it when reading the code, but we can get to that later. I've spawned an issue for it here https://github.com/includeos/IncludeOS/issues/2277.

Sep 07 '24 21:09 alfreb

IncludeOS IncludeOS copied to clipboard

Upgrade to latest musl 1.2.5 (via 1.1.24)

IncludeOS
IncludeOS copied to clipboard