RIOT icon indicating copy to clipboard operation
RIOT copied to clipboard

build system: use thread-safe stdio

Open maribu opened this issue 8 months ago • 17 comments

Contribution description

build system: introduce bug_% feature category

This introduces a new feature category to declare which bugs are present in a given build that we cannot fix in RIOT, but need to work around. These bugs may be silicon bugs, software bugs in ROM, software bugs in binary blobs needed for a platform, or bugs in the toolchain.

These features behave the same way as arch_% features: Every provided bug is always used, so that we can inspect FEATURES_USED to check which bugs need to be worked around.

build system: use thread-safe stdio

This makes use of the new bug modeling to declare all platforms that can use newlib and have no reentrancy hooks as affected by the non-thread-safe stdio bug. (Which is every platform but ESP* and AVR, the former because the reentrancy hooks are provided, the latter because we do not and never will support newlib on them.)

Building on that, the mpaland-printf package is used when newlib is used and the bug is present. This way we can rely on the stdio being thread-safe on every platform and not causing random crashes at run time.

sys/newlib: drop workaround for stdio

This drops a workaround that initialized newlib's reentrancy structure on boot to reduce the chances of crashes when using the non-thread-safe (unless reentrancy hooks are provided) stdio implementation of newlib.

Now that the newlib stdio implementation is only ever used if it is thread-safe, we no longer need a workaround that reduces the chance of crashes on concurrent use of stdio.

Testing procedure

Now the package mpaland-printf should be pulled in for all non-ESP boards when newlib is used. E.g.

for board in hifive1b same54-xpro native32 native64 arduino-mega2560 esp32-mh-et-live-minikit esp8266-esp-12x z1 nucleo-l011k4; do
    printf '\n%s\n===================\n' $board
    echo "Features:"
    make BOARD=$board --no-print-directory -C examples/basic/hello-world info-features-used | grep -E 'newlib|libc'
    echo "Packages:"
    make BOARD=$board --no-print-directory -C examples/basic/hello-world info-packages
done
hifive1b
===================
Features:
bug_newlib_broken_stdio
newlib
Packages:
mpaland-printf

same54-xpro
===================
Features:
bug_newlib_broken_stdio
newlib
Packages:
cmsis
mpaland-printf

native32
===================
Features:
Packages:

native64
===================
Features:
Packages:

arduino-mega2560
===================
Features:
Packages:

esp32-mh-et-live-minikit
===================
Features:
newlib
Packages:
esp32_sdk

esp8266-esp-12x
===================
Features:
newlib
Packages:
esp8266_sdk

z1
===================
Features:
bug_newlib_broken_stdio
Packages:
mpaland-printf

nucleo-l011k4
===================
Features:
bug_newlib_broken_stdio
picolibc
Packages:
cmsis

With this, the test in tests/sys/snprintf now should also pass for each and every board, except for ESP32 boards. The reason is that, since ESP* provides the reentrancy hooks, newlib's stdio is thread-safe on those. So we do not pull in the alternative printf. For ESP8266 boards newlib is configured to use the non-nano stdio, which passes the test. For ESP32 newlib-nano is used, which does not implement all format specifiers and, therefore, fails the test.

Issues/PRs references

Depends on and includes:

  • [x] https://github.com/RIOT-OS/RIOT/pull/21439

maribu avatar Apr 25 '25 06:04 maribu

Murdock results

:heavy_check_mark: PASSED

4a84f6f58e47c330c813a5eba5ab6efe45d6c64c tests/sys/snprintf: use printf_long_long

Success Failures Total Runtime
10536 0 10536 20m:02s

Artifacts

riot-ci avatar Apr 25 '25 10:04 riot-ci

We should probably give this a full build without fast fail, otherwise you'll be chasing your tail for days 😅

crasbe avatar Apr 25 '25 11:04 crasbe

We should probably give this a full build without fast fail, otherwise you'll be chasing your tail for days 😅

No, actually the memory requirements will go down with this. Unless... both printf variants get linked in into the same app.

See https://github.com/RIOT-OS/RIOT/pull/21439 for a fix of that.

maribu avatar Apr 25 '25 11:04 maribu

in one of your commit messages and in the description of this PR you have every misspelled.

This way we can rely on the stdio being thread-safe on evyer platform and not causing random crashes at run time.

Enoch247 avatar Apr 26 '25 00:04 Enoch247

For MSP430 both .text and .data slightly increase with this. This is not expected and needs to be investigated first.

maribu avatar Apr 27 '25 08:04 maribu

Ah, 64 bit math is expensive on MSP430. Disabling support for long long and .data stays the same and .text goes down when switching to mpaland-printf.

I was hoping that mpaland-printf would be smaller for all platforms while supporting all features. But maybe support for printing long long should be a module to not cause regressions for MSP430. It could still be a default module on 32 bit and 64 bit platforms, were support for long long is super cheap.

maribu avatar Apr 27 '25 09:04 maribu

Btw: you can use the built-in function to compare Build Sizes: https://doc.riot-os.org/advanced-build-system-tricks.html#comparing-build-sizes

Someone (👀👀) recently fixed it :D

crasbe avatar Apr 27 '25 09:04 crasbe

I didn't want to install the RISC-V and ESP32 toolchains as well, so this is just a smaller set.

cbuec@W11nMate:~/RIOTstuff/riot-mpaland/RIOT/examples/basic/hello-world$ OLDBIN=master NEWBIN=mpaland make info-buildsizes-diff
using BOARD="native64" as "native" on a 64-bit system
text    data    bss     dec     BOARD/BINDIRBASE

22      0       0       22      arduino-mega2560
5532    118     904     6554    master
5554    118     904     6576    mpaland

24      0       0       24      native64
27566   1080    59128   87774   master
27590   1080    59128   87798   mpaland

20      0       0       20      nucleo-l011k4
5996    24      1772    7792    master
6016    24      1772    7812    mpaland

-3188   -108    -8      -3304   same54-xpro
12632   116     2308    15056   master
9444    8       2300    11752   mpaland

crasbe avatar Apr 27 '25 10:04 crasbe

I think there is missing a RIOT_CI_BUILD=1,as the longer version (which includes the branch name unless it is master) is probably the cause for the size diff on AVR. (avrlibc's stdio is thread safe naturally and better optimized for 8 bit platforms than mpaland-printf, so sticking with the native printf there makes sense and is done here.) Same with the nucleo-l011k, which only supports picolibc. And on picolibc the native printf is also excellent.

maribu avatar Apr 27 '25 10:04 maribu

Yes, with RIOT_CI_BUILD=1 the sizes are identical for the other boards.

crasbe avatar Apr 27 '25 10:04 crasbe

https://github.com/RIOT-OS/RIOT/pull/21445 now makes 64 bit printing support optional. It makes that support opt-in for MSP430 (due it being expensive there), and opt-out for 32-bit systems (due to it being cheap there).

maribu avatar Apr 27 '25 21:04 maribu

I'd like to do some more testing before hitting the green button on this one. There seems to be something off on RISC-V with this PR, at least with the BUILD_IN_DOCKER=1 toolchain.

maribu avatar Apr 28 '25 19:04 maribu

OK, on RISC-V enabling support for long long printing is also not almost free. I dropped the soft dependency on printf_long_long from arch32_bit (turning opt-out to opt-in).

The soft dependency from arch64_bit to printf_long_long was also not correct: Without printf_long_long on 64-bit, %p will not work. Hence, 64-bit platforms need a hard dependency on printf_long_long.

Finally, on ARM (both legacy and Cortex M) I could not see any difference in memory with printf_long_long enabled. Would be nice to understand the background of this first before merging it. But if that is expected behavior, than enabling printf_long_long by default on ARM makes sense, as it adds more features for no cost.

maribu avatar May 02 '25 07:05 maribu

Would be nice to understand the background of this first before merging it. But if that is expected behavior, than enabling printf_long_long by default on ARM makes sense, as it adds more features for no cost.

Are you still wanting to investigate this before merging? Should this PR be marked as draft?

Enoch247 avatar May 08 '25 00:05 Enoch247

Are you still wanting to investigate this before merging?

Just did and it looks like human error. On ARM it also safes ~1K of .text, so it behaves as expected.

I'll change the default to consistently only support long long when disabled via the printf_long_long module; with the one exception of 64 bit systems.

maribu avatar May 10 '25 14:05 maribu

I think this is ready to get merged :)

maribu avatar May 15 '25 09:05 maribu

I traced down the issue of the increased ROM size for the wolfSSL example to the fact that two assert() implementations are used in it: The newlib's and the RIOT one.

I added a commit that wraps the newlib's __assert_func() to RIOT's assertion failure handlers for more consistent behavior. In addition this will result in that assert() not pulling in low-level printf() functions from newlib even when newlib's stdio is wrapped.

maribu avatar May 27 '25 21:05 maribu

This should now finally pass the CI

maribu avatar Aug 20 '25 11:08 maribu

We should probably do a full build with no fast fail before trying to merge again.

crasbe avatar Aug 20 '25 11:08 crasbe

But hitting the merge button would also do a full build. Doing a full build prior would just cause the 7 days of CPU time for a full build cycle to be spent twice on the same code.

maribu avatar Aug 20 '25 11:08 maribu

Ah sorry for the noise, that's probably the removed printf_long_long support.

Should we rather add printf_long_long to the test application or feature-gate test_int64?

mguetschow avatar Aug 21 '25 08:08 mguetschow

Ah sorry for the noise, that's probably the removed printf_long_long support.

No, its IMO not noise. A test failing with default configuration is IMO a genuine bug :)

Should we rather add printf_long_long to the test application or feature-gate test_int64?

I did both. There might be users with tiny boards that can benefit from easily being able to disable printf_long_long in the test, but having the larger test coverage by default is probably better.

maribu avatar Aug 21 '25 08:08 maribu

Please squash :)

mguetschow avatar Aug 21 '25 10:08 mguetschow

Yay :tada:

Thanks everyone for bearing with me on this one :heart:

maribu avatar Aug 21 '25 16:08 maribu

This resulted in some failing nightly tests, as mpaland-printf prints %p without leading 0x. See the unmerged PR over at https://github.com/mpaland/printf/pull/90.

As the printing format is indeed implementation-defined, the respective tests should probably be adapted to accept output both with and without 0x.

mguetschow avatar Aug 22 '25 08:08 mguetschow

Another problem, at least on my machine:

$ make -C tests/sys/heap_cmd BOARD=nrf52840dk all
make: Entering directory '/home/mikolai/TUD/Code/RIOT/tests/sys/heap_cmd'
Building application "tests_heap_cmd" for "nrf52840dk" with CPU "nrf52".

"make" -C /home/mikolai/TUD/Code/RIOT/pkg/cmsis/ 
"make" -C /home/mikolai/TUD/Code/RIOT/pkg/mpaland-printf/ 
"make" -C /home/mikolai/.riot/pkg/mpaland-printf -f /home/mikolai/TUD/Code/RIOT/pkg/mpaland-printf/mpaland-printf.mk
"make" -C /home/mikolai/TUD/Code/RIOT/boards
"make" -C /home/mikolai/TUD/Code/RIOT/boards/common/init
"make" -C /home/mikolai/TUD/Code/RIOT/boards/nrf52840dk
"make" -C /home/mikolai/TUD/Code/RIOT/boards/common/nrf52xxxdk
"make" -C /home/mikolai/TUD/Code/RIOT/core
"make" -C /home/mikolai/TUD/Code/RIOT/core/lib
"make" -C /home/mikolai/TUD/Code/RIOT/cpu/nrf52
"make" -C /home/mikolai/TUD/Code/RIOT/cpu/cortexm_common
"make" -C /home/mikolai/TUD/Code/RIOT/cpu/cortexm_common/periph
"make" -C /home/mikolai/TUD/Code/RIOT/cpu/nrf52/periph
"make" -C /home/mikolai/TUD/Code/RIOT/cpu/nrf52/vectors
"make" -C /home/mikolai/TUD/Code/RIOT/cpu/nrf5x_common
"make" -C /home/mikolai/TUD/Code/RIOT/cpu/nrf5x_common/periph
"make" -C /home/mikolai/TUD/Code/RIOT/drivers
"make" -C /home/mikolai/TUD/Code/RIOT/drivers/periph_common
"make" -C /home/mikolai/TUD/Code/RIOT/sys
"make" -C /home/mikolai/TUD/Code/RIOT/sys/auto_init
"make" -C /home/mikolai/TUD/Code/RIOT/sys/div
"make" -C /home/mikolai/TUD/Code/RIOT/sys/isrpipe
"make" -C /home/mikolai/TUD/Code/RIOT/sys/libc
"make" -C /home/mikolai/TUD/Code/RIOT/sys/malloc_thread_safe
"make" -C /home/mikolai/TUD/Code/RIOT/sys/newlib_syscalls_default
"make" -C /home/mikolai/TUD/Code/RIOT/sys/preprocessor
"make" -C /home/mikolai/TUD/Code/RIOT/sys/ps
"make" -C /home/mikolai/TUD/Code/RIOT/sys/shell
"make" -C /home/mikolai/TUD/Code/RIOT/sys/shell/cmds
"make" -C /home/mikolai/TUD/Code/RIOT/sys/stdio
"make" -C /home/mikolai/TUD/Code/RIOT/sys/stdio_uart
"make" -C /home/mikolai/TUD/Code/RIOT/sys/test_utils/interactive_sync
"make" -C /home/mikolai/TUD/Code/RIOT/sys/test_utils/print_stack_usage
"make" -C /home/mikolai/TUD/Code/RIOT/sys/tsrb
/usr/lib/gcc/arm-none-eabi/12.2.1/../../../arm-none-eabi/bin/ld: /usr/lib/gcc/arm-none-eabi/12.2.1/../../../arm-none-eabi/lib/thumb/v7e-m+fp/hard/libc_nano.a(lib_a-mstats.o): in function `_mstats_r':
/home/pere/src/newlib-salsa/build_nano/arm-none-eabi/thumb/v7e-m+fp/hard/newlib/libc/stdlib/../../../../../../../../newlib/libc/stdlib/mstats.c:121: undefined reference to `__wrap_fiprintf'
/usr/lib/gcc/arm-none-eabi/12.2.1/../../../arm-none-eabi/bin/ld: /home/pere/src/newlib-salsa/build_nano/arm-none-eabi/thumb/v7e-m+fp/hard/newlib/libc/stdlib/../../../../../../../../newlib/libc/stdlib/mstats.c:121: undefined reference to `__wrap_fiprintf'
/usr/lib/gcc/arm-none-eabi/12.2.1/../../../arm-none-eabi/bin/ld: /usr/lib/gcc/arm-none-eabi/12.2.1/../../../arm-none-eabi/lib/thumb/v7e-m+fp/hard/libc_nano.a(lib_a-nano-mallstatsr.o): in function `_malloc_stats_r':
/home/pere/src/newlib-salsa/build_nano/arm-none-eabi/thumb/v7e-m+fp/hard/newlib/libc/stdlib/../../../../../../../../newlib/libc/stdlib/nano-mallocr.c:523: undefined reference to `__wrap_fiprintf'
/usr/lib/gcc/arm-none-eabi/12.2.1/../../../arm-none-eabi/bin/ld: /home/pere/src/newlib-salsa/build_nano/arm-none-eabi/thumb/v7e-m+fp/hard/newlib/libc/stdlib/../../../../../../../../newlib/libc/stdlib/nano-mallocr.c:525: undefined reference to `__wrap_fiprintf'
/usr/lib/gcc/arm-none-eabi/12.2.1/../../../arm-none-eabi/bin/ld: /home/pere/src/newlib-salsa/build_nano/arm-none-eabi/thumb/v7e-m+fp/hard/newlib/libc/stdlib/../../../../../../../../newlib/libc/stdlib/nano-mallocr.c:527: undefined reference to `__wrap_fiprintf'
collect2: error: ld returned 1 exit status
make: *** [/home/mikolai/TUD/Code/RIOT/tests/sys/heap_cmd/../../../Makefile.include:755: /home/mikolai/TUD/Code/RIOT/tests/sys/heap_cmd/bin/nrf52840dk/tests_heap_cmd.elf] Error 1
make: Leaving directory '/home/mikolai/TUD/Code/RIOT/tests/sys/heap_cmd'

https://github.com/RIOT-OS/RIOT/blob/c0cc617c7f6addd35ee8d2e827f768eabcf6b83b/pkg/mpaland-printf/Makefile.include#L16-L25

suggests that this is failing on purpose, but I'm not sure what to make out of this. Apparently that also didn't happen on CI for some reason (maybe it was not properly rebuilding all modules for the test?).

mguetschow avatar Aug 22 '25 09:08 mguetschow

As the printing format is indeed implementation-defined, the respective tests should probably be adapted to accept output both with and without 0x.

There we go: https://github.com/RIOT-OS/RIOT/pull/21677

mguetschow avatar Aug 22 '25 09:08 mguetschow

Hey, commit build system: use thread-safe stdio introduced a severe performance penalty. Together with the penalty of slipmux, this makes the RIOT shell basically unusable slow (think >7s for running ps). I haven't measured yet how bad the issue is when not using slipmux, I suspect it's still so fast the human eye doesn't realize it otherwise this would have been spotted earlier. There is also the option that this is specific in conjunction to using slipmux but at the moment that seems unlikely to me but further digging is needed.

The issue immediatly goes away when disabling bug_newlib_broken_stdio for my board, so it originates in mpaland-printf.

Teufelchen1 avatar Oct 13 '25 10:10 Teufelchen1

Yes, mpaland-printf sends one char out at a time, as it does not do buffering. With a network as output, unbuffered output makes little sense.

maribu avatar Oct 13 '25 10:10 maribu

@maribu do you have a rough idea how to approach this issue? I feel a bit lost to find a starting point.

Teufelchen1 avatar Oct 13 '25 10:10 Teufelchen1