LLVM-embedded-toolchain-for-Arm
LLVM-embedded-toolchain-for-Arm copied to clipboard
Picolibc release version
We've been running benchmarks to compare performance with other toolchains for Cortex-M55, and the compiler is performing pretty well on the actual compilation - clearly superior to GCC because of its MVE capabilities, if not at the heights of Arm Compiler.
But the overall benchmark results are coming out firmly below the Arm GCC package because of Picolibc, which is optimised for size rather than speed. 45% faster on vectorisable code doesn't compensate for being nearly 4 times slower on parsing operations.
There is a Meson configuration for picolibc, so could you expose an option and provide the speed-optimsed variant in the multilibs?
https://github.com/picolibc/picolibc/issues/258 https://github.com/picolibc/picolibc/pull/259
Another issue we hit was the "tinystdio", which made semihosting support intolerably slow - one character at a time. Overall we'd rather like a "newlib" equivalent config of picolibc rather than the "newlib-nano" equivalent it seems to default to.
Although maybe if newlib is being re-added in an upcoming release, then that can be the full library, and picolibc can remain in its "nano"/"tiny" config. But still, a release rather than minsize option for that "tiny" would be good. The minsize speed penalty is very big.
Hi Kevin, thank you for the detailed analysis and great input! Let us see what can we do for the upcoming LLVM 18 based release and get back here.
Another bit of input - having tried just the "release" version on the benchmark that was showing the worst performance hit (an XML parser), that alone didn't help that much. It turns out that the nano malloc implementation was actually having more of an effect.
Compiler (all at -O2) | Relative speed |
---|---|
LLVM 17.0.1 with minsize picolib and nano malloc | 26% |
LLVM 17.0.1 with release picolib and nano malloc | 28% |
LLVM 17.0.1 with release picolib and standard malloc | 99% |
GCC 13.2.1 + newlib | 100% |
Arm Compiler 6.21 | 171% |
https://github.com/ARM-software/LLVM-embedded-toolchain-for-Arm/pull/441 added option to select between minsize
and release
builds, and provided release
by default for bigger cores. For release
builds fast malloc
is also enabled.
Release version of picolibc added; newlib package is now part of the release, I will close this issue.