RIOT
RIOT copied to clipboard
Monitoring malloc usage
This is still far from ready to be merged, but before investing more time into it I would like to gather some feedback, see below.
Contribution description
This is a first attempt of a malloc monitoring implementation that is meant to be used for debugging purposes when working with (libraries that make use of) malloc and friends. Compared to the MALLOC_TRACING
module already available in RIOT, this is not designed to print the calls to stdout, but to keep track of the amount of memory that is currently dynamically allocated, as well as the all-time max. I was using this while playing with uzlib in #20293.
Discussion points
- Does such an implementation make sense at all? Or can someone think of another, easier way of monitoring the dynamic memory usage? I tried using
mallinfo()
but that didn't report the all-time max correctly. - This just hooks into
sys/malloc_thread_safe
which is afaik not used on all boards, and definitely not onnative
. Is there a list somewhere that shows which boards use which malloc implementation? Would it make sense to hook into all of them? - I would probably implement most of this in a separate module. In that case, do we want
MALLOC_TRACING
to stay around or should it be removed in favor of this implementation? Pinging ~~@benpicco~~ @maribu here since this was originally his contribution.
Testing procedure
None yet.
Cool! I'm almost certain that some standard c libs (looking avrlibc) do not have mallinfo anyway, so this would be a solution to add this globally.
What would also be quite useful is adding a backtrace to malloc_state_rm()
when called on an address not traced (e.g. to detect double-free).
In that case, do we want
MALLOC_TRACING
to stay around or should it be removed in favor of this implementation?
I found that quite useful to trace down a call to free()
with a wild pointer. If malloc_state_rm()
where to print the caller address when called on an address not traced, this would work even better for that use case, as it would only print the offending side.
But I guess it can still be useful to resort to highest output verbosity in some cases and the implementation is pretty low-key to maintain, so I guess it could just remain. Maybe just rename that to malloc_tracing_verbose
, so that the name malloc_tracing
would be available again?
This just hooks into
sys/malloc_thread_safe
which is afaik not used on all boards, and definitely not onnative
.
This is correct. What malloc_thread_safe
is trying to provide is thread-safety on a C lib that does not have reentrant syscalls. E.g. the authors avrlibc
never imagined someone would be running multiple threads on an AVR, so that will never be thread safe. But also for newlib and picolibc the overhead of enabling reentrant syscalls was prohibitive (at least as a default). Still, corruption and crashes (as concurrent calls to malloc()
previously caused) was not acceptable and this fixed the most pressing issue with relatively low overhead.
On native
calls to malloc()
are thread-safe anyway, so this doesn't wrap them. Enabling malloc_thread_safe
there anyway (as dependency for malloc_tracing
) should however not do any harm, other than a bit more overhead.
Thanks for the feedback! I will then go forward and clean this up.
What would also be quite useful is adding a backtrace to
malloc_state_rm()
when called on an address not traced (e.g. to detect double-free).
With backtrace you mean basically the same type of print that is used in malloc_tracing
currently? I can certainly add that.
But I guess it can still be useful to resort to highest output verbosity in some cases and the implementation is pretty low-key to maintain, so I guess it could just remain. Maybe just rename that to
malloc_tracing_verbose
, so that the namemalloc_tracing
would be available again?
If we want to have the option of high verbosity, I would rather offer a configuration option for the new module and integrate the prints there behind a compile-time flag. I would suppose that a configuration option for a module is nicer for the user than having to deal with two different (pseudo)modules.
I would have proposed malloc_monitor
as module name, how does that sound? I'd argue that it is monitoring the dynamic heap instead of tracing calls as malloc_tracing
is currently doing.
This just hooks into
sys/malloc_thread_safe
which is afaik not used on all boards, and definitely not onnative
.This is correct.
Are there any more boards where sys/malloc_thread_safe
is not used? Is enabling it always (also on boards other than native
) as a dependency of the new module a good option?
This is now ready for review.
The module malloc_monitor
contains the functionality described in the initial post and can be configured with Kconfig and the corresponding cflags wrt the maximum number of pointers that can be monitored at once and the verbosity. For verbose output, this completely replaces the functionality of the pseudo-module malloc_tracing
, which is therefore deprecated and scheduled for removal after 2024.07.
As discussed, malloc_monitor
builds on and therefore includes malloc_thread_safe
automatically. Shouldn't do any harm apart from some runtime overhead, but malloc_monitor
is anyway not meant to be used in production code.
Murdock results
:heavy_check_mark: PASSED
f31f82042a5310e3f5b23fdb9c4e940edff56436 sys: add malloc_monitor, deprecate malloc_tracing
Success | Failures | Total | Runtime |
---|---|---|---|
10031 | 0 | 10031 | 10m:50s |
Artifacts
The murdock failure for native64:llvm is apparently only a single example of all boards with TOOLCHAIN=llvm
failing, other locally failing boards are native
and nrf52840dk
.
It seems the wrapping functions for malloc and friends are not properly handled by clang, which however also means that the thread-safety wrappers are ignored by that toolchain. Interestingly, tests/sys/malloc_thread_safety
built with TOOLCHAIN=llvm
does succeed also on nrf52840dk
, where malloc_thread_safe
is pulled in by default.
I have to correct myself, it was a case of the compiler being smarter than the human again :P
Since the memory returned by malloc was never used, all calls to malloc and free where just silently optimized away. Fixed that now by declaring the memory as volatile for the sake of the test.
Yeah, we had a test for calloc()
returning NULL
when the multiplication was overflowing. LLVM notices that the requested memory was never actually used, so it optimized the allocation out and just claimed that the optimized out allocation succeeded (and calloc()
returned a non-NULL
pointer) and the test failed therefore.
./dist/tools/doccheck/check.sh x Error: reached end of file while inside a '```' block!
Unfortunately I can't reproduce this locally. What's the doxygen version that's used in the CI?
Thanks for the review!
One open issue I see is synchronization to avoid data races when accessing the monitor data.
Very good point o.O, thanks for pointing it out! Addressed with https://github.com/RIOT-OS/RIOT/pull/20363/commits/c5b35283725b79673f7909bbb915388b6ce5c892, please have a look :)
@maribu @gdoffe maybe you can have another look to include this in the next release :)
@maribu @gdoffe maybe you can have another look to include this in the next release :)
As a simple user of your work LGTM, good job ! :+1:
Thank you!