puncover Chasing callbacks/function pointers?

I'm using puncover for Zephyr firmware (via the built-in west build -t puncover... command) and have gotten a ton of mileage out of it for determining safe stack sizes for different threads. One issue that's tripped us a number of times, though, is function pointers and callbacks. As an example, we've got a driver with code that looks like this:

static void hwt901b_event_task(const struct device *dev) {
   [-- big while loop, this is the long-running function that a thread executes --]
        if (data->data_cb) {
[-- do a bunch of processing to populate a struct --]
            data->data_cb(dev, data->cb_user_data, &hwt901b);
        }

In this case, I totally get that puncover can't know how much stack the callback function will need (I mean... whole program analysis might be able to determine that for a given ELF file it is only ever set to one value, but that's asking a lot). The issue that we've run into is that puncover does report a definitive worst-case stack usage number, but that number neither includes the callback function nor includes any indication that there's an unknown amount of stack required (which would flag to us that we need to manually do that math).

I'm honestly interested in potentially implementing this functionality myself and would make a PR for it but I:

don't know how much the maintainers/community actually cares about this
haven't dug into the actual source yet to start figuring out what might need to be done to support this

If the answers to those are: "yes, please!" and "this seems like it wouldn't be too much work to get through" then let me know and I can start digging over the Christmas break here.

Dec 10 '24 20:12 tonyarkles

Hey, @tonyarkles, I am glad to read that you got value out of puncover !

I am currently not spending any time on this project, but thought about this use-case before when I was originally writing puncover in the context of Pebble. We had both an event loop and user callbacks for even handlers. If you want to spend the trying this, my loose idea on how to implement it was to "annotate" certain functions to create additional edges to the call graph.

A simple way to implement it could be a set of command line arguments or a configuration file that expresses these edges, e.g. func_a: [func_1, func_2, func_3], func_b: [func_2, func_5], ....

Unfortunately, it's tedious to maintain such a configuration next to the source code. I am sure there are ways to capture this closer to the source code, but since puncover operates on the ELF by parsing output from objdump the options are limited. Maybe you can introduce helper symbols, e.g. a const array that references the callees and uses a name that puncover recognizes to construct this map. In debug builds, this array could also be used to assert that every callback is present at runtime. Not great, I know.

Dec 11 '24 11:12 HBehrens

Hej! I am also very interested in this feature.

I loocked for an fitting example and found one from my Prusa mini 3D printer firmware. It has some callbacks used when a screen/window on the display is closed.

Here on the function page of gui_loop_until_dialog_closed it has 3 callees, all the callbacks hide behind std::function<void ()>::operator()() const (18). From what I understood on my printer that std::function goes out to 4 callbacks. My proposal to add those via command-line options first, like

parser.add_argument('--add-dynamic-callee', '--add_dynamic_callee', action='append', help="adds a dynamic callee to another function i.e. --add-dynamic-callee 'func_a->func_1' to add func_1 as a calee to func_a ")

So it would be possible to add those four callback at the cli with

--add-dynamic-callee 'gui_loop_until_dialog_closed->msg_box(MsgBoxType, string_view_utf8, std::array const&, MsgBoxDefaultButton)' \
--add-dynamic-callee 'gui_loop_until_dialog_closed->static_unique_ptr ScreenFactory::Screen()' \
--add-dynamic-callee 'gui_loop_until_dialog_closed->marlin_client::gcode(char const*)' \
--add-dynamic-callee 'gui_loop_until_dialog_closed->static_unique_ptr ScreenFactory::Screen()'

or with the configargparse from this PR with this config:

elf_file: ~/git/Prusa-Firmware-Buddy/build/firmware
build_dir: ~/git/Prusa-Firmware-Buddy/build
add-dynamic-callee: [
    gui_loop_until_dialog_closed->msg_box(MsgBoxType, string_view_utf8, std::array const&, MsgBoxDefaultButton)
    gui_loop_until_dialog_closed->static_unique_ptr ScreenFactory::Screen()
    gui_loop_until_dialog_closed->marlin_client::gcode(char const*)
    gui_loop_until_dialog_closed->static_unique_ptr ScreenFactory::Screen() 
]

What do you think of the format to add the callbacks/function pointers ^^?

Feb 15 '25 20:02 paulwuertz

As an more usable extension I could think of adding those to the webapp so one could dynamically add an callee in the table at the top of the function page by clicking an '+'-bottom at the end of all statically know ones like this:

If the cli version is done and you are interested I would open this as another issue.

And another issue then would be for me to export whatever was manually selected on the webapp as a cli config to run it afterwards directly without repeating the process of adding callbacks manually later on ^^

Feb 15 '25 20:02 paulwuertz

Added an PR https://github.com/HBehrens/puncover/pull/108 tackling the issue :)

Feb 16 '25 19:02 paulwuertz

Hey, @tonyarkles, I am glad to read that you got value out of puncover !

A ton of value! We've got relatively complex firmware running on low-memory embedded hardware and puncover has helped a ton for right-sizing our stacks to safely free up resources as much as possible. Thank you!

A simple way to implement it could be a set of command line arguments or a configuration file that expresses these edges, e.g. func_a: [func_1, func_2, func_3], func_b: [func_2, func_5], ....

Unfortunately, it's tedious to maintain such a configuration next to the source code. I am sure there are ways to capture this closer to the source code, but since puncover operates on the ELF by parsing output from objdump the options are limited. Maybe you can introduce helper symbols, e.g. a const array that references the callees and uses a name that puncover recognizes to construct this map. In debug builds, this array could also be used to assert that every callback is present at runtime. Not great, I know.

I haven't dug too much into the objdump output but based on the comments in this thread I'm going to try to find some time to dig a bit into how puncover is actually working under the hood. Off the top of your head, do you know if it's possible to tell from the underlying objdump output that there's a call to a function pointer (and hence an unknown quantity of stack usage)? If that's straightforward then it seems like maintaining the list of edges could at least get easier in a conservative way. By identifying a function pointer dereference and looking at the current edge map it could warn the user "hey, I don't know how much stack the caller will use because I don't know what functions this pointer could point at".

There is still the issue of detecting when that mapping changes, for example if you make a new callback function that replaces an old one. I know that, in general, this probably turns into the halting problem but it might also be possible to do some simpler analysis with some kind of directive like "the 2nd parameter passed to ublox_set_pvt_callback(dev, handle_pvt) will be called by ublox_parse_message()".

Thanks again for riffing on this even though it's not a tool you're actively working on. It really has been valuable for us and I'm hoping I'll get a chance to add a bit more value back to the project.

Mar 22 '25 16:03 tonyarkles