esp-hal
esp-hal copied to clipboard
Relax some of the restrictions of the `#[ram]` macro
See https://github.com/esp-rs/esp-hal/pull/448#issuecomment-1535446796 and https://github.com/esp-rs/esp-hal/pull/448#issuecomment-1535979574 for more details.
You'd written:
The idea behind the #[ram] macro is for the "initial" function that should be placed into ram, after that any functions that might get called inside there, should probably just have #[inline(always)].
And I agree that this would be a very good idea! I'm fairly sure that's not what #[ram] currently does, though: it seems like it only applies to the function it's placed on and not the callees, at least every time I've tried it. I apologize for not reaching out about that sooner; it's something I was hoping it was something I'd find a nice solution for, but instead I am merely burdened with a bunch of context about things I've tried :upside_down_face:
So far I've been following the callees and inlining/relocating them by inspection; originally at the rust source level and most recently via scrutinizing disassembly. As you might expect, this is a lot of effort for a non-trivial error rate, and it's a process that more or less needs to restart from scratch every time. Plus, the effects that might trigger a change are extremely non-local, especially if you've turned on LTO as we have.
Suffice to say, I'd definitely like a #[ram] macro that walked the whole graph! I didn't see an obvious way to implement it, though, since proc macros appear to operate mainly over tokens? I wasn't sure even who to ask to walk the entire call graph, but I suspect a macro running as part of the compiler would need to go pretty severely "out of bounds" to do so, because rustc is only (intended to be) looking at a single crate at a time.
I did a (brief) survey to see if I could find other approaches in the wild; I can't find my notes on it at the moment (probably for the best, heh), but the most promising thing I found was https://github.com/japaric/cargo-call-stack : it claims to be able to produce a .dot for the call graph of an entire program, which'd be a very useful reference (implementation or result)! It didn't work right away on our binary, and I haven't gotten back yet to see if I can figure out why, but that at least felt like I was at least on the train into the right city for a solution, if not quite yet arriving at a destination.
Suffice to say, I'd definitely like a #[ram] macro that walked the whole graph! I didn't see an obvious way to implement it, though, since proc macros appear to operate mainly over tokens? I wasn't sure even who to ask to walk the entire call graph, but I suspect a macro running as part of the compiler would need to go pretty severely "out of bounds" to do so, because rustc is only (intended to be) looking at a single crate at a time.
I would love this too, however I think this would need compiler support, perhaps a mix of both rustc and LLVM. I think we could maybe aim for a tool to check that all code called within the #[ram] macro is actually in ram; this would at least save you the trouble of manually inspecting the ELF. I believe there is some prior art in the esp-idf tooling.
I did a (brief) survey to see if I could find other approaches in the wild; I can't find my notes on it at the moment (probably for the best, heh), but the most promising thing I found was https://github.com/japaric/cargo-call-stack : it claims to be able to produce a .dot for the call graph of an entire program, which'd be a very useful reference (implementation or result)! It didn't work right away on our binary, and I haven't gotten back yet to see if I can figure out why, but that at least felt like I was at least on the train into the right city for a solution, if not quite yet arriving at a destination.
I never thought about using cargo call stack for this purpose, but it would indeed be very useful in this case! I know it can get confused by some type, particularly dyn types but if you're using dyn in an interrupt then you probably don't care or know enough to care about the performance hit. I'd be interested to see where this goes, AFAIK the esp-idf tooling is more "fuzzy", having an actual call graph to work with would be incredible.