lean4
lean4 copied to clipboard
feat: track callers in panicking functions
This implements a feature similar to the #[track_caller] attribute in Rust or the [HasCallStack] typeclass in GHC Haskell.
- The panicking primitive
panicWithPosWithDeclandpanicWithPosare changed toStackInfo.panicwhich bundles together the module/decl/position information into a structStackInfo. panicWithInfois a wrapper overStackInfo.panicwhich accepts theStackInfoby auto param, using thestack_infotactic.- The
stack_infotactic is a macro forfirst | assumption | exact stack_info_here, a poor man's typeclass inference with a tactic fallback. - The
stack_info_hereterm macro expands to a literal containing the declaration, module and line information for the application in which it is called. panic!used to synthesize the decl/module/line info and callpanicWithPosWithDeclandpanicWithPos, now it is just a trivial wrapper overpanicWithInfo, whose auto param does the equivalent thing.
The upshot is that when you call panic!, it now either synthesizes the position on the spot, or gets it from an argument of type StackInfo if it is in the context. Even more importantly, functions with a (info : StackInfo := by stack_info) auto param will also do the same thing, meaning that it's possible to bubble up panic blame.
Generally speaking, we want to put these (info : StackInfo := by stack_info) auto params on any function which panics due to a precondition violation rather than a bug in the code itself. Helpfully, these are all marked with a ! in the code (like Option.get!), so they are easy to find.
I would have liked to use a typeclass like GHC does instead of an auto param. The inference behavior of typeclasses is much closer to what we want here - auto params have to go at the end of the declaration, which is awkward, and they are also a lot more verbose than a [HasCallStack] argument would be. But currently there is no way to trigger a tactic from a typeclass search, or even a typeclass search failure as in this case.
Question: Do we still need the panic! macro? In this PR it is completely trivial: panic! $arg expands to panicWithInfo $arg which is just a regular definition. You might think it is needed because of interpolation but actually it doesn't even do that: you have to write panic! s!"..." if you want interpolation. Should I make it do interpolation? It would be more consistent that way. (Maybe it doesn't support interpolation for some reason to do with staging, since panic is available quite early, but it was refactored in this PR to come a bit later, so that is probably not an issue anymore.)
We want to make string interpolation the default anyhow: #407
This PR now implements panic! interpolation as well. I cleaned up the commits and it has a whopping 3 update-stage0's in it, so it's likely to rot quickly.
I am very concerned about this PR. It is increasing the size of the generated code quite a bit (see some of the stage0 .c files). I didn't carefully investigate the issue (nor have time to do it), but the problem seems to be that extra code has to be emitted at every call site using this new approach. We also currently don't have a mechanism for controlling code generation for debug vs release builds. BTW, I am very happy with the call stack dump at panic that has recently been added. I think we can survive with this feature until we add better support for debugging. I marking this PR to be discussed at the next dev meeting.
Can you run the benchmark? I did anticipate that code size growth could be an issue, since there is now a lot more _closed_n terms corresponding to all the new places that errors can be thrown. AFAICT that's a fundamental problem here: panics used to only occur in a few places like Option.get! itself, now all the uses of get! get their own panic location information and that info has to be stored in the binary. If we ever get debug info this problem will be much worse.
An alternative solution would be to just flag how many call frames to drop from the call stack when reporting the panic, but this would (1) require us to have the complete call stack and (2) not have optimized out the function of interest. I don't think we can reliably guarantee either one (we can do (1) only on linux). Having better and more "official" stack introspection would be hugely helpful here, but it seems like a relatively long-term goal compared to this implementation which is one of the few things we can do to improve the debug experience today.
!bench
Here are the benchmark results for commit be81b562ab6117987f0f5c5b2e956d9eba1ca860. There were significant changes against commit fd5f3a5bad48e48ecdfdea22d96c9a13753a9d78:
Benchmark Metric Change
===============================================================
- libleanshared.so binary size 3%
- rbmap_library maxrss 1% (23.4 σ)
- stdlib instructions 4% (2664.3 σ)
- stdlib maxrss 3% (41.7 σ)
- stdlib task-clock 3% (35.3 σ)
- stdlib size bytes .olean 4%
- stdlib size lines C 4%
- tests/bench/ interpreted instructions 1% (589.7 σ)
- tests/bench/ interpreted maxrss 1% (15.7 σ)
- tests/compiler sum binary sizes 2%
One way to decrease the cost of this in lean itself is to have an option set_option callerInfo false to be set inside lean, which will make the caller_info macro use the caller info of the function being called (i.e. get! or find!) instead of a newly synthesized caller info. Also the whole thing can be behind a thunk and some auxiliary functions to avoid doing anything other than passing a pointer on the hot path. I'd really like some way to have the default user experience get this debug info though, even if it doesn't outweigh the performance implications inside the lean implementation.
I'm going to stop rebasing this PR until we come to a consensus about the direction to go with it.