RFCs
RFCs copied to clipboard
Demangle Symbols in Debuggers (LLDB, GDB)
Summary
Related issue which is closed: https://github.com/nim-lang/Nim/issues/8596
- Nim can be debugged with LLDB (or GDB)
- Name mangling causes UX issues with debugging in LLDB and GDB by requiring you to refer to Nim symbols in their mangled form.
- The suggested workaround is quite hacky. For variables, this requires you to print all local or global variables. Then you scan and find the variable in a GUI or Terminal output. For other symbols, such as breaking at a function call, I can see this being quite frustrating.
- Preferably we would not have to refer to names as mangled, and e.g. could
print xrather thanprint x_<nim-specific-mangle>
- This is a common problem, as noted from the forums:
- https://forum.nim-lang.org/t/9735
- https://forum.nim-lang.org/t/9906
- https://forum.nim-lang.org/t/9735
Description
Here are my findings from researching LLDB. I have not researched GDB. I thought I would post them here in case others wanted to implement/execute this or whether I have missed something in my proposed solution.
For LLDB, one needs to:
- (Required) Let LLDB know how to identify the mangling scheme & how to de-mangle a symbol
- (Optional) Implement a Language plugin for deeper LLDB integration
References:
- For D: https://reviews.llvm.org/D110578
- For Swift: if one looks at Apple's fork of LLDB, we can see the Language plugin being implemented for Swift along with code for demangling & identification of swift's mangling scheme.
From reading the source: a unique mangling scheme identifiable from others is needed along with code to de-mangle it. All mangling schemes used by other languages/compilers (C++/Itanium, C++/MSVC, D, Rust) use a prefix to classify how/from what compiler the name was mangled.
For Nim: identifying the mangling scheme/language from a mangled name is more complex. This is because Nim is compiled into a target language that uses an existing mangling scheme. If we had control over the binary or Debug Symbol output file (e.g. DWARF), I believe this would be easier, but again: since the target language's compiler is being used it is slightly more complex.
To solve this with today's standard Nim compiler, here are my researched steps:
- Contain/embed a unique constant identifier within each symbol to identify that this symbol was output from the Nim compiler. Modifications to be done here: https://github.com/nim-lang/Nim/blob/502a4486aeb8d0a5dcdf86540522d3dc16960536/compiler/ccgutils.nim#L71
- This unfortunately would have a chance to overlap with identifiers that are used for C or C++ code in existing codebases. Unicode symbols would allow for rare conflicts but would require C99 or above
- This probably requires an RFC and further discussion
- Modify LLDB:
- Modify the Mangle class
- Add mangling scheme enum entry for Nim here: https://github.com/llvm/llvm-project/blob/main/lldb/include/lldb/Core/Mangled.h#L41-L48
- Classify if the symbol originates from the Nim compiler with the above knowledge: https://github.com/llvm/llvm-project/blob/main/lldb/source/Core/Mangled.cpp#L42-L79
- Implementation seems to require one-level deep recursion
- Call & implement demangling code in C++
- Getting this accepted to LLDB might be difficult (due to valid C/C++ identifiers). Perhaps a compiler option similar to Apple's LLDB (see here) or a run-time flag would be appropriate here (seems to require many modifications of LLDB, maybe LLVM folks know best here)
- Modify the Mangle class
- (optional): implement a Language plugin. Why? Deeper integration with LLDB
- See here for swift's language plugin, here's is the Language class: https://github.com/apple/llvm-project/blob/40e3ca95e3f05c7b5286092d52a33a751a717a5e/lldb/source/Plugins/Language/Swift/SwiftLanguage.h#L26
- Docs seem to be lacking, but it seems to be for:
- Help/docs on symbols
- De-mangle functions without parameters mangled in the name (
GetDemangledFunctionNameWithoutArguments) - Probably other things, for Swift it seems to be related to the REPL integration with LLDB
Alternatives
Here are some alternatives I can think of, but will likely require more work:
- Modify the nim compiler to output the target assembly directly (or via LLVM), this is related to NIR
- It would be likely be easier convincing the LLVM/LLDB team to merge the name de-mangling changes for Nim if it did not conflict with C/C++ symbols
- Write a debugger in Nim. Pros:
- Would offer a chance to integrate with the compiler, i.e. to evaluate nimscript in the debugger or to modify the program at run-time / to provide a REPL similar to Swift
- Reading & modifying the LLDB code is hard with all the OOP/abstraction
Examples
No response
Backwards Compatibility
My proposed solution will change the way the nim compiler mangles, but for backward compatibility: one could offer a flag to mangle the old way. Though I don't think this flag would be necessary: just re-compile your source if you want debugging support.
Links
Mangling & D:
- https://dlang.org/blog/2017/12/20/ds-newfangled-name-mangling/
LLDB codepointers:
- Language Plugin: https://github.com/apple/llvm-project/tree/40e3ca95e3f05c7b5286092d52a33a751a717a5e/lldb/source/Plugins/Language/Swift
- Mangled: https://github.com/apple/llvm-project/blob/40e3ca95e3f05c7b5286092d52a33a751a717a5e/lldb/source/Core/Mangled.cpp#L319-L341
- Guess lang: https://github.com/apple/llvm-project/blob/next/lldb/source/Core/Mangled.cpp#L471-L476
Writing a debugger:
- https://www.timdbg.com/posts/writing-a-debugger-from-scratch-part-1/
- https://opensource.com/article/18/1/how-debuggers-work
+1 Please let's write our own debugger.
Implementing for GDB would be similar process ^1. Imo adding support to existing debuggers is better than writing our own since it means less maintenance and allows easy integration with existing tools