zig
zig copied to clipboard
LLVM backend always produces `DW_AT_language=C99` in DWARF debug symbols.
Zig Version
0.16.0-dev.1225+bf9082518
Steps to Reproduce and Observed Behavior
I'm currently working on a fork of GDB with support for zig's DWARF extensions and types. Currently the X86_64 backend produces Zig's language code (0x27) in the dwarf output in a zig compilation unit. But the LLVM backend produces the C99 language code. We really want this language code to be set correctly on all backends so that a debugger can detect what language it's debugging.
On an X86_64 machine:
zig build-exe -fllvm ./test.zig -femit-bin=llvm_test
zig build-exe -fno-llvm ./test.zig -femit-bin=x86_test
Then use objdump or any other equivalent dwarf dumping tool to check the field: on the llvm bin:
objdump -g ./llvm_test | grep 'DW_AT_language'
produces:
<3a325> DW_AT_language : 12 (ANSI C99)
And then on the x86_64 backend:
objdump -g ./x86_64_test | grep 'DW_AT_language'
produces:
<d5463> DW_AT_language : 39 (Zig)
Expected Behavior
Zig compilation units should use the zig language code on all backends.
This is where the llvm backend is hardcoded to always output C99 as the language code, if you change the value to std.dwarf.LANG.Zig then you see the correct behavior:
https://github.com/ziglang/zig/blob/ee4df4ad3edad160fb737a1935cd86bc2f9cfbbe/lib/std/zig/llvm/ir.zig#L1629
I'm working on a fix here: https://github.com/ziglang/zig/compare/master...The-Briel-Deal:zig:pass_down_lang_llvm
I'm just trying to figure out if there is a case where another language's language code should be passed down. I don't think that would be the case but there could be a case i'm not thinking of.
Since LLVM (the library) does not support emitting Zig debugging extensions, llvm debugging info and self-hosted debugging info are incompatible, so the language is currently being used to distinguish between "real" zig debugging info and "fake" c debugging info which is the best llvm is currently able to support.
Thanks @jacobly0! That makes more sense. That said I'd like to devise some way to distinguish between a C99 program and a Zig LLVM program. My reasoning for this is I'de like my GDB fork to better display Zig's types. For example, some types (optionals), which are currently displayed as {some: 1, data: 81273682} since the data portion is just being interpreted as an integer unless you cast it.
So even though we won't get the zig extension instructions, i'de like to still be able to have the language be distinct from C99 so that I can properly display data from Zig.
I'm also not against using the x86 backend for debugging but there are still a few kinks that need to be worked out.
Oh and also, thank you for your Zig LLDB fork! That ended up being a super nice reference and a handy tool for most debugging. Do you still maintain that?
Makes sense, but I would argue that the llvm debug info should be improved first before it's worth supporting, since the current state is not likely to be compatible with a future state, which ideally is just compatible with what the x86_64 backend is already emitting. However, since we are not planning to use llvm for debug builds long-term, it is very low priority for the core team.
The LLDB fork exists to demonstrate and test that the debug info emitted by the self-hosted backends is sufficient to do debugger things. I only maintain it enough to serve that purpose and my own debugging needs.
Fair enough, i'de really like to help with either getting the X86 or LLVM backend to the point where people can have a stable debugging with Zig. If the x86 backend is the primary focus for the Zig team then I can focus there. I've ran into a couple of issues with the x86 backend's DWARF output so I might start by making an issues for one of those and go from there.
Side note, since it seems like you have a good amount of experience with Zig's DWARF Output, is there a channel I could reach out through to ask a few questions?
You could create a topic on Zulip.