Generating DWARF debug info.
Hi
Great work on the compiler!!
I'm working as a professor at Luleå University of Technology Sweden and currently preparing for running the course in Compiler Technology. We have used OCaml before (and I like teaching Compilers using a functional language). This year I'd like to use LLVM as a backend (previously the students were making their own backend, but using LLVM makes more sense).
I have played around a bit with the moe (https://llvm.moe/ocaml/) llvm bindings and got your compiler to compile for LLVM 8 (not sure if everything is working but at least the examples I tried).
Not sure we will use Tiger as the language for the students, maybe we define our own.
In any case. I'd like to ask you a bit about LLVM and debug information.
As I understand you do not currently generate debug info. Do you think it is possible with the moe bindings. I was looking briefly at the LLVM docs, and they are using a special builder for DI. I could not find any dibuilder in moe documentation. It could be that its possible, but the moe documentation is not that complete (no examples etc.).
A simple program in hello.c
#include <stdio.h>
int main(void)
{
int c = 0;
int d = c + 1;
printf("hello, world\n");
return 0;
}
In the IR (.ll) without any optimizations, you would find something like this:
; ModuleID = 'hello.c'
source_filename = "hello.c"
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-pc-linux-gnu"
@.str = private unnamed_addr constant [14 x i8] c"hello, world\0A\00", align 1
; Function Attrs: noinline nounwind optnone sspstrong uwtable
define dso_local i32 @main() #0 !dbg !9 {
%1 = alloca i32, align 4
%2 = alloca i32, align 4
%3 = alloca i32, align 4
store i32 0, i32* %1, align 4
call void @llvm.dbg.declare(metadata i32* %2, metadata !13, metadata !DIExpression()), !dbg !14
store i32 0, i32* %2, align 4, !dbg !14
call void @llvm.dbg.declare(metadata i32* %3, metadata !15, metadata !DIExpression()), !dbg !16
%4 = load i32, i32* %2, align 4, !dbg !17
%5 = add nsw i32 %4, 1, !dbg !18
store i32 %5, i32* %3, align 4, !dbg !16
%6 = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([14 x i8], [14 x i8]* @.str, i32 0, i32 0)), !dbg !19
ret i32 0, !dbg !20
}
; Function Attrs: nounwind readnone speculatable
declare void @llvm.dbg.declare(metadata, metadata, metadata) #1
declare i32 @printf(i8*, ...) #2
attributes #0 = { noinline nounwind optnone sspstrong uwtable "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "less-precise-fpmad"="false" "min-legal-vector-width"="0" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+fxsr,+mmx,+sse,+sse2,+x87" "unsafe-fp-math"="false" "use-soft-float"="false" }
attributes #1 = { nounwind readnone speculatable }
attributes #2 = { "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+fxsr,+mmx,+sse,+sse2,+x87" "unsafe-fp-math"="false" "use-soft-float"="false" }
!llvm.dbg.cu = !{!0}
!llvm.module.flags = !{!3, !4, !5, !6, !7}
!llvm.ident = !{!8}
!0 = distinct !DICompileUnit(language: DW_LANG_C99, file: !1, producer: "clang version 8.0.0 (tags/RELEASE_800/final)", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, enums: !2, nameTableKind: None)
!1 = !DIFile(filename: "hello.c", directory: "/home/pln/ocaml/ocaml-llvm-tutorial/part2")
!2 = !{}
!3 = !{i32 2, !"Dwarf Version", i32 4}
!4 = !{i32 2, !"Debug Info Version", i32 3}
!5 = !{i32 1, !"wchar_size", i32 4}
!6 = !{i32 7, !"PIC Level", i32 2}
!7 = !{i32 7, !"PIE Level", i32 2}
!8 = !{!"clang version 8.0.0 (tags/RELEASE_800/final)"}
!9 = distinct !DISubprogram(name: "main", scope: !1, file: !1, line: 4, type: !10, scopeLine: 5, flags: DIFlagPrototyped, spFlags: DISPFlagDefinition, unit: !0, retainedNodes: !2)
!10 = !DISubroutineType(types: !11)
!11 = !{!12}
!12 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
!13 = !DILocalVariable(name: "c", scope: !9, file: !1, line: 6, type: !12)
!14 = !DILocation(line: 6, column: 9, scope: !9)
!15 = !DILocalVariable(name: "d", scope: !9, file: !1, line: 7, type: !12)
!16 = !DILocation(line: 7, column: 9, scope: !9)
!17 = !DILocation(line: 7, column: 13, scope: !9)
!18 = !DILocation(line: 7, column: 15, scope: !9)
!19 = !DILocation(line: 8, column: 5, scope: !9)
!20 = !DILocation(line: 10, column: 5, scope: !9)
Sorry for all the noise. In any case, the interesting parts are the !9, and the following tags. In C++, you would generate a DI builder, and build Type/Variable and Location tags (if I understand correctly), then you attach them to the instruction builder (llbuilder in moe).
I haven't looked at the implementation behind the moe bindings, maybe they were never finished. Even though the LLVM OCaml bindings are shipped with LLVM, the Kaleidoscope example is terribly dated (like 10 years), and incomplete (it just states that its straightforward to build the DWARF info by just looking at some .ll files generated with debug info from C, maybe it is, I just don't get how...
So what do you think?
Best regards, and thanks in advance for any input. /Per (you reach me on [email protected], if you feel that its better to take the discussion there than using the issue tracker...)