ldc icon indicating copy to clipboard operation
ldc copied to clipboard

Non optimized IR for reference types in CondExp is messy

Open ghost opened this issue 2 years ago • 6 comments

It seems that ldc does not have the notion of reference type, leading to add an indirection for temporaries. Here is an example

The d code

class A
{
    B b;
}

class B
{
    int member;
}

B test(A a) {
    return a ? a.b : null;
}

The interesting IR part (full test case)

define %example.B* @_D7example4testFCQp1AZCQv1B(%example.A* %a_arg) #1 !dbg !10 {
  %a = alloca %example.A*, align 8                ; [#uses = 3, size/byte = 8]
  %condtmp = alloca %example.B**, align 8         ; [#uses = 4, size/byte = 8]
  %.makelvaluetmp = alloca %example.B*, align 8   ; [#uses = 2, size/byte = 8]
  store %example.A* %a_arg, %example.A** %a, align 8, !dbg !13 ; [debug line = app/example.d:12:3]
  %1 = load %example.A*, %example.A** %a, align 8, !dbg !14 ; [#uses = 1] [debug line = app/example.d:13:5]
  %2 = icmp ne %example.A* %1, null, !dbg !14     ; [#uses = 1] [debug line = app/example.d:13:5]
  br i1 %2, label %condtrue, label %condfalse, !dbg !14 ; [debug line = app/example.d:13:5]

condtrue:                                         ; preds = %0
  %3 = load %example.A*, %example.A** %a, align 8, !dbg !14 ; [#uses = 1] [debug line = app/example.d:13:5]
  %4 = getelementptr inbounds %example.A, %example.A* %3, i32 0, i32 2 ; [#uses = 1, type = %example.B**]
  store %example.B** %4, %example.B*** %condtmp, align 8, !dbg !14 ; [debug line = app/example.d:13:5]
  br label %condend

condfalse:                                        ; preds = %0
  store %example.B* null, %example.B** %.makelvaluetmp, align 8, !dbg !14 ; [debug line = app/example.d:13:5]
  store %example.B** %.makelvaluetmp, %example.B*** %condtmp, align 8, !dbg !14 ; [debug line = app/example.d:13:5]
  br label %condend

condend:                                          ; preds = %condfalse, %condtrue
  %5 = load %example.B**, %example.B*** %condtmp, align 8, !dbg !14 ; [#uses = 1] [debug line = app/example.d:13:5]
  %6 = load %example.B*, %example.B** %5, align 8, !dbg !14 ; [#uses = 0] [debug line = app/example.d:13:5]
  %7 = load %example.B**, %example.B*** %condtmp, align 8, !dbg !14 ; [#uses = 1] [debug line = app/example.d:13:5]
  %8 = load %example.B*, %example.B** %7, align 8, !dbg !14 ; [#uses = 1] [debug line = app/example.d:13:5]
  ret %example.B* %8, !dbg !14                    ; [debug line = app/example.d:13:5]
}

condtemp type is B*** but could be B**.

Also note that %5 and %6 are not used at all.

ghost avatar Apr 11 '23 07:04 ghost

This is a result of LLVM being SSA and this extra indirection is how this is done. Also this this is unoptimised IR, what do you expect?

thewilsonator avatar Apr 11 '23 09:04 thewilsonator

I expected more something like that

define ptr @temp.test(ptr %0) #0 {
entry:
  %1 = icmp ne ptr %0, null
  br i1 %1, label %2, label %5

2:                                                ; preds = %entry
  %3 = getelementptr inbounds %temp.A, ptr %0, i32 0, i32 1
  %4 = load ptr, ptr %3, align 8
  br label %6

5:                                                ; preds = %entry
  br label %6

6:                                                ; preds = %5, %2
  %7 = phi ptr [ %4, %2 ], [ null, %5 ]
  ret ptr %7
}   

ghost avatar Apr 11 '23 09:04 ghost

or with typed pointers and copy of the parameter in a local

define %temp.B* @temp.test(%temp.A* %0) #0 {
entry:
  %1 = alloca %temp.A*, align 8
  store %temp.A* %0, %temp.A** %1, align 8
  %2 = load %temp.A*, %temp.A** %1, align 8
  %3 = icmp ne %temp.A* %2, null
  br i1 %3, label %4, label %8

4:                                                ; preds = %entry
  %5 = load %temp.A*, %temp.A** %1, align 8
  %6 = getelementptr inbounds %temp.A, %temp.A* %5, i32 0, i32 1
  %7 = load %temp.B*, %temp.B** %6, align 8
  br label %9

8:                                                ; preds = %entry
  br label %9

9:                                                ; preds = %8, %4
  %10 = phi %temp.B* [ %7, %4 ], [ null, %8 ]
  ret %temp.B* %10
}

ghost avatar Apr 11 '23 09:04 ghost

Well I think that the problem is that the DMD front-end produces an intermediate representation, then ldc uses that intermediate representation to create a second intermediate representation, and then that second representation illustrates how bad, or maybe just "dated", is the first.

So... this might be a DMD problem after all. I'm pretty sure that LDC non-optimized IR would be better if not based on the dated intermediate DMD step.

Still, that's not great.

ghost avatar Apr 11 '23 10:04 ghost

You can also view the AST coming from DMD that LDC operates on; there is no difference with the source input code (i.e. still the ternary expression without extra intermediaries).

LDC's conversion to IR uses a temporary variable condtmp instead of a phi-node. https://github.com/ldc-developers/ldc/blob/0d4d711aed785e40f8d894187f8be0d361b8589a/gen/toir.cpp#L2066-L2113 condtmp is a pointer to the result variable (type B**) instead of a copy (type B*) probably to avoid having to make a copy which may not be semantically correct for all types (because of constructors).

The unnecessary loads %5 and %6 may be because of accidental interaction between different visitors in the code. It's loaded into result at the end of visit(CondExp) and then loaded again for the return statement?

If you want to try and improve it, you're welcome! But for unoptimized IR, it is expected that we create relatively bad code to simplify compiler implementation...

JohanEngelen avatar Apr 11 '23 11:04 JohanEngelen

Thnaks for the advanced analysis. I'll try to open an issue in bugzilla about the two useless loads but keep open for now please. The point is that the report shows that there's an opportunity to create a better IR.

ghost avatar Apr 14 '23 11:04 ghost