64-line program takes 14s to compile
Compiling the following program with ldc2 -betterC -O2 --enable-asserts=false a.d takes 14 seconds.
import core.stdc.stdlib : malloc;
extern(C) int main() {
return 0;
}
struct Foo {
struct A { Foo x; Foo y; }
struct B { Foo z; }
struct C { Foo w; }
this(A* a) { kind = Kind.a; a_ = a; }
this(B* b) { kind = Kind.b; b_ = b; }
this(C* c) { kind = Kind.c; c_ = c; }
enum Kind { a, b, c }
Kind kind;
union { A* a_; B* b_; C* c_; }
}
struct Bar {
long[4] x;
Bar[] children;
}
Bar[] toBarArray(Foo[] xs) {
return map(xs, (Foo x) => toBar(x));
}
Out[] map(Out, In)(In[] xs, Out delegate(In) cb) {
Out* res = cast(Out*) malloc(Out.sizeof * xs.length);
foreach (i, x; xs)
res[i] = cb(x);
return res[0 .. xs.length];
}
T[] allocArr(T)(scope T[] xs) {
return map!(T, T)(xs, x => x);
}
Bar toBar(Foo foo) {
return matchFoo!Bar(
foo,
(Foo.A a) => Bar(1, allocArr([toBar(a.x), toBar(a.y)])),
(Foo.B b) => toBar(b.z),
(Foo.C c) => toBar(c.w));
}
T matchFoo(T)(
Foo foo,
T delegate(Foo.A) cbA,
T delegate(Foo.B) cbB,
T delegate(Foo.C) cbC,
) {
final switch (foo.kind) {
case Foo.Kind.a:
return cbA(*foo.a_);
case Foo.Kind.b:
return cbB(*foo.b_);
case Foo.Kind.c:
return cbC(*foo.c_);
}
}
It's hard to get a smaller repro for this issue, since changing just about anything makes the compile time fast. I've tested the following changes:
- With
-O1instead of-O2: 0.1s - Without
-betterC: 2s- Interestingly, in ldc 1.27.1, it took 14s with or without
-betterC.
- Interestingly, in ldc 1.27.1, it took 14s with or without
- Without
--enable-asserts=false: 2s - Without
Foo.A(and related code): 0.1s- Similar if removing
Foo.BorFoo.C. It seems the union must have at least 3 members.
- Similar if removing
- If I use an
aliasinmapinstead of adelegate: 0.1s- (meaning:
Out[] map(alias cb, Out, In)(In[] xs) {)
- (meaning:
- If I write a switch in
toBarinstead of usingmatchFoo: 0.1s - Without
long[4] x;inBar: 0.1s- With
long[1] x;instead: 5s
- With
- If
allocArris justassert(0);: 0.1s
Tested with ldc2 1.28.0, based on DMD 2.098.0 and LLVM 12.0.1.
It's 8.23 secs on my box with LDC v1.28; with v1.26 (LLVM 11.0.1), it is 2.62 seconds. [Without -O, I get < 0.05 secs with both.] Might be another symptom of an LLVM 12 regression, at least partially: https://github.com/ldc-developers/ldc/issues/3824
After a quick glance at the produced .ll file, I haven't spotted anything totally obvious in the meager 465 lines. The optimizer taking such a long time for that little code seems weird indeed.