ldc
ldc copied to clipboard
IR-PGO + LTO: error: Not an IR level instrumentation profile
Combining IR-PGO and LTO fails to build. This was tested on OSX. Tried a simple helloworld program as shown below. The same commands work when using AST-PGO (the fprofile-instr-[generate|use]
versions of the instructions with the instr-
bit).
The behavior is as if using LTO switches profile generation back to AST. In fact, in the examples below, changing the LTO builds to use fprofile-generate
/ fprofile-instr-use
builds successfully, suggesting that this may be what is happening.
$ ./ldc2-1.8.0-beta1-osx-x86_64/bin/ldc2 --version | head -n 5
LDC - the LLVM D compiler (1.8.0-beta1):
based on DMD v2.078.3 and LLVM 5.0.1
built with LDC - the LLVM D compiler (1.8.0-beta1)
Default target: x86_64-apple-darwin17.4.0
Host CPU: haswell
$ xcodebuild -version
Xcode 9.2
Build version 9C40b
$ cat helloworld.d
void main(string[] args)
{
import std.stdio;
writeln("hello world");
}
## Building with PGO - Works fine
$ ./ldc2-1.8.0-beta1-osx-x86_64/bin/ldc2 -O3 -fprofile-generate=profile.pgo.raw -of./helloworld.pgo.instr helloworld.d
$ ./helloworld.pgo.instr
hello world
$ ./ldc2-1.8.0-beta1-osx-x86_64/bin/ldc-profdata merge -o profile.pgo.profdata profile.pgo.raw
$ ./ldc2-1.8.0-beta1-osx-x86_64/bin/ldc2 -O3 -fprofile-use=profile.pgo.profdata -of./helloworld.pgo helloworld.d
## Building with PGO and ThinLTO - Build error
$ ./ldc2-1.8.0-beta1-osx-x86_64/bin/ldc2 -O3 -flto=thin -fprofile-generate=profile.pgo_lto_thin.raw -of./helloworld.pgo.lto_thin.instr helloworld.d
$ ./helloworld.pgo.lto_thin.instr
hello world
$ ./ldc2-1.8.0-beta1-osx-x86_64/bin/ldc-profdata merge -o profile.pgo_lto_thin.profdata profile.pgo_lto_thin.raw
$ ./ldc2-1.8.0-beta1-osx-x86_64/bin/ldc2 -O3 -flto=thin -fprofile-use=profile.pgo_lto_thin.profdata -of./helloworld.pgo.lto_thin helloworld.d
error: profile.pgo_lto_thin.profdata: Not an IR level instrumentation profile
## Building with PGO and FullLTO - Build error
$ ./ldc2-1.8.0-beta1-osx-x86_64/bin/ldc2 -O3 -flto=full -fprofile-generate=profile.pgo_lto_full.raw -of./helloworld.pgo.lto_full.instr helloworld.d
$ ./helloworld.pgo.lto_full.instr
hello world
$ ./ldc2-1.8.0-beta1-osx-x86_64/bin/ldc-profdata merge -o profile.pgo_lto_full.profdata profile.pgo_lto_full.raw
$ ./ldc2-1.8.0-beta1-osx-x86_64/bin/ldc2 -O3 -flto=full -fprofile-use=profile.pgo_lto_full.profdata -of./helloworld.pgo.lto_full helloworld.d
error: profile.pgo_lto_full.profdata: Not an IR level instrumentation profile
Thanks, I can reproduce it too on macOS with LLVM 6 and trunk.
When you remove the -flto=...
in the instrumentation (-generate
) step, it works.
With Clang, these steps do work. But clang reports that the only function (main) is outdated (hash mismatch), so effectively it's not working. Without -flto=...
in the generate steps, things seem to work fine.
So the workaround for now is to not use -flto=...
for the PGO instrumentation phase. This makes some sense: LTO happens after PGO instrumentation is added. This means that the profile gained from instrumentation is wrong/hard-to-interpret when decorating the IR with the profile in the second step (because the profile info is applied to the non-LTO IR). Still, we shouldn't error the way we do now.
Thanks, I'll try this. Another piece is how it interacts with building phobos/druntime with LTO, e.g.
ldc-build-runtime --dFlags="-flto=thin" BUILD_SHARED_LIBS=OFF
From your description it sounds like this should work. After all, what this step is doing is generating the IR, not the final optimization. I should be able to try it before too long.
The workaround doesn't work when building with LTO against phobos/druntime. Somewhat obvious in retrospect. The druntime/phobos libs are built with -flto=...
. Using these libraries when building an app requires specifying the same -flto=...
option. This means using it when building the instrumented build. And indeed, dropping it when building the instrumented app results in linker failure.
Reproducing testcase for Lit:
// Test combination of IR-based PGO and LTO
// REQUIRES: PGO_RT
// REQUIRES: LTO
// REQUIRES: atleast_llvm309
// There is an LLVM bug with IR-based LTO on Windows.
// XFAIL: Windows
// RUN: %ldc -flto=full -fprofile-generate=%t.profraw -run %s \
// RUN: && %profdata merge %t.profraw -o %t.profdata \
// RUN: && %ldc -flto=full -c -output-ll -of=%t.use.ll -fprofile-use=%t.profdata %s
// R UN: && FileCheck %s -check-prefix=PROFUSE < %t.use.ll
import ldc.attributes : weak;
extern (C)
{ // simplify name mangling for simpler string matching
@weak // disable reasoning about this function
void hot()
{
}
void luke()
{
}
void cold()
{
}
void function() foo;
@weak // disable reasoning about this function
void select_func(int i)
{
if (i < 1700)
foo = &hot;
else if (i < 1990)
foo = &luke;
else
foo = &cold;
}
} // extern C
// PROFUSE-LABEL: @_Dmain(
int main()
{
for (int i; i < 2000; ++i)
{
select_func(i);
// PROFUSE: [[REG1:%[0-9]+]] = load void ()*, void ()** @foo
// PROFUSE: [[REG2:%[0-9]+]] = icmp eq void ()* [[REG1]], @hot
// PROFUSE: call void @hot()
// PROFUSE: call void [[REG1]]()
foo();
}
return 0;
}
I've dug a little more. This is also broken with Clang, at least partly. (contrary to what I reported earlier). I'll discuss it with the LLVM team.
FWIW, using -flto=full -defaultlib=druntime-ldc-lto
for an IR-PGO build of DMD (both instrumented + PGO'd) works fine on Linux with LDC v1.29 (LLVM 13).