msbuild icon indicating copy to clipboard operation
msbuild copied to clipboard

msbuild does not "recover" from slowdowns

Open kasperk81 opened this issue 2 years ago • 7 comments

when i build runtime libraries, after quickly building 50 or so libraries, msbuild slows down and doesn't recover the speed it initially possessed for rest of the build.

reproduction steps:

# clone dotnet/runtime on linux

# first build native prerequisite
./build.sh clr+libs.native

# then build libs and notice how the build slows down after a first couple of
# libraries are built (and never recovers the speed)
./build.sh libs.src

kasperk81 avatar Jun 28 '22 22:06 kasperk81

Can you please collect a trace of the build by installing dotnet-trace and running your repro with

dotnet trace collect --providers Microsoft-Build -- ./build.sh libs.src

? That should collect perf information using our events.

rainersigwald avatar Jul 07 '22 16:07 rainersigwald

docs say:

Note

Using this option monitors the first .NET 5 process that communicates back to the tool, which means if your command launches multiple .NET applications, it will only collect the first app. Therefore, it is recommended you use this option on self-contained applications, or using the dotnet exec <app.dll> option.

since ./build.sh libs.src invokes dotnet internally multiple, it doesn't work. i didn't find a "follow child" option, provided by similar tools; strace -f or gdb (all versions) and lldb (version 14 onward) follow-fork-mode option.

@hoyosjs

kasperk81 avatar Jul 07 '22 17:07 kasperk81

That's good feedback, but sadly I don't think there's a provision for that. I opened https://github.com/dotnet/diagnostics/issues/3175 to discuss the issue. @rainersigwald, when a build shells out using the MSBuild task, is there a way to get an equivalent invocation from the binlog such that dotnet exec msbuild could reproduce the issues in a single instance?

hoyosjs avatar Jul 07 '22 20:07 hoyosjs

No, unfortunately MSBuild is multi-process by design and while you can build with -m:1 to avoid that, it dramatically alters the perf characteristics of the build so it's probably not something that would be helpful here.

A binlog might have sufficient information to observe and possibly troubleshoot the slowdown, though.

rainersigwald avatar Jul 07 '22 20:07 rainersigwald

@rainersigwald, here is artifacts/log/Debug/Build.binlog from ./build.sh clr+libs --binarylog: https://github.com/kasperk81/runtime/releases/download/msbuild_binlogs/Build.binlog

weirdly enough, the total duration is smaller than core compile task alone:

image

kasperk81 avatar Jul 08 '22 19:07 kasperk81

the total duration is smaller than core compile task alone

That is not unexpected: total duration is "total wall-clock time", while task times are "sum of wall-clock time", and we can run many of them in parallel on your 16-core machine.

Taking a quick look at the log nothing is jumping out at me; there are some long serialized segments but they all seemed inherent to the nature of the repo, for example compiling/linking/crossgenning corelib before unblocking a bunch of stuff (that presumably depends on corelib).

rainersigwald avatar Jul 08 '22 19:07 rainersigwald

nothing is jumping out at me

ok, in that case, we can close this. it just always feels like after quickly building couple of libraries, the build slows down then it continues on degraded speed. it is probably tied to project-to-project dependency graph; the leaf nodes build quicker than the others.

kasperk81 avatar Jul 08 '22 20:07 kasperk81