graph-prototype [3pt] graph-prototype: Optimize compilation time

[3pt] graph-prototype: Optimize compilation time

Open ivan-cukic opened this issue 1 year ago • 1 comments

GP is a TMP-heavy library. See if compilation times can be optimized without sacrificing the runtime performance.

Primary focus - document:

identify bottlenecks -- what are the heaviest template constructs
iterate on SFINAE vs. concepts
evaluate PCH
...

Jul 14 '23 09:07 ivan-cukic

This task can be tackled using Clang's -ftime-trace compile-flag and in-built compile-time instrumentation as described in bit more detail here and partially implemented and tested in PR#299.

How to compile and enable compile-time profiling:

cmake -GNinja -DCMAKE_CXX_COMPILER=clang++ -DCMAKE_CXX_FLAGS="-fuse-ld=lld -ftime-trace" ..

which produces a set of .json' files containing the compile-time trace information for each compilation unit that can be inspected either with chrome's inbuilt interface (use: chrome://tracing` in the URL bar) or via the https://ui.perfetto.dev/ site (same functionality, perhaps a bit nicer UI). Which yields flame graphs such as

which can be drilled-down similarly to any runtime performance analysis.

Post-Processing using https://github.com/aras-p/ClangBuildAnalyzer/ (i.e. creating histograms of common template patterns):

Either on a single .json file or a whole build sub-directory:

ClangBuildAnalyzer --all <artifacts_folder> <capture_file>
   Processing all files and saving to '<capture_file>'...
   done in 4.2s. Run 'ClangBuildAnalyzer --analyze <capture_file> > <text output>' to analyze it.

You may want to set and configure a default ClangBuildAnalyzer.ini and notably set the default maxNameLength = 70 to see the full template and compile-unit names.

As a general guidance:

first try to reduce the file- and template-specific overheads, before
tackling global compile-time optimisations such as PCH, unity-builts etc.

since the latter may hide structural problems that re-emerge only very (too?) late in further developments.

Apr 23 '24 07:04 RalphSteinhagen

graph-prototype graph-prototype copied to clipboard

[3pt] graph-prototype: Optimize compilation time

How to compile and enable compile-time profiling:

Post-Processing using https://github.com/aras-p/ClangBuildAnalyzer/ (i.e. creating histograms of common template patterns):

graph-prototype
graph-prototype copied to clipboard