librascal icon indicating copy to clipboard operation
librascal copied to clipboard

RelWithDebInfo returns faster benchmark results than Release, I have no idea why

Open agoscinski opened this issue 4 years ago • 3 comments

I am not sure how we should deal with this, don't think this is for the feat/interpolator branch to "solve", therefore the issue

RelWithDebInfo returns faster benchmark results than Release, I have no idea why

Might be an issue of code cache explosion: when using -O3 (which I believe is the default for Release), the inliner can go crazy, resulting in very large assembly/machine code. This negatively impact performance, since the CPU has to fetch and decode all this code, which do not fit in the L1 cache anymore.

What is the time difference between -O2 and -O3 here? Also, does this happens with GCC or clang?

Originally posted by @Luthaf in https://github.com/cosmo-epfl/librascal/pull/113#issuecomment-533188183

GNU (g++ 7.4.0) speedup is sometimes marginal 1.01 profile_matrix_cubic_spline, but can also be 1.2 for profile_scalar_cubic_spline. Indeed LLC Miss Count is 3030k (Release) vs 2220k (RelWithDebugInfo) for profile_matrix_cubic_spline.

For clang I actually get problems to run the benchmarks, added to my TODOs (yey)

Originally posted by @agoscinski in https://github.com/cosmo-epfl/librascal/pull/113#issuecomment-533224198

agoscinski avatar Sep 20 '19 08:09 agoscinski