gentooLTO LTO benchmarks

Hi guys, I was always wondering, how much faster is LTO optimized system compared to a default one. One can use Phoronix Test Suite, but I've created simple self-made bash scripts to run 28 benchmarks. Here are the results:

Test on Intel Xeon E3-1265L V2, 2.50GHz, RUNS=20

Default Gentoo: -march=native -O2 -march=native Default LTO: -march=native -O3 ${GRAPHITE} ${DEVIRTLTO} ${IPAPTA} ${SEMINTERPOS} ${FLTO} -fuse-linker-plugin Only LTO: -march=native -O2 ${FLTO} -fuse-linker-plugin

world was always rebuild, however, the script lists just the packages, which needs to be rebuild for the benchmarks

for relative performance + means faster (better), - means slower (worse)

Bench	Default Gentoo [s]	Default LTO [s]	Default LTO [% rel. to Gentoo]	Only LTO	Only LTO [% rel. to Gentoo]
bash	12.369±0.017	11.480±0.074	+7.74±0.71	11.419±0.017	+8.31±0.22
ash	8.208±0.015	8.100±0.055	+1.33±0.71	8.061±0.021	+1.82±0.32
dash	4.472±0.010	4.298±0.041	+4.04±1.02	4.280±0.014	+4.48±0.41
bc	16.764±0.011	18.458±0.014	-9.17±0.09	17.723±0.012	-5.41±0.09
java	27.484±0.089	27.678±0.122	-0.70±0.54	27.368±0.069	+0.43±0.41
lammps	22.069±0.217	21.985±0.234	0.38±1.45	22.502±0.232	-1.92±1.40
lzop	2.422±0.008	2.487±0.031	-2.61±1.26	2.420±0.01	+0.08±0.53
lz4	2.321±0.008	4.879±0.035	-52.42±0.38	2.335±0.009	-0.60±0.51
zstd	1.744±0.020	1.777±0.029	-1.86±1.96	1.732±0.014	+0.69±1.41
gzip	25.607±0.013	24.893±0.035	+2.86±0.15	25.261±0.021	+1.36±0.10
pigz	5.472±0.008	5.156±0.025	+6.13±0.54	5.472±0.010	0.00±0.23
zopfli	21.823±0.015	21.756±0.052	+0.31±0.25	21.918±0.022	-0.43±0.12
pigz.zopfli	6.348±0.008	5.885±0.028	+7.86±0.53	5.973±0.010	+6.27±0.22
xz	13.280±0.020	13.244±0.041	+0.27±0.35	13.307±0.015	-0.20±0.19
lrzip	12.597±0.051	12.876±0.060	-2.17±0.60	12.600±0.053	-0.02±0.58
gcc	281.756±0.453	280.858±0.485	+0.31±0.24	276.708±0.379	+1.82±0.22
ccache	23.567±3.722	23.421±3.672	+0.62±22.39	23.177±3.649	+1.68±22.68
clang	704.925±0.490	578.851±0.557	+21.77±0.14	711.065±0.467	-0.86±0.09
eix	2.291±0.613	2.775±0.584	-17.44±28.10	2.562±0.520	-10.57±30.03
emerge	20.137±2.266	18.546±2.189	+8.57±17.71	18.878±1.715	+6.66±15.43
normalize	0.240±0.009	0.346±0.012	-30.63±3.54	0.237±0.007	+1.26±4.83
flac	7.572±0.123	7.074±0.248	+7.03±4.14	8.071±0.110	-6.18±1.99
ogg	7.925±0.131	7.580±0.380	+4.55±5.52	7.934±0.118	-0.11±2.22
lame	14.085±0.130	12.112±0.337	+16.28±3.41	13.896±0.125	+1.36±1.31
mencoder	22.345±0.202	22.532±0.397	-0.82±1.96	22.366±0.218	-0.09±1.33
jpegtran	23.133±0.007	23.620±0.062	-2.06±0.26	23.108±-0.006	+0.10±0.04
optipng	44.846±0.013	39.837±0.035	+12.57±0.10	44.052±0.014	+1.80±0.04
zopflipng	14.085±0.023	13.756±0.047	+2.39±0.39	14.375±0.025	-2.01±0.23
average [%]		-1.79		+0.28

It looks like in most cases we gain performance, but in some (bc, lzop, lz4, lrzip, normalize, jpegtran) we loose. It seems most of the losses are caused by the advanced optimizations, not by LTO itself (except for bc and flac).

Also, averaging over all packages shows a lost of -1.79% for default gentooLTO and quite small gain of 0.28% for LTO only.

Should we optimize the packages separately? What do you think?

Sep 04 '20 15:09 jfikar

This looks really cool, thanks for taking the time to do world rebuilds. I'll give it a shot soon. A suggestion I have is to mention the versions used, if certain versions are blocked a prefix could be used to build in.

Sep 05 '20 03:09 jiblime

Wow, thanks for doing this! I wouldn't be surprised at all at this point if Graphite were to blame for some of the negative discrepancies shown. One interesting result is lz4 taking around twice the amount of time.

Sep 26 '20 15:09 InBetweenNames

Tried lz4 as one "suffering" the most. On my Ryzen I did not see any difference. -O2 vs -O3 vs full-fledged LTO package showed the same performance within maybe 1% gap. Ebuild is dead simple and obviously does'n fiddle with CFLAGS. So it may be some Intel+GCC issue...

Jan 21 '21 20:01 kanyck

More results from another benchmark for your viewing pleasure: https://openbenchmarking.org/result/1307063-UT-GCCOPTIMI03

Feb 16 '21 04:02 WillPower3309

Looks like graphite does'n bring in any benefits, to put it mildly...

Feb 18 '21 12:02 kanyck

Looks like ten y.o. benchmark, first of all.

Feb 18 '21 16:02 pchome

IMHO graphite should be disabled by default, I had some ugly and unpredictable bugs due to it and not only it doesn't seem to bring any performance benefits but can slow down significantly some programs, like zstd.

Feb 18 '21 17:02 barolo

IMHO graphite should be disabled by default, I had some ugly and unpredictable bugs due to it and not only it doesn't seem to bring any performance benefits but can slow down significantly some programs, like zstd.

I believe clear linux enable/disable graphite on a per-bundle basis. They've never said that graphite is enabled system wide.

Feb 18 '21 18:02 addeps3

Looks like graphite does'n bring in any benefits, to put it mildly...

For a lot of those tests in the link I sent, fewer is better in the scales, however there is definitely some conflicting results there

Feb 18 '21 18:02 WillPower3309

gentooLTO gentooLTO copied to clipboard

LTO benchmarks

gentooLTO
gentooLTO copied to clipboard